## Archived seminars in StatisticsSeminars 51 to 100 | Previous 50 seminars Next 50 seminars |

### Ihaka Lecture #3: Alberto Cairo

*University of Miami*

Date: Wednesday 21 March 2018

The use of graphs, charts, maps and infographics to explore data and communicate science to the public has become more and more popular. However, this rise in popularity has not been accompanied by an increasing awareness of the rules that should guide the design of these visualisations.

This talk teaches normal citizens principles to become a more critical and better informed readers of charts.

~~Alberto Cairo is the Knight Chair in Visual Journalism at the University of Miami. He’s also the director of the visualisation programme at UM’s Center for Computational Science. Cairo has been a director of infographics and multimedia at news publications in Spain (El Mundo, 2000-2005) and Brazil (Editora Globo, 2010-2012,) and a professor at the University of North Carolina-Chapel Hill. Besides teaching at UM, he works as a freelancer and consultant for companies such as Google and Microsoft. He’s the author of the books The Functional Art: An Introduction to Information Graphics and Visualization (2012) and The Truthful Art: Data, Charts, and Maps for Communication (2016).~~

[!The lectures are live-streamed;https://goo.gl/forms/ycwHTR6k8aD8Tquk1] from 6.30pm NZDST onwards on 7, 14 and 21 March 2018.

Join the local group in the Mathematics and Statistics Department for this live-stream viewing and discussion

Local contact: [Timothy.Bilton@agresearch.co.nz;Timothy.Bilton@agresearch.co.nz]

### Ihaka Lecture #2: Paul Murrell

*University of Auckland*

Date: Wednesday 14 March 2018

When combined with screen reader software, this provides information for blind and visually-impaired R users about the contents of an R plot. A minor difficulty that arises in the generation of these text descriptions involves the information about colours within a plot. As far as R is concerned, colours are described as six-digit hexadecimal strings, e.g. "#123456", but that is not very helpful for a human audience. It would be more useful to report colour names like "red" or "blue".

This talk will make a mountain out of that molehill and embark on a daring Statistical Graphics journey featuring colour spaces, high-performance computing, Te Reo, and XKCD. The only disappointment will be the ending.

~~Paul Murrell is an Associate Professor in the Department of Statistics at The University of Auckland. He is a member of the core development team for R, with primary responsibility for the graphics system.~~

[!The lectures are live-streamed;https://goo.gl/forms/ycwHTR6k8aD8Tquk1] from 6.30pm NZDST onwards on 7, 14 and 21 March 2018.

Join the local group in the Mathematics and Statistics Department for this live-stream viewing and discussion

Local contact: [Timothy.Bilton@agresearch.co.nz;Timothy.Bilton@agresearch.co.nz]

### Ihaka Lectures: A thousand words: visualising statistical data

*Live-streamed, 1st of 3 lectures*

Date: Wednesday 7 March 2018

[!The lectures are live-streamed;https://goo.gl/forms/ycwHTR6k8aD8Tquk1] from 6.30pm NZDST onwards on 7, 14 and 21 March 2018.

Local contact: [Timothy.Bilton@agresearch.co.nz;Timothy.Bilton@agresearch.co.nz]

### Alan E Gelfand

*Duke University*

Date: Tuesday 28 November 2017

The predictive process is simple to understand, routine to implement, with straightforward bias correction. It enjoys several attractive properties within the class of dimension reduction approaches and works well for datasets of order 103 or 104. It suffers several limitations including spanning only a finite dimensional subspace, over-smoothing, and underestimation of uncertainty.

So, we focus primarily on the nearest neighbor Gaussian process which draws upon earlier ideas of Vecchia and of Stein. It is a bit more complicated to grasp and implement but it is highly scalable, having been applied to datasets as large as 106. It is a well-defined spatial process providing legitimate finite dimensional Gaussian densities with sparse precision matrices. Scalability is achieved by using local information from few nearest neighbors, i.e., by using the neighbor sets in a conditional specification of the model. This is equivalent to sparse modeling of Cholesky factors of large covariance matrices. We show a multivariate spatial illustration as well as a space-time example. We also consider automating the selection of the neighbor set size.

For either specification, we embed the PGP as a dimension reduction prior and the NNGP as a sparsity-inducing prior within a rich hierarchical modeling framework and outline how computationally efficient Markov chain Monte Carlo (MCMC) algorithms can be executed. However, the future likely lies with the NNGP since it can accommodate spatial scales that preclude dimension-reducing methods.

### Matthew Parry

*Department of Mathematics and Statistics*

Date: Tuesday 24 October 2017

### Rachel Weir

*Allegheny College, Pennsylvania*

Date: Monday 16 October 2017

A common theme in the United States in recent years has been a call to increase the number of graduates in STEM (science, technology, engineering, and mathematics) fields and to enhance the scientific literacy of students in other disciplines. For example, in the 2012 report Engage to Excel, the Obama administration announced a goal of "producing, over the next decade, 1 million more college graduates in STEM fields than expected under current assumptions." Achieving these types of goals will require us to harness the potential of all students, forcing us to identify and acknowledge the barriers encountered by students from traditionally underrepresented groups. Over the past few years, I have been working to understand these barriers to success, particularly in mathematics. In this talk, I will share what I have learned so far and how it has influenced my teaching.

### David Fletcher

*Department of Mathematics and Statistics*

Date: Thursday 12 October 2017

### Honours and PGDip students

*Department of Mathematics and Statistics*

Date: Friday 6 October 2017

Jodie Buckby : ~~Model checking for hidden Markov models~~

Jie Kang : ~~Model averaging for renewal process~~

Yu Yang : ~~Robustness of temperature reconstruction for the past 500 years~~

MATHEMATICS

Sam Bremer : ~~An effective model for particle distribution in waterways~~

Joshua Mills : ~~Hyperbolic equations and finite difference schemes~~

### Ken Ono

*Emory University; 2017 NZMS/AMS Maclaurin Lecturer*

Date: Thursday 5 October 2017

Ramanujan’s work has had a truly transformative effect on modern mathematics, and continues to do so as we understand further lines from his letters and notebooks. In this lecture, some of the studies of Ramanujan that are most accessible to the general public will be presented and how Ramanujan’s findings fundamentally changed modern mathematics, and also influenced the lecturer’s work, will be discussed. The speaker is an Associate Producer of the film ~~The Man Who Knew Infinity~~ (starring Dev Patel and Jeremy Irons) about Ramanujan. He will share several clips from the film in the lecture.

Biography: Ken Ono is the Asa Griggs Candler Professor of Mathematics at Emory University. He is considered to be an expert in the theory of integer partitions and modular forms. He has been invited to speak to audiences all over North America, Asia and Europe. His contributions include several monographs and over 150 research and popular articles in number theory, combinatorics and algebra. He received his Ph.D. from UCLA and has received many awards for his research in number theory, including a Guggenheim Fellowship, a Packard Fellowship and a Sloan Fellowship. He was awarded a Presidential Early Career Award for Science and Engineering (PECASE) by Bill Clinton in 2000 and he was named the National Science Foundation’s Distinguished Teaching Scholar in 2005. In addition to being a thesis advisor and postdoctoral mentor, he has also mentored dozens of undergraduates and high school students. He serves as Editor-in-Chief for several journals and is an editor of The Ramanujan Journal. He is also a member of the US National Committee for Mathematics at the National Academy of Science.

### Katie Jones and Olya Shatova

*Oritain Dunedin*

Date: Monday 2 October 2017

##Note day, time and venue##

Oritain Global Ltd is a scientific traceability company that verifies the origin of food, fibre, and pharmaceutical product by combining trace element and isotope chemistry with statistics. Born in the research labs at the Chemistry Department in the University of Otago, Oritain has grown to become a multinational company with offices in Dunedin, London, and Sydney, and with clients from around the globe. Dr Katie Jones and Dr Olya Shatova are Otago alumni working as scientists at Oritain Dunedin. They will provide an overview of the science behind Oritain and discuss their transition from academic research to commercialized science.

### Mike and Sue Carson

*Carson Associates Ltd*

Date: Thursday 28 September 2017

### Mik Black

*Department of Biochemistry*

Date: Thursday 21 September 2017

### Timothy Bilton

*Department of Mathematics and Statistics*

Date: Thursday 14 September 2017

### Martin Hazelton

*Massey University*

Date: Thursday 7 September 2017

In this talk I will discuss network tomography for a rather general class of traffic models. I will describe some recent progress on model identifiability. I will then discuss the development of effective MCMC samplers for simulation-based inference, based on insight provided by an examination of the geometry of the space of feasible route flows.

### Lech Szymanski

*Department of Computer Science*

Date: Thursday 31 August 2017

### John Holmes

*Department of Mathematics and Statistics*

Date: Thursday 24 August 2017

### Moana Theodore

*Department of Psychology*

Date: Thursday 17 August 2017

### Matthew Schofield

*Department of Mathematics and Statistics*

Date: Thursday 10 August 2017

### Phil Wilcox

*Department of Mathematics and Statistics*

Date: Thursday 3 August 2017

### Peter Dillingham

*Department of Mathematics and Statistics*

Date: Thursday 27 July 2017

### Alastair Lamont

*Department of Mathematics and Statistics*

Date: Thursday 20 July 2017

### Michael Lee

*Department of Mathematics and Statistics*

Date: Thursday 13 July 2017

### Jim Cotter

*School of Physical Education, Sport and Exercise Sciences*

Date: Thursday 1 June 2017

### Amina Shahzadi

*Department of Mathematics and Statistics*

Date: Thursday 25 May 2017

### Jin Zhang

*Department of Accountancy and Finance*

Date: Thursday 18 May 2017

### Paula Bran

*Department of Mathematics and Statistics*

Date: Thursday 11 May 2017

### Richard Arnold

*Victoria University Wellington*

Date: Thursday 4 May 2017

This is joint work with Stefanka Chukova (VUW) and Yu Hayakawa (Waseda University, Tokyo)

### Fiona Hely

*AbacusBio*

Date: Thursday 27 April 2017

### Will Rayment

*Department of Marine Science*

Date: Thursday 13 April 2017

### Peter Dillingham

*Department of Mathematics and Statistics*

Date: Thursday 6 April 2017

### Mike Paulin

*Department of Zoology*

Date: Thursday 30 March 2017

*not*doing this in late Precambrian ecosystems leads us to model spike trains recorded from sensory neurons (in sharks, frogs and other animals) as samples from a family of Inverse Gaussian-censored Poisson, a.k.a. Exwald, point-processes. Neurons that evolved for other reasons turn out to be natural mechanisms for generating samples from Exwald processes, and natural computers for inferring the posterior density of their parameters. This is a consequence of a curious correspondence between the likelihood function for sequential inference from a censored Poisson process and the impulse response function of a neuronal membrane. We conclude that modern animals, including humans, are natural Bayesians because when neurons evolved 560 million years ago they provided our ancestors with a choice between being Bayesian or being dead.

This is joint work with recent Otago PhD students Kiri Pullar and Travis Monk, honours student Ethan Smith, and UCLA neuroscientist Larry Hoffman.

### Nicolas Cullen

*Department of Geography*

Date: Thursday 23 March 2017

### Farzana Afroz

*Department of Mathematics and Statistics*

Date: Thursday 16 March 2017

### Tilman Davies

*Department of Mathematics and Statistics*

Date: Thursday 9 March 2017

*d*-dimensional spatial data can be represented as a slice of a

*fixed*-bandwidth kernel estimator in

*(d+1)*-dimensional "scale space", enabling fast computation using discrete Fourier transforms. Edge correction factors have a similar representation. Different values of global bandwidth correspond to different slices of the scale space, so that bandwidth selection is greatly accelerated. Potential applications include estimation of multivariate probability density and spatial or spatiotemporal point process intensity, relative risk, and regression functions. The new methods perform well in simulations and real applications.

Joint work with Professor Adrian Baddeley, Curtin University, Perth.

### Jiancang Zhuang

*Institute of Statistical Mathematics, Tokyo*

Date: Thursday 2 March 2017

### Shiyong Zhou

*Peking University*

Date: Tuesday 14 February 2017

**NOTE venue is not our usual**

Following the stress release model (SRM) proposed by Vere-Jones (1978), we developed a new multidimensional SRM, which is a space-time-magnitude version based on multidimensional point processes. First, we interpreted the exponential hazard functional of the SRM as the mathematical expression of static fatigue failure caused by stress corrosion. Then, we reconstructed the SRM in multidimensions through incorporating four independent submodels: the magnitude distribution function, the space weighting function, the loading rate function and the coseismic stress transfer model. Finally, we applied the new model to analyze the historical earthquake catalogues in North China. An expanded catalogue, which contains the information of origin time, epicentre, magnitude, strike, dip angle, rupture length, rupture width and average dislocation, is composed for the new model. The estimated model can simulate the variations of seismicity with space, time and magnitude. Compared with the previous SRMs with the same data, the new model yields much smaller values of Akaike information criterion and corrected Akaike information criterion. We compared the predicted rates of earthquakes at the epicentres just before the related earthquakes with the mean spatial seismic rate. Among all 37 earthquakes in the expanded catalogue, the epicentres of 21 earthquakes are located in the regions of higher rates.

### Keolu Fox

*University of San Diego*

Date: Wednesday 1 February 2017

*Keolu has a strong background in using genomic technologies to understand human variation and disease. Throughout his career he has made it his priority to focus on the interface of minority health and genomic technologies. Keolu earned a Ph.D. in Debbie Nickerson's lab in the University of Washington's Department of Genome Sciences (August, 2016). In collaboration with experts at Bloodworks Northwest, (Seattle, WA) he focused on the application of next-generation genome sequencing to increase compatibility for blood transfusion therapy and organ transplantation. Currently Keolu is a postdoc in Alan Saltiel's lab at the University of California San Diego (UCSD) School of Medicine, Division of Endocrinology and Metabolism and the Institute for Diabetes and Metabolic Health. His current project focuses on using genome editing technologies to investigate the molecular events involved in chronic inflammatory states resulting in obesity and catecholamine resistance.*

### Roy Costilla

*Victoria University Wellington*

Date: Tuesday 24 January 2017

Despite its name however, BNP models are actually massively parametric. A parametric model uses a function with finite dimensional parameter vector as prior. Bayesian inference then proceeds to approximate the posterior of these parameters given the observed data. In contrast to that, a BNP model is defined on an infinite dimensional probability space thanks to the use of a stochastic process as a prior. In other words, the prior for a BNP model is a space of functions with an infinite dimensional parameter vector. Therefore, instead of avoiding parametric forms, BNP inference uses a large number of them to gain more flexibility.

To illustrate this, we present simulations and also a case study where we use life satisfaction in NZ over 2009-2013. We estimate the models using a finite Dirichlet Process Mixture (DPM) prior. We show that this BNP model is tractable, i.e. is easily computed using Markov Chain Monte Carlo (MCMC) methods; allowing us to handle data with big sample sizes and estimate correctly the model parameters. Coupled with a post-hoc clustering of the DPM locations, the BNP model also allows an approximation of the number of mixture components, a very important parameter in mixture modelling.

### Jorge Navarro Alberto

*Universidad Autónoma de Yucatán (UADY)*

Date: Wednesday 9 November 2016

**NOTE day and time of this seminar**

The subject of the talk is statistical methods (both theoretical and applied) and computational algorithms for the analysis of binary data, which have been applied in ecology in the study of species composition in systems of patches with the ultimate goal to uncover ecological patterns. As a starting point, I review Gotelli and Ulrich's (2012) six statistical challenges in null model analysis in Ecology. Then, I exemplify the most recent research carried out by me and other statisticians and ecologists to face those challenges, and applications of the algorithms outside the biological sciences. Several topics of research are proposed, seeking to motivate statisticians and computer scientists to venture and, eventually, to specialize in the subject of the analysis of co-occurrences.

Reference: Gotelli, NJ and Ulrich, W, 2012. Statistical challenges in null model analysis.

*Oikos*121: 171-180

### Scotland Leman

*Virginia Tech, USA*

Date: Tuesday 8 November 2016

**NOTE day and time of this seminar**

In this talk I will primarily discuss the Multiset Sampler (MSS): a general ensemble based Markov Chain Monte Carlo (MCMC) method for sampling from complicated stochastic models. After which, I will briefly introduce the audience to my interactive visual analytics based research.

Proposal distributions for complex structures are essential for virtually all MCMC sampling methods. However, such proposal distributions are difficult to construct so that their probability distribution match that of the true target distribution, in turn hampering the efficiency of the overall MCMC scheme. The MSS entails sampling from an augmented distribution that has more desirable mixing properties than the original target model, while utilizing a simple independent proposal distributions that are easily tuned. I will discuss applications of the MSS for sampling from tree based models (e.g. Bayesian CART; phylogenetic models), and for general model selection, model averaging and predictive sampling.

In the final 10 minutes of the presentation I will discuss my research interests in interactive visual analytics and the Visual To Parametric Interaction (V2PI) paradigm. I'll discuss the general concepts in V2PI with an application of Multidimensional Scaling, its technical merits, and the integration of such concepts into core statistics undergraduate and graduate programs.

### Ivor Cribben

*University of Alberta*

Date: Wednesday 19 October 2016

**NOTE day and time of this seminar**

Spectral clustering is a computationally feasible and model-free method widely used in the identification of communities in networks. We introduce a data-driven method, namely Network Change Points Detection (NCPD), which detects change points in the network structure of a multivariate time series, with each component of the time series represented by a node in the network. Spectral clustering allows us to consider high dimensional time series where the number of time series is greater than the number of time points. NCPD allows for estimation of both the time of change in the network structure and the graph between each pair of change points, without prior knowledge of the number or location of the change points. Permutation and bootstrapping methods are used to perform inference on the change points. NCPD is applied to various simulated high dimensional data sets as well as to a resting state functional magnetic resonance imaging (fMRI) data set. The new methodology also allows us to identify common functional states across subjects and groups. Extensions of the method are also discussed. Finally, the method promises to offer a deep insight into the large-scale characterisations and dynamics of the brain.

### John Tipton

*Colorado State University*

Date: Tuesday 18 October 2016

**NOTE day and time of this seminar**

Many scientific disciplines have strong traditions of developing models to approximate nature. Traditionally, statistical models have not included scientific models and have instead focused on regression methods that exploit correlation structures in data. The development of Bayesian methods has generated many examples of forward models that bridge the gap between scientific and statistical disciplines. The ability to fit forward models using Bayesian methods has generated interest in paleoclimate reconstructions, but there are many challenges in model construction and estimation that remain.

I will present two statistical reconstructions of climate variables using paleoclimate proxy data. The first example is a joint reconstruction of temperature and precipitation from tree rings using a mechanistic process model. The second reconstruction uses microbial species assemblage data to predict peat bog water table depth. I validate predictive skill using proper scoring rules in simulation experiments, providing justification for the empirical reconstruction. Results show forward models that leverage scientific knowledge can improve paleoclimate reconstruction skill and increase understanding of the latent natural processes.

### Benjamin Fitzpatrick

*Queensland University of Technology*

Date: Monday 17 October 2016

**NOTE day and time of this seminar**

When making inferences concerning the environment, ground truthed data will frequently be available as point referenced (geostatistical) observations accompanied by a rich ensemble of potentially relevant remotely sensed and in-situ observations.

Modern soil mapping is one such example characterised by the need to interpolate geostatistical observations from soil cores and the availability of data on large numbers of environmental characteristics for consideration as covariates to aid this interpolation.

In this talk I will outline my application of Least Absolute Shrinkage Selection Opperator (LASSO) regularized multiple linear regression (MLR) to build models for predicting full cover maps of soil carbon when the number of potential covariates greatly exceeds the number of observations available (the p > n or ultrahigh dimensional scenario). I will outline how I have applied LASSO regularized MLR models to data from multiple (geographic) sites and discuss investigations into treatments of site membership in models and the geographic transferability of models developed. I will also present novel visualisations of the results of ultrahigh dimensional variable selection and briefly outline some related work in ground cover classification from remotely sensed imagery.

Key references:

Fitzpatrick, B. R., Lamb, D. W., & Mengersen, K. (2016). Ultrahigh Dimensional Variable Selection for Interpolation of Point Referenced Spatial Data: A Digital Soil Mapping Case Study.

*PLoS ONE*, 11(9): e0162489.

Fitzpatrick, B. R., Lamb, D. W., & Mengersen, K. (2016). Assessing Site Effects and Geographic Transferability when Interpolating Point Referenced Spatial Data: A Digital Soil Mapping Case Study. https://arxiv.org/abs/1608.00086

### Paul van Dam-Bates

*Department of Conservation*

Date: Thursday 13 October 2016

Authors: Paul van Dam-Bates[1], Ollie Gansell[1] and Blair Roberston[2]

1 Department of Conservation, New Zealand

2 University of Canterbury, Department of Mathematics and Statistics

### Murray Efford

*Department of Mathematics and Statistics*

Date: Thursday 6 October 2016

### Jimmy Zeng

*Department of Preventive and Social Medicine*

Date: Thursday 29 September 2016

### Richard Barker

*Department of Mathematics and Statistics*

Date: Thursday 22 September 2016

*N*and detectability

*p*. They are popular because they allow inference about

*N*while controlling for factors that influence

*p*without the need for marking animals. Using a capture-recapture perspective we show that the loss of information that results from not marking animals is critical, making reliable statistical modeling of

*N*and

*p*problematic using just count data. We are unable to fit a model in which the detection probabilities are distinct among repeat visits as this model is overspecified. This makes uncontrolled variation in

*p*problematic. By counter example we show that even if

*p*is constant after adjusting for covariate effects (the 'constant

*p*' assumption) scientifically plausible alternative models in which

*N*(or its expectation) is non-identifiable or does not even exist, lead to data that are practically indistinguishable from data generated under an N-mixture model. This is particularly the case for sparse data as is commonly seen in applications. We conclude that under the constant

*p*assumption reliable inference is only possible for relative abundance in the absence of questionable and/or untestable assumptions or with better quality data then seen in typical applications. Relative abundance models for counts can be readily fitted using Poisson regression in standard software such as R and are sufficiently flexible to allow controlling for

*p*through the use covariates while simultaneously modeling variation in relative abundance. If users require estimates of absolute abundance they should collect auxiliary data that help with estimation of

*p*.

### Mohammad Ali Nilforooshan

*Department of Mathematics and Statistics*

Date: Thursday 15 September 2016

### Katrina Sharples

*Department of Mathematics and Statistics*

Date: Thursday 8 September 2016

### Sander Greenland

*University of California*

Date: Monday 5 September 2016

**Note day, time and venue of this special seminar**

Sander Greenland is Research Professor and Emeritus Professor of Epidemiology and Statistics at the University of California, Los Angeles. He is a leading contributor to epidemiological statistics, theory, and methods, with a focus on the limitations and misuse of statistical methods in observational studies. He has authored or co-authored over 400 articles and book chapters in epidemiology, statistics, and medical publications, and co-authored the textbook Modern Epidemiology.

Professor Greenland has played an important role in the recent discussion following the American Statistical Association’s statement on the use of p values.[1-3] He will discuss lessons he took away from the process and how they apply to properly interpreting what is ubiquitous but rarely interpreted correctly by researchers: Statistical tests, P-values, power, and confidence intervals.

1. Ronald L. Wasserstein & Nicole A. Lazar (2016): The ASA's statement on p-values: context, process, and purpose,

*The American Statistician*, 70, 129-133, DOI: 10.1080/00031305.2016.1154108

2. Greenland, S., Senn, S.J., Rothman, K.J., Carlin, J.C., Poole, C., Goodman, S.N., and Altman, D.G. (2016). Statistical tests, confidence intervals, and power: A guide to misinterpretations.

*The American Statistician*, 70, online supplement 1 at http://www.tandfonline.com/doi/ suppl/10.1080/00031305.2016.1154108; reprinted in the European Journal of Epidemiology, 31, 337-350.

3. Greenland, S. (2016). The ASA guidelines and null bias in current teaching and practice.

*The American Statistician*, 70, online supplement 10 at http://www.tandfonline.com/doi/ suppl/10.1080/00031305.2016.1154108