Applied Bayesian modelling for ecologists and epidemiologists (ABME)

Delivered by Dr. Matt Denwood and Prof. Jason Matthiopoulos

http://prstatistics.com/course/applied-bayesian-modelling-for-ecologists-and-epidemiologists-abme/

This 6 day course will run from 24th -29th October 2016 at SCENE field station, Loch Lomond national park, Scotland.

This application-driven course will provide a founding in the basic theory & practice of Bayesian statistics, with a focus on MCMC modelling for ecological & epidemiological problems. Starting from a refresher on probability & likelihood, the course will take students all the way to cutting-edge applications such as state-space population modelling & spatial point-process modelling. By the end of the week, you should have a basic understanding of how common MCMC samplers work and how to program them, and have practical experience with the BUGS language for common ecological and epidemiological models. The experience gained will be a sufficient foundation enabling you to understand current papers using Bayesian methods, carry out simple Bayesian analyses on your own data and springboard into more elaborate applications such as dynamical, spatial and hierarchical modelling.

Course content is as follows

Day 1 • Revision of likelihoods using full likelihood profiles and an introduction to the theory of Bayesian statistics. o Probability and likelihood o Conditional, joint and total probability, independence, Baye’s law o Probability distributions o Uniform, Bernoulli, Binomial, Poisson, Gamma, Beta and Normal distributions – their range, parameters and common usesoLikelihood and parameter estimation by maximum likelihood o Numerical likelihood profiles and maximum likelihood • Introduction to Bayesian statistics o Relationship between prior, likelihood & posterior distributions o Summarising a posterior distribution; The philosophical differences between frequentist & Bayesian statistics, & the practical implications of these o Applying Bayes’ theorem to discrete & continuous data for common data types given different priors o Building a posterior profile for a given dataset, & compare the effect of different priors for the same data

Day 2 • An introduction to the workings of mcmc, and the potential dangers of mcmc inference. Participants will program their own (basic) mcmc sampler to illustrate the concepts and fully understand the strengths and weaknesses of the general approach. The day will end with an introduction to the bugs language. o Introduction to MCMC. o The curse of dimensionality & the advantages of MCMC sampling to determine a posterior distribution. o Monte Carlo integration, standard error, & summarising samples from posterior distributions in R . o Writing a Metropolis algorithm & generating a posterior distribution for a simple problem using MCMC. o Markov chains, autocorrelation & convergence. o Definition of a Markov chain. o Autocorrelation, effective sample size and Monte Carlo error. o The concept of a stationary distribution and burning. o Requirement for convergence diagnostics, and common statistics for assessing convergence. o Adapting an existing Metropolis algorithm to use two chains, & assessing the effect of the sampling distribution on the autocorrelation. o Introduction to BUGS & running simple models in JAGS. o Introduction to the BUGS language & how a BUGS model is translated to an MCMC sampler during compilation. o The difference between deterministic & stochastic nodes, & the contribution of priors & the likelihood. o Running, extending & interpreting the output of simple JAGS models from within R using the runjags interface.

Day 3 • This day will focus on the common models for which jags/bugs would be used in practice, with examples given for different types of model code. All aspects of writing, running, assessing and interpreting these models will be extensively discussed so that participants are able and confident to run similar models on their own. There will be a particularly heavy focus on practical sessions during this day. The day will finish with a discussion of how to assess the fit of mcmc models using the deviance information criterion (dic) and other methods. o Using JAGS for common problems in biology. o Understanding and generating code for basic generalised linear mixed models in JAGS. o Syntax for quadratic terms and interaction terms in JAGS. o Essential fitting tips and model selection. o The need for minimal cross-correlation and independence between parameters and how to design a model with these properties. o The practical methods and implications of minimizing Monte Carlo error and autocorrelation, including thinning. o Interpreting the DIC for nested models, and understanding the limitations of how this is calculated. o Other methods of model selection and where these might be more useful than DIC. o Most commonly used methods Rationale and use for fixed threshold, ABGD, K/theta, PTP, GMYC with computer practicals. o Other methods, Haplowebs, bGMYC, etc. with computer practicals

Day 4 • Day 4 will focus on the flexibility of mcmc, and precautions required for using mcmc to model commonly encountered datasets. An introduction to conjugate priors and the potential benefits of exploiting gibbs sampling will be given. More complex types of models such as hierarchical models, latent class models, mixture models and state space models will be introduced and discussed. The practical sessions will follow on from day 3. o General guidance for model specification. o The flexibility of the BUGS language and MCMC methods. o The difference between informative and diffuse priors. o Conjugate priors and how they can be used. o Gibbs sampling. o State space models. o Hierarchical and state space models. o Latent class and mixture models. o Conceptual application to animal movement. o Hands-on application to population biology. o Conceptual application to epidemiology

Day 5 • Day 5 will give some additional practical guidance for the use of Bayesian methods in practice, and finish with a brief overview of more advanced Bayesian tools such as inla and stan. o Additional Bayesian methods. o Understand the usefulness of conjugate priors for robust analysis of proportions (Binomial and Multinomial data). o Be aware of some methods of prior elicitation. o Advanced Bayesian tools. o Strengths and weaknesses of Integrated Nested Laplace Approximation (INLA) compared to BUGS. o Strengths and weaknesses of Stan compared to BUGS

Day 6 • Round table discussions and problem solving with final Q and A round table discussion and problem solving with final Q and A. o The final day will consist of round table discussions, the class will be split in to smaller groups to discuss set topics/problems. This will include participants own data where possible. After an early lunch there will be a general question and answer time until approx. 2pm as a whole group before transport to Balloch train station.

There will be a 15 minute morning coffee break, an hour for lunch, and a15 minute afternoon coffee break. We keep the timing of these flexible depending how the course advances. Breakfast is from 08:00-08:45 and dinner is at 18:00 each day.

Please email any inquiries to oliverhooker@prstatistics.com or visit our website www.prstatistics.com

Please feel free to distribute this material anywhere you feel is suitable Upcoming courses - email for details oliverhooker@prstatistics.com 1. INTRODUCTION TO PYTHON FOR BIOLOGISTS (October) 2. LANDSCAPE GENETIC DATA ANALYSIS USING R (October) 3. PHYLOGENETIC DATA ANALYSIS USING R (October/November) 4. SPATIAL ANALYSIS OF ECOLOGIC AL DATA USING R (November) 5. ADVANCING IN STATISTICAL MODELLING USING R (December) 6. MODEL BASED MULTIVARIATE ANALYSIS OF ECOLOGICAL DATA USING R (January) 7. ADVANCED PYTHON FOR BIOLOGISTS (February) 8. NETWORK ANALYSIS FOR ECOLOGISTS USING R (March) 9. INTRODUCTION TO GEOMETRIC MORPHOMETRICS USING R (June)

Dates still to be confirmed - email for details oliverhooker@prstatistics.com • STABLE ISOTOPE MIXING MODELS USING SIAR, SIBER AND MIXSIAR USING R • INTRODUCTION TO R AND STATISTICS FOR BIOLOGISTS • BIOINFORMATICS FOR GENETICISTS AND BIOLOGISTS • GENETIC DATA ANALYSIS USING R • INTRODUCTION TO BIOINFORMATICS USING LINUX • INTRODUCTION TO BAYESIAN HIERARCHICAL MODELLING

Oliver Hooker PhD. PR statistics

3/1 128 Brunswick Street Glasgow G1 1TF

+44 (0) 7966500340 www.prstatistics.com