Timetable
Courses for 2012-13 are below. Booking opens on Monday 1st October 2012, and no bookings or requests to be added to waiting lists can be taken prior to this. For details of how to register for a course, or to be added to the waiting list, please visit this page.
Please note that the courses on Clustered and Longitudinal Data (D1), Causal Modelling (D4) and Structural Equation Modelling (D5) are advanced courses requiring COMPLETE familiarity with multiple regression. Those taking an introductory statistics course should not expect to be allowed to take one of these advanced courses in the same year.
Lecture notes for 2012-13 courses will be available in Moodle (this is also the location of last year's course notes).
There are four categories of courses:
(no prior knowledge assumed)
A1. Multiple Testing
A2. Power Analysis
NB - notes from last year's 'Introduction to Statistics' course are available in Moodle, but it will not be re-run this year.
(only knowledge of basic statistical methodologies assumed)
B1. Introduction to STATA
B2. Getting Started in R
(knowledge of basic statistical methodologies and respective software package assumed)
C1. Logistic and Poisson Regression Analysis
C2. Model Selection and Penalised Regression
C3. Introduction to Structural Equation Modelling using AMOS
(these require COMPLETE familiarity with multiple regression. Those taking an introductory statistics course are strongly advised against taking one of these advanced courses before they are comfortable with multiple regression at the level of our self-taught ‘Introduction to Statistics’ course, which is available in Moodle)
D1. Longitudinal and Clustered Data
D2. Missing Data in Mental Health Research: A Practical Approach Using Stata
D3. Gllamm course
D4. Causal Modelling
D5. Structural Equation Modelling using MPlus
A) Introductory Statistics Teaching
(no prior knowledge assumed)
A1. Multiple Testing - Daniel Stahl
Two sessions (attendees must attend both sessions to retain deposit)
Time: 9:30am – 12:30pm
Dates: 27th February 2013 and 6th March 2013
Requirements: None
Course aim and content: This course aims to introduce to the problem of multiple testing and introduces methods to adjust for having performed many statistical tests. Multiple tests, such as multiple comparisons of different means, are a common procedure in statistical analysis. However, with increasing number of statistical tests the probability that a test will become significant by chance increases exponentially ("alpha error inflation"). A common procedure is to adjust the alpha error level by adjusting the p-value by multiplying the p-value of a test by the number of comparisons (Bonferroni method). However, this method has got a very low power and many multiple comparison procedures have been developed, such as improved Bonferroni methods (Holm, Hochberg), Fischer’s LSD, Tukey’ HSD, Dunnett test or False Discovery Rate procedures.
[Back to full course list]
A2. Power Analysis - Nick Magill
One session
Time: 9:30am – 12:30pm
Date: 28th February 2013
Requirements: None
Course aim and content: Power requirements are becoming ever more important in studies, both in terms of funding and ethical approval. This course aims to introduce participants to the concept of significance testing and power, how to develop a power analysis strategy and how to perform a power analysis. The computer package GPOWER for calculating power or sample sizes will be introduced, and practical examples used to explain how to calculate power for such purposes as comparing means (through an independent samples t-test), comparing proportions (through a chi-squared test) and testing correlations. Participants will also be able to put their knowledge into practice through a series of exercises.
[Back to full course list]
B) Training in Statistical Software
(only knowledge of basic statistical methodologies assumed)
B1. Introduction to STATA - Andrew Pickles, Mizan Khondoker, and Sabine Landau
Seven sessions (attendees must attend at least five sessions to retain deposit)
Time: 1pm – 4pm
Dates (all 2012): October – 15th, 22nd, 29th; November – 5th, 12th, 19th, 26th
Requirements: None
Course aim and content: The course aims to demonstrate how to carry out standard statistical procedures in STATA using its command language. The course is intended to provide a foundation in STATA use. Please note that it will not teach introductory statistical concepts; those requiring such teaching should refer to the self-taught ‘Introduction to Statistics’ course in Moodle.
Please also note that most courses on specialist statistical methodologies for behavioural research will demonstrate methods in STATA and will assume STATA proficiency at the level of this course.
[Back to full course list]
B2. Getting Started in R - Mizan Khondoker
One session
Time: 10am – 12pm
Date: 6th February 2013
Requirement: No prior knowledge of R is assumed, but familiarity with basic statistical concepts and some basic experience in any programming language would be useful.
Course aim and content: The course aims to provide a basic introduction to the R software.
Course outline:
Introduction and preliminaries:
• Obtaining and installing R
• Running R and R Window system
Simple manipulation in R:
• Scalar, vector, arrays (matrix), list, data frame
• R objects, their mode and attributes
Importing/Exporting data:
• Reading in data from ASCII/csv files
• Importing data from other software (e.g., Stata, SPSS,)
• Exporting and saving data
Simple summary and statistical analysis in R
• Frequency table and other summary statistics
• t-test, chi-squared test and their non-parametric equivalents
• Fitting regression type models in R (lm, glm, etc.)
R Graphics:
• Creating and storing graphics in R.
[Back to full course list]
C) Specialist Statistical Methodologies for Behavioural Research
(knowledge of basic statistical methodologies and respective software package assumed)
C1. Logistic and Poisson Regression Analysis - Victoria Harris and John Hodsoll
Three sessions (attendees must attend at least two sessions to retain deposit)
Time: 2pm – 5pm
Dates: 7th, 14th and 21st February 2013
Requirement: Familiarity with basic statistical concepts and STATA experience is essential.
Course aim and content: The course aims to demonstrate how to carry out the statistical analysis of binary and count data in STATA using logistic and Poisson regression. Binary data (e.g. yes/no outcomes) or count data (e.g. number of relapses) arise frequently in behavioural research, especially in epidemiological studies.
[Back to full course list]
C2. Model Selection and Penalised Regression - Daniel Stahl and Mizan Khondoker
Three sessions (attendees must attend at least two sessions to retain deposit)
Time: 9:30am – 12:30pm
Dates: 13th, 20th and 27th March 2013
Requirement: Familiarity with regression analyses and some basic knowledge of Stata are essential.
Course aim and content: The aim of many behavioural or medical studies, especially in observational studies, is to explain an outcome of interest by a number of independent variables using multiple regression analyses. The goal of such an analysis is to find a model with a parsimonious set of predictor variables, which explains best the variation of the dependent variable. However, some common problems in the analysis of such studies with many explanatory variables are (i) to find a "best" model for prediction, (ii) feature variable selection (finding the optimal set of predictors), (iii) dealing with colinearity/redundancy, overfitting and the curse of dimensionality (the number of variables is large in comparison to sample size,).
The standard approach in behavioural studies is to rely on hypothesis testing to include significant variables, often using automated backward, forward or stepwise model selection procedures. However, different automatic selection procedures may assess variables in different orders. Consequently different model selection procedures may lead to different answers, especially with a larger number of independent variables. Furthermore, one will get only one model and similarly good ones or even better models will be ignored.
An alternative approach is using information criteria and likelihood statistics. Instead of testing the significance of a parameter, the likelihood of the data to support a hypothesized model is calculated using maximum likelihood methods. This method allows to quantitatively compare the goodness of fit of different models. This course will cover an introduction to model selection using Akaike’s information criterion (AIC).
The course will also provide an introduction to penalised regression methods for feature/variable selection and dealing with the redundancy (high correlation), high variability and the problem of overfitting commonly seen in models with high dimensional data. The least squares (LS) or maximum likelihood (ML) estimates from a standard regression model is unbiased, but can be highly variable (unstable). This is specially the case when the sample size (n) and the number of variables (p) are of similar size. Furthermore, when p>n, no unique LS or ML estimate exists. A practical alternative in these circumstances is to use penalised regression methods which deals with the high variability, identifiability and overfitting problems by introducing a little bias in the estimated regression parameters. The course will mainly discuss two basic penalised methods, namely the Ridge regression and LASSO (least absolute shrinkage and selection operators). Computer practical sessions using STATA will help applying penalised regression methods to real data examples.
[Back to full course list]
C3. Introduction to Structural Equation Modelling using AMOS - Daniel Stahl
Three full-day sessions (attendees must attend at least two days to retain deposit)
Time: 9am – 5pm
Dates: 9th, 10th and 11th April 2013
Requirement: Familiarity with multiple linear regression analysis would help participants.
Course aim and content: This course is a brief introduction and overview of path analysis and structural equation modelling using the AMOS software. The course features an introduction to the logic of SEM, the assumptions and required input for SEM analysis, and how to perform SEM analyses using AMOS. The course includes topics such as mediation analysis and confirmatory factor analysis.
[Back to full course list]
D) Advanced Statistical courses
(these require COMPLETE familiarity with multiple regression. Those taking an introductory statistics course are strongly advised against taking one of these advanced courses before they are comfortable with multiple regression at the level of our self-taught ‘Introduction to Statistics’ course, which is available in Moodle)
D1. Longitudinal and Clustered Data - Sabine Landau and Mizan Khondoker
Five sessions (attendees must attend at least four sessions to retain deposit)
Time: 9:30am – 12:30pm
Dates (all 2013): March – 14th, 21st, 28th; April – 4th and 11th
Requirement: Familiarity with basic statistical concepts and STATA experience at the level provided by ‘Introduction to STATA’ is essential.
Course aim and content: The course aims to demonstrate how to carry out statistical analyses of clustered data in STATA. Here the term clustered data refers to sets of correlated observations. Clustered data arise frequently in behavioural research, typically due to an individual being measured repeatedly (e.g. in longitudinal studies, repeated measures designs) or subjects falling into natural clusters of correlated observations (e.g. families, twins). Such data require special methods since conventional methods such as regression or logistic regression are based on the assumption that individual observations are statistically independent and will produce invalid inferences in the presence of correlated observations. (Note that the relevant methodologies have nothing to do with the multivariate statistical method cluster analysis which aims to identify groups of observations.)
[Back to full course list]
D2. Missing Data in Mental Health Research: A Practical Approach Using Stata - Sabine Landau, and guest lecturer Ian White
One full-day session
Time: 9am – 5pm
Date: 16th May 2013
Requirement: The target audience of this course is mental health researchers needing to analyse incomplete data. Participants should be familiar with running Stata from the command line (i.e. not using menus) at least to the level of fitting a multiple regression model to complete data. No prior knowledge of missing data methods is assumed. Participants will need to bring their own laptop computer with Stata version 10 or higher (not version 9).
Course aim and content: This course aims to explain the problems arising from missing data in mental health studies. It will introduce some statistical analysis methods that can be used to deal with missing data in (i) the outcome variables (with a focus on randomised trials) and (ii) in the covariates (with a focus on observational studies). The course will demonstrate the steps involved in carrying out relevant analyses in Stata. Throughout the course emphasis will be on providing participants with an awareness of the assumptions underlying various analysis approaches and their limitations.
[Back to full course list]
D3. Gllamm course - Andrew Pickles
Three full-day sessions (attendees must attend at least two days to retain deposit)
Time: 9am - 5pm
Dates: 28th, 29th and 30th May 2013
Requirement: This course is demanding, requiring familiarity with Stata, statistical modelling and a good conceptual understanding of research. You should be entirely comfortable with the use of ordinary and logistic regression, already know the basics of Stata, and preferably have undertaken the analysis of a significant research project.
Course aim and content: The gllamm procedure within Stata allows the fitting of a vast array of models from the Generalized Linear Latent and Mixed Model framework, including longitudinal random effects, latent class and trajectory, instrumental variable, and multilevel factor models for normal, binary, ordinal and censored data. This course introduces the framework and illustrates its use through an extended series of examples.
Background Text:
- Skrondal A & Rabe-Hesketh S. (2004) Generalized Latent Variable Modelling: Multilevel and Structural Equation Modelling. Boca Raton, FL. Chapman and Hall/CRC
- Rabe-Hesketh, S., Skrondal, A & Pickles, A. (2004) Generalized structural equation modelling. Psychometrika 69, 167-190
[Back to full course list]
D4. Causal Modelling - Sabine Landau and Richard Emsley
Three full-day sessions (attendees must attend at least two days to retain deposit)
Time: 9am – 5pm
Dates: 11th, 12th and 13th June 2013
Requirement: Familiarity with basic statistical concepts and STATA experience at the level provided by ‘Introduction to STATA’ is essential.
Background: The aim of many studies is to infer causal effects of explanatory variables on outcomes. Clinical trials are aimed at evaluating the causal effects of treatments while observational studies tend to target the effects of risk factors. In observational studies allocation of subjects to levels of potential risk factors is by self-selection and common causes of both the outcome and the risk factor (so called “confounders”) can lead to biased estimation of causal effects when such third variables have not been measured or when despite being recorded standard statistical analysis methods fail to take proper account of them. In clinical trials randomization ensures that effects of (random) treatment offers on clinical outcomes cannot be confounded by measured or unmeasured baseline variables. However, it is common, especially in mental health trials, that participants do not comply (e.g. do not turn up for their session) or only partly comply (e.g. attend only a portion of their treatment sessions) with their allocated treatment. In this case treatment receipt is subject to self-selection and the effect of the treatment received on the clinical outcome (efficacy) can be confounded.
Course aim and content: We aim to introduce participants to the problem of inferring the effects of treatment or exposure in the presence of self-selection and/or confounding. To this end we will review some of the concepts and vocabulary of the causal inference literature and provide a non-technical introduction to accessible methods for estimating causal effects. Implementations of these methods in STATA will be illustrated.
[Back to full course list]
D5. Structural Equation Modelling using MPlus - Andrew Pickles and Sabine Landau
Three full-day sessions (attendees must attend at least two days to retain deposit)
Time: 9am – 5pm
Dates: 18th, 19th and 20th June 2013
Requirement: Formal course requirements are modest, however participants should be entirely familiar with the use of regression models and have significant prior research data analysis experience.
Course aim and content: This course introduces the concepts and methods of structural equation modelling using the software MPlus. While starting at first principles and with simple examples the course develops rapidly to cover item response models, growth curve and latent trajectory models, and instrumental variable models for more rigorous causal analysis. Thus, while the formal course requirements are modest, participants should be entirely familiar with the use of regression models and have significant prior research data analysis experience.
[Back to full course list]