Applied Statistical Modelling & Health Informatics MSc, PGCert, PGDip
This course has been created to deliver a skill set and knowledge base in “multimodal” and “big data” analysis techniques, which are a recognised scarcity within UK Life sciences.
You will receive world-class training in core statistical, machine learning and computational methodology, and you will have the opportunity to apply your skills to real-life settings facilitated by the world-leading Institute of Psychiatry, Psychology & Neuroscience.
Our PhD programme is designed to train and guide students for a research-intensive career in the academia or industry. Our aim is to train scholars who can do original innovative research or applications in theory, methods, or applications. Students will benefit from world class research and facilities plus internationally recognised supervisors.
Biostatistics and Health Informatics Executive Education Programme 2019/20
Introduction to Statistical Programming
The course will cover the theoretical basis of programming, and applying this knowledge to a diverse range of practical problems. To relate general concepts in programming to language-specific instances in R and STATA. On completion of the course, participants will:
- Develop a theoretical understanding of the basics of programming.
- Understand the use and usefulness of conditionals, iterations, and functions.
- Identify appropriate methods for solving particular data manipulation problems.
- Formulate sensible programming solutions for a data set, and demonstrate an ability to check the correctness of one’s manipulations.
- Become an independent user of R and Stata, in finding help online, and demonstrate how to select appropriate packages for one’s problems.
Practical Statistics for Public Health Research
The course will cover the theoretical basis of generalized linear and generalised mixed models to apply to a diverse range of practical problems. The course will relate modern statistical models and methods to real life situations and use relevant computer software (STATA) for statistical analysis. On completion of the course, participants will:
- Develop an understanding of the underlying assumptions and principals of statistical modelling *Understand the foundation theory of Generalised Linear Models and Generalised Linear Mixed Models
- Identify appropriate methods of estimation using Bayesian and frequentist approaches
- Understand different missing data mechanisms and their assumptions
- Formulate sensible models for a set of data, conduct statistical inference and interpret the results of any analysis.
- Critique, and adapt, statistical models to cope with atypical error structures and non-independence
- With limited guidance, deploy established techniques of analysis and enquiry in scientific endeavour
Health Informatics - a Data Science Focused Short Course
This course will provide a comprehensive introduction to the fundamentals of modern health-focused informatics research. The course will delve into a problems intrinsic to the domain, as well as general questions of how informatics techniques can help alleviate them, enabling the re-use o improve workflow and care. In this course, students will understand the major issues related to applying informatics techniques to the medical domain as well as the challenges faced by researchers working on medical records today. The students will also have obtained hands-on experience through analysis of case studies the advantages and challenges of applying health informatics techniques in various clinical and medical settings. On completion of the course, participants will
- Understand the sources of data in healthcare and medicine and their individual and shared properties and characteristics.
- Understand the need for informatics techniques to mine, structure, aggregate and analyse health and medical data.
- To recognise available longitudinal medical datasets and to a number of them.
- To understand the concepts of data aggregation, cleaning, structuring and the different formats for saving healthcare medical data.
- To appreciate the need for standardisation of medical data, and have good knowledge of existing medical nomenclatures standards, e.g. ICD and SNOMED CT.
- To understand the role played by knowledge representation and management as well as machine techniques in health informatics research.
Introduction to R
This course aims to provide a basic introduction to the R software. This will cover:
(i) obtaining and installing R,
(ii) dealing with various data objects such as vector, matrix and data frame,
(iii) importing/exporting data from/to ASCII text, spread sheet (Excel) and other statistical software’s (e.g., Stata, SPSS)
(iv) some statistical analysis in R such as summary statistics, t-test, chi-squared test, linear regression, logistic regression and
(v) R graphics using ggplot2 for creating and storing graphics in R.
Introduction to STATA
The course aims to demonstrate how to carry out standard statistical procedures in STATA using its command language. The course is intended to provide a foundation in STATA use. Please note that it will not teach introductory statistical concepts and please also note that most courses on specialist statistical methodologies for behavioural research will demonstrate methods in STATA and will assume STATA proficiency at the level of this course.
Multilevel and Longitudinal Modelling
As health care data becomes more complex and multimodal, the structure of the data becomes increasingly complicated. There could be an increasing number of repeated measurements on the same individuals over time, longer duration of follow-up for time-to-event outcomes or nested hierarchies leading to non-independence of individuals. The use of simpler statistical approaches to analyse these data is invalid because the key assumptions of the those approaches do not hold. In this module, we introduce the concept of multilevel and longitudinal modelling, including time-to event or survival analysis. The aim is for the student to understand the challenges of longitudinal and clustered data, and the concept and implementation of multilevel models. Students will also become familiar with the most common models for time-to-event data (inc. Cox proportional hazard models, additive hazard models), and finally link all these concepts together through joint modelling of survival and longitudinal data. Students will discover Stata commands that can fit all of these models and become familiar with the resulting Stata output, whilst applying them to real data structures. On completion of this course the student should be able:
- To properly interpret a multilevel model, and its parameter estimates
- To specify multilevel models, appropriate to an example study data setting
- To implement multilevel models in Stata and be able to interpret the output
- To achieve the above for continuous and binary responses
- To properly interpret a time-to-event model, and its parameter estimates
- To specify time-to-event models appropriate to an example study data setting
- To implement time-to-event models in Stata and be able to interpret the output
- To properly interpret a joint model and its parameters estimates
- To implement joint models in Stata and be able to interpret the output
- To read, understand, critique and discuss an application in the research literature
- To make use of face-to-face learning to apply multilevel modelling or survival analysis to a typical example dataset to address a particular scientific problem
Prediction Modelling
To provide a comprehensive introduction to the fundamentals of clinical prediction modelling using modern statistical modelling techniques for health research. It will cover all steps of developing and accessing a prediction model. Computer based teaching introduces students the theory and practical implementation of cutting-edge predictive statistical and machine learning modelling techniques using the R statistical software. At the end of the course the students will:
- Have a good understanding of core clinical prediction concepts, such as prognosis, prognostic factors, prognostic models, and stratified medicine and will be able to apply this understanding to the design, conduct, and interpretation of clinical prediction modelling research studies
- Be able to describe how modern statistical concepts, regression and machine learning methods can be applied in medical prediction problems.
- Be familiar with the principles that play a role in internal validation such as over-fitting, optimism and shrinkage and understand key components of internal validation methods such as cross-validation or bootstrapping.
- Be able to develop simple prediction models, assess their quality and validate them using R software.
- Be able to critically assess the general applicability of a developed model to predict future outcomes.
- Be equipped with a range of statistical and machine learning skills, including problem -solving, project work and presentation, which will enable students to take prominent roles in a wide spectrum of employment and research.
Contemporary Psychometrics
The course provides a comprehensive introduction to the fundamental ideas of psychometric theory and implementation. Starting from the scale construction and gradually moving to the most recent statistical methods employed in measurement, the course provides a complete methodological framework for applied researchers. For the measurement of the latent variable(s) the course presents exploratory and confirmatory factor analysis (EFA and CFA) for numerical data. For categorical data, both the item response theory approach (IRT; 2-parameter logistic, grated response and partial credit models) as well as the item factor analysis model (IFA; both with regard to EFA and CFA). The course also presents the methods to explore measurement differences (measurement invariance) between groups (MG CFA), measurement differences due to covariates (MIMIC models), and measurement differences between raters or time points (longitudinal IFA), for all types of data. On completion of this module the student should be able to:
- develop a psychometric tool, for a characteristic of interest (latent variable)
- apply classical test theory methods in psychometrics to test the reliability and the validity of a scale
- apply item response theory methods in psychometrics
- apply exploratory and confirmatory factor analysis for binary, categorical, and numerical data
- understand the concept of measurement invariance and be able to apply the corresponding methods across groups
- to choose the appropriate model for each type of data
- to understand the links between different schools of thought in psychometrics, their similarities and their differences
- to critically evaluate the quality of published psychometric assessments
- understand and explain the concept of latent constructs and their measurement
Clinical Trials: Conduct and Analysis
The course provides a comprehensive introduction to trial design features used to mitigate bias, important aspects of trial design, conduct, analysis and reporting, and challenges and solutions for conducting RCTs with some focus on behavioural interventions. This will include some coverage of methods for elucidation of treatment mechanisms (e.g. mediation). Throughout the course the emphasis will be on practical issues faced by researchers in the conduct and analysis of RCTs through the lens of the mental health setting, and participants will be provided with skills to design, conduct and analyse rigorous RCTs in this research area. On completion of this course the student should be able to:
- Understand the key sources of bias within study designs
- Understand the fundamental features of a clinical trial within a mental health framework and potential challenges
- Implement suitable clinical trial designs for different settings
- Perform some robust analysis of clinical trial data and disseminate results