Skip to main content

03 June 2024

New software to help predict individuals' genetic risk of health conditions

The new GenoPred pipeline enables researchers to easily calculate an individual’s likelihood of different health-related outcomes, such as disease and disorders or response to treatment, based on their genetics.

Genetic theme

Most health-related outcomes are underpinned by thousands of genetic variants. Inherited changes in each gene impacts our risk of different diseases. This information can be used to calculate scores to represent an individual’s genetic risk or propensity for a certain health outcome, called polygenic scores. For example, how likely an individual is to develop depression based on their genetics.

The new GenoPred pipeline uses advanced machine learning on genetic data to robustly calculate polygenic scores. This allows researchers to generate polygenic scores that can predict disease risk, age of onset and treatment response, as well as understand the genetics underlying different health-related outcomes, and control for genetic effects when investigating the role of the environment.

GenoPred takes individual genetic data (for example from the UK Biobank or cohort studies such as the Twins Early Development Study) and large summary statistics from Genome Wide Association Studies (GWAS) that can be found in online catalogues, and uses current best practices to calculate polygenic scores. Pipelines like GenoPred are easy to implement and ensure accurate and reproducible analysis of genetic data. GenoPred supports global capacity building by enabling research groups without substantial statistical and computational expertise to engage with genetic research.

GenoPred fills a critical gap in the landscape of genetic research tools by providing a standardised, efficient, and accessible pipeline implementing advanced polygenic scoring methodology.

Dr Oliver Pain, Postdoctoral Research Fellow at the NIHR Maudsley BRC and Social, Genetic & Developmental Psychiatry Centre

As well as polygenic scores, GenoPred can also calculate an individual’s ancestry based on which populations their genetics match with most closely, and the degree of relatedness between multiple individuals. This is important for improving research quality as it allows researchers to ensure they are comparing individuals from the same population, and avoid comparing individuals that are highly related.

Polygenic scores are already used extensively in research to understand the contribution of genetics to health outcomes and to investigate the interplay of genetic and environmental factors. They also show great promise as a tool for personalised medicine based on an individual’s risk for different health outcomes, however more research is needed before they can be used in clinical settings.

The GenoPred pipeline software has already been implemented in research teams internationally, including at the University of Oslo and the Medical Research Council/Uganda Virus Research Institute and London School of Hygiene & Tropical Medicine Uganda Research Unit. It was used in a recently published peer-reviewed paper at the IoPPN which found that DNA sequences which originate from ancient viral infections contribute to susceptibility for psychiatric disorders like schizophrenia, bipolar disorder and depression.

The GenoPred pipeline is part of a wider research programme at the IoPPN, funded by Wellcome and the National Institute for Health and Care Research (NIHR) Maudsley Biomedical Research Centre (BRC), that aims to address the key issues of using polygenic scores in clinical settings and develop more tools that implement current best practices.

For more information, please contact Milly Remmington (School of Mental Health & Psychological Sciences Communications Manager).

In this story

Oliver Pain

Sir Henry Wellcome Postdoctoral Research Fellow at the Maurice Wohl Clinical Neuroscience Institute

Cathryn Lewis

Professor of Genetic Epidemiology & Statistics