Polygenic scores released for four national cohort studies

Data release
24 September 2025

The research community can now access a range of polygenic scores from more than 30,000 people taking part in four of the UK’s national cohort studies.

This data release makes cross-cohort analyses using derived genetic measures possible for the first time. 

Researchers from the UCL Centre for Longitudinal Studies (CLS) have generated a range of polygenic scores – summary measures that combine the estimated effects of many different genes on a specific trait or characteristic. The rich new resource covers 44 traits across a wide range of domains, including physical and mental health, cognition, health behaviours and social outcomes.  

These scores when used in combination with rich information about the study members’ lives – their social background, education, income, mental health, relationships and more – can offer the scientific community opportunities to draw valuable insights and inform a more nuanced understanding of how cohort members’ outcomes may be shaped. For example, the scores could be used to investigate the changing impact of genetic variation on mental wellbeing or depression over time, and how gene environment interactions differ across life. 

Dr Tim Morris (UCL Centre for Longitudinal Studies), who led the development of these scores said:

“Unlike raw genome-wide data that require specialist expertise to use, these polygenic scores make it far easier for researchers to incorporate genetic data in their analyses.

Dr Tim Morris (UCL Centre for Longitudinal Studies)

What information was collected? 

Genotyping was conducted from biological samples such as blood and saliva collected from participants in four studies which each follow a different generation. Polygenic scores were derived from the genetic data and information from large scale Genome-wide Association Studies (GWAS) that are publicly available. Among study members, polygenic scores are available for: 

  • 6,396 participants of the 1958 National Child Development Study (NCDS)  
  • 5,361 participants of the 1970 British Cohort Study (BCS70)  
  • 1,272 Next Steps participants (born in 1989-90) 
  • 7,641 Millennium Cohort Study (MCS) participants (born in 2000-02) 
  • 7,781 MCS co-resident mothers and 4,634 co-resident fathers  

The polygenic scores have been developed using a consistent methodology making cross-cohort analysis possible. This approach allows for a much wider use of the genetic data collected in the studies. The code to generate the scores is publicly available on GitHub.

Dr Tim Morris said: By sharing the methods openly, we hope to support high standards of reproducibility and transparency, and ensure that our work benefits the wider research community. 

What do they cover?

The polygenic scores will aid researchers interested in genetically informed research in a range of disciplinary fields including health, social science, and psychology. The 44 traits for which polygenic scores have been developed are: 

  • Anthropometrics: birth weight, body fat distribution, body mass index (childhood and adulthood), grip strength, height, waist circumference. 
  • Brain structure and cognition: Alzheimer’s disease, cognition, hippocampal volume, Parkinson’s disease. 
  • Health behaviours: substance abuse, age at initiation of smoking, alcoholic drinks per week, cigarettes per day, diet. 
  • Mental health: anxiety, ADHD, autism spectrum disorder, bipolar disorder, depressive symptoms, externalising problems, major depressive disorder, schizophrenia. 
  • Personality: agreeableness, conscientiousness, extraversion, openness to experience, neuroticism. 
  • Physical health: age at menopause, asthma, blood pressure, coronary artery disease, C-reactive protein, fasting glucose, HbA1c, hypertension, rheumatoid arthritis, type 1 diabetes, type 2 Diabetes. 
  • Social outcomes: education, household income, human longevity, parental lifespan. 

“For researchers, this is a rare and powerful resource – accessible genetic data brought together with decades of detailed life-course information across multiple comparable generations. It creates unprecedented opportunities to better understand how genes and environments together influence health and wellbeing.”

Professor David Bann (UCL Centre for Longitudinal Studies)

How to access the data? 

The datasets have been pseudonymised and are available from the UK Data Service as special safeguarded data, which are subject to the UK Data Service Special Licence. These are available without charge upon approval of a data access request. Find out more information on the data access training page 

To access these data, visit the UK Data Service website:  

NCDS PGIs under Special License Access 

BCS70 PGIs under Special License Access 

Next Steps PGIs under Special License Access 

MCS PGIs under Special License Access 

Find out more about the data

To find out more about the Polygenic indices dataset read the user guide on the CLS website.

Further Information 

The CLS genomics GitHub website contains detailed information about the genotyping, imputation and quality control of the underlying genetic resources used in the generation of the Polygenic Indices (PGIs). The CLS data resource profile paper provides a broad overview as well as scientific motivation. 

Watch the training webinar

If you are interested in learning more about the polygenic scores, then watch the recording of the recent training webinar hosted by CLS on 30 September from 12-1pm. The experts provide an overview of the polygenic scores, how these can be accessed, give examples of how they have been used before, and illustrate the unique research opportunities they offer.

 

 


Back to news listing

Media enquiries

Ryan Bradshaw
Editorial Content Manager

Phone: 020 7612 6516
Email: r.bradshaw@ucl.ac.uk

Contact us

Centre for Longitudinal Studies
UCL Social Research Institute

20 Bedford Way
London WC1H 0AL

Email: clsdata@ucl.ac.uk

Funded by
Follow us
Index