Linked education data opens up new research opportunities

News, Data release
18 July 2019

*Update June 2023* – New datasets linking education data – from Key Stage 1 to Key Stage 4 – to the records of MCS participants based in Wales are now available from the UK Data Service.

New datasets have just been released, linking education data, including GCSE exam results, to the records of Millennium Cohort Study (MCS) participants based in England.

This new information, produced by the Centre for Longitudinal Studies (CLS), will allow researchers to more accurately study how pupils born at the turn of this century have progressed through school, including the factors associated with different academic trajectories and attainment. It will be uniquely placed to provide insight into how people from different backgrounds across the country fare in England’s education system today.

The MCS study team worked with the Department for Education (DfE) to securely link the National Pupil Database (NPD) data to cohort members’ survey data.

Cohort members’ parents had been asked for consent to link the NPD education records to their children’s MCS survey data when they were growing up. In England, 94 per cent agreed. Key Stage 1 and 2 test results were first linked in 2012. The new data linkage, which includes GCSE exam results, means researchers will be able to explore detailed information on cohort members’ progress through schools in England, from reception to year 11.

Other NPD pupil-level data in the new datasets includes information on absences, ethnicity, languages spoken at home, special educational needs, and eligibility for free school meals. This is included in the School Census data on pupils attending maintained schools

In addition, school-level data from Local Education Authority School Information Service (LEASIS) files has been linked to the study, covering the type of school the cohort member attended, the percentage of pupils on free school meals and the percentage of pupils with special needs at their school. School level variables have been derived allowing researchers to identify which schools had changed their Unique Reference Number between years and the reasons for that change.

Professor Emla Fitzsimons, Director of MCS, said: “Education researchers and policymakers can now track the performance of a generation of young people in England across much of their school lives. Linking this education data to MCS provides a unique resource for the study of a wide range of topics, for instance, the links between early childhood, family background and education trajectories, and the interplay between education and other aspects of people’s lives such as mental health and wellbeing.

“This new linked data will help to improve the scope and quality of education-related  research, which will have the potential to benefit policy and practice, and ultimately current and future school pupils.”

Cohort members’ survey data and linked education records are anonymised and can be accessed securely by bona fide, registered researchers through the UK Data Service (UKDS).

These newly released datasets relate to cohort members in England only, but work is ongoing to also link education data to the survey data of cohort members in Scotland, Wales and Northern Ireland. Key Stage 1 test scores have already been linked for Scotland and Wales, and these datasets are available to researchers now. Plans are in place to link Key Stage 2 test scores and GCSE exam results for these cohort members.

This latest development is part of a wider programme of data linkage work under way at CLS, which runs the MCS and three other cohort studies – the 1958 National Child Development Study, the 1970 British Cohort Study, and Next Steps.

Further information

Read the Millennium Cohort Study User Guide to the Linked Administrative Datasets.

The datasets are available via the UKDS’s secure-access system, the Secure Lab. Information about how users can apply for access to these data can be found here.


  1. CLS securely transferred personal details (such as the cohort member’s name, sex, date of birth and address) when working with the DfE to link education records to survey data. No other information about the person, or any of their answers to the surveys, was sent. The DfE only used these details to identify the person in their systems and to send CLS their education records. Once the correct information was identified, these personal details were destroyed. When the data from the education records was sent to CLS, it was added to the information collected in the study. All survey and linked data are made available to researchers under restricted access arrangements. Names, addresses, National Insurance and/or NHS numbers, are never disclosed.
  2. Highly confidential datasets managed by the UKDS are classified as controlled data, which are not available under standard UKDS access arrangements. Users applying for controlled data have additional conditions of access, including the completion of special forms and attending a training course. Once approval is granted, datasets are available to view on the UKDS’s secure system, the Secure Lab. Data accessed in this way cannot be downloaded. Once researchers and their projects are approved, they can analyse the data remotely from their organisational desktop, or by using the UKDS’s Safe Room. They provide access to statistical and office software to make remote analysis and collaboration secure and convenient.

Back to news listing

Media enquiries

Ryan Bradshaw
Senior Communications Officer

Phone: 020 7612 6516

Contact us

Centre for Longitudinal Studies
UCL Social Research Institute

20 Bedford Way
London WC1H 0AL


Follow us