The idea of CPF

The Comparative Panel File (CPF) harmonises the world’s largest and longest-running household panel surveys from seven countries: Australia (HILDA), Germany (SOEP), Great Britain (BHPS and UKHLS), Korea (KLIPS), Russia (RLMS), Switzerland (SHP), and the United States (PSID). The project aims to support the social science community in the analysis of comparative life course data. The CPF is not a data product but an open-source code that integrates individual and household panel data from all seven surveys into a harmonised three-level data structure. The open-source character of the code allows for developing and extending areas of application.

CPF is an open-science project aimed at answering the growing need for cross-nationally comparative longitudinal data in the social sciences. It also contributes to the open and replicable science by providing access to data resources and collaborative improvement of research tools.

Currently, CPF is developed by Konrad Turek and Matthijs Kalmijn at the Netherlands Interdisciplinary Demographic Institute (NIDI-KNAW) and Thomas Leopold at the University of Cologne [see The CPF Team]. The CPF code was designed and prepared by Konrad Turek and will be continuously developed and improved by the CPF team and the community of users.

How did it start?

The first version of CPF (1.0) was published in December 2020. The idea of the comparative dataset originated in 2019 among a group of sociologist from the Netherlands Interdisciplinary Demographic Institute, the University of Cologne, and the University of Amsterdam, involved in a research project “CRITEVENTS” founded by NORFACE/DIAL programme*, which focused on critical life events and the dynamics of inequality over the lifecourse. CPF was developed in an attempt to extend and popularise the approach implemented in the Cross-National Equivalent File (CNEF). CNEF is a long-running and well-established project which harmonizes international longitudinal surveys of households. It is an extraordinary endeavor, however, it has some limitations related to topics included, lack of options to include new variables, or complex application procedures. Building on the CNEF approach, CPF was an attempt to overcome these limitations for users who require more flexibility and control over the data management process.

* “Critical Life Events and the Dynamics of Inequality: Risk, Vulnerability, and Cumulative Disadvantage” (CRITEVENTS) was funded by NORFACE through the transnational research programme “Dynamics of Inequality Across the Life-Course: Structures and Processes (DIAL), which is co‐funded by the European Commission through Horizon 2020 under grant agreement No 724363.

Open Science Platform

CPF is an open-science project, which means that it provides access to all resources, including the programming code. Furthermore, the code can be improved and developed by anyone who wishes to contribute to the project. To allow the open access and community-based development, we have built an open-science platform that connects several tools: website, online forum, GitHub and OSF.

CPF's open-science framework

The central element is the project’s website that contains all important information, documentation and the latest major version of the code. The website also includes a forum. The forum (www.cpfdata.com/forum) serves general communication, discussions and suggestions related to the code. It may also be used for asking questions and providing answers.

GitHub is precisely oriented at the development of the CPF code. GitHub is a code hosting platform for collaborations in code development, especially useful for managing open-source projects. It allows users to access the main and alternative versions of the code, share their modifications, track changes and continuously integrate them into consecutive versions. Extensions, improvements or alternative versions of the code can be offered by all researchers and programmers who register free of charge at the GitHub platform. Notably, all changes are recorded, providing version control functionality.

Open Science Framework is one of the most popular open-science platforms, which facilitates open collaboration in research. OSF integrates many tools and services which support managing, organising, documenting and sharing all aspects of a project. Among others, OSF allows pre-registering studies, storing code and data; it is linked to preprint services and many scientific platforms. It facilitates collaborative workflow on projects, allows to document the work and progress. Similarly to GitHub, OSF uses a version control system, so all changes to the project are recorded. OSF allows additionally to register the project at each stage and creates an archival version of the project with a unique hyperlink. All materials can be registered this way, receiving permanent links and DOIs. Importantly, OSF includes a GitHub add-on which directly links files stored at GitHub repository into the OSF project. This way, changes to the code can be introduced either through GitHub or OSF. They are also synchronised so that the code at the OSF is always up to date.

Links to the resources:

Our team

Core team

Konrad Turek

Konrad Turek

Assistant professor, sociology, work & ageing labour markets, life course inequalities at Tilburg University


Thomas Leopold

Thomas Leopold

Full professor of Sociology at the University of Cologne, the Chair for Methods of Empirical Social Research.


Matthijs Kalmijn

Matthijs Kalmijn

Professor of demography and sociology at the Netherlands Interdisciplinary Demographic Institute (NIDI), theme leader ‘Families and Gender’. Professor of Sociology at the University of Groningen.


Supporting team

Isabel Voets

Isabel Voets

Research Assistant for CPF (2022-2023) at NIDI-KNAW. MSc Double Degree Sociology and Population Studies (Tilburg University / Universitat Pompeu Fabra)


Xu Xiao

Xu Xiao

In this Chinese name, the family name is Xu.

Postdoctoral Researcher for CPF (2025) at Tilburg University. PhD candidate Computational Demography (NIDI-KNAW / University of Groningen)