The idea of CPF

The Comparative Panel File (CPF) is an ongoing, open science project to harmonise the world’s largest and longest-running household panel surveys from seven countries. The project aims to support the social science community in analyzing comparative life course data. By harmonising individual repeated data covering long periods and several general population surveys, researchers can analyse both time trends and country differences. Currently, CPF includes household panel data from seven of the world’s most important longitudinal surveys: Australia (HILDA), Germany (SOEP), Great Britain (BHPS/UKHLS), South Korea (KLIPS), Switzerland (SHP), the United States (PSID), and the Netherlands (LISS). Previous versions of CPF include Russia (The Russian Longitudinal Monitoring Survey, RLMS), but from version 2.0 it is removed.

Rather than a static dataset, CPF offers flexible code (in Stata) that combines original survey data into a unified, three-level panel structure (observations nested within individuals, within countries). Thus, CPF is not a data product. Researchers must download the raw data from national data providers and apply the CPF code to create their harmonized dataset. The open-source nature of the code enables the development and expansion of its areas of application. The main features of CPF include:

• Open source: All code is fully transparent, editable, and extendable.

• Broad and extendable scope: Covers a broader range of variables (which can be further extended).

• Flexible and modular: Users can easily adapt it to include different countries, waves, or variables.

• Community-driven: Designed for collaborative development through open-coding frameworks GitHub.

CPF is developed by Konrad Turek and Matthijs Kalmijn at the Netherlands Interdisciplinary Demographic Institute (NIDI-KNAW) and Thomas Leopold at the University of Cologne (see The CPF Team). The CPF code was designed and prepared by Konrad Turek and is continuously developed and improved by the CPF team and the community of users.

How did it start?

The first version of CPF (1.0) was published in December 2020 (Turek, Kalmijn, Leopold, 2021). The idea of the comparative dataset originated in 2019 among a group of sociologist from the Netherlands Interdisciplinary Demographic Institute, the University of Cologne, and the University of Amsterdam, involved in a research project “CRITEVENTS” founded by NORFACE/DIAL programme*, which focused on critical life events and the dynamics of inequality over the lifecourse. CPF was developed in an attempt to extend and popularise the approach implemented in the Cross-National Equivalent File (CNEF). CNEF is a long-running and well-established project which harmonizes international longitudinal surveys of households. It is an extraordinary endeavor, however, CPF was an attempt to move the harmonization process to open science, crowdsource cooperation, provide novel functionalities, more flexibility and control over the data management process (Turek, 2025).

Open Science Platform

CPF is an open-science project, which means that it provides access to all resources, including the programming code. Furthermore, the code can be improved and developed by anyone who wishes to contribute to the project. To allow the open access and community-based development, we have built an open-science platform that connects several tools: website, GitHub and OSF. Users’ improvements and suggestions will be recorded, incorporated, and shared using open online tools to allow continuous development and regular updates to the official versions of the code. This design balances community-based development with centralized coordination through a core team that supervises development and ensures quality control (Turek, 2025).

CPF's open-science framework

The central element is the project’s website that contains all important information, documentation and the latest major version of the code. The website also includes a forum.

GitHub is precisely oriented at the development of the CPF code. GitHub is a code hosting platform for collaborations in code development, especially useful for managing open-source projects. It allows users to access the main and alternative versions of the code, share their modifications, track changes and continuously integrate them into consecutive versions. Extensions, improvements or alternative versions of the code can be offered by all researchers and programmers who register free of charge at the GitHub platform. Notably, all changes are recorded, providing version control functionality.

Open Science Framework is one of the most popular open-science platforms, which facilitates open collaboration in research. OSF integrates many tools and services which support managing, organising, documenting and sharing all aspects of a project. Among others, OSF allows pre-registering studies, storing code and data; it is linked to preprint services and many scientific platforms. It facilitates collaborative workflow on projects, allows to document the work and progress. Similarly to GitHub, OSF uses a version control system, so all changes to the project are recorded. OSF allows additionally to register the project at each stage and creates an archival version of the project with a unique hyperlink. All materials can be registered this way, receiving permanent links and DOIs. Importantly, OSF includes a GitHub add-on which directly links files stored at GitHub repository into the OSF project. This way, changes to the code can be introduced either through GitHub or OSF. They are also synchronised so that the code at the OSF is always up to date.

Links to the resources:

Our team

Core team

Konrad Turek

Konrad Turek

Assistant professor, sociology, work & ageing labour markets, life course inequalities at Tilburg University


Thomas Leopold

Thomas Leopold

Full professor of Sociology at the University of Cologne, the Chair for Methods of Empirical Social Research.


Matthijs Kalmijn

Matthijs Kalmijn

Professor of demography and sociology at the Netherlands Interdisciplinary Demographic Institute (NIDI), theme leader ‘Families and Gender’. Professor of Sociology at the University of Groningen.


Supporting team

Isabel Voets

Isabel Voets

Research Assistant for CPF (2022-2023) at NIDI-KNAW. MSc Double Degree Sociology and Population Studies (Tilburg University / Universitat Pompeu Fabra)


Xu Xiao

Xu Xiao

In this Chinese name, the family name is Xu.

Postdoctoral Researcher for CPF (2025) at Tilburg University. PhD candidate Computational Demography (NIDI-KNAW / University of Groningen)