Skip to main content

Linking National Hospital Care Survey and CMS Data

Female patient wearing medical mask in exam room at clinic.  Doctor and nurse also wearing masks to protect from virus or disease.
Evaluating privacy-preserving linkage techniques and other support
  • Client
    National Center for Health Statistics
  • Dates
    2020 - 2022


The National Center for Health Statistics wanted to link several data sources while maintaining privacy.

The National Center for Health Statistics (NCHS) wanted to evaluate commercially available privacy-preserving linkage software to determine if this software could link data with a similar degree of accuracy as non-privacy-preserving methods. NCHS’s interest in PPRL methods is based on their anticipation that for certain proposed linkages, the data custodians may be unable (e.g., because of legal restrictions) or unwilling to share complete files or sensitive data elements. PPRL offers the opportunity to complete linkages under these limitations. NCHS expected that linkage results would be somewhat diminished by limited access to privacy-impacting data elements, but by how much?

NCHS sought to increase the analytic utility of the 2014 and 2016 NHCS data by linking it with T-MSIS claims and enrollment data. These linked data allow researchers to analyze the health and health outcomes of persons enrolled in a means-tested government healthcare program.


NORC provided technical support to the Data Linkage Team at the National Center for Health Statistics (NCHS).

The National Center for Health Statistics (NCHS) asked NORC to:

  • Evaluate privacy-preserving record linkage techniques by re-performing the linkage between the National Hospital Care Survey (NHCS, 2016) and the National Death Index (NDI) using encrypted identifiers with Datavant software
  • Conduct linkage between NHCS (2014 and 2016) and Centers for Medicare & Medicaid Services (CMS) Transformed Medicaid Statistical Information System (T-MSIS) data containing enrollment and claims data from a range of years

We performed a methodological assessment of Datavant software for conducting Privacy Preserving Record Linkage. The project assessed how the use of hashing algorithms might affect the quality of the linked data and the inference in a secondary analysis of those data (I.e., the accuracy of tabulations and analyses made with linked data from the PPRL approach.

The second part of the project linked NHCS patient records with T-MSIS claims and encounter data, creating a database of detailed health insurance claims data for all NHCS patients receiving health insurance coverage from Medicare and Medicaid, the two largest U.S. public health insurance programs.


NORC’s groundbreaking data linkage provides myriad benefits to health care research. 

This new data resource supports patients, caregivers, and providers as they strive to improve health, prevent chronic disease, and improve the efficacy and quality of health care services. The project expanded data capacity for studies of HHS priority issues, particularly among the Medicaid-covered population, such as opioids, obesity, and infectious diseases in a way that no single source alone could provide. It also supported a wide array of health outcomes research studies, such as examining differences in the efficiency and effectiveness of treatment protocols or post-acute care utilization among patients covered by Medicare fee-for-service, Medicare Advantage, and Medicaid programs. The project also provides a rich data source for researchers examining the association between health and housing.

Project Leads


Explore NORC Health Projects

Adapting and Implementing a Toolkit to Identify Pneumonia in Patients

Adapting and implementing patient safety practices in ambulatory care


Agency for Healthcare Research and Quality

Addressing Vaccine Hesitancy & Health Disparities Among Rural Farmworkers

Using Migrant and Seasonal Head Start infrastructure to distribute COVID-19 vaccines and document health disparities


W. K. Kellogg Foundation