NHANES-CMS Linked Data Overview
Purpose
This module provides an overview of the NHANES-CMS linked data by briefly describing the available linked datasets and the history of the NHANES and CMS data collection.
Task 1: Sources and Structure of Linked Datasets
The data linkage program at the National Center for Health Statistics (NCHS) is designed to maximize the scientific value of the Center’s population-based surveys by linking survey responses with information gathered from administrative records. This task will help you understand the source and structure of some of the linked data files.
Data Linkage Overview
The linked data files enable you to examine factors that influence disability, chronic disease, health care utilization, morbidity, and mortality. Participants are considered eligible for linkage to CMS administrative records if they have provided consent as well as the necessary personally identifiable information (PII), such as date of birth and full or partial SSN or Medicare Health Insurance Claim (HIC) number.
This tutorial’s focus is on the linkage of National Health and Nutrition Examination Survey (NHANES) data with enrollment and claims data from CMS. Depending on your research goals, you may be interested in adding information from other linked NHANES datasets at NCHS, including the NHANES linked mortality files. For confidentiality reasons, the linked CMS Medicare and Medicaid data files described in this tutorial are restricted and need to be accessed through the NCHS Research Data Center (RDC). For more information, refer to Module 1. In order to access the data, you must submit a proposal to the RDC that fully describes the data needed for your intended analysis. NCHS has provided a Feasibility Study data file that includes a limited set of variables for you to use in determining the feasibility and potential sample sizes of your proposed research project(s). Links to the feasibility files webpages and the RDC website are included in the Resources section of this module.
Resources
- Linkage of NCHS Population Health Surveys to Administrative Records From Social Security Administration and Centers for Medicare & Medicaid Services
- Centers for Medicare & Medicaid Services (CMS) website
- NCHS Research Data Center webpage
- NCHS-CMS Medicare Feasibility Files
- NCHS-CMS Medicaid Feasibility Files
- NHANES Mortality Data Linkage webpage
- National Health and Nutrition Examination Survey (NHANES) webpage
Medicare enrollment and fee-for-service (FFS) claims data are available for NHANES participants who are considered linkage eligible and matched to CMS administrative data files using the Medicare Enrollment Database. To link the NHANES participants with their Medicare data, participants had to provide consent as well as the necessary personally identifiable information (PII), such as date of birth and full or partial Social Security number (SSN) or Medicare Health Insurance Claim (HIC) number. Linkage methods differed depending on the survey and year of data collection. For more information on linkage methods, please refer to the NCHS-CMS Medicare Matching Methodology and Analytic Considerations (link is below under Resources).
CMS provided NCHS with Master Beneficiary Summary File (MBSF) and Medicare FFS claims data for all successfully matched NHANES participants. CMS also provided Medicare Part D prescription drug event data to NCHS, beginning with the onset of the Part D benefit in 2006. Refer to Course 1, Module 4, Task 2, “Years of NHANES-CMS Linked Data” for the range of NHANES and CMS years of data that are linked.
The CMS claims files linked to NHANES data are all finalized claims files. These files are generated from provider-submitted health care services claims for payment through final action algorithms that match the original claim with adjusted claims to resolve any adjustments. The files are annual and are available for Part A (Inpatient, Outpatient, Skilled Nursing Facility, Hospice, or Home Health Agency) and for Part B (Carrier, Durable Medical Equipment) health care claims.
The files contain information collected by Medicare to pay for health care services provided to a Medicare beneficiary. There is one record in the file for each claim, although episodes of care include more than one claim.
Available Linked Data Sets
The following is a list and description of Medicare files that have been linked with NHANES.
Diagram: NHANES-Medicare Linked Data Structure
Master Beneficiary Summary File
The MBSF is an annual file containing demographic and enrollment information about beneficiaries enrolled in Medicare during each calendar year. It does not contain information on all beneficiaries ever entitled to Medicare, only those enrolled during the calendar year. The MBSF provides information on beneficiary demographic characteristics, reason for Medicare entitlement, and beneficiary enrollment in various Medicare programs including Parts A, B, C, and D. The MBSF consists of several segments as noted below. Each segment is provided as a separate data file.
The Base (A/B) segment includes beneficiary demographic and enrollment information, such as date of birth, date of death, sex, race, age, geographic information, monthly entitlement indicators, reasons for entitlement (initial and current), and monthly Medicare Advantage enrollment.
The Part D segment includes variables specific to Medicare Part D Prescription Drug plan enrollment dating back to its inception in 2006.
The Cost & Utilization segment includes summarized information about the service utilization and Medicare payment amounts by type of claims.
The Chronic Conditions segment includes variables that indicate a Medicare beneficiary has received a service or treatment for up to 21 chronic health conditions.
Note: Medicare Advantage (MA) plans (also known as Medicare Part C) are offered by private companies approved by Medicare. MA plans do not submit health care claims to Medicare for reimbursement of health care services. Therefore, currently, there are no health care claims included in the linked NHANES-CMS Medicare files for beneficiaries enrolled in an MA plan. These beneficiaries should be excluded from any analysis involving claims data analysis because they do not have claims data.
Remember: You should always include a request for the Master Beneficiary Summary File as part of your RDC proposal to use the linked NHANES-CMS data.
Medicare Fee-for-Service Claims Files
Medicare Provider Analysis and Review (MedPAR): The MedPAR file consolidates individual Inpatient Hospital or Skilled Nursing Facility (SNF) health care claims data into stay level records. Inpatient or SNF claims data are consolidated from the beneficiary’s date of admission to date of discharge into one facility stay record.
Carrier: The Carrier File contains final action claims data submitted by noninstitutional providers. The data are largely made up of physician claim records, although the file also includes claims from other professional providers such as physician assistants, clinical social workers, nurse practitioners, independent clinical laboratories, ambulance providers, and free-standing ambulatory surgical centers.
Durable Medicare Equipment (DME): The DME File contains final action claims data submitted by DME suppliers to a DME Medicare Administrative Contractor (MAC).
Home Health Agency (HHA): The HHA File contains final action claims submitted by Home Health Agency providers for reimbursement of home health covered services. An HHA claim may cover services provided over a period of time, rather than a single day.
Hospice: The Hospice File contains final action claims data submitted by hospice providers. The data in this file include the type of hospice care received including hospice home care or inpatient respite care.
Outpatient: The Outpatient File contains Medicare Part B final action claims from institutional outpatient providers which can include Hospital outpatient departments, rural health clinics, renal dialysis facilities, outpatient rehabilitation facilities, and community mental health centers.
Other Files
Medicare Part D Prescription Drug Event (PDE): The Part D PDE File contains a summary of prescription drug costs and payment data used by CMS to administer benefits for Medicare Part D enrollees, including payments to the plan providers. It does not contain individual drug claims, but rather summary extracts submitted to CMS by Medicare Part D prescription drug plan providers.
Resources
- NCHS-CMS Medicare Matching Methodology and Analytic Considerations
- Linkage of NCHS Population Health Surveys to Administrative Records From Social Security Administration and Centers for Medicare & Medicaid Services
- Centers for Medicare & Medicaid Services (CMS) website
- Chronic Condition Data Warehouse (CCW) website
- National Health and Nutrition Examination Survey (NHANES) webpage
Medicaid enrollment and claims data are available for NHANES participants who are considered linkage eligible and matched to the Medicaid Person Summary (PS) File, which identifies all Medicaid recipients. To link NHANES participants with their Medicaid data, participants had to provide consent as well as the necessary personally identifiable information (PII), such as date of birth and full or partial Social Security number (SSN). Linkage methods differed depending on the survey and year of data collection. For more information on linkage methods, please refer to the NCHS-CMS Medicaid Matching Methodology and Analytic Considerations (link below under Resources).
CMS provided NCHS with Medicaid enrollment and health care claims and encounter data for all successfully matched NHANES participants. Refer to Course 1, Module 4, Task 2 “Years of NHANES-CMS Linked Data” for the range of years of data that are linked.
Available Linked Data Sets
The following is a list and description of Medicaid files that have been linked with NHANES.
Diagram: NHANES-Medicaid Linked Data Structure
For each calendar year there are five MAX data files, one, the PS, that includes demographic, enrollment, and summary utilization statistics, and four for various types of utilization.
Summary File
Person Summary (PS) File: The PS File consists of person-level information about Medicaid-eligible persons who have enrolled in a state Medicaid program; it includes demographic data, basis of eligibility, maintenance assistance status, monthly enrollment status, and a utilization summary. The file contains one record for every individual enrolled for at least one day during the year.
You should always include a request for the Person Summary (PS) File as part of the data request.
Claims Files
Inpatient Hospital (IP) File: The IP File contains complete stay records for enrollees who used inpatient services. Records include fee-for-service (FFS) claims and encounter records submitted for inpatient stays covered by Medicaid managed care.
Other Services (OT) File: The OT File contains claim records for all non-institutional Medicaid services, including physician services, lab/X-ray, clinic services, durable medical equipment (DME), personal care, and premium payments. Non-institutional services take place in outpatient and hospice facilities, physician offices, and home health settings.
Long Term Care (LT) File: The LT File contains claims records for long-term care services provided by Skilled Nursing Facilities (SNF), Intermediate Care Facilities (ICFs), and independent psychiatric facilities. Types of services include mental health, inpatient psychiatric, and intermediate care facilities for the mentally retarded (ICF/MR), and nursing facility days.
Prescription Drug (RX) File: The Prescription Drug File includes fee-for-service (FFS) claims and managed care encounter records that contain a National Drug Code (NDC) for a filled prescription. The National Drug Code is used for identification of each drug product. Drugs provided during an inpatient stay are not included on this file. Injectable drugs administered by a health care professional are included in the OT File.
Medicaid Federal Funding and Services Covered
Medicaid is administered by states under general guidelines established by the federal government and is financed jointly by federal and state funds. The Federal Medical Assistance Percentage (FMAP), also called the federal match rate, represents the percent of Medicaid financed by the federal government in each state. The FMAP differs by state and considers the average per capita income in a state relative to the national average. You can find the FMAP for individual states for each federal fiscal year in the Federal Percentages and Federal Medical Assistance Percentages table produced by the Department of Health and Human Services’ Assistant Secretary for Planning and Evaluation.
State Medicaid programs must cover mandatory services specified in federal law to receive federal matching funds. Beneficiaries are entitled to receive the following mandatory services:
- Physicians’ services
- Hospital services (inpatient and outpatient)
- Laboratory and x-ray services
- Early and periodic screening, diagnostic and treatment (EPSDT) services for persons under 21
- Federally-qualified health center and rural health clinic services
- Family planning services and supplies
- Pediatric and family nurse practitioner services
- Nurse midwife services
- Nursing facility services for persons 21 and older
- Home health care for persons eligible for nursing facility services
- Transportation services
- Medicaid long-term care services
Medicaid long-term care services include comprehensive services provided in nursing homes and intermediate care facilities (ICF). Long-term care also includes a wide range of services and supports needed by people to live independently in the community, including home health care, personal care, medical equipment, rehabilitative therapy, adult day care, case management and respite for caregivers.
States are also permitted to cover many services that federal law designates as optional, including dental services, prescription drugs, case management, and hospice services. State variation in Medicaid coverage, with regard to both program eligibility and covered services, results in state differences in enrollment rates and expenditures. Other factors, including the age distribution, the poverty rate, and the Medicaid provider reimbursement rates, also contribute to variation among states in enrollment, service use, and costs. As a result, Medicaid operates as more than 50 distinct programs – one in each state, the District of Columbia, and each of the territories. Consideration of these state-level differences may be necessary for many analyses. State identifiers for NHANES need to be specifically requested in those circumstances. State identifiers are not available in NHANES public use data files and must be specifically requested in your NCHS Research Data Center proposal.
Resources
- NCHS-CMS Medicaid Matching Methodology and Analytic Considerations
- Linkage of NCHS Population Health Surveys to Administrative Records From Social Security Administration and Centers for Medicare & Medicaid Services
- Centers for Medicare & Medicaid Services (CMS) website
- Medicaid Analytic eXtract (MAX) General Information
- National Health and Nutrition Examination Survey (NHANES) webpage
While this tutorial’s focus is on using NHANES data that are linked to Medicare and Medicaid enrollment and claims data from CMS, you may find it useful to include mortality data in your analysis to gain additional information about the participant. In addition to the CMS linked data files, NCHS survey data are also linked to the National Death Index (NDI). The linked mortality files contain date and cause of death information for NHANES participants who are deceased. The NDI is maintained by NCHS and is the nation’s most complete and detailed source of information on mortality in the US, including vital status (alive or dead), date of death and cause of death.
The NHANES Mortality Data Linkage web pages provide a complete overview of the linked mortality files, how to access the files, and description of the matching methodology (link below).
CMS files contain the date of death for CMS program beneficiaries who are deceased, but not the cause of death. The Medicare MBSF and the Medicaid MAX PS File include information on date of death if it occurred during the calendar year of the data file. The death information in CMS data files comes from SSA records, rather than the NDI.
Death information is occasionally misreported to CMS. While this erroneous information is not corrected by CMS, these cases can be identified by data users. One indicator that a CMS identified deceased CMS beneficiary may still be alive is that they continue to be eligible for benefits in later years or they have new death information recorded in a later file. You should use extra caution in analyzing CMS death information to ensure that deaths are not over-counted. In addition, the actual date of death information is occasionally misreported to CMS. Cases can be identified by examining the variable, “Valid Date of Death Switch,” where a value of “V” indicates that CMS has validated the actual date the beneficiary died, whereas a “blank” indicates that the reported death date was not validated. If the date of death is not validated, CMS assigns the date of death as the last day of the month.
No attempt has been made to reconcile inconsistent death information from CMS with linked mortality information collected by NCHS from linkage with the NDI. RDC research proposals that intend to analyze mortality outcomes should consider investigating death information from both the CMS data and the NHANES Linked Mortality Files.
Resources
Task 2: Background of NHANES and CMS Data
NHANES is a program of studies designed to assess the health and nutritional status of adults and children in the United States. Medicare and Medicaid data from the CMS are obtained from administrative data files. This task will give you a brief overview of NHANES and CMS data.
Key Concepts about NHANES Data
NHANES is a continuous, nationally representative survey consisting of about 5,000 persons from 15 different counties each year. For a variety of reasons, including disclosure issues, the NHANES data are released on public-use data files in two-year increments. The survey includes a standardized physical examination, laboratory tests, and questionnaires that cover various health-related topics. NHANES includes an interview in the household followed by an examination in a mobile examination center (MEC). NHANES is a nationally representative, cross-sectional sample of the U.S. civilian, noninstitutionalized population that is selected using a complex, multistage probability design.
Prior to becoming a continuous survey in 1999, NHANES was conducted periodically. NHANES I was conducted in 1971–75 and was followed by the NHANES I Epidemiologic Follow-up Study (NHEFS), a national longitudinal study conducted in collaboration with the National Institutes of Health, National Institute on Aging and other agencies of the Public Health Service. The last periodic survey, NHANES III, was conducted between 1988 and 1994. NHANES III was designed to provide national estimates of health and nutritional status of the civilian, non-institutionalized population of the United States aged 2 months and older. Similar to the continuous survey, NHANES III included a standardized physical examination, laboratory tests, and questionnaires that covered various health-related topics. For detailed information about the Continuous NHANES, NHANES III and NHEFS contents and methods, refer to the NHANES website.
Goals of NHANES
Since its inception, the basic survey goal of monitoring the health status of the US population has been refined as well as expanded. The current goals of the NHANES survey are to:
- Estimate the number and percent of persons in the U.S. population, and designated subgroups, with selected diseases and risk factors
- Monitor trends in the prevalence, awareness, treatment, and control of selected diseases
- Monitor trends in risk behaviors and environmental exposures
- Analyze risk factors for selected diseases
- Study the relationship between diet, nutrition, and health
- Explore emerging public health issues and new technologies
- Establish a national probability sample of genetic material for future genetic research
- Establish and maintain a national probability sample of baseline information on health and nutritional status
Administered by CMS, Medicare is the primary health insurance program for people ages 65 or older, people under age 65 with certain disabilities, and people of all ages with End-Stage Renal Disease (ESRD). Nearly all Medicare beneficiaries receive Part A hospital insurance benefits, which help cover inpatient hospital care, skilled nursing facility stays, home health and hospice care. Most beneficiaries also subscribe to Part B medical insurance benefits, which help to cover physician services, outpatient care, durable medical equipment and some home health care. Additionally, many beneficiaries elect to purchase Medicare Part D prescription drug coverage (available since 2006). Beneficiaries may elect to receive traditional fee-for-service (FFS) Medicare or, as an alternative, enroll in Medicare Part C plans. Medicare Part C plans are also referred to as Medicare Advantage (MA) and include Health Maintenance Organizations (HMOs), Managed Care Plans, Preferred Provider Organizations (PPOs), Private Fee-for-Service (PFFS) Plans, Special Needs Plans, and Medicare Medical Savings Account Plans. These are private plans comparable to managed care organizations which provide Medicare Part A and Part B services, and for 2006 forward, also must provide Part D prescription drug coverage.
Where do the Medicare data come from?
The Medicare data are from CMS enrollment files and FFS administrative claims submitted for payment to CMS. This information, beginning with 1999 data, has been linked with the continuous NHANES files. The range of years of Medicare data that are linked are described in Course 1, Module 4, Task 2, “Years of NHANES-CMS Linked Data.”
Resources
- National Health and Nutrition Examination Survey (NHANES) webpage
- Centers for Medicare & Medicaid Services (CMS) website
- NCHS Research Data Center webpage
- NCHS-CMS Medicare Feasibility Files
- Chronic Condition Data Warehouse (CCW) website
- National Death Index (NDI) webpage
- NHANES Mortality Data Linkage webpage
- Continuous NHANES Web Tutorial Survey Overview page
Also administered by CMS, Medicaid is a U.S. public health insurance program (Title XIX of the Social Security Act from 1965) covering low-income adults and children and people with certain disabilities. It is jointly funded by the individual states and the Federal government. Each state manages its own Medicaid program within the bounds of minimum Federal requirements. Thus, Medicaid eligible populations and available benefits will vary among states and over time.
Many groups of people are covered by Medicaid depending on the state’s requirements (e.g., age, whether pregnant, disabled, blind, or aged, income level and resources, U.S. citizenship or lawful immigration status).
A link to the CMS webpage is in the Resources section.
Where do the Medicaid data come from?
The Medicaid Analytic eXtract (MAX) data are extracted from the Medical Statistical Information System (MSIS) data. MSIS is a database of claims that have been submitted, adjusted and paid by the states. MAX data are organized by CMS into annual calendar year files and include finalized claims. The MAX files, beginning with 1999 data, have been linked with the continuous NHANES files. The range of years of Medicaid data that are linked are described in Course 1, Module 4, Task 2, “Years of NHANES-CMS Linked Data.”
Resources
- National Health and Nutrition Examination Survey (NHANES) webpage
- Centers for Medicare & Medicaid Services (CMS) website
- NCHS Research Data Center webpage
- NCHS-CMS Medicaid Feasibility Files
- National Death Index (NDI) webpage
- NHANES Mortality Data Linkage webpage
- Continuous NHANES Web Tutorial Survey Overview page