Data Sets
Research Data Set Resources
This overview document includes important data sets available for research on healthcare topics.
Caveat for Researchers
Investigators are highly encouraged to review the detailed documentation for any data set being considered. As with many research databases, there may be missing key elements and/or variations in scope or exactness. Be sure you understand the limitations and strengths of the data elements in relation to the research question you are asking.
Medicare Summary Files
Medicare Summary Files
Medicare offers researchers extensive aggregated healthcare usage data available for download from their website. Data files anesthesia researchers may find useful include:
Provider Utilization and Payment Data: Physician and Other Supplier
Summarizes Part B allowed services, charges, and payments by National Provider ID (NPI) and Healthcare Common Procedure Coding System (HCPCS) code.
Physician/Supplier Procedure Summary
Summarizes Part B claims for allowed services and costs by carrier (can be linked to state), HCPCS, modifier, provider specialty, type of service, and place of service.
Part B National and Carrier Summary Data
Summarizes Part B claims by allowed services and costs by National/Carrier level, HCPCS, and prominent modifiers.
Claims are a key data source for detailed healthcare encounters and associated research on cost and quality outcomes. While more complex to access and use, for many purposes they are the best or only data source available.
Medicare Limited Data Set (LDS) Files
Medicare LDS files have been stripped of data elements that might permit identification of beneficiaries. These files contain beneficiary-level health information but exclude specified direct identifiers as outlined in the Health Insurance Portability and Accountability Act (HIPAA Privacy Rule). Many files are available including MEDPAR for inpatient hospital stays and Standard Analytic Files for Inpatient, Outpatient, etc. Accessing LDS data requires submitting a Data Use Agreement requesting specific data files and paying the associated costs.
All Payer Claims Database (APCD)
The APCD is supported by the APCD Council and represents a series of state-level efforts to compile medical claims from all payers using consistent standards. Arkansas, Colorado, Delaware, Massachusetts, Maine, Minnesota, New Hampshire, Oregon, Rhode Island, Utah, and Vermont have procedures for requesting data. A formal application and approval are generally required.
Healthcare Cost and Utilization Project (HCUP)
Healthcare Cost and Utilization Project (HCUP)
HCUP (pronounced “H-Cup”) is a family of health care databases and related software tools and products developed through a Federal-State-Industry partnership and sponsored by the Agency for Healthcare Research and Quality. HCUP databases bring together the data collection efforts of State data organizations, hospital associations, private data organizations, and the Federal government to create a national information resource of encounter-level health care data. In general, HCUP is an excellent resource for all payer cost and regional healthcare analysis. Additionally, the datasets offer a wealth of information related to surgery, diagnoses, and comorbidities to perform some quality, cost, and outcomes research. Data Use Agreement (DUA) required for all datasets.
National Inpatient Survey (NIS) and National Readmissions Database (NRD)
The NIS is a national survey of hospital discharge records representing approximately 20% of all hospital stays. All payment sources are represented. Data can be used to study healthcare utilization, cost, quality, access, and outcomes. The unit of analysis is the hospital discharge, where multiple hospitalizations cannot be linked within patient overtime. Data elements include ICD-9 diagnosis and procedures, charges, payment, demographics, severity and comorbidities, discharge status, and hospital characteristics. Data has been harmonized across available states resulting in some missing values for certain fields.
State Inpatient Databases (SID) and State Ambulatory Surgery and Services Databases (SASD)
The SID is a state-level database of hospital discharge records representing the full universe of discharges in participating states. The SASD includes ambulatory surgeries and services. Data elements can vary by state, but generally include significant patient detail, including ICD-9 diagnosis and procedures, charges, payment, demographics, and admission/discharge status.
National Readmissions Database (NRD)
The NRD is based on the SID but retains the essential patient linkage and timing measures needed for readmission analyses.
Provider Distribution Data Sources
Provider Distribution Data Sources
Physician Compare National Downloadable File
Includes demographic information, practice locations, and quality program participation for providers eligible for participation in Medicare. This is one of the best data sources for assessing provider availability by geography.
Area Healthcare Resource File (AHRF)
The AHRF dataset includes demographic and population characteristics, utilization, and healthcare workforce summary data at the county-level, which can be aggregated and linked to other data sources using various geographic identifiers. The AHRF is released annually with data from 50+ sources including the American Hospital Association, Census Bureau, Bureau of Labor Statistics, American Medical Association, and others.
Anesthesia in AHRF: County level counts of both CRNAs/nurse anesthesiologist (from NPPES) and Anesthesiologists (from AMA Survey) are available.
National Plan and Provider Enumeration System (NPPES)
The NPPES Downloadable File is a downloadable directory of all National Provider Identifiers (NPIs). NPIs identify individuals or organizations with a unique ID as required by CMS under federal law. However, many providers do not regularly update their NPI record and therefore this data source may not always have accurate practice location or credential information.
Anesthesia in the NPPES: Provider taxonomy code is available.
Other Sources
Other Sources
National Practitioner Data Bank (NPDB)
The NPDB includes the record of malpractice awards and other measures taken against licensed healthcare practitioners by state licensing boards, hospitals, professional societies, DEA, or Department of HHS.
Bureau of Labor Statistics
The BLS has data available on occupational employment trends by industry and geographic profiles. BLS data includes also includes average wages for various categories of employed providers.
Dartmouth Atlas Project
The Dartmouth Atlas Project offers resources for analyzing geographic disparities in healthcare utilization. There are tools for visualizing trends and downloading data on many healthcare topics.
Kaiser Family Foundation
The Kaiser Family Foundation produces aggregated state-level data tables relevant for research and policy analysis.