NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

National Center for Health Statistics (US). Health, United States, 2010: With Special Feature on Death and Dying. Hyattsville (MD): National Center for Health Statistics (US); 2011 Feb.

Cover of Health, United States, 2010

Health, United States, 2010: With Special Feature on Death and Dying.

Show details

Appendix IData Sources

Health, United States consolidates the most current data on the health of the population of the United States, the availability and use of health resources, and health care expenditures. Information was obtained from the data files and published reports of many federal government, private, and global agencies and organizations. In each case, the sponsoring agency or organization collected data using its own methods and procedures. Therefore, data in this report may vary considerably with respect to source, method of collection, definitions, and reference period.

Although a detailed description and comprehensive evaluation of each data source are beyond the scope of this appendix, readers should be aware of the general strengths and weaknesses of the different data collection systems. For example, population-based surveys obtain socioeconomic data, data on family characteristics, and information on the impact of an illness, such as days lost from work or limitation of activity. These data are limited by the amount of information a respondent remembers or is willing to report. For example, a respondent may not know detailed medical information, such as a precise diagnosis or the type of procedure performed, and therefore cannot report that information. In contrast, records-based surveys, which collect data from physician and hospital records, usually contain good diagnostic information but little or no information about the socioeconomic characteristics of individuals or the impact of illnesses on individuals.

Different data collection systems may cover different populations, and understanding these differences is critical to interpreting the resulting data. Data on vital statistics and national expenditures cover the entire population. However, most data on morbidity and the utilization of health resources cover only the civilian noninstitutionalized population and thus may not include data for military personnel, who are usually young; for institutionalized people, including the prison population, who may be of any age; or for nursing home residents, who are usually older.

All data collection systems are subject to error, and records may be incomplete or contain inaccurate information. Respondents may not remember essential information, a question may not mean the same thing to different respondents, and some institutions or individuals may not respond at all. It is not always possible to measure the magnitude of these errors or their effect on the data. Where possible, table notes describe the universe and method of data collection to assist users in evaluating data quality.

Some information is collected in more than one survey, and estimates of the same statistic may vary among surveys because of different survey methodologies, sampling frames, questionnaires, definitions, and tabulation categories. For example, cigarette use is measured by the National Health Interview Survey, the National Survey on Drug Use & Health, the Monitoring the Future Survey, and the Youth Risk Behavior Survey. These surveys use slightly different questions, cover persons of differing ages, and interview in diverse settings (e.g., at school compared with at home), so estimates will differ.

Overall estimates generally have relatively small sampling errors, but estimates for certain population subgroups may be based on a small sample size and have relatively large sampling errors. Numbers of births and deaths from the National Vital Statistics System (NVSS) represent complete counts (except for births in those states where data are based on a 50% sample for certain years). Therefore, these data are not subject to sampling error. However, when the figures are used for analytical purposes, such as the comparison of rates over a period, the number of events that actually occurred may be considered as one of a large series of possible results that could have arisen under the same circumstances. When the number of events is small and the probability of such an event is rare, estimates may be unstable, and considerable caution must be used in interpreting the statistics. Estimates that are unreliable because of large sampling errors or small numbers of events are noted with asterisks in tables, and the criteria used to designate unreliable estimates are indicated in an accompanying footnote.

In this appendix, government data sources are listed alphabetically by data set name, and private and global sources are listed separately. To the extent possible, government data systems are described using a standard format. The Overview is a brief, general statement about the purpose or objectives of the data system. The Selected Content section lists major data elements that are collected or estimated using interpolation or modeling. The Data Years section gives the years that the survey or data system has existed or been fielded. The Coverage section describes the population that the data system represents: for example, residents of the United States, the noninstitutionalized population, persons in specific population groups, or other entities that make up the survey. The Methodology section presents a short description of the methods used to collect data. Sample size and response rates are given for surveys. The Issues Affecting Interpretation section describes major changes in the data collection methodology or other factors that must be considered when analyzing trends: for example, a major survey redesign that may introduce a discontinuity in the trend. For additional information about the methodology, data files, and history of a data source, consult the References and For More Information sections that follow each summary.

Government Sources

Abortion Surveillance System

CDC/National Center for Chronic Disease Prevention and Health Promotion (NCCDPHP)

Overview. The Abortion Surveillance Program documents the number and characteristics of women obtaining legal induced abortions, monitors unintended pregnancy, and assists efforts to identify and reduce preventable causes of morbidity and mortality associated with abortions.

Selected Content. Content includes age, race/ethnicity, marital status, previous live births, period of gestation, and previous induced abortions of women obtaining legal induced abortions.

Data Years. Between 1973 and 1997, the number of abortions is based on reporting from 52 reporting areas: 50 states, the District of Columbia, and New York City. In 1998 and 1999, CDC compiled abortion data from 48 reporting areas. Alaska, California, New Hampshire, and Oklahoma did not report, and data for these areas were not estimated. In 2000–2004, CDC compiled data from 49 reporting areas. Alaska, California, and New Hampshire did not report abortion data to CDC in 2000–2002. In 2003 and 2004, California, New Hampshire, and West Virginia did not report. In 2005 and 2006, California, Louisiana, and New Hampshire did not report.

Coverage. The system includes women of all ages, including adolescents, who obtain legal induced abortions.

Methodology. Starting with 2000 data, the number and characteristics of women who obtain legal induced abortions are provided for 49 reporting areas by central health agencies, such as state health departments and the health departments of New York City and the District of Columbia, and by hospitals and other medical facilities. In general, the procedures are reported by the state in which the procedure is performed (i.e., state of occurrence). Although the total number of legal induced abortions is available for those 49 reporting areas, not all areas collect information on the characteristics of women who obtain abortions. The number of areas reporting each characteristic and the number of areas with complete data for each characteristic vary from year to year. For example, in 2005 the number of areas reporting different women's characteristics ranged from 28 areas reporting adequate data for the Office of Management and Budget (OMB) recommended race categories (accounting for 39% of the total number of reported abortions), 30 areas reporting adequate data on Hispanic ethnicity, and 43 areas reporting marital status, to 48 areas reporting age. Data from reporting areas with more than 15% unknown for a given characteristic are excluded from the analysis of that characteristic.

Issues Affecting Interpretation. The drug mifepristone for medical abortion was approved in September 2000 by the U.S. Food and Drug Administration (FDA) for distribution and use in the United States. The percentage of medical abortions increased from 1% in 2000 to 10% in 2005. Between 1989 and 1997, the total number of abortions reported to CDC was about 10% less than the total estimated independently by the Guttmacher Institute (previously, the Alan Guttmacher Institute, or AGI), a not-for-profit organization for reproductive health research, policy analysis, and public education. Between 1998 and 2005, the total number of abortions reported to CDC was about 34% less than the total estimated by Guttmacher. The three reporting areas (the largest of which was California) that did not report abortions to CDC in 2005 accounted for 18% of all abortions tallied by Guttmacher's 2005 survey. (Also see Appendix I, Guttmacher Institute Abortion Provider Census.)

Reference
For More Information

AIDS Surveillance

CDC/National Center for HIV/AIDS, Viral Hepatitis, STD, and TB Prevention (NCHHSTP)

Overview. Acquired immunodeficiency syndrome (AIDS) surveillance data are used to detect and monitor cases of human immunodeficiency virus (HIV) disease and AIDS in the United States, identify epidemiologic trends, identify unusual cases requiring follow-up, and inform public health efforts to prevent and control the disease.

Selected Content. Data collected on cases diagnosed with AIDS include age, sex, race/ethnicity, mode of exposure, and geographic region.

Data Years. Reports on AIDS cases are available from the beginning of the epidemic that started in 1981.

Coverage. All 50 states, the District of Columbia (D.C.), U.S. dependencies and possessions, and independent nations in free association with the United States report AIDS cases to CDC using a uniform surveillance case definition and case report form. As of April 2008, all states had implemented confidential, name-based HIV infection reporting.

Methodology. AIDS surveillance is conducted by health departments in each state or territory and D.C. Although surveillance activities range from passive to active, most areas employ multifaceted active surveillance programs, which include four major reporting sources of AIDS information: hospitals and hospital-based physicians, physicians in nonhospital practice, public and private clinics, and medical record systems (death certificates, tumor registries, hospital discharge abstracts, and communicable disease reports). Using a standard confidential case report form, the health departments collect information that is then transmitted electronically, without personal identifiers, to CDC.

Adjustments of the estimated data on HIV infection (not AIDS) and AIDS to account for reporting delays are calculated by a maximum likelihood statistical procedure that takes into account the differences in reporting delays among exposure, geographic, racial/ethnic, age, sex, and vital status categories and is based on the assumption that reporting delays in these categories have not changed over time. AIDS surveillance data are provisional and are updated annually.

Issues Affecting Interpretation. Although the completeness of reporting of AIDS cases to state and local health departments differs by geographic region and patient population, studies conducted by state and local health departments indicate that the reporting of AIDS cases in most areas of the United States is more than 85% complete. To assess trends in AIDS cases, deaths, and prevalence, it is preferable to use case data adjusted for reporting delays and presented by year of diagnosis, rather than straight counts of cases presented by year of report.

The definition of AIDS was modified in 1985 and 1987. The case definition for adults and adolescents was modified again in 1993. The revisions incorporated a broader range of AIDS-indicator diseases and conditions and used HIV diagnostic tests to improve the sensitivity and specificity of the definition. Laboratory and diagnostic criteria for the 1987 pediatric case definition were updated in 1994. Effective January 2000, the surveillance case definition for HIV infection was revised to reflect advances in laboratory HIV virologic tests. The definition incorporates the reporting criteria for HIV infection and AIDS into a single case definition for adults and children.

In 2008, changes were made to the case definition for HIV infection. The new case definition combined the two previous case definitions for HIV and AIDS and established a new disease staging classification. This change in the new case definition prompted changes to the title of the report and new terminology diagnoses of HIV infection and AIDS diagnoses throughout the report. The term “HIV/AIDS”— previously used to refer to a new diagnosis of HIV infection regardless of the person’s disease stage at the time of diagnosis—was replaced with the term “diagnosis of HIV infection,” to reflect implementation of the revised case definition for HIV infection that incorporated the previous case definition for AIDS and established a new disease staging classification.

Decreases in AIDS incidence and in the number of AIDS deaths, first noted in 1996, have been ascribed to the effect of new treatments, which prevent or delay the onset of AIDS and premature death among HIV-infected persons and result in an increase in the number of persons living with HIV and AIDS. A growing number of states require confidential reporting of persons with HIV infection and participate in CDC's integrated HIV/AIDS surveillance system that compiles information on the population of persons newly diagnosed and living with HIV infection.

Reference
For More Information

Census of Fatal Occupational Injuries (CFOI)

Bureau of Labor Statistics (BLS)

Overview. CFOI compiles comprehensive and timely information on fatal work injuries occurring in the 50 states and the District of Columbia (D.C.), to monitor workplace safety and inform private and public health efforts to improve workplace safety.

Selected Content. Information is collected about each workplace fatality, including occupation and other worker characteristics, equipment involved, and circumstances of the event.

Data Years. Data have been collected annually since 1992.

Coverage. The data cover all 50 states and D.C.

Methodology. CFOI is administered by BLS, in conjunction with participating state agencies, to compile counts that are as complete as possible to identify, verify, and profile fatal work injuries. Key information about each workplace fatality (occupation and other worker characteristics, equipment or machinery involved, and circumstances of the event) is obtained by cross-referencing source records. For a fatality to be included in the census, the decedent must have been employed (that is, working for pay, compensation, or profit) at the time of the event, engaged in a legal work activity, or present at the site of the incident as a requirement of his or her job. These criteria are generally broader than those used by federal and state agencies administering specific laws and regulations. Fatalities that occur during a person's commute to or from work are excluded from the census counts. Fatalities to volunteer workers who are exposed to the same work hazards and perform the same duties or functions as paid employees and that meet the CFOI work relationship criteria are included.

Data for CFOI are compiled from various federal, state, and local administrative sources including death certificates, workers' compensation reports and claims, reports to various regulatory agencies, medical examiner reports, police reports, and news reports. Diverse sources are used because studies have shown that no single source captures all job-related fatalities. Source documents are matched so that each fatality is counted only once. To ensure that a fatality occurred while the decedent was at work, information is verified from two or more independent source documents or from a source document and a follow-up questionnaire.

Denominator data for the calculation of fatal injury rates are provided by the Current Population Survey (CPS). CPS and CFOI differ in scope. Where these differences occur, CFOI-adjusted fatal injury counts are used in calculating the rates, to maintain consistency between the rate numerator (number of fatal injuries) and the denominator (annual average employment and/or average hours at work). Workers under 16 years of age are excluded from fatal injury rate data. Starting with 2008 data, volunteers and military personnel also are excluded. Volunteers and military personnel are not included in the CPS data, and CFOI has been unable to obtain reliable hours-worked data for these groups.

Issues Affecting Interpretation. The number of occupational fatalities and fatality rates is revised periodically. States have up to 8 months to update their initial published counts and may identify additional fatal work injuries after data collection has closed for a reference year. Fatalities initially excluded from the published count because of insufficient information to determine work relationship may subsequently be verified as work-related and included in the revised counts and rates. Increases in the published counts over the last 5 years based on additional information have averaged approximately 110 fatalities per year, or less than 2% of the annual total.

Beginning with 2003 data, CFOI began using the North American Industry Classification System (NAICS) to classify industries. Prior to 2003, the program used the Standard Industrial Classification (SIC) system and the U.S. Census Bureau’s occupational classification system. Although some titles in SIC and NAICS are similar, there is limited comparability between the two systems because the industry groupings are defined differently. (See Appendix II, Industry of employment.)

Starting with 2008 data, fatal injury rates presented in Health, United States are based on hours, rather than employment, and consequently are not directly comparable with earlier injury rate data. Hours-based rates standardize the amount of exposure and are considered more accurate than employment-based rates. Hours-based rates use the average number of employees at work and the average hours each employee works. Employment- and hours-based rates will be similar for groups of workers who usually work full time. Differences in these rates are more likely for groups of workers who have a high percentage of part-time workers, like younger workers. Hours-worked data are provided by CPS. For more information, see: http://www.bls.gov/iif/oshnotice10.htm.

Reference
For More Information

Consumer Price Index (CPI)

Bureau of Labor Statistics (BLS)

Overview. The CPI is designed to produce a monthly measure of the average change in the prices paid by urban consumers for a fixed market basket of goods and services.

Selected Content. Price indexes are available for the United States, the four census regions, size of city, cross-classifications of regions and size-classes, and 26 local areas. For other local areas, data are bimonthly or semiannual. Indexes are available for major groups of consumer expenditures (food and beverages, housing, apparel, transportation, medical care, recreation, education and communications, and other goods and services), for items within each group, and for special categories such as services. Monthly indexes are available for the United States, the four census regions, and some local areas. More detailed item indexes are available for the United States than for regions and local areas. Indexes are available for two population groups: a CPI for All Urban Consumers (CPI–U), which covers approximately 87% of the total population; and a CPI for Urban Wage Earners and Clerical Workers (CPI–W), which covers 32% of the population.

Data Years. Data are available back to 1913. Prior to 1978, the data are based on the CPI–W population.

Coverage. The all-urban index (CPI–U), introduced in 1978, covers residents of metropolitan areas and residents of urban parts of nonmetropolitan areas (about 87% of the U.S. population in 2000).

Methodology. In calculating the index, price changes for the various items in each location are averaged together with weights that represent their importance in the spending of all urban consumers. Local data are aggregated to obtain a U.S. city average.

The index measures price changes from a designated reference date, 1982–1984, which equals 100. An increase of 22%, for example, is shown as 122. Change can also be expressed in dollars; for example, the price of a base period market basket of goods and services bought by all urban consumers has risen from $100 in 1982–1984 to $215 in 2008.

The CPI currently reflects spending patterns based on the Survey of Consumer Expenditures from 2007–2008, the 1990 Census of Population, and the ongoing Point-of-Purchase Survey. Using an improved sample design, prices for the goods and services required to calculate the index are collected in urban areas throughout the country and from retail and service establishments. Data on rents are collected from tenants of rented housing and residents of owner-occupied housing units. Food, fuels, and other goods and services are priced monthly in urban locations. Price information is obtained through visits or calls by trained BLS field representatives using computer-assisted telephone interviews.

Issues Affecting Interpretation. A 1987 revision changed the treatment of health insurance in the cost–weight definitions for medical care items. This change has no effect on the overall index result but provides a clearer picture of the role of health insurance in the CPI. As part of the revision, three new indexes were created by separating previously combined items; for example, eye care is separated from other professional services, and inpatient and outpatient treatment are separated from other hospital and medical care services.

Effective January 1997, the hospital index was restructured by combining the three categories— room, inpatient services, and outpatient services— into one category: hospital services. In addition, new procedures for hospital data collection identify a payor, diagnosis, and the payor's reimbursement arrangement from selected hospital bills.

References
  • Bureau of Labor Statistics. BLS handbook of methods. Washington, DC: U.S. Department of Labor; 1997. BLS bulletin no 2490.
  • Bureau of Labor Statistics. Revising the Consumer Price Index Mon Labor Rev. 12. Vol. 119. 1996.
  • Ford IK, Ginsburg DH. Medical care in the Consumer Price Index, Medical care output and productivity Bureau of Economic Research studies in income and wealth. Cutler DM, Berndt ER, editors. 62 . Chicago, IL: University of Chicago Press; 2001. pp. 203–19.
For More Information

Current Population Survey (CPS)

Bureau of Labor Statistics (BLS) and U.S. Census Bureau

Overview. CPS provides current estimates and trends in employment, unemployment, and other characteristics of the general labor force, the population as a whole, and various population subgroups.

Selected Content. The CPS interview is divided into three basic parts: (a) household and demographic information, (b) labor force information, and (c) supplement information for months that include supplements. Comprehensive work experience information is gathered on the employment status, occupation, and industry of persons interviewed.

Estimates of poverty and health insurance coverage presented in Health, United States from CPS are derived from the Annual Social and Economic Supplement (ASEC), formerly called the Annual Demographic Supplement (ADS) or commonly called the March Supplement. ASEC collects data on family characteristics, household composition, marital status, migration, income from all sources, information on weeks worked, time spent looking for work or on layoff from a job, occupation and industry classification of the job held longest during the year, health insurance coverage, and receipt of noncash benefits such as food stamps, school lunch program, employer-provided group health insurance plan, employer-provided pension plan, personal health insurance, Medicaid, Medicare, CHAMPUS or military health care, and energy assistance.

Data Years. The basic CPS has been conducted since 1945, although some data were collected prior to that time. The U.S. Census Bureau has collected data in the ASEC or ADS since 1947.

Coverage. The 2000-based basic CPS sample was introduced in April 2004, and implementation was completed by July 2005 with coverage in every state and the District of Columbia. The adult universe (i.e., the population of marriageable age) is composed of persons 15 years of age and over in the civilian noninstitutionalized population for CPS labor force data. The sample for the March CPS supplement is expanded to include members of the Armed Forces who are living in a household that includes at least one civilian adult, as well as additional Hispanic households that are not included in the monthly labor force estimates.

Methodology. The basic CPS sample is selected from multiple frames using multiple stages of selection. Each unit is selected with a known probability to represent similar units in the universe. The sample design is state-based, with the sample in each state being independent of the others.

One person generally responds for all eligible members of a household. For those who are employed, employment information is collected on the job held in the reference week. The reference week is defined as the 7-day period, Sunday through Saturday, that includes the 12th of the month. In CPS, a person with two or more jobs is classified according to the job at which he or she worked the greatest number of hours. In general, the BLS publishes labor force data only for persons 16 years of age and over because those under 16 are substantially limited in their labor market activities by compulsory schooling and child labor laws. No upper age limit is used, and full-time students are treated the same as nonstudents.

The additional Hispanic sample is from the previous November's basic CPS sample. If a person is identified as being of Hispanic origin from the November interview and is still residing at the same address in March, that housing unit is eligible for the March survey. This amounts to a near doubling of the Hispanic sample because there is no overlap of housing units between the basic CPS samples in November and March.

For all CPS data files, a single weight is prepared and used to compute the monthly labor force status estimates. An additional weight is prepared for the earnings universe that roughly corresponds to wage and salary workers in the two outgoing rotations. The final weight is the product of the basic weight, the adjustments for special weighting, the noninterview adjustment, the first-stage ratio adjustment factor, and the second-stage ratio adjustment factor. This final weight should be used when producing estimates from the basic CPS data. Differences in the questionnaire, sample, and data uses for the March CPS supplement result in the need for additional adjustment procedures to produce what is called the March Supplement weight.

Sample Size and Response Rate. Beginning with 2001, the Children's Health Insurance Program (CHIP) sample expansion was introduced. This included an increase in the basic CPS sample to 60,000 households per month. Prior to 2001, estimates were based on 50,000 households per month. The expansion also included an additional 12,000 households that were allocated differentially across states, based on prior information of the number of uninsured children in each state, to produce statistically reliable current state data on the number of low-income children who do not have health insurance coverage. In an average month, the nonresponse rate for the basic CPS is about 7%–8%.

Issues Affecting Interpretation. Over the years, the number of income questions has expanded, questions on work experience and other characteristics have been added, and the month of interview was moved to March. In 2002, an ASEC sample increase was implemented, requiring more time for data collection. Thus, additional ASEC interviews are now taking place in February and April. However, even with this sample increase, most of the data collection still occurs in March.

In 1994, major changes were introduced that included a complete redesign of the questionnaire to include new health insurance questions and the introduction of computer-assisted interviewing for the entire survey. In addition, some of the labor force concepts and definitions were revised. Prior to the redesign, CPS data were primarily collected using a paper-and-pencil form. Beginning in 1994, population controls were based on the 1990 census and adjusted for the estimated population undercount. Starting with Health, United States, 2003, poverty estimates for data years 2000 and beyond were recalculated based on the expanded CHIP sample, and Census 2000-based population controls were implemented. Starting with 2002 health insurance data, 1997 race standards were implemented that allowed respondents to report more than one race.

Reference
For More Information

Department of Veterans Affairs National Patient Care Database, Patient Treatment File, and National Enrollment Database

Department of Veterans Affairs (VA)

Overview. The VA compiles and analyzes multiple data sets on the health and health care of its clients and other veterans to monitor access and quality of care and to conduct program and policy evaluations.

Selected Content. The VA maintains the National Patient Care Database (NPCD), the Patient Treatment File (PTF), and the National Enrollment Database (NED).

The NPCD and PTF are nationwide systems that contain a statistical record for each episode of care provided under VA auspices, in VA and non-VA hospitals, nursing homes, VA residential rehabilitation treatment programs (formerly called domiciliaries), and VA outpatient clinics. Three major extracts are the PTF, the Patient Census File (PCF), and the NPCD.

The PTF collects data at the time of the patient's discharge on each episode of inpatient care provided to patients at VA hospitals, VA nursing homes, VA residential rehabilitation treatment programs, community nursing homes, and other non-VA facilities. The PTF record contains unique patient identifiers, dates of inpatient treatment, date of birth, state and county of residence, type of disposition, place of disposition after discharge, and International Classification of Diseases, 9th Revision, Clinical Modification (ICD–9–CM) diagnostic and procedure or operative codes for each episode of care.

The PCF collects data on each patient remaining in a VA medical facility at midnight at the end of each quarter of the fiscal year. The census record includes information similar to that reported in the PTF record.

The NPCD collects data on each instance of medical treatment provided to a veteran in an outpatient setting. The NPCD record includes the age, unique patient identifiers, state and county of residence, VA eligibility code, clinic(s) visited, purpose of visit, and date of visit for each episode of care.

The VA also maintains the NED as the official repository of enrollment information for each veteran enrolled in the VA health care system.

Coverage. U.S. veterans who receive services within the VA medical system are included. Data are available for some nonveterans who receive care at VA facilities.

Methodology. The NPCD and PTF are the source data for the Veterans Health Administration (VHA) Medical SAS Datasets. The NPCD and PTF are also the VHA's centralized relational databases (a data warehouse) that receive encounter data from VHA clinical information systems. The databases are updated daily. Data are collected locally at each VA medical center and transmitted electronically to the VA's Austin Automation Center for use in providing nationwide statistics, reports, and comparisons.

Issues Affecting Interpretation. The databases include users of the VA health care system. VA eligibility is a hierarchy based on service-connected disabilities, income, age, and availability of services. Therefore, different VA programs may serve populations with different sociodemographic characteristics than those served by other health care systems.

For More Information: See the VA Information Resource Center website at: http://www.virec.research.va.gov/Support/Training-NewUsersToolkit/IntroToVAData.htm.

Employee Benefits Survey—See National Compensation Survey

Medicaid Statistical Information System (MSIS)

Centers for Medicare & Medicaid Services (CMS)

Overview. CMS works with its state partners to collect data on each person served by the Medicaid program, to monitor and evaluate access and quality of care, trends in program eligibility, characteristics of enrollees, changes in payment policy, and other program-related issues.

Selected Content. Data collected include claims for services and their associated payments for each Medicaid beneficiary, by type of service. MSIS also collects information on the characteristics of every Medicaid eligible, including eligibility and demographic information.

Data Years. Selected state data are available starting in 1992. MSIS was an optional program until 1999, when the Balanced Budget Act of 1997 mandated that all states use MSIS. Data for the 50 states and the District of Columbia are available starting in 1999.

Coverage. The data include information about all individuals enrolled in the Medicaid program, the services they receive, and the payments made for those services.

Methodology. The primary data sources for Medicaid statistical data are the MSIS and CMS–64 reports.

MSIS is the basic source of state-reported eligibility and claims data on the Medicaid population, its characteristics, utilization, and payments. Beginning in FY 1999, as a result of legislation enacted from the Balanced Budget Act of 1997, states were required to submit individual eligibility and claims data tapes to CMS quarterly, through MSIS. Prior to FY 1999, states were required to submit an annual HCFA–2082 report, designed to collect aggregated statistical data on eligibles, recipients, services, and expenditures during a federal fiscal year (October 1 through September 30), or, at state option, to submit eligibility data and claims through MSIS. The claims data reflect bills adjudicated or processed during the year, rather than services used during the year.

CMS–64, a product of the financial budget and grant system, is a statement of expenditures for the Medicaid program that the states submit to CMS 30 days after each quarter. The report is an accounting statement of actual expenditures made by the states for which they are entitled to receive federal reimbursement under Title XIX for that quarter. The amount claimed on CMS–64 is a summary of expenditures derived from source documents such as invoices, cost reports, and eligibility records.

CMS–64 shows the disposition of Medicaid grant funds for the quarter being reported and for previous years, the recoupments made or refunds received, and income earned on grant funds. The data on CMS–64 are used to reconcile the monetary advance made on the basis of states' funding estimates filed prior to the beginning of the quarter on CMS–37. As such, CMS–64 is the primary source for making adjustments for any identified overpayments and underpayments to the states. Also incorporated into this process are disallowance actions forwarded from other federal financial adjustments. Finally, CMS–64 provides information that forms the basis for a series of Medicaid financial reports and budget analyses. Also included are third-party liability (TPL) collections tables. TPL refers to the legal obligation of certain health care sources to pay the medical claims of Medicaid recipients before Medicaid pays these claims. Medicaid pays only after the TPL sources have met their legal obligation to pay.

Issues Affecting Interpretation. Medicaid tables in Health, United States are based on MSIS data. Users of Medicaid data may note apparent inconsistencies in the data that are primarily due to the difference in information captured in MSIS compared with CMS–64 reports. The most substantive difference is due to payments made to disproportionate share hospitals. Payments to disproportionate share hospitals do not appear in MSIS because states reimburse these hospitals directly and there is no fee-for-service billing. Other, less significant, differences between MSIS and CMS–64 occur because adjudicated claims data are used in MSIS versus actual payments reflected in CMS–64. Differences also may occur because of internal state practices for capturing and reporting these data through two separate systems. Finally, national totals for CMS–64 are different because they include other jurisdictions, such as the Northern Mariana Islands and American Samoa. Starting with 1999 data, MSIS excluded data from Puerto Rico and the U.S. Virgin Islands, which accounted for approximately 1 million eligibles and $250 million in Medicaid payments.

For More Information: See the CMS websites at: http://www.cms.hhs.gov/home/medicaid.asp and http://www.cms.hhs.gov/msis and the Research Data Assistance Center (ResDAC) website at: http://www.resdac.umn.edu/medicaid/data_available.asp. (Also see Appendix II, Medicaid.)

Medical Expenditure Panel Survey (MEPS)

Agency for Healthcare Research and Quality (AHRQ)

Overview. MEPS produces nationally representative estimates of health care use, expenditures, sources of payment, insurance coverage, and quality of care for the U.S. civilian noninstitutionalized population.

Selected Content. MEPS data in Health, United States include total health care expenses and prescribed medicine expenses, presented by sociodemographic characteristics, type of health insurance, and sources of payment.

Data Years. The 1977 National Medical Care Expenditure Survey and the 1987 National Medical Expenditure Survey (NMES) are earlier versions of this survey. Since 1996, MEPS has been conducted on an annual basis.

Coverage. The U.S. civilian noninstitutionalized population is the primary population represented. The 1987 and 1996 surveys also had an institutionalized population component.

Methodology. The MEPS–HC is a national probability survey conducted on an annual basis since 1996. The panel design of the survey features five rounds of interviewing covering two full calendar years. MEPS consists of three components: the Household Component (HC), the Medical Provider Component (MPC), and the Insurance Component (IC).

The HC is a nationally representative survey of the civilian noninstitutionalized population drawn from a subsample of households that participated in the prior year’s National Health Interview Survey conducted by NCHS. Whenever possible, missing expenditure data are imputed using data collected in the MPC.

The MPC collects data from hospitals, physicians, home health care providers, and pharmacies that were reported in the HC as providing care to MEPS sample persons. Data are collected in the MPC to improve the accuracy of expenditure estimates that would be obtained if derived solely from the HC. The MPC is particularly useful in obtaining expenditure information for persons enrolled in managed care plans and for Medicaid recipients. Sample sizes for the MPC vary from year to year, depending on the HC sample size and the MPC sampling rates for providers.

The IC is a separate component that collects data on the types and costs of workplace health insurance from a sample of about 40,000 business establishments and 3,000 state and local governments each year.

The MEPS predecessor, the 1987 NMES, consisted of two components: the Household Survey (HS) and the Medical Provider Survey (MPS). The NMES–HS component was designed to provide nationally representative estimates of health insurance status, health insurance coverage, and health care use for the U.S. civilian noninstitutionalized population for calendar year 1987. Data from the NMES–MPS component were used in conjunction with HS data to produce estimates of health care expenditures. The NMES–HS consisted of four rounds of household interviews. Income was collected in a special supplement administered early in 1988. Events under the scope of the NMES–MPS included medical services provided by or under the direction of a physician, all hospital events, and home health care.

Sample Size and Response Rate. In recent years, the MEPS annual survey has consisted of approximately 12,500 families and 32,000 individuals. The annual response rate, which reflects nonresponse to the National Health Interview Survey from which the MEPS sample is selected as well as nonresponse and attrition in MEPS, has averaged about 60% in recent years.

Issues Affecting Interpretation. The 1987 estimates are based on NMES, and 1996 and later years estimates are based on MEPS. Because expenditures in NMES were based primarily on charges, whereas those for MEPS were based on payments, data for NMES were adjusted to be more comparable with MEPS by using estimated charge-to-payment ratios for 1987. For a detailed explanation of this adjustment, see Zuvekas and Cohen (2002).

References
  • Hahn B, Lefkowitz D. Annual expenses and sources of payment for health care services, National Medical Expenditure Survey research findings no 14. AHCPR pub no 93–0007. Rockville, MD: Agency for Health Care Policy and Research; 1992.
  • Ezzati-Rice TM, Rohde F, Greenblatt J. Sample design of the Medical Expenditure Panel Survey Household Component, 1998–2007. Rockville, MD: Agency for Healthcare Research and Quality; 2008. Methodology report no 22. Available from: http://www​.meps.ahrq​.gov/mepsweb/data_files​/publications/mr22/mr22.shtml.
  • Zuvekas SH, Cohen JW. A guide to comparing health care expenditures in the 1996 MEPS to the 1987 NMES. Inquiry. 2002;39(1):76–86. [PubMed: 12067078]
For More Information

Medicare Administrative Data

Centers for Medicare & Medicaid Services (CMS)

Overview. CMS collects and synthesizes Medicare enrollment, spending, and claims data to monitor and evaluate access to and quality of care, trends in utilization, changes in payment policy, and other program-related issues.

Selected Content. Data include claims information for services furnished to Medicare beneficiaries and Medicare enrollment data. Claims data include type of service, procedures, diagnoses, dates of service, charge amounts, and payment amounts. Enrollment data include date of birth, sex, race or ethnicity, and reason for entitlement.

Data Years. Some data files are available as far back as 1987, but CMS no longer provides technical support for files with data prior to 1991.

Coverage. Enrollment data are for all persons enrolled in the Medicare program. Claims data include data for Medicare beneficiaries who filed claims.

Methodology. The claims and utilization data files contain extensive utilization information at various levels of summarization for a variety of providers and services. There are many types and levels of these files: the National Claims History files, the Standard Analytic files (SAFs), Medicare Provider and Analysis Review (MEDPAR) files, Medicare enrollment files, and various other files.

The NCH 100% Nearline file contains all institutional and noninstitutional claims and provides records of every Medicare claim submitted, including adjustment claims. SAFs contain final action claims data in which all adjustments have been resolved. These files contain information collected by Medicare to pay for health care services provided to a Medicare beneficiary. SAFs are available for each institutional (inpatient, outpatient, skilled nursing facility, hospice, or home health agency) and noninstitutional (physician and durable medical equipment providers) claim type. The record unit of SAFs is the claim (some episodes of care may have more than one claim). SAFs include the Inpatient SAF, the Skilled Nursing Facility SAF, the Outpatient SAF, the Home Health Agency SAF, the Hospice SAF, the Durable Medical Equipment SAF, and the Physician/Supplier SAF.

MEDPAR files contain inpatient hospital and skilled nursing facility (SNF) final action stay records. Each MEDPAR record represents a stay in an inpatient hospital or SNF. An inpatient stay record summarizes all services rendered to a beneficiary from the time of admission to a facility, through discharge. Each MEDPAR record may represent one claim or multiple claims, depending on the length of a beneficiary's stay and the amount of inpatient services used throughout the stay.

The Denominator file contains demographic and enrollment information about each beneficiary enrolled in Medicare during a calendar year. The information in the Denominator file is frozen in March of the following calendar year. Some of the information contained in this file includes the beneficiary unique identifier, state and county codes, ZIP code, date of birth, date of death, sex, race, age, monthly entitlement indicators (for Medicare Part A, Medicare Part B, or Part A and Part B), reasons for entitlement, state buy-in indicators, and monthly managed care indicators (yes/no). The Denominator file is used to determine beneficiary demographic characteristics, entitlement, and beneficiary participation in Medicare Managed Care Organizations (MCOs).

The Vital Status file contains demographic information about each beneficiary ever entitled to Medicare. Some of the information contained in this file includes the beneficiary unique identifier, state and county codes, ZIP Code, date of birth, date of death, sex, race, and age. Often the Vital Status file is used to obtain recent death information for a cohort of Medicare beneficiaries.

The Group Health Plan (GHP) master file contains data on beneficiaries who are currently enrolled, or have ever been enrolled, in an MCO under contract with CMS. Each record represents one beneficiary, and each beneficiary has one record. Some of the information contained in this file includes the beneficiary unique identifier, date of birth, date of death, state and county, and managed care enrollment information such as dates of membership and MCO contract number. The GHP master file is used to identify the exact MCO in which beneficiaries were enrolled.

Issues Affecting Interpretation. Because Medicare managed care programs may not file claims, files based only on claims data will exclude care for persons enrolled in Medicare managed care programs. In addition, to maintain a manageable file size, some files are based on a sample of enrollees, rather than on all Medicare enrollees. Coding and the interpretation of Medicare coverage rules have also changed over the life of the Medicare program.

For More Information: See the CMS Research Data Assistance Center (ResDAC) website at: http://www.resdac.umn.edu/medicare/index.asp and the CMS website at: http://www.cms.hhs.gov/home/medicare.asp. (Also see Appendix II, Medicare.)

Medicare Current Beneficiary Survey (MCBS)

Centers for Medicare & Medicaid Services (CMS)

Overview. MCBS produces nationally representative estimates of health status, health care use and expenditures, health insurance coverage, and socioeconomic and demographic characteristics of Medicare beneficiaries. It is used to estimate expenditures and sources of payment for all services used by Medicare beneficiaries, including copayments, deductibles, and noncovered services; to ascertain all types of health insurance coverage and relate coverage to sources of payment; and to trace processes over time, such as changes in health status and the effects of program changes.

Selected Content. The survey collects data on the utilization of health services, health and functional status, health care expenditures, and health insurance and beneficiary information (such as income, living arrangement, family assistance, and quality of life).

Data Years. The first round of interviewing was conducted from September through December 1991, and the survey has been in the field continuously since then. The data are designed to support both cross-sectional and longitudinal analyses.

Coverage. MCBS is a continuous survey of a nationally representative sample of aged, institutionalized, and disabled Medicare beneficiaries.

Methodology. The overlapping panel design of the survey allows each sample person to be interviewed three times a year for 4 years, whether he or she resides in the community or a facility or moves between the two settings, using the version of the questionnaire appropriate to the setting. Sample persons are interviewed using computer-assisted personal interviewing (CAPI) survey instruments. Because residents of long-term care facilities often are in poor health, information about institutionalized residents is collected from proxy respondents such as nurses and other primary caregivers affiliated with the facility. The sample is selected from the Medicare enrollment files, with oversampling among disabled persons under 65 years of age and among persons 80 years and over.

MCBS has two components: the Cost and Use file and the Access to Care file. Medicare claims are linked to survey-reported events to produce the Cost and Use file, which provides complete expenditure and source of payment data on all health care services, including those not covered by Medicare. The Access to Care file contains information on beneficiaries' access to health care, satisfaction with care, and usual source of care. The sample for this file represents the always enrolled population—those who participated in the Medicare program for the entire year. In contrast, the Cost and Use file represents the ever enrolled population, including those who entered Medicare and those who died during the year.

Sample Size and Response Rate. Each fall, about one-third of the sample is retired and roughly 6,000 new sample persons are included in the survey; the exact number chosen is based on projections of target samples of 12,000 persons with 3 years of cost and use information distributed appropriately across the sample cells. In the community, response rates for initial interviews range in the mid- to high 80s; once respondents have completed the first interview, their participation in subsequent rounds is 95% or more. In recent rounds, data have been collected from approximately 16,000 beneficiaries. Roughly 90% of the sample is made up of persons who live in the community, with the remaining persons living in long-term care facilities. Response rates for facility interviews approach 100%.

Issues Affecting Interpretation. Because only Medicare enrollees are included in the survey, the survey excludes a small proportion of persons 65 years of age and over who are not enrolled in Medicare. This should be noted when using the MCBS to make estimates of the entire population 65 years and over in the United States.

References
For More Information

Monitoring the Future Study (MTF)

National Institute on Drug Abuse (NIDA)

Overview. MTF is an ongoing study of the behaviors, attitudes, and values of U.S. secondary school students, college students, and young adults.

Selected Content. Data collected include lifetime, annual, and 30-day prevalence of use of specific illegal drugs and substances, inhalants, tobacco, and alcohol. Data are also collected on usage levels, frequency of use, perceived risks associated with use, opinions about whether use is approved or disapproved by others, and opinions about availability of the substances.

Data Years. MTF has been conducted annually since 1975, initially with high school seniors. Ongoing panel studies of representative samples from each graduating class have been conducted by mail since 1976, and annual surveys of 8th and 10th graders were initiated in 1991.

Coverage. MTF surveys a sample of high school seniors, 10th graders, and 8th graders selected to be representative of all seniors, 10th graders, and 8th graders in public and private high schools in the coterminous United States.

Methodology. The survey design is a multistage random sample, with stage 1 being selection of particular geographic areas, stage 2 being selection of one or more schools in each area, and stage 3 being selection of classes within each school. Data are collected using self-administered questionnaires conducted in the classroom by representatives of the Institute for Social Research. Dropouts and students who are absent on the day of the survey are excluded. Recognizing that the dropout population is at higher risk for drug use, this survey was expanded in 1991 to include similar nationally representative samples of 8th and 10th graders, which have lower dropout rates than seniors and include future high-risk 12th grade dropouts. For more information on MTF adjustments for absentees and dropouts, see:

Johnston LD, O'Malley PM, Bachman JG, Schulenberg JE. Monitoring the Future: National survey results on drug use, 1975–2009, vol I: Secondary school students. Appendix A. NIH pub no 10–7584. Bethesda, MD: National Institute on Drug Abuse; 2010. Available from: http://www.monitoringthefuture.org/pubs/monographs/vol1_2009.pdf.

Sample Size and Response Rates. In 2009, a total of 46,097 students in the 8th, 10th, and 12th grades in 389 secondary schools were surveyed. The annual senior samples comprised 14,268 seniors in 125 public and private high schools nationwide. The 10th-grade samples involved 16,320 students in 119 schools, and the 8th-grade samples had 15,509 students in 145 schools. Response rates were 82% for 12th graders, 89% for 10th graders, and 88% for 8th graders and have been relatively constant across time. Absentees constitute virtually all of the nonresponding students.

Issues Affecting Interpretation. Estimates of substance use among youth based on the National Survey on Drug Use & Health (NSDUH) are not directly comparable with estimates based on MTF and the Youth Risk Behavior Surveillance System (YRBSS). In addition to the fact that MTF excludes dropouts and absentees, rates are not directly comparable across these surveys because of differences in populations covered, sample design, questionnaires, and interview setting. NSDUH collects data in residences, whereas MTF and YRBSS collect data in school classrooms. In addition, NSDUH estimates are tabulated by age, whereas MTF and YRBSS estimates are tabulated by grade, representing different ages as well as different populations.

References
  • Johnston LD, O'Malley PM, Bachman JG, Schulenberg JE. Overview of key findings, 2009. NIH pub no 10–7583. Bethesda, MD: National Institute on Drug Abuse; 2010. Monitoring the Future: National results on adolescent drug use. Available from: http://www​.monitoringthefuture​.org/pubs/monographs​/overview2009.pdf.
  • Johnston LD, O'Malley PM, Bachman JG, Schulenberg JE. NIH pub no 09–7402. Bethesda, MD: National Institute on Drug Abuse; 2009. Monitoring the Future: National survey results on drug use, 1975–2008, vol I: Secondary school students. Available from: http://www​.monitoringthefuture​.org/pubs/monographs​/vol1_2008.pdf.
  • Cowan CD. Coverage, sample design, and weighting in three federal surveys. J Drug Issues. 2001;31(3):599–614.
For More Information

National Ambulatory Medical Care Survey (NAMCS)

CDC/NCHS

Overview. NAMCS is a national survey designed to provide information about the provision and use of medical care services in office-based physician practices in the United States.

Selected Content. Data are collected from medical records on type of providers seen; reason for visit; diagnoses; drugs ordered, provided, or continued; and selected procedures and tests ordered or performed during the visit. Patient data include age, sex, race, and expected source of payment. Data are also collected on selected characteristics of physician practices.

Data Years. NAMCS, which began in 1973, was conducted annually until 1981, once in 1985, and resumed an annual schedule in 1989.

Coverage. The scope of the survey covers patient encounters in the offices of nonfederally employed physicians classified by the American Medical Association (AMA) or American Osteopathic Association (AOA) as office-based patient care physicians. Patient encounters with physicians engaged in prepaid practices—health maintenance organizations (HMOs), independent practice organizations (IPAs), and other prepaid practices—are included in NAMCS. Excluded are visits to hospital-based physicians; visits to specialists in anesthesiology, pathology, and radiology; and visits to physicians who are principally engaged in teaching, research, or administration. Telephone contacts and nonoffice visits are also excluded.

Methodology. A multistage probability design is employed. The first-stage sample consisted of 84 primary sampling units (PSUs) in 1985, and beginning in 1989, 112 PSUs, which were selected from about 1,900 such units into which the United States had been divided. In each sample PSU, a sample of practicing nonfederal office-based physicians is selected from master files maintained by the AMA and the AOA. The final stage involves systematic random samples of office visits during randomly assigned 7-day reporting periods. In 1985, the survey excluded Alaska and Hawaii. Starting in 1989, the survey included all 50 states and the District of Columbia.

The U.S. Census Bureau acts as the data collection agent for NAMCS. Screening interviews are conducted by Census field representatives to obtain information about physicians' office-based practices and to ensure that the practice is within the scope of the survey. Field representatives visit eligible physicians prior to their participation in the survey to provide them with survey materials and instruct them on how to sample patient visits and complete patient record forms. Participants are asked to complete forms for a systematic random sample of approximately 30 office visits occurring during a randomly assigned 1-week period, but increasingly patient record forms are abstracted by field representatives.

Sample data are weighted to produce national estimates. The estimation procedure used in NAMCS has three basic components: inflation by the reciprocal of the probability of selection, adjustment for nonresponse, and ratio adjustment to fixed totals.

Sample Size and Response Rate. In each sample year from 2003 to 2005, 3,000 physicians were sampled, and the response rates were 66%–70%. Data were provided for approximately 25,000 visits per survey year. In sample years 2006 and 2007, 3,500 physicians were sampled, and the response rates were 64%–65%. Data were provided for approximately 29,000 visits in 2006 and almost 33,000 visits in 2007. In 2008, a sample of 3,319 physicians was selected: 2,229 were in scope and 1,334 participated, for a response rate of 59.1%. The response rate has been modified to accommodate the mixture of one- and two-stage samples of providers. Data were provided for 28,741 visits.

Issues Affecting Interpretation. The NAMCS patient record form is modified approximately every 2–4 years to reflect changes in physician practice characteristics, patterns of care, and technological innovations. Examples of recent changes include increasing the number of drugs recorded on the patient record form and adding checkboxes for specific tests or procedures performed. Sample sizes vary by survey year. For some years it is suggested that analysts combine two or more years of data if they wish to examine relatively rare populations or events. Starting with Health, United States, 2005, data for survey years 2001–2002 were revised to be consistent with the weighting scheme introduced in the 2003 NAMCS data. For more information on the new weighting scheme, see the “National Ambulatory Medical Care Survey: 2003 Summary” (2005).

Reference
  • Hing E, Cherry DK, Woodwell DA. Advance data from vital and health statistics; no 365. Hyattsville, MD: NCHS; 2005. National Ambulatory Medical Care Survey: 2003 summary. Available from: http://www​.cdc.gov/nchs/data/ad/ad365​.pdf.
For More Information

National Compensation Survey (NCS)

Bureau of Labor Statistics (BLS)

Overview. NCS provides comprehensive measures of occupational earnings, compensation cost trends, benefit incidence, and detailed plan provisions.

Selected Content. Detailed occupational earnings are collected for metropolitan and nonmetropolitan areas, for broad geographic regions, and on a national basis. The Employment Cost Index (ECI) and Employer Costs for Employee Compensation (ECEC) are compensation measures derived from NCS. ECI measures changes in labor costs; average hourly employer costs for employee compensation are presented in ECEC. National benefits data are presented for five broad occupational groupings: professional, management, and related; sales and office; service; natural resources, construction, and maintenance; and production, transportation, and material moving. Data are also available by goods-and service-producing industries, union affiliation, and establishment size.

Data Years. NCS replaces three existing BLS surveys: ECI, the Occupational Compensation Survey Program (OCSP), and the Employee Benefits Survey (EBS). ECI and EBS were fully integrated into NCS in 1999. Prior to 1999, EBS was collected for small private establishments (those employing fewer than 100 workers) and from state and local governments regardless of employment size. In odd-numbered years, data were collected for medium and large private establishments (those employing 100 workers or more). ECI was created in the mid-1970s, and EBS was added to an existing data collection effort—the Professional, Administrative, and Technical Pay Survey—in the late 1970s. ECEC was developed in 1987.

Coverage. NCS provides information for the Nation for the nine census divisions and for 152 selected areas (combined statistical areas, metropolitan statistical areas, micropolitan statistical areas, and county clusters). Not all areas have information for all occupations. NCS includes both full- and part-time workers who are paid a wage or salary and includes data for the civilian economy, including both private industry and state and local government. It excludes agriculture, fishing, and forestry industries; private household workers; and the federal government.

Methodology. NCS is conducted quarterly by the BLS’s Office of Compensation and Working Conditions. The sample is selected using a three-stage design. The first stage involves the selection of areas for the state and local government sample and the private industry sample. In the second stage, establishments are selected systematically, with the probability of selection proportionate to their relative employment size within the industry. Use of this technique means that the larger an establishment's employment, the greater its chance of selection. The third stage of sampling is a probability sample of occupations within a sampled establishment. This step is performed by the BLS field economist during an interview with the respondent establishment in which selection of an occupation is based on probability of selection proportionate to employment in the establishment and each occupation is classified under its corresponding major occupational group.

Data collection is conducted by BLS field economists. Data are gathered from each establishment on the primary business activity of the establishment; types of occupations; number of employees; wages, salaries, and benefits; hours of work; and duties and responsibilities. Wage data obtained by occupation and work level allows NCS to publish occupational wage statistics for localities, census divisions, and the Nation.

Sample. The sample consists of approximately 152 areas that represent the Nation’s almost 370 metropolitan statistical areas and almost 580 micropolitan statistical areas, as defined by the Office of Management and Budget (OMB), and the remaining portions of the 50 states. NCS is in the midst of a 6-year transition from the OMB's December 1993 area definitions to the December 2003 area definitions. During this transition, NCS is surveying additional areas as new areas are being phased into the sample and others are being phased out. For more information, see: http://www.bls.gov/ncs/ncswage2007.htm#AppendixA.

Issues Affecting Interpretation. Because NCS merges separate surveys, trend analyses prior to 2000 should be interpreted with care. The industrial coverage, establishment size coverage, and geographic coverage for EBS have changed since 1990. All surveys conducted from 1979–1989 excluded part-time employees, as well as establishments in Alaska and Hawaii. The surveys conducted from 1979–1986 covered only medium and large private establishments and excluded most of the service industries. Establishments that employed at least 50, 100, or 250 workers (depending on the industry) were included. The survey conducted in 1987 consisted of state and local governments with 50 or more employees. The surveys carried out in 1988 and 1989 included all private-sector establishments that employed 100 or more people.

ECEC switched to new industry and occupation classification systems with the release of the March 2004 data. The North American Industry Classification System (NAICS) is now used to classify industries, and the 2000 Standard Occupational Classification (SOC) system is used to classify occupations. ECEC data based on the 1987 Standard Industrial Classification System and the 1990 Occupational Classification System are no longer produced, and data classified under these coding schemes are not comparable to data classified under NAICS or SOC. The 2007 NAICS is gradually replacing the 2002 NAICS, but this does not affect trends. Beginning with the March 2004 quarter, historical data are available based on NAICS and the 2000 SOC. The historical tables are available from: http://www.bls.gov/ncs/ect/home.htm or upon request from BLS. For more detailed information on NAICS and SOC, including background definitions and implementation schedules, see the BLS websites at: http://www.bls.gov/bls/naics.htm and http://www.bls.gov/soc/home.htm.

The state and local government sample, which is replaced less frequently than the private industry sample, was replaced in its entirety in September 2007. As a result of this replacement, the number of state and local government occupations and establishments increased substantially. The private industry sample is rotated over approximately 5 years, which makes the sample more representative of the economy and reduces respondent burden. Data are collected for the pay period including the 12th day of the survey months of March, June, September, and December. The sample is replaced on a cross-area, cross-industry basis.

References
For More Information

National Health Expenditure Accounts

Centers for Medicare & Medicaid Services (CMS)

Overview. National Health Expenditure Accounts provide estimates of how much money is spent on different types of health-care-related services and programs in the United States.

Selected Content. National health expenditures measure spending for health care in the United States by type of service delivered (e.g., hospital care, physician services, nursing home care) and source of funding for those services (e.g., private health insurance, Medicare, Medicaid, out-of-pocket spending).

Data Years. Expenditure estimates are available starting from 1960 in data files or published articles.

Methodology. The American Hospital Association data on hospital finances, and the U.S. Census Bureau's Services Annual Survey (SAS), are the primary sources for estimates relating to hospital care. These are supplemented by data on federal hospitals. The salaries of physicians and dentists on the staffs of hospitals, hospital outpatient clinics, hospital-based home health care agencies, and nursing home care provided in the hospital setting are also considered to be components of hospital care. Expenditures for nursing home care and home health care, and for the services of health care professionals (i.e., doctors, chiropractors, private duty nurses, therapists, and podiatrists), are estimated primarily by using a combination of data from SAS and the quinquennial Census of Service Industries.

The estimates of retail spending for prescription drugs are based on industry data on prescription drug transactions from the Census of Retail Trade (U.S. Bureau of the Census) and IMS Health, an organization that collects data from the pharmaceutical industry. Expenditures for other medical nondurables and for vision products and other medical durables purchased in retail outlets are based on input-output (I/O) tables prepared by the U.S. Department of Commerce's Bureau of Economic Analysis, U.S. Bureau of Labor Statistics (BLS), Consumer Expenditure Survey; the 1987 National Medical Expenditure Survey and the Medical Expenditure Panel Surveys conducted by the Agency for Healthcare Research and Quality; and spending by Medicare and Medicaid. Those durable and nondurable products provided to inpatients in hospitals or nursing homes, and those provided by licensed professionals or through home health care agencies, are excluded here but are included with the expenditure estimates for the provider service category.

The construction estimates measured the value put in place in the construction of some medical sector buildings, mainly hospitals and nursing homes; these estimates were derived from the Bureau of the Census C–30 survey of new construction. Medical capital equipment comprises the value of new capital equipment (including software) purchased or put in place by the medical sector during the year.

Expenditures for noncommercial research (the cost of commercial research by drug companies is assumed to be embedded in the price charged for the product; to include this item again would result in double counting) are developed from information gathered by the National Institutes of Health and the National Science Foundation.

Source of funding estimates likewise come from many sources. Data on federal health care programs are taken from administrative records maintained by the servicing agencies. Among the sources used to estimate state and local government spending for health care are the U.S. Census Bureau's Government Finances reports and the National Academy of Social Insurance reports on state-operated workers' compensation programs. Federal, state, and local expenditures for education and training of medical personnel are excluded from these measures where they are separable. For the private financing of health care, data on the financial experience of health insurance organizations come from special CMS analyses of private health insurers and from the BLS survey on the cost of employer-sponsored health insurance and on consumer expenditures.

Information on out-of-pocket spending from the U.S. Bureau of the Census Services Annual Survey; U.S. BLS Consumer Expenditure Survey; the 1987 National Medical Care Expenditure Survey and the Medical Expenditure Panel Surveys conducted by the Agency for Healthcare Research and Quality; and from private surveys conducted by the American Hospital Association, the American Medical Association, the American Dental Association, and IMS Health is used to develop estimates of direct spending by customers.

Reference
  • Hartman M, Martin A, Nuccio O, Catlin A. the National Health Expenditure Accounts Team. Health spending growth at a historic low in 2008. Health Aff (Millwood) 2010;29(1):147–55. [PubMed: 20048374]
For More Information

National Health and Nutrition Examination Survey (NHANES)

CDC/NCHS

Overview. The NHANES program includes a series of cross-sectional, nationally representative health examination surveys conducted in mobile examination units or clinics (MECs). In the first series of surveys, the National Health Examination Survey (NHES), data were collected on the prevalence of certain chronic diseases, the distributions of various physical and psychological measures, and measures of growth and development. In 1971, a nutrition surveillance component was added, and the survey name was changed to NHANES. See the Data Years section for more information on the survey name and the years it was conducted.

Selected Content. NHANES has collected data on chronic disease prevalence and conditions (including undiagnosed conditions) and risk factors such as obesity and smoking, serum cholesterol levels, hypertension, diet and nutritional status, immunization status, infectious disease prevalence, health insurance, and measures of environmental exposures. Other topics addressed include hearing, vision, mental health, anemia, diabqetes, cardiovascular disease, osteoporosis, oral health, pharmaceuticals and dietary supplements used, and physical fitness.

NHES I data were collected on the prevalence of certain chronic diseases, as well as the distribution of various physical and psychological measures, including blood pressure and serum cholesterol levels. NHES II and NHES III focused on factors related to growth and development in children and youth.

For NHANES I, data were collected on indicators of the nutritional and health status of the American people through dietary intake data, biochemical tests, physical measurements, and clinical assessments for evidence of nutritional deficiency. Detailed examinations were conducted by dentists, ophthalmologists, and dermatologists, with an assessment of need for treatment. In addition, data were obtained for a subsample of adults on overall health care needs and behavior, and more detailed examination data were collected on cardiovascular, respiratory, arthritic, and hearing conditions. For NHANES II, the nutrition component was expanded and the medical area focused on diabetes, kidney and liver function, allergy, and speech pathology. The third survey (NHANES III) also included data on antibodies, spirometry, and bone health.

Beginning in 1999 with continuous data collection for NHANES, new topics have included cardiorespiratory fitness, physical functioning, lower extremity disease, full body scan (DXA) for body fat and bone density, and tuberculosis infection.

Data Years. Data have been collected from surveys conducted during 1960–1962 (NHES I), 1963–1965 (NHES II), 1966–1970 (NHES III), 1971–1974 (NHANES I), 1976–1980 (NHANES II), 1982–1984 (Hispanic Health and Nutrition Examination Survey (HHANES)), and 1988–1994 (NHANES III). Beginning in 1999, the survey has been conducted continuously.

Coverage. With the exception of HHANES (see Methodology, below), NHES and NHANES provide estimates of the health status of the civilian noninstitutionalized population of the United States. NHES II and NHES III examined probability samples of the Nation's noninstitutionalized children 6–11 years of age and 12–17 years, respectively.

The NHANES I target population was the civilian noninstitutionalized population 1–74 years of age residing in the coterminous United States, except for people residing on any of the reservation lands set aside for the use of American Indians.

The NHANES II target population was the civilian noninstitutionalized population 6 months to 74 years of age residing in the United States, including Alaska and Hawaii.

HHANES studied three geographically and ethnically distinct populations: Mexican Americans living in Texas, New Mexico, Arizona, Colorado, and California; Cuban Americans living in Dade County, Florida; and Puerto Ricans living in parts of New York, New Jersey, and Connecticut.

The NHANES III target population was the civilian noninstitutionalized population 2 months of age and over. The sample design provided for oversampling among children 2 months to 5 years of age, persons 60 years and over, black persons, and persons of Mexican origin.

Beginning in 1999, NHANES oversampled low-income persons, adolescents 12–19 years of age, persons 60 years and over, African Americans, and persons of Mexican origin. The sample for data years 1999–2006 is not designed to give a nationally representative sample for the total population of Hispanics residing in the United States. Starting with 2007–2008 data collection, all Hispanics were oversampled, not just Mexican Americans. For more information on the sampling methodology changes, see: http://www.cdc.gov/nchs/nhanes/nhanes2007-2008/sampling_0708.htm.

Methodology. NHANES include clinical examinations, selected medical and laboratory tests, and self-reported data. NHANES and previous surveys interviewed persons in their homes and conducted medical examinations, including laboratory analysis of blood, urine, and other tissue samples. Medical examinations and laboratory tests follow very specific protocols and are as standard as possible to ensure comparability across sites and providers. In 1999–2002, as a substitute for the MEC examinations, a small number of survey participants received an abbreviated health examination in their homes if they were unable to come to the MEC.

For the first program or cycle of NHES I, a highly stratified multistage probability sample was selected to represent the 111 million civilian noninstitutionalized adults 18–79 years of age in the United States at that time. The sample areas consisted of 42 primary sampling units (PSUs) from the 1,900 geographic units. NHES II and NHES III were also multistage stratified probability samples of clusters of households in land-based segments. NHES II and III used the same 40 PSUs.

For NHANES I, the sample areas consisted of 65 PSUs. A subsample of persons 25–74 years of age was selected to receive the more detailed health examination. Groups at high risk of malnutrition were oversampled.

NHANES II used a multistage probability design that involved selection of PSUs, segments (clusters of households) within PSUs, households, eligible persons, and, finally, sample persons. The sample design provided for oversampling among persons 6 months to 5 years of age, 60–74 years, and those living in poverty areas.

HHANES was similar in content and design to NHANES I and II. The major difference between HHANES and the previous national surveys is that HHANES used a probability sample of three special subgroups of the population living in selected areas of the United States, rather than a national probability sample. The three HHANES universes included approximately 84%, 57%, and 59% of the respective 1980 Mexican-, Cuban-, and Puerto Rican-origin populations in the continental United States.

The survey for NHANES III was conducted from 1988 to 1994 and consisted of two phases of equal length and sample size. Phases 1 and 2 comprised random samples of the civilian U.S. population living in households. About 40,000 persons 2 months of age and over were selected and asked to complete an extensive interview and an examination. Participants were selected from households in 81 counties across the United States. Children 2 months to 5 years of age and persons 60 years and over were oversampled to provide precise descriptive information on the health status of selected population groups in the United States.

Beginning in 1999, NHANES became a continuous, annual survey, which allows increased flexibility in survey content. Since April 1999, NHANES has collected data every year from a representative sample of the civilian noninstitutionalized U.S. population, newborns and older, by in-home personal interviews and physical examinations in the MEC. The sample design is a complex, multistage, clustered design using unequal probabilities of selection. The first-stage sample frame for continuous NHANES during 1999–2001 was the list of PSUs selected for the design of the National Health Interview Survey. Typically, an NHANES PSU is a county. For 2002, an independent sample of PSUs (based on current census data) was selected. This independent design was used for the period 2002–2008. For 1999, because of a delay in the start of data collection, 12 distinct PSUs were in the annual sample. For each year in 2000–2008, 15 PSUs were selected. The within-PSU design involves forming secondary sampling units that are nested within census tracts, selecting dwelling units within secondary units, and then selecting sample persons within dwelling units. The final sample person selection involves differential probabilities of selection according to the demographic variables of sex (male or female), race/ethnicity (Hispanic, black, all others), and age. Because of the differential probabilities of selection, dwelling units are screened for potential sample persons. Sample weights are available and should be used in estimating descriptive statistics. The complex design features should be used in estimating standard errors for the descriptive estimates.

The estimation procedure used to produce national statistics for all NHANES involved inflation by the reciprocal of the probability of selection, adjustment for nonresponse, and poststratified ratio adjustment to population totals. Sampling errors also were estimated to measure the reliability of the statistics.

Sample Size and Response Rates. NHES I sampled 7,710 adults. The examination response rate was 87%. NHES II sampled 7,417 children and reported a response rate of 96% for the questionnaire sample and 73% for the examination sample. NHES III sampled 7,514 youth and reported a response rate of 90%.

A sample of 28,043 persons was selected for NHANES I. Household interviews were completed for more than 96% of the persons selected, and about 75% (20,749) were examined. A sample of 27,801 persons was selected for NHANES II; 73% (20,322 persons) were examined.

In HHANES, 9,894 persons in the Southwest were selected (75% or 7,462 were examined); in Dade County, 2,244 persons were selected (60% or 1,357 were examined); and in the Northeast, 3,786 persons were selected (75% or 2,834 were examined). Over the 6-year survey period of NHANES III, 39,695 persons were selected, the household interview response rate was 86%, and the medical examination response rate was 78%.

In the sample selection for NHANES 1999–2000, there were 22,839 dwelling units screened. Of these, 6,005 households had at least one eligible sample person identified for interviewing, for a total of 12,160 eligible sample persons. The overall response rate in NHANES 1999–2000 for those interviewed was 82% (9,965 of 12,160), and the response rate for those examined was 76% (9,282 of 12,160). For NHANES 2001–2002 there were 13,156 persons selected in the sample, of which 84% (11,039) were interviewed and 80% (10,480) completed the health examination component of the survey. For NHANES 2003–2004, 6,410 households had at least one eligible sample person identified for interviewing. A total of 12,761 eligible sample persons were identified, of which 79% (10,115) were interviewed and 76% (9,653) completed the health examination component. For NHANES 2005–2006, a total of 12,862 persons were identified, of which 80% (10,348) were interviewed and 77% (9,950) completed the health examination component. For NHANES 2007–2008, a total of 12,943 persons were identified, of which 78% (10,149) were interviewed and 75% (9,762) completed the health examination component. For more information on unweighted NHANES response rates and response weights using sample size weighted to Current Population Survey population totals, see: http://www.cdc.gov/nchs/nhanes/response_rates_CPS.htm.

Issues Affecting Interpretation. Data elements, laboratory tests performed, and the technological sophistication of medical examination and laboratory equipment have changed over time. Therefore, trend analyses should carefully examine how specific data elements were collected across the various NHES and NHANES surveys.

References
For More Information

National Health Interview Survey (NHIS)

CDC/NCHS

Overview. NHIS monitors the health of the U.S. population through the collection and analysis of data on a broad range of health topics. A major strength of this survey lies in the ability to analyze health measures by many demographic and socioeconomic characteristics.

Selected Content. NHIS obtains information, during household interviews, on illnesses, injuries, activity limitation, chronic conditions, health insurance coverage, utilization of health care, and other health topics. Demographic data gathered include age, sex, education, race/ethnicity (reported by respondent or proxy), place of birth, income, and residence. Other data collected include risk factors such as lack of exercise, smoking, alcohol consumption, and use of prevention services such as vaccinations, mammography, and Pap smears. Special modules and supplements focus on different issues each year and have included topics such as vaccinations, aging, cancer screening, prevention, alternative and complementary medicine, and many other topics.

Data Years. NHIS has been conducted annually since 1957, with a major redesign every 10–15 years.

Coverage. NHIS covers the civilian noninstitutionalized population of the United States. Among those excluded are patients in long-term care facilities, persons on active duty with the Armed Forces (although their dependents are included), incarcerated persons, and U.S. nationals living in foreign countries.

Methodology. NHIS is a cross-sectional household interview survey. Sampling and interviewing are continuous throughout each year. The sampling plan follows a multistage area probability design that permits the representative sampling of households. Traditionally, the sample for NHIS is redesigned and redrawn about every 10 years to better measure the changing U.S. population and to meet new survey objectives. A new sample design was implemented in the 2006 survey. The fundamental structure of the new design is very similar to the previous design for the 1995–2005 surveys. Information is presented only for the current sampling plan covering design years 2006–2014. The first stage of the current sampling plan consists of a sample of 428 primary sampling units (PSUs) drawn from approximately 1,900 geographically defined PSUs that cover the 50 states and the District of Columbia. A PSU consists of a county, a small group of contiguous counties, or a metropolitan statistical area.

Within a PSU, two types of second-stage units are used: area segments and permit segments. Area segments are defined geographically and contain an expected 8, 12, or 16 addresses. Permit segments cover housing units built after the 2000 census. The permit segments are defined using updated lists of building permits issued in the PSU since 2000 and contain an expected four addresses. Within each segment, all occupied households at the sample addresses are targeted for interview.

The total NHIS sample of PSUs is subdivided into four separate panels, or subdesigns, such that each panel is a representative sample of the U.S. population. This design feature has a number of advantages, including flexibility for the total sample size. The households selected for interview each week in NHIS are a probability sample representative of the target population.

In the 2006–2014 redesign, the NHIS sample was reduced by 13% compared with the 1995–2005 design. In addition, the NHIS sample was reduced by approximately 50% during the third quarter of 2006, cutting about 13% of the sample size of the original 2006 sample. In 2007, the NHIS sample was reduced by approximately 50% during July–September. The 2007 sample reduction was implemented in the same way and during the same time of year as the 2006 sample reduction. Overall, about 13% of the households in the 2007 NHIS sample were deleted from interviewers' assignments. The NHIS sample was reduced by approximately 50% during October–December 2008 and by approximately 50% during January–March 2009. The 2009 sample reduction was implemented in the same way as the 2006, 2007, and 2008 sample reductions; however, the timing of the 2009 reduction was different: the 2006 and 2007 reductions occurred during July–September, and the 2008 reduction occurred during October–December. Newly available funding later in 2009 permitted an expansion during October–December to increase that quarter’s normal sample size by approximately 50%. The net effect of the January–March cut and the October–December expansion is that the 2009 NHIS sample size is approximately the same as it would have been if the sample had been maintained at a normal level during the entire calendar year.

Oversampling of the black and Hispanic populations was retained in the 2006–2014 design to allow for more precise estimation of health characteristics in these growing minority populations. The new sample design also oversamples the Asian population. In addition, the sample adult selection process was revised so that when black, Hispanic, or Asian persons 65 years of age and over are present, they have an increased chance of being selected as the sample adult.

The NHIS that was fielded from 1982–1996 consisted of two parts: (a) a set of basic health and demographic items (known as the Core questionnaire) and (b) one or more sets of questions on current health topics (known as Supplements). The Core questionnaire remained the same over that time period, whereas the current health topics changed depending on data needs.

The NHIS questionnaire revision, implemented in 1997, has two basic parts: a Basic Module or Core and one or more supplements that vary by year. The Core remains largely unchanged from year to year and allows for trend analysis and for data from more than 1 year to be pooled to increase the sample size for analytic purposes. The Core contains three components: the Family, the Sample Adult, and the Sample Child. The Family component collects information on everyone in the family and allows NHIS to serve as a sampling frame for additional integrated surveys as needed. Information collected in the Family section for all family members includes household composition and sociodemographic characteristics, tracking information, information for matches to administrative databases, health insurance coverage, and basic indicators of health status and utilization of health care services. Information from the Family component is included on the Person file (see the NHIS website, below). From each family in NHIS, one sample adult and, for families with children under 18 years of age, one sample child are randomly selected to participate in the Sample Adult and Sample Child questionnaires. For children, information is provided by a knowledgeable family member 18 years or over residing in the household. Because some health issues are different for children and adults, these two questionnaires differ in some items but both collect basic information on health status, use of health care services, health conditions, and health behaviors.

Sample Size and Response Rates. Between 1997 and 2005, the sample numbered about 100,000 persons with about 30,000–36,000 persons participating in the Sample Adult and about 12,000–14,000 persons in the Sample Child questionnaire. In 2009, the sample numbered 88,446 with 27,731 persons participating in the Sample Adult and 11,156 persons in the Sample Child questionnaires. In 2009, the total household response rate was 82%. The final response rate for the Sample Adult file was 65% and for the Sample Child file was 73%.

Issues Affecting Interpretation. In 1997, the questionnaire was redesigned; some basic concepts were changed, and other concepts were measured in different ways. For some questions there was a change in the reference period. Also in 1997, the collection methodology changed from paper-and-pencil questionnaires to computer-assisted personal interviewing (CAPI). Because of the major redesign of the questionnaire in 1997, most NHIS trend tables in Health, United States begin with 1997 data. Starting with Health, United States, 2005, estimates for 2000–2002 were revised to use 2000-based weights and differ from previous editions of Health, United States that used 1990-based weights for those data years. The weights available on the public-use NHIS files for 2000–2002 are 1990-based. Data for 2003 and later years use weights derived from the 2000 Census. In 2006 and beyond, the sample size was reduced, and this is associated with slightly larger variance estimates than in previous years when a larger sample was fielded.

References
For More Information

National Home and Hospice Care Survey (NHHCS)

CDC/NCHS

Overview. NHHCS is a national probability sample survey of U.S. home health and hospice care agencies. The survey is designed to provide descriptive information on the agencies and their staffs, services, and patients.

Selected Content. NHHCS provides information on home health and hospice care agencies from two perspectives—that of the provider of services and that of the recipient of services. Data about the agencies include characteristics such as ownership; affiliation; services offered; and number, training, and characteristics of staff. Data about the current home health care patients and discharged hospice care patients include demographic characteristics, diagnoses, health status, level of assistance needed with activities of daily living, services received, sources of payment, and discharge disposition (for discharges). The redesigned NHHCS, conducted in 2007, included new agency data items on electronic information systems, cultural competency, end-of-life practices, and special service programs, as well as new patient-level data items on pain assessment and pain relief, medications, family and caregiver services, end-of-life care, and advance directives. The 2007 survey also included a supplemental survey of home health aides employed by home health and/or hospice care agencies, called the National Home Health Aide Survey.

Data Years. NHHCS was first conducted in 1992 and was repeated in 1993, 1994, 1996, 1998, 2000, and most recently in 2007. The 2007 NHHCS, which was reintroduced into the field after a 7-year break that included a redesign, was conducted between August 2007 and February 2008.

Coverage. The survey covers agencies that provide home health and hospice care services in the United States and the care recipients of these agencies. Agencies are freestanding health facilities or units of larger organizations, such as hospitals or nursing homes. Agencies that provide only homemaker services or housekeeping services, assistance with instrumental activities of daily living (IADLs), or durable medical equipment and supplies are excluded from the survey.

Methodology. The survey uses a stratified two-stage probability sample design; the 1992–1994 surveys used a stratified three-stage probability sample design. The first stage of the 2007 survey, carried out by NCHS, was the selection of home health and hospice care agencies from the sample frame of over 15,000 agencies, representing the universe of agencies providing home health and hospice care services in the United States. The primary sampling strata of agencies were defined by agency type (i.e., home health care only, hospice care only, and mixed (provides both home health and hospice care services)) and metropolitan statistical area (MSA) status. Within these sampling strata, agencies were sorted by census region, ownership, certification status, state, county, ZIP code, and size (number of employees).

The second stage of sample selection was completed by the interviewers during the agency interviews. The current home health care patients and hospice care discharges were randomly selected by a computer algorithm, based on a census list provided by each agency director or his or her designee. Up to 10 current home health care patients were randomly selected per home health care agency; up to 10 hospice care discharges were randomly selected per hospice care agency; and a combination of up to 10 current home health care patients and hospice care discharges were randomly selected per mixed agency. Current home health care patients were defined as patients who were on the rolls of the home health care agency as of midnight of the day immediately before the agency interview. The hospice care discharges were defined as patients who were discharged from the hospice care agency during the 3-month period beginning 4 months before the agency interview. Discharges that occurred because of the death of a sampled hospice patient were included.

All data, except for the paper-and-pencil self-administered staffing questionnaire, were collected using a computer-assisted personal interviewing instrument. Agency data, available in agency administrative records, were collected through in-person interviews with agency directors and their designated staff. Data on home health care patients and hospice care discharges, available in medical records, were collected by interviewing the staff member most familiar with the care provided to the sampled patients/discharges. No interviews were conducted directly with patients or their families or friends.

Estimates based on NHHCS take into account the selection procedures of the complete survey design to develop the final sample weight for each sampled agency and each sampled patient/discharge. The final weight for each sampled unit is the product of up to three components: inverse of the probability of selection; nonresponse adjustment; and ratio adjustment. The data from the surveys are adjusted for three types of nonresponse: an in-scope agency did not respond; an in-scope agency did not provide the number of current home health care patients and/or hospice care discharges; and the administrative and medical records of the sampled current home health care patients and/or hospice discharges were not made available to complete the survey.

Sample Size and Response Rates. The sampling frame for the 2007 NHHCS was constructed using three sources: (a) Centers for Medicare & Medicaid Services Provider of Services File of home health care agencies and hospices, (b) state licensing lists of home health care agencies compiled by a private organization, and (c) the National Hospice and Palliative Care Organization file of hospices. The combined files were matched and identified duplicates were removed, resulting in a sampling frame of 15,488 agencies. A sample of 1,545 agencies were selected, of which 1,461 (95%) were considered in scope. Of the in-scope agencies, 1,036 agreed to participate, resulting in a first-stage agency unweighted response rate of 71% and a weighted response rate of 59%. A total of 10,009 current home health care patients and hospice care discharges were sampled from the responding agencies: 5,026 current home health care patients and 4,983 hospice care discharges. Of these, 106 home health care patients and 19 hospice care discharges were considered out of scope. Furthermore, 237 current home health care patients and 231 hospice care discharges were excluded due to one of the following reasons: consent problems, record problems, refusals, ran out of time, and nonresponse. This resulted in 4,683 home health cases and 4,733 hospice cases, for a second-stage unweighted response rate of 95% and a weighted response rate of 96%.

Issues Affecting Interpretation. The current home health care patient sample describes individuals receiving home health care on the night before data collection began and represents home health care utilization on any given day between August 2007 and February 2008. Because frequent short-term users are less likely than long-term users to be enrolled with the agency on any given day, the current home health care patients with a very short length of service may be underestimated. The hospice care discharge sample describes the annual number of discharges from hospice care. Estimates of hospice discharges may underestimate those patients who tend to receive care for longer periods of time. Finally, various survey items were added or modified in the 2007 survey, which may preclude comparisons from previous years or trend analyses.

References
For More Information

National Hospital Ambulatory Medical Care Survey (NHAMCS)

CDC/NCHS

Overview. NHAMCS collects data on the utilization and provision of medical care services provided in hospital emergency and outpatient departments.

Selected Content. Data are collected from medical records on types of providers seen; reason for visit; diagnoses; drugs ordered, provided, or continued; and selected procedures and tests performed during the visit. Patient data include age, sex, race, and expected source of payment. Data are also collected on selected characteristics of the hospitals included in the survey.

Data Years. Annual data collection began in 1992.

Coverage. The survey is a representative sample of visits to emergency departments (EDs) and outpatient departments (OPDs) of nonfederal, short-stay, or general hospitals. Telephone contacts are excluded.

Methodology. A four-stage probability sample design is used in NHAMCS, involving (a) samples of geographically defined primary sampling units (PSUs), (b) hospitals within PSUs, (c) clinics within OPDs, and (d) patient visits within clinics. EDs are treated as their own stratum, and all service areas within EDs are included. The first-stage sample of NHAMCS consists of 112 PSUs selected from 1,900 such units that make up the United States. Within PSUs, 600 general and short-stay hospitals were sampled and assigned to 1 of 16 panels. In any given year, 13 panels are included. Each panel is assigned to a 4-week reporting period during the calendar year.

In the NHAMCS OPD survey, a clinic is defined as an administrative unit of the OPD in which ambulatory medical care is provided under the supervision of a physician. Clinics where only ancillary services—such as radiology, laboratory services, physical rehabilitation, renal dialysis, and pharmacy—are provided, or other settings in which physician services are not typically provided, are considered out of scope. If a hospital OPD has five or fewer in-scope clinics, all are included in the sample. If an outpatient department has more than five clinics, the clinics are assigned into one of six specialty groups: general medicine, surgery, pediatrics, obstetrics/gynecology, substance abuse, and other. Within these specialty groups, clinics are grouped into clinic sampling units (SUs). A clinic SU is generally one clinic, except when a clinic expects fewer than 30 visits. In that case, it is grouped with one or more other clinics to form a clinic SU. If the grouped SU is selected, all clinics included in that SU are included in the sample. Prior to 2001, a sample of generally five clinic SUs was selected per hospital, based on probability proportional to the total expected number of patient visits to the clinic during the assigned 4-week reporting period. Starting in 2001, clinic sampling within each hospital was stratified. If an OPD had more than five clinics, two clinic SUs were selected from each of the six specialty groups with a probability proportional to the total expected number of visits to the clinic. The change was made to ensure that at least two SUs were sampled from each of the specialty group strata.

The U.S. Census Bureau acts as the data collection agent for NHAMCS. Census field representatives contact sample hospitals to determine whether they have a 24-hour ED or an OPD that offers physician services. Visits to eligible EDs and OPDs are systematically sampled over the 4-week reporting period such that about 100 ED encounters and about 200 OPD encounters are selected. Hospital staff are asked to complete patient record forms (PRFs) for each sampled visit, but census field representatives typically abstract data for more than one-third of these visits.

Sample data are weighted to produce national estimates. The estimation procedure used in NHAMCS has three basic components: inflation by the reciprocal of the probability of selection, adjustment for nonresponse, and ratio adjustment to fixed totals.

Sample Size and Response Rate. In any given year, the hospital sample consists of approximately 500 hospitals, of which 80% have EDs and about one-half have eligible OPDs. Typically, about 1,000 clinics are selected from participating hospital OPDs. In each sample year from 2002 to 2008, the number of PRFs completed for EDs ranged from 33,000 to 40,000 and for OPDs from 30,000 to 36,000. The hospital response rate was 83%–94% for EDs and 73%–84% for OPDs during this timeframe. In 2008, the number of PRFs completed for EDs was 34,134 and for OPDs was 33,908, and the hospital response rate was 87% for EDs and 75% for OPDs.

Issues Affecting Interpretation. The NHAMCS PRF is modified approximately every 2 to 4 years to reflect changes in physician practice characteristics, patterns of care, and technological innovations. Examples of recent changes are the number of drugs recorded on the PRF form and the number of checkboxes for specific tests or procedures performed.

Reference
For More Information

National Hospital Discharge Survey (NHDS)

CDC/NCHS

Overview. NHDS collects and produces national estimates on characteristics of inpatient stays in nonfederal, short-stay hospitals in the United States.

Selected Content. Patient information collected includes demographics, length of stay, diagnoses, and procedures. Hospital characteristics collected include region, ownership, and bed size.

Data Years. NHDS has been conducted annually since 1965.

Coverage. The survey design covers the 50 states and the District of Columbia. Included in the survey are hospitals with an average length of stay of less than 30 days for all inpatients, general hospitals, and children's general hospitals. Excluded are federal, military, and Department of Veterans Affairs hospitals, as well as hospital units of institutions (such as prison hospitals) and hospitals with fewer than six beds staffed for patient use. All discharged patients from in-scope hospitals are included in the survey; however, data for newborns are not included in Health, United States.

Methodology. The NHDS design implemented in 1965 continued through 1987, and a redesign with a new sample of hospitals, fielded in 1988, is currently in place. The sample for the 1965 NHDS was selected in 1964 from a frame of short-stay hospitals listed in the National Master Facility Inventory. A two-stage stratified sample design was used, with hospitals stratified according to bed size and geographic region. Sample hospitals were selected with probabilities ranging from certainty for some hospitals to 1 in 40 for other hospitals. Within each participating hospital, a systematic random sample was selected from a daily listing sheet of discharges. Within-hospital sampling rates for discharges varied inversely with the probability of hospital selection, so the overall probability of selecting a discharge was approximately the same across the sample.

Data collection was conducted by means of manual abstraction of patient information from sampled medical records. Sample selection and transcription of information from inpatient medical records to NHDS survey forms were performed by hospital staff, representatives of NCHS, or both. In 1985, a second data collection procedure was introduced that involved the purchase of computer data tapes from commercial abstracting services that contained automated discharge data for some hospitals participating in NHDS. This procedure was used in approximately 17% of the sample hospitals for 1985–1987. Discharges on these computer files were subjected to the NHDS sampling specifications as well as the computer edits and estimation procedures. Two data collection methods, manual and automated, continue to be used in NHDS.

A redesign of NHDS was implemented for the 1988 survey. Under the redesign, hospitals were selected using a modified three-stage stratified design. Units selected at the first stage consisted of either hospitals or geographic areas. The geographic areas were the primary sampling units (PSUs) used for the 1985–1994 National Health Interview Survey, which are geographic areas such as counties or townships. Hospitals within PSUs were then selected at the second stage. Strata at this stage were defined by geographic region, PSU size, abstracting service status, and hospital specialty-size groups. Within these strata, hospitals were selected with probabilities proportional to their annual number of discharges. At the third stage, a sample of discharges was selected by a systematic random sampling technique. The sampling rate was determined by the hospital's sampling stratum and the type of data collection system (manual or automated) used. Discharge records from hospitals submitting data from commercial abstracting services and selected state data systems (approximately 45% of sample hospitals in 2007) were arrayed by primary diagnoses, patient sex and age group, and date of discharge, before sampling.

The NHDS hospital sample is updated every 3 years by continuing the sampling process among hospitals that become eligible for the survey during the intervening years and by deleting hospitals that are no longer eligible. This update was conducted in 1991, 1994, 1997, 2000, 2003, and 2006.

The basic unit of estimation for NHDS is a sampled discharge. The basic estimation procedure involves inflation by the reciprocal of the probability of selection. Adjustments are made for nonresponding hospitals and discharges, and a post-ratio adjustment to fixed totals is employed.

Sample Size and Response Rate. In 2007, 501 hospitals were selected: 477 were within scope, 422 participated (88%), and data were collected from medical records for approximately 366,000 discharges.

Issues Affecting Interpretation. NHDS was redesigned in 1988, and caution is required in comparing trend data from before and after the redesign. In addition, annual modifications to the International Classification of Diseases, 9th Revision, Clinical Modification (ICD–9–CM) may affect diagnosis and procedure categories. (See Appendix II, International Classification of Diseases, 9th Revision, Clinical Modification; and Tables XI and XII.)

Hospital utilization rates per 10,000 population were computed using estimates of the civilian population of the United States as of July 1 of each year. Rates for 1990–1999 use postcensal estimates of the civilian population based on the 1990 census, adjusted for net underenumeration using the 1990 National Population Adjustment Matrix from the U.S. Census Bureau. The estimates for 2000 and beyond that appear in Health, United States, 2003 and later editions were calculated using estimates of the civilian population based on Census 2000, and therefore are not strictly comparable with postcensal rates calculated for the 1990s. (See Appendix I, Population Census and Population Estimates.)

References
For More Information

National Immunization Survey (NIS)

CDC/National Center for Immunization and Respiratory Diseases (NCIRD) and NCHS

Overview. NIS is a continuing nationwide telephone sample survey to monitor vaccination coverage rates among children 19–35 months of age and among teenagers (NIS–Teen) 13–17 years.

Selected Content. Data collected for children include vaccination status and date of vaccinations for diphtheria, tetanus toxoids, and acellular pertussis vaccine (DTP/DT/DTaP); poliovirus vaccine (Polio); measles, mumps, and rubella vaccine (MMR); Haemophilus influenzae type b vaccine (Hib); hepatitis B vaccine (Hep B); varicella zoster vaccine; pneumococcal conjugate vaccine (PCV); hepatitis A (Hep A); influenza; and for adolescents meningococcal conjugate vaccine (MCV4) and human papillomavirus vaccine (HPV). Demographic data include age, gender, race and ethnicity, and poverty level. Data are available at a variety of geographic levels, including census regions, state, and selected urban areas.

Data Years. Annual household data collection was initiated beginning with the data year 1994. Data collection for varicella began in July 1996; data collection for PCV began in July 2001. Data collection for adolescents 13–17 years of age began in 2006.

Coverage. Children 19–35 months of age and adolescents 13–17 years in the civilian noninstitutionalized population are represented in this survey. Estimates of vaccine-specific coverage are available for the Nation, states, and selected urban areas.

Methodology. NIS is a nationwide telephone sample survey of households with age-eligible children. NIS uses a two-phase sample design. First, a random-digit-dialing sample of telephone numbers is drawn. When households with age-eligible children are contacted, the interviewer collects information on the vaccinations received by all age-eligible children and obtains permission to contact the children's vaccination providers. Second, identified providers are sent vaccination history questionnaires by mail. Providers' responses are compared with information obtained from households to provide a more accurate estimate of vaccination coverage levels. Final estimates are adjusted for households without telephones and for nonresponse. NIS–Teen followed the same sample design and data collection procedures as NIS except that only one age-eligible adolescent was selected from each household for data collection.

Sample Size and Response Rate. In 2009, vaccination data were collected from providers for 17,313 children 1–35 months of age. The overall interview response rate was 64%. Vaccination information from providers was obtained for 71% of all children who were eligible for provider follow-up in 2009.

In 2009, vaccination data were collected from providers for 20,399 adolescents 13–17 years of age. The overall interview response rate was 58%. Vaccination information from providers was obtained for 57% of all adolescents who were eligible for provider follow-up in 2009.

Issues Affecting Interpretation. For data years 1998, 2002, 2004, and 2005, slight modifications to the estimation procedure were implemented to obtain vaccination coverage rates from the provider data. Published estimates of vaccination coverage based on NIS data for years prior to 1998 (e.g., estimates published in Morbidity and Mortality Weekly Report (MMWR) articles) may differ slightly from estimates published in Health, United States and on the NIS website for the same NIS data. All released public-use data files include the sampling weights using the revised estimation procedure. The findings in recent years are subject to at least three limitations. First, NIS is a telephone survey, and statistical adjustments might not compensate fully for nonresponse and for households without landline telephones. Second, underestimates of vaccination coverage might have resulted in exclusive use of provider-reported vaccination histories because completeness of records is unknown. Finally, although national coverage estimates are precise, annual estimates and trends for state and local areas should be interpreted with caution because of smaller sample sizes and wider confidence intervals.

Before January 2009, NIS did not distinguish between Hib vaccine production types; therefore, children who received three doses of a vaccine product that requires four doses were misclassified as fully vaccinated. For more information, see “Changes in Measurement of Haemophilus influenzae Serotype b (Hib) Vaccination Coverage—National Immunization Survey, United States, 2009” (2010).

References
For More Information

National Medical Expenditure Survey (NMES)—See Medical Expenditure Panel Survey

National Notifiable Disease Surveillance System (NNDSS)

CDC

Overview. NNDSS provides weekly provisional information on the occurrence of diseases defined as notifiable by the Council of State and Territorial Epidemiologists (CSTE).

Selected Content. Data include incidence of reportable diseases using uniform case definitions.

Data Years. The first annual summary of the notifiable diseases in 1912 included reports of 10 diseases from 19 states, the District of Columbia (D.C.), and Hawaii. By 1928, all states, D.C., Hawaii, and Puerto Rico were participating in national reporting of 29 specified diseases. At their annual meeting in 1950, State and Territorial Health Officers authorized a conference of state and territorial epidemiologists whose purpose was to determine which diseases should be reported to Public Health Service. In 1961, CDC assumed responsibility for the collection and publication of data concerning nationally notifiable diseases.

Coverage. Notifiable disease reports are received from health departments in the 50 states, five territories, New York City, and D.C. Policies for reporting notifiable disease cases can vary by disease or by reporting jurisdiction, depending on case status classification (i.e., confirmed, probable, or suspect).

Methodology. CDC, in partnership with CSTE, operates NNDSS. Notifiable disease surveillance is conducted by public health practitioners at local, state, and national levels to support disease prevention and control activities. The system also provides annual summaries of the data. CSTE and CDC annually review the status of national infectious disease surveillance and recommend additions or deletions to the list of nationally notifiable diseases, based on the need to respond to emerging priorities. For example, Q fever and tularemia became nationally notifiable in 2000. However, reporting nationally notifiable diseases to CDC is voluntary. Because reporting is currently mandated by law or regulation only at the local and state levels, the list of diseases that are considered notifiable varies slightly by state. For example, reporting of cyclosporiasis to CDC is not done by some states in which this disease is not notifiable to local or state authorities.

State epidemiologists report cases of notifiable diseases to CDC, which tabulates and publishes these data in Morbidity and Mortality Weekly Report (MMWR) and in Summary of Notifiable Diseases, United States (before 1985, titled Annual Summary).

Issues Affecting Interpretation. NNDSS data must be interpreted in light of reporting practices. Some diseases that cause severe clinical illness (for example, plague and rabies) are likely reported accurately if diagnosed by a clinician. However, persons who have diseases that are clinically mild and infrequently associated with serious consequences (e.g., salmonellosis) may not seek medical care from a health care provider. Even if these less severe diseases are diagnosed, they are less likely to be reported.

The degree of completeness of data reporting is also influenced by the diagnostic facilities available, the control measures in effect, public awareness of a specific disease, and the interests, resources, and priorities of state and local officials responsible for disease control and public health surveillance. Finally, factors such as changes in case definitions for public health surveillance, introduction of new diagnostic tests, or discovery of new disease entities can cause changes in disease reporting that are independent of the true incidence of disease.

Reference
For More Information

National Nursing Home Survey (NNHS)

CDC/NCHS

Overview. NNHS collects and provides national estimates on the characteristics of nursing homes and their residents and staff.

Selected Content. NNHS provides information on nursing homes from two perspectives—that of the provider of services and that of the recipient. Data about the facilities include characteristics such as bed size, ownership, affiliation, Medicare/Medicaid certification, specialty units, services offered, number and characteristics of staff, expenses, and charges. Data about the current residents and discharges include demographic characteristics, health status, level of assistance needed with activities of daily living, vision and hearing impairment, continence, services received, sources of payment, and discharge disposition (for discharges). The redesigned NNHS conducted in 2004 included new facility data items on Joint Commission on Accreditation of Healthcare Organizations (JCAHO) accreditation, electronic information systems, cultural competency, immunization policies and practices, end-of-life practices, and special service programs, as well as new patient-level data items on hospitalizations and emergency department admissions, pain assessment and pain relief, medications, family and caregiver services, end-of-life care and advance directives, pressure ulcers, behavior or mood symptoms, falls, and out-of-pocket charges. In addition to these facility and resident data items, data were also collected on nurse staffing and a supplemental survey was conducted on nursing assistants working in nursing homes.

Data Years. NCHS has conducted seven NNHSs. The first survey was performed August 1973–April 1974; the second, May–December 1977; the third, August 1985–January 1986; the fourth, July–December 1995; the fifth, July–December 1997; and the sixth, July–December 1999. The seventh and most recent NNHS, which had undergone a major redesign, was conducted August 2004–January 2005.

Coverage. The initial NNHS, conducted in 1973–1974, included the universe of nursing homes that provided some level of nursing care and excluded homes providing only personal or domiciliary care. The 1977 NNHS encompassed all types of nursing homes, including personal care and domiciliary care homes. The 1985 NNHS was designed to be similar to the 1973–1974 survey in that it excluded personal or domiciliary care homes; however, in 1985 an unknown number of residential care facilities were present in the sampling frame. These facilities were identified in the 1986 inventory survey and can be removed from the estimate of facilities and beds for 1985. The 1995, 1997, 1999, and 2004 NNHS also included only nursing homes that provided some level of nursing care and excluded homes providing only personal or domiciliary care, similar to the 1985 and 1973–1974 surveys.

Methodology. The survey uses a stratified two-stage probability design. The first stage is the selection of facilities, and the second stage is the selection of residents and discharges. Prior to the 2004 NNHS, up to six current residents and/or six discharges were selected for each facility. The 2004 survey was designed to select only 12 current residents from each facility to participate in the survey. Information on the facility was collected through a personal interview with the administrator or with staff designated by the administrator. Resident data were provided by staff familiar with the care provided to the resident. Staff relied on the medical record and personal knowledge of the resident. In addition to employee data collected during the interview with the administrator, in several years staffing data were collected by means of a self-administered questionnaire. Discharge data, when collected, were based on information recorded in the medical record.

Current residents are those on the facility's roster as of the night before the survey. Included are all residents for whom beds are maintained, even though they may be away on an overnight leave or in the hospital. People residing in personal care or domiciliary care homes are excluded. Discharges are those who are formally discharged from care by the facility during a designated reference period randomly selected for each facility before data collection. Both live and deceased discharges are included. Residents were counted more than once if they were discharged more than once during the reference period. Resident rates are calculated using estimates of the civilian population of the United States, including institutionalized persons. Population data are from unpublished tabulations provided by the U.S. Census Bureau. The 2004 population estimates are postcensal estimates as of July 1, 2004, based on the 2000 census. For more information about the 2004 population estimates, see Technical Notes in: Kozak LJ, DeFrances CJ, Hall MJ. National Hospital Discharge Survey: 2004 annual summary with detailed diagnosis and procedure data. Vital Health Stat 13(162). Hyattsville, MD: NCHS; 2006. Available from: http://www.cdc.gov/nchs/data/series/sr_13/sr13_162acc.pdf.

Statistics for NNHS are derived by a multistage estimation procedure that has three major components: (a) inflation by the reciprocals of the probabilities of sample selection, (b) adjustment for nonresponse, and (c) ratio adjustment to fixed totals. The surveys are adjusted for four types of nonresponse: (a) when an eligible nursing facility did not respond, (b) when the facility failed to complete the sampling lists, (c) when the facility did not complete the facility questionnaire but did complete the questionnaire for residents in the facility, and (d) when the facility did not provide information to complete the questionnaire for the sample resident or discharge.

Sample Size and Response Rates. In 1973–1974, the sample of 2,118 homes was selected from the 1971 National Master Facility Inventory (NMFI) and from those that opened for business in 1972. For the 1977 NNHS, the sample of 1,698 facilities was selected from nursing homes in the sampling frame, which consisted of all homes listed in the 1973 NMFI and those opening for business between 1973 and December 1976. The sample for the 1985 survey consisted of the 1,220 facilities selected from the 1982 NMFI, data for homes identified in the 1982 Complement Survey of the NMFI, data on hospital-based nursing homes obtained from the Health Care Financing Administration (now known as the Centers for Medicare & Medicaid Services), and data on nursing homes open for business between 1982 and June 1, 1984. The 1995 sample of 1,500 homes was selected from a sampling frame consisting of nursing homes from the 1991 National Health Provider Inventory (NHPI) and updated lists from the Agency Reporting System (ARS). The ARS was an ongoing system designed to periodically update the NHPI and consisted primarily of lists or directories of facilities from state agencies, federal agencies, and national voluntary organizations. For the 1997 survey, data were obtained from about 1,488 nursing homes from a sampling frame consisting of nursing homes listed on the 1991 NHPI that was updated with a current listing of nursing facilities supplied by the Health Care Finance Administration and other national organizations. The facility frame for the 1999 NNHS consisted of all nursing homes identified in the 1997 NNHS and updated with current nursing facilities listed by the Centers for Medicare & Medicaid Services and other national organizations. The 1999 sample consisted of 1,496 nursing homes. In 1995, 1997, and 1999, facility-level response rates were over 93%. For the 2004 redesigned and expanded NNHS, 1,500 nursing homes were selected and a facility response rate of 81% was achieved.

Issues Affecting Interpretation. Samples of discharges and residents contain different populations with different characteristics. The resident sample is more likely to contain long-term nursing home residents and, conversely, to underestimate short nursing home stays. Because short-term residents are less likely to be on the nursing home rolls on a given night, they are less likely to be sampled. Estimates of discharges underestimate long nursing home stays. In addition, analysts should ensure that the underlying populations are similar across survey years—for example, whether the survey includes personal or domiciliary care homes.

References
For More Information

National Survey on Drug Use & Health (NSDUH)

Substance Abuse and Mental Health Services Administration (SAMHSA)

Overview. NSDUH, formerly called the National Household Survey on Drug Abuse (NHSDA), collects data on substance use, abuse, and dependence; mental health problems; and receipt of substance abuse and mental health treatment.

Selected Content. NSDUH reports on the prevalence, incidence, and patterns of drug and alcohol use and abuse in the general U.S. civilian noninstitutionalized population 12 years of age and over. Data are collected primarily on the use of illicit drugs, the nonmedical use of prescription psychotherapeutic drugs, and the use of alcohol and tobacco products; dependence and abuse involving drugs and alcohol; mental health problems; and treatment of substance use and mental health problems. Data are also collected on special topics of interest, such as attitudes about drugs, health conditions, driving under the influence of alcohol and illicit drugs, and criminal behavior.

Data Years. NHSDA has been conducted periodically since 1971 and annually starting in 1990. In 1999, NHSDA underwent a major redesign affecting the method of data collection, sample design, sample size, and oversampling. In 2002, the survey’s name was changed to NSDUH, a monetary incentive for participation was introduced, and other improvements were made.

Coverage. The survey is representative of persons 12 years of age and over in the civilian noninstitutionalized population of the United States in each state and the District of Columbia. This includes civilians living on military bases and persons living in noninstitutionalized group quarters, such as college dormitories, rooming houses, and shelters. Persons excluded from the survey include homeless people who do not use shelters, active military personnel, and residents of institutional group quarters such as jails and hospitals.

Methodology. The data collection method is in-person interviews conducted with a sample of individuals at their place of residence. Prior to 1999, NSDUH used a paper-and-pencil interviewing methodology. Since 1999, the interview has been carried out with computer-assisted interviewing methodology. The survey uses a combination of computer-assisted personal interviewing (CAPI), conducted by the interviewer to obtain basic demographic information, and audio computer-assisted self-interviewing (ACASI) for most of the questions. ACASI provides a highly private and confidential means of responding to questions, to increase the level of honest reporting of illicit drug use and other sensitive behaviors.

In 1999, a 50-state sample design was introduced. Eight states (California, Florida, Illinois, Michigan, New York, Ohio, Pennsylvania, and Texas) are designated as large sample states with target sample sizes of 3,600 per year. The remaining states and the District of Columbia have target sample sizes of 900 per year. This approach ensures that there are sufficient samples in every state to support small area estimation, while maintaining efficiency for national estimates. In the 1999–2001 and 2002–2004 surveys, the first-stage sampling units were clusters of census blocks called area segments. In 2005, NSDUH introduced a coordinated 5-year sample design in which the first stage of selection involved census tracts, with sample segments within a single census tract to the extent possible. States were first stratified into a total of 900 state sampling (SS) regions (48 regions in each large sample state and 12 regions in each small sample state). These regions were contiguous geographic areas designed to yield the same number of interviews on average. In the 2005–2009 surveys, a total of 48 census tracts per SS region were selected with probability proportional to size. Within sampled census tracts, adjacent census blocks were combined to form the second-stage sampling units, or area segments. One segment was selected within each sampled census tract with probability proportional to population size to support the 5-year sample and any supplemental studies that SAMHSA may choose to field. Of these segments, 24 were designated for the coordinated 5-year sample and 24 were designated as reserve segments. Eight sample segments per SS region were fielded during the 2005 survey year. These sampled segments were allocated equally into four separate samples, one for each 3-month period (calendar quarter) during the year, so that the survey was essentially continuous in the field.

The design also oversampled youths and young adults, so that each state's sample was approximately equally distributed among three major age groups: 12–17 years, 18–25 years, and 26 years and over.

Sample Size and Response Rate. Nationally, of the 160,133 eligible households sampled, 142,938 addresses were successfully screened for the 2008 survey, and in these screened households, a total of 86,435 sample persons were selected, from which 68,736 completed interviews were obtained. The survey was conducted from January to December 2008. Weighted response rates were 89% for household screening and 74% for interviewing.

Issues Affecting Interpretation. Several improvements to the survey were implemented in 2002. In addition to the name change, respondents were offered a $30 incentive payment for participation in the survey starting in 2002, and quality control procedures for data collection were enhanced in 2001 and 2002. Because of these improvements and modifications, estimates from the NSDUH completed in 2002 and later should not be compared with estimates from the 2001 or earlier versions of the survey. The data collected in 2002 represent a new baseline for tracking trends in substance use and other measures. Special questions on methamphetamine were added in 2005 and 2006. Data for years prior to 2007 were adjusted for comparability. Estimates of substance use for youth based on NSDUH are not directly comparable with estimates based on Monitoring the Future (MTF) and the Youth Risk Behavior Surveillance System (YRBSS). In addition to the fact that MTF excludes dropouts and absentees, rates are not directly comparable across these surveys because of differences in the populations covered, sample design, questionnaires, and interview setting. NSDUH collects data in residences, whereas MTF and YRBSS collect data in school classrooms. In addition, NSDUH estimates are tabulated by age, whereas MTF and YRBSS estimates are tabulated by grade, representing different ages as well as different populations.

References
  • Hughes A, Muhuri P, Sathe N, Spagnola K. NSDUH series H–37, HHS pub no SMA 10–4472. Rockville, MD: Substance Abuse and Mental Health Services Administration, Office of Applied Studies; 2010. State estimates of substance use from the 2007–2008 National Surveys on Drug Use and Health. Available from: http://www​.oas.samhsa​.gov/2k8State/toc.cfm.
  • Office of Applied Studies. NSDUH series H–36; HHS pub no SMA 09–4434. Rockville, MD: Substance Abuse and Mental Health Services Administration; 2009. Results from the 2008 National Survey on Drug Use and Health: National findings. Available from: http://www​.oas.samhsa​.gov/NSDUH/2k8NSDUH/2k8results.cfm.
For More Information

National Survey of Family Growth (NSFG)

CDC/NCHS

Overview. NSFG provides national data on factors affecting birth and pregnancy rates, adoption, and maternal and infant health.

Selected Content. Data elements include sexual activity, marriage, divorce and remarriage, unmarried cohabitation, forced sexual intercourse, contraception and sterilization, infertility, breastfeeding, pregnancy loss, low birthweight, and use of medical care for family planning and infertility.

Data Years. Seven cycles of the survey have been completed: 1973, 1976, 1982, 1988, 1995, 2002, and 2006–2008.

Coverage. The 1973–1995 cycles of NSFG were based on samples of women 15–44 years of age in the civilian noninstitutionalized population of the United States. Cycles 1 and 2 (1973 and 1976) excluded most women who had never been married. Cycles 3–5 (1982, 1988, and 1995) included all women 15–44 years of age in the civilian noninstitutionalized population of the United States. Cycles 6 (2002) and 7 (2006–2008) included men and women 15–44 years of age in the household population of the United States.

Methodology. Interviews are conducted in person by professional female interviewers using a standardized questionnaire. In all cycles, black women were sampled at higher rates than white women so that detailed statistics for black women could be produced. In cycles 5 and 6 (1995 and 2002), Hispanic persons were also oversampled. In cycle 7 (2006–2008), black and Hispanic adults and all 15–19 year olds were oversampled.

To produce national estimates from the sample for the millions of women 15–44 years of age in the United States, data for the interviewed sample women were (a) inflated by the reciprocal of the probability of selection at each stage of sampling (for example, if there was a 1 in 5,000 chance that a woman would be selected for the sample, her sampling weight was 5,000); (b) adjusted for nonresponse; and (c) poststratified, or forced to agree with benchmark population values based on data from the U.S. Census Bureau.

Sample Size and Response Rates. For cycle 1, from 101 primary sampling units (PSUs), 10,879 women 15–44 years of age were selected; 9,797 of these were interviewed. In cycle 2, from 79 PSUs, 10,202 eligible women were identified; of these, 8,611 were interviewed. In cycle 3, household screener interviews were completed in 29,511 households (95%). Of the 9,964 eligible women identified, 7,969 were interviewed. In cycle 4, 10,566 eligible women 15–44 years of age were sampled. Interviews were completed with 8,450 women. The response rate for the 1990 telephone reinterview was 68% of those responding to the 1988 survey and still eligible for the 1990 survey. In cycle 5, of the 13,795 eligible women in the sample, 10,847 were interviewed. In cycle 6, from 120 PSUs, 7,643 (about 80%) interviews were completed with eligible women and 4,928 (78%) interviews were completed with men. In cycle 7, from 110 PSUs, 7,356 (about 76%) interviews were completed with eligible women and 6,139 (about 73%) interviews were completed with men.

References
For More Information

National Vital Statistics System (NVSS)

CDC/NCHS

Overview. NVSS collects and publishes official national statistics on births, deaths, fetal deaths, and, prior to 1996, marriages and divorces occurring in the United States, based on U.S. Standard Certificates. Fetal deaths are classified and tabulated separately from other deaths. The five vital statistics files—Birth, Mortality, Multiple Cause-of-Death, Linked Birth/Infant Death, and Compressed Mortality—are described in detail below.

Data Years. The death registration area for 1900 consisted of 10 states, the District of Columbia (D.C.), and a number of cities located in nonregistration states; it covered 40% of the continental U.S. population. The birth registration area was established in 1915 with 10 states and D.C. The birth and death registration areas continued to expand until 1933, when they included all 48 states and D.C. Alaska and Hawaii were added to both registration areas in 1959 and 1960, respectively—the years in which they gained statehood.

Coverage. NVSS collects and presents U.S. resident data for the aggregate of 50 states, New York City, and D.C., as well as for each individual state and D.C. Vital events occurring in the United States to non-U.S. residents and vital events occurring abroad to U.S. residents are excluded.

Methodology. NCHS's Division of Vital Statistics obtains information on births and deaths from the registration offices of each of the 50 states, New York City, D.C., Puerto Rico, the U.S. Virgin Islands, Guam, American Samoa, and Northern Mariana Islands. Until 1972, microfilm copies of all death certificates and a 50% sample of birth certificates were received from all registration areas and processed by NCHS. In 1972, some states began sending their data to NCHS through the Cooperative Health Statistics System (CHSS). States that participated in the CHSS program processed 100% of their death and birth records and sent the entire data file to NCHS on computer tapes. Currently, data are sent to NCHS through the Vital Statistics Cooperative Program (VSCP), following the same procedures as with CHSS. The number of participating states grew from 6 in 1972 to 46 in 1984. Starting in 1985, all 50 states and D.C. participated in VSCP.

U.S. Standard Certificates. U.S. Standard Certificates of Live Birth and Death and Fetal Death Reports are revised periodically, allowing evaluation and addition, modification, and deletion of items. Beginning with 1989, revised Standard Certificates replaced the 1978 versions. The 1989 revision of the birth certificate included items to identify the Hispanic parentage of newborns and to expand information about maternal and infant health characteristics. The 1989 revision of the death certificate included items on educational attainment and Hispanic origin of decedents, as well as changes to improve the medical certification of cause of death. Standard Certificates recommended by NCHS are modified in each registration area to serve the area's needs. However, most certificates conform closely in content and arrangement to the Standard Certificate, and all certificates contain a minimum data set specified by NCHS. The 2003 revision of vital records went into effect in some states beginning in 2003, but full implementation in all states will be phased in over several years.

Birth File

Overview. Vital statistics natality data are a fundamental source of demographic, geographic, and medical and health information on all births occurring in the United States. This is one of the few sources of comparable health-related data for small geographic areas over an extended time period. The data are used to present the characteristics of babies and their mothers, track trends such as birth rates for teenagers, and compare natality trends with those in other countries.

Selected Content. The Birth file includes characteristics of the baby, such as sex, birthweight, and weeks of gestation; demographic information about the parents, such as age, race, Hispanic origin, parity, educational attainment, marital status, and state of residence; medical and health information, such as prenatal care, based on hospital records; and behavioral risk factors for the birth, such as mother's tobacco use during pregnancy.

Data Years. The birth registration area began in 1915 with 10 states and the District of Columbia.

Methodology. In the United States, state laws require birth certificates to be completed for all births. The registration of births is the responsibility of the professional attendant at birth, generally a physician or midwife. The birth certificate must be filed with the local registrar of the district in which the birth occurs. Each birth must be reported promptly; the reporting requirements vary from state to state, ranging from 24 hours to as much as 10 days after the birth.

Federal law mandates national collection and publication of birth and other vital statistics data. NVSS is the result of cooperation between NCHS and the states to provide access to statistical information from birth certificates. Standard forms for the collection of the data, and model procedures for the uniform registration of the events, are developed and recommended for state use through cooperative activities of the states and NCHS. NCHS shares the costs incurred by the states in providing vital statistics data for national use.

Issues Affecting Interpretation. Data on mother's educational attainment, tobacco use during pregnancy, and prenatal care based on the 2003 revision of the U.S. Standard Certificate of Live Birth are not comparable with data based on the 1989 revision of the U.S. Standard Certificate of Live Birth. For 2006 and 2007, data on mother's educational attainment, tobacco use during pregnancy, and prenatal care are shown separately for the 17–19 reporting areas that used the 2003 revision in 2006–2007 and for the 28 reporting areas that continued to use the 1989 revision in 2007, in order to provide 2 years of comparable data. Data are not shown for reporting areas that were transitioning from the 1989 revision to the 2003 revision during 2006–2007 or for states that had other comparability issues with these three items during that timeframe. The states that implemented the 2003 revision of the U.S. Standard Certificate of Live Birth are as follows: starting in 2003, Pennsylvania and Washington; and starting in 2004, Idaho, Kentucky, New York state (excluding New York City), South Carolina, and Tennessee. Starting in 2005, the reporting area using the 2003 revision expanded to 13 states, adding Florida, Kansas, Nebraska, New Hampshire, Texas, and Vermont (midyear). Starting in 2006, the reporting area using the 2003 revision included 19 states, with the addition of California, Delaware, North Dakota, Ohio, South Dakota, and Wyoming. California does not report information on tobacco use during pregnancy. Twenty-two states (California, Colorado, Delaware, Florida, Idaho, Indiana, Iowa, Kansas, Kentucky, Nebraska, New Hampshire, New York state (excluding New York City), North Dakota, Ohio, Pennsylvania, South Carolina, South Dakota, Tennessee, Texas, Vermont, Washington, and Wyoming) reported births using the 2003 revision. Approximately one-half (53%) of all births in 2007 were reported using the 2003 revision. Prior to 2003, the number of states reporting information on maternal education, Hispanic origin, marital status, and tobacco use during pregnancy increased over the years. Interpretation of trend data should take into consideration changes to reporting areas and immigration. For methodological and reporting area changes for the following birth certificate items, see Appendix II: Age (maternal); Cigarette smoking; Education (maternal); Hispanic origin; Marital status; Prenatal care; Race.

References
For More Information

Mortality File

Overview. Vital statistics mortality data are a fundamental source of demographic, geographic, and cause-of-death information. This is one of the few sources of comparable health-related data for small geographic areas over an extended time period. The data are used to present the characteristics of those dying in the United States, to determine life expectancy, and to compare mortality trends with those in other countries.

Selected Content. The Mortality file includes demographic information on age, sex, race, Hispanic origin, state of residence, and educational attainment, as well as medical information on cause of death.

Data Years. The death registration area began in 1900 with 10 states and the District of Columbia.

Methodology. By law, the registration of deaths is the responsibility of the funeral director. The funeral director obtains demographic data for the death certificate from an informant. The physician in attendance at the death is required to certify the cause of death. Where death is from other than natural causes, a coroner or medical examiner may be required to examine the body and certify the cause of death. Data for the entire United States refer to events occurring within the United States; data for geographic areas are by place of residence. For methodological and reporting area changes for the following death certificate items, see Appendix II: Education; Hispanic origin; Race.

Issues Affecting Interpretation. The International Classification of Diseases (ICD), by which cause of death is coded and classified, is revised approximately every 10–20 years. Because revisions of the ICD may cause discontinuities in trend data by cause of death, comparison of death rates by cause of death across ICD revisions should be done with caution and with reference to the comparability ratio. (See Appendix II, Comparability ratio.) Prior to 1999, modifications to the ICD were made only when a new revision of the ICD was implemented. A process for updating the ICD was introduced with the 10th revision (ICD–10) that allows for mid-revision changes. These changes, however, may affect comparability of data between years for select causes of death. Minor changes may be implemented every year, whereas major changes may be implemented every 3 years (e.g., 2003 data year). In data year 2006, major changes were implemented, including the addition and deletion of several ICD codes. For more information, see:

Heron M, Hoyert DL, Murphy SL, Xu JQ, Kochanek KD, Tejada-Vera B. Deaths: Final data for 2006. National vital statistics reports; vol 57 no 14. Hyattsville, MD: NCHS; 2009. Available from: http://www.cdc.gov/nchs/data/nvsr/nvsr57/nvsr57_14.pdf.

The death certificate has been revised periodically. A revised U.S. Standard Certificate of Death was recommended for state use beginning January 1, 1989. Among the changes were the addition of a new item on educational attainment and Hispanic origin of the decedent and changes to improve the medical certification of cause of death. The U.S. Standard Certificate of Death was revised again in 2003; states are adopting this new certificate on a rolling basis. The 2003 revision included significant changes in the way that information on educational attainment, maternal mortality, and race are collected and coded. The educational attainment item was changed to be consistent with the U.S. Census Bureau data and to improve the ability to identify specific types of educational degrees. Educational attainment data collected using the 2003 revision are not comparable with data collected using the 1989 revision. The 2003 revision introduced a standard question on pregnancy status of female decedents. This change, in addition to changes in the classification of maternal death under ICD–10, allows for more complete reporting of deaths associated with pregnancy, childbirth, and the puerperium. These changes may affect trends in maternal mortality. The 2003 revision also permits reporting of more than one race (multiple races). This change was implemented to reflect the increasing diversity of the U.S. population and to be consistent with the decennial census. Many states, however, are still using the 1989 revision of the U.S. Standard Certificate of Death, which allows only a single race to be reported. Until all states adopt the new death certificate, the race data reported using the 2003 revision were “bridged” for those for whom more than one race was reported (multiple race) to one, single race to provide comparability with race data reported on the 1989 revision. For more information on the impact of the 2003 certificate revisions on mortality data presented in Health, United States, including a list of states that have adopted the 2003 certificate, see Appendix II: Education; Maternal death; Race.

References
For More Information

Multiple Cause-of-Death File

Overview. Multiple cause-of-death data reflect all medical information reported on death certificates and complement traditional underlying cause-of-death data. Multiple-cause data give information on diseases that are a factor in death, whether or not they are the underlying cause of death; on associations among diseases; and on injuries leading to death.

Selected Content. In addition to the same demographic variables listed for the Mortality file, the Multiple Cause-of-Death file includes record axis and entity axis cause-of-death data (see Methodology, below).

Data Years. Multiple cause-of-death data files are available for every data year since 1968.

Methodology. NCHS is responsible for compiling and publishing annual national statistics on causes of death. In carrying out this responsibility, NCHS adheres to the World Health Organization (WHO) Nomenclature Regulations. These regulations require (a) that cause of death be coded in accordance with the applicable revision of the International Classification of Diseases (ICD) (see Appendix II, International Classification of Diseases; and Table IV); and (b) that underlying cause of death be selected in accordance with international rules. Traditionally, national mortality statistics have been based on a count of deaths, with one underlying cause assigned for each death.

Prior to 1968, mortality medical data were based on manual coding of an underlying cause of death for each certificate, in accordance with WHO rules. Starting with 1968, NCHS converted to computerized coding of the underlying cause and manual coding of all causes (multiple causes) on the death certificate. In this system, called Automated Classification of Medical Entities (ACME), multiple cause codes serve as inputs to the computer software that employs WHO rules to select the underlying cause. All cause-of-death data in this report are coded using ACME. ACME is used to select the underlying cause of death for all death certificates in the United States. In addition, NCHS has developed two computer systems as inputs to ACME. Beginning with 1990 data, the Mortality Medical Indexing, Classification, and Retrieval system (MICAR) was introduced to automate coding multiple causes of death. In addition, MICAR provides more detailed information on the conditions reported on death certificates than is available through the ICD code structure. Then, beginning with data year 1993, SuperMICAR, an enhancement of MICAR, was introduced. SuperMICAR allows for literal entry of the multiple cause-of-death text as reported by the certifier. This information is then processed automatically by the MICAR and ACME computer systems. Records that cannot be processed automatically by MICAR or SuperMICAR are manually multiple-cause coded and then further processed through ACME. In 2006, SuperMICAR was used to process all of the Nation's death records.

Issues Affecting Interpretation. The ICD, by which cause of death is coded and classified, is revised approximately every 10 to 15 years. Revisions of the ICD may cause discontinuities in trend data by cause of death; therefore, comparison of death rates by cause of death across ICD revisions should be done with caution and with reference to the comparability ratio. (See Appendix II, Comparability ratio.) Data were obtained from all certificates for 1968–1971, 1973–1980, and 1983–present. Data were obtained from a 50% sample of certificates for 1972. Multiple-cause data for 1981 and 1982 were obtained from a 50% sample of certificates from 19 registration areas. For the other states, data were obtained from all certificates.

Reference
For More Information

Linked Birth/Infant Death Data Set

Overview. National linked files of live births and infant deaths are used for research on infant mortality.

Selected Content. The Linked Birth/Infant Death data set includes all variables on the natality (Birth) file, including racial and ethnic information, birthweight, and maternal smoking, as well as variables on the Mortality file, including cause of death and age at death.

Data Years. National linked files of live births and infant deaths were first produced for the 1983 birth cohort. Birth cohort linked file data are available for 1983–1991, and both period linked files and birth cohort linked files are available starting with 1995. National linked files do not exist for 1992–1994.

Coverage. To be included in the U.S. linked file, both the birth and death must have occurred in the 50 states or the District of Columbia.

Methodology. Infant mortality rates are based on infant deaths per 100,000 live births. Infant deaths are defined as a death before the infant's first birthday. About 97%–99% of files can be linked. The linkage makes available extensive information about the pregnancy, maternal risk factors, infant characteristics, and health items at birth that can be used in analyses of infant mortality.

Starting with data year 1995, more timely linked file data are produced in a period data format preceding the release of the corresponding birth cohort format. The 2006 period linked file contains a numerator file that consists of all infant deaths occurring in 2006 that have been linked to their corresponding birth certificates, whether the birth occurred in 2005 or 2006. In contrast, the 2006 birth cohort linked file will contain a numerator file that consists of all infant deaths to babies born in 2006, whether the death occurred in 2006 or 2007. Starting with 1995 data, period linked files are used for infant mortality rates tables, using the linked file data in Health, United States. For the 2006 file, NCHS accepted birth records that could be linked to infant deaths even if the births were registered after the closure of the 2006 Birth file (fewer than 100 cases). This improved the infant birth/death linkage and made the denominator file distinctly different from the official 2006 Birth file.

Other changes to the data set starting with 1995 include addition of record weights to compensate for the 1%–2% infant death records that could not be linked to their corresponding birth records. In addition, not-stated birthweight was imputed if the period of gestation was known. This imputation was done to improve the accuracy of birthweight-specific infant mortality rates because the percentage of records with not-stated birthweight is generally higher for infant deaths (3.1% in 2006) than for live births (0.1% in 2006). In 2006, not-stated birthweight was imputed for 0.09% of births.

Issues Affecting Interpretation. Period linked file data starting with 1995 are not strictly comparable with birth cohort data for 1983–1991. Although birth cohort linked files have methodological advantages, their production incurs substantial delays in data availability because it is necessary to wait until the close of a second data year to include all infant deaths to the birth cohort. Data on mother's educational attainment, tobacco use during pregnancy, and prenatal care based on the 2003 revision are not comparable with data based on the 1989 revision of the U.S. Standard Certificate of Live Birth and are currently excluded from the Health, United States statistics on infant mortality by mother's educational attainment. (See Appendix II, Education.)

Reference
For More Information

Compressed Mortality File (CMF)

Overview. The CMF is a county-level national mortality and population database.

Selected Content. The CMF contains mortality data derived from the detailed Mortality files of the National Vital Statistics System and estimates of U.S. national, state, and county resident populations from the U.S. Census Bureau. For 1968–1998, number of deaths, crude death rates, and age-adjusted death rates can be obtained by place of residence (total U.S., state, and county), age group, race (white, black, and other), sex, year of death, and underlying cause of death. For 1999–2006, mortality statistics can be obtained by place of residence, by age group and expanded race groups (white, black, American Indian or Alaska Native, Asian or Pacific Islander), and by Hispanic origin.

Data Years. The CMF spans the years 1968–2006. On CDC WONDER, data are available starting with 1979.

Methodology. In Health, United States, the CMF is used to compute death rates by urbanization level of the decedent's county of residence. Counties are categorized according to level of urbanization based on the 2006 NCHS Urban–Rural Classification Scheme for Counties. This scheme assigns counties and county equivalents to one of six urbanization levels: four metropolitan and two nonmetropolitan.

For More Information: See the CMF website at: http://www.cdc.gov/nchs/data_access/cmf.htm and the CDC WONDER website at: http://wonder.cdc.gov. (Also see Appendix II, Urbanization.)

Occupational Employment Statistics (OES)

Bureau of Labor Statistics (BLS)

Overview. The OES program conducts a semiannual survey designed to produce estimates of employment and wages for specific occupations.

Selected Content. The OES survey produces estimates of occupational employment and wages for most, three-, four-, and selected five-digit North American Industry Classification System (NAICS) levels in these sectors: forestry and logging; mining; utilities; construction; manufacturing; wholesale trade; retail trade; transportation and warehousing; information; finance and insurance; real estate and rental and leasing; professional, scientific, and technical services; management of companies and enterprises; administrative and support and waste management and remediation services; educational services; health care and social assistance; arts, entertainment, and recreation; accommodation and food services; other services (except public administration); and federal, state, and local government.

Data Years. Prior to 1996, the OES program collected only occupational employment data for selected industries in each year of the 3-year survey cycle and produced only industry-specific estimates of occupational employment. The 1996 survey round was the first year that the OES program began collecting occupational employment and wage data in every state. In addition, the program's 3-year survey cycle was modified to collect data from all covered industries each year. The year 1997 is the earliest year available for which the OES program produced estimates of cross-industry as well as industry-specific occupational employment and wages.

Coverage. The OES survey covers all full-time and part-time wage and salary workers in nonfarm establishments. Surveys collect data for the payroll period including the 12th day of May or November, depending on the industry surveyed. The survey does not cover the self-employed, owners and partners in unincorporated firms, household workers, or unpaid family workers.

Methodology. The OES survey is a federal–state cooperative program between the BLS and state workforce agencies (SWAs). The OES program surveys approximately 200,000 establishments per panel (every 6 months), taking 3 years to fully collect the sample of 1.2 million establishments. Mail surveys collect data for the payroll period including the 12th day of May or November, depending on the industry surveyed. The estimates for occupations in nonfarm establishments are based on OES data collected for the reference months of May and November. BLS provides the procedures and technical support, draws the sample, and produces the survey materials, while SWAs collect the data. SWAs from all 50 states plus the District of Columbia (D.C.), Puerto Rico, Guam, and the U.S. Virgin Islands participate in the survey. Occupational employment and wage rate estimates at the national level are produced by BLS using data from the 50 states and D.C. Employers who respond to states' requests to participate in the OES survey make these estimates possible. The nationwide response rate for the May 2009 survey was 78% for establishments, covering 74% of employment. The survey included establishments sampled in the May 2009, November 2008, May 2008, November 2007, May 2007, and November 2006 semiannual panels.

Issues Affecting Interpretation. The OES survey began using NAICS in 2002. Data prior to 2002 are based on the Standard Industrial Classification system. In 1999, the OES survey began using the new Office of Management and Budget (OMB) Standard Occupational Classification (SOC) system. The new SOC system, which will be used by all federal statistical agencies for reporting occupational data, consists of 821 detailed occupations, grouped into 449 broad occupations, 96 minor groups, and 23 major groups. The OES program provides occupational employment and wage estimates at the major group and detailed occupation level. Because of the OES survey's transition to the SOC system, estimates for 1999 and subsequent years are not directly comparable with previous years' OES estimates, which were based on a classification system having seven major occupational groups and 770 detailed occupations. Approximately one-half of the detailed occupations were unchanged under the new SOC system, with the other half being SOC occupations or occupations that are slightly different from similar occupations in the old OES classification system. Guam, Puerto Rico, and the U.S. Virgin Islands were surveyed, but their data were not included in the May 2008 survey.

Reference
  • Bureau of Labor Statistics. Washington, DC: U.S. Department of Labor; Occupational employment and wages May, 2009. 2010 May
For More Information

Online Survey Certification and Reporting Database (OSCAR)

Centers for Medicare & Medicaid Services (CMS)

Overview. OSCAR is an administrative database containing detailed information on all Medicare- and Medicaid-certified institutional health care providers, including all currently and previously certified Medicare and Medicaid nursing homes, short-term hospitals, and intermediate care facilities for the mentally retarded in the United States and territories. (Data for the territories are not shown in Health, United States.) The purpose of the facility survey certification process is to ensure that facilities meet the current CMS care requirements and thus can be reimbursed for services furnished to Medicare and Medicaid beneficiaries.

Selected Content. OSCAR contains information on facility and patient characteristics and health deficiencies issued by the government during state surveys.

Data Years. OSCAR has been maintained by CMS, formerly the Health Care Financing Administration (HCFA), since 1992. OSCAR is an updated version of the Medicare and Medicaid Automated Certification System that had been in existence since 1972.

Coverage. Facilities in the United States that receive Medicare or Medicaid payments are included.

Methodology. A facility representative fills out the forms with the required information, and the forms are submitted to CMS. The information provided can be audited at any time.

All certified facilities are inspected periodically by representatives of the state survey agency (generally the department of health). For nursing homes, for example, the survey cycle is every 15 months. Therefore, a complete census of nursing homes must be based on a 15-month reporting cycle rather than a 12-month cycle. Some nursing homes are inspected twice, or more often, during any given reporting cycle. To avoid overcounting, the data must be edited and duplicates removed. Data editing and compilation of nursing home data were performed by Cowles Research Group and published in the group's Nursing Home Statistical Yearbook series. Data editing and compilation for other facilities were performed by NCHS staff.

References
  • Cowles CM, editor. Nursing home statistical yearbooks for 1995, 1996, and 1997. Anacortes, WA: Cowles Research Group (CRG); published 1995, 1997, and 1998, respectively.
  • Cowles CM, editor. Nursing home statistical yearbooks for 1998, 1999, 2000, 2001, and 2002. Washington, DC: American Association of Homes and Services for the Aging (AAHSA); published 1999, 2000, 2001, 2002, and 2003, respectively.
  • Cowles CM, editor. Nursing home statistical yearbooks for 2003–2009. McMinnville, OR: Cowles Research Group (CRG); published 2004, 2005, 2006, 2007, 2008, 2009, and 2010, respectively.
  • Centers for Medicare & Medicaid Services. Certification and compliance. 2005. Available from: http://www​.cms.gov/CertificationandComplianc/01_Overview.asp.

Population Census and Population Estimates

U.S. Census Bureau Decennial Census

The census of population (decennial census) has been held in the United States every 10 years since 1790. It has enumerated the resident population as of April 1 of the census year since 1930. Data on sex, race, Hispanic origin, age, and marital status are collected from 100% of the enumerated population. More detailed information such as income, education, housing, occupation, and industry are collected from a representative sample of the population.

Race Data on the 1990 Census

The question on race on the 1990 census was based on the Office of Management and Budget's (OMB) 1977 Race and Ethnic Standards for Federal Statistics and Administrative Reporting (Statistical Policy Directive 15). This document specified rules for the collection, tabulation, and reporting of race/ethnicity data within the federal statistical system. The 1977 Standards required federal agencies to report race-specific tabulations using four single-race categories: American Indian or Alaska Native, Asian or Pacific Islander, black, and white. Under the 1977 Standards, race and ethnicity were considered to be two separate and distinct concepts. Thus, persons of Hispanic origin may be of any race.

Race Data on the 2000 Census

The question on race on the 2000 census was based on OMBs 1997 Revisions to the Standards for the Classification of Federal Data on Race and Ethnicity (Fed Regist 1997 October 30;62:58781–90). (Also see Appendix II, Race.) The 1997 Standards incorporated two major changes in the collection, tabulation, and presentation of race data. First, the 1997 Standards increased from four to five the minimum set of categories to be used by federal agencies for identification of race: American Indian or Alaska Native, Asian, black or African American, Native Hawaiian or Other Pacific Islander, and white. Second, the 1997 Standards included the requirement that federal data collection programs allow respondents to select one or more race categories when responding to a query on their racial identity. This provision means that there are potentially 31 race groups, depending on whether an individual selects one, two, three, four, or all five of the race categories. The 1997 Standards continue to call for use, when possible, of a separate question on Hispanic or Latino ethnicity and specify that the ethnicity question should appear before the question on race. Thus, under the 1997 Standards, as under the 1977 Standards, Hispanics may be of any race.

Modified Decennial Census Files

For several decades the U.S. Census Bureau has produced Modified Decennial Census files. These modified files incorporate adjustments to the 100% April 1 count data for (a) errors in the census data discovered subsequent to publication, (b) misreported age data, and (c) nonspecified race.

For the 1990 census, the U.S. Census Bureau modified the age, race, and sex data on the census and produced the Modified Age Race Sex (MARS) file. The differences between the population counts in the original census file and the MARS file are primarily due to modification of the race data. Of the 248.7 million persons enumerated in 1990, 9.8 million persons did not specify their race (over 95% were of Hispanic origin). For the 1990 MARS file, these persons were assigned the race reported by a nearby person with an identical response to the Hispanic origin question.

For the 2000 census, the U.S. Census Bureau modified the race data on the census and produced the Modified Race Data Summary file. For this file, persons who reported the category Some Other Race as part of their race response were assigned to one of the 31 race groups, which are the single- and multiple-race combinations of the five race categories specified in the 1997 race and ethnicity standards. Persons who did not specify their race were assigned to one of the 31 race groups by imputation. Of the 18.5 million persons who reported the category Some Other Race as part of their race response, or who did not specify their race, 16.8 million (90.4%) were of Hispanic origin.

Bridged-race Population Estimates for Census 2000

Race data on the 2000 census are not comparable with race data on other data systems that are continuing to collect data using the 1977 Standards on race and ethnicity during the transition to full implementation of the 1997 Standards. For example, states are implementing the revised birth and death certificates, which have race and ethnicity items that are compliant with the 1997 OMB Standards, at different times, and to date, many states are still using the 1989 certificates that collect race and ethnicity data in accordance with the 1977 Standards. Thus, population estimates for 2000 and beyond with race categories comparable to the 1977 categories are needed so that race-specific birth and death rates can be calculated. To meet this need, NCHS, in collaboration with the U.S. Census Bureau, developed methodology to bridge the 31 race groups in Census 2000 to the four single-race categories specified under the 1977 Standards.

The bridging methodology was developed using information from the 1997–2000 National Health Interview Survey (NHIS). The NHIS provides a unique opportunity to investigate multiple-race groups because, since 1982, it has allowed respondents to choose more than one race but has also asked respondents reporting multiple races to choose a primary race. The bridging methodology developed by NCHS involved the application of regression models relating person-level and county-level covariates to the selection of a particular primary race by the multiple-race respondents. Bridging proportions derived from these models were applied by the U.S. Census Bureau to the Census 2000 Modified Race Data Summary file. This application resulted in bridged counts of the April 1, 2000, resident single-race populations for four racial groups: American Indian or Alaska Native, Asian or Pacific Islander, black, and white. As bridged-race population estimates continue to be needed for the calculation of vital rates, the Census Bureau annually produces postcensal bridged-race estimates of the July 1 resident single-race populations.

Reference
For More Information

Postcensal Population Estimates

Postcensal population estimates are estimates made for the years following a census, before the next census has been taken. National postcensal population estimates are derived annually by updating the resident population enumerated in the decennial census using a components-of-population-change approach. Each annual series includes estimates for the current data year and revised estimates for the earlier years in the decade. The following formula is used to derive the estimates for a given year from those for the previous year, starting with the decennial census enumerated resident population as the base:

  • Resident population
  • + Births to U.S. resident women
  • − Deaths to U.S. residents
  • + Net international migration.

The postcensal estimates are consistent with official decennial census figures and do not reflect estimated decennial census underenumeration.

Estimates for the earlier years in a given series are revised to reflect changes in the components-of-change data sets (for example, births to U.S. resident women from a preliminary natality file are replaced with counts from a final natality file). To help users keep track of which postcensal estimate is being used, each annual series is referred to as a vintage and the last year in the series is used to name the series. For example, the Vintage 2001 postcensal series has estimates for July 1, 2000, and July 1, 2001, and the Vintage 2002 postcensal series has revised estimates for July 1, 2000, and July 1, 2001, as well as estimates for July 1, 2002. The estimates for July 1, 2000, and for July 1, 2001, from the Vintage 2001 and Vintage 2002 postcensal series, differ.

The U.S. Census Bureau also produces postcensal estimates of the resident population for each state and county by using a component of population change method at the county level. An additional component of population change, net internal migration, is involved. The state population estimates are produced by summing all county populations within each state.

The Census Bureau has annually produced a postcensal series of estimates of the July 1 resident population of the United States based on Census 2000 by applying the components of change methodology to the Modified Race Data Summary file. These series of postcensal estimates have race data for 31 race groups, in accordance with the 1997 race and ethnicity standards. So that the race data for 2000-based postcensal estimates will be comparable with race data on vital records, the Census Bureau has applied the NHIS bridging methodology to each 31-race-group postcensal series of population estimates to obtain bridged-race postcensal estimates (estimates for the four single-race categories: American Indian or Alaska Native, Asian or Pacific Islander, black, and white). Bridged-race postcensal population estimates are available from: http://www.cdc.gov/nchs/nvss/bridged_race.htm.

Vital rates for 2000 were calculated using the bridged-race April 1, 2000, census counts, and vital rates for 2001 and beyond were calculated using bridged-race estimates of the July 1 population from the corresponding postcensal vintage.

Intercensal Population Estimates

Intercensal population estimates are estimates made for the years between two censuses and are produced once the decennial census at the end of the decade has been completed. They replace the postcensal estimates that were produced prior to the completion of the census at the end of the decade. Intercensal estimates are more accurate than postcensal estimates because they are based on both the census at the beginning and the census at the end of the decade and thus correct for the error of closure (the difference between the estimated population at the end of the decade and the census count for that date). The error of closure at the national level was quite small for the 1960s (379,000). However, for the 1970s it amounted to almost 5 million; for the 1980s, 1.5 million; and for the 1990s, about 6 million. The error of closure affects age, race, sex, and Hispanic origin subgroup populations differently, as well as the rates based on these populations. Vital rates that were calculated using postcensal population estimates are routinely revised when intercensal estimates become available.

Intercensal estimates for the 1990s with race data comparable to the 1977 Standards have been derived so that vital rates for the 1990s could be revised to reflect Census 2000. Calculation of the intercensal population estimates for the 1990s was complicated by the incomparability of the race data on the 1990 and 2000 censuses. The Census Bureau, in collaboration with National Cancer Institute and NCHS, derived race-specific intercensal population estimates for the 1990s using the 1990 MARS file as the beginning population base and the bridged-race population estimates for April 1, 2000, as the ending population base. Bridged-race intercensal population estimates are available from: http://www.cdc.gov/nchs/nvss/bridged_race.htm.

For More Information: See the U.S. Census Bureau website at: http://www.census.gov.

Sexually Transmitted Disease (STD) Surveillance

CDC/National Center for HIV/AIDS, Viral Hepatitis, STD, and TB Prevention (NCHHSTP)

Overview. Surveillance information on the incidence and prevalence of STDs is used to inform public and private health efforts to control these diseases.

Selected Content. Case reporting data are available for nationally notifiable chanchroid, chlamydia, gonorrhea, and syphilis. Surveillance of other STDs, such as genital herpes simplex virus, genital warts or other human papillomavirus infections, and trichomoniasis are based on estimates of office visits in physicians' office practices provided by the National Disease and Therapeutic Index.

Data Years. STD national surveillance data have been collected since 1941.

Coverage. Case reports of STDs are reported to CDC by STD surveillance systems operated by state and local STD control programs and health departments in 50 states, the District of Columbia, selected cities, 3,141 U.S. counties, and outlying areas consisting of U.S. dependencies, possessions, and independent nations in free association with the United States. Data from outlying areas are not included in Health, United States.

Methodology. Information is obtained from the following data sources: (a) case reports from STD project areas; (b) prevalence data from the Regional Infertility Prevention Project, the National Job Training Program (formerly the Job Corps), the Corrections STD Prevalence Monitoring Projects, and the Men Who Have Sex With Men Prevalence Monitoring Project; (c) sentinel surveillance of gonococcal antimicrobial resistance from the Gonococcal Isolate Surveillance Project; and (d) national sample surveys implemented by federal and private organizations. STD data are submitted to CDC on a variety of hard-copy summary reporting forms (monthly, quarterly, and annually) and in electronic summary or individual case-specific (line-listed) formats via the National Electronic Telecommunications System for Surveillance.

Issues Affecting Interpretation. Because of incomplete diagnosis and reporting, the number of STD cases reported to CDC undercounts the actual number of cases occurring among the U.S. population.

Reference
  • CDC. Sexually transmitted diseases surveillance 2008. Atlanta, GA: CDC, National Center for HIV/AIDS, Viral Hepatitis, STD, and TB Prevention; 2009. Available from: http://www​.cdc.gov/std/stats08/toc.htm.
For More Information

Surveillance, Epidemiology, and End Results Program (SEER)

National Cancer Institute (NCI)

Overview. SEER tracks the incidence of new cancers each year and collects follow-up information on all previously diagnosed patients until their death.

Selected Content. For each cancer, SEER registries routinely collect data on patient demographics, primary tumor site, morphology, stage at diagnosis, first course of treatment, and follow-up for vital status.

Data Years. Case ascertainment for SEER began January 1, 1973, and has continued for more than 37 years. The most recent data available are for 2007.

Coverage. The SEER 9 registries (Atlanta, Connecticut, Detroit, Hawaii, Iowa, New Mexico, San Francisco–Oakland, Seattle–Puget Sound, and Utah) have been part of the program continuously since 1975. The SEER 13 registries (the SEER 9 registries plus Los Angeles, San Jose–Monterey, rural Georgia, and the Alaska Native Tumor Registry) have been part of the program continuously since 1992. The SEER 17 registries (the SEER 13 plus Kentucky, Greater California, New Jersey, and Louisiana) have been part of the program continuously since 2000. SEER currently collects and publishes cancer incidence and survival data from 17 population-based cancer registries covering approximately 26% of the U.S. population.

To ensure continuity in reporting areas for trend data, the SEER data file is commonly used both for statistical analyses and for analysis of cancer survival rates in Health, United States. The SEER 13 data file is commonly used for analysis of cancer incidence by expanded racial and ethnic groups.

Methodology. A cancer registry collects and stores data on cancers diagnosed in a specific hospital or medical facility (hospital-based registry) or in a defined geographic area (population-based registry). A population-based registry includes, but is not limited to, a number of hospital-based registries. In SEER registry areas, trained coders abstract medical records using the International Classification of Diseases for Oncology, Third Edition (ICD–O–3), which provides coding systems for site and tumor morphology. The third edition, implemented in 2001, is the first complete review and revision of the text and guidelines since the original publication in 1988. The major staging systems used by cancer registries are American Joint Committee on Cancer TNM (tumor, nodes, metastasis) staging and SEER Summary Stage. The SEER Extent of Disease (EOD) and TNM stages include schemes for all sites and morphologies and are used by NCI to derive SEER Summary Stage and Collaborative Staging.

NCI obtains population counts from the U.S. Census Bureau and uses them to calculate incidence rates. It also uses estimation procedures as needed to obtain estimates for years and races not included in data provided by the U.S. Census Bureau. Life tables used to determine general population life expectancy when calculating relative survival rates were obtained from NCHS and in-house calculations. Separate life tables are used for each race-sex-specific group included in SEER.

Issues Affecting Interpretation. Because of the addition of registries over time, analysis of long-term incidence and survival trends is limited to those registries that have been in SEER for similar lengths of time. Analysis of Hispanic and American Indian and Alaska Native data is limited to shorter trends. Starting with Health, United States, 2006, the North American Association of Central Cancer Registries (NAACCR) Hispanic Identification Algorithm was used on a combination of variables to classify cases as Hispanic for analytic purposes. Starting with Health, United States, 2007, Hispanic incidence data exclude data for Alaska. Earlier editions of Health, United States also excluded Hispanic data for Hawaii and Seattle. Starting with Health, United States, 2007, incidence estimates for the American Indian or Alaska Native population are limited to contract health service delivery area (CHSDA) counties within SEER reporting areas. This change is believed to produce estimates that more accurately reflect the incidence rates for this population group. More information on CHSDA is available from: http://www.ihs.gov/NonMedicalPrograms/dqwg/dqwg-section1-home.asp. For more information on SEER estimates by race/ethnicity, see: http://seer.cancer.gov/seerstat/variables/seer/race_ethnicity/index.html. Rates presented in this report may differ somewhat from those reported previously due to changes in population estimates and the addition and deletion of small numbers of incidence cases.

Reference
  • Altekruse SF, Kosary CL, Krapcho M, Neyman N, Aminou R, Waldron W, et al., editors. SEER cancer statistics review, 1975–2007. (Based on November 2009 SEER data submission). Bethesda, MD: National Cancer Institute; 2010. Available from: http://seer​.cancer.gov/csr/1975_2007.
For More Information

Survey of Mental Health Organizations (SMHO)

Substance Abuse and Mental Health Services Administration (SAMHSA)

Overview. SMHO/General Hospital Mental Health Services (GHMHS) collects data on the number and characteristics of specialty mental health organizations in the United States.

Selected Content. This inventory collects basic information such as types of mental health organizations, ownership, number of additions and residents, and number of beds. The sample survey is a more detailed questionnaire that covers types of services provided, revenues and expenditures, staffing, and many items relating to managed behavioral health care.

Data Years. The Inventory of Mental Health Organizations (IMHO/GHMHS) was conducted biannually from 1986 until 1994. SMHO replaced IMHO/GHMHS in 1998. SMHO and the inventory used as its sampling frame have been conducted biannually, starting in 1998.

Coverage. Organizations included are state and county mental hospitals, private psychiatric hospitals, nonfederal general hospitals with separate psychiatric services, Department of Veterans Affairs medical centers, residential treatment centers for emotionally disturbed children, freestanding outpatient psychiatric clinics, partial care organizations, freestanding day–night organizations, and multiservice mental health organizations not elsewhere classified.

Methodology. IMHO was an inventory of all mental health organizations. Its core questionnaire included a version designed for specialty mental health organizations and another for nonfederal general hospitals with separate psychiatric services. The data system was based on questionnaires mailed every other year to mental health organizations in the United States. In 1998, IMHO was replaced by SMHO. SMHO is made up of two parts. A complete inventory is done by postcard, gathering a limited amount of information. The inventory is then used as a sampling frame for SMHO, which contains most of the information from the IMHO core questionnaire as well as new items about managed behavioral health care.

Sample Size and Response Rate. In Phase I, all organizations (about 10,000) were inventoried by postcard. A complete enumeration was needed to define the sampling frame for the sample survey. In Phase II, general hospitals without separate mental health units, community residential organizations, and managed behavioral health care organizations are dropped from the sampling frame. From this number, approximately 1,600–2,200 organizations are drawn for the sample survey and are sent a questionnaire, with a response rate of approximately 90%.

Issues Affecting Interpretation. Revisions to definitions of providers include phasing out Community Mental Health Centers as a category after 1981–1982; increasing the number of multiservice mental health organizations from 1981–1986; increasing the number of psychiatric outpatient clinics in 1981–1982 but decreasing the number in 1983–1984, 1986, 1990, and 1992; and increasing the number of partial care services in 1983–1984. These changes should be noted when interyear comparisons for the affected organizations and service types are made. The increase in the number of general hospitals with separate psychiatric services was partially due to a more concerted effort to identify these organizations. Forms had been sent only to those hospitals previously identified as having a separate psychiatric service. Beginning in 1980–1981, a screener form was sent to general hospitals not previously identified as providing a separate psychiatric service, to determine whether they had such a service.

Reference
For More Information

Survey of Occupational Injuries and Illnesses (SOII)

Bureau of Labor Statistics (BLS)

Overview. SOII is a federal/state program that collects statistics used to identify problems with workplace safety and to develop programs to improve workplace safety. Occupational Safety and Health Administration (OSHA) regulations require the recording and reporting by employers of occupational fatalities, injures, and illnesses. Each January, a sample of employers is selected by BLS to participate in a mandatory SOII for that calendar year.

Selected Content. Data include the number of new nonfatal injuries and illnesses by industry. The case and demographic data provide additional details on workers injured, the nature of the disabling condition, and the event and source producing that condition for those cases that involve one or more days away from work.

Data Years. BLS has conducted an annual survey since 1971.

Coverage. The data represent persons employed in private industry establishments in the United States. The survey excludes the self-employed, farms with fewer than 11 employees, private households, and federal government agencies. BLS produces annual estimates of injuries and illnesses for many of the two-, three-, four-, five-, and six-digit private-sector industries as defined by the North American Industry Classification System (NAICS).

Methodology. Survey estimates of occupational injuries and illnesses are based on a scientifically selected probability sample of establishments, rather than a census of all establishments. Each January, an independent sample of establishments is selected for each state and the District of Columbia to participate in the mandatory SOII. BLS includes all the state samples in the national sample.

Establishments included in the survey are instructed to maintain lists of injuries and illnesses and to track days away from work, restricted, or transferred for the calendar year, using the OSHA Summary of Work-Related Injuries and Illnesses form (OSHA no 300A). In January following the year of data collection, BLS mails this sample of employers the SOII. An occupational injury is any injury, such as a cut, fracture, sprain, or amputation, that results from a work-related event or from a single instantaneous exposure in the work environment. An occupational illness is any abnormal condition or disorder, other than one resulting from an occupational injury, caused by exposure to factors associated with employment. It includes acute and chronic illnesses or diseases that may be caused by inhalation, absorption, ingestion, or direct contact. Prior to 2002, injury and illness cases involved days away from work, days of restricted work activity, or both (lost workday cases). Starting in 2002, injury and illness cases may involve days away from work, job transfer, or restricted work activity. Restriction may involve shortened hours, a temporary job change, or temporary restrictions on certain duties (for example, no heavy lifting) of a worker's regular job.

Sample Size and Response Rates. Employer reports were collected from about 205,500 private industry establishments in 2008. The survey response rate was 91% in 2008.

Issues Affecting Interpretation. The number of new injuries and illnesses reported in any given year can be influenced by the level of economic activity, working conditions and work practices, worker experience and training, and number of hours worked. Long-term latent illnesses caused by exposure to carcinogens are believed to be understated in the survey's illness measures. In contrast, new illnesses such as contact dermatitis and carpal tunnel syndrome are easier to relate directly to workplace activity.

Effective January 1, 2002, OSHA revised its requirement for recording occupational injuries and illnesses. Because of the revised recordkeeping rule, the estimates from the 2002 survey and beyond are not comparable with those from previous years. See http://www.osha.gov/recordkeeping/index.html for details on the revised recordkeeping requirements.

Data for the mining industry and for railroad activities are provided by the Department of Labor's Mine Safety and Health Administration and the Department of Transportation's Federal Railroad Administration. Neither of these agencies adopted the revised OSHA recordkeeping requirements for 2002. Therefore, estimates for these industries for 2002 and beyond are not comparable with estimates for other industries but are comparable with estimates for prior years. Excluded from the survey are self-employed individuals, farmers with fewer than 11 employees, private households, federal government agencies, and employees in state and local government agencies.

Starting with 2003 data, SOII began using NAICS to classify industries. Prior to 2003, the program used the Standard Industrial Classification (SIC) system and the Bureau of the Census occupational classification system. Although some titles in SIC and NAICS are similar, there is limited compatibility because industry groupings are defined differently in the two systems. (See Appendix II, Industry of employment.)

Reference
For More Information

United States Renal Data System (USRDS)

National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), in conjunction with the Centers for Medicare & Medicaid Services (CMS) and the Health Resources and Services Administration (HRSA)

Overview. USRDS is a national data system that collects, analyzes, and distributes information about end-stage renal disease (ESRD) in the United States. USRDS staff collaborate with staff from the Centers for Medicare & Medicaid Services (CMS), HRSA, the Organ Procurement and Transplantation Network (OPTN), under the auspices of HRSA, and the ESRD networks, sharing data sets and actively working to improve the accuracy of ESRD patient information. USRDS has five goals: (a) to characterize the ESRD population; (b) to describe the prevalence and incidence of ESRD, along with trends in mortality and disease rates; (c) to investigate relationships among patient demographics, treatment modalities, and morbidity; (d) to identify new areas for special renal studies and support investigator-initiated research; and (e) to provide data sets and samples of national data to support research by the Special Studies Centers.

Selected Content. USRDS maintains a stand-alone database with data on the diagnoses and demographic characteristics of ESRD patients, along with biochemical data, dialysis claims, and information on treatment and payor histories, hospitalization events, deaths, physician/supplier services, and providers.

Data Years. Data have been compiled annually since 1988.

Coverage. The primary source of ESRD identification is the ESRD Medical Evidence form that is used to register patients at the onset of ESRD and must be submitted by dialysis or transplant providers within 45 days of initiation. The form establishes Medicare eligibility for individuals previously not Medicare beneficiaries, reclassifies previously eligible beneficiaries as ESRD patients, and provides demographic and diagnostic information on all new patients. The CMS, USRDS, and renal research communities rely on the form to ascertain patient demographics, primary diagnosis, comorbidities, and biochemical test results at the time of ESRD initiation. Since 1995, providers have been required to complete the form for all new ESRD patients (Medicare and non-Medicare eligible).

Methodology. Data for the USRDS database are compiled from existing data sources including the CMS Renal Management Information System (REMIS), CMS claims data, facility survey data, CDC survey data (NHANES), Standard Information Management System (SIMS), Medicare Evidence form (CMS–2728), ESRD Death Notification form (CMS–274 6), and OPTN transplant and wait-list data. The CMS data files are supplemented by CMS with enrollment, payer history, and other administrative data, to provide utilization and demographic information on ESRD patients.

Sample Size and Response Rate. Response or coverage rates are 100% of people treated for ESRD since May 1995 because the amended ESRD entitlement policy requires a Medicare Evidence form to be submitted for all ESRD patients, regardless of their insurance and eligibility status. However, the payment data for non-Medicare ESRD patients may be absent during the 30-month coordination period. Ascertainment of incident cases may also be incomplete because the data are for persons receiving ESRD treatment as reported to CMS and do not include patients who die of ESRD before receiving treatment and those who are not reported to CMS.

For More Information: See the USRDS website at: http://www.usrds.org.

Youth Risk Behavior Survey (YRBS)

CDC/National Center for Chronic Disease Prevention and Health Promotion (NCCDPHP)

Overview. YRBS monitors health risk behaviors among students in grades 9–12 that contribute to morbidity and mortality in both adolescence and adulthood.

Selected Content. Data are collected on behaviors that contribute to unintentional injuries and violence; tobacco use; alcohol and other drug use; sexual behaviors that contribute to unintended pregnancy and sexually transmitted diseases (STDs), including human immunodeficiency virus (HIV) infection; unhealthy dietary behaviors; and physical inactivity. In addition, YRBS monitors the prevalence of obesity and asthma.

Data Years. The national YRBS of high school students was conducted in 1990, 1991, 1993, 1995, 1997, 1999, 2001, 2003, 2005, 2007, and 2009.

Coverage. Data are representative of high school students in public and private schools in the United States.

Methodology. The national YRBS school-based surveys employ a three-stage cluster sample design to produce a nationally representative sample of students in grades 9–12 attending public and private high schools. The first-stage sampling frame contains primary sampling units (PSUs) consisting of large counties or groups of smaller, adjacent counties. The PSUs are then stratified based on degree of urbanization and relative percentage of black and Hispanic students in the PSU. The PSUs are selected from these strata with probability proportional to school enrollment size. At the second sampling stage, schools are selected with probability proportional to school enrollment size. To enable separate analysis of data for black and Hispanic students, schools with substantial numbers of black and Hispanic students are sampled at higher rates than all other schools. The third stage of sampling consists of randomly selecting one or two intact classes of a required subject from grades 9–12 at each chosen school. All students in the selected classes are eligible to participate in the survey. A weighting factor is applied to each student record to adjust for nonresponse and for the varying probabilities of selection, including those resulting from the oversampling of black and Hispanic students.

Sample Size and Response Rate. The sample size for the 2009 YRBS was 16,460 students in 158 schools. The school response rate was 81%, and the student response rate was 88%, for an overall response rate of 71%.

Issues Affecting Interpretation. National YRBS data are subject to at least two limitations. First, these data apply only to adolescents who attend regular high school. These students may not be representative of all persons in this age group because those who have dropped out of high school or attend an alternative high school are not surveyed. Second, the extent of underreporting or overreporting cannot be determined, although the survey questions demonstrate good test–retest reliability.

Estimates of substance use for youth based on the YRBS differ from the National Survey on Drug Use & Health (NSDUH) and Monitoring the Future (MTF). Rates are not directly comparable across these surveys because of differences in populations covered, sample design, questionnaires, and interview setting. NSDUH collects data in residences, whereas MTF and YRBS collect data in school classrooms. In addition, NSDUH estimates are tabulated by age, whereas MTF and YRBS estimates are tabulated by grade, representing different ages as well as different populations.

References
For More Information

Private and Global Sources

American Association of Colleges of Osteopathic Medicine (AACOM)

AACOM, founded in 1898, compiles data on various aspects of osteopathic medical education for distribution to the profession, the government, and the public. Questionnaires are sent annually to schools of osteopathic medicine requesting information on characteristics of applicants, students and graduates, faculty, curriculum, contract and grant activity, revenues and expenditures, and clinical facilities. The response rate is 100%.

Reference

  • American Association of Colleges of Osteopathic Medicine. 2006 Annual statistical report on osteopathic medical education. Chevy Chase, MD: American Association of Colleges of Osteopathic Medicine; 2007.

For More Information

  • Contact the American Association of Colleges of Osteopathic Medicine, 5550 Friendship Boulevard, Suite 310, Chevy Chase, MD 20815–7231; or see the AACOM website at: http://www​.aacom.org.

American Association of Colleges of Pharmacy (AACP)

AACP compiles data on colleges of pharmacy, including information on student enrollment and types of degrees conferred. Data are collected through an annual survey. In 2007, the response rate was 99%.

Reference

  • American Association of Colleges of Pharmacy. Profile of pharmacy students: Fall 2008. Alexandria, VA: American Association of Colleges of Pharmacy; 2009.

For More Information

  • Contact the American Association of Colleges of Pharmacy, 1727 King Street, Alexandria, VA 22314; or see the AACP website at: http://www​.aacp.org.

American Association of Colleges of Podiatric Medicine (AACPM)

AACPM compiles data on colleges of podiatric medicine, including information on the schools and enrollment. Data are collected annually through written questionnaires. The response rate is 100%.

For More Information: Contact the American Association of Colleges of Podiatric Medicine, 15850 Crabbs Branch Way, Suite 320, Rockville, MD 20855; or see the AACPM website at: http://www.aacpm.org.

American Dental Association (ADA)

ADA's Division of Educational Measurement conducts annual surveys of predoctoral dental educational institutions. A questionnaire, mailed to all dental schools, collects information on academic programs, admissions, enrollment, attrition, graduates, educational expenses and financial assistance, patient care, advanced dental education, and faculty positions.

Reference

  • American Dental Association. 2007–2008 Survey of dental education, vol 1, Academic programs, enrollments, and graduates. Chicago, IL: American Dental Association; 2009.

For More Information

  • Contact the American Dental Association, 211 East Chicago Avenue, Chicago, IL 60611–2678; or see the ADA website at: http://www​.ada.org.

American Hospital Association (AHA) Annual Survey of Hospitals

Data from the AHA's annual survey are based on questionnaires sent to all AHA-registered and nonregistered hospitals in the United States and its associated areas. U.S. government hospitals located outside the United States are excluded. Overall, the average response rate over the past 5 years has been approximately 85%. For nonreporting hospitals and for the survey questionnaires of reporting hospitals on which some information was missing, estimates are made for all data except those on beds, bassinets, and facilities. Data for beds and bassinets of nonreporting hospitals are based on the most recent information available from those hospitals. Data for facilities and services are based only on reporting hospitals. Estimates of other types of missing data are based on data reported the previous year, if available. When unavailable, estimates are based on data furnished by reporting hospitals similar in size, control, major service provided, length of stay, and geographic and demographic characteristics.

For More Information: Contact the AHA Annual Survey of Hospitals, Health Forum, LLC, an American Hospital Association Company, One North Franklin Street, Chicago, IL 60606; or see the AHA website at: http://www.aha.org.

American Medical Association (AMA) Physician Masterfile

A master file of physicians has been maintained by the AMA since 1906. The Physician Masterfile contains data on all physicians in the United States, both members and nonmembers of the AMA, and on those graduates of American medical schools temporarily practicing overseas. The file also includes information on international medical graduates (IMGs), who are graduates of foreign medical schools, who reside in the United States, and who meet U.S. educational standards for primary recognition as physicians.

A file is initiated on each individual upon entry into medical school or, in the case of IMGs, upon entry into the United States. Between 1965 and 1985, a mail questionnaire survey was conducted every 4 years to update the file information on professional activities, self-designated area of specialization, and present employment status. Since 1985, approximately one-fourth of all physicians are surveyed each year.

Reference

  • American Medical Association, Division of Survey and Data Resources. Physician characteristics and distribution in the U.S., 2009. Chicago, IL: American Medical Association; 2009.

For More Information

  • Contact the American Medical Association, 515 North State Street, Chicago, IL 60654; or see the AMA website at: http://www​.ama-assn.org.

American Osteopathic Association (AOA)

AOA was established to promote the public health, to encourage scientific research, and to maintain and improve high standards of medical education in osteopathic colleges. The AOA Department of Educational Affairs sets the standards for and accredits osteopathic medical colleges and hospitals, postdoctoral training, and board certification programs. AOA publishes both professional and public informational materials. Professional publications include information on osteopathic education, accreditation of hospitals and other health care delivery facilities, and physician licensing. Public information materials include introductory materials on osteopathic medicine, brochures on osteopathic physicians and osteopathic medicine, and patient education materials. AOA compiles the number of osteopathic physicians (DOs); the number of active DOs by gender, age, and specialty and by 50 states and the District of Columbia; and the number of osteopathic medical students by selected characteristics. Statistics for 2007 are available from: http://www.osteopathic.org/inside-aoa/about/who-we-are/Pages/aoa-annual-statistics.aspx.

For More Information: Contact the American Osteopathic Association, 142 East Ontario Street, Chicago, IL 60611; or see the AOA website at: http://www.osteopathic.org.

Association of American Medical Colleges (AAMC)

AAMC collects information on student enrollment in medical schools through its annual Liaison Committee on Medical Education questionnaire, the fall enrollment questionnaire, and the American Medical College Application Service (AMCAS) data system. Other data sources are the Medical School Profile System, the Pre-MCAT questionnaire, the Minority Student Opportunities in Medicine questionnaire, the Faculty Roster system, data from the Medical College Admission Test, and one-time surveys developed for special projects.

The AAMC Data Warehouse (DW) stores two sections of data relevant to applicants and students: AAMC DW: AMF (Applicant Matriculant file) and AAMC DW: Student. From these two source files, AAMC derives summary statistics about applicants, accepted applicants, matriculants, enrollees, and graduates. AAMC DW: AMF compiles applicant and matriculant data from AMCAS and other medical school application processes. AAMC DW: Student compiles enrollee and graduate data from the AAMC Student Records System. Applicant, enrollment, and graduate statistical data are arranged by academic year, which begins July 1 and ends June 30.

Reference

  • Association of American Medical Colleges. Statistical information related to medical schools and teaching hospitals. Washington, DC: Association of American Medical Colleges; 2008.

For More Information

  • Contact the Association of American Medical Colleges, 2450 N Street, NW, Washington, DC 20037–1126; or see the AAMC website at: http://www​.aamc.org.

Association of Schools and Colleges of Optometry (ASCO)

ASCO compiles data on various aspects of optometric education, including data on schools and enrollment. Questionnaires are sent annually to all schools and colleges of optometry. The response rate is 100%.

Reference

  • Association of Schools and Colleges of Optometry. Annual survey of optometric educational institutions: July 1992–June 1993. Rockville, MD: Association of Schools and Colleges of Optometry; 1994.

For More Information

  • Contact the Association of Schools and Colleges of Optometry, 6110 Executive Boulevard, Suite 420, Rockville, MD 20852; or see the ASCO website at: http://www​.opted.org.

Association of Schools of Public Health (ASPH)

ASPH compiles data on schools of public health in the United States and Puerto Rico. Questionnaires are sent annually to all member schools. The response rate is 100%.

Unlike health professional schools that emphasize specific clinical occupations, schools of public health offer study in specialty areas such as biostatistics, epidemiology, environmental health, occupational health, health administration, health planning, nutrition, maternal and child health, social and behavioral sciences, and other population-based sciences.

For More Information: Contact the Association of Schools of Public Health, 1101 15th Street, NW, Suite 910, Washington, DC 20005; or see the ASPH website at: http://www.asph.org.

Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) Census

The CT/MRI Census is a biennial telephone survey that queries all hospital and nonhospital sites in the United States performing CT and MRI procedures. The census details the types of procedures being performed, procedure volumes, staffing and productivity, installed equipment, planned equipment purchases, and annual budgets for consumables, including contrast media.

Candidate sites for MRI/CT procedures are identified in the American Hospital Association's AHA Guide. U.S. territories are not included.

References

  • American Hospital Association. AHA guide, 2010. Chicago, IL: American Hospital Association; 2009. IMV, Medical Information Division. 2006 Computed tomography (CT) and magnetic resonance imaging (MRI) census, Benchmark report: Installed base of CT scanners; Installed base of MRI scanners. DesPlaines, IL: IMV Ltd., Medical Information Division; 2007.

For More Information

Dartmouth Atlas of Health Care

The Dartmouth Institute

Overview. The Dartmouth Atlas Project (DAP) began in 1993 as a study of health care markets in the United States, measuring variations in health care resources and their utilization by geographic areas: local hospital market areas, regional referral regions, and states. More recently, the research agenda has expanded to reporting on the resources and utilization among patients at specific hospitals. DAP research uses very large claims databases from the Medicare program and other sources to define where Americans seek care, what kind of care they receive, and to correlate increasing expenditures and the supply of health providers and services with health outcomes.

Selected Content. The database contains information on Medicare spending and on Medicare utilization of selected services, providers, and facilities, by state, local, and regional market areas; by selected subpopulations of Medicare beneficiaries, including decedents and chronically ill beneficiaries; and by providers. The database also allows users to compare quality measures across hospitals.

Data Years. Dartmouth Atlas data are available for 1994 onward.

Coverage. Medicare beneficiaries between the ages of 65 and 99 years with full Part A and Part B entitlement are included in the database. Persons enrolled in managed care organizations are excluded from the analysis.

Methodology. Data reported in Health, United States, as computed by DAP, use Medicare claims and administrative data (see Appendix I, Medicare Administrative Data). The percentage of Medicare deaths occurring in a hospital was computed using “death in a hospital” (discharge status B in the Medicare Provider Analysis and Review (MEDPAR) file) as the numerator event. For the percentage of Medicare deaths who were admitted to an intensive care unit (ICU) in the last 6 months of life, the numerator event was “death in a hospital with admission to an ICU within 6 months of the death date” using MEDPAR files. Rates were age-, sex-, and race-adjusted and were expressed as a percentage of deaths. Medicare decedents are identified by their ZIP code of residence.

Total ICU days measures intensive care days (which includes medical, surgical, trauma, and burn care) and coronary care days to produce a total ICU days measure. Intermediate care or step-down units are also included.

Sample Size and Response Rate. The data are from the MEDPAR file, a 100% sample of inpatient claims. The file includes one record for each hospital stay by a Medicare beneficiary, including data on dates of admission and discharge, diagnoses, procedures, and Medicare reimbursements to the hospital.

Issues Affecting Interpretation. The data do not include Medicare enrollees enrolled in managed care organizations under Medicare Advantage.

For More Information: Contact Dartmouth Atlas of Health Care, c/o The Dartmouth Institute for Health Policy and Clinical Practice, 35 Centerra Parkway, Suite 202, Lebanon, NH 03766; or see the Dartmouth Atlas of Health Care website at: http://www.dartmouthatlas.org/faq.shtm.

Guttmacher Institute Abortion Provider Census

Overview. The Guttmacher Institute (previously called the Alan Guttmacher Institute, or AGI) is a not-for-profit organization for reproductive health research, policy analysis, and public education. The institute's abortion provider surveillance program documents the number of legal induced abortions, monitors unintended pregnancy, and assists in efforts to identify and reduce preventable causes of morbidity and mortality associated with abortions.

Selected Content. Guttmacher reports the number of induced abortions; number, types, and locations of providers; and types of procedures performed by state and region. Health, United States presents the total number of abortions reported by Guttmacher for each data year.

Data Years. Guttmacher has collected or estimated national abortion data since 1973. Fourteen provider surveys have been conducted for selected data years 1973–2005. No data were collected for 1983, 1986, 1989, 1990, 1993, 1994, 1997, 1998, 2001, 2002, and 2003.

Coverage. The abortion data reported to Guttmacher include women of all ages, including adolescents, who obtain legal induced abortions, and includes both surgical and medication (e.g., using mifepristone, misoprostol, or methotrexate) abortion procedures. Data are collected from three major categories of providers that were identified as potential providers of abortion services: clinics, physicians, and hospitals.

Methodology. For 1999–2000 and 2004–2005, a version of the survey questionnaire was created for each of the three major categories of providers, modeled on the survey questionnaire used for Guttmacher's data collection in 1997. Questionnaires were mailed to all potential providers, with two additional mailings and telephone follow-up for nonresponse. All surveys asked the number of induced abortions performed at the provider's location. State health statistics agencies were also contacted, requesting all available data reported by providers to each state health agency on the number of abortions performed in the survey year. For states that provided data to the Guttmacher Institute, the health agency figures were used for providers who did not respond to the survey. Estimates of the number of abortions performed by some providers were ascertained from knowledgeable sources in the community.

To estimate the number of abortions performed in 2001, 2002, and 2003, the Guttmacher Institute first estimated the change in the number of abortions between 2000 and 2001, beginning with the number of abortions occurring in each state, as reported by the CDC, in each of those 2 years (see Appendix I, Abortion Surveillance System). The three states without reporting systems were excluded. Guttmacher also eliminated the states with very incomplete or inconsistent reporting (Arizona, Maryland, Nevada, and the District of Columbia (D.C.)) and summed the number of abortions that took place in the 44 remaining states for each year. The percentage change between 2000 and 2001 was then applied to Guttmacher's more complete nationwide count of 1,312,990 abortions in 2000 to arrive at the national estimate for 2001. The same procedure was used to estimate the change in the number of abortions between 2001 and 2002 and between 2002 and 2003, except that the data for both years were collected directly from state health departments because the CDC abortion surveillance report for the latest year was not yet available. The states without reporting systems were not included, and, as before, Guttmacher excluded states with incomplete or inconsistent reporting. Further adjustments were made after the 2004–2005 Guttmacher survey results became available.

Sample Size and Response Rate. Of the 2,310 potential providers surveyed for 2004–2005 data, 1,552 responded directly or in follow-up; health department data were used for 274 providers; knowledgeable sources were used for 59 providers; and Guttmacher made its own estimates for 330 facilities. The level of internal estimation was higher than in previous years because health department data from New York and California were less complete.

Issues Affecting Interpretation. The drug mifepristone for medical abortion was approved in September 2000 by the U.S. Food and Drug Administration (FDA) for distribution and use in the United States. For the 2004–2005 data, the distributor of mifepristone also mailed surveys to all facilities and medical professionals that had ever purchased mifepristone.

The CDC national count of abortions was 15% lower than the Guttmacher survey in 1977 and 1978, 12% lower in 1987, 11% lower in 1991 and 1992, and 12% lower in 1995. Beginning in 1998, CDC reported totals for only 48 states and D.C.; since then, the total number of abortions reported to CDC has been about 34% less than the total estimated by Guttmacher. The three reporting areas that did not report abortions to CDC in 2005 (the largest of which was California) accounted for 18% of all abortions tallied by Guttmacher's 2005 survey. (See Appendix I, Abortion Surveillance System.)

References

For More Information

  • Contact the Guttmacher Institute, 125 Maiden Lane, 7th floor, New York, NY 10038; or see the Guttmacher Institute website at: http://www​.guttmacher.org.

Organisation for Economic Cooperation and Development (OECD) Health Data

OECD provides annual data on statistical indicators for health and health systems collected from 30 member countries, with some time series going back to 1960. The international comparability of health expenditure estimates depends on the quality of national health accounts in OECD member countries. In recent years, an increasing number of countries have adopted the standards for health accounting defined by OECD, greatly increasing the comparability of national health expenditure data reporting. Additional limitations in international comparisons include differing boundaries between health care and other social care, particularly for the disabled and elderly, and underestimation of private expenditures on health.

OECD was established in 1961 with a mandate to promote policies to achieve the highest sustainable economic growth and a rising standard of living among member countries. The organization now comprises 30 member countries: Australia, Austria, Belgium, Canada, Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Italy, Japan, Korea, Luxembourg, Mexico, the Netherlands, New Zealand, Norway, Poland, Portugal, Slovak Republic, Spain, Sweden, Switzerland, Turkey, the United Kingdom, and the United States.

As part of its mission, OECD has developed a number of activities related to health and health care systems. The main aim of OECD work on health policy is to conduct cross-national studies of the performance of OECD health systems and to facilitate exchanges between member countries regarding their experiences in financing, delivering, and managing health services. To support this work, each year OECD compiles cross-country data in the OECD Health Data database, one of the most comprehensive sources of comparable health-related statistics. OECD Health Data is an essential tool for conducting comparative analyses and drawing lessons from international comparisons of diverse health care systems. This international database now incorporates the first results arising from implementation of the OECD manual, A System of Health Accounts, which provides a standard framework for producing a set of comprehensive, consistent, and internationally comparable data on health spending. OECD collaborates with other international organizations such as the World Health Organization.

Reference

For More Information

  • Contact the OECD Washington Center, 2001 L Street, NW, Suite 650, Washington, DC 20036–4922; or see the OECD website at: http://www​.oecd.org/health.

Views

Other titles in this collection

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...