NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

National Research Council (US) Panel on Performance Measures and Data for Public Health Performance Partnership Grants; Perrin EB, Durch JS, Skillman SM, editors. Health Performance Measurement in the Public Sector: Principles and Policies for Implementing an Information Network. Washington (DC): National Academies Press (US); 1999.

Cover of Health Performance Measurement in the Public Sector

Health Performance Measurement in the Public Sector: Principles and Policies for Implementing an Information Network.

Show details

4Data and Information Systems: Issues for Performance Measurement

A performance measurement program must begin by identifying outcome goals, and then using those goals to guide the selection of suitable measures of desired outcomes and related processes and capacities. Once those steps have been completed, operationalizing performance measurement requires access to appropriate data and analytic resources. In its first report, the panel observed that many types of data useful for monitoring the performance of publicly funded health programs are collected and assembled across the country, but that few data sources are ideal for this purpose. For the most part, data systems have not been created specifically for performance measurement, so they may currently be narrower, less timely, or less comparable to other data systems than is optimal.

Despite such shortcomings, there are a number of reasons why the panel favors enhancing this extensive and often strong information base rather than establishing wholly new and specialized data systems for performance measurement. Although current health data collection processes and the resulting data sets often are not well coordinated with each other (Thacker and Stroup, 1994), the panel is hopeful that the current interest in performance measurement, reflected in reports such as this one, will encourage policy makers and health professionals at the federal, state, and local levels to transform the many different existing data sources into a more efficient and effective health information system with the capability of responding to varied information needs.

Collecting and assembling data is expensive, and expanding data collection efforts carries the risk of reducing the resources available for program services. Building on existing data systems for purposes of performance measurement would still require a substantial commitment of resources, but should be expected to promote more efficient and effective use of those systems, and to improve their value for other applications as well. In relying on data collected for other primary purposes, however, those who develop and use performance measures must have a good understanding of the nature and limitations of those data.

This chapter begins by reviewing various health data resources. It then examines analytic and operational challenges involved in using those data, including assuring the quality of data and data analysis; developing and implementing standards for data and data systems; enhancing performance measurement through advances in information technology; and protecting the privacy, confidentiality, and security of health data. The chapter then outlines steps that can be taken to strengthen the data and data systems used to support performance measurement, in particular by investing in health data and data systems and by taking a collaborative approach to their development.

Health Data Resources

Diverse health-related data are required to monitor and better understand the health of the population, including the incidence and prevalence of disease, morbidity and mortality associated with acute and chronic illness, behavioral risk factors, disability, and the quality of life. Data are also needed to plan, implement, and evaluate health policies, programs, and services. The data to meet these needs are produced and used in both the public and private sectors and, increasingly, by public-private partnerships. The Performance Partnership Grants (PPG) proposal that was the impetus for the work of this panel focused attention specifically on data for performance measures to be used in the context of state reporting requirements for federal grants. The panel emphasizes, however, that if performance measurement activities are to succeed, they should fit into a broader agenda for collecting and using health data to protect the health of the public, as well as for guiding the development and implementation of health policies at the local, state, and federal levels.

Although the panel did not attempt to address measurement of the quality and performance of individual health care providers or health plans, it should be noted that these activities are generating similar concerns about such matters as the selection of suitable performance measures, the limitations of administrative data sets for assessing health outcomes, the need for greater standardization of measures and data and for methods to improve data quality, and broader use of new information technologies (see, e.g., Iezzoni, 1997a; National Committee for Quality Assurance, 1997; Palmer, 1997; Foundation for Accountability, 1998; and Joint Commission on Accreditation of Healthcare Organizations, 1998). Major changes in social welfare programs are also prompting a reexamination of the adequacy of data resources for monitoring those programs, especially at the state and local levels (e.g., Joint Center for Poverty Research, 1998; National Research Council, 1998).

Data for performance measurement can be drawn from a variety of sources, such as reports to disease surveillance or vital statistics systems, environmental monitoring systems, population surveys, and clinical or administrative records from service encounters. Considering only the program areas covered by the original PPG proposal, the panel identified 48 data systems that might provide data for performance measurement (National Research Council, 1997a). Most states and communities can be expected to have a similarly large number of systems from which to draw data for performance measurement.

Four basic types of data resources are available: (1) registries, often referred to as census data systems, that attempt to capture information about all events of interest on such matters as health status (e.g., births, deaths, cases of disease) or risk factors (e.g., immunizations, environmental contaminants); (2) surveys that obtain data through the systematic collection of information from a representative sample of a population of interest; (3) patient records that contain clinical information obtained in the course of providing health care; and (4) administrative data, such as billing records, that are collected as part of the operation of a program (although these records may include data on health status or clinical care, that is not their primary purpose). Each type of data has a place in performance measurement, but each also has limitations that must be taken into account. Linking data over time or across data sets can potentially overcome some of those limitations and result in more useful information than is obtainable using a single data set or data for a single point in time. The basic types of health data and some of the issues related to linkage of data sets are reviewed in this section.


Registries are census-like data systems designed to compile information on all events of a specified type, such as births, deaths, specific injuries and environmental or infectious diseases, cancers, immunizations, hospital discharges, and birth defects. Vital records and disease surveillance registries are some of the most long-standing examples of these health data systems. Reporting systems also compile information on air and water quality, work-related injuries, and motor vehicle crashes resulting in deaths. Registries rely on reports of specific information to a designated authority. Some registries collect data through direct reporting of the events of interest (e.g., births, cases of reportable diseases), whereas others rely on assembling information originally collected in whole or in part for other purposes (e.g., work-related injuries).

Some of these systems operate locally, while others are connected to a state-or nationwide data system. For example, hospitals file reports on births with local or state registrars, and states then transmit these records to the National Center for Health Statistics (NCHS), where national vital statistics data are compiled. The rules governing which data are collected and how they are reported are developed and maintained through a federal-state collaborative system. In contrast, immunization registries are being developed by some states and communities to capture reports on all immunizations administered to children (and also to serve as an information resource for health care providers on the immunization status of children under their care), but there is no national registry of immunization reports.

Registry systems benefit from standardized reporting practices. For example, NCHS and the states work together to develop standard birth and death certificates and guidelines for completing them. Systems differ substantially in their completeness, however. For example, virtually all births are reported, but reporting of fetal deaths is much less complete. Data on some reportable but often clinically mild or asymptomatic diseases (e.g., chlamydia, hepatitis C) are often incomplete because cases may not receive medical care or may not be diagnosed. The quality of the reported data also varies. Birth certificate data on birth weight, for example, are generally more reliable than some of the accompanying information, such as reports of birth defects or the mother's use of tobacco during pregnancy.

The significance of such limitations in these data depends on how the data are to be used. Estimation of reliable incidence and prevalence rates, for example, requires nearly complete reporting, whereas monitoring of trends depends more (within limits) on consistency of reporting than on completeness. For example, consistent and essentially complete reporting of births and deaths is the basis for calculation of comparable birth and death rates at the local, state, and national levels. In contrast, reportable disease data compiled at the national level are useful for monitoring disease trends even if they are not complete; however, these data are appropriate for more precise assessments of incidence rates only for those conditions for which reporting is essentially complete. And any variation in reporting practices from state to state means the resulting data will not be appropriate for assessing small differences in incidence rates across states.


Surveys are an essential resource for population-based performance measurement data. Well-designed surveys produce information about an entire population by collecting data from a representative sample of that population. The population of interest in a survey is often defined by residence in a geographic area, such as a state or county, but may also be defined by other characteristics, such as age, place of employment, enrollment in a public assistance program (e.g., Medicaid), or use of a specific clinic. Continuing survey programs that have a defined schedule (e.g., the National Immunization Survey, the Behavioral Risk Factor Survey) can combine a stable core of questions, yielding results that can be compared over time, with changing sets of questions that can address topics of special interest. One-time surveys or surveys repeated on an irregular schedule have less value for performance measurement because they provide at best a limited basis for comparisons over time. The use of surveys requires special expertise in such matters as questionnaire and sample design.

Surveys are particularly well suited to obtaining data for many measures of health status, functioning, and risk that depend on reports of behaviors, perceptions, and attitudes. They also are good tools for collecting information on general activities and events. Survey data are, however, vulnerable to misreporting and can be adversely affected by nonresponse. Respondents may misreport unintentionally because of recall errors or lack of knowledge (e.g., date of last illness or hypertension status), may refuse to answer certain questions, or may intentionally alter their responses on sensitive topics (e.g., drug use or even exercise habits). Careful questionnaire design can help reduce some forms of misreporting. Nonresponse is a concern because individuals who are missed may differ from the respondents in important ways (e.g., older or younger, lower or higher income, sicker or healthier) that cannot be determined with certainty. Despite such limitations, surveys may be the best or only option for obtaining data on key topics of interest.

The cost of surveys is a major constraint on their use. In contrast to data collection that occurs as a byproduct of other activities, such as restaurant inspections or health care visits, surveys require a set of specialized activities, including developing a sampling frame, selecting the sample, locating the eligible respondents, and gathering the survey data. For each of these activities, choices can be made that affect costs, but those choices may also affect the quality of the survey results. For example, telephone interviews tend to be less costly than in-person interviews, but cannot reach people who do not have a telephone.

Such cost trade-offs should be weighed carefully. For some purposes, a factor such as telephone access may have little impact on the quality of the data, readily justifying the use of a less costly method of data collection. An analysis of National Health Interview Survey data, which were obtained through in-person interviews, found little difference in results between respondents who had telephone access and the overall responses, even when the analysis was restricted to persons below the poverty level (Anderson et al., 1998). Similarly, studies of the Behavioral Risk Factor Surveillance System suggest that its telephone-based methods are sufficiently reliable to justify continued use of this less expensive method (e.g., Arday et al., 1997). In contrast, a study focusing on health insurance coverage suggests that reliance on telephone interviews alone may not be adequate for some analyses (Strouse et al., 1997). It may, however, be possible to use baseline data from in-person interviews to adjust estimates based on data collected by telephone in subsequent rounds of a study.

Patient Records and Related Clinical Encounter Data

The detailed clinical records maintained by physicians, hospitals, health plans, and most other health care providers on each patient they treat are repositories for an array of data such as patient-reported health status and risk factors, clinical observations, diagnoses, procedures performed, medications prescribed, and results of laboratory tests. Access to clinical data from medical records would improve the analytic strength of many health survey and administrative data sets. However, these records have important limitations. Most patient records are still maintained in paper form, which makes it difficult to aggregate and analyze the data or integrate them into broader health data systems. Extracting data from paper records requires time-consuming and costly review of individual files. Research studies that require specific clinical data often review samples of records, but even that approach is likely to be too costly and time-consuming to be practical for the periodic reporting required for performance measurement. Furthermore, the completeness and consistency of records may differ across records or within a single record over time, and may vary more for certain types of information than for others. For example, numerical data, such as blood pressure readings, are more readily recorded in a consistent manner than are notes describing clinical observations.

There is widespread support for the development of computer-based patient records (CPRs), and considerable progress has been made in this area in recent years (Institute of Medicine, 1997). The CPR holds the promise that documentation of the process and outcomes of care will become a byproduct of the use of such an information system in the delivery of care, and that patient records will become a more practical source of data for performance measurement for both the health care industry and health agencies at the federal, state, and local levels. Major advances are needed in at least three areas, however, if more extensive use is to be made of clinical data in computerized form: standards defining the structure and content of electronic clinical records must be established, technology for converting natural medical language into standardized coding systems must be developed, and privacy concerns must be resolved.

Despite progress, there are still substantial differences and incompatibilities among the CPR systems now in use. Standards for the data elements included in patient records, the codes and vocabulary used to represent clinical data, and the format of electronic records are still evolving. Additional research and testing are also needed to move beyond prototype systems for converting natural medical language into medical procedure and diagnosis codes. Among the groups working on these CPR issues are federal agencies such as the National Library of Medicine and the Agency for Health Care Policy and Research, private organizations such as the Computerized Patient Record Institute, and various private companies. Progress toward a CPR should also result from the Health Insurance Portability and Accountability Act of 1996 (HIPAA) (Public Law 104-191), which directs the Secretary of Health and Human Services to promulgate guidelines for computerized medical records by August 2000. HIPAA also calls for establishing policies to protect the security and privacy of electronic health data transactions. Privacy concerns are an issue for all health-related data, but are particularly acute for information contained in medical records. (Other provisions of HIPAA are reviewed elsewhere in this chapter.)

Computerization per se will not, however, overcome certain limitations inherent in patient records. For example, as used in most health care settings, patient records are not well suited to capturing information on patients' views about the care they receive. Clinical records can also be incomplete when people receive health care services from several sources, each of which maintains a separate record. For some of its performance measures, the Health Plan Employer Data and Information Set (HEDIS), version 3.0, compensates for such factors by requiring health plans to use data from a member survey rather than from administrative or medical records (National Committee for Quality Assurance, 1996; see also Chapter 2). For example, the rate of influenza vaccination among older adults is to be tracked with survey data because health plan members may receive these shots through community programs instead of their health plan.

Administrative Data

The operation of health programs typically generates substantial amounts of nonclinical administrative data that can be useful for performance measurement. Some of this information describes program resources or characteristics of program operation, such as numbers and qualifications of staff members or features of facilities used to provide services (e.g., number of laboratories meeting quality standards). Administrative records on population-based services can produce such information as the number and results of restaurant inspections, the number of immunizations administered at special immunization events, or the number of health education programs offered. Programs that provide services to specific individuals (e.g., substance abuse treatment or prenatal care) generate administrative records that contain information about those individuals and the services they receive. Administrative data produced by various other activities that are not specifically health-related can also provide useful information for health programs. For example, traffic safety records can provide data on motor vehicle crashes resulting in injuries, and state corrections records can provide information on incarcerated adults with serious mental illness. In addition, administrative records can sometimes be used to identify a population for a separate survey-based data collection activity.

Most administrative data sets are created to serve operational purposes rather than the needs of performance measurement or other analytic tasks. Even so, they are a valuable resource with some advantages over other types of data. A recent assessment of the utility of administrative data for policy studies of public assistance programs provides useful insights for the health-related programs of interest to this panel (Joint Center for Poverty Research, 1998). Administrative data sets can offer detailed and generally accurate program information, large enough numbers of records to permit analyses of subgroups of participants, greater state and local specificity and applicability than many national data sources, longitudinal information on programs and program participants, and low marginal costs for data collection.

At the same time, these data sets have important limitations for secondary uses such as performance monitoring. They generally cover a selected set of people and activities and are not necessarily representative of the population as a whole. In the case of health services, for example, such data sets have no information on individuals in a community who might be in need of those services but have not sought care. The data sets also may lack useful descriptive information on the economic and demographic characteristics of the individuals who are included. Measures such as program participation rates require that administrative data (the numerator) be supplemented by population data (the denominator) from another source. Information on outcomes and events that occur outside the framework of the program are rarely available. For example, the records of a substance abuse treatment program can produce data such as the number of participants who complete treatment, but will not directly capture the drug-related arrests of program drop-outs or the subsequent employment history of people who have successfully completed treatment. Similarly, records of a water treatment facility can provide data on observed levels of bacterial contaminants, but will not reflect outbreaks of waterborne disease. Linkages to other data sets (discussed below) can overcome some of these limitations, but the linkage process poses its own technical and policy challenges. These issues are discussed elsewhere in this chapter.

Use of program-specific data definitions can hinder or prevent valid comparisons across data sets. For example, one health program may define adolescents as young people between the ages of 12 and 18, while another may use ages 13 to 17. Greater coordination and collaboration and the development of standard measures may overcome some definitional differences. Yet other differences in data definitions reflect true variations in program features; if comparisons are necessary, those variations must be taken into account. Operational and design factors may also affect the usefulness of administrative data sets for purposes such as performance monitoring. Programs that serve families may not identify each family member separately, making it difficult to distinguish who received what services. If closed or inactive cases are dropped, the data set cannot provide a complete record of services or participants. And the installation of new or upgraded information systems (either equipment or programs) may result in lost or limited access to records created with the previous system.

Claims Data

A specialized administrative data resource that bridges the public and private sectors in health care is insurance and other third-party claims for payment for health services. An enormous quantity of data is produced from the billing and payment of health insurance claims. In accordance with the administrative simplification provisions of HIPAA, standards for the format and content of electronic claims transactions are being established. Claims data have been used to study the effectiveness and outcomes of health care and may also have a place in performance measurement. As with the other data sources discussed in this section, however, their limitations must be kept in mind.

Claims data generally include only a minimal amount of clinical information (e.g., diagnosis, procedure performed) to document the fact that a covered service was provided and payment is owed. Moreover, medical conditions and treatments can be characterized in varying ways in insurance claims. This factor can reduce the consistency and comparability of claims records. Incentives such as higher reimbursement rates for certain types of care can encourage more deliberate changes over time in the content of claims data. The timeliness of these data can also be a concern. Greater use of electronic data interchange (EDI) allows faster claims processing, but delays of several months may still occur in filing and settling claims.

Another limitation of claims data is that they may not provide a complete record of services received by an individual or by a population in a given community or state because claims are submitted only for covered services and only for the individuals served by a specific insurer. Typically, a defined geographic area (a state or a community) is served by several insurers, each of which may offer many different insurance products that vary in scope and terms of coverage. In addition, Medicaid claims records may be managed separately by state agencies, and prepaid managed care plans generally do not generate claims records. With nearly universal participation in Medicare among those aged 65 and older, Medicare claims files have been more complete than other claims databases and therefore often more useful for state and local analyses. However, claims records are generally not available for Medicare services provided through prepaid managed care plans.

The experience of the State of Maryland in using Medicaid claims data in conjunction with public health initiatives illustrates both the strengths and limitations of such data when used for a purpose other than that for which they were originally collected (see Box 4-1). Although these data are a promising means of monitoring health care services for a vulnerable population, they do not capture all of the information that may be needed for some purposes.

Box Icon

Box 4-1

Development of a Health Care Services Database in Maryland. In 1985, Maryland began developing person-based analytic files from Medicaid data. These files were used to conduct analyses that provided a basis for statewide public health initiatives such (more...)

Linkage of Data Sets

Data linkage involves matching records on specific individuals to other records for those individuals in the same or other data sets. This panel believes that in many cases, better performance measurement data could be obtained if selected data sets could be linked. As noted earlier, such linkages can overcome some of the limitations of specific data sets. This is especially true in efforts to relate health outcomes to services provided. For example, linking data from a community survey to administrative records from a prenatal care program could help identify eligible mothers who did not participate in the program and therefore do not appear in the administrative data system. Alternatively, program records could compensate for survey respondents' recall errors about numbers of visits or timing of specific services. Another approach to linking data sets is taken with some immunization registries: birth records are used to create an initial entry in the registry to which subsequent immunization reports are linked. In health care studies, efforts have been made to link multiple insurance claims for an individual to construct a more coherent picture of care for an episode of illness.

A particularly broad pilot project on data linkage that is relevant to the panel's earlier work on performance measures for emergency medical services was initiated by the National Highway Traffic Safety Administration (1996) of the U.S. Department of Transportation. The Crash Outcome Data Evaluation System (CODES), originally tested in seven states, is designed to link data on motor vehicle crashes, emergency medical services, emergency department care, hospital and outpatient care, rehabilitation and long-term care, death certificates, and insurance claims. Using these linked data, states have been able to explore such factors as populations at increased risk for injury (e.g., on the basis of age, alcohol use, or failure to use seatbelts), the consequences of specific types of crashes or injuries (e.g, collisions with pedestrians, abdominal versus head injuries), and the effects of delayed prehospital care.

Attempting to match records from separate data systems poses significant technical challenges. Reasonably successful techniques have been developed that rely on combinations of information, such as name, address, and date of birth, to establish highly probable matches. Use of unique personal identifiers might simplify the process of establishing exact matches, but such identifiers have not been uniformly employed. Provisions of HIPAA now call for adoption of these identifiers, especially for use in electronic health care data transactions, but there is serious concern that stronger privacy protections must be enacted before unique personal identifiers can be used with confidence or comfort (see National Committee on Vital and Health Statistics, 1997b). Even without the use of personal identifiers, the linkage of data sets must be undertaken only with firm assurance that personal privacy and the confidentiality of the data will be protected. (See the discussion of these issues later in this chapter.)

Steps Toward Integration of Data Sets

In the public sector, many states are working to enhance the integration and accessibility of health data (Mendelson and Salinsky, 1997; U.S. Department of Health and Human Services, 1998b). For example, Georgia has provided a single Internet access point to county- and state-level information from several data sets, such as vital statistics and notifiable diseases. In Illinois, the Cornerstone system integrates client records for various maternal and child health services provided by local health agencies. Most states are also enhancing their Medicaid data systems and promoting the use of electronic claims transactions (Mendelson and Salinsky, 1997). The Center for Substance Abuse Treatment in the Substance Abuse and Mental Health Services Administration (SAMHSA) is working with three states to explore ways of linking state Medicaid, mental health, and substance abuse data sets to provide more complete information about clients and service use and to support the implementation of performance measurement.

Attempts to establish public-private partnerships to facilitate the integration of health care data have had limited success. For example, the Community Health Management Information System (CHMIS), proposed by the Hartford Foundation in the early 1990s, was envisioned as a community repository and resource for health care data for use in assessing the cost and quality of care offered in the community. Support for such a community-based approach, however, has been weakened by fundamental changes in the organization of health care services that have resulted in the growth of large regional and national insurers, integrated health care delivery systems, and managed care organizations (Starr, 1997). These organizations are now investing in their own information systems to provide corporate- rather than community-based analyses of cost and quality. Other obstacles included the technical complexity and expense of community-based systems, concerns regarding the confidentiality of patient records, and the reluctance of some health care organizations to share information with business competitors.

An alternative model, sometimes referred to as the Community Health Information Network (CHIN), has shifted the focus from the collection and storage of information for use in the community to the development of clearinghouses that would transmit information among diverse proprietary information systems maintained by insurers, managed care organizations, and individual hospitals and clinicians (Starr, 1997). Such efforts are hampered, however, by the proliferation of proprietary information systems using customized administrative transactions, which also impose a substantial burden on the health care system. For example, hospitals and physicians often find that each insurer and health plan uses a different claims form that requires somewhat different information. Administrative overhead is estimated to account for about 26 percent of health care expenses (U.S. Department of Health and Human Services, 1997c). The administrative simplification provisions of HIPAA are an effort to reduce this burden by establishing standards for electronic health data transactions. These standards can also be expected to facilitate the integration of administrative health care data into other applications.

Assuring the Quality of Data and Data Analysis

Health professionals and policy makers seeking to use performance measurement in conjunction with publicly funded health programs must consider the quality, consistency, and comparability of the data available for this purpose and determine how to address the limitations of those data. The panel cannot offer simple, straightforward criteria for judging the quality of health data and data systems. However, because it is nearly impossible to evaluate data quality on the basis of summary measures, quality is an essential consideration at every step, from the planning of data systems and data collection to the calculation and use of final measures. These issues are not unique to performance measurement, and lessons learned in other contexts merit attention (see, e.g., Hoaglin et al., 1982; Bailar and Mosteller, 1992).

Ideally, the data used for performance measurement would be totally accurate and complete. In practice, however, data rarely meet these requirements. Many different factors may affect the quality of medical and scientific data, whether the data are used for scientific study, administrative purposes, or management oversight as in performance measurement. Some ways in which data can be compromised include inaccurate reporting, incomplete reporting, poorly designed survey samples, errors introduced during data processing procedures, inappropriate aggregation of detailed data to facilitate analysis, and inaccurate calculation of measures.

Standards for data quality and practices adopted to meet those standards should be based on informed assessments of the intended and anticipated uses of the data. Some consideration should be given to potential future uses of the data, but data systems should not be overdesigned in an effort to meet all possible but as yet unidentified requirements. Other concerns relate to the implications of data quality for analysis. By itself, a data set may appear to produce data of satisfactory quality. Problems may arise, however, if the characteristics of older data differ from those of newer data or if the data differ in important ways from other data with which they might be used.

Early and continuing advice from and participation by experts in such fields as statistics, epidemiology, and informatics can reduce the likelihood and severity of many problems involved in the design and use of data and data systems. For example, the use of observational and administrative data for performance measurement poses analytic challenges that differ from those for studies that can rely on more carefully controlled experimental data. Although opportunities to redesign the existing data systems that will provide much of the data for performance measurement will be limited, expert advice can help maintain or improve the quality of those systems. For any new data systems, expert advice early in the design of the system is particularly important because well-designed data systems can prevent many problems that are difficult, and sometimes impossible, to overcome by analytic techniques.

Policy makers who use performance measures should ensure that there is a review process to determine what problems are most likely to affect the data, what has been done to manage those problems, and (at least roughly) how large any residual problems are likely to be. Data and data systems should be held to high standards, but the use of reasonably good data with known limitations may be acceptable, even desirable, for some purposes given the opportunity costs of collecting better data. For example, data from a survey of teenagers, with all the biases inherent in such surveys, are likely to be better for determining the frequency of violation of laws restricting cigarette sales to minors than highly accurate and complete court records covering only violations that have come to judicial attention. Substantial investments of time and money in data that are not appropriate for the analysis at hand or in activities that will produce only marginal improvements in the data do not represent a good use of resources. In cases in which bias dominates random variation, for example, the benefit gained from stringent reductions in the random component of uncertainty (e.g., through use of larger samples) may not justify the cost involved.

Many observers agree that making data useful and important to those who produce the data creates a strong incentive for ensuring that the data are of high-quality. Performance measurement may help provide such an incentive by requiring that data be used for internal purposes or for external reporting. As noted in Chapter 2, however, care must also be taken to avoid the creation of adverse incentives that could encourage deliberate distortions of the data to make performance measures appear more favorable than is warranted.

A few of the statistical and operational factors that can affect the quality of data and their analysis are reviewed briefly in the following subsections. These discussions provide only an introduction to potentially complex issues that should be addressed in more detail by those responsible for implementing performance measurement.

Random Variation and Bias

In collecting and using performance measurement data, policy makers and program staff must keep in mind the effects of random variation and bias. Some degree of random variation should be expected among otherwise similar measurements. For example, small year-to-year changes can occur in the number of infant deaths without representing a meaningful change in the underlying infant mortality rate. Similarly, two independent random samples drawn from the same population are likely to produce slightly different but still representative estimates of the average age of all of the members of that population or of measures such as the percentage of adults who have had their blood pressure checked in the past 2 years. The effect of random variation tends to be greater in measures based on small numbers of events or small sample sizes in surveys. For example, random variation in the annual number of infant deaths will have less impact on the stability of the national infant mortality rate than on the rate in an individual community. Statistical techniques can be used to estimate the likely contribution of random variability to a sample estimate or to the difference between two measurements (e.g., infant mortality rates for successive years). In some situations, the variation arising from small numbers of cases can be reduced by using measures that pool related but individually rare events. For example, a community might measure the percentage of the population using any illegal drugs rather than attempting to measure separately the use of several different drugs.

Bias reflects systematic distortions in the data and poses a more serious challenge to successful use of those data. Bias can be introduced in the design of data collection procedures and in the data collection process itself. In surveys, for example, bias may result from a sample design that excludes certain groups (e.g., households without telephones, as discussed earlier). Bias may also result from differing response rates by specific population groups (e.g., fewer responses by single adults than by married adults with children) or from intentional misreporting (e.g., underreporting of tobacco and alcohol use in panel surveys). Data from registries and administrative records can be affected by systematic differences in the populations they cover or by incomplete or inaccurate reporting. For example, population differences could be reflected in health insurance claims. A firm with an older workforce would tend to have more claims related to care for such conditions as diabetes and hypertension than a firm with younger employees. Financial incentives associated with variations in reimbursement rates may also influence the way diagnoses or health services are characterized in health insurance claims.

Although bias is undesirable, it may not make data unusable. When bias is constant (over time, over a geographic location, over a population segment), it cancels out of many kinds of analysis. For some purposes, imperfect data with a constant bias may be more useful than continually improved data. For example, if the incidence of a disease has been underreported by a consistent 20 percent over a period of years, the trend in its incidence can still be assessed, and any ratios and proportions calculated using data from that period will be accurate.

Data Management

Other factors that can impair data quality include coverage problems that can occur at the time the data are collected or during the data processing phase. Records may be duplicated, inappropriate records may be included, or appropriate records may be missing. Duplication of records may occur if multiple reports about a single individual are received from separate sources and cannot be matched. Some states, for example, require that health care providers and laboratories submit reports on cases of HIV infection without the use of individuals' names, but difficulty can be encountered in matching the reports on a given individual that come from separate sources. Data linkage can also cause problems if records are matched incorrectly. Moreover, data sources such as registries and administrative data systems can be affected by delays in receiving or entering records that result in missing records at the time a report is produced. A data set may also be incomplete because of such factors as a very low response rate among those who believe that a survey may harm their interests or failure to identify and collect the death certificates of all cancer patients in a given cancer register.

Other problems arise if data are used incorrectly to construct performance measures or measures that may be used for other operational or policy purposes. For example, a measure of immunization rates among 2-year-olds will be flawed if either the numerator or the denominator includes children of the wrong age. Audits of health plan performance measures by the National Committee for Quality Assurance (1997) found average error rates of 20 percent, denominator error rates of up to 63 percent, and numerator error rates as high as 72 percent.

Attention to operational policies and practices throughout the collection and processing of data to produce performance measures is likely to help ensure that performance measurement is based on high-quality data. The National Committee for Quality Assurance (1997) has specifically recommended that health plans implement routine data-quality audits to improve the accuracy and completeness of their clinical and administrative data sets. Among the considerations highlighted is good documentation for all steps involved in data collection and processing, including both instructions for each step and records of what was actually done. Audits can verify the accuracy of individual data elements and of the measures calculated using those data. Automated edit checks can test the consistency of data entries. For example, a record showing a 10-year-old respondent in a survey of adults can be flagged for review. Automated and manual checks at the data processing stage should ensure that data are being drawn from appropriate sources (e.g., survey data for the correct year), that calculations are being performed correctly, and that the data being used are consistent with established definitions. For example, if adolescents are defined as 14 to 17 years of age for a given measure, data for those aged 13 to 17 do not meet this definition.

Challenges in Data Analysis

As discussed in this report, performance measurement is most likely to rely on agreed-upon measures that are widely accepted as representing specific programmatic activities and that use data from existing, defined sources. Once those measures have been selected and the data produced, policy makers and other analysts may face several challenges in the successful use and interpretation of the data.

Almost every statistical analysis requires some sort of a statistical model to summarize the data and guide interpretations. A correct model can add great strength to the analysis, but an incorrect model can lead to unreliable findings. Because the correct model is generally unknown, experienced analysts may make a critical contribution through their inferences about the form of an appropriate model. For performance measurement, this observation relates back to our understanding of the evidence that links processes and capacity to health outcomes. If that understanding is good, it becomes possible to select measures of outcomes, processes, and capacity that provide reliable signals of progress toward health goals. If, however, that understanding is incomplete or flawed, the process and capacity measures selected may provide little insight into the change (or lack of change) in health outcomes.

The issue of data comparability noted earlier represents an important challenge to data analysis. Comparisons among groups or over time are of particular relevance for performance measurement. The panel anticipates that performance data used in the framework of performance partnership agreements will frequently become the basis for comparisons across states. Similarly, states may make comparisons across counties, cities, or other community units. Data comparability is also a common issue in the interpretation of changes over time. Results can be affected, for example, by differences in methods of collecting the data, in the health care or program environments, and in the underlying characteristics of the populations being measured. A study designed specifically to test the ability of five states to report comparable data for a set of mental health performance measures demonstrated that such differences are currently an obstacle for performance measurement (National Association of State Mental Health Program Directors Research Institute, 1998; see also the discussion of this study in Chapter 3). Even efforts to improve existing data systems (e.g., through more complete coverage or better questionnaire design) have the undesirable, though often acceptable, side effect of hindering the interpretation of time trends. Lack of complete comparability does not preclude the use of the data, but it necessarily affects the nature and strength of the conclusions drawn from analysis of the data. Several common concerns related to data comparability are discussed below.

Concerns Related to Differences in Data Collection Methods

Different methods of collecting data regarding a particular phenomenon can produce different findings. For example, within the broad domain of surveys, the specific techniques employed will affect the accuracy of the data obtained and therefore the comparability of those data. Substance abuse rates ascertained from computer-aided interviews may be more accurate than those derived from telephone or in-person interviews (see Turner et al., 1998). Similarly, surveys that rely on self-reporting may produce more accurate data on respondents than surveys that allow reports about an individual by another person (i.e., proxy responses), although the use of proxies may improve data quality by reducing the nonresponse rate.

A different circumstance is illustrated by estimates of current tobacco use based on tax records as compared with estimates based on survey reports. Tax records nearly always show substantially higher use than do survey data, apparently because tobacco use is underreported in surveys. Although the tax records themselves may be less than perfect, it appears likely that a substantial fraction of the tobacco sold and taxed will be used. Unlike surveys, however, tax data cannot provide information on the characteristics of those who purchase tobacco products. The findings from these two data sources might be used together to develop an adjustment factor for inflating the survey data on tobacco consumption to match estimates from tax receipts.

When data sources are as fundamentally different as administrative records and surveys (as in the example just cited), analysts may be more alert to the hazards of direct comparisons than when the differences are less obvious. If, however, a major discrepancy in findings between two or more data sources cannot reasonably be accounted for by differences in the methods of collecting or analyzing the data, analysts must consider which, if either, source should be used. If this choice is not clear, new data collection efforts may be warranted.

Concerns Related to Differences in the Program Environment

External factors in the program environment can affect the results of performance measurement in ways that are unrelated to program activities or goals and should be considered in interpreting performance results over time or across groups. Special circumstances, such as natural disasters or unrelated disease outbreaks (e.g., unusually high rates of influenza), might affect performance measurement results through either a deterioration in health outcomes or a reduction in the resources available for program activities. Apparent rates of disease incidence may also be affected by such factors as increased awareness of a given health problem or new health care technologies that alter patterns of detection and treatment of disease. For example, reported incidence rates for prostate cancer increased from 79.8 per 100,000 in 1980 to 132.0 per 100,000 in 1990 (both age-adjusted to the U.S. population in 1970) (Ries et al., 1997). During this period, increasingly widespread use of the prostate-specific antigen test led to a marked rise in the number of diagnosed prostate cancers without evidence that the underlying incidence of the disease had changed substantially and without a corresponding change in reported mortality rates. This change is readily attributable to a specific factor, but the factors underlying other differences can be less obvious or entirely hidden. Sorting out such matters generally requires the application of both statistical and subject-matter expertise.

Concerns Related to Differences in the Characteristics of Populations

Health outcomes are often closely linked to biological and social risk factors. Since the nature and distribution of these factors can be expected to vary across the populations being served in various programs and geographic areas, some of the differences in outcomes may be attributable to these variations rather than to true differences in program performance. Methods are needed to adjust performance data for differences in important covariables over time or between comparison populations (see, e.g., Rothman, 1986; Gordis, 1996). Without adjustment, comparisons may often be difficult to interpret. For example, apparent variations in performance might be a reflection of differences in the characteristics of program participants (e.g., educational attainment, access to transportation), the general population in a state or community (e.g., average age), or the socioeconomic and other characteristics of the states or communities served (e.g., unemployment rates, population density).

One method of accounting for such differences is stratification—the calculation of performance measurements separately for specific population subgroups (e.g., younger and older age groups)—which will provide more comparable results within those subgroups. This approach may not be feasible, however, if available data sources do not identify the subgroups of interest or if small numbers of cases compromise the reliability of the subgroup measures.

Differences in the mix of subgroup characteristics across populations can also be addressed by adjustment methods that permit calculation of a single measurement for each population to be compared. The ''direct" adjustment method is one of the most widely used. With this method, population-wide rates are calculated by applying the observed subgroup measurements from each population of interest (e.g., age-specific rates for smoking or completion of substance abuse treatment) to the equivalent subgroups (e.g., age groups) in a single "standard" population. For example, in comparisons of cancer incidence, which is generally higher in older age groups, rates are often "age adjusted" using this method to ensure that observed differences can be attributed to disease incidence (or its detection), rather the age distributions of the populations.

More complex forms of statistical analysis offer other ways of accounting for differences in population characteristics. For example, "risk adjustment" techniques have been used to account for differences in initial severity of illness among patients in comparisons of clinical outcomes, such as hospital mortality rates for cardiac surgery (see, e.g., Luft and Romano, 1993; Iezzoni, 1994; Landon et al., 1996). These techniques are, however, still evolving, and different severity-adjustment methods have been shown to produce differing performance results (Iezzoni, 1997b).

In theory, similar adjustments could be made in evaluating performance data for health programs, but specific methods of adjustment have not yet been adopted for this purpose. Doing so would require determining which factors are appropriate to use in an adjustment, developing the statistical model to be used, and ensuring the availability of the necessary data on the adjustment factors.

Any adjustment method must, of course, be used carefully. One concern is ensuring that adjustment does not disguise meaningful differences in program performance among subgroups in the population. An adjustment based on income, for example, might mask different levels of performance for lower- and higher-income groups. Another concern is that currently limited knowledge regarding the relationships between health outcomes and many social or biological factors may lead to inappropriate uses of adjustment. Determining which factors provide an appropriate basis for adjustment of performance data will require careful consideration of both the technical and policy implications of these methods.

The Drug Evaluation Network Study illustrates attempts to make such adjustments, in this case for comparisons of drug treatment centers.1 The Addiction Severity Index (ASI) (see McLellan et al., 1992), an extensive patient interview instrument, is being used to collect information about the characteristics of the substance abuse patients served, including employment, legal involvement, family, and psychiatric problems, as well as the nature and extent of their illegal drug use.

Implications for Data Analysis

What are the implications of the preceding observations for data analysis and for policy that flows from that analysis? The experienced analyst is far from helpless even in the face of serious bias and/or incompatibilities in the data. First, however, the analyst must be familiar with the details of the data collection methods, as well as the procedures used to prepare the data for analysis and the specific format of the resulting data files. Every increment in understanding may reveal additional influences on the data that should be considered. Second, analysts and policy makers need to know that bias is likely to be critically important, but that there are some means to control or understand its influence. Third, ordinary tests of statistical significance and confidence bounds do not capture the broad and perverse effects of bias, and hence may be seriously misleading. Fourth, analysts and policy makers must anticipate that different methods of answering a question will sometimes produce apparently incompatible results. Similar uncertainties about the meaning of results arise even when only one set of observations has been made; if the data had been obtained by other reasonable methods, the results would have been different. Thus the policy analyst and the statistical analyst must work together to understand the strengths and limitations of a specific data set so that policy will be robust against problems in the data as obtained.

Developing and Implementing Standards for Data and Data Systems

One of the difficulties the panel faced in the first phase of its work was the limited availability of data that are comparable across states for use in performance measurement. Achieving greater comparability will require more standardization in the content and methods of data collection, in the coding and vocabulary used to record data, and in the selection and definition of performance measures. In addition to standards for the substantive comparability of data, effective use of information technologies requires standards for the format in which data are stored and transmitted. These issues are relevant for all forms of health data, from infectious disease and vital records reporting, to survey data, to clinical and administrative records.

Standardization has proven elusive for at least two reasons. First, health data are often complex. Second, many data systems have been developed independently to meet local needs, and it can be difficult to reach consensus on standards that may require substantial change in those data systems or may seem less likely to meet those local needs. With regard to choosing performance measures, lack of consensus can also reflect a field's need for further development of a framework for assessing performance. Although standards can be imposed through regulation and legislation, the panel favors a collaborative approach based on the participation of interested parties at the national, state, and local levels to ensure consideration of a broad range of views.

Standardization Activities

Although much remains to be done to improve standardization in methods of data collection, in the coding of health data, in the formats for storing and transmitting data, and in the definition of performance measures, many activities in the public and private sectors are making useful contributions in these areas. Several of these efforts are briefly reviewed below.

Centralized Data Systems

Centralized data collection efforts at the national level (sponsored by the federal government or organizations with national and multistate agendas) can use comparable definitions, questions, and methods across many or all states. Most of these activities result in national-level data, but usually cannot provide subnational estimates. One exception is the National Immunization Survey, a random-digit-dial telephone survey that yields state and regional estimates of immunization rates for children aged 19–35 months. This federally run survey uses comparable data collection methods across all states and regions, and comparisons of rates of immunization can reasonably be made among states.

Federal-State Collaboration in Public Health

Often, a centralized approach to data collection and analysis is too costly and inflexible to provide adequate state- and local-level detail, and other means of achieving comparability are necessary. Collaboration by the states and the federal government has led to a few well-recognized successes in harmonizing independent state systems. A notable success is the national vital statistics system, a cooperative state-federal program through which recommended forms and procedures for the collection and reporting of vital records data have been developed. Data collected by each state vital records system are reported to and compiled by the National Center for Health Statistics to produce national totals. The National Notifiable Diseases System, which relies on state reporting of new cases of specific conditions, was enhanced in 1990 by the development of standard case definitions for nationally reportable conditions (Centers for Disease Control and Prevention, 1997). SAMHSA compiles the Treatment Episode Data Set from a minimum set of data collected by states on clients admitted to substance abuse treatment programs that receive funding through the state substance abuse agency.

Under the Behavioral Risk Factor Surveillance System (BRFSS), the Centers for Disease Control and Prevention (CDC) has worked with the states to reach agreement on a core set of questions and standard sets of supplemental modules. CDC provides overall support and technical oversight, but individual states administer the survey and have the opportunity to add their own questions. However, because sampling design and data collection methods vary among states, comparisons of BRFSS data among states must be made cautiously, and it has not been possible to aggregate state estimates into national totals. State surveys may, for example, have significantly different response rates, and users of the data should consider how nonresponse bias may have affected the estimates.

Healthy People

Over the past 20 years the Healthy People initiative has provided a framework for establishing a common, national set of measurable health promotion and disease prevention objectives (U.S. Public Health Service, 1979; U.S. Department of Health and Human Services, 1991), and most states report using Healthy People 2000, at least in part, to guide the development of similar state-level health objectives (Public Health Foundation, 1998). The national objectives for 1990 and 2000 were not created with performance measurement in mind, but initial proposals for Healthy People 2010 specifically call for efforts to link the objectives to performance measurement activities under the Government Performance and Results Act (GPRA) (U.S. Department of Health and Human Services, 1998a).

Healthy People is contributing to standards that will be useful to performance measurement by promoting the adoption of specific measures for tracking progress toward identified objectives and the development of detailed operational definitions for those measures. Some of the Healthy People 2000 measures have been adopted as performance measures for federal block grants (e.g., the Maternal and Child Health Services Block Grant [see Chapter 2] and the Preventive Health and Health Services Block Grant). Also, several of the measures proposed by this panel in its first report (National Research Council, 1997a) are quite similar to Healthy People 2000 measures, allowing for differences in data sources for state-versus national-level data. A series of reports from the National Center for Health Statistics is providing detailed specifications for operational definitions and data sources for the measures used to track progress toward the national Healthy People objectives (e.g., Seitz and Jonas, 1998). This information can help states and communities employ comparably defined measures.

Mental Health and Substance Abuse

Publicly funded programs in mental health and substance abuse are closely involved with the delivery of personal health services, often through providers in the private sector (see Chapter 3). A variety of data collection activities have developed, many of which have focused on services and service providers. The federally initiated Mental Health Statistics Improvement Program (MHSIP), for example, has been an essential resource supporting the development of information systems for public mental health services. Currently, however, comparable state-level data are limited for both mental health and substance abuse programs. As interest in treatment outcomes and performance measures has grown, state mental health and substance abuse programs have recognized the need to develop new and more comparable measures.

In 1997, the members of the National Association of State Mental Health Program Directors (NASMHPD) (1998) adopted a standardized performance indicator framework (see also Chapter 3). Using this framework as a guide, NASMHPD is working closely with SAMHSA and other organizations to identify and test measures that all states can use. Similarly, the National Association of State Alcohol and Drug Abuse Directors is working with its state members and SAMHSA to achieve consensus on a framework for performance measurement for substance abuse treatment and to develop detailed specifications for measures. Work is also being done on standard measures for substance abuse prevention.

Performance Measures in Health Care

As was noted in Chapter 2, several organizations are actively involved in the development and use of performance measures in health care, most notably the American Medical Accreditation Program, the Foundation for Accountability, the Joint Commission on Accreditation of Healthcare Organizations, and the National Committee for Quality Assurance (NCQA). Standard measures, data definitions, and data collection systems are being developed for assessing providers, health care facilities, and health plans to determine whether accreditation requirements are being met and to provide purchasers and consumers of health services with comparative performance information. As these performance measurement programs are implemented, they will tend to encourage greater consistency in the health data components related to those measures. NCQA (1997) has also emphasized the importance of adopting recognized standards for the structure and content of both clinical and administrative components of health plan information systems.

Standards in Health Informatics

The increasing computerization of health data and the proliferation of incompatible information systems have generated efforts on many fronts to standardize various elements of the structure, function, and content of these information systems and of the format for transmitting information among systems. Organizations such as the American National Standards Institute (ANSI), the American Society for Testing and Materials (ASTM), Health Level Seven (HL7), the Institute of Electrical and Electronics Engineers (IEEE), and the National Uniform Claims Committee (NUCC) serve as private-sector forums for voluntary collaboration among parties interested in formulating standards for specific information system features.

Other groups are focusing on the development of standard coding sets and vocabularies for recording clinical information such as symptoms, diagnoses, procedures, and laboratory findings; examples of these coding sets and vocabularies are the International Classification of Diseases (ICD), the Current Procedural Terminology (CPT), and the Systematized Nomenclature of Medicine (SNOMED). No one coding system or vocabulary has become a comprehensive standard, and the Unified Medical Language System (UMLS), a project of the National Library of Medicine (1998), provides a "translation" tool to link information represented using these varying systems.

Health Insurance Portability and Accountability Act of 1996

HIPAA should result in substantial advances in the standardization of health care data and data systems. The administrative simplification provisions of HIPAA direct the Secretary of Health and Human Services (DHHS) to adopt standards for electronic transmission of administrative and financial health care data (see Box 4-2); for unique health identification numbers for health plans, health care providers, employers, and individuals; for code sets for data elements used in health care transactions; and for security of electronic transactions. The act also directs the secretary to promulgate guidelines for computerized medical records within 4 years. DHHS will base HIPAA standards on existing standards in any of these areas that have been developed, adopted, or modified by standards-setting organizations accredited by ANSI.

Box Icon

Box 4-2

Administrative and Financial Transactions Covered Under Standardization Requirements of the Health Insurance Portability and Accountability Act. Claims or equivalent encounter information Coordination of benefits information

The use of electronic transactions is not required, but if they are used, they must adhere to the HIPAA standards. Despite anticipated high initial costs of modifying or developing information systems to implement the new transactions, overall savings are expected to amount to billions of dollars (Office of the Secretary, U.S. Department of Health and Human Services, 1998). Standardized enrollment, coding, and billing formats will eliminate the need for health care providers to customize transactions to the varied requirements of many different health plans and insurers.

The impact of HIPAA will extend to state and local health departments and other health-related agencies. Those that function either as payors or as service providers seeking reimbursement will have to implement information systems that use the transaction standards. Standardization of data elements, data definitions, transaction formats, and code sets should aid the conversion of health encounter data into public health data. For example, with a standardized transaction format and standardized electronic data interchange, it should be possible to piggy-back notices of reportable illness on an electronic transaction. This process offers the potential for greatly enhancing the quality and timeliness of these reports. In addition, because a separate report would no longer be needed, the percentage of cases that are reported could be expected to increase.

Balancing Standardization and Change

As discussed earlier, greater standardization of data, data collection methods, and measures is essential to permit comparisons of performance over time or across groups. However, this standardization must be pursued thoughtfully. The participants in the performance measurement process must have an opportunity to gain experience with both the conceptual and practical aspects of performance measurement, and the process must be seen as a continuing activity that allows for the reassessment and revision of standard measures. In its first report the panel emphasized that the proposed measures of health outcomes and risk status were reasonable candidates for use in most states (if the necessary data were available), but that process and capacity measures had to be selected to match the particular program strategies that a state had adopted. The panel also observed that greater consensus regarding appropriate measurement domains may sometimes be necessary before performance measures can be proposed.

There is a risk that setting standards for performance measures or data sources will discourage improvement and innovation. Current limitations on the availability of data could, for example, encourage adoption of "least common denominator" measures for which data are widely available, rather than better measures for which new data collection efforts would be required. The desire for continuity of measures over time could also work to discourage constructive changes in the selection and definition of performance measures or in data collection and analysis.

Policy makers and others who develop and use performance measures need to recognize that they must be engaged in a continuing process in which measures and data are reviewed and revised in response to advances in knowledge and changes in program practices and priorities. Within a framework that strives for comparability, this process should allow for the introduction of new measures, the acceptance of new data sources, and the adoption of new techniques for data collection and analysis. As new measures and data systems are introduced, efforts should be made to calibrate them against previously established data systems to facilitate continued use of the data generated by those older systems in longitudinal analyses. The review process should also ensure that the measures and data sources already in use continue to be suitable and are used in appropriate ways. For example, for the most recent revision of the HEDIS measures (version 3.0), NCQA (1996) instituted a supplemental "testing set" of measures to be examined further before being adopted as required measures. With HEDIS 3.0, NCQA also established a standing Committee on Performance Measurement to oversee an ongoing process of reviewing and testing measures and to develop a research agenda for the development of new measures.

For a review process to be credible and acceptable, those whose performance is to be measured must have an effective means of participating in the deliberations conducted to select and review performance measures. For the public-sector programs that have been the focus of the panel's attention, formal mechanisms should be developed to ensure that each major group of stakeholders at the state and local levels is a full partner in such discussions. The regional meetings organized in response to the PPG proposal were a welcome opportunity for broad participation, but are not a viable model for a continuing forum for discussion of program priorities, performance measures, or data resources. Federal, state, and local governments should ensure that policy, program, and technical perspectives are all represented, and might work with various organizations to identify representative participants from these constituencies for such an effort. Examples of these organizations include the Association of State and Territorial Health Officials, the National Association of County and City Health Officials, the Council of State and Territorial Epidemiologists, the Association of Maternal and Child Health Programs, the National Association of State Mental Health Program Directors, the National Association of State Alcohol and Drug Abuse Directors, the National Association of Local Boards of Health, and the Association of Public Health Laboratories. (Box 5-1 in Chapter 5 lists additional organizations that might be involved in these activities.)

Enhancing Performance Measurement Through Advances in Information Technology

Advances in information technology are changing the environment not only for performance measurement, but also for many aspects of the health data infrastructure that support decision making for policies and programs (see, e.g., Lasker et al., 1995). These developments include improved capabilities for linking and merging electronic data sets, access to enhanced analytic resources as smaller computers become capable of using more powerful software to analyze larger data sets, and vastly expanded desktop access to information and options for data collection and transmission through the Internet and the World Wide Web. These advances in information technology enhance the ability to monitor the performance of state and local health agencies, as well as private providers of health care services. Furthermore, these developments can improve the availability of health performance information through new methods of communicating performance results to key audiences. There is, however, great variability among states and communities in access to and expertise in the use of information technologies. Noted here are a few of the developments in information technology that should enhance the ability to implement performance measurement.

Data Collection and Transmission Technologies

Technology is providing new options for data collection and transmission that can improve the quality and timeliness of the data. The widespread adoption of electronic birth certificates, for example, allows hospitals to enter birth certificate data directly into an information system that can check for missing or inconsistent data and then transmit the record to the appropriate office to register the birth. This eliminates the need with the older, paper-based reporting process for hospitals to forward written records that must then be transcribed in a central office, and for that office to send back to the hospital (sometimes more than once) questions about missing or suspect data that must be resolved before the birth record can be completed. For the most part, electronic birth certification is currently being used to automate the paper record process, but it could become the core of a more comprehensive information system on infant health that would link birth certificates with other data sources, such as records on prenatal care, metabolic disease screening, and immunization (Starr and Starr, 1995).

The Drug Evaluation Network Study, mentioned earlier, also illustrates new technological capabilities in data collection and analysis. Trained staff record information collected during an extensive patient interview directly into laptop computers. This allows the data to be monitored for invalid or inconsistent entries during the course of the interview and transmitted electronically to the researchers conducting the study. The electronic linkage between the treatment centers and the study staff makes it possible to update the interview protocol overnight to address policy changes or specific concerns about the nature of illegal drug use across the country.

Data Management and Analysis

Data management systems such as relational databases and data warehousing make it possible to maintain data in many separate files and link data from those files as needed. These systems can store information that includes personal identifiers, or they can be based on anonymous data records for which identifying information has been replaced by system-specific codes that allow records to be linked, but do not relate them to identifiable individuals.

The addition of geographic detail to health records and other types of information, often referred to as geocoding, can enhance the analytic value of many kinds of health data. Geocoded data can be grouped into geographic subunits for analysis. For example, responses from a state's Behavioral Risk Factor Survey might be grouped by county or other substate region to gain additional insight into possible differences in risk behaviors and program impacts or needs across the state. It may also be useful to include geographic information, such as the distance between a substance abuse client's residence and treatment site, in analyses of program outcomes. For some purposes, specific addresses may be needed, but even zip codes can provide useful geographic information for many analyses.

Geocoding also makes it possible to use new mapping technologies to display and analyze data. These geographic information systems (GIS) can capture and plot data from multiple sources to examine the spatial relationships among several factors that may affect health outcomes or health program services (see, e.g., Clarke et al., 1996). For example, data on birth outcomes (e.g., birth weight, prematurity) might be plotted with data on the residence of the Medicaid-eligible population and the location of such services as health care providers, child care facilities, and grocery stores. The capture of geographic data for GIS is being enhanced by data collection systems that can record specific geographic coordinates by using satellite-based global positioning systems (GPS).

Computer-Based Patient Records

In health care settings, the use of information technology is expanding beyond the management of administrative and financial data to computer-based clinical information systems. As discussed earlier, there is great potential for CPRs to meet the need for timely and accurate clinical information that is difficult to access with traditional paper records (Institute of Medicine, 1997). Prototype CPR systems can convert natural medical language into medical procedure and diagnosis codes. Work is also being done to integrate decision support tools into these systems by incorporating clinical knowledge resources such as accepted treatment protocols. Based on these protocols, deviations or oversights in patient management can be detected and alternatives suggested.

Individual institutions have made progress in developing CPR systems. However, further advances will require not only technological innovation but also organizational and policy changes, such as greater consensus on clinical vocabularies and greater acceptance of changes in methods of recording information by those who use such systems. For example, voice-activated interfaces and increased familiarity with computer use are overcoming the past resistance of many physicians to typing notes directly into the patient record. The high cost of current CPR systems is another barrier to their wide acceptance. Further refinement of the technology and cost reductions are key factors for more widespread adoption.

The Internet and the World Wide Web

Both health data systems and performance measurement are affected by the extraordinary growth of the Internet and the World Wide Web as tools that facilitate communication and data exchange. Performance data can be collected and submitted to central repositories for processing and analysis and then made available to a broad range of interested parties. It is even possible to provide access to data in a form that permits customized analysis (see Box 4-3). Although use of these tools has expanded rapidly, they may not yet be appropriate for certain communities. Some of the more rural parts of the country may still lack affordable access to the high-speed telecommunications services that substantially enhance the utility of the Internet and the World Wide Web.

Box Icon

Box 4-3

Using the Internet and the World Wide Web. SHARING CURRENT INFORMATION Recent Wisconsin legislation requires each community to conduct a biannual assessment of its population health needs and to develop a plan that reflects community health priorities. (more...)

Limits of Technology

Developing and maintaining information systems designed to be long-lived is challenging when the technology is evolving at a rapid pace. As equipment and software advance, an agency may lose easy access to data from either obsolete systems or systems more advanced than its own. Incompatibilities can emerge in equipment, storage media, and programs, especially if information systems are developed and maintained in isolation. Collaborative efforts to develop information system standards can foster the evolution of independent information systems able to exchange information successfully. It is also essential for policy makers and data system managers and users to ensure that information technologies are employed only in ways that maintain the confidentiality of health data and protect the privacy of the individuals to whom the information applies. This issue is discussed in the next section.

Privacy, Confidentiality, and Security of Health Data

Protecting the privacy, confidentiality, and security of all forms of health-related data is a critical consideration in the collection and use of data for performance measurement. The public is concerned that personal health data may be used in detrimental ways, particularly as information technologies become more powerful and more pervasive. Fears that disclosures such as HIV test results or records of mental health or substance abuse treatment could lead to loss of employment or refusal of insurance may be especially acute. This concern is creating pressures for stricter technical and policy controls on access to and use of health data. At the same time, health policy makers and researchers worry that overly strict controls may hinder responsible uses of the data for research, performance measurement, and other such purposes aimed at controlling health threats or improving health and health services. Linkage of data sets can be a source of special concern because combining data in ways that may not have been anticipated when personal information was disclosed for a more limited purpose could compromise the privacy of the individuals involved.

Privacy, confidentiality, and security are closely related but distinct issues. As used here, privacy refers to an individual's interest in limiting the disclosure of personal information; confidentiality refers to controlling the release of information once it has been disclosed; and security refers to measures for controlling and protecting information and the systems through which it is accessed (National Research Council, 1997b). The fundamental concerns about unauthorized access to and use of health data are relevant for both paper records and electronic systems. The scope, power, and speed of electronic information systems magnify these concerns, but use of electronic information systems also offers new means for protecting data. Moreover, concerns about protecting the privacy, confidentiality, and security of personally identifiable records apply not just to health data, but also to data collected for a variety of purposes, such as tax and census records.

In discussions of health data, medical records and administrative files that identify specific individuals are generally viewed as the most vulnerable to inappropriate disclosure, but other materials, such as vital statistics records and survey responses, also require adequate protection. Moreover, even supposedly anonymous or aggregated data, such as those published in vital statistics reports, must be handled appropriately because distinctive combinations of characteristics such as age, race, occupation, and diagnosis could permit the identification of individuals. Linking records across data sets can add valuable information, but poses added risks that privacy and confidentiality will be compromised. The proposed use of unique personal identifiers in health records discussed earlier would facilitate record linkage, but many observers oppose their adoption until more effective privacy protections are in place (e.g., Institute of Medicine, 1994; National Committee on Vital and Health Statistics, 1997b; National Research Council, 1997b).

Several federal laws and regulations provide privacy protection for data collected by federal agencies (see National Research Council, 1993), and states have adopted varying provisions regarding the privacy and confidentiality of publicly and privately held data related to health status (e.g., infectious disease reports) and health care (Institute of Medicine, 1994; Gostin et al., 1996). Many observers believe that federal legislation is necessary to ensure a uniform minimum level of protection for health data at the national, state, and local levels and in the public and private sectors (e.g., Institute of Medicine, 1994; National Committee on Vital and Health Statistics, 1997a). Others have advocated the adoption of more consistent policies governing the collection and use of data for statistical purposes by federal agencies (National Research Council, 1993).

Despite widespread support for strong protection of the privacy and confidentiality of most health data, considerations of personal or public safety may sometimes require controlled release of data related to matters such as contagious diseases or mental illness. Other public policy priorities may also preempt protections for health-related information. For example, states now have the option to bar permanently from the Temporary Assistance for Needy Families program persons convicted of drug-related felonies (U.S. Department of Health and Human Services, 1997d).

Over the past few years, Congress has considered but not acted on proposals to establish policies regarding the privacy and confidentiality of individually identifiable health data. Action may now be more likely because HIPAA calls for Congress to pass such legislation by August 1999, or if Congress does not act, for the Secretary of Health and Human Services to issue regulations for electronic administrative and financial transactions. DHHS has submitted recommendations to Congress for federal privacy standards (U.S. Department of Health and Human Services, 1997b), and discussion continues over broader federal action. In terms of performance measurement, the panel notes that the DHHS recommendations include provisions for disclosure of information to public health agencies and state health data systems. The National Committee on Vital and Health Statistics (1997a) has strongly recommended passage of a health privacy law rather than reliance on departmental regulations that will govern only electronic transactions because the restricted scope of those regulations may make them impossible to administer appropriately.

Ensuring the physical security of health data, especially data in electronic form, is an essential adjunct to policies on privacy and confidentiality, but relates also to protecting data from intentional or inadvertent alteration. Although technological measures can increase data security, strong organizational policies and practices are also needed. Specific recommendations regarding health data have been made by a committee of the National Research Council (1997b) and the National Committee on Vital and Health Statistics (1997a). Among the technological steps that have been recommended to ensure the security of electronic health data are individual authentication of information system users, procedures to control user access to data, tracking and review of user transactions (i.e., use of ''audit trails"), and protection of electronic communications and points of remote access to an information system. Among the recommended organizational practices are establishing formal security and confidentiality policies that include sanctions for violations, designating an information security officer, and providing training for staff and other users of an organization's information systems.

The policies and practices that emerge in response to these concerns are likely to have significant implications for health information systems and the use of health data for performance measurement. The panel urges careful consideration of all perspectives, recognizing that there are strong views and compelling interests on many sides. These issues lie beyond the scope of the panel's work, but policy makers and the public are urged to consider carefully the recommendations that others have made.

Investing in Health Data and Data Systems

For performance measurement to be effective, good data and information systems, including a skilled staff with expertise to manage and use those systems, must be developed and maintained. A stable long-term investment must be made in the equipment and program activities needed to collect, manage, and use health data. To ensure that staff at the federal, state, and local levels are prepared to perform the tasks associated with performance measurement, a similar investment must be made in training and technical assistance. The DHHS strategic plan gives high priority to investments in electronic data systems and the training and technical assistance needed to apply new technologies at the state and local levels (U.S. Department of Health and Human Services, 1997a).

These investments are a responsibility that should be shared by all who expect to make use of the data and information systems involved, whether for performance measurement or for other purposes. Support must include adequate direct funding, as well as commitments of staff time and access to computing and other technical resources. Because publicly funded health programs often face serious funding constraints, the panel emphasizes the importance of mobilizing the resources needed for data and information system development in ways that do not compromise funding for program services. At the same time, it is important to emphasize that only with good data and good program monitoring is it possible to assess whether program services are effective and being used appropriately.

Data and Information Systems

The panel urges DHHS to initiate a comprehensive review of the nation's current portfolio of health data activities to explore with states, communities, and the private sector opportunities for producing better data more efficiently. Although the panel has focused on performance measurement, this review should adopt a broader perspective that takes into account the variety of purposes served by health data and information systems.

Among the issues for exploration in this review are the investment in federal versus state and local data systems and opportunities for more efficient use of data system resources. For example, the DHHS strategic plan calls for additional investment in departmental surveys to generate state-level data, but the merits of using federal resources to strengthen state and local survey programs should also be considered. This latter approach might make it possible to consolidate reports to meet certain national data needs while producing data that respond to specific state and local needs as well.

A key starting point might be the BRFSS model. This collaboration between CDC and the states has resulted over time in a survey program in which all states and the District of Columbia participate. Each of the annual state-administered surveys uses a standard core questionnaire and can also include separately funded customized supplements that respond to specific state interests. These surveys also provide a framework that states can use to produce more detailed substate data. The BRFSS has been cited as a key source of state-level data for measures related to Healthy People (Kim and Keppel, 1997). It is also an essential resource for the state-level data on health status and health risk factors that will be needed for many performance measures, including a number of the measures proposed by this panel in its first report (National Research Council, 1997a). The panel is concerned that the BRFSS program has not received a strong commitment at the federal level for continuing support consistent with its importance as an information resource. The federal and state funding arrangements vary from state to state, but overall, direct federal funding has generally supported about half of the modest annual cost of the survey program. For the 1996–1997 grant cycle, this direct federal support amounted to about $3.5 million of the combined federal-state funding of about $7 million (D. Nelson, Centers for Disease Control and Prevention, personal communication, September 1998). For the 1997–1998 grant cycle, however, direct CDC support was reduced by about 25 percent to $2.7 million. A funding loss of this magnitude is a serious concern. For the 1998–1999 round of grants, CDC funding for the BRFSS increased to $3.9 million, but even with this increase, the funding level allows for an average grant of only about $76,500 per state.

Efforts to identify opportunities for making more efficient use of existing data resources could include assessing the usefulness of the data currently being collected, exploring opportunities to build new data collection capabilities within existing systems, and identifying ways to remove obstacles that may hinder more efficient operation or the sharing of data across systems. Given changes in program priorities and a more outcome-oriented approach to monitoring program operations (as reflected in the PPG proposal), some current data collection programs may no longer be appropriate or may require redirection. If out-of-date activities were identified, the resources used to support them could be shifted to more useful data activities.

Opportunities may also exist to expand data collection within an existing framework, which would tend to be less costly than establishing a new freestanding activity. For example, the National Center for Health Statistics has proposed the State and Local Area Integrated Telephone Survey (SLAITS), which would take advantage of the National Immunization Survey sampling frame. For the latter survey, a large number of ineligible households must be contacted in the course of identifying those with children of an appropriate age. With SLAITS, these contacts with ineligible households could be transformed into contacts with households eligible for alternative surveys.

Also of concern to the panel are data system inefficiencies that may exist because of constraints on the use of categorical funding or demands for specialized data systems and reporting. For example, states have found it difficult to integrate some federally developed reporting systems into existing state information systems. In a recent audit, the Illinois Department of Health found eight separate information systems for HIV/AIDS, each of which required independent data entry (J. Lumpkin, Illinois Department of Health, personal communication, August 1998). Specialized turnkey or proprietary systems that are customized for a single program area can be difficult to link to other information systems or adapt for other, related applications. Moreover, because such systems must be used in operational settings that vary across communities and states, a single version is unlikely to be suitable for every setting. Information science advises designing information systems to support service delivery rather than adapting service delivery to the information systems.

Problems of redundancy and incompatibility can be traced to all levels of government. If such problems can be identified, efforts can be made to overcome them, although they may not be easy to eliminate. At the federal level, CDC and the Health Resources and Services Administration (HRSA) recently took steps in this direction by endorsing the use of their categorical grant funds to support the development of integrated health information systems, noting that integration will benefit categorical programs and serve cross-cutting information needs (Broome and Fox, 1998). The panel encourages other federal agencies, as well as state and local health agencies, to explore similar policies. Any formal legislative and regulatory restrictions that constrain the use of program funds in support of more integrated health information systems should be reviewed to determine whether they can be revised or removed.


Rapid advances in information technology are presenting new opportunities to collect, manage, analyze, and disseminate data for performance measurement and other purposes. To take full advantage of those opportunities, however, federal, state, and local governments must invest in more sophisticated computers, software, and communications capabilities. To optimize their investment, they should look for efficient approaches to system design and operation. For example, a modular object-oriented approach to programming facilitates the transfer of software development efforts from one application to another. This reuse of software can dramatically reduce the time and cost of system development. The creation of a national repository of software objects that perform common core functions might be one means of facilitating the development of state or local information systems and leveraging the funds currently available for system development.

The panel has not attempted to estimate the level of investment that would be appropriate, but notes that in the private sector, the health care industry spent an estimated $10–$15 billion on information technology in 1996 (Munro, 1996). Further growth in the level of effort is expected as health care organizations implement CPRS; upgrade administrative and billing systems; install networks for sharing information with affiliated entities; and use public networks, such as the Internet, to distribute health-related information and provide access to clinical databases in remote areas. The scale of private-sector investment signals broad recognition of the importance of supporting information systems.

If the public sector is not to be left behind, it, too, must make a significant investment in information systems. Estimates of the current spending on state and local health data systems are not readily available,2 but a reference point might be sought in state spending on environmental data systems. The Environmental Protection Agency (EPA) (1998) reports that states engaged in reforming their environmental reporting processes are spending $3–$10 million per year for data system improvements and operations. EPA is making demonstration grants of up to $500,000 to support these efforts. In DHHS, the Maternal and Child Health Bureau (1998) is funding a systems development initiative that offers states grants of $100,000 that can be used to support information system activities, especially those related to performance measures for the Maternal and Child Health Services Block Grant. In the future, federal, state, and local health data systems may benefit from the savings expected to result from the administrative simplification required by HIPAA. Strategies to realize this potential and reinvest some portion of the savings in data systems and performance measurement for publicly funded health programs should be explored.

Training and Technical Assistance

The adoption of performance-based systems of accountability for publicly funded health programs will require staff who oversee and operate these programs to apply skills in planning and assessment that may be unfamiliar to them. Federal agencies have found the development of specific performance goals and the definition of related outcome measures to be among the most difficult challenges posed by GPRA (U.S. General Accounting Office, 1997). The use of performance measurement also draws further attention to the need for expertise in data analysis and in the design and operation of data collection and data management systems. Greater access to data and to more powerful computers and software makes it easier to perform more complex analyses, but also increases the importance of ensuring that users have sufficient skills and expertise to use these technologies appropriately. Moreover, in commenting on requirements for successfully implementing GPRA, the U.S. General Accounting Office (1996) has observed that staff will need skills in strategic planning, performance measurement, and use of performance information in decision making and that agencies should view training to develop these skills as a worthwhile investment. Support for training and technical assistance is essential to ensure that the necessary skills and expertise are available.

In theory, health departments and other health-related agencies might add new staff to obtain the expertise needed to support performance measurement and related activities. In practice, however, most states and communities have limited resources for hiring additional program and data system staff, and some may face other pressures to maintain or reduce staff size. Furthermore, the relatively low salaries traditionally offered by state and local health agencies can make it difficult to attract and retain highly trained staff. These limitations may be especially acute in technical areas. The rapid growth of the information technology industry, as well as the need to address the year 2000 problem for huge numbers of computer systems, has placed a high premium on information technology skills. Given these constraints, access to training programs that can enhance the skills of existing staff and to technical assistance that draws on the expertise of others becomes especially important.

For staff to obtain the needed training, suitable materials and programs are required, as well as time and funds to support the staff members' participation. Training opportunities may take many forms, including formal academic programs (e.g., graduate programs in schools of public health) and specialized courses and training sessions offered by federal agencies (e.g., the CDC Public Health Training Network), academic institutions, or others in the private sector. Funding for scholarships and dissertation grants could assist staff in obtaining advanced academic training. Support for other training opportunities is also needed. Teleconferencing, self-guided instruction, and other forms of distance-based learning can bring a variety of training to large audiences and can compensate in part for constraints on funding for travel to attend courses and conferences. However, supplementing distance-based training with attendance at off-site programs may give staff valuable opportunities to learn through direct interaction with colleagues from other states or communities. The panel was informed that even though CDC provides funds specifically to allow staff from each state to attend the annual BRFSS conference, these funds generally cover participation by the data managers who oversee the collection and maintenance of state BRFSS data sets, but are not adequate to support the attendance of most users of BRFSS data.3

Technical assistance can make a large reservoir of expertise available to meet diverse needs. The assistance can take many forms, including publications, information clearinghouses, conferences, and consultations with experts. In a recent activity of particular relevance to the interests of this panel, CDC and HRSA worked with the Association of State and Territorial Health Officials and the National Association of County and City Health Officials to develop an "investment guide" to assist states in planning and developing integrated health information systems (Centers for Disease Control and Prevention and Health Resources and Services Administration, 1998).

A review of technical assistance activities in DHHS led to the conclusion that these activities could be enhanced by greater coordination and evaluation of the effectiveness of current forms of assistance (U.S. Department of Health and Human Services, 1997f). It was suggested to the panel that in the area of epidemiologic analysis, for example, states could benefit from greater access to more senior CDC epidemiologists to supplement programs that currently rely primarily on newly trained epidemiologists.4 Because of their national perspective and their influential role as funders of many health programs, federal agencies are well placed to serve as a focal point for technical assistance. User groups that draw participants from local, state, and federal health agencies could open other channels for obtaining technical assistance and learning about a broader range of health data issues. States and communities might also look to academic institutions and others in the private sector, particularly in a rapidly evolving area such as information technology. Foundations or other nonprofit groups might be able to serve as intermediaries in sponsoring such public-private collaborations.

Taking A Collaborative Approach to the Development of Health Data and Information Systems

The panel's deliberations regarding performance measurement have led to the conclusion that much greater collaboration and coordination are an essential foundation for further development of the nation's health data and data systems. It appears that by adopting a broadly based approach to health data needs and resources, it will be possible to make more effective use of available data and information systems for performance measurement, as well as for other purposes, including monitoring health status in the population, managing health programs, and informing policy makers and the public. For publicly funded health programs, it is essential that information needs at the federal, state, and local levels all be taken into account. The DHHS strategic plan recognizes the need for accurate and timely data at all these levels for assessing changes in health status and managing health programs (U.S. Department of Health and Human Services, 1997a).

States are responding to these concerns with initiatives aimed at strengthening their health data infrastructure by improving data quality; developing standards for data definitions, information system configurations, and electronic transmission of data; and linking data systems (see the earlier discussion in this chapter).5 The panel is encouraged to see states taking these steps and believes there is additional value in promoting a national approach to these matters. State-specific solutions may limit the comparability of data across states, and states may miss opportunities to collaborate or to adopt successful strategies developed elsewhere. Likewise, the panel applauds the advances that HIPAA is expected to bring to standards for electronic health care transactions, but also urges support for efforts that will encourage the development of standards for an even broader range of health data elements, such as those likely to be used in performance measures for a variety of publicly funded health programs.

Meeting the Needs of Many Data Users

Many federal health data systems have been designed to provide national-level data for an overall assessment of health status to help guide the planning and implementation of national health policies and programs. At the state and local levels—the "front line" for service delivery—the perspective is somewhat different. Detailed local data are needed to guide planning and program operations, and they have more immediate value than national estimates. Even summary state-level data may lack sufficient detail to be useful for understanding health needs and program outcomes at the local level. For example, data for Illinois as a whole are not likely to provide a satisfactory picture of health status and program activities in either Chicago or a rural county in southern Illinois. Developing a more efficient and effective approach to information systems used to support performance measurement may depend on finding a way to accommodate differing perspectives on several issues.

One concern is the tension between the program-specific perspective that is often the basis for funding and oversight of publicly funded health programs and a more functional perspective on the operation of data systems that focuses on the commonalities among the data collection and management tasks to be performed for many program areas. Categorical grant programs help ensure that funds are directed to specific needs, but they may hinder both a broad view of health and the efficient organization of data systems at the state and local levels.

At the federal level, the programmatic perspective often dominates. The various categorical funding programs often have specialized reporting requirements, and some require the use of independent, customized systems to file those reports (e.g., for HIV/AIDS cases as noted earlier). In contrast, an approach that consolidates data collection systems across program areas can be beneficial at the state and local levels, where limited staff and operational funding can be used more efficiently if similar tasks can be combined. For example, a single ongoing survey such as a state's Behavioral Risk Factor Survey can collect data on such topics as smoking habits, alcohol use, disabling conditions, and mammography use without requiring each program to operate a separate survey.

The specialized data systems developed to meet categorical program requirements tend to have a limited scope and may be costly to maintain. They may require duplication of data collection and management tasks, and if their reporting requirements are incompatible, they may preclude use of a single, more efficient data collection method at the state or local level. Cooperation across programs can provide an opportunity to combine resources from diverse program areas to support similar tasks in data collection and analysis. For example, a single system of notifiable disease surveillance could accommodate reports on AIDS cases and pesticide exposures, or a single ongoing telephone survey of adults could integrate questions about domestic violence and mammography use. For this approach to work, compromises may be needed to balance the interests of diverse program areas. If a single survey that addresses both domestic violence and mammography use is to remain short enough to be practical, it may have to collect less detail on each topic than would be gathered by separate surveys.

State and local officials and health planners are also concerned about the flexibility and timeliness of data collection and reporting. They require access to current information about the specific populations they serve for effective program implementation and management. Often, data systems managed at the federal level have not been able to respond to these needs. For example, the National Health Interview Survey, National Health and Nutrition Examination Survey, and National Hospital Discharge Survey produce valuable national data, but are not designed to produce state- or local-level estimates. Moreover, state and local health departments and other health-related agencies have had little organized opportunity to participate in shaping the design and content of many federally operated data systems. Without this input, such systems are less likely to be relevant to state and local concerns, and opportunities to improve comparability or coordination across federal, state, and local data systems may be missed. Also, data managed at the national level have often been produced more slowly than is useful for state and local purposes. New computer and communications technologies are reducing the time needed to collect and process data and produce reports, but they may require expertise and equipment that are not yet available in some states and communities. To the extent that federal data systems will be relied upon to meet the need for state and local data for performance measurement, those systems will require the capacity and flexibility to respond in a timely way to state or local information needs. They will also have to ensure that data processing and reporting proceed as expeditiously as possible.

The need for timely data of state or local relevance should not, however, undermine the quality of the data in terms of validity, reliability, completeness, or accuracy. For example, new survey questions or modules must be validated, and survey staff must be trained to administer them. Concerns at the federal level about the quality and comparability of data produced by states have tended to encourage federal centralization of data systems rather than aggregation of state-level data. Although states acknowledge shortcomings in some areas, they are committed to producing high-quality data. Federal-state collaborations in areas such as vital records data and AIDS case reporting have achieved good quality and comparability in state-based data systems. These collaborations stand as examples for efforts that could be undertaken in other areas, such as enhancing the comparability of states' behavioral risk factor data.

Another source of tension is the burden associated with the reporting requirements for the federal block grants (e.g., the Preventive Health and Health Services Block Grant or the Substance Abuse Prevention and Treatment Block Grant) that provide a portion of the funds used to support state and local health programs. In the past, the reporting requirements associated with these grants have imposed a significant burden because some of the required information is not readily available and is often expensive or time-consuming to obtain. In addition, the reporting requirements of different federal grants are not always consistent across program areas. Constraints on the use of grant funds have also tended to prevent the consolidation of funding to support the development of integrated data systems. The performance partnership concept represents an effort to reduce this burden by making the states partners in a negotiation with the granting agency that leads to the selection of some of the measures to be reported. Plans should be made to assess the impact of this approach on states' reporting burden.

Collaboration in the Design and Implementation of Data Systems

The panel has concluded that a more collaborative approach to the planning, design, and operation of health data systems would better serve the needs of all parties at the federal, state, and local levels. This conclusion is consistent with the views of the Council of State and Territorial Epidemiologists (1997a,b), as reflected in that organization's recommendations in support of a National Public Health Surveillance System and for enhanced usefulness of state and local data collection by the National Center for Health Statistics. Those recommendations included improving access to surveillance data through better coordination of data systems, and planning surveillance and other data collection activities at the state and local levels in a standardized but collaborative fashion that includes local, state, and federal partners from relevant organizations.

The panel's position is also consistent with the follow-up steps proposed as a result of the 1997 review of progress toward the Healthy People 2000 objectives on surveillance and data systems (U.S. Department of Health and Human Services, 1997e). Those proposals included involving state and local governments at every stage of national data collection, analysis, and dissemination; providing easier access to national data sets, including additional geocoding to facilitate subnational analyses; improving coordination of data resources within DHHS and between census and health program data; and giving greater attention to state and local priorities in the development of health objectives for Healthy People 2010.

Collaborative efforts are complicated by the multiplicity of stakeholders across the federal, state, and local levels. No single voice at any of these levels can speak to all of the issues that need to be addressed, and no established framework is currently available for selecting representatives and involving them in deliberations about data system issues (e.g., survey design, question selection). At the federal level, DHHS has a critical leadership role to play in these activities, but it must function as a partner with other stakeholders. Mechanisms are needed for designating recognized representatives of key stakeholder groups and for supporting their participation in formal and informal efforts to improve coordination and collaboration. Currently, opportunities for state officials to meet with their federal counterparts may be lost because funding constraints prevent out-of-state travel. Similarly, states must work in partnership with community-level stakeholders, as well as with relevant federal and private-sector groups, to ensure that community information needs are addressed. For state and local government, the stakeholders include both staff with policy and programmatic responsibilities who use health data and staff with technical expertise in data collection and analysis who produce and manage health data.

Collaboration must be pursued not only in an intergovernmental framework, but also intragovernmentally. Better coordination among federal agencies, both within DHHS and between DHHS and other departments, could contribute to more effective use of available data and data collection systems and help reduce duplication of reporting effort for states and communities. Similarly, greater collaboration among states and communities increases the likelihood that they will be able to learn from each other and develop comparable measures, definitions, and data collection methods for monitoring health programs.


Health-related data are needed for the formulation of health policies and for the optimal targeting of resources to address priority health issues. Recent interest in performance measurement and performance-based accountability has brought renewed and broader attention to many long-standing concerns about these data and the data systems through which they are produced and used. The panel is convinced that this interest could and should be translated into the sustained commitment of time and resources needed to develop a more comprehensive and coherent approach to health data and health data systems that would build effectively on existing data resources and be capable of meeting health information needs at the federal, state, and local levels. The panel has focused primarily on the public-sector perspective, but recognizes that there are closely related private-sector interests and developments that must not be overlooked.

Attention must be given both to operational concerns and to policy issues. On the operational side, one of the most fundamental requirements must be ensuring that good-quality data are available and used in appropriate analyses. To make health data more useful in a broader context, greater consistency and comparability are needed. Key to achieving this objective will be the variety of activities under way to establish standards for the methods used to collect the data; the content and format of data files; the formats for exchanging data electronically; the protection of data privacy, confidentiality, and security; and the measures used to assess performance. Advances in computer technology and electronic data transmission could speed the collection and analysis of data and facilitate access to a broader range of health-related data for many more users.

The fundamental need is for a collaborative partnership across the local, state, and federal levels as a basis for strengthening and better coordinating the health data and information systems needed to support performance measurement.

Mechanisms must allow stakeholders to participate in an ongoing process that encourages them to contribute to policy determinations about what information is to be collected and how it is to be used. Because health issues affect everyone and are addressed in a variety of ways, the panel supports a national approach that recognizes a broad range of interests.

DHHS has an important leadership role to play in furthering these efforts, but all of the participants must share responsibility for ensuring that health data and data systems receive the support they need to operate efficiently and effectively. An investment must be made in the data collection programs and information technology that are at the core of these information systems and in the necessary training and technical assistance for the people who produce and use health data.


  • Anderson, J.E., D.E. Nelson, and R.W. Wilson 1998. Telephone coverage and measurement of health risk indicators: Data from the National Health Interview Survey. American Journal of Public Health 88:1392–1395. [PMC free article: PMC1509082] [PubMed: 9736886]
  • Arday, D.R., S.L. Tomar, D.E. Nelson, R.K. Merritt, M.W. Schooley, and P. Mowery 1997. State smoking prevalence estimates: A comparison of the behavioral risk factor surveillance system and current population surveys. American Journal of Public Health 87:1665–1669. [PMC free article: PMC1381131] [PubMed: 9357350]
  • Bailar, J.C., editor; , and F. Mosteller, editor. , eds. 1992. Medical Uses of Statistics, 2nd ed. Boston: NEJM Books.
  • Broome, C.V., and C.E. Fox 1998. CDC/HRSA Grant Funding Flexibility for Integrated Health Information Systems. Grant funding transmittal letter. April 1, 1998. U.S. Department of Health and Human Services. http://www​ (also at http://www​ (April 21, 1998).
  • Centers for Disease Control and Prevention 1997. Case definitions for infectious conditions under public health surveillance. MMWR 46(RR-10).
  • Centers for Disease Control and Prevention and Health Resources and Services Administration 1998. Integrated Health Information Systems Investment Analysis Guide. http://www​.hrsa. dhhs​.gov/investment.htm#iv (also at http://www​ (April 21, 1998).
  • Clarke, K.C., S.L. McLafferty, and B.J. Tempalski 1996. On epidemiology and geographic information systems: A review and discussion of future directions. Emerging Infectious Diseases 2(2):85–92. [PMC free article: PMC2639830] [PubMed: 8903207]
  • Council of State and Territorial Epidemiologists 1997. a. Implementation of National Public Health Surveillance System. Position statement #EC-1. Adopted June 19, 1997. http://www​ (September 28, 1998).
  • 1997. b. NCHS-State Data Coordination. Position statement #CD-2. Adopted June 19, 1997. http://www​ (September 28, 1998).
  • Environmental Protection Agency 1998. One Stop Program Strategy and Grant Award Criteria. http://www​​/onestop/strategy.htm (April 21, 1998).
  • Foundation for Accountability 1998. About FACCT. http://www​ (April 15, 1998).
  • Gordis, L. 1996. Epidemiology. Philadelphia: Saunders.
  • Gostin, L., Z. Lazzarini, V.S. Neslund, and M.T. Osterholm 1996. The public health information infrastructure: A national review of the law on health information privacy. JAMA 275:1921–1927. [PubMed: 8648874]
  • Hoaglin, D.C., R.J. Light, B. McPeek, F. Mosteller, and M.A. Stoto 1982. Data for Decisions: Information Strategies for Policymakers. Cambridge, Mass.: Abt Books.
  • Iezzoni, L.I., editor. , ed. 1994. Risk Adjustment for Measuring Health Care Outcomes. Ann Arbor, Mich.: Health Administration Press.
  • Iezzoni, L.I. 1997. a. Assessing quality using administrative data. Annals of Internal Medicine 127:666–674. [PubMed: 9382378]
  • 1997. b. The risks of risk adjustment. JAMA 278:1600–1607. [PubMed: 9370507]
  • Institute of Medicine 1994. Health Data in the Information Age: Use, Disclosure and Privacy. M.S. Donaldson, editor; and K.N. Lohr, editor. , eds. Committee on Regional Health Data Networks. Washington, D.C.: National Academy Press. [PubMed: 25144051]
  • 1997. The Computer-Based Patient Record: An Essential Technology for Health Care, Revised ed. R.S. Dick, editor; , E.B. Steen, editor; , and D.E. Detmer, editor. , eds. Committee on Improving the Patient Record. Washington, D.C.: National Academy Press.
  • Joint Center for Poverty Research 1998. Administrative Data for Policy Relevant Research: Assessment of Current Utility and Recommendations for Development. V.J. Hotz, editor; , R. Goerge, editor; , J.D. Balzekas, editor; , and F. Margolin, editor. , eds. Report of the Advisory Panel on Research Uses of Administrative Data. Evanston, Ill.: Northwestern University/University of Chicago Joint Center for Poverty Research.
  • Joint Commission on Accreditation of Healthcare Organizations 1998. Performance Measurement. http://www​​.htm (July 24, 1998).
  • Kim, I., and K.G. Keppel 1997. Priority Data Needs: Sources of National, State, and Local-Level Data and Data Collection Systems. Healthy People 2000 Statistical Notes, No. 15. Hyattsville, Md.: National Center for Health Statistics. [PubMed: 10620824]
  • Landon, B., L.I. Iezzoni, A.S. Ash, M. Shwartz, J. Daley, J.S. Hughes, and Y.D. Mackiernan 1996. Judging hospitals by severity-adjusted mortality rates: The case of CABG surgery. Inquiry 33:155–166. [PubMed: 8675279]
  • Lasker, R.D., B.L. Humphreys, and W.R. Braithwaite 1995. Making a Powerful Connection: The Health of the Public and the National Information Infrastructure. Report of the U.S. Public Health Service Public Health Data Policy Coordinating Committee. Washington, D.C. http://www​​/pubs/staffpubs/lo/makingpd.html (August 11, 1998).
  • Luft, H.S., and P.S. Romano 1993. Chance, continuity, and change in hospital mortality rates. Coronary artery bypass graft patients in California hospitals, 1983 to 1989. JAMA 270:331–337. [PubMed: 8315777]
  • Maternal and Child Health Bureau 1998. State Systems Development Initiative (SSDI) Grant Application Guidance for FY98. U.S. Department of Health and Human Services, Health Resources and Services Administration. http://www​​/hrsa/mchb/guidance.htm (June 4, 1998).
  • McLellan, A.T., H. Kushner, D. Metzger, R. Peters, I. Smith, G. Grisson, H. Pettinati, and M. Argeriou 1992. The fifth edition of the Addiction Severity Index. Journal of Substance Abuse Treatment 9:199–213. [PubMed: 1334156]
  • Mendelson, D.N., and E.M. Salinsky 1997. Health information systems and the role of state government. Health Affairs 16(3):106–119. [PubMed: 9141327]
  • Munro, N. 1996. Infotech reshapes health care marketplace. Washington Technology 11(August 8):1–3.
  • National Association of State Mental Health Program Directors 1998. State Mental Health Directors Adopt a Framework of Performance Indicators for Mental Health Systems. Press announcement. May 1998. Alexandria, Va.
  • National Association of State Mental Health Program Directors Research Institute 1998. Five State Feasibility Study on State Mental Health Agency Performance Measures. Final Report. Alexandria, Va.: National Association of State Mental Health Program Directors Research Institute.
  • National Committee for Quality Assurance 1996. NCQA Issues Final Technical Specifications for HEDIS 3.0. Press release. October 31, 1996. http://www​ (April 7, 1998).
  • 1997. HEDIS 3.0/1998. Vol. 4, A Roadmap for Information Systems: Evolving Systems to Support Performance Measurement. Washington, D.C.: National Committee for Quality Assurance.
  • National Committee on Vital and Health Statistics 1997. a. Health Privacy and Confidentiality Recommendations. June 25, 1997. Washington, D.C. http://aspe​​.htm (June 11, 1998).
  • 1997. b. Letter to Secretary Shalala on Unique Health Identifiers. September 9, 1997. Washington, D.C. http://aspe​ (June 11, 1998).
  • National Highway Traffic Safety Administration 1996. Why Data Linkage? The Importance of CODES. Washington, D.C.: U.S. Department of Transportation, National Highway Traffic Safety Administration. Available at http://www​​.gov/people/ncsa/codes/whylink.html (April 28, 1998).
  • National Library of Medicine 1998. Fact Sheet: Unified Medical Language System. National Institutes of Health. February 23, 1998. http://www​​/pubs/factsheets/umls.html (July 10, 1998).
  • National Research Council 1993. Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics. G.T. Duncan, editor; , T.B. Jabine, editor; , and V.A. de Wolf, editor. , eds. Panel on Confidentiality and Data Access, Committee on National Statistics. Washington, D.C.: National Academy Press.
  • 1997. a. Assessment of Performance Measures for Public Health, Substance Abuse, and Mental Health. E.B. Perrin, editor; and J.J. Koshel, editor. , eds. Panel on Performance Measures and Data for Public Health Performance Partnership Grants, Committee on National Statistics. Washington, D.C.: National Academy Press.
  • 1997. b. For the Record: Protecting Electronic Health Information. Committee on Maintaining Privacy and Security in Health Care Applications of the National Information Infrastructure, Computer Science and Telecommunications Board. Washington, D.C.: National Academy Press. [PubMed: 25121276]
  • 1998. Providing National Statistics on Health and Social Welfare Programs in an Era of Change. Summary of a workshop. C.F. Citro, editor; , C.F. Manski, editor; , and J. Pepper, editor. , eds. Committee on National Statistics. Washington, D.C.: National Academy Press. [PubMed: 25101444]
  • Office of the Secretary, U.S. Department of Health and Human Services 1998. National standard health care provider identifier. Proposed rule. May 7. Federal Register 63(88):25320–25357. [PubMed: 10179329]
  • Palmer, R.H. 1997. Process-based measures of quality: The need for detailed clinical data in large health care databases. Annals of Internal Medicine 127:733–738. [PubMed: 9382389]
  • Powe, N., J. Weiner, B. Starfield, M. Stuart, A. Baker, and D. Steinwachs 1996. The development and testing of a claims data-based approach for evaluating the care of patients with chronic illnesses. Medical Care 34:798–810. [PubMed: 8709661]
  • Public Health Foundation 1998. Measuring Health Objectives and Indicators: 1997 State and Local Capacity Survey. Washington, D.C.: Public Health Foundation.
  • Ries, L.A.G., editor; , C.L. Kosary, editor; , B.F. Hankey, editor; , A. Harras, editor; , and B.K. Edwards, editor. , eds. 1997. SEER Cancer Statistics Review, 1973–1994. NIH Pub. No. 97-2789. Bethesda, Md.: National Cancer Institute.
  • Rothman, K.J. 1986. Stratified analysis. Pp. 176–236 in Modern Epidemiology. Boston: Little, Brown.
  • Rubin, H., M.W. Jenckes, M. Stuart, and M. Wickham 1994. Maryland Medicaid Recipients' Ratings of Prenatal and Pediatric Care by HMOs and Fee-for-Service Providers. Paper presented at 122nd meeting of the American Public Health Association (abstract, p. 324). Washington, D.C.
  • Seitz, F., and B. Jonas 1998. Operational Definitions for Year 2000 Objectives: Priority Area 6, Mental Health and Mental Disorders. Healthy People 2000 Statistical Notes, No. 16. Hyattsville, Md.: National Center for Health Statistics. [PubMed: 10620825]
  • Starfield, B., N. Powe, J. Weiner, M. Stuart, D. Steinwachs, S. Scholle, and A. Gerstenberger 1994. Costs vs. quality in different types of primary care settings. JAMA 272(24):1903–1908. [PubMed: 7990241]
  • Starr, P. 1997. Smart technology, stunted policy: Developing health information networks. Health Affairs 16(3):91–105. [PubMed: 9141326]
  • Starr, P., and S. Starr 1995. Reinventing vital statistics: The impact of changes in information technology, welfare policy, and health care. Public Health Reports 110:534–544. [PMC free article: PMC1381625] [PubMed: 7480607]
  • Strouse, R., J. Hall, B.L. Carlson, and J. Cheng 1997. Impact of the Nontelephone Sample on Family Health Insurance Survey Estimates. Report submitted by Mathematica Policy Research, Inc., to the Robert Wood Johnson Foundation. March 3, 1997. Princeton, N.J.
  • Stuart, M. 1994. Redefining boundaries in the financing and care of diabetes: The Maryland experience. Milbank Quarterly 72(4):679–694. [PubMed: 7997223]
  • 1995. Public health issues in the development of centralized health care databases. Journal of Public Health Management and Practice 1(2):31–38. [PubMed: 10186593]
  • Stuart, M., D. Steinwachs, J. Harlow, and M. Fox 1990. Ambulatory practice variation in Maryland: Implications for Medicaid cost management. Health Care Financing Review (Annual) December:57–67. [PMC free article: PMC4195161] [PubMed: 10113498]
  • Svikis, D., M. McCaul, T. Feng, M. Stuart, M. Fox, and E. Stokes 1998. Drug dependence during pregnancy: Effect of an on-site support group. Journal of Reproductive Medicine 43:799–805. [PubMed: 9777620]
  • Thacker, S.B., and D.F. Stroup 1994. Future directions for comprehensive public health surveillance and health information systems in the United States. American Journal of Epidemiology 140:383–397. [PubMed: 8067331]
  • Turner, C.F., L. Ku, S.M. Rogers, L.D. Lindberg, J.H Pleck, and F.L. Sonenstein 1998. Adolescent sexual behavior, drug use, and violence: Increased reporting with computer survey technology. Science 280(5365):867–873. [PubMed: 9572724]
  • U.S. Department of Health and Human Services 1991. Healthy People 2000: National Health Promotion and Disease Prevention Objectives. DHHS Pub. No. (PHS) 91-50212. Washington, D.C.: Office of the Assistant Secretary for Health.
  • 1997. a. 1997 Strategic Plan. September 30, 1997. Washington, D.C.
  • 1997. b. Confidentiality of Individually-Identifiable Health Information. Recommendations of the Secretary of Health and Human Services, pursuant to Section 264 of the Health Insurance Portability and Accountability Act of 1996. September 11, 1997. Washington, D.C.
  • 1997. c. Health Insurance Portability and Accountability Act of 1996 Administrative Simplification. Fact sheet. February 13, 1997. http://www​​/press/1997pres/970213.html (June 29, 1998).
  • 1997. d. Major Provisions of the Personal Responsibility and Work Opportunity Reconciliation Act of 1996 (P.L. 104-193). April 28, 1997. http://www​​/news/welfare/aspesum.htm (July 8, 1998).
  • 1997. e. Progress review: Surveillance and data systems. Prevention Report 12(1).
  • 1997. f. Technical Assistance in the U.S. Department of Health and Human Services. Report of the Technical Assistance and Training Liaison Work Group. July 1997. Washington, D.C. http://aspe​​/progsys/ta/hhs-ta.htm (April 13, 1998)).
  • 1998. a. Leading Indicators for Healthy People 2000. Report from the HHS Working Group on Sentinel Objectives. Washington, D.C.: U.S. Department of Health and Human Services.
  • 1998. b. Registry of State-Level Efforts to Integrate Health Information. http://aspe​​.htm (February 9, 1998).
  • U.S. General Accounting Office 1996. Executive Guide: Effectively Implementing the Government Performance and Results Act. GAO/GGD-96-118. Washington, D.C.: U.S. Government Printing Office.
  • 1997. Managing for Results: Analytic Challenges in Measuring Performance. GAO/HEHS/GGD-97-138. Washington, D.C.: U.S. Government Printing Office.
  • U.S. Public Health Service 1979. Healthy People: Surgeon General's Report on Health Promotion and Disease Prevention. DHEW (PHS) Pub. No. 79-55071. Washington, D.C.: U.S. Department of Health, Education, and Welfare.



The Drug Evaluation Network Study and its use of the Addiction Severity Index were described to the panel by Thomas McLellan (professor of psychiatry at the University of Pennsylvania and senior scientist at the Veterans Administration Center for Studies of Addiction at the University of Pennsylvania) and colleagues at a workshop held by the panel in July 1997.


The Association of State and Territorial Health Officials, the National Association of County and City Health Officials, the National Association of Local Boards of Health, and the Public Health Foundation are collaborating in a federally funded project aimed at developing a methodology for measuring state and local public health expenditures in support of the essential public health functions (see Chapter 3 for a list of these functions). The results of this project are expected to lead to better information about public health spending, but may not directly address investment in information systems. Descriptions of the project can be found at <www​> and <www​>.


This information was reported to the panel in the background paper "Improving Federal-State Data Collection to Monitor Program Performance Measures," which was prepared by the Science and Epidemiology Committee of the Association of State and Territorial Chronic Disease Program Directors and the Council of State and Territorial Epidemiologists.


This suggestion was also made to the panel in the background paper "Improving Federal-State Data Collection to Monitor Program Performance Measures."


A summary of state efforts to integrate health information was compiled by DHHS and The Lewin Group. Information on activities in each state can be found at <http://aspe​>.

Copyright 1999 by the National Academy of Sciences. All rights reserved.
Bookshelf ID: NBK231008


  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (3.9M)

Related information

  • PMC
    PubMed Central citations
  • PubMed
    Links to PubMed

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...