NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Institute of Medicine (US) Roundtable on Value & Science-Driven Health Care. Clinical Data as the Basic Staple of Health Learning: Creating and Protecting a Public Good: Workshop Summary. Washington (DC): National Academies Press (US); 2010.

Cover of Clinical Data as the Basic Staple of Health Learning

Clinical Data as the Basic Staple of Health Learning: Creating and Protecting a Public Good: Workshop Summary.

Show details



Because of their potential to enable the development of new knowledge and to guide the development of best practices from the growing sum of individual clinical experiences, clinical data represent the resource most central to healthcare progress (Arrow et al., 2009; Detmer, 2003). Whether captured during product development activities such as clinical research trials and studies, or as a part of the care delivery process, these data are fundamental to the delivery of timely, appropriate care of value to individual patients—and essential to building a system that continually learns from and improves upon care delivered. The opportunities for learning from practice are substantial, from improved understanding of the effects of different treatments and therapies in specific patient subpopulations, to developing and refining practices to streamline or tailor care processes for complex patients, to the development of a delivery system that can advance the evidence base on novel diagnostic and therapeutic techniques (Hrynaszkiewicz and Altman, 2009; Nass et al., 2009; NRC, 2009; Safran, 2007). Furthermore, U.S. per capita healthcare costs are now nearly double that of comparable nations (Health care spending in the United States and OECD countries, 2007), and broader access and use of existing and future clinical data may be a key opportunity to better understand and address system-wide factors—such as waste and inefficiencies—that contribute to rising healthcare expenditures.

Clinical data now reside in many often unconnected and inaccessible repositories, making linkage, analysis, and interpretation of these data challenging on an individual or population level. The increase in potentially interoperable electronic and personal health datasets—integrated with laboratory values, diagnostic images, and patient demographic information and preferences—and development of approaches to link and network these data offer even greater opportunity to create and use rich data resources to help transform healthcare delivery and improve the public’s health. Concerns about privacy of health data, as well as the treatment of medical data—even those generated with public funds—as proprietary goods pose additional challenges to data use (Blumenthal, 2006; Nass et al., 2009, editors, Nature 2005, Ness, 2007; Piwowar et al., 2008).

The utility of clinical data as a transformative agent in the U.S. health-care system was the focus of the February 2008 Institute of Medicine (IOM) workshop, Clinical Data as the Basic Staple of Health Learning: Creating and Protecting a Public Good. Issues motivating discussion include the potential for clinical data as a resource for continuous learning and key component of an efficient healthcare system; the opportunities presented by vastly larger and potentially interoperable data resources—particularly those developed with public funds; the challenges and barriers to more appropriate use of these resources (e.g., related to the fragmentation of data, proprietary nature of data, and privacy concerns); the lag of public policy development and public awareness of and attention to these issues; and the need to address key issues, including the extent to which data constitute a public good (Box S-1).

Box Icon


Issues Motivating Discussion. Discovering what works best in medical care—including for whom and under what circumstances—requires that clinical data be carefully nurtured as a resource for continuous learning. Transformational opportunities (more...)

During the 2-day workshop, participants representing a variety of healthcare perspectives, reviewed current use of data for benchmarking and generating new clinical and operational insights, and discussed a sampling of innovative efforts to aggregate data for greater insights. In evaluating the current marketplace for care data, participants presented opportunities to increase access to and sharing of health information as private and public goods, while devoting particular attention to legal and social aspects of privacy and security of healthcare data. The workshop addressed multiple health-sector perspectives in the identification of specific policy areas for developing strategies and next-generation health data systems. Engaging the public in the advances necessary to develop a learning health system was viewed as a particularly important area for further discussion.

The IOM Roundtable and the Clinical Data Utility

Convened by the IOM in 2006, the Roundtable on Value & Science-Driven Health Care (formerly the Roundtable on Evidence-Based Medicine) serves as a mechanism for bringing stakeholders from multiple sectors together to evaluate means through which improving the generation and application of evidence will accelerate progress toward an efficient, effective U.S. medical care system. These stakeholders span the realm of health care, and include patients, employers, health product manufacturers, payers, policy makers, providers, and researchers. As a guiding principle for the Roundtable, decisions shaping American health and health care will draw from a proven evidence base, appropriately accommodate patient variation, and simultaneously generate additional insight into clinical effectiveness.

Roundtable participants established a goal that, by the year 2020, 90 percent of clinical decisions will be supported by accurate, timely, and up-to-date clinical information and will reflect the best available evidence. Central to this goal is the development of a learning health system designed to generate the best evidence for the collaborative healthcare choices of each patient and each provider; to drive the process of discovery as a natural outgrowth of patient care; and to ensure innovation, quality, safety, and value in health care. The broader availability and use of clinical data is an essential component of a learning system given the large potential for gains in the efficiency, quality, and safety of the care delivered; however, such a shift will have systemwide implications: drawing upon resources in each sector, and requiring cross-sector cooperation and discussion to ensure the appropriate development, support, and use of these resources.

The Roundtable’s Learning Health System series of workshops and publications are opportunities to foster the broad cross-sector discussions needed to better characterize the key elements, barriers, and needs of a transformed healthcare system. Each workshop is summarized in a publication available through the National Academies Press. Workshops and publications in this series since 2006 include:

  • The Learning Healthcare System
  • Judging the Evidence: Standards for Determining Clinical Effectiveness
  • Leadership Commitments to Improve Value in Health Care: Finding Common Ground
  • Redesigning the Clinical Effectiveness Research Paradigm: Innovation and Practice-Based Approaches
  • Clinical Data as the Basic Staple of Health Learning: Creating and Protecting a Public Good
  • Engineering a Learning Healthcare System: A Look at the Future
  • Learning What Works: Infrastructure Required to Learn Which Care Is Best
  • Value in Health Care: Accounting for Cost, Quality, Safety, Outcomes, and Innovation
  • The Healthcare Imperative: Lowering Costs and Improving Outcomes—A Four-Part Workshop Series

This publication summarizes the proceedings of the sixth workshop in the series, Clinical Data as the Basic Staple of Health Learning: Creating and Protecting a Public Good. A summary chapter includes highlights from each workshop session; manuscripts submitted by each speaker and panel discussion summaries can be found in the subsequent chapters. Two keynote presentations, included in Chapter 1, titled “Clinical Data as the Basic Staple of the Learning Health System” and “Creating a Public Good for the Public’s Health,” offered critical context for the workshop. The first day of the 2-day workshop also featured presentations that profiled data in the current healthcare system (Chapter 2), provided an overview of innovative efforts to use data (Chapter 3), evaluated the public and private natures of healthcare data (Chapter 4), and discussed issues related to privacy and security (Chapter 5). The second day featured a panel discussion on policy opportunities (also in Chapter 5) and presentations and discussions that identified next-generation data utilities (Chapter 6). The workshop concluded with a focus on engaging the public in efforts to use clinical data for insights (Chapter 7) and some final observations on meeting themes and potential follow-on activities (Chapter 8). The workshop agenda, biographical sketches, and a list of participants are located in the appendixes.


Apart from shedding light on the issues that impede or challenge improved data utility, the discussion identified a rich array of ideas for accelerating progress toward better application of data. Across the 2 days of presentations and discussion, a compelling set of reoccurring themes emerged for follow-on attention.

BOX S-2Workshop Common Themes

  • Clarity on the basic principles of clinical data stewardship.
  • Incentives for real-time use of clinical data in evidence development.
  • Transparency to the patient when data are applied for research.
  • Addressing the market failure for expanding EHRs.
  • Personal records and portals that center patients in the learning process.
  • Coordinated EHR user organization evidence development work.
  • The business case for expanded data sharing in a distributed network.
  • Assuring publicly funded data are used for the public benefit.
  • Broader semantic strategies for data mining.
  • Public engagement in evidence development strategies.
  • Clarity on the basic principles of clinical data stewardship. The starting point for expanded access and use of clinical data for knowledge development is agreement on some of the fundamental notions to guide the activities for all individuals and organizations with responsibility for managing clinical data. Workshop participants repeatedly mentioned the need for consensus on approaches to such issues as data structure, standards, reporting requirements, quality assurance, timeliness, deidentification or security measures, and access and use procedures—all of which will determine the pace and nature of evidence development.
  • Incentives for real-time use of clinical data in evidence development. Current barriers to the real-time use of clinical data for new knowledge discussed at the workshop ranged from regulatory and commercial issues to cost and quality issues. Participants suggested the need for a dedicated program of activities, incentives, and strategies to improve the methods and approaches, their testing and demonstration, the cooperative decision making on priorities and programs, and the collective approach to regulatory barriers.
  • Transparency to the patient when data are applied for research. Patient acceptance is key to use of clinical data for knowledge development, and patient engagement and control are key to acceptance. In this respect, clarity to individual patients on the structure, risks, and benefits of access to data for knowledge development was noted by participants as particularly important. Patient confidence and system accountability may be enhanced through transparent notification and audit processes in which patients are informed of when and by whom their information has been accessed for knowledge development.
  • Addressing the market failure for expanding electronic health records. Currently, market incentives are not enough to bring about the expansion of use of electronic health records necessary to make the point of care a locus for the development, sharing, and application of knowledge about what works best for individual patients. Shortfalls noted by participants included demand by providers or patients that is not sufficient to counter the expense to small organizations, competing platforms, and asynchronous reporting requirements that work against their utility for broad quality and outcome determinations, and that even the larger payers—apart from government—do not possess the critical mass necessary to drive broader scale applicability and complementarity. Deeper, more directed, and coordinated strategy involving Medicare leadership will likely be needed to foster such changes.
  • Personal records and portals that center patients in the learning process. Patient demand could be instrumental in spreading the availability of electronic health records for improving patient care and knowledge development. Such demand will depend on much greater patient access to, comfort with, and regular use of programs that allow either the maintenance of personal electronic health records or access through a dedicated portal to their provider-maintained electronic medical record. As noted during the workshop, many consumer-oriented products under development give patients and consumers more active roles in managing personal clinical information. These may help to demonstrate value in the speed and ease of personal access to the information, better accommodate patient preference in care, and foster a partnership spirit conducive to the broader electronic health records (EHRs) application.
  • Coordinated EHR user organization evidence development work. The development of a vehicle to enhance collaboration among larger EHR users of different vendors was raised during the workshop as a means to accelerate the emergence of more standardized agreements and approaches to integrating and sharing data across multiple platforms, common query strategies, virtual data warehousing rules and strategies, relational standards, and engagement of ways to reduce misperceptions on regulatory compliance issues.
  • The business case for expanded data sharing in a distributed network. Demonstrating the net benefits of data sharing could promote its use. Benefits suggested by participants included cost savings or avoidance from facilitated feedback to providers on quality and outcomes; quick, continuous improvement information; and improved management, coordination, and assessment of patient care.
  • Assuring publicly funded data for the public benefit. Federal and state funds that support medical care and support insights into medical care through clinical research grant funding are the source of substantial clinical data, yet many participants observed that these resources are not yet effectively applied to the generation of new knowledge for the common good.
  • Broader semantic strategies for data mining. Platform incompatibilities for clinical data substantially limit the spread of electronic health records and their use for knowledge development. Yet discussion identified strategies using alternative semantic approaches for mining clinical data for health insights, which may warrant dedicated cooperative efforts to develop and apply them.
  • Public engagement in evidence development strategies. Generating a base of support for and shared emphasis on developing a healthcare ecosystem in which all stakeholders play a contributory role was noted by many participants as important for progress. Ultimately, the public will determine the broad acceptance and applicability of clinical data for knowledge development, underscoring the importance of keeping the public closely involved and informed on all relevant activities to use clinical data to generate new knowledge.


Presentations at the Clinical Data as the Basic Staple of Health Learning: Creating and Protecting a Public Good included perspectives from healthcare sectors and beyond on the current state of clinical data and data systems, the implications of healthcare data as a public good, and potential opportunities for improving their collection and use. These workshop presentations served as the genesis of the papers that constitute the chapters that follow, and are summarized below.

Clinical Data as the Basic Staple of Health Learning

Clinical data consist of information ranging from determinants of health and measures of health and health status to documentation of care delivery. These data are captured for a variety of purposes and stored in numerous databases across the healthcare system. Advances in health information technology (HIT) and analytics raise the potential for these data to be used to fill substantial knowledge gaps in health care, but complicating the needed aggregation and use of these data are technical, cultural, and legal barriers. Although efforts are underway to address technical barriers and privacy concerns, many have suggested that a shift is needed to a data- sharing culture in which clinical data are considered a basic staple of health learning. Chapter 1 provides a brief overview of these issues and includes summaries of the context setting remarks of the workshop’s two keynote speakers. David Brailer’s presentation profiled current clinical data collection and use, and offered his perspective on future applications that would improve care delivery, research, and health outcomes; and Carol Diamond offered her perspective on what might be possible if data were treated as a public good and identified several policy and technical issues important to achieving this vision.

Clinical Data as the Basic Staple of the Learning Health System

Brailer, Chair of Health Evolution Partners, was appointed in 2004 as the first national coordinator for health information technology. He served in that position until 2006. He described the potential of clinical data as a key building block for a learning health system by profiling the utility of clinical data currently available, as well as what might be possible if all data sources could be readily and reliably drawn upon for new insights into healthcare effectiveness. A significant gap exists between the potential utility of clinical data and how data are treated in the current healthcare market—where clinical information is proprietary and used for strategic benefit. A question important for progress is whether clinical data are a public or private good.

Brailer noted that significant progress has been made in the past few years to broaden adoption of HIT. Advances in HIT certification and standardization efforts have produced portable health information and enabled exchange of significant volumes of information. Moreover, many hospitals have made progress in implementing electronic records, as have many physicians, especially those in large group practices. Emphasizing the important role of engaging the public, Brailer discussed opportunities for HIT companies such as Microsoft and Google to interact with the public and to raise the potential for data access and use for health improvement.

Yet, misaligned incentives, based on current systems of reimbursement, and an outdated privacy paradigm currently hinder progress. Brailer suggested that the development of a framework for privacy that recognizes the dynamic, portable, and compounding nature of health information assets is a necessary first step to facilitating greater data sharing and use. A second major challenge is ensuring that information developed via data sharing efforts are truly useful to clinical care—and in this respect, there is a major tension between adoption and interoperability. Interoperability—or the capacity to share, integrate, and apply health information from disparate sources—has been the principal priority of the nation’s health information agenda. However, as the push for adoption gains momentum, there is potential for moving health information tools into broad use. This may not be best suited to also support the interoperability needed for a learning health system.

Many aspects of the data and data systems essential for the development and assembly of coherent, representative, timely, and valid information that can inform decisions at patient or population levels are understood. However, leadership is needed to help maintain a focus on developing and using data to make health care smarter in the face of competing near-term priorities—especially given the many challenges faced by providers, payers, and consumers in term of access and cost. Progress in efforts to make clinical data structured, intelligent, useful, assembled, and applied in a way that makes care better requires a sharper focus on data stewardship. Under one scenario, health information could become a true public good that is not proprietary; alternatively, clinical information could become a private good, used differentially for comparative advantage.

Nothing in federal or state statutes, regulations, or other guidance confers control of health information to any data originator (e.g., provider, hospital, manufacturers); yet, in practice, clinical information today is largely a private good, controlled by data producers. Brailer observed that the Health Information Portability and Accountability Act (HIPAA) enables de facto provider control over health information as patients cannot direct that their information be sent to a third party, nor are providers obligated to make data available in a timely or convenient format. Other regulations also create barriers to portable, available, and acceptable health information. Brailer described the potential implications of proprietary use versus transparency of health information for health system stakeholders, and noted that it should be feasible to create a system in which providers can gain advantage from being performance driven, yet not gain advantage from exclusive use of health information as a private good.

Ultimately, advancing the notion of clinical data as a public good is essential to a healthcare system that learns. Such efforts also offer the potential to extend the benefits of the information revolution—already experienced by many other industries—to health care by providing the power of choice to consumers. Brailer contended there is a limited window of opportunity available to achieve this end through technical and policy advances that make data more useful and valuable while also ensuring equitable access and maintaining a competitive marketplace. Clarification of data stewardship is needed to promote a shared understanding and transparency with respect to data control, ownership, and access.

Vision for the Future: Creating a Public Good for the Public’s Health

Carol Diamond, managing director of the Markle Foundation’s Health Program, delivered the keynote address on the second day of the workshop. She outlined a vision for clinical data positioned as a public good and provided guidance on the technical and policy issues required to build the public trust necessary to achieve this vision. The work of the public–private collaborative, Connecting for Health (CFH), was discussed to illustrate key opportunities to develop a health information sharing environment that seeks to improve the quality and cost effectiveness of health care. The work of CFH to improve how information is used to address research, public health, and quality measurement was emphasized.

Achieving population health goals requires analysis, decision support, and feedback loops embedded throughout the system. However, as revealed by the significant challenges of collecting, cleaning, and analyzing health data for existing data reporting demands, progress will require a new approach to collecting, accessing, and using health information. To guide system development, Diamond suggested the need to consider three central requirements for responsible information policies: fulfilling seven core privacy principles (openness and transparency, purpose specification, collection and use limitation, individual participation and control, data integrity and quality, and security safeguards and controls); ensuring sound network design; and enabling accountability and oversight. As the needs of information users constantly evolve, Diamond also raised the importance of developing a flexible information technology architecture that is adaptable to different users, data sources, and research methods. A vision for a 21st-century approach to information sharing for public health was illustrated through nine “First Principles for Population Health”: (1) designed for decisions; (2) designed for many; (3) shaped by public policy goals and values; (4) boldly led, broadly implemented; (5) possible, responsive, and effective; (6) distributed, but queriable; (7) trusted through safeguards and transparency; (8) layers of protection; and (9) accountability and enforcement of good networking citizenship.

To convey the potential impact of a 21st-century vision for data, Diamond offered three scenarios for how decision making by providers, consumers, and policy makers might be enhanced by broader access to information grounded in reliable evidence. Realizing this vision will require moving to a new paradigm for health information in which, instead of collecting data in centralized databases for research, questions are brought to the data. Such an approach would emphasize the specific information needs of decision makers; a networked approach that supports efficient research analyses and allows data to remain distributed; and greater involvement of consumers as participants and producers of information.

U.S. Healthcare Data Today: Current State of Play

The first set of workshop sessions provided an overview of existing healthcare data—the sources, types, accessibility, and uses in the United States. In an exploration of example initiatives in the current healthcare marketplace that collect and use these data, presentations considered factors motivating the work and profiled elements of the system from different perspectives. Issues considered included the accessibility of data for new clinical insights, the extent of current uses of clinical data, and barriers to the advancement of next-generation data applications. The manuscripts in Chapter 2 reflect opportunities present within the current healthcare data profile to assess and manage clinical outcomes, as well as to glean new healthcare insights through the use of data from public and private sources.

Current Healthcare Data Profile

When discussing elements associated with evidence-based medicine or when defining the data or the taxonomies regarding health and health care, the healthcare community does not always consider all of the potential effects on health. As evidence-based medicine is more fully adopted, it will be important to evaluate all facets of evidence development and application necessary to transform health care. Simon P. Cohn, chair of the National Committee on Vital and Health Statistics (NCVHS) and associate executive director of the Permanente Federation at Kaiser Permanente, provided an overview of the current state of healthcare data. At the broadest levels, available data inputs include biomedical and genetic factors, individual health status and health behaviors, and socioeconomic and environmental factors, along with information about health resources, healthcare use, healthcare financing and expenditures, and healthcare outcomes. In aggregate, the scope and variety of healthcare data have the possibility to significantly shape the future of research and care delivery.

Clinical data today tend to be distributed widely across healthcare systems, patients, manufacturers, and researchers. Data are fragmented rather than integrated. To achieve the goal of using data to draw on evidence for patient and provider decisions and to promote increased comparability, interoperability, and standardization of data, improved terminologies and classifications of available and future data are required. Better analytical tools, networking, and data sharing are needed to leverage access to administrative and clinical data. In addition, better means of providing meaningful data to clinicians on demand for increased integration into practice for clinical decision support will accelerate adoption. Overall, Cohn suggested that a national strategy, supported by adequate funding, is needed to address these problems and fill gaps in data use.

Recommendations made through the NCVHS are relevant to these goals. In the area of health data stewardship, for example, the NCVHS suggested that covered entities be more specific about what data will be used, how, and by whom. Recognizing that transparency is very important to consumers, the NCVHS recommends that individuals should be able to request and be given information about the specific uses and users of their data. The NCVHS also suggests that data stewardship principles should be extended to include personal health data held by non-covered entities in personal health records (PHRs) and similar instruments.

Data Used as Indicators for Assessing, Managing, and Improving Health Care

Massachusetts Health Quality Partners (MHQP) is a multistakeholder coalition that measures and reports on physician performance using health plan claims data. Speaking to the benefits and challenges of using large aggregated databases for performance measurement, MHQP Executive Director Barbra Rabson shared some of the organization’s experiences using data to assess, manage, and improve health care.

MHQP aggregates healthcare data to enhance transparency in measuring and reporting on physician performance. The organization develops reports for both doctors and consumers on provider performance at the physician network, medical group, practice site, and individual physician levels. On MHQP’s website, for example, consumers can compare primary care physicians and medical groups in the state on preventive care service and chronic disease management measures. The organization also provides information for consumers on conditions, measurement, and suggestions for what both patients and clinicians might do to improve care and outcomes. Rabson observed that to date, MHQP data reporting has had greater impact on physician behavior than on consumer behavior. Massachusetts physicians have improved over the past 4 years on eight of nine measures. The public release of the data has influenced physician organizations’ investments in information systems, and MHQP continues to develop strategies to engage consumers with the measures to support quality and incentives for individual physicians. Although consumers access MHQP’s website, the overall impact on consumer behavior is unclear. Information gathered in consumer focus groups indicates that perhaps consumers do not always value the information made available—for example, one woman preferred to know whether a physician would be likely to deliver treatment in a patient-centered, respectful way rather than how well physicians provided breast cancer screening. Rabson noted that this suggested the need for new types of measures and data sources that provide more meaningful information to consumers.

Based on the experience of MHQP, Rabson cited some of the challenges associated with creating quality measures from electronic clinical information, including the difficult trade-offs and tensions between offering physicians flexibility to enter data and standardization of data for easier data capture; the lack of standards in data definitions and terms; the lack of standardization across vendors; and the absence of required elements from EHR data. Encrypted patient identifiers, mechanisms for facilitating patient privacy, make it difficult to provide patient-specific feedback to physicians. Ideally, clinical claims and personal data would be integrated for quality improvement; Rabson cited important work at the national level by the American Health Information Community toward this goal to define how health information technology can effectively support quality improvement. At the local level, MHQP and the Massachusetts eHealth Collaborative have been designated as the Massachusetts Chartered Value Exchange with a goal to integrate quality and HIT. Additionally, MHQP is one of six organizations selected to be part of the Centers for Medicare & Medicaid Services-funded Better Quality Information project, which involves the aggregation of claims and other clinical data from commercial payers and Medicare. MHQP is also a lead partner in a project to implement and measure the impact of EHRs in three Massachusetts communities, and to use EHRs as a data source for clinical quality measurement.

Data Primarily Collected for New Insights

Clinical researchers and epidemiologists attribute success in understanding and discovering advances in health care to the ability to collect, sort, and analyze increasingly vast amounts of numerical data. Currently, clinical and public health scientists have at least three major types of data available to them:

  • Data based on clinical care that come from electronic health records, clinic-based administrative datasets, and government payer datasets;
  • Large-scale registries generated and maintained by government entities, professional societies, and the private sector; and
  • Clinical trials, both publicly and privately funded.

Despite the wealth of data available to researchers and policy makers, a number of major limitations hinder researchers’ use of data, observed Michael S. Lauer, director of the Division of Prevention and Population Sciences at the National Heart, Lung, and Blood Institute. Although soon to change, currently relatively few American clinicians use computers to document care, and even when they do, much of the imported data are unstructured narrative text that is challenging to analyze. Most data generated today are based on nonrandomized observations drawing from the care delivery experience. Although some examples demonstrate that it is possible to incorporate rigorous and prospective data collection into routine clinical care, most clinical data are not collected at the point of care in a manner that is easily retrievable later. Access to data varies. Some datasets are widely available, while others are only available to personnel working at specific clinical sites or for specific sponsors.

As Lauer noted, the Roundtable’s goal of integrating evidence-based medicine into the routine clinical practice depends on the use of data as a staple for developing scientifically sound guidelines. If the Roundtable’s goal is to be realized, Lauer suggested, clinical data must be recognized as a staple that should be widely available and integrated across sites and practices. As a caution, even if a “data paradise” could be achieved with universally obtained and available clinical data, policy leaders should use care in placing too much reliability on these largely observational datasets for generating evidence-based recommendations. Even though modern statistical techniques and collection of more data elements may reduce biases, observational analyses of treatments must be recognized as inherently biased because of failure to take into account selection biases and unmeasured confounders. Lauer also emphasized the importance of well-designed experiments for building a scientific base to support evidence-based medicine.

Health Product Marketing Data

Significant amounts of data are available and used by public and private organizations to better understand public attitudes and consumer trends in the healthcare marketplace. William D. Marder from Thomson Healthcare Administrative identified three major types of data used by public and private entities to market healthcare products and services: health survey data, information about general consumption patterns, and administrative data generated by the healthcare delivery system. Much of the information about patient/consumer attitudes comes from health survey data (private and public) combined with general consumption pattern and market segmentation data. Administrative data are often used in retrospective database studies to examine the cost effectiveness of interventions in the general population. Marder described how these data comprise an information base that organizations often use to develop effective communications with the public and the business models that support collection of the data.

A number of healthcare entities engage in marketing (e.g., hospitals, pharmaceutical companies, device suppliers, government agencies, physicians) that relies heavily on information that helps target marketing to specific consumers. One example is the Thomson PULSE Survey that models healthcare use as a function of household and neighborhood characteristics. Surveying 100,000 random households per year, PULSE results are identifiable by respondents’ Census tracts and linkable to other Census tract data, including socioeconomic characteristics of a particular area and lifestyle modeling done by general marketing firms. These models can be used to drive healthcare marketing and planning decisions. Marder illustrated how these resources might be used to find the best groups for clinical trial participation.

Resources that provide retail store sales, billing services, health plans, and employer-based data are sometimes incomplete, but can be useful for marketing data analyses. Claims data can be applied tactically to identify the effect of marketing campaigns and measure sales-force effectiveness. At a more strategic level, claims data can offer insights for evaluating unmet medical needs, understanding the cost of acquiring a drug in a broader context, pricing new products, gaining a favorable formulary position, and convincing prescribers about the value of a drug.

However, data collection is often expensive. For surveys such as PULSE, revenue streams to offset data collection costs come from use of the data in marketing and planning tools sold to providers and suppliers. Revenue also covers licensing of general marketing information. As for the funding of administrative data, the costs of retail and product-switch data are largely covered by pharma. Health plan and employer data are largely covered by the operations of payer organizations, with additional support from consultants serving many organizations, including pharma, government, benefits consultants, and reinsurance companies.

Licensing data in a for-profit setting has certain benefits. Licensing helps customers achieve their goals by making data easy to use and to be sorted based on their interests. For license holders, the process of licensing offers the capability to recoup some costs of developing the data. A key consideration is how to best manage the intersection and interaction of private data assets and academic research. Despite the challenge of balancing the costs and benefits of making data available to researchers, Marder suggested it is essential that channels be maintained to make data available at no charge to academic audiences and to ensure access to data for replication of results.

In sum, marketing data can be seen as a synergy of inputs and interests from a variety of entities. The public sector provides raw material and models of data collection, at minimal cost. The private sector builds databases with clear commercial value that fill needs suggested by, but not covered by, public sources. As electronic medical record systems become more common, a blend of databases can be envisioned to draw on both public and private data sources—the mix will depend on government willingness to fund aggregation.

Changing the Terms: Data System Transformation in Progress

Building on workshop discussion that described the current landscape of clinical data, several presentations explored the evolution of the national data utility by highlighting efforts to coordinate clinical data—through large linked sets, aggregated data, and registries—and to make medical care data more readily available and usable. Speakers described incentives and drivers that push this evolution—including integration dynamics and disincentives, including shortfalls, limitations, and challenges of various approaches to organizing and aggregating data.

Emerging Large-Scale Linked Data Systems and Tools: The Example of caBIG

The complexity of cancer research is reflected by the many, widely differing diseases categorized as cancer. Understanding the molecular mechanisms behind these diseases is an endeavor that must involve many individuals, laboratories, and institutions across an array of specialties and subspecialties and on an international scale. The Cancer Biomedical Informatics Grid (caBIGTM) aims to help provide the resources needed for such research. It was developed in response to demand at the National Cancer Institute (NCI) for a more highly coordinated approach to informatics resource development and management. As described by Peter Covitz of the NCI, caBIG is a voluntary network of infrastructure, tools, and ideas that enables the collection, analysis, and sharing of data and knowledge along the entire research pathway—from laboratory bench to patient bedside. It was designed to speed research discoveries and improve patient outcomes by linking researchers, physicians, and patients throughout the cancer community. The program was designed intentionally to identify governance structures, organizations, and technologies that can move cancer research forward and, ultimately, have a much bigger impact on patient health then we have been able to see thus far. In its first year, for example, caBIG defined high-level interoperability and compatibility requirements for information models, common data elements, vocabularies, and programming interfaces.

The caBIG vision is to connect the cancer research community through a sharable, interoperable structure; to employ and extend standard rules in a common language; to more easily share information, and to build or adapt tools for collection, analysis, integration, and dissemination. caBIG is seen as an essential resource to fulfill the NCI’s goal of eliminating suffering and death due to cancer. Moreover, Covitz suggested, the experience of designing a governance structure for caBIG as well as the nuances of the initiative’s internal architecture can be instructive for healthcare data as a whole. In short, caBIG is a possible model or prototype for the broader challenge of creating an interoperable health information network across the nation.

Networked Data Sharing and Standardized Reporting Initiatives

Research questions today are more complex and data more complicated. At the same time, low-frequency events of interest demand larger pools of data, and greater geographic and demographic diversity is needed. Translational research uses institutional entities as the unit of analysis so researchers can compare outcome differences and patterns of practice. Data from single entities are insufficient, and thus the need for sharing data across research entities and collaborators continues to grow. Solutions must be tailored that are fast, inexpensive, sustainable, safe, and high quality with understood meanings. As described by Pierre-André La Chance, chief information officer at the Kaiser Permanente Center for Health Research, the Center and its research collaborators address these criteria in their data sharing efforts.

The chosen approach involves constructing research-friendly, secure, locally controlled data warehouses, as well as secure networks of local interoperable data warehouses. With controlled access to data warehouses with such characteristics, researchers can use available internal data quickly, cheaply, and expertly. Warehoused data are of sufficiently high quality to have credibility for decisions that affect both treatment and policy. The system is able to show data quickly for more than 10 million members per month. Developers are working to create sharable versions of data that include enrollment, demographics, a tumor registry, pharmacy, vital signs, procedures, diagnosis, and laboratory values. Also in development is a biolibrary that will allow people from multiple institutions to access Kaiser Permanente tumor registries and histology data and to electronic inventories of slides that expedite the identification of appropriate participants for research studies. Developers have also focused on a specific aspect of data warehousing, creating counters or specific data marts to share deidentified data quickly; this is especially valuable for preparatory research purposes. These models are seen as valuable tactical tools that are available now to advance clinical data sharing.

Large Health Database Aggregation

Steven Waldren, director of the Center for Health Information Technology at the American Academy of Family Physicians (AAFP), provided an overview of AAFP’s work to provide valued services and lower the costs for the technology. The holy grail of HIT is the ability to drive rapid improvement in the quality and safety of healthcare delivery. Yet, current financing of health care rewards high-cost, high-volume care, not low-cost, high-quality care. This disconnect creates potential conflict of interests between those who need the technology and those who will financially benefit from the technology.

As a central tool in data aggregation, the EHR can drive and support quality improvement, public reporting, health services research, clinical research, healthcare value analysis, biosurveillance, population management, and public health. In practice, however, the AAFP has found that physicians are adopting the EHR and other technologies not in the interest of data aggregation, but primarily for business support, suggesting that a paradigm shift must occur to achieve data aggregation at any level other than administrative data. Also important is the need to clarify the value proposition for those who collect clinical data. Data codification, structure, standardization, and input into systems present barriers to data aggregation; confidentiality and data privacy concerns also impede progress. These issues must be addressed to foster greater willingness to share data.

In support of data aggregation, the AAFP has worked to establish and promote HIT standards focused on clinical data, such as the American Society for Testing and Materials Continuity of Care Record standard, as well as to map individual data to a common data structure. The AAFP continues to advocate for payment reform to incentivize quality of care, not volume of care, and works with the Ambulatory Care Quality Alliance to articulate the concept of a National Health Data Stewardship Entity. In addressing privacy and confidentiality issues, Waldren discussed AAFP’s work to clarify members’ misconceptions about the Health Insurance Portability and Accountability Act (HIPAA), which can be an unnecessary barrier to the sharing of data. Future uses for aggregated data that are of highest priority to AAFP members include quality improvement and clinical research efforts.

Registries and Care with Evidence Development

The challenges faced in data collection and knowledge dispersion to the point of care includes standardization of the language of medicine, individual confidentiality, and cooperation among professional societies. In addition, the methodology associated with information point-of-care decisions must be improved. Initiatives are needed to advance observational data adjustment techniques and to ensure that analysis of the data is unbiased because sharing of the data requires establishing trust and a common understanding between patients and providers on data issues. Finally, this is an expensive process, and allocating the expense of this to Medicare Part A or getting a fundamental base payment will be essential.

As described by Peter Smith, professor and chief of thoracic surgery at Duke University, one promising model that is impacting cardiac surgery outcomes is the Society of Thoracic Surgeons’ (STS’) Adult Cardiac Surgery Database (ACSD). The largest of three distinct databases that comprise the STS National Database, the ASCD is a voluntary clinical registry developed for the purpose of continuous quality improvement in cardiac surgery. It contains more than 3 million surgical procedure records from 850 participant groups, representing approximately 80 percent of adult cardiac surgical procedures performed nationally. Data are harvested quarterly, risk adjustment algorithms are updated, and each site is then provided with its raw and risk-adjusted outcomes compared to similar groups and national benchmarks. The publicly available, individual-patient STS risk calculator, based on the most recent risk adjustment algorithms, is a tool to rapidly disseminate knowledge to the bedside.

The ACSD is a clinical database that has been studied extensively and been shown to be more accurate than administrative databases. It has been selectively audited and endorsed for public use in several states. ASCD data have been linked to administrative data to demonstrate cost effectiveness of continuous improvement, and have been used to improve the accuracy of the Medicare Physician Fee Schedule nationally. In addition, over the past 7 years, STS/Agency for Healthcare Research and Quality grant programs have demonstrated that the use of a clinical data repository and feedback can rapidly change physician behavior on a national scale. Smith noted that ultimately, the experience and success of the ASCD can be exported to inform the development of shared data in other medical specialties.

Healthcare Data: Public Good or Private Property?

Despite the potential for accelerated and expanded research offered by existing data systems and efforts to enhance their linkage and use, broader access to critical data hinge on whether healthcare data constitute a public good. As reviewed in Chapter 4, one workshop session explored this question from several perspectives. Examining the clinical data utility from a conceptual standpoint as well as from the perspectives of the marketplace and the legal system, presenters considered how the structure of the medical care data marketplace can affect research priorities, gaps, and possibilities. Questions of whether important distinctions should be made within the spectrum of data types or sources, and how a case might be made for improved access and sharing of medical data, were addressed. Several options were suggested for how to think about basic concepts related to shared data and guide their use through policy and legislation.

Characteristics of a Public Good and How They Are Applied to Healthcare Data

As understanding theoretical principles can help guide the development of practical policy and action, David Blumenthal, director of the Institute for Health Policy at Massachusetts General Hospital/Partners Health System, reviewed the classic definition of a public good and discussed how this definition applies to health information under varying circumstances. Pure public goods cannot be traded efficiently in the marketplace, he suggested, because they are both nonrival, meaning that using this good does not preclude others from using it, and nonexcludable, meaning that, even if a good is wholly owned and paid for, its use and benefit by others cannot be prevented. As an example, basic research is widely accepted in the United States—across the spectrum of ideological opinions about markets—as a public good. Basic research is considered both nonrival and nonexcludable; its support is an an appropriate and necessary role of government. In addition, Blumenthal described quasi-public goods, which he considered more relevant to discussion of healthcare data. Such goods may be relevant to the public and nonrival, but not nonexcludable or vice versa. Particularly emphasized, however, were quasi-public goods for which production or consumption generates or might generate effects (positive or negative externalities) on third parties not involved in the private purchase or sale of such goods. Applied biomedical research has aspects of both a public good and a quasi-public good. It is excludable and rival within limits. For example, knowledge underlying a particular drug or device can be appropriated up to a point, but important information can also be kept secret and lead to benefit in the marketplace. Keeping knowledge private causes a potential loss of efficiency in the advancement of other knowledge, but Blumenthal notes that this loss is tolerated to incentivize innovation driven by opportunity for economic gain. Patent law seeks to mitigate this loss of efficiency enabling scientific progress based on protected information.

The principal questions relevant to workshop discussion concerned what to do with privately maintained databases, which are arguably quasi-public goods because they have private costs and value that given parties will not likely construct nor share out of altruism, but for which large externalities exist (i.e., if available, these data could generate significant social benefits). To realize these benefits, an approach is needed that does not eliminate losing the incentive to assemble such databases. Also relevant is another kind of informational public good. For example, data found at the National Institutes of Health (NIH) or developed through the National Health Interview Survey or National Census represent situations in which the taxpayer has paid for the information to be collected. Efforts are needed to ensure these data are efficiently made available and used.

Rationales for making data publicly available, even when not meeting the definition of a public good, apply in situations in which government has supported—through financial or other means—the development or enrichment of data or in which making the data publicly available has significant benefits not captured in traditional market transactions. Blumenthal highlighted two solutions to contending with this quasi-public good or public good nature of information. The first is to increase the appropriateness or excludability of information. Traditionally we have used patents and copyrights to accomplish this by granting a period of exclusivity on the condition of revealing the science and practice that led to the patent. The second is to have the government produce the good in question. The NIH and the National Science Foundation are examples of this approach. Blumenthal concluded that the question of data is complex, and that nuanced information uses will arise that require public guidance, perhaps on a case by case basis.

Characteristics of the Marketplace for Medical Care Data

Various sources of medical and prescription drug data are available to support safety surveillance and generation of evidence for healthcare decision makers. Repositories are constructed as potential mechanisms for research and commercial application. Data linking drug information with medical claims data provide an opportunity to view treatments, whether by procedure or pharmaceuticals, and capture elements of the other healthcare use patterns of those patients.

Claims aggregators commonly create deidentified research databases to license to third parties, including the federal government. These databases are also licensed to academic researchers if they can afford to pay (and if they cannot, they are often given access in a spirit of good will). The largest market for these commercially licensed databases is the pharmaceutical sector, which uses them for a variety of purposes. William H. Crown, president of i3 Innovus, detailed the planning involved in the construction of large, complex datasets. Documenting many potential sources of data that can be compiled for such a purpose, Crown outlined potential barriers to data aggregation, such as adjusting and standardizing data pooled from multiple sources, protecting patient privacy, and monitoring cautions required when using data for a purpose other than the one for which the data were originally collected.

In terms of the trade-offs between a pooled mega-database and pulling data from different data aggregators, Crown indicated a growing need for a mega-database that could house data from multiple health plans, government providers and payers, as well as other sources to promote standardization and create a public good available for research, for cost-effectiveness studies, for real-world drug safety, and for guiding compliance of physician practice.

Legal Issues Related to Data Access, Pooling, and Use

The legal system enters the public good debate because it reflects and so perpetuates the current excludability state of clinical data with property and intellectual property models. Furthermore, market exchanges or shifts to public good nonexcludability face legal barriers (e.g., privacy, confidentiality, and security) that are designed to reduce or eliminate negative externalities—effects that negatively impact individuals not directly involved in the collection or use of data—suffered by data subjects. Nicolas P. Terry, Chester A. Myers Professor of Law and codirector of the Center for Health Law Studies at Saint Louis University School of Law, offered some observations on aspects of the legal system relevant to the debate of whether clinical data should be treated as a public good. These include the perceived mandate to create or support structures that treat clinical data as a private good; the design of data protection laws to eliminate or reduce potential negative externalities of data sharing on the data subjects; and the uncertainty inherent in the legal system—an indeterminacy increased by the legion of “legacy laws,” such as records laws predating electronic clinical data collection and the potential for data mining of records to improve outcomes and effectiveness. Three major clusters of legal rules that create barriers to clinical data evolving into a public good were reviewed: property or inalienability rules (ownership of medical records, and IP and trade secret protections); federal–state vectors (state restrictions on data collection, processing, or security, and state initiatives on HIT and Health Information Exchange policy); and changing data protection models (the HIPAA privacy model, and personal health records and consumer-directed health care).

Ongoing work toward solutions to challenges related to IP and data protection models were reviewed, including the notion of balancing proprietary rights in information property with public duties such as obligations of accuracy and confidentiality, and the need to to facilitate scientific, technical and educational uses of information. Terry argued that a more rigorous data protection model will be required as a predicate for greater access to patient data, noting that, as stated by NCVHS, “erosion of trust in the healthcare system may occur when there is divergence between what individuals reasonably expect health data to be used for and when uses are made for other purposes without their knowledge and permission.” Examples of efforts to confront what he described as “a tension between data protection and public utility” include the National Center for Health Statistics’ stewardship framework report and the European data directive.

Healthcare Data as a Public Good: Privacy and Security

In addition to proprietary issues, concerns related to privacy and security restrict the use of healthcare data. Maintaining confidential and secure data records is of paramount importance to ensuring public trust in the healthcare system, and is an important factor in discussions about sharing of health data. As presented in Chapter 5 and summarized below, four speakers considered key legal and social challenges to privacy and security issues from a variety of perspectives—including insights gained from prevailing public opinion, implications of HIPAA as a means of ensuring privacy, experiences of organizations outside health care, and the privacy and security practices of healthcare delivery organizations.

Public Views

Privacy is all-pervasive in terms of the future of HIT. Public beliefs about privacy issues link directly to the trust level that individuals have in the entire healthcare establishment, and factors significantly in the move to electronic health records, personal health records, and interoperability exchanges. Alan Westin, professor emeritus of public law and government at Columbia University and principal of the Privacy Consulting Group, presented results of a 2007 national Harris/Westin survey that measured public attitudes toward the current state of health information privacy and security protection, health provider handling of patient information, health research activities, and trust in health researchers.

Westin’s study indicates that 83 percent trusted their own healthcare providers to protect the privacy and confidentiality of personal medical records and health information. Sixty-nine percent believed researchers can be trusted to protect the privacy and confidentiality of medical records and health information on research participants. Fifty-eight percent said they do not believe there is adequate protection today for their health information when asked whether privacy of personal medical records and health information is protected enough by federal and state laws and organizational practices.

The results of Westin’s survey confirm that the public has strong privacy concerns regarding the handling and protection of their personal health information, especially concerning uses of data not directly used for providing care. According to Westin, privacy is a matter of balance and judgment, and it is contextual. The results also suggest that a new code of privacy confidentiality and security written into legislation might support new health information technology and the adoption of EHR systems. In addition to encouraging models of voluntary patient control privacy policies offered through repositories of personal health records such as Microsoft’s HealthVault and Google Health, Westin suggested the need for an independent health privacy audit of the verification process. New, easy-to-use technologies for implementing patient notice and choice could revolutionize the role of individuals in the process of how their personal information is used. Conducting additional field research into privacy in the EHR programs and sponsoring a national educational campaign to promote privacy-compliant, evidence-based health research might advance the public perception of data.

HIPAA Implications and Issues

As the healthcare system and HIT systems evolve, experience suggests that modifications are needed in the HIPAA Privacy Rule to strike the proper balance between protecting patient privacy and making data available for research necessary to improve healthcare quality and lower costs. Early changes to HIPAA allowed the disclosure of limited datasets and lightened administrative burdens on healthcare providers and plans that made data available for research purposes. Marcy Wilder, a partner at the law firm of Hogan and Hartson, LLP, and former deputy general counsel at the Department of Health and Human Services (HHS), served as the lead attorney in the development of HIPAA. She asserted that identifying the most significant barriers that remain, including those related to future unspecified research and data deidentification, and clearly defining policy alternatives will be essential to promoting the research enterprise.

HIPAA rules can be confusing and require administrative recordkeeping that challenges many covered entities—in particular, smaller hospitals. There are also liability concerns on the part of the covered entities. For these reasons, researchers who seek clinical data for evidence development and application find that such data are hard to obtain. Because HIPAA is so often used as a smokescreen to preclude the sharing of data, a more difficult challenge in policy discussions will be separating out and defining for the regulators and legislators what the real problems are regarding data sharing. Wilder suggested that HIPAA as it stands today may be somewhat outdated. Advisory committees, Congress, and agencies within HHS itself have recognized that the research provisions need improvement to encourage the use of data for research and innovation. Discussions in this regard will prompt important conversations with both regulators and legislators.

Wilder identified several areas for consideration. Under HIPAA, individuals are not permitted to give their consent for the use of data for future unspecified research, which prompts an important policy discussion. In light of developments in HIT and other technology, another issue is deidentification and safe harbor standards. Another topic for discussion concerns liability burdens distributed across covered and noncovered entities.

Examples from Other Sectors

Greater openness about data can be seen in the collaborative research of the Human Genome Project, the public registration of clinical trials, and the growth of new models of disclosure/publication of research results in open-access journals and digital repositories. As the basis for the creation of new datasets underlying evidence-based medicine, greater openness is transforming the relationship between doctors and patients, increasing market incentives for improved health care, and providing new means for detecting emerging diseases. The challenge, as with privacy and clinical records, is to determine what level of openness is most appropriate for the particular purpose to be accomplished. To provide perspective on these issues, Elliot E. Maxwell, a consultant, Fellow in the communications program at Johns Hopkins University, and Distinguished Research Fellow at Pennsylvania State University, presented an overview of the adjudication of information in the context of the report, Harnessing Openness to Transform American Health Care, from the Committee for Economic Development.

Among other recommendations, the report suggests that the Food and Drug Administration (FDA) should review existing requirements on patient consent to participate in clinical trials and make changes as appropriate. The report also suggests that the FDA should require electronic filing for all drug and device approvals and should set standards for, and require the filing of, underlying clinical data, upon approval, in a form that allows subsequent machine aggregation, search, and manipulation.

Regarding EHRs, the report recommends that individuals and groups providing and funding health care should institute appropriate incentives for the adoption of information and communication technologies (including EHRs) to reduce health care’s burdensome administrative costs. Maxwell suggested that federal research agencies should increase their support for the development of the large databases necessary for progress toward evidence-based medicine, including development of the necessary data standards. They should also evaluate and amend HIPAA to require that those parties who hold a patient’s medical records provide the patient with the opportunity to receive copies of those records in digital form pursuant to HIPAA.

Institutional and Technical Approaches to Ensuring Privacy and Security of Clinical Data

Healthcare providers view the protection and security of patient health information as essential to maintaining the trust and confidence of their patients and as an important element of patient satisfaction. At the same time, healthcare providers are rich sources of data, which have the potential to enhance the quality of clinical care and may result in better clinical outcomes, improved efficiencies, cost savings, and other medical advances. Alexander Eremia, associate general counsel and corporate privacy officer at MedStar Health, Inc., discussed the implications of these tensions from the standpoint of a major healthcare provider organization. He reflected on the institutional challenges inherent in balancing patient privacy interests with providing access for research purposes. In particular, Eremia indicated that providers must address perceived and actual privacy or security hurdles, patient trust considerations, potential legal consequences, and actual costs associated with retrieval of data; all of these pose barriers to releasing data for research purposes.

Healthcare providers may find HIPAA privacy and security requirements confusing, and health information data custodians and researchers may have limited awareness of HIPAA’s data access and disclosure requirements. Furthermore, even when access and disclosure are permitted under HIPAA, the willingness to make certain disclosures of identifiable information may be impeded by physician concerns related to violating the trust of their patients, minimum necessary standards, accounting for disclosure obligations, and even concern about losing patients to physician/researchers. In addition, it is often costly for healthcare providers to divert resources and personnel away from clinical care activities to attend to system and records access activities. As a result, healthcare providers are often more motivated to protect patient privacy, respect physician–patient relationships, minimize the administrative impact on data retrieval, and minimize legal risks and customer complaints. According to Eremia, without adequate financial or strategic incentives, regulatory amendment, and greater appreciation of the public benefits of research, access to identifiable data for research will remain a challenge.

Creating the Next-Generation Data Utility: Building Blocks and Action Agenda

The development of new data utility builds on the considerable past progress in health care. As summarized in Chapter 6, both theoretical perspectives and specific ideas for practice were presented at the workshop. Reviewed first are workshop presentations that identified lessons learned on important components or building blocks for a next-generation data utility. The chapter concludes with a summary of comments on emerging, practical opportunities to align policy developments with improved data access and evidence development offered by a discussion panel of key policy makers.

Building Blocks for the Next-Generation Public Agenda

Important strategic priorities for the development of an architecture for a next-generation data utility emerged from workshop discussion of collaborative models that offered insights into collaborative clinical data system management, and suggested a framework for expectations, purposes, incentives, priorities, structures, roles and responsibilities, and principles for data entry, access, linkage, and use. Similarly, presentation of efforts to aggregate clinical data from multiple institutions raised considerable technical, organizational, and operational challenges that need to be addressed. Finally, economic incentives and legal issues were considered as important levers to realize the full potential of health data.

Building on collaborative models. The Institute to Transform and Advance Children’s Healthcare at Children’s Hospital of Philadelphia (CHOP) is spearheading a novel effort to harness clinical and business information to improve children’s health, make their health care more efficient, and transform the delivery system. The Institute has developed a data system that links the full spectrum of information about a child’s health needs, from genomics to clinical to environmental data, in order to build out a vision of personalized pediatrics. Christopher Forrest, professor of pediatrics and senior vice president and chief transformation officer at CHOP, described the hospital’s approach to data. He discusses issues related to collaborative relationships needed to realize a vision of personalized pediatrics, including forming linkages with multiple pediatric institutions, giving patients and families access to their data, obtaining information from them, and creating provider–payer collaborations.

CHOP’s model is predicated on giving care at the right time by the right person in the right setting, minimizing waste, and shifting services from specialty care to primary care. In evolving into a data-driven organization, CHOP developed a concept of personalized pediatrics, which relies on collaborations with other pediatric institutions, public institutions, payers, patients, and families. The concept of personalized pediatrics focuses on outcomes, changes in health, and reductions in costs, both financial and nonfinancial. Apart from issues related to establishing and then sustaining strong collaborations with CHOP’s partners, other challenges include communications and changing cultural assumptions. Changing the culture of providers to collect data in a high-quality way is dramatically difficult: providers can be added to an EHR, but getting them to change what they do with the EHR requires education and time. Communication across the board is important, according to Forrest, especially in regard to engaging families in a dialogue about how they can partner in personalized pediatrics. CHOP’s model of care is family centered and designed in partnership with families. Forrest also suggested that none of the programs will work without the support and participation of families.

Technical and operational challenges. Efforts to aggregate clinical data from multiple institutions for the purposes of gaining insights on clinical effectiveness or drug/device safety face many technical, operational, and organizational challenges. Drawing on experiences from previous pilot projects and other work in this area, Brian J. Kelly, the executive director of the Health & Sciences Division at Accenture, provided an on-the-ground, real-life implementation perspective on the challenges with aggregating data from multiple sources for secondary use. He also discussed the impact of current privacy regulations, based on work to prototype the Nationwide Health Information Network, in which researchers aimed to aggregate data from 15 completely separate organizations in four states.

Among the challenges to optimizing the use of data, both in patient care and for secondary uses, is getting the data into equivalent standards and terms, and finding ways to draw data into one repository from multiple systems. Systems for such data are in place and to some extent entrenched, and changing those systems will be incremental. Kelly drew from experiences in the area to suggest that a sophisticated approach to information governance is needed. Another consideration focuses on ownership of the data—by patients or the entity that enters the data into a database. Approaches to addressing technological and architectural challenges are needed to best support envisioned goals for the data. Because states can place restrictions on data sharing in addition to HIPAA rules, the standard notification of privacy practices is changed to say data can be used if they are deidentified for secondary use in clinical research, but there will be continued trouble aggregating data among various institutions. To share data among delivery organizations, there could be a different approach to notification for privacy purposes, which Kelly indicated is one of the biggest policy areas that must be addressed.

There is a growing need for advocacy for using data as a public utility. Many organizations have started marketing campaigns to educate patients and their families on the importance of participation in clinical trials and related research endeavors. Kelly pointed out that we need to do the same thing to educate people on how important it is to be able to use data for secondary purposes. Such efforts would have to contend with security and privacy issues, but those factors can be addressed.

Economic incentives and legal issues. If we wish to change behavior, then we must directly address incentives, argues Eugene Steuerle, senior fellow at the Urban Institute. Steuerle suggested that existing incentive structures discourage information sharing, giving great weight to possible errors in protecting privacy relative to errors deriving from failing to take advantage of ways to improve public and often individual health. In addition, incentives internal to the bureaucracy also discourage optimal use of information, even functions such as merging already existing datasets. Because government now controls nearly three fifths of the health budget, including tax subsidies, it bears substantial responsibility to improve these incentives. Some incentive changes are possible now, through reimbursement and payment systems. Others require examining the reward structure internal to the bureaucracy. In the end, however, the primary incentive needs to come from consumer demand, operating either directly on providers and insurers or on the voters’ elected representatives.

In many cases the benefits of clinical data are shared by all, but in fact the benefits to the individual come from clinical data treated as a public good. Accordingly, data-sharing solutions should examine and change the incentive structure and manage the tensions between privacy and confidentiality of data used to improve well-being. Steuerle also highlighted the failure to improve the public good when and noted evidence to the contrary. We lack data sharing for individual care; we lack data sharing for an early warning system (e.g., through the Centers for Disease Control and Prevention [CDC] or other organizations); and we lack data sharing for basically solving problems and finding cures or better treatments for various health problems. In the end, however, Steuerle encouraged engaging the public to support data initiatives. Another incentive problem is the lack of bureaucratic incentives to share datasets or allow datasets under an agency’s purview to be shared. Notwithstanding good will among many public servants, there are strong disincentives in the bureaucracy to share data; consideration is needed on how to introduce incentives into the bureaucracy to reward people for enabling the sharing of data.

For solutions to some of these issues, Steuerle outlined several opportunities through which the government can leverage its position, for example, higher reimbursement of drugs prescribed electronically instead of through traditional methods. It could differentially pay for lab tests put into electronic form for sharing with patients or the CDC. It could pay for electronic filing of information on diagnoses and treatment. Government could also provide more incentives for participation in clinical trials. Steuerle also suggested that people working in the public sector need incentives to encourage data sharing.

The Action Agenda

Also summarized in Chapter 6 are discussions of a stakeholder panel charged with moving the conversation about data utility to an action agenda, by offering practical ideas on strategies or incentives that advance the development of an improved data utility, and what strategies or incentives might be necessary to make that happen. As session chair David Blumenthal observed, the environment for clinical data is much more distributive than ever, a phenomenon that overrides traditional instincts of policy makers to develop solutions by identifying roles and responsibilities for local, state, and federal governments. In a distributed environment, such an approach is too narrowly framed. For example, the conversation that engages consumers directly, and focuses on the personal health record, is a very different policy environment from one that could be addressed through a centralized authority. At the same time, the federal government is a big stakeholder and player in the collection of health-related data. However, the environment surrounding data differ from one part of the government to the other—the NIH, for example, has the capacity to focus on promoting sharing of data and has a broad mandate for data collection sharing, whereas Medicare operates in a much more restrictive environment. With these observations as context, panelists offered comments on decisions and actions that could best enable access to and use of clinical data as a means of advancing learning and improving the value delivered in health care.

Government-sponsored clinical and claims data. Steve E. Phurrough, director of coverage and analysis at the Centers for Medicare & Medicaid Services, provided an inventory of the data that Medicare collects, what it does with the data it collects, and what some of its challenges are in data utility. Medicare currently collects data in each of the four parts of the program: A, B, C, and D. Collected data are used as the basis of paying claims. Different data collection programs look at how different payment systems may affect outcomes versus clinical issues. Data are collected to help improve quality of health care, for payment purposes, and to develop pay-for-performance qualitative information. Another set of data collection programs is in Medicare demonstration projects, which look at a variety of issues and, generally, examine how different payment systems may affect outcomes versus clinical issues. Data are also collected in the interest of evidence development.

Given the limits of its authority, Medicare has had to be somewhat innovative. One example is linking some clinical data to collections to coverage of particular technologies. One carrot Medicare has developed is that it has required the delivery of clinical data beyond the typical claims data as a provision for payment for certain services; a few years ago, the system required, for example, additional clinical information for the insertion of implantable defibrillators. Such an approach has the potential to provide significant amounts of information if, in fact, we can learn how to meet the challenge of what we can do with data that have been collected, and merge those data with other sources of data so that data collection can inform clinical practice.

Government-sponsored research data. The molecular biology revolution was founded on the commonality of DNA and the genetic code among living things. Discoveries at the molecular level provide unprecedented insight into the mechanisms of human disease. This understanding has developed into an expectation of wide data sharing in molecular biology and molecular genetics. Now that powerful genomewide molecular methods are being applied to populations of individuals, the necessity of broad data sharing is being brought to clinical and large cohort studies. This has prompted considerable discussion at the NIH that have resulted in the NIH Genome Wide Association Study Policy for data sharing, and a new database at the NIH’s National Center for Biotechnology Information (NCBI) called the Database of Genotypes and Phenotypes (dbGaP).

James M. Ostell, chief of the NCBI Information Engineering Branch, heads the group that provides resources such as PubMed and GenBank, an annotated collection of all publicly available DNA sequences. He observed that in the course of collecting and distributing terabytes of data, the branch has wrestled with questions concerning which data are worth centralizing versus which should be kept distributed. Although technical and policy requirements sometimes dictate answers to those questions, nature sometimes directs information engineers to pursue certain tactics. For example, the commonality of molecular data might drive the desire to have all related information in one data pool, so that a researcher could search all the data comprehensively, perhaps not even with a specific goal in mind. This could lead to the kind of serendipitous connection that is fundamental to the nature of discovery. At the same time, however, there must be a balance toward collecting only those pieces of data that make sense in a universal way.

The NIH has required researchers to pool data collected under NIH grants so that other investigators might benefit from those data. NIH created dbGaP to archive and distribute the results of studies that have investigated the interaction of genotype and phenotype. Such studies include genome-wide association studies, medical sequencing, and molecular diagnostic assays, as well as association between genotype and nonclinical traits. The advent of high-throughput, cost-effective methods for genotyping and sequencing has provided powerful tools that allow for the generation of the massive amounts of genotypic data required to make these analyses possible. dbGaP incorporates phenotype data collected in different studies into a single common pool so the data can be available to all researchers. Dozens of studies are now in the database, and by the end of 2008, the database was expected to hold data on more than 100,000 individuals and tens of thousands of measured attributes.

Hundreds of researchers have already begun using the resource. There is also a movement on the part of the major scientific and medical journals to require deposition accession numbers when they publish the types of studies alluded to above, the same as required for DNA sequence data. The publications recognize the importance of other people being able to confirm or deny a paper’s conclusions, which requires investigators to review the data that informed the paper. To further encourage secondary use of data, other accession numbers are used when people take data out of a database, reanalyze the data, and then publish their analysis.

Professional organization-sponsored data. Guidelines and performance measures in cardiology developed by the American College of Cardiology (ACC), often in association with the American Heart Association, typically are adopted worldwide. ACC Chief Executive Officer Jack Lewin described ongoing efforts to ensure that ACC guidelines, performance measures, and technology appropriateness criteria are adopted in clinical care, where they can benefit individual patients. Although most guidelines are currently available on paper, the vision is to have clinical decision support integrated into EHRs.

The ACC’s National Cardiovascular Data Registry (NCDR) was designed to improve the quality of cardiovascular patient care by providing information, knowledge, and tools; benchmarks for quality improvement; updated programs for quality assurance; platforms for outcomes research; and solutions for postmarket surveillance. The NCDR strives to standardize data and to provide data that are relevant, credible, timely, and actionable, and to represent real-life outcomes that help providers improve care and that help participants meet consumer, payer, and regulator demands for quality care. The NCDR’s flagship registry, the national CathPCI Registry, is considered the gold standard for measuring quality in the catheterization laboratory. Other NCDR registries collect data on acute coronary syndrome, percutaneous coronary interventions, implantable cardioverter defibrillators, and carotid artery revascularizations. The ACC is currently working to standardize registry data to be able to measure gaps in performance and adherence to guidelines, with an ultimate goal of being able to teach how to fill those gaps and thus create a cycle of continuous quality improvement.

Mandates from Medicare and states have pushed hospitals to use the ACC registries, but there is room for wider adoption. The ACC is working to alleviate barriers such as the need for standardization, the expense of collecting needed data, and the lack of clinical decision support processes built into EHRs. The ACC would also like to see a national patient identifier that would enable the tracking of an individual’s overall health continuum while preserving patient privacy; such an identifier would bolster longitudinal studies. The ACC believes wider adoption of data sharing via registries is within reach, should be encouraged, and would ultimately result in better health care overall, but that strategies need to be developed and implemented that foster systems of care versus development of data collection mechanisms specific to a single hospital. Toward the development of business strategies needed to develop the clinical decision support capacity, standardization, and interoperability, the ACC wants to collaborate with other medical specialties, EHR vendors, the government, insurers, employers, and other interested parties. Going forward, the ACC supports investment in rigorous measurement programs, advocating for government endorsements of a limited number of data collection programs, allowing professional societies to help providers meet mandated reporting requirements, and implementing systematic change designed to engage physicians and track meaningful measures.

Product development and testing data. The pharmaceutical industry collects and shares a great deal of clinical data. Because the industry is heavily regulated, the data it collects are voluminous and made available publicly under strict regulations that, it is hoped, ensure their accuracy and the accuracy of their interpretations. Eve Slater, senior vice president for worldwide policy at Pfizer, noted that the pharmaceutical industry is interested in ensuring the widespread availability of data to support research at the point of patient care and care at the point of research. In the pursuit of that goal, the industry is interested in pursuing the alignment of data quality, accessibility, integrity, and comprehensiveness. An influx of regulations and an acknowledged need for transparency are prompting the appearance of product development and testing data in the public domain. Nonetheless, attention is needed to ensure data standards, integrity, and appropriate, individualized interpretation.

Although significant amounts of product development data are required by law to be in the public domain, roadblocks prevent the effective sharing of clinical data. In the area of clinical trials posted on, for example, shared information can be incomplete, duplicative, and hard to search, and nomenclature is not always standardized. The information also needs to be translated into language that patients can understand. The lack of an acceptable format for providing data summaries for the public is linked to concerns about disseminating data in the absence of independent scientific oversight; once data are in the public domain, controlling quality assurance and the accuracy with which the information is translated to patients become difficult. Policies to address some of these issues lag behind the actual availability of data.

These issues argue in support of the data-sharing and standardization principles that the IOM has articulated. The Clinical Data Interchange Standards Consortium (CDISC) and other organizations are currently focused on the issues of standardizing electronic data.

Regulatory policies to promote sharing. Although large repositories now exist for controlled clinical trial data, including primary data, Janet Woodcock, deputy commissioner and chief medical officer at the FDA, observed that much of that information unfortunately resides on paper in various archives, not in an electronic form that would readily enable sharing. The FDA’s Critical Path Initiative is an aggressive attempt to be able to combine research data from the various clinical trials in different ways and to extend learning beyond a particular research program. The FDA has been working with the CDISC to try to standardize as many data elements as possible.

Several years ago, the FDA established the ECG Warehouse, an annotated electrocardiogram (ECG) waveform data storage and review system, for which a standard was established for a digital ECG. The FDA asked companies engaged in cardiac safety trials to use that standard. Today the ECG Warehouse holds more than 500,000 digital ECGs along with the clinical data, and the FDA is collaborating with the academic community to analyze those data to learn new knowledge that would not have been accessible before the development of a standardized dataset.

The FDA is constructing quantitative disease models from clinical trials data, building electronic models that incorporate the natural history of the disease, performance of all the different biomarkers about the disease over time, and results from interventions. Given multiple interventions, the approach allows researchers to model quantitatively. The FDA expects more of these models to evolve in the future.

Within the Critical Path Initiative, the FDA worked with various pharmaceutical companies to pool all their animal data for different drug-induced toxicities, before the drugs are given to people. This groundbreaking consortium worked to cross-validate all the relevant biomarkers in each other’s laboratories. The first dataset, on drug-induced kidney toxicity in animals, has been submitted to the FDA and is under review. Similar approaches could be undertaken with humans; pooling those data from various sources could lead to new knowledge.

The FDA also plans to build a distributed network for pharmaco-vigilance. The Sentinel Network seeks to integrate, collect, analyze, and disseminate medical product (e.g., human drugs, biologics, and medical devices) safety information to healthcare practitioners and patients at the point of care. Required under the 2007 Food and Drug Administration Amendments Act (FDAAA), the Sentinel Network is currently the focus of discussions by many stakeholders about how best to proceed. One approach is to build a secure distributed network in which data stay with the data owners, but are accessible to others.

Legislative change to allow sharing. The Center for Medical Consumers, a nonprofit advocacy organization, was founded in 1976 to provide access to accurate, science-based information so that consumers could participate more meaningfully in medical decisions that often have profound effects on their health. Arthur Levin, the center’s cofounder and director, believes government has a role to play in regulating the healthcare sector; key questions in this arena concern what government can and cannot do, and what it should and should not do.

Legislatively, most of the action concerning data sharing is currently in the states. Levin noted that we may face a scenario similar to that with managed care legislation, where in the absence of federal legislation, states moved ahead on their own, for better or worse. Currently states are moving ahead rapidly with HIT and health information exchange. Issues of privacy and confidentiality are very much in the forefront and driving state legislation. In terms of legislation covering data sharing, we need to make sure that whatever policy is developed moves things in an agreed-upon direction that does not create new obstacles and barriers. A first step will be to develop a much better understanding of what barriers exist in the states and federal government to aggregating data for research, quality improvement, and similar goals.

Another issue is that data sharing is, in essence, a social contract between individuals and researchers who want to use their data. Patients are told there will be some payoff from sharing data, but perhaps patients do not hear enough about how that is supposed to happen. Where does the payoff come? How does the other side of that contract deliver? What are the deliverables? Is there a time line for those deliverables? Is there accountability for those deliverables? As part of the social contract, there should be a burden on collecting data, a requirement that the collector do something specific with the data being collected. Privacy and confidentiality rules and remedies can be legislated; however, trust must be built. All who believe that data represent a public good—and that data sharing is a public responsibility to advance the public interest in improving healthcare quality, safety, and efficacy—also understand that such a message may not resonate so readily with the public. The public has not yet been brought up to that level, and more is needed to engage consumers in this enterprise.

Engaging the Public

The final session of the workshop examined the public’s role in improving the clinical data utility, considering how the public currently views the use of clinical care data for research, what types of information the public is interested in deriving from such research, and how that interest might influence public response to future developments in the use of health information. The session further considered what technical, communication, and demonstration-of-value advances might help address the concerns of healthcare consumers. As summarized in Chapter 7, participants provided an overview of public knowledge, issues, concerns, and discussion of strategies on public understanding, engagement, and support for the changes necessary to create the next-generation public data utility. Also discussed were the design and implementation of tools that would be enhanced by wider availability of clinical data—such as those that help improve patient access and use of information from, about, and by those who are dealing with similar circumstances. Finally, the nature and potential use of personal health records, safeguards for data access and entry, and possible influence on public perceptions about privacy and data use were considered.

Generating Public Interest in a Public Good

In many respects, the greatest challenge associated with establishing a medical care data system to serve the public interest lies in the fact that such data largely reside in the private sector, where commercial interests and other factors inhibit sharing. This paradigm has benefited discrete entities, but it has failed to serve the public health interests of the broader U.S. population or to promote awareness of how such information can be used to improve clinical decision making at the individual level. Though the public should have considerable interest in this information, the limitations of the data system as currently structured severely inhibit demonstration of the value proposition for consumers, both individually and collectively. Alison Rein, senior manager at AcademyHealth, identified key issues to be addressed to develop public awareness and perception of medical care data use for public good applications. She provided an overview of what little is known about this domain from the public’s perspective; discussed some assumptions and attitudes that may impede progress in this direction; and highlighted examples from which we might learn and share strategies for generating public interest.

Rein discussed the public’s limited understanding of how their clinical data move within or outside our fragmented system and the consequences for discussions about data access and data protection and security. Although lessons might be learned from other industries’ transition to electronic systems for data management, the public expectation of trust and privacy between providers and patients, as well as the potential for irrevocable harm inherent to health care, enhance the challenge. Progress will require public education, outreach, and the demonstration of value in the use of health data.

Generating interest in electronic access to personal health information might help overcome market obstacles related to sequestering data for proprietary interests. However, Rein suggested that until greater regulation is put in place to compel providers and healthcare institutions to share data appropriately, use of clinical data for the public good will remain constrained. Efforts should also be made to align public and research interests toward pursuing common goals and helping the public develop a deeper appreciation for research as a public good. Public demonstration of the value of data sharing might help in this regard—showing, for example, the potential impact of clinical data on personal lifestyle, the bottom line, or other endpoints of interest to the public. Possible approaches to demonstrating the value of research as a public good included expanded reporting of limited, but meaningful, clinical health data to public health entities; the enhancement and expansion of clinical data registries; and the development of a nationwide health tracking network that could yield information of value to researchers, the public health community, providers, policy makers, and consumers.

Implications of “Patients Like Me” Databases

The longstanding tension between an individual’s desire for personalized information and the population’s interest in healthcare research is exacerbated by scientific advances such as molecular profiling, information sharing on the web, and modern data management tools. Both the public and private sectors are struggling to navigate this logistically challenging landscape to gain medical insights and occasionally to monetize these insights. Patient-focused clinical trial information services created in the past decade provide a unique view of how patients feel about healthcare research at both the individual and the population level. Courtney Hudson, chief executive officer and founder of EmergingMed, provided an overview of EmergingMed, a company that helps cancer patients gain access to clinical trials and search for treatment options. Hudson discussed how this service addresses the intersection of an individual’s need for information, access, and transparency with the U.S. healthcare system’s desire for population-based research and data sharing in light of modern data management and data-sharing capabilities.

Patients in this country support mining clinical databases for the good of public health and for learning, and they believe overwhelmingly that it already happens. Patients seek information to inform treatment decisions, and Hudson indicated it would be unconscionable to not provide as much information as we have available in the public domain to possibly help each patient. As ways to use and aggregate public datasets are developed, it would be extremely difficult ethically to justify any decision to withhold information from patients. Similarly, Hudson highlighted the concept of promoting evidence-based medicine and garnering public approval and cooperation in terms of the potential benefit to the public, rather than the public understanding of research. Transparency and trust were also emphasized. The more transparent the system, the more likely patients’ trust is gained. Regarding the informed consent process, a basic ethical concern is that the clinical trials system as it stands today has a narrow definition of informed consent. Hudson encouraged workshop participants to consider ways to provide context, full disclosure, or transparency to patients or to inform them about the larger process. A key distinction in considering the patient’s point of view might be to view clinical data utilities in terms of patient-driven solutions versus system-driven solutions.

Implications of Personal Health Records

Dramatic increases in medical information and increases in consumer access to information via the Internet, are making health care one of the most significant hot spots for technology innovation today. Currently the practice of medicine suffers from an information management problem. Control will eventually shift, moving the current top-down doctor-patient relationship to one that is characterized by mutual control. For physicians, the issue is about aggregating data within and across provider organizations, and for consumers it is about aggregating health data across all of their sources. Ultimately, these views will connect to enable informed health decisions and better clinical outcomes. Today, we have more personal health data than ever; however, the data are dispersed over a variety of facilities, providers, and even our own monitoring devices and home computers.

As described by Jim Karkanias, partner and senior director of applied research and technology at Microsoft Corporation, Microsoft is working to address gaps in the healthcare data management system, both from an enterprise and a consumer standpoint, to enable a more connected, informed, and collaborative healthcare ecosystem. Microsoft HealthVault, a consumer health platform with specialized health search capabilities, delivers a platform that puts users in control of their information so they can access, store, and recall it on demand. Karkanias indicated that such a level of access and control contributes to the ability to make good decisions. The platform is built on the premise that the consumer is at the center of health care, so patients are the logical aggregators of this information.

HealthVault seeks to help patients to proactively manage their own health care—substituting, for example, costly visits to a doctor’s office with daily in-home monitoring to allow for proactive measures to be taken as they can be detected. Chronic conditions and more serious illnesses could be handled proactively. With appropriate privacy consents, a care-giver could have a full view of a patient’s underlying data; others could be granted access to different parts of that same data—an approach useful, for example, to adult children caring for their parents from afar.


The availability of timely and reliable evidence to guide healthcare decisions depends substantially on the quality and accessibility of the data used to produce the evidence. Important information about the results of different diagnostic and treatment interventions is collected in multiple forms by many institutions for different reasons and audiences—providers, patients, insurers, manufacturers, health researchers, and public agencies. Medical care data represent a vital resource for improving insight and action for more effective treatment. With the increasing potential of technical capacity for aggregation and sharing of data while ensuring confidentiality, the prospects are at hand for powerful and unprecedented tools to determine the circumstances under which medical interventions work best, and for whom. However, these data are usually held in a proprietary manner instead of being considered a public good that can be pooled and mined for new research and, ultimately, better patient care and outcomes. There are a number of challenges to the use of such data—coding discrepancies, platform incompatibilities, patient protection tools—yet practical approaches are and can be developed to contend with these issues. The most significant challenge may be the barriers and restrictions to data access inherent in treating clinical outcome data as a proprietary commodity.

Chapter 8 summarizes the themes emerging from workshop discussion and opportunities for follow-up action by the Roundtable. Key issues discussed include clarifying basic principles of data stewardship; creating next-generation data utilities and models; creating next-generation data policy; and engaging the public. Potential opportunities for follow-up attention by the members of the IOM Roundtable on Value & Science-Driven Health Care include those noted below—Roundtable Innovation Collaboratives already engaged in related follow-on work are indicated in parentheses.

  1. Principles: Foster the development, review, and implementation of basic principles for data stewardship.
  2. Use of electronic health records for knowledge development: Convene an affinity group of EHR users and vendors to consider approaches to cooperative work on knowledge development, including issues related to standards and rules for governed data query and application (EHR Innovation Collaborative).
  3. Collaborative data mining: Organize exploratory efforts to investigate cutting-edge data-mining techniques for generating evidence on care practices and research (EHR Innovation Collaborative).
  4. Incentives: Convene an employer–payer workgroup to explore the use of economic incentives to reward providers/groups working to improve knowledge generation and application in the care process.
  5. Privacy and security: At the conclusion of a current IOM study on HIPAA and privacy protection regulations, convene a series of meetings to explore and clarify definitions as well as reduce the tendency toward unnecessarily restrictive interpretations, in particular as they relate to data sharing and secondary uses.
  6. Transparency and access to federal data: Explore the marketplace for data, opportunities to enhance data sharing, governance/stewardship issues, and ways to make federally sponsored clinical data widely available for secondary analysis. This includes not only data from federally supported research but also Medicare-related data, including from Part D (pharmaceutical) use.
  7. Public involvement in the evidence process: Engage the public through communication efforts aimed at increasing public understanding and involvement in evidence-based medicine (Evidence Communication Innovation Collaborative).


  1. Arrow K, Bertko J, Brownlee S, Casalino LP, Cooper J, Crosson FJ, Enthoven A, Falcone E, Feldman RC, Fuchs VR, Garber AM, Gold MR, Goldman D, Hadfield GK, Hall MA, Horwitz RI, Hooven M, Jacobson PD, Jost TS, Kotlikoff LJ, Levin J, Levine S, Levy R, Linscott K, Luft HS, Mashal R, McFadden D, Mechanic D, Meltzer D, Newhouse JP, Noll RG, Pietzsch JB, Pizzo P, Reischauer RD, Rosenbaum S, Sage W, Schaeffer LD, Sheen E, Silber BM, Skinner J, St. Shortell M, Thier SO, Tunis S, Wulsin L Jr, Yock P, Bin Nun G, Bryan S, Luxenburg O, van de Ven WPMM. Toward a 21st-century health care system: Recommendations for health care reform. Annals of Internal Medicine. 2009;150(7):493–495. [PubMed: 19258550]
  2. Blumenthal D, Campbell EG, Gokhale M, Yucle R, Clarridge B, Hilgartner S, Holtzman NA. Data witholding in genetics and the other life sciences: Prevalence and predictors. Academic Medicine. 2006;82(2):137–145. [PubMed: 16436574]
  3. Detmer DE. Building the national health information infrastructure for personal health, health care services, public health, and research. BMC Medical Informatics and Decision Making. 2003;3(1):1–40. [PMC free article: PMC149369] [PubMed: 12525262]
  4. Editorial. Let data speak to data. Nature. 2005;438:531. [PubMed: 16319843]
  5. Health care spending in the United States and OECD countries. 2007. [accessed July 14, 2008]. http://www​​/snapshot/chcm010307oth.cfm.
  6. Hrynaszkiewicz I, Altman D. Towards agreement on best practice for publishing raw clinical trial data. Trials. 2009;10(17) [PMC free article: PMC2662833] [PubMed: 19296844]
  7. IOM (Institute of Medicine). Beyond the HIPAA Privacy Rule: Enhancing privacy, improving health through research. Washington, DC: The National Academies Press; 2009. [PubMed: 20662116]
  8. Ness RB. Influence of the HIPAA Privacy Rule on health research. Journal of the American Medical Association. 2007;298(18):2196–2198. [PubMed: 18000206]
  9. NRC (National Research Council). Computational technology for effective health care: Immediate steps and strategic directions. Washington, DC: The National Academies Press; 2009. [PubMed: 20662117]
  10. Piwowar HA, Becich MJ, Bilofsky H, Crowley RS. Towards a data sharing culture: Recommendations for leadership from Academic Health Centers. PLoS Medicine. 2008;5(9):e183. [PMC free article: PMC2528049] [PubMed: 18767901]
  11. Safran C, Bloomrosen M, Hammond WE, Labkoff S, Markel-Fox S, Tang PC, Detmer DE.with input from the expert panel. Toward a national framework for the secondary use of health data: An American Medical Informatics Association white paper. Journal of the American Medical Informatics Association. 2007. [accessed August 18, 2008]. pp. 1–9. http://www​.healthlawyers​.org/Members/PracticeGroups​/HIT/Toolkits​/Documents/5_Health​_Data_AMIA_Summary.pdf. [PMC free article: PMC2329823] [PubMed: 17077452]
Copyright © 2010, National Academy of Sciences.
Bookshelf ID: NBK54290


  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (3.1M)

Related information

  • PMC
    PubMed Central citations
  • PubMed
    Links to PubMed

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...