NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Gliklich RE, Dreyer NA, editors. Registries for Evaluating Patient Outcomes: A User's Guide. 2nd edition. Rockville (MD): Agency for Healthcare Research and Quality (US); 2010 Sep.

Cover of Registries for Evaluating Patient Outcomes: A User's Guide

Registries for Evaluating Patient Outcomes: A User's Guide. 2nd edition.

Show details

Executive Summary

Defining Patient Registries

This user’s guide is intended to support the design, implementation, analysis, interpretation, and quality evaluation of registries created to increase understanding of patient outcomes. For the purposes of this guide, a patient registry is an organized system that uses observational study methods to collect uniform data (clinical and other) to evaluate specified outcomes for a population defined by a particular disease, condition, or exposure, and that serves one or more predetermined scientific, clinical, or policy purposes. A registry database is a file (or files) derived from the registry. Although registries can serve many purposes, this guide focuses on registries created for one or more of the following purposes: to describe the natural history of disease, to determine clinical effectiveness or cost-effectiveness of health care products and services, to measure or monitor safety and harm, and/or to measure quality of care.

Registries are classified according to how their populations are defined. For example, product registries include patients who have been exposed to biopharmaceutical products or medical devices. Health services registries consist of patients who have had a common procedure, clinical encounter, or hospitalization. Disease or condition registries are defined by patients having the same diagnosis, such as cystic fibrosis or heart failure.

Planning a Registry

There are several key steps in planning a patient registry, including articulating its purpose, determining whether it is an appropriate means of addressing the research question, identifying stakeholders, defining the scope and target population, assessing feasibility, and securing funding. The registry team and advisors should be selected based on their expertise and experience.

The plan for registry governance and oversight should clearly address such issues as overall direction and operations, scientific content, ethics, safety, data access, publications, and change management.

It is also helpful to plan for the entire lifespan of a registry, including how and when the registry will end and any plans for transition at that time. A registry may be stopped because it has fulfilled its original purpose, is unable to fulfill its purpose, is no longer relevant, or is unable to maintain sufficient funding, staffing, or other support.

Registry Design

A patient registry should be designed with respect to its major purpose, with the understanding that different levels of rigor may be required for registries designed to address focused analytical questions to support decisionmaking, in contrast to those intended primarily for descriptive purposes. The key points to consider in designing a registry include formulating a research question; choosing a study design; translating questions of clinical interest into measurable exposures and outcomes; choosing patients for study, including deciding whether a comparison group is needed; determining where data can be found; and deciding how many patients need to be studied and for how long. Once these key design issues have been settled, the registry design should be reviewed to evaluate potential sources of bias (systematic error); these should be addressed to the extent that is practical and achievable. The information value of a registry is enhanced by its ability to provide an assessment of the potential for bias and to quantify how this bias could affect the study results.

The specific research questions of interest will guide the registry’s design, including the choice of exposures and outcomes to be studied and the definition of the target population (the population to which the findings are meant to apply). The registry population should be designed to approximate the characteristics of the target population as much as possible. The number of study subjects to be recruited and the length of observation (followup) should be planned in accordance with the overall purpose of the registry. The desired study size (in terms of subjects or person-years of observation) is determined by specifying the magnitude of an expected, clinically meaningful effect or the desired precision of effect estimates. Study size determinants are also affected by practicality, cost, and whether or not the registry is intended to support regulatory decisionmaking. Depending on the purpose of the registry, internal, external, or historical comparison groups strengthen the understanding of whether the observed effects are indeed real and in fact different from what would have occurred under other circumstances.

Registry study designs often restrict eligibility for entry to individuals with certain characteristics (e.g., age) to ensure that the registry will have subgroups with sufficient numbers of patients for analysis. Or the registry may use some form of sampling—random selection, systematic sampling, or a haphazard, nonrandom approach—to achieve this end.

Use of Registries for Product Safety Assessment

Whether as part of a postmarketing requirement or out of a desire to supplement spontaneous reporting, prospective product and disease registries are also increasingly being considered as resources for examining unresolved safety issues and/or as tools for proactive risk assessment in the postapproval setting. Registries can be valuable tools for evaluating product safety, although they are only one of many approaches to safety assessments. When designing a registry for the purposes of safety, the size of the registry, the enrolled population, and the duration of followup are all critical characteristics to ensure validity of the inferences made based on the data collected. Consideration in the design phase must also be given to other recognized aspects of product use in the real world (e.g., switching therapies during followup, use of multiple products in combination or in sequence, dose effects, delayed effects, and patient compliance).

Registries designed for safety assessment purposes should also formulate a plan that ensures that appropriate information will reach the right stakeholders (through reporting either to the manufacturer or directly to the regulator) in a timely manner. Stakeholders include patients, clinicians, providers, product manufacturers and authorization holders, and payers such as private, State, and national insurers. Registries not designed specifically for safety assessment purposes should, at a minimum, ensure that standard reporting mechanisms for adverse event information are described in the registry’s standard operating procedures and are made clear to investigators.

Data Elements

The selection of data elements requires balancing such factors as their importance for the integrity of the registry and for the analysis of primary outcomes, their reliability, their contribution to the overall burden for respondents, and the incremental costs associated with their collection. Selection begins with identifying relevant domains. Specific data elements are then selected with consideration for established clinical data standards, common data definitions, and whether patient identifiers will be used. It is important to determine which elements are absolutely necessary and which are desirable but not essential. In choosing measurement scales for the assessment of patient-reported outcomes, it is preferable to use scales that have been appropriately validated, when such tools exist. Once data elements have been selected, a data map should be created, and the data collection tools should be pilot tested. Testing allows assessment of respondent burden, the accuracy and completeness of questions, and potential areas of missing data. Inter-rater agreement for data collection instruments can also be assessed, especially in registries that rely on chart abstraction. Overall, the choice of data elements should be guided by parsimony, validity, and a focus on achieving the registry’s purpose.

Data Sources

A single registry may integrate data from various sources. The form, structure, availability, and timeliness of the required data are important considerations. Data sources can be classified as primary or secondary. Primary data are collected by the registry for its direct purposes. Secondary data have been collected by a secondary source for purposes other than the registry, and may not be uniformly structured or validated with the same rigor as the registry’s primary data. Sufficient identifiers are necessary to guarantee an accurate match between data from secondary sources and registry patients. Furthermore, it is advisable to obtain a solid understanding of the original purpose of the secondary data, because the way those data were collected and verified or validated will help shape or limit their use in a registry. Common secondary sources of data linked to registries include medical records systems, institutional or organizational databases, administrative health insurance claims data, death and birth records, census databases, and related existing registry databases.

Linking Registry Data

Registry data may be linked to other data sources (e.g., administrative data sources, other registries) to examine questions that cannot be addressed using the registry data alone. Two equally weighted and important sets of questions must be addressed in the data linkage planning process: (1) What is a feasible technical approach to linking the data? (2) Is linkage legally feasible under the permissions, terms, and conditions that applied to the original compilations of each dataset? Many statistical techniques for linking records exist (e.g., deterministic matching, probabilistic matching); the choice of a technique should be guided by the types of data available. Linkage projects should include plans for managing common issues (e.g., records that exist in only one database and variations in units of measure). In addition, it is important to understand that linkage of de-identified data may result in accidental re-identification. Risks of re-identification vary depending on the variables used, and should be managed with guidance from legal and statistical experts to minimize risk and ensure compliance with the Health Insurance Portability and Accountability Act of 1996 (HIPAA), the Common Rule, and other legal and regulatory requirements.

Ethics, Data Ownership, and Privacy

Critical ethical and legal considerations should guide the development and use of patient registries. The Common Rule is the uniform set of regulations on the ethical conduct of human subjects research issued by the Federal agencies that fund such research. Institutions that conduct research agree to comply with the Common Rule for federally funded research, and may opt to apply that rule to all human subjects activities conducted within their facilities or by their employees and agents, regardless of the source of funding. HIPAA and its implementing regulations (collectively, the Privacy Rule) are the legal protections for the privacy of individually identifiable health information created and maintained by health care providers, health plans, and health care clearinghouses (called “covered entities”). The research purpose of a registry, the status of its developer, and the extent to which registry data are individually identifiable largely determine which regulatory requirements apply. Other important concerns include transparency of activities, oversight, and data ownership. This section focuses solely on U.S. law. Health information is also legally protected in European and some other countries by distinctly different rules.

Patient and Provider Recruitment and Management

Recruitment and retention of patients as registry participants and providers as registry sites are essential to the success of a registry. Recruitment typically occurs at several levels, including facilities (hospitals, physicians’ practices, and pharmacies), providers, and patients. The motivating factors for participation at each level and the factors necessary to achieve retention differ according to the registry. Factors that motivate participation include the perceived relevance, importance, or scientific credibility of the registry, as well as the risks and burdens of participation and any incentives for participation. Because patient and provider recruitment and retention can affect how well a registry represents the target population, well-planned strategies for enrollment and retention are critical. Goals for recruitment, retention, and followup should be explicitly laid out in the registry planning phase, and deviations during the conduct of the registry should be continuously evaluated for their risk of introducing bias.

Data Collection and Quality Assurance

The integrated system for collecting, cleaning, storing, monitoring, reviewing, and reporting on registry data determines the utility of those data for meeting the registry’s goals. A broad range of data collection procedures and systems are available. Some are more suitable than others for particular purposes. Critical factors in the ultimate quality of the data include how data elements are structured and defined, how personnel are trained, and how data problems are handled (e.g., missing, out-of range, or logically inconsistent values). Registries may also be required to conform to guidelines or to the standards of specific end users of the data (e.g., 21 Code of Federal Regulations, Part 11). Quality assurance aims to affirm that the data were, in fact, collected in accordance with established procedures and that they meet the requisite standards of quality to accomplish the registry’s intended purposes and the intended use of the data.

Requirements for quality assurance should be defined during the registry’s inception and creation. Because certain requirements may have significant cost implications, a risk-based approach to developing a quality assurance plan is recommended. It should be based on identifying the most important or likely sources of error or potential lapses in procedures that may affect the quality of the registry in the context of its intended purpose.

Interfacing Registries and Electronic Health Records

Achieving interoperability between electronic health records (EHRs) and registries will be increasingly important as adoption of EHRs and the use of patient registries for many purposes both grow significantly. Such interoperability should be based on open standards that enable any willing provider to interface with any applicable registry without requiring customization or permission from the EHR vendor. Interoperability for health information systems requires accurate and consistent data exchange and use of the information that has been exchanged. Syntactic interoperability (the ability to exchange data) and semantic interoperability (the ability to understand the exchanged data) are the core constructs of interoperability and must be present in order for EHRs and registries to share data successfully. Full interoperability is unlikely to be achieved for some time. The successive development, testing, and adoption of open standard building blocks (e.g., the Healthcare Information Technology Standards Panel’s HITSP TP-50) is a pragmatic approach toward incrementally advancing interoperability while providing real benefits today. Care must be taken to ensure that integration efforts comply with legal and regulatory requirements for the protection of patient privacy.

Adverse Event Detection, Processing, and Reporting

The U.S. Food and Drug Administration defines an adverse event (AE) as any untoward medical occurrence in a patient administered a pharmaceutical product, whether or not related to or considered to have a causal relationship with the treatment. AEs are categorized according to the seriousness and, for drugs, the expectedness of the event. Although AE reporting for all marketed products is dependent on the principle of “becoming aware,” collection of AE data falls into two categories: those events that are intentionally solicited (meaning data that are part of the uniform collection of information in the registry) and those that are unsolicited (meaning that the AE is volunteered or noted in an unsolicited manner). Determining whether the registry should use a case report form to collect AEs should be based on the scientific importance of the information for evaluating the specified outcomes of interest. Regardless of whether or not AEs constitute outcomes for the registry, it is important for any registry that has direct patient interaction to develop a plan for detecting, processing, and reporting AEs. If the registry receives sponsorship, in whole or in part, from a regulated industry (drugs or devices), the sponsor has mandated reporting requirements, the process for detecting and reporting AEs should be established, and registry personnel should receive training on how to identify AEs and to whom they should be reported. Sponsors of registries designed specifically to meet requirements for surveillance of drug or device safety are encouraged to hold discussions with health authorities about the most appropriate process for reporting serious AEs.

Analysis and Interpretation of Registry Data

Analysis and interpretation of registry data begin with answering a series of core questions: Who was studied, and how were they chosen for study? How were the data collected, edited, and verified, and how were missing data handled? How were the analyses performed? Four populations are of interest in describing who was studied: the target population, the accessible population, the intended population, and the population actually studied (the “actual population”). The representativeness of the actual population to the target population is referred to as generalizability.

Analysis of registry outcomes first requires an analysis of recruitment and retention, of the completeness of data collection, and of data quality. Considerations include an evaluation of losses to followup; completeness for most, if not all, important covariates; and an understanding of how missing data were handled and reported. Analysis of a registry should provide information on the characteristics of the patient population, the exposures of interest, and the endpoints. Descriptive registry studies focus on describing frequency and patterns of various elements in a patient population, whereas analytical studies concentrate on associations between patients or treatment characteristics and health outcomes of interest. A statistical analysis plan describes the analytical plans and statistical techniques that will be used to evaluate the primary and secondary objectives specified in the study plan. Interpretation of registry data should be provided so that the conclusions can be understood in the appropriate context and any lessons from the registry can be applied to the target population and used to improve patient care and outcomes.

Evaluating Registries

Although registries can provide useful information, there are levels of rigor that enhance validity and make the information from some registries more useful for guiding decisions than the information from others. The term “quality” can be applied to registries to describe the confidence that the design, conduct, and analysis of the registry can be shown to protect against bias and errors in inference—that is, erroneous conclusions drawn from a registry. Although there are limitations to any assessment of quality, a quality component analysis is used both to evaluate high-level factors that may affect results and to differentiate between research quality (which pertains to the scientific process) and evidence quality (which pertains to the data/findings emanating from the research process). Quality components are classified as either “basic elements of good practice,” which can be viewed as a checklist that should be considered for all patient registries, or as “potential enhancements to good practice,” which may strengthen the information value in particular circumstances. The results of such an evaluation should be considered in the context of the disease area(s), the type of registry, and the purpose of the registry, and should also take into account feasibility and affordability.