U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Gliklich RE, Leavy MB, Dreyer NA, editors. Registries for Evaluating Patient Outcomes: A User’s Guide [Internet]. 4th edition. Rockville (MD): Agency for Healthcare Research and Quality (US); 2020 Sep.

Cover of Registries for Evaluating Patient Outcomes: A User’s Guide

Registries for Evaluating Patient Outcomes: A User’s Guide [Internet]. 4th edition.

Show details

Executive Summary

Defining Patient Registries

This User’s Guide is intended to support the design, implementation, analysis, interpretation, and quality evaluation of registries created to increase understanding of patient outcomes. For the purposes of this guide, a patient registry is an organized system that uses observational study methods to collect uniform data (clinical and other) to evaluate specified outcomes for a population defined by a particular disease, condition, or exposure, and that serves one or more stated scientific, clinical, or policy purposes. A registry database is a file (or files) derived from the registry. Although registries can serve many purposes, this guide focuses on registries created for one or more of the following purposes: to describe the natural history of disease, to determine clinical effectiveness or cost-effectiveness of healthcare products and services, to measure or monitor safety and harm, and/or to measure quality of care.

Registries are classified according to how their populations are defined. For example, product registries include patients who have been exposed to biopharmaceutical products or medical devices. Health services registries consist of patients who have had a common procedure, clinical encounter, or hospitalization. Disease or condition registries are defined by patients having the same diagnosis, such as cystic fibrosis or heart failure.

Planning a Registry

There are several key steps in planning a patient registry, including articulating its purpose, determining whether it is an appropriate means of addressing the research question, identifying stakeholders, defining the scope and target population, assessing feasibility, and securing funding. The registry team and advisors should be selected based on their expertise and experience. The plan for registry governance and oversight should clearly address such issues as overall direction and operations, scientific content, ethics, safety, data access, publications, and change management. It is also helpful to plan for the entire lifespan of a registry, including how and when the registry will end and any plans for transition at that time. Special consideration should be given to the unique challenges of planning specific types of registries, such as rare disease registries or quality improvement registries.

Registry Design

A patient registry should be designed with respect to its major purpose, with the understanding that different levels of rigor may be required for registries designed to address focused analytical questions to support decision making, in contrast to registries intended primarily for descriptive purposes. The key points to consider in designing a registry include formulating a research question; choosing a study design; translating questions of clinical interest into measurable exposures and outcomes; choosing patients for study, including deciding whether a comparison group is needed; determining where data can be found; and deciding how many patients need to be studied and for how long. Once these key design issues have been settled, the registry design should be reviewed to evaluate potential sources of bias (systematic error); these should be addressed to the extent that is practical and achievable. The information value of a registry is enhanced by its ability to provide an assessment of the potential for bias and to quantify how this bias could affect the study results.

The specific research questions of interest will guide the registry’s design, including the choice of exposures and outcomes to be studied and the definition of the target population (the population to which the findings are meant to apply). The registry population should be designed to approximate the characteristics of the target population as much as possible. The number of study subjects to be recruited and the length of observation (followup) should be planned in accordance with the overall purpose of the registry. The desired study size (in terms of subjects or person-years of observation) is determined by specifying the magnitude of an expected, clinically meaningful effect or the desired precision of effect estimates. Study size determinants are also affected by practicality, cost, and whether the registry is intended to support regulatory decision making. Depending on the purpose of the registry, internal, external, or historical comparison groups strengthen the understanding of whether the observed effects are indeed real and in fact different from what would have occurred under other circumstances. Registry study designs often restrict eligibility for entry to individuals with certain characteristics (e.g., age) to ensure that the registry will have subgroups with sufficient numbers of patients for analysis. Or the registry may use some form of sampling—random selection, systematic sampling, or a haphazard, nonrandom approach—to achieve this end.

Special consideration should be given to the unique challenges of designing registries for specific purposes, such as product safety surveillance, rare diseases, medical devices, and quality improvement.

Selecting and Defining Outcome Measures for Registries

The selection and definition of patient outcomes of interest is a critical step in designing a patient registry. The outcomes of interest, together with the exposures(s) of interest, drive many of the decisions regarding the study duration, the necessary data elements, and the source(s) of the data. Outcomes should be selected primarily based on the research questions of interest, with consideration given to the feasibility of capturing the desired outcomes within the study scope and budget. It is also important to consider the perspectives of multiple stakeholders when determining which outcomes are most relevant. Tools such as the Outcomes Measures Framework can be helpful to guide the selection and definition of outcome measures for use within registries. In addition, the use of standardized outcome measures or other data standards, when available, is essential so that registries can maximally contribute to evolving medical knowledge. Standard terminologies—and to a greater degree, higher level groupings into core datasets for specific conditions—not only improve efficiency in establishing registries but also promote more effective sharing, combining, or linking of datasets from different sources. Furthermore, the use of well-defined standards for data elements and data structure ensures that the meaning of information captured in different systems is the same. This is critical to maximize the value of registries as tools in learning health systems and a national research infrastructure.

Data Elements

The selection of data elements requires balancing such factors as their importance for the integrity of the registry and for the analysis of primary outcomes, their reliability, their contribution to the overall burden for respondents, and the incremental costs associated with their collection. Selection begins with identifying relevant domains. Specific data elements are then selected with consideration for established clinical data standards, common data definitions, and whether patient identifiers will be used. It is important to determine which elements are absolutely necessary and which are desirable but not essential. In choosing measurement scales for the assessment of patient-reported outcomes, it is preferable to use scales that have been appropriately validated, when such tools exist. Once data elements have been selected, a data map should be created, and the data collection tools should be pilot tested. Testing allows assessment of respondent burden, the accuracy and completeness of questions, and potential areas of missing data. Inter-rater agreement for data collection instruments can also be assessed, especially in registries that rely on chart abstraction. Overall, the choice of data elements should be guided by parsimony, validity, and a focus on achieving the registry’s purpose.

Data Sources

A single registry may integrate data from various sources. The form, structure, availability, and timeliness of the required data are important considerations. Data sources can be classified as primary or secondary. Primary data are collected by the registry for its direct purposes. Secondary data have been collected by a secondary source for purposes other than the registry, and may not be uniformly structured or validated with the same rigor as the registry’s primary data. Sufficient identifiers are necessary to guarantee an accurate match between data from secondary sources and registry patients. Furthermore, it is advisable to obtain a solid understanding of the original purpose of the secondary data, because the way those data were collected and verified or validated will help shape or limit their use in a registry. Common secondary sources of data linked to registries include medical records systems, institutional or organizational databases, administrative health insurance claims data, death and birth records, census databases, and related existing registry databases.

Ethics, Data Ownership, and Privacy

Critical ethical and legal considerations should guide the development and use of patient registries. The Common Rule is the uniform set of regulations on the ethical conduct of human subjects research, issued by the Federal agencies that fund such research. Institutions that conduct research agree to comply with the Common Rule for federally funded research, and may opt to apply that rule to all human subjects activities conducted within their facilities or by their employees and agents, regardless of the source of funding. The Privacy Rule, promulgated under the Health Insurance Portability and Accountability Act of 1996 (HIPAA), establishes Federal protections for the privacy of individually identifiable health information created and maintained by health plans, healthcare clearinghouses, and most healthcare providers (collectively, “covered entities”). The purpose of a registry, the type of entity that creates or maintains the registry, the types of entities that contribute data to the registry, and the extent to which registry data are individually identifiable affect how the regulatory requirements apply. Other important concerns include transparency of activities, oversight, and data ownership. This chapter of the User’s Guide focuses solely on U.S. law. Health information is also legally protected in European and some other countries by distinctly different rules.

Informed Consent for Registries

The requirement of informed consent often raises different issues for patient registries versus clinical trials. For example, registries may be used for public health or quality improvement activities, which may not constitute “human subjects research.” Also, registries may integrate data from multiple electronic sources (e.g., claims data, electronic health records) and may be linked to biobanks. Institutional review boards may approve waivers or alterations of informed consent (e.g., electronic consent, oral consent) for some registries, depending on the purpose and risk to participants. Established registries that undergo a change in scope (e.g., changes in data sharing policies, changes to the protocol, extension of the followup period) may need to ask patients to “re-consent.” When planning informed consent procedures, registry developers should consider several factors, including documentation and format, consent revisions and re-consent, the applicability of regulatory requirements, withdrawal of participants from the study, and the physical and electronic security of patient data and biological specimens. In addition, registry developers may need to consider the individual authorization requirements of the HIPAA Privacy Rule, where applicable.

Registry Governance

Registries function in a dynamic environment and are often shaped by the complex relationships among individual health, public health policy, economics, geography, and culture. Complexity within registries stems from the topics being studied, stakeholders with different agendas, and the legal and political climates for such research, among other factors. Governance is an important tool to help registries manage complexities such as these across the registry lifecycle, from the initial planning phase through the dissemination of results. Registry governance refers to a formalized structure or plan for managing the registry and guiding decision making related to registry funding, operations, and dissemination of information. Registry governance can take many forms depending on the scope of the registry, the number of stakeholders, and the purpose of the registry, but some principles for successful governance apply across all governance models. In particular, all aspects of governance should be codified in a written format that can be reviewed, shared, and refined over time, and transparency regarding any perceived or actual conflicts of interest is important for effective governance. Expectations of each research partner should be clearly delineated, pragmatic, and transparent. Lastly, policies and procedures should be developed to support stakeholder engagement and transparency.

Patient and Provider Recruitment and Management

Recruitment and retention of patients as registry participants, and of providers as registry sites, are essential to the success of a registry. Recruitment typically occurs at several levels, including facilities (hospitals, physicians’ practices, and pharmacies), providers, and patients. The motivating factors for participation at each level and the factors necessary to achieve retention differ according to the registry. Factors that motivate participation include the perceived relevance, importance, or scientific credibility of the registry, as well as a favorable balance of any incentives for participation versus the risks and burdens thereof. Because patient and provider recruitment and retention can affect how well a registry represents the target population, well-planned strategies for enrollment and retention are critical. Goals for recruitment, retention, and followup should be explicitly laid out in the registry planning phase, and deviations during the conduct of the registry should be continuously evaluated for their risk of introducing bias.

Obtaining Data and Quality Assurance

The integrated system for obtaining, cleaning, storing, monitoring, reviewing, and reporting on registry data determines the utility of those data for meeting the registry’s goals. A broad range of procedures and systems are available for obtaining or collecting data. Some are more suitable than others for particular purposes. Critical factors in the ultimate quality of the data include how data elements are structured and defined, how personnel are trained, and how data problems (e.g., missing, out of range, or logically inconsistent values) are handled. Registries may also be required to conform to guidelines or to the standards of specific end users of the data (e.g., 21 Code of Federal Regulations, Part 11). Quality assurance aims to affirm that the data were, in fact, collected in accordance with established procedures and that they meet the requisite standards of quality to accomplish the registry’s intended purposes and the intended use of the data. Requirements for quality assurance should be defined during the registry’s inception and creation. Because certain requirements may have significant cost implications, a risk-based approach to developing a quality assurance plan is recommended. It should be based on identifying the most important or likely sources of error or potential lapses in procedures that may affect the quality of the registry in the context of its intended purpose.

Adverse Event Detection, Processing, and Reporting

The U.S. Food and Drug Administration defines an adverse event (AE) as any untoward medical occurrence in a patient administered a pharmaceutical product, whether or not related to or considered to have a causal relationship with the treatment. AEs are categorized according to the seriousness and, for drugs, the expectedness of the event. Although AE reporting for all marketed products is dependent on the principle of “becoming aware,” collection of AE data falls into two categories: those events that are intentionally solicited (meaning data that are part of the uniform collection of information in the registry) and those that are unsolicited (meaning that the AE is volunteered or noted in an unsolicited manner). The determination of whether the registry should use a case report form to collect AEs should be based on the scientific importance of the information for evaluating the specified outcomes of interest. Regardless of whether or not AEs constitute a primary objective of the registry, it is important for any registry that has direct patient interaction to develop a plan for detecting, processing, and reporting AEs. If the registry receives sponsorship, in whole or in part, from a regulated industry (drugs or devices), the sponsor has mandated reporting requirements including stringent timelines, and the registry should establish the process for detecting and reporting AEs and should provide training to registry personnel on how to identify AEs and to whom they should be reported. Sponsors of registries designed specifically to meet requirements for surveillance of drug or device safety are encouraged to hold discussions with health authorities about the most appropriate process for reporting serious AEs.

Analysis, Interpretation, and Reporting of Registry Data

Analysis and interpretation of registry data begin with answering a series of core questions: Who was studied, and how were they chosen for study? How were the data collected, edited, and verified, and how were missing data handled? How were the analyses performed? Four populations are of interest in describing who was studied: the target population, the accessible population, the intended population, and the population actually studied (the “actual population”). The representativeness of the actual population to the target population is referred to as generalizability.

Analysis of registry outcomes first requires an analysis of recruitment and retention, of the completeness of data collection, and of data quality. Considerations include an evaluation of losses to followup; completeness for most, if not all, important covariates; and an understanding of how missing data were handled and reported. Analysis of a registry should provide information on the characteristics of the patient population, the exposures of interest, and the endpoints. Descriptive registry studies focus on describing frequency and patterns of various elements in a patient population, whereas analytical studies concentrate on associations between patients or treatment characteristics and health outcomes of interest. A statistical analysis plan describes the analytical plans and statistical techniques that will be used to evaluate the primary and secondary objectives specified in the study plan. Interpretation of registry data should be provided so that the conclusions can be understood in the appropriate context and any lessons from the registry can be applied to the target population and used to improve patient care and outcomes.

Evaluating Registries

Although registries can provide useful information, there are levels of rigor that enhance validity and make the information from some registries more useful for guiding decisions. The term “quality” can be applied to registries to describe the confidence that the design, conduct, and analysis of the registry can be shown to protect against bias and errors in inference—that is, erroneous conclusions drawn from the registry. Although there are limitations to any assessment of quality, a quality component analysis is used both to evaluate high-level factors that may affect results and to differentiate between research quality (which pertains to the scientific process) and evidence quality (which pertains to the data/findings emanating from the research process). Quality components are classified as either “essential elements of good practice,” which can be viewed as a checklist that should be considered for all patient registries, or as “potential enhancements to good practice,” which may strengthen the value of the information in particular circumstances. The results of such an evaluation should be considered in the context of the disease area(s), the type of registry, and the purpose of the registry, and should also take into account feasibility and affordability.

©2020 United States Government, as represented by the Secretary of the Department of Health and Human Services, by assignment.

All rights reserved. The Agency for Healthcare Research and Quality (AHRQ) permits members of the public to reproduce, redistribute, publicly display, and incorporate this work into other materials provided that it must be reproduced without any changes to the work or portions thereof, except as permitted as fair use under the U.S. Copyright Act. This work contains certain tables and figures noted herein that are subject to copyright by third parties. These tables and figures may not be reproduced, redistributed, or incorporated into other materials independent of this work without permission of the third-party copyright owner(s). This work may not be reproduced, reprinted, or redistributed for a fee, nor may the work be sold for profit or incorporated into a profit-making venture without the express written consent of AHRQ. This work is subject to the restrictions of Section 1140 of the Social Security Act, 42 U.S.C. § 1320b-10. When parts of this work are used or quoted, the following citation should be used:

Bookshelf ID: NBK562582