Learn more: PMC Disclaimer | PMC Copyright Notice
Strategic vision for improving human health at The Forefront of Genomics
Abstract
Starting with the launch of the Human Genome Project three decades ago, genomics has become progressively entrenched within the bedrock of the biomedical research enterprise. Capitalizing on the momentum of the project’s successful completion in 2003, genomics now regularly plays a central and catalytic role in basic and translational research, and studies increasingly demonstrate the vital role that genomic information can play in clinical care. Looking ahead, the anticipated advances in technology development, biological insights, and clinical applications (among others) will lead to more widespread integration of genomics into virtually all areas of biomedical research, the adoption of genomics into mainstream medical and public-health practices, and an increasing relevance of genomics in everyday life. On behalf of the research community, the National Human Genome Research Institute recently completed a multi-year process of strategic engagement to capture input about the future research priorities and opportunities in human genomics, with an emphasis on health applications. Here we articulate the highest-priority elements envisioned for the cutting-edge of human genomics going forward – that is, at “The Forefront of Genomics.”
Introduction
Three decades ago this month, a pioneering group of international researchers began an audacious journey to generate the first map and sequence of the human genome, marking the start of a 13-year odyssey called the Human Genome Project1–3. The project’s successful and early completion in 2003, which included parallel studies of a set of model organism genomes, catalyzed enormous progress in genomics research. Leading the signature advances has been a greater than one million-fold reduction in the cost of DNA sequencing4. This decrease has allowed the generation of innumerable genome sequences, including hundreds of thousands of human genome sequences (both in research and clinical settings), and the continuous development of new assays for identifying and characterizing functional genomic elements5,6. With these new tools, coupled with increasingly sophisticated statistical and computational methods, researchers have been enabled to create rich catalogs of human genomic variants7,8, to gain an ever-deepening understanding of the functional complexities of the human genome5, and to elucidate the genomic bases of thousands of human diseases9,10. In turn, the last decade has brought the initial realization of genomic medicine11, as research successes have been converted into powerful tools for use in healthcare, including somatic genome analysis for cancer (enabling development of targeted therapeutics)12, noninvasive prenatal genetic screening13, and genomics-based tests for a growing set of pediatric conditions and rare disorders14, among others.
In essence, with growing insights about the structure and function of the human genome and ever-improving laboratory and computational technologies, genomics has become increasingly woven into the fabric of biomedical research, medical practice, and society. The scope, scale, and pace of genomic advances to date were nearly unimaginable when the Human Genome Project began; even today, such advances are yielding scientific and clinical opportunities beyond our initial expectations, with many more anticipated in the decade ahead.
Embracing its leadership role in genomics, the National Human Genome Research Institute (NHGRI) has developed strategic visions for the field at key inflection points, in particular at the end of the Human Genome Project in 200315 and then again at the beginning of the last decade in 201116. These visions outlined the most compelling opportunities for human genomics research, in each case informed by a multi-year engagement process. NHGRI endeavored to start the new decade with an updated strategic vision for human genomics research. Through a planning process that involved over 50 events (e.g., dedicated workshops, conference sessions, and webinars) over the last two years (see http://genome.gov/genomics2020), the institute collected input from a large number of stakeholders, with the resulting input catalogued and synthesized using the framework depicted in Fig. 1.

Together, the indicated progressive and interrelated areas serve to organize the major elements in the strategic vision described here.
Unlike the past, this round of strategic planning was significantly influenced by the now widely disseminated nature of genomics across biomedicine. A representative glimpse into this historic phenomenon is illustrated in Fig. 2. During the Human Genome Project, NHGRI was the primary funder of human genomics research at the U.S. National Institutes of Health (NIH), but the past two decades have brought a greater than ten-fold increase in the relative fraction of funding coming from other parts of NIH.

The total funding levels for NIH (top panel) and NHGRI (middle panel) are indicated for 1990, 2010, and 2020, respectively. Also shown (bottom panel) is the relative proportion of funds supporting human genomics research provided by NHGRI versus all of NIH for the three corresponding time intervals (as derived from queries of the internal NIH Research, Condition and Disease Categorization database for funds assigned to the “human genome” category). During the 30-year period when the NHGRI budget increased roughly ten-fold (middle panel), the proportion of total NIH funding for human genomics research actually increased more dramatically, from <5% during the Human Genome Project to ~90% at the beginning of the current decade (bottom panel). In essence, these trends reflect a leveraging of NHGRI’s funds that increased NIH’s overall human genomics research funding by greater than ten-fold.
The planning process continually encountered the realities associated with the broad and extensive use of genomics and the impracticality of being comprehensive, which together served to focus attention on the most cutting-edge opportunities in human genomics. This experience affirmed NHGRI’s recently rearticulated role in providing genomics leadership at NIH, embodied by our newly conceived organizational mantra: “The Forefront of Genomics.” We ultimately linked this mantra to the strategic planning process to help guide the formulation of input. From the ensuing discussions, it became apparent that responsible stewardship is a central aspect of being at (and pushing forward) The Forefront of Genomics, specifically in the four major areas detailed in Fig. 1, Boxes 1, 2, 3, and 4, and below.
Principles and values for human genomics
As genomics has matured as a discipline, the field has embraced a growing set of fundamental principles and values that together serve as a guiding compass for the research efforts – some of these emerged organically within the field, whereas others have been adopted from the broader scientific community. The growing complexities of human genomics and its many applications (especially in medicine) at The Forefront of Genomics make it imperative to reaffirm, sharpen, and even extend these tenets, such as those highlighted in Box 1.
Many of these principles and values have been informed by the recognized area of genomics that focuses on ethical, legal, and social implications (ELSI) research17, which was established at the beginning of the Human Genome Project to help guarantee that the eugenics movement and other misuses of genetics are not repeated. ELSI research has since grown to encompass a broad portfolio of studies examining issues at the interface of genomics and society, the results of which have informed policies and laws related to genetic discrimination, intellectual property, data sharing, and informed consent18. Similar efforts seek to ensure that the benefits of genomics are available to all members of society19. Genomics, like all fields of science, must reckon with systematic injustices and biases, fully cognizant of their criticality for health equity. Looking ahead, ELSI research needs to focus on aspects of genomic medicine implementation that present challenging questions about legal boundaries, study governance, data control, privacy, and consent. Complex societal issues must also be studied, including the expanded application of genomics in non-medical realms (e.g., ancestry testing, law enforcement, and genetics-based marketing of consumer goods)20. Finally, ELSI research could also examine the implications of studying genetic associations with bio-behavioral traits (e.g., intelligence, sexual behavior, social status, and educational attainment)21 and of a future where machine learning and artificial intelligence are used to tailor risk communication and clinical decisions based on analyzing an individual’s genome sequence22.
Robust foundation for genomics
Genomics is now routinely and broadly utilized throughout biomedical research, with widespread reliance on a robust foundation for facilitating genomic advances. The foundation’s integrity depends on a number of key elements, including infrastructure, resources, and dynamic areas of technology development and research. Sustaining and improving that foundation are key responsibilities at The Forefront of Genomics, the major elements of which are highlighted in Box 2 and detailed in corresponding paragraphs below.
Genome structure and function
The last two decades have brought a greater than million-fold reduction in the cost of DNA sequencing23 along with an explosion in technologies for functional genomics6,24,25 (i.e., the study of how elements in the genome contribute to biological processes). Additional opportunities are poised to be unlocked as the generation and analysis of genomic data become even faster, cheaper, and more accurate. Near-term expectations include enhanced capabilities for generating high-quality and complete (e.g., telomere-to-telomere and phased) genome sequences26,27 and continued refinement and enhanced utility of a human genome reference sequence(s) that increasingly reflects human variation and diversity on a global scale28 and that serves as a substrate for genome annotation29. Technologies for generating DNA sequence and other data types (e.g., transcriptomic data, epigenomic data, and functional readouts of DNA sequences) need to be enabled at orders-of-magnitude lower costs, at single-cell resolution, at distinct spatial locations within tissues, and longitudinally over time30–32. These genomic data should be integrated with other multi-omic data (e.g., proteomes, metabolomes, lipidomes, and/or microbiomes) in sophisticated ways, including novel methods that collect multiple data types from a single sample32. Transformative approaches will become increasingly vital for assimilating, sharing, and analyzing these complex and heterogeneous data types33 and must expand to include the integration of environmental, lifestyle, clinical, and other phenotypic data. These capabilities should be incorporated into browsers, portals, and visualization tools for use by a broadening community of researchers and clinicians.
Genome sequences have now been generated for over 1,000 vertebrate species and are increasingly accompanied by multi-species annotations34. Understanding natural genomic variation, conservation of genomic elements, and the rapid evolutionary changes in genomic regions associated with specific traits is critical for attaining a comprehensive view of genome structure and function. The study of a wide range of organisms continues to be instrumental for elucidating the impact of genomic variation on biological processes and phenotypes, providing insights about the interplay of genomic variants and environmental pressures35 and the relevance of putative pathogenic variants identified in clinical studies36. It is essential that the generation of high-quality multi-species genomic data be accompanied by community-accepted standards for data, metadata, and data interoperability. New groundbreaking methods would allow for integrating functional data from diverse species with human data and visualizing increasingly complex comparative genomic datasets. Continued progress in this area would move the field closer to the long-term aspirational goal of understanding the evolutionary history of every base in the human genome.
Genomic data science
All major genomics breakthroughs to date have been accompanied by the development of groundbreaking statistical and computational methods. Accordingly, continued innovations in both traditional and advanced methods (including machine learning and artificial intelligence) should be prioritized37. These approaches must be considered from the early stages of study planning and data collection in ways that complement and enhance, rather than inhibit, technical progress. Further, the biomedical research community requires accurate, curated, accessible, secure, and interoperable genomic data repositories and informatics platforms that benefit all populations. Approaches for improving the efficiency of such resources include the use of shared storage and computing infrastructure, the adoption of common data-management processes, and the development of increasingly automated data-curation methods38. Carefully considered funding strategies must be designed to support these methods and resources, including a global, multi-funder model that ensures their development, enhancements, and long-term sustainability39.
Recent progress has brought substantial transformations in how the petabytes of genomic data being generated each year are assimilated and analyzed, including the emergence of cloud-based and federated approaches. Effective and efficient management of increasingly complex genomic datasets requires addressing challenges with these emerging approaches as well as innovations in the use of hardware, algorithms, software, standards, and platforms40. Current barriers include the lack of interoperable genomic data resources (which limits downstream access, integration, and analyses) and the absence of controlled and consistently adopted data and metadata vocabularies and ontologies41,42. User-friendly systems that capture metadata in a scalable, intelligent, and cost-effective fashion and that allow for intuitive data visualizations are essential. Ever-improving routines and guidelines should be formulated to continue and even enhance responsible data sharing, requiring the collective efforts of researchers, funders, and publishers alike; similar attention should focus on ensuring the use of FAIR (findable, accessible, interoperable, and reusable) data standards and the reproducibility of data analyses38. Innovations in technology and policy must be integrated to develop data-stewardship models that ensure open science and reduce data-access burdens to advance research, including the use of optimally balanced and ethically sound approaches for respecting participant preferences and consent as well as engaging communities. Such developments should be done in an open-source culture to build consensus and enable the development, maintenance, and utilization of best-in-class tools, pipelines, and platforms that can be applied to all datasets.
Fully integrating genomics into medical practice will require informatics and data-science advances that effectively connect the growing body of genomic knowledge to clinical decision-making. To make genomic information readily accessible and broadly useful to clinicians, user-friendly electronic health record-based clinical decision support tools must be created to interact with a variety of clinical data from electronic health record and other data systems (e.g., laboratory, pharmacy, and radiology) as well as non-computable reports, such as those provided as portable document format (PDF) files43,44. These efforts require well-curated, highly integrated, and up-to-date knowledgebases that connect genomic information to clinical characteristics, other phenotypic data, and information on family health history45. Also needed are reliable risk-stratification and prevention algorithms, including polygenic risk scores (PRSs)46, that incorporate both common and rare genomic variants from a broad range of population subgroups, phenotypic data, and environmental information into the risk modeling47. Such algorithms should be evaluated both for their validity across multiple populations and for their impact on patient outcomes and subsequent healthcare utilization. Finally, it will be important to evaluate new genomics-oriented clinical decision support tools to ensure that they are acceptable to practitioners across the spectrum of clinical disciplines.
Genomics and society
Understanding the role of genomics in human health requires knowledge and insights about how social, environmental, and genomic risk factors interact to produce health outcomes48,49 (Box 1). Given that such interactions are, in general, poorly understood, it is critical that studies of genomic risk (particularly of common, complex diseases) account for the social and environmental factors that influence health and disease50. These factors must be properly described, measured, and incorporated in genomics studies51. Optimal implementation of genomic medicine will require understanding how the intersectional aspects of people’s social and political identities influence the ways in which populations are described in research. Such knowledge will, in turn, provide clarity about the interrelationships among these multiple influences on health and disease.
People want to be able to make well-informed decisions about their genomic data, leading to the engagement efforts in initiatives such as the UK Biobank52 and the All of Us Research Program53. Partnering with communities and individuals is fundamental to engaging participants in such large-scale research. Genomic researchers must incorporate models and methods of community engagement in their experimental design. Such studies must be appropriately tailored for different cultures and designed to reduce inequities and healthcare disparities; they must also be accompanied by effective information dissemination54. An unrelenting focus on the optimal ways to conduct research in partnership with data stakeholders and communities would ensure the identification of the key issues and values influencing peoples’ choices about the provision of personal data for research55,56. Data-stewardship infrastructures that integrate appropriate policies, technologies57, and governance and legal frameworks must be developed and assessed to ensure alignment between communities’ and individuals’ decisions about their data and the practices of researchers and clinicians.
To fully realize the fruits of genomic advances, a working understanding of the basic concepts of genomics will be important for science educators58, healthcare professionals59, policymakers, and the public60. Multiple educational strategies will inevitably be required for enhancing the genomic literacy of these heterogeneous groups, which points to the need for innovative approaches that are shared, assessed, and improved over time58. A growing evidence base shows that increasing the understanding of key genomics concepts and applications attracts students to careers in genomics61, assists with the use of genomics for addressing health disparities62, and facilitates the uptake of genomic medicine63. Curricula for enhancing genomic literacy must be designed to be accessible, effective, and scalable for use in the full range of settings where genomics education is provided – including primary and secondary schools, science museums, and informal science-education venues. Researchers and educators must also disseminate information about both the science of genomics as well as the key ethical and societal implications of genomics64.
Training and genomics workforce development
Appropriate skills in data science and data stewardship are now prerequisites for becoming a genomics researcher65. Furthermore, given the ever-expanding use of genomics in basic, translational, social, behavioral, and clinical research, a greater number of scientists will require fundamental data-science skills appropriate for the genomic applications being utilized66. Establishing and maintaining data-science competencies for conducting genomics research requires a series of interrelated educational and training efforts67, including the recruitment of a cadre of data scientists into genomics and the reciprocal exchange of expertise between genomics researchers and data scientists.
Moving into healthcare, providers must be poised to manage questions from patients who receive genomic information, including that from direct-to-consumer testing, and this applies to the full spectrum of medical professionals (including nurses, pharmacists, physicians, and other clinicians)68. Education modules tailored to specific user groups should be designed to adapt rapidly to advances in genomics and data-science technologies; these should be available on demand and, where appropriate, integrated into existing clinical systems69. Research on the methodologies for train-the-trainer approaches, implementation of standards and competency-based education, and strategies for enhancing genomic literacy among all healthcare providers at all career stages70 should also be pursued. The involvement of patients, caregivers, educators, professional organizations71, and accreditation boards will be critical to ensure success. Importantly, cross-training in relevant aspects of genomics must also be available for specialists working in or around healthcare systems, including (but not limited to) those involved in health services research, health economics, law, bioethics, and social and behavioral sciences.
Both in research and clinical settings, the global genomics workforce – as with the general biomedical research workforce – falls considerably short of reflecting the diversity of the world’s population (a vivid example of this is seen in the U.S.72), which limits the opportunity of those systematically excluded to bring their unique ideas to scientific and clinical research73. To attain a diverse genomics workforce, new strategies and programs to reduce impediments to career opportunities in genomics are required, as are creative approaches to promote workforce diversity, leadership in the field, and inclusion practices. Efforts must intentionally include women, underrepresented racial and ethnic groups, disadvantaged populations, and individuals with disabilities. Initiatives should not focus exclusively on early-stage recruitment; rather, they must also include incentives to recruit and retain a diverse workforce at all career stages74 as well as novel approaches for cultivating the next generation of genomics practitioners.
Breaking down barriers in genomics
Genomics has benefited enormously from the proactive identification of major obstacles impeding progress and the subsequent focused efforts to break down those barriers. Prototypic successes include the call for a “$1,000 human genome sequence” following completion of the Human Genome Project15 and proposed actions to facilitate genomic medicine implementation in 201116; in these cases, both the risks of failure and the benefits of success were high. Once again, breaking down barriers, as highlighted in Box 3 and detailed below, would accelerate progress and create new research and clinical opportunities at The Forefront of Genomics.
Laboratory and computational technologies
Advances in DNA synthesis and genome editing allow the field of genomics to progress from largely observational (“reading DNA”) to more experimental (“writing” and “editing” DNA) approaches. Enabling true “synthetic genomics” (i.e., the synthesis, modification, and perturbation of nucleic acid sequences at any scale) will allow for more powerful experimental testing of hypotheses about genome variation and function and improve opportunities for linking genotypes to phenotypes75. Genome editing is increasingly being used for practical applications in medicine (e.g., in gene therapy76), biotechnology, agriculture, and other areas. Despite recent triumphs, however, the current approaches are limited in their ability to interrogate genome function at the pathway or network level and to study gene regulation, chromosome organization and mechanics, and other important phenomena that involve factors acting across large chromosomal (or genomic) distances. Furthermore, radically new capabilities for understanding how the full complement of genomic variation within any individual genome contributes to phenotype should be pursued. Innovative approaches for generating nucleic acid molecules with defined sequences and of any size, coupled with technologies that allow for the concurrent and large-scale perturbation of multiple genes or simultaneous examination of multiple genomic variants, would be transformative. These advances would benefit from the development of methods for introducing large synthetic constructs into mammalian cells.
In recent years, large human genomics projects have often relied on data generated as part of existing research studies, and emerging approaches involve developing biobanks and organized cohorts77–79. Meanwhile, direct-to-consumer (DTC) companies are generating substantial amounts of genomic data, and those efforts are rapidly being eclipsed by that being generated in the clinical care setting80. Properly leveraged, these DTC and clinical data offer opportunities for genomics-based studies at unprecedented scales; however, these data are often heavily fragmented, siloed, and mostly outside the purview of genomics researchers and their typical funders81. Eliminating the barriers to accessing these sources of data for conducting research is essential, but this will require resolving issues related to governance, policy infrastructure, and informatics and workflow solutions. Approaches are needed to mitigate the resulting gaps, limitations, and biases within this highly distributed data environment (e.g., with regards to population diversity, data-collection strategies, data standards, and data privacy), all while addressing concerns of the patients, participants, and groups. These challenges must be addressed globally81 (Box 1), so as to accommodate differences in healthcare systems and views about data privacy. In addition, the healthcare stakeholders should take advantage of opportunities offered by genomics, thereby enabling virtuous-cycle routes between genomic learning healthcare systems and basic genomics research82 (Fig. 3).

As human genomics has matured as a discipline, productive and connected virtuous cycles of activity have emerged, each self-improving with successive rounds of new advances. The cycle on the left reflects basic genomics research, in which technology innovations spur the collection and analysis of genomics research data, often yielding new knowledge and additional hypotheses for testing. The cycle on the right reflects a genomic learning healthcare system, in which the implementation of new genomic medicine practice innovations allows for the collection and analysis of outcomes data, often yielding new genomic knowledge and additional genomics-based strategies for improving the quality of clinical care. Note that the new knowledge emerging from either the left or right cycle has the potential to feed into the other, creating opportunities for “bench to bedside” and “bedside back to bench” progressions82 – both of which are expected to grow in the coming decade.
Biological insights
Despite progress in identifying genomic variants that cause monogenic traits or are statistically associated with complex phenotypes, connecting specific variants to phenotypes remains challenging83. Systematic approaches, including new tactics that connect high-throughput molecular readouts of functional genomic assays to organismal phenotypes, are required for establishing the phenotypic consequences of all genomic variants – individually and in combinations – in a cell-type context across the life span84. Progress in this area requires global collaboration85, advances in integrating multiple data types and performing perturbation assays, protein localization/interaction experiments, and animal models, as well as resources cataloging information about the fitness consequences of de novo mutations and the clinical relevance of genomic variants83. Because it is not possible to directly test every variant in all cell types and states, developmental stages, and disease processes, new data-collection strategies and analytical approaches are needed that can generalize and adapt predictions to new contexts, handle sparse data, and prioritize variants for experimental follow-up.
Recent advances have led to a greater appreciation of the extent of mosaicism – i.e., genomic variation among cells (both somatic and germline) within an individual. While there have been remarkable advances in understanding the somatic genomic changes encountered in cancer86, there is a paucity of detailed knowledge about other impacts of mosaicism beyond a few well-studied examples87. Important areas of future research include investigating the prevalence and extent of different forms of mosaic variation in both nuclear and mitochondrial DNA, the mechanisms that generate mosaicism, and the roles of mosaicism in physiology and human disease. Such efforts might reveal if this form of genomic variation contributes to variable penetrance and expressivity, comprises a form of genetic epistasis, explains any currently undiagnosed diseases or sporadic cases (or apparent phenocopies) of known inherited diseases9, and can inform the design of therapies for genetic diseases. Single-cell genomic technologies have extended knowledge about the functional impact of mosaicism in multiple experimental systems88,89, with the next challenge being to translate such single-cell understanding to in vivo settings. The development of new laboratory and clinical approaches for readily detecting genomic mosaicism at high spatial and temporal resolution, especially in ways that are relatively non-invasive (e.g., requiring minimal amounts of tissue), would be catalytic.
Implementation science
A critical barrier to using genomics for improving health and preventing disease is the lack of clinical uptake of proven genomic interventions. Implementation science approaches are needed to identify the most effective methods and strategies for facilitating the use of evidence-based genomic applications, most notably pharmacogenomics-based selection of medications90, in routine clinical care. Novel experimental designs, such as genotype-specific participant recruitment91 or integration of patient-provided genomic data92 (captured during previous healthcare encounters or from direct-to-consumer sources), should be explored for their potential to speed adoption and limit costs. The effectiveness of centralized resources for genomic referrals [e.g., genomic medicine specialists, consult services93,94, centers of excellence in undiagnosed diseases (akin to transplantation centers or cancer centers)] should be explored as potential steppingstones to the more generalized uptake of genomics in clinical care. Strategies for deploying the limited workforce of highly trained genetics/genomics specialists (e.g., systematic referral networks or telemedicine/telecounseling) should also be evaluated for their effectiveness at increasing the availability of services broadly – as opposed to being limited to select, highly specialized centers.
Universal newborn genetic screening may represent the most visible and successful approach to population-based identification of serious and treatable inherited conditions, but population screening across the lifespan for other genetic conditions is less widely accepted. Standard public health screening approaches for the U.S. Centers for Disease Control and Prevention Tier 1 conditions95,96 (e.g., Lynch syndrome, hereditary breast and ovarian cancer, and familial hypercholesterolemia) identify people at risk through blood relatives of affected individuals (referred to as “cascade testing” by geneticists97). Implementation research methods, coupled with effective science communication, are primed for optimizing approaches for engaging individuals in genetic testing for these disorders and also other emerging indications, such as genetic predisposition to adverse drug effects (pharmacogenomics), carrier testing of prospective parents, use of PRSs in disease detection and prevention46, and genomic indicators (e.g., gene-expression and epigenetic patterns) of exposure to infectious pathogens98 and other environmental agents.
Compelling genomics research projects
The field of genomics has routinely benefited from a willingness to articulate ambitious – often audacious – research efforts that aim to address questions and acquire knowledge that (at the time) may seem out of reach. Such boldness has served to stimulate interest in emerging opportunities, recruit new expertise, galvanize international collaborations involving multiple funders, and propel the field forward. While by no means comprehensive, the areas highlighted in Box 4 and detailed below illustrate the broadening range of compelling research projects that are ripe for pursuit at The Forefront of Genomics.
Advances in understanding gene regulation5,24, the myriad functional roles of RNA99, and the multi-dimensional nature of the nucleome100 – coupled with the utility of single-cell genomic approaches30,31 and anticipated new technological and computational capabilities for analyzing genomic datasets and variants – provide an unprecedented opportunity for deciphering the individual and combined roles of each gene and regulatory element. This must start with establishing the function of each human gene, including the phenotypic impact of human gene knockouts. Because genes and regulatory elements do not function in isolation, it is imperative to build robust experimental and computational models that deduce causal relationships and accurately predict cellular and organismal phenotypes using pathway and network models101,102. Novel analysis methods must address functional redundancy as well as the nearly boundless experimental space and complexity, including cell states and fates, temporal relationships, environmental conditions, and individual genetic background.
Building on the recent successes in unraveling the genetic underpinnings of rare and undiagnosed diseases9, the field is poised to gain a more comprehensive understanding of the genetic architecture of all human diseases and traits10,85. However, myriad complexities can be anticipated. For example, any given genomic variant(s) may affect more than one disease or trait (i.e., pleiotropy); can confer disease risk or reduce it; and can act additively, synergistically, and/or through intermediates. New methods for analyzing data that account for human diversity103, coupled with a growing clarity about genotype-phenotype relationships, must be developed to deduce associations and interactions among genomic variants and environmental factors, improve estimates of penetrance and expressivity, and enhance the clinical utility of genomic information for predicting risk, prognosis, treatment response, and, ultimately, clinical outcomes.
Prioritizing the generation of genomic and corresponding phenotypic data from ancestrally diverse participants is a scientific imperative104 and essential for achieving equitable benefits from genomic advances105 (Box 1). However, this is an area in which genomics has repeatedly fallen short19, leading to missed opportunities for understanding genome structure and function, identifying variants conferring risk for common diseases106, and implementing genomic medicine for the benefit of all107–109. Ideally, studies should be designed for different groups, tailored for local sensibilities and situations, and consistent in capturing key information beyond participants’ ancestry (e.g., the physical and social environments in which they live and receive healthcare110). Leveraging new insights from studies of diverse populations will require the development of new and robust methods for identifying novel signatures of natural selection, performing genotype imputation, mapping disease loci, characterizing genomic variant pathogenicity, and calculating PRSs103,109. Success in these efforts will yield a more-complete understanding of how the human genome functions in different environments and offer benefit to those participating in genomics research. Attaining the level of population diversity that will truly benefit all people requires bold scientific and community-based leadership, dedicated resources from funders, highly committed researchers, and effective partnerships that earn the trust of diverse groups of participants and their communities.
As genomics has grown in medicine and society, its potential to influence people’s actions has also expanded. Increasingly, genomics has affected concepts of health, disease, responsibility, family, identity, and community, raising a number of important and changing questions. When and how is genomic information shared and communicated within families111? Will the identification of a strong genetic risk for a disease change a person’s perception of their health or others’ perception of that person? With some genetic risks being more common in certain identifiable populations, what role does group affiliation play in how risk is communicated and perceived, including potential group stigmatization? Research that catalogs, analyzes, and measures the impact of genomics on individuals, families, and communities is important for providing a more informed context to avoid future misrepresentations, misunderstandings, and misuses of genomics54. Finally, researchers must appreciate how their own backgrounds and experiences shape their interpretations of genomic data112.
Extending genomics research in clinical settings beyond DNA sequence to include other multi-omic data, together with clinical variables and outcomes, would advance understanding of disease onset and progression and may also prove important for drug-discovery efforts113,114. This would require tissue- and cell-specific analyses that integrate these data, providing real-time snapshots of biological and disease processes. For clinical applicability and adoption, these high-dimensional, multi-omic data should be integrated with clinical decision support tools and electronic health records. Ultimately, such efforts could reveal important relationships among genomic, environmental, and behavioral variation and facilitate a transition of the use of genomics in medicine from diagnosing and treating disease to maintaining health.
Sharp barriers between research and clinical care obstruct the virtuous cycle of moving scientific discoveries rapidly into clinical care and bringing clinical observations back to the research setting82 (Fig. 3). Learning healthcare systems – in which real-time data on outcomes of healthcare delivery are accessed and used to enhance clinical practice – can lead to continuous care improvement, but only if the barriers between research and clinical care are reduced115. For example, offering genome sequencing to all members of a healthcare system, performed in conjunction with research and participant engagement and provided in real time81, could help to assess the clinical utility of genomic information and may allow providers to improve disease diagnosis and management. System-wide implementation of such an experiment requires not only extensive patient and provider education, sophisticated informatics capabilities, and genomics-based clinical decision support, but also the development and evaluation of data security and privacy protections to ensure patient confidentiality116. Patients should be engaged in the design of such systems and informed at entry to them (and periodically thereafter), so as to be fully aware of the nature of the ongoing research with their clinical data and the goals and potential risks of their participation117. Extending such studies across multiple healthcare systems should reveal common challenges and solutions118,119, thereby enhancing the learning healthcare model for genomic medicine more broadly (Fig. 3).
Concluding thoughts
The dawn of genomics featured the launch of the Human Genome Project in October 19901. Three decades later, the field’s journey has included stunning technological advances and high-profile programmatic successes, which in turn have led to the widespread infusion of genomic methods and approaches across the life sciences and, increasingly, into medicine and society.
NHGRI has for the third time15,16 since the Human Genome Project undergone an extensive horizon-scanning process to capture, synthesize, and articulate the most compelling strategic opportunities for the next phase of genomics – with particular attention to those elements most relevant to human health. The now near-ubiquitous nature of genomics (including in the complex healthcare ecosystem) presented practical challenges for attaining a holistic assessment of the field. Another reality was that NHGRI’s investment in genomics has now been multiplied many-fold by the seeding of human genomics throughout the broader research community. These changes reflect a continued maturation of both the field (in general) and NHGRI (more specifically), nicely aligning with the institute’s evolving leadership role at The Forefront of Genomics.
Embracing that role, NHGRI formulated the strategic vision described here, which provides an optimistic outlook that the successes in human genomics over the past three decades will be amplified in the coming decade. Many of the details about what is needed to fulfill the promise of genomics have now come into focus. Major unsolved problems remain – among them determining the role for the vast majority of functional elements in the human genome (especially those outside of protein-coding regions), understanding the full spectrum of genomic variation (especially that implicated in human disease), developing data-science capabilities (especially those that keep pace with data generation), and improving healthcare through the deployment of genomic medicine (especially in the areas of prevention, diagnosis, and therapeutic development). The new decade also brings new research questions related to the societal implications of genomics, including those related to social inequities, pointing to the continued importance of investigating the ethical, legal, and social issues related to genomics. But now more than ever, solutions to these problems seem to be within striking distance. Towards that end (and with the characteristic spirit of genomics audacity), we offer ten bold predictions of what might be realized in human genomics by 2030 (Box 5).
The strategic vision articulated here was crafted on behalf of the field of human genomics and emphasized broad strategic goals opposed to implementation tactics. The realization of these goals will require additional planning in conjunction with the collective creativity, energies, and resources of the global community of scientists, funders, and research participants. NHGRI has taken some initial steps for implementing this vision, although these will inevitably need to be adapted as advances occur and circumstances change. Indeed, the final words of this strategic vision were formulated as the world moved urgently to deal with the COVID-19 pandemic (see Epilogue), providing a vivid reminder of the need to be nimble and the importance of nurturing all parts of the research continuum – from basic to translational to clinical – for protecting public health and advancing medical science.
Despite the seismic changes seen in genomics since the inception of the field, the fundamental sense of curiosity, marvel, and purpose associated with genome science seems to be timeless. In concluding NHGRI’s previous strategic vision16 – published just under a decade ago – the then-envisioned opportunities and challenges were provided with “… a continuing sense of wonder, a continuing need for urgency, a continuing desire to balance ambition with reality, and a continuing responsibility to protect individuals while maximizing the societal benefits of genomics….” With the 2020 strategic vision described here providing a thoughtful guide and with enduring feelings of wonder, urgency, ambition, and social consciousness providing unfettered momentum, we are ready to embark on the next exciting phase of the human genomics journey.
Epilogue: COVID-19 and genomics
SARS-CoV-2 emerged as global threat to public health at the end of the multi-year process that generated the above strategic vision. Nonetheless, the pandemic provides a potent lesson about how a tiny string of nucleic acids can wreak global havoc on humankind. Understanding the mechanisms involved in the virus’s transmission, invasion, and clearance, as well as the highly variable and at times disastrous physiologic responses to it, are fertile grounds for genomics research. Indeed, genomics rapidly assumed critical roles in COVID-19 research and clinical care in areas such as the (1) deployment of DNA- and RNA-sequencing technologies for diagnostics, viral isolate tracking, and environmental monitoring; (2) use of synthetic nucleic acid technologies for studying SARS-CoV-2 virulence and facilitating vaccine development; (3) examination of how human genomic variation influences infectivity, disease severity, vaccine efficacy, and treatment response; (4) adherence to principles and values related to open science, data sharing, and consortia-based collaborations; and (5) provision of genomic data science tools for studying COVID-19 pathophysiology. The growing adoption of genomic approaches and technologies into myriad aspects of the global response to the COVID-19 pandemic serves as another important and highly visible example of the integral and vital nature of genomics in modern research and medicine.
Acknowledgements
The strategic vision described here was formulated on behalf of NHGRI. We are grateful to the many members of the institute staff for their contributions to the associated planning process (see http://genome.gov/genomics2020 for details) as well as to the numerous external colleagues who provided input to the process and draft versions of this strategic vision. The National Advisory Council for Human Genome Research (current members are Jeffrey Botkin, Trey Ideker, Sharon Plon, Jonathan Haines, Stephen Fodor, Rafael Irizarry, Patricia Deverka, Wendy Chung, Mark Craven, Hal Dietz, Stephen Rich, Howard Chang, Lisa Parker, Len Pennacchio, and Olga Troyanskaya) ratified the strategic planning process, themes, and priorities associated with this strategic vision.
Footnotes
Competing Interests
The authors declare no competing interests.




