Interrogating the Human Diplome: Computational Methods, Emerging Applications, and Challenges

Methods Mol Biol. 2023:2590:1-30. doi: 10.1007/978-1-0716-2819-5_1.

Abstract

Human DNA sequencing protocols have revolutionized human biology, biomedical science, and clinical practice, but still have very important limitations. One limitation is that most protocols do not separate or assemble (i.e., "phase") the nucleotide content of each of the maternally and paternally derived chromosomal homologs making up the 22 autosomal pairs and the chromosomal pair making up the pseudo-autosomal region of the sex chromosomes. This has led to a dearth of studies and a consequent underappreciation of many phenomena of fundamental importance to basic and clinical genomic science. We discuss a few protocols for obtaining phase information as well as their limitations, including those that could be used in tumor phasing settings. We then describe a number of biological and clinical phenomena that require phase information. These include phenomena that require precise knowledge of the nucleotide sequence in a chromosomal segment from germline or somatic cells, such as DNA binding events, and insight into unique cis vs. trans-acting functionally impactful variant combinations-for example, variants implicated in a phenotype governed by compound heterozygosity. In addition, we also comment on the need for reliable and consensus-based diploid-context computational workflows for variant identification as well as the need for laboratory-based functional verification strategies for validating cis vs. trans effects of variant combinations. We also briefly describe available resources, example studies, as well as areas of further research, and ultimately argue that the science behind the study of human diploidy, referred to as "diplomics," which will be enabled by nucleotide-level resolution of phased genomes, is a logical next step in the analysis of human genome biology.

Keywords: Cancer and DNA/RNA binding; Epistasis; Functional prediction; Genetic variation; Genomics; Haplotyping; Phasing.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • Computational Biology
  • Diploidy*
  • Genome, Human*
  • Haplotypes
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Nucleotides
  • Sequence Analysis, DNA

Substances

  • Nucleotides