Format

Send to

Choose Destination
Hum Mutat. 2017 Sep;38(9):1266-1276. doi: 10.1002/humu.23265. Epub 2017 Jun 19.

Matching phenotypes to whole genomes: Lessons learned from four iterations of the personal genome project community challenges.

Author information

1
Department of Biomedical Informatics & Medical Education, University of Washington School of Medicine, Seattle, Washington.
2
The Buck Institute for Research on Aging, Novato, California.
3
Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland.
4
Department of Medicine, Division of Medical Genetics, Institute for Genomic Medicine and Moores Cancer Center, University of California San Diego, La Jolla, Califonia.
5
Department of Computer Science, Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland.
6
Department of Biomolecular Engineering, University of California, Santa Cruz, California.
7
McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland.
8
Department of Computer Science, University of Bristol, Bristol, UK.
9
Bristol Centre for Complexity Sciences, University of Bristol, Bristol, UK.
10
Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland.
11
Computational Biology, Bioinformatics and Genomics, Biological Sciences Graduate Program, University of Maryland, College Park, Maryland.
12
Department of Biomedical Sciences, University of Padova, Padova, Italy.
13
Department of Information Engineering, University of Padova, Padova, Italy.
14
Department of Woman and Child Health, University of Padova, Padova, Italy.
15
CNR Neuroscience Institute, Padova, Italy.
16
PersonalGenomes.org, Boston, Massachusetts.
17
Department of Plant and Microbial Biology, University of California, Berkeley, California.
18
European Bioinformatics Institute, Hinxton, UK.
19
Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, Maryland.
20
Institute of Mathematics and Computer Science, University of Greifswald, Greifswald, Germany.
21
Department of Oncology, The Johns Hopkins Medical Institutions, Baltimore, Maryland.

Abstract

The advent of next-generation sequencing has dramatically decreased the cost for whole-genome sequencing and increased the viability for its application in research and clinical care. The Personal Genome Project (PGP) provides unrestricted access to genomes of individuals and their associated phenotypes. This resource enabled the Critical Assessment of Genome Interpretation (CAGI) to create a community challenge to assess the bioinformatics community's ability to predict traits from whole genomes. In the CAGI PGP challenge, researchers were asked to predict whether an individual had a particular trait or profile based on their whole genome. Several approaches were used to assess submissions, including ROC AUC (area under receiver operating characteristic curve), probability rankings, the number of correct predictions, and statistical significance simulations. Overall, we found that prediction of individual traits is difficult, relying on a strong knowledge of trait frequency within the general population, whereas matching genomes to trait profiles relies heavily upon a small number of common traits including ancestry, blood type, and eye color. When a rare genetic disorder is present, profiles can be matched when one or more pathogenic variants are identified. Prediction accuracy has improved substantially over the last 6 years due to improved methodology and a better understanding of features.

KEYWORDS:

biomedical informatics; community challenge; critical assessment; genome; genome interpretation; open consent; personal genome project (PGP); phenotype

PMID:
28544481
PMCID:
PMC5645203
DOI:
10.1002/humu.23265
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Wiley Icon for PubMed Central
Loading ...
Support Center