Format

Send to

Choose Destination
Hum Mutat. 2019 Sep;40(9):1373-1391. doi: 10.1002/humu.23874. Epub 2019 Sep 3.

CAGI SickKids challenges: Assessment of phenotype and variant predictions derived from clinical and genomic data of children with undiagnosed diseases.

Author information

1
Department of Plant and Microbial Biology, University of California, Berkeley, California.
2
Institute of Biomedicine and Translational Medicine, University of Tartu, Tartu, Estonia.
3
Department of Pediatrics and Wisconsin State Lab of Hygiene, University of Wisconsin, Madison, Wisconsin.
4
Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy.
5
Department of Computer Science, University of Bristol, Bristol, UK.
6
Department of Computer Science, Indiana University, Indiana.
7
Tata Consultancy Services Ltd, Mumbai, India.
8
Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas.
9
Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland.
10
Computational Biology, Bioinformatics and Genomics, Biological Sciences Graduate Program, University of Maryland, College Park, Maryland.
11
Department of Biochemistry & Molecular Biology, Department of Pharmacology, Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, Texas.
12
Department of Biomedical Informatics and Medical Education, University of Washington, Washington.
13
Department of Cell Biology and Molecular Genetics, University of Maryland, Maryland.
14
QIAGEN Bioinformatics, Redwood City, California.
15
Khoury College of Computer Sciences, Northeastern University, Massachusetts.
16
Department of Bioinformatics, Technische Universität München, Wissenschaftszentrum Weihenstephan, Germany.
17
Center for Human Genomics and Precision Medicine, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin.
18
Department of Paediatrics, The Hospital for Sick Children, Toronto, Canada.

Abstract

Whole-genome sequencing (WGS) holds great potential as a diagnostic test. However, the majority of patients currently undergoing WGS lack a molecular diagnosis, largely due to the vast number of undiscovered disease genes and our inability to assess the pathogenicity of most genomic variants. The CAGI SickKids challenges attempted to address this knowledge gap by assessing state-of-the-art methods for clinical phenotype prediction from genomes. CAGI4 and CAGI5 participants were provided with WGS data and clinical descriptions of 25 and 24 undiagnosed patients from the SickKids Genome Clinic Project, respectively. Predictors were asked to identify primary and secondary causal variants. In addition, for CAGI5, groups had to match each genome to one of three disorder categories (neurologic, ophthalmologic, and connective), and separately to each patient. The performance of matching genomes to categories was no better than random but two groups performed significantly better than chance in matching genomes to patients. Two of the ten variants proposed by two groups in CAGI4 were deemed to be diagnostic, and several proposed pathogenic variants in CAGI5 are good candidates for phenotype expansion. We discuss implications for improving in silico assessment of genomic variants and identifying new disease genes.

KEYWORDS:

CAGI; SickKids; pediatric rare disease; phenotype prediction; variant interpretation; whole-genome sequencing data

PMID:
31322791
DOI:
10.1002/humu.23874

Supplemental Content

Full text links

Icon for Wiley
Loading ...
Support Center