Format

Send to

Choose Destination
See comment in PubMed Commons below
BMC Bioinformatics. 2015 Sep 22;16:304. doi: 10.1186/s12859-015-0736-4.

Group-based variant calling leveraging next-generation supercomputing for large-scale whole-genome sequencing studies.

Author information

  • 1Biomedical Sciences Graduate Program, University of California, San Diego, Gilman Drive, La Jolla, 92092, CA, USA. kstandis@ucsd.edu.
  • 2Human Biology, J. Craig Venter Institute, 4120 Capricorn Lane, La Jolla, 92092, CA, USA. kstandis@ucsd.edu.
  • 3Human Biology, J. Craig Venter Institute, 4120 Capricorn Lane, La Jolla, 92092, CA, USA. carlandt@gmail.com.
  • 4San Diego Supercomputer Center, University of California, San Diego, Gilman Drive, La Jolla, 92092, CA, USA. glennklockwood@gmail.com.
  • 5San Diego Supercomputer Center, University of California, San Diego, Gilman Drive, La Jolla, 92092, CA, USA. pfeiffer@sdsc.edu.
  • 6San Diego Supercomputer Center, University of California, San Diego, Gilman Drive, La Jolla, 92092, CA, USA. mahidhar@sdsc.edu.
  • 7Systems Pharmacology & Biomarkers (Immunology), Janssen R&D LLC, Springhouse, PA, USA. CHuang4@its.jnj.com.
  • 8Systems Pharmacology & Biomarkers (Immunology), Janssen R&D LLC, Springhouse, PA, USA. slambert@its.jnj.com.
  • 9Systems Pharmacology & Biomarkers (Immunology), Janssen R&D LLC, Springhouse, PA, USA. ycherka@its.jnj.com.
  • 10Systems Pharmacology & Biomarkers (Immunology), Janssen R&D LLC, Springhouse, PA, USA. cbrodmer@its.jnj.com.
  • 11Systems Pharmacology & Biomarkers (Immunology), Janssen R&D LLC, Springhouse, PA, USA. ejaeger@its.jnj.com.
  • 12R&D IT, Janssen R&D LLC, Springhouse, PA, USA. ejaeger@its.jnj.com.
  • 13Systems Pharmacology & Biomarkers (Immunology), Janssen R&D LLC, Springhouse, PA, USA. lsmith10@its.jnj.com.
  • 14R&D IT, Janssen R&D LLC, Springhouse, PA, USA. lsmith10@its.jnj.com.
  • 15Systems Pharmacology & Biomarkers (Immunology), Janssen R&D LLC, Springhouse, PA, USA. grajagop@its.jnj.com.
  • 16R&D IT, Janssen R&D LLC, Springhouse, PA, USA. grajagop@its.jnj.com.
  • 17Systems Pharmacology & Biomarkers (Immunology), Janssen R&D LLC, Springhouse, PA, USA. mcurran3@its.jnj.com.
  • 18Human Biology, J. Craig Venter Institute, 4120 Capricorn Lane, La Jolla, 92092, CA, USA. nschork@jcvi.org.

Abstract

MOTIVATION:

Next-generation sequencing (NGS) technologies have become much more efficient, allowing whole human genomes to be sequenced faster and cheaper than ever before. However, processing the raw sequence reads associated with NGS technologies requires care and sophistication in order to draw compelling inferences about phenotypic consequences of variation in human genomes. It has been shown that different approaches to variant calling from NGS data can lead to different conclusions. Ensuring appropriate accuracy and quality in variant calling can come at a computational cost.

RESULTS:

We describe our experience implementing and evaluating a group-based approach to calling variants on large numbers of whole human genomes. We explore the influence of many factors that may impact the accuracy and efficiency of group-based variant calling, including group size, the biogeographical backgrounds of the individuals who have been sequenced, and the computing environment used. We make efficient use of the Gordon supercomputer cluster at the San Diego Supercomputer Center by incorporating job-packing and parallelization considerations into our workflow while calling variants on 437 whole human genomes generated as part of large association study.

CONCLUSIONS:

We ultimately find that our workflow resulted in high-quality variant calls in a computationally efficient manner. We argue that studies like ours should motivate further investigations combining hardware-oriented advances in computing systems with algorithmic developments to tackle emerging 'big data' problems in biomedical research brought on by the expansion of NGS technologies.

PMID:
26395405
PMCID:
PMC4580299
DOI:
10.1186/s12859-015-0736-4
[PubMed - indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for PubMed Central
    Loading ...
    Support Center