The GeT-RM Browser at NCBI

The GeT-RM Coordination Program is a CDC project for establishing a community process to create reference materials, quality control measures, and proficiency testing for genetic testing. The GeT-RM Browser is provided by NCBI to facilitate access to the data generated by this project, which is intended to aid evaluation of analytical validity of next-generation sequencing assays. All data is available on our FTP site .

About the Data

This site contains sequence data sets for the HapMap samples NA12878 and NA19240. Participating laboratories include clinical genetic testing laboratories as well as research laboratories. A variety of next generation sequencing (NGS) technologies have been used in conjunction with Sanger-based sequencing in a limited number of regions. Individual data sets reflect whole genome (WGS) and whole exome (WES) sequencing, a variety of targeted gene panels, as well as integrated data sets. A subset of variants have been confirmed with an orthogonal technology (such as Sanger sequencing), consistent with current recommendations for clinical genetic testing. A small number of regions were Sanger sequenced in their entirety.

This site also includes data generated by the Genome in a Bottle Consortium , which has developed a set of high-confidence calls for SNPs and small indels in NA12878 from an integrated dataset.

Ways to Access and Work With This Resource (see also "using this data" and "using website" tabs)


All data available for download

Search the Browser

Download Data for Your Region of Interest

To bulk download variant data, upload a file with a list of genes or a BED file
(only standard BED format is accepted; other formats such as BED detail are not accepted).

Upload your gene list or bed file
Download format

Guided Analysis of NGS Data for Clinical Validation (see also "using this data" tab)

TSV = Tab-separated value file, suitable for opening in Excel
VCF = Variant call format
BED = Tab-separated value file, suitable for uploading to most genome browsers

Analytical Sensitivity High quality variant datasets used to define true positive variant calls
Analytical Specificity High quality regions used to define true negative variant calls
  • NA12878GeT-RM Sanger sequenced regionBED
  • NA12878NIST data set (Zook et al. 2014, PMID:24531798)BED
Difficult Regions Regions excluded from variant calling by Genome in a Battle (GIAB), due to uncertainty (Zook et al. 2014, PMID:24531798). See the README for more information about these files.
Support Center