Format

Send to

Choose Destination
Sci Data. 2017 Sep 21;4:170115. doi: 10.1038/sdata.2017.115.

Whole genome characterization of sequence diversity of 15,220 Icelanders.

Author information

1
deCODE genetics/Amgen Inc., Sturlugata 8, Reykjavik 101, Iceland.
2
Bioinformatics Research Centre (BiRC), C.F.Møllers Allé 8, Aarhus University, Aarhus 8000 Aarhus C, Denmark.
3
Department of Clinical Medicine-Molekylær Medicinsk afdeling (MOMA), Palle Juul-Jensens Boulevard 99, Aarhus University Hospital, Aarhus 8200 Aarhus N, Denmark.
4
Faculty of Medicine, School of Health Sciences, University of Iceland, Reykjavik 101, Iceland.
5
School of Engineering and Natural Sciences, University of Iceland, Reykjavik 101, Iceland.
6
School of Science and Engineering, Reykjavik University, Reykjavik 101, Iceland.
7
Department of Anthropology, University of Iceland, Reykjavik 101, Iceland.

Abstract

Understanding of sequence diversity is the cornerstone of analysis of genetic disorders, population genetics, and evolutionary biology. Here, we present an update of our sequencing set to 15,220 Icelanders who we sequenced to an average genome-wide coverage of 34X. We identified 39,020,168 autosomal variants passing GATK filters: 31,079,378 SNPs and 7,940,790 indels. Calling de novo mutations (DNMs) is a formidable challenge given the high false positive rate in sequencing datasets relative to the mutation rate. Here we addressed this issue by using segregation of alleles in three-generation families. Using this transmission assay, we controlled the false positive rate and identified 108,778 high quality DNMs. Furthermore, we used our extended family structure and read pair tracing of DNMs to a panel of phased SNPs, to determine the parent of origin of 42,961 DNMs.

PMID:
28933420
PMCID:
PMC5607473
DOI:
10.1038/sdata.2017.115
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Nature Publishing Group Icon for PubMed Central
Loading ...
Support Center