Format

Send to

Choose Destination
Nat Genet. 2016 Jul;48(7):811-6. doi: 10.1038/ng.3571. Epub 2016 Jun 6.

Fast and accurate long-range phasing in a UK Biobank cohort.

Author information

1
Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA.
2
Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA.
3
Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA.

Abstract

Recent work has leveraged the extensive genotyping of the Icelandic population to perform long-range phasing (LRP), enabling accurate imputation and association analysis of rare variants in target samples typed on genotyping arrays. Here we develop a fast and accurate LRP method, Eagle, that extends this paradigm to populations with much smaller proportions of genotyped samples by harnessing long (>4-cM) identical-by-descent (IBD) tracts shared among distantly related individuals. We applied Eagle to N ≈ 150,000 samples (0.2% of the British population) from the UK Biobank, and we determined that it is 1-2 orders of magnitude faster than existing methods while achieving similar or better phasing accuracy (switch error rate ≈ 0.3%, corresponding to perfect phase in a majority of 10-Mb segments). We also observed that, when used within an imputation pipeline, Eagle prephasing improved downstream imputation accuracy in comparison to prephasing in batches using existing methods, as necessary to achieve comparable computational cost.

PMID:
27270109
PMCID:
PMC4925291
DOI:
10.1038/ng.3571
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Nature Publishing Group Icon for PubMed Central
Loading ...
Support Center