Format

Send to

Choose Destination
See comment in PubMed Commons below
Genetics. 2009 May;182(1):295-301. doi: 10.1534/genetics.109.100479. Epub 2009 Mar 16.

Estimation of allele frequencies from high-coverage genome-sequencing projects.

Author information

1
Department of Biology, Indiana University, Bloomington, Indiana 47405, USA. milynch@indiana.edu

Abstract

A new generation of high-throughput sequencing strategies will soon lead to the acquisition of high-coverage genomic profiles of hundreds to thousands of individuals within species, generating unprecedented levels of information on the frequencies of nucleotides segregating at individual sites. However, because these new technologies are error prone and yield uneven coverage of alleles in diploid individuals, they also introduce the need for novel methods for analyzing the raw read data. A maximum-likelihood method for the estimation of allele frequencies is developed, eliminating both the need to arbitrarily discard individuals with low coverage and the requirement for an extrinsic measure of the sequence error rate. The resultant estimates are nearly unbiased with asymptotically minimal sampling variance, thereby defining the limits to our ability to estimate population-genetic parameters and providing a logical basis for the optimal design of population-genomic surveys.

PMID:
19293142
PMCID:
PMC2674824
DOI:
10.1534/genetics.109.100479
[Indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for HighWire Icon for PubMed Central
    Loading ...
    Support Center