Association studies using family pools of outcrossing crops based on allele-frequency estimates from DNA sequencing

Theor Appl Genet. 2014 Jun;127(6):1331-41. doi: 10.1007/s00122-014-2300-4. Epub 2014 Mar 26.

Abstract

We propose a method in which GBS data can be conveniently analyzed without calling genotypes. F2 families are frequently used in breeding of outcrossing species, for instance to obtain trait measurements on plots. We propose to perform association studies by obtaining a matching "family genotype" from sequencing a pooled sample of the family, and to directly use allele frequencies computed from sequence read-counts for mapping. We show that, under additivity assumptions, there is a linear relationship between the family phenotype and family allele frequency, and that a regression of family phenotype on family allele frequency will estimate twice the allele substitution effect at a locus. However, medium-to-low sequencing depth causes underestimation of the true allele substitution effect. An expression for this underestimation is derived for the case that parents are diploid, such that F2 families have up to four dosages of every allele. Using simulation studies, estimation of the allele effect from F2-family pools was verified and it was shown that the underestimation of the allele effect is correctly described. The optimal design for an association study when sequencing budget would be fixed is obtained using large sample size and lower sequence depth, and using higher SNP density (resulting in higher LD with causative mutations) and lower sequencing depth. Therefore, association studies using genotyping by sequencing are optimal and use low sequencing depth per sample. The developed framework for association studies using allele frequencies from sequencing can be modified for other types of family pools and is also directly applicable for association studies in polyploids.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computer Simulation
  • Crops, Agricultural / genetics*
  • Crosses, Genetic*
  • Gene Frequency
  • Genetic Association Studies
  • Genotype
  • Models, Genetic
  • Sequence Analysis, DNA