A spectral theory for Wright's inbreeding coefficients and related quantities

PLoS Genet. 2021 Jul 19;17(7):e1009665. doi: 10.1371/journal.pgen.1009665. eCollection 2021 Jul.

Abstract

Wright's inbreeding coefficient, FST, is a fundamental measure in population genetics. Assuming a predefined population subdivision, this statistic is classically used to evaluate population structure at a given genomic locus. With large numbers of loci, unsupervised approaches such as principal component analysis (PCA) have, however, become prominent in recent analyses of population structure. In this study, we describe the relationships between Wright's inbreeding coefficients and PCA for a model of K discrete populations. Our theory provides an equivalent definition of FST based on the decomposition of the genotype matrix into between and within-population matrices. The average value of Wright's FST over all loci included in the genotype matrix can be obtained from the PCA of the between-population matrix. Assuming that a separation condition is fulfilled and for reasonably large data sets, this value of FST approximates the proportion of genetic variation explained by the first (K - 1) principal components accurately. The new definition of FST is useful for computing inbreeding coefficients from surrogate genotypes, for example, obtained after correction of experimental artifacts or after removing adaptive genetic variation associated with environmental variables. The relationships between inbreeding coefficients and the spectrum of the genotype matrix not only allow interpretations of PCA results in terms of population genetic concepts but extend those concepts to population genetic analyses accounting for temporal, geographical and environmental contexts.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Consanguinity
  • Genetic Variation / genetics*
  • Genetics, Population / methods*
  • Genome
  • Genomics
  • Genotype
  • Humans
  • Inbreeding / methods
  • Models, Genetic
  • Models, Theoretical
  • Principal Component Analysis / methods*

Grants and funding

The study was supported by the grant ANR-19-P3IA-0003 MIAI@Grenoble-Alpes – PEG: Predictive Ecological Genomics to OF from the Agence Nationale de la Recherche (https://anr.fr/) The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.