Format

Send to

Choose Destination
G3 (Bethesda). 2014 Oct 1;4(12):2317-28. doi: 10.1534/g3.114.011957.

Influence of outliers on accuracy estimation in genomic prediction in plant breeding.

Author information

1
Biostatistics Unit, Institute of Crop Science, University of Hohenheim, 70599 Stuttgart, Germany.
2
Biostatistics Unit, Institute of Crop Science, University of Hohenheim, 70599 Stuttgart, Germany jogutu2007@gmail.com.

Abstract

Outliers often pose problems in analyses of data in plant breeding, but their influence on the performance of methods for estimating predictive accuracy in genomic prediction studies has not yet been evaluated. Here, we evaluate the influence of outliers on the performance of methods for accuracy estimation in genomic prediction studies using simulation. We simulated 1000 datasets for each of 10 scenarios to evaluate the influence of outliers on the performance of seven methods for estimating accuracy. These scenarios are defined by the number of genotypes, marker effect variance, and magnitude of outliers. To mimic outliers, we added to one observation in each simulated dataset, in turn, 5-, 8-, and 10-times the error SD used to simulate small and large phenotypic datasets. The effect of outliers on accuracy estimation was evaluated by comparing deviations in the estimated and true accuracies for datasets with and without outliers. Outliers adversely influenced accuracy estimation, more so at small values of genetic variance or number of genotypes. A method for estimating heritability and predictive accuracy in plant breeding and another used to estimate accuracy in animal breeding were the most accurate and resistant to outliers across all scenarios and are therefore preferable for accuracy estimation in genomic prediction studies. The performances of the other five methods that use cross-validation were less consistent and varied widely across scenarios. The computing time for the methods increased as the size of outliers and sample size increased and the genetic variance decreased.

KEYWORDS:

GenPred; accuracy estimation; genomic prediction; heritability; outliers; predictive accuracy; shared data resource

PMID:
25273862
PMCID:
PMC4267928
DOI:
10.1534/g3.114.011957
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for HighWire Icon for PubMed Central
Loading ...
Support Center