Prediction of eye color in the Slovenian population using the IrisPlex SNPs

Aim To evaluate the accuracy of eye color prediction based on six IrisPlex single nucleotide polymorphisms (SNP) in a Slovenian population sample. Methods Six IrisPlex predictor SNPs (HERC2 – rs12913832, OCA2 – rs1800407, SLC45A2 – rs16891982 and TYR – rs1393350, SLC24A4 – rs12896399, and IRF4 – rs12203592) of 105 individuals were analyzed using single base extension approach and SNaPshot chemistry. The IrisPlex multinomial regression prediction model was used to infer eye color probabilities. The accuracy of the IrisPlex was assessed through the calculation of sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and the area under the receiver characteristic operating curves (AUC). Results Blue eye color was observed in 44.7%, brown in 29.6%, and intermediate in 25.7% participants. Prediction accuracy expressed by the AUC was 0.966 for blue, 0.913 for brown, and 0.796 for intermediate eye color. Sensitivity was 93.6% for blue, 58.1% for brown, and 0% for intermediate eye color. Specificity was 93.1% for blue, 89.2% for brown, and 100% for intermediate eye color. PPV was 91.7% for blue and 69.2% for brown color. NPV was 94.7% for blue and 83.5% for brown eye color. These values indicate prediction accuracy comparable to that established in other studies. Conclusion Blue and brown eye color can be reliably predicted from DNA samples using only six polymorphisms, while intermediate eye color defies prediction, indicating that more research is needed to genetically predict the whole variation of eye color in humans.

Prediction of human visible characteristics by genotyping informative polymorphisms in DNA opens up a new perspective in the forensic field. Multiple genes including HERC2, OCA2, MC1R, SLC24A5, SLC45A2, TYR, TYRP1, ASIP, SLC24A4, TP-CN2, KITLG, and IRF4 have been associated with eye, hair, and skin color in European populations and they have been used in studies dealing with eye color prediction (1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14). Variation of iris color depends on the content of eumelanine, a brown light-absorbing biopolymer, which is present in higher concentrations in brown-eyed individuals (15, 16). Although eye color is evidently a continuous variable, it has been often classified into three categories -blue, brown, and intermediate (4,14). Eye color variability is particularly striking in European populations, constituting a highly differentiating trait of potential use in forensic investigations (7,14,17). Recent studies have shown that a significant fraction of human iris color variation can be explained by polymorphisms within a single region in the human genome, comprising the evolutionary conserved HERC2 gene and the neighboring OCA2 gene located on the chromosome 15. It is assumed that the level of expression of the known pigmentation gene -OCA2 -is controlled by polymorphism rs12913832 on HERC2 locus (18,19). The remaining genes that have been shown to contribute to eye color variation are SLC24A4, SLC45A2, TYR, and IRF4 (4,20,21). However, their impact on eye color prediction is lower and it seems to vary between populations (8,14,22,23). Since such differences may potentially affect accuracy of prediction in various populations, we further addressed this issue and analyzed a population sample of individuals with defined eye color from Slovenia.
Several prediction models have already been proposed to be useful in eye color prediction (4,8,9,17,23,24). Here we used six IrisPlex predictors, which were selected by Liu et al (4) from a larger set of polymorphisms potentially influencing pigmentation in humans and included into the IrisPlex prediction system (4,13,17). The IrisPlex prediction model is based on a multinomial logistic regression method and uses phenotype and genotype data from 3804 Dutch individuals. Based on these data the model gives three probabilities for blue, brown, and intermediate eye color (13). From the obtained probabilities, the most probable iris color is predicted based on recommendations given in Walsh et al (13). Four SNPs (HERC2 -rs12913832, OCA2 -rs1800407, SLC45A2 -rs16891982, and TYR -rs1393350) were genotyped previously as described in Kastelic et al (26). The remaining two IrisPlex SNPs (SLC24A4 -rs12896399 and IRF4 -rs12203592) were genotyped for the purpose of this study using the protocol described by Walsh et al (17). Marker details and primer sequences are listed in Supplementary Tables 1 and 2. All cleaned products were analyzed on the ABI Prism 3130 Genetic Analyzer (Applied Biosystems) using run parameters as described previously (17,26).

Model-based prediction of eye color and evaluation of its accuracy
On the basis of the formula provided by Liu et al (4) and implemented in the eye color prediction model of the IrisPlex system, three prediction probability values were generated for each of the three phenotype categories (blue, intermediate, and brown) (Supplementary Table 3). The overall prediction accuracy was assessed as previously explained by calculating area under the curve (AUC) values using SPSS 19.0 (SPSS Inc., Chicago, IL, USA) (26). The AUC is the integral of receiver operating characteristic (ROC) curve, and ranges from 0.5, which represents a total absence of prediction, to 1.0, which represents a perfect prediction. Additionally the values of sensitivity, specificity, positive predic-tive value (PPV), and negative predictive value (NPV) were calculated according to Liu et al using prediction threshold at the ≥0.7 level, which has been determined to be the most appropriate (1). Inconclusive results (below the threshold 0.7) were considered as negative results. From these, false negatives were used to calculate sensitivity and true negatives were used to calculate specificity.

Characteristics of the study population
The frequency of blue eye color in the studied sample was 44.7% (47 samples) and the frequency of brown eye color was much lower and reached 29.6% (31 samples). The individuals were categorized in these two eye color groups only when the color was homogenous, regardless of the intensity. The frequency of individuals in the intermediate eye color group was relatively high, 25.7% (27 samples).

Prediction accuracy of the irisPlex model
Prediction accuracy expressed by the AUC (Figure 1) was 0.966 for blue, 0.913 for brown, and 0.796 for intermediate eye color (  Figure).

DisCussion
In the studied population sample, blue eye color was present with the frequency 44.7%, while according to the Eupedia (http://www.eupedia.com/europe/maps_of_euro pe.shtml#eye_colour), the expected frequency of light eyed individuals in Slovenia should be between 50% and 79%. However, most of the individuals included in the intermediate category had green irises and blue irises with brown spots or peripupillary rings and therefore could be included in the group of light eyed people. Taking this into account, it can be said that the percentage of light-eyed individuals in the study was 56%, which is in accordance with the Eupedia.  The IrisPlex system includes six SNPs located on six genes (HERC2 rs12913832, OCA2 rs1800407, SLC24A4 rs12896399, SLC45A2 rs16891982, TYR rs1393350, and IRF4 rs12203592), which are considered to be major genetic predictors of eye color (4,(12)(13)(14). Numerous studies have confirmed that rs12913832 located on HERC2 gene alone carries most of the eye color predictive information and is therefore the best known eye color predictor (4,5,8,14,22,27). We confirmed these results and also provided evidence that the CC genotype on rs12913832 was strongly associated with blue eye color (8,14,22,27). Among 49 Slovenian individuals carrying the CC genotype, 45 (91.8%) had blue eye color, which is comparable to other reports (93.4%) (14). In the remaining four cases, green eye color was observed and accordingly our study suggests that the CC genotype reliably predicts light eye colors. On the other hand, 12 (80%) out of 15 individuals carrying the TT genotype had brown eye color. It is worth noting that this proportion was higher (98%) in the larger EUREYE study (14). This difference may be a consequence of a relatively small sample set in this study, especially in brown eye color category, but also of differences in phenotype description among various studies. The latter is supported by phenotype distribution among CT genotype carriers -only 19 (46.3%) out of 41 individuals with the CT genotype had brown eyes in our population compared to 75.5% in Walsh et al (14). However, among the remaining 22 individuals, 2 had blue eye color but with intense brown peripupillary rings, 2 had green eye color, and 18 had hazel eye color. Overall, of the 56 individuals carrying the T allele either as homozygote or heterozygote state, 31 (55.4%) had brown eye color and even 49 (87.5%) had dark irises (brown or hazel). Notably, 8 (53.3%) individuals out of the 15 with a TT genotype had dark brown eye color and 4 (26.7%) had lighter brown eye color, which implies that other polymorphisms are important modifiers of eye color intensity. This study confirmed the prevailing role of the rs12913832 in the determination of blue and brown eye color. The significance of this position for human pigmentation seems to be undisputable and has been confirmed also by functional genome analysis (19). This huge effect on eye color detected in all the so far studied populations makes this SNP a key element of all eye color prediction methods. Moreover, it has been shown that this position also has influence on variation in hair and skin color (1,22). Indeed, rs12913832 is also an important element of hair color prediction models (1,3,17).
OCA2 SNP rs1800407, which ranked second among the best eye color predictors, has very low frequency of allele A (11.9%) and therefore may have had weak overall influence on variation in eye color in our population sample (4,12,13). The remaining IrisPlex predictors have been shown to have smaller effect on iris color variation but all six are implemented in the IrisPlex macro (13). It has been pointed out that IrisPlex can accurately predict blue and brown eye color while it is inefficient in the prediction of intermediate eye color and thus one should expect considerably lower prediction accuracy for this eye color category ( Figure 1) (14). Indeed, the AUC values for blue and brown eye color categories in the Slovenian population were found to be very high and equaled 0.966 and 0.913, respectively. This result is similar to the AUC values obtained using multinomial logistic regression for a much larger group of seven European populations, which were 0.964 for blue and 0.956 for brown (14). Mengel-From et al (5) investigated HERC2, OCA2, and SLC45A2 variation in 395 Danes using logistic regression and concluded that variation in HERC2-OCA2 complex can be useful for reliable prediction of light and dark eye colors (5). Pośpiech et al (7) used Bayesian network built on 638 Poles and by testing 80 samples obtained AUC values of 0.783 and 0.583 for blue and brown color, respectively (7). These values were calculated from the scoring results rather than from probabilistic values, which is a much more conservative approach. They confirmed high sensitivity of prediction of blue eye color (80%) and lower for brown eye color (35%). They also concluded that eye color can be reliably predicted from the available DNA markers at the level light-dark, obtaining a high AUC value of 0.925 (7). A similar AUC value for light and dark eye color categories (AUC = 0.985) was obtained in our previous study involving the same Slovenian population and a different set of predictors (26). The accuracy values were also similar in the study of six European populations (with AUC values 0.986 and 0.978 for blue and brown, respectively) (8). In this latter work, a different model building data set was used, which could certainly influence the final AUC values. Recently, Allwood and Harbison (23) proposed a novel eye color prediction method utilizing classification tree approach and predicted blue and brown eye color with very high accuracy (23). Declaration of authorship VK participated in designing the methods, analyzing and interpreting the results, and preparing the manuscript. EP participated in study conception and design, considerably contributed to statistical analysis and data interpretation, revised the manuscript, and gave the final approval for publication. JDB revised the data and calculations, and contributed to the manuscript preparation and revision. WB wrote substantial sections of the manuscript. KD produced the idea, designed the manuscript, and contributed to results interpretation and revision of the manuscript for important intellectual content.

Competing interests All authors have completed the Unified Competing
Interest form at www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author) and declare: no support from any organization for the submitted work; no financial relationships with any organizations that might have an interest in the submitted work in the previous 3 years; no other relationships or activities that could appear to have influenced the submitted work. references