Allele frequencies of 15 STR loci in Bosnian and Herzegovinian population

Aim To determine newest the most accurate allele frequencies for 15 short tandem repeat (STR) loci in the Bosnian and Herzegovinian population, calculate statistical parameters, and compare them with the relevant data for seven neighboring populations. Methods Genomic DNA was obtained from buccal swabs of 1000 unrelated individuals from all regions of Bosnia and Herzegovina. Genotyping was performed using PowerPlex® 16 System to obtain allele frequencies for 15 polymorphic STR loci including D3S1358, TH01, D21S11, D18S51, Penta E, D5S818, D13S317, D7S820, D16S539, CSF1PO, Penta D, vWA, D8S1179, TPOX, and FGA. The calculated allele frequencies were also compared with the data from neighboring populations. Results The highest detected value of polymorphism information content (PIC) was detected at the PentaE locus, whereas the lowest value was detected at the TPOX locus. The power of discrimination (PD) values had similar distribution, with Penta E showing the highest PD of 0.9788. While D18S51 had the highest value of power of exclusion (PE), the lowest PE value was detected at the TPOX locus. Conclusion Upon comparison of Bosnian and Herzegovinian population data with those of seven neighboring populations, the highest allele frequency differentiation was noticed between Bosnian and Herzegovinian and Turkish population at 5 loci, the most informative of which was Penta E. The neighbor-joining dendrogram constructed on the basis of genetic distance showed grouping of Slovenian, Austrian, Hungarian, and Croatian populations. Bosnian and Herzegovinian population was between the mentioned cluster and Serbian population. To determine more accurate distribution of allelic frequencies and forensic parameters, our study included 1000 unrelated individuals from all regions of Bosnia and Herzegovina, and our findings demonstrated the applicability of these markers in both forensics and future population genetic studies.

Short Tandem Repeats (STRs) are common markers in population biodiversity research, paternity testing, and forensic analysis of biological evidence. Reliability of STR amplification provides a high level of individualization that is crucial for population genetic studies. To obtain precise and reliable results of analysis, it is necessary to use population data obtained from a sufficient number of the samples (1). Currently available official allele frequencies for Bosnian and Herzegovinian population at 15 STR loci addressed in this study were published 10 years ago and obtained from only 100 unrelated individuals, which was acceptable at the time (2).
The latest recommendations regarding the official publication and forensic usage of STR population data highlighted the need for increasing the size of population sample for its calculation. Therefore, the main aim of this study was to determine latest and more accurate allele frequencies and forensic statistical parameters for 15 most used STR loci in the Bosnian and Herzegovinian population and compare them with the relevant data for other neighboring populations.

Material
Biological buccal swab samples were collected from 1000 unrelated individuals from all regions of Bosnia and Herzegovina. Samples were randomly collected from routine casework performed during the period of 2006-2016 at the Institute for Genetic Engineering and Biotechnology, University of Sarajevo. Only unrelated adults over 18 years of age were included in this study. Informed consent for the use of collected biological material and data was obtained from all subjects.

Statistical analysis
Allele frequencies, matching probability (MP), power of discrimination (PD), power of exclusion (PE), and typical paternity index (TPI) were calculated within Microsoft Excel workbook template -PowerStats (5). Powermarker version 3.25 was used for estimation of allele number (AN) (6), deviation from Hardy-Weinberg equilibrium (7), polymorphism information content (8), and observed and expected heterozygosity (9). Exact test of population differentiation (10) was calculated within Arlequin version 3.5.1.2 (11). After Bonferroni's correction, statistically significant deviation from Hardy-Weinberg equilibrium was considered as P < 0.01 and P < 0.001 for the population differentiation test. The number of effective alleles (A E ) was estimated as 1/ ∑ p i 2 , where p denotes the allele frequency for a particular locus. Ratio of effective and detected numbers of alleles and its statistical significance was calculated as suggested by Pojskic et Kalamujic (12) with Alleles Ratio, a Microsoft Excel workbook template (13). A Z-score of P < 0.01 was considered statistically significant. In order to estimate genetic distance among populations, we have implemented Dsw method proposed by Shriver et al (14). The neighbor-joining dendrogram was constructed based on genetic distance results (15). These calculations were performed using POPTREE software (16).

RESULTS
Allele frequencies and statistical parameters including heterozygosity (observed and expected), results of exact test, PD, and PE for the 15 STR markers were calculated (Tables 1 and 2). No statistically significant deviation from Hardy-Weinberg equilibrium was found at analyzed loci (P > 0.05 for all), except at the D8S1179 locus. However, after applying the Bonferroni's correction, no statistical significance was revealed either (P = 0.015). Excess of heterozygosity was detected for D3S1358, D21S11, D18S51, D16S539, vWA, and TPOX loci ( Table 2). A total of 160 alleles were detected, with 32 of those qualifying as rare alleles (frequency <0.005). The highest number of alleles was detected at the Penta E locus (allele 18) and the lowest at the TH01locus (allele 7) ( Table 1). The highest number of effective alleles was estimated for the Pen-taE (9.47) and lowest for the TPOX locus (2.55). The highest ratio between the number of effective and observed alleles (A E /A N ) was detected for the TH01 (0.652) and low-est for the TPOX (0.319) locus (Table 2). However, D21S11 (Z = 3.420, P = 0.001), D18S51 (Z = 3.019, P = 0.003), Penta E (Z = 3.344, P = 0.001), PentaD (Z = 2.621, P = 0.009), D8S1179 (Z = 2.616, P = 0.009), TPOX (Z = 2.874, P = 0.004) and FGA (Z = 3.699, P < 0.001) loci showed statistically significant ratio effective and detected number of alleles (P < 0.01) indi- cating a sharp-cut departure of effective number of alleles from the detected number of alleles.
The highest value of polymorphism information content (PIC) was found for the Penta E, and the lowest for the TPOX locus. The same results were obtained for PD, while the highest PE value was detected for D18S51 (Table 2). The lowest matching probability was observed for the Penta E locus, whereas the highest TPI was calculated for the D18S51 (Table 2). Statistically significant differenc-es were found in allele frequencies between the Bosnian and Herzegovinian population and data available of seven neighboring populations ( Table 3). The largest differences were found between the Bosnian and Herzegovinian population and Turkish, Croatian, Austrian, and Italian population ( Table 3).
The neighbor-joining dendrogram based on result of genetic distance analysis showed the relationship between the Bosnian and Herzegovinian population and neighbor-   ing populations (17)(18)(19)(20)(21)(22)(23). It showed that the Bosnian and Herzegovinian population had the greatest genetic distance from Turkish (0.220) and the smallest genetic distance from the Serbian (0.000), Slovenian (0.001) and Hungarian populations (0.001) ( Figure 1 and Table 4).

DISCUSSION
Our results were concordant with the findings of the previous study conducted in Bosnian and Herzegovinian population (2). This was also demonstrated in previous studies for other populations (17)(18)(19)(20)(21)(22)(23). Our results showed that locus D18S51 had the highest PE and TPI values. The most discriminating STR loci in Bosnian and Herzegovinian population were Penta E and D18S51. Therefore, it should be desirable to include these two loci in paternity testing and forensic analysis of biological evidence.
Allele frequencies for 15 STR loci in the studied Bosnian and Herzegovinian population did not differ significantly from those found in populations of Slovenia (17), Serbia (18), and Hungary (19). In other words, according to 15 analyzed STR loci here, the Bosnian and Herzegovinian population is genetically more similar to Slovenian, Serbian, and Hungarian populations than to the remaining four populations. Also, no significant differences were observed in previous population studies (17,18) when compared with earlier Bosnian and Herzegovinian data (2). Statistically significant differences between Bosnian and Herzegovinian and Turkish populations were found at five loci (TH01, D13S317, D16S539, D8S1179 and FGA) (19) and among Bosnian and Herzegovinian and Croatian population at 4 loci (D3S1358, TH01, D8S1179 and FGA) (20). A deviation of allele frequencies in Bosnian and Herzegovinian population from those in Croatian population was observed in the previous study (2) at the locus D8S1179. Also, Bosnian and Herzegovinian allele frequencies differed from Austrian (21) at 4 loci (TH01, D18S51, D16S539 and D8S1179), whereas deviation of Bosnian and Herzegovinian allele frequencies distribution from Italian (22) was observed only for two loci (D18S51 and D16S539).
The neighbor-joining dendrogram showed the relationship among 8 populations on the basis of result of genetic distance analysis. Bosnian and Herzegovinian population has the highest genetic distance from the Turkish population and the lowest from the Serbian population. The neighbor-joining dendrogram constructed on the basis of genetic distance showed grouping of Slovenian, Austrian, Hungarian, and Croatian populations. Bosnian and Herzegovinian population is placed between the mentioned cluster and the Serbian population. The results of genetic distance analysis are in concordance with conclusions about similarity among neighboring populations based on of STR profiles (24).
Although a study that was based on 100-150 respondents was sufficient at the time (25) and considered adequate for determining parameters for the given population, the latest recommendations regarding official publication and forensic usage of STR population data highlight the need of increasing the size of a studied population sample. The previous study of the population of Bosnia and Herzegovina analyzed fewer than 150 individuals and DNA typing included STR markers (2). Our findings on the distribution of allelic frequencies and forensic parameters obtained in 1000 unrelated individuals from all regions of Bosnia and Herzegovina demonstrate the applicability of these markers in both forensics and future population genetic studies.