Genetic analysis of haplotype data for 23 Y-chromosome short tandem repeat loci in the Turkish population recently settled in Sarajevo, Bosnia and Herzegovina

Aim To explore the distribution and polymorphisms of 23 short tandem repeat (STR) loci on the Y chromosome in the Turkish population recently settled in Sarajevo, Bosnia and Herzegovina and to investigate its genetic relationships with the homeland Turkish population and neighboring populations. Methods This study included 100 healthy unrelated male individuals from the Turkish population living in Sarajevo. Buccal swab samples were collected as a DNA source. Genomic DNA was extracted using the salting out method and amplification was performed using PowerPlex Y 23 amplification kit. The studied population was compared to other populations using pairwise genetic distances, which were represented with a multi-dimensional scaling plot. Results Haplotype and allele frequencies of the sample population were calculated and the results showed that all 100 samples had unique haplotypes. The most polymorphic locus was DYS458, and the least polymorphic DYS391. The observed haplotype diversity was 1.0000 ± 0.0014, with a discrimination capacity of 1.00 and the match probability of 0.01. Rst values showed that our sample population was closely related in both dimensions to the Lebanese and Iraqi populations, while it was more distant from Bosnian, Croatian, and Macedonian populations. Conclusion Turkish population residing in Sarajevo could be observed as a representative Turkish population, since our results were consistent with those previously published for the homeland Turkish population. Also, this study once again proved that geographically close populations were genetically more related to each other.

Human Y chromosome short tandem repeats (Y-STRs) are repeating regions with 2-7 bp long repetitive units found in the non-recombining region of Y chromosome. Y-STRs are characterized by male inheritance pattern. They are the most widely used Y chromosome markers due to simple typing and a high level of diversity. Typing is performed using polymerase chain reaction (PCR), which is a reliable procedure tolerant to degraded DNA. Thus, Y-STRs can be used in forensics for the investigation of sexual assault cases, for deficient paternity testing when the alleged father is not available for testing, in gang rape situations (mixture of two or more male DNA samples), for the investigation of genetic reasons of male infertility, in genealogical research, particularly for surname testing, in population genetic studies, for the verification of amelogenin Y-deficient men, and in genetic epidemiology (1-9). In this study genotyping was performed by PowerPlex Y 23 kit (Promega Corporation, Madison, WI, USA). This kit types 23 Y-STR loci in a tested haplotype and includes 6 new Y-STR loci when compared to the previous Y-STR commercial kits, namely DYS576, DYS481, DYS549, DYS533, DYS570, and DYS643 (10,11).
More than 400 years of shared history of Bosnia and Herzegovina and Ottoman Empire shaped cultural features of both populations, with consequences visible even in the modern era. However, according to a previously published study (12), there was no greater genetic impact on the local population. Nowadays, for the first time we have a considerable settlement process from Turkey to Bosnia and Herzegovina, which could have a certain impact on local population diversity.
This study included Turkish students currently studying in Sarajevo as a representative sample of the Turkish population that has recently settled in Sarajevo. The aim of the study was to provide the haplotype polymorphisms and distributions in the Turkish population and estimate their forensic parameters. In addition, population pairwise genetic distances (R st ) and associated probability values (P values) with 10 000 permutations were calculated between the studied population and the neighboring populations. Multi-dimensional scaling (MDS) plots were generated using genetic distances for the comparison of different populations' haplotype data found in the Y Chromosome Haplotype Reference Database (YHRD, www.yhrd.org).

Study population
A total of 100 unrelated (not belonging to the same nuclear family) healthy male students from Turkish population living in Sarajevo and originating from different geographical regions of Turkey were sampled for the Y-STR analysis. DNA samples were collected at the International Burch University in the period between March, 2013 and January, 2014. The informed consent form was obtained from all participants. Ethical approval was received from the International Burch University, Sarajevo.

Sample preparation and Dna extraction
Buccal swabs were collected and used as a DNA source (n = 100). Air-dried buccal swabs were placed in paper envelopes and stored at +4°C for DNA extraction. Genomic DNA was isolated from buccal swabs using the salting out method (13) and extracted DNA samples were stored at -20°C until PCR amplification.

Statistical analyses
Allele and haplotype frequencies were calculated by counting method. Haplotype diversity was estimated by Nei's formula (15): HD = (1-∑ p i 2 )*n/(n-1), where n is the sample size and p i is the ith's haplotype frequency. Gene diversity (GD) was calculated as 1-∑ p i 2 , where p i is the allele frequency. Arlequin software version 3.5 was used (16). The match probability (MP) was calculated as ∑ p i 2 , where p i is the frequency of the ith haplotype. The discrimination capacity was calculated according to the formula DC = h/n, where h is the number of different haplotypes in the observed population (17). In order to compare the countries located in the proximity to Bosnia and Herzegovina and Turkey, we used the Y Chromosome Haplotype Reference Database -YHRD (www.yhrd.org) (18). AMOVA online tool (19) from YHRD was used to calculate population pairwise genetic distances (R st ) and associated probability values (P values) between the studied population and the neighboring populations with 10 000 permutations. MDS plots were generated using genetic distances for the comparison of different populations' haplotype data present in YHRD. For the population comparison, recently published haplotype data present in YHRD were used and the analyses were performed on 23 Y-chromosome STR loci. Compared population samples with the number of haplotypes were as follows: Macedonian (n = 101), Italian (n = 335), Lebanese (n = 555), Iraqi (n = 124), Greek (n = 109), Slovenian (n = 104), Hungarian (n = 143), German (n = 131), Croatian (n = 125), Bosnian (n = 100), and Turkish (n = 100).

reSultS
We calculated haplotype frequencies of the sample population and detected 165 alleles at the 23 Y-STR loci (Table  1). All 100 samples had unique Y 23 haplotypes, which is extremely satisfying and demonstrates the power of the new PowerPlex Y 23 kit (20). Apart from DYS385a/b, the most polymorphic locus was DYS458 with 12 alleles. High degree of polymorphism was also observed in all the six loci that were analyzed for the first time using PowerPlex Y 23 kit. The least polymorphic loci were DYS391 with 3 alleles, DYS438 with 4 alleles, and DYS389I, DYS437, DYS439, and DYS393 with 5 alleles. Haplotype diversity was 1.0000 ± 0.0014 with a DC of 1.00 and MP of 0.01.
Average GD for the study population was 0.703, ranging from 0.470 to 0.942. The initial analysis of GD values indicated that the highest GD was detected at DYS385a/b loci with a value of 0.942 and the lowest GD at DYS391 locus with a value of 0.470, which is consistent with the polymorphism findings presented above in this study.
Two microvariant alleles were detected at DYS458, namely 17.2 and 18.2. The total frequency of the variant alleles at DYS458 was 6% of all study samples. We detected duplicated allele (alleles 15, 16) in the two samples at DYS19.
The choice of world populations for comparison with our data was mostly limited by the fact that the majority of populations still lack relevant data on 23 Y-STR loci. Turkish population was closer to the Lebanese (R st = 0.001, P = 0.289) and Iraqi (R st = -0.001, P = 0.522) populations than to Italian (R st = 0.027, P = 0.000) and Greek (R st = 0.039, P = 0.000) populations. Also, it was closer to Hungarian (R st = 0.074, P = 0.000), German (R st = 0.088, P = 0.000), and Slovenian (R st = 0.089, P = 0.000) populations than to Macedonian (R st = 0.126, P = 0.000) and Croatian (R st = 0.216, P = 0.000) populations. The greatest genetic distance was detected between Turkish and Bosnian (R st = 0.284, P = 0.000) population (Table 2).
MDS plot was generated using pairwise R st values to estimate the genetic relationships between the compared populations ( Figure 1). Our sample population was closely related to the Lebanese and Iraqi populations in both dimensions, while it was much more distant to Bosnian, Croatian and Macedonian populations.

DiScuSSion
A comparison of the Turkish population currently living in Sarajevo with other populations showed that populations that were geographically closer were also more similar to each other. Thus, the populations most similar to our population were Iraqi and Lebanese. On the other hand, European populations are geographically more distant from Turkey and they showed greater differences from our population. The most distant populations were those from the Balkan Peninsula: Bosnian, Croatian, and Macedonian. Another study has also found important differences between Turkish population and European populations (1). However, there are no studies comparing Turkish population with  In the recent years, Y-STR marker analysis has been increasingly used in forensic science and population studies. Studies including 7, 9, 11, or 17 Y-STRs have been made for Turkish population. However, this study gives the first population data for 23 Y-STR loci for Turkish population in Eurasia region, which is one of the biggest advantages of the study as it will surely improve the knowledge about the genetic characteristics of the Turkish population. YHRD database is missing data on 23 Y-STR loci for Turkish population and this study can be helpful in filling that gap in the literature. Although sampled in Sarajevo, the population from the present study could be considered representative of the Turkish population in general since our data are similar to those obtained in 2003 (21). It is worth noting that the Turkish population sampled for this research comes from different parts of Turkey and can represent Turkish population on a small scale.
Furthermore, one of our intentions was also to "genetically record" a preliminary situation within this Turkish population, as well as the relationships with the local and neighboring populations, but also with the homeland Turkish population. These data can be used as a starting point for future population studies of Turkish population in the region, which currently shows a tendency to expand.
Also, in the examined population, high degree of haplotype diversity was observed as all haplotypes were unique (no profile appeared more than once), which is not usually observed in the literature and confirms that the tested individuals are indeed paternally unrelated. Such results are probably a consequence of using PowerPlex Y 23 kit, which types results on 23 Y-STR loci, thus enabling increased haplotype diversity to be observed. A large number of identified alleles also confirm high haplotype diversity due to the use of 23 Y-STR loci, while previous studies of Turkish population showed a much smaller number of alleles due to the use of fewer Y-STR loci (1,21-24).
Six loci that were included and analyzed for the first time in PowerPlex Y 23 kit (DYS576, DYS481, DYS549, DYS533, DYS570, and DYS643) were found to be highly informative and to contribute to the uniqueness of the haplotypes in the present population. DYS385a/b loci showed a high degree of polymorphism, which was expected as these loci were analyzed as a duplicated locus in the generated haplotypes. It was also found that DYS458 was one of the most  informative markers with 12 different alleles. This marker was shown to be highly polymorphic in a study of Turkish population from Anatolia as well (22).
The least informative locus was DYS391, with GD value of 0.470 and with only 3 alleles, which is in concordance with the previously published data for Turkish population (1), as well as with older studies (21,23,24) that used a smaller number of Y-STR loci. One of these studies (21) typed 6 Y-STR loci but the observed results were similar. One of the least informative loci in our study was DYS437 with 5 alleles, which confirmed the results of a previous study of Anatolian population (22), in which this locus was the least informative with 3 alleles.
Two of 12 alleles on locus DYS458 were microvariant alleles (17.2 and 18.2). The appearance of variant alleles was expected as it is a specific feature of the locus DYS458 (6). These alleles are most frequently found in the Northern and the Eastern Africa and the Caucasus but are less common in Europe. Having such variant alleles detected in a population increases the information content of the haplotypes, which is why we consider it an especially important finding.
Duplication was found in two individuals at DYS19 locus, which was also expected according to the theoretic and experimental data from the literature (2). Both individuals had the same genotype: 15, 16. Identical situation was observed in a study of Turkish population from the Central Anatolia (24). In this study, one individual had duplication at DYS19 locus with the combination 15, 16.
To conclude, 23 Y-STR loci had high haplotype diversity and the kits using this number of Y-STR loci represent great progression in the field of Y chromosome-related testing of individuals for the purposes of forensic investigation, paternity testing, population studies, and genealogical research. Also, selected 23 Y-STR markers within PowerPlex Y 23 kit are very useful in the investigation of genetic relationships between examined populations.
Funding None.
ethical approval received from the International Burch University, Sarajevo, Bosnia and Herzegovina.
Declaration of authorship SD designed the study, performed sample collection, DNA extraction, amplification, statistical analysis, and wrote the manuscript. DP performed the interpretation of the results and the final review. DM performed genotyping, raw data evaluation, interpretation of the results and was included in all stages of the project and preparation of the manuscript.
competing interests All authors have completed the Unified Competing Interest form at www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author) and declare: no support from any organization for the submitted work; no financial relationships with any organizations that might have an interest in the submitted work in the previous 3 years; no other relationships or activities that could appear to have influenced the submitted work.