Y-short tandem repeat haplotype and paternal lineage of the Ezhava population of Kerala, south India

Aim To analyze the haplotype of the Ezhava population of Kerala, south India, using 8 short tandem repeat (STR) loci on the Y chromosome and trace the paternal genetic lineage of the population. Methods Whole blood samples (n = 104) were collected from unrelated healthy men of the Ezhava population over a period of one year from October 2009. Genomic DNA was extracted by salting out method. All samples were genotyped for the 8 Y-STR loci by the AmpFiSTR Y-filer PCR Amplification Kit. The haplotype and allele frequencies were determined by direct counting and analyzed using Arlequin 3.1 software, and molecular variance was calculated with the Y-chromosome haplotype reference database online analysis tool, www.yhrd.org. Results Among the 104 examined haplotypes, we found 98 unique ones. The average gene diversity was 0.669, with the highest diversity of 0.9462 observed for the biallelic Y-STR marker DYS 385. The allele frequency among DYS loci varied between 0.0096 and 0.75. Out of the 104 haplotypes, 10 were identical to the Jat Sikh population of Punjab, which is the greatest number among the Indian populations, and 4 to the Turkish population, which is the greatest number among the European populations. According to the allele frequency of Y-STR, the Ezhavas were genetically more similar to the Europeans (60%) than to the East Asians (40%). Conclusion The vast majority of haplotypes were observed only once, reflecting the enormous genetic heterogeneity of the Ezhavas. Based on the genotype, the Ezhavas showed more resemblance to Jat Sikh population of Punjab and the Turkish populations than to the East Asians, hence indicating a paternal lineage of European origin.

Due to the geographical position of the Indian Peninsula between Africa, the Pacific, and west and east Eurasia, different populations have moved through its territory. This is why ethnic Indian population shows enormous cultural, linguistic, and genetic diversity. Indian tribal and caste populations derive heritage largely from the Pleistocene southern and western Asians, receiving limited gene flow from external regions since the Holocene (1). Also, Indian castes have been found to be more closely related to the Central Asians than to the Indian tribal groups (2).
The long seacoast of Kerala on the southern-most part of India has provided a gateway to India for many Asian, European, and Srilankan missionaries and traders. Non-tribal communities of Kerala, as shown by a human leukocyte antigen (HLA) analysis, were influenced by Dravidian, Indo-European, and East Asian gene pools (3). The Ezhava population of Kerala, according to the allele frequency distribution, had features of European, Central Asians, and East Asian gene pools. Mitochondrial DNA studies also validated the presence of two distinct, eastern and western Eurasianspecific lineage groups in India, suggesting that there were at least two separate migration events to India (4).
Due to the unique biology of the Y-chromosome, its genetic markers have been used in many forensic and evolutionary studies to determine patrilineal relationships within and between population groups. It has been suggested that, due to different distribution of region-specific allele frequencies, Y-short tandem repeats (STR) can be used to compare closely related populations (5,6). Previous genetic studies on the Ezhavas of south India failed to achieve a consensus on their paternal origin. In view of these diverse opinions based on HLA polymorphism and mitochondrial DNA analysis, this study aims to collect conclusive genetic data for a better understanding of the paternal origin of the Ezhavas. We present the haplotype analysis of the 8 Y-STR loci included in the European minimal haplotype set in 104 men from the Ezhava population to explore their genetic relationships with the European and East Asian populations. This is the first report on the Y-STR profile in Kerala population.

Study samples
Whole blood samples (n = 104) were collected from unrelated healthy male individuals of the Ezhava population of Kerala over a period of one year from October 2009. The familial histories of the participants were recorded to exclude related individuals before sample collection. Blood samples were collected using standard procedures in ethylene diamine tetra aceticacid (EDTA)-coated tubes. The individuals gave their written informed consent and the ethical approval was received from the Ethics Committee of the institution. dNA analysis DNA was extracted from EDTA blood samples with the salting out method (7). All samples were genotyped for the 8 Y-STR loci (DYS19, DYS385, DYS389I, DYS389II, DYS390, DYS391, DYS392, and DYS393) by the AmpFiSTR Y-filer PCR Amplification Kit (Applied Biosystems, Foster City, CA, USA). The Y-STR data of Turkish, German, and North Indian Jat Sikh and other Indian populations were obtained from previous studies (8)(9)(10) and the Y-STR haplotype reference database (www.yhrd.org).

StAtiStiCAl ANAlySiS
Allelic and haplotype frequencies were estimated by direct counting. Gene diversities were calculated using Arlequin 3.1 software (11), according to the formula G.D = N (1-ΣPi 2 ) / N-1, where N is the population size and Pi is the allele frequency of the i-th haplotype. Furthermore, we used YHRD (www.yhrd.org) in order to determine the similarity of the Y-STR markers with the other publicly available population data. The Ezhava population was compared with other Indian populations and with selected world populations in order to investigate the pattern of paternal contributions. A total of 1890 samples were analyzed: German population sample with 685 haplotypes, Turkish population sample with 160 haplotypes, Ezhava population (our sample) with 104 haplotypes, Haryana Jat population sample with 91 haplotypes, Punjab Jat Sikh population with 108 haplotypes, Andhra Pradesh, India (Brahmin) with 109 haplotypes, Himachal Pradesh, India (Saraswat Brahmin) with 61 haplotypes, Jammu, India (Saraswat Brahmin) with 61 haplotypes, Jharkhand, India (Munda) with 68 haplotypes, Jharkhand, India (Sakaldwipi Brahmin) with 65 haplotypes, Madhya Pradesh, India (Kanyakubja Brahmin) with 78 haplotypes, Punjab, India (Balmiki) with 62 haplotypes, Tamil Nadu, India (Iyengar) with 67 haplotypes, and Tamil Nadu, India (Kuruman) with 67 haplotypes. Analysis of molecular variance (AMOVA) was used to establish the total variance among and within groups. AMOVA was done using fixation index analysis and with 10 000 permutations. From the AMOVA calculation, we obtained pairwise differ-  (Table 1).
Allele frequencies at DYS loci varied between 0.75 to 0.0096 (Table 2). Gene diversity ranged from 0.3875 to 0.9462 (average, 0.6671) and the most polymorphic single locus marker was DYS389II ( Table 2).
The haplotypes of the Ezhavas were compared with the haplotypes of other Indian populations ( na Jat, and 0.1785 for Germans. These values show that Ezhava population is distant from other Indian populations and close to the Punjab Jat and Turkish population, as well as the Jammu population (Saraswat Brahmin), with a pairwise distance value of 0.0561, and the Jharkhand population (Sakaldwipi Brahmin), with a value of 0.0617 (Figure 1).
Similarities in the Y-STR data were found between the Ezhavas, north Indian, and Turkish population ( Table 4). Out of the 104 haplotypes, 10 haplotypes were identical to Jat Sikhs, 7 to the German population, and 4 to the Turkish population. One particular haplotype was found to be common to all 4 populations, with a very high incidence (8/108) in the Jat Sikh population of north India.
Allelic distribution of the Y-STR markers of the Ezhavas was compared with the European and East Asian populations (Figure 2). The histogram showed that 60% of the markers had similar allele frequencies to the European popula-tion, while only 40% showed similarity to the East Asian populations. The Y-STR alleles showed marked similarities to European populations when compared using European minimal haplotype.
Duplications, though rare in the Ezhava population, have been observed in monoallelic markers like DYS19, DYS389I, DYS389II, and DYS393 loci in other populations (8,20,24 declaration of authorship PSN contributed to designing the work, implementing, and manuscript preparation. AG contributed to sample collection and lab work. CJ contributed to sample collection and lab work.

Competing interests All authors have completed the Unified Competing
Interest form at www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author) and declare: no support from any organization for the submitted work; no financial relationships with any organizations that might have an interest in the submitted work in the previous 3 years; no other relationships or activities that could appear to have influenced the submitted work.