Reliability of long vs short COI markers in identification of forensically important flies

Aim To compare the reliability of short and long cytochrome oxidase I gene fragment (COI) in identification of forensically important Diptera from Egypt and China. Methods We analyzed 50 specimens belonging to 18 species. The two investigated markers were amplified by polymerase chain reaction (PCR) followed by direct sequencing. Nucleotide sequence divergences were calculated using the Kimura two-parameter (K2P) distance model and neighbor-joining (NJ) phylogenetic trees. Results Although both tested fragments showed an overlap between intra and interspecific variations, long marker had greater completeness of monophyletic separation with high bootstrap support. Moreover, NJ tree based on the long fragment clustered species more in accordance with their taxonomic classification than that based on the short fragment. Conclusion In dipterous identification, it is recommended to use the long COI marker due to its greater reliability and safety.

Necrophagous insects can serve as a valuable source of information for estimation of minimum post-mortem interval (PMI) in legal medicine. Most suitable for forensic purposes are species from the order Diptera (eg, Calliphoridae, Muscidae, and Sarcophagidae) (1-4). In PMI estimation, an important initial step is correct identification of these insects, which may be difficult by using the traditional morphology-based approach (5,6), because several forensically important fly species can hardly be distinguished morphologically (7)(8)(9). The limitations of morphological method can be overcome by gene sequences analysis, a fast and accurate method of species identification. Molecular analysis requires small tissue samples and is relatively insensitive to preservation conditions (1, 10). Different mitochondrial (mt) and nuclear (nu) DNA markers are investigated as forensic tools. However, mtDNA is preferred because it can be easily extracted even from small or degraded samples (10). In addition, because of its strictly maternal inheritance and lack of genetic recombination, mtDNA haplotype is a good candidate for evolutionary and population genetics study.
Mitochondrial cytochrome c oxidase subunit I (COI) sequences are a rapid and powerful tool for accurate identification of species across various taxa (7,(11)(12)(13)(14). Although COI has been extensively studied by forensic entomologists, resulting in a vast amount of DNA data, there is little agreement as to which portion of the gene needs to be sequenced. Although the 5' end of COI is also the site of the proposed universal animal DNA "barcode" (11) and it has been successfully used in the identification of many blowfly species (12), this approach cannot identify some closely related species (12,15). Therefore, to optimize discrimination power between closely related species some authors suggested multi-gene approach (16,17). Surprisingly, a recent study using this approach revealed that phylogenetic tree based on COI fragment was similar to that based on 3 different gene fragments (16).
Fragments of the COI sequence that show low sequence divergence within species but high divergences among species can be employed as taxon "barcodes, " and unknown samples can be accurately grouped to species with reference sequences of the "barcode library" (14,18,19). Therefore, it is paramount to evaluate not only discrimination power of these COI fragments between closely related species but also between species belonging to more than one family, because in a database an unknown sample will be compared to all reference samples. In the absence of an appropriate reference sample, unknown samples will simply group with the most closely matched reference sample (20). Thus, it is important to confirm that the investigated marker will not only be correctly assigned to a species but also that it will be in accordance with the traditional morphological classification. Therefore, we evaluated the discrimination power of the short (272-bp) COI fragment in the identification of the most forensically relevant flies (Calliphoridae, Sarcophagidae, and Muscidae) originating from Egypt and China in comparison to the long (1173-bp) COI fragment, and aimed to gather genetic data on common forensically important Diptera.

dnA extraction
MtDNA was extracted from all samples using Mini Tissue Kit (Qiagen, Hilden, Germany) according to the manufacturer's protocol. To avoid possible contamination of fly DNA with DNA from ingested proteins and eggs of gut parasites, the thoracic muscle of each insect was used as the source of DNA, whereas the head and abdomen were retained for further analysis.

Sequences analysis and phylogenetic tree construction
Analysis of DNA sequence variations, nucleotide composition, and genetic distances analysis was performed using Molecular Evolutionary Genetics Analysis v. 5.10 (MEGA) (28). Phylogenetic trees based on the 2 investigated COI sequences were constructed by neighbor-joining (NJ) method using Kimura two-parameter (K2P) model implemented in the MEGA and tested by 1000 bootstrap replicates. Based on 272-bp sequences, 73 were variant and 71 were parsimony-informative characters. The nucleotide composition showed much higher frequencies of adenine and thymine (31.7% and 37% of total nucleotide compositions, respectively) compared with 14.2% of cytosine and 17.1% of guanine. NJ analysis was conducted to determine the relationships between the analyzed species ( Figure 1 and

diSCuSSion
This study found that although both tested fragments showed an overlap between intra and interspecific variations, long marker showed greater completeness of monophyletic separation with high bootstrap support. To our knowledge, this is the first study to provide molecular data on forensically important species from Egypt and China by using either short 272-bp or long 1173-bp fragment of the mt COI gene. The mt COI gene has been shown to be a major candidate gene for identification of forensically important insects (7,14,27,29). So, before using it in real forensic entomology cases, it is worth evaluating the applicability of different 272-bp and 1173-bp COI genetic markers by using species from the specific geographic areas (30).
As expected, this region of mtDNA had a strong adeninethymine bias, which is characteristic of insect mtDNA (6,12). No insertions or deletions were identified within the aligned sequences, as was found in studies conducted on other mtDNA fragments (6,11,31,32). Based on both tested COI fragments, C. megacephala and M. autumnalis samples were both sequenced from China and Egypt and showed minimal variation between populations. However, the largest intraspecific variation was observed between the species collected from different locations within one country. These results are in agreement with the study by Harvey et al (20), who tested 1167-bp COI for identification of Calliphoridae of Australian and South African origin. The low intraspecific variation between two countries indicates the value of the mtDNA region in interspecific distinction (33,34).
One study suggested that intraspecific variation should be ≤1% and between-species separation ≥3% (35), whereas other studies suggested establishing group-specific thresholds (8,11). In the present study, results of both short and long COI fragments support the idea of establishing group-specific thresholds because the 3 investigated species that belong to Muscidae exhibited the lowest interspecific variation, leading to an overlap between intraspecific and interspecific nucleotide divergences. Interestingly, although low sequence divergence can result in similar haplotypes, which may lead to misidentification and a wrong PMI estimate (8)  1 C. megacephala 4 0-1.5 0-1.5 -6 8 6 11 10 9 9 9 10 10 11 11 12 12 13 13 13 2 C. albiceps 5 0 0-0.7 7 -4 7 11 10 9 10 11 10 12 12 12 12 11 13 12 13 3 C. rufifacies 2 0 0-1. Based on 1173-bp COI gene tree, Aldrichina clade presented a deviation from traditional taxonomy because this species (Calliphorinae) was identified as a sister species to Chrysomya rather than to Lucilia (16). This pattern of evolution was also observed previously based on 28rRNA alone (36) and based on COI, CYTB, and ITS2 in a multi-gene approach (16). This relation was different from that observed based on 272-bp COI, when A. grahami was embedded within Lucilia tribe. The data obtained by 1173-bp COI phylogenetic analysis were more in accordance with the traditional morphological classification than the data obtained by 272-bp COI fragment analysis.
In this preliminary genetic identification of fly species from Egypt and China, we found that the long COI fragment outperformed the short one in species identification. Since the sample size was small, we recommend an evaluation of more samples using the same and other loci to confirm our findings. In addition, it is important to identify additional forensically important fly species and expand such analyses to all relevant Egyptian and Chinese species.
Funding None.
ethical approval Not required.
declaration of authorship SMA designed the study, performed samples analysis, and wrote the manuscript.