Defining orthologous groups among multicopy genes prior to inferring phylogeny, with special emphasis on the Triticeae (Poaceae)

Hereditas. 2001;135(2-3):123-38. doi: 10.1111/j.1601-5223.2001.00123.x.

Abstract

Sequence information from multicopy genes has been widely used for phylogenetic inference. Among those sequences analyzed, nuclear 5S rRNA genes, the two internal transcribed spacer regions (ITS1 and ITS2) of the 18S-26S rDNA genes, and the intergenic spacer (IGS) regions of the same 18S-26S rDNA genes have all been used at the specific, generic, familial and tribal levels. Many investigations have used direct sequencing of PCR products to generate sequence data. The merits of an alternate approach, namely, cloning prior to sequencing followed by careful alignment of numerous cloned sequences to discern groups of putative orthologous sequences that may then be useful for the inference of relationships among species and genera, are examined and discussed. This process discerns patterns resulting from several cycles of careful alignment followed by manual editing conducted by eye--an exacting operation especially when sequences are unequal in length due to the presence of additions/deletions. Based upon examples taken from our work on the sequencing of individual 5S rDNA clones from several wheat and barley species (Triticum and Hordeum respectively), and the re-analysis of data of others taken from several studies using the nuclear genes mentioned above, we are able to identify groups of putative orthologous sequences that we have named "unit classes". Furthermore, comparisons between provisional orthologous sequences isolated from different species are required for the inference of phylogenetic relationships between them. Paralogous sequences from different unit classes can be compared to infer evolutionary relationships among repeat types only, i.e. among unit classes. In several cases, the analysis of the sequence diversity obtained from different clones permitted the assignment of unit classes to specific haplomes.

MeSH terms

  • Base Sequence
  • Cloning, Molecular
  • Databases as Topic
  • Genes, Duplicate
  • Molecular Sequence Data
  • Phylogeny
  • RNA, Ribosomal / genetics
  • RNA, Ribosomal, 5S / metabolism
  • Sequence Homology, Nucleic Acid
  • Triticum / classification*
  • Triticum / genetics*

Substances

  • RNA, Ribosomal
  • RNA, Ribosomal, 5S