Format

Send to

Choose Destination
Mol Genet Genomics. 2016 Apr;291(2):763-73. doi: 10.1007/s00438-015-1144-1. Epub 2015 Nov 20.

Genome-wide identification and phylogenetic analysis of plant RNA binding proteins comprising both RNA recognition motifs and contiguous glycine residues.

Author information

1
Molecular Cell Physiology, Faculty of Biology, Bielefeld University, Universitätsstr. 25, 33615, Bielefeld, Germany.
2
Bioinformatics/Medical Informatics Department, Universitätsstr. 25, 33615, Bielefeld, Germany.
3
Department of Cellular and Developmental Biology of Plants, Faculty of Biology, Bielefeld University, Universitätsstr. 25, 33615, Bielefeld, Germany.
4
Molecular Cell Physiology, Faculty of Biology, Bielefeld University, Universitätsstr. 25, 33615, Bielefeld, Germany. dorothee.staiger@uni-bielefeld.de.

Abstract

This study focused on the identification and phylogenetic analysis of glycine-rich RNA binding proteins that contain an RNA recognition motif (RRM)-type RNA binding domain in addition to a region with contiguous glycine residues in representative plant species. In higher plants, glycine-rich proteins with an RRM have met considerable interest as they are responsive to environmental cues and play a role in cold tolerance, pathogen defense, flowering time control, and circadian timekeeping. To identify such RRM containing proteins in plant genomes we developed an RRM profile based on the known glycine-rich RRM containing proteins in the reference plant Arabidopsis thaliana. The application of this remodeled RRM profile that omitted sequences from non-plant species reduced the noise when searching plant genomes for RRM proteins compared to a search performed with the known RRM_1 profile. Furthermore, we developed an island scoring function to identify regions with contiguous glycine residues, using a sliding window approach. This approach tags regions in a protein sequence with a high content of the same amino acid, and repetitive structures score higher. This definition of repetitive structures in a fixed sequence length provided a new glance for characterizing patterns which cannot be easily described as regular expressions. By combining the profile-based domain search for well-conserved regions (the RRM) with a scoring technique for regions with repetitive residues we identified groups of proteins related to the A. thaliana glycine-rich RNA binding proteins in eight plant species.

KEYWORDS:

Glycine-rich domains; HMMER biosequence analysis; MUSCLE alignment; Orthology prediction; Plant; RNA binding protein; RNA recognition motif

PMID:
26589419
DOI:
10.1007/s00438-015-1144-1
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Springer
Loading ...
Support Center