Format
Sort by
Items per page

Send to

Choose Destination

Search results

Items: 1 to 20 of 33

1.
J Integr Bioinform. 2017 Jun 5;14(1). pii: /j/jib.2017.14.issue-1/jib-2016-0013/jib-2016-0013.xml. doi: 10.1515/jib-2016-0013.

Towards a Consistent, Quantitative Evaluation of MicroRNA Evolution.

Abstract

The miRBase currently reports more than 25,000 microRNAs in several hundred genomes that belong to more than 1000 families of homologous sequences. Quantitative investigations of miRNA gene evolution requires the construction of data sets that are consistent in their coverage and include those genomes that are of interest in a given study. Given the size and structure of data, this can be achieved only with the help of a fully automatic pipeline that improves the available seed alignments, extends the set of available sequences by homology search, and reliably identifies true positive homology search results. Here we describe the current progress towards such a system, emphasizing the task of improving and completing the initial seed alignment.

KEYWORDS:

Alignments; Homology Search; ascertainment biases; miRBase

PMID:
28637930
DOI:
10.1515/jib-2016-0013
[Indexed for MEDLINE]
Free full text
Icon for Sheridan PubFactory
2.
BMC Bioinformatics. 2016 Dec 15;17(Suppl 18):464. doi: 10.1186/s12859-016-1345-6.

SnoReport 2.0: new features and a refined Support Vector Machine to improve snoRNA identification.

Author information

1
Department of Computer Science, University of Brasilia, Brasília, BR-70910-900, Brazil. joaovicers@gmail.com.
2
Bioinformatics Group, Department of Computer Science, Albert-Ludwigs-University Freiburg, Georges-Köhler-Allee 106, Freiburg, 79110, Germany.
3
Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Haertelstraße 16-18, Leipzig, D-04107, Germany.
4
German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig, Germany.
5
Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, Vienna, A-1090, Austria.
6
Center for non-coding RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, Frederiksberg, DK-1870, Denmark.
7
Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, Leipzig, D-04103, Germany.
8
RNomics Group, Fraunhofer Institut for Cell Therapy and Immunology, Perlickstraße 1, Leipzig, D-04103, Germany.
9
Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM87501, USA.
10
Young Investigators Group Bioinformatics & Transcriptomics, Helmholtz Centre for Environmental Research - UFZ, Permoserstraße 15, Leipzig, D-04318, Germany.
11
Department of Computer Science, University of Brasilia, Brasília, BR-70910-900, Brazil.

Abstract

BACKGROUND:

snoReport uses RNA secondary structure prediction combined with machine learning as the basis to identify the two main classes of small nucleolar RNAs, the box H/ACA snoRNAs and the box C/D snoRNAs. Here, we present snoReport 2.0, which substantially improves and extends in the original method by: extracting new features for both box C/D and H/ACA box snoRNAs; developing a more sophisticated technique in the SVM training phase with recent data from vertebrate organisms and a careful choice of the SVM parameters C and γ; and using updated versions of tools and databases used for the construction of the original version of snoReport. To validate the new version and to demonstrate its improved performance, we tested snoReport 2.0 in different organisms.

RESULTS:

Results of the training and test phases of boxes H/ACA and C/D snoRNAs, in both versions of snoReport, are discussed. Validation on real data was performed to evaluate the predictions of snoReport 2.0. Our program was applied to a set of previously annotated sequences, some of them experimentally confirmed, of humans, nematodes, drosophilids, platypus, chickens and leishmania. We significantly improved the predictions for vertebrates, since the training phase used information of these organisms, but H/ACA box snoRNAs identification was improved for the other ones.

CONCLUSION:

We presented snoReport 2.0, to predict H/ACA box and C/D box snoRNAs, an efficient method to find true positives and avoid false positives in vertebrate organisms. H/ACA box snoRNA classifier showed an F-score of 93 % (an improvement of 10 % regarding the previous version), while C/D box snoRNA classifier, an F-Score of 94 % (improvement of 14 %). Besides, both classifiers exhibited performance measures above 90 %. These results show that snoReport 2.0 avoid false positives and false negatives, allowing to predict snoRNAs with high quality. In the validation phase, snoReport 2.0 predicted 67.43 % of vertebrate organisms for both classes. For Nematodes and Drosophilids, 69 % and 76.67 %, for H/ACA box snoRNAs were predicted, respectively, showing that snoReport 2.0 is good to identify snoRNAs in vertebrates and also H/ACA box snoRNAs in invertebrates organisms.

KEYWORDS:

C/D box snoRNA; H/ACA box snoRNA; Machine learning; Non-coding RNA; Support Vector Machine (SVM); snoRNA

PMID:
28105919
PMCID:
PMC5249026
DOI:
10.1186/s12859-016-1345-6
[Indexed for MEDLINE]
Free PMC Article
Icon for BioMed Central Icon for PubMed Central
3.
Noncoding RNA. 2017 Jan 5;3(1). pii: E3. doi: 10.3390/ncrna3010003.

Evolution of Fungal U3 snoRNAs: Structural Variation and Introns.

Canzler S1, Stadler PF2,3,4,5,6,7,8, Hertel J9.

Author information

1
Bioinformatics Group, Department Computer Science, and Interdisciplinary Center for Bioinformatics, University Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany. sebastian@bioinf.uni-leipzig.de.
2
Bioinformatics Group, Department Computer Science, and Interdisciplinary Center for Bioinformatics, University Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany. studla@bioinf.uni-leipzig.de.
3
German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Competence Center for Scalable Data Services and Solutions, and Leipzig Research Center for Civilization Diseases, University Leipzig, D-04107 Leipzig, Germany. studla@bioinf.uni-leipzig.de.
4
Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, D-04103 Leipzig, Germany. studla@bioinf.uni-leipzig.de.
5
Fraunhofer Institute for Cell Therapy and Immunology, Perlickstrasse 1, D-04103 Leipzig, Germany. studla@bioinf.uni-leipzig.de.
6
Department of Theoretical Chemistry of the University of Vienna, Währingerstrasse 17, A-1090 Vienna, Austria. studla@bioinf.uni-leipzig.de.
7
Center for RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, 1870 Frederiksberg C, Denmark. studla@bioinf.uni-leipzig.de.
8
Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, USA. studla@bioinf.uni-leipzig.de.
9
Helmholtz Centre for Environmental Research-UFZ, Young Investigators Group Bioinformatics and Transcriptomics Permoserstraße 15, D-04318 Leipzig, Germany. jana.hertel@ufz.de.

Abstract

The U3 small nucleolar RNA (snoRNA) is an essential player in the initial steps of ribosomal RNA biogenesis which is ubiquitously present in Eukarya. It is exceptional among the small nucleolar RNAs in its size, the presence of multiple conserved sequence boxes, a highly conserved secondary structure core, its biogenesis as an independent gene transcribed by polymerase III, and its involvement in pre-rRNA cleavage rather than chemical modification. Fungal U3 snoRNAs share many features with their sisters from other eukaryotic kingdoms but differ from them in particular in their 5' regions, which in fungi has a distinctive consensus structure and often harbours introns. Here we report on a comprehensive homology search and detailed analysis of the evolution of sequence and secondary structure features covering the entire kingdom Fungi.

KEYWORDS:

RNA secondary structure; RNA–RNA interactions; evolution; pre-rRNA processing; small nucleolar RNA; spliceosomal introns

4.
BMC Genomics. 2016 Nov 24;17(1):969.

Phylogenetic distribution of plant snoRNA families.

Author information

1
Bioinformatics Group, Dept. Computer Science, and artin-Luther-Universität Halle-Wittenberg, Leipzig, D-04107, Germany.
2
Institut für Informatik, Halle (Saale), D-06120, Germany.
3
Young Investigators Group Bioinformatics & Transcriptomics, Helmholtz Centre for Environmental Research - UFZ, Permoserstrasse 15, Leipzig, D-04318, Germany.
4
German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig, Leipzig, Germany.
5
Bioinformatics Group, Dept. Computer Science, and artin-Luther-Universität Halle-Wittenberg, Leipzig, D-04107, Germany. studla@bioinf.uni-leipzig.de.
6
Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, Leipzig, D-04103, Germany. studla@bioinf.uni-leipzig.de.
7
Fraunhofer Institute for Cell Therapy and Immunology, Perlickstrasse 1, Leipzig, D-04103, Germany. studla@bioinf.uni-leipzig.de.
8
Department of Theoretical Chemistry of the University of Vienna, Währingerstrasse 17, Leipzig, A-1090, Germany. studla@bioinf.uni-leipzig.de.
9
Center for RNA in Technology and Health, Univ. Copenhagen, Grønnegårdsvej 3, Frederiksberg C, Copenhagen, Denmark. studla@bioinf.uni-leipzig.de.
10
Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, USA. studla@bioinf.uni-leipzig.de.
11
German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig, Leipzig, Germany. studla@bioinf.uni-leipzig.de.

Abstract

BACKGROUND:

Small nucleolar RNAs (snoRNAs) are one of the most ancient families amongst non-protein-coding RNAs. They are ubiquitous in Archaea and Eukarya but absent in bacteria. Their main function is to target chemical modifications of ribosomal RNAs. They fall into two classes, box C/D snoRNAs and box H/ACA snoRNAs, which are clearly distinguished by conserved sequence motifs and the type of chemical modification that they govern. Similarly to microRNAs, snoRNAs appear in distinct families of homologs that affect homologous targets. In animals, snoRNAs and their evolution have been studied in much detail. In plants, however, their evolution has attracted comparably little attention.

RESULTS:

In order to chart the phylogenetic distribution of individual snoRNA families in plants, we applied a sophisticated approach for identifying homologs of known plant snoRNAs across the plant kingdom. In response to the relatively fast evolution of snoRNAs, information on conserved sequence boxes, target sequences, and secondary structure is combined to identify additional snoRNAs. We identified 296 families of snoRNAs in 24 species and traced their evolution throughout the plant kingdom. Many of the plant snoRNA families comprise paralogs. We also found that targets are well-conserved for most snoRNA families.

CONCLUSIONS:

The sequence conservation of snoRNAs is sufficient to establish homologies between phyla. The degree of this conservation tapers off, however, between land plants and algae. Plant snoRNAs are frequently organized in highly conserved spatial clusters. As a resource for further investigations we provide carefully curated and annotated alignments for each snoRNA family under investigation.

KEYWORDS:

Evolution; Small RNAs; snoRNA targets; snoRNAs

PMID:
27881081
PMCID:
PMC5122169
DOI:
10.1186/s12864-016-3301-2
[Indexed for MEDLINE]
Free PMC Article
Icon for BioMed Central Icon for PubMed Central
6.
Nucleic Acids Res. 2016 Jun 20;44(11):5068-82. doi: 10.1093/nar/gkw386. Epub 2016 May 12.

An updated human snoRNAome.

Author information

1
Computational and Systems Biology, Biozentrum, University of Basel and Swiss Institute of Bioinformatics, Basel CH-4056, Switzerland.
2
Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, D-04107 Leipzig, Germany.
3
Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, D-04107 Leipzig, Germany Max Planck Institute for Mathematics in the Sciences, D-04103 Leipzig, Germany RNomics Group, Fraunhofer Institute for Cell Therapy and Immunology, D-04103 Leipzig, Germany Department of Theoretical Chemistry, University of Vienna, A-1090 Vienna, Austria Santa Fe Institute, NM-87501Santa Fe, USA.
4
Computational and Systems Biology, Biozentrum, University of Basel and Swiss Institute of Bioinformatics, Basel CH-4056, Switzerland mihaela.zavolan@unibas.ch.
5
Computational and Systems Biology, Biozentrum, University of Basel and Swiss Institute of Bioinformatics, Basel CH-4056, Switzerland agruber@tbi.univie.ac.at.

Abstract

Small nucleolar RNAs (snoRNAs) are a class of non-coding RNAs that guide the post-transcriptional processing of other non-coding RNAs (mostly ribosomal RNAs), but have also been implicated in processes ranging from microRNA-dependent gene silencing to alternative splicing. In order to construct an up-to-date catalog of human snoRNAs we have combined data from various databases, de novo prediction and extensive literature review. In total, we list more than 750 curated genomic loci that give rise to snoRNA and snoRNA-like genes. Utilizing small RNA-seq data from the ENCODE project, our study characterizes the plasticity of snoRNA expression identifying both constitutively as well as cell type specific expressed snoRNAs. Especially, the comparison of malignant to non-malignant tissues and cell types shows a dramatic perturbation of the snoRNA expression profile. Finally, we developed a high-throughput variant of the reverse-transcriptase-based method for identifying 2'-O-methyl modifications in RNAs termed RimSeq. Using the data from this and other high-throughput protocols together with previously reported modification sites and state-of-the-art target prediction methods we re-estimate the snoRNA target RNA interaction network. Our current results assign a reliable modification site to 83% of the canonical snoRNAs, leaving only 76 snoRNA sequences as orphan.

PMID:
27174936
PMCID:
PMC4914119
DOI:
10.1093/nar/gkw386
[Indexed for MEDLINE]
Free PMC Article
Icon for Silverchair Information Systems Icon for PubMed Central
7.
Nat Genet. 2016 Apr;48(4):427-37. doi: 10.1038/ng.3526. Epub 2016 Mar 7.

The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons.

Author information

1
Institute of Neuroscience, University of Oregon, Eugene, Oregon, USA.
2
Department of Organismal Biology and Anatomy, University of Chicago, Chicago, Illinois, USA.
3
Department of Biology, University of Kentucky, Lexington, Kentucky, USA.
4
Department of Anthropology, Pennsylvania State University, University Park, Pennsylvania, USA.
5
Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Greece.
6
Institut National de la Recherche Agronomique (INRA), UR1037 Laboratoire de Physiologie et Génomique des Poissons (LPGP), Campus de Beaulieu, Rennes, France.
7
Department of Animal Biology, University of Illinois, Urbana-Champaign, Illinois, USA.
8
Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA.
9
Eccles Institute of Human Genetics, University of Utah, Salt Lake City, Utah, USA.
10
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK.
11
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, UK.
12
Department of Zoology, University of Oxford, Oxford, UK.
13
School of Biological Sciences, Bangor University, Bangor, UK.
14
Comparative Genomics Laboratory, Institute of Molecular and Cell Biology, Agency for Science, Technology and Research (A*STAR), Singapore.
15
Institut de Génomique Fonctionnelle de Lyon, Ecole Normale Supérieure de Lyon, Lyon, France.
16
Department of Biology, University of Konstanz, Konstanz, Germany.
17
Department of Molecular Biomedical Sciences, North Carolina State University, Raleigh, North Carolina, USA.
18
Center for Comparative Medicine and Translational Research, North Carolina State University, Raleigh, North Carolina, USA.
19
Departament de Genètica, Universitat de Barcelona, Barcelona, Spain.
20
Institut de Recerca de la Biodiversitat, Universitat de Barcelona, Barcelona, Spain.
21
Department of Biology, University of Victoria, Victoria, British Columbia, Canada.
22
Center for Circadian Clocks, Soochow University, Suzhou, China.
23
School of Biology and Basic Medical Sciences, Medical College, Soochow University, Suzhou, China.
24
Bioinformatics Group, Department of Computer Science, Universität Leipzig, Leipzig, Germany.
25
Department of Dental Hygiene, Nippon Dental University College at Niigata, Niigata, Japan.
26
Department of Pediatrics, University of South Florida Morsani College of Medicine, St. Petersburg, Florida, USA.
27
Department of Microbiology, Nippon Dental University School of Life Dentistry at Niigata, Niigata, Japan.
28
Department of Evolutionary Studies of Biosystems, SOKENDAI (Graduate University for Advanced Studies), Hayama, Japan.
29
Molecular Genetics Program, Benaroya Research Institute, Seattle, Washington, USA.
30
Department of Biological Sciences, Nicholls State University, Thibodaux, Louisiana, USA.
31
Instituto de Ciências Biológicas, Universidade Federal do Pará, Belem, Brazil.
32
International Max Planck Research School for Organismal Biology, University of Konstanz, Konstanz, Germany.
33
Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden.

Abstract

To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before teleost genome duplication (TGD). The slowly evolving gar genome has conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization and development (mediated, for example, by Hox, ParaHox and microRNA genes). Numerous conserved noncoding elements (CNEs; often cis regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles for such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses showed that the sums of expression domains and expression levels for duplicated teleost genes often approximate the patterns and levels of expression for gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes and the function of human regulatory sequences.

PMID:
26950095
PMCID:
PMC4817229
DOI:
10.1038/ng.3526
[Indexed for MEDLINE]
Free PMC Article
Icon for Nature Publishing Group Icon for PubMed Central
8.
RNA Biol. 2016;13(2):119-27. doi: 10.1080/15476286.2015.1132139.

U6 snRNA intron insertion occurred multiple times during fungi evolution.

Canzler S1, Stadler PF1,2,3,4,5,6,7,8, Hertel J1,9.

Author information

1
a Bioinformatics Group , Department of Computer Science,and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraße 16-18 , D-04107 Leipzig , Germany.
2
b Computational EvoDevo Group , Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraße 16-18 , D-04107 Leipzig , Germany.
3
c LIFE - Leipzig Research Center for Civilization Diseases, Universität Leipzig , Germany.
4
d Max Planck Institute for Mathematics in the Sciences, Inselstrasse 22 , D-04103 Leipzig , Germany.
5
e Fraunhofer Institut für Zelltherapie und Immunologie - IZI Perlickstraße 1 , D-04103 Leipzig , Germany.
6
f Department of Theoretical Chemistry , University of Vienna, Währingerstraße 17, A-1090 Wien , Austria.
7
g Center for non-coding RNA in Technology and Health , University of Copenhagen, Grønnegårdsvej 3 , DK-1870 Frederiksberg C, Denmark.
8
h Santa Fe Institute; 1399 Hyde Park Rd. ; Santa Fe ; NM 87501 , USA.
9
i Department of Proteomics , Helmholtz Centre for Environmental Research - UFZ , Permoserstrabe 15, 04318 Leipzig , Germany.

Abstract

U6 small nuclear RNAs are part of the splicing machinery. They exhibit several unique features setting them appart from other snRNAs. Reports of introns in structured non-coding RNAs have been very rare. U6 genes, however, were found to be interrupted by an intron in several Schizosaccharomyces species and in 2 Basidiomycota. We conducted a homology search across 147 currently available fungal genome and identified the U6 genes in all but 2 of them. A detailed comparison of their sequences and predicted secondary structures showed that intron insertion events in the U6 snRNA were much more common in the fungal lineage than previously thought. Their positional distribution across the entire mature snRNA strongly suggests a large number of independent events. All the intron sequences reported here show canonical splice site and branch site motifs indicating that they require the splicesomal pathway for their removal.

KEYWORDS:

Fungi; homology search; intron; snRNA; snRNA evolution

PMID:
26828373
PMCID:
PMC4829304
DOI:
10.1080/15476286.2015.1132139
[Indexed for MEDLINE]
Free PMC Article
Icon for Taylor & Francis Icon for PubMed Central
9.
Cytogenet Genome Res. 2015;145(2):78-179. doi: 10.1159/000430927. Epub 2015 Jul 14.

Third Report on Chicken Genes and Chromosomes 2015.

Author information

1
Department of Human Genetics, University of Würzburg, Würzburg, Germany.
PMID:
26282327
PMCID:
PMC5120589
DOI:
10.1159/000430927
[Indexed for MEDLINE]
Free PMC Article
Icon for S. Karger AG, Basel, Switzerland Icon for PubMed Central
10.
PLoS One. 2015 Mar 30;10(3):e0121797. doi: 10.1371/journal.pone.0121797. eCollection 2015.

Conservation and losses of non-coding RNAs in avian genomes.

Author information

1
School of Biological Sciences, University of Canterbury, Christchurch, New Zealand; Biomolecular Interaction Centre, University of Canterbury, Christchurch, New Zealand.
2
Bioinformatics Group, Department of Computer Science; and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany; ecSeq Bioinformatics, Brandvorwerkstr.43, D-04275 Leipzig, Germany.
3
European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK.
4
Faculty of Life Sciences, University of Manchester, Manchester, United Kingdom.
5
Bioinformatics Group, Department of Computer Science; and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany.
6
School of Biological Sciences, University of Canterbury, Christchurch, New Zealand.
7
Bioinformatics Group, Department of Computer Science; and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany; Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, D-04103 Leipzig, Germany; Fraunhofer Institute for Cell Therapy and Immunology, Perlickstrasse 1, D-04103 Leipzig, Germany; Department of Theoretical Chemistry of the University of Vienna, Währingerstrasse 17, A-1090 Vienna, Austria; Center for RNA in Technology and Health, Univ. Copenhagen, Grønnegårdsvej 3, Frederiksberg C, Denmark; Santa Fe Institute, 1399 Hyde Park Road, Santa Fe NM 87501, USA; German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Germany.

Abstract

Here we present the results of a large-scale bioinformatics annotation of non-coding RNA loci in 48 avian genomes. Our approach uses probabilistic models of hand-curated families from the Rfam database to infer conserved RNA families within each avian genome. We supplement these annotations with predictions from the tRNA annotation tool, tRNAscan-SE and microRNAs from miRBase. We identify 34 lncRNA-associated loci that are conserved between birds and mammals and validate 12 of these in chicken. We report several intriguing cases where a reported mammalian lncRNA, but not its function, is conserved. We also demonstrate extensive conservation of classical ncRNAs (e.g., tRNAs) and more recently discovered ncRNAs (e.g., snoRNAs and miRNAs) in birds. Furthermore, we describe numerous "losses" of several RNA families, and attribute these to either genuine loss, divergence or missing data. In particular, we show that many of these losses are due to the challenges associated with assembling avian microchromosomes. These combined results illustrate the utility of applying homology-based methods for annotating novel vertebrate genomes.

PMID:
25822729
PMCID:
PMC4378963
DOI:
10.1371/journal.pone.0121797
[Indexed for MEDLINE]
Free PMC Article
Icon for Public Library of Science Icon for PubMed Central
11.
Life (Basel). 2015 Mar 13;5(1):905-20. doi: 10.3390/life5010905.

The Expansion of Animal MicroRNA Families Revisited.

Hertel J1, Stadler PF2,3,4,5,6,7,8.

Author information

1
Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany. jana@bioinf.uni-leipzig.de.
2
Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany. studla@bioinf.uni-leipzig.
3
German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5E, 04103 Leipzig, Germany. studla@bioinf.uni-leipzig.
4
Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, D-04103 Leipzig, Germany. studla@bioinf.uni-leipzig.
5
Fraunhofer Institute for Cell Therapy and Immunology, Perlickstrasse 1, D-04103 Leipzig, Germany. studla@bioinf.uni-leipzig.
6
Department of Theoretical Chemistry of the University of Vienna, Währingerstrasse 17, A-1090 Vienna, Austria. studla@bioinf.uni-leipzig.
7
Center for RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, Frederiksberg C, Denmark. studla@bioinf.uni-leipzig.
8
Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, USA. studla@bioinf.uni-leipzig.

Abstract

MicroRNAs are important regulatory small RNAs in many eukaryotes. Due to their small size and simple structure, they are readily innovated de novo. Throughout the evolution of animals, the emergence of novel microRNA families traces key morphological innovations. Here, we use a computational approach based on homology search and parsimony-based presence/absence analysis to draw a comprehensive picture of microRNA evolution in 159 animal species. We confirm previous observations regarding bursts of innovations accompanying the three rounds of genome duplications in vertebrate evolution and in the early evolution of placental mammals. With a much better resolution for the invertebrate lineage compared to large-scale studies, we observe additional bursts of innovation, e.g., in Rhabditoidea. More importantly, we see clear evidence that loss of microRNA families is not an uncommon phenomenon. The Enoplea may serve as a second dramatic example beyond the tunicates. The large-scale analysis presented here also highlights several generic technical issues in the analysis of very large gene families that will require further research.

12.
Mol Cell. 2014 Nov 6;56(3):389-99. doi: 10.1016/j.molcel.2014.10.004. Epub 2014 Nov 6.

The coilin interactome identifies hundreds of small noncoding RNAs that traffic through Cajal bodies.

Author information

1
Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstrasse 108, 01307 Dresden, Germany; Department of Molecular Biophysics & Biochemistry, Yale University, 333 Cedar Street, New Haven, CT 06520, USA.
2
Bioinformatics Group, Department of Computer Science and Interdisciplinary Center for Bioinformatics, University of Leipzig, Haertelstrasse 16-18, 04107 Leipzig, Germany.
3
Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstrasse 108, 01307 Dresden, Germany.
4
Institute of Molecular Biology (IMB), Ackermannweg 4, 55128 Mainz, Germany.
5
Department of Molecular Neuroscience, UCL Institute of Neurology, Queen Square, London WC1N 3BG, UK.
6
Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstrasse 108, 01307 Dresden, Germany; Department of Molecular Biophysics & Biochemistry, Yale University, 333 Cedar Street, New Haven, CT 06520, USA. Electronic address: karla.neugebauer@yale.edu.

Abstract

Coilin protein scaffolds Cajal bodies (CBs)-subnuclear compartments enriched in small nuclear RNAs (snRNAs)-and promotes efficient spliceosomal snRNP assembly. The molecular function of coilin, which is intrinsically disordered with no defined motifs, is poorly understood. We use UV crosslinking and immunoprecipitation (iCLIP) to determine whether mammalian coilin binds RNA in vivo and to identify targets. Robust detection of snRNA transcripts correlated with coilin ChIP-seq peaks on snRNA genes, indicating that coilin binding to nascent snRNAs is a site-specific CB nucleator. Surprisingly, several hundred small nucleolar RNAs (snoRNAs) were identified as coilin interactors, including numerous unannotated mouse and human snoRNAs. We show that all classes of snoRNAs concentrate in CBs. Moreover, snoRNAs lacking specific CB retention signals traffic through CBs en route to nucleoli, consistent with the role of CBs in small RNP assembly. Thus, coilin couples snRNA and snoRNA biogenesis, making CBs the cellular hub of small ncRNA metabolism.

PMID:
25514182
DOI:
10.1016/j.molcel.2014.10.004
[Indexed for MEDLINE]
Free full text
Icon for Elsevier Science
13.
Methods Mol Biol. 2014;1097:437-56. doi: 10.1007/978-1-62703-709-9_20.

Computational prediction of microRNA genes.

Author information

1
Bioinformatics Group, Department of Computer Science, University of Leipzig, Leipzig, Germany.

Abstract

The computational identification of novel microRNA (miRNA) genes is a challenging task in bioinformatics. Massive amounts of data describing unknown functional RNA transcripts have to be analyzed for putative miRNA candidates with automated computational pipelines. Beyond those miRNAs that meet the classical definition, high-throughput sequencing techniques have revealed additional miRNA-like molecules that are derived by alternative biogenesis pathways. Exhaustive bioinformatics analyses on such data involve statistical issues as well as precise sequence and structure inspection not only of the functional mature part but also of the whole precursor sequence of the putative miRNA. Apart from a considerable amount of species-specific miRNAs, the majority of all those genes are conserved at least among closely related organisms. Some miRNAs, however, can be traced back to very early points in the evolution of eukaryotic species. Thus, the investigation of the conservation of newly found miRNA candidates comprises an important step in the computational annotation of miRNAs.Topics covered in this chapter include a review on the obvious problem of miRNA annotation and family definition, recommended pipelines of computational miRNA annotation or detection, and an overview of current computer tools for the prediction of miRNAs and their limitations. The chapter closes discussing how those bioinformatic approaches address the problem of faithful miRNA prediction and correct annotation.

PMID:
24639171
DOI:
10.1007/978-1-62703-709-9_20
[Indexed for MEDLINE]
Icon for Springer
14.
Bioinformatics. 2014 Jan 1;30(1):115-6. doi: 10.1093/bioinformatics/btt604. Epub 2013 Oct 29.

snoStrip: a snoRNA annotation pipeline.

Author information

1
Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany.

Abstract

MOTIVATION:

Although small nucleolar RNAs form an important class of non-coding RNAs, no comprehensive annotation efforts have been undertaken, presumably because the task is complicated by both the large number of distinct small nucleolar RNA families and their relatively rapid pace of sequence evolution.

RESULTS:

With snoStrip we present an automatic annotation pipeline developed specifically for comparative genomics of small nucleolar RNAs. It makes use of sequence conservation, canonical box motifs as well as secondary structure and predicts putative targets.

AVAILABILITY AND IMPLEMENTATION:

The snoStrip web service and the download version is available at http://snostrip.bioinf.uni-leipzig.de/

PMID:
24174566
DOI:
10.1093/bioinformatics/btt604
[Indexed for MEDLINE]
Icon for Silverchair Information Systems
15.
Mol Biol Evol. 2014 Feb;31(2):455-67. doi: 10.1093/molbev/mst209. Epub 2013 Oct 24.

Matching of Soulmates: coevolution of snoRNAs and their targets.

Author information

1
Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany.

Abstract

Ribosomal and small nuclear RNAs (snRNAs) comprise numerous modified nucleotides. The modification patterns are retained during evolution, making it even possible to project them from yeast onto human. The stringent conservation of modification sites and the slow evolution of rRNAs and snRNAs contradicts the rapid evolution of small nucleolar RNA (snoRNA) sequences. To explain this discrepancy, we investigated the coevolution of snoRNAs and their targeted sites throughout vertebrates. To measure and evaluate the conservation of RNA-RNA interactions, we defined the interaction conservation index (ICI). It combines the quality of individual interaction with the scope of its conservation in a set of species and serves as an efficient measure to evaluate the conservation of the interaction of snoRNA and target. We show that functions of homologous snoRNAs are evolutionarily stable, thus, members of the same snoRNA family guide equivalent modifications. The conservation of snoRNA sequences is high at target binding regions while the remaining sequence varies significantly. In addition to elucidating principles of correlated evolution, we were able, with the help of the ICI measure, to assign functions to previously orphan snoRNAs and to associate snoRNAs as partners to known chemical modifications unassigned to a given snoRNA. Furthermore, we used predictions of snoRNA functions in conjunction with sequence conservation to identify distant homologies. Because of the high overall entropy of snoRNA sequences, such relationships are hard to detect by means of sequence homology search methods alone.

KEYWORDS:

ICI; RNA-RNA interaction; target prediction

PMID:
24162733
DOI:
10.1093/molbev/mst209
[Indexed for MEDLINE]
Icon for Silverchair Information Systems
16.
Curr Biol. 2012 Jul 24;22(14):1309-13. doi: 10.1016/j.cub.2012.05.018. Epub 2012 Jun 14.

Genomic and morphological evidence converge to resolve the enigma of Strepsiptera.

Author information

1
Center for Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, 53113 Bonn, Germany. o.niehuis.zfmk@uni-bonn.de

Erratum in

  • Curr Biol. 2013 Jul 22;23(14):1388.

Abstract

The phylogeny of insects, one of the most spectacular radiations of life on earth, has received considerable attention. However, the evolutionary roots of one intriguing group of insects, the twisted-wing parasites (Strepsiptera), remain unclear despite centuries of study and debate. Strepsiptera exhibit exceptional larval developmental features, consistent with a predicted step from direct (hemimetabolous) larval development to complete metamorphosis that could have set the stage for the spectacular radiation of metamorphic (holometabolous) insects. Here we report the sequencing of a Strepsiptera genome and show that the analysis of sequence-based genomic data (comprising more than 18 million nucleotides from nearly 4,500 genes obtained from a total of 13 insect genomes), along with genomic metacharacters, clarifies the phylogenetic origin of Strepsiptera and sheds light on the evolution of holometabolous insect development. Our results provide overwhelming support for Strepsiptera as the closest living relatives of beetles (Coleoptera). They demonstrate that the larval developmental features of Strepsiptera, reminiscent of those of hemimetabolous insects, are the result of convergence. Our analyses solve the long-standing enigma of the evolutionary roots of Strepsiptera and reveal that the holometabolous mode of insect development is more malleable than previously thought.

PMID:
22704986
DOI:
10.1016/j.cub.2012.05.018
[Indexed for MEDLINE]
Free full text
Icon for Elsevier Science
17.
RNA Biol. 2012 Mar;9(3):231-41. doi: 10.4161/rna.18974. Epub 2012 Mar 1.

Evolution of the let-7 microRNA family.

Author information

1
Bioinformatics Group, Department of Computer Science and Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany.

Abstract

The increase of bodyplan complexity in early bilaterian evolution is correlates with the advent and diversification of microRNAs. These small RNAs guide animal development by regulating temporal transitions in gene expression involved in cell fate choices and transitions between pluripotency and differentiation. One of the two known microRNAs whose origins date back before the bilaterian ancestor is mir-100. In Bilateria, it appears stably associated in polycistronic transcripts with let-7 and mir-125, two key regulators of development. In vertebrates, these three microRNA families have expanded to form a complex system of developmental regulators. In this contribution, we disentangle the evolutionary history of the let-7 locus, which was restructured independently in nematodes, platyhelminths, and deuterostomes. The foundation of a second let-7 locus in the common ancestor of vertebrates and urochordates predates the vertebrate-specific genome duplications, which then caused a rapid expansion of the let-7 family.

PMID:
22617875
PMCID:
PMC3384580
DOI:
10.4161/rna.18974
[Indexed for MEDLINE]
Free PMC Article
Icon for Taylor & Francis Icon for PubMed Central
18.
Bioinformatics. 2010 Mar 1;26(5):610-6. doi: 10.1093/bioinformatics/btp680. Epub 2009 Dec 16.

RNAsnoop: efficient target prediction for H/ACA snoRNAs.

Author information

1
Institute for Theoretical Chemistry, University of Vienna, Währingerstrasse 17, A-1090 Vienna, Austria. htafer@tbi.univie.ac.at

Abstract

MOTIVATION:

Small nucleolar RNAs are an abundant class of non-coding RNAs that guide chemical modifications of rRNAs, snRNAs and some mRNAs. In the case of many 'orphan' snoRNAs, the targeted nucleotides remain unknown, however. The box H/ACA subclass determines uridine residues that are to be converted into pseudouridines via specific complementary binding in a well-defined secondary structure configuration that is outside the scope of common RNA (co-)folding algorithms.

RESULTS:

RNAsnoop implements a dynamic programming algorithm that computes thermodynamically optimal H/ACA-RNA interactions in an efficient scanning variant. Complemented by an support vector machine (SVM)-based machine learning approach to distinguish true binding sites from spurious solutions and a system to evaluate comparative information, it presents an efficient and reliable tool for the prediction of H/ACA snoRNA target sites. We apply RNAsnoop to identify the snoRNAs that are responsible for several of the remaining 'orphan' pseudouridine modifications in human rRNAs, and we assign a target to one of the five orphan H/ACA snoRNAs in Drosophila.

AVAILABILITY:

The C source code of RNAsnoop is freely available at http://www.tbi.univie.ac.at/ -htafer/RNAsnoop

PMID:
20015949
DOI:
10.1093/bioinformatics/btp680
[Indexed for MEDLINE]
Icon for Silverchair Information Systems
19.
BMC Genomics. 2009 Oct 8;10:464. doi: 10.1186/1471-2164-10-464.

Homology-based annotation of non-coding RNAs in the genomes of Schistosoma mansoni and Schistosoma japonicum.

Author information

1
Bioinformatics Group, Department of Computer Science and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany. cclaudia@bioinf.uni-leipzig.de

Abstract

BACKGROUND:

Schistosomes are trematode parasites of the phylum Platyhelminthes. They are considered the most important of the human helminth parasites in terms of morbidity and mortality. Draft genome sequences are now available for Schistosoma mansoni and Schistosoma japonicum. Non-coding RNA (ncRNA) plays a crucial role in gene expression regulation, cellular function and defense, homeostasis, and pathogenesis. The genome-wide annotation of ncRNAs is a non-trivial task unless well-annotated genomes of closely related species are already available.

RESULTS:

A homology search for structured ncRNA in the genome of S. mansoni resulted in 23 types of ncRNAs with conserved primary and secondary structure. Among these, we identified rRNA, snRNA, SL RNA, SRP, tRNAs and RNase P, and also possibly MRP and 7SK RNAs. In addition, we confirmed five miRNAs that have recently been reported in S. japonicum and found two additional homologs of known miRNAs. The tRNA complement of S. mansoni is comparable to that of the free-living planarian Schmidtea mediterranea, although for some amino acids differences of more than a factor of two are observed: Leu, Ser, and His are overrepresented, while Cys, Meth, and Ile are underrepresented in S. mansoni. On the other hand, the number of tRNAs in the genome of S. japonicum is reduced by more than a factor of four. Both schistosomes have a complete set of minor spliceosomal snRNAs. Several ncRNAs that are expected to exist in the S. mansoni genome were not found, among them the telomerase RNA, vault RNAs, and Y RNAs.

CONCLUSION:

The ncRNA sequences and structures presented here represent the most complete dataset of ncRNA from any lophotrochozoan reported so far. This data set provides an important reference for further analysis of the genomes of schistosomes and indeed eukaryotic genomes at large.

PMID:
19814823
PMCID:
PMC2770079
DOI:
10.1186/1471-2164-10-464
[Indexed for MEDLINE]
Free PMC Article
Icon for BioMed Central Icon for PubMed Central
20.
Nucleic Acids Res. 2009 Oct;37(18):6184-93. doi: 10.1093/nar/gkp600. Epub 2009 Sep 1.

Accurate and efficient reconstruction of deep phylogenies from structured RNAs.

Author information

1
Zoologisches Forschungsmuseum Alexander Koenig, Bonn, Germany. jana@bioinf.uni-leipzig.de

Abstract

Ribosomal RNA (rRNA) genes are probably the most frequently used data source in phylogenetic reconstruction. Individual columns of rRNA alignments are not independent as a consequence of their highly conserved secondary structures. Unless explicitly taken into account, these correlation can distort the phylogenetic signal and/or lead to gross overestimates of tree stability. Maximum likelihood and Bayesian approaches are of course amenable to using RNA-specific substitution models that treat conserved base pairs appropriately, but require accurate secondary structure models as input. So far, however, no accurate and easy-to-use tool has been available for computing structure-aware alignments and consensus structures that can deal with the large rRNAs. The RNAsalsa approach is designed to fill this gap. Capitalizing on the improved accuracy of pairwise consensus structures and informed by a priori knowledge of group-specific structural constraints, the tool provides both alignments and consensus structures that are of sufficient accuracy for routine phylogenetic analysis based on RNA-specific substitution models. The power of the approach is demonstrated using two rRNA data sets: a mitochondrial rRNA set of 26 Mammalia, and a collection of 28S nuclear rRNAs representative of the five major echinoderm groups.

PMID:
19723687
PMCID:
PMC2764418
DOI:
10.1093/nar/gkp600
[Indexed for MEDLINE]
Free PMC Article
Icon for Silverchair Information Systems Icon for PubMed Central

Supplemental Content

Loading ...
Support Center