Logo of narLink to Publisher's site
Nucleic Acids Res. Jul 2007; 35(Web Server issue): W305–W309.
Published online May 25, 2007. doi:  10.1093/nar/gkm255
PMCID: PMC1933207

RNAbor: a web server for RNA structural neighbors

Abstract

RNAbor provides a new tool for researchers in the biological and related sciences to explore important aspects of RNA secondary structure and folding pathways. RNAbor computes statistics concerning δ-neighbors of a given input RNA sequence and structure (the structure can, for example, be the minimum free energy (MFE) structure). A δ-neighbor is a structure that differs from the input structure by exactly δ base pairs, that is, it can be obtained from the input structure by adding and/or removing exactly δ base pairs. For each distance δRNAbor computes the density of δ-neighbors, the number of δ-neighbors, and the MFE structure, or MFE δ structure, among all δ-neighbors. RNAbor can be used to study possible folding pathways, to determine alternate low-energy structures, to predict potential nucleation sites and to explore structural neighbors of an intermediate, biologically active structure. The web server is available at http://bioinformatics.bc.edu/clotelab/RNAbor.

INTRODUCTION

RNA plays a surprising and previously unsuspected role in many biological processes, such as post-transcriptional regulation, conformational switches, expansion of the genetic code (such as selenocysteine insertion), ribosomal frameshift, metabolite-binding and chemical modification of specific nucleotides in the ribosome. Apart from its catalytic role as a ribonucleic enzyme (ribozyme) (1), RNA can regulate genes in several ways. For example, by hybridizing to a portion of messenger RNA, small ~ 22 nt RNA molecules perform post-transcriptional gene regulation by RNA interference (RNAi), a process so important that for its discovery the 2006 Nobel Prize in Physiology or Medicine was awarded to A. Z. Fire and C. C. Mello. In addition, by very different means, RNA can perform transcriptional and translational gene regulation by allostery, where a portion of the 5′ untranslated region (5′ UTR) of mRNA, known as a riboswitch (2,3), undergoes a conformational change upon binding a specific ligand such as adenine, guanine or lysine.

As the field of RNomics matures, many sophisticated computational tools, e.g. RNA structure prediction, alignment and gene finding, have been developed—see (4,5) for recent overviews. Recently developed programs that are of most relevance here include the program Sfold (6,7), that computes a low energy ensemble of structures by sampling from the partition function (8), and an earlier program RNAsubopt (9) that computes all suboptimal structures within a user-specified number of kcal/mol of the minimum free energy (MFE). In addition, the program RNAshapes (10–12) provides a useful description of RNA branching structure by computing the Boltzmann probability of various shapes and also the MFE structure for various shapes. Here, an RNA shape is an equivalence class of secondary structures, describing the overall branching; for instance the shape of a typical cloverleaf tRNA would be [ [ ] [ ] [ ] ].

In this article, we describe the web server RNAbor, which computes the Boltzmann probability and MFE structures which differ by δ base pairs from a given initial structure. Unlike most of the tools just described, which focus on the MFE structure or a low energy ensemble, RNAbor yields information concerning the secondary structure folding landscape. Potential applications of RNAbor include the design of RNA aptamers (see (13) for a suggestion how RNA might be designed to inhibit the function of the viral enzymes such as HIV-1 reverse transcriptase and hepatitis C NS3 protease), detection of conformational switches, understanding the role played by biologically active structural intermediates and improvement in secondary structure prediction.

MATERIALS AND METHODS

Let An external file that holds a picture, illustration, etc.
Object name is gkm255i6.jpg denote a given RNA nucleotide sequence, and let An external file that holds a picture, illustration, etc.
Object name is gkm255i7.jpg be any given secondary structure of An external file that holds a picture, illustration, etc.
Object name is gkm255i6.jpg. The structure An external file that holds a picture, illustration, etc.
Object name is gkm255i7.jpg could be the MFE structure of An external file that holds a picture, illustration, etc.
Object name is gkm255i6.jpg, it could be the secondary structure obtained from the 3-dimensional X-ray conformation or by comparative sequence analysis, or it could be an arbitrary intermediate structure of particular biological significance. For an integer δ, a secondary structure An external file that holds a picture, illustration, etc.
Object name is gkm255i8.jpg of An external file that holds a picture, illustration, etc.
Object name is gkm255i6.jpg is a δ-neighbor of An external file that holds a picture, illustration, etc.
Object name is gkm255i7.jpg, if An external file that holds a picture, illustration, etc.
Object name is gkm255i7.jpg and An external file that holds a picture, illustration, etc.
Object name is gkm255i8.jpg differ by exactly δ base pairs [14]. In (Freyhult, E., Moulton, V. and Clote, P. Boltzmann probability of RNA structural neighbors and riboswitch detection, submitted for publication), we describe new algorithms, which compute the number Nδ of δ-neighbors, the partition function Zδ for δ-neighbors and the MFEδ, and the corresponding MFEδ structure over all δ-neighbors of a fixed structure An external file that holds a picture, illustration, etc.
Object name is gkm255i7.jpg.

Computing structural neighbors

To give the reader a feeling for how the algorithms work, we present the recurrence relations to compute the number Nδ of δ-neighbors of An external file that holds a picture, illustration, etc.
Object name is gkm255i7.jpg. Let An external file that holds a picture, illustration, etc.
Object name is gkm255i2.jpg. If An external file that holds a picture, illustration, etc.
Object name is gkm255i3.jpg denotes the number of δ-neighbors of the substructure S [i,j], the restriction of An external file that holds a picture, illustration, etc.
Object name is gkm255i7.jpg to interval [i,j] of An external file that holds a picture, illustration, etc.
Object name is gkm255i6.jpg, then the number of δ-neighbors of An external file that holds a picture, illustration, etc.
Object name is gkm255i7.jpg, An external file that holds a picture, illustration, etc.
Object name is gkm255i4.jpg, can be computed by the following recursion:

equation image
1

where An external file that holds a picture, illustration, etc.
Object name is gkm255i5.jpg (i.e. the set of Watson-Crick base pairs together with wobbles), b0 = 1 if j is base-paired in S [i,j] and 0 otherwise and b is the base pair distance between S [i,j] and a structure on the same interval [i,j] where a base pair between k and j has been added (taking into account all the base pairs in S [i,j] that need to be broken to allow the addition of this base pair).

This approach for computing Nδ can be extended to compute the partition function contribution, Zδ, of the set of δ-neighbors and also to compute the MFEδ and the MFEδ structure. Computations are made with respect to the Turner energy model (15,16); treatment of the dangle is similar to that in Vienna RNA Package (option -d2). The algorithms employ dynamic programming, and run in O(Δ · n3) time and O(Δ · n2) space, where n is the sequence length and Δ is the maximum value of δ. Since Δ can be at most n, the run time cannot be worse than O(n4) and space no worse that O(n3), even if the user does not specify a value of Δ. Full details of the algorithms are given in (Freyhult, E., Moulton, V. and Clote, P. Boltzmann probability of RNA structural neighbors and riboswitch detection, submitted for publication).

Web server

The web server available at http://bioinformatics.bc.edu/clotelab/RNAbor runs on a Linux cluster with 20 computational nodes, each with double processors of between 1300 and 3000 MHz and 2 GB RAM (6 Dell PowerEdge 1650, 2 × 1300 MHz Pentium III, 2 GB RAM; 11 Dell PowerEdge 1850, 2 × 2800 + MHz Xeon EM64T, 2 GB RAM; 5 Dell PowerEdge 1850, 2 × 3000 MHz Xeon EM64T, 2 GB RAM).

RESULTS

Due to the time and space constraints of the algorithm, RNA sequences may be of length up to 300 nucleotides. Sequences of length up to 60 are processed interactively and output is displayed in the user's browser window. For sequences of length 61–300, the computation is done off-line and the results are returned to the user by email; for this, the email address is required. The user can either paste an input sequence (with optional secondary structure), or upload a file of the same. The full input consists of up to four lines, illustrated by the following example.An external file that holds a picture, illustration, etc.
Object name is gkm255i1.jpgThe temperature is set to a default value of 37[composite function (small circle)] C; however the user can enter any integer temperature between 0 and 100.

The only required input is an RNA sequence An external file that holds a picture, illustration, etc.
Object name is gkm255i6.jpg of length at most 300 nucleotides; the FASTA comment, initial secondary structure An external file that holds a picture, illustration, etc.
Object name is gkm255i7.jpg and upper bound Δ are optional inputs. If no secondary structure is given, then the initial structure An external file that holds a picture, illustration, etc.
Object name is gkm255i7.jpg is taken to be the MFE structure, as computed by RNAfold -d2. If the optional input Δ is missing, then Δ is defined to equal the length n of the input sequence An external file that holds a picture, illustration, etc.
Object name is gkm255i6.jpg; otherwise Δ is the minimum of the input value and n. For each 0 ≤ δ ≤ Δ, RNAbor computes the Boltzmann probability pδ = Zδ /Z, where the partition function is defined by

equation image

where R is the universal gas constant and T is temperature in degrees Kelvin. Here, the summation is made over all secondary structures An external file that holds a picture, illustration, etc.
Object name is gkm255i8.jpg of An external file that holds a picture, illustration, etc.
Object name is gkm255i6.jpg which are δ-neighbors of An external file that holds a picture, illustration, etc.
Object name is gkm255i7.jpg. The full partition function Z = ∑ δ Zδ is computed by McCaskill's algorithm (8) if Δ ≥ n.

In addition to computing probability pδ, RNAbor computes the number Nδ of δ-neighbors of An external file that holds a picture, illustration, etc.
Object name is gkm255i7.jpg, the MFEδ over all δ-neighbors of An external file that holds a picture, illustration, etc.
Object name is gkm255i7.jpg and the MFEδ secondary structure. Tables of the values Nδ and pδ, as well as their graphs, are made available as downloadable files. The five-column text file output, consisting of δ, pδ, Nδ, MFEδ and the MFEδ structure, is depicted in Figure 1.

Figure 1.
Text output of RNAbor on the 51 nt 3 ′ UTR of a mRNA with NCBI accession number MUSGBPS. The five columns in the entire text output from RNAbor are given by the following, in order: (i) value of δ, (ii) Boltzmann probability p ...

EXAMPLES

RNAbor can be used to generate alternative low energy structures, which differ markedly from the MFE structure, or from any initially given structure. Figure 1 shows the RNAbor output for a short 3 ′-UTR sequence of an mRNA with NCBI accession number MUSGBPS. The input structure in this example is the MFE structure (as predicted by RNAfold -d2). The RNAbor output indicates two ranges of δ that show higher probabilities than the rest, 0–9 and 20–24. The MFEδ structures at distance δ between 0 and 9 from the MFE structure all have very similar folds and the probability of finding the RNA in a structure at δ between 0 and 9 is 0.63. The probability of finding a structure at δ 20–24 is also relatively high, 0.35, and the MFEδ structures in this range are similar to each other but completely different from the MFE structure. Thus the two highly probable δ ranges represent two possible alternative folds of the RNA.

Analyzing the same sequence with Sfold gives similar results. Sfold finds three types of structures (three clusters), with probabilities 0.65, 0.22 and 0.13, respectively. One cluster contains the MFE structure corresponding to the folds at δ values from 0 to 9, another cluster has a centroid structure resembling the structures at δ between 20 and 24, and the third cluster has a centroid structure similar to the MFE19 structure. RNAshapes on the other hand is less successful for this example since the alternative folds as predicted by RNAbor have the same shape [ ], even though the folds are very different.

Figure 2 displays the MFE structure and the MFE30 structure of the 101 nt SAM riboswitch with EMBL accession number AP004597.1/118941-119041, with sequence taken from Rfam (17). The MFE structure over all 30-neighbors, the MFE30 structure, is clearly much closer to the real structure than the global MFE structure. Figure 3 displays the Boltzmann probability density, showing a peak for the value δ = 30.

Figure 2.
Two alternative low energy secondary structures for the 101 nt SAM riboswitch with EMBL accession number AP004597.1 from position 118941 to ...
Figure 3.
Boltzmann probability density plot for the 101 nt SAM riboswitch (EMBL accession number AP004597.1/118941-119041). The curve shows the probability, ...

DISCUSSION

In this article, we have introduced the web server RNAbor, which computes the Boltzmann probability and MFE structure over all δ-neighbors for a given RNA sequence An external file that holds a picture, illustration, etc.
Object name is gkm255i6.jpg and initial secondary structure An external file that holds a picture, illustration, etc.
Object name is gkm255i7.jpg. The underlying algorithms, described in the forthcoming paper (Freyhult, E., Moulton, V. and Clote, P. Boltzmann probability of RNA structural neighbors and riboswitch detection, submitted for publication), use dynamic programming, involve the Turner energy model (15,16), and require considerable time O(Δ · n3) and space O(Δ · n2) resources. Figures 2 and and33 illustrate the use of RNAbor in better understanding structural aspects of a SAM riboswitch, and indicate that RNAbor should provide a useful complementary tool to programs such as Sfold and RNAshapes for analyzing the ensemble of possible secondary structures on a given RNA sequence.

ACKNOWLEDGEMENTS

Research of P.C. was partially supported by National Science Foundation DBI-0543506, which additionally supported some travel of E.F. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. All three authors would like to thank Elena Rivas, Eric Westhof and funding agencies for organizing the meeting RNA-2006 in Benasque, Spain, in July 2006, where some of this work was carried out. Finally, thanks to Jason Persampieri for some technical assistance. Funding to pay the Open Access publication charges for this article was provided by the National Science Foundation.

Conflict of interest statement. None declared.

REFERENCES

1. Doudna JA, Cech TR. The chemical repertoire of natural ribozymes. Nature. 2002;418:222–228. [PubMed]
2. Winkler WC, Cohen-Chalamish S, Breaker RR. An mRNA structure that controls gene expression by binding FMN. Proc. Natl Acad. Sci. USA. 2002;99:15908–15913. [PMC free article] [PubMed]
3. Penchovsky R, Breaker RR. Computational design and experimental validation of oligonucleotide-sensing allosteric ribozymes. Nat. Biotechnol. 2005;23:1424–1431. [PubMed]
4. Mathews DH, Turner DH. Prediction of RNA secondary structure by free energy minimization. Curr. Opin. Struct. Biol. 2006;16:270–278. [PubMed]
5. Eddy SR. Non-coding RNA genes and the modern RNA world. Nat. Rev. Genet. 2001;2:919–929. [PubMed]
6. Ding Y, Lawrence CE. A statistical sampling algorithm for RNA secondary structure prediction. Nucleic Acids Res. 2003;31:7280–7301. [PMC free article] [PubMed]
7. Ding Y, Chan CY, Lawrence CE. RNA secondary structure prediction by centroids in a Boltzmann weighted ensemble. RNA. 2005;11:1157–1166. [PMC free article] [PubMed]
8. McCaskill JS. The equilibrium partition function and base pair binding probabilities for RNA secondary structures. Biopolymers. 1990;29:1105–1119. [PubMed]
9. Wuchty S, Fontana W, Hofacker IL, Schuster P. Complete suboptimal folding of RNA and the stability of secondary structures. Biopolymers. 1999;49:145–164. [PubMed]
10. Giegerich R, Voss B, Rehmsmeier M. Abstract shapes of RNA. Nucleic Acids Res. 2004;32:4843–4851. [PMC free article] [PubMed]
11. Steffen P, Voss B, Rehmsmeier M, Reeder J, Giegerich R. RNAshapes: an integrated RNA analysis package based on abstract shapes. Bioinformatics. 2006;22:500–503. [PubMed]
12. Voss B, Giegerich R, Rehmsmeier M. Complete probabilistic analysis of RNA shapes. BMC Biol. 2006;4:5. [PMC free article] [PubMed]
13. James W. Nucleic acid and polypeptide aptamers: a powerful approach to ligand discovery. Curr. Opin. Pharmacol. 2001;1:540–548. [PubMed]
14. Moulton V, Zuker M, Steel M, Pointon R, Penny D. Metrics on RNA secondary structures. J. Comput. Biol. 2000;7:277–292. [PubMed]
15. Matthews DH, Sabina J, Zuker M, Turner DH. Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. J. Mol. Biol. 1999;288:911–940. [PubMed]
16. Xia T, SantaLucia J, Jr, Burkard ME, Kierzek R, Schroeder SJ, Jiao X, Cox C, Turner DH. Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson-Crick base pairs. Biochemistry. 1998;37:14719–14735. [PubMed]
17. Griffiths-Jones S, Bateman A, Marshall M, Khanna A, Eddy SR. Rfam: an RNA family database. Nucleic Acids Res. 2003;31:439–441. [PMC free article] [PubMed]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

  • PubMed
    PubMed
    PubMed citations for these articles
  • Substance
    Substance
    PubChem Substance links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...