# RNAbor: a web server for RNA structural neighbors

^{1}Linnaeus Centre for Bioinformatics, Uppsala University, 75124 Uppsala, Sweden,

^{2}School of Computing Sciences, University of East Anglia, Norwich, NR4 7TJ, UK and

^{3}Department of Biology, Boston College, Chestnut Hill, MA 02467, USA

## Abstract

** RNAbor** provides a new tool for researchers in the biological and related sciences to explore important aspects of RNA secondary structure and folding pathways.

**computes statistics concerning**

`RNAbor`**-neighbors of a given input RNA sequence and structure (the structure can, for example, be the minimum free energy (MFE) structure). A**

*δ***-neighbor is a structure that differs from the input structure by exactly**

*δ***base pairs, that is, it can be obtained from the input structure by adding and/or removing exactly**

*δ***base pairs. For each distance**

*δ***computes the density of**

*δ*`RNAbor`**-neighbors, the number of**

*δ***-neighbors, and the MFE structure, or MFE**

*δ*^{δ}structure, among all

**-neighbors.**

*δ***can be used to study possible folding pathways, to determine alternate low-energy structures, to predict potential nucleation sites and to explore structural neighbors of an intermediate, biologically active structure. The web server is available at**

`RNAbor`

`http://bioinformatics.bc.edu/clotelab/RNAbor.`## INTRODUCTION

RNA plays a surprising and previously unsuspected role in many biological processes, such as post-transcriptional regulation, conformational switches, expansion of the genetic code (such as selenocysteine insertion), ribosomal frameshift, metabolite-binding and chemical modification of specific nucleotides in the ribosome. Apart from its catalytic role as a ribonucleic enzyme (*ribozyme*) (1), RNA can regulate genes in several ways. For example, by hybridizing to a portion of messenger RNA, small ~ 22 nt RNA molecules perform post-transcriptional gene regulation by RNA interference (RNAi), a process so important that for its discovery the 2006 Nobel Prize in Physiology or Medicine was awarded to A.Z.Fire and C.C.Mello. In addition, by very different means, RNA can perform transcriptional and translational gene regulation by allostery, where a portion of the 5′ untranslated region (5′ UTR) of mRNA, known as a *riboswitch* (2,3), undergoes a conformational change upon binding a specific ligand such as adenine, guanine or lysine.

As the field of *RNomics* matures, many sophisticated computational tools, e.g. RNA structure prediction, alignment and gene finding, have been developed—see (4,5) for recent overviews. Recently developed programs that are of most relevance here include the program `Sfold` (6,7), that computes a low energy ensemble of structures by sampling from the partition function (8), and an earlier program `RNAsubopt` (9) that computes all suboptimal structures within a user-specified number of kcal/mol of the minimum free energy (MFE). In addition, the program `RNAshapes` (10–12) provides a useful description of RNA branching structure by computing the Boltzmann probability of various shapes and also the MFE structure for various shapes. Here, an RNA shape is an equivalence class of secondary structures, describing the overall branching; for instance the shape of a typical cloverleaf tRNA would be [[][][]].

In this article, we describe the web server `RNAbor`, which computes the Boltzmann probability and MFE structures which differ by δ base pairs from a given initial structure. Unlike most of the tools just described, which focus on the MFE structure or a low energy ensemble, `RNAbor` yields information concerning the secondary structure folding landscape. Potential applications of `RNAbor` include the design of RNA aptamers (see (13) for a suggestion how RNA might be designed to inhibit the function of the viral enzymes such as HIV-1 reverse transcriptase and hepatitis C NS3 protease), detection of conformational switches, understanding the role played by biologically active structural intermediates and improvement in secondary structure prediction.

## MATERIALS AND METHODS

Let denote a given RNA nucleotide sequence, and let be any given secondary structure of . The structure could be the MFE structure of , it could be the secondary structure obtained from the 3-dimensional X-ray conformation or by comparative sequence analysis, or it could be an arbitrary intermediate structure of particular biological significance. For an integer δ, a secondary structure of is a δ-neighbor of , if and differ by exactly δ base pairs [14]. In (Freyhult, E., Moulton, V. and Clote, P. Boltzmann probability of RNA structural neighbors and riboswitch detection, submitted for publication), we describe new algorithms, which compute the number N^{δ} of δ-neighbors, the partition function Z^{δ} for δ-neighbors and the MFE^{δ}, and the corresponding MFE^{δ} structure over all δ-neighbors of a fixed structure .

### Computing structural neighbors

To give the reader a feeling for how the algorithms work, we present the recurrence relations to compute the number N^{δ} of δ-neighbors of . Let . If denotes the number of δ-neighbors of the substructure S _{[i,j]}, the restriction of to interval [*i*,*j*] of , then the number of δ-neighbors of , , can be computed by the following recursion:

where (i.e. the set of Watson-Crick base pairs together with wobbles), b_{0} = 1 if *j* is base-paired in *S*
_{[i,j]} and 0 otherwise and *b* is the base pair distance between S _{[i,j]} and a structure on the same interval [*i*,*j*] where a base pair between *k* and *j* has been added (taking into account all the base pairs in *S*
_{[i,j]} that need to be broken to allow the addition of this base pair).

This approach for computing N^{δ} can be extended to compute the partition function contribution, Z^{δ}, of the set of δ-neighbors and also to compute the MFE^{δ} and the MFE^{δ} structure. Computations are made with respect to the Turner energy model (15,16); treatment of the dangle is similar to that in Vienna RNA Package (option `-d2`). The algorithms employ dynamic programming, and run in *O*(Δ · *n*^{3}) time and *O*(Δ · *n*^{2}) space, where *n* is the sequence length and Δ is the maximum value of δ. Since Δ can be at most *n*, the run time cannot be worse than *O*(*n*^{4}) and space no worse that *O*(*n*^{3}), even if the user does not specify a value of Δ. Full details of the algorithms are given in (Freyhult, E., Moulton, V. and Clote, P. Boltzmann probability of RNA structural neighbors and riboswitch detection, submitted for publication).

### Web server

The web server available at `http://bioinformatics.bc.edu/clotelab/RNAbor` runs on a Linux cluster with 20 computational nodes, each with double processors of between 1300 and 3000MHz and 2GB RAM (6 Dell PowerEdge 1650, 2 × 1300MHz Pentium III, 2GB RAM; 11 Dell PowerEdge 1850, 2 × 2800 + MHz Xeon EM64T, 2GB RAM; 5 Dell PowerEdge 1850, 2 × 3000MHz Xeon EM64T, 2GB RAM).

## RESULTS

Due to the time and space constraints of the algorithm, RNA sequences may be of length up to 300 nucleotides. Sequences of length up to 60 are processed interactively and output is displayed in the user's browser window. For sequences of length 61–300, the computation is done off-line and the results are returned to the user by email; for this, the email address is required. The user can either paste an input sequence (with optional secondary structure), or upload a file of the same. The full input consists of up to four lines, illustrated by the following example.The temperature is set to a default value of 37^{} C; however the user can enter any integer temperature between 0 and 100.

The only required input is an RNA sequence of length at most 300 nucleotides; the FASTA comment, initial secondary structure and upper bound Δ are optional inputs. If no secondary structure is given, then the initial structure is taken to be the MFE structure, as computed by `RNAfold -d2`. If the optional input Δ is missing, then Δ is defined to equal the length *n* of the input sequence ; otherwise Δ is the minimum of the input value and *n*. For each 0 ≤ δ ≤ Δ, `RNAbor` computes the Boltzmann probability *p*^{δ} = Zδ /Z, where the partition function is defined by

where *R* is the universal gas constant and *T* is temperature in degrees Kelvin. Here, the summation is made over all secondary structures of which are δ-neighbors of . The full partition function Z = ∑ _{δ} Z^{δ} is computed by McCaskill's algorithm (8) if Δ ≥ *n*.

In addition to computing probability *p*^{δ}, `RNAbor` computes the number *N*^{δ} of δ-neighbors of , the MFE^{δ} over all δ-neighbors of and the MFE^{δ} secondary structure. Tables of the values *N*^{δ} and *p*^{δ}, as well as their graphs, are made available as downloadable files. The five-column text file output, consisting of δ, *p*^{δ}, *N*^{δ}, MFE^{δ} and the MFE^{δ} structure, is depicted in Figure 1.

## EXAMPLES

`RNAbor` can be used to generate alternative low energy structures, which differ markedly from the MFE structure, or from any initially given structure. Figure 1 shows the `RNAbor` output for a short 3 ′-UTR sequence of an mRNA with NCBI accession number MUSGBPS. The input structure in this example is the MFE structure (as predicted by `RNAfold -d2`). The `RNAbor` output indicates two ranges of δ that show higher probabilities than the rest, 0–9 and 20–24. The MFE^{δ} structures at distance δ between 0 and 9 from the MFE structure all have very similar folds and the probability of finding the RNA in a structure at δ between 0 and 9 is 0.63. The probability of finding a structure at δ 20–24 is also relatively high, 0.35, and the MFE^{δ} structures in this range are similar to each other but completely different from the MFE structure. Thus the two highly probable δ ranges represent two possible alternative folds of the RNA.

Analyzing the same sequence with `Sfold` gives similar results. `Sfold` finds three types of structures (three clusters), with probabilities 0.65, 0.22 and 0.13, respectively. One cluster contains the MFE structure corresponding to the folds at δ values from 0 to 9, another cluster has a centroid structure resembling the structures at δ between 20 and 24, and the third cluster has a centroid structure similar to the MFE^{19} structure. `RNAshapes` on the other hand is less successful for this example since the alternative folds as predicted by `RNAbor` have the same shape [ ], even though the folds are very different.

Figure 2 displays the MFE structure and the MFE^{30} structure of the 101nt SAM riboswitch with EMBL accession number {"type":"entrez-nucleotide","attrs":{"text":"AP004597.1","term_id":"22776828","term_text":"AP004597.1"}}AP004597.1/118941-119041, with sequence taken from Rfam (17). The MFE structure over all 30-neighbors, the MFE^{30} structure, is clearly much closer to the real structure than the global MFE structure. Figure 3 displays the Boltzmann probability density, showing a peak for the value δ = 30.

**...**

**...**

## DISCUSSION

In this article, we have introduced the web server `RNAbor`, which computes the Boltzmann probability and MFE structure over all δ-neighbors for a given RNA sequence and initial secondary structure . The underlying algorithms, described in the forthcoming paper (Freyhult, E., Moulton, V. and Clote, P. Boltzmann probability of RNA structural neighbors and riboswitch detection, submitted for publication), use dynamic programming, involve the Turner energy model (15,16), and require considerable time *O*(Δ · *n*^{3}) and space *O*(Δ · *n*^{2}) resources. Figures 2 and and33 illustrate the use of `RNAbor` in better understanding structural aspects of a SAM riboswitch, and indicate that `RNAbor` should provide a useful complementary tool to programs such as `Sfold` and `RNAshapes` for analyzing the ensemble of possible secondary structures on a given RNA sequence.

## ACKNOWLEDGEMENTS

Research of P.C. was partially supported by National Science Foundation DBI-0543506, which additionally supported some travel of E.F. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. All three authors would like to thank Elena Rivas, Eric Westhof and funding agencies for organizing the meeting RNA-2006 in Benasque, Spain, in July 2006, where some of this work was carried out. Finally, thanks to Jason Persampieri for some technical assistance. Funding to pay the Open Access publication charges for this article was provided by the National Science Foundation.

*Conflict of interest statement*. None declared.

## REFERENCES

**Oxford University Press**

## Formats:

- Article |
- PubReader |
- ePub (beta) |
- PDF (1.6M) |
- Citation

- Boltzmann probability of RNA structural neighbors and riboswitch detection.[Bioinformatics. 2007]
*Freyhult E, Moulton V, Clote P.**Bioinformatics. 2007 Aug 15; 23(16):2054-62. Epub 2007 Jun 14.* - A method for aligning RNA secondary structures and its application to RNA motif detection.[BMC Bioinformatics. 2005]
*Liu J, Wang JT, Hu J, Tian B.**BMC Bioinformatics. 2005 Apr 7; 6:89. Epub 2005 Apr 7.* - RADAR: a web server for RNA data analysis and research.[Nucleic Acids Res. 2007]
*Khaladkar M, Bellofatto V, Wang JT, Tian B, Shapiro BA.**Nucleic Acids Res. 2007 Jul; 35(Web Server issue):W300-4. Epub 2007 May 21.* - Beyond Mfold: recent advances in RNA bioinformatics.[J Biotechnol. 2006]
*Reeder J, Höchsmann M, Rehmsmeier M, Voss B, Giegerich R.**J Biotechnol. 2006 Jun 25; 124(1):41-55. Epub 2006 Mar 10.* - Sequence and structure analysis of noncoding RNAs.[Methods Mol Biol. 2010]
*Washietl S.**Methods Mol Biol. 2010; 609:285-306.*

- Efficient calculation of exact probability distributions of integer features on RNA secondary structures[BMC Genomics. ]
*Mori R, Hamada M, Asai K.**BMC Genomics. 15(Suppl 10)S6* - Maximum expected accuracy structural neighbors of an RNA secondary structure[BMC Bioinformatics. ]
*Clote P, Lou F, Lorenz WA.**BMC Bioinformatics. 13(Suppl 5)S6* - RNAmutants: a web server to explore the mutational landscape of RNA secondary structures[Nucleic Acids Research. 2009]
*Waldispühl J, Devadas S, Berger B, Clote P.**Nucleic Acids Research. 2009 Jul 1; 37(Web Server issue)W281-W286*

- RNAbor: a web server for RNA structural neighborsRNAbor: a web server for RNA structural neighborsNucleic Acids Research. 2007 Jul; 35(Web Server issue)W305

Your browsing activity is empty.

Activity recording is turned off.

See more...