• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of bmcbioiBioMed Centralsearchsubmit a manuscriptregisterthis articleBMC Bioinformatics
BMC Bioinformatics. 2006; 7: 122.
Published online Mar 9, 2006. doi:  10.1186/1471-2105-7-122
PMCID: PMC1421440

Selection of antisense oligonucleotides based on multiple predicted target mRNA structures

Abstract

Background

Local structures of target mRNAs play a significant role in determining the efficacies of antisense oligonucleotides (ODNs), but some structure-based target site selection methods are limited by uncertainties in RNA secondary structure prediction. If all the predicted structures of a given mRNA within a certain energy limit could be used simultaneously, target site selection would obviously be improved in both reliability and efficiency. In this study, some key problems in ODN target selection on the basis of multiple predicted target mRNA structures are systematically discussed.

Results

Two methods were considered for merging topologically different RNA structures into integrated representations. Several parameters were derived to characterize local target site structures. Statistical analysis on a dataset with 448 ODNs against 28 different mRNAs revealed 9 features quantitatively associated with efficacy. Features of structural consistency seemed to be more highly correlated with efficacy than indices of the proportion of bases in single-stranded or double-stranded regions. The local structures of the target site 5' and 3' termini were also shown to be important in target selection. Neural network efficacy predictors using these features, defined on integrated structures as inputs, performed well in "minus-one-gene" cross-validation experiments.

Conclusion

Topologically different target mRNA structures can be merged into integrated representations and then used in computer-aided ODN design. The results of this paper imply that some features characterizing multiple predicted target site structures can be used to predict ODN efficacy.

Background

Antisense oligonucleotides (ODNs) have served as powerful tools during the post-genome era. They provide an important approach to sequence-specific knockdown of gene expression, offering significant advantages over gene knockout techniques in respect of cost, time and resource requirements, and have therefore been widely used for determining gene function, validating drug targets and elucidating pathways [1,2]. ODNs also have potential as novel therapeutic agents for various diseases; several antisense compounds have been evaluated in clinical trials with promising results [3].

However, even with careful design, only a small proportion of ODNs against a given RNA effectively suppress the target gene in living cells [4]. It is commonly accepted that the identification of accessible sites in the target RNA is of great importance in designing ODNs. Various experimental approaches to the identification of promising local target sites have been described in recent years [5-10]. There has also been much interest in computational approaches to ODN design, which have advantages over experimental methods in terms of throughput, cost and efficiency. Several approaches to efficacy prediction have been proposed for rational selection of ODN target sites [11-14].

Among the factors that influence the activity of a given ODN, the local secondary structures of the target mRNA are very significant in determining in vitro efficiency [5,15-17] and are therefore particularly important in current ODN design strategies [18-20]. Local target site structures have also been used as the basis of rational design for other kinds of nucleic acids drugs such as antisense RNAs [21], catalytic RNAs [22] and ribozymes [23]. However, the term "structure" in these studies refers to "single computational predicted structure", not the real structure of the target mRNA; RNA secondary structure is difficult to determine experimentally.

Many RNA secondary structure prediction algorithms have been proposed during the past 20 years. Since the thermodynamically most stable structure of a molecule is generally the one with the minimum free energy (MFE), the initial aim of these prediction methods is to determine the MFE structure [24]. Several MFE structure searching algorithms have been described and are widely used in related research [25,26], especially in ODN target selection. However, partly because of the relatively low reliability of individual target mRNA structure predictions, researchers have often drawn inconsistent conclusions about favorable local structure motifs. The results obtained by Lima et al. [18] and Thierry et al. [19] indicated that single-stranded hairpin loops in RNA were the best target sites, whereas the studies by Laptev et al. [20] suggested that ODNs targeted to sequences predicted to form clustered double-stranded structures in RNA transcripts had the best potential.

It is also possible to consider conformations close to the energy minimum, and algorithms for calculating suboptimal structures within certain energy limits have been proposed [27,28]. The popular RNA secondary structure prediction program MFold now provides results over a range of free energies, mitigating the uncertainty of MFE prediction. Although multiple predicted structures are apparently more reliable, the MFE structure of the target mRNA is still used as the only structural basis in some ODN research. The main difficulty may lie in how to use these foldings simultaneously, since they can be topologically very different.

Studies on ensembles of target structures in ODNs design date back to Jaroszewski et al. [29], who considered the 30 lowest-energy computer-simulated structures of rabbit β-globin mRNA qualitatively. In some thermodynamic models, multiple predicted target structures have been merged into the form of free energy [30,31]. The earliest work on computational ODN design based on the original forms of multiple predicted target mRNA structures was perhaps that of Patzel et al. [17]. Five structures with low energy were predicted and aligned for a given sequence stretch, and ODN sequences were chosen if potentially favourable local structural elements occurred in all five. In vitro experiments showed that this theoretical protocol increased the statistical probability of identifying local target sites accessible to ODN sequences [17,32]. Another way to explore the original forms of optimal and suboptimal mRNA structures simultaneously, which is probably more straightforward, is to merge them into a single-stranded probability profile (SSPP), P = {pi}, 1 ≤ i n, where pi is the probability that base i is single-stranded. Actually, algorithms for predicting single-stranded regions in RNA secondary structures have long been of interest, since such regions play many important roles in RNA-RNA, RNA -DNA and RNA-protein interactions [33]. The SFold web server [34] can now directly output the SSPP of an RNA molecule instead of definite individual structures. Ding and Lawrence [33] presented a method for predicting accessible sites in the SSPP of rabbit β-globin mRNA, obtained by summing statistical samples of probable secondary structures. Their results showed a significant correlation between the predicted hybridization potential and the degree of inhibition of in vitro translation. Some researchers regard this method as the most successful [11,12].

The original RNA structural information is used in essentially different ways in the two methods described above. In the method based on structure alignment, favorable structural elements are identified by base pairing patterns, which can be illustrated as graphs. The role of secondary structures in this method is similar to its role in earlier studies of ODN design based on the target mRNA MFE structure. The success of this method relies mainly on the greatly increased reliability of structural elements. However, in the method based on SSPP, the RNA structures resemble a special time series rather than molecular "structures" in the usual sense. Base pairing patterns, or topological features, can hardly be explored in SSPP. The common ground between these two methods is the emphasis on the role of single-stranded regions in determining target accessibility. In the SSPP of rabbit β-globin mRNA, Ding and Lawrence found a significant correlation between the peak value of SSPP and the degree of inhibition of translation. The "well-chacterized" single-stranded regions were revealed by high probability peaks in the profile [33], while in the systematic alignment of multiple predicted target mRNA secondary structures, large (>10 nt) consecutive sequence stretches not involved in base pairing were regarded as favorable structural motifs [17]. Since these two methods were only evaluated on a single target mRNA, further research is needed on a broad range of target genes.

The purpose of this article is to systematically explore the methods for computational selection of ODN target sites based on features defined in multiple predicted structures of the target mRNA. In our approach, the predicted mRNA structures were first merged into integrated representations. Efficacy-associated features were then screened from a set of features defined on these representations. The potential of neural networks for predicting efficacy on the basis of these features was also validated.

Results

Dataset

Three ODN databases have been reported: ODNBase [35], AOdb [12] and an unnamed database with experimental data from Isis Pharmaceuticals [36]. We have also developed a database named AOBase [37] (NAR molecular biology database collection entry number 781) for both the selection and design of ODNs. Currently, it stores 705 ODNs from the published literature tested against transcripts of 54 different target genes. Since no homogeneous database is publicly available, we perforce used a heterogeneous collection of measurements made by different researchers using different experimental techniques as our dataset. Four hundred and forty-eight ODNs against 28 different mRNAs were collected from AOBase to construct this dataset; 54.2% of them had been tested at protein level and the others at mRNA level. The data selection criteria were similar to those used in other ODN efficacy prediction studies [11-13]: (a) at least 4 ODNs were tested under the same experimental conditions; (b) ODN efficacies were presented as percentages of the control target gene expression level; (c) virus targets were excluded; (d) ODNs targeting to the translational initiation site were excluded, since regions surrounding the initiation codon are generally considered to be free of secondary structure [38]. To keep in line with most of the research on drug design, the ODN efficacies in our dataset were transformed into [100%-(% of control expression)].

RNA folding calculation times have been greatly reduced in recent years because of faster computers and improved algorithms. The MFold web server [39] can now fold 6000 bases for a batch job, which meets the need of full-length mRNA structure prediction in most cases and is therefore used in this study. Because the number of predicted suboptimal RNA secondary structures increases exponentially as the folding energy increases [40], only structures within 5 percent of the computed minimum free energy were taken into consideration. The upper bound on the number of simultaneously predicted structures was set to 50 to avoid the high computational cost of long RNA sequences. These settings were the default settings of the MFold web server. Table Table11 is a brief summary of the dataset.

Table 1
Summary of antisense target genes and their predicted structures used in this study

Integrating multiple predicted target mRNA secondary structures

In this study, two methods were used to represent the multiple predicted local structures of target sites synthetically. All the predicted local structures were first merged into an SSPP, which is easily calculated from the ss-count file in the MFold output. For a more illustrative representation of the multiple predicted structures, the SSPP was further transformed to a "single-stranded/pair/uncertain" sequence (SUP representation) S = {si}, where si = 'S' if base i is single-stranded, si = 'P' if base i is paired with another base, and si = 'U' if it is uncertain whether base i is single-stranded. The thresholds suggested by Ding and Lawrence [33] were used to map SSPP {pi} into the SUP representation {si}, giving

S i = { ' S ' , p i > 0.5 ' U ' , 0.5 p i > 0.2 ' P ' , p i 0.2       ( 1 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGtbWudaWgaaWcbaGaemyAaKgabeaakiabg2da9maaceaabaqbaeGabmGaaaqaaiabcEcaNiabbofatjabcEcaNiabcYcaSaqaaiabdchaWnaaBaaaleaacqWGPbqAaeqaaOGaeyOpa4JaeGimaaJaeiOla4IaeGynaudabaGaei4jaCIaeeyvauLaei4jaCIaeiilaWcabaGaeGimaaJaeiOla4IaeGynauJaeyyzImRaemiCaa3aaSbaaSqaaiabdMgaPbqabaGccqGH+aGpcqaIWaamcqGGUaGlcqaIYaGmaeaacqGGNaWjcqqGqbaucqGGNaWjcqGGSaalaeaacqWGWbaCdaWgaaWcbaGaemyAaKgabeaakiabgsMiJkabicdaWiabc6caUiabikdaYaaaaiaawUhaaiaaxMaacaWLjaWaaeWaaeaacqaIXaqmaiaawIcacaGLPaaaaaa@5A1B@

SUP representation loses a lot of structural information in comparison to graphical illustration or dot-parenthesis notation of RNA secondary structure and therefore cannot be used to explore the whole RNA structure. However, for RNA local structural analysis, especially of very RNA short regions, SUP gives a competent simplified representation. Figure Figure11 illustrates part of these two representations (101–130 nt) of rabbit β-globin mRNA structure.

Figure 1
Two representations of multiple predicted structures of rabbit β-globin mRNA (G101-G130). (a) Single-stranded probability profile; (b) 'SUP' representation.

Selection of efficacy-associated features

The first important step in computational design based on multiple predicted mRNA structures is to find the efficacy-associated features in the SSPP and SUP representations of the target sites. Since the data structures of these two linear representations of multiple predicted structures are very different from graphical illustrations of RNA molecules, the topological features known to be correlated with efficacy must be redefined. However, new representations also afford opportunities to discover novel efficacy-associated features.

A set of features characterizing the local multiply-predicted target mRNA secondary structures was derived. Seven of these features were defined on the SSPP representation (listed in Table Table2)2) while the other eleven were defined on the SUP sequence representation (listed in Table Table3).3). The size of the local target, n, in the definition of features is equal to the length of the ODN.

Table 2
Parameters derived from the SSPP representation
Table 3
Parameters derived from the SUP sequence representation

The mean of all single stranded probabilities within a given target site, fmean, indicates the probability that the target site is single-stranded. The maximum value, fmax, has also been used for this purpose [33]. fimpulse, can be viewed as a relative peak value compared to the mean. The other statistics, frms, fpeak, fwave, and fdifference, describe the structural consistency of the target site.

Numerical features defined on the SUP sequence are directly derived from research results and from empirical rules about target site selection based on local structure. Features fNS, fNP, fPS, and fPP, give an overall description of target structure, while f5S, f5P, f3S and f3P emphasize the local structure of the target site termini. Factors fCS and fCP are derived to confirm whether the occurrence of consecutive subsequences in single-stranded or helical regions is correlated with efficacy, as explored by Patzel et al. [17].

Absolute numbers of bases appear in the definitions of eight features defined on the SUP representation, viz. fNS, fNP, fCS, fCP, f5S, f5P, f3S and f3P. Since the ODN lengths in the dataset are not uniform, it is necessary to determine whether these features are bound up with or limited by the size of local target. Figure 2(a) shows the distribution of ODN lengths in the dataset, which range from 10 nt to 22 nt. Most of the ODNs were 20 nt long. The dataset was divided into groups according to ODN length. The mean values of these features were calculated for each group and are shown in Figure 2(b), which indicates no obvious relationships between these features and target size.

Figure 2
The distribution of ODN length and length-limited features. (a) The distribution of ODN lengths in the dataset; (b) Mean values of some features of ODNs with different lengths.

Two types of indices, efficiency prediction potential and classification potency, were used to measure the suitability of these parameters for rational ODN design. The efficacy prediction potential was evaluated by calculating the correlation between the features and efficacy, using Pearson linear correlation, Spearman rank correlation and Kendall rank correlation. The classification potency was evaluated by exploring the performance of Fisher linear discriminators, using the feature as the single independent variable. The performance was measured as specificity Sp=TnTn+Fp MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGtbWudaWgaaWcbaGaemiCaahabeaakiabg2da9maalaaabaGaemivaq1aaSbaaSqaaiabd6gaUbqabaaakeaacqWGubavdaWgaaWcbaGaemOBa4gabeaakiabgUcaRiabdAeagnaaBaaaleaacqWGWbaCaeqaaaaaaaa@39B4@ and sensitivity Se=TpTp+Fn MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGtbWudaWgaaWcbaGaemyzaugabeaakiabg2da9maalaaabaGaemivaq1aaSbaaSqaaiabdchaWbqabaaakeaacqWGubavdaWgaaWcbaGaemiCaahabeaakiabgUcaRiabdAeagnaaBaaaleaacqWGUbGBaeqaaaaaaaa@39A2@. Two different efficacy threshold values, 50% and 75%, were used to distinguish between positive and negative cases in our dataset, since these indices depend on threshold. Features matching at least one of the following two criteria were selected as efficacy-associated: (a) statistically significant correlation (p < 0.05) with efficacy; and (b) high specificity (≥0.7) or high sensitivity (≥0.7) in distinguishing between active and inactive ODNs.

The correlation between parameters and efficacy is presented in Table Table4.4. Only four features defined on SSPP, i.e. frms, fmax, fpeak and fdifference, correlated strongly with efficacy. Table Table55 compares the Fisher discrimination results for each parameter and different thresholds, indicating that frms, fmax, fpeak, fdifference, fPP, fCS, fCP, f5S and f3S can be used to distinguish between active and inactive ODNs according to our criteria.

Table 4
Correlations between features and efficacy
Table 5
Performance of Fisher linear discriminators for each parameter

The most noteworthy finding is that ODN efficacy seems not to rely greatly on the degree of single-strandedness in its target site, as suggested in previous publications [18-20], since fmean, fNS and fPS show neither sufficient correlation with efficacy nor good performance in identifying active ODNs. The lengths of consecutive single-stranded regions in the target site, which are characterized by fCS, prove useful for identifying active ODNs. This result is partly consistent with the conclusion drawn by Patzel et al [17]. In contrast to the conclusion of Ding and Lawrence [33], although fmax is revealed to be efficacy-associated, the peak value of the target site SSPP correlates negatively with efficacy.

The helical region in the target site appears to be more important, as suggested by Laptev [20], because features fPP and fCP satisfy our selection criteria for ODN classification. From the analysis, it is obvious that the structural consistency features, frms, fpeak, and fdifference, are more important in target site selection. But this should not be interpreted as implying simple correspondences between structural consistency and efficacy.

ODN efficacy may be closely associated with the local structures of the 5' and 3' termini of the target sites. Fisher classifiers using factors f5S and f3S gave high specificity or sensitivity in ODN discrimination.

Although some features are efficacy-associated, the relationship between structural factors and efficacy is highly complex. No single feature has been found to correlate highly with efficacy, and no feature is reliable on its own for distinguishing active from inactive ODNs. Two feature sets defined on the SSPP and SUP representations of the target site are selected as inputs of efficacy-predicting neural networks: FSSPP = {frms, fmax, fpeak, fdifference} and FSUP = {fPP, fCS, fCP, f5S, f3S}.

Efficacy predicting using neural networks

To assess the ability of selected features to predict efficacy, two neural network models were constructed, one for features defined on the SSPP and the other for features derived from the SUP sequence representation of the target structure.

Previous studies have shown that cross-validation is important for estimating accuracy [11-14]. Since ODNs always have similar properties if they are near each other on the same gene or are measured in the same study, the network training process should be completely independent of the test data [12,13]. In this research, cross-validation was done by the "minus-one-gene" (-gene) [13] approach. ODNs targeting to 8 mRNAs (listed in Table Table6)6) were selected alternately from the dataset for testing, while the remainder, assayed in the same studies, were used as the training set. The test mRNA selection criteria were: (a) more than 15 different target sites were tested; (b) the efficacy of at least one ODN was greater than 75%.

Table 6
Dataset for cross-validation experiments

Sixteen neural networks for efficacy prediction were tested in our cross-validation experiments. The network group NSSPP (NSSPP1~NSSPP8) took FSSPP as inputs, and the NSUP group (NSUP1~NSUP8) took FSUP as the input parameter set. The outputs of all these networks met the condition of convergence within 100 training cycles.

Several methods have been used to measure the accuracy of ODN predictors [11-14]. To obtain rounded assessments for the aforementioned neural networks, two different types of indices were computed: (1) specificity SP, sensitivity Se and accuracy Acc=Tn+TpTp+Tn+Fp+Fn MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGbbqqcqWGJbWycqWGJbWycqGH9aqpdaWcaaqaaiabdsfaunaaBaaaleaacqWGUbGBaeqaaOGaey4kaSIaemivaq1aaSbaaSqaaiabdchaWbqabaaakeaacqWGubavdaWgaaWcbaGaemiCaahabeaakiabgUcaRiabdsfaunaaBaaaleaacqWGUbGBaeqaaOGaey4kaSIaemOray0aaSbaaSqaaiabdchaWbqabaGccqGHRaWkcqWGgbGrdaWgaaWcbaGaemOBa4gabeaaaaaaaa@4585@ calculated using fixed threshold values, as mentioned above in the account of feature selection; (2) the receiver operating characteristics (ROC) curve [41], which is a plot of Se versus 1 - SP at different thresholds. The ROC area was calculated as a quantitative indicator of the ability of the network to classify. The cutoff efficacy value used to distinguish positive from negative ODNs in the cross-validation test was 75%.

The performances of the neural networks are listed in Table Table7.7. The specificities, SP, of all the networks in these two groups are greater than the related sensitivities, Se. This performance is beneficial for ODN design, since users will only be interested in candidates with high predicted efficacy in practical applications [14]. The ROC curves of the 16 networks tested on ODNs targeting to 8 different mRNAs are shown in Figure Figure3.3. The best ROC curve areas were obtained in cross-validation experiment 7 (network NSSPP7 and NSUP7), which used the data from Matveeva et al. [6] as test set. The average ROC area for NSUP is 0.77. The average for NSUP is 0.73, which is little lower.

Figure 3
ROC curves for efficacy-predicting neural networks. ROC curves are shown for networks (a) NSSPP1 and NSUP1; (b) NSSPP2 and NSUP2; (c) NSSPP3 and NSUP3; (d) NSSPP4 and NSUP4; (e) NSSPP5 and NSUP5; (f) NSSPP6 and NSUP6; (g) NSSPP7 and NSUP7; (h) NSSPP8 ...
Table 7
The performances of two groups of networks in cross-validation experiments

Discussion

Compared with most other bioinformatics research problems, studies on computer-aided ODN design are far from "data rich". Moreover, the data collected from the published literature are variable owing to the diversity of experimental methods. To provide a more reliable basis for feature-mining and predictor development, one focus of future work will be on enlargement of the dataset. A large dataset with quality control will make the analysis and cross-validation of grouped homogeneous subsets possible, and therefore make the ODN design systems more reliable.

Another "data poor" limitation in our study and related research [6,17,29] is that not all possible target RNA structures are taken into account. As pointed out by Mathews, an ideal way to integrate the predicted RNA structures would be to compute a partition function, which sums the contributions of all structures weighted by their Boltzmann probabilities [44]. However, the determination of a partition function has O(N3) computational complexity [45], so this method is practicable only for short RNA sequences. Several studies have been done on the estimation of partition function with lower computational cost [44,46-48]. The Vienna RNA secondary structure prediction server [49] can now compute the partition function of RNA up to 5000 bases for batch jobs. One implication of this study that warrants further investigation is ODN design using the partition function of the target mRNA, which is based on more reliable structural information.

The factors influencing the potential of an ODN are complex and so far poorly understood. Although this paper focuses on the relationship between ODN efficacy and target site structure, we do not ignore other factors that have been shown to influence efficacy, such as chemical properties, DNA-RNA duplex stability, sequence motifs, metabolic properties of target mRNA, etc. [4]. We do believe that as more factors are considered in ODN efficacy prediction, the more reliable the target site selection becomes.

Conclusion

This paper presents a method, based on multiple predicted target mRNA structures, for reducing the uncertainty of structure prediction in ODN design. Several efficacy-associated features characterizing the integrated structure of the target site have been discovered. The structural consistency features of the target seem to be correlated with efficacy. In contrast, some features of favorable ODN targets reported in previous research, which emphasized single-stranded regions, were found to correlate weakly with efficacy. In addition, the local structures of the 5' and 3' termini were shown to be important in target site selection.

Neural network efficacy predictors using features defined on integrated structures as inputs have been shown to perform well, implying that these features can also be used for other forms of efficacy prediction such as Bayesian statistics (BS), multiple linear regression (MLR), decision tree (DT) and support vector machine (SVM).

Methods

After preliminary experiments, feed-forward network architecture with a hidden layer containing 20 nodes was applied to each network. The input neurons used a logarithmic sigmoid (tan-sigmoid) activation function; the output neurons used a hyperbolic tangent sigmoid (log-sigmoid) activation function. The weights and bias values of the networks were updated according to the Levenberg-Marquardt optimization algorithm [42], which appears to be the fastest method for training a moderate-size feed-forward neural network [43]. Matlab® Neural Network Toolbox 4.0.3 was used for all neural network implementation.

Authors' contributions

SW guided the project. XB and SW conceived of the study. XB wrote program, analyzed the results and drafted the manuscript. SL and DS helped in dataset construction. WS and JY helped in analysis and discussion, gave useful comments.

Acknowledgements

This work was supported by grants from the National Nature Science Foundation of China (No.30171111), the National High Technology Research and Development Program of China (863 Program) (No. 2003AA234031) and the Special Funds for Major State Basic Research Program of China (973 Program) (No. 2004CB518904).

References

  • Taylor MF, Wiederholt K, Svetdrup F. Antisense oligonucleotides: a systematic high-throughput approach to target validation and gene function determination. Drug Discov Today. 1999;4:562–567. doi: 10.1016/S1359-6446(99)01392-6. [PubMed] [Cross Ref]
  • Flaherty KT, Stevenson JP, O'Dwyer PJ. Antisense therapeutics: lessons from early clinical trials. Curr Opin Oncol. 2001;13:499–505. doi: 10.1097/00001622-200111000-00013. [PubMed] [Cross Ref]
  • Crooke ST. An overview of progress in antisense therapeutics. Antisense Nucleic Acid Drug Dev. 1998;8:115–122. [PubMed]
  • Far RK, Nedbal W, Sczakiel G. Concepts to automate the theoretical design of effective antisense oligonucleotides. Bioinformatics. 2001;17:1058–1061. doi: 10.1093/bioinformatics/17.11.1058. [PubMed] [Cross Ref]
  • Ho SP, Bao Y, Lesher T, Malhotra R, Ma LY, Fluharty SJ, Sakai RR. Mapping of RNA accessible sites for antisense experiments with oligonucleotide libraries. Nature Biotechnology. 1998;16:59–63. doi: 10.1038/nbt0198-59. [PubMed] [Cross Ref]
  • Matveeva OV, Felden B, Tsodikov A, Johnston J, Monia BP, Atkins JF, Gesteland RF, Freier SM. Prediction of antisense oligonucleotide efficacy by in vitro methods. Nature Biotechnology. 1998;16:1374–1375. doi: 10.1038/4362. [PubMed] [Cross Ref]
  • Matveeva O, Felden B, Audlin S, Gesteland RF, Atkins JF. A rapid in vitro method for obtaining RNA accessibility patterns for complementary DNA probes: correlation with an intracellular pattern and known RNA structures. Nucleic Acids Res. 1997;25:5010–5016. doi: 10.1093/nar/25.24.5010. [PMC free article] [PubMed] [Cross Ref]
  • Milner N, Mir KU, Southern EM. Selecting effective antisense reagents on combinatorial oligonucleotide arrays. Nature Biotechnology. 1997;15:537–541. doi: 10.1038/nbt0697-537. [PubMed] [Cross Ref]
  • Allawi HT, Dong F, Ip HS, Neri BP, Lyamichev VI. Mapping of RNA accessible sites by extension of random oligonucleotide libraries with reverse transcriptase. RNA. 2001;7:314–327. doi: 10.1017/S1355838201001698. [PMC free article] [PubMed] [Cross Ref]
  • Zhang HY, Modn J, Zhou D, Xu Y, Thonberg H, Liang Z, Wahlestedt C. mRNA accessibility site tagging (MAST): a novel high throughput method for selecting effective antisense oligonucleotides. Nucleic Acid Res. 2003;31:e72. doi: 10.1093/nar/gng072. [PMC free article] [PubMed] [Cross Ref]
  • Camps-Valls G, Chalk AM, Serrano-Lopez A, Martin-Guerrero JD, Sonnhammer ELL. Profiled support vector machine for antisense oligonucleotide efficacy prediction. BMC Bioinformatics. 2004;5:135. doi: 10.1186/1471-2105-5-135. [PMC free article] [PubMed] [Cross Ref]
  • Chalk AM, Sonnhammer ELL. Computational antisense oligo prediction with a neural network model. Bioinformatics. 2002;18:1567–1575. doi: 10.1093/bioinformatics/18.12.1567. [PubMed] [Cross Ref]
  • Giddings MC, Shah AA, Freier S, Atkins JF, Gesteland RF, Matveeva OV. Artificial neural network prediction of antisense oligodeoxynucleotide activity. Nucleic Acids Res. 2002;30:4295–4304. doi: 10.1093/nar/gkf557. [PMC free article] [PubMed] [Cross Ref]
  • Sætrom Pål. Predicting the efficacy of short oligonucleotides in antisense and RNAi experiments with boosted genetic programming. Bioinformatics. 2004;20:3055–3063. doi: 10.1093/bioinformatics/bth364. [PubMed] [Cross Ref]
  • Scherer LJ, Rossi JJ. Approaches for the sequence-specific knockdown of mRNA. Nature Biotechnology. 2003;21:1457–1465. doi: 10.1038/nbt915. [PubMed] [Cross Ref]
  • Vickers TA, Wyatt JR, Freier SM. Effects of RNA secondary structure on cellular antisense activity. Nucleic Acids Res. 2000;28:1340–1347. doi: 10.1093/nar/28.6.1340. [PMC free article] [PubMed] [Cross Ref]
  • Patzel V, Steidl U, Kronenwett R, Haas R, Sczakiel G. A theoretical approach to select effective antisense oligodeoxyribonucleotides at high statistical probability. Nucleic Acids Res. 1999;27:4328–4334. doi: 10.1093/nar/27.22.4328. [PMC free article] [PubMed] [Cross Ref]
  • Lima WF, Monia BP, Ecker DJ, Freier SM. Implication of RNA structure on antisense oligonucleotide hybridization kinetics. Biochemistry. 1992;31:12055–12061. doi: 10.1021/bi00163a013. [PubMed] [Cross Ref]
  • Thierry AR, Rahman A, Dritschilo A. Overcoming multi drug resistance in human tumor cells using free and liposomally encapsulated antisense oligodeoxynucleotides. Biochem Biophys Res Commun. 1993;190:952–960. doi: 10.1006/bbrc.1993.1142. [PubMed] [Cross Ref]
  • Laptev AV, Lu Z, Colige A, Prockop DJ. Specific inhibition of expression of a human collagen gene (COL1A1) with modified antisense oligonucleotides. Biochemistry. 1994;33:11033–11039. doi: 10.1021/bi00202a024. [PubMed] [Cross Ref]
  • Sczakiel G, Homann M, Rittner K. Computer-aided search for effective antisense RNA target sequences of the human immunodeficiency virus type 1. Antisense Res Dev. 1993;3:45–52. [PubMed]
  • Denman RB. Using RNAFOLD to predict the activity of small catalytic RNAs. Biotechniques. 1993;15:1090–1095. [PubMed]
  • James W, Cowe E. Computational approaches to the identification of ribozyme target sites. Methods Mol Biol. 1997;74:17–26. [PubMed]
  • Higgs PG. RNA secondary structure: physical and computational aspect. Quarterly Reviews of Biophysics. 2000;33:199–253. doi: 10.1017/S0033583500003620. [PubMed] [Cross Ref]
  • Zuker M, Stiegler P. Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Res. 1981;9:133–148. [PMC free article] [PubMed]
  • Dumas JP, Ninio J. Efficient algorithms for folding and comparing nucleic acid sequences. Nucleic Acids Res. 1982;10:197–206. [PMC free article] [PubMed]
  • Zuker M. On finding all suboptimal foldings of an RNA molecule. Science. 1989;244:48–52. [PubMed]
  • Yamamoto K, Kilamura Y, Yoshikura H. Computation of statistical secondary structure of nucleic acids. Nucleic Acids Res. 1984;12:335–346. [PMC free article] [PubMed]
  • Jaroszewski JW, Syi JL, Ghosh M, Ghosh K, Cohen JS. Targeting of antisense DNA: comparison of activity of anti-rabbit beta-globin oligodeoxyribonucleoside phosphorothioates with computer predictions of mRNA folding. Antisense Res Dev. 1993;3:339–348. [PubMed]
  • Walton SP, Stephanopoulos GN, Yarmush ML, Roth CM. Thermodynamic and kinetic characterization of antisense oligodeoxynucleotide binding to a structured mRNA. Biophys J. 2002;82:366–377. [PMC free article] [PubMed]
  • Mathews DH, Burkard ME, Freier SM, Wyatt JR, Turner DH. Predicting oligonucleotide affinity to nucleic acid targets. RNA. 1999;5:1458–1469. doi: 10.1017/S1355838299991148. [PMC free article] [PubMed] [Cross Ref]
  • Scherr M, Rossi JJ, Sczakiel G, Patzel V. RNA accessibility prediction: a theoretical approach is consistent with experimental studies in cell extracts. Nucleic Acids Res. 2000;28:2455–2461. doi: 10.1093/nar/28.13.2455. [PMC free article] [PubMed] [Cross Ref]
  • Ding Y, Lawrence CE. Statistical prediction of single-stranded regions in RNA secondary structure and application to predicting effective antisense target sites and beyond. Nucleic Acids Res. 2001;29:1034–1046. doi: 10.1093/nar/29.5.1034. [PMC free article] [PubMed] [Cross Ref]
  • Ding Y, Chan CY, Lawrence CE. Sfold web server for statistical folding and rational design of nucleic acids. Nucleic Acids Res. 2004;32:W135–W141. [PMC free article] [PubMed]
  • Giddings MC, Matveeva OV, Atkins JF, Gesteland RF. ODNBase – a web database for antisense oligonucleotide effectiveness studies. Bioinformatics. 2000;16:843–844. doi: 10.1093/bioinformatics/16.9.843. [PubMed] [Cross Ref]
  • Matveeva OV, Mathews DH, Tsodikov AD, Shabalina SA, Gesteland RF, Atkins JF, Freier SM. Thermodynamic criteria for high hit rate antisense oligonucleotide design. Nucleic Acids Res. 2003;31:4989–4994. doi: 10.1093/nar/gkg710. [PMC free article] [PubMed] [Cross Ref]
  • AOBase http://www.bioit.org.cn/ao/aobase
  • Sohail M, Southern EM. Selecting optimal antisense reagents. Advanced Drug Delivery Reviews. 2000;44:23–34. doi: 10.1016/S0169-409X(00)00081-8. [PubMed] [Cross Ref]
  • Zuker M. Mfold web server for nucleic acid folding and hybridization. Nucleic Acids Res. 2003;31:3406–3415. doi: 10.1093/nar/gkg595. [PMC free article] [PubMed] [Cross Ref]
  • Wuchty S, Fontana W, Hofacker IL, Schuster P. Complete suboptimal folding of RNA and the stability of secondary structures. Biopolymers. 1999;49:145–165. doi: 10.1002/(SICI)1097-0282(199902)49:2<145::AID-BIP4>3.0.CO;2-G. [PubMed] [Cross Ref]
  • Hanley J, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143:29–36. [PubMed]
  • Hagan MT, Menhaj M. Training feedforward networks with the Marquardt algorithm. IEEE Transactions on Neural Networks. 1994;5:989–993. doi: 10.1109/72.329697. [PubMed] [Cross Ref]
  • Demuth H, Beale M. Neural Network Toolbox MathWorks Inc. 2004.
  • Mathews DH. Using an RNA secondary structure partition function to determine confidence in base pairs predicted by free energy minimization. RNA. 2004;10:1178–1190. doi: 10.1261/rna.7650904. [PMC free article] [PubMed] [Cross Ref]
  • McCaskill JS. The equilibrium partition function and base pair probabilities for RNA secondary structure. Biopolymers. 1990;29:1105–1119. doi: 10.1002/bip.360290621. [PubMed] [Cross Ref]
  • Hofacker IL, Fontana W, Stadler PF, Bonhoeffer LS, Tacker M, Schuster P. Fast folding and comparison of RNA secondary structures. Monatsh Chem. 1994;125:167–168. doi: 10.1007/BF00818163. [Cross Ref]
  • Fekete M, Hofacker IL, Stadler PF. Prediction of RNA base pairing probabilities on massively parallel computers. J Comput Biol. 2000;7:171–182. doi: 10.1089/10665270050081441. [PubMed] [Cross Ref]
  • Ding Y, Lawrence CE. A Bayesian statistical algorithm for RNA secondary structure prediction. Comput Chem. 1999;23:387–400. doi: 10.1016/S0097-8485(99)00010-8. [PubMed] [Cross Ref]
  • Hofacker IL. Vienna RNA secondary structure server. Nucleic Acids Res. 2003;31:3429–3431. doi: 10.1093/nar/gkg599. [PMC free article] [PubMed] [Cross Ref]

Articles from BMC Bioinformatics are provided here courtesy of BioMed Central

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...