Logo of bmcbioiBioMed Centralsearchsubmit a manuscriptregisterthis articleBMC Bioinformatics
BMC Bioinformatics. 2006; 7(Suppl 5): S20.
Published online 2006 Dec 18. doi:  10.1186/1471-2105-7-S5-S20
PMCID: PMC1764477

MicroTar: predicting microRNA targets from RNA duplexes



The accurate prediction of a comprehensive set of messenger RNAs (targets) regulated by animal microRNAs (miRNAs) remains an open problem. In particular, the prediction of targets that do not possess evolutionarily conserved complementarity to their miRNA regulators is not adequately addressed by current tools.


We have developed MicroTar, an animal miRNA target prediction tool based on miRNA-target complementarity and thermodynamic data. The algorithm uses predicted free energies of unbound mRNA and putative mRNA-miRNA heterodimers, implicitly addressing the accessibility of the mRNA 3' untranslated region. MicroTar does not rely on evolutionary conservation to discern functional targets, and is able to predict both conserved and non-conserved targets. MicroTar source code and predictions are accessible at http://tiger.dbs.nus.edu.sg/microtar/, where both serial and parallel versions of the program can be downloaded under an open-source licence.


MicroTar achieves better sensitivity than previously reported predictions when tested on three distinct datasets of experimentally-verified miRNA-target interactions in C. elegans, Drosophila, and mouse.


MicroRNAs (miRNAs) are a class of endogenous, small regulatory RNA averaging 22 nucleotides in length that mediate the post-transcriptional regulation of messenger RNAs. They bind to target messages in a sequence-specific manner, and induce translational repression or endonucleolytic cleavage. The first two miRNAs, lin-4 and let-7 were discovered some seven years apart in the worm C. elegans, in genetic screens for mutants with disrupted developmental timing [1,2]. There has since been an explosion of interest in the field, and the identification of hundreds of miRNAs in metazoans as disparate as vertebrates, arthropods, nematodes, and viruses [3] has established miRNAs as pervasive regulators of gene expression. For recent reviews, see [4-6].

Functions have only been experimentally assigned to a small fraction of the few thousand known miRNAs [7]. Of the experimental strategies available to investigate miRNA function, stringent genetic tests that link miRNA loss-of-function mutants to misregulated targets, and point mutations in miRNA binding sites to specific phenotypes are impractical on a genomic scale in any animal species [8]. Tissue-culture assays using reporter gene constructs fused to target sequences are an easier alternative, but their reliance on ectopic miRNA expression harbours the danger of measuring what may be a nonphysiological interaction between two molecules with complementary surfaces [9]. Computational approaches are thus likely to remain an important means of studying miRNA targets for the forseeable future, not least as a means of directing wet-lab experiments. These predictions are no doubt hampered by the fact that animal miRNAs – in contrast to plant miRNAs – tend to be only partially complementary to their target mRNAs. This fact, compounded by the small size of these molecules, precludes the use of standard sequence comparison methods.

Several algorithms have been developed to predict miRNA targets in animal species; these are listed in Table Table1.1. A common strategy in several of these programs is to rank target 3' untranslated region (UTR) complementarity by some combination of duplex free energy and/or pairing requirements at the 5' end (seed region) of the miRNA [8]. For instance, TargetScan [10] combines requirements for conserved perfect Watson-Crick pairing at positions 2–8 of the miRNA with estimates of the free energy of isolated miRNA-target site interactions, ignoring initiation free energy. While in vitro tests have shown sites containing G:U base-pairs to be functional but impaired [11], recent in vivo experiments have demonstrated them to be efficiently downregulated [9]. Taken together with the presence of a G:U base-pair in the seed region of a functional let-7 binding site in the lin-41 3'-UTR [12], these results make a case for the inclusion of seeds with G:U wobbles in target prediction algorithms.

Table 1
miRNA target prediction tools. A list of current miRNA target prediction tools, with access details. Note that only RNAHybrid and miRanda provide source code for download.

The PicTar [13,14] algorithm defines seeds as heptamers with Watson-Crick or G:U pairings at positions 1–7 or 2–8 from the miRNA 5' end. It combines seed searches with RNA duplex free energy filters, evolutionary conservation requirements, and a probabilistic scoring mechanism to predict targets that are under combinatorial control by co-expressed miRNAs. However, it makes use of RNAHybrid [15], an algorithm that approximates RNA duplex free energies by discarding intramolecular hybridizations in order to achieve linear time complexity.

Robins et al. [16] incorporate mRNA secondary structure computed from 3'-UTRs in their target prediction algorithm, but require perfect Watson-Crick complementarity in the seed site. Furthermore, the use of isolated 3'-UTRs is likely to produce structures very different from the structure of 3'-UTRs in folds that use complete mRNA sequences.

While most of the tools listed in Table Table11 are accessible as web services, only miRanda [17] and RNAHybrid are available as downloadable software that can be modified, extended and run on custom datasets. Most listed algorithms also rely on target conservation across two or more species as a filter. While this is necessary to distinguish functional targets from a vast array of candidates, it results in the unavoidable omission of real targets that are not thus conserved.

Here we present MicroTar, an miRNA target prediction program that does not rely on evolutionary conservation. Through the use of the partial complementarity of miRNAs to their target messages, and the predicted free energy of complete mRNA molecules, we are able to address the problem of the prediction of targets that are not conserved across different genomes. Moreover, harnessing the power of parallel computing obviates the need for introducing approximations that discard intramolecular base pairs in estimates of miRNA-mRNA duplex free energy; we thus implicitly incorporate the accessibility of 3'-UTRs in the algorithm. MicroTar source code – available under an open-source licence – and predictions can be accessed at the MicroTar website [18].



The MicroTar algorithm is based on the following assumptions:

• miRNA target specificity is determined by a heptameric seed sequence (beginning at the first or second position from the 5' end of the miRNA) that is complementary to sites in mRNA 3'-UTRs

• targets are functional if miRNA-mRNA duplex formation is energetically favourable

Beginning with a set of fasta-formatted query (miRNA) sequences and target (mRNA) sequences, the MicroTar algorithm predicts the minimum free energy of the each mRNA molecule, searches for seed sites, and performs a constrained fold where each seed match is, in turn, bound in the miRNA-mRNA heterodimer; the output is a list of putative duplexes more stable than free mRNA, along with images of bound and unbound mRNA secondary structure. This result is subsequently subjected to a statistical analysis to determine the significance of each miRNA-mRNA match. Figure Figure11 presents a schematic overview of this algorithm.

Figure 1
MicroTar algorithm. Beginning with a set of fasta-formatted query (miRNA) sequences and target (mRNA) sequences, the MicroTar algorithm predicts the minimum free energy of the each mRNA molecule, searches for seed sites, and performs a constrained fold ...

Secondary structure prediction

The secondary structure and minimum free energy of the complete unbound mRNA molecule are predicted using the fold routine from the RNAlib library of the ViennaRNA package [19]. This is an implementation of the Zuker & Stiegler dynamic programming algorithm [20]. We denote the predicted free energy of unbound mRNA as G1.

Seed search

Loss-of-function mutation studies have demonstrated the core of miRNA sequence specificity to be a heptameric seed sequence [11], which we define as nucleotides 1–7 or 2–8 at the 5' end of the miRNA. MicroTar searches each mRNA 3'-UTR (or complete mRNA in the absence of annotations) for sites with Watson-Crick or G–U wobble complementarity to this seed sequence; we refer to these hits as seed matches.

Constrained fold

For each seed match above, the mRNA is again folded under the constraint that the miRNA seed is bound to its corresponding match. This uses the cofold [21] routine from the RNAlib library. We denote the free energy of the duplex as G2.


The output is a list of all seed matches, along with predicted energies of the unbound mRNA (G1), putative mRNA-miRNA heterodimers (G2), the estimated energy of duplex formation (g = G2 - G1), and optionally, images of the secondary structure of each mRNA before and after miRNA binding (see e.g., Figure Figure22).

Figure 2
mRNA secondary structure. Sample output of the C. elegans. cog-1 [GenBank:NM_001027093] mRNA secondary structure before and after binding with ...

Functional targets

Seed matches are considered functional targets if the relevant miRNA-mRNA heterodimer is more energetically stable than free mRNA, i.e., g < 0. We then estimate the significance of the prediction using extreme value statistics, much in the fashion of Rehmsmeier et al. [15]. This procedure is outlined below.

Statistical analysis of predicted targets

Negative normalized free energy

The occurrence of favourable hybridizations of short miRNAs with long mRNAs can frequently be attributed to chance: the longer the mRNA, the more likely the incidence. In order to eliminate the effect of sequence length on our measure of free energy [15,22], we define the negative normalized free energy

An external file that holds a picture, illustration, etc.
Object name is 1471-2105-7-S5-S20-i1.gif

where m is the length of the target sequence searched, and n is the length of the miRNA.

Extreme value statistics

Extreme value distributions (EVDs) are limiting distributions that describe the minimum or maximum of independent random variables [23]. If we consider the miRNA-mRNA duplex energy estimation to be essentially an optimization procedure that produces a minimum, the negative normalized free energy described above is a corresponding maximum, and can be described by an EVD having a distribution function of the form

An external file that holds a picture, illustration, etc.
Object name is 1471-2105-7-S5-S20-i2.gif

A transformation then converts this distribution function into a straight line:

An external file that holds a picture, illustration, etc.
Object name is 1471-2105-7-S5-S20-i3.gif

By scanning for targets of random miRNA sequences in the mRNA sequences in the dataset, we obtain a set of negative normalized free energies, which we expect will follow an EVD. We then transform the distribution function of the empirical EVD into a straight line, as in Equation 3, and estimate the parameters of the EVD by a linear least squares fit to the line y = mx + c, obtaining

An external file that holds a picture, illustration, etc.
Object name is 1471-2105-7-S5-S20-i4.gif


a = cb.     (5)

We can now compute, for each predicted miRNA-mRNA duplex, a p-value, the probability that the same or a more favourable free energy is observed due to chance:

An external file that holds a picture, illustration, etc.
Object name is 1471-2105-7-S5-S20-i5.gif

where a and b are estimated EVD parameters, and gn is the negative normalized free energy from Equation 1 [15].

Technical details

MicroTar has been written using the C programming language, and makes use of the RNAlib library from the Vienna RNA package [19]. Great care has been taken to make the system suitable for datasets of varying sizes. Sequences are loaded into memory only as required, allowing the handling of virtually any number of sequences. The parallel version uses functions from v2.0 of the Message Passing Interface (MPI).

MicroTar should compile and run under Linux and most flavours of UNIX. It has been tested under Fedora Core 4 & 5 and CentOS 4.4 Linux distributions, on both 32 and 64 bit platforms.

Results and Discussion


We performed a test of MicroTar on three sets of experimentally verified miRNA targets in C. elegans, Drosophila, and mouse, from v3.0 of TarBase [7]. miRNA sequences were retrieved from miRBase v9.0 [3]; mRNA sequences from RefSeq entries associated with the corresponding gene entry in the Entrez Gene database. In the absence of 3'-UTR annotations, the entire mRNA sequence was scanned for seed matches by MicroTar. These results are summarized in Figure Figure3,3, which shows a density plot of free energies of the most stable predicted miRNA-target duplex for each gene-miRNA pair in the three species.

Figure 3
Energies of predicted miRNA targets. A density plot of free energies of the most stable predicted miRNA-target duplex for each gene-miRNA pair in (a) mouse, (b) C. elegans, and (c) Drosophila, with genes along the x-axis and miRNAs along the y-axis. A ...

Furthermore, we compared our predictions to the widely-used PicTar algorithm, which was recently updated and applied to miRNAs in C. elegans. This comparison is shown in Table Table2,2, where we note that MicroTar achieves better sensitivity in all three cases. We emphasize that unverified predicted interactions should be viewed as a guide for further experiments and not as false positives. Detailed lists of targets predicted are available as supplementary data (see Additional File 1 – MicroTar target predictions compared to PicTar), and on the MicroTar website [18].

Table 2
MicroTar target predictions compared to PicTar. A comparison of MicroTar and PicTar prediction results on three datasets of experimentally verified miRNA targets; MicroTar achieves better sensitivity in all three cases.

Duplex energy estimation

At the core of the MicroTar algorithm lies a novel approach to the estimation of miRNA-mRNA duplex energy. Interactions are viewed in a global context by predicting folds for the entire mRNA, rather than just its 3'-UTR or seed match. By allowing intramolecular hybridizations, we implicitly incorporate the accessibility of the 3'-UTR; seed matches in highly inaccessible UTRs are expected to disrupt UTR secondary structure in putative duplexes. Large disruptions in base pairing cannot be compensated for by bond formation during miRNA-mRNA hybridization. This results in a putative duplex with free energy G2 far greater than that of the unbound mRNA, G1, and the match is rejected.

Significance of predictions

In order to estimate the significance of our predictions, we calculated the p-value for the lowest energy duplex for each miRNA-transcript pair, as derived in Equation 6. The parameters were estimated separately for each species from a distribution computed with random miRNAs. We shuffled miRNAs using the shuffleseq utility from the EMBOSS package [24], ensuring that there were a sufficient number of random sequences for approximately 4000 seed matches in each species. Figure Figure44 shows these p-values in a density plot for each miRNA-target pair, as in Figure Figure33.

Figure 4
p-values of predicted miRNA targets. A density plot of p-values lower than 0.1, of the most stable predicted miRNA-target duplex for each gene-miRNA pair in (a) mouse, (b) C. elegans, and (c) Drosophila, with genes along the x-axis and miRNAs along the ...


MicroTar does not rely on evolutionary conservation to filter predicted targets and is able to address the problem of the prediction of targets that are not conserved across different genomes. Parallel computing makes feasible the use of complex energy prediction algorithms on a large scale, and by using estimates of miRNA-mRNA duplex free energy that allow intramolecular pairings, MicroTar implicitly incorporates the accessibility of 3'-UTRs. In tests on three datasets of experimentally verified miRNA targets in C. elegans, Drosophila and mouse, MicroTar displays greater sensitivity than previously developed target prediction programs.

Availability and Requirements

Project name: MicroTar

Project home page: http://tiger.dbs.nus.edu.sg/microtar/

Operating systems: Linux, UNIX

Programming language: C

Other requirements: GNU autoconf/automake

Licence: New BSD licence

Any restrictions to use by non-academics: None (check ViennaRNA licence, however)

Authors' contributions

MTT and RT planned the project. RT acquired the data and implemented the algorithm. Both authors prepared and approved the final manuscript.

Supplementary Material

Additional File 1:

MicroTar target predictions compared to PicTar. A list of all experimentally verified targets in the three datasets used (C. elegans, Drosophila and mouse), with a comparison of those predicted by MicroTar and those found on the PicTar website.


This work was supported in part by grant R-154-000-265-112 from the National University of Singapore.

RT acknowledges support from the National University of Singapore Research Scholarship.

This article has been published as part of BMC Bioinformatics Volume 7, Supplement 5, 2006: APBioNet – Fifth International Conference on Bioinformatics (InCoB2006). The full contents of the supplement are available online at http://www.biomedcentral.com/1471-2105/7?issue=S5


  • Lee RC, Feinbaum RL, Ambros V. The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell. 1993;75:843–854. doi: 10.1016/0092-8674(93)90529-Y. [PubMed] [Cross Ref]
  • Reinhart BJ, Slack FJ, Basson M, Pasquinelli AE, Bettinger JC, Rougvie AE, Horvitz HR, Ruvkun G. The 21-nucleotide let-7 RNA regulates developmental timing in Caenorhabditis elegans. Nature. 2000;403:901–906. doi: 10.1038/35002607. [PubMed] [Cross Ref]
  • Griffiths-Jones S, Grocock RJ, van Dongen S, Bateman A, Enright AJ. miRBase: microRNA sequences, targets and gene nomenclature. Nucleic Acids Res. 2006;34:D140–D144. doi: 10.1093/nar/gkj112. [PMC free article] [PubMed] [Cross Ref]
  • Bartel DP. MicroRNAs: Genomics, Biogenesis, Mechanism, and Function. Cell. 2004;116:281–297. doi: 10.1016/S0092-8674(04)00045-5. [PubMed] [Cross Ref]
  • Du T, Zamore PD. microPrimer: the biogenesis and function of microRNA. Development. 2005;132:4645–4652. doi: 10.1242/dev.02070. [PubMed] [Cross Ref]
  • Kim VN, Nam JW. Genomics of microRNA. Trends Genet. 2006;22:165–173. doi: 10.1016/j.tig.2006.01.003. [PubMed] [Cross Ref]
  • Sethupathy P, Corda B, Hatzigeorgiou AG. TarBase: A comprehensive database of experimentally supported animal microRNA targets. RNA. 2006;12:192–197. doi: 10.1261/rna.2239606. [PMC free article] [PubMed] [Cross Ref]
  • Lai EC. Predicting and validating microRNA targets. Genome Biol. 2004;5:115. doi: 10.1186/gb-2004-5-9-115. [PMC free article] [PubMed] [Cross Ref]
  • Didiano D, Hobert O. Perfect seed pairing is not a generally reliable predictor for miRNA-target interactions. Nat Struct Mol Biol. 2006;13:849–851. doi: 10.1038/nsmb1138. [PubMed] [Cross Ref]
  • Lewis BP, Shih IH, Jones-Rhoades MW, Bartel DP. Prediction of Mammalian MicroRNA Targets. Cell. 2003;115:787–798. doi: 10.1016/S0092-8674(03)01018-3. [PubMed] [Cross Ref]
  • Brennecke J, Stark A, Russell RB, Cohen SM. Principles of MicroRNA-Target Recognition. PLoS Biol. 2005;3:e85. doi: 10.1371/journal.pbio.0030085. [PMC free article] [PubMed] [Cross Ref]
  • Vella MC, Choi EY, Lin SY, Reinert K, Slack FJ. The C. elegans microRNA let-7 binds to imperfect let-7 complementary sites from the lin-41 3'UTR. Genes Dev. 2004;18:132–137. doi: 10.1101/gad.1165404. [PMC free article] [PubMed] [Cross Ref]
  • Krek A, Grün D, Poy MN, Wolf R, Rosenberg L, Epstein EJ, MacMenamin P, da Piedade I, Gunsalus KC, Stoffel M, Rajewsky N. Combinatorial microRNA target predictions. Nat Genet. 2005;37:495–500. doi: 10.1038/ng1536. [PubMed] [Cross Ref]
  • Lall S, Grün D, Krek A, Chen K, Wang YL, Dewey CN, Sood P, Colombo T, Bray N, MacMenamin P, Kao HL, Gunsalus KC, Pachter L, Piano F, Rajewsky N. A genome-wide map of conserved microRNA targets in C. elegans. Curr Biol. 2006;16:460–471. doi: 10.1016/j.cub.2006.01.050. [PubMed] [Cross Ref]
  • Rehmsmeier M, Steffen P, Höchsmann M, Giegerich R. Fast and effective prediction of microRNA/target duplexes. RNA. 2004;10:1507–1517. doi: 10.1261/rna.5248604. [PMC free article] [PubMed] [Cross Ref]
  • Robins H, Li Y, Padgett RW. Incorporating structure to predict microRNA targets. Proc Natl Acad Sci U S A. 2005;102:4006–4009. doi: 10.1073/pnas.0500775102. [PMC free article] [PubMed] [Cross Ref]
  • John B, Enright AJ, Aravin A, Tuschl T, Sander C, Marks DS. Human MicroRNA Targets. PLoS Biol. 2004;2:e363. doi: 10.1371/journal.pbio.0020363. [PMC free article] [PubMed] [Cross Ref]
  • MicroTar: microRNA target prediction http://tiger.dbs.nus.edu.sg/microtar/
  • Hofacker IL, Fontana W, Stadler PF, Bonhoeffer LS, Tacker M, Schuster P. Fast folding and comparison of RNA secondary structures. Monatsh Chem. 1994;125:167–188. doi: 10.1007/BF00818163. [Cross Ref]
  • Zuker M, Stiegler P. Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Res. 1981;9:133–148. [PMC free article] [PubMed]
  • Bernhart SH, Tafer H, Mückstein U, Flamm C, Stadler PF, Hofacker IL. Partition function and base pairing probabilities of RNA heterodimers. Algorithms Mol Biol. 2006;1:3. doi: 10.1186/1748-7188-1-3. [PMC free article] [PubMed] [Cross Ref]
  • Karlin S, Altschul SF. Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc Natl Acad Sci U S A. 1990;87:2264–2268. doi: 10.1073/pnas.87.6.2264. [PMC free article] [PubMed] [Cross Ref]
  • Gumbel EJ. Statistics of Extremes. New York: Columbia University Press; 1958.
  • Rice P, Longden I, Bleasby A. EMBOSS: The European Molecular Biology Open Software Suite. Trends Genet. 2000;16:276–277. doi: 10.1016/S0168-9525(00)02024-2. [PubMed] [Cross Ref]
  • Rusinov V, Baev V, Minkov IN, Tabler M. MicroInspector: a web tool for detection of miRNA binding sites in an RNA sequence. Nucleic Acids Res. 2005;33:W696–W700. doi: 10.1093/nar/gki364. [PMC free article] [PubMed] [Cross Ref]
  • Kiriakidou M, Nelson PT, Kouranov A, Fitziev P, Bouyioukos C, Mourelatos Z, Hatzigeorgiou A. A combined computational-experimental approach predicts human microRNA targets. Genes Dev. 2004;18:1165–1178. doi: 10.1101/gad.1184704. [PMC free article] [PubMed] [Cross Ref]
  • Sætrom O, Ola Snøve J, Sætrom P. Weighted sequence motifs as an improved seeding step in microRNA target prediction algorithms. RNA. 2005;11:995–1003. doi: 10.1261/rna.7290705. [PMC free article] [PubMed] [Cross Ref]
  • Stark A, Brennecke J, Russell RB, Cohen SM. Identification of Drosophila microRNA targets. PLoS Biol. 2003;1:e60. doi: 10.1371/journal.pbio.0000060. [PMC free article] [PubMed] [Cross Ref]

Articles from BMC Bioinformatics are provided here courtesy of BioMed Central
PubReader format: click here to try


Save items

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • Gene (nucleotide)
    Gene (nucleotide)
    Records in Gene identified from shared sequence and PMC links.
  • Nucleotide
    Primary database (GenBank) nucleotide records reported in the current articles as well as Reference Sequences (RefSeqs) that include the articles as references.
  • PubMed
    PubMed citations for these articles
  • Substance
    PubChem chemical substance records that cite the current articles. These references are taken from those provided on submitted PubChem chemical substance records.
  • Taxonomy
    Taxonomy records associated with the current articles through taxonomic information on related molecular database records (Nucleotide, Protein, Gene, SNP, Structure).
  • Taxonomy Tree
    Taxonomy Tree

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...