• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of plosonePLoS OneView this ArticleSubmit to PLoSGet E-mail AlertsContact UsPublic Library of Science (PLoS)
PLoS One. 2011; 6(6): e19312.
Published online Jun 9, 2011. doi:  10.1371/journal.pone.0019312
PMCID: PMC3111417

Computational Prediction of Intronic microRNA Targets using Host Gene Expression Reveals Novel Regulatory Mechanisms

M. Hossein Radfar,1,2,3,* Willy Wong,1,2 and Quaid Morris3,4,5,6,*
Magnus Rattray, Editor

Abstract

Approximately half of known human miRNAs are located in the introns of protein coding genes. Some of these intronic miRNAs are only expressed when their host gene is and, as such, their steady state expression levels are highly correlated with those of the host gene's mRNA. Recently host gene expression levels have been used to predict the targets of intronic miRNAs by identifying other mRNAs that they have consistent negative correlation with. This is a potentially powerful approach because it allows a large number of expression profiling studies to be used but needs refinement because mRNAs can be targeted by multiple miRNAs and not all intronic miRNAs are co-expressed with their host genes.

Here we introduce InMiR, a new computational method that uses a linear-Gaussian model to predict the targets of intronic miRNAs based on the expression profiles of their host genes across a large number of datasets. Our method recovers nearly twice as many true positives at the same fixed false positive rate as a comparable method that only considers correlations. Through an analysis of 140 Affymetrix datasets from Gene Expression Omnibus, we build a network of 19,926 interactions among 57 intronic miRNAs and 3,864 targets. InMiR can also predict which host genes have expression profiles that are good surrogates for those of their intronic miRNAs. Host genes that InMiR predicts are bad surrogates contain significantly more miRNA target sites in their 3′ UTRs and are significantly more likely to have predicted Pol II and Pol III promoters in their introns.

We provide a dataset of 1,935 predicted mRNA targets for 22 intronic miRNAs. These prediction are supported both by sequence features and expression. By combining our results with previous reports, we distinguish three classes of intronic miRNAs: Those that are tightly regulated with their host gene; those that are likely to be expressed from the same promoter but whose host gene is highly regulated by miRNAs; and those likely to have independent promoters.

Introduction

MicroRNAs (miRNAs) are a large family of small, non-coding endogenous RNAs that play critical roles in a wide range of normal and diseased-related biological processes [1][3] by post-transcriptionally repressing the expression of target genes. miRNAs repress gene expression by binding target mRNAs often in their 3′ UTR.

MicroRNAs recognize their targets through partially complementary, as such, they are particularly amenable to computational prediction of their target mRNA sequences [4][20] (for a recent review of these techniques see [21]). Substantial computational and experimental effort in this area has revealed a number of core predictive sequence features: strong base pairing between the 3′ UTR of mRNAs and the miRNA seed region [22], thermodynamic stability of binding sites [23], evolutionary conservation of binding sites (particularly the seed region) [7], [14], secondary structure accessibility [8], [11], [24][26], and dinucleotide composition of flanking sequence [14], [27]. For example, TargetScan [8] is a popular method that incorporates many of these features and regularly performs well in head-to-head comparisons (e.g., [28]). For a comprehensive review of sequence-based features see [29].

However, despite these efforts, recent reports claim that even the most accurate miRNA target prediction methods have false positive rates greater than 30% [28], [30] and the limited overlap of their predictions suggest that they also have high false negative rates [31][33].

One strategy to improve the accuracy and the sensitivity of target prediction methods is to search for inverse relationships between paired miRNA and mRNA expression levels. Although miRNA-mediated gene repression can occur through Argonaute-catalyzed mRNA cleavage or mRNA destabilization, or translational repression [34][40], as much as 84% of the resulting decrease in the protein product is due to miRNA-induced changes at the transcriptional level [41]. This miRNA-induced mRNA degradation leaves a signature that is inversely correlated with miRNA expression level on the steady-state mRNA levels of its targets [34], [42], [43]. This signature can be detected even when miRNAs also repress translation [37], [38]. However, detecting this signature is difficult simply by comparing expression profiles of a single miRNA and mRNAs [44] possibly because many mRNAs are regulated by multiple miRNAs [12], [32]. We have previously shown that allowing for multiple miRNA regulators of a given mRNA and Bayesian modeling of potential sources of variation can reveal this signature [12]. One way to predict the miRNA targets is to identify mRNA-miRNA pairs whose expression profiles show significant negative correlation in both human and mouse data [45]. However, these approaches require large amounts of paired miRNA and mRNA expression data. This paired data is rarely available because different assays need to applied to the same RNA sample, and until recently, miRNA expression levels were difficult to measure accurately.

Approximately half of mammalian miRNAs are in the introns of protein-coding genes, so it may be possible to predict the targets of some of these intronic miRNAs without having to measure their expression level. Indeed, many intronic miRNAs appear to lack their own promoters and are processed out of introns [46][57]. Estimates for the proportion of intronic miRNA whose expression profiles are significantly correlated with their host gene vary between 34% (25/74 [51]) and 71% (22/31 [50]). If these co-expression relationships can be detected without having to measure the miRNA expression, then host gene expression levels can be used as a surrogate for the miRNA levels when doing target prediction (c.f., [16]). There are substantial advantages to doing this. First, host gene expression levels are measured at the same time and on the same platform as the target gene expression levels, thus removing the need to model platform and laboratory-based effects. Also, there are hundreds of suitable Gene Expression Omnibus datasets for well-studied model organisms that can be used for target prediction, thus adding considerable statistical power to any target predictions.

However, not all host gene expression profiles are useful for predicting the targets of their intronic miRNAs. Some of these intronic miRNAs show evidence of having their own promoter [58][65]. For example, two independent studies found putative promoters for one-third of intronic miRNAs [58], [59]. Furthermore, host gene mRNAs may themselves be under post-transcriptional regulation by other miRNA. As such, it is important to distinguish host genes with expression profiles that are good surrogates for those of their intronic miRNAs from those that are not.

Here we propose a new method that both identifies intronic miRNAs whose host gene's expression provide good surrogates for their expression level as well as predicting the mRNA targets of these miRNAs. Our method takes as input a set of potential miRNA target sites based on sequence comparisons and then among these sites it identifies those likely to be functional sites based on the degree to which host gene's expression is predictive of down-regulation of the mRNA. When predicting regulators of a particular mRNA, we consider the combined effect of all of its potential regulators because most miRNAs are regulated by multiple miRNAs [12], [31], [32], [66], [67]. Our method can use any mRNA expression profiles, however, here we use 140 gene expression data series chosen for their size and their use of the same microarray platform. We distinguish between good and bad host gene surrogates based on the proportion of their hosted miRNA's potential targets that we predict to be functional. Host genes that we deem to be bad surrogates based on this test have more predicted Pol II/III promoters in their introns as well as more predicted miRNA binding sites in their 3′ UTRs.

Results

We modeled the change of an mRNA's expression level in a sample by a linear combination of the host gene expression levels of a subset of the miRNAs with potential target sites in the 3′ UTR of the mRNA. We distinguished the functional and non-functional target sites by fitting this linear model to expression profiling data from a large number of studies and then examining the distributions of weights assigned each potential miRNA regulator.

This linear modeling approaches differs from previous ones [12], [66], [67] in a number of important aspects. First, we use host gene expression levels as surrogates for miRNA expression levels. Also, we predict functional and non-functional sites by integrating evidence from multiple profiling studies rather than a single study. This change allows us to employ a much simpler linear model for each individual dataset because we need not rely upon prior assumptions to detect statistical signals of regulation. The parameters of our model can be easily estimated using ordinary least squares linear regression. One final change is that we assume that the multiple miRNAs contribute additivity to the down-regulation of a given mRNA rather than multiplicatively. In other words, the decrease in expression level of the target is proportional to the expression level of miRNAs. As such, we do not log transform the mRNA expression profile applying our model to it. In the following, we describe our methodology and obtained results in detail.

1-Computing weights for putative miRNA regulators on individual datasets

Our linear model is as follows: Given An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e001.jpg gene expression datasets An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e002.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e003.jpg (see materials and Table S1), let An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e004.jpg denote an An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e005.jpg-element vector whose elements correspond to the decrease in the expression level of the An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e006.jpgth target gene over An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e007.jpg samples in the An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e008.jpgth dataset. We model this vector as a linear function of An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e009.jpg intronic miRNAs whose host gene expression levels are denoted by An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e010.jpg. These intronic miRNAs represent putative regulators of the mRNA identified based on a sequence-based miRNA prediction algorithm, such as TargetScan. Based on the above assumptions and definitions, we build the following model:

equation image
(1)

where An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e012.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e013.jpg is a weight that represents the contribution of the An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e014.jpgth intronic miRNA in regulating the target gene An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e015.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e016.jpg represents modeling error or noise. Typically, we cannot measure An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e017.jpg directly, so we approximate it by the difference between the mean mRNA expression level in the sample and the measured level of An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e018.jpg, i.e., An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e019.jpg , where An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e020.jpg denotes the number of genes in the dataset. We also assume that the noise vector is sampled from a multivariate Gaussian distribution whose covariance matrix is proportional to the identity matrix, i.e., is spherical. Equation (1) can be written in matrix-vector notation as

equation image
(2)

in which An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e022.jpg denotes the expression data of An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e023.jpg host genes over An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e024.jpg samples.

In the model, a positive (negative) weight, An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e025.jpg, indicates the contribution of the host gene An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e026.jpg in decreasing (increasing) the expression level (An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e027.jpg) of the target gene An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e028.jpg. We call this the unconstrained linear model (ULM) to distinguish it from previous models [12], [66] that constrain the weights An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e029.jpg to be positive thereby insisting that miRNAs act only to down-regulate the expression of their target genes. We relax this constraint for convenience because doing so simplifies the fitting procedure without impacting the predictions of the model (see Fig. S2, Fig. S3, and Fig. S4). In this paper, we focus on the down-regulating role of miRNAs as only few miRNAs have been reported to up-regulate target gene expression [68], [69].

Under these assumptions, we can estimate An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e030.jpg using ordinary least squares linear regression, i.e., we minimize the root mean squared error between the reconstruction of the mRNA down-regulation profile based on the miRNA estimates and the observed one, i.e.,:

equation image
(3)

where An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e032.jpg denotes the matrix transpose operation. Note that the solution to equation (3) corresponds to the maximum likelihood estimate of An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e033.jpg (see materials for details).

We solved (3) individually in each dataset to obtain An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e034.jpg An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e035.jpg vectors for the target gene An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e036.jpg. In order to be able to compare weights across datasets, we rescaled the weights for each mRNA within each dataset by dividing each element in An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e037.jpg by the sum of the absolute values of its elements, i.e., An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e038.jpg thus ensuring that An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e039.jpg. In the next section we describe how we combine weights from multiple datasets to make a single prediction for each putative miRNA and mRNA interaction. A summary of symbols used is given in Table 1.

Table 1
The description of symbols used in the paper.

2-Mapping host gene weights to miRNA weights

Our model uses host gene expression as a surrogate for the expression level(s) of its intronic miRNAs. This requires us to resolve some of the host gene / intronic miRNA relationships that are not one-to-one, because some host genes contain multiple intronic miRNAs and some intronic miRNAs are duplicated in more than one host gene. Fig. 1 shows a directed acyclic graph (DAG) representing these relationship for eight intronic miRNAs that are possible regulators for the expression of gene LSM12 whose protein product accumulates in stress granules [70]. This DAG can be interpreted as a graphical model in which the expression patterns of intronic miRNAs are hidden. Because our goal is not only to predict miRNA targets but also to determine which host genes are good surrogates for their intronic miRNAs, we assign weights directly to host genes rather than miRNAs. So, the host genes of duplicated miRNAs get separate weights. Also, when a host gene contains more than one intronic miRNA with putative targets in a given mRNA, we assign this host gene weight to each of these miRNAs. The host gene / target mRNA model that we fit for LSM12 after making these adjustments is shown in Fig. 2.

Figure 1
Interaction between hosts, targets, and intronic miRNAs using DAG.
Figure 2
The simplified DAG.

3-Combining multiple datasets to predict functional targets

We make our predictions of functional targets by comparing the distribution of weights assigned to a host gene / mRNA pair across the datasets to a distribution in which the association between host genes and their expression profiles is randomized. Specifically, we generate a null distribution of weights by permuting the labels of the host genes and re-calculating the weights for all putative pairs in every dataset. All of the weights calculated during this process comprise the empirical null distribution. Then for each host gene / mRNA pair, we compare the distribution of weights for this pair against this null distribution by calculating the two-sided Wilcoxon-Mann-Whitney (WMW) ranksum P-value, we call this value An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e064.jpg for the An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e065.jpg-th host gene and the An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e066.jpg-th mRNA. We also record whether the mean of the distribution of real weights for a given pair is larger or smaller than the mean of the null distribution. The means of the weight distributions that are larger than random reflect a prediction by our model that a miRNA associated with the host gene is down-regulating the target mRNA. As we will describe later, we use host gene / mRNA pairs whose weights are smaller than random when distinguishing good and bad host gene surrogates.

We interpret An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e067.jpg as an enrichment measure and determine a cutoff value, for both positive and negative enrichment, by comparing it to P-values calculated for host gene / mRNA pairs that are unlikely to interact. We generated P-values for these likely negative examples by calculating a two-tailed WMW P-value, An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e068.jpg, for each putative host gene / mRNA pair as described above except that we replace the actual weight distribution with that we computed after permuting the host gene labels. Formally, we define An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e069.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e070.jpg as follows:

equation image
(4)
equation image
(5)

where An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e073.jpg is a function that calculates a two-tailed WMW P-value for sets An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e074.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e075.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e076.jpg is the set of weights fit to the permuted data.

Fig. 3.a–d show the CDFs of weights (i.e. An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e077.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e078.jpg ,An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e079.jpg) for all host genes whose intronic miRNAs have potential target sites in LSM12. The CDF of the pooled weights obtained from the permuted data (the thick gray line) is also shown. These weights were obtained from two methods: ULM (Fig. 3.a–b) and a method that sets weights by correlation (Fig. 3.c–d) (the CORR method, see materials for details). Recently, the HOCTAR method was introduced that uses inverse correlation with host genes to detect intronic miRNA targets [16]; here we use the CORR method to demonstrate how well inverse correlation performed within our framework. From Fig. 3.c–d, we see that the distributions obtained from CORR from the actual and permuted data are almost indistinguishable suggesting that CORR is unpowered and/or prone to misclassification compared to ULM. Moreover, these observations also confirm the cooperative impact of miRNAs on target genes. By contrast, the distributions of three host genes, namely CTDSP1,CTDSP2, and CTDSPL, obtained from ULM–also from constrained linear model (CLM) (Fig.S4)–are significantly different from their permuted counterparts and the pooled distribution. The table at the bottom of Fig. 3 lists An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e080.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e081.jpg for each interaction. In the next subsection we specify a cutoff point in order to determine the significant interactions that we will be using to make predictions about targets.

Figure 3
CDF plots for weights.

4-Determining a cutoff value for significant interactions

We apply ROC analysis to determine a cutoff point for specifying significant An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e086.jpg. Fig. 4 shows the ROC curves for the ULM and CORR methods when we use An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e087.jpg as the discriminant values for the positive examples and An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e088.jpg for the negative examples. By using a cutoff of An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e089.jpg for the ULM An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e090.jpg values, we are able to achieve a sensitivity of 32% at 100% predicted specificity. In other words, An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e091.jpg of interactions predicted by TargetScan are assigned weights whose distributions are more distinguishable from a random distribution than any of those assigned the permuted host gene / mRNA pairs. If we insist on An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e092.jpg specificity, CORR only recovers 17% of the TargetScan predicted host gene / mRNA interactions; achieving 32% sensitivity with CORR requires lowering the specificity to An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e093.jpg. The corresponding cumulative distribution of these An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e094.jpg P-values is shown in Fig.S1-2. In the example in Fig. 3, detect significant interactions between CTDSP1 and LSM12 (P-value = An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e095.jpg(ULM)), between CTDSP2 and LSM12 (P-value = An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e096.jpg (ULM)), and between CTDSPL and LSM12 (P-values = An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e097.jpg (ULM)) significant. Fig. 5 shows the boxplots of weights of 7 host genes whose intronic miRNAs putatively target LSM12.

Figure 4
Receiver Operating Characteristic (ROC) curve analysis.
Figure 5
Interaction between LSM12 (target gene ) and the host genes of its targeting miRNAs.

5-Detecting good host gene surrogates

Using the method described in the last section, we defined for each host gene a set of significant interactions between the host gene's expression level and those of the predicted targets of its associated intronic miRNAs (i.e. those for which An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e101.jpg). Furthermore, we know whether that an interaction is a “negative” one when the mean of weights over all datasets (i.e. An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e102.jpg) is larger than random expectation or a “non-negative” one, when the mean is smaller than random expectation. When we examine all the significant interactions between a host (or equivalently its miRNA) and its predictive targets, we find that these interactions are almost exclusively negative or non-negative.

We retrieved and processed the expression profiles of 75 host genes and 3864 target genes (see materials and Table S3 ) over 140 datasets. For all target genes (An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e103.jpg), we carried out the procedure given in Materials subsection 5 for obtaining p-values for ULM, CLM, and CORR methods. All of these p-values are available in Table S3. We report the results for ULM, the significant interactions from CLM are similar and, as we described in the last section, using CORR reduces our sensitivity or specificity or both. After applying the cutoff at An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e104.jpg, we find that An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e105.jpg (An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e106.jpg) host genes have more negative interactions than positive ones. Those host genes and their 1935 target genes are shown in Fig. 6.

Figure 6
A gene-gene interaction network of target and host genes of intronic miRNAs.

Fig. 7 shows the number of TargetScan-predicted targets for each of these An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e109.jpg host genes, along with the number of significant interactions for these predicted targets and the number of these significant interactions that are negative. As shown, for 21 out of 22 host genes, almost all interactions are negative (equal light green and yellow bars). We take this as evidence that the host gene expression level is a good surrogate for that of its intronic miRNAs. Indeed when we consider all of the host genes with any significant interactions, we find that they fall into two main classes: those whose interactions are almost exclusively negative and those that are non-negative (Fig. 8). Furthermore, those that are non-negative are highly enriched for those with possible promoters, as predicted by sequence analysis in [58], for their intronic miRNAs (Fig. 8 and Fig. 9). We also observe that significantly negatively enriched host genes have, on average, high mean p-values (blue circles). For instance, 7 out of 8 host genes, namely HNRNPK , COPZ1, HUWE1, PANK2, ACADVL, LARP7, and IARS2 appear at the top of the ranked mean p-value list. Thus, significantly negatively interactions and high mean p-values are two determinants which may provide strong evidence for detecting co-expressed host-intronic miRNA pairs.

Figure 7
The host genes that significantly negatively interact with the target genes.
Figure 8
The scatter plot shows the enrichment of host genes.
Figure 9
Venn diagrams.

6-Targeting of host genes by miRNAs partially explains their predicted surrogacy

Even if a host gene and intronic miRNA are expressed from the same promoter, they could have different expression levels due to different post-transcriptional regulation. To investigate this, we examined the predicted miRNA targets within the 3′ UTRs of host genes. We found host genes are targeted by miRNAs much more than non-host genes (An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e114.jpg, Wilcoxon ranksum test) though we were unable to detect a preference for targeting by intronic versus intergenic miRNAs (Fig S5). However, we found that negatively enriched host genes have significantly fewer (An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e115.jpg, Wilcoxon ranksum test) miRNA targets than non-negatively enriched hosts (Fig. 10). So, down-regulation of the host gene by other miRNAs could provide another possible explanation for why some host expression levels are bad surrogates for those of their intronic miRNAs. The pattern of interactions among host genes and their intronic miRNAs suggests that there may be some hierarchical structure in intronic miRNA-based regulation (Fig S6).

Figure 10
Number of intergenic and intronic miRNAs that putatively target our set of host genes.

7-Correlation measurements are not good indicators of surrogacy

Correlation between the expression patterns of the host genes and their intronic miRNAs in a single dataset are not a good indicator of surrogacy. We observed that correlation measurements reported by five different groups are highly non-overlapped and somehow inconsistent (See File S1, Fig S7, Table S5). Only 11 host-miRNA pairs show high positive correlation (An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e116.jpg) at least in two of these five datasets (Fig. 11). Out of these 11 host genes, 4 host genes are predicted to be good surrogates by our model. While the intronic miRNAs of none of these 4 hosts have promoters, 6 out of 7 hosts predicted to be bad surrogates have intronic miRNAs with promoters (Fig. 11). Thus, 7 highly correlated host-intronic miRNA pairs pass neither our criteria nor the promoterless condition.

Figure 11
Pearson correlation coefficients averaged over five correlation datasets.

Discussion

InMiR models the combinatorial effect of miRNAs using a simple and biologically plausible linear model. Because we use ordinary linear regression for target prediction, InMiR is fast and easy to update to incorporate new mRNA expression data. We used data from An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e118.jpg1,500 gene expression arrays to predict interactions in human between 57 intronic miRNAs and 3,864 potential targets. InMiR can also be readily applied to other species beside human because intronic miRNAs constitute a large portion of the miRNA complement of a variety of species (Fig. 12).

Figure 12
Intronic miRNAs comprises a significant portion of identified miRNAs in other species.

Unlike previously described methods, InMiR does not assume that all host genes have expression levels that are equally good surrogates. The set of host genes predicted by InMiR to be bad surrogates is enriched for those with predicted intronic promoters as well as having a larger number of microRNA target sites in their 3′ UTRs.

As shown in Fig. 13, our observations suggest at least three types of regulatory relationships between host genes and their intronic microRNAs: (a) an intronic miRNA and its host gene are transcribed from the same promoter; the mature miRNA is then processed from intron before or after splicing using Drosha or independently (mirtrons) and the subsequent steady-state expression levels of the host and intronic miRNA are highly correlated (Fig6.a); (b) an intronic miRNA has its own promoter and is transcribed independently from the host gene at least some of the time (Fig 6.b); (c) the intronic miRNA and host are transcribed from the same promoter but the post-transcriptional regulation of the host gene expression levels is different than those of the miRNA (Fig 6.c). For example, a host gene could be down-regulated by its own intronic miRNA; we found three self-regulated hosts, all of which were predicted as bad surrogates by InmiR (Fig S8) or host genes could be down-regulated by other co-expressed miRNAs.

Figure 13
Regulatory mechanisms.

The host gene / intronic miRNA interactions that we observe suggest a variety of new regulatory mechanisms. For example, tightly coupled host gene and intronic miRNA expression could support a rapid “biological switch” in cellular state in which host gene expression also expresses an intronic miRNA that immediately down-regulates genes expressed in the competing state (Fig. S9).

Our observation raise a number of interesting questions. Are intronic miRNAs with their own promoter ever expressed from the host gene's promoter? How is this decision regulated? How does the independent transcription of an intronic miRNA affect host gene transcription? Does the processing of intronic miRNA interfere with splicing? This may depend on whether Drosha cleaves the pre-miRNA before or after splicing. Kim and Kim [56] speculated that both mechanisms may occur but no conclusive results can be drawn yet. Answers to these not well-understood mechanisms provide a clearer picture of intronic miRNA biogenesis.

Materials and Methods

1-Microarray data

140 curated gene expression data sets, called GDS, were downloaded from Gene Expression Omnibus (GEO) using the MATLAB Bioinformatics toolbox function getgeodata.m. The list of these GDSs are given in Table S1. Each dataset is then processed as follows. First, we excluded those genes for which we have missing values. Then we filtered out genes with absolute values less than 10th percentile using MATLAB function genelowvalfilter.m. The expression profile related to the host gens are normalized so that all have length one. Mathematically this means An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e119.jpg. For the target genes, we obtain the decrease in expression level as An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e120.jpg where An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e121.jpg.

2-Maximum Likelihood Estimation

The maximum likelihood estimate of An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e122.jpg is given by

equation image
(6)

The vector An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e124.jpg is modeled by a zero mean white Gaussian noise of the form

equation image
(7)

If we assume that the noise process has a diagonal covariance matrix of the form An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e126.jpg where An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e127.jpg denotes the identity matrix, then maximum likelihood function is given by

equation image
(8)

Thus, maximizing the log of An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e129.jpg is equivalent

equation image
(9)

3-Predicting miRNA targets using inverse correlation (CORR method)

Gennarino and colleague [16] recently described an algorithm, HOCTAR, that predict intronic microRNA targets based on inverse correlation of their host genes with other mRNAs across a large number of datasets. As we have previously demonstrated [71], linear models that consider the impact of multiple potential miRNA regulators generate more accurate target predictions than simple correlations, consistent with recent observations of miRNA-target interactions [31], [32]. To assess whether these observations hold for target predictions based on host gene expression, we also assessed a version of our method in which we replace the weights with correlations. The resulting algorithm is very similar to HOCTAR.

In particular, we denote the correlation coefficient by An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e131.jpg where An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e132.jpg represents the Pearson correlation coefficient. We then use these correlations An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e133.jpg for real and permuted datasets in the place of weights to calculate the P-value based enrichment measures as described in Section II.C. We call this method as CORR.

4-Processing hosts and targets data

We retrieved the mirRBase gene context repository and extracted all human intronic miRNA-host gene association (Table S2). We also downloaded 140 gene expression datasets (GDS) from Gene Expression Omnibus (GEO) which were built on the Affymetrix HG-U133 microarray platform [16] using MATLAB function getgeodata.m (Table S1 and materials). Only those probe IDs that could be mapped to gene symbols (according to HGNC) were considered for analysis. We used the list of putatively predicted target genes (9448) and their intronic miRNAs (134) from the TargetScan (release 5.1) repository.

5-Pseudo code for implementing InmiR

for An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e134.jpg(number of target genes)

  find all intronic miRNAs which putatively target An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e135.jpg using    TargetScan

  map intronic miRNAs to their host genes ,An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e136.jpg

 for An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e137.jpg(number gene expression datasets)

   extract the expression data of the host genes, An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e138.jpg

   extract the expression data of the target gene, An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e139.jpg

   solve An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e140.jpg

   permute the rows using a permuted matrix, An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e141.jpg, to get An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e142.jpg

   solve An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e143.jpg

 end

 for An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e144.jpg

   compute the P-values:

  An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e145.jpg

  An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e146.jpg

 end

end

 set two classes of data I:An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e147.jpg} and II:An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e148.jpg

 plot ROC curve and determine a cutoff point (An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e149.jpg) to get   almost zero false positive

 declare the interaction between host gene An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e150.jpg and target gene An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e151.jpg   significant if An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e152.jpg

Supporting Information

Figure S1

The cumulative distribution function obtained from ULM. The cumulative distribution functions of the negative 10 based logarithm of the p-values for the actual and permuted host-target interactions obtained form ULM (dashed and solid blue lines), and CORR (dashed and solid red lines). The cutoff point was set to 2 (the dashed black vertical line) and all p-values beyond this point are declared significant.

(TIF)

Figure S2

The cumulative distribution function obtained from CLM. The cumulative distribution functions of the negative 10 based logarithm of the p-values for the actual and permuted host-target interactions obtained form constrained linear model (CLM)–An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e153.jpg–(dashed and solid blue lines), and ULM.

(TIF)

Figure S3

Receiver Operating Characteristic (ROC) curve analysis for ULM and CLM.Receiver Operating Characteristic (ROC) curve analysis to determine the cutoff point. We set the cutoff point to 0.01 (An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e154.jpg) to identify significant host-target interactions. The blue and green curves show the ROC associated with ULM and CLM.

(TIF)

Figure S4

The weights CDFs and p-values obtained from ULM. Plots e-f: the CDFs of the weights An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e155.jpg An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e156.jpg for seven host genes obtained from constrained linear model (CLM)–An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e157.jpg– with the actual (e) and permutation data (f). The thick gray line in each plot is the CDF obtained from the pooled permutation data for each method. Table lists the An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e158.jpg p-values (Willcoxon ranksum test) showing the probability that the weight or correlation data are drawn from the pooled permutated data (see (4) and (5) for detail). It should be noted that the host gene MIRHG1 was excluded for analysis since the expression data related this host gene did not exist in the retrieved dataset.

(TIF)

Figure S5

The CDFs of the number of miRNAs targeting host and non-host genes. Top: the cumulative distribution of the number of miRNAs targeting host (blue) and non-host genes (red). The inset shows the CDF of 3′ UTR length of hosts(bule) and non-host genes (bule). Bottom: the CDF of the number of miRNAs targeting host (blue) and non-host genes (red) per base; that is, number of target /3′UTR length. The CDFs are obtained from analyzing 367 host genes and 17000 non-host genes.

(TIF)

Figure S6

Host genes targeted by intronic miRNAs of other hosts. Host genes targeted by intronic miRNAs of other hosts. The nodes corresponding to hosts predicted to be good surrogates are shown in red.

(TIF)

Figure S7

Scatter plots of five correlation datasets. Scatter plots of five correlation datasets (Table S4). (a) the scatter plot of Rad's data versus Liang's, Wang's, Ruike's, and Baskerville's data. (b) the scatter plot of Liang's data versus Wang's, Ruike's, and Baskerville's data. (c) the scatter plot of Wang's data versus Ruike's and Baskerville's data. (d) the scatter plot of Ruike's data versus Baskerville's data.

(TIF)

Figure S8

The host genes targeted by their own intronic miRNAs. The host genes in our dataset which are targeted by their own intronic miRNAs. All of these hosts are predicted to be bad surrogates.

(TIF)

Figure S9

Host and intronic miRNA resemble a “biological switch”. Tightly coupled host gene and intronic miRNA expression could support a rapid “biological switch” in cellular state in which host gene expression also expresses an intronic miRNA that immediately down-regulates genes expressed in the competing state.

(TIF)

Table S1

List of GDS data for analysis. The identifiers of Gene Datasets (GDS) retrieved from the Gene Expression Omnibus repository.

(XLS)

Table S2

The excel file contains all intronic-host genes pairs. Data are retrieved from MirBase v.15.

(XLS)

Table S3

The excel file, consisting of 6 sheets, contains the entire p-values obtained from interactions between 3864 intronic miRNAs targeted genes and 57 hosts genes using the CLM, ULM, and CORR methods. sheet 1 p-values from the CLM model. sheet 2 p-values from the CLM model with permuted data. sheet 3 p-values from the ULM model. sheet 4 p-values from the ULM model with permuted data. sheet 5 p-values from the CORR model. sheet 6 p-values from the CORR model with permuted data. The names of the targeted genes and host genes are given in the first row and column of the first sheet. Note that a zero in (i,j) in the tables shows that the ith gene is not a target of the intronic miRNAs of the jth host.

(XLS)

Table S4

The excel file contains all target-intronic miRNA pairs and their scores. column one: target genes. column two: intronic mirnas. column three: host genes. column four: scores (pvalues)– scores An external file that holds a picture, illustration, etc.
Object name is pone.0019312.e159.jpg2 are significant. column five flag = 1 negative and flag = 1 positive interactions.

(XLS)

Table S5

coefficients. Correlation coefficients obtained from five different datasets, namely Baskerville et al., Liang et al., Wang et al., Ruike et al. , and Rad. The data reported by Wang et al. are in terms of p-values. A empty cell in the table shows that either the data was not available for the host-intronic miRNA pair or the correlation coefficient was negative or insignificant.

(XLS)

File S1

Host-intronic mirnas correlation data.

(PDF)

Acknowledgments

The authors thank Gary Bader and his lab members for help on Cytoscape.

Footnotes

Competing Interests: The authors have declared that no competing interests exist.

Funding: This project was funded by a Natural Sciences and Engineering Research Council (NSERC) of Canada fellowship to M.H.R. and partially funded by NSERC operating grants to Q.M. and W.W. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

1. Bartel D. MicroRNAs: target recognition and regulatory functions. Cell. 2009;136:215–233. [PMC free article] [PubMed]
2. Rajewsky N. microRNA target predictions in animals. Nature genetics. 2006;38:S8–S13. [PubMed]
3. Zhang B, Pan X, Cobb G, Anderson T. microRNAs as oncogenes and tumor suppressors. Developmental biology. 2007;302:1–12. [PubMed]
4. Griffiths-Jones S, Grocock RJ, van Dongen S, Bateman A, Enright A. miRBase: microRNA sequences, targets and gene nomenclature. NAR. 2006;34:140–144. [PMC free article] [PubMed]
5. Griffiths-Jones S, Saini HK, van Dongen S, Enright AJ. miRBase: tools for microRNA genomics. Nucleic Acids Research. 2008;36:154–158. [PMC free article] [PubMed]
6. John B, Enright A, Aravin A, Tuschl T, Sander C, et al. Human microRNA targets. PLoS Biol. 2004;2:e363. [PMC free article] [PubMed]
7. Friedman R, Farh K, Burge C, Bartel D. Most mammalian mRNAs are conserved targets of microRNAs. Genome Research. 2009;19:92. [PMC free article] [PubMed]
8. Grimson A, Farh K, Johnston W, Garrett-Engele P, Lim L, et al. MicroRNA targeting specificity in mammals: determinants beyond seed pairing. Molecular cell. 2007;27:91–105. [PMC free article] [PubMed]
9. Betel D, Koppal A, Agius P, Sander C, Leslie C. Comprehensive modeling of microRNA targets predicts functional non-conserved and non-canonical sites. Genome biology. 2010;11:R90. [PMC free article] [PubMed]
10. Lall S, Grun D, Krek A, Chen K, Wang Y, et al. A genome-wide map of conserved microRNA targets in C. elegans. Current biology. 2006;16:460–471. [PubMed]
11. Kertesz M, Iovino N, Unnerstall U, Gaul U, Segal E. The role of site accessibility in microRNA target recognition. Nature genetics. 2007;39:1278–1284. [PubMed]
12. Huang JC, Babak T, Corson TW, Chua G, Khan S, et al. Using expression profiling data to identify human microRNA target. Nature Methods. 2007;4:1045–1049. [PubMed]
13. Wang X, El Naqa I. Prediction of both conserved and nonconserved microRNA targets in animals. Bioinformatics. 2008;24:325. [PubMed]
14. Nielsen C, Shomron N, Sandberg R, Hornstein E, Kitzman J, et al. Determinants of targeting by endogenous and exogenous microRNAs and siRNAs. Rna. 2007;13:1894. [PMC free article] [PubMed]
15. Hammell M, Long D, Zhang L, Lee A, Carmack C, et al. mirWIP: microRNA target prediction based on microRNA-containing ribonucleoprotein–enriched transcripts. Nature methods. 2008;5:813–819. [PMC free article] [PubMed]
16. Gennarino VA, Sardiello M, Avellino R, Meola N, Maselli V, et al. MicroRNA target prediction by expression analysis of host genes. Genome Res. 2008;19:481–490. [PMC free article] [PubMed]
17. Maragkakis M, Reczko M, Simossis V, Alexiou P, Papadopoulos G, et al. Nucleic Acids Research 37(Web Server issue); 2009. DIANA-microT web server: elucidating microRNA functions through target prediction. pp. W273–W276. doi: 10.1093/nar/gkp292. [PMC free article] [PubMed]
18. Gaidatzis D, Van Nimwegen E, Hausser J, Zavolan M. Inference of miRNA targets using evolutionary conservation and pathway analysis. BMC bioinformatics. 2007;8:69. [PMC free article] [PubMed]
19. Ioshikhes I, Roy S, Sen C. Algorithms for mapping of mRNA targets for microRNA. DNA and Cell Biology. 2007;26:265–272. [PubMed]
20. Hausser J, Berninger P, Rodak C, Jantscher Y, Wirth S, et al. MirZ: an integrated microRNA expression atlas and target prediction resource. Nucleic Acids Researchl. 2009;37(supp 2):W266–W272. doi: 10.1093/nar/gkp412. [PMC free article] [PubMed]
21. Hammell M. Computational methods to identify miRNA targets. In: Seminars in Cell Developmental Biology. 2010;21(7):738–744. [PMC free article] [PubMed]
22. Lewis B, Shih I, et al. Prediction of mammalian microRNA targets. Cell. 2003;115:787–798. [PubMed]
23. Rehmsmeier M, Steffen P, Hochsmann M, Giegerich R. Fast and effective prediction of microRNA/target duplexes. RNA. 2004;10:1507. [PMC free article] [PubMed]
24. Ameres S, Martinez J, Schroeder R. Molecular basis for target RNA recognition and cleavage by human RISC. Cell. 2007;130:101–112. [PubMed]
25. Tafer H, Ameres S, Obernosterer G, Gebeshuber C, Schroeder R, et al. The impact of target site accessibility on the design of effective siRNAs. Nature Biotechnology. 2008;26:578–583. [PubMed]
26. Majoros W, Ohler U. Spatial preferences of microRNA targets in 3′ untranslated regions. BMC Genomics. 2007;8:152. [PMC free article] [PubMed]
27. Ohler U, Yekta S, Lim L, Bartel D, Burge C. Patterns of flanking sequence conservation and a characteristic upstream motif for microRNA gene identification. RNA. 2004;10:1309. [PMC free article] [PubMed]
28. Min H, Yoon S. Got target?: computational methods for microrna target prediction and their extension. Exp Mol Med. 2010;4:233–244. [PMC free article] [PubMed]
29. Hausser J, Landthaler M, Jaskiewicz L, Gaidatzis D, Zavolan M. Relative contribution of sequence and structure features to the mRNA binding of Argonaute/EIF2C–miRNA complexes and the degradation of miRNA targets. Genome Research. 2009;19(11):2009–2020. [PMC free article] [PubMed]
30. Thomas M, Lieberman J, Lal A. Desperately seeking microRNA targets. Nature Structural & Molecular Biology. 2010;17:1169–1174. [PubMed]
31. Ritchie W, Flamant S, Rasko J. Predicting microRNA targets and functions: traps for the unwary. Nature Methods. 2009;6:397–398. [PubMed]
32. Peter M. Targeting of mRNAs by multiple miRNAs: the next step. Oncogene. 2010;29:2161–2164. [PubMed]
33. Boross G, O K, Farkas IJ. Human microRNAs co-silence in well-separated groups and have different predicted essentialities. Bioinformatics. 2009;25:1063–1069. [PubMed]
34. Lim L, Lau N, Garrett-Engele P, Grimson A, Schelter J, et al. Microarray analysis shows that some microRNAs downregulate large numbers of target mRNAs. Nature. 2005;433:769–773. [PubMed]
35. Sood P, Krek A, Zavolan M, Macino G, Rajewsky N. Cell-type-specific signatures of microRNAs on target mRNA expression. Proceedings of the National Academy of Sciences of the United States of America. 2006;103:2746. [PMC free article] [PubMed]
36. Filipowicz W, Bhattacharyya S, Sonenberg N. Mechanisms of post-transcriptional regulation by microRNAs: are the answers in sight? Nature Reviews Genetics. 2008;9:102–114. [PubMed]
37. Baek D, Villén J, Shin C, Camargo F, Gygi S, et al. The impact of microRNAs on protein output. Nature. 2008;455:64–71. [PMC free article] [PubMed]
38. Selbach M, Schwanhausser B, Thierfelder N, Fang Z, Khanin R, et al. Widespread changes in protein synthesis induced by microRNAs. Nature. 2008;455:58–63. [PubMed]
39. Humphreys D, Westman B, Martin D, Preiss T. MicroRNAs control translation initiation by inhibiting eukaryotic initiation factor 4E/cap and poly (A) tail function. Proceedings of the National Academy of Sciences of the United States of America. 2005;102:16961. [PMC free article] [PubMed]
40. Khan A, Betel D, Miller M, Sander C, Leslie C, et al. Transfection of small RNAs globally perturbs gene regulation by endogenous microRNAs. Nature Biotechnology. 2009;27:549–555. [PMC free article] [PubMed]
41. Guo H, Ingolia N, Weissman J, Bartel D. Mammalian microRNAs predominantly act to decrease target mRNA levels. Nature. 2010;466:835–840. [PMC free article] [PubMed]
42. Farh K, Grimson A, Jan C, Lewis B, Johnston W, et al. The widespread impact of mammalian MicroRNAs on mRNA repression and evolution. Science. 2005;310:1817. [PubMed]
43. Babak T, Zhang W, Morris Q, Blencowe B, Hughes T. Probing microRNAs with microarrays: tissue specificity and functional inference. RNA. 2004;10:1813. [PMC free article] [PubMed]
44. Liu H, D'Andrade P, Fulmer-Smentek S, Lorenzi P, Kohn K, et al. mRNA and microRNA expression profiles of the NCI-60 integrated with drug activities. Molecular Cancer Therapeutics. 2010;9:1080. [PMC free article] [PubMed]
45. Ritchie W, Rajasekhar M, Flamant S, Rasko J. Conserved Expression Patterns Predict microRNA Targets. PLoS Computational Biology. 2009;5(9):e1000513. [PMC free article] [PubMed]
46. Rodriguez A, Griffiths-Jones S, Ashurst J, Bradley A. Identification of mammalian microRNA host genes and transcription units. Genome Research. 2004;14:1902. [PMC free article] [PubMed]
47. Baskerville S, Bartel DP. Microarray profiling of microRNAs reveals frequent coexpression with neighboring miRNAs and host genes. RNA. 2005;11:241–247. [PMC free article] [PubMed]
48. Lu J, Getz G, Miska E, Alvarez-Saavedra E, Lamb J, et al. MicroRNA expression profiles classify human cancers. Nature. 2005;435:834–838. [PubMed]
49. Bargaje R, Hariharan M, Scaria V, Pillai B. Consensus miRNA expression profiles derived from interplatform normalization of microarray data. RNA. 2010;16:16. [PMC free article] [PubMed]
50. Liang Y, Ridzon D, Wong L, Chen C. Characterization of microRNA expression profiles in normal human tissues. BMC Genomics. 2007;8:166. [PMC free article] [PubMed]
51. Yu-Ping W, Kuo-Bin L. Correlation of expression profiles between microRNAs and mRNA targets using NCI-60 data. BMC Genomics. 2009;10 doi: 10.1186/1471-2164-10-218. [PMC free article] [PubMed]
52. Blower P, Verducci J, Lin S, Zhou J, Chung J, et al. MicroRNA expression profiles for the NCI-60 cancer cell panel. Molecular Cancer Therapeutics. 2007;6:1483. [PubMed]
53. Wang D, Lu M, Miao J, Li T, Wang E, et al. Cepred: predicting the co-expression patterns of the human intronic microRNAs with their host genes. PLoS One. 2009;4(2):e4421. doi: 10.1371/journal.pone.0004421. [PMC free article] [PubMed]
54. Ronchetti D, Lionetti M, Mosca L, Agnelli L, Andronache A, et al. An integrative genomic approach reveals coordinated expression of intronic miR-335, miR-342, and miR-561 with deregulated host genes in multiple myeloma. BMC Medical Genomics. 2008;1:37. [PMC free article] [PubMed]
55. Ruike Y, Ichimura A, Tsuchiya S, Shimizu K, Kunimoto R, et al. Global correlation analysis for micro-RNA and mRNA expression profiles in human cell lines. Journal of Human Genetics. 2008;53:515–523. [PubMed]
56. Kim YK, Kim VN. Processing of intronic microRNAs. The EMBO Journal. 2007;26:775–783. [PMC free article] [PubMed]
57. Li S, Tang P, Lin W. Intronic microRNA: discovery and biological implications. DNA and Cell Biology. 2007;26:195–207. [PubMed]
58. Monteys A, Spengler R, Wan J, Tecedor L, Lennox K, et al. Structure and activity of putative intronic miRNA promoters. RNA. 2010;16:495. [PMC free article] [PubMed]
59. Ozsolak F, Poling L, Wang Z, Liu H, Liu X, et al. Chromatin structure analyses identify miRNA promoters. Genes and Development. 2008;22:3172. [PMC free article] [PubMed]
60. Martinez N, Ow M, Reece-Hoyes J, Barrasa M, Ambros V, et al. Genome-scale spatiotemporal analysis of Caenorhabditis elegans microRNA promoter activity. Genome Research. 2008;18:2005. [PMC free article] [PubMed]
61. Wang X, Xuan Z, Zhao X, Li Y, Zhang M. High-resolution human core-promoter prediction with CoreBoost HM. Genome Research. 2009;19:266. [PMC free article] [PubMed]
62. Golan D, Levy C, Friedman B, Shomron N. Biased hosting of intronic microRNA genes. Bioinformatics. 2010;26:992. [PubMed]
63. Ernst J, Plasterer H, Simon I, Bar-Joseph Z. Integrating multiple evidence sources to predict transcription factor binding in the human genome. Genome Research. 2010;20:526. [PMC free article] [PubMed]
64. Corcoran D, Pandit K, Gordon B, Bhattacharjee A, Kaminski N, et al. 4. PLoS One 4; 2009. Features of mammalian microRNA promoters emerge from polymerase II chromatin immunoprecipitation data. doi: 10.1371/journal.pone.0005279. [PMC free article] [PubMed]
65. Zhou X, Ruan J, Wang G, Zhang W. PLoS Comput Biol 3(3): e37. doi:10.1371/journal.pcbi; 2007. Characterization and identification of microRNA core promoters in four model species. [PMC free article] [PubMed]
66. Huang JC, Morris QD, Frey BJ. Bayesian inference of microRNA targets from sequence and expression data. Journal of Computational Biology. 2007;14:550–563. [PubMed]
67. Krek A, Grun D, Poy M, Wolf R, Rosenberg L, et al. Combinatorial microRNA target predictions. Nature Genetics. 2005;37:495–500. [PubMed]
68. Vasudevan S, Tong Y, Steitz J. Switching from repression to activation: microRNAs can up-regulate translation. Science. 2007;318:1931. [PubMed]
69. Vasudevan S, Steitz J. AU-rich-element-mediated upregulation of translation by FXR1 and Argonaute 2. Cell. 2007;128:1105–1118. [PMC free article] [PubMed]
70. Swisher K, Parker R. Localization to, and Effects of Pbp1, Pbp4, Lsm12, Dhh1, and Pab1 on Stress Granules in Saccharomyces cerevisiae. PLoS One. 2010;5(4):e10006. doi: 10.1371/journal.pone.0010006. [PMC free article] [PubMed]
71. Huang J, Morris Q, Frey B. Research in Computational Molecular Biology. Berlin, Germany: Springer-Verlag,; 2006. Detecting microRNA targets by linking sequence, microRNA and gene expression data. pp. 114–129.

Articles from PLoS ONE are provided here courtesy of Public Library of Science
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

  • MedGen
    MedGen
    Related information in MedGen
  • PubMed
    PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...