Logo of genesdevCSHL PressJournal HomeSubscriptionseTOC AlertsBioSupplyNetGenes & Development
Genes Dev. 2001 Jul 1; 15(13): 1637–1651.
PMCID: PMC312727

Identification of novel small RNAs using comparative genomics and microarrays


A burgeoning list of small RNAs with a variety of regulatory functions has been identified in both prokaryotic and eukaryotic cells. However, it remains difficult to identify small RNAs by sequence inspection. We used the high conservation of small RNAs among closely related bacterial species, as well as analysis of transcripts detected by high-density oligonucleotide probe arrays, to predict the presence of novel small RNA genes in the intergenic regions of the Escherichia coli genome. The existence of 23 distinct new RNA species was confirmed by Northern analysis. Of these, six are predicted to encode short ORFs, whereas 17 are likely to be novel functional small RNAs. We discovered that many of these small RNAs interact with the RNA-binding protein Hfq, pointing to a global role of the Hfq protein in facilitating small RNA function. The approaches used here should allow identification of small RNAs in other organisms.

Keywords: Hfq, rpoS, antisense regulation

In the last few years, the importance of regulatory small RNAs (sRNAs) as mediators of a number of cellular processes in bacteria has begun to be recognized. Although instances of naturally occurring antisense RNAs have been known for many years, the participation of sRNAs in protein tagging for degradation, modulation of RNA polymerase activity, and stimulation of translation are relatively recent discoveries (for review, see Wassarman et al. 1999; Wassarman and Storz 2000). These findings have raised questions about how extensively sRNAs are used, what other cellular activities might be regulated by sRNAs, and what other mechanisms of action exist for sRNAs. In addition, prokaryotic sRNAs appear to target different cellular functions than their eukaryotic counterparts that primarily act during RNA biogenesis. It is unclear whether this apparent difference between prokaryotic and eukaryotic sRNAs is accurate or stems from the incompleteness of current knowledge. Implicit in these questions is the question of how many sRNAs exist in a given organism and whether the current known sRNAs are truly representative of sRNA function in general.

To date, most known bacterial sRNAs have been identified fortuitously by the direct detection of highly abundant sRNAs (4.5S RNA, tmRNA, 6S RNA, RNaseP RNA, and Spot42 RNA), by the observation of an sRNA during studies on proteins (OxyS RNA, Crp Tic RNA, CsrB RNA, and GcvB RNA), or by the discovery of activities associated with overexpression of genomic fragments (MicF RNA, DicF RNA, DsrA RNA, and RprA RNA) (Okamoto and Freundlich 1986; Bhasin 1989; Urbanowski et al. 2000; Wassarman and Storz 2000; Majdalani et al. 2001; for review, see Wassarman et al. 1999). None of the Escherichia coli sRNAs were found as a result of mutational screens. This observation may reflect the small target size of genes encoding sRNAs compared to protein genes, or may be a consequence of the regulatory rather than essential nature of many sRNA functions. The complete genome sequence of an organism provides a rapid inventory of most encoded proteins, tRNAs, and rRNAs, but it has not led to the immediate recognition of other genes that are not translated. In particular, new bacterial sRNA genes have been overlooked because there are no identifiable classes of sRNAs that can be found based solely on sequence determinants.

We and others have previously suggested several approaches to look for new sRNAs including computer searching of complete genomes based on parameters common to sRNAs, probing of genomic microarrays, and isolating sRNAs based on an association with general RNA-binding proteins (Eddy 1999; Wassarman et al. 1999). Using a combination of these approaches, we have identified 17 novel sRNAs; in addition, we have found six small transcripts that contain short conserved open reading frames (ORFs).


Identification of candidate sRNA genes by homology

As a starting point for detecting novel sRNAs in E. coli, we considered a number of common properties of the previously identified sRNAs that might serve as a guide to identify genes encoding new sRNAs. We are defining sRNAs as relatively short RNAs that do not function by encoding a complete ORF. Of the 13 small RNAs known when this work began, we were struck by the high conservation of these genes between closely related organisms. In most cases, the conservation between E. coli and Salmonella was >85%, whereas that of the typical gene encoding an ORF was frequently <70% (data not shown). Conservation tests on random noncoding regions of the genome suggested that extended conservation in intergenic regions was unusual enough to be used as an initial parameter to screen for new sRNA genes. We therefore tested this approach to look for novel sRNAs in the E. coli genome.

All known sRNAs are encoded within intergenic (Ig) regions (defined as regions between ORFs). A file (R. Overbeek, pers. comm.) containing all Ig sequences from the E. coli genome (Blattner et al. 1997) was used as a starting point for our homology search. We arbitrarily chose the 1.0- to 2.5-Mb region of the 4.6-Mb E. coli genome to test and refine our approach and developed the following steps for searching the full E. coli genome.

All Ig regions of 180 nucleotides or larger were compared to the NCBI Unfinished Microbial Genomes database using the BLAST program (Altschul et al. 1990). These 1097 Ig regions were rated based on the degree of conservation and length of the conserved region when compared to the closely related Salmonella and Klebsiella pneumonia species. The highest rating was given to Ig regions with a high degree of conservation (raw BLAST score of >80) over at least 80 nt (see Materials and Methods for explanation of ratings). Note that most promoters do not meet these length and conservation requirements. Figure Figure11 shows a set of BLAST searches for three known sRNAs (RprA RNA, CsrB RNA, and OxyS RNA), three Ig regions with high conservation (#14, #17, and #52), and one Ig region with intermediate conservation (#36). Some Ig regions had a large number of matches, often to several chromosomal regions of the same organism. These Ig regions were noted, and many were found to contain tRNAs, rRNAs, REP, or other repeated sequences. The 40 highly conserved Ig regions containing tRNAs and/or rRNAs were eliminated from our search because these regions were complicated in their patterns of conservation.

Figure 1
BLAST alignments of representative Ig regions. The indicated Ig regions were used in a BLAST search of the NCBI Unfinished Microbial Genomes database. Each panel shows the summary figure provided by the BLAST program for matches to Salmonella enteritidis, ...

Next the orientation and identity of the ORFs bordering the Ig regions were determined using the Colibri database, an annotated listing of all E. coli genes and their coordinates. Inconsistencies between the Colibri database and our original file led to the reclassification of some Ig regions as shorter than 180 nt, and these were not analyzed further. Of the remaining 1006 Ig regions, 13 contained known small RNAs, 295 were in the highest conservation group, 88 showed intermediate conservation, and 610 showed no conservation.

The location of the conservation relative to the orientation of the flanking ORFs was an important consideration in choosing candidates for further analysis. In many cases (132/295 Ig regions), the conserved region was just upstream of the start of an ORF, consistent with conservation of regulatory regions, including untranslated leaders. Cases where the conserved region was more than 50 nt from an ORF start or extended over more than 150 nt in length (RprA RNA, CsrB RNA, OxyS RNA, #17, and #52 in Fig. Fig.1),1), or where the bordering ORFs ended rather than started at the Ig region (#14 in Fig. Fig.1),1), were considered better candidates for novel sRNAs.

Published information on promoters and other known regulatory sites within conserved regions of promising candidates was tabulated and used to eliminate many candidates in which the conservation could be attributed to previously identified promoter or 5′ untranslated leaders. Finally, the remaining candidate regions were examined for sequence elements such as potential promoters, terminators, and inverted repeat regions. We considered evidence for possible stem-loops, in particular those with characteristics of rho-independent terminators, as especially indicative of possible sRNA genes (Table (Table1).1).

Table 1
sRNA Candidates

Using these criteria, together with microarray expression data (see below), a set of 59 candidates was selected (Table (Table1).1). Candidates 1–18 were chosen in the first round of screening of the 1.0- to 2.5-Mb region; some of these candidates would not have met the higher criteria applied to the rest of the genome.

Selecting candidate genes by whole genome expression analysis

In an independent series of experiments, high-density oligonucleotide probe arrays were used to detect transcripts that might correspond to sRNAs from Ig regions. Total RNA isolated from MG1655 cells grown to late exponential phase in LB medium was labeled for probes or used to generate cDNA probes (see Materials and Methods). From a single RNA isolation each labeling approach was carried out in duplicate and individually hybridized to high-density oligonucleotide microarrays. The high-density oligonucleotide probe arrays used are appropriate for this analysis because they have probes specific for both the clockwise (Watson) and counterclockwise (Crick) strands of each Ig region as well as for the sense strand of each ORF. The resulting data from the four experiments were analyzed to examine global expression within Ig regions, as well as neighboring ORFs.

Our criteria for analyzing the microarray data evolved during the course of this analysis. Stringent criteria (longer transcripts in the Ig region, higher expression levels) identified many of the previously known sRNAs but did not uncover many strong candidates for new small RNAs. More relaxed criteria (shorter transcripts, lower expression levels) gave a very large number of candidates and therefore were not by themselves useful as the initial basis for identifying candidates. However, these data were very useful as an additional criterion for selection of candidate regions based on the conservation approach. Detection of a transcript by microarray on the strand opposite to that of surrounding ORFs was considered a strong indicator of an sRNA (S* in Table Table1).1). Microarray data contributed to the selection of 34 of 59 candidates (Table (Table1).1). Examples of the different types of expression observed in microarray experiments are shown in Figure Figure2.2. Signal corresponding to CsrB RNA clearly is detected on the Crick (C) strand. #17 and #36 have a transcript in the Ig region on the opposing strand (C) to that for the flanking genes (Watson; W). However, the expression patterns were not as obvious in many cases, either because expression levels were low or because the pattern of expression could be interpreted in a number of ways. For instance, very little expression was detected for RprA RNA encoded on the W strand, and there is unexplained signal detected from the opposite strand of the rprA and csrB Ig regions. #14 and #52 also had some expression on each strand (Fig. (Fig.2).2). #14 proved to express a small RNA from the Watson strand, whereas #52 expresses sRNAs from each strand (see below and Table Table2).2).

Figure 2
Expression profile across high-density oligonucleotide arrays for representative Ig regions. Probe intensities are shown for the indicated Ig regions (red) and the flanking ORFs (blue), calculated from the perfect match minus the mismatch intensities. ...
Table 2
Novel sRNAs and Predicted Small ORFsa

Given that a number of the known sRNAs are relatively stable, we tested whether selection for stable RNAs might allow the microarray data to be more useful for de novo identification of sRNA candidates. The transcription inhibitor rifampicin was added to cells for 20 min prior to harvesting the RNA with the intention of enriching for stable RNAs. Many of the known sRNAs can be detected after the rifampicin treatment. Of the 59 candidates in Table Table1,1, 12 retained a hybridization signal (marked rif in Table Table1),1), and 4 of these proved to correspond to small transcripts (see below). Other rif-resistant transcripts detected in Ig regions appeared to be highly expressed leaders.

Small RNA transcripts detected by Northern hybridization

The final test for the presence of an sRNA gene was the direct detection of a small RNA transcript. The candidates in Table Table11 were analyzed by Northern hybridization using RNA extracted from MG1655 cells harvested from three growth conditions (exponential phase in LB medium, exponential phase in M63-glucose medium, or stationary phase in LB medium). The microarray analysis discussed above used RNA isolated from cells grown to late exponential phase in LB medium, which is intermediate between the two LB growth conditions used for the Northern analysis. Initially, Northern analysis was carried out using double-stranded DNA probes containing the full Ig region for most candidates. In three cases (#8, #22, and #55) PCR amplification of the Ig region to generate a probe was not successful and therefore oligonucleotide probes were used for Northern analysis. Seventeen candidates gave distinct bands consistent with small RNAs, and one additional candidate gave a somewhat larger RNA, but the location of conservation was not consistent with a leader sequence for a flanking ORF (#36). In some of these cases, two or more RNA species were detected with a single Ig probe (Table (Table2;2; see also Fig. Fig.3).3). One candidate (#43) gave a signal with the double-stranded DNA probe, but contains regions duplicated elsewhere in E. coli that probably account for this signal (see below). Of the remaining 41 candidates, 17 gave no detectable transcript. These Ig regions could encode sRNAs expressed only under very specific growth conditions. For instance, #8 has all the sequence hallmarks of an sRNA gene (a well-conserved region preceded by a possible promoter and ending with a terminator), but has not been detected. Alternatively, the observed conservation could be caused by nontranscribed regulatory regions. Fairly large RNAs were detected for another 24 candidates. Given the size of these transcripts together with data on the orientation of flanking genes and the location of conserved regions, it is likely these are leader sequences within mRNAs (Table (Table1).1).

Figure 3
Detection of novel sRNAs by Northern hybridization. Northern hybridization using strand-specific probes for each candidate was done on RNA extracted from MG1655 cells grown under three different growth conditions: (E) exponential growth in LB medium, ...

For candidates expressing RNAs not expected to be 5′ untranslated leaders, Northern analysis was carried out with strand-specific probes to determine gene orientation (Fig. (Fig.3).3). For many of the candidates, we used sequence elements (see below) as well as expression information from the microarray experiments to predict which strand was most likely expressed; both strands were tested when predictions were unclear. The results from the strand-specific probes generally agreed with predictions and were used to estimate the RNA size (Table (Table2).2). Interestingly, in one case there is an sRNA expressed from both the W and C strands within the Ig (#52; Fig. Fig.3).3). For #12, although no sRNA had been detected using a double-stranded DNA probe, the presence of a potential terminator and promoter remained suggestive of the presence of an sRNA gene. Therefore, oligonucleotide probes also were used in Northern analysis of this candidate, and a small RNA transcript was detected (Fig. (Fig.3;3; Table Table11).

Examination of expression profiles of the RNAs under different growth conditions gave an indication of specificity of expression. Some candidates were detected under all three growth conditions; others were preferentially expressed under one growth condition (Fig. (Fig.3;3; Table Table2).2). For instance, #25 was present primarily during growth in minimal medium, consistent with the absence of detection in the whole genome expression experiment, which analyzed RNA isolated from cells grown in rich medium.

Sequence predictions of sRNA genes and ORFs

For the candidates expressing small RNA transcripts, the conserved sequence blocks (contigs) from K. pneumoniae, the highest conserved Salmonella species, and in a few cases Yersinia pestis, were selected from the NCBI Unfinished Microbial Genome database and aligned with the E. coli Ig region using GCG Gap (Devereux et al. 1984). Multiple alignments were assembled by hand, and the conserved regions were examined for likely promoters and terminators and other conserved structures (data not shown). Information from the alignments, together with results from strand-specific Northern and microarray expression analyses, allowed assignments of gene orientation, putative regulatory regions, and RNA length from the predicted starting and ending positions. Where a terminator sequence was very apparent (13 of 19 candidates), transcription was assumed to end at the terminator, and the observed size of the transcript was used to help identify possible promoters. The identification of promoters and terminators was less definite when there was only one species with conservation to E. coli.

As the alignments were assembled, the pattern of conservation in some cases was reminiscent of patterns expected from ORFs, with higher sequence variation in positions consistent with the third nucleotide of codons. GCG Map (Devereux et al. 1984) was used to predict translation in all frames for all of the candidate small RNAs. In six cases, the conservation and translation potential suggested the presence of a short ORF (data not shown). In these cases, a ribosome-binding site and the potential ORF were well conserved, with the most variation in the third position of codons, but other elements of the predicted RNA were less well conserved. For example, #17 expresses an RNA of ∼266 nt, containing a predicted ORF of only 19 amino acids. Within the predicted Shine-Delgarno sequence and ORF, only 9/80 positions showed variation for either Klebsiella or Salmonella, but the overall RNA is <60% conserved. We predict that for #17, as well as five others (Table (Table2),2), the detected RNA transcript is functioning as an mRNA, encoding a short, conserved ORF. An evaluation of both the new predicted ORFs and the untranslated sRNAs with GLIMMER, a program designed to predict ORFs within genomes, gave complete agreement with our designations (Delcher et al. 1999).

We have assigned gene names to all candidates that we have confirmed are expressed as RNAs (see Table Table2).2). The genes we predict to encode ORFs were given names according to accepted practice for ORFs of unknown function (Rudd 1998). The genes that express sRNAs without evidence of conserved ORFs were named with a similar nomenclature: ryx, with ry denoting RNA of unknown function and x indicating the 10 min interval on the E. coli genetic map.

We noted one instance of overlap in sequence between our new sRNAs. The conserved region within #43 is highly homologous to a duplicated region within #55, as well as to a fourth region of the chromosome within a more poorly conserved Ig (#61 in Table Table1).1). This repeated region was previously denoted the QUAD repeat and suggested to encode sRNAs (Rudd 1999). Each of the QUAD repeats contains a short stretch homologous to boxC, a repeat element of unknown function present in 50 copies or more within the genome of E. coli (Bachellier et al. 1996). Rudd also has detected transcripts from the QUAD regions (G. Tolun, Z. Li, and K. Rudd, pers. comm.). To determine which of the four QUAD genes was being expressed, we designed oligonucleotide probes unique for each of the four repeats. These oligonucleotide probes demonstrated expression for three of the four QUAD genes (#55-I, #55-II, and #61); furthermore, each gave two RNA bands (Fig. (Fig.3;3; Table Table2).2). No signal was detected for the fourth repeat (#43). The #41 Ig region encodes another pair of repeats, PAIR2 (Rudd 1999), and we observed two RNA species, suggesting that each of the repeats may be transcriptionally active. Finally, another repeat region noted by Rudd, PAIR3, is encoded by the #22 Ig region.

Many sRNAs bind Hfq and modulate rpoS expression

Hfq is a small, highly abundant RNA-binding protein first identified for its role in replication of the RNA phage Qβ (Franze de Fernandez et al. 1968; for review, see Blumenthal and Carmichael 1979). Recently, Hfq has been shown to be involved in a number of RNA transactions in the cell, including translational regulation (rpoS), mRNA polyadenylation, and mRNA stability (ompA, mutS, and miaA) (Muffler et al. 1996; Tsui et al. 1997; Vytvytska et al. 1998; Hajndsorf and Regnier 2000; Vytvytska et al. 2000). Three of the known E. coli sRNAs regulate rpoS expression: DsrA RNA and RprA RNA positively regulate rpoS translation, whereas OxyS RNA represses its translation. In all three cases the Hfq protein is required for regulation (Zhang et al. 1998; Majdalani et al. 2001; Sledjeski et al. 2001), and binding studies have revealed a direct interaction between Hfq and the OxyS and DsrA RNAs (Zhang et al. 1998; Sledjeski et al. 2001).

Given the interaction of the Hfq protein with at least three of the known sRNAs, we asked how many of the newly discovered sRNAs are bound by this protein. Hfq-specific antisera was used to immunoprecipitate Hfq-associated RNAs from extracts of cells grown under the conditions used for the Northern analysis. Total immunoprecipitated RNA was examined using two methods. First, RNA was 3′-end labeled and selected RNAs were visualized directly on polyacrylamide gels. Under each growth condition, several RNA species coimmunoprecipitated with Hfq-specific sera but not with preimmune sera, which suggests that many sRNAs interact with Hfq (Fig. (Fig.4A;4A; data not shown). Second, selected RNAs were examined using Northern hybridization to determine whether other known sRNAs and any of our newly discovered sRNAs interact with Hfq. For each sRNA, Hfq binding was examined under growth conditions where the sRNA was most abundant (Fig. (Fig.4B;4B; Table Table2).2). sRNAs present in samples using the Hfq antisera but not preimmune sera were concluded to interact with Hfq. Comparison of levels of a selected sRNA relative to the total amount of that sRNA in the extract revealed that many of the sRNAs bound Hfq quite efficiently (>30% bound) (#14, #24, #25, #26, #31, #41, #52-II, Spot42 RNA, and RprA RNA), but other sRNAs bound Hfq less efficiently (<10% bound) (#9, #17, and #52-I), or not at all (#27, #38, #40, 6S RNA, 5S RNA, and tmRNA) (Fig. (Fig.4;4; Table Table2).2). The physiological significance of the weaker interactions remains to be tested.

Figure 4
Coimmunoprecipitation of sRNAs with the Hfq protein. (A) Immunoprecipitations using extract from MG1655 cells grown in LB medium in exponential growth (OD600 = 0.4) were done using no antibody (lane 1); 5 μL of preimmune serum ...

As mentioned above, at least three of the known sRNAs that interact with Hfq also regulate translation of rpoS, the stationary phase ς factor. In light of the fact that many of the new sRNAs also interact with Hfq, we examined whether these new sRNAs affect rpoS expression. Plasmids carrying the Ig regions encoding either control sRNAs (pRS-DsrA and pRS-RprA) or many of our novel sRNAs were introduced into an MG1655 Δlac derivative carrying an rpoS–lacZ translational fusion. We then compared expression of the rpoS–lacZ fusion in these cells to cells carrying the control vector by measuring β-galactosidase activity at stationary phase in LB or M63–glucose medium (Table (Table2).2). As expected, overproduction of either DsrA RNA or RprA RNA increased rpoS–lacZ expression significantly (Table (Table22 legend). A number of plasmids (pRS-#24, pRS-#31) led to increased rpoS–lacZ expression, whereas others (pRS-#12, pRS-#14, and pRS-#25) led to decreased expression. These results suggest that the corresponding sRNAs may directly regulate rpoS expression or indirectly affect rpoS expression by altering Hfq activity, possibly by competition. Intriguingly, there is not a complete correlation between Hfq binding and altered rpoS–lacZ expression in these studies.

As a start in defining possible functions for the sRNAs, we screened strains carrying the multicopy plasmids for effects on growth in LB medium at various temperatures as well as growth in minimal medium containing a number of different carbon sources. pRS-#25 renders cells unable to grow on succinate in agreement with predictions for #25 RNA interaction with sdh mRNA (discussed below). We were unable to isolate plasmids carrying the #27 Ig region without mutations, suggesting that overproduction of this small RNA may interfere with growth. No other growth phenotypes were observed. A caveat for the interpretation of results with the multicopy plasmids is that they contain the full intergenic region; therefore, we cannot rule out effects of sequences outside the sRNA genes but within the intergenic regions.


In summary, a multifaceted search strategy to predict sRNA genes was validated by our discovery of 17 novel sRNAs. Northern analysis determined that 44 of 60 candidate regions express RNA transcripts, some of them expressing more than one RNA species. Of these transcripts, 24 were concluded to be 5′ untranslated leaders for mRNAs of flanking genes, and another 6 are predicted to encode new, short ORFs (Tables (Tables11 and and2).2). The 17 transcripts believed to be novel, functional sRNAs range from 45 nt to 320 nt in length and vary significantly in expression levels and expression profiles under different growth conditions. More than half of the new sRNAs were found to interact with the RNA-binding protein Hfq, suggesting that Hfq binding may be a defining characteristic of a family of prokaryotic sRNAs.

Evaluation of selection criteria

Three general approaches for predicting sRNA genes were evaluated in this work. In the primary approach, Ig regions were scored for degree and length of conservation between closely related bacterial species followed by examination of sequence features. This approach proved to be very productive in identifying Ig regions encoding novel sRNAs in E. coli; >30% of the candidates selected primarily on the basis of their conservation proved to encode novel small transcripts. The availability of nearly completed genome sequences for Salmonella and Klebsiella made this approach possible. Any organism for which the genome sequences of closely related species are known can be analyzed in this way. Comparative genomics of this sort have been used before to search for regulatory sites (for review, see Gelfand 1999), but have not been employed previously to find sRNAs.

Although we found the conservation-based approach to be the most productive in identifying sRNA genes, we note a number of limitations to its use. A high level of conservation is not sufficient to indicate the presence of an sRNA gene. Many of the most highly conserved regions, not unexpectedly, were consistent with regulatory and leader sequences for flanking genes. We also did not analyze any Ig regions where conservation was attributable to sources other than an sRNA. For example, potential sRNAs processed from mRNAs, or any sRNAs encoded by the antisense strand of ORFs or leaders, may have been missed in our approach. We made the assumption that Ig regions must be ≥180 nt to encode an sRNA of ≥60nt, a 50–60-nt promoter and regulatory region to control expression of the sRNA, as well as regulatory regions for flanking genes. Any sRNA genes in smaller Ig regions would have been overlooked. We also excluded the highly conserved tRNA and rRNA operons from our consideration because of their complexity. It is certainly possible that sRNA genes may be associated with these other RNA genes. In fact, sRNA genes have been predicted to be encoded in at least one tRNA operon (R. Carter, I. Dubchak, and S. Holbrook, pers. comm.). In addition, conservation need not be a property of all sRNAs. We expect sRNAs that play a role in modulating cellular metabolism to be well conserved, as is the case for the previously identified sRNAs. Nevertheless, sRNAs may be encoded within or act upon regions for which there is no homology between E. coli, Klebsiella, and Salmonella (e.g., in cryptic prophages and pathogenicity islands), and they would be missed by this approach. Only 1 of 24 Ig regions within the e14, CP4-54, or CP4-6 prophages showed conservation. A few of these Ig regions showed evidence of transcription by microarray analysis, and RNAs have been implicated in immunity regulation in phage P4 (Ghisotti et al. 1992), which is related to the prophages CP4-54 and CP4-6. Despite the limitations listed above, however, we believe the use of conservation provides a relatively quick identification of the majority of sRNAs.

An alternative genomic sequence-based strategy for identifying sRNAs would be to search for orphan promoter and terminator elements as well as other potential RNA structural elements. Potential promoter elements were generally too abundant to be useful predictors without other information on their expected location and orientation. We found sequences predicted to be rho-independent terminators a more useful indicator of sRNAs; such sequences were clearly present for 13/17 of the sRNAs and 3/6 of the new mRNAs. In a number of cases, it appears that the sRNAs share a terminator with a convergent gene for an ORF. In other cases, either no terminator was detected or it appeared to be in a neighboring ORF. A search using promoter and terminator sequences as the requirements for identifying sRNAs might therefore have found two-thirds of the sRNAs described here. Phage integration target sequences also could be scanned for nearby sRNA genes. Many phage att sites overlap tRNAs (for review, see Campbell 1992), and ssrA, encoding the tmRNA, has a 3′ structure like a tRNA and overlaps the att site of a cryptic prophage (Kirby et al. 1994). In this work, we found that the 3′ end and terminator of #14 overlaps the previously mapped phage P2 att site (Barreiro and Haggard-Ljungquist 1992). #14 sRNA does not obviously resemble a tRNA, suggesting that the overlap between phage att sites and RNA genes extends beyond tRNAs and related molecules and may be common to additional sRNAs.

Our second approach, high-density oligonucleotide probe array expression analysis, proved to be more useful in confirming the presence of sRNA genes first found by the conservation approach than in identifying new sRNA genes de novo. Further consideration of the location of microarray signal compared to flanking genes as well as analysis of microarray signals after a variety of growth conditions should expand the ability to detect sRNAs in this manner. Under a single growth condition, signal consistent with the RNA identified by Northern analysis was detected for 5/15 of the Ig regions proven to encode new sRNAs and for 4/6 of the new mRNAs. Thus, a similar analysis of microarray data in nonconserved genomic regions might help in the identification of sRNAs missed by the conservation-based approaches. We predict that sRNAs from any organism expressed at reasonably high levels under normal growth conditions will be detected by microarrays that interrogate the entire genome, inclusive of noncoding regions.

One clear limitation in detecting sRNAs with microarray or Northern analyses is the fact that some sRNAs may be expressed only under limited growth conditions or at extremely low levels. We chose three growth conditions to scan our samples. Although most of the previously known sRNAs were seen under these conditions, OxyS RNA, which is induced by oxidative stress, was not detectable. For a few of our candidates in which no RNA was detected, it is possible that an sRNA is encoded but is not expressed sufficiently to be detected under any of our growth conditions. Another possible limitation of hybridization-based approaches is that highly structured sRNAs may be refractory to probe generation. sRNA transcripts may not remain quantitatively represented after the fragmentation used in the direct labeling approach here. cDNA labeling also may underrepresent sRNAs because they are a small target for the oligonucleotide primers, and secondary structure can interfere with efficiency of extension.

As our third approach, sRNAs were selected on the basis of their ability to bind to the general RNA-binding protein Hfq. Northern analysis revealed that many of our novel sRNAs interact with Hfq. In preliminary microarray analysis of Hfq-selected RNAs to look for additional unknown sRNAs, DsrA RNA, DicF RNA, Spot42 RNA, #14, #24, #25, #31, #41, and #52-II were detected among those RNAs with the largest difference in levels between Hfq-specific sera and preimmune sera (data not shown). This preliminary experiment suggests that microarray analysis of selected RNAs will be very valuable on a genome-wide basis. Interestingly, a large number of genes with leaders and a number of RNAs for operons were found to coimmunoprecipitate with Hfq (including the known Hfq target nlpD-rpoS mRNA) (Brown and Elliott 1996). It seems likely that the subset of sRNAs binding a common protein will represent a subset in terms of function; the sRNAs of known function associated with Hfq in our experiments appear to be those involved in regulating mRNA translation and stability. Other sRNAs have been shown to interact with specific prokaryotic RNA-binding proteins, for example, tmRNA with SmpB (Karzai et al. 1999), and the possibility of other sRNAs interacting with these proteins or other general sRNA-binding proteins should be tested. This approach is adaptable to all organisms, and, in fact, binding to Sm and Fibrillarin proteins has been the basis for identification of several sRNAs in eukaryotic cells (Montzka and Steitz 1988; Tyc and Steitz 1989).

All the criteria we used to identify sRNAs also will detect short genes encoding new small peptides, and we have found six conserved short ORFs. Although our approach was intended to develop methods to identify nontranslated genes within the genome, short ORFs also are missing from annotated genome sequences. The combination of a requirement for conservation and/or transcription with sequence predictions for ORFs should add significantly to our ability to recognize short ORFs. Small polypeptides have been shown to have a variety of interesting cellular roles. It is tempting to speculate that some of the short ORFs we have found may be involved in signaling pathways, akin to those of B. subtilis peptides that enter the medium and carry out cell–cell signaling (for review, see Lazazzera 2000).

Characteristics and possible functions of new sRNAs

The current work serves as a blueprint for the initial prediction, detection, and characterization of a large group of novel sRNAs. Although we do not have definitive information on function yet, some characteristics that may provide clues regarding the cellular roles of these new sRNAs are noted. Several known sRNAs that bind the Hfq protein act via base pairing to target mRNAs. The finding that a number of our new sRNAs bind Hfq may suggest a similar mechanism of action for this subset of sRNAs. We searched the E. coli genome for possible complementary target sequences and examined phenotypes associated with multicopy plasmids containing new sRNA genes. Intriguingly, #25, an sRNA preferentially expressed in minimal medium, has extended complementarity to a sequence near the start of sdhD, the second gene of the succinate dehydrogenase operon (data not shown). When the #25 Ig region is present on a multicopy plasmid, it interferes with growth on succinate minimal medium (Table (Table2),2), consistent with #25 sRNA acting as an antisense RNA for sdhD. Complementarity to potential target mRNAs was found for a number of other novel sRNAs, but the validity of these possible interactions remains to be confirmed by experimentation.

As outlined in the evaluation of each of our approaches, we do not expect our searches to have been exhaustive. sRNAs also have been detected by others using a variety of approaches. The sRNA encoded by #38 was independently identified as a regulatory RNA (CsrC RNA; T. Romeo, pers. comm.), and others have found additional sRNAs using variations of the approaches used here (Argaman et al. 2001). Nevertheless, we think it unlikely that there are many more than 50 sRNAs encoded by the E. coli chromosome and by closely related bacteria. We expect such sRNAs to be present and playing important regulatory roles in all organisms. Using the approaches described here, it is feasible to search all sequenced organisms for these important regulatory molecules. We anticipate that study of the expanded list of sRNAs in E. coli will allow a more complete understanding of the range of roles played by regulatory sRNAs.

Materials and methods

Computer searches

Ig regions are defined here as sequences between two neighboring ORFs. We compared Ig regions of ≥180 nt against the NCBI Unfinished Microbial Genomes database (http://www.ncbi.nlm.nih.gov/Microb_blast/unfinishedgenome.html) using the BLAST program (Altschul et al. 1990). Salmonella enteritidis sequence data were from the University of Illinois, Department of Microbiology (http://www.salmonella.org). Salmonella typhi and Yersinia pestis sequence data were from the Sanger Centre (http://www.sanger.ac.uk/Projects/S_typhi/ and http://www.sanger.ac.uk/Projects/Y_pestis/, respectively). Salmonella typhimurium, Salmonella paratyphi, and Klebsiella pneumoniae sequences were from the Washington University Genome Sequencing Center (Genome Sequencing Center, pers. comm.).

Each Ig region was rated based on the best match to Salmonella or K. pneumoniae species. Ig regions containing previously identified sRNAs were rated 5 (each of them met the criteria to be rated 4). Ig regions were rated 4 if the raw BLAST score was >200 (red in Fig. Fig.1)1) or 80–200 (magenta in Fig. Fig.1)1) extending for more than 80 nt; 3 if the raw BLAST score was 80–200 (magenta) extending for 60–80 nt; 2 if the raw BLAST score was 50–80 (green) extending for more than 65 nt; and 1 if the raw BLAST score was <50 (blue, black, or none) or <65 nt. The location of the longest conserved section(s) within each Ig and the number of matches to the NCBI Unfinished Microbial database were recorded. Note that the computer searches were done from May 2000 to December 2000; more sequences are expected to match as the database continues to expand. The identity and orientation of genes flanking each Ig region were determined from the Colibri database (http://genolist.pasteur.fr/Colibri). Ig regions that the Colibri database predicted to be <180 nt in length and Ig regions containing tRNA and/or rRNAs were rated 0 and removed from further consideration. An Excel document containing the full set of data from this analysis is available at http://dir2.nichd.nih.gov/nichd/cbmb/segr/segrPublications.html.

Strains and plasmids

Strains were grown at 37°C in Luria-Bertani (LB) medium or M63 minimal medium supplemented with 0.2% glucose and 0.002% vitamin B1 (Silhavy et al. 1984) except for phenotype testing of strains carrying multicopy plasmids as described below. Ampicillin (50 μg/mL) was added where appropriate. E. coli MG1655 was the parent for all strains used in this study. MG1655 Δlac (DJ480, obtained from D. Jin, NCI), was lysogenized with a λ phage carrying an rpoS–lacZ translational fusion (Sledjeski et al. 1996) to create strain SG30013.

To generate clones containing the Ig region of each candidate (pCR-#N, where N refers to candidate number; see Table Table1),1), Ig regions were amplified by PCR from a MG1655 colony and cloned into the pCRII vector using the TOPO TA cloning kit (Invitrogen). Oligonucleotides were designed so the entire conserved region and in most cases the full Ig region was included. In a few cases, repeated sequences or other irregularities required a reduction in the Ig regions cloned. See http://dir2.nichd.nih.gov/nichd/cbmb/segr/segrPublications.html for a list of all oligonucleotides used in this paper. Ig regions encoding sRNAs also were cloned into multicopy expression vectors (pRS-#N) in which each Ig region is flanked by several vector-encoded transcription terminators. To generate pRS-#N plasmids, pCR-#N plasmids were digested with BamHI and XhoI, and the Ig-containing fragments were cloned into the BamHI and SalI sites of pRS1553 (Pepe et al. 1997), replacing the lacZ-α peptide. To construct pBS-spot42, the Spot42-containing fragment was amplified by PCR from K12 genomic DNA, digested with EcoRI and BamHI, and cloned into corresponding sites in pBluescript II SK+ (Stratagene). All DNA manipulations were carried out using standard procedures. All clones were confirmed by sequencing.

RNA analysis

RNA for Northern analysis was isolated directly from ∼3 × 109 cells in exponential growth (OD600 = 0.2–0.4) or stationary phase (overnight growth) as described previously (Wassarman and Storz 2000). Then 5-μg RNA samples were fractionated on 10% polyacrylamide urea gels and transferred to Hybond N membrane as described previously (Wassarman and Storz 2000). For Northern analysis of candidate regions, double-stranded DNA probes were generated by PCR from a colony of MG1655 cells or from the pCR-#N plasmids with oligonucleotides used for cloning the pCR-#N plasmids. PCR amplification was done with 52°C annealing for 30 cycles in 1× PCR buffer (1 mM each dATP, dGTP, and dTTP; 2.5 μM dCTP; 100 μCi [α32P]dCTP; 10 ng plasmid; 1 U taq polymerase) (Perkin Elmer). Probes were purified over G-50 microspin columns (Amersham Pharmacia Biotech) prior to use. Northern membranes were prehybridized in a 1:1 mixture of Hybrisol I and Hybrisol II (Intergen) at 40°C. DNA probes with 500 μg sonicated salmon sperm DNA were heated for 5 min to 95°C and added to prehybridization solution; membranes were hybridized overnight at 40°C. Membranes were washed by rinsing twice with 4× SSC/0.1% SDS at room temperature followed by three washes with 2× SSC/0.1% SDS at 40°C. Northern blot analysis using RNA probes was done as described previously (Wassarman and Steitz 1992). RNA probes were generated by in vitro transcription according to manufacturer protocols (Roche Molecular Biochemicals) from pCR-#N plasmids linearized with EcoRV or HindIII using SP6 RNA polymerase or T7 RNA polymerase, respectively; pBS-6S (pGS0112; Wassarman and Storz 2000) or pBS-spot42 were linearized with EcoRI using T3 RNA polymerase; pGEM-5S (pG5019; Altuvia et al. 1997) or pGEM-10Sa (Altuvia et al. 1997) were linearized with EcoRI using SP6 RNA polymerase. Oligonucleotide probes were labeled by polynucleotide kinase according to manufacturer protocols (New England Biolabs) using [γ32P]ATP (>5000 Ci/mmole; Amersham Pharmacia Biotech). For oligonucleotide probes, Northern membranes were prehybridized in Ultrahyb (Ambion) at 40°C followed by addition of labeled oligonucleotide probe and hybridization overnight at 40°C. Membranes were washed twice with 2× SSC/0.1% SDS at room temperature followed by two washes with 0.1× SSC/0.1% SDS for 15 min each at 40°C.


Immunoprecipitations were carried out using extracts from cells in exponential growth (OD600 = 0.2–0.4) or stationary phase (overnight growth) as described previously (Wassarman and Storz 2000), using rabbit antisera against the Hfq protein (A. Zhang and G. Storz, unpubl.) or preimmune serum. After immunoprecipitation, RNA was isolated from Protein A Sepharose-antibody pellets by extraction with phenol:chloroform:isoamyl alcohol (50:50:1), followed by ethanol precipitation. RNA was examined on gels directly after 3′-end labeling or analyzed by Northern hybridization after fractionation on 10% polyacrylamide urea gels as described previously (Wassarman and Storz 2000).

rpoS–lacZ expression

Effects on rpoS–lacZ expression by multicopy plasmids containing the novel sRNAs were determined from a single colony of SG30013 transformed with pRS-#N, grown for 18 h in 5 mL of LB–ampicillin medium or M63–ampicillin medium supplemented with 0.2% glucose at 37°C. β-Galactosidase activity in the culture was assayed as described previously (Zhou and Gottesman 1998). The numbers provided in Table Table22 were calculated as the ratio between pRS-#N and the pRS1553 vector control.

Phenotype testing

To test carbon source utilization or temperature sensitivity associated with the multicopy plasmids containing the novel sRNAs, a single colony of MG1655 transformed with a given pRS-#N was grown for 6 h in 5 mL of LB–ampicillin medium at 37°C. Then 10 μL of serial dilutions (10−2, 10−4, and 10−6) was spotted on M63–ampicillin plates containing 0.2% of the carbon source being tested (glucose, arabinose, lactose, glycerol, ribose, or succinate) and grown at 37°C; or on LB plates incubated at room temperature or 42°C. Plates were analyzed after both 1 d and 2 d. Failure to grow in Table Table22 indicates an efficiency of plating of <10−3.

Microarray analysis

RNA for microarray analysis was isolated using the MasterPure RNA purification kit according to the manufacturer protocols (Epicentre) from MG1655 cells grown to OD600 = 0.8 in LB medium at 37°C. DNA was removed from RNA samples by digestion with DNase I for 30 min at 37°C. Probes for microarray analysis were generated by one of two methods: direct labeling of enriched mRNA or generation of labeled cDNA.

To generate direct labeled RNA probes, mRNA enrichment and labeling was done as described in the Affymetrix expression handbook (Affymetrix). Oligonucleotide primers complementary to 16S and 23S rRNA were annealed to total RNA followed by reverse transcription to synthesize cDNA strands complementary to 16S and 23S rRNA species. 16S and 23S were degraded with RNase H followed by DNase I treatment to remove cDNA and oligonucleotides. Enriched RNA was fragmented for 30 min at 95°C in 1× T4 polynucleotide kinase buffer (New England Biolabs), followed by labeling with γ-S-ATP and T4 polynucleotide kinase and ethanol precipitation. The biotin label was introduced by resuspending RNA in 96 μL of 30 mM MOPS (pH 7.5), 4 μL of a 50 mM Iodoacetylbiotin solution, and incubating at 37°C for 1 h. RNA was purified using the RNA/DNA Mini Kit according to manufacturer protocols (QIAGEN).

To generate cDNA probes, 5 μg of total RNA was reverse transcribed using the Superscript II system for first strand cDNA synthesis (Life Technologies) and 500-ng random hexamers. RNA and primers were heated to 70°C and cooled to 25°C; reaction buffer was then added, followed by addition of Superscript II and incubation at 42°C. RNA was removed by RNase H and RNase A. The cDNA was purified using the Qiaquick cDNA purification kit (QIAGEN) and fragmented by incubation of up to 5 μg cDNA and 0.2 U DNase I for 10 min at 37°C in 1× one-phor-all buffer (Amersham Pharmacia Biotech). The reaction was stopped by incubation for 10 min at 99°C, and fragmentation was confirmed on a 0.7% agarose gel to verify that average length fragments were 50–100 nt. Fragmented cDNA was 3′-end-labeled with terminal transferase (Roche Molecular Biochemicals) and biotin-N6-ddATP (DuPont/NEN) in 1× TdT buffer (Roche Molecular Biochemicals) containing 2.5 mM cobalt chloride for 2 h at 37°C.

Hybridization to microarrays and staining procedures were done according to the Affymetrix expression manual (Affymetrix). The arrays were read at 570 nm with a resolution of 3 μm using a laser scanner.

The expression of genes was analyzed using the Affymetrix Microarray Suite 4.01 software program. Detection of transcripts in intergenic regions was done using the intensities of each probe designed to be a perfect match and the corresponding probe designed to be the mismatch. If the perfect match probe showed an intensity that was 200 units higher than the mismatch probe, the probe pair was called positive. Two neighboring positive probe pairs were considered evidence of a transcript. The location and length of the transcripts were estimated based on the first and last identified positive probe pair within an Ig region.


We thank R. Overbeek for the file of intergenic sequences, D. Jin for MG1655 ΔlacZ, A. Zhang for Hfq antibodies, R.M. Saxena for technical assistance, and S. Salzberg for running the GLIMMER program. We made extensive use of the NCBI Unfinished Microbial Genome database. In particular, the authors thank the Sanger Center; the Genome Sequencing Center; Washington University, St. Louis; and the University of Illinois, Department of Microbiology for communication of DNA sequence data to that database prior to publication. We thank S. Altuvia, S. Holbrook, T. Romeo, K. Rudd, and their collaborators for permission to quote unpublished results; and B. Peculis, T. Romeo, K. Rudd, R. Weisberg, and members of our laboratories for comments on the manuscript.

The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 USC section 1734 solely to indicate this fact.


E-MAIL vog.hin.xileh@zrots; FAX (301) 402-0078.

E-MAIL vog.hin.xileh@gnasus; FAX (301) 496-3875.

Article and publication are at http://www.genesdev.org/cgi/doi/10.1101/gad.901001.


  • Altschul SF, Gish W, Miller W, Myers EW, Lipman D. A basic local alignment search tool. J Mol Biol. 1990;215:403–410. [PubMed]
  • Altuvia S, Weinstein-Fischer D, Zhang A, Postow L, Storz G. A small stable RNA induced by oxidative stress: Role as a pleiotropic regulator and antimutator. Cell. 1997;90:43–53. [PubMed]
  • Argaman, L., Hershberg, R., Vogel, J., Bejerano, G., Wagner, E.G.H., Margalit, H., and Altuvia, S. 2001. Novel small RNA-encoding genes in the intergenic regions of Escherichia coli. Curr. Biol. (in press). [PubMed]
  • Bachellier S, Gilson E, Hofnung M, Hill CW. Repeated sequences. In: Neidhardt FC, et al., editors. Escherichia coli and Salmonella: Cellular and molecular biology. Washington, D.C.: American Society for Microbiology; 1996. pp. 2012–2040.
  • Barreiro V, Haggard-Ljungquist E. Attachment sites for bacteriophage P2 on the Escherichi coli chromosome: DNA sequences, localization on the physical map, and detection of a P2-like remnant in E. coli K-12 derivatives. J Bacteriol. 1992;174:4086–4093. [PMC free article] [PubMed]
  • Bhasin RS. “Studies on the mechanism of the autoregulation of the crp operon of E. coli K12.” Ph.D. thesis. Stony Brook, NY: State University of New York at Stony Brook; 1989.
  • Blattner FR, Plunkett G, Bloch CA, Perna NT, Burland V, Riley M, Collado-Vides J, Glasner JD, Rode CK, Mayhew GF, et al. The complete genome sequence of Escherichia coli K-12. Science. 1997;277:1453–1474. [PubMed]
  • Blumenthal T, Carmichael GG. RNA replication: Function and structure of Qβ-replicase. Annu Rev Biochem. 1979;48:525–548. [PubMed]
  • Bouvier J, Richaud C, Higgins W, Bogler O, Stragier P. Cloning, characterization, and expression of the dapE gene of Escherichia coli. J Bacteriol. 1992;174:5265–5271. [PMC free article] [PubMed]
  • Brown L, Elliott T. Efficient translation of the RpoS ς factor in Salmonella typhimurium requires Host Factor I, an RNA-binding protein encoded by the hfq gene. J Bacteriol. 1996;178:3763–3770. [PMC free article] [PubMed]
  • Campbell AM. Chromosomal insertion sites for phages and plasmids. J Bacteriol. 1992;174:7495–7499. [PMC free article] [PubMed]
  • Compan I, Touati D. Anaerobic activation of arcA transcription in Escherichia coli: Roles of Fnr and ArcA. Mol Microbiol. 1994;11:955–964. [PubMed]
  • Delcher AL, Harmon D, Kasif S, White O, Salzberg SL. Improved microbial gene identification with GLIMMER. Nucleic Acids Res. 1999;27:4636–4641. [PMC free article] [PubMed]
  • Devereux J, Haeberli P, Smithies O. A comprehensive set of sequence analysis programs for the VAX. Nucleic Acids Res. 1984;12:387–395. [PMC free article] [PubMed]
  • Eddy SR. Noncoding RNA genes. Curr Opin Genet Dev. 1999;9:695–699. [PubMed]
  • Franze de Fernandez M, Eoyang L, August J. Factor fraction required for the synthesis of bacteriophage Qβ RNA. Nature. 1968;219:588–590. [PubMed]
  • Gelfand MS. Recognition of regulatory sites by genome comparison. Res Microbiol. 1999;150:755–771. [PubMed]
  • Ghisotti D, Chiaramonte R, Forti F, Zangrossi S, Sironi G, Deho G. Genetic analysis of the immunity region of phage-plasmid P4. Mol Microbiol. 1992;6:3405–3413. [PubMed]
  • Hajndsorf E, Regnier P. Host factor Hfq of Escherichia coli stimulates elongation of poly(A) tails by poly(A) polymerase I. Proc Natl Acad Sci. 2000;97:1501–1505. [PMC free article] [PubMed]
  • Karzai AW, Susskind MM, Sauer RT. SmpB, a unique RNA-binding protein essential for the peptide-tagging activity of SsrA (tmRNA) EMBO J. 1999;18:3793–3799. [PMC free article] [PubMed]
  • Kirby JE, Trempy JE, Gottesman S. Excision of a P4-like cryptic prophage leads to Alp protease expression in Escherichia coli. J Bacteriol. 1994;176:2068–2081. [PMC free article] [PubMed]
  • Lazazzera BA. Quorum sensing and starvation: Signals for entry into stationary phase. Curr Opin Microbiol. 2000;3:177–182. [PubMed]
  • Majdalani N, Chen S, Murrow J, St. John K, Gottesman S. Regulation of RpoS by a novel small RNA: The characterization of RprA. Mol Microbiol. 2001;39:1382–1394. [PubMed]
  • McVeigh A, Fasano A, Scott DA, Jelacic S, Moseley SL, Robertson DC, Savarino SJ. IS1414, an Escherichia coli insertion sequence with a heat-stable enterotoxin gene embedded in a transposase-like gene. Infect Immun. 2000;68:5710–5715. [PMC free article] [PubMed]
  • Montzka KA, Steitz JA. Additional low-abundance human small nuclear ribonucleoproteins: U11, U12, etc. Proc Natl Acad Sci. 1988;85:8885–8889. [PMC free article] [PubMed]
  • Muffler A, Fischer D, Hengge-Aronis R. The RNA-binding protein HF-I, known as a host factor for phage Qβ RNA replication, is essential for rpoS translation in Escherichia coli. Genes & Dev. 1996;10:1143–1151. [PubMed]
  • Okamoto K, Freundlich M. Mechanism for the autogenous control of the crp operon: Transcriptional inhibition by a divergent RNA transcript. Proc Natl Acad Sci. 1986;83:5000–5004. [PMC free article] [PubMed]
  • Pepe CM, Suzuki C, Laurie C, Simons RW. Regulation of the “tetCD” genes of transposon Tn10. J Mol Biol. 1997;270:14–25. [PubMed]
  • Rudd KE. Linkage map of Escherichia coli K-12, edition 10: The physical map. Microbiol Mol Biol Rev. 1998;62:985–1019. [PMC free article] [PubMed]
  • ————— Novel intergenic repeats of Escherichia coli K-12. Res Microbiol. 1999;150:653–664. [PubMed]
  • Seoane AS, Levy SB. Identification of new genes regulated by the marRAB operon in Escherichia coli. J Bacteriol. 1995;177:530–535. [PMC free article] [PubMed]
  • Silhavy TJ, Berman ML, Enquist LW. Experiments with gene fusions. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory; 1984.
  • Sledjeski DD, Gupta A, Gottesman S. The small RNA, DsrA, is essential for the low temperature expression of RpoS during exponential growth in Escherichia coli. EMBO J. 1996;15:3993–4000. [PMC free article] [PubMed]
  • Sledjeski DD, Whitman C, Zhang A. Hfq is necessary for regulation by the untranslated RNA DsrA. J Bacteriol. 2001;183:1997–2005. [PMC free article] [PubMed]
  • Tsui H-CT, Feng G, Winkler M. Negative regulation of mutS and mutH repair gene expression by the Hfq and RpoS global regulators of Escherichia coli K-12. J Bacteriol. 1997;179:7476–7487. [PMC free article] [PubMed]
  • Tyc K, Steitz JA. U3, U8 and U13 comprise a new class of mammalian snRNPs localized in the cell nucleolus. EMBO J. 1989;8:3113–3119. [PMC free article] [PubMed]
  • Urbanowski ML, Stauffer LT, Stauffer GV. The gcvB gene encodes a small untranslated RNA involved in expression of the dipeptide and oligopeptide transport systems in Escherichia coli. Mol Microbiol. 2000;37:856–868. [PubMed]
  • Vytvytska O, Jakobsen J, Balcunaite G, Andersen J, Baccarini M, von Gabain A. Host Factor I, Hfq, binds to Escherichia coli ompA mRNA in a growth-rate dependent fashion and regulates its stability. Proc Natl Acad Sci. 1998;95:14118–14123. [PMC free article] [PubMed]
  • Vytvytska O, Moll I, Kaberdin VR, von Gabain A, Blasi U. Hfq (HF1) stimulates ompA mRNA decay by interfering with ribosome binding. Genes & Dev. 2000;14:1109–1118. [PMC free article] [PubMed]
  • Wassarman KM, Steitz JA. The low abundance U11 and U12 snRNAs interact to form a two snRNP complex. Mol Cell Biol. 1992;12:1276–1285. [PMC free article] [PubMed]
  • Wassarman KM, Storz G. 6S RNA regulates E. coli RNA polymerase activity. Cell. 2000;101:613–623. [PubMed]
  • Wassarman KM, Zhang A, Storz G. Small RNAs in Escherichia coli. Trends Microbiol. 1999;7:37–45. [PubMed]
  • Zhang A, Altuvia S, Tiwari A, Argaman L, Hengge-Aronis R, Storz G. The oxyS regulatory RNA represses rpoS translation by binding Hfq (HF-1) protein. EMBO J. 1998;17:6061–6068. [PMC free article] [PubMed]
  • Zhou Y-N, Gottesman S. Regulation of proteolysis of the stationary-phase ς factor RpoS. J Bacteriol. 1998;180:1154–1158. [PMC free article] [PubMed]

Articles from Genes & Development are provided here courtesy of Cold Spring Harbor Laboratory Press
PubReader format: click here to try


Save items

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • Cited in Books
    Cited in Books
    NCBI Bookshelf books that cite the current articles.
  • Conserved Domains
    Conserved Domains
    Conserved Domain Database (CDD) records that cite the current articles. Citations are from the CDD source database records (PFAM, SMART).
  • Gene
    Gene records that cite the current articles. Citations in Gene are added manually by NCBI or imported from outside public resources.
  • GEO Profiles
    GEO Profiles
    Gene Expression Omnibus (GEO) Profiles of molecular abundance data. The current articles are references on the Gene record associated with the GEO profile.
  • MedGen
    Related information in MedGen
  • Protein
    Protein translation features of primary database (GenBank) nucleotide records reported in the current articles as well as Reference Sequences (RefSeqs) that include the articles as references.
  • PubMed
    PubMed citations for these articles
  • Substance
    PubChem chemical substance records that cite the current articles. These references are taken from those provided on submitted PubChem chemical substance records.
  • Taxonomy
    Taxonomy records associated with the current articles through taxonomic information on related molecular database records (Nucleotide, Protein, Gene, SNP, Structure).
  • Taxonomy Tree
    Taxonomy Tree

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...