Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Current Biology, Vol. 12, R138–R140, February 19, 2002, ©2002 Elsevier Science Ltd. All rights reserved. PII S0960-9822(02)00708-X MicroRNAs: Hidden in the Genome Eric G. Moss Genes for tiny RNAs have been found to be plentiful in the genomes of worms, flies, humans and probably all animals. Some of these microRNAs have been conserved through evolution, and many are expressed only at specific times or places. How they act is just beginning to be understood, but their importance to biology is likely to be great. How many genes in the genome? Add a few hundred to your best estimate. In the DNA of humans, flies and worms are genes for short (21–24 nucleotides) noncoding RNAs, dubbed microRNAs (miRNAs) [1–3]. These genes were not seen before — their existence is easily overlooked by standard methods for finding genes. They were uncovered when three groups used molecular and bioinformatic approaches expressly designed to find such very small RNAs. The Tuschl lab [1] came upon them unexpectedly while searching for the endogenous products of the RNA interference (RNAi) mechanism, which are also about 22 nucleotides long. The Bartel [2] and Ambros [3] groups each suspected that the nematode Caenorhabditis elegans would have more than the two short RNAs already known to control development in that organism. All together, the three groups found nearly 100 new genes encoding about as many miRNAs, some of which are conserved in worms, flies and humans. Because of their sequence diversity, regulated expression and resemblance to the two C. elegans RNAs, these miRNAs are likely to regulate the expression of protein-encoding genes. Their discovery is stunning for showing the vastness of this class of tiny RNAs and humbling for telling us that we are still ignorant of significant processes inside our cells. The First MicroRNAs: lin-4 and let-7 Before the recent flood of miRNAs, only two such RNAs were known. The C. elegans genes for these miRNAs, lin-4 and let-7, were discovered by way of their mutant phenotypes [4,5]. Strains carrying a mutation in either of these genes display retarded development, with some cells failing to divide and differentiate on schedule [6]. When these genes were cloned, they were found to encode unrelated 21–22 nucleotide RNAs [5,7]. Both of these RNAs are believed to act by basepairing with the RNA of the 3′ untranslated regions (UTRs) of one or more target genes in the developmental timing pathway [8–10]. By mechanisms that are not fully understood, this interaction leads to repression of the target genes at a post-transcriptional step [11,12]. The Ruvkun lab [13] discovered that a let-7like RNA can be found in diverse animals, including humans, demonstrating that these tiny RNAs are not Cell and Developmental Biology, Fox Chase Cancer Center, Philadelphia, PA 19111, USA. Dispatch peculiar to worms. Because they are different from any non-coding RNAs described previously, in their size and activity on mRNAs, lin-4 and let-7 represent a new class of RNAs, recently dubbed small temporal RNAs (stRNAs) for their roles in developmental timing. Why were not more genes for tiny RNAs like lin-4 and let-7 found before? The genetic approaches that identified lin-4 and let-7 relied on finding mutant animals by chance. But the genes for these RNAs present very small targets for mutagens. In all the many genetic screens conducted with C. elegans to find mutants with defective larval development, a mutation in lin-4 appears to have turned up only once, so it could have easily been missed. When taking a biochemical approach to finding an unknown regulator of a gene’s expression, most seek protein factors and would not think to look for small RNAs, which would require very different assays than are normally used. A gene-finding approach that relies on detecting RNA transcripts would also miss the miRNAs because they run off the bottom of gels most people use for northern blots. Bioinformatics approaches are also currently limited [14]. Without a significant open reading frame (which would reveal a protein-encoding gene) or significant similarity to genes for known RNAs, such as tRNAs and snRNAs, genes for noncoding RNAs lie camouflaged against the background of intronic and intergenic sequences. Capturing a ‘meer’ Uncovering a hundred or so genes for miRNAs (each denoted mir, pronounced ‘meer’) required taking advantage of their few special features. First, each group used the size of the RNAs as its primary criterion. Standard cDNA libraries would lack miRNAs, primarily because RNAs that small are normally excluded in the construction procedure. Total RNA from fly embryos, worms or HeLa cells was size fractionated so that only molecules 25 nucleotides or smaller would be captured. Synthetic oligomers were ligated directly to RNAs using T4 RNA ligase. Then the sequences were reverse-transcribed, amplified by PCR, cloned and sequenced. The genome databases were queried with the sequences, confirming which derive from these organisms (some sequences were from the E. coli on which C. elegans feeds). This also allowed the mir genes to be placed physically in the context of other genes. The vast majority of the cloned sequences were found to come, not from known or predicted genes, but rather from intronic regions or between genes. Occasionally they occur in clusters, some so close as to suggest that the tandemly arranged miRNAs are processed from a single transcript to allow coordinate regulation. Furthermore, the genomic sequence revealed the second bit of identifying information about mir genes, the fold-back structures of the miRNA precursors (Figure 1). Both lin-4 and let-7 are processed from longer precursors, which form incompletely double-stranded fold-back structures [5,7]. These precursors are short- Current Biology R139 lived, probably because processing occurs rapidly and only the important RNA product is protected from further degradation. All the miRNAs discovered by the cloning strategy are likely to be processed from similarly structured precursors — a conclusion drawn from the genomic sequence surrounding the miRNA sequence. In fact, the existence of a fold-back precursor structure was one major criterion in the bioinformatics approach to finding mir genes in genomic sequence. Lee and Ambros [3] used conservation between C. elegans and its relative C. briggsae as an additional criterion, a strategy validated by the Bartel group’s [2] finding that 85% of the mir genes of C. elegans had recognizable homologs in C. briggsae. Almost all of the miRNAs described by the three groups were shown to be expressed and to be about the expected size, and in many cases the longer precursor form also was seen. Lee and Ambros [3] predicted 25 additional miRNAs, but did not report them because their expression was not yet confirmed. Because many of the miRNAs described were found only once or a few times in the RNA samples examined, the search for new miRNAs is far from over and possibly hundreds more remain to be discovered. Many will likely have restricted expression so that bioinformatics or tissue-specific cloning will be necessary to reveal them. The RNAi Connection Some of the recent attention paid to RNAs in the size range of 21 to 25 nucleotides is due to the excitement over RNAi, the phenomenon in which double-stranded RNA leads to the degradation of any RNA that is homologous [15]. A powerful technique for knocking down gene function, RNAi relies on a complex and ancient cellular mechanism that probably evolved for protection against viruses and mobile genetic elements [16]. A central step in that mechanism is the generation of siRNAs, double-stranded RNAs which not coincidentally are each about 22 nucleotides long [16]. The siRNAs lead to the degradation of homologous RNA and the production of more siRNAs against the same target [17]. The stRNAs and the siRNAs are about the same length and resemble each other at their ends — a 5′ phosphate and a 3′ hydroxyl, a feature the Bartel group [2] took advantage of to capture miRNAs in their cloning procedure — because both are cleaved from their double-stranded precursors by a multidomain, RNaseIII-type ribonuclease named Dicer [18–20]. Knocking out the activity of Dicer in C. elegans prevents the processing of lin-4 and let-7, as well as RNAi [19,20]. Lee and Ambros [3] checked two of the miRNAs they found, and showed that they too require Dicer activity for processing of the precursor to the miRNA form [3]. In RNAi, Dicer forms siRNAs from perfectly basepaired precursors and yields sense and antisense RNAs in abundance. Yet the same enzyme processes the miRNA precursors, which contain variously placed bulges and loops, to yield only a single short RNA. Analysis of the collection of miRNA sequences quickly revealed that they can be derived from either the 5′ or 3′ end of the precursor. Because of the diversity of miRNA sequences, which show little miR-1: 5′PO4-UGGAAUGUAAAGAAGUAUGUA-OH 3′ miR-1 precursor RNAs: 5' A A 3' A U G G U G A UA CG CG GU G U A GG C CG G CA A A G CG UA GU CG AU UA AU CG U A UA CG C A UA UA AU CG AU UA G A C G C CG A U UA AU A CG UG AU U AA A A U CAU 5' C C A U A 3' U G A U G C AG C CG CG U U U UG C A UA A G A G A GC UG UA CG CG AU UA GU CG UA UA CG C AA U UA GU CG AU UA UA CG A G AU UA A UA GC UG UAA A C U A UU C. elegans Drosophila A Drosophila mir gene cluster mir-3 mir-4 mir-5 mir-6-1 mir-6-2 mir-6-3 Chromosome 2r 5' GU C A 3' G C A C C A U G C GG CG U C GC CG UG UA GC GU GC AU A A AU CG AU UA AU CG UA UA CG UA UA UA AU UG AU UA G A CC G CG AU UA A U CG UAA GU GC G A CCU Human A human mir gene cluster mir-17 mir-18 mir-19a mir-20 mir-19b-1 Chromosome 13 Current Biology Figure 1. MicroRNAs (miRNAs) are 21–25 nucleotide RNAs encoded in the genomes of animals. miR-1, a 22-nucleotide miRNA, is conserved in worms, flies and humans. It is processed from a larger precursor which is mostly, but not completely double-stranded. Only the region shown in red is abundant after processing, the rest of the precursor is degraded rapidly. Genes for miRNAs can be found in clusters, as shown for example in the fly and human. In these clusters, the genes are in the same orientation, they only differ by whether the miRNA, shown in red, comes from the 5′ or 3′ part of the precursor. in common beside their length and some sequence biases at certain positions [2], there are no clues to suggest how Dicer properly processes the precursor. There must be some guidance provided by the miRNA precursor and/or bound proteins. Remarkably, there is little evidence for endogenous siRNAs. The vast majority of the small RNAs cloned did not correspond to RNA from any protein coding or other RNA gene [2]. This suggests that the RNAi mechanism does not have a major task to perform in healthy cells. MicroRNAs Through Evolution What kept lin-4 an oddball in the years before let-7 was cloned was that no RNAs like it were known to exist in any organisms other than a few species of Caenorhabditis. But shortly after the cloning of let-7, RNAs of identical sequence were found in diverse animals, including humans [13]. It was the realization that RNAs like lin-4 and let-7 may be more widespread than previously recognized that spurned the search for miRNAs. As it turns out, many of the new batch of miRNAs appear to be like lin-4 in existing only in one organism, whereas others are like let-7 and are more broadly conserved. In those Dispatch R140 cases, the miRNA is conserved, though the precursor sequence varies (Figure 1). The conservation or divergence of a particular miRNA may be a consequence of how its targets have evolved. Co-evolution would be necessary to preserve an important interaction between the miRNA regulator and the genes it controls (if it does indeed act by regulating gene expression). If the miRNA has many targets, divergence may be slow, whereas those with a single target may co-evolve with their targets quickly. The emergence and disappearance of mir genes during evolution is likely to be rapid, much more rapid than the appearance and disappearance of protein-coding genes, because they are very small and need not maintain an open reading frame. Expression in Time and Space Other small RNAs in cells, such as snRNAs and snoRNAs, are uniformly expressed because they are involved in essential functions in all cells. Many miRNAs, however, are expressed only at specific times or in certain tissues. The pioneer lin-4 and let-7 RNAs, after all, are regulators of developmental timing, and their time of appearance correlates with the repression of their targets [6]. Underscoring the conservation of let-7 across phyla is its general expression profile of ‘off early, on late’ in all animals examined [13]. Many of the miRNAs discovered in C. elegans and Drosophila appear to be exclusively expressed during embryogenesis, a time when much dynamic gene regulation occurs [1,2]. Others have more complex or non-conserved patterns; miR-1, for example, is essentially ubiquitous in C. elegans and Drosophila, but appears only in human heart, among the few tissues that were examined. For these reasons it is highly likely that miRNAs act as specific regulators of gene expression. We do not know whether each of the hundred or so miRNAs have one, a few or many targets, or whether a single target may be regulated by multiple miRNAs. And how they act, whether by one or more mechanisms, is still a puzzle. The only miRNA studied in detail so far is lin-4, which appears to be part of a mechanism that blocks protein synthesis after translation initiation [11,12]. Finding targets of more miRNAs in other systems will certainly give a big boost to the effort to understand their mechanism of action. Some miRNAs may have different mechanisms, and perhaps function like the siRNAs of RNAi. RNAi itself has offered a view into a previously unknown part of the RNA world. Now miRNAs have opened a new vista. References 1. Lagos-Quintana, M., Rauhut, R., Lendeckel, W. and Tuschl, T. (2001). Identification of novel genes coding for small expressed RNAs. Science 294, 853–858. 2. Lau, N.C., Lim le, E.P., Weinstein, E.G. and Bartel, D.P. (2001). An abundant class of tiny RNAs with probable regulatory roles in Caenorhabditis elegans. Science 294, 858–862. 3. Lee, R.C. and Ambros, V. (2001). An extensive class of small RNAs in Caenorhabditis elegans. Science 294, 862–864. 4. Chalfie, M., Horvitz, H.R. and Sulston, J.E. (1981). Mutations that lead to reiterations in the cell lineage of C. elegans. Cell 24, 59–69. 5. Reinhart, B.J., Slack, F.J., Basson, M., Pasquinelli, A.E., Bettinger, J.C., Rougvie, A.E., Horvitz, H.R. and Ruvkun, G. (2000). The 21nucleotide let-7 RNA regulates developmental timing in Caenorhabditis elegans. Nature 403, 901–906. 6. Ambros, V. (2000). Control of developmental timing in Caenorhabditis elegans. Curr. Opin. Genet. Dev. 10, 428–433. 7. Lee, R.C., Feinbaum, R.L. and Ambros, V. (1993). The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell 75, 843–854. 8. Wightman, B., Ha, I. and Ruvkun, G. (1993). Posttranscriptional regulation of the heterochronic gene lin-14 by lin- 4 mediates temporal pattern formation in C. elegans. Cell 75, 855–862. 9. Moss, E.G., Lee, R.C. and Ambros, V. (1997). The cold shock domain protein LIN-28 controls developmental timing in C. elegans and is regulated by the lin-4 RNA. Cell 88, 637–646. 10. Slack, F.J., Basson, M., Liu, Z., Ambros, V., Horvitz, H.R. and Ruvkun, G. (2000). The lin-41 RBCC gene acts in the C. elegans heterochronic pathway between the let-7 regulatory RNA and the LIN29 transcription factor. Mol. Cell 5, 659–669. 11. Olsen, P.H. and Ambros, V. (1999). The lin-4 regulatory RNA controls developmental timing in Caenorhabditis elegans by blocking LIN-14 protein synthesis after the initiation of translation. Dev. Biol. 216, 671–680. 12. Seggerson, K., Tang, L. and Moss, E.G. (2002). Two genetic circuits repress the C. elegans heterochronic gene lin-28 after translation initiation. Dev. Biol. in press. 13. Pasquinelli, A.E., Reinhart, B.J., Slack, F., Martindale, M.Q., Kuroda, M.I., Maller, B., Hayward, D.C., Ball, E.E., Degnan, B. and Muller, P. (2000). Conservation of the sequence and temporal expression of let-7 heterochronic regulatory RNA. Nature 408, 86–89. 14. Eddy, S.R. (1999). Noncoding RNA genes. Curr. Opin. Genet. Dev. 9, 695–699. 15. Fire, A., Xu, S., Montgomery, M.K., Kostas, S.A., Driver, S.E. and Mello, C.C. (1998). Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans. Nature 391, 806–811. 16. Zamore, P.D. RNA interference: listening to the sound of silence. Nat. Struct. Biol. 8, 746–750. 17. Lipardi, C., Wei, Q. and Paterson, B.M. (2001). RNAi as Random Degradative PCR. siRNA primers convert mRNA into dsRNAs that are degraded to generate new siRNAs. Cell 107, 297–307. 18. Bernstein, E., Caudy, A.A., Hammond, S.M. and Hannon, G.J. (2001). Role for a bidentate ribonuclease in the initiation step of RNA interference. Nature 409, 363–366. 19. Grishok, A., Pasquinelli, A.E., Conte, D., Li, N., Parrish, S., Ha, I., Baillie, D.L., Fire, A., Ruvkun, G. and Mello, C.C. (2001). Genes and mechanisms related to RNA interference regulate expression of the small temporal RNAs that control C. elegans developmental timing. Cell 106, 23–34. 20. Hutvagner, G., McLachlan, J., Pasquinelli, A.E., Balint, E., Tuschl, T. and Zamore, P.D. (2001). A cellular function for the RNA-interference enzyme Dicer in the maturation of the let-7 small temporal RNA. Science 293, 834–838.