Download MicroRNAs: Hidden in the Genome Dispatch

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Gene expression wikipedia , lookup

MicroRNA wikipedia , lookup

RNA-Seq wikipedia , lookup

RNA silencing wikipedia , lookup

RNA interference wikipedia , lookup

Transcript
Current Biology, Vol. 12, R138–R140, February 19, 2002, ©2002 Elsevier Science Ltd. All rights reserved. PII S0960-9822(02)00708-X
MicroRNAs: Hidden in the Genome
Eric G. Moss
Genes for tiny RNAs have been found to be plentiful
in the genomes of worms, flies, humans and probably all animals. Some of these microRNAs have been
conserved through evolution, and many are
expressed only at specific times or places. How they
act is just beginning to be understood, but their
importance to biology is likely to be great.
How many genes in the genome? Add a few hundred
to your best estimate. In the DNA of humans, flies and
worms are genes for short (21–24 nucleotides) noncoding RNAs, dubbed microRNAs (miRNAs) [1–3].
These genes were not seen before — their existence
is easily overlooked by standard methods for finding
genes. They were uncovered when three groups used
molecular and bioinformatic approaches expressly
designed to find such very small RNAs. The Tuschl lab
[1] came upon them unexpectedly while searching for
the endogenous products of the RNA interference
(RNAi) mechanism, which are also about 22 nucleotides
long. The Bartel [2] and Ambros [3] groups each suspected that the nematode Caenorhabditis elegans
would have more than the two short RNAs already
known to control development in that organism. All
together, the three groups found nearly 100 new
genes encoding about as many miRNAs, some of
which are conserved in worms, flies and humans.
Because of their sequence diversity, regulated expression and resemblance to the two C. elegans RNAs,
these miRNAs are likely to regulate the expression of
protein-encoding genes. Their discovery is stunning
for showing the vastness of this class of tiny RNAs
and humbling for telling us that we are still ignorant of
significant processes inside our cells.
The First MicroRNAs: lin-4 and let-7
Before the recent flood of miRNAs, only two such
RNAs were known. The C. elegans genes for these
miRNAs, lin-4 and let-7, were discovered by way of
their mutant phenotypes [4,5]. Strains carrying a mutation in either of these genes display retarded development, with some cells failing to divide and differentiate
on schedule [6]. When these genes were cloned, they
were found to encode unrelated 21–22 nucleotide
RNAs [5,7]. Both of these RNAs are believed to act by
basepairing with the RNA of the 3′ untranslated regions
(UTRs) of one or more target genes in the developmental timing pathway [8–10]. By mechanisms that are
not fully understood, this interaction leads to repression of the target genes at a post-transcriptional step
[11,12]. The Ruvkun lab [13] discovered that a let-7like RNA can be found in diverse animals, including
humans, demonstrating that these tiny RNAs are not
Cell and Developmental Biology, Fox Chase Cancer Center,
Philadelphia, PA 19111, USA.
Dispatch
peculiar to worms. Because they are different from
any non-coding RNAs described previously, in their
size and activity on mRNAs, lin-4 and let-7 represent
a new class of RNAs, recently dubbed small temporal
RNAs (stRNAs) for their roles in developmental timing.
Why were not more genes for tiny RNAs like lin-4
and let-7 found before? The genetic approaches that
identified lin-4 and let-7 relied on finding mutant
animals by chance. But the genes for these RNAs
present very small targets for mutagens. In all the
many genetic screens conducted with C. elegans to
find mutants with defective larval development, a
mutation in lin-4 appears to have turned up only once,
so it could have easily been missed. When taking a
biochemical approach to finding an unknown regulator of a gene’s expression, most seek protein factors
and would not think to look for small RNAs, which
would require very different assays than are normally
used. A gene-finding approach that relies on detecting RNA transcripts would also miss the miRNAs
because they run off the bottom of gels most people
use for northern blots. Bioinformatics approaches are
also currently limited [14]. Without a significant open
reading frame (which would reveal a protein-encoding
gene) or significant similarity to genes for known
RNAs, such as tRNAs and snRNAs, genes for noncoding RNAs lie camouflaged against the background
of intronic and intergenic sequences.
Capturing a ‘meer’
Uncovering a hundred or so genes for miRNAs (each
denoted mir, pronounced ‘meer’) required taking
advantage of their few special features. First, each
group used the size of the RNAs as its primary criterion. Standard cDNA libraries would lack miRNAs, primarily because RNAs that small are normally excluded
in the construction procedure. Total RNA from fly
embryos, worms or HeLa cells was size fractionated
so that only molecules 25 nucleotides or smaller would
be captured. Synthetic oligomers were ligated directly
to RNAs using T4 RNA ligase. Then the sequences
were reverse-transcribed, amplified by PCR, cloned
and sequenced. The genome databases were queried
with the sequences, confirming which derive from
these organisms (some sequences were from the
E. coli on which C. elegans feeds). This also allowed
the mir genes to be placed physically in the context of
other genes. The vast majority of the cloned sequences
were found to come, not from known or predicted
genes, but rather from intronic regions or between
genes. Occasionally they occur in clusters, some so
close as to suggest that the tandemly arranged
miRNAs are processed from a single transcript to
allow coordinate regulation. Furthermore, the genomic
sequence revealed the second bit of identifying information about mir genes, the fold-back structures of
the miRNA precursors (Figure 1).
Both lin-4 and let-7 are processed from longer precursors, which form incompletely double-stranded
fold-back structures [5,7]. These precursors are short-
Current Biology
R139
lived, probably because processing occurs rapidly
and only the important RNA product is protected from
further degradation. All the miRNAs discovered by the
cloning strategy are likely to be processed from similarly structured precursors — a conclusion drawn from
the genomic sequence surrounding the miRNA
sequence. In fact, the existence of a fold-back precursor structure was one major criterion in the bioinformatics approach to finding mir genes in genomic
sequence. Lee and Ambros [3] used conservation
between C. elegans and its relative C. briggsae as an
additional criterion, a strategy validated by the Bartel
group’s [2] finding that 85% of the mir genes of
C. elegans had recognizable homologs in C. briggsae.
Almost all of the miRNAs described by the three
groups were shown to be expressed and to be about
the expected size, and in many cases the longer precursor form also was seen. Lee and Ambros [3] predicted 25 additional miRNAs, but did not report them
because their expression was not yet confirmed.
Because many of the miRNAs described were found
only once or a few times in the RNA samples examined, the search for new miRNAs is far from over and
possibly hundreds more remain to be discovered.
Many will likely have restricted expression so that
bioinformatics or tissue-specific cloning will be necessary to reveal them.
The RNAi Connection
Some of the recent attention paid to RNAs in the size
range of 21 to 25 nucleotides is due to the excitement
over RNAi, the phenomenon in which double-stranded
RNA leads to the degradation of any RNA that is homologous [15]. A powerful technique for knocking down
gene function, RNAi relies on a complex and ancient
cellular mechanism that probably evolved for protection
against viruses and mobile genetic elements [16]. A
central step in that mechanism is the generation of
siRNAs, double-stranded RNAs which not coincidentally are each about 22 nucleotides long [16]. The
siRNAs lead to the degradation of homologous RNA
and the production of more siRNAs against the same
target [17]. The stRNAs and the siRNAs are about the
same length and resemble each other at their ends — a
5′ phosphate and a 3′ hydroxyl, a feature the Bartel
group [2] took advantage of to capture miRNAs in their
cloning procedure — because both are cleaved from
their double-stranded precursors by a multidomain,
RNaseIII-type ribonuclease named Dicer [18–20].
Knocking out the activity of Dicer in C. elegans prevents the processing of lin-4 and let-7, as well as RNAi
[19,20]. Lee and Ambros [3] checked two of the
miRNAs they found, and showed that they too require
Dicer activity for processing of the precursor to the
miRNA form [3]. In RNAi, Dicer forms siRNAs from
perfectly basepaired precursors and yields sense and
antisense RNAs in abundance. Yet the same enzyme
processes the miRNA precursors, which contain variously placed bulges and loops, to yield only a single
short RNA. Analysis of the collection of miRNA
sequences quickly revealed that they can be derived
from either the 5′ or 3′ end of the precursor. Because
of the diversity of miRNA sequences, which show little
miR-1:
5′PO4-UGGAAUGUAAAGAAGUAUGUA-OH 3′
miR-1 precursor RNAs:
5'
A
A
3'
A
U
G
G
U
G A UA
CG
CG
GU
G
U
A GG
C
CG
G CA
A
A
G
CG
UA
GU
CG
AU
UA
AU
CG
U A
UA
CG
C A
UA
UA
AU
CG
AU
UA
G
A
C G
C
CG
A U
UA
AU
A
CG
UG
AU
U AA
A
A
U
CAU
5'
C
C
A
U
A
3'
U
G
A
U
G
C
AG C
CG
CG
U
U
U
UG C
A UA
A
G
A
G
A
GC
UG
UA
CG
CG
AU
UA
GU
CG
UA
UA
CG
C AA
U
UA
GU
CG
AU
UA
UA
CG
A G
AU
UA
A UA
GC
UG
UAA
A
C
U
A UU
C. elegans
Drosophila
A Drosophila mir
gene cluster
mir-3
mir-4
mir-5
mir-6-1 mir-6-2 mir-6-3
Chromosome 2r
5'
GU
C
A
3'
G
C
A
C
C
A
U
G C GG
CG
U C
GC
CG
UG
UA
GC
GU
GC
AU
A A
AU
CG
AU
UA
AU
CG
UA
UA
CG
UA
UA
UA
AU
UG
AU
UA
G A
CC G
CG
AU
UA
A U CG
UAA
GU
GC
G
A
CCU
Human
A human mir
gene cluster
mir-17 mir-18 mir-19a mir-20 mir-19b-1
Chromosome 13
Current Biology
Figure 1. MicroRNAs (miRNAs) are 21–25 nucleotide RNAs
encoded in the genomes of animals.
miR-1, a 22-nucleotide miRNA, is conserved in worms, flies and
humans. It is processed from a larger precursor which is
mostly, but not completely double-stranded. Only the region
shown in red is abundant after processing, the rest of the precursor is degraded rapidly. Genes for miRNAs can be found in
clusters, as shown for example in the fly and human. In these
clusters, the genes are in the same orientation, they only differ
by whether the miRNA, shown in red, comes from the 5′ or 3′
part of the precursor.
in common beside their length and some sequence
biases at certain positions [2], there are no clues to
suggest how Dicer properly processes the precursor.
There must be some guidance provided by the miRNA
precursor and/or bound proteins. Remarkably, there is
little evidence for endogenous siRNAs. The vast majority of the small RNAs cloned did not correspond to
RNA from any protein coding or other RNA gene [2].
This suggests that the RNAi mechanism does not
have a major task to perform in healthy cells.
MicroRNAs Through Evolution
What kept lin-4 an oddball in the years before let-7 was
cloned was that no RNAs like it were known to exist in
any organisms other than a few species of Caenorhabditis. But shortly after the cloning of let-7, RNAs of identical sequence were found in diverse animals, including
humans [13]. It was the realization that RNAs like lin-4
and let-7 may be more widespread than previously recognized that spurned the search for miRNAs. As it turns
out, many of the new batch of miRNAs appear to be like
lin-4 in existing only in one organism, whereas others
are like let-7 and are more broadly conserved. In those
Dispatch
R140
cases, the miRNA is conserved, though the precursor
sequence varies (Figure 1). The conservation or divergence of a particular miRNA may be a consequence of
how its targets have evolved. Co-evolution would be
necessary to preserve an important interaction between
the miRNA regulator and the genes it controls (if it does
indeed act by regulating gene expression). If the miRNA
has many targets, divergence may be slow, whereas
those with a single target may co-evolve with their
targets quickly. The emergence and disappearance of
mir genes during evolution is likely to be rapid, much
more rapid than the appearance and disappearance of
protein-coding genes, because they are very small and
need not maintain an open reading frame.
Expression in Time and Space
Other small RNAs in cells, such as snRNAs and
snoRNAs, are uniformly expressed because they are
involved in essential functions in all cells. Many
miRNAs, however, are expressed only at specific times
or in certain tissues. The pioneer lin-4 and let-7 RNAs,
after all, are regulators of developmental timing, and
their time of appearance correlates with the repression
of their targets [6]. Underscoring the conservation of
let-7 across phyla is its general expression profile of
‘off early, on late’ in all animals examined [13]. Many of
the miRNAs discovered in C. elegans and Drosophila
appear to be exclusively expressed during embryogenesis, a time when much dynamic gene regulation
occurs [1,2]. Others have more complex or non-conserved patterns; miR-1, for example, is essentially
ubiquitous in C. elegans and Drosophila, but appears
only in human heart, among the few tissues that were
examined. For these reasons it is highly likely that
miRNAs act as specific regulators of gene expression.
We do not know whether each of the hundred or so
miRNAs have one, a few or many targets, or whether
a single target may be regulated by multiple miRNAs.
And how they act, whether by one or more mechanisms, is still a puzzle. The only miRNA studied in
detail so far is lin-4, which appears to be part of a
mechanism that blocks protein synthesis after translation initiation [11,12]. Finding targets of more miRNAs
in other systems will certainly give a big boost to the
effort to understand their mechanism of action. Some
miRNAs may have different mechanisms, and perhaps
function like the siRNAs of RNAi. RNAi itself has
offered a view into a previously unknown part of the
RNA world. Now miRNAs have opened a new vista.
References
1. Lagos-Quintana, M., Rauhut, R., Lendeckel, W. and Tuschl, T.
(2001). Identification of novel genes coding for small expressed
RNAs. Science 294, 853–858.
2. Lau, N.C., Lim le, E.P., Weinstein, E.G. and Bartel, D.P. (2001). An
abundant class of tiny RNAs with probable regulatory roles in
Caenorhabditis elegans. Science 294, 858–862.
3. Lee, R.C. and Ambros, V. (2001). An extensive class of small RNAs
in Caenorhabditis elegans. Science 294, 862–864.
4. Chalfie, M., Horvitz, H.R. and Sulston, J.E. (1981). Mutations that
lead to reiterations in the cell lineage of C. elegans. Cell 24, 59–69.
5. Reinhart, B.J., Slack, F.J., Basson, M., Pasquinelli, A.E., Bettinger,
J.C., Rougvie, A.E., Horvitz, H.R. and Ruvkun, G. (2000). The 21nucleotide let-7 RNA regulates developmental timing in
Caenorhabditis elegans. Nature 403, 901–906.
6. Ambros, V. (2000). Control of developmental timing in Caenorhabditis elegans. Curr. Opin. Genet. Dev. 10, 428–433.
7. Lee, R.C., Feinbaum, R.L. and Ambros, V. (1993). The C. elegans
heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell 75, 843–854.
8. Wightman, B., Ha, I. and Ruvkun, G. (1993). Posttranscriptional regulation of the heterochronic gene lin-14 by lin- 4 mediates temporal
pattern formation in C. elegans. Cell 75, 855–862.
9. Moss, E.G., Lee, R.C. and Ambros, V. (1997). The cold shock
domain protein LIN-28 controls developmental timing in C. elegans
and is regulated by the lin-4 RNA. Cell 88, 637–646.
10. Slack, F.J., Basson, M., Liu, Z., Ambros, V., Horvitz, H.R. and
Ruvkun, G. (2000). The lin-41 RBCC gene acts in the C. elegans heterochronic pathway between the let-7 regulatory RNA and the LIN29 transcription factor. Mol. Cell 5, 659–669.
11. Olsen, P.H. and Ambros, V. (1999). The lin-4 regulatory RNA controls developmental timing in Caenorhabditis elegans by blocking
LIN-14 protein synthesis after the initiation of translation. Dev. Biol.
216, 671–680.
12. Seggerson, K., Tang, L. and Moss, E.G. (2002). Two genetic circuits
repress the C. elegans heterochronic gene lin-28 after translation
initiation. Dev. Biol. in press.
13. Pasquinelli, A.E., Reinhart, B.J., Slack, F., Martindale, M.Q., Kuroda,
M.I., Maller, B., Hayward, D.C., Ball, E.E., Degnan, B. and Muller, P.
(2000). Conservation of the sequence and temporal expression of
let-7 heterochronic regulatory RNA. Nature 408, 86–89.
14. Eddy, S.R. (1999). Noncoding RNA genes. Curr. Opin. Genet. Dev.
9, 695–699.
15. Fire, A., Xu, S., Montgomery, M.K., Kostas, S.A., Driver, S.E. and
Mello, C.C. (1998). Potent and specific genetic interference by
double-stranded RNA in Caenorhabditis elegans. Nature 391,
806–811.
16. Zamore, P.D. RNA interference: listening to the sound of silence.
Nat. Struct. Biol. 8, 746–750.
17. Lipardi, C., Wei, Q. and Paterson, B.M. (2001). RNAi as Random
Degradative PCR. siRNA primers convert mRNA into dsRNAs that
are degraded to generate new siRNAs. Cell 107, 297–307.
18. Bernstein, E., Caudy, A.A., Hammond, S.M. and Hannon, G.J.
(2001). Role for a bidentate ribonuclease in the initiation step of
RNA interference. Nature 409, 363–366.
19. Grishok, A., Pasquinelli, A.E., Conte, D., Li, N., Parrish, S., Ha, I.,
Baillie, D.L., Fire, A., Ruvkun, G. and Mello, C.C. (2001). Genes and
mechanisms related to RNA interference regulate expression of the
small temporal RNAs that control C. elegans developmental timing.
Cell 106, 23–34.
20. Hutvagner, G., McLachlan, J., Pasquinelli, A.E., Balint, E., Tuschl, T.
and Zamore, P.D. (2001). A cellular function for the RNA-interference enzyme Dicer in the maturation of the let-7 small temporal
RNA. Science 293, 834–838.