* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Genetic regulation in eukaryotes
Epigenetics of neurodegenerative diseases wikipedia , lookup
X-inactivation wikipedia , lookup
Transposable element wikipedia , lookup
Point mutation wikipedia , lookup
Genomic imprinting wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Ridge (biology) wikipedia , lookup
Genetic code wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
History of genetic engineering wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Microevolution wikipedia , lookup
Genome (book) wikipedia , lookup
Human genome wikipedia , lookup
Transfer RNA wikipedia , lookup
Designer baby wikipedia , lookup
Genome evolution wikipedia , lookup
Messenger RNA wikipedia , lookup
Non-coding DNA wikipedia , lookup
Minimal genome wikipedia , lookup
Gene expression profiling wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Short interspersed nuclear elements (SINEs) wikipedia , lookup
Nucleic acid tertiary structure wikipedia , lookup
Polyadenylation wikipedia , lookup
RNA interference wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Mir-92 microRNA precursor family wikipedia , lookup
Epigenetics of human development wikipedia , lookup
History of RNA biology wikipedia , lookup
RNA-binding protein wikipedia , lookup
RNA silencing wikipedia , lookup
Primary transcript wikipedia , lookup
RNAs Ribonucleic acids (RNAs) INTRODUCTION SLIDEs 1+2 Ancient RNA World Proteins cannot self-replicate, and many evolutionary geneticists consider that autocatalytic nucleic acids must have pre-dated proteins and were able to replicate without the help of proteins. The RNA World Hypothesis developed from ideas proposed by Alexander Rich and Carl Woese in the 1960’s. It imagines that RNA had a dual role in the earliest stages of life, acting as both the genetic material (with the capacity for self-replication) and also as effector molecules, such as proteins today. Both roles are still evident today: some viruses have RNA genomes, and non-coding RNA molecules can work as effector molecules with catalytic activity (ribozymes). Another observation consistent with RNAs being the first nucleic acid is that deoxyribonucleotides are synthesized from ribonucleotides in cellular pathways (by reverse transcription). As well as storing genetic information, RNA has been imagined to have been used subsequently to synthesize proteins from amino acids. RNA has a rather rigid backbone and so is not very well suited as an effector molecule. Proteins are much more flexible and also offer more functional variety because the 20 amino acids can have widely different structures and offer more possible sequence combinations. The replacement of RNA with DNA as an information storage molecule provided significant advantages. DNA is much more stable than RNA, and so better suited for this task. The sugar residues of the DNA lack the 2’OH group on ribose sugars that makes RNA prone to hydrolytic cleavage. Greater efficiency could be achieved by separating the storage and transmission of genetic information (DNA) from protein synthesis (RNA). All that was needed was the development of a reverse transcriptase so that DNA could be synthesized from deoxyribonucleotides by using an RNA template. TYPES of RNAs RNAs for PROTEIN SYNTHESIS SLIDES 3-6 SLIDE 4 (1) Messenger RNAs have already discussed in detail in the lecture on Genetic Regulation in Eukaryotes. Briefly: Prokaryotes: • transcription and translation is coupled in space and time • instability: half life time 1-3 min • polycistronic mRNAs (operons: more than one genes on a single mRNA molecule). Eukaryotes • 1 gene encodes 1 protein • pre-mRNAs are processed: splicing, capping, polyA tail SLIDE 5 (2) Ribosomal RNA genes In addition to the two mitochondrial rRNA molecules (12S and 16S rRNA), there are four types of cytoplasmic rRNA, three associated with the large ribosome subunit (28S, 5.8S, and 5S rRNAs) and one with the small ribosome subunit (18S rRNA). The 5S RNA genes occur in small gene clusters, the largest being a cluster of 16 genes on chromosome 1q42, close to the telomere. Only a few 5S RNA genes are functional, and there are many dispersed pseudogenes. The 28S, 5.8S, and 18S rRNAs are encoded by a single multigenic transcription unit that is tandemly repeated to form megabase-size ribosomal DNA arrays (about 30-40 tandem repeats, or roughly 100 rRNA genes) on the short arm of the acrocentric human chromosomes 13, 14, 15, 21, and 22. SLIDE 6 (3) Transfer RNA genes The 22 mitochondrial tRNAs are made by 22 tRNA genes in mtDNA. 516 human tRNA genes are known that make cytoplasmic tRNA with defined anticodon specificity. The genes can be classified into 49 families on the basis of anticodon specificity. There is only a rough correlation of human tRNA gene number with amino acid frequency. For example, 30 tRNA gene specify the comparatively rare amino acid cystein BASIC REQUIREMENT page 1 1 RNAs (which accounts for 2.5% of all amino acids in human proteins), but only 21 tRNA genes specify the more abundant proline (which has a frequency of 6.1%). Although the tRNA genes seem to be dispersed throughout the human genome, more than half of human tRNA genes (273 out of 516) reside on either chromosome 6 or 1. In addition, 18 of the 30 cys tRNAs are found in a 0.5 Mb stretch of chromosome 7. tRNAs are adaptor molecules that deliver amino acids to the ribosome and decode the information in mRNA. Their primary structure (i.e. the linear sequence of nucleotides) is 60-95 nucleotide (nt) long, but most commonly 76. They have many modified bases sometimes accounting 20% of the total bases in any one tRNA molecule. Indeed, over 50 different types of modified base have been observed in the tRNA molecules, and all of them are created post-transcriptionally. Four of these, ribothimidine (T), which contains the base thymine not usually found in RNA, pseudouridine (), dihydro-uridin (D) and inosine (I) are very common in nearly all tRNA, all but the last being present in nearly all tRNA molecules in similar positions in the sequence. Inosine is part of the anticodon in some tRNA, and can recognize 2 or 3 different bases (wobble). All tRNAs have a common secondary structure, the cloverleaf structure. The stem is called the amino acid acceptor stem. See the picture for further details on the secondary structure of tRNAs. There are 9 hydrogen bonds that help form the 3-D structure of tRNA molecules. tRNAs are joined to amino acids to become aminoacyl-tRNAs (charged tRNAs) in a reaction called aminoacylation. Special enzymes called aminoacyl-tRNA synthetase carry out the joining reaction which is extremely specific. An example for the nomenclature: the amino acid leucine is linked to its tRNA called tRNALeu by the leucyl-tRNA synthetase resulting in the generation of leucyl-tRNALeu. NON-PROTEIN CODING RNAs SLIDES 7-18 SLIDE 7 A New RNA World RNA genes are transcribed parts of the genome that do not encode proteins, hence their other name “non-coding RNAs (ncRNAs)”. Much of the attention paid to the human genes has focused on protein-coding genes because they were long considered to be far the functionally most important part of our genome. Non-coding RNA molecules have been so underappreciated that raw draft of human genome sequences reported in 2001 contained no analyses at all of human RNA genes! RNA was seen to be important in very early evolution (RNA World) but its functions were imagined to have been very largely overtaken by DNA and proteins. In recent times, the vast majority of RNA molecules were imagined to serve as accessory molecules in the making of proteins. The last few years have witnessed a revolution in our understanding of the importance of RNA and, although the number of protein-encoding genes has been steadily revised downward since draft human genome sequences were reported, the number of RNA genes is constantly being revised upward. The tiny mitochondrial genome was always considered to exceptional because 65% (24 out of 37) of its genes are RNA genes (22 tRNA and 2 rRNA genes). Now we are beginning to realize that the RNA transcribed from the nucleus is not so uniformly dedicated to the protein synthesis as we once thought; instead, it shows great functional diversity. What has changed our thinking? (1) First, completely unsuspected classes of ncRNAs have recently been discovered. (2) Secondly, recent whole-genome analyses have shown that at least 85% and possibly more than 90% of the human genome is transcribed. Two other major surprises were (3) the extent of multigenic transcription, and (4) the pervasiveness of bidirectional transcription (about 70% of human genes are transcribed from both strands; non-coding DNAs can be transcribed from both strands, too). The recent data challenge the distinction between genes and intergenic space and have forced a radical rethink of the concept of the gene. We have known for many decades that various ubiquitous ncRNA classes are essential for cell function. Until recently, however, we have largely been accustomed to thinking of ncRNAs as not much more than a series of accessories that are needed to process genes to make proteins. Transfer RNAs are needed at the very end of the pathway, serving to decode the codons in mRNA and provide amino acids in the order they are needed for insertion into the growing polypeptide chains. Ribosomal RNAs are essential components of the ribosomes, the complex ribonucleoprotein factories of protein synthesis. Other ubiquitous ncRNAs were known to function higher up the pathway to ensure correct processing of mRNA, tRNA and rRNA precursors. Various small RNAs are components of complex ribonucleoproteins involved in different processing reactions, including splicing, cleavage of rRNA and tRNA precursors, and base modifications that are required for RNA maturation. Typically, these RNAs work as guide RNAs, by base pairing with complementary sequences in the precursor RNA. We have also long been aware of a few ncRNAs that have other functions, such as RNAs implicated in X- BASIC REQUIREMENT page 2 2 RNAs inactivation and imprinting, and the RNA component of telomerase ribonucleoprotein needed for the synthesis of the DNA telomeres. But these RNAs seemed to be quirky exceptions. In the past decade or so, however, there has been a revolution in how we view RNAs. Many thousands of different ncRNAs have recently been identified. Many of them are developmentally regulated and have been shown to have crucial roles in a whole variety of different processes that occur in specialized tissues or different stages in development. Several ncRNAs have already been implicated in cancer and genetic disease. May be it is time to view our genome as more of an RNA machine than just a protein machine. SLIDE 8 Ribozymes: relics of an ancient world? A ribozyme (from ribonucleic acid enzyme, also called RNA enzyme or catalytic RNA) is an RNA molecule possessing a well defined tertiary structure that enables it to catalyze a chemical reaction. Many natural ribozymes catalyze either the hydrolysis of one of their own phosphodiester bonds, or the hydrolysis of bonds in other RNAs, but they have also been found to catalyze the aminotransferase activity of the ribosome. RNAseP, for example, is a ribozyme that can cleave substrate RNA without any requirement for proteins, and certain types of intron are autocatalytic and able to splice themselves out of RNA transcripts without any help from proteins. The peptidyl transferase activity - the enzyme that catalyzes the peptide bond – is a ribozyme. SLIDE 9 Small nuclear RNAs (snRNAs) Various families of rather small RNA molecules (60 to 360 nucleotides long) are known to have a role in the nucleus in assisting general gene expression, mostly at the level of post-transcriptional processing. Types: spliceosomal snRNAs, non-spliceosomal snRNAs, small nucleolar RNAs (snoRNAs), small Cajal body RNAs (scaRNAs). Spliceosomal small nuclear RNA genes The nine human spliceosomal snRNAs vary in length from 106 to 186 nucleotides and bind a ring of seven core proteins. U1, U2, U4, U5, and U6 operate within the major spliceosome to process conventional GU-AG introns. The other four spliceosomal snRNAs form part of the minor spliceosome that excises AU-AC introns. More than 70 genes specify snRNAs used in the major spliceosome. They include 44 identified genes specifying U6 snRNA and 16 specifying U1 snRNA. Non-spliceosomal small nuclear RNA genes Not all snRNAs within the nucleoplasm function as part of spliceosomes. Both U1 and U2 snRNAs also have non-spliceosomal functions.Us1 is required to stimulate transcription elongation by RNA polymerase II. Several other snRNAs with a non-spliceosomal function tend to be single-copy genes but there are many associated pseudogenes. Three examples are given bellow. (1) U7 snRNA is a 63-nucleotide RNA that is dedicated to the specialized 3’ processing undergone by histone mRNA which, exceptionally, is not polyadenylated. (2) 7SK RNA is a 331-nucleotide RNA that functions as a negative regulator of the RNA polymerase II elongation factor p-TEFb. (3) The Y RNA family consists of three small RNAs that are involved in chromosomal DNA replication and function as regulators of cell proliferation. Small nucleolar RNA (snoRNA) genes SnoRNAs are between 60 to 300 nucleotide long, and were initially identified in the nucleolus, where they guide nucleotide modification in rRNA at specific positions. They do this by forming short duplexes with a sequence of the rRNA that contains the target nucleotide. At least 340 human snoRNA genes have been found so far, but there may be many more because snoRNAs are very difficult to identify with the use of bioinformatics approaches. The vast majority is found within the introns of a larger gene, which is transcribed by RNA pol-II. These snoRNAs are produced by processing of the intronic RNA, and so the regulation of their synthesis is coupled to that of the host gene. Many snoRNAs genes are dispersed single-copy genes. Others occur in clusters. Most snoRNAs are ubiquitously expressed, but some are tissue-specific. Nonstandard functions are known or expected for some snoRNA genes that do not have sequences complementary to rRNA sequences. For example, the HBII-52 snoRNA has an 18-nucleotide sequence that is perfectly complementary to a sequence within the HTR2C (serotonin receptor 2c) gene at Xp24, and regulates alternative splicing of this gene. Small Cajal body RNA (scaRNA) genes Cajal bodies (CBs) are spherical sub-organelles of 0.3-1.0 µm in diameter found in the nucleus of proliferative cells like embryonic cells and tumor cells, or metabolically active cells like neurons. In contrast to cytoplasmic organelles, CBs lack any phospholipid membrane which would separate their content, largely consisting of proteins and RNA, from the surrounding nucleoplasm. They were first reported by Santiago Ramón y Cajal in 1903. The scaRNAs resemble snoRNAs and perform a similar role BASIC REQUIREMENT page 3 3 RNAs in RNA maturation, but their targets are spliceosomal snRNAs and they perform site-specific modifications of spliceosomal snRNA precursors in the Cajal bodies of the nucleus. There are at least 25 human genes, each specifying one type of scaRNA. Like snoRNA genes, the scaRNA genes are typically located within the introns of genes transcribed by RNA pol-II. ANTISENSE RNAs SLIDES 10-18 SLIDES 10, 11 Antisense RNAs 1. trans-antisense RNAs (imperfect homology): micro RNAs, siRNAs, piRNAs 2. cis-antisense RNAs (perfect homology): overlapping RNAs, siRNAs (?) Cis position: Trans position: 4 close (overlapping) to the gene containing homologous sequences far from the gene containing homologous sequences SLIDES 12-14 MicroRNAs (miRNAs). A continuously increasing number of miRNAs have been described in the genomes of several multicellular organisms. Micro RNA genes yield RNA transcripts that are processed into short single-stranded segments, which then double over on themselves to form hairpin structures. It has been proposed that they act as components of protein/RNA complexes. A miRNA can both pair exactly with a mRNA and cause its degradation via RNA interference (RNAi; see bellow) or it can pair partially with a message and shut off translation. Recent studies involving computational approaches suggest that the human genome may encode well over 1500 different miRNAs; the number known is rising rapidly. A single micro RNA is assumed to regulate the expression of several genes. It is hypothesized that up to one-third of human genes are regulated by these small RNAs. A miRNA is a form of single-stranded (ss)RNS which is typically 20-25 nucleotides long. The miRNAs are transcribed from DNA, but are not translated into protein. The DNA sequence that codes for a miRNA gene is longer than the miRNA. This DNA sequence includes the miRNA sequence and an approximate reverse complement. When this DNA sequence is transcribed into a single-stranded RNA molecule, the miRNA sequence and its reversecomplement base pair to form a double stranded RNA hairpin loop; this forms a primary miRNA structure (pri-miRNA). Drosha, a nuclear enzyme, cleaves the base of the hairpin to form pre-miRNA. The pre-miRNA molecule is then actively transported out of the nucleus into the cytoplasm. The Dicer enzyme then cuts 20-25 nucleotides from the base of the hairpin to release the mature miRNA. The function of miRNAs appears to be in gene regulation. For that purpose, a miRNA is complementary to a part of one or more mRNAs, usually at a site in the 3’-UTR (untranslated region). The annealing of the miRNA to the mRNA inhibits protein translation. In some cases, the formation of the double-stranded RNA through the binding of the miRNA triggers the degradation of the mRNA transcript through a process similar to RNAi, though in other cases it is believed that the miRNA complex blocks the protein translation machinery or otherwise prevents protein translation without causing the mRNA to be degraded. Because many miRNAs are strongly conserved during evolution, vertebrate miRNAs were quickly identified. miRNA regulate the expression of selected sets of target genes by base pairing with their transcripts. Usually, the binding sites are in the 3’ untranslated region of the target mRNA sequences, and bound miRNA inhibits translation so as to down-regulate expression of the target gene. Synthesis of miRNAs involves the cleavage of RNA precursors by nuclease-specific and cytoplasm-specific RNA-III ribonucleases, nucleases that specifically bind to and cleave double-stranded RNAs. The primary transcript, the pri-miRNA, has closely positioned inverted repeats that base-pair to form a hairpin RNA that is initially cleaved from the primary transcript by a nuclear RNase-III (known as Drosha) to make a short double-stranded pre-miRNA that is transported out of the nucleus. A cytoplasmic RNAse III called Dicer cleaves the pre-miRNA to generate a miRNA duplex with overhanging 3’ dinucleotides. A specific RNA-induced silencing complex (RISC) that contains the endoribonuclease argonaute binds the miRNA duplex and acts to unwind the double-stranded miRNA. The argonaute protein then degrades one of the RNA strands (the passenger strand) to leave the mature singlestranded miRNA (known as the guide strand) bound to argonaute. The mature miRNP (ribonucleoprotein) BASIC REQUIREMENT page 4 RNAs associates with RNA transcripts that have sequences complementary to the guide strand. The binding of miRNA to target transcript normally involves a significant number of base mismatches. As a result, a typical miRNA can silence the expression the expression of hundreds of target genes in much the same way that the tissue-specific protein transcription factor can affect the expression of multiple target genes at the same time. Until now, more than 700 human miRNA genes had been identified and experimentally validated, but comparative genomics analyses indicate that the number of such genes is likely to increase. Some of the miRNA genes have their own individual promoters; others are part of a miRNA cluster and are cleaved from a common multi-miRNA transcription unit. Another class of miRNA genes form part of a compound transcription unit that is dedicated to making other proteins in addition to miRNA, either another type of ncRNA or a protein. SLIDE 15 Endogenous small interfering RNAs (endo-siRNAs) Long double-stranded RNA in mammalian cells triggers nonspecific gene silencing through interferon pathways, but transfection of exogenous synthetic siRNA duplexes or hairpin RNAs induces RNAi-mediated silencing of specific genes with sequence elements in common with the exogenous RNA. Very recently, it has become clear that human cells also naturally produce endo-siRNAs. Like piRNAs, endo-siRNAs are among the most varied RNA population in the cell (many tens of thousands endo-siRNAs have been identified in mouse oocytes – itt kutatták). One way in which this happens involves the occasional transcription of some pseudogenes. SLIDE 16 Piwi-protein-interacting RNAs (piRNAs) have been found in a wide variety of eukaryotes. They are expressed in germ-line cells in mammals and are typically 24-31 nucleotides long; they are thought to have a major role in limiting transposition by retrotransposons, but they may also regulate gene expression. Control of gene transposon activity is required because by integrating into new locations in the genome, active transposons can interfere with gene function, causing genetic diseases and cancer. More than 15,000 different human piRNAs have been identified and so the piRNA family is among the most diverse RNA family in human cells. They are thought to be cleaved from large multigenic transcripts. The piRNAs are processed from long RNA precursors transcribed from defined loci called piRNA clusters. Any transposon inserted in the reverse orientation in the piRNA clusters can give rise to antisense piRNAs (shown in red). Antisense transposons are incorporated into a piwi protein and direct its slicer activity on sense transposon transcripts. The 3’ cleavage product is bound by another piwi protein and trimmed to piRNA size. The sense piRNA is, in turn, used to cleave piRNA cluster transcripts and to generate more antisense piRNAs. Antisense piRNAs target the piwi complexes to transcribing RNAs, which will lead to DNA methylation or histone modification (methylation of histone) in the vicinity of the transcription. SLIDES 17 Antisense overlapping RNAs Natural cis-encoded antisense RNAs are endogenous transcripts that are transcribed from the opposite strand of the same genomic locus as the sense RNA and have a region of perfect overlap with the sense transcripts. Very surprising novel data suggest that at least 30-40% of genes are under the control of cis-antisense RNAs. The binding of mRNAs and antisense transcripts can sterically block translation from mRNA or, alternatively, it may trigger the RNA interference pathway, which eventually leads to the degradation of mRNA. Many thousands of different long ncRNAs, often many kilobases in length, are also thought to have regulatory roles in animal cells. They include antisense transcripts that are usually do not undergo splicing and that can regulate overlapping sense transcripts, plus a wide variety of long mRNA-like ncRNAs that undergo splicing, and polyadenylation but do not seem to encode any sizable polypeptide, although some contain internal ncRNAs such as snoRNAs and piRNAs. The functions of the great majority of the mRNAlike ncRNAs are unknown. Some, however, are known to be tissue-specific and involved in gene regulation. The XIST gene encodes a long ncRNA that regulates X-chromosome inactivation, the process by which one of the two X chromosomes is randomly selected to be condensed in female mammals, with large regions becoming transcriptionally inactive (see lecture Epigenetics). Many other long ncRNAs, such as the H19 RNA, are implicated in repressing the transcription of either paternal or maternal allele of autosomal regions (imprinting, see lecture Epigenetics). Role in the medicine The double stranded (ds)RNAs that trigger RNAi may be usable as drugs. Another speculative use of dsRNA is in the repression of essential genes in eukaryotic human pathogens or viruses that are dissimilar from any human genes; this would be analogous to how existing drugs work. RNAi interferes with the translation process of gene expression and appears not to interact with the DNA itself. Proponents of therapies based on RNAi suggest that the lack of interaction with DNA may alleviate some patients' concerns about alteration of their DNA (as practiced in gene therapy), and suggest that this method of treatment would likely be no more feared than taking any prescription drug. For this reason RNAi and therapies based on RNAi have attracted much interest in the pharmaceutical and biotech industries. BASIC REQUIREMENT page 5 5 RNAs SLIDE 18 HAR (human accelerated regions) In the non-coding part of human genome it was described 49 region (HAR: human accelerated region), which were found highly conserved (unchanged) in vertebrate species, but rapidly evolved (changed the nucleotide content) in human. Twelve out of 49 HARs are expressed in the brain. The most rapidly changing among the HARs is the HAR1: in a 118 by DNA stretch, there is only 2basepair difference between chimp and chicken, while 18-base pair difference between chimp and human. It is interesting that HAR1 is expressed in human brain cortex at 7-17th gestation period, which raise the question as to whether these regions played roles in human evolution. 6 The pre-initiation complex The pre-initiation complex facilitates the binding of RNA polymerase II to the promoter of genes, which in turn initializes transcription. The RNA polymerase II is composed of 12 subunits. RNA polymerase binding the promoter has to be preceded by the attachment of several transcription factors to the promoter or to the polymerase itself. The pre-initiation complex can only initiate a basal expression level from a specific gene. Other elements at more distant position are needed for elevated expression level. mRNA transport Eukaryotic mRNAs must leave the nucleus in order to be translated into proteins. Mature mRNAs exit through the nuclear pores, but the underlying mechanisms are not fully understood. A large portion of unprocessed transcripts never leave the nucleus and are degraded. Proteins traveling to the appropriate organelles are directed by the signal peptides locating on their N-terminals. Another possibility for pass a certain protein to the desired organelle is based on mRNA targeting. Some mRNAs contain a zip code on the 5’ termini, which contains information for the subcellular targeting of mRNA. Texts written in small letters belong to the extra requirements BASIC REQUIREMENT page 6