* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Chapter 2 DNA, RNA, Transcription and Translation I. DNA
Transfer RNA wikipedia , lookup
Epigenetics in learning and memory wikipedia , lookup
Human genome wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
Molecular cloning wikipedia , lookup
Transcription factor wikipedia , lookup
RNA interference wikipedia , lookup
DNA vaccination wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Cancer epigenetics wikipedia , lookup
Designer baby wikipedia , lookup
Genetic code wikipedia , lookup
Nutriepigenomics wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Cell-free fetal DNA wikipedia , lookup
Short interspersed nuclear elements (SINEs) wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Nucleic acid double helix wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
DNA supercoil wikipedia , lookup
Epigenomics wikipedia , lookup
Microevolution wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
DNA polymerase wikipedia , lookup
History of genetic engineering wikipedia , lookup
Nucleic acid tertiary structure wikipedia , lookup
RNA silencing wikipedia , lookup
Polyadenylation wikipedia , lookup
Point mutation wikipedia , lookup
Non-coding DNA wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Messenger RNA wikipedia , lookup
Helitron (biology) wikipedia , lookup
History of RNA biology wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Non-coding RNA wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Chapter 2 DNA, RNA, Transcription and Translation I. DNA (deoxyribonucleic acid) The basic genetic material to establish and maintain the cellular and biochemical function. Central Dogma: (Gene Expression) Structure of DNA (discovered by Watson and Crick) DNA basic unit: nucleotides that are composed of an organic base, a pentose and a phosphate group. Four different bases in DNA (A, T, G, C): The genetic information is stored in the alignment and sequence of these 4 bases, analogous to 0 and 1 used in the information storage in computer. The sequence of the DNA determines the polypeptide sequence and the protein function and hence the cellular activities and functions. 1 Precursor of DNA synthesis: (deoxynucleoside triphosphate) The nucleotides of DNA are joined by the addition of dNTP to the polynucleotide chain (the P at the position reacts with the 3’ –OH at the growing nucleotide chain. **DNA synthesis proceeds from 5’ to 3’ => 5’-ATGC-….3’ Base nucleoside nucleotide Precursors for Precursors for DNA synthesis RNA synthesis Adenine (A) Adenosine Adenylic acid dATP ATP Guanine (G) Guanosine Guanylic acid dGTP GTP Cytosine (C) Cytidine Cytidylic acid dCTP CTP Thymine (T) Thymidine Thymidylic acid dTTP Uracil (U) Uridine Uridylic acid Note: base + sugar = nucleoside phosphate + base + sugar = nucleotide 2 UTP ATP: adenosine triphosphate, energy stored in ATP can drive many bioreactions (e.g. active transport by hydrolyzing to ADP or AMP). DNA double helix DNA in cells exists as a double helix, consisting of two long chains (strands). A only pairs with T, G only pairs with C. These reactions are called base pairing, the two strands are complementary. The length of DNA is expressed in base pair (bp). The two strands run in opposite direction => one is 5’→3’, the other is 3’→5’ = > because both strands are complementary ∴ if one strand is 5’- TAGGCAT-3’ the other strand must be 3’-ATCCGTA-5’ Usually 5’ end starts from the left. II. DNA Replication [5]: During the cell division, new DNA is synthesized and segregated to new daughter cells. The replication initiates at origin of replication (ori). Some ori are identified in bacteria (245 bp oriC in E. coli), yeast, chloroplasts and mitochondria. ori is usually A-T rich (easier melting). 3 DNA topoisomerase: unwinds the helix DNA helicase: unzip the helix DNA polymerase: moves along the DNA template and catalyzes the incorporation of nucleotides into DNA (synthesis from 5’ to 3’). Replication process in E. coli (500 nt/s in bacteria, 50 nt/s in mammals) 4 Synthesis of lagging strand. See Ref [2] DNA polymerase requires an RNA primer to initiate DNA synthesis, but RNA polymerase doesn’t need an RNA primer for RNA synthesis. In eucaryotes, repair system exists to ensure the fidelity of DNA replication. In E. coli, chromosome is circular, replication moves toward two directions1. In each daughter chromosome, one strand comes from the parent, the other is newly synthesized, ∴ → semiconserved synthesis. Gene: A stretch of nucleotide bases on a stand of DNA that is transcribed into RNA. 1 In human cells, the telomere is synthesized by telomerase. Telomerase is absent in many cell types thus their telomeres become shorter with each cell division eventually DNA damage occurs at chromosome ends sends a signal to stabilize p53transcription of several genes (e.g. p21) CKI (cdk inhibitor protein) bind to and inhibit G1/S-cdk and S-cdk.=> block entry into S phase (Alberts, p10071018). 5 The number and sequence of the bases within a gene determine the info the gene carries -i.e. the a.a. sequence of the specific polypeptide. There are III. RNA (ribonucleic acid): RNA is also a polymer of nucleotides, but is different from DNA in that U (uracil) substitutes for thymine (T) in RNA. The pentose is ribose, instead of deoxyribose. Types of RNA mRNA (messenger RNA): encodes protein (3~5%); transcribed by RNA polymerase II (RNA pol II). DNA is always present within the cell, whereas mRNA is present transiently, degraded after a short period of time (Half-lives of yeast mRNA:1-60 min. Halflives of animal mRNA:1-24 h). DNA is double stranded, whereas mRNA is single stranded. tRNA (transfer RNA): carries a.a to the site of protein synthesis (required for protein translation) (4%); transcribed by RNA pol III. rRNA (ribosomal RNA): several different sizes (≒90%); transcribed by RNA pol I. 5S, 16S, 23S in procaryotes. (S is the relative sedimentation rate during centrifugation) 5S, 5.8S, 18S, 28S in eucaryotes, extensively modified (e.g. methylation of the 2’OH position on ribose) rRNA combines with some proteins to form ribosome microRNA: Catalytic RNA (ribozyme): enzymatically cleave RNA molecules. 6 IV. Transcription [2, 3, 5]: RNA polymerase (RNA pol) is the enzyme to catalyze the transcription using DNA as the template (40 nt/s at 37C for bacterial RNA pol, much slower than the DNA replication rate). RNA pol first binds the promoter region of the template DNA. RNA pol is large and spans 75-80 bp (from -55 to +20), it separates the two DNA strands in a transient bubble and synthesizes the first 9 nt bond. RNA pol moves along the template strand from 3’ to 5’ direction in a way similar to DNA replication (RNA synthesis must be from 5’ to 3’). As RNA pol moves, it unwinds the DNA at the front of the bubble (12-14 bp) and rewinds the DNA at the back. The RNA-DNA hybrid is shorter and transient, then the nascent RNA is released. When the termination sequence is reached, the enzyme stops adding nt to the RNA chain, releases the product and dissociates from the DNA template. 7 Note: RNA pol consists of the following subunits 2 subunits: enzyme assembly and promoter recognition and ’ subunits: catalytic center subunit: promoter specificity. In E. coli, the factor (e.g. 70, 32) is essential for initiation, it’s released when RNA chain reaches 8-9 nt. Two strands of DNA coding strand (sense strand): has the same sequence as the mRNA, often the sequence shown in literature. template strand (antisense strand): Rifampicin, an antibiotic used against tuberculosis, inhibits transcription. Differences in procaryotic and eucaryotic genes: Procaryotic genes 8 Genes w/ related functions are often contiguous, forming the operon (including the genes themselves and the control element). These genes are under the control of a single promotergenerates a set of proteins (polycistronic). Ex: Lac Operon (1st operon studied and uncovered by Jacob and Monod in 1961), the gene products enable cells to take up and metabolize -galactoside such as lactose. LacZ: encodes -galactosidase, cleaves lactose into glucose and galactose LacY: encodes permease, transports -galactosides into the cell. LacA: encodes transacetylase, transfers an acetyl group from acetyl-CoA to galactosides Note: the operon maintains basal level transcription so that small amounts of permease can transport foreign -galactosides into the cells. Eucaryotic genes: Consist of a set of coding regions (exon) separated by noncoding regions (intron). mRNA is synthesized via transcription and undergoes splicing. 9 See Ref [4] All RNA pol II transcribed RNAs are capped by a terminal nt, 7methylguanylate (m7G). The 5’ cap positively influences the poly A addition and splicing, and is essential for the initiation of translation. After transcription, the 3’ end of mRNA is cleaved by an See Ref [1] endonuclease, then poly(A) polymerase adds 200 A residues downstream (15 nt) of the polyA signal (e.g. AAUAAA) Splicing occurs in spliceosome (in the nucleus) which consists of small nuclear RNA (<200 nt) complexed to proteins called snRNPs (small nuclear ribonucleoprotein particles). The splicing occurs at the junctions of exon-intron; almost all introns begin with GU and end with AG. 10 Alternative Splicing An primary transcript may undergo alternative splicing events, i.e. the mRNA may be spliced one way in one tissue, but in different way in another tissue. The exon skipping mechanism generates different gene products in different tissues from the same structural gene. (also facilitates the evolution of novel proteins thanks to recombination). Genetic Code [3]: the blue print for any living cell, universal, applicable to all living system 3 bases on RNA codes for an a.a 64 (43) combination: many of which are redundant, e.g. UCU, UCC, UCA and UCG specify serine. Nonsense codons: UAA, UAG, UGA, don’t encode any thing but acting as the stop signal in translation. Start codon: AUG (Met) See Ref [4] 11 V. Translation [2]: Requires the interaction of mRNA, charged tRNA, ribosomes and some factors that facilitate the initiation, elongation and termination. tRNA: (also expressed by genes) transports a.a. to the polypeptide chain 75~93 nucleotides has a 2’ structure → folded 12 a particular a.a. is linked by its carboxyl end (to form aminoacyl group), after addition of the a.a, the tRNA is “charged”. anticodon determines what specific a.a can be added. 1. Initiation: (usually the rate limiting step) Procaryote mRNA binds to the small (30S) ribosomal subunit. The binding is mediated by Shine-Delgarno (SD) sequence (6-8 nt, -AGGAGGU-, RBS). The SD is 5-10 nt upstream of the AUG codon and is complementary to a sequence of nt near the 3’ end of the 16S rRNA of the small ribosomal subunit. Several initiation factors (IF1, IF2 and IF3) are required for initiation. E.g. IF2 is required for attachment of the first aminoacyl-tRNA. A formyl-methionine charged tRNA binds to an mRNA-small ribosomal complex. (Procaryotic cells possess 2 distinct methionyl-tRNA, one for initiation (tRNAiMet), one for incorporating Met into the interior of a polypeptide (tRNAMet). Large (50S) ribosomal subunit joins to complete the initiation complex assembly. 13 Eucaryote mRNA is transported out of nucleus for translation. Require at least 10 initiation factors (eIF), several of which bind to the 40S subunit. The initiator tRNAiMet also binds to the 40S subunit prior to its interaction with the mRNA, these and other eIFs form the 43S complex (shown on the left) which is recruited to the mRNA with the help of eIFs that are already bound to the mRNA. The eIFs bound to the mRNA: Poly A binding protein (PABP): binds the 3’ poly(A) eIF4E: binds to the 5’ cap eIF4A: a helicase, moves along the 5’ end unwinding the mRNA eIF4G: serves as a linker between the 5’ capped end and the 3’ poly(A) end of the mRNA, thus eIF4G converts a linear mRNA into a circular loop. Once the 43S complex binds to the 5’ end, the complex scans along the transcript until it reaches the first AUG. Once the 43S complex reaches the AUG codon, eIF2-GTP is hydrolyzed to eIF2-GDP, and released along with other factors, and the large (60S) subunit joins to complete the initiation. Usually, but not always, the first AUG to be encountered is the initiation codon. However, the AUG triplet is not sufficient to determine whether it is the start codon, it is recognized efficiently as the initiation codon only when it is in the right context. An initiation codon may be recognized in the sequence NNNPuNNAUGG (e.g. Kozak sequence -CCACCAUGG-). The purine (A or G) at -3 position and the G immediately following AUG can influence the translation efficiency by 10X. 14 This scanning mechanism is found in most eucaryotic cells, but some viruses (e.g. poliovirus) use an alternative mechanism by which the 40S subunit associates directly with IRES. When the leader sequence is long, further 43S subunits can recognize the 5’ end before the first has left the initiation site, creating a queue of subunits proceeding along the leader to the initiation site. 2. Elongation: (similar for eucaryotes and procaryotes) 1. The second codon (CUG) base pairs with the anticodon (GAC) of Leu-tRNA 2. The Met of the initiator tRNA is joined to the Leu of Leu-tRNA by a peptide bond, and the uncharged initiator tRNA is ejected from the ribosome. 3. Translocation of the peptidyl-tRNA and the mRNA to the peptidyl site (P site, where peptidyl-tRNA is, on the ribosome) from the aminoacyl site (A site), which opens the aminoacyl site for the next codon (UUU). 4. The third codon (UUU) base pairs with the anticodon (AAA) of Phe-tRNA 5. The Leu of the peptidyl-tRNA is joined to Phe of the Phe-tRNA by a peptide bond, and the uncharged Leu-tRNA is ejected from the ribosome. 6. Translocation of the peptidyl-tRNA and mRNA to the peptidyl site from the aminoacyl site, which opens the aminoacyl site for the next codon and codon-anticodon interaction. 15 3. Termination: The above process continues until a stop codon (UAA, UAG or UGA) is reached. The stop codon binds with a termination factor, the last tRNA is cleaved from the peptide chain and ejected. The mRNA and the peptide are released; the ribosomes are prepared for recycling. In many cases, Met is removed later. Note: Correct protein sequence is synthesized only if the message is read in correct “reading frame”. Every sequence can have 3 possible ORF e.g. The big red dog frameshift correct reading frame HEB IGR EDD… wrong, meaningless Translation can occur co-transcriptionally in procaryotes, but in eucaryotes mRNA is transported to ribosome for translation. VI. POST-TRANSLATIONAL PROCESSING After translation, the polypeptide is further processed to be functional. Many eucaryotic proteins require post-translational processing to be functional, while in E. coli many of the post-translational processing steps are not performed. Several types of processing: 1. Folding into proper conformation, which can be assisted by chaperone proteins (e.g. GroEL/GroES in E. coli) 2. Signal peptide cleavage: many proteins carry signal sequences for intracellular translocation and secretion, these signals are cleaved during secretion. 3. Glycosylation: addition of sugar to the protein; very common in eucaryotic proteins; could affect the protein function, stability, immunogenicity. 16 4. Phosphorylation 5. Proteolytic processing: e.g. cleavage of a polyprotein (VP) to form VP2 and VP4 (IBDV) VII. Regulation of mRNA Transcription Cells transcribe a common set of genes (housekeeping) that maintain routine cellular functions, but not all genes are transcribed and translated at the same rate and the same time. Genes are turned on and off when needed, otherwise cell resources would be depleted. Therefore, gene expression is regulated. If a protein is required by a cell, a signaling system initiates transcription of the pertinent genes. In Bacteria Gene clusters are controlled by a single promoter. Each gene has its own ATG codon. In the promoter region, there are two binding sites for The distance separating -35 and -10 sites is The affinity of the promoter region to RNA polymerase is called the strength. The sequence between TATA box and the initiation site is called Example of Negative Regulation 17 Repressor protein binds to the operatorRNA pol cannot move along off (but there could be a basal level of expression). Ex: LacI represses Lac operon. Effector (or called inducer) molecules bind to the repressor and release from the operator region on (induction). e.g. the lacZ gene is off w/o -galactoside, when the substrate is added, the enzyme activity appears within 2-3 min IPTG (isopropyl-beta-Dthiogalactopyranoside), a synthetic analogue of lactose, induces lac genes very efficiently. Example of Positive Regulation: (increases the frequency of transcription) Activator-operator complex attracts RNA pol and increases the transcription efficiency. The same effector can be used in positive or negative control for different genes, i.e., it may increase or decrease the transcription rates. In Eucaryotes Many genes are strictly regulated and may be transcribed in restricted types of cells, e.g. and subunits of hemoglobin are only expressed in red blood cells. The transcriptional regulation of genes is essential for maintaining cell specificity and conserving cellular energy and function. 18 In general the gene expression in eucaryotic cells is controlled at the initiation of transcription. Regulation at subsequent stages of transcription is relatively rare in eucaryotes. The transcription is mediated by transcription factors (TF), many of which bind to DNA sequences that are often <10 bp in length (boxes, elements, enhancers2). Basal factors: Together with RNA pol, bind at the start point and TATA box. Activators: TF that can bind the promoters or enhancers and increase the efficiency the basal apparatus binds to the promoter. Some activators are ubiquitous but others have a regulatory role and are synthesized/activated at specific time or in specific tissues, to bind the response elements. Coactivators: do not bind the DNA themselves, provide a connection between activators and the basal apparatus. Some regulators act to change the chromatin structure. (see Appendix) p53 is a transcription factorupregulate p21 and p16 cell cycle regulation mutation of p53 often leads to cancer. Each eucaryotic gene has its own set of elements within the promoter, but there are common elements within a typical promoter (200 bp): Active promoters contain at least one of these three, but may not contain all three TATA box (-25 bp): 8 bp containing a TATA sequence, binds general transcription factors (TFIID for pol II transcription), responsible for correctly locating the transcription start site. 2 CCAAT box (cat box, -75): binds TF CTF and NF1, affects the efficiency Enhancers: elements that can stimulate the transcription. They may be located a considerable distance from the transcription start site, and may be upstream or downstream of the gene they control. Folding and bending of DNA might bring these regions close to each other. 19 GC box (-90 bp): repeated GC nt (GGGCGG). Binds TF SP1. Some genes contain the response elements (a common promoter or enhancer element that can respond to a certain factor). e.g. HSE (heat shock element). Initiation Process TF IID binds to TATA box, other transcription factors subsequently bind to form a protein complex. RNA pol II then binds (oriented toward the gene) the complex. With the aid of additional TF, the transcription is initiated at the correct starting point. Note: Transcription control is generally positive in eucaryotes . Repression is usually accomplished at the level of unavailability or inactivation of TF, change in In mammals, the methylation of DNA occurs at the cytosine bases in CpG dinucleotide via the methyltransferase [1]. A high CpG content is found in regions known as CpG islands (a stretch of DNA 1-2 kb that has clusters of CpG doublets). CpG islands surround the promoters of constitutively expressed genes where they are unmethylated. Methylation of a CpG island prevents activation of a promoter within it. The methylation can recruit HDAC activities to methylated CpG sequences to establish a repressed state of chromatin. (epigenetics) . 20 VIII. Summary Eucaryotic promoter elements and other sequences important for transcription and translation [3]. Which strand serves as the template depends on the promoter and direction the RNA pol is moving. RNA pol 5’ pol 3’ 3’ 5’ RNA synthesis must be 5’ to 3’ the DNA template must be traversed from 3’ to 5’ So If RNA pol moves to right top is sense, bottom is template (antisense) If RNA pol moves to leftbottom is sense, top is template On one end of chromosome, the top strand may be template, on the other locus the bottom may be the template. 21 Appendix Chromatin: complex of histones, DNA and other proteins The nucleus of an interphase (G1, G2, S phases) cells contains the chromatin fiber whose extended state is more suitable for gene transcription, only when the cells enter M phase do the chromatin fibers compact to form the chromosome. 22 The metaphase chromosome has undergone DNA replication, so both chromosomes are sister chromatids (same chromosomes after replication). After replication, they are aligned in parallel and then linked by centromere. Alternative Splicing [4] 23 References: [1] Krauss G. Biochemistry of signal transduction and regulation. Weinheim: Wiley-VCH Verlag, 2003. [2] Lewin B. Genes VIII. Upper Saddle River, NJ. : Pearson Prentice Hall, 2004. [3] Barnum SR. Biotechnology: an Introduction. New York: Thomson Learning, 2005. [4] Watson J, Myers RM, Caudy AA, Witkowski JA. Recombinant DNA: Genes and Genomes. New York: W.H. Freeman and Company, 2007. [5] Alberts B, Johnson A, Lewis J, Raff M, Roberts K, Walter P. Molecular Biology of the Cell. New York: Garland Science, 2002. 24