Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
How cells read the genome: From DNA to Protein Control of Gene expression M. Saifur Rohman, MD. PhD. FIHA. FICA Sub topic • • • • • • • • • From DNA to RNA From RNA to protein The RNA world and origin of the life An overview of gene control DNA binding motifs in gene regulatory proteins How Genetic swithes work The molecular genetic mechanism that create specialized cell type Posttransciptional controls How genome evolve From DNA to Protein: An overview Protein synthesis • DNA • mRNA (transcription) • Protein (translation) From DNA to Protein From RNA to Protein: Step by step • • • • • • • • • • • • The genetic code Open reading frames tRNA structure and production tRNA charging - tRNA synthetases Ribosome structure – (components, tRNA binding, rRNA, peptide tunnel) Peptide chain elongation – EF-Tu, EF-G or EF1, EF2 Initiation (prokayotic & eukaryotic) Termination Polyribosomes mRNA template surveillance (Quality control) NMD, Nostop mediated decay, tmRNA Changes in the code (selenocysteine, frameshifting, hardcoded) Protein folding (chaperones… hsp60 & hsp70, degradation, diseases) The RNA World • Basic tenets of the theory • Basic timeline • preRNA world • Ribozymes • SELEX Systematic Evolution of Ligands by EXponential enrichment • Model of central dogma Gene control and DNA binding motifs • Differentiated cells contain the same DNA • Structure of DNA binding proteins – DNA binding and Activation domains • Types of DNA binding motifs and how they work • Common techniques • • • • • • • • • Microarray 2-D gels Gel mobility shift DNA affinity chromatography Footprinting SELEX One hybrid system Chromatin immunoprecipitation, Chip-chip, Chip-seq Phylogenetic footprinting The control of gene expression • Each cell in the human contains all the genetic material for the growth and development of a human • Some of these genes will be need to be expressed all the time • These are the genes that are involved in of vital biochemical processes such as respiration • Other genes are not expressed all the time • They are switched on an off at need © 2007 Paul Billiet ODWS Operons • An operon is a group of genes that are transcribed at the same time. • They usually control an important biochemical process. • They are only found in prokaryotes. © 2007 Paul Billiet ODWS Jacob, Monod & Lwoff © NobelPrize.org The lac Operon The lac operon consists of three genes each involved in processing the sugar lactose One of them is the gene for the enzyme βgalactosidase This enzyme hydrolyses lactose into glucose and galactose © 2007 Paul Billiet ODWS 1. When lactose is absent • A repressor protein is continuously synthesised. It sits on a sequence of DNA just in front of the lac operon, the Operator site • The repressor protein blocks the Promoter site where the RNA polymerase settles before it starts transcribing Repressor protein DNA I O Regulator gene Operator site © 2007 Paul Billiet ODWS RNA polymerase Blocked z y lac operon a 2. When lactose is present • A small amount of a sugar allolactose is formed within the bacterial cell. This fits onto the repressor protein at another active site (allosteric site) • This causes the repressor protein to change its shape (a conformational change). It can no longer sit on the operator site. RNA polymerase can now reach its promoter site DNA I © 2007 Paul Billiet ODWS O z y a 2. When lactose is present • A small amount of a sugar allolactose is formed within the bacterial cell. This fits onto the repressor protein at another active site (allosteric site) • This causes the repressor protein to change its shape (a conformational change). It can no longer sit on the operator site. RNA polymerase can now reach its promoter site DNA I © 2007 Paul Billiet ODWS O z y Promotor site a 3. When both glucose and lactose are present • This explains how the lac operon is transcribed only when lactose is present. • BUT….. this does not explain why the operon is not transcribed when both glucose and lactose are present. © 2007 Paul Billiet ODWS • When glucose and lactose are present RNA polymerase can sit on the promoter site but it is unstable and it keeps falling off Repressor protein removed RNA polymerase DNA I O z y Promotor site a 4. When glucose is absent and lactose is present • Another protein is needed, an activator protein. This stabilises RNA polymerase. • The activator protein only works when glucose is absent • In this way E. coli only makes enzymes to metabolise other sugars in the absence of glucose Activator protein steadies the RNA polymerase Transcription DNA I O z y Promotor site © 2007 Paul Billiet ODWS a Carbohydrates Activator protein Repressor protein RNA polymerase lac Operon + GLUCOSE + LACTOSE Not bound to DNA Lifted off operator site Keeps falling off promoter site No transcription + GLUCOSE - LACTOSE Not bound to DNA Bound to operator site Blocked by the repressor No transcription - GLUCOSE - LACTOSE Bound to DNA Bound to operator site Blocked by the repressor No transcription - GLUCOSE + LACTOSE Bound to DNA Lifted off Sits on the operator site promoter site Transcription © 2007 Paul Billiet ODWS • Control of Gene Expression • 1. DNA-Protein Interaction • 2. Transcription Regulation • 3. Post-transcriptional Regulation Neuron and lymphocyte Different morphology, same genome Six Steps at which eucaryotic gene expression are controlled Regulation at DNA levels Double helix Structure The outer surface difference of base pairs without opening the double helix Hydrogen bond donor: blue Hydrogen bond acceptor: red Hydrogen bond: pink Methyl group: yellow DNA recognition code One typical contact of Protein and DNA interface In general, many of them will form between a protein and a DNA DNA-Protein Interaction 1. 2. 3. Different protein motifs binding to DNA: Helix-turn-Helix motif; the homeodomain; leucine zipper; helix-loop-helix; zinc finger Dimerization approach Biotechnology to identify protein and DNA sequence interacting each other. Helix-turn-Helix C-terminal binds to major groove, N-terminal helps to position the complex, discovered in Bacteria Homeodomain Protein in Drosophila utilizing helix-turn-helix motif Zinc Finger Motifs Utilizing a zinc in the center An alpha helix and two beta sheet An Example protein (a mouse DNA regulatory protein) utilizing Zinc Finger Motif Three Zinc Finger Motifs forming the recognition site A dimer of the zinc finger domain of the glucocorticoid receptor (belonging to intracellular receptor family) bound to its specific DNA sequence Zinc atoms stabilizing DNA-binding Helix and dimerization interface Beta sheets can also recognize DNA sequence (bacterial met repressor binding to s-adenosyl methionine) Leucine Zipper Dimer Same motif mediating both DNA binding and Protein dimerization (yeast Gcn4 protein) Homodimers and heterodimers can recognize different patterns Helix-loop-Helix (HLH) Motif and its dimer Truncation of HLH tail (DNA binding domain) inhibits binding Six Zinc Finger motifs and their interaction with DNA Gel-mobility shift assay Can identify the sizes of proteins associated with the desired DNA fragment DNA affinity Chromatography After obtain the protein, run mass spec, identify aa sequence, check genome, find gene sequence Assay to determine the gene sequence recognized by a specific protein Chromatin Immunoprecipitation In vivo genes bound to a known protein Summary • Helix-turn-Helix, homeodomain, leucine zipper, helix-loop-helix, zinc-finger motif • Homodimer and heterodimer • Techniques to identify gene sequences bound to a known protein (DNA affinity chromatography) or proteins bound to known sequences (gel mobility shift) Gene Expression Regulation Transcription Tryptophan Gene Regulation (Negative control) Operon: genes adjacent to each other and are transcribed from a single promoter Different Mechanisms of Gene Regulation The binding site of Lambda Repressor determines its function Act as both activator and repressor Combinatory Regulation of Lac Operon CAP: catabolite activator protein; breakdown of lactose when glucose is low and lactose is present The difference of Regulatory system in eucaryotes and bacteria 1. 2. 3. Enhancers from far distance over promoter regions Transcription factors Chromatin structure Gene Activation at a distance Regulation of an eucaryotic gene TFs are similar, gene regulatory proteins could be very different for different gene regulations Functional Domain of gene activation protein 1. Activation domain and 2. DNA binding domain Gene Activation by the recruitment of RNA polymerase II holoenzyme Gene engineering revealed the function of gene activation protein Directly fuse the mediator protein to enhancer binding domain, omitting activator domain, similar enhancement is observed Gene regulatory proteins help the recruitment and assembly of transcription machinery (General model) Gene activator proteins recruit Chromatin modulation proteins to induce transcription Two mechanisms of histone acetylation in gene regulation a. Histone acetylation further attract activator proteins b. Histone acetylation directly attract TFs Synergistic Regulation Transcription synergy 5 major ways of gene repressor protein to be functional Protein Assembled to form commplex to Regulate Gene Expression Integration for Gene Regulation Regulation of Gene Activation Proteins Insulator Elements (boundary elements) help to coordinate the regulation Gene regulatory proteins can affect transcription process at different steps The order of process may be different for different genes Summary • Gene activation or repression proteins • DNA as a spacer and distant regulation • Chromatin modulation, TF assembly, polymerase recruitment • combinatory regulations Genetic Switches Positive, negative and combinatorial control of transcription in bacteria Trp and Lac operons Lambda repressor DNA bending and protein-protein interactions on DNA Differences in transcription regulation between prokaryotes and eukaryotes The structure of a eukaryotic gene control region How eukaryotic transcriptional activators work How eukaryotic transcriptional repressors work Steps of eukaryotic transcriptional activation Transcription factor complexes, coactivators and corepressors synergy Control of Drosophila even-skipped (eve) Locus control regions and insulators Creating Specialized Cells Phase variation in Salmonella Yeast mating type switching Regulation of lambda phage lysogeny: flip-flop Four types of feedback Positive and negative transcription feedback loops Examples: Circadian clocks: (don’t need to know details) Myogenic proteins and muscle cell formation Eye development in Drosophila Creation of cell types by a few transcription factors Mechanisms by which patterns of gene expression can be passed to daughter cells: X-inactivation Cytosine methylation Genomic imprinting CpG islands Post transcriptional controls • • • Post-initiation transcriptional control of gene expression • attenuation Alternative splicing • Regulation of alternative splicing • • Transcript cleavage • Secreted verses membrane bound antibodies RNA editing especially as it related to human cells RNA transport and localization • Negative control of translation initiation • Export of HIV RNAs from the nucleus • Localization in the cytoplasm • Bacteria (ex. Bacterial ribosomal proteins) • How do translational repressor work in eukaryotes –Aconitase • • • • Phosphorylation of eIF-2 • uORFs IRES Control of mRNA stability RNA interference, miRNAs, siRNAs Transcription • The transcription cycle • The structure of E. coli RNA polymerase • Sigma70 promoter structure (-10 region & variants) – Sigma factors • Subunits of bacterial RNA polymerase • Two types of terminators • Eukaryotic RNA pols – General composition of the polymerases – General transcriptions factors – TATA and other promoter DNA sequence signals – Mediator complex • Elongation • RNA capping, Splicing, Cleavage and polyAdenylation • Differences between prokaryotic and eukaryotic transcription Splicing • Spliceosome structure and mechanism of splicing • Different types of splicing (3 major types) • Group I and II introns • • • • 1. Transcription 2. RNA Modification and Splicing 3. RNA transportation 4. Translation Processing of eukaryotic pre-mRNA: the classical texbook picture Alternative picture: co-transcriptional pre-mRNA processing • This picture is more realistic than the previous one, particularly for long pre-mRNAs Heterogenous ribonucleoprotein patricles (hnRNP) proteins • In nucleus nascent RNA transcripts are associated with abundant set of proteins • hnRNPs prevent formation of secondary structures within pre-mRNAs • hnRNP proteins are multidomain with one or more RNA binding domains and at least one domain for interaction with other proteins • some hnRNPs contribute to pre-mRNA recognition by RNA processing enzymes • The two most common RNA binding domains are RNA recognition motifs (RRMs) and RGG box (five Arg-Gly-Gly repeats interspersed with aromatic residues) 3D structures of RNA recognition motif (RRM ) domains Capping p-p-p-N-p-N-p-N-p…. Capping enzyme (mCE) p-p-N-p-N-p-N-p… GMP mCE (another subunit) G-p-p-p-N-p-N-p-N-p… S-adenosyl methionine methyltransferases CH3 G-p-p-p-N-p-N-p-N-p… CH3 CH3 The capping enzyme • A bifunctional enzyme with both 5’-triphosphotase and guanyltransferase activities • In yeast the capping enzyme is a heterodimer • In metazoans the capping enzyme is monomeric with two catalytic domains • The capping enzyme specific only for RNAs, transcribed by RNA Pol II (why?) Capping mechanism in mammals Growing RNA DNA Capping enzyme is allosterically controlled by CTD domains of RNA Pol II and another stimulatory factor hSpt5 Polyadenylation • • • • Poly(A) signal recognition Cleavage at Poly(A) site Slow polyadenylation Rapid polyadenylation • • • • • G/U: G/U or U rich region CPSF: cleavage and polyadenylation specificity factor CStF: cleavage stimulatory factor CFI: cleavage factor I CFII: cleavage factor II PAP: Poly(A) polymerase PAP CPSF PABPII- poly(A) binding protein II PABP II functions: 1. rapid polyadenylation 2. polyadenylation termination Link between polyadenylation and transcription FCP1 Phosphatase removes phospates from CTDs Pol II gets recycled mRNA Pol II aataaa c t d p p PolyA – binding factors cap degradation p p cap polyA mRNA gets cleaved and polyadenylated cap splicing,nu clear transport Splicing The size distribution of exons and introns in human, Drosophila and C. elegans genomes Consensus sequences around the splice site YYYY Molecular mechanism of splicing Small nuclear RNAs U1-U6 participate in splicing • snRNAs U1, U2, U4, U5 and U6 form complexes with 6-10 proteins each, forming small nuclear ribonucleoprotein particles (snRNPs) • Sm- binding sites for snRNP proteins Additional factors of exon recognition ESE - exon splicing enhancer sequences SR – ESE binding proteins U2AF65/35 – subunits of U2AF factor, binding to pyrimidine-rich regions and 3’ splice site The essential steps in splicing Binding of U1 and U2 snRNPs Binding of U4, U5 and U6 snRNPs Rearrangement of base-pair interactions between snRNAs, release of U1 and U4 snRNPs The catalytic core, formed by U2 and U6 snRNPs catalyzes the first transesterification reaction Further rearrangements between U2, U6 and U5 lead to second transesterification reaction The spliced lariat is linearized by debranching enzyme and further degraded in exosomes Not all intrones are completely degraded. Some end up as functional RNAs, different from mRNA Co-transciptional splicing mRNA Pol II c t d SRs snRNPs p p SCAFs: SR- like CTD – associated factors Intron cap Self-splicing introns • Under certain nonphysiological conditions in vitro, some introns can get spliced without aid of any proteins or other RNAs • Group I self-splicing introns occur in rRNA genes of protozoans • Group II self-splicing introns occur in chloroplasts and mitochondria of plants and fungi Group I introns utilize guanosine cofactor, which is not part of RNA chain Comparison of secondary structures of group II selfsplicing introns and snRNAs Spliceosome • Spliceosome contains snRNAs, snRNPs and many other proteins, totally about 300 subunits. • This makes it the most complicted macromolecular machine known to date. • But why is spliceosome so extremely complicated if it only catalyzes such a straightforward reaction as an intron deletion? Even more, it seems that some introns are capable to excise themselves without aid of any protein, so why have all those 300 subunits? • No one knows for sure, but there might be at least 4 reasons: • 1. Defective mRNAs cause a lot of problems for cells, so some subunits might assure correct splicing and error correction • 2. Splicing is coupled to nuclear transport, this requires accessory proteins • 3. Splicing is coupled to transcription and this might require more additional accessory proteins • 4. Many genes can be spliced in several alternative ways, which also might require additional factors One gene – several proteins • • • • Cleavage at alternative poly(A) sites Alternative promoters Alternative splicing of different exons RNA editing Alternative splicing, promoters & poly-A cleavage RNA editing • Enzymatic altering of pre-mRNA sequence • Common in mitochondria of protozoans and plants and chloroplasts, where more than 50% of bases can be altered • Much rarer in higher eukaryotes Editing of human apoB pre-mRNA The two types of editing 1) Substitution editing • Chemical altering of individual nucleotides • Examples: Deamination of C to U or A to I (inosine, read as G by ribosome) 2) Insertion/deletion editing •Deletion/insertion of nucleotides (mostly uridines) •For this process, special guide RNAs (gRNAs) are required Guide RNAs (gRNAs) are required for editing Organization of pre-rRNA genes in eukaryotes Electron micrograph of tandem pre-rRNA genes Small nucleolar RNAs • • ~150 different nucleolus restricted RNA species snoRNAs are associated with proteins, forming small nucleolar ribonucleoprotein particles (snoRNPs) • The main three classes of snoRNPs are envolved in following processes: a) removing introns from pre-rRNA b) methylation of 2’ OH groups at specific sites c) converting of uridine to pseudouridine What is this pseudouridine good for? Uridine ( U ) Pseudouridine ( Y ) • Pseudouridine Y is found in RNAs that have a tertiary structure that is important for their function, like rRNAs, tRNAs, snRNAs and snoRNAs • The main role of Y and other modifications appears to be the maintenance of three-dimensional structural integrity in RNAs Where do snoRNAs come from? • Some are produced from their own promoters by RNA pol II or III • The majority of snoRNAs come from introns of genes, which encode proteins involved in ribosome synthesis or translation • Some snoRNAs come from intrones of genes, which encode nonfuctional mRNAs Assembly of ribosomes Processing of pre-tRNAs RNase P cleavage site Splicing of pre-tRNAs is different from pre-mRNAs and pre-rRNAs • The splicing of pre-tRNAs is catalyzed by protein only • A pre-tRNA intron is excised in one step, not by two transesterification reactions • Hydrolysis of GTP and ATP is required to join the two RNA halves Macromolecular transport across the nuclear envelope The central channel • Small metabolites, ions and globular proteins up to ~60 kDa can diffuse freely through the channel • Large proteins and ribonucleoprotein complexes (including mRNAs) are selectively transported with the assistance of transporter proteins Proteins which are transported into nucleus contain nuclear location sequences Two different kinds of nuclear location sequences basic hydrophobic importin a importin b importin b nuclear import Artifical fusion of a nuclear localization signal to a cytoplasmatic protein causes its import to nucleus Mechanism for nuclear “import” Mechanism for nuclear “export” Mechanism for mRNA transport to cytoplasm Example of regulation at nuclear transport level: HIV mRNAs After mRNA reaches the cytoplasm... • mRNA exporter, mRNP proteins, nuclear capbinding complex and nuclear poly-A binding proteins dissociate from mRNA and gets back to nucleus • 5’ cap binds to translation factor eIF4E • Cytoplasmic poly-A binding protein (PABPI) binds to poly-A tail • Translation factor eIF4G binds to both eIF4E and PABPI, thus linking together 5’ and 3’ ends of mRNA Quality control of translation in bacteria Rescue the incomplete mRNA process and add labels for proteases Folding of the proteins Is required before functional Folding process starts at ribosome Protein Folding Pathway Molecular Chaperone An example of molecular chaperone functions Hsp70, early binding to proteins after synthesis An example of molecular chaperone functions (chaperonin) Hsp60-like protein, late The Fate of Proteins after translation E1: ubiquitin activating enzyme; E2/3: ubiquitin ligase The production of proteins • RNA translation (Protein synthesis), tRNA, ribosome, start codon, stop codon • Protein folding, molecular chaperones • Proteasomes, ubiquitin, ubiqutin ligase How Genomes evolve • Mutations, gene deletions, chromosomal rearrangements, transposable elements, horizontal transfer, inversions, gene duplication, whole genome duplication • Phylogenetic trees • Sequence alignments • Chromosomal rearrangements • Gene duplication • Neofunctionalization, subfunctionalization, • Whole genome duplication • SNPs (mutations within a genome) • Haplotypes • CNVs How genome evolve • No Evolution !!! Thank You