* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Slide 1
Citric acid cycle wikipedia , lookup
Molecular evolution wikipedia , lookup
List of types of proteins wikipedia , lookup
Cell-penetrating peptide wikipedia , lookup
RNA interference wikipedia , lookup
Protein (nutrient) wikipedia , lookup
RNA polymerase II holoenzyme wikipedia , lookup
Eukaryotic transcription wikipedia , lookup
Messenger RNA wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Polyadenylation wikipedia , lookup
Protein structure prediction wikipedia , lookup
Gene expression wikipedia , lookup
Peptide synthesis wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Bottromycin wikipedia , lookup
RNA silencing wikipedia , lookup
Epitranscriptome wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Non-coding RNA wikipedia , lookup
Biochemistry wikipedia , lookup
Evolution of the Genetic Code: Before and After the LUCA 1. The genetic code evolved to its canonical form before the Last Universal Common Ancestor of Archaea, Bacteria and Eukaryotes - >3 billion years ago. It appears to be highly optimized. How did it get to be this way? 2. Numerous small changes have occurred to the canonical code since then. What is the mechanism of codon reassignment? Codon Reassignment – The Genetic code is variable in mitochondria (and also some cases of other types of genomes) Second Position F i r s t P o s i t i o n U C A G U C A G Third Pos. F F L L S S S S Y Y Stop Stop C C Stop W U C A G L L L L P P P P H H Q Q R R R R U C A G CUN Leu to Thr I I I M T T T T N N K K S S R R U C A G AGR Arg to Ser to Stop/Gly V V V V A A A A D D E E G G G G U C A G UGA Stop to Trp AUA Ile to Met CGN Arg to unassigned etc..... But how can this happen? It should be disadvantageous. Reassignments in Metazoa Porifera Cnidaria Arthropoda Nematoda Lophotrochozoa Loss of tRNA-Ile(CAU) but AUA remains Ile Loss of tRNA-Arg(UCU) and AGR : Arg -> Ser Loss of many tRNAs + import from cytoplasm Platyhelminthes Echinodermata Hemichordata AUA : Ile -> Met AGR : Ser -> Stop Urochordata AGR : Ser -> Gly AAA : Lys -> Asn AAA : Lys -> unassigned Cephalochordata Craniata Example 1: AUA was reassigned from Ile to Met during the early evolution of the mitochondrial genome. Before Codon Anticodon Ile Ile Ile Met AUU AUC AUA GAU k2CAU AUG CAU Codon Anticodon Ile Ile AUU AUC GAU Met Met AUA AUG UAU or f5CAU After Notes G in the wobble position of the tRNA-Ile can pair with U and C in the third codon position Bacteria and some protist mitochondria possess another tRNA-Ile with a modified base that translates AUA only. The tRNA-Met translates AUG only. Notes In animal mitochondria the k2CAU tRNA has been deleted. There is a gain of function of the tRNA-Met by a mutation or a base modification Example 2: UGA was reassigned from Stop to Trp many times (12 times in mitochondria). Before Codon Anticodon Notes Stop UGA RF Release Factor recognizes UGA codon. Trp UGG CCA Normal tRNA-Trp translates only UGG codons. After Codon Anticodon Trp Trp UGA UGG UCA Notes In animal mitochondria (and elsewhere) there is a gain of function of the tRNA-Trp via mutation or base modification so that it translates both UGG and UGA. The GAIN-LOSS framework (Sengupta & Higgs, Genetics 2005) LOSS = deletion or loss of function of a tRNA or RF GAIN = gain of a new tRNA or a gain of function of an existing one. GAIN Ambiguous codon. Selective disadvantage. New Code. Selective disadvantage because codons are used in wrong places Initial Code. No Problem. LOSS LOSS Unassigned codon. Selective disadvantage. Note – the strength of the selective disadvantage depends on the number of times the codon is used. There is no disadvantage if the codon disappears. GAIN Mutations in coding sequences New Code. Codons now used in right places. No Problem. Four possible mechanisms of codon reassignment. 1. Codon Disappearance - The codon disappears. The order of the gain and loss is irrelevant. For the other three mechanisms the codon does not disappear. 2. Ambiguous Intermediate – The gain happens before the loss. There is a period when the gain is fixed in the population and translation is ambiguous. 3. Unassigned Codon – The loss happens before the gain. There is a period when the loss is fixed in the population and the codon is unassigned. 4. Compensatory Change – The gain and loss are fixed in the population simultaneously (although they do not arise at the same time). There is no intermediate period between the old and the new codes. - cf. theory of compensatory substitutions in RNA helices. Sengupta & Higgs (2005) showed that all four mechanisms work in a population genetics simulation Summary of Codon Reassignments in Mitochondria Codon reassignment Can this be explained by GCAU mutation pressure? No. of times Change in No. of tRNAs Is mispairing important? Mechanism UAG: Stop Leu 2 G A at 3rd pos. +1 No CD UAG: Stop Ala 1 G A at 3rd pos. +1 No CD 0 Possibly. CA at 3rd pos. CD UGA: Stop Trp 12 G A at 2nd pos. CUN: Leu Thr 1 C U at 1st pos. 0 No CD CGN: Arg Unass 5 C A at 1st pos. -1 No CD AUA: Ile Met or Unassigned 3 / 5 -1 Yes. GA at 3rd pos. UC 0 Yes. GA at 3rd pos. AI 0 Possibly. GA at 3rd pos. UC or AI -1 Yes. GA at 3rd pos. UC AAA: Lys Asn AAA: Lys Unass AGR: Arg Ser 2 1 1 No No No No AGR: Ser Stop 1 No 0 No AI(b) AGR: Ser Gly 1 No +1 No AI(b) UUA: Leu Stop 1 No 0 No UC or AI UCA: Ser Stop 1 No 0 No UC or AI CD mechanism explains disappearance of stop codons because they are rare initially. Only a few examples of CD for sense codons. UC and AI are important for sense codons. Three examples in yeasts (Mutation pressure GC to AU) CUN is rare (replaced by UUR) Second Position F i r s t P o s i t i o n U C A G U F F L L S S S S Y Y Stop Stop C C Stop W U C A G C L L L L P P P P H H Q Q R R R R U C A G I I I M T T T T N N K K S S R R U C A G V V V V A A A A D D E E G U C A G A G Third Pos. G G G CUN Leu to Thr CGN is rare (replaced by AGR) CGN Arg codons become unassigned. AUA and AUU common and AUC is rare Nevertheless AUA is reassigned to Met. Codon does not disappear Leu and Arg codons in yeasts Codon Disappearance causes reassignments Leu Leu CUN UUR Arg CGN Arg AGR S 53 192 7 33 Y. 44 618 0** 75 C 3 279 12 29 C 132 397 47 26 C 66 547 39 45 P 25 714 18 67 K 0 286 0** 48 C 11* 294 1** 45 S 33* 333 7 49 S 19* 274 0** 40 S 22* 300 0** 46 * CUN = Thr. Unusual tRNA-Thr present instead of tRNA-Leu ** CGN = unassigned. tRNA-Arg is deleted AUA Ile to Met in Yeasts codon anticodon AUU Ile GUA AUC Ile “ AUA Ile K2CAU AUG Met CAU Codon Usage AUU AUC AUA AUG AUA is J 133 40 32 48 Ile O 161 34 0 57 Absent P 113 39 49 51 Ile tRNA K2CAU none K2CAU AUU AUC AUA AUG 119 81 229 100 Ile 303 32 193 117 Ile 274 18 562 105 Ile 213 16 7 63 ? 207 21 16 73 Met 239 31 60 73 Met 203 7 101 56 Met 218 11 95 70 Met K2CAU K2CAU K2CAU none C*AU C*AU C*AU C*AU C C P K C S S S Evolution of the canonical code - Before the LUCA The canonical code seems to be optimized to reduce the effects of translational and mutational errors. Neighbouring codons code for similar amino acids. 5 7 C LI F WM Y V PT A SG HQ 9 R 11 NK Woese’s polar requirement scale Measure difference between amino acid properties by how far apart they are on this scale. 13 E D Principal Component Analysis Projects the 8-d space into the two ‘most important’ dimensions. Big Small Hydrophobic Hydrophilic Cost function g(a,b) for replacing amino acid a by amino acid b e.g. difference in Polar Requirement E rij g (ai , a j ) / rij i j i j rij = rate of mistaking codon i for codon j = 1 for single position mistakes, 0 otherwise E = measure of error associated with a code Phe UUU Ser UCU Phe UUC Ser UCC Leu UUA Ser UCA Leu UUG Ser UCG Le u CUU Pro CCU CUC Pro CCC T y r UAU UAC UGC T y r UAA UGA UAG UGG Le u Hi *s CAU CAC Hn CUA eu CUG Ile AUU Ile AUC Ile AUA Met AUG Val GUU Val GUC Val GUA Val GUG Pro Pro CCA CCG ACU ACC ACA ACG Ala GCU Ala GCC Ala GCA Ala GCG CAA CGC A g CGA As n AAU Ser AGU AAC Ser AGC AAA Arg AGA Ly s AAG Arg AGG As p GAU GGU GAC GGC GAA GGA GAG GGG Ay s As p G u one in a million codes is better (Freeland and Hurst) E CGU CAG f ~ 10-6 f A rg Gl n E is smaller for the canonical code than for almost all random codes. Ereal UGU * Generate random codes by permuting the 20 amino acids in the code table p(E) C p CGG The statistical argument shows that the code is highly non-random but it does not explain how the code evolved to be that way. Need a step-by-step evolutionary argument that leads from a proposed first stage of the code to today’s code. Random permutations – Not Possible Random swaps – seems unlikely The earliest code probably had few amino acids. Which were the first? Selection acts when new amino acids are added. Phe UUU Ser UCU Phe UUC Ser UCC Leu UUA Ser UCA Leu UUG Ser UCG Le u CUU Pro CCU CUC Pro CCC T y r UAU T y r UAA UGA UAG UGG C p UAC UGU UGC * Le u CAU CAC Hn CUA eu CUG Ile AUU Ile AUC Pro Pro CCA CCG ACU ACC ACA Ile AUA Met AUG Val GUU Ala GCU Val GUC Ala GCC Val GUA Ala GUG Ala Val Hi *s ACG GCA GCG CAA A rg CGU CGC A g CGA Gl n CAG As n AAU Ser AGU AAC Ser AGC AAA Arg AGA Ly s AAG Arg AGG As p GAU GGU GAC GGC GAA GGA GAG GGG Ay s As p G u CGG Time scale for the origin of life The origin of the genetic code is the end of the RNA World Dating of rocks and meteorites What preceded RNA? Another polymer? Metabolism only? Last oceanvaporizing impact. Lunar craters Microfossil evidence Stromatolites. Phylogenetic methods (divergence after LUCA) Isotopic evidence for life Prebiotic synthesis of organic molecules Miller-Urey experiment (1953) Began with a mixture of CH4 , NH3, H2O and H2. Energy source = electric spark or UV light. Obtained 10 amino acids. Atmospheres and Chemistry reducing: CH4 , NH3, H2O, H2. or CO2, N2, H2 or CO, N2, H2 There is hydrogen gas and/or hydrogen is present combined with other elements (methane, ammonia, water) neutral: CO or CO2 , N2 , H2O no hydrogen or oxygen gas oxidizing: O2, CO2, N2 oxygen gas present Prebiotic chemists favour reducing atmospheres. Yields in Miller-Urey exp are higher and more diverse in reducing than in neutral atmospheres. Doesn’t work in oxidizing atmosphere. Planetary Atmospheres Major element in universe is H (big bang) so doesn’t it make sense that atmosphere was reducing? Jupiter retains original mixture: H2, He + small amounts CH4, NH3, H2O Smaller planets lose H2 New atmosphere created by outgassing from interior Geologists & Astronomers favour an intermediate atmosphere. (i) Venus - 64 Earth atmospheres pressure! Mostly CO2 and N2 (ii) Carbonates in sedimentary rocks on Earth suggest previously lots of CO2 So maybe Miller and Urey were wrong? :-( Current Earth: Mostly N2, O2 + small amounts of CO2 H2O – changed by life. Mars: very low pressure – mostly CO2 and N2 Alternative suggestion – Hydrothermal vents Sea water passes through vents. Heated to 350o C. Cools to 2o C in surrounding ocean. Supply of H2 H2S etc. Fierce debate as to whether these conditions favour formation or breakup of organic molecules (Miller & Lazcano, 1995) Organic compounds in meteorites Most widely studied meteorite is the Murchison meteorite. Fell in Australia in 1969. Carbonaceous chondrite. Contained both biological and non-biological amino acids Both optical isomers (later shown to be not quite equal) Compounds are not contamination Just about all the building block molecules have now been found in carbonaceous meteorites (Sephton, 2002). Astrochemistry: molecular clouds; icy grains; parent bodies of meteorites.... Delivery by: dust particles; meteorites; comets.... Was external delivery an important source of organic molecules? The earliest code probably had few amino acids. Which were the first? Selection acts when new amino acids are added. Phe UUU Ser UCU Phe UUC Ser UCC Leu UUA Ser UCA Leu UUG Ser UCG Le u CUU Pro CCU CUC Pro CCC T y r UAU UAC UGC T y r UAA UGA UAG UGG C p UGU * Le u CUA Pro CCA eu CUG Pro CCG Ile AUU Ile AUC Hi *s CAU CAC Hn ACU ACC ACA Ile AUA Met AUG Val GUU Ala GCU Val GUC Ala GCC Val GUA Ala Val GUG Ala ACG GCA GCG CAA A rg CGU CGC A g CGA Gl n CAG As n AAU Ser AGU AAC Ser AGC AAA Arg AGA Ly s AAG Arg AGG As p GAU GGU GAC GGC GAA GGA GAG GGG Ay s As p G u CGG Prebiotic Synthesis of amino acids Higgs and Pudritz (2009) Astrobiology Amino acids are found in • Meteorites • Atmospheric chemistry experiments (Miller-Urey) • Hydrothermal synthesis • Icy dust grains in space Rank amino acids in order of decreasing frequency in 12 observations. Derive ranking. Comparison of amino acid frequencies produced non-biologically Gly Ala Asp Glu Val Ser Ile Leu Pro Thr Miller 1.000 1.795 0.077 0.018 0.044 0.011 0.011 0.026 0.003 0.002 Murchison Yamato 1.00 1.000 0.34 0.380 0.19 0.035 0.40 0.110 0.19 0.100 0.003 0.13 0.060 0.04 0.035 0.29 0.003 Ice Exp. 1.000 0.293 0.022 0.012 0.072 concentrations normalized relative to Gly 0.001 10 amino acids are found in the Miller-Urey experiments. Very similar ones are also found in meteorites, an Ice grain analogue experiment, and other places. These are ‘early’ amino acids that were available for use by the first organisms. GADEVSILPT The other 10 are not seen. These are late amino acids that were only used when organisms evolved a means of synthesizing them biochemically. KRHFQNYWCM The earliest amino acids are those that are cheapest to form thermodynamically Positions of early and late amino acids.... What does this mean? Second Position F i r s t P o s i t i o n U C A G U FF FF L L S S S S Y Y Stop Stop C C Stop W U C A G C L L L L P P P P H H Q Q R R R R U C A G I I I MM T T T T N N K K S S R R U C A G V V V V A A A A D D E E G G G G U C A G A G Third Pos. Maybe only 2nd position was relevant initially. Late amino acids took over codons previously assigned to amino acids with similar properties. Propose that the four earliest amino acids were Val, Ala, Asp, Gly U C A G U C U A G U C C Val A Ala Asp Gly A G U C A G G U C A G Four column code. (Higgs Biol. Direct. 2009) This is a triplet code but only the second base means anything. The second base is the most important for codon-anticodon recognition. Unlikely to make a mistake at second position. All first and third position mistakes are synonymous. Code structure after addition of the 10 early amino acids. . Add new amino acids in positions that were formerly occupied by amino acids with similar properties. This minimizes disruption to existing gene sequences. Summary of my argument Selection acts at the time of addition of new amino acids to the code. The new amino acid is assigned to codons that formerly coded for an amino acid with similar properties. This minimizes disruption to existing genes. The result is that codons in the same columns end up assigned to amino acids with similar properties. The column structure is retained from the earliest code. Hence the code appears to minimize translational error with respect to randomly reshuffled codes, even though translational error was not the main factor being selected. Pathways of amino acid synthesis in modern organisms (from Di Giulio 2008) Other points – Column structure suggests that translational errors were more important than mutational errors (tRNA structure/RNA world) Precursor-product pairs tend to be neighbours (but doubts over statistical significance). Maybe late amino acids took over codons previously assigned to their biochemical precursors. Direct chemical interactions between RNA motifs and amino acids (“stereochemical theory”). In vitro selection experiments suggest binding sites of aptamers preferentially contain codon and anticodon sequences. RNA World First hypothesis: There was a stage of evolution at when RNA molecules performed both genetic and catalytic roles. DNA later took over the genetic role and proteins took over the catalytic role. Almost certainly true Translation depends on RNA: mRNA supplies the information for protein synthesis. Active ingredient of the ribosome is rRNA – 3d structures show site of peptidyl transferase reaction. Proteins probably added as a late addition to the ribosome. tRNAs also essential for translation. Second hypothesis: The RNA world arose de novo in the form of self replicating ribozymes. The jury is still out RNA world idea originated in 60’s as a theoretical solution to the chicken and egg problem of DNA and proteins. Self-splicing introns. First RNA catalysts to be discovered. Tom Cech (1982). ‘RNA World’ term coined by Walter Gilbert (1986). Example of an RNA catalyst Hammerhead ribozyme Cleaves RNA at a specific point. Rolling circle mechanism of replication of virus-like RNAs in plants. Chops long strand into pieces. What can ribozymes do? Ligases E’ A B E T. A. Lincoln, G. F. Joyce, Science 323, 1229 (2009) An Autocatalytic Set Made from Ligases T. A. Lincoln, G. F. Joyce, Self-Sustained Replication of an RNA Enzyme, Science 323, 1229, (2009) Given a supply of A, B, A’, B’, the E and E’ make more of themselves. E' A B E E A ' B ' E' What can ribozymes do? Recombinases E.J. Hayden, G.v. Kiedrowski & N. Lehman, Angew. Chem. Int. Edit. (2008) 120, 8552 Catalyst is autocatalytic given a supply of W X Y Z. The non-covalent assembly is also a catalyst. What can ribozymes do? Polymerases Black +Blue – ribozyme Red – template Orange – primer Primer extended by up to 14 nucleotides Johnstone et al. (2001) Science Gradual improvement of Polymerases in the lab Wochner et al. (2011) Science - up to 95 nucleotides What can ribozymes do? Nucleotide Synthetases Unrau and Bartel, (1998) Nature An RNA organism must have had a metabolism. Hypothetical pathway for RNA catalyzed RNA synthesis (Joyce) Synthesis of nucleosides Phosphorylation Generation of NTPs Creation of activated nucleotides Stepwise polymerization Clutter of RNA synthesis (Joyce) Why is this particular set of monomers used for nucleic acids? How is this set synthesized specifically? Where is the chemistry occurring? Earth, or space? Hydrothermal vents? A new route to Pyrimidine ribonucleotide assembly. MW Powner et al. Nature 459, 239-242 (2009) doi:10.1038/nature08013 Previously assumed synthesis of -ribocytidine-2',3'-cyclic phosphate 1 (blue; note the failure of the step in which cytosine 3 and ribose 4 are proposed to condense together) and the successful new synthesis described here (green). p, pyranose; f, furanose. Chemical synthesis of monomers and polymers must have occurred before the origin of ribozymes. Ferris (2002) Orig. Life Evol. Biosph. Montmorillonite catalyzed synthesis of RNA oligonucleotides (30-50 mers) Rajamani et al. (2008) Orig. Life Evol. Biosph. Lipid assisted synthesis of RNA-like polymers from mononucleotides Costanzo et al. (2009) J. Biol. Chem. Synthesis of long RNA strands from cyclic nucleotides in water Rajamani et al. (2010) J. Am. Chem. Soc. Measurements of error rates in non-enzymatic RNA replication There are still some experimental issues… But this is a logical necessity! How could the RNA world have got started? Getting from chemistry to biology…. RNA replicators must have emerged from prebiotic synthesis of random sequences Jump-starting the RNA World Wu & Higgs (2009) J. Mol. Evol. Synthesis Precursors Monomers Ribozymes Long polymers Polymerization Activated monomers Short polymers Are there alternatives to RNA? RNA a – Threose Nucleic Acid – TNA b – Peptide nucleic acid – PNA c – Glycerol derived nucleic acid d – Pyranosyl RNA RNA hybridizes with other nucleic acids. Information is not lost. DNA-RNA hybrids DNA takes over at end of RNA world. Maybe TNA or PNA preceded the RNA world. Information passed to RNA. Would need to show that the alternative was easier to synthesize than RNA. Two scenarios from Segré & Lancet (2000) A – RNA first (strong RNA world hypothesis) B – Lipids first (lipid world hypothesis – compositional genomes – metabolism without genes)