Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Comparative Genomics and New Evolutionary Biology 10/01/2008 The Genomic Clock ¾ Evolutionary rates of protein families are too different to support the classic molecular clock concept. ¾ The distributions of the mutation rate among orthologs in two distant genomes are similar after normalization, suggesting the existence of a possible genomic clock, which changes linearly with time. ¾ Thus the evolutionary rate of different sets of orthologs may differ, the genome-wide distribution remains largely invariant in shape. Before normalization After normalization The major transitions in evolution: a comparative genomic perspective ¾ There are several transitions during the history of life 1 2 3 4 5 6 7 8 Replicating molecules to Populations of molecules in compartments Chromosomes Independent replicators to Chromosomes (probably RNA) RNA as both genes and to DNA as genes, proteins as enzymes enzymes Prokaryotes to Eukaryotes Asexual clones to Sexual populations — evolution of sex Protists to Multicellular organisms — animals, plants, fungi; evolution of multicellularity Solitary individuals to Colonies with non-reproductive castes Primate societies to Human societies with language, enabling memes The Major Transitions in Evolution by John Maynard Smith and Eörs Szathmáry (Oxford University Press, 1995). The Major transitions in evolution: a comparative genomic perspective ¾ Comparative genome analyses have provide us a more concrete understanding of these important transitions, such as 1. From the pure RNA world to RNA-protein life forms; 2. From RNA to DNA as the substance of heredity; 3. From the prokaryotic cell to the eukaryotic cell . The Last Universal Common Ancestor (LUCA) and its reconstruction ¾ The molecular evidences of the existence of LUCA: all life forms 1. share many homologous proteins, in particular, for information processing (transcription and translation); 2. use almost the same set of genetic code to translate stored information in their genome into protein; 3. have similar macromolecular assemblies such RNA polymerase, ribosome and membrane. The Last Universal Common Ancestor (LUCA) and its reconstruction ¾ The existence of LUCA has nothing to do with the ideas of the origin of life, it can be compatible with the panspermia, exogenesis, multiple origin of life or a primordial pre-LUCA; ¾ One goal of comparative genomics is 1. to derive the rules that have governed gene duplication, divergence of gene function, gene loss and HGT to shape the distinct gene repertoire of each major lineage; 2. to use this rules to reconstruct the ancestral genomes that existed at different stages of evolution including LUCA. What genes would LUCA have ? ¾ Naively, we might think LUCA contains all the essential genes shared by the extant genomes, thus reconstruction of the genome of LUCA become trial by finding all the universal genes. ¾ However, this does not work in its simplest form, because only 65 COGs are universal, and an estimated minimal genome contains a few hundred genes. ¾ Thus, the universal set of genes are definitely a part of the LUCA genome, but it should also contain other genes that are not universally present the modern genomes owing to: 1. parasites and some free-living heterotrophs undergo substantial gene loss; 2. extensive non-orthologous gene displacement for essential genes, which also makes reconstruction more difficult. What genes would LUCA have ? ¾ LUCA must be a chemoautotroph resembling its modern chemoautotroph, and contained the central metabolic as well as anabolic pathways and membrane. ¾ All the exiting organisms have similar transcriptional and translational system suggesting that the LUCA had similar transcriptional and translational system. ¾ However, some main components of DNA replication systems in archaea-eukaryotes and bacterial are different. There are many explanations for this discrepancy: 1. They are too divergent to detect the homology; however structure analyses refuted this: archaee-eukarytic primases have palm-like domain, the bacterial versions have a TOPRIM domain. What genes would LUCA have ? 2. Existence of two distinct DNA replication systems involves non-orthologous gene displacement or differential gene loss. Archaeal-eukaryotic DNA replication Archaeal-eukaryotic Bacterial DNA replication DNA replication Non-orthologous gene displacement DNA replication 1 DNA replication 2 DNA replication 1 DNA replication 2 Differential gene loss What genes would LUCA have ? ¾ However, it’s difficult to make sense to displace the well developed multi-domain DNA replication complex; ¾ No organism having two DNA replication systems has ever been found, this also imply that LUCA had a more complicated DNA replication machinery than its modern descendents. Archaeal-eukaryotic DNA replication Archaeal-eukaryotic Bacterial DNA replication DNA replication Non-orthologous gene displacement DNA replication 1 DNA replication 2 DNA replication 1 DNA replication 2 Differential gene loss A proposed retrovirus-like genome of LUCA ¾ Instead of containing a DNA-based genome, LUCA might has a RNA segment-based genome. A proposed retrovirus-like genome of LUCA ¾ This hypothesis tries to account both for the lack of conservation of several central components of modern replication systems and for the presence of some other conserved components, such as the sliding clamp, clamploader ATPase, and RNAse H, as well as enzymes of DNA precursor biosynthesis, and the basal transcription and translation machineries; ¾ This LUCA itself was not modern but rather a transitional form on the path from the ancient RNA world to the DNA world. ¾ LUCA must have had at least several hundred protein-coding genes and 30 or so genes for structural RNAs. A proposed retrovirus-like genome of LUCA ¾ RNA segments of LUCA's genome were of an operon size, i.e. a typical segment carried three to five genes; ¾ Some operons coding for ribosomal proteins are universally conserved and must have been inherited from LUCA; ¾ Such a set of genomic segments hardly could segregate with high accuracy into the daughter protocells during division, although multiploidy could have increased the likelihood that each received the complement of essential genes. ¾ Therefore, what we call LUCA inevitably must have been a collection of protocells with similar but not identical sets of genome segments. A brief history of early life ¾ A hypothetical sequence of major events in the evolution of life from self-replicating RNA to the emergence of modern-type DNA replication. Positive selection Modern organisms 3.5 billion years purifying selection The transition from prokaryotes to eukaryotes ¾ Even the simplest eukaryotic cell is more complex the most advanced prokaryotic cell; ¾ The transition from prokaryotes to eukaryotes is one of the greatest mysteries of life's history. ¾ The endosymbiosis hypothesis: eukaryotes were the result of uptake of prokaryotes by the protoeukaryote---”you are what you eat”: --- mitochondrion is derived from an ancient α-proteobacterium --- chloroplast is derived from an ancient cyanobacterium --- cytoskeleton is derived from some ancient bacterium Actin is likely derived from bacterial cell division proteins MreB and FtsA; tublin is highly likely derived from FtsB The transition from prokaryotes to eukaryotes ¾ The eukaryotic proteome is a mix of proteins of apparent archaeal descent, those that seem to originate from bacteria, and eukaryote-specific ones. The origins of the nematode proteins The transition from prokaryotes to eukaryotes ¾ The genes of archaeal origin code for information processing system components; ¾ Metabolic enzymes and transporters seem to be largely of bacterial origin; ¾ At least three distinct scenarios of bacterial gene integration into the protoeukaryotic genome: 1. Displacement of ancestral, archaeo-eukaryotic genes by bacterial counterparts (xenologous gene displacement: an ortholog from a distant lineage (xenolog) displaces the “native” gene in a given genome ); 2. Acquisition of bacterial genes without elimination of the ancestral archaeo-eukaryotic counterparts so that eukaryotes end up with both versions of a particular protein; 3. Evolution of new functions by utilization of bacterial proteins (exaptation). Conclusions and Outlook 1. Comparative genomics shows that genomes are much more dynamic, even volatile (on the evolutionary scale) systems than previously thought; 2. Comparative genomics reaffirms, through numerous illustrations that evolution is largely a tinker who achieves the best feasible result by combining, sometimes in haphazard ways, whatever materials are at hand. 3. Comparative genomics not only vastly complicates the picture of life's evolution but also provides the information necessary to resolve the principles and details of this picture. 4. The methods and concrete studies that take us in this direction are starting to appear but much more remains to be done.