* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Download Genomes 3/e
X-inactivation wikipedia , lookup
List of types of proteins wikipedia , lookup
RNA silencing wikipedia , lookup
Non-coding RNA wikipedia , lookup
Gene desert wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Ridge (biology) wikipedia , lookup
Genomic imprinting wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Genomic library wikipedia , lookup
Community fingerprinting wikipedia , lookup
Gene expression wikipedia , lookup
Point mutation wikipedia , lookup
Gene regulatory network wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Gene expression profiling wikipedia , lookup
Non-coding DNA wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Terry Brown Genomes Third Edition Chapter 18: How Genomes Evolve Copyright © Garland Science 2007 Mutations and recombination provide the genome with the means to evolve, but we can get very little about the evolutionary histories of genomes simply by studying these events in living cells. Rather we need to combine our understanding of mutations and recombination with comparison between the genomes of different organisms in order to find patterns of genome evolution that had occurred. These studies will provide revealing insights about the way genomes get evolve in this biosphere Cosmologists believe that the universe began some 14 Billions years ago with the gigantic “primordial fireball” called the Big Bang. After going through different stages our solar system was developed some 4.6 B years The earth was covered with water, and it was this huge planetary ocean where first biological system appeared Cellular life seen when land masses become evident around 3.5 B years ago. First ocean are thought to have had a similar salt composition to those of today but the Earth’s atmosphere, and hence the dissolved gases in the ocean was very different. Oxygen was very low but ammonia and methane was abundant. This experimental mimicry to these compositions resulted in the formation of: a range of amino acids (alanine, glycine, valine etc), hydrogen cyanide and formaldehyde which can react with amino acids to form purines and pyrimidiens and sugars in very less amounts. It means that some of the building blocks of biomolecules could have formed in the ancient chemosphere. The ocean soup in ancient chemosphere provided building block which may have polymerize whether in ocean, Or by repeated condensation and drying of water droplets in clouds or at some muddy place on clay particles Or may be at some vent was the location where these biomolecules may have polymerized. The precise mechanism need not concern us, what important is the condition at that time was suitable enough that synthesis of polymeric biomolecules could be synthesized. The next step was the ordered assembly of these random collection of biomolecules in a form which at least show some of the attributes associated with the life. The steps of this rare possibility were never reproduced experimentally and based on speculation and computer based simulations But keeping it in mind that global ocean water could have 106 biomolecules/liter that may be associated with very different geological regions remained for billions of years give good chances to think or imagine any possible scenarios that could lead to the ordered assembly of these biomolecules. Progress in understanding the origins of life was initially stalled by the apparent requirement that polynucleotides and polypeptides must work in harness in order to produce a selfreproducing biochemical system. Proteins can not replicate Polynucleotides can not do self-replication This is called as polynucleotide-polypeptide dilemma The major breakthrough came in the mid-1980s when it was discovered that RNA can have catalytic activity. The today’s ribozymes found naturally perform three important biochemcial reactions: Self-cleavage (self-splicing groups I, II and III introns) Cleavage of other RNAs (as carried out by RNaseP). Synthesis of peptide bonds (rRNA in ribosome). The in-vitro experiments with RNA showed some very important biological reactions which can be performed by RNA molecule Synthesis of ribonucletodes Synthesis and copying of RNA molecules Transfer of RNA-bound amino acid to a second amino acid forming a dipeptide These activites enable RNA to perform all those functions which are needed by a a pre-cellular system or early biochemcal systems capable to re-produce The evolution of RNA early world was started at very slow pace where RNA molecules initially replicated in a slow and haphazard fashion, simply by acting as template for binding of complementary nucleotides which polymerized spontaneously. This replication process was very inaccurate so a variety of RNA sequences would have been generated. Eventually leading to one or more with nascent ribozyme properties that were able to direct their own, more accurate self-replication. The natural selection may have selected those systems which are very efficient in replication and predominated others (experimentally proven). A greater accuracy in replication would have enabled RNAs to increase in length without losing their sequence specificity, providing the potential for more sophisticated catalytic properties Leading to develop more complex systems like present-day Group-1 introns and ribosomal RNAs The early replicate able RNA was not a true genome but a protogenome is more accurate to define it. Which describes as that molecule being able to: Self replicate Carry on some important and simple biochemical reactions i.e. energy metabolism (release of free energy by ATP and GTP) The somehow production of long chain un branched lipids could be formed by RNA catalyzed reactions or by chemical synthesis Once there these can form membrane like structures and some of the protogenomes could compartmentalize This compartmentalization can enable RNA to perform those functions which were not possible in open ocean. That could have provided the bases for cellular life How did the RNA world develop into the DNA world? The first transition was development of protein based enzymes with RNA before DNA could took place the RNA place There are many questions regarding the transition of catalytic powers from RNA to proteins, but might be due to these reasons: Chemical diversity associated with 20 different amino acids Diverse folding patterns and hence diverse chemical capabilities in proteins Or compartmentalization made it essential to recruit proteins to give RNA a bit hydrophobic coat necessary for its function related to membranes The transition to proteins-mediated catalyst demanded a radical shift in the function of RNA protogenomes i.e. to become coding molecules! So RNA started playing role as protogenome directing the synthesis of proteins for early biochemical functions. Whether RNA become coding molecules itself or synthesized coding molecules by its ribozyme ability was not clear but most probably the later is seems correct. This leads to RNA being protogenome being involved in function of coding function and left their catalytic function. RNA is less suitable for its coding function due to its inherent instability, thus leading to transfer of coding function to more stable molecule like DNA was inevitable. Reduction of ribonucleotides giving rise to deoxyribonucleotides which could be utilized to polymerize into copy of RNA by reverse transcription process. The stability was increased by utilization of: Thymidine instead of Uracil. Adaptation of double stranded structure which also facilitate repair mechanism. So the first DNA genomes comprised of many molecules, each specifying a single proteins and each was equivalent to single gene. The linking of these genes in a single molecules like a chromosome, may facilitate the efficient distribution during cell division. If our understanding about the origin of life and early biological systems are correct than it is possible that initial stages in the biochemical evolution occurred many times in parallel in the ocean or atmosphere of early earth. So it is possible that life has originated more than once at that time? But there are many evidence which suggest that present day organisms are derived from a single origin. The single origin is indicated by remarkable similarities between the basic molecular biological and biochemical mechanism in bacterial, archaeal, and eukaryotic cells. I.e. There is not a single biological or biochemical logic that can tell why certain codons used by living system is fairly universal? They should be at least different for different origins At what stage the single origin of modern biological system predominated at that time? The exact answer to it is difficult to establish but most likely the system which developed first protein system and then DNA genomes was predominated due to their efficient replication and catalytic activity which competed out RNA based early systems. Although it is possible that informational molecules other than DNA or RNA may have been there at that times, like peptide nucleic acid (PNA) or pyranosyle version of RNA, however there is no indication that either these molecules are more likely to be formed and evolved than RNA in prebiotic soup. Although very old fossil record is difficult to interpret, there is reasonably convincing evidence that by 3.5 billion years ago biochemical systems had evolved into cells similar in appearance to modern bacteria. It is very difficult to tell what type of genomes these cells may contains but it seems likely that they had dsDNA based genome and consisted of small number of chromosomes, possibly just one, each containing many multiple genes. If we follow the fossil record forward in time we found that: First evidence of eukaryotic like cells similar to single cell algae are found at 1.4 billion years Multicellular algae about 0.9 billion years ago Multicellular animals appeared around 640 millions year ago Cambrian Revolution where many novel invertebrate life forms are found occurred 530 millions year ago. Mass extinction occurred in 500 million years ago Then rapid diversification occurred and first terrestrial insects, animals and plants were established by 350 millions year ago. Dinosaurs had been and gone by the end of Cretaceous era 65 million years ago. And the first hominoid appeared a mere 4.5 million years ago. Morphological evolution was accompanied by genome evolution. It is not adequate to equate evolution with “progress” but it is undeniable that as we move up the evolutionary tree we see increasingly complex genomes. One indication of this complexity is gene number which varies from less that 1000 in some bacteria to 30,000-40,000 in vertebrates i.e. humans. Within individual lineages i.e. within bacteria, change in gene number is probably is gradual with the acquisition of new gens balanced at least in part by the loss of existing ones. In certain evolutionary pathways the organisms evolved to have less genes than gain in genes, i.e. minimum genome of Mycoplasma and other parasitic species. The is two important points in evolutionary pathway where we see transitions and where organism with great increase in gene number was appeared. One of these transition is the arrival of first eukaryotes about 1.4 billion years ago, containing about 10,000 genes compared to 5000 or less in prokaryotic cells. The second transition was associated with arrival of first vertebrate soon after the end of the Cambrian, these having at least 30,000 genes There are two fundamentally different ways in which new genes could be acquired by a genome: By duplication of some or all of the existing genes in the genome By acquiring genes from other species. A central role for gene duplication in genome evolution was first proposed in 1970. The initial result of gene evolution is presence of two identical genes. Selective constraints will ensure that one gene remains the same to provide functional protein, while other additional copy can have multiple fates. If the additional dose is beneficial for the organism than it will remain the same If the additional gene is not beneficial then it will accumulate the mutations and some of deleterious mutations will lead to inactivation of this gene, resultantly a pseudogene will be developed, the analysis of pseudogenes suggest that most of the mutation which they accumulate are frameshift and nonsense mutation that occur in the coding regions of the genes. Some of the mutations might lead to some new gene functions that may be beneficial for the organisms. Genes are duplicated in the past and an even a cursory examination reveals this phenomenon. If a gene is beneficial and its increased amount is stabilized then its sequence will remains the same and the result would be the presence of two genes with identical or near identical sequences. Many multigene families are the example of this type of gene duplications. rRNA genes whose copy number ranges from two in Mycoplasma genitalium to 500 in Xenopus laevis. This increased copy number reflects the need of rapid synthesis of rRNA at certain stage of the cell cycle. There should be some mechanism which ensures that the family members retains the same sequence with the passage of evolutionary times This type of evolution is called as concerted evolution, any advantageous mutation in one member of the family will be spread to other members. The molecular mechanism involved in this process is gene conservation with depends on recombination If the duplicated genes is not under the same evolutionary pressure as that of original copy that gene may accumulate mutations which can give new and useful functionality. Multigene families provide many indication that such events have occurred frequently in the past. The prime example is globin gene family where duplication and mutations result in the formation of new family members. The analysis of this family shows that the duplication and mutations provide new functions to its members and by applying molecular clock we can estimate based on sequence divergence that when these genes got duplicated. This data also helped to understands the different events of various groups of b-globulin genes present in different mammals. Another striking example of gene evolution by duplication is provided by homeotic selector genes that play important role in determination of body plans. Drosophila has a single cluster of homeotic selector genes (called HOM-C) containing eight genes that contains a homeodomain sequences which can bind with the DNA. These genes seems to evolve from an ancestral gene that existed about 1000 million year ago. The pattern of evolution in this cluster gives striking example that how gene duplication and sequence divergence could in this case, have been the underlying processes responsible for increasing the morphological complexity of the series of organisms in Drosophila evolutionary tree The Drosophila have one Hox cluster, while amphibians have two and vertibrates have four Hox clusters each with sequence similarity with the genes at its position in the cluster. The Ray-finned fishes, probably the most diverse group of vertebrates with a vast range of different variations of the basic body plan, have seven Hox clusters. There are many ways through which short segment of DNA containing a single gene or a small group of genes could be duplicated. Unequal Crossing-Over: Recombination event that are started by similar sequences in homologous chromosomes, resulting in duplication of a segment of DNA Unequal Sister Chromatid Exchange: Which occur in the similar manner but involves a pair of chromatids from a single chromosome DNA Amplification: Where some part of gene can be amplified due to unequal recombination between the two daughter DNA molecules Replication Slippage: Results in duplication of short segments such as microsatellite sequences. These process results in tandem duplications i.e. one in which two duplicated segments lies adjacent to one another in the genome, such as globin gene families. Some times duplicated genes do now lies adjacent to each other. i.e. in human genome there are three functional genes for the metablic enzyme aldolase, each present on different choromosome. One possibility is that these genes are present in tandem and then get apart on large scale genome rearrangements. Other possibility is that these genes are the result of gene duplication by reterotransposition. The processed mRNA can be converted to cDNA which then can be reinserted to the genome. The genes duplicated in this manner are called as retrogenes. These genes lack any promoter so are pseudogenes. These retrogenes can reinserted near already present promoter and can be expressed but distinctive feature of these genes that they lack any introns. In the similar process a full gene with introns can also be made by antisense RNA if that is transcribed by “wrong” template by nearby promoter regions So far we studies the process which can duplicate short DNA i.e. few tens of kilobases in length. Although duplication of an entire chromosome seems possible but it seems unlikely that it has played any major role in genome evolution. Because we know duplications in individual human chromosomes result in a cell that contains three copies of one and two of all others (trisomy), it is either lethal or results in diseases such as Down syndrome. It seems over dose of some genes but not of other results in imbalance of gene products and disruption of the cellular biochemistry. Entire set of chromosomes can be duplicated and it is common in plants. Autopolyploidy can results in aberrant meiosis Tetraploid are stable and can reproduce while triploid are can not reproduce Wheat (Triticum aestivum) is hexa ploid, while cotton (Gossypium hirsutum ) is tetraploid Polyploidy is less common in animals especially which have distinct sex chromosomes, neverthless red viscacha rat of Argentina, has a tetraploid genome Autopolyploidy does not lead direclty to an increase in gene number as the initial product is an organism that simply has extra copies of every gene, rather than any new gene. But this provides the potential for increase in new genes by mutational process to those gene which are not essential for organism. To look for these past events is a difficult task for simple sequence comparison because many of the duplicated genes may be deleted and many would have evolved so much that they seems to be totally new sequences. To look for such events we need to look for entire set of genes that have duplicated and had the same order along the DNA molecules, if not have undergone much rearrangements. The search of Saccharomyces cerevisiae showed many such examples which showed that its genome has undergone genome duplication just under 100 million years ago. The sequence comparison showed about 800 genes pair having more than 25% sequence identity in their proteins products. Out of these genes 376 could be placed in 55 duplicated sets each set containing at least 3 genes in same order. These sets altogether covering half the genome! The fact that there were just two copies of each gene not three or four supported that the copies arose by whole genome duplications The comparison of S. cerevisiae with other yeast species Kluyveromyces lactis and Ashbya gossypii showed that these three species shared a common ancestor what lived over 100 million years ago, previous to time of genome duplication. The duplication in S. cerevisiae was also supported that this specie contains many duplicated copies of those genes which are present in singletons in other yeast species Equivalent work has been carried out with other genomes which showed that whole genome duplication is relatively frequent event in evolution of many groups of organisms. Arobidopsis thaliana genome sequence analysis with other plants showed that its ancestor has undergone four rounds of genome duplication between 100 to 200 million years ago Human and other mammalian genomes also contains so may genes duplicates that at least one genome duplication event is thought to have occurred in this lineage between 350 and 600 million years ago. In human genome in recent part there are some smaller gene duplication events has been occurred. The analysis of long are of human chromosome number 22 showed that about more than 200 segment of DNA having more than 1 kb length showed 90% or more sequence similarity in the region of 35Mb or with other chromosomes over the period of 34 million years. There are other evidence for DNA duplication over 1 to 400 kb in length throughout the genome. The possibility that any different size of portion of DNA can be duplicated provides an possibility that functional units in a gene i.e. domains of a protein can be duplicated and can be recombined with other proteins to make novel genes. Most of the domains in a proteins are formed by a contagious sequence in the DNA. Rearrangement of domain-encoding gene segments could result in novel protein functions. Domain Duplication: Can occur when the DNA segment coding for a domain get duplicated by any of the mechanisms studied so far, can result in duplication of the same domain in that protein. The presence of an additional domain may confer novel properties to that gene or after accumulating mutations can give rise to different structure or function. The domain duplication results in elongation of a gene, which is a characteristic of higher organisms. Domain Shuffling: Can occur when domains from different genes get recombined in new ways giving to a totally new combinations or mosaic proteins leading to the development of entirely new biochemical function. The duplication of domains require that the domain should code for continous stretch of DNA without any presence of introns. Interestingly, the domains in a protein is usually coded by one exons, therefore there physical separation facilitate the movement of full domians. The excellent example is of a2 Type I Collagen which codes for three peptide chains of repeated sequence of tripeptide glycine-X-Y. The chicken a2 Type I collagen gene is split into 52 exons, 42 of which cover the part of the gene coding for glycine-X-Y repeats. Within this region, each exon encodes a set of complete tripeptide repeats. The number of repeats per exon varies but is 5 (5 exons), 6 (23 exons), 11 (5 exons), 12 (8 exons) or 18 (1) exon. This gene could have evolved by duplication of exons leading to repetition of the structural domains. There are similar examples of many proteins involved in blood clotting in humans. The new genes can be acquired from other species by a process known as lateral gene transfer. The lateral gene transfer has played major role in the genome evolution of prokaryotes and is very common among them. That can be facilitated by multiple ways i.e. conjugation, composite transposones, direct DNA uptake from environment. In higher organisms, plant are well known for acquiring genes from other species by allopolyploidy, the cross of wheat and cotton are well characterized examples. In animals specie barrier is not easy to break, so its very little that acquiring genes by lateral transfer is common among them. The most common way is by the transposons and retroviruses which carry genes along with their genome from one host to another. The transposon P elements has shown to be transferred from Drosophila to human The coding region only makes up 1.5% of human genome, the evolution of noncoding DNA is also important for genome evolution. The large amount of noncoding DNA is always remained puzzling for scientists. It has been thought that is performaing some unknown but important function. It is also important that the noncoding DNA might be playing some role in aspects of genome organization and control function as can be thought by chromatin structural influence on gene expression. The view of some scientist is that as it has not selective pressure to get rid of it that’s why its still there. It seems that most of the noncoding DNA is under random evolutionary pressure except for the parts which are preceding the coding regions and some which are involved in structural aspects of chromosomes. Nevertheless along other important regions transposable elements and introns have interesting evolutionary histories and are general importance in genome evolution. The transposable elements have a number of effects on evolution of the genome as whole. They can initiate the recombination events by providing identical sequences at different places. This unequal recombination with in and between different chromosomes can lead to deletion of in between DNA. These deletions are harmful and results in loss of genes from genome but some time it may also be beneficial. The recombination between a pair of LINE-1 elements approximately 35 million years ago resulted in the gene duplication of b-globlin gene that resulted in gamma and alpha members of this gene family. The movement of transposible elements in genome has important consiquences. They may affect the transcription by inserting and removing in promoter regions. They can alter splicing of the genes by inactivating/activating the splice sites. Although human evolutionary history is controversial but its is generally acceptable that our closest relative among the primates is chimpanzee and the most recent ancestor which we both shared lived 4.6-5.0 million years ago. Since that split human lineage embraced two genera Australopithecus and Homo making us which thinks to having important biological functions at least to our eyes, which make us different from all other animals. So how different we are then chimpanzee? Only 1.73% nucleotide differences among humans and chimpanzee The identity in coding regions are greater than 98.5% with 29% genes with identical amino acid sequence, even noncoding regions are 97% identical. The gene order is almost similar and chromosome are very similar in appearance. The only significant difference is that chimpanzee have one more 24 chromosome while human have 23. But still the gene content is same.