* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download L13Generalizations
Polycomb Group Proteins and Cancer wikipedia , lookup
Human genome wikipedia , lookup
History of genetic engineering wikipedia , lookup
Dual inheritance theory wikipedia , lookup
Metagenomics wikipedia , lookup
Transposable element wikipedia , lookup
Non-coding DNA wikipedia , lookup
Group selection wikipedia , lookup
Minimal genome wikipedia , lookup
Point mutation wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Population genetics wikipedia , lookup
Helitron (biology) wikipedia , lookup
Genome evolution wikipedia , lookup
Koinophilia wikipedia , lookup
Topic 8. Lecture 13. Generalizations emerging from past evolution History unfolds in time, which makes chronology of past events crucial. However, any history also has a crucial timeless aspect, which can be described by generalizations. An example of generalizations: complexity is rapidly lost, if selection stops maintaining it. Mycobacterium leprae is in the middle of massive genome degeneration A parasitic plant Epifagus virginiana lost many key genes in its chloroplast genome Astyanax mexicana, like many other cave animals, has degenerated eyes A crustacean parasite of fish, Lernaea carassii, has profoundly simplified morphology In a sense, every feature of living beings, both modern and ancient, is a generalization about evolution of their ancestors. Let us view evolutionary generalizations from three complementary perspectives: 1. Generalizations concerned with evolution at a particular level of organization of life - i. e., with sequences, molecules, cells, organisms, populations, and ecosystems. 2. Generalizations concerned with evolution of the diversity of life. Such generalizations describe patterns in the diversity of life at one moment of time, as well as processes that generate such diversity, i. e., evolution of individual lineages, birth and death of lineages, independent and dependent evolution of different lineages, and evolution in space. 3. Generalizations concerned with evolution of complex adaptations, the most enigmatic aspects of evolution. These generalizations describe genotypical and phenotypical mechanisms of adaptive evolution, origin of novel adaptations, and dynamics of complexity Because we still lack a comprehensive theory of Macroevolution, generalizations about past evolution often are all what we have. Level-specific generalizations: 1. Sequences a) Mutation strongly affects sequence evolution, and selfish segments are common b) Functionally important segments and sites of genomes usually evolve slower c) Complex organisms have larger genomes, mostly due to noncoding sequences 2. Molecules a) Life possesses fundamental unity b) A particular function can be performed by very dissimilar molecules c) Rates of evolution vary across sites of a molecule and often change with time 3. Cells a) Networks within a cell are modular b) Networks within a cell consist of a small number of common motifs 4. Multicellular organisms a) Cell differentiation involves combinatorial regulation of gene expression b) In the development of vertebrates one stage is particularly conservative c) Body size often increases, but declines on islands 5. Populations a) Reproduction almost always involves unicellular channels b) Amphimixis is pervasive 6. Ecosystems a) Natural ecosystems can be successfully invaded Generalizations concerned with diversity of life: 1. Diversity of life at a particular moment of time a) Every individual belongs to a population of at least ~1000 individuals b) At any moment, life mostly consists of compact, disconnected forms c) Genotypes are incompatible if the distance between them exceeds ~1-5% 2. Evolution of a lineage a) Changes of a lineage are continuous, with some caveats b) Genomes evolve at much more uniform rates than phenotypes 3. Birth and death of lineages a) Cladogenesis is often, but not always, triggered by geographic isolation b) Cladogenesis and extinction are extremely unfair processes c) Overall diversity of life fluctuates, with the long-term tendency to increase 4. Independent evolution in multiple lineages a) Evolution is predominantly divergent, but homoplasy is common in simple traits b) Independent evolution eventually leads to speciation 5. Coevolution a) Lineages often coevolve for a long time b) Organisms often imitate each other to avoid been eaten 6. Diversity in space a) Distributions of ranges of species are strongly affected by limited dispersal b) Independent evolution at different localities is often parallel Generalizations concerned with adaptation and complexity: 1. Genetical aspects of adaptive evolution a) Evolution of both coding and non-coding sequences is important for adaptation b) The target for strong positive selection is narrow at each moment c) Tightly related genes can perform rather different functions 2. Phenotypic aspects of adaptive evolution a) Adaptations can be very general and very specific b) Evolution is irreversible c) Perhaps, all adaptations are imperfect 3. Origin of novelties a) New non-coding regulatory sites, but not new genes, often appear from scratch b) Origin of phenotypic novelties is usually opportunistic and can happen fast 4. Dynamics of complexity a) Complex phenotypes evolve through adaptive intermediate stages b) Complexity is rapidly lost, if selection stops maintaining it c) The overall trend is for complexity to increase Level-specific generalizations: 1. Sequences The level of sequences is the simplest of all levels of organization of life. ACGATCGACGACGATCGATCGACGATCGA Green, blue, red: targets of no, negative, and positive selection. Evolution of sequences is undestood relatively well. The two key factors of Darwinian evolution, mutation and selection, are its main forces. However, this is of little help for understanding evolution at higher levels. Whether genotypes drive evolution of phenotypes or it is the other way around is a classical chicken-and-egg problem. 1a) Mutation strongly affects sequence evolution, and selfish segments are common This sweeping generalization has many facets. The three most important of them are: i) Evolution of sequences proceeds through individual changes that are supplied by mutation process, first of all by point mutations - single nucleotide substitutions, and short deletions and insertions. Sister 1: caagccag---cgtctatcatatacgcagactcggctatttacgccacgatcagcat Sister 2: catgccagcatcgtctagcatatacacagactc-gctatttacgtcacga-cagcat Outgroup: catgccagcatcgtgtagcatataggcagactc-gctaattacgtcacgatcagtat del. in. del. ii) Long new sequences have identifiable sources, instead of appearing from scratch. acagcatcgtgactagctatcgagatca -> acagcatcgtgactagctatagctatcgagatca Tandem duplication, the simplest manifestation of this pattern. iii) Different genome regions evolve at similar overall rates. This is another theory-based evidence for evolution. Human-mouse divergence at synonymous sites of genes on chromosomes 4 (left) and 22 (right). One important special case of this generalization is that transposable elements (TEs) accumulate in many genomes. A mammalian genome is ~50% TEs, a Drosophila genome is ~10% TEs, and bacterial genomes usually contain very few TEs and other junk. In mammals, individual TEs are usually fixed, i. e. present in every genotype within a lineage. In Drosophila a individual TE is usually rare. Mammals Drosophila Often, TEs or their segments become domesticated, i. e. start performing some function for their host. The distribution of ages of TEs in the human genome. This is measured by divergence from the consensus sequences and grouped into bins that correspond to 25My of divergence. Simple explanation for: 1a) Mutation strongly affects sequence evolution, and selfish segments are common Qualitatetively, this pattern is inavoidable - the only feasible mode of genome evolution is fixation of an individual mutation. Selection is powerless without mutation. IMPERFECT -> IMPENICESEGMENTRFECT No selection can accomplish this! Quantitatively, mutation dictates the course of evolution, but only as long as selection does not care. For example, in coding regions deletions and insertions of lengths 1 and 2 (but not 3) are rare. Not so simple from here: Selection is more efficient in lineages represented, at any moment, by many individuals. Thus, lineages with large populations are better protected against TEs and other junk. The relative roles of mutation and selection is the key issue of evolution at the sequence level. To some extent, it will be clarified by the next generalization. 1b) Functionally important segments and sites of genomes usually evolve slower A nucleotide substitution can kill, but at another location a substitution of even a removal of 1Mb of sequence has no evident impact on the phenotype. Pathogenic (black) and benign (grey) nucleotide substitutions in human mitochondrial gene for alanine tRNA. A typical human is heterozygous for ~50 deletions larger than 5,000 nucleotides each. The detected deletions span a total of 267 genes. This sweeping generalization has many facets. The three most important of them are: i) Non-synonymous sites of coding genes evolve slower than synonymous sites ATG TCT GGG CGA GGT AAA GGT GGC AAG GGG CTG GGT AAG GGA GGC GCC AAG CGC CAC CGG ||| ||| || ||| || ||| || ||| || ||| || || || || ||| || || ||| || || ATG TCT GGA CGA GGC AAA GGC GGC AAA GGG CTC GGA AAA GGT GGC GCT AAA CGC CAT CGT This alignment of the first 20 codons of histone 4 genes from human and zebra-fish genomes is an extreme case. On average, nonsynonymous substitutions accumulate ~10 times slower than synonymous ones. ii) functional non-coding segments evolve slower than junk segments Alignment of four genome regions upstream of the transcription start of apolipoprotein gene. The binding site of the key transcription factor (protein) is conserved (sequence motif) and highlighted. Conservation is represented by a motif logo. Functional non-coding sequence segments can be detected using phylogenetic footprinting. iii) exons evolve slower than introns Coding exons evolve much slower than introns, and this pattern can be used to determine exon locations by genome comparions. Alignment of human (top) and mouse (bottom) orthologous genes. Lines connecting the genomes show segments where their similarity is moderate (blue) or high (red). Red boxes below the alignment show predicted exons. Essential genes, that make ~25% of all genes, evolve ~1.5 slower than non-essential genes. Still, occasionally even once-essential genes are lost. The estimated number of lost genes is shown next to each branch. Approximate divergence times are shown at the right. Simple explanation for: 1b) Functionally important segments and sites of genomes usually evolve slower Negative selection which favors already-commom variants and prevents changes is much more common than positive (Darwinian) selection which favors initially rare variants and promotes changes. One may wonder why beneficial mutations happen at all. Still, positive selection does happen, and a particular site or even a segment where selection strongly promotes changes can evolve faster than selectively neutral sites. Mutation reigns where selection does not care, but where it cares, selection makes a very strong impact on sequence evolution, although it can only reject of favor new mutations. 1c) Complex organisms have larger genomes, mostly due to noncoding sequences Genomes of complex organisms carry only a slightly elevated number of protein-coding genes. In Drosophila, ~50% of its non-coding DNA is apparently doing something, and in mammals this fraction is ~10%. Organisms parasitic bacteria free-living bacteria unicellular eukaryotes flowering plants most of animals fishes birds mammals Minimal Genome size (millions) 0.5-1.5 2.5-7.5 10-30 60-120 100-200 400-1000 1000-1500 2500-3500 Number of genes (thousands) 0.5-1.5 2.5-7.0 7-10 20-30 15-25 20-30 20 20 Maxiaml coding fraction (per cent) 85 85 50-70 25-40 15-20 5-10 2-3 1.5-2 Simple explanation: Complex organisms need more text to describe themselves, and the extra text comes in the form of functional non-coding sequences (we do not really understand why). Also, complex organisms have "bloated", instead of "lean", genomes. Level-specific generalizations: 2. Molecules Molecules are the lowest functional level. A molecule is a (relatively) small but fully functional entity, and each one is incredibly complex (protein folding remains a mystery). DNA as a functioning molecule. tRNA, a non-coding RNA. Hemoglobin, a protein. Why do we ignore evolution of things like this? Discontinuous, and its connection to the genotype is much more complex. 2a) Life possesses fundamental unity This unity is most striking, as far as translation machinery is concerned - genetic code, components of ribosomes, tRNAs, aminoacyl tRNA synthetases, etc. 80S ribosome of Saccharomyces cerevisiae 70S ribosome of Escherichia coli Many proteins not involved in translation are also universal - almost 50% of E. coli proteins have homologs among human proteins. Simple explanation: Many key features of life are probably forozen accidents, impossible to modify. This allows us to learn something about LUCA. 2b) A particular function can be performed by very dissimilar molecules Despite the fundamental unity of life, there are some cases when the same function, either simple or complex, is performed by clearly non-homologous molecules, similar only to the extent dictated by this function. Inorganic pyrophosphatases comprise two non-homologous families, I and II. Archaeal and eukaryotic replicative DNA polymerases (families A and B) and bacterial replicative DNA polymerases (family C) perhaps are non-homologous. Simple explanation: Apparently, this is a general law of nature: every complex task can be performed more or less equally well in many rather different ways. Without common ancestry, each species would probably use its own way hydorolize pyrophosphate. 2c) Rates of evolution vary across sites of a molecule and often change with time In almost every RNA or protein molecule there are sites that evolve very conservative and sites that evolve as fast as junk DNA (i. e., at mutation rate) or even faster. A typical segment of an alignment of several orthologous proteins from different species. Distribution of amino acid replacements along the Neisseria gonorrhoeae transmembrane porin sequence. Each dot represents one replacement. Obviously, sequence segments exposed outside the cell evolve much faster, probably due to positive selection. The rate of evolution within a molecule is not only heterogeneous across sites at any moment of time, but it also can change at a particular site while the molecule evolves. This occasionally includes the most drastic, qualitative changes - a nucleotide or amino acid replacement which was forbidden by selection may become permitted, or other way around. Hs 1 Ag 1 MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVGSKTKEGVVHGVATVAEKTK 60 MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVGSKTKEGVVHGVTTVAEKTK 60 Hs 61 Ag 61 EQVTNVGGAVVTGVTAVAQKTVEGAGSIAAATGFVKKDQLGKNEEGAPQEGILEDMPVDP 120 EQVTSVGGAVVTGVTAVAQKTVEGAGNIAAATGFVKKDHSGKSEEGAPQEGILEDMPVDP 120 Hs 121 DNEAYEMPSEEGYQDYEPEA 140 Ag 121 DNEAYEMPSEEGYQDYEPEA 140 In humans, T at the 53rd site of a protein alpha-synuclein is pathogenic. However, in spider monkey normal alpha-synuclein contains this T. Probably, some other deviation of spider monkey alpha-synuclein from its human ortholog renders T at the 53rd site harmless. Thus, we can call this T a CPD (compensated pathogenic deviation). As many as 10% of deviations of a non-human protein from its human ortholog would be deleterious, if placed into the human molecule individually. CDPs are very common in tRNAs. Three of them are present in mitochondrial tRNASer of Ursus maritimus (polar bear). Nucleotides corresponding to human pathogenic mutations are shown in red; predicted compensatory substitutions are shown in blue; and other deviations from the human ortholog, those unrelated to the pathogenic mutations or their compensations, are shown in green. Nucleotides found in healthy humans are shown in orange alongside the nonhuman sequence. At least five mechanisms of compensation are known for pathogenic mutations that destroy a Watson-Crick pair in one of the four tRNA stems. Simple explanation for: 2c) Rates of evolution vary across sites of a molecule and often change with time Variation of rate of evolution across sites is not surprising. Proportions of amino acid replacements in human proteins that are opposed by selection with coefficients s > 10–2, 10–2–10–4, 10–4 –10–5, or <10–5. Variation of rate of evolution at a site may be unexpected. However, molecules are very complex things, and their parts interact with each other in complex ways. Thus, evolution of a molecule changes the rules of the game for each its site. Level-specific generalizations: 3. Cells Cells is not the lowest functional level - molecules is - but it is the first living level. Thus, in cells we encounter a staggering degree of complexity. Unicellular green alga Acetabularia is ~5cm tall A ciliate Stentor Human Hippocampal neuron A cell contains a large number of functional units - promoters, mRNAs, ribosomes, and proteins. These units interact with each other, forming networks. Networks that describe the following 3 processes are particularly important: transcription of genes physical interactions of proteins functional interactions of proteins Transcriptional regulatory network of the Saccharomyces cerevisiae. Transcription factor genes are green, regulated genes are brown, and those with both functions are red. Network of protein complexes in S. cerevisiae. Different functions are shown by colors. The gray edges connect complexes that share protein components. Exemplar complexes from each function are expanded to show individual proteins. A standard map of biochemical pathways, representing the metabolic network of the cell. Networks of interacting units (produced by evolution!) is the essense of cells. But can we formulate any useful generalizations about them? 3a) Networks within a cell are modular Modularity of a network simply means that interactions between some components are tight and other interactions are loose. Complexes of physically interacting proteins are modules. The genome of yeast Saccharomyces cerevisiae encodes ~7,000 proteins. Within the cell, they form ~700 different complexes of physically interacting proteins. Such complexes are modules, but this is not the whole story. Often, a protein can participate in several complexes. Some proteins always stick together and form "cores". Other proteins form "submodules" that can attach to different cores. A protein complex, consisting of the core, 3 submodules, and other attachments. Modularity is also pervasive in transcruption and metabolic networks. Transcription factors (boxes) separately regulate genes involved in different processes. Modules, associated with different functions, in the metabolic network in Escherichia coli. Hierarchical organization of modularity in metabolic networks. Simple explanation for: 3a) Networks within a cell are modular Well, nothing is going to be simple here! We only might assume that, perhaps, networks within cells are modular because such networks are evolvable and designable, and not because they are optimal. 3b) Networks with a cell consist of a small number of common motifs Network motifs are patterns of interconnections that recur in many different parts of a network. All networks within cells consist mostly of a small number of motifs that evolved independently. Much of the network of transcriptional interactions in Escherichia coli is composed of repeated appearances of three motifs. Each motif has a specific function in determining gene expression. Feedforward loop: a transcription factor X regulates a second transcription factor Y, and both jointly regulate one or more operons Z1...Zn. Example of a feedforward loop (Larabinose utilization). SIM motif: a single transcription factor, X, regulates a set of operons Z1...Zn. X is usually autoregulatory. All regulations are of the same sign. No other transcription factor regulates the operons. Example of a SIM system (arginine biosynthesis). DOR motif: a set of operons Z1...Zm are each regulated by a combination of a set of input transcription factors, X1...Xn. DORs are detected as dense regions of connections. Example of a DOR (stationary phase response). The most common motif in metabolic networks that regulate enzyme activity: negative feedback loop. Again, such loops evolved independently very many times in different metabolic pathways. Simple explanation for: 3b) Networks with a cell consist of a small number of common motifs Apparently, there are not too many feasible solutions for each of the simple regulatory tasks that a part of the network has to perform. We may be dealing with unique optimality here, as far as the overall structure of regulatory interactions is considered. Level-specific generalizations: 4. Multicellular organisms A cell is alive, but often cells are not independent. Multicellular organisms evolved from unicellular five times. Multicellular organisms are as complex as constituent cells, if not more. Obviously, cell differentiation, pattern formation, and overall properties of organisms are all essential. Cell differentiation Pattern formation Overall phenotype 4a) Cell differentiation involves combinatorial regulation of gene expression The genome of a multicellular organism programs development of many different cell types, although it contain only slightly more genes that the genome of a unicellular organism. Greater complexity of multicellular organisms appears because, on average, their genes are regulated by a much larger number of transcription factors. a, Simple eukaryotic transcriptional unit. A simple core promoter (TATA), upstream activator sequence (UAS) and silencer element. b, Complex metazoan transcriptional control modules consisting of multiple clustered enhancer modules interspersed with silencer and insulator elements. Moreover, the total number of transcription factors encoded by any genome is not large. For example, the genome of Drosophila encodes only ~800 transcription factors. Thus, combinatorics of transcription factors and their binding sites is essential both for genespecific and tissue-specific patterns in gene expression. An array of binding sites for 4 transcription factors in a controlling region of a typical gene of a multicellular organism. Each of these factors regulates many other genes - but in different combinations. Simple explanation: This generalization is not fully understood. Perhaps, combinatorial regulation evolves because novel transcription factors are more difficult to acquire than novel binding sites for them. Or, alternatively, such regulation may be the most efficient one feasible. Opportunism or optimality? - we do not know. 4b) In the development of vertebrates one stage is particularly conservative The embryonic development of all vertebrates shows remarkable similarities at the early - but not the earliest - stage called the pharyngula. At this stage all vertebrates have notochord, dorsal hollow nerve cord, post-anal tail, and a series of paired branchial grooves, matched on the inside by a series of paired gill pouches. The pattern is known since XIX centry, as Von Baer's law. Simple explanation: Perhaps, early stage of development are less evolvable, because their changes afect all subsequent stages. 4c) Body size often increases in, but declines on islands This pattern is known as Cope's rule, and has been observed repeatedly. Larger animals are apparently more prone to extinction. Body size is plotted against time, for species of Borophaginae (a clade of extinct carnivors). Clearly, this pattern cannot be universal! Indeed, there are many exceptions. In fact, on islands body size of many - but again not all - organism declines, a pattern known as Foster's Rule. The Pygmy Mammoth (Mammuthus exilis) was a dwarfed descendant of full-sized mammoths that lived on an island known as Santa Rosae. Wrangel island - the range of another dwarfed mammoth, extinct only ~3,500ya. Simple explanation: Clearly, this is a mess! Skeleton of a Cretan Dwarf Elephant. Level-specific generalizations: 5. Populations Here the complexity of our object drops again - a part may be more complex than the whole, if we can view parts as black boxes. Populations are sets of similar individuals, and we care only about those properties of inviduals that describe them as members of such sets, without looking under the hood. Organism Individual Population of individuals Complexity of life peaks at cells and organisms - lower and upper levels are simpler. 5a) Reproduction almost always involves unicellular channels Why to recreate big organisms from single cells, every generation? Indeed, vegetative reproduction, i. e. reproduction by many cells, is perfectly feasible even in humans - but it almost never replaces single-cell reproduction completely. Some other examples of vegetative reproduction Even when reproduction is nominally vegetative - for example, a branch of a moss becomes an independent organism - all the cells of this branch may originate from a single apical meristemal cell. Even mitochondria in female germline in mammals go through drastic bottlenecks - all mitochondia of a newborn are descendants of just 3-4 stem maternal mitochondria. Why? Simple explanation: Probably, single-cell (and single-genotype) channels make selection more efficient. In fact, this is not that simple, and other explanations are feasible. We will return to this issue later. 5b) Amphimixis is pervasive Why is such a crazy process - alternation of syngamy and meiosis - ubiquitous? Indeed, apomixis (asexual reproduction) is very common, but almost never represents the only mode of reproduction. The only known exception are bdelloid rotifers. An obligately apomictic bdelloid rotifer. Simple explanation: There is no definite explanation for the ubiquity of amphimixis. Almost 20 hypotheses have been proposed, and 3 or 4 among them make sense. We will consider this issue later. Level-specific generalizations: 6. Ecosystems As you know, ecosystems consist of interacting populations. 6a) Natural ecosystems can be successfully invaded. Purple loosestrife, Lythrum salicaria, a very successful invader of European origin in North America. Elodea canadensis, a very successful invader of North American origin in Eurasia. An invasion present a paradox: why should an invader should be successful within the new environment, to which it never had a chance to adapt? Apparently, natural ecosystems have a lot of empty niches. Simple explanation: There are several hypotheses but none is universally accepted. Still, it is clear that the problem is an evolutonary one. Quiz: Formulate your own generalization about evolution at any level of organization of life. This generalization does not need to be sweeping and very important - just make sure that it makes sense. Generalizations concerned with diversity of life: 1. Diversity of life at a particular moment of time a) Every individual belongs to a population of at least ~1000 individuals b) At any moment, life mostly consists of compact, disconnected forms c) Genotypes are incompatible if the distance between them exceeds ~1-5% 2. Evolution of a lineage a) Changes of a lineage are continuous, with some caveats b) Genomes evolve at much more uniform rates than phenotypes 3. Birth and death of lineages a) Cladogenesis is often, but not always, triggered by geographic isolation b) Cladogenesis and extinction are extremely unfair processes c) Overall diversity of life fluctuates, with the long-term tendency to increase 4. Independent evolution in multiple lineages a) Evolution is predominantly divergent, but homoplasy is common in simple traits b) Independent evolution eventually leads to speciation 5. Coevolution a) Lineages often coevolve for a long time b) Organisms often imitate each other to avoid been eaten 6. Diversity in space a) Distributions of ranges of species are strongly affected by limited dispersal b) Independent evolution at different localities is often parallel Generalizations concerned with diversity of life: As long as we are ready to ignore the complexity of life, evolution of its diversity is understood reasonably well. 1. Diversity of life at a particular moment of time 1a) Every individual belongs to a population of at least ~1000 individuals This fundamental fact is so familiar that it is often taken for granted - although it should not. Loch-Ness monster does not exist - there must be at least 1000 of them. The same is probably true for yeti. Mating ball of Garter snakes. Simple explanation: Population genetic theory demonstrates that a small population will soon become extinct due to inefficient selection against new deleterious mutations. We will consider this theory. 1b) At any moment, life mostly consists of compact, disconnected forms Indeed, at least among multicellular eukaryotes, we often encounter "good species", i. e. compact sets of similar and compatible organisms. Often, a form of life is not very compact phenotypically, but it still compatible and connected within itself, and disconnected from other forms. Aquilegia formosa Aquilegia pubescens Sometimes, two compatible phenotypes are connected by only a relatively small number of hybrids, so it is not clear whether to treat them all as one form of life or not. Occasionally, connection exists even between incompatible genotypes. Of course, according to the Strong Claim, every two organisms are connected, if we take into account all organisms, present and past. Still, "intermediate" genotypes and phenotypes have a tendency to disappear. Among modern organisms, continuous paths within the space of genotypes are no longer than 0.01-0.1 of DNA-level differences. Simple explanation: There are probably several reasons behind this generalization: (i) species might be adapted to discontinuous ecological niches, (ii) reproductive isolation (which can arise only in sexual taxa) might create gaps between taxa by allowing them to evolve independently, (iii) Anagenesis is only rarely coupled with continuous range expansion, and that such expansion cannot be too long - because the Earth is too small. c) Genotypes are incompatible if the distance between them exceeds ~1- 5% Very often, incompatible genotypes are also disconnected (again, only within modern organisms). There are no living, fit intermediates between dog and cat ,or horse and donkey. nothing to show Two incompatible, disconneced genotypes - the most common situation. Two partially compatible, disconnected genotypes (mules are viable, but sterile). Still, compatible genotypes (left and right) may be disconnected, due to geographical isolation (hybrid in the center was produced artificially). Occasionally, even incompatible genotypes remain connected. Despite this variation, there is a strong correlation between incompatibility and dissimilarity. Incompatibility appears when the genetic distance between two genotypes exceeds 0.01 0.05. Each point represents a pair of species of Drosophila. No wonder that genotypes that are very dissimilar are also incompatible. However, it seems that incompatibility kicks in surprisingly abruptly. Mitochondrial genetic distance between most distant hybridizable species do not differ between birds and mammals. Such distances correspond to nuclear DNA genetic distances 0.01 - 0.03 (mitochondria evolve faster). Curiosuly, in mammals, within clades with invasive placenta hybridization is possible between more dissimilar species. Simple explanation: There is no need to explain, really, why incompatibility generally increases with dissimilarity. However, the likely reason for a rapid transition from compatibility to incompatibility is nontrivial and is known as Orr's snowball effect - we will consider it later. Generalizations concerned with diversity of life: 2. Evolution of a lineage a) Changes of a lineage are continuous, with some caveats Children are similar to parents A rare exception: WDG A rare exception: symbiogenesis Simple explanation: With some exceptions, long parent-offspring leaps within the space of genotypes are just impossible: most of potential genotypes are junk, and a long leap will land you in junk. b) Genomes evolve at much more uniform rates than phenotypes At the level of sequences, different lineages can easily accumulate changes at rates that vary within a factor of 1.5-2.0, but variation of rates is rarely large. Lengths of dog, mouse, and human branches of the unrooted phylogenetic tree in numbers of nucleotide substitutions per a synonymous (Ks) and a nonsynonymous site (Ka). Sequences evolved almost 3 times faster on the mouse branch than on the human branch - because the number of generations was much higher in the mouse branch. In contrast, phenotypes occasionally evolve at very different rates along different branches. Simple explanation: No law of nature prescribes a constant rate of genome evolution. Thus, its approximate uniformity is something of a mystery. We will consider this issue later. Heterogeneity in rates of phenotypical evolution must be, at least partially, due to heterogeneity of strength of Darwinian natural selection. Generalizations concerned with diversity of life: 3. Birth and death of lineages 3a) Cladogenesis is often, but not always, triggered by geographic isolation Geographical isolation always leads to unlimited divergence. However, this is not the whole story. A lineage can also split into two even without geographic subdivision. This process is called sympatric speciation. For example, in a crater lake Apoyo in Nicaragua a new species of cichlids evolved sympatrically in the course of ~10,000 years. Amphilophus citrinellus (left) is the ancestral form and A. zaliosus (right) is a new species. By now, these two species are quite different morphologically, occupy substantially different ecological niches, and do not hybridize in nature. Simple explanation: If a lineage is subdivided into two isoalted parts, these parts are bound to evolve independently and, eventually, will become very dissimilar, disconnected and incompatiblebecause evolution is primarily divergent. This is trivial. In contrast, cladogenesis without prior geographical isolation is a complex and fascinating subject, to be considered later. 3b) Cladogenesis and extinction are extremely unfair processes We already saw this may times. Simple explanation: Why should they be fair? Is life fair? Specific reasons for unfairness, however, are not clear, and may be 1) "Key innovations", 2) Ecological opportunities, 3) Chance. Questions to think about: Can we say that a clade which diversifies faster has a selective advantage over a clade which diversifies slower? Is it true that species from a more diverse clade are more advanced (derived)? Is Amborella a living fossil? 3c) Overall diversity of life fluctuates, with the long-term tendency to increase Reliable data exist only for times since Cambrian, but the tendency is clear. Simple explanation: Initially, the diversity of life was low, so it could only grow from there. However, it is not clear why an equilibrium has not yet been reached. Generalizations concerned with diversity of life: 4. Independent evolution in multiple lineages 4a) Evolution is predominantly divergent, but homoplasy is common in simple traits When a complex enough genotype or phenotype is considered, divergence always dominates. Divergence of sequences eventually reaches saturation at ~75%, but divergence of phenotypes is unlimited. However, homoplasy is also common, as long as we consider simple traits that can only accept a small number of states. In proteins, per site rate of parallel amino acid replacements is above the average. Homo Macaca Rattus Mus fkVmnasdfrtshnmcvadnmd fklmnasdfrtshnmcvqdnmd fklmnatdfrtshnmcvadnmd fkvmnasdfrtshnicvadnmd At sites (painted red), were an amino acid replacement occurred between rat and mouse, the same replacement occurs between human and monkey with probability that is ~5 times higher that the probability of replacement at other sites. In contrast, at the level of complex phenotypes, homoplasy, although widely-known, is always superficial. Simple explanation: When we consider the whole multidimensional space ot possibilities, homoplasy is very improbable. Imagine two hikers wandering in a 41,000,000-dimensional forest, starting from the same location. Thus, at the level of complex genotypes and phenotypes, homoplasy must be forced by similar selection operating on different lineages. In contrast, when we consider a 1-dimensional subspace with only 4 or 20 states, random homoplasy is possible, and is quite common because if an even occurred in one lineage it must be harmless. 4b) Independent evolution eventually leads to speciation This is always the case, qualitatively. Quantitatively, however, the rates of speciation can vary. On the one hand, lineages which became geographically isolated over 40 Mya may still hybridize and produce fertile offspring. Platanus orientalis from Asia Their hybrid, "London plane" Platanus occidentalis from North America On the other hand, host races of insects with high degree of reproductive isolation can appear after ~100 years of different selection. Alphalpha (left) and clover (right) races of pea aphid Acyrthosiphon pisum are reproductively isolated to a large degree. Simple explanation: Because independent evolution is mostly divergent and incompatibility increases with dissimilarity, this pattern in inavoidable. Generalizations concerned with diversity of life: 5. Coevolution 5a) Lineages often coevolve for a long time Often, a host and its symbiont or parasite have congruent phylogenies, suggesting their co-divergence (cospeciation). An example of this is provided by mealybugs and their bacteria symbiont Tremblaya. A spectacular example of a long-term coevolving association are figs and fig wasps, which cospeciate for over 60My. Simple explanation: This is not surprising, when the host and the symbiont totally depend on each other. 5b) Organisms often imitate each other to avoid been eaten Mimicry is a spectacular phenomenon. There are two kinds of mimicry. Batesian mimics where the mimic resembles the successful species but does not share the attribute that discourages predation. Palatable viceroy Limenitis archippus (top) mimics bitter monarch Danaus plexippus. Non-venomous Scarlet kingsnake Lampropeltis triangulum (top) mimics deadly coral snake Micruroides euryxanthus. Müllerian mimics where the mimic resembles the successful species and shares the antipredation attribute (dangerous or unpalatable.) Heliconius erato (above), and H. melpomene (below), a pair of impalatable Müllerian mimics from different areas of Ecuador and Northern Peru. Within any area, the two species are extremely accurate mimics of one another, but major geographic differences in colour pattern have evolved within each species. Simple explanation: This is natural selection! A question to think about: Does mimicry constitute an evidence for evolution? (do not tell anybody, but here I disagree with Darwin). Generalizations concerned with diversity of life: 6. Diversity in space 6a) Distributions of ranges of species are strongly affected by limited dispersal This is a trully pervasive pattern. For example, there are 13 species of finches on the Galapagos islands, occupying a wide variety of ecological niches. Simple explanation: Limited dispersal can strongly affect the outcomes of even slow evolution. 6b) Independent evolution at different localities is often parallel Simple explanation: This is natural selection! The arrays of ecological niches available for similar organisms at different places tend to be similar. Generalizations concerned with adaptation and complexity: 1. Genetical aspects of adaptive evolution a) Evolution of both coding and non-coding sequences is important for adaptation b) The target for strong positive selection is narrow at each moment c) Tightly related genes can perform rather different functions 2. Phenotypic aspects of adaptive evolution a) Adaptations can be very general and very specific b) Evolution is irreversible c) Perhaps, all adaptations are imperfect 3. Origin of novelties a) New non-coding regulatory sites, but not new genes, often appear from scratch b) Origin of phenotypic novelties is usually opportunistic and can happen fast 4. Dynamics of complexity a) Complex phenotypes evolve through adaptive intermediate stages b) Complexity is rapidly lost, if selection stops maintaining it c) The overall trend is for complexity to increase Generalizations concerned with adaptation and complexity: Of the two main assignments of evolutionary biology, to understand the origin of diversity and of complexity and adaptation of life, the second one is by far the most difficult. 1. Genetical aspects of adaptive evolution Genetics of adaptive evolution is understood better than it other aspects. 1a) Evolution of both coding and non-coding sequences is important for adaptation ........................................Allele.C...................Lys......... ........................................Allele.S...................Val......... transcription.start..............................MetValHisLeuThrProGluGluLysSer... catttgcttctgacacaactgtgttcactagcaacctcaaacagacaccATGGTGCACCTGACTCCTGAGGAGAAGTCT... ........................................Allele.S....................T.......... ........................................Allele.C...................A........... A recent and imperfect adaptation in humans. Alleles S and C of beta-hemoglobin, both causing a single amino acid replacement (of the same glutamine) protect against malaria in heterozygous state. In our ancestors and in many modern human populations adults are lactose-intolerant. The ability of adults to produce lactase is due to T -> C substitution at site -13910 upstream of the start codon of LCT locus (in Europeans) and due to a G ->C substitution at site -14010 (in Africans). Simple explanation: It is only natural that adaptation may involve changes both in proteins and in regulation of their synthesis. 1b) The target for strong positive selection is narrow at each moment In a typical protein sites that are currently under positive selection are rare and interspersed among numerous sites under negative selection. Sites that were under recent positive selection are painted red in a primate seminal protein Kallikrein 2 and in HIV-1 protein gp120. Usually, the fraction of such sites is much lower. Positive selection in non-coding segments also appears to be relatively rare. Simple explanation: Natural selection is, above all, a conservative force. Also, a large target for positive selection would imply substantial suboptimality of the phenotype, which may be lethal. 1c) Tightly related genes can perform rather different functions Occasionally, a protein completely changes its function due to not too many changes. MATEGDKLLGGRFVGSTDPIMEILSSSISTEQRLTEVDIQASMAYAKALEKASILTKTELEKILSGLEKISEESSKGVLV MA+EGDKL.GGRF.GSTDPIME+L+SSI+.+QRL+EVDIQ.SMAYAKALEKA.ILTKTELEKILSGLEKISEE..SKGVV MASEGDKLWGGRFSGSTDPIMEMLNSSIACDQRLSEVDIQGSMAYAKALEKAGILTKTELEKILSGLEKISEEWSKGVFV MTQSDEDIQTAIERRLKELIGDIAGKLQTGRSRNEQVLTDLKLLLKSSTSVISTHLLQLIKTLVERAAIEIDIIMPGYTH +.QSDEDI.TA.ERRLKELIGDIAGKL.TGRSRN+QV+TDLKLLLKSS.SVISTHLLQLIKTLVERAA.EID+IMPGYTH VKQSDEDIHTANERRLKELIGDIAGKLHTGRSRNDQVVTDLKLLLKSSISVISTHLLQLIKTLVERAATEIDVIMPGYTH LQKALPIRWSQFLLSHAVALTRDSERLGEVKKRITVLPLGSGALAGNPLEIDRELLRSELDMTSITLNSIDAISERDFVV LQKALPIRWSQFLLSHAVAL.RDSERLGEVKKR++VLPLGSGALAGNPLEIDRELLRSELD..SI+LNS+DAISERDFVV LQKALPIRWSQFLLSHAVALIRDSERLGEVKKRMSVLPLGSGALAGNPLEIDRELLRSELDFASISLNSMDAISERDFVV ELISVATLLMIHLSKLAEDLIIFSTTEFGFVTLFDAYSTGSSLLPQKKNPDSLELIRSKAGRVFGRLAAILMVLKGIPST EL+SVATLLMIHLSKLAEDLIIFSTTEFGFVTL.DAYSTGSSLLPQKKNPDSLELIRSKAGRVFGRLAA+LMVLKG+PST ELLSVATLLMIHLSKLAEDLIIFSTTEFGFVTLSDAYSTGSSLLPQKKNPDSLELIRSKAGRVFGRLAAVLMVLKGLPST FSKDLQEDKEAVLDVVDTLTAVLQAATEVISTLQVNKENMEKALTPELLSTDLALYLVRKGMPIRQAQTASGKAVHLAET ++KDLQEDKEAV.DVVDTLTAVLQ.AT.VISTLQVNKENMEKALTPELLSTDLALYLVRKGMP.RQA..ASGKAVHLAET YNKDLQEDKEAVFDVVDTLTAVLQVATGVISTLQVNKENMEKALTPELLSTDLALYLVRKGMPFRQAHVASGKAVHLAET KGITINNLTLEDLKSISPLFASDVSQVFSVVNSVEQYTAVGGTAKAA KGI.IN.LTLEDLKSISPLFASDVSQVF++VNSVEQYTAVGGTAK++ KGIAINKLTLEDLKSISPLFASDVSQVFNIVNSVEQYTAVGGTAKSS Delta-crystalline (top) and argininosuccinate lyase (bottom), both of chicken, Gallus gallus. Simple explanation: There are often many peaks on the fitness landscape, and the closest peak may be not far away from any point in phase space. A little editing replaces fuction A with function B. Generalizations concerned with adaptation and complexity: 2. Phenotypic aspects of adaptive evolution This is the most difficult, and the least understood, facet of evolutionary biology. 2a) Adaptations can be very general and very specific Heart is a general adaptation, necessary for any large active organism. Coloration of viceroy would not be adaptive without monarch Simple explanation: Why not? 2b) Evolution is irreversible Reversals at individual simple traits are common, but a reversal of substantial evolution has never been observed. Can a hermit crab abandon its dependence on gastropod shells? Yes, it can! King crabs originated from hermit crabs. Still, king crabs retained asymmetrical abdomens of their hermit crab ancestors. Another example: aquatic tetrapods still need air for breezing: Simple explanation: Perhaps, fitness landscapes are too complex and variable to allow exact return to the starting point of an evolutionary trajectory. 2c) Perhaps, all adaptations are imperfect We have no good data to directly suport this hypothesis, because measuring adaptation precisely is currently impossible. However, it seems very likely because: i) Most of fitness landscapes probably have multiple peaks, and the probability of reaching the highest peak, after climbing strictly up from a random starting point, is very low. ii) Often, an apparently good adaptation is achieved by a slight modification of a phenotype that performed an unrelated function. Generalizations concerned with adaptation and complexity: 3. Origin of novelties Origin of new functions is a particularly intriguing aspect of the adaptive evolution. 3a) New non-coding regulatory sites, but not new genes, often appear from scratch Binding sites of transcription factor Zeste and shown by boxes, the sites in the top two and bottom two species are located differently, indicating their gains and losses. In contrast, the origin of a protein-coding gene from scratch (i. e., entirely from a non-coding sequence) is a very rare event - so far, only ~10 such occasions have been documented. Simple explanation: A typical transcription factor-binding site is small enough to "condence from chaos. In contrast, a random segment very rarely encodes even a slightly useful protein. 3b) Origin of phenotypic novelties is usually opportunistic and can happen fast Even a novel function tends to appear on the basis of a pre-existing adaptation. MATEGDKLLGGRFVGSTDPIMEILSS MA+EGDKL.GGRF.GSTDPIME+L+S MASEGDKLWGGRFSGSTDPIMEMLNS Origin of a crystalline from an enzyme. Males of a toothed whale narwhal (Monodon monoceros) have a 2-3 m long tusk, which is an incisor tooth on the left side of the upper jaw. Feathers of flightless dinosaurs allowed birds to evolve flight. Skeletons of semiaquatic mammals transitional from land to sea in the origin of whales. Making a whale from a "protohippo" took ~15 My. Making a human from an ape took ~5My. Simple explanation: Evolution is not apt to perform long jumps in the space of phenotypes, and works with preexisting material. Apparently, under strong positive selection evolutiona can be fast. Generalizations concerned with adaptation and complexity: 4. Dynamics of complexity Here, generalizations is almost all we have. 4a) Complex phenotypes evolve through adaptive intermediate stages Simple explanation: Evolution cannot do it any other way, but we still do not really know how this happens. 4b) Complexity is rapidly lost, if selection stops maintaining it Mycobacterium leprae is in the middle of massive genome degeneration A parasitic plant Epifagus virginiana lost many key genes in its chloroplast genome Astyanax mexicana, like many other cave animals, has degenerated eyes Simple explanation: To demolish is easier than to build. A crustacean parasite of fish, Lernaea carassii, has profoundly simplified morphology 4c) The overall trend is for complexity to increase There is no law of nature that would force complexity to always increase - because it can easily decline. Still, the overall trend is up. Simple explanation: Initially, the complexity was low, so the only direction for it to change was up. This is not the whole story, of course. Epilogue for generalizations regarding past evolution: was evolution a purely natural phenomenon and does it matter? We know that modern life is a product of evolution - simple creationsm is refuted by evidence for past evolution. However, such evidence do not necessarily refute the claim that Supernatural Power somehow guided evolution. Do we have any reasons to think this was the case (of course, here we need strong reasons, due to Occam's razor)? One such reason may be an apparent inability of evolution to produce complex adaptaions - an argument going back to Darwin. However, here we are on shaky ground our theoretical understanding of phenotypic evolution remains so poor that we simply cannot say a priori what is impossible and what is possible, or to discern Supernatural guidance in the course of past evolution of complex phenotypes. It is better to address this issue at a much better understood level of sequences. Because phenotype is mostly determined by genotype (we just do not know how exactly), in order to guide evolution a Supernatural Power would have to guide the evolution of genomes. Do we see any traces of this? The answer (so far) is "No". Let us consider the last 5My of human evolution, because this short episode in the history of life is of special interest for any anthropocentric religion. We see no traces of supernatural intervention in changes accepted by our lineage after human-chimpanzee divergence - only substitutions, deletions, and insertions (duplications), all suppliable by natural mutation - and no new pieces of DNA that look as if they came from Heaven. Thus, the null hypothesis of purely natural evolution of humans from apes must be kept. Still, to prove that something is absent is essentially impossible. Thus, if you believe (for any reason) that evolution of humans from apes was supernaturally guided, study human-chimpanzee-orangutang alignments. I am not optimistic - but if you find a change in the human lineage that cannot be explained naturally, this would be the most important discovery in the history of all natural (or supernatural?) sciences. What to search for in such an analysis? An obvious trace of a supernatural intervention would be a long meaningful sequence that appeared without a plausible source in the human lineage, something like this: Homo sapiens nlpirqrtgillygppsinnersrepentgtgktllagviaresrmnfi Pan troglodytes nipirqrtgillygpp-------------gtgktllagvivresrmnfi Pongo pygmeus nlpirqrtgillygpp-------------gtgktllagviaresrmnyi If no such sequences are found, a weaker evidence would be unexplained differences between overall patterns of sequence evolution in the human and chimpanzee lineages. However, as long as no overt traces of supernatural intervention are evident, we have to assume that such intervention did not happen. Does this all really matter? I am (to the best of my knowledge) 1/2 Jewish, 23/64 Russian, 1/8 Latvian, and 1/64 French. OK, Jewish and Russian components (my father and mother) are to some extent important - but do I really care that my great-great-great-greatgrandfather (named Laurent) was a Napoleon's soldier, captured (according to our family legend) by my great-great-great-great-grandmother after Kutuzov destroyed the invading Great Army? And this happened less that 200 years ago - so why do people care about their great-great-...-great-grandparents being apes (or worms, or protozoans)? A human being does not "evolve from apes", but develops from the zygote, and slow evolution of our remote ancestors is only marginally relevant to the mystery of the origin of the newborn (or, occasionally, of identical twins) 40 weeks after a fusion of two gametes. Human nature emerges, again and again, in the course of individual ontogeneses, instead of appearing just once during the evolution of Homo sapiens. The lack of traces of any overt, proximal Supernatural involvement in human evolution does not imply that an individual human being, with their mind, consciousness, and, according to the views of some, immortal soul and free will, is a purely natural phenomenon. Also, instead of proximal causes, one can contemplate philosophical issues. If the Material Universe is inherently stochastic, we are free to attribute an apparently random mutation, or any other event, to the will of Providence. Moreover, evolutionary origin of humans required a lot of conditions, from the right values of physical constants to the suitable distance between the Earth and the Sun and the timely meteorite strike which cleared the Earth of dinosaurs. Can we interpret these conditions as a work of Providence? This is a possibility, although an alternative, called Anthropic Principle, also exists - the conditions were right for our origin because otherwise we would not be here to ponder such questions. However, philosophical questions, such that neither answer can ever be proven on the basis of laws of nature, do not belong to the domain of natural sciences. Quiz: Imagine that for every amino acid sequence we completely know the properties of the corresponding protein. Consider new insights that can be provided by this knowledge for any three evolutionary generalizations.
 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                            