Topic 2. Lectures 3-4: Evidence for Past Evolution - Reasoning So, you are asked to believe in the Weak Claim (anagenesis, profound changes of ancestors of modern species) and in the Strong Claim (cladogenesis, common ancestry of modern species). No one should accept such weird claims without some good reasons. Indeed, how did modern species come into being? What is the truth: a) no evolution, or b) Weak Claim only, or c) both Weak and Strong claims? Three log tips, protruding from a muddy, swallen river - what is below the surface? Philosophical prologue I: Can we have any evidence for past evolution? We cannot directly observe past events, let alone affect them by experiments. So, are cosmology, geology, evolutionary biology, or history sciences? Did Napoleon Bonaparte or Ronald Reagan really exist? Because this last question is stupid, there must be some ways to study past. To study past, we must: First, believe that the Universe is not entirely chaotic, or free to do whatever it wishes. Instead, there are laws of nature, restrictions on what could happen. Second, believe in "Uniformitarianism", a premise that in the past laws of nature were more or less the same as today. If so, "present is the key to past". If we accept these two postulates, we are in business. For example, a thing we call "T. rex skeleton" must really be a remnant of a huge beast - it could not be produced abiogenically. Analogously, we must conclude that countless evidence for existence of Napoleon Bonaparte could not all be faked in one global conspiracy. Why "historical revisionism", applied both to history of life and of humanity, attracts so many people, is an interesting question which we will ignore. Evidence for past objects and events (and persons) can be of 3 general kinds: 1) Direct messages from the past - if, at some moment, an object "froze" and stopped changing, it can give us a direct window into the past. A message from the past: Tyrannosaurus rex skeleton. Such structures do not naturally condense from clay these days, and Uniformitarianism implies that this was never possible. Thus, a T. rex skeleton tells us a lot about a living being that lived in the past. Of course, a T. rex skeleton is not in itself an evidence for past evolution. Another message from the past: Nag Hammadi library In 1945, twelve papyrus codices buried in a sealed jar were found near Nag Hammadi by a peasant named Mohammed Ali. The codices include Gospel of Thomas and are believed to be a library hidden by monks from the nearby monastery of St. Pachomius around year 390. 2) Sometimes, we know laws of changes precisely and can "play time back". For example, knowing concentrations of 40K (P - parent) and 40Ar (D - daughter) isotopes within a crystal, we can play back the dynamics of radioactive decay of 40K into 40Ar (because the decay constant, l, can be measured today) and, thus, calculate the age of the crystal t: dP lP dt P(t ) P0 e lt D(t ) P0 (1 e lt lt P ( 1 e ) D ( t ) lt 0 ) e 1 lt P(t ) P0 e t 1 l ln( 1 t D ) P ln 2 l In particular, if the number of radioactive parent atoms is equal to the number of stable daughter atoms, the age of the crystal is t years, where t is half-decay time. Of course, life is too complex and too stochastic for this approach. 3) In more complex cases, we have to resort to "hypothetico-deductive" method: i) observe something, ii) try to imagine all possible scenarios of how this something could appear, iii) compare their implications to what you see. Application of the hypothetico-deductive method by our ancestors: these footprints were left by a mammoth! A more complex application of the hypothetico-deductive method: there is now a general agreement that big craters on Mars, Moon, and Earth (and in many other places) were left by impacts of celestial bodies. The same hypothetico-deductive method is the only way to learn something about the core of the Earth. The very boundary between past and present can be blurred: tonight, we observe a ten billion years-old galaxy. Thus, we can study the past without making any apologies. Philosophical prologue II: Do we need any evidence for past evolution? Evolution is the only feasible natural explanation for the very existence of modern life. Can we reject supernatural alternatives to evolution without consideration and, thus, accept past natural evolution "by default"? Indeed, in natural sciences, as well as in criminal justice, the existence of a natural explanation for any phenomenon is always accepted as null hypothesis. The birth of Aphrodite from sea waves - the only alternative to evolution A hopeless criminal defense. When DNA evidence exonerates a person, found guilty "beyond reasonable doubt", after a "fair trial", this innocent person is released, without any opposition. Kirk Noble Bloodsworth, the first U.S. death row prisoner exonerated by DNA, in 1993. Occam's razor: use only the minimal set of fundamental principles to explain your facts, and do not invoke unnecessary principles. However, the case of past evolution of life is more difficult. Indeed, we know that a mismatch between DNA from the crime scene and from a convicted person can appear naturally, if the conviction was wrong. In contrast, we are not sure a priori that evolutionary origin of modern life is possible. Remember - "To suppose that the eye ... could have been formed by natural selection, seems, I freely confess, absurd in the highest possible degree" (Darwin). Thus, it is prudent not to accept past natural evolution by default, but to seek evidence for - or against - it. Here we encounter another unusual situation: normally, natural sciences compare alternative hypotheses. Geocentric or Heliocentric? However, there is no natural alternative to evolution - and natural sciences do not tell us what to expect from supernatural origin of life. Thus, the only thing we could really do is to compare implications of a hypothesis of past natural evolution to the data. So, what we do and do not expect to see if modern life is a product of past evolution? Direct and Indirect evidence for past evolution Direct evidence is provided by fossils - messages from the past. Indirect evidence are provided by features of modern organisms - after application of the hypothetico-deductive method. Direct and indirect evidence complement each other nicely. Fossils inform us about: 1) some morphological traits of past organisms - but usually not about their colorations or genotypes. 2) existence of organisms that left no living descendants, such as mammoths or Paranthropus. Modern organisms inform us about: 1) genomes of their ancestors. 2) branching order in the Tree of Life. Indirect evidence for past evolution stem from transmission of genetic information from parents to offsping, a processs unique to life. I prefer to start from indirect evidence, because considering them is the best way to introduce evolutionary thinking. So, for now we will ignore fossils. Indirect evidence for past evolution Connectedness and designability of modern species: great indirect evidence we do not have. Indeed, if modern species are products of evolution from one common ancestor (LUCA, last unvesral common ancestor), they all must be connected in the space of genotypes, by continous series of fit genotype. Also, each species must be connected by such a series to something VERY SIMPLE. Touble is, we do not know if this is the case. So, we have to seek less ambitious evidence, based only on "local" properties of modern species. Let us start from a paradox and two examples. Paradox: perfect adaptations of modern organisms are not evidence of past evolution of their ancestral lineages. Darwin proposed that adaptations are the result of evolution driven by natural selection. Thus, one may think that perfect adaptations of modern species are the best indirect evidence for Darwinian evolution of their ancestors. NO! Bodies of sharks are very close to perfection. A perfect adaptation of a modern organism does not give us any reason to think that its ancestors were different from it - thus, no evidence for evolution emerges. Instead, past evolution is evident primarily from those features of modern organisms that do not contribute to their fitness. Examples: 1) A real evidence for past evolution: suboptimal phenotype of a modern species. Blind cave fish Astyanax mexicanus with vestigial, useless eyes. H.s. cagctcaccatggatgatgatatcaccgcgctcgtcattgacaacggctc |||||||||||||||||||||||||||||||||||||| ||||||||||| P.t. cagctcaccatggatgatgatatcaccgcgctcgtcatcgacaacggctc H.s. cggcatgtgcaaggccagcttcacgggcgacaatgccgcccgggcagtct |||||||||||||||| |||||||||||||| ||||| |||||||||||| P.t. cggcatgtgcaaggccggcttcacgggcgacgatgccacccgggcagtct H.s. tcccctccatcgttgggcaccccaggcaccagggcgtgatggtgggcatg |||||||||| ||||||||||||||||||||||||||||||||||||||| P.t. tcccctccattgttgggcaccccaggcaccagggcgtgatggtgggcatg H.s. ggtcagaaggattcctatgtgggcgacgaggcccagagcaagagaggcat |||||||||||||||||||||||||||||||||||||||||||||||||| P.t. ggtcagaaggattcctatgtgggcgacgaggcccagagcaagagaggcat 2) A real evidence for past evolution: similarity between different modern species not forced by similarity of their adaptations. Partial alignment of human (Homo sapiens) and chimpanzee (Pan troglodytes) useless genome segments called beta actin processed pseudogenes. The two pseudogenes are 98.8% identical and are flanked by the same genes. So, why does evidence for evolution emerge from bad, or at least harmless but useless, phenotypes - and not from perfect adaptations? Because we have no natural alternative to evolution. Thus, no evidence for evolution may emerge from what evolution CAN do, but its alternative (absent!) CANNOT. Instead, evidence for evolution primarily emerge from what evolution CANNOT do. Fortunately, evolution is very far from being omnipotent. Slow, gradual evolution (if it occurred!) CANNOT always produce optimal phenotypes, and CANNOT completely erase traces of history from evolving phenotypes. Thus, we expect evolution to produce suboptimal phenotypes and useless similarities between phenotypes (and some other useless patterns) - and if reality matches these expectations, we have indirect evidence for past evolution. In other words, as far as evidence for past evolution is concerned, slow, gradual nature of evolution - the only natural possibility - is more important than its mechanism, natural selection that increases adaptation. Darwin was fully aware of paradox. Evolution naturally leads to suboptimality (purple) if 1) a lineage is still climbing or 2) a lineage is trapped on a low peak. Evolution naturally leads to unforsed similarity or homology (shades of green) if multiple lineages are all initially located in the domain of attraction of the same peak. Fortunately, we often can say something, with reasonable confidence, about fitnesses of currently existing species, and this may be enough. 3 degrees of optimality of a phenotype: Suboptimal (a), non-uniquely optimal (b), and uniquely optimal (c) phenotypes. 1. Slow, gradual, and greedy evolution must be prone to produce species with suboptimal phenotypes. Thus, when we see suboptimality in a modern species, this is evidence of Weak Claim for it. 2. Slow, gradual, and greedy evolution must be prone to retaining similar nonuniquely optimal (or even suboptimal) phenotypes in all species which originated from the common ancestor. Thus, when we see similar non-uniquely optimal (or even suboptimal) phenotypes in several modern species, this is evidence for Strong Claim for them. H.s. cagctcaccatggatgatgatatcaccgcgctcgtcattgacaacggctc |||||||||||||||||||||||||||||||||||||| ||||||||||| P.t. cagctcaccatggatgatgatatcaccgcgctcgtcatcgacaacggctc Unforced similarity (similarity too deep to be explained by common adaptations) is called homology. In particular, all similarities of functionless features, such as junk DNA, are homologous. In contrast, no evidence for evolution emerges from unique optimality - in one or even in many species. Shared optimal body shape of sharks, ichtyosaurs, and dolphins - or shared optimal hexagonal honey-combs of diffrent bees and wasps - do not imply that their ancestors were different, or that they shared a common ancestor. In addition to 1) suboptimality and 2) homology, indirect evidence for past evolution can emerge from: 3) hierarchical distributions of trait states, 4) patterns in ranges of modern species not explainable by their adaptations, 5) agreement between what we see and some simple evolutionary scenario, 6) agreement between what we see and a partial theory of Macroevolution. Let us analyze these 6 possible kinds of evidence in some detail. 1) Suboptimality Suboptimality can be of two kinds: easy-to-improve, if given more time hard-to-improve - trapped on a low peak To imply past evolution, a suboptimality must be unconditional: a polar bear suffering from heat in a zoo is not an evidence for evolution. Vestigial eyes probably are an easy-to-improve suboptimality: some cave animals do not have any eyes. Contorted morphology of adult flatfishes is a suboptimal, and perhaps hard-to-improve, adaptation to bottomdwelling. Young flatfishes, which are pelagic, have bilateral symmetry. Stingrays, which adapted to bottom-dwelling differently, are always bilateral. 2) Unforced similarity, or homology "Simia quam similis turpissima bestia nobis" (The monkey, how similar that most ugly beast is to us!). Quintus Ennius (239–169 BCE) However, evolution has really been discovered only 2000 years later - why? Because, similarity between human and chimpanzee eyes, or between genes, does not necessary provide any evidence for our common ancestry. In contrast, similarity between human and chimpanzee PSEUDOgenes is an evidence for our common ancestry, because it is obviously unforced. Even similarity between functional phenotypes can be homologous, if we believe that the same function can be performed in many different ways. Homologous phenotypes can also be analogous, that is, perform the same function. To claim homology, we must show that there are many other ways of performing this function. Two enzymes of totally different overall structures that perform exactly the same function - they both are inorganic pyrophosphatases - only their active centers are (necessarily) similar. Still, we may have a stronger case for homology if similar phenotypes are not analogous, i. e. perform different functions. Limbs of vertebrates are only partially analogous, and similarity of their functions can hardly explain similarity in arrangements of bones inside them. Two kinds non-unique optimality: H.s. cagctcaccatggatgatgatatcaccgcgctcgtcattgacaacggctc |||||||||||||||||||||||||||||||||||||| ||||||||||| P.t. cagctcaccatggatgatgatatcaccgcgctcgtcatcgacaacggctc Continuous set of Isolated "optimal" "optimal" phenotypes phenotypes junk sequences In both cases, possession of the same phenotype by many species is evidence for Strong Claim for them. Shared suboptimality is also a homology: Three species from over 400 flatfishes (order Pleuronectiformes): Pleuronectes platessa (left), Psetta maxima (center),Citharichthys sordidus (right). Homology is pervasive not only between species, but also between different parts of the same genotype or phenotype. cagctcaccatggatgatgatatcaccgcgctcgtcattgacaacggctc |||||||||||||||||||||||| ||||||||||| | ||||||||||| cagctcaccatggatgatgatatcgccgcgctcgtcgtcgacaacggctc cggcatgtgcaaggccagcttcacgggcgacaatgccgcccgggcagtct |||||||||||||||| ||||| |||||||| ||||| ||||||| |||| cggcatgtgcaaggccggcttcgcgggcgacgatgccccccgggccgtct tcccctccatcgttgggcaccccaggcaccag-----------------||||||||||||| |||| ||||||||||||| tcccctccatcgtggggcgccccaggcaccaggtaggggagctggctggg -------------------------------------------------- tggggcagccccgggagcgggcgggaggcaagggcgctttctctgcacag -------------------------------------------------gagcctcccggtttccggggtgggggctgcgcccgtgctcagggcttctt ----------------ggcgtgatggtgggcatgggtcagaaggattcct |||||||||||||||||||||||||||||||||| gtcctttccttcccagggcgtgatggtgggcatgggtcagaaggattcct atgtgggcgacgaggcccagagcaagagaggcat |||||||||||||||||||||||||||||||||| atgtgggcgacgaggcccagagcaagagaggcat A fragment of human beta actin processed pseudogene (top), aligned with the corresponding region of human beta actin gene (bottom). The pseudogene misses all introns (one of them is shown in green), indicating its origin through insertion of the DNA sequence produced by reverse transcription of mature mRNA. Homology of different parts of the genome - such as alpha- and beta-hemoglobin genes, or a gene and the corresponding pseudogene(s) is an evidence for their common ancestry, and, thus, for the Weak Claim for the whole species. In contrast, homology of parts of the phenotype at the above-genome level does not immediately produce any evidence for past evolution. Why such a contrast? Parts of the genome are transmitted from generation to generation, and their homology suggests common ancestry. In contrast, above-sequence phenotypes develop every generation, and there is no common ancestry of legs and arms. Homology is the most important and pervasive kind of indirect evidence for past evolution. Still, this is not the end of the story - but enough for today. Quiz: Explain, in terms of fitness landscapes, why one can expect slow, greedy evolution to lead to suboptimal phenotypes and to preserve homologies. Hints: 1) Before you start answering, make sure you understand this difficult question, 2) Discuss you possible answer with your peers, 3) Seek advice if needed! 3) Hierarchical distributions of trait states Let us now move beyond homology, and consider only traits that are not invariant within a set of modern species. A joint distribution of such traits can be an evidence for the Strong Claim for the set, if this distribution is hierarchical. This idea, like most of what is covered in this lecture, goes back to Darwin. Let us explain what is a hierarchical distribution. This is the most difficult part of the story of indirect evidence for past evolution. Informally, multiple traits are distributed hierarchically, if some states of one trait only occur together with some states of other traits. Among vertebrates, placenta is "nested" within bearing live young. Among insects, complete metamorphosis is nested within having wings. A more complex example of a joint distribution of many variable traits: Traits: Species: 7 9 12 33 34 42 57 79 116 Homo sapiens E K V L V F G L A Monodelphis domestica E K I L V F G L G Gallus gallus E K I L I F G L A Rana catesbeiana E K I F I Y G L G Hynobius retardatus E K I L I Y A L A Salmo salar A K I L I Y G M A Danio rerio A R I L I Y G M A A matrix of traits presenting phenotypes of 7 species each consisting of 9 traits. Each trait characterizes a position in the alignment of beta globins, and the trait state is the amino acid that occupies this position. Only binary traits, with just two states within the set of species, were chosen. The species are human, gray shorttailed opossum, chicken, North American bullfrog, Hokkaido salamander, Atlantic salmon, and zebrafish. Venn diagram for the same data. For each trait, species with the trait state which was shown in red in the previous picture are enclosed by the colored line. The joint distribution would be hierarchical, if not two conflicts, between traits 116 and 34, and between traits 116 and 42. In both these pairs of traits, all 4 possible combinations of their states are present. All other pairs of traits are "nested". Now, we are ready for a more formal treatment: Definition 1: Two binary traits, each with states 0 and 1, are said to be in conflict, within a set of species, if and only if each of the 4 possible combinations (00, 01, 10, and 11) of the trait states is present in at least one species. Definition 2: A joint distribution of two or more binary traits is called hierarchical if and only if in each pair of these traits there is not in conflict, i. e. no more than 3 combinations of states of the two traits are present within the set of species. Two "poor" hierarchies (really the same). Four "rich" hierarchies (really the same). The only possible conflict. OK, but why do we care about hierarchical distributions of traits? Because we expect evolution, if it occurred, to produce hierarchical distributions of traits within a set of species which evolved from the common ancestor. Let us call (hypothetical) evolution divergent, if every evolutionary event (a change of the state of a trait) produces a new trait state, never present before in any other lineage derived from the common ancestor. All similarities between the species produced by divergent evolution must be inherited from their common ancestor. If evolution was sufficiently slow, we can expect it to also be divergent. Theorem: divergent evolution of a set of species from the common ancestor can only lead to a hierarchal distribution of binary traits within the set. This is the so-called Pairwise Compatibility Theorem, which is intuitively obvious and can be easily proven. Try to invent a course of exclusively divergent evolution from a common ancestor that would lead to a conflict between two binary traits - this will not work, but can hint at the idea for a formal proof. We must assume that the same event happened twice, in order to obtain a conflict! In contrast, any hierarchical, conflictless distribution of traits can be obtained after divergent evolution from a common ancestor. For example: Origin of a poor hierarchy, 01 and 10 Origin of a rich hierarchy, 00, 10, and 11 3 feasible modes of non-divergent evolution: Reversal (left), parallel (center), and convergent (right) evolution. The common ancestor of the 4 species always had state A of the only trait under consideration. Evolutionary events are shown as short horizontal lines on the phylogenetic trees. Collectively, independent origins of the same trait state is known as homoplasy. This figure only illustrates these concepts - we do not address the issue of how we know that the set of 4 species originated from the common ancestor or how their exact phylogeny was ascertained. Obviously, convergence is impossible for a binary trait, and requires at least 3 possible states. Thus, a hierarchical distribution of multiple variable traits within a set of modern species provides indirect evidence for their common ancestry because this is what we EXPECT to see after divergent evolution. Of course, we do not always observe heirarchical distributions, and do not expect evolution to be exclusively diveregent. In particular, a lot of conflicts is observed at the level of nucleotide sites, which is not surprising, because there are only 4 nucleotides (A, T, G, C), and homoplasy must be common, if evolution occurred. Moreover, to be an evidence for past evolution, a hierarchical distribution must not be forced by low fitness of absent combinations of traits. Not an evidence for evolution. Evidence for evolution. In addition to providing evidence for past evolution, hierarchical distributions are also essential for reconstructing the course of past evolution. Thus, we will revisit them soon, when phylogenetic trees will be considered. 4) Patterns in ranges of modern species, not explainable by their adaptations This is another kind of evidence for past evolution. Here, the Darwin's treatment is sufficient ("Origin of Species", Chapter 11): In considering the distribution of organic beings over the face of the globe, the first great fact which strikes us is, that neither the similarity nor the dissimilarity of the inhabitants of various regions can be accounted for by their climatal and other physical conditions. A second great fact which strikes us in our general review is, that barriers of any kind, or obstacles to free migration, are related in a close and important manner to the differences between the productions of various regions. A third great fact, partly included in the foregoing statements, is the affinity of the productions of the same continent or sea, though the species themselves are distinct at different points and stations. Summary: similarity of species from two points in space depends not on how similar are the two environments, but on how easy it is to migrate between them. These patterns are evidence for past evolution, because this is what one can expect as a result of local, divergent evolution, with limited dispersal. Of course, these patterns are to some extent related to what we considered before. The simplest geographical pattern which implies evolution is unforced similarity (homology) of ranges of similar species (Darwin's great fact 1). Example: absence of placentals in Australia (until very recently). When introduced by humans, many placental mammals are doing very well in Australia - thus, the place is not inherently "placental-unfriendly". Shared absence of placental in Australia, not forced by their low fitness there, is evidence of their evolution from a common ancestor outside Australia. Shared presence of Marsupials in Australia is also consistent with an evolutionary scenario of their origin, from the common ancestor, in Australia. External similarity between placental mole, and "marsupial mole" is not an evidence for evolution - one has to be a "mole" to live underground. In contrast, similarity between placental mole and camel (both have placenta and a lot of other traits, defining them as placentals), together with their shared absence in Australia, is evidence for their common ancestry. Suboptimality of the range of just one species, although consistent with its localized origin, does not directly suggest past evolution of its ancestors. 5) Agreement between what we see and a simple evolutionary scenario Let us move beyond the basic notion that past evolution, if any, was slow and gradual, with limited dispersal. When what we observe can be easily explained by a simple scenario that involves evolution, we have a scenario-based evidence for past evolution. Example: ancient whole-genome duplication (autopolyploidization). We know that WGD can happen, as there are very many polyploids, such as Sequoia sempervirens (probably, a hexaploid of AAAABB type). What can we expect if a WGD occurred long time ago? A scenario of evolution following whole-genome duplication (WGD). The box at the top shows a hypothetical genome region containing ten genes numbered 1-10. After WGD, the whole region is briefly present in two copies. However, many genes subsequently return to single-copy state. In this example, only genes 1, 6 and 10 remain duplicated. So, when we see something like the bottom box, we have a scenario-based evidence for evolution. 6) Agreement between what we see and a partial theory of Macroevolution If biology were a more advanced science, most of evidence for past evolution would be of this kind. However, we have only incomplete and fragmented theory of Macroevolution, so our ability to expect something specific (beyond simple scenarios) is rather limited. Still, we DO already have some useful theories. Example: Neutral theory of sequence evolution predicts that, within a functionless segment of DNA, changes of different kinds accumulate at the rates proportional to the corresponding mutation rates. nonCpG transversions CpG transitions CpG transversions indels Rates of human mutations, observed in patients these days 0.53 15.4 1.5 0.10 Levels of humanchimpanzee divergence of pseudogenes 0.46 13.3 3.7 0.19 A good agreement between patterns in human mutation and in humanchimpanzee divergence of junk DNA provides theory-based evidence for Strong Claim for these species. All numbers are relative to nonCpG transitions. The final question: How do we recognize indirect evidence for evolution in nature? Natural phenotypes, unfortunately, do not come together with nice pictures of fitness landscapes. Indeed, poor understanding of function and adaptation is the Achillean heel of any indirect evidence for evolution. In particular, it is hard to be 100% sure that a phenotype is suboptimal. Perhaps, the most uncontroversial evidence are based on shared junk DNA - we believe that we understand genomes will enough to claim that (most of) what we regard as junk is, indeed, junk. We will address this issue when specific examples will be considered. So, we introduced indirect evidence for past evolution of these 6 kinds: 1) suboptimality of the phenotype of one or many modern species, 2) homology, or unforced similarity, between different modern species, or between parts of their genomes, 3) (unforced) hierarchical distributions of states of variable traits within sets of modern species, 4) patterns in geographical distributions of modern species, not explainable by their adaptations, 5) agreement between what we see and a simple evolutionary scenario, 6) agreement between what we see and a partial theory of Macroevolution. Next time, we will consider examples of such evidence. Quiz: What are hierarchical distributions of traits and why are they viewed as indirect evidence for common ancestry? Present your own example of a hierarchical distribution. Seek advice if necessary - this is not an easy question!