Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Using DNA sequences • • • • • • Obtain sequence Align sequences, number of parsimony informative sites Gap handling Picking sequences (order) Analyze sequences (similarity/parsimony/exhaustive/bayesian Analyze output; CI, HI Bootstrap/decay indices Good chromatogram! Bad chromatogram… Reverse reaction suffers same problems in opposite direction Pull-up (too much signal) Loss of fidelity leads to slips, skips and mixed signals Alignments (Se-Al) Using DNA sequences • • • • Testing alternative trees: kashino hasegawa Molecular clock Outgroup Spatial correlation (Mantel) • Networks and coalescence approaches Using DNA sequences • Bootstrap: the presence of a branch separating two groups of microbial strains could be real or simply one of the possible ways we could visualize microbial populations. Bootstrap tests whether the branch is real. It does so by trying to see through iterations if a similar branch can come out by chance for a given dataset • BS value over 65 ok over 80 good, under 60 bad From Garbelotto and Chapela, Evolution and biogeography of matsutakes Biodiversity within species as significant as between species Genetic analysis requires variation at loci, variation of markers (polymorphisms) • How the variation is structured will tell us – Does the microbe reproduce sexually or clonally – Is infection primary or secondary – Is contagion caused by local infectious spreaders or by a long-disance moving spreaders – How far can individuals move: how large are populations – Is there inbreeding or are individuals freely outcrossing CASE STUDY • AAgrou stand of adjacent trees is infected by a disease: How can we determine the way trees are infected? CASE STUDY • AAgrou stand of adjacent trees is infected by a disease: How can we determine the way trees are infected? BY ANALYSING THE GENOTYPE OF THE MICROBES: if the genotype is the same then we have local secondary tree-to-tree contagion. If all genotypes are different then primary infection caused by airborne spores is the likely cause of Contagion. CASE STUDY • AWE grou HAVE DETERMINED AIRBORNE SPORES (PRIMARY INFECTION ) IS THE MOST COMMON FORM OF INFECTION QUESTION: Are the infectious spores produced by a local spreader, or is there a general airborne population of spores that may come from far away ? HOW CAN WE ANSWER THIS QUESTION? If spores are produced by a local spreader.. • Even if each tree is infected by different genotypes (each representing the result of meiosis like us here in this class)….these genotypes will be related • HOW CAN WE DETERMINE IF THEY ARE RELATED? HOW CAN WE DETERMINE IF THEY ARE RELATED? • By using random genetic markers we find out the genetic similarity among these genotypes infecting adjacent trees is high • If all spores are generated by one individual – They should have the same mitochondrial genome – They should have one of two mating alleles WE DETERMINE INFECTIOUS SPORES ARE NOT RELATED • QUESTION: HOW FAR ARE THEY COMING FROM? ….or…… • HOW LARGE IS A POPULATION? Very important question: if we decide we want to wipe out an infectious disease we need to wipe out at least the areas corresponding to the population size, otherwise we will achieve no result. HOW TO DETERMINE WHETHER DIFFERENT SITES BELONG TO THE SAME POP OR NOT? • Sample the sites and run the genetic markers • If sites are very different: – All individuals from each site will be in their own exclusive clade, if two sites are in the same clade maybe those two populations actually are linked (within reach) – In AMOVA analysis, amount of genetic variance among populations will be significant (if organism is sexual portion of variance among individuals will also be significant) – F statistics: Fst will be over ) 0.10 (suggesting sttong structuring) – There will be isolation by distance Levels of Analyses Individual • identifying parents & offspring– very important in zoological circles – identify patterns of mating between individuals (polyandry, etc.) In fungi, it is important to identify the "individual" -determining clonal individuals from unique individuals that resulted from a single mating event. Levels of Analyses cont… • Families – looking at relatedness within colonies (ants, bees, etc.) • Population – level of variation within a population. – Dispersal = indirectly estimate by calculating migration – Conservation & Management = looking for founder effects (little allelic variation), bottlenecks (reduction in population size leads to little allelic variation) • Species – variation among species = what are the relationship between species. • Family, Order, ETC. = higher level phylogenies What is Population Genetics? About microevolution (evolution of species) The study of the change of allele frequencies, genotype frequencies, and phenotype frequencies Goals of population genetics • Natural selection (adaptation) • Chance (random events) • Mutations • Climatic changes (population expansions and contractions) •… To provide an explanatory framework to describe the evolution of species, organisms, and their genome, due to: Assumes that: • the same evolutionary forces acting within species (populations) should enable us to explain the differences we see between species • evolution leads to change in gene frequencies within populations Pathogen Population Genetics • must constantly adapt to changing environmental conditions to survive – High genetic diversity = easily adapted – Low genetic diversity = difficult to adapt to changing environmental conditions – important for determining evolutionary potential of a pathogen • If we are to control a disease, must target a population rather than individual • Exhibit a diverse array of reproductive strategies that impact population biology Analytical Techniques – Hardy-Weinberg Equilibrium • p2 + 2pq + q2 = 1 • Departures from non-random mating – F-Statistics • measures of genetic differentiation in populations – Genetic Distances – degree of similarity between OTUs • • • • Nei’s Reynolds Jaccards Cavalli-Sforza – Tree Algorithms – visualization of similarity • UPGMA • Neighbor Joining Allele Frequencies • Allele frequencies (gene frequencies) = proportion of all alleles in an all individuals in the group in question which are a particular type • Allele frequencies: p + q = 1 • Expected genotype frequencies: p2 + 2pq + q2 Evolutionary principles: Factors causing changes in genotype frequency • Selection = variation in fitness; heritable • Mutation = change in DNA of genes • Migration = movement of genes across populations – Vectors = Pollen, Spores • Recombination = exchange of gene segments • Non-random Mating = mating between neighbors rather than by chance • Random Genetic Drift = if populations are small enough, by chance, sampling will result in a different allele frequency from one generation to the next. The smaller the sample, the greater the chance of deviation from an ideal population. Genetic drift at small population sizes often occurs as a result of two situations: the bottleneck effect or the founder effect. Founder Effects; typical of exotic diseases • Establishment of a population by a few individuals can profoundly affect genetic variation – Consequences of Founder effects • • • • Fewer alleles Fixed alleles Modified allele frequencies compared to source pop GREATER THAN EXPECTED DIFFERENCES AMONG POPULATIONS BECAUSE POPULATIONS NOT IN EQUILIBRIUM (IF A BLONDE FOUNDS TOWN A AND A BRUNETTE FOUND TOWN B ANDF THERE IS NO MOVEMENT BETWEEN TOWNS, WE WILL ISTANTANEOUSLY OBSERVE POPULATION DIFFERENTIATION) Bottleneck Effect • The bottleneck effect occurs when the numbers of individuals in a larger population are drastically reduced • By chance, some alleles may be overrepresented and others underrepresented among the survivors • Some alleles may be eliminated altogether • Genetic drift will continue to impact the gene pool until the population is large enough Founder vs Bottleneck Northern Elephant Seal: Example of Bottleneck Hunted down to 20 individuals in 1890’s Population has recovered to over 30,000 No genetic diversity at 20 loci Hardy Weinberg Equilibrium and F-Stats • In general, requires co-dominant marker system • Codominant = expression of heterozygote phenotypes that differ from either homozygote phenotype. • AA, Aa, aa Hardy-Weinberg Equilibrium • Null Model = population is in HW Equilibrium – Useful – Often predicts genotype frequencies well Hardy-Weinberg Theorem if only random mating occurs, then allele frequencies remain unchanged over time. After one generation of random-mating, genotype frequencies are given by AA Aa aa p2 2pq q2 p = freq (A) q = freq (a) Expected Genotype Frequencies • The possible range for an allele frequency or genotype frequency therefore lies between ( 0 – 1) • with 0 meaning complete absence of that allele or genotype from the population (no individual in the population carries that allele or genotype) • 1 means complete fixation of the allele or genotype (fixation means that every individual in the population is homozygous for the allele -- i.e., has the same genotype at that locus). ASSUMPTIONS 1) diploid organism 2) sexual reproduction 3) Discrete generations (no overlap) 4) mating occurs at random 5) large population size (infinite) 6) No migration (closed population) 7) Mutations can be ignored 8) No selection on alleles IMPORTANCE OF HW THEOREM If the only force acting on the population is random mating, allele frequencies remain unchanged and genotypic frequencies are constant. Mendelian genetics implies that genetic variability can persist indefinitely, unless other evolutionary forces act to remove it Departures from HW Equilibrium • Check Gene Diversity = Heterozygosity – If high gene diversity = different genetic sources due to high levels of migration • Inbreeding - mating system “leaky” or breaks down allowing mating between siblings • Asexual reproduction = check for clones – Risk of over emphasizing particular individuals • Restricted dispersal = local differentiation leads to non-random mating Pop 3 Pop 4 FST = 0.30 Pop 2 Pop 1 FST = 0.02 Pop1 Pop2 Pop3 Sample size AA 20 20 20 10 5 0 Aa 4 10 8 aa 6 5 12 Pop1 Pop2 Pop3 Freq p (20 + 1/2*8)/40 = (10+1/2*20)/40 = (0+1/2*16)/40 = 0.60 .50 0.20 q (12 + 1/2*8)/40 = (10+1/2*20)/40 = (24+1/2*16)/40 = 0.40 .50 0.80 Local Inbreeding Coefficient • Calculate HOBS – Pop1: 4/20 = 0.20 – Pop2: 10/20 = 0.50 – Pop3: 8/20 = 0.40 • Calculate HEXP (2pq) – Pop1: 2*0.60*0.40 = 0.48 – Pop2: 2*0.50*0.50 = 0.50 – Pop3: 2*0.20*0.80 = 0.32 • Calculate F = (HEXP – HOBS)/ HEXP • Pop1 = (0.48 – 0.20)/(0.48) = 0.583 • Pop2 = (0.50 – 0.50)/(0.50) = 0.000 • Pop3 = (0.32 – 0.40)/(0.32) = -0.250 F Stats Proportions of Variance • FIS = (HS – HI)/(HS) • FST = (HT – HS)/(HT) • FIT = (HT – HI)/(HT) Pop Hs HI p q 1 0.48 0.20 0.60 0.40 2 0.50 0.50 0.50 0.50 3 0.32 0.40 0.20 0.80 Mean 0.43 0.37 0.43 0.57 HT FIS FST 0.49 -0.14 0.12 FIT 0.24 Important point • Fst values are significant or not depending on the organism you are studying or reading about: – Fst =0.10 would be outrageous for humans, for fungi means modest substructuring Microsatellites or SSRs • AGTTTCATGCGTAGGT CG CG CG CG CG AAAATTTTAGGTAAATTT • Number of CG is variable • Design primers on FLANKING region, amplify DNA • Electrophoresis on gel, or capillary • Size the allele (different by one or more repeats; if number does not match there may be polimorphisms in flanking region) • Stepwise mutational process (2 to 3 to 4 to 3 to2 repeats) Host islands within the California Northern Channel Islands create fine-scale genetic structure in two sympatric species of the symbiotic ectomycorrhizal fungus Rhizopogon Rhizopogon occidentalis Rhizopogon vulgaris Rhizopogon sampling & study area • Santa Rosa, Santa Cruz – R. occidentalis – R. vulgaris • Overlapping ranges – Sympatric – Independent evolutionary histories Sampling Bioassay – Mycorrhizal pine roots Local Scale Population Structure Rhizopogon occidentalis FST = 0.26 N 5 km T B Populations are similar Grubisha LC, Bergemann SE, Bruns TD Molecular Ecology in press. FST = 0.24 FST E W 8-19 km FST = 0.33 = 0.17 Populations are different Local Scale Population Structure Rhizopogon vulgaris FST = 0.21 N FST = 0.20 E W FST = 0.25 Populations are different Grubisha LC, Bergemann SE, Bruns TD Molecular Ecology in press B. Locus Rvu24.9 Rvu20.80 Allele 234 237 240 Santa Cruz Island (SCI) SCI East SCI No rth SCI West 0.267 0.458 0.576 0.467 0.479 0.424 0.267 0.063 144 153 156 159 162 165 168 0.033 0.383 0.133 0.400 195 198 201 204 207 210 0.050 Rvu20.46 Rvu21.83 Rvu19.80 Rvu21.13 0.033 0.017 0.156 0.323 0.281 0.104 0.135 0.033 0.076 0.065 0.739 0.087 Santa Rosa Island (SRI) SRI 1.000 0.833 0.167 0.100 0.017 0.817 0.017 0.167 0.042 0.125 0.010 0.615 0.042 0.054 0.033 0.663 0.228 0.022 1.000 144 147 0.017 0.983 0.042 0.958 0.478 0.522 0.417 0.583 291 294 297 300 303 306 309 0.433 0.300 0.050 0.200 0.017 0.021 0.646 0.125 0.010 0.115 0.073 0.010 0.587 0.043 0.370 1.000 261 264 0.983 0.017 0.865 0.135 0.989 0.01 1 1.000 How do we know that we are sampling a population? • We actually do not know • Mostly we tend to identify samples from a discrete location as a population, obviously that’s tautological • Assignment tests will use the data to define population, that is what Grubisha et al. did using the program STRUCTURE Four phases of INVASION • TRANSPORT • SURVIVAL AND ESTABLISHMENT (LAG PHASE) • INVASION • POST-INVASION TRANSPORT • Biology will determine how • Normally very few organisms will make it • Use phylogeographic approach to determine origin ( Armillaria, Heterobasidion) • Use population genetic approach (Cryphonectria, Certocystis fimbriata) TRANSPORT-2 • Need to sample source pop or a pop that is close enough • Need markers that are polymorphic and will differentiate genotypes haplotypes • Need analysis that will discriminate amongst individuals and identify relationships ( similarity clusterying, parsimony, Fst & N, coalescent) ESTABLISHMENT • LAG PHASE; normally effects not noticed because mortality are masked by background normal mortality • By the time the introduction is discovered, normally too late to eradicate • Short lag phase= aggressive pathogen • Long lag phase= less aggressive pathogen ESTABLISHMENT • NORMALLY REDUCED GENETIC VARIABILITY INVASION • Because of lack of equilibrium, high Fst values, I.e. strong genetic structuring among populations • Normally dominance of a few genotypes • Spatial autocorrelation analyses to tell us exten of spread INVASION-2 • Later phase: genetic differentiation • Higher genetic difference in areas of older establishment