Download Phylogeography

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Artificial gene synthesis wikipedia, lookup

Genome (book) wikipedia, lookup

Microevolution wikipedia, lookup

Designer baby wikipedia, lookup

Gene expression programming wikipedia, lookup

Gene wikipedia, lookup

Saethre–Chotzen syndrome wikipedia, lookup

Gene expression profiling wikipedia, lookup

Site-specific recombinase technology wikipedia, lookup

Population genetics wikipedia, lookup

Human genetic variation wikipedia, lookup

Genome evolution wikipedia, lookup

Genetic drift wikipedia, lookup

RNA-Seq wikipedia, lookup

Gene nomenclature wikipedia, lookup

Gene desert wikipedia, lookup

Polymorphism (biology) wikipedia, lookup

Therapeutic gene modulation wikipedia, lookup

Helitron (biology) wikipedia, lookup

Gene therapy wikipedia, lookup

Koinophilia wikipedia, lookup

The Selfish Gene wikipedia, lookup

Genetics and archaeogenetics of South Asia wikipedia, lookup

Computational phylogenetics wikipedia, lookup

Viral phylodynamics wikipedia, lookup

Quantitative comparative linguistics wikipedia, lookup

Maximum parsimony (phylogenetics) wikipedia, lookup

Transcript
Lecture 17: Phylogenetics and
Phylogeography
October 22, 2012
Announcements
 Exam Next Wednesday (Oct 31)
 Review on Monday
 Bring questions
 Covers material from genetic drift (Sept 28)
through Coalescence (Friday)
 I will be gone Monday, Oct 29 (after office
hours) through Oct 31
 Bring questions on Monday!
Last Time
 Using FST to estimate migration
 Direct estimates of migration: parentage
analysis
 Introduction to phylogenetic analysis
Today
 Phylogeography
 Limitations of phylogenetic analysis
 Coalescence introduction
 Influence of demography on coalescence
time
UPGMA Method
Use all pairwise
comparisons to make
dendrogram
UPGMA:Unweighted
Pairwise Groups
Method using
Arithmetic Means
Hierarchically link
most closely related
individuals
Read the Lab 9 Introduction!
Phenetics (distance) vs Cladistics
(character state based)
Lowe, Harris, and Ashton 2004
Parsimony Methods
 Based on underlying genealogical relationships among alleles
 Occam’s Razor: simplest scenario is the most likely
 Useful for depicting evolutionary relationships among taxa
or populations
 Choose tree that
requires smallest
number of steps
(mutations) to produce
observed relationships
Choosing Phylogenetic Trees
 MANY possible trees can
be built for a given set of
taxa
 Very computationally
intensive to choose among
these
Lowe, Harris, and Ashton 2004
UN 
(2n  5)!
2n3 (n  3)!
RN 
(2n  3)!
 (2n  3)U n
n2
2 (n  2)!
Choosing Phylogenetic Trees
 Many algorithms exist for
searching tree space
 Local optima are problem:
need to traverse valleys to
get to other peaks
 Heuristic search: cut trees
up systematically and
reassemble
 Branch and bound: search
for optimal path through
tree space
9
9
10
9
9
Felsenstein 2004
8
9
7
8
11
11
5
Choosing Phylogenetic Trees
 If multiple trees equally likely, select majority rule or
consensus
 Strict consensus is most conservative approach
 Bootstrap data matrix (sample with replacement) to
determine robustness of nodes
E
60
Lowe, Harris, and Ashton 2004
A
D F
CB
60
60
Felsenstein
2004
Phylogeography
 The study of evolutionary relationships among
individuals based on phylogenetic analysis of DNA
sequences in geographic context
 Can be used to infer evolutionary history of populations
 Migrations
 Population subdivisions
 Bottlenecks/Founder Effects
 Can provide insights on current relationships among
populations
 Connectedness of populations
 Effects of landscape features on gene flow
Phylogeography
 Topology of tree provides
clues about evolutionary and
ecological history of a set of
populations
 Dispersal creates poor
correspondence between
geography and tree topology
 Vicariance (division of
populations preventing gene
flow among subpopulations)
results in neat mapping of
geography onto haplotypes
Example: Pocket gophers (Geomys pinetis)
 Fossorial rodent that
inhabits 3-state area in
the U.S.
 RFLP for mtDNA of 87
individuals revealed 23
haplotypes
 Parsimony network
reveals geographic
relationships among
haplotypes
 Haplotypes generally
confined to single
populations
 Major east-west split in
distribution revealed
Avise 2004
Problems with using Phylogenetics for
Inferring Evolution
 It’s a black box: starting from end
point, reconstructing past based on
assumed evolutionary model
 Homologs versus paralogs
 Hybridization
 Differential evolutionary rates
 Assumes coalescence
Gene Orthology
 Phylogenetics requires unambiguous identification of
orthologous genes
 Paralogous genes are duplicated copies that do not
share a common evolutionary history
 Difficult to determine orthology relationships
Lowe, Harris, Ashton 2004
Gene Trees vs Species Trees
 Genes (or loci) evolve at different rates
 Why?
 Topology derived by a single gene may not match
topology based on whole genome, or morphological traits
Gene Tree
B
C A
Gene Trees vs Species Trees
 Failure to coalesce within species
lineages drives divergence of
relationships between gene and
species trees
Divergent
Gene Tree:
Concordant
Gene Tree
b is closer to a
than to c
a b
c
b is closer to c a b
than to a
c
Coalescence
 Retrospective tracing of ancestry of
individual alleles
 Allows explicit simulation of sequence
evolution
 Incorporation of factors that cause
deviation from neutrality: selection,
drift, and gene flow
9 generations in the history of a population of 14 gene copies
Time
present
Slide courtesy of Yoav Gilad
Individual alleles
How to model this process?
Modeling from Theoretical Ancestors: Forward Evolution
 Can model populations
in a forward
direction, starting
with theoretical past
 Fisher-Wright model
of neutral evolution
 Very computationally
intensive for large
populations
Alternative: Start at the end and work your way
back
Most recent common ancestor (MRCA)
Time
present
Slide courtesy of Yoav Gilad
Individual alleles
The genealogy of a sample of 5 gene copies
Most recent common ancestor (MRCA)
Time
present
individuals
Slide courtesy of Yoav Gilad
The genealogy of a sample of 5 gene copies
Most recent common ancestor (MRCA)
Individual alleles
Slide courtesy of Yoav Gilad
Time
present
Examples of coalescent trees for a sample of 6
Time
Individual alleles
Slide courtesy of Yoav Gilad
Coalescence Advantages
 Don’t have to model dead ends
 Only consider lineages that survive to
modern day: computationally efficient
 Based on actual observations
 Can simulate different evolutionary
scenarios to see what best fits the
observed data
Coalescent Tree Example
 Coalescence:
Merging of two
lineages in the
Most Recent
Common Ancestor
(MRCA)
 Waiting Time: time
to coalescence for
two lineages
 Increases with
each
coalescent
event
Probability of Coalescence
 For any two lineages, function of
population size
Pcoalescence
1

2Ne
 Also a function of number of lineages
Pcoalescence
k (k  1) 1

2
2Ne
where k is number of lineages
Probability of Coalescence
 Probability declines over time
 Lineages decrease in number
 Can be estimated based on negative
exponential
Pcoalescence  e
 k ( k 1) 1
t 
2
2 Ne




where k is number of lineages
Time to Coalescence Affected by
Population History
Bottleneck
Time to Coalescence Affected by
Population History
Population Growth
Time to Coalescence Affected by
Population Structure
Applications of the Coalescent Approach
 Framework for efficiently testing
alternative models for evolution
 Inferences about effective population
size
 Detection of population structure
 Signatures of selection (coming
attraction)