Download Random Allelic Variation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Gene nomenclature wikipedia , lookup

Epistasis wikipedia , lookup

Gene wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Quantitative trait locus wikipedia , lookup

History of genetic engineering wikipedia , lookup

Public health genomics wikipedia , lookup

Heritability of IQ wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Genetic engineering wikipedia , lookup

Gene expression programming wikipedia , lookup

Pharmacogenomics wikipedia , lookup

Inbreeding avoidance wikipedia , lookup

Genome (book) wikipedia , lookup

Genetics and archaeogenetics of South Asia wikipedia , lookup

Designer baby wikipedia , lookup

Koinophilia wikipedia , lookup

Polymorphism (biology) wikipedia , lookup

Inbreeding wikipedia , lookup

Human genetic variation wikipedia , lookup

Dominance (genetics) wikipedia , lookup

Hardy–Weinberg principle wikipedia , lookup

Population genetics wikipedia , lookup

Genetic drift wikipedia , lookup

Microevolution wikipedia , lookup

Transcript
Random Allelic Variation
AKA
Genetic Drift
Genetic Drift
a non-adaptive mechanism of evolution (therefore, a theory of
evolution) that sometimes operates simultaneously with others,
such as natural selection
the frequency of gene copies (i.e., alleles) in any generation of
adult organisms represents only a sample of the gene copies
carried by gametes of the previous generation, and the sample
is subject to random variation, i.e., “sampling error”
Beginning with the Hardy-Weinberg model
no mutation
no selection
no gene flow
But with one wrinkle in drift
finite population size
Result
random changes in allele frequency
(there is never a change in allele frequency in Hardy-Weinberg
equilibrium)
Definitions
monomorphy – no allelic variation at a locus in a population
polymorphy – multiple alleles at a locus in a population
fixation, fixed – describes an allele frequency reaching 100% and
therefore monomorphic
N = censused population size
Definitions
private allele – an allele unique to only one population, but not
necessarily fixed within it
cline – continuous change in allele frequency along a geographic
transect, the hallmark of gene flow
metapopulation – a collection of conspecific demes
deme – a reproductively isolated or semi-isolated sub- population,
i.e., reduced or no gene flow among or between demes
Two models of drift
Random Walk – prospective, looking forward in time
Coalescence – retrospective, looking backward in time
Random Walk Model – Monte Carlo Markov Chain
prospective, looking forward in time
The state at timet=0 is determined only by the state at timet-1 plus a
random event
example
stand at sundial on Horseshoe in front of Hadley Hall facing West
take one step forward, flip a coin, move one step to the right if heads
or one step to the left if tails
repeat process until you either reach Espina, run into either North
Horseshoe Rd or South Horseshoe Rd
Espina St
South Horseshoe Rd
North Horseshoe Rd
the horseshoe as a graph of allele frequency
Espina St
South Horseshoe Rd
North Horseshoe Rd
y-axis (vertical) = allele frequency
x-axis (horizontal) = time in generations
p = 0 at North Horseshoe Rd (extinction of A1, fixation of A2)
p = 1.0 at South Horseshoe Rd (fixation of A1, extinction of A2)
the sundial is halfway between, p = 0.5 at timet=0
the outcome will differ every time because of the
random component
Unlike the coin-toss exercise, in which the
probability of heads and tails remains equal,
the probability of an allele being represented in a
gamete changes with each generation
the probability of an allele being represented in a
gamete is equal to the new allele frequency
this will tend to ensure allele fixation or loss
the width of the Horseshoe (i.e., North-South) is
analogous to population size (N)
the smaller the population, the narrower the width
the smaller the population, the greater the sampling bias of gametes,
and the more probably and rapidly an allele frequency will become
fixed or monomorphic (100%) or go extinct (0%)
variance is higher in small samples
𝑝 1−𝑝
𝑉=
2N
(N = population size)
If Drift is random (by definition, it is), then how can
you predict change in allele frequency or which
allele will become fixed?
Probability of fixation = p
Probability of extinction = 1 – p
for any new mutant, probability of fixation = initial frequency
1
𝑝=
2N
intuitively, probability of fixation of a new mutant by chance alone is
greater in a small rather than large population
average time to fixation (without selection) = 4N generations (in
diploid species)
Coalescent Model – Coalescent Theory
any two lineages can
be traced backward
in time to a common
ancestor
alleles, haplotypes,
or lineages are said
to “coalesce” at that
generation of
common ancestry
Coalescent Model – Coalescent Theory
Example – a haploid non-recombining bacterium
in each generation, the bacterium may die, survive, or survive and
reproduce
thus, in a population of finite size, if some lineages leave no
descendants while others reproduce, eventually all individuals will
be descendants of just one single lineage
barring consideration of new mutants, initially polymorphic
populations become increasingly closely related as allelic variation
is lost by fixation and extinction
Same thing
– in color
Histogram of generations
to coalescence of lineages
http://www.csbio.unc.edu/mcmillan/index.py?run=Courses.Comp790S09
time to coalescence of alleles =
4N generations (diploids)
2N generations (haploids)
1N generations in maternally inherited haploid organellar DNA
(i.e., mitochondria, chloroplast) because the paternal lineage
ends in every generation
Conclusion – coalescence is fast in small populations
Drift is greatest in small populations
Coalescent Theory Predicts (in the absence of gene flow,
mutation, selection)
Allele or haplotype frequencies fluctuate at random but, in finite
populations, one will become fixed
Individual populations lose their genetic variation
Initially similar populations diverge in allele frequencies by chance
alone because they become fixed for different alleles or different
combinations of alleles at unlinked loci
The probability that an allele will ultimately become fixed is equal to
its frequency in the population in any given generation
Rate of fixation (or loss) is greater in small populations
Distinct evolutionary histories of species and their genes
Polymorphism arises
before speciation
modified from Ebersberger et al. Mol Biol Evol 2007
Lineage Sorting
Hemiplasy
the time-dependent process by
which species lose their ancestral
polymorphism through the process
of genetic drift
genes or characters with
different evolutionary histories
than the species that possess
them, most often due to
incomplete lineage sorting (ILS)
ancestral
polymorphism
the shorter
the time
between
speciations,
the more ILS,
hemiplasy
complete
complete
complete
Robinson et al 2008 PNAS 105:14477-14481
How is hemiplasy manifested?
Mosaic Genomes with discordant gene trees among three or
more species that diverged in rapid succession
Percentage of
25,000 genes
most closely related
Between:
• human-chimp
• chimp-gorilla
• human-gorilla
Ebersberger et al Mol Biol Evol 2007
Heterozygosity (H)
single locus H – the number of individuals in a population that are
heterozygous for a given locus
multilocus H – the number of loci that are heterozygous in an average
individual
H highest in a population with equal numbers of homozygotes
Within demes, drift fixes alleles
Across the metapopulation, allele frequencies remain unchanged, but
genotype frequencies deviate from Hardy-Weinberg equilibrium,
i.e., heterozygosity decreases (H)
panmictic with gene flow, high H
demic with genetic structure, low H
Effective Population Size (NE)
the NE of an actual population is equal to the censused population
size (N) of an “ideal” population (i.e., in which all individuals breed
and contribute equally to the gene pool) that would show the
amount of drift actually observed and measured by heterozygosity
(H)
typically, NE < N because of:
sex bias (the less numerous sex limits NE)
reproductive variance of the sexes (the polygamous sex limits NE)
overlapping generations
fluctuations in population size, e.g., past bottlenecks
ploidy
Founder Effect
the principle that the founders of a new colony carry only a
fraction of the total genetic variation of the source
population
genetic drift will have a strong effect on small founding
populations
most rare alleles will not be represented, a few will be overrepresented
Founder Effect
initially, H tends to be similar in source and founder populations
because H is most influenced by common alleles
but H decreases rapidly in founder populations, more so in small
populations, less so in populations with high intrinsic growth rate (r)
examples of drift – Buri 1956
Fixation of eye
color allele from
initial freq = 0.5 in
107 populations
of Drosophila in
19 generations
examples of drift –
Baker and Mooed 1987
Mynah birds are indigenous to India
Mynahs were introduced by humans to
Australia, New Zealand, Fiji, and Hawaii in the 1800’s
among natural populations of Mynahs, Nei’s D = 0.001 (a genetic
distance that describes the inverse correlation coefficient of shared
alleles)
among naturalized populations, Nei’s D = 0.006 – equivalent to
sub-species differences in about a century
also, most rare alleles lost, but some increased from p = 0.01 to p = 0.08
Inbreeding (Assortative Mating, compared to drift)
the antithesis of panmixia, panmixis, random mating
Inbreeding Coefficient (F) – the frequency of autozygous individuals in
a population
Autozygous (“identical by descent”) - both alleles in a homozygous
individual were inherited directly from a single haploid allele in an
ancestor (e.g., grandparent)
Allozygous – not identical by descent; either homozygous or
heterozygous
in an inbred population, H is low
Pedigree with Identity by Descent
Parental
♂ A1A2*
F1
♂ A1A2*
F2 generation
♀ A1A2
inbreeding
A2*A2*
♀ A2A2*
Genotype frequencies with inbreeding
A1 A1
A1 A2
A2 A2
allozygous
p2 (1-F)
2pq (1-F)
q2 (1-F)
+
autozygous
pF
+
qF
as F  1.0, the frequency the frequency of autozygous homozygotes
increases at the expense of all allozygous genotypes
the greater F, the faster H decreases
Selfing (self-fertilization) – H is halved in each generation
Inbred population F > number of autozygous individuals in a
panmictic population
F = 1 fully inbred, F = 0 no inbreeding
comparison of inbreeding to Hardy-Weinberg equil
allele frequencies do not change (necessarily)
genotype frequencies do change
phenotypic variance usually increases due to loss of heterozygotes
inbreeding depression (reduction in mean of phenotype due to
increased expression of recessive alleles in homozygous genotype)
number of homozygous recessive alleles increases
mean fitness of population decreases
which, when coupled with natural selection, can then change allele
frequency can promote linkage disequilibrium due to lack of
heterozygotes, even if loci are not physically linked
comparison of Inbreeding to Drift
both genetic drift and inbreeding lead to deviation from HardyWeinberg equilibrium
heterozygosity decreases in small demes
genetic drift causes change in allele frequency (and consequently
genotype frequency)
inbreeding causes change in genotype frequency (but not allele
frequency in the absence of selection)
both cause a decrease in heterozygosity
Neutral Theory
Mootoo Kimura 1968
Original thesis: there are too many genes for selection to act in any
significant way on all simultaneously, no population is sufficiently
large to bear the reduction in fitness
Now better understood as a balance between new mutation and
genetic drift; the genome is in a constant state of flux
Consistent with
Molecular clock
Large percentage of non-coding and non-conserved DNA and
redundancy of genetic code
Gene Flow
homogenization of metapopulations
addition of alleles/genotypes to demes
opposite effect of drift
complete gene flow = panmixia
recall models of gene flow: island, stepping stone, isolation by
distance, extinction and recolonization
Gene Flow
m = rate of gene flow = % of gene copies carried into a population
from outside per generation
Nm = number of immigrants per generation, a measure of gene flow
FST = fixation index ≈ % of genetic variation of a total population that
is represented in a sub-population
genetic structure – a structured population is one with high FST, that
is, the subpopulation is not representative of the total
As Nm , equilibrium FST , genetic structure 
Gene Flow
It takes very few immigrants to homogenize populations
Yet, typically populations are very structured, gene flow is
surprisingly low
Direct estimates of gene flow (mark and recapture studies) suggest
more gene flow than typically measured FST