Download statgen4

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Gene nomenclature wikipedia , lookup

Gene expression profiling wikipedia , lookup

Mutation wikipedia , lookup

Pharmacogenomics wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Heritability of IQ wikipedia , lookup

Genome evolution wikipedia , lookup

Gene wikipedia , lookup

Point mutation wikipedia , lookup

Public health genomics wikipedia , lookup

Group selection wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Genetic engineering wikipedia , lookup

Epistasis wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

History of genetic engineering wikipedia , lookup

Inbreeding wikipedia , lookup

Dominance (genetics) wikipedia , lookup

Gene expression programming wikipedia , lookup

Hardy–Weinberg principle wikipedia , lookup

Genome (book) wikipedia , lookup

Designer baby wikipedia , lookup

Polymorphism (biology) wikipedia , lookup

Human genetic variation wikipedia , lookup

Koinophilia wikipedia , lookup

Genetic drift wikipedia , lookup

Population genetics wikipedia , lookup

Microevolution wikipedia , lookup

Transcript
1) When the Hardy-Weinberg Law Fails to
Apply
 Mutation
 The frequency of gene B and its allele b will not remain in Hardy-Weinberg

equilibrium if the rate of mutation of B -> b (or vice versa) changes.
By itself, mutation probably plays only a minor role in evolution; the rates are
simply too low. But evolution absolutely depends on mutations because this is
the only way that new alleles are created. After being shuffled in various
combinations with the rest of the gene pool, these provide the raw material on
which natural selection can act.
 Gene Migration
 Many species are made up of local populations whose members tend to breed

within the group. Each local population can develop a gene pool distinct from
that of other local populations.
However, members of one population may breed with occasional immigrants
from an adjacent population of the same species. This can introduce new genes
or alter existing gene frequencies in the residents. In many plants and some
animals, gene migration can occur not only between subpopulations of the same
species but also between different (but still related) species. This is called
hybridization. If the hybrids later breed with one of the parental types, new genes
are passed into the gene pool of that parent population. This process, is called
introgression.
 Genetic Drift
 As we have seen, interbreeding often is limited to the members of local
populations. If the population is small, hardy-Weinberg may be violated. Chance
alone may eliminate certain members out of proportion to their numbers in the
population. In such cases, the frequency of an allele may begin to drift toward
higher or lower values. Ultimately, the allele may represent 100% of the gene
pool or, just as likely, disappear from it. Drift produces evolutionary change, but
there is no guarantee that the new population will be more fit than the original
one. Evolution by drift is aimless, not adaptive.
Controversial when first proposed (Kimura 1968) , Incontrovertible in
2001:
DNA sequence polymorphisms are abundant
In Eukarya, most of the genome is noncoding
most sequence polymorphism lies in noncoding regions
Most sequence polymorphisms appear selectively neutral
 Nonrandom Mating
 One of the cornerstones of the Hardy-Weinberg equilibrium is that mating in the





population must be random. If individuals (usually females) are choosy in their
selection of mates the gene frequencies may become altered. Darwin called this
sexual selection.
Assortative mating -Humans seldom mate at random preferring phenotypes like
themselves (e.g., size, age, ethnicity). This is called assortative mating.
Isolate communities
Worldwide, 1/3 of all marriages are between people born within 10 miles of each
other
Cultures in which consanguinity is more prominent
Consanguinity is marriage between relatives e.g. second or third cousins
 Natural Selection
 If individuals having certain genes are better able to produce mature offspring
than those without them, the frequency of those genes will increase. This is
simple expressing Darwin's natural selection in terms of alterations in the gene
pool. (Darwin knew nothing of genes.) Natural selection results from
differential mortality and/or
differential fecundity.


 Mortality Selection
 Certain genotypes are less successful than others in surviving through to the end

of their reproductive period.
The evolutionary impact of mortality selection can be felt anytime from the
formation of a new zygote to the end (if there is one) of the organism's period of
fertility. Mortality selection is simply another way of describing Darwin's criteria
of fitness: survival.
 Fecundity Selection
 Certain phenotypes (thus genotypes) may make a disproportionate contribution
to the gene pool of the next generation by producing a disproportionate number
of young. Such fecundity selection is another way of describing another criterion
of fitness described by Darwin: family size. In each of these examples of natural
selection certain phenotypes are better able than others to contribute their genes
to the next generation. Thus, by Darwin's standards, they are more fit. The
outcome is a gradual change in the gene frequencies in that population.
2) Effect of Natural Selection on Gene
Frequencies.
 Let us define the frequency of each genotype in the population
as: w, and the initial allele distribution as p and q for A1 and A2.
wA1/ A1  1- r
wA1/ A 2  1
wA 2 / A 2  1- s
3)
Fitness
The average fitness is:
W  (1- r ) p 2  2 pq  (1- s)q 2  1- rp 2 - sq 2
P  ([(1- r ) p 2  pq]/ W ) - p  pq[ s - (r  s) p]/ W
0,1


[ p, q ]  
1, 0
 s /(r  s), r /(r  s)

4)
Hetrozygote Advantage
s (1-r) p 2n +p n q n
s
p n+1 =
r+s
W
r+s
s
=[(1-r)p 2n +p n q n ]W]/W=
r+s
s
=[(1-r)p 2n +p n q n ](1-rp n2 -sq n2 )]/W
r+s
1-rp n -sqn
s
=
[p n ]
W
r+s
The difference decreases to zero only for positive r and s. Thus the
scenario in which both alleles can survive is Hetrozygote
Advantage
5)
Recessive diseases
If r>0, and s=0, the disadvantage appears only homozygotic A1.
In this case:
p n+1 =p n (1-rp n )/(1-rp n2 )
1/p n+1 -1/p n =1/p n [(1-rp n2 )/(1-rp n )-1]=
[r(1-p 2n )/(1-rp n )]
1/p n -1/p 0 =nr
6)
Fitness Summary
 Third fix point is in the range [0,1] only if r and s have the same
sign.
 It is stable only of both r and s are positive
 In all other cases one allele is extinct.
 If r>0 and s=0 then the steady state is still p=0, but is is obtained
with a rate pn=1/(nr+1/p0)
7)
Summary
 In the absence of selection an allele concentration equilibrium is
obtained after one generation.
 In the presence of selection, usually a single allele survives.
 There are many mechanisms which can lead to the failure of the
hardy weinberg equilibrium.
8)
Genetic Variation
 Three fundamental levels and each is a genetic resource of
potential importance to conservation:
1. Genetic variation within individuals (heterozygosity)
2. Genetic differences among individuals within a population
3. Genetic differences among populations
 Species rarely exist as panmictic population = single, randomly
interbreeding population
 Typically, genetic differences exist among populations—this
geographic genetic differences=Crucial component of overall
genetic diversity
9)
heterozygosity
 Several measures of heterozygosity exist. The value of these
measures will range from zero (no heterozygosity) to nearly 1.0
(for a system with a large number of equally frequent alleles).
We will focus primarily on expected heterozygosity (HE, or
gene diversity, D). The simplest way to calculate it for a single
locus is as:
H  1   pi2
where pi is the frequency of the ith of k alleles
 . If we want the gene diversity over several loci we need double
summation and subscripting as follows
H  1   pij2
i
j
 In H.W heterozygosity is given by 2pq. The rest of the
2
2
expression ( p + q ) is the homozygosity.
 The heterozgosity for a two-allele system is described by a
concave down parabola that starts at zero (when p = 0) goes to a
maximum at p = 0.5 and goes back to zero when p = 1. In fact
for any multi-allelic system, heterozygosity is greatest when
 p1 = p2 = p3 = ….pk
 The maximum heterozygosity for a 10-allele system comes when
each allele has a frequency of 0.1 H then equals 0.9.
10) Genetic Variation
 HI, HS, HT refer to the average heterozygosity within individuals,
subpopulations and the total population, respectively
 HT = HP + DPT
 where HT = total genetic variation (heterozygosity) in the
species;
 HP = average diversity within populations (average
heterozygosity)
 DPT = average divergence among populations across total species
range
 Divergence arise among populations from random processes
(founder effects, genetic drift, bottlenecks, mutations) and from
local selection).
 Drop in heterozygosity defined as Wright’s F statistics:
FIS 
HS  HI
H  HS
H  HI
, FIT  T
, FST  T
HS
H TS
HT
 2 subpopulations of equal size, gene frequencies p1 = 0.8, p2 =
0.3.
 Gene frequency in total population midway between them pt =
0.55
 HS1 = 2p1q1 = 2 x 0.8 x (1-0.8) = 0.32
 HS2 = 2p2q2 = 2 x 0.3 x (1-0.3) = 0.42
 HS = average(HS1, HS2) = (0.32 + 0.42)/2 = 0.37
 HT = 2 x 0.55 x (1 - 0.55) = 0.495
 FST=(0.495-0.37)/0.495=0.252
11) Indentity
 G1 and G2 are identical by descent (i.b.d) if they are physical
copies of the same ancestor, or one of the other.
 G1 and G2 are identical by state (i.b.s) if they represent the same
allele.
 The kinship between two relatives fij is the probability that
random gene from autosomal loci in I and j are i.b.d.
 The interbreeding coefficient is the probability that his or her
two genes from autosomal loci are i.b.d
 Every mutation creates a new allele
 Identity in state = identity by descent (IBD)
A1A1
A1A1
A1A2
A1A2
A1A2
A2A2
 F=1-H (inbreeding coefficient) is probability of IBD = 1/2.
 fAC is the coancestry of A with C etc., i.e. the probability of 2
gametes taken at random, 1 from A and one from C, being IBD.
 The inbreeding is thus ,- fAA be the probability of 2 gametes
taken at random from A being IBD.

A-B C- D
|
|
P - Q
|
X

FX  f PQ 
1
1
1
1
f AD  f AC  f BC  f BD
4
4
4
4
A- B
|
|
P - Q
|
X
1
1
1
1
f AD  f AC  f BC  f BD
4
4
4
4
1
1
1 1 1
  2 f AB  f AA  f BB    0    
4
4
2 2 4
FX  f PQ 

 Note that if mutations occurred twice indpendently
A1A1
A1A2
A1A2
A1A1
A1A1
A1A2
A1A2
A2A2
A1A2
A1A1
A1A2
A2A2
A2 A2 IBD
A2 A2 IBD
A2 A2
alike in state (AIS)
not identical by descent
12) Back to genetic drift
 Assume a population size of N, therefore 2N alleles in
population. Imagine eggs and sperm released randomly into
environment (e.g. sea)
 What is the probability of 2 gametes drawn randomly having the
same allele?
gen 0
gen 1
 Therefore, after 1 generation the
level of inbreeding is F1 = 1/2N
2N
alleles
probability = 1/(2N)
 After t generations the probability is

1 

Ft  1  1 

 2N 
Ft 
1 
1 
 1 
 Ft 1
2N  2N 
t
 Genetic drift will make initially identical population different
 Eventually, each population will be fixed for a different allele
 If there are very many populations, the proportion of
populations fixed for each allele will correspond to the initial
frequency of the allele
 Small populations will get different more rapidly
 The effective population size is determined by
 Large variation in the number of offspring
 Overlapping generation
1
1
= n
Ne
 Fluctuations in population size
Σ
1
Ni
(Harmonic mean)
 Unequal numbers of males and females contributing to
reproduction
4NfNm
Ne =
(Nf + Nm)
13) Founders effect
Population
Costa Rica
Finland
Hutterites
Japan
Iceland
Newfoundland
Quebec
Sardinia
# of
founders
4,000
500
80
1,000
25,000
25,000
2,500
500
# of
generations
12
80-100
14
80-100
40
16
12-16
400
Current size
2,500,000
5,000,000
36,000
120,000,000
300,000
500,000
6,000,000
1,660,000
14) Coalesence








Simplification: 0, 1 or 2 offspring
Coalesce: have the same parent
Probability to coalesce: 1/N
Probability Not to coalesce: 1 – 1/N
t generations:
(1-1/N)t
Average time to coalesce for 2 genes: N
For the whole population: 2N
15) Genetic drift and mutation
Ft 
1
1
2
2
1     1   1    Ft 1
2N
 2N 

 Probability of neither of 2 alleles being mutated is (1-)2

1
1
2
2
1     1   1    Ft 1
2N
 2N 
1
Fˆ  Ft  Ft 1 
1 4N 
Ft 
 If one also includes gene flow
 FT = [1/2N + (1 - 1/2N) * FT-1] * (1 – m- μ)2
16) Balance between Mutation and selection.
 Mutations can provide a balancing force to selection.
 Let us assume a mutation rate of from A2 to A1. The dynamics
equation is:
(1-r)p2 +pq
P=
-p+(1- )*q 
W
 An equilibrium is obtained when
pq+(1-s)q 2
q=
 1+ (1-s)/s 
W(1- )