Download Gene Flow Up to now, we have dealt with local populations in which

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Genomic imprinting wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Point mutation wikipedia , lookup

Copy-number variation wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Neuronal ceroid lipofuscinosis wikipedia , lookup

Saethre–Chotzen syndrome wikipedia , lookup

Epistasis wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

RNA-Seq wikipedia , lookup

Koinophilia wikipedia , lookup

Polymorphism (biology) wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Gene wikipedia , lookup

Public health genomics wikipedia , lookup

The Selfish Gene wikipedia , lookup

Genetic drift wikipedia , lookup

Genome evolution wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Gene therapy wikipedia , lookup

Gene desert wikipedia , lookup

History of genetic engineering wikipedia , lookup

Gene expression profiling wikipedia , lookup

Genetic engineering wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Gene nomenclature wikipedia , lookup

Human genetic variation wikipedia , lookup

Gene expression programming wikipedia , lookup

Population genetics wikipedia , lookup

Genome (book) wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Designer baby wikipedia , lookup

Microevolution wikipedia , lookup

Transcript
Gene Flow
Up to now, we have dealt with local populations in which all individuals can be
viewed as sharing a common system of mating. But in many species, the species is
broken up into many local populations with restricted amounts of interbreeding.
Therefore, the system of mating within differs from the system of mating between.
The system of mating between local populations determines the amount of GENE
FLOW.
We will start with a simple model in which two, infinitely large local populations
experience gene flow by exchanging a portion m of their populations each
generation. Consider 1 locus with 2 alleles as follows:
Population 1
A
p1
Population 2
a
q1
1-m
A
p2
m
m
a
q2
1-m
A
a
A
a
p1 '
q1 '
p2 '
q2 '
With neutrality, one has that
p1 ' = (1-m)p1 + mp 2
p1 '- p1 = ∆p1 = -m(p 1 -p2 )
p2 '= (1-m)p2 + mp 1
∆p2 = -m(p 2 -p1 )
These equations show that gene flow acts as an evolutionary force (ie. alters allele
frequencies) if 1) m>0 (the local populations are not completely reproductively
isolated) and 2) p1 ≠ p2 (the local populations are genetically distinct to some degree - this will always be true if the local populations are finite in size and had much
previous isolation -- this insures divergence by genetic drift).
In the special case where p1 =0 and p2 >0, then gene flow introduces new genetic
variability into the population. In this sense, gene flow acts like mutation.
However, unlike mutation, gene flow can alter frequencies at many loci
simultaneously and can cause radical and extremely rapid shifts in allele frequency.
Conditions causing m>0. Although this appears simple, m in reality represents a
complex interaction between the pattern of dispersal and the mating system. For
example, inbreeding (in the pedigree sense) can greatly reduce the opportunity for
gene flow, even if the individuals are in physical proximity. E.g., the Tauregs (an
Arabian tribe) mate almost exclusively with cousins. As a result, this tribe shows
almost no gene flow with other tribes with which they are physically intermingled.
Assortative mating can also greatly reduce the amount of gene flow. E.g., Western
Grebes had apparently split and differentiated into two color morphs (light and
dark) in the past. These two morphs now occur together in certain parts of the West.
In one population, expected 33% of the mating pairs to be mixed under random
mating; but due to strong positive assortative mating, only had 1.2% of pairs mixed.
This greatly reduces gene flow and allows the maintenance of all the genetic
differences between the color morphs, and not just the loci that determine the color.
The reason is simple, although there may be linkage equilibrium within each color
phase, with respect to the global population, all loci that are differentiated between
the phases will show disequilibrium with the color loci. Another example of this is
the European corn borer, which has two pheromone races that are now broadly
sympatric. There is strong assortative mating for pheromone phenotype, and hence
despite sympatry, the races have maintained much differentiation at isozyme loci
that have no impact on the pheromone phenotype directly. In contrast,
disassortative mating enhances m for all loci. E.g., D. melanogaster has strong
disassortative mating pheromone system, and shows much less differentiation than
corn borers and is effectively a single, cosmopolitan species showing little
geographical differentiation (except for a handful of selected loci and inversions) on
even a continental basis.
It is also important to note that the assortative or disassortative mating that
determines m for all loci can be based on a non-genetic phenotype. This was already
noted for the Amish who have assortative mating based on religion and who, as a
consequence, maintain extreme genetic distinctiveness from surrounding
populations. Likewise, social castes in Chile are strong determinants of assortative
mating, and for historical reasons are correlated with the amount of Indian blood.
As a consequence, the Indian and Spanish gene pools are still quite distinct despite
400 years of socially limited gene flow. Another example is provided by whites &
blacks in the US vs. N.E. Brazil. In North Amer., European settlers imported black
slaves mainly from 1700-1808, with 98% of them coming from West and WestCentral Africa. There is a strong tendency for assortative mating on racial category,
but when hybrids are formed, they are socially classified as blacks. (Genetically, and
phenotypically, the hybrids are intermediate and are no more "black" than they are
"white".) This social definition of hybrids as "black" when coupled with assortative
mating by racial category and the numerical predominance of "whites" results in a
very asymetrical gene flow pattern. Effectively, almost all gene flow is from whites
into blacks, with almost none going in the other direction. Let M = the effective
amount of gene flow over the entire relevant period of North American history (in
contrast to m, which was a per generation gene flow parameter). Then we can
model the North Amer. situation as follows:
European
A
pw
West African
a
1
A
pw
A
pa
M
a
a
1-M
A
pb=Mpw+(1-M)pa
a
Given the allele frequencies, you can estimate M by
M = (pb-pa )/(pw-pa )
e.g., for the Rh + allele, pb=.4381, pa =.5512, and pw=.0279, so M=.216. In the U.S.,
estimates of M range from 3% (S.C.) to 27% (Detroit).
In the northeast of Brazil, the social definition of hybrids is "white". Hence, NE
Brazil had the opposite pattern of gene flow:
European
A
pw
West African
a
1-M
A
a
pw’=Mpa +(1-M)pw
A
pa
M
a
1
A
pa
a
Because the blacks were a minority in Brazil as well, this pattern of gene flow means
that most N.E. Brazilians are of mixed European/African ancestry. Eg., in
Northeastern Brazil, the “white” gene pool is 59% European, 30% African, and 11%
Indian. Similarly, can characterize people on basis of skin color from "Most
Caucasoid" to "Most Negroid". In US., "most caucasoid" group is about 100%
European in origin, and the average "black" about 20% European. In N.E. Brazil, the
“most Caucasoid” group is 71% European, and “most Negroid” 28%. Thus, the social
definitions used in system of mating in the two countries have had a major genetic
impact on the composition of their present day populations dispite similar initial
conditions.
The Genetic Impact of Gene Flow
We have already seen that allele frequencies are altered when gene flow occurs
between genetically distinct populations. But the alterations are in a specific
direction. Let d = p1 –p 2 . Recall that
p1 ' = (1-m)p1 + mp 2 = p1 - m(p1 -p2 ) = p1 - md
p2 ' = p2 + md
Hence
p1 ' - p2 ' = d' = p1 - md - p 2 -md = d(1-2m) < d
for all m>0. After t generations:
dt = d(1-2m) t 0 as t ∞.
Therefore, GENE FLOW DECREASES GENETIC VARIABILITY BETWEEN
POPULATIONS. However, recall that, like mutation, GENE FLOW CAN
INTRODUCE NEW ALLELES INTO A POPULATION, AND THEREFORE GENE
FLOW INCREASES GENETIC VARIABILITY WITHIN A POPULATION.
VERY IMPORTANT -- THE EFFECTS OF GENE FLOW ON WITHIN AND
BETWEEN POPULATION GENETIC VARIABILITY ARE THE OPPOSITE OF
THOSE OF GENETIC DRIFT. THEREFORE, THE BALANCE BETWEEN DRIFT
AND GENE FLOW IS THE PRIMARY DETERMINANT OF THE GENETIC
POPULATION STRUCTURE OF A SPECIES. Genetic population structure refers to
1) how genetic variability is distributed within a species (within and between local
populations), and 2) how genetic variability in gene pools is related to individual
level genotypic variability (this is also highly dependent upon the system of
mating). Genetic and genotypic variability provide the raw material for all
evolutionary change, including that caused by natural selection. As will be seen
later, natural selection operates within the constraints imposed by the genetic
structure.
The Balance of Gene Flow and Drift
Recall that to measure the impact of genetic drift upon identity by descent, we
started with the equation:
Ft = 1/(2Nef) + [1-1/(2Nef)]Ft-1
To examine the balance between drift and mutation, we modified the above
equation as follows:
Ft = {1/(2Nef) + [1-1/(2Nef)]Ft-1}(1-µ)2
A similar modification can be used to address the following question: suppose two
populations of inbreeding effective size Nef are experiencing gene flow at a rate of m
per generation. Then, what is the probabilty that two randomly drawn genes from
the same subpopulation are identical by descent AND from the same population?
That is, if one of the genes came from the other gene pool, we no longer regard it as
identical. The equation for this probability is then:
Ft = {1/(2Nef) + [1-1/(2Nef)]Ft-1}(1-m)2
which at equilibrium yields
Feq ≈ 1/(4Nefm + 1)
if m is small. These results emphasize the similar impact of gene flow and
mutation, as discussed above. This can also be interpreted in the coalescent sense as
the probability that two genes randomly drawn from the same subpopulation
coalescece back to a common ancestor before either lineage experienced a gene flow
even given than either coalescence or gene flow has occurred. This equilibrium
equation reflects the balance of gene flow (proportional to m) vs. drift (proportional
to 1/Nef) as measured by their ratio [m/(1/Nef) = N efm] upon identity by descent
within a subpopulation when alleles drawn from outside are regarded as nonidentical. Wright therefore defined this F as “Fst” where the “st” designates this as
identity by descent in the subpopulation with regard to the total population. (Note,
just as drift influences both ibd and variances of allele frequencies, resulting in more
than one effective size, there is also an alternative definition of Fst in terms of
variances of allele frequencies, as will be given in the next handout.) As expected,
as m goes up, Fst goes down, as 1/N ef goes up (drift goes up), Fst goes up.
What is surprising is how little gene flow is needed to cause two populations to
behave effectively as a single evolutionary lineage. E.g., let Nefm =1, that is, one
“effective” migrant per generation. Then, Fst = 1/5 = .20. That is, 80% of the gene
pairs drawn from the same subpopulation will show gene flow before coalescence
(that is, the genes travelled through different geographical areas before they
coalesced). The other thing that is surprising is that the proportion of genes shared
by two populations as measured by 1-Fst depends only upon the effective number of
migrants (Nefm) and not the rate of gene flow (m). For example, two subpopulations
of a billion each whould share 80% of their genes by exchanging only 1 individual
per generation, as would two subpopulations of size 100. The reason why the same
number of migrants is needed for a particular level of Fst and not the same rate of
gene flow is that Fst represents a balance between the rate at which gene flow causes
subpopulations to diverge vs. the rate at which gene flow makes them more similar.
In large populations, divergence is slow, so small amounts of gene flow are effective
in counterbalancing divergence; as populations because smaller, larger and larger
rates of gene flow are needed to counterbalance the increasing rate of divergence.
Similarly, it is the product Nefm (which reflects the balance of drift vs. gene flow)
and not m that determines the relative coalescence times of genes within and
among local populations. If there is restricted gene flow among demes, it makes
sense that the average time to coalescence (a common DNA molecule) for two genes
sampled within a deme will be less than that for two genes sampled at random for
the entire species. In particular, Slatkin (Genet. Res. 58: 167-175, 1991) has shown
that these relative times are determined by Nefm. The exact relationship depends
upon the pattern of gene flow, but consider the simple “island model” case of a
species subdivided into a large number of local demes each of size Nef and each
receiving m of its genes per generation from the species at large. Then,
N ef m =
t0
4( t − t 0 )
where t 0 = the average time to coalescence of two genes sampled from the same deme
and t = the average time to coalescence of two genes sampled from the entire species
Hence, the ratio of within deme coalescence time to entire species coalescence time
is:
t0
t
=
4N ef m
1 + 4N ef m
For example, the ratio of coalescence times of Y chromosomes in East Anglia, UK to
humans globally has been estimated to be between 0.56 and 0.71 (Cooper et al.
Human Molecular Genetics 5, 1759-1766, 1996). This yields an Nefm for Y
chromosomes of between 1.22 and 0.64 for humans (note, 2Nefm appears in the ratio
equation in this case and not 4Nefm because Y-DNA is haploid).
Likewise, Fst now has a simple interpretation in terms of coalescence times:
F st =
t − to
t
or
to
t
= 1 − F st
In general, there is a lack of appreciation over just how little gene flow is needed to
keep populations evolving together as a single unit. For example, Fst ≈ .15 when the
major racial groups of humans are regarded as the subpopulations. This could be
explained by only a little more than one effective migrant per generation (1.42)
among the races over recent the evolutionary history of humans (a result consistent
with coalescent times of Y-DNA -- recall that Nef for Y DNA refers only to males).
Likewise, one can convert this Fst into the relative coalescence times of genes within
races versus humans as an entire species such that the average coalescence time of
two genes drawn from within a race is 85% of that for 2 genes randomly drawn from
humanity as a whole. Thus, it does not take a lot of exchange between the races to
insure that humans evolve as a single evolutionary lineage. This fact is not widely
appreciated, as evidenced by the debates over the “out-of-Africa replacement” vs. the
multiregional hypotheses concerning the origins of the modern races.
The Balance of Gene Flow, Mutation, and Drift
If we regard that ibd can be destroyed by both mutation and gene flow, then the
appropriate balance equation is:
Ft = {1/(2Nef) + [1-1/(2Nef)]Ft-1}[(1-µ)(1-m)] 2
If both µ and m are small, then using a Taylor’s series, (1-µ)(1-m) ≈ 1-µ-m. Hence,
Feq ≈ 1/[4Nef(µ+m) + 1].
The above equation emphasizes the similar role that the disparate forces of
mutation and gene flow have upon genetic variation and identity by descent.