Download Effects of the Ordering of Natural Selection and Population

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Koinophilia wikipedia , lookup

Gene expression programming wikipedia , lookup

Human genetic variation wikipedia , lookup

Dual inheritance theory wikipedia , lookup

Viral phylodynamics wikipedia , lookup

Selective breeding wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Tag SNP wikipedia , lookup

The Selfish Gene wikipedia , lookup

Hardy–Weinberg principle wikipedia , lookup

Polymorphism (biology) wikipedia , lookup

Genetic drift wikipedia , lookup

Sexual selection wikipedia , lookup

Natural selection wikipedia , lookup

Microevolution wikipedia , lookup

Population genetics wikipedia , lookup

Group selection wikipedia , lookup

Transcript
G3: Genes|Genomes|Genetics Early Online, published on May 12, 2017 as doi:10.1534/g3.117.041038
Effects of the ordering of natural selection and population regulation
mechanisms on Wright-Fisher models
Zhangyi Hea,∗, Mark Beaumontb , Feng Yua
a
b
School of Mathematics, University of Bristol, Bristol BS8 1TW, United Kingdom
School of Biological Sciences, University of Bristol, Bristol BS8 1TQ, United Kingdom
Abstract
We explore the effect of different mechanisms of natural selection on the evolution of populations for one- and two-locus systems. We compare the effect of viability and fecundity selection
in the context of the Wright-Fisher model with selection under the assumption of multiplicative
fitness. We show that these two modes of natural selection correspond to different orderings of
the processes of population regulation and natural selection in the Wright-Fisher model. We
find that under the Wright-Fisher model these two different orderings can affect the distribution of trajectories of haplotype frequencies evolving with genetic recombination. However, the
difference in the distribution of trajectories is only appreciable when the population is in significant linkage disequilibrium. We find that as linkage disequilibrium decays the trajectories for
the two different models rapidly become indistinguishable. We discuss the significance of these
findings in terms of biological examples of viability and fecundity selection, and speculate that
the effect may be significant when factors such as gene migration maintain a degree of linkage
disequilibrium.
Keywords: Wright-Fisher model, Viability selection, Fecundity selection, Linkage
disequilibrium
1
1. Introduction
2
In population genetics, one studies the genetic composition of biological populations and
3
the changes in genetic composition that result from the operation of various factors, including
4
natural selection. Bürger (2000), Ewens (2004) and Durrett (2008) provided an excellent theo-
5
retical introduction to this field. The most basic, but also most important, model in population
∗
Corresponding author at: School of Mathematics, University of Bristol, Bristol BS8 1TW, United Kingdom
Email address: [email protected] (Zhangyi He)
Preprint submitted to G3: Genes, Genomes, Genetics
© The Author(s) 2013. Published by the Genetics Society of America.
May 12, 2017
6
genetics is the Wright-Fisher model, developed by Fisher (1922) and Wright (1931), which
7
forms the basis for most theoretical and applied research in population genetics to date, includ-
8
ing Kingman’s coalescent (Kingman, 1982), Ewens’ sampling formula (Ewens, 1972), Kimura’s
9
work on fixation probabilities (Kimura, 1955) and techniques for inferring demographic and
10
genetic properties of biological populations (see Tataru et al., 2016, and references therein).
11
Such widespread application is mainly due to not only the strong universality results for the
12
Wright-Fisher model, e.g., Möhle’s work on the Cannings model (Möhle, 2001), but also the
13
fact that the Wright-Fisher model captures the essence of the biology involved and provides an
14
elegant mathematical framework for characterizing the dynamics of gene frequencies, even in
15
complex evolutionary scenarios.
16
The simplest version of the Wright-Fisher model (Fisher, 1922; Wright, 1931) is concerned
17
with a finite random-mating population of fixed population size evolving in discrete and nonover-
18
lapping generations at a single biallelic locus, which can be regarded as a simplified version of the
19
life cycle where the next generation is randomly sampled with replacement from an effectively
20
infinite gene pool built from equal contributions of all individuals in the current generation.
21
The Wright-Fisher model can be generalized to incorporate other evolutionary forces such as
22
natural selection (see, for example, Etheridge, 2011).
23
Natural selection is the differential survival and reproduction of individuals due to differences
24
in phenotype, which has long been a topic of interest in population genetics. According to
25
Christiansen (1984), natural selection can be classified according to the stage of an organism’s
26
life cycle at which it acts: viability selection (or survival selection) which acts to improve the
27
rate of zygote survival and fecundity selection (or reproduction selection) which acts to improve
28
the rate of gamete reproduction, as shown in Figure 1.
29
Nagylaki (1997) provided different derivations of multinomial-sampling models for genetic
30
drift at a single multiallelic locus in a monoecious or dioecious diploid population for different
31
orders of the evolutionary forces (i.e., mutation, natural selection and genetic drift) in the life
32
cycle. Prugnolle et al. (2005) found that gene migration occurring before or after asexual repro-
33
duction in the life cycle of monoecious trematodes can have different effects on a finite island
34
model, depending on values of the other genetic parameters. A natural question that arises
35
here is whether different stages of an organism’s life cycle at which natural selection acts can
2
Adults (N < ∞)
population regulation
meiosis
viability selection
Gametes (∞)
Zygotes (∞)
mating
(a) The life cycle stage at which viability selection acts.
Adults (N < ∞)
fecundity selection
population regulation
meiosis
Gametes (∞)
Zygotes (∞)
mating
(b) The life cycle stage at which fecundity selection acts.
Figure 1: Life cycles of a diploid population incorporated with different types of natural selection.
36
cause different population behaviors under the Wright-Fisher model. It is an inevitable choice
37
that we have to make in current statistical inferential frameworks based on the Wright-Fisher
38
model for inferring demographic and genetic properties of biological populations, especially for
39
natural selection, which also affects the performance of these statistical inference methods.
3
40
In the present work, we are concerned with a finite random-mating diploid population of
41
fixed population size N (i.e., a population of 2N chromosomes) evolving with discrete and
42
nonoverlapping generations under natural selection within the framework of the Wright-Fisher
43
model, especially for the evolution of one- and two-locus systems under natural selection. We
44
carry out diffusion analysis of Wright-Fisher models with selection and use extensive Monte
45
Carlo simulation studies to address the question of whether different types of natural selection
46
can cause different population behaviors under the Wright-Fisher model, especially when nat-
47
ural selection takes the form of viability or fecundity selection. Our main finding is that the
48
distribution of the trajectories of haplotype frequencies for two recombining loci depends on
49
whether viability or fecundity selection is operating. However, this difference is only appre-
50
ciable when the haplotype frequencies are in significant linkage disequilibrium, and arises as a
51
consequence of the interplay between genetic recombination and natural selection. Once linkage
52
disequilibrium disappears, the distributions of trajectories under the Wright-Fisher model with
53
either viability or fecundity selection become almost identical fairly quickly.
54
2. Materials and Methods
55
In this section, we provide detailed formulations for a finite random-mating diploid pop-
56
ulation of fixed population size evolving with discrete and nonoverlapping generations under
57
natural selection (i.e., viability and fecundity selection, respectively) within the framework of
58
the Wright-Fisher model and their diffusion approximations. We also introduce the Hellinger
59
distance to measure the difference in the behavior of the Wright-Fisher model between viability
60
and fecundity selection.
61
2.1. Wright-Fisher models with selection
62
Consider a monoecious population of N randomly mating diploid individuals evolving with
63
discrete and nonoverlapping generations under natural selection. We assume that there are two
64
alleles segregating at each autosomal locus, and the population size N is fixed. Let Xi
65
the frequency of haplotype i in N adults of generation k ∈ N and X (N ) (k) denote the vector
66
with frequencies of all possible haplotypes. We can study the population evolving under natural
67
selection in terms of the changes in haplotype frequencies from generation to generation.
(N )
4
(k) be
68
To determine the transition of haplotype frequencies from one generation to the next, we
69
need to investigate how the mechanisms of evolution (e.g., natural selection) alter the genotype
70
frequencies at intermediate stages of the life cycle. Let Yij (k) be the frequency of the ordered
71
genotype made up of haplotypes i and j in N adults of generation k and Y (N ) (k) designate the
72
vector with frequencies of all possible genotypes. Note that genotypes are regarded as ordered
73
here only for simplicity of notation. Under the assumption of random mating, the genotype
74
frequency is equal to the product of the corresponding haplotype frequencies (Edwards, 2000),
(N )
(N )
(N )
Yij (k) = Xi
(N )
(k)Xj
(N )
(k) = Xj
(N )
(k)Xi
(N )
(k) = Yji (k).
(1)
75
As illustrated in Figure 1, natural selection takes the form of viability selection and the
76
life cycle moves through a loop of population regulation, meiosis, random mating, viability
77
selection, population regulation, and so forth (see Figure 1a), or natural selection takes the
78
form of fecundity selection and the life cycle moves through a loop of population regulation,
79
fecundity selection, meiosis, random mating, population regulation, and so forth (see Figure 1b).
80
The entrance of the life cycle here is assumed to be with the population reduced to N adults
81
right after a round of population regulation. In the life cycles shown in Figure 1, there are
82
four potential mechanisms of evolutionary change, natural selection, meiosis, random mating
83
and population regulation. We assume that natural selection, meiosis and random mating
84
occur in an effectively infinite population so can be treated deterministically (Hamilton, 2011).
85
Suppose that population regulation (i.e., genetic drift) acts in a similar manner to the Wright-
86
Fisher sampling introduced by Fisher (1922) and Wright (1931). In other words, population
87
regulation corresponds to randomly drawing N zygotes with replacement from an effectively
88
infinite population to become new adults in the next generation, consequently completing the
89
life cycle shown in Figure 1. Thus, given the genotype frequencies Y (N ) (k) = y, the genotype
90
frequencies in the next generation satisfy
Y (N ) (k + 1) | Y (N ) (k) = y ∼
1
Multinomial (N, q) ,
N
(2)
91
where q is a function of the genotype frequencies y denoting the vector with frequencies of
92
all possible genotypes of an effectively infinite population after the possible mechanisms of
5
93
evolutionary change (except population regulation) at intermediate stages of the life cycle such
94
as natural selection, meiosis and random mating within generation k. The explicit expression
95
of the sampling probabilities q will be given in the following two sections for the evolution of
96
one- and two-locus systems under natural selection, respectively.
97
To simplify notation, we introduce a function ρiuv of three variables u, v and i, defined as
1
ρiuv = (δui + δvi ),
2
(3)
98
where δui and δvi denote the Kronecker delta functions. We can then express the frequency of
99
haplotype i in terms of the corresponding genotype frequencies as
(N )
Xi
(k) =
X
(N )
ρiuv Yuv
(k),
(4)
u,v
and the transition probabilities of the haplotype frequencies from one generation to the next
can be easily obtained from Eqs. (2)-(4). We refer to the process X (N ) = {X (N ) (k) : k ∈ N} as
the Wright-Fisher model with selection, whose first two conditional moments satisfy
(N )
E (k,x) Xi (k + 1) = pi
p (δ − p )
i ij
j
(N )
(N )
E (k,x) Xi (k + 1)Xj (k + 1) =
+ pi pj ,
2N
(5)
(6)
where
E (k,x) ( · ) = E · | X (N ) (k) = x
is the short-hand notation for the conditional expectation of a random variable given the population of the haplotype frequencies x in generation k and
pi =
X
ρiuv quv
u,v
100
is the frequency of haplotype i of an effectively infinite population after the possible mechanisms
101
of evolutionary change (except population regulation) at intermediate stages of the life cycle
102
such as natural selection, meiosis and random mating within generation k, which obviously can
103
be expressed in terms of the haplotype frequencies x.
6
104
2.1.1. One-locus Wright-Fisher models with selection
105
Let us consider a monoecious population of N randomly mating diploid individuals at a
106
single autosomal locus A, segregating into two alleles, A1 and A2 , evolving under natural
107
selection in discrete and nonoverlapping generations. We call the two possible haplotypes A1
108
and A2 haplotypes 1 and 2, respectively. As we have stated above, we need to formulate the
109
sampling probabilities q for the evolution of one-locus systems under natural selection, which
110
is the vector of frequencies of all possible genotypes of an effectively infinite population after
111
a single generation of natural selection, meiosis (without genetic recombination) and random
112
mating due to the absence of genetic recombination at meiosis in the evolution of one-locus
113
systems.
114
With a single biallelic locus, there are four possible zygotes (i.e., four ordered genotypes)
115
that result from the random union of two gametes. We denote the fitness of the genotype formed
116
by haplotypes i and j by wij for i, j = 1, 2. In the life cycles illustrated in Figure 1, the number
117
of adults is regulated to be of size N and the gene frequencies in the subsequent generation can
118
be described by multinomial sampling of the normalized frequencies in an effectively infinite pool
119
of zygotes. When natural selection takes different forms, genotype frequencies in the zygotes
120
can be modeled in different ways.
121
In the case of what can be called ‘viability selection’ shown in Figure 1a, the N adults have
122
an equal chance of forming gametes, which unite at random to form zygotes. This is followed
123
by viability selection on genotypes of zygotes, leading to modified genotype frequencies. The
124
subsequent genotype frequencies are obtained by multinomial sampling. We can express the
125
frequency of the genotype formed by haplotypes i and j of an effectively infinite population
126
after a single generation of meiosis, random mating and viability selection as

(v)
qij =
wij 
w̄
2
X

ρiuv yuv  
u,v=1
127
where
w̄ =
2
X
i,j=1

wij 
2
X
2
X

ρjuv yuv  ,
(7)
u,v=1

ρiuv yuv  
u,v=1
2
X

ρjuv yuv  .
(8)
u,v=1
128
Alternatively, in what can be termed ‘fecundity selection’ illustrated in Figure 1b, the N
129
adults have an unequal chance of forming gametes, depending on their genotypes; the gametes
7
130
unite at random and the subsequent genotype frequencies remain unchanged until multinomial
131
sampling. We can express the frequency of the genotype formed by haplotypes i and j of
132
an effectively infinite population after a single generation of fecundity selection, meiosis and
133
random mating as

(f )
qij = 
2
X
u,v=1
134


2
X
w
w
uv
uv
ρjuv
ρiuv
yuv  
yuv  ,
w̄
w̄
(9)
u,v=1
where
w̄ =
2
X
wuv yuv .
(10)
u,v=1
From Eqs. (7)-(8) and (9)-(10), we see that the transition probabilities of the genotype frequencies from one generation to the next depend only on the genotype frequencies in the current
generation for both viability and fecundity selection, but take on different forms. Combining
with Eqs. (2)-(4), we find that the process X (N ) is a time-homogeneous Markov process with
(N )
respect to the filtration F (N ) = {Fk
: k ∈ N} generated by the process Y (N ) evolving in the
state space
(
ΩX (N ) =
x∈
1
0,
,...,1
2N
)
2 X
2
:
xi = 1 ,
i=1
135
which we call the one-locus Wright-Fisher model with selection.
136
2.1.2. Two-locus Wright-Fisher models with selection
137
Now we turn to the study of a monoecious population of N randomly mating diploid in-
138
dividuals at two linked autosomal loci named A and B, each segregating into two alleles, A1 ,
139
A2 and B1 , B2 , evolving under natural selection with discrete and nonoverlapping generations.
140
We call the four possible haplotypes A1 B1 , A1 B2 , A2 B1 and A2 B2 haplotypes 1, 2, 3 and 4,
141
respectively. As we have stated above, we need to formulate the sampling probabilities q for
142
the evolution of two-locus systems under natural selection, which is the vector with frequencies
143
of all possible genotypes of an effectively infinite population after a single generation of natu-
144
ral selection, meiosis (with genetic recombination) and random mating due to the presence of
145
genetic recombination at meiosis in the evolution of two-locus systems.
146
With two biallelic loci, there are 16 possible zygotes (i.e., 16 ordered genotypes) that result
147
from the random union of four gametes. We denote the fitness of the genotype formed by
148
haplotypes i and j by wij for i, j = 1, 2, 3, 4, and designate the recombination rate between the
8
149
two loci by r (i.e., the rate at which a recombinant gamete is produced at meiosis). To simplify
150
notation, we introduce a vector of auxiliary variables η = (η1 , η2 , η3 , η4 ), where η1 = η4 = −1
151
and η2 = η3 = 1. In the life cycles shown in Figure 1, the number of adults is regulated to be of
152
size N and the gene frequencies in the subsequent generation can be described by multinomial
153
sampling of the normalized frequencies in an effectively infinite pool of zygotes. When natural
154
selection takes different forms, genotype frequencies in the zygotes can be modeled in different
155
ways.
156
Following similar reasoning as in the one-locus case, we can express the frequency of the
157
genotype formed by haplotypes i and j of an effectively infinite population after a single gener-
158
ation of meiosis, random mating and viability selection as

(v)
qij =
wij 
w̄

4
X
ρiuv yuv + ηi rD 
u,v=1
4
X

ρjuv yuv + ηj rD ,
(11)
u,v=1
where
w̄ =
4
X

wij 
D=

ρiuv yuv + ηi rD 
u,v=1
i,j=1

4
X
4
X

ρjuv yuv + ηj rD
(12)
u,v=1

ρ1uv yuv  
u,v=1
4
X
4
X


ρ4uv yuv  − 
u,v=1

4
X
ρ2uv yuv  
u,v=1

4
X
ρ3uv yuv  .
(13)
u,v=1
159
Similarly, the frequency of the genotype formed by haplotypes i and j of an effectively
160
infinite population after a single generation of fecundity selection, meiosis and random mating
161
is

(f )
qij = 
4
X
u,v=1

ρiuv
wuv
yuv + ηi rD 
w̄
4
X

ρjuv
u,v=1
wuv
yuv + ηj rD ,
w̄
(14)
where
w̄ =
4
X
wuv yuv
(15)
u,v=1

D=
4
X
u,v=1

 


4
4
4
X
X
X
w
w
w
w
uv
uv
uv
uv
ρ1uv
yuv  
ρ4uv
yuv  − 
ρ2uv
yuv  
ρ3uv
yuv  .
w̄
w̄
w̄
w̄
u,v=1
u,v=1
u,v=1
(16)
9
From Eqs. (11)-(13) and (14)-(16), we see that the transition probabilities of the genotype
frequencies from one generation to the next depend only on the genotype frequencies in the
current generation for both viability and fecundity selection, but take on different forms. Combining with Eqs. (2)-(4), we show that the process X (N ) is a time-homogeneous Markov process
(N )
with respect to the filtration F (N ) = {Fk
: k ∈ N} generated by the process Y (N ) evolving in
the state space
(
ΩX (N ) =
x∈
1
0,
,...,1
2N
)
4 X
4
:
xi = 1 ,
i=1
162
which we call the two-locus Wright-Fisher model with selection.
163
2.2. Diffusion approximations
164
Due to the interplay of stochastic and deterministic forces, the Wright-Fisher model with
165
selection presents analytical challenges beyond the standard Wright-Fisher model for neutral
166
populations. The analysis of the Wright-Fisher model with selection today is greatly facilitated
167
by its diffusion approximation, commonly known as the Wright-Fisher diffusion with selection,
168
which can be traced back to Kimura (1964) and has already been successfully applied in the
169
statistical inference of natural selection (see Malaspinas, 2016, and references therein). Here we
170
only present the formulation of the Wright-Fisher diffusion with selection and refer to Durrett
171
(2008) for a rigorous proof, especially for one- and two-locus systems.
172
The Wright-Fisher diffusion with selection is a limiting process of the Wright-Fisher model
173
with selection characterizing the changes in haplotype frequencies over time in an extremely
174
large population evolving under extremely weak natural selection. More specifically, the se-
175
lection coefficients (and the recombination rate if there is more than one locus) are assumed
176
to be of order 1/(2N ) and the process runs time at rate 2N , i.e., t = k/(2N ). The selection
177
coefficient mentioned here represents the difference in fitness between a given genotype and
178
the genotype with the highest fitness. For example, a common category of fitness values for a
179
diploid population at a single locus can be presented as follows: genotypes A1 A1 , A1 A2 and
180
A2 A2 at a given locus A have fitness values 1, 1 − hA sA and 1 − sA , respectively, where sA
181
is the selection coefficient and hA is the dominance parameter (see Hamilton, 2011, for other
182
categories of fitness values presented in terms of selection coefficients).
(N )
Let ∆Xi
(k) denote the change in the frequency of haplotype i from generation k to the
10
next. Using Eqs. (5) and (6), we can obtain its first two conditional moments
(N )
E (k,x) ∆Xi (k) = pi − xi
p (δ − p )
i ij
j
(N )
(N )
E (k,x) ∆Xi (k)∆Xj (k) =
+ (pi − xi )(pj − xj ).
2N
Considering the limits as the population size N goes to infinity, we can formulate the infinitesimal mean vector µ(t, x) as
(N )
µi (t, x) = lim 2N E ([2N t],x) ∆Xi ([2N t])
N →∞
= lim 2N (pi − xi )
N →∞
(17)
and the infinitesimal covariance matrix Σ(t, x) as
(N )
(N )
Σij (t, x) = lim 2N E ([2N t],x) ∆Xi ([2N t])∆Xj ([2N t])
N →∞
= lim pi (δij − pj ) + 2N (pi − xi )(pj − xj ),
N →∞
(18)
183
where [ · ] is used to denote the integer part of the value in the brackets, according to standard
184
techniques of diffusion theory (see, for example, Karlin & Taylor, 1981).
The process X (N ) thereby converges to a diffusion process, denoted by X = {X(t), t ≥ 0},
satisfying the stochastic differential equation of the form
dX(t) = µ (t, X(t)) dt + σ (t, X(t)) dW (t),
where the diffusion coefficient matrix σ(x) satisfies the relation that
σ(t, x)σ T (t, x) = Σ(t, x)
185
and W (t) is a multi-dimensional standard Brownian motion. We refer to the process X as
186
the Wright-Fisher diffusion with selection, which we will use to investigate the difference in the
187
behavior of the Wright-Fisher model between viability and fecundity selection in Section 3.1.
11
188
2.3. Statistical distances
189
Given that the Wright-Fisher model with selection is a Markov process, the evolution is
190
completely determined by its transition probabilities. We can, therefore, study the difference
191
in the behavior of the Wright-Fisher model between viability and fecundity selection in terms
192
of their transition probabilities.
193
We define the conditional probability distribution function of the Wright-Fisher model with
194
selection X (N ) evolving from the population of the initial haplotype frequencies x0 over k
195
generations (i.e., k-step transition probabilities) as
π(x0 , xk ) = P X (N ) (k) = xk | X (N ) (0) = x0 .
(19)
196
We use a statistical distance to quantify the difference between two probability distributions for
197
the Wright-Fisher model with viability and fecundity selection (i.e., the difference in the behav-
198
ior of the Wright-Fisher model between viability and fecundity selection). Rachev et al. (2013)
199
provided an excellent introduction of statistical distances, and here we employ the Hellinger dis-
200
tance, introduced by Hellinger (1909), to quantify the difference in the behavior of the Wright-
201
Fisher model between viability and fecundity selection, defined as
1
H(π (v) , π (f ) )(x0 , k) = √
2
sX
2
π (v) (x0 , xk ) − π (f ) (x0 , xk ) ,
(20)
xk
202
where π (v) is the probability distribution for the Wright-Fisher model with viability selection and
203
π (f ) is the probability distribution for the Wright-Fisher model with fecundity selection, both
204
of which are given by Eq. (19) combined with the Wright-Fisher model with the corresponding
205
type of natural selection.
206
Given the difficulties in analytically formulating the probability distribution π, especially for
207
the population evolving over a long-time period, we resort to Monte Carlo simulation here, which
208
enables us to get an empirical probability distribution function associated to the probability
209
distribution π, defined as
π̂(x0 , xk ) =
M
1 X
I{ξ(m) =x } ,
k
M
k
(21)
m=1
where I is the indicator function, namely I{ξk =xk } is one if ξ k = xk is true and zero otherwise,
12
(m)
ξk
is the m-th realization of the haplotype frequencies simulated under the Wright-Fisher
model with selection from the population of the initial haplotype frequencies x0 over k generations, and M is the total number of independent realizations in Monte Carlo simulation.
Combing with Eqs. (20) and (21), we can formulate the Monte Carlo approximation for the
Hellinger distance H(π (v) , π (f ) ) as
Ĥ(π (v) , π (f ) )(x0 , k) = H(π̂ (v) , π̂ (f ) )(x0 , k).
210
According to Van der Vaart (2000), the rate of convergence for the empirical probability dis-
211
tribution π̂ to the probability distribution π with respect to the Hellinger distance is of order
212
C1 (|ΩX | − 1)1/2 /M 1/2 , where C1 is a constant and |ΩX | is total number of possible states in
213
the state space ΩX . Combining with Le Cam & Yang (2000), we find that the rate of con-
214
vergence for the Hellinger distance approximated by Monte Carlo simulation, Ĥ(π (v) , π (f ) ), to
215
the Hellinger distance H(π (v) , π (f ) ) is of order C2 (|ΩX | − 1)1/2 /M 1/2 , where C2 is a constant.
216
Therefore, in theory, the Monte Carlo approximation for the Hellinger distance Ĥ(π (v) , π (f ) ) can
217
be used instead of the Hellinger distance H(π (v) , π (f ) ) as long as we increase the total number
218
of independent realizations M , which we will apply to measure the difference in the behavior
219
of the Wright-Fisher model between viability and fecundity selection in Section 3.2.
220
2.4. Data availability
221
The authors state that all data necessary for confirming the conclusions presented in the
222
article are represented fully within the article. Code used to simulate the Wright-Fisher model
223
with selection, including both viability and fecundity selection, and compute the results is
224
provided in Supplemental Material, File S1.
225
3. Results
226
In this section, we use diffusion analysis of Wright-Fisher models with selection and extensive
227
Monte Carlo simulation studies to investigate whether different types of natural selection can
228
cause different population behaviors under the Wright-Fisher model, especially when natural
229
selection takes the form of viability or fecundity selection.
13
230
We employ the category of fitness values presented in terms of selection coefficients given
231
in Section 2.2 and consider the simple case of directional selection with 0 ≤ sA ≤ 1, which
232
implies that the A1 allele is the type favored by natural selection. The dominance parameter
233
is assumed to be in the range 0 ≤ hA ≤ 1, i.e., general dominance. Suppose that fitness values
234
of two-locus genotypes are determined multiplicatively from fitness values at individual loci,
235
e.g., the fitness value of genotype A1 B2 /A2 B2 is (1 − hA sA )(1 − sB ), which means that there
236
is no position effect, i.e., coupling and repulsion double heterozygotes have the same fitness,
237
w14 = w23 = (1 − hA sA )(1 − hB sB ). Moreover, the recombination rate defined in Section 2.1.2
238
is in the range 0 ≤ r ≤ 0.5.
239
3.1. Diffusion analysis of the Wright-Fisher model
240
Now we use diffusion analysis of Wright-Fisher models with selection to address the question
241
of whether viability and fecundity selection can cause different population behavior under the
242
Wright-Fisher model, especially for the evolution of one- and two-locus systems under natural
243
selection. From Section 2.2, we see that the diffusion approximation for the Wright-Fisher
244
model with selection is fully determined from its infinitesimal mean vector µ(t, x) in Eq. (17)
245
and the infinitesimal covariance matrix Σ(t, x) in Eq. (18), which implies that we can carry
246
out diffusion analysis of Wright-Fisher models with selection to investigate the difference in the
247
behavior of the Wright-Fisher model between viability and fecundity selection by comparing
248
the difference in the haplotype frequencies of an effectively infinite population after a single
249
generation of natural selection, meiosis and random mating mentioned in Eqs. (17) and (18)
250
between viability and fecundity selection.
Under the set of assumptions on fitness values given above and using Taylor expansions with
respect to the selection coefficient sA , we find that for one-locus systems there is no difference
in the haplotype frequencies of an effectively infinite population after a single generation of
natural selection, meiosis and random mating between viability and fecundity selection, i.e.,
(v)
(f ) p
−
p
i
i = 0,
for i = 1, 2. For two-locus systems with the selection coefficients sA and sB and the recombina-
14
tion rate r, Supplemental Material, File S2, shows that
1
(v)
(f ) ,
pi − pi = O
N2
for i = 1, 2, 3, 4. Combining with Eqs. (17) and (18), we have
µ(v) (t, x) = µ(f ) (t, x)
Σ(v) (t, x) = Σ(f ) (t, x),
251
which leads to the same stochastic differential equation representation of the Wright-Fisher
252
diffusion with viability and fecundity selection for the evolution of one- and two-locus systems
253
under natural selection, respectively (see Supplemental Material, File S2, for detailed calcula-
254
tions).
255
Therefore, we can conclude that viability and fecundity selection bring about the same
256
population behavior under the Wright-Fisher diffusion for the evolution of one- and two-locus
257
systems under natural selection, which implies that there is almost no difference in popula-
258
tion behaviors under the Wright-Fisher model between viability and fecundity selection for the
259
evolutionary scenario of an extremely large population evolving under extremely weak natural
260
selection (and genetic recombination if there is more than one locus).
261
3.2. Simulation analysis of the Wright-Fisher model
262
Diffusion analysis of Wright-Fisher models with selection require assumptions on genetic pa-
263
rameters for tractability, i.e., it is only guaranteed to be a good approximation of the underlying
264
Wright-Fisher model in the case of an extremely large population evolving under extremely weak
265
natural selection (and genetic recombination if there is more than one locus). In this section, we
266
use extensive Monte Carlo simulation studies to investigate the difference in the behavior of the
267
Wright-Fisher model between viability and fecundity selection for other evolutionary scenarios
268
such as a small population evolving under strong natural selection. We illustrate how different
269
types of natural selection affect the behavior of the Wright-Fisher model with haplotype 1 in
270
detail in the following and expect other haplotypes to behave in a similar manner.
271
Let us designate the marginal probability distribution of the frequency of haplotype 1 by
272
π1 and simulate the dynamics of the Hellinger distance H(π 1 , π 1 ) over time for different
(v)
15
(f )
273
evolutionary scenarios to investigate whether viability and fecundity selection can cause different
274
population behaviors under the Wright-Fisher model using Eqs. (20) and (21) in Eq. (19).
1.0
1.0
1.0
1.0
0.8
0.8
0.8
0.8
0.6
0.6
0.6
0.6
0.4
0.4
0.4
0.4
0.2
0.2
0.2
0.0
0.0
0.0
selection coefficient
M=100000
0.2
M=10000
0
100 200 300 400 500
0.0
0
generation
100 200 300 400 500
generation
(a) We generate M independent realizations from simulating the one-locus Wright-Fisher model with
selection, where we adopt N = 500, hA = 0.5 and x0 = (0.3, 0.7).
1.0
1.0
1.0
1.0
0.8
0.8
0.8
0.8
0.6
0.6
0.6
0.6
0.4
0.4
0.4
0.4
0.2
0.2
0.2
0.0
0.0
0.0
selection coefficient
M=100000
0.2
M=10000
0
100 200 300 400 500
0.0
0
generation
100 200 300 400 500
generation
(b) We generate M independent realizations from simulating the two-locus Wright-Fisher model with
selection, where we adopt N = 500, sB = 0.05, hA = 0.5, hB = 0.5, r = 0.45 and x0 = (0.3, 0.4, 0.2, 0.1).
(v)
(f )
Figure 2: Dynamics of the Hellinger distance H(π1 , π1 ) simulated with the varying selection coefficient sA
over 500 generations under the one- and two-locus Wright-Fisher models with selection.
(v)
(f )
275
In Figure 2, we show the dynamics of the Hellinger distance H(π 1 , π 1 ) with the varying
276
selection coefficient sA over 500 generations under the one- and two-locus Wright-Fisher models
277
with selection, respectively, in which the Hellinger distance H(π 1 , π 1 ) is clearly not always
278
close to zero. This may result from the different forms natural selection takes or the statistical
279
noise produced in Monte Carlo simulation. Comparing the left columns of Figures 2a and 2b
(v)
16
(f )
(v)
(f )
280
with their right columns, we find that the Hellinger distance H(π 1 , π 1 ) decreases as the
281
total number of independent realizations M in Monte Carlo simulation increases, especially
282
when the selection coefficient sA is close to zero. We, therefore, believe that the discrepancy
283
should be caused mainly by the statistical noise produced in Monte Carlo simulation rather
284
than the different forms natural selection takes when the selection coefficient is close to zero,
285
which otherwise leads to a contradiction to what we have already achieved in Section 3.1. On
286
the contrary, when the selection coefficient is not close to zero, the discrepancy should be indeed
287
caused by the different forms natural selection takes rather than the statistical noise in Monte
288
Carlo simulation. We will confirm and discuss this point in more detail in Section 4.1.
289
Figure 2 shows that the difference in the behavior of the Wright-Fisher model between
290
viability and fecundity selection does exist, which becomes more significant when the effect of
291
natural selection on the population evolving over time increases. For the population evolving
292
under natural selection at a single locus, the difference in the behavior of the Wright-Fisher
293
model between viability and fecundity selection is almost negligible. However, for the population
294
evolving under natural selection at two linked loci, the difference in the behavior of the Wright-
295
Fisher model between viability and fecundity selection is no longer negligible, especially when
296
natural selection is not extremely weak. Therefore, we assert that viability and fecundity
297
selection can cause different population behaviors under the Wright-Fisher model, especially for
298
the population evolving under natural selection at two linked loci.
For further investigation into the difference in the population evolving over time at two linked
loci under the Wright-Fisher model between viability and fecundity selection, we introduce the
coefficient of linkage disequilibrium, proposed by Lewontin & Kojima (1960), to quantify the
level of linkage disequilibrium between the two loci, which is defined as
(N )
D(N ) (k) = X1
(N )
(k)X4
(N )
(k) − X2
(N )
(k)X3
(v)
(k).
(f )
299
We simulate the dynamics of the Hellinger distance H(π 1 , π 1 ) with the varying selection
300
coefficient sA and recombination rate r over 500 generations under the two-locus Wright-Fisher
301
model with selection for the population initially at different levels of linkage disequilibrium in
302
Figure 3.
303
Comparing the middle column of Figure 3 with its left and right columns, we find that the
17
300
400
500 0.0
1.0
1.0
0.8
0.8
0.6
0.4
0.4
0.4
0.2
0.2
0.2
0
300
400
500 0.0
200 300
generation
400
500 0.0
400
500 0.0
1.0
1.0
0.8
0.8
0.8
0.6
0.6
0.6
0.6
0.4
0.4
0.4
0.2
0.2
0.2
0
100
200
300
400
500 0.0
0
100
200
300
400
500 0.0
1.0
1.0
1.0
0.8
0.8
0.8
0.8
0.6
0.6
0.6
0.6
0.4
0.4
0.4
0.2
0.2
r=0.45 and D=0.05
1.0
0.4
0.2
300
0.2
0.4
200
0.8
0.2
0.0
1.0
selection coefficient
0.2
0.4
0.6
0.8
0.6
0.0
0.8
100
r=0.25 and D=0.05
r=0.45 and D=0
1.0
100
0
1.0
r=0.45 and D=−0.05
0
500 0.0
0.4
0.2
200
400
0.2
0.4
100
300
1.0
0.0
1.0
0.6
0.0
0.8
0
200
r=0.25 and D=0
1.0
selection coefficient
0.2
0.4
0.6
0.8
r=0.25 and D=−0.05
100
0.0
200
0.6
0.0
100
0.6
0
100
200 300
generation
(v)
400
500 0.0
0.0
0
0.8
0.6
0.2
0.8
0.4
0.4
1.0
0.2
0.6
r=0.05 and D=0.05
1.0
0.0
0.8
0.0
1.0
r=0.05 and D=0
1.0
selection coefficient
0.2
0.4
0.6
0.8
r=0.05 and D=−0.05
0
100
200 300
generation
400
500 0.0
(f )
Figure 3: Dynamics of the Hellinger distance H(π1 , π1 ) simulated with the varying selection coefficient sA
and recombination rate r over 500 generations under the two-locus Wright-Fisher model with selection for the
population initially at different levels of linkage disequilibrium. We generate M = 105 independent realizations
in Monte Carlo simulation and adopt N = 500, sB = 0.05, hA = 0.5 and hB = 0.5.
304
difference in the behavior of the two-locus Wright-Fisher model between viability and fecundity
305
selection becomes negligible when the population is initially in linkage equilibrium (see the
306
middle column of Figure 3), which implies that whether different types of natural selection can
307
cause different population behaviors under the Wright-Fisher model at two linked loci depends
308
significantly on the level of linkage disequilibrium. We thereby consider the dynamics of linkage
309
disequilibrium over time under the two-locus Wright-Fisher model with viability and fecundity
18
selection, respectively, for the population initially at different levels of linkage disequilibrium.
0.8
0.8
−0.04
0.00
0.04
0.00
0.04
0.0
−0.04
k=9 and D=0.05
0.00
0.04
k=12 and D=0.05
(v)
−0.04
0.00
0.04
k=15 and D=0.05
0.4
0.4
−0.04
0.00
0.04
linkage disequilibrium
0.0
0.2
0.0
−0.04
0.00
0.04
linkage disequilibrium
0.04
0.4
−0.04
0.08
−0.04
0.00
0.04
linkage disequilibrium
0.00
k=15 and D=0
0.8
k=6 and D=0.05
0.4
0.2
0.04
0.04
0.02
0.00
−0.04
0.00
0.04
linkage disequilibrium
0.00
−0.04
0.8
0.8
k=12 and D=0
0.8
k=3 and D=0.05
−0.04
0.0
0.04
0.4
0.00
0.04
k=0 and D=0.05
−0.04
0.00
0.04
0.12
0.00
0.04
0.02
−0.04
0.0
0.04
0.0
0.00
k=9 and D=0
0.4
0.4
−0.04
k=15 and D=−0.05
−0.04
0.00
0.04
linkage disequilibrium
0.0
0.04
0.0
0.2
0.00
k=6 and D=0
0.08
0.04
−0.04
0.0
0.04
0.4
0.00
k=3 and D=0
0.00
−0.04
0.12
0.04
k=12 and D=−0.05
0.4
0.10
0.04
0.00
0.00
probability distribution
0.0
0.4
0.8
k=9 and D=−0.05
0.02
−0.04
k=0 and D=0
probability distribution
0.0
0.4
0.8
k=6 and D=−0.05
0.6
k=3 and D=−0.05
viability
fecundity
0.00
probability distribution
0.0
0.4
0.8
k=0 and D=−0.05
0.00
310
−0.04
0.00
0.04
linkage disequilibrium
(f )
Figure 4: Dynamics of the probability distributions πD and πD simulated over 15 generations under the twolocus Wright-Fisher model with selection for the population initially at different levels of linkage disequilibrium.
We generate M = 105 independent realizations in Monte Carlo simulation and adopt N = 500, sA = 0.95,
sB = 0.05, hA = 0.5, hB = 0.5 and r = 0.45.
We define the conditional probability distribution function of linkage disequilibrium D(N )
evolving under the Wright-Fisher model from the population of the initial haplotype frequencies
x0 over k generations as
πD (D0 , Dk ) = P D(N ) (k) = Dk | D(N ) (0) = D0 ,
where
D0 = x0,1 x0,4 − x0,2 x0,3 .
311
We simulate the dynamics of the probability distribution πD over time for the population ini-
312
tially at different levels of linkage disequilibrium in Figure 4. We observe that the probability
313
distribution πD becomes concentrated on a narrower and narrower range of possible values
314
centered around 0 from generation to generation until fixing at 0. The dynamics of the proba-
315
bility distribution πD over time seems not to be associated with whether viability or fecundity
316
selection is occurring.
317
From Ridley (2004), there are three potential mechanisms of evolutionary change, genetic
318
recombination, natural selection and population regulation (genetic drift), in the life cycle,
19
319
which may affect the dynamics of linkage disequilibrium over time. Genetic recombination
320
always works towards linkage equilibrium due to genetic recombination generating new gamete
321
types to break down non-random genetic associations (Ridley, 2004). Natural selection alone
322
can not move the population far away from linkage equilibrium under the assumption that
323
fitness values of two-locus genotypes are multiplicative (Slatkin, 2008). Genetic drift can destroy
324
linkage equilibrium and create many genetic associations since genetic drift leads to the random
325
change in haplotype frequencies, which, however, can cause persistent linkage disequilibrium
326
only in small enough populations (Ridley, 2004). So once linkage equilibrium has been reached,
327
populations usually will not move far away from linkage equilibrium.
328
From Eqs. (11)-(13) and (14)-(16), provided that the population evolving over time is always
329
close to linkage equilibrium, the amount of the change in haplotype frequencies caused by genetic
330
recombination is negligible. The two-locus Wright-Fisher model where each locus segregates
331
into two alleles thereby becomes similar to the one-locus Wright-Fisher model for a single locus
332
segregating four alleles, where each gamete is analogous to a single allele. Therefore, once linkage
333
equilibrium has been reached, populations evolving under natural selection at two linked loci
334
within the framework of the Wright-Fisher model would no longer depend on whether viability
335
or fecundity selection is occurring. That is, viability and fecundity selection can cause different
336
population behaviors under the Wright-Fisher model only when the population is far away from
337
linkage equilibrium, which is confirmed by Figure 3.
338
Now we discuss why viability and fecundity selection lead to different population behaviors
339
under the two-locus Wright-Fisher model with selection. We simulate the dynamics of the
340
probability distributions π1
341
of linkage disequilibrium in Figure 5, from which it is clear that the Wright-Fisher models with
342
different types of natural selection have different rates of change in the frequency of haplotype 1
343
approaching fixation when the population is initially far away from linkage equilibrium, which
344
leads to the difference in the behavior of the Wright-Fisher model between viability and fecundity
345
selection, as illustrated in the left and right columns of Figure 3.
(v)
(f )
and π1
over time for the population initially at different levels
346
More specifically, as shown in Figure 5a, when the population is in negative linkage dise-
347
quilibrium, the Wright-Fisher model with selection of two linked loci drives haplotype 1 more
348
rapidly toward fixation than the Wright-Fisher model with selection of two completely linked
20
0.0
k=6
0.0
0.004
0.4
0.8
0.0
k=12
0.4
0.8
k=15
k=6
0.0
k=9
0.004
0.4
0.8
0.0
0.4
0.8
haplotype frequency
0.000
0.004
0.8
0.0
k=12
0.4
0.8
k=15
0.006
0.0
0.4
0.8
haplotype frequency
0.003
0.004
0.004
0.010
0.000
0.0
0.4
0.8
haplotype frequency
0.4
0.0
0.4
0.8
haplotype frequency
0.000
0.0
0.000
0.004
0.8
0.008
k=3
0.4
0.000
0.0
0.000
0.8
fecundity with r=0.45
fecundity with r=0
0.0
0.4
0.8
haplotype frequency
0.8
0.000
probability distribution
0.0
0.4
0.8
k=0
0.4
0.000
0.0
0.006
0.010
0.8
0.4
k=9
viability with r=0.45
fecundity with r=0.45
0.4
0.008
0.004
0.8
0.008
k=3
0.4
k=15
0.000
0.0
0.000
0.8
k=12
0.004
0.008
0.4
0.000
0.0
0.000
0.8
0.000
probability distribution
0.0
0.4
0.8
k=9
0.006
0.010
0.4
k=0
0.0
k=6
0.000 0.004 0.008
0.0
k=3
viability with r=0.45
viability with r=0
0.000
probability distribution
0.0
0.4
0.8
k=0
0.0
0.4
0.8
haplotype frequency
(a) The population is initially in negative linkage disequilibrium (D0 = −0.05).
0.0
k=6
0.004
0.8
0.0
k=12
0.0
0.4
0.8
haplotype frequency
0.4
0.8
k=15
0.004
0.0
0.4
0.8
haplotype frequency
0.8
0.004
0.4
0.000
0.0
0.4
0.8
haplotype frequency
0.0
0.000
0.008
0.8
0.4
k=15
0.004
0.0
0.4
0.8
haplotype frequency
0.006
0.010
0.000
0.0
0.4
0.8
haplotype frequency
0.4
k=9
fecundity with r=0.45
fecundity with r=0
k=12
0.004
0.004
0.8
0.008
k=3
0.4
0.0
0.000
0.0
0.8
0.000
0.8
0.000
0.4
0.4
0.004
0.0
0.000
0.8
0.0
0.008
k=9
0.006
0.010
0.4
0.8
0.000
k=6
0.4
0.008
0.0
0.000
0.004
0.8
0.008
k=3
0.4
k=15
0.004
0.008
0.0
0.000
0.8
viability with r=0.45
fecundity with r=0.45
0.000
probability distribution
0.0
0.4
0.8
0.4
k=12
0.000
0.0
0.000
0.8
k=0
probability distribution
0.0
0.4
0.8
k=9
0.006
0.010
0.4
k=0
0.0
k=6
0.000
0.0
k=3
viability with r=0.45
viability with r=0
0.000
probability distribution
0.0
0.4
0.8
k=0
0.0
0.4
0.8
haplotype frequency
(b) The population is initially in positive linkage disequilibrium (D0 = 0.05).
(v)
(f )
Figure 5: Dynamics of the probability distributions π1 and π1 simulated over 15 generations under the twolocus Wright-Fisher model with selection for the population initially at different levels of linkage disequilibrium.
We generate M = 105 independent realizations in Monte Carlo simulation and adopt N = 500, sA = 0.95,
sB = 0.05, hA = 0.5, hB = 0.5 and r = 0.45.
349
loci (i.e., r = 0). This is due to the fact that genetic recombination reinforces the change
350
in haplotype frequencies caused by natural selection for negative linkage disequilibrium (see
351
Hamilton, 2011). The middle row of Figure 5a shows that the Wright-Fisher model with via-
352
bility selection drives haplotype 1 more rapidly toward fixation than the Wright-Fisher model
353
with fecundity selection when the population is in negative linkage disequilibrium, which im21
354
plies that genetic recombination affecting natural selection at two linked loci is more significant
355
in the Wright-Fisher model with viability selection than that in the Wright-Fisher model with
356
fecundity selection for negative linkage disequilibrium. These results are also confirmed with
357
Figure 5b for the population in positive linkage disequilibrium. Therefore, the difference in the
358
behavior of the two-locus Wright-Fisher model between viability and fecundity selection results
359
from the different effect of genetic recombination on different types of natural selection within
360
the framework of the Wright-Fisher model, i.e., genetic recombination has a more significant
361
effect on natural selection at two linked loci when natural selection takes the form of viability
362
selection than fecundity selection.
363
Notice that here we employ the Hellinger distance to measure the difference in the behavior
364
of the Wright-Fisher model between viability and fecundity selection, and only simulate the
365
Hellinger distance between two probability distributions for the Wright-Fisher model with via-
366
bility and fecundity selection in limited evolutionary scenarios such as completely additive gene
367
action (h = 0.5). In Supplemental Material, we simulate the Hellinger distance between two
368
probability distributions for the Wright-Fisher model with viability and fecundity selection for
369
completely dominant gene action (h = 0) and completely recessive gene action (h = 1), recep-
370
tively (see Supplemental Material, File S3). We also provide the dynamics of the difference in
371
the behavior of the Wright-Fisher model between viability and fecundity selection for different
372
population sizes (see Supplemental Material, File S4). Moreover, we simulate the total variation
373
distance between two probability distributions for the Wright-Fisher model with viability and
374
fecundity selection for different evolutionary scenarios (see Supplemental Material, File S5). All
375
these simulated results given in Supplemental Material confirm what we have achieved above.
376
4. Discussion
377
In this section, we discuss the robustness of Monte Carlo simulation studies carried out
378
in Section 3. Moreover, we summarize our main results and discuss the significance of these
379
findings in terms of biological examples of fecundity and viability selection.
380
4.1. Robustness of Monte Carlo simulation studies
381
We perform a robustness analysis of Monte Carlo simulation studies carried out in Section 3.
382
This is mainly designed to demonstrate that the discrepancy in Monte Carlo simulation studies
22
383
results from whether viability selection or fecundity selection is occurring is not due to the
384
statistical noise produced in Monte Carlo simulation and is indeed caused by the different forms
385
natural selection takes.
1.0
1.0
0.8
0.8
0.8
0.6
0.6
0.6
0.4
0.4
0.4
0.2
0.2
0.2
0.0
0.0
0.0
1.0
N=500 and M=100000
1.0
selection coefficient
0.2 0.4 0.6 0.8
N=500 and M=10000
0 100
300
500
300
500
0.0
1.0
1.0
0.8
0.8
0.8
0.6
0.6
0.6
0.4
0.4
0.4
0.2
0.2
0.2
0.0
0.0
0.0
1.0
N=5000 and M=100000
1.0
selection coefficient
0.2 0.4 0.6 0.8
N=5000 and M=10000
0 100
0 100
300
generation
500
0 100
300
generation
500
0.0
(a) Natural selection takes the form of viability selection.
1.0
1.0
0.8
0.8
0.8
0.6
0.6
0.6
0.4
0.4
0.4
0.2
0.2
0.2
0.0
0.0
0.0
1.0
N=500 and M=100000
1.0
selection coefficient
0.2 0.4 0.6 0.8
N=500 and M=10000
0 100
300
500
300
500
0.0
1.0
1.0
0.8
0.8
0.8
0.6
0.6
0.6
0.4
0.4
0.4
0.2
0.2
0.2
0.0
0.0
0.0
1.0
N=5000 and M=100000
1.0
selection coefficient
0.2 0.4 0.6 0.8
N=5000 and M=10000
0 100
0 100
300
generation
500
0 100
300
generation
500
0.0
(b) Natural selection takes the form of fecundity selection.
(·)
(·)
Figure 6: Dynamics of the Hellinger distance H(π 1 , π 1 ) simulated with the varying selection coefficient sA
over 500 generations under the two-locus Wright-Fisher model with selection for natural selection taking different
forms. We adopt sB = 0.05, hA = 0.5, hB = 0.5, r = 0.45 and x0 = (0.3, 0.4, 0.2, 0.1).
23
386
We simulate the dynamics of the Hellinger distance of the two empirical probability distri-
387
butions for the two runs of the two-locus Wright-Fisher model with the same type of natural
388
selection here as an illustration, the two-locus Wright-Fisher model with viability selection in
389
Figure 6a and the two-locus Wright-Fisher model with fecundity selection in Figure 6b, respec-
390
tively. Figure 6 shows that the statistical noise in Monte Carlo simulation is increasing in the
391
population size N but is deceasing in the total number of independent realizations M . This is
392
confirmed by the rate of convergence for the Hellinger distance approximated by Monte Carlo
393
simulation we have stated in Section 2.3. For the population of N = 500 individuals in Monte
394
Carlo simulation studies, generating M = 105 independent realizations in Monte Carlo simula-
395
tion is large enough to remove most of the statistical noise. In comparison with Figures 6a and
396
6b, the discrepancy in Figure 2b shares the same pattern except for large selection coefficients,
397
which implies that the difference in the behavior of the Wright-Fisher model between viability
398
and fecundity selection indeed results from the different forms natural selection rather than the
399
statistical noise produced in Monte Carlo simulation when natural selection is not extremely
400
weak.
401
4.2. Summary and further perspectives
402
Our diffusion analysis of Wright-Fisher models with selection and extensive Monte Carlo
403
simulation studies show that with genetic recombination, the population evolving under natu-
404
ral selection within the framework of the Wright-Fisher model depends significantly on whether
405
viability or fecundity selection is occurring when the population is far away from linkage equi-
406
librium. This is mainly caused by different effects of genetic recombination on different types of
407
natural selection, i.e., genetic recombination has a more significant effect on natural selection at
408
linked loci when natural selection takes the form of viability selection than fecundity selection.
409
The difference in the behavior of the Wright-Fisher model becomes more significant as the effect
410
of natural selection and genetic recombination increases. However, after the population reaches
411
linkage equilibrium, population behaviors under the Wright-Fisher model with viability and
412
fecundity selection become almost identical fairly quickly.
413
We have shown that with genetic recombination, the population evolving under natural selec-
414
tion within the framework of the Wright-Fisher model depends significantly on whether viability
415
or fecundity selection is occurring when the population is far away from linkage equilibrium. Ac24
416
cording to Ridley (2004), most natural populations are probably near linkage equilibrium, which
417
implies that the different behavior of the Wright-Fisher model caused by viability or fecundity
418
selection may not be significant. However, one evolutionary scenario in which it may have signif-
419
icant consequences is that of admixture between two populations. In this case, there is likely to
420
be significant linkage disequilibrium due to the allele frequency differences between populations
421
(Pritchard & Rosenberg, 1999), and also, there is likely to be local natural selection in which
422
alleles that are favored in one population are selected against in the other (Charlesworth et al.,
423
1997), and vice-versa. Therefore, the results presented here may lead to testable predictions
424
about the role of life-history on the natural selection dynamics in populations. For example,
425
we predict that genetic recombination will have a less significant effect on the natural selection
426
dynamics when natural selection acts through fecundity difference between genotypes. It should
427
be noted that in our formulation the fecundity selection is associated with a difference among
428
monoecious diploid genotypes in their overall gamete production, and should not be confused
429
with gametic selection, for example natural selection on sperm binding alleles (Palumbi, 1999).
430
Moreover, we do not distinguish between sperm and egg production (i.e., the two haplotype
431
frequencies Xi and Xj in Eq. (1) are exchangeable). The most relevant organisms that may be
432
expected to conform to this assumption may be monoecious plant species, where correlations
433
between pollen and seed production may be induced through genetic variation in flower size
434
(e.g., Galen, 2000).
435
Given that we have developed a Monte Carlo framework for simulating the Wright-Fisher
436
model under either scenario (i.e., viability or fecundity selection), future developments could
437
include the addition of population structure, and it should then be straightforward to de-
438
velop a statistical inferential framework to allow us to distinguish between viability selection
439
or fecundity selection at candidate loci such as the approximate Bayesian computation (ABC)
440
framework (Liepe et al., 2014), which enables us to compute the posterior probabilities of the
441
Wright-Fisher models with viability and fecundity selection. Once the posterior probabilities
442
of candidate models have been estimated, we can make full use of the techniques of Bayesian
443
model comparison. Furthermore, in the present work, we investigate how different types of
444
natural selection affect the population evolving under the Wright-Fisher model mainly through
445
Monte Carlo simulation studies, which can only demonstrate the qualitative difference in the
25
446
behavior of the Wright-Fisher model. It would be far more challenging to carry out the analysis
447
of the effect of different types of natural selection on the Wright-Fisher model that can lead to
448
more quantitative comparisons.
449
Acknowledgments
450
We would like to thank Dr. Daniel Lawson for critical reading and suggestion and the
451
two anonymous reviewers for their helpful comments. This work was funded in part by the
452
Engineering and Physical Sciences Research Council (EPSRC) Grant EP/I028498/1 to F.Y.
453
References
454
Bürger, R. (2000). The Mathematical Theory of Selection, Recombination, and Mutation. Chich-
455
ester: Wiley.
456
Charlesworth, B., Nordborg, M., & Charlesworth, D. (1997). The effects of local selection,
457
balanced polymorphism and background selection on equilibrium patterns of genetic diversity
458
in subdivided populations. Genetical Research, 70 , 155–174.
459
Christiansen, F. (1984). Definition and measurement of fitness. In B. Shorrocks (Ed.), Evolu-
460
tionary ecology: the 23rd Symposium of the British Ecological Society, Leeds, 1982 . Oxford:
461
Blackwell Science Ltd.
462
463
464
465
466
467
468
469
470
471
Durrett, R. (2008). Probability Models for DNA Sequence Evolution. New York: SpringerVerlag.
Edwards, A. W. (2000). Foundations of Mathematical Genetics. Cambridge: Cambridge University Press.
Etheridge, A. (2011). Some Mathematical Models from Population Genetics: École D’Été de
Probabilités de Saint-Flour XXXIX-2009 . Berlin: Springer-Verlag.
Ewens, W. J. (1972). The sampling theory of selectively neutral alleles. Theoretical Population
Biology, 3 , 87–112.
Ewens, W. J. (2004). Mathematical Population Genetics 1: Theoretical Introduction. New York:
Springer-Verlag.
26
472
473
Fisher, R. A. (1922). On the dominance ratio. Proceedings of the Royal Society of Edinburgh,
42 , 321–341.
474
Galen, C. (2000). High and dry: drought stress, sex-allocation trade-offs, and selection on
475
flower size in the alpine wildflower Polemonium viscosum (Polemoniaceae). The American
476
Naturalist, 156 , 72–83.
477
Hamilton, M. (2011). Population Genetics. Chichester: Wiley-Blackwell.
478
Hellinger, E. (1909). Neue Begrndung der Theorie quadratischer Formen von unendlichvielen
479
480
481
Vernderlichen. Journal fr die reine und angewandte Mathematik , 136 , 210–271.
Karlin, S., & Taylor, H. E. (1981). A Second Course in Stochastic Processes. New York:
Academic Press.
482
Kimura, M. (1955). Solution of a process of random genetic drift with a continuous model.
483
Proceedings of the National Academy of Sciences of the United States of America, 41 , 144–
484
150.
485
486
487
488
489
490
491
492
Kimura, M. (1964). Diffusion models in population genetics. Journal of Applied Probability, 1 ,
177–232.
Kingman, J. F. C. (1982). The coalescent. Stochastic Processes and their Applications, 13 ,
235–248.
Le Cam, L., & Yang, G. L. (2000). Asymptotics in Statistics: Some Basic Concepts. New York:
Springer-Verlag.
Lewontin, R., & Kojima, K.-i. (1960). The evolutionary dynamics of complex polymorphisms.
Evolution, 14 , 458–472.
493
Liepe, J., Kirk, P., Filippi, S., Toni, T., Barnes, C. P., & Stumpf, M. P. (2014). A framework for
494
parameter estimation and model selection from experimental data in systems biology using
495
approximate Bayesian computation. Nature protocols, 9 , 439–456.
496
497
Malaspinas, A.-S. (2016). Methods to characterize selective sweeps using time serial samples:
an ancient DNA perspective. Molecular Ecology, 25 , 24–41.
27
498
499
500
501
Möhle, M. (2001). Forward and backward diffusion approximations for haploid exchangeable
population models. Stochastic processes and their applications, 95 , 133–149.
Nagylaki, T. (1997). Multinomial-sampling models for random genetic drift. Genetics, 145 ,
485–491.
502
Palumbi, S. R. (1999). All males are not created equal: fertility differences depend on gamete
503
recognition polymorphisms in sea urchins. Proceedings of the National Academy of Sciences,
504
96 , 12632–12637.
505
Pritchard, J. K., & Rosenberg, N. A. (1999). Use of unlinked genetic markers to detect pop-
506
ulation stratification in association studies. The American Journal of Human Genetics, 65 ,
507
220–228.
508
Prugnolle, F., Liu, H., De Meeûs, T., & Balloux, F. (2005). Population genetics of complex
509
life-cycle parasites: an illustration with trematodes. International Journal for Parasitology,
510
35 , 255–263.
511
512
Rachev, S. T., Klebanov, L., Stoyanov, S. V., & Fabozzi, F. (2013). The Methods of Distances
in the Theory of Probability and Statistics. New York: Springer-Verlag.
513
Ridley, M. (2004). Evolution. Oxford: Oxford University Press.
514
Slatkin, M. (2008). Linkage disequilibriumunderstanding the evolutionary past and mapping
515
516
517
the medical future. Nature Reviews Genetics, 9 , 477–485.
Tataru, P., Simonsen, M., Bataillon, T., & Hobolth, A. (2016). Statistical inference in the
Wright-Fisher model using allele frequency data. Systematic Biology, (p. syw056).
518
Van der Vaart, A. W. (2000). Asymptotic Statistics. Cambridge: Cambridge University Press.
519
Wright, S. (1931). Evolution in Mendelian populations. Genetics, 16 , 97–159.
28