Download Supporting Information (SI) for “Theoretical models of the influence

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

NUMT wikipedia , lookup

Group selection wikipedia , lookup

Ploidy wikipedia , lookup

Philopatry wikipedia , lookup

Non-coding DNA wikipedia , lookup

Chromosome wikipedia , lookup

Frameshift mutation wikipedia , lookup

Minimal genome wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Designer baby wikipedia , lookup

Gene expression programming wikipedia , lookup

Human genetic variation wikipedia , lookup

Karyotype wikipedia , lookup

Genetic drift wikipedia , lookup

Pathogenomics wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Koinophilia wikipedia , lookup

Oncogenomics wikipedia , lookup

Human genome wikipedia , lookup

Genomics wikipedia , lookup

Whole genome sequencing wikipedia , lookup

Helitron (biology) wikipedia , lookup

Point mutation wikipedia , lookup

Mutation wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Epistasis wikipedia , lookup

Polyploid wikipedia , lookup

Human Genome Project wikipedia , lookup

Population genetics wikipedia , lookup

Genomic library wikipedia , lookup

Genome editing wikipedia , lookup

Genome evolution wikipedia , lookup

Microevolution wikipedia , lookup

Transcript
Supporting Information (SI) for
“Theoretical models of the influence of genomic architecture on the dynamics of
speciation”
by S. M. Flaxman, A. C. Wacholder, J. L. Feder, and P. Nosil
SI comprises:
Figures S1 – S10
Descriptions of Videos S1 and S2 with URLs for streaming or download
Fig. S1. A schematic representation of the three scenarios of genomic architecture from which
results were generated. Squares represent demes, open circles are individual organisms, filled
gray circles and lower case italicized letters are alleles at different loci, and solid lines are
chromosomes. Solid, double-headed black arrows represent migration between demes. Dashed,
blue, curved arrows represent indirect effects of alleles of different genes on one another (note
these are absent in the “beanbag” scenario, where dynamics of alleles at different loci are
completely independent of one another because offspring are produced from a population
“beanbag”). The “genome only” scenario has genes organized in genomes and gametes are
produced from individuals by meiosis. As a consequence of this basic genomic architecture,
selection on one locus can have indirect effects on other loci (dashed arrows). However, this
“genome only” scenario assumes independent assortment of all loci (i.e., there is no
chromosomal linkage). The “linkage” scenario is identical to the “genome only” scenario, with
the exception that it includes chromosomes and physical linkage, adding a second general type of
genomic architecture.
Fig. S2. An example graphically illustrating the “time of speciation” as defined in the Methods
of the main text. Three quantities are used in this determination: (1) the average fitness of
residents born in a deme, wres, (2) the average fitness of immigrants to a deme, wimm, and (3) the
maximum fitness that would be possible in the deme, given the alleles that are segregating in the
whole population. Prior to genome wide congealing, there is minimal local adaptation: resident
fitness is not much higher than that of immigrants. However, after genome wide congealing
takes hold, there is strong local adaptation, so the magnitude of resident fitness compared to
immigrant fitness grows greatly, as residents increasingly have genotypes having nearly optimal
combinations of alleles. Hence, the time of speciation was defined, operationally, as the last
time when the magnitude of resident fitness was closer to the magnitude of immigrant fitness
than to that of the maximum possible fitness (relative magnitudes are revealed by taking the
ratios, as shown in the figure). This time is shown by the vertical dashed line in the figure. The
y-axis is logarithmically scaled to give maximum visual resolution (but this does not affect the
speciation time metric).
Fig. S3. Effective migration rates over time for nine combinations of gross migration rates (m)
and the average, per-locus strength of divergent selection (s) for large populations (N = 20,000).
For reference, data from Fig. 1A,B,C in the main text are shown here as panels C, E, and I,
respectively. Each panel shows the three genomic architecture scenarios (Fig. S1). For the
linkage runs, there are C = 4 chromosomes in the genome, each of which is l = 25 cM long.
Each line is the median of 50 independent simulation runs.
Fig. S4. Effective migration rates over time for the same nine combinations of m and s as in Fig.
S3, but for small populations (N = 1000). Each line is the median of 50 independent simulation
runs. Interpretations and all parameters (except N) are the same as Fig. S3, but note that the time
scale (x-axis) is longer here, owing to the larger effects of drift in slowing down divergence in
these smaller populations.
Fig. S5. The dynamics of population divergence for four combinations of m and s and the three
genomic architecture scenarios. Interpretation of lines and points in all panels is the same as in
Fig. 2A-C in the main text. Values of s and m are constant within a row; y-axis scaling differs
between rows to maximize visual clarity. Moving down the rows, the magnitude of m relative to
s increases as follows: (A-C) m = s, (D-F) m = 4s, (G-I) m = 5s, and (J-L) m = 8s. All examples
were run to an (a priori) operational divergence point or 1,200,000 generations, whichever came
first (Methods). In the models with genomes (middle and right columns), as the m/s ratio
increases, the proportion of time spent in the non-diverged phase becomes larger, but speciation
events become more sharply defined. (Additional parameters: (A-C, G-F) N = 4000, M = 100
cM, C = 4 chromosomes; (D-F, J-L) N = 5000, M = 400 cM, C = 4 chromosomes.).
Fig. S6. The effects of initial divergence via a few strongly selected mutations on the subsequent
dynamics of speciation. These three simulation runs began with (A,D) two, (B,E) four, or (C,F)
eight strongly divergently selected mutations established in the population having selection
coefficients S = 0.2. Selection coefficients of all subsequent mutations were then drawn, as
usual, from an exponential distribution with mean s (= 0.01 in these cases; other parameters: m =
0.1, N = 5000, and all three runs are from the “genome only” model). Meanings of lines and
points are the same as in Fig. 2 in the main text. (A,B) Even with strong divergence at a small
number of loci, gene flow is extensive over much of the genome and populations are not
reproductively isolated until many more mutations (of mean s) establish and GWC drives the
populations apart. (C,F) With a greater number of strongly selected mutations, the transition to
species is less sharply defined: LD builds much more gradually than in the cases with fewer
large-effect mutations. See also Fig. 3A in the main text for a summary of how large-effect
mutations affect waiting times to speciation.
Fig. S7. Effects of allopatry on speciation dynamics. Each column gives results from one
simulation run. In all panels, m = 0.1 when migration is occurring, and s = 0.01. See Fig. 2 for
interpretations of lines and points. Examples are shown for (A-C, E-G) the “genome only”
model and (D,H) the beanbag model. (A,E) In the “genome only” model, an initial period of
50,000 generations of allopatry allowed sufficient divergence to accumulate such that the
populations did not fuse when secondary contact occurred, but (B,F) allopatric divergence for
10,000 generations was followed by fusion of the populations, and then divergence much later.
(C,G) We “replayed the tape” of the same simulation run as shown in Fig. 2B, but caused
allopatry from generations 150,000-151,000 (arrow). To demonstrate the effects of GWC per se,
during only this 1000-generation period, we prevented any new mutations from entering. The
increase in LD resulting from allopatry was sufficient to create barriers to gene flow, and in spite
of some hybridization, the populations remained distinct after migration began again (purple and
orange lines do not come back toward to the center). Note that in Fig. 2B, with the same
conditions but continuous migration, it took more than 80,000 generations longer before the split
occurred. (D,H) In the beanbag model, having no genomic architecture, there is no way to
preserve LD in the face of gene flow. Thus, with m >> s, populations fuse upon secondary
contact, in spite of the accumulation of high levels of divergence and LD prior to contact. Note
that up to the point of secondary contact, the beanbag model followed essentially the same
trajectory as the genome model (compare A and D; see also Fig. 3B in the main text for more on
allopatry).
Fig. S8. The effect of the mutation rate on the time required to reach a given level of divergence.
Here, wr is the average fitness of residents born in a deme, wi is the average fitness of recent
immigrants to a deme, and the barrier strength b = 500. With more divergently selected
mutations per generation, (A) the total number of generations required to reach a given barrier
strength is reduced, but (B) the cumulative number of mutations that arise is greater. The
mutation rate was varied from 1 to 100 per generation in steps of 1. Other parameters: N = 5000,
m = 0.05, s = 0.01, C = 4, M = 100.
Fig. S9. Increasingly asymmetric selection makes speciation more difficult. The difference on
the x-axis is the mean strength of selection, per locus, in deme 2 compared to that in deme 1.
Hence, in the simulations shown here, per-locus selection was consistently stronger in deme 2.
This gave alleles that were favored in deme 2 a net selective advantage globally, increasing the
likelihood that they would reach global fixation (eliminating divergence). (A) Linkage model.
(B) Genome only model. Parameter values used here were m = 0.05, N = 5000, C = 4
chromosomes, M = 100 cM. sdeme 2 was varied from 0.02 to 0.039 in steps of 0.001, while sdeme 1
was simultaneously varied from 0.02 to 0.001 in steps of -0.001. 10 replicates were performed
for each difference, and all mutations within a deme were given a constant value of sj = sdeme i to
ensure the same asymmetry at every locus. Values where no points are shown indicate that
speciation was not observed within the allotted time (1,200,000 generations).
Fig. S10. Times to speciation in the BU2S model when all selection coefficients are a constant
value (rather than being drawn from an exponential distribution) and selection is symmetric in
(A) the linkage model and (B) the genome only model. For the cases shown here, all sj = s =
0.02 (other parameters: N = 5000, C = 4 chromosomes, M = 100 cM). 10 replicates were
performed for each value of m, which varied from 0.01 to 0.1 in steps of 0.01. In the genome
only model, speciation did not occur within the allotted time for m > 0.06.
Descriptions of SI Videos and Links for Download or Streaming
Note: downloading and playing videos will produce higher viewing resolution than streaming
directly from the Dropbox.com sites.
Video S1
Technical details:
Filename: Supplemental_Movie_S1.avi
File format: AVI movie
File size: 86 MB
Duration: 40 seconds
URL for streaming or download (file size was too large for submission):
https://www.dropbox.com/s/q9kmoraqoqtj3qu/Supplemental_Movie_S1.avi
Description and brief interpretation: Video S1 shows a broad time-lapse view of the
process of the buildup of de novo genome-wide divergence. FST values are shown for all
divergent loci at their respective locations on chromosomes (C = 4) in the genome. Ripley’s K
function, with null expectations, approximate 95% quantiles around the null (blue lines), and
BU2S simulation results (red lines) are shown for each chromosome at each time step as well.
To aid comparisons, the data for the video are from the same simulation run as the one used to
construct Fig. 6 in the main text, and interpretations of all lines and points are the same as in that
figure. Note that the vast majority of alleles come and go quickly due to mutation and drift, so in
the time series of FST, loci appear to flicker in and out (since FST is undefined when there is only
one allele fixed at a locus). Fig. 6 in the text shows a snapshot at a time of 630,000 generations
after the start of the simulation. The video spans from 150,000 generations to 750,000
generations. A nonrandom cluster of diverged loci becomes persistently statistically significant
on chromosome #3 at around 600,000 generations, and with the benefit of hindsight, one can see
that loci contributing to this cluster accumulated over hundreds of thousands of generations
(starting from about 162,000 generations). Later on, once population divergence is well under
way, other significant clusters can also be seen to develop and persist. The rapid transition to
species can be seen occurring in a span of time centered around 670,000 generations (see Video
S2 for more a much more detailed view of the transition).
Video S2
Technical details:
Filename: Supplemental_Movie_S2.avi
File format: AVI movie
File size: 126 MB
Duration: 53 seconds
URL for streaming or download (file size was too large for submission):
https://www.dropbox.com/s/pgeoksecdc2lhfr/Supplemental_Movie_S2.avi
Description and brief interpretation: Video S2 shows a “slowed down” time-lapse of the
period from 650,000 to 690,000 generations in the same simulation run as Video S1. Whereas
each sequential frame in Video S1 advances by 500 generations, each frame in Video S2
advances by only 25 generations. Hence, Video S2 provides a much more detailed look at the
period of time surrounding the transition from one species to two.