* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Supporting Information (SI) for “Theoretical models of the influence
Group selection wikipedia , lookup
Non-coding DNA wikipedia , lookup
Frameshift mutation wikipedia , lookup
Minimal genome wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Designer baby wikipedia , lookup
Gene expression programming wikipedia , lookup
Human genetic variation wikipedia , lookup
Genetic drift wikipedia , lookup
Pathogenomics wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Koinophilia wikipedia , lookup
Oncogenomics wikipedia , lookup
Human genome wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Helitron (biology) wikipedia , lookup
Point mutation wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Human Genome Project wikipedia , lookup
Population genetics wikipedia , lookup
Genomic library wikipedia , lookup
Genome editing wikipedia , lookup
Supporting Information (SI) for “Theoretical models of the influence of genomic architecture on the dynamics of speciation” by S. M. Flaxman, A. C. Wacholder, J. L. Feder, and P. Nosil SI comprises: Figures S1 – S10 Descriptions of Videos S1 and S2 with URLs for streaming or download Fig. S1. A schematic representation of the three scenarios of genomic architecture from which results were generated. Squares represent demes, open circles are individual organisms, filled gray circles and lower case italicized letters are alleles at different loci, and solid lines are chromosomes. Solid, double-headed black arrows represent migration between demes. Dashed, blue, curved arrows represent indirect effects of alleles of different genes on one another (note these are absent in the “beanbag” scenario, where dynamics of alleles at different loci are completely independent of one another because offspring are produced from a population “beanbag”). The “genome only” scenario has genes organized in genomes and gametes are produced from individuals by meiosis. As a consequence of this basic genomic architecture, selection on one locus can have indirect effects on other loci (dashed arrows). However, this “genome only” scenario assumes independent assortment of all loci (i.e., there is no chromosomal linkage). The “linkage” scenario is identical to the “genome only” scenario, with the exception that it includes chromosomes and physical linkage, adding a second general type of genomic architecture. Fig. S2. An example graphically illustrating the “time of speciation” as defined in the Methods of the main text. Three quantities are used in this determination: (1) the average fitness of residents born in a deme, wres, (2) the average fitness of immigrants to a deme, wimm, and (3) the maximum fitness that would be possible in the deme, given the alleles that are segregating in the whole population. Prior to genome wide congealing, there is minimal local adaptation: resident fitness is not much higher than that of immigrants. However, after genome wide congealing takes hold, there is strong local adaptation, so the magnitude of resident fitness compared to immigrant fitness grows greatly, as residents increasingly have genotypes having nearly optimal combinations of alleles. Hence, the time of speciation was defined, operationally, as the last time when the magnitude of resident fitness was closer to the magnitude of immigrant fitness than to that of the maximum possible fitness (relative magnitudes are revealed by taking the ratios, as shown in the figure). This time is shown by the vertical dashed line in the figure. The y-axis is logarithmically scaled to give maximum visual resolution (but this does not affect the speciation time metric). Fig. S3. Effective migration rates over time for nine combinations of gross migration rates (m) and the average, per-locus strength of divergent selection (s) for large populations (N = 20,000). For reference, data from Fig. 1A,B,C in the main text are shown here as panels C, E, and I, respectively. Each panel shows the three genomic architecture scenarios (Fig. S1). For the linkage runs, there are C = 4 chromosomes in the genome, each of which is l = 25 cM long. Each line is the median of 50 independent simulation runs. Fig. S4. Effective migration rates over time for the same nine combinations of m and s as in Fig. S3, but for small populations (N = 1000). Each line is the median of 50 independent simulation runs. Interpretations and all parameters (except N) are the same as Fig. S3, but note that the time scale (x-axis) is longer here, owing to the larger effects of drift in slowing down divergence in these smaller populations. Fig. S5. The dynamics of population divergence for four combinations of m and s and the three genomic architecture scenarios. Interpretation of lines and points in all panels is the same as in Fig. 2A-C in the main text. Values of s and m are constant within a row; y-axis scaling differs between rows to maximize visual clarity. Moving down the rows, the magnitude of m relative to s increases as follows: (A-C) m = s, (D-F) m = 4s, (G-I) m = 5s, and (J-L) m = 8s. All examples were run to an (a priori) operational divergence point or 1,200,000 generations, whichever came first (Methods). In the models with genomes (middle and right columns), as the m/s ratio increases, the proportion of time spent in the non-diverged phase becomes larger, but speciation events become more sharply defined. (Additional parameters: (A-C, G-F) N = 4000, M = 100 cM, C = 4 chromosomes; (D-F, J-L) N = 5000, M = 400 cM, C = 4 chromosomes.). Fig. S6. The effects of initial divergence via a few strongly selected mutations on the subsequent dynamics of speciation. These three simulation runs began with (A,D) two, (B,E) four, or (C,F) eight strongly divergently selected mutations established in the population having selection coefficients S = 0.2. Selection coefficients of all subsequent mutations were then drawn, as usual, from an exponential distribution with mean s (= 0.01 in these cases; other parameters: m = 0.1, N = 5000, and all three runs are from the “genome only” model). Meanings of lines and points are the same as in Fig. 2 in the main text. (A,B) Even with strong divergence at a small number of loci, gene flow is extensive over much of the genome and populations are not reproductively isolated until many more mutations (of mean s) establish and GWC drives the populations apart. (C,F) With a greater number of strongly selected mutations, the transition to species is less sharply defined: LD builds much more gradually than in the cases with fewer large-effect mutations. See also Fig. 3A in the main text for a summary of how large-effect mutations affect waiting times to speciation. Fig. S7. Effects of allopatry on speciation dynamics. Each column gives results from one simulation run. In all panels, m = 0.1 when migration is occurring, and s = 0.01. See Fig. 2 for interpretations of lines and points. Examples are shown for (A-C, E-G) the “genome only” model and (D,H) the beanbag model. (A,E) In the “genome only” model, an initial period of 50,000 generations of allopatry allowed sufficient divergence to accumulate such that the populations did not fuse when secondary contact occurred, but (B,F) allopatric divergence for 10,000 generations was followed by fusion of the populations, and then divergence much later. (C,G) We “replayed the tape” of the same simulation run as shown in Fig. 2B, but caused allopatry from generations 150,000-151,000 (arrow). To demonstrate the effects of GWC per se, during only this 1000-generation period, we prevented any new mutations from entering. The increase in LD resulting from allopatry was sufficient to create barriers to gene flow, and in spite of some hybridization, the populations remained distinct after migration began again (purple and orange lines do not come back toward to the center). Note that in Fig. 2B, with the same conditions but continuous migration, it took more than 80,000 generations longer before the split occurred. (D,H) In the beanbag model, having no genomic architecture, there is no way to preserve LD in the face of gene flow. Thus, with m >> s, populations fuse upon secondary contact, in spite of the accumulation of high levels of divergence and LD prior to contact. Note that up to the point of secondary contact, the beanbag model followed essentially the same trajectory as the genome model (compare A and D; see also Fig. 3B in the main text for more on allopatry). Fig. S8. The effect of the mutation rate on the time required to reach a given level of divergence. Here, wr is the average fitness of residents born in a deme, wi is the average fitness of recent immigrants to a deme, and the barrier strength b = 500. With more divergently selected mutations per generation, (A) the total number of generations required to reach a given barrier strength is reduced, but (B) the cumulative number of mutations that arise is greater. The mutation rate was varied from 1 to 100 per generation in steps of 1. Other parameters: N = 5000, m = 0.05, s = 0.01, C = 4, M = 100. Fig. S9. Increasingly asymmetric selection makes speciation more difficult. The difference on the x-axis is the mean strength of selection, per locus, in deme 2 compared to that in deme 1. Hence, in the simulations shown here, per-locus selection was consistently stronger in deme 2. This gave alleles that were favored in deme 2 a net selective advantage globally, increasing the likelihood that they would reach global fixation (eliminating divergence). (A) Linkage model. (B) Genome only model. Parameter values used here were m = 0.05, N = 5000, C = 4 chromosomes, M = 100 cM. sdeme 2 was varied from 0.02 to 0.039 in steps of 0.001, while sdeme 1 was simultaneously varied from 0.02 to 0.001 in steps of -0.001. 10 replicates were performed for each difference, and all mutations within a deme were given a constant value of sj = sdeme i to ensure the same asymmetry at every locus. Values where no points are shown indicate that speciation was not observed within the allotted time (1,200,000 generations). Fig. S10. Times to speciation in the BU2S model when all selection coefficients are a constant value (rather than being drawn from an exponential distribution) and selection is symmetric in (A) the linkage model and (B) the genome only model. For the cases shown here, all sj = s = 0.02 (other parameters: N = 5000, C = 4 chromosomes, M = 100 cM). 10 replicates were performed for each value of m, which varied from 0.01 to 0.1 in steps of 0.01. In the genome only model, speciation did not occur within the allotted time for m > 0.06. Descriptions of SI Videos and Links for Download or Streaming Note: downloading and playing videos will produce higher viewing resolution than streaming directly from the Dropbox.com sites. Video S1 Technical details: Filename: Supplemental_Movie_S1.avi File format: AVI movie File size: 86 MB Duration: 40 seconds URL for streaming or download (file size was too large for submission): https://www.dropbox.com/s/q9kmoraqoqtj3qu/Supplemental_Movie_S1.avi Description and brief interpretation: Video S1 shows a broad time-lapse view of the process of the buildup of de novo genome-wide divergence. FST values are shown for all divergent loci at their respective locations on chromosomes (C = 4) in the genome. Ripley’s K function, with null expectations, approximate 95% quantiles around the null (blue lines), and BU2S simulation results (red lines) are shown for each chromosome at each time step as well. To aid comparisons, the data for the video are from the same simulation run as the one used to construct Fig. 6 in the main text, and interpretations of all lines and points are the same as in that figure. Note that the vast majority of alleles come and go quickly due to mutation and drift, so in the time series of FST, loci appear to flicker in and out (since FST is undefined when there is only one allele fixed at a locus). Fig. 6 in the text shows a snapshot at a time of 630,000 generations after the start of the simulation. The video spans from 150,000 generations to 750,000 generations. A nonrandom cluster of diverged loci becomes persistently statistically significant on chromosome #3 at around 600,000 generations, and with the benefit of hindsight, one can see that loci contributing to this cluster accumulated over hundreds of thousands of generations (starting from about 162,000 generations). Later on, once population divergence is well under way, other significant clusters can also be seen to develop and persist. The rapid transition to species can be seen occurring in a span of time centered around 670,000 generations (see Video S2 for more a much more detailed view of the transition). Video S2 Technical details: Filename: Supplemental_Movie_S2.avi File format: AVI movie File size: 126 MB Duration: 53 seconds URL for streaming or download (file size was too large for submission): https://www.dropbox.com/s/pgeoksecdc2lhfr/Supplemental_Movie_S2.avi Description and brief interpretation: Video S2 shows a “slowed down” time-lapse of the period from 650,000 to 690,000 generations in the same simulation run as Video S1. Whereas each sequential frame in Video S1 advances by 500 generations, each frame in Video S2 advances by only 25 generations. Hence, Video S2 provides a much more detailed look at the period of time surrounding the transition from one species to two.