Download The plots show the decay of LD (y-axis) with physical

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Polycomb Group Proteins and Cancer wikipedia , lookup

Gene expression programming wikipedia , lookup

Metagenomics wikipedia , lookup

Designer baby wikipedia , lookup

Population genetics wikipedia , lookup

Short interspersed nuclear elements (SINEs) wikipedia , lookup

Neocentromere wikipedia , lookup

Chromosome wikipedia , lookup

Mitochondrial DNA wikipedia , lookup

Y chromosome wikipedia , lookup

Gene wikipedia , lookup

Non-coding DNA wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Karyotype wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

History of genetic engineering wikipedia , lookup

Ridge (biology) wikipedia , lookup

NUMT wikipedia , lookup

Gene expression profiling wikipedia , lookup

Polyploid wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

RNA-Seq wikipedia , lookup

Public health genomics wikipedia , lookup

Epigenetics of human development wikipedia , lookup

X-inactivation wikipedia , lookup

Genome editing wikipedia , lookup

Genomics wikipedia , lookup

Human genome wikipedia , lookup

Human Genome Project wikipedia , lookup

Microevolution wikipedia , lookup

Human genetic variation wikipedia , lookup

Genomic library wikipedia , lookup

Genomic imprinting wikipedia , lookup

Pathogenomics wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Tag SNP wikipedia , lookup

Genome (book) wikipedia , lookup

Minimal genome wikipedia , lookup

Genome evolution wikipedia , lookup

Transcript
Supplementary figures
Mitochondrial-nuclear interactions maintain geographic separation of
deeply diverged mitochondrial lineages in the face of nuclear gene flow
Authors: Hernán E. Morales, Alexandra Pavlova, Nevil Amos, Richard Major, Andrzej
Kilian, Chris Greening and Paul Sunnucks
Contents:
Figure S1 Eastern Yellow Robin (EYR) hypothesized evolutionary history. ....... 2
Figure S3 FST histograms at fine spatial scales. ................................................... 3
Figure S3 Principal Component Analysis of genome-wide nuclear variation. .... 4
Figure S4 Allelic frequency correlations between north and south transects. ... 5
Figure S5 Manhattan plot of FST analyses at fine spatial scales .......................... 6
Figure S6 Manhattan plot of BayeScanEnv analyses ........................................... 7
Figure S7 Manhattan plot of PCAdapt analyses.................................................... 8
Figure S8 Counts of nuclear-encoded genes with mitochondrial functions (Nmt genes). ................................................................................................................ 9
Figure S9 Candidate genes for mitochondrial-nuclear interactions mapped to a
three-dimensional model of OXPHOS complex I. ............................................... 10
Figure S10 Shape and location of transects for cline analyses ......................... 11
Figure S11A PCA analysis of neutral and outlier loci for northern transect. .... 12
Figure S11B PCA analysis of neutral and outlier loci for southern transect. ... 13
Figure S12 Spatial population structure in the northern transect. .................... 15
Figure S13 Spatial population structure in the southern transect. .................... 16
Figure S14A Linkage disequilibrium (LD, r2) decay chromosomes 1-6. ............ 17
Figure S14B Linkage disequilibrium (LD, r2) decay chromosomes 7-14. .......... 18
Figure S14C Linkage disequilibrium (LD, r2) decay chromosomes 15-Z. .......... 19
Figure S15 DArT-tags mapping comparison against two reference genomes. 20
Figure S16 correlation of chromosome size and number of SNPs .................... 21
Figure S17 Results summary of outlier detection methods for loci that were not
mapped to a reference genome............................................................................ 22
Figure S18 Populations arbitrarily defined across each transect for the
BayesScanEnv analyses and the Hardy-Weinberg equilibrium tests. ............... 23
Figure S19 Summary of result for PCAdapt analyses. ....................................... 24
References ............................................................................................................. 25
1
Figure S1 Eastern Yellow Robin (EYR) hypothesized evolutionary history.
The background colour of the boxes represents birds’ nuclear genomic background
(red- north, blue- south). Inside of the bird symbols, the circles represent mitochondrial
DNA (white- mito-A, black- mito-B) and the chromosome symbols (crosses) represent
nuclear-encoded genes with mitochondrial function (N-mt genes). Matching colours
(white-white or black-black) represent functional mitonuclear interactions, mismatching
colours (white-black) represent mitonuclear incompatibilities. Each panel represent a
stage in EYR evolutionary history. (A) Initial differentiation with gene flow between
northern and southern populations as described in Morales et al. (2017). (B) Two
independent events of mitonuclear co-introgression resulting in inland-coastal
divergence (shades of blue and shades of red in the third and fourth panels). (C)
Selection for mitonuclear interactions despite ongoing nuclear gene flow (shades of
red and blue differing horizontally highlight the subtle genome-wide differentiation of
each genomic background; see genome-wide PCAs in Fig. 1 of main text). (D)
Mitonuclear divergence in the face of gene flow: selection against the formation of
mitonuclear genetic incompatibilities in hybrids (intrinsic barriers); selection for locally
adapted mitonuclear interactions to maintain geographic separation at the climatic
divide (extrinsic barriers); and/or evolution of genomic architecture for reduced
recombination. The processes described by the last two panels likely occurred
simultaneously.
2
Figure S3 FST histograms at fine spatial scales.
Inland-coastal FST differentiation between different mitolineage-bearing populations in
each transect. This analysis includes only individuals sampled within 40 km of the
centre of the contact zone. The insets show the right tails of the distributions (FST >
0.4) showing high peaks that are barely visible on the plots with all markers.
3
Random 1 (565 SNPs)
Random 2 (565 SNPs)
Random 3 (565 SNPs)
Random 4 (565 SNPs)
Random 5 (565 SNPs)
PCA 2
Non-outlier loci (59,879 SNPS)
PCA 1
Figure S3 Principal Component Analysis of genome-wide nuclear variation.
For Fig. 1C in the main text we depicted a PCA with genome-wide variation (all loci).
Here we show that the same general pattern is recovered regardless of the number
and type of loci used. Top-left panel shows a PCA with non-outlier loci only. The
remaining panels show PCAs with five different random sets of non-outlier loci of the
same size (565 SNPs) as the outlier analyses reported in the main text.
4
Figure S4 Allelic frequency correlations between north and south transects.
(A) Allelic frequency correlation between north mito-A-bearing and south mito-Abearing populations. (B) Allelic frequency correlation between north mito-B-bearing
and south mito-B-bearing populations. Correlations of outlier loci are significantly
higher than expected at random (P<0.001) showing that alleles within mitolineagebearing populations segregate in the same direction in both nuclear backgrounds.
The genomic random distribution was obtained by calculating allelic frequency
correlations of 140 sets of non-outlier loci.
5
Figure S5 Manhattan plot of FST analyses at fine spatial scales
Nuclear DNA differentiation between mitolineage-bearing populations mapped onto the zebra finch reference genome for individuals located
within a 40-km radius from the midpoint of the contact zone between the two mitolineages in each transect. (A) Northern transect; (B) southern
transect; genomic position of each Single Nucleotide Polymorphism as mapped to the zebra finch genome is shown on the x-axis; FST is
presented on the y-axis.
6
Figure S6 Manhattan plot of BayeScanEnv analyses
Nuclear DNA differentiation between populations correlated with a binomial variable of mitolineage membership mapped onto the reference
genome for all individuals across each transect. (A) Northern transect; (B) southern transect. The genomic position of each Single Nucleotide
Polymorphism as mapped to the zebra finch genome is shown on the x-axis. The -log10 posterior error probability (see Methods in main text)
depicted on the y-axis, indicates the probability of each SNPs being an outlier , with the False Discovery Rate threshold of 5% shown with a
horizontal red line.
7
Figure S7 Manhattan plot of PCAdapt analyses.
Nuclear DNA differentiation along the first axis of differentiation (K = 1, see methods) for all individuals across each transect. (A) Northern
transect; (B) southern transect. The genomic position of each Single Nucleotide Polymorphism as mapped to the zebra finch genome is shown
on the x-axis. The -log10 posterior error probability (see Methods in main text) depicted on the y-axis, indicates the probability of each SNPs
being an outlier, with the False Discovery Rate threshold of 5% shown with a horizontal red line.
8
Figure S8 Counts of nuclear-encoded genes with mitochondrial functions (Nmt genes).
Histograms of counts of nuclear-encoded genes with mitochondrial function (N-mt
genes) in random genomic regions of the same size as the mtDNA-linked island of
divergence represent the random expectation. The black arrows indicate the number
of N-mt genes found within the chromosome 1A mtDNA-linked island of divergence.
Two different categories of N-mt genes were counted (A) N-mt genes (GO term:
0005739) and (B) a subset of N-mt genes, protein-coding OXPHOS genes (GO term:
0006119). The mtDNA-linked island of divergence on chromosome 1A is significantly
enriched for N-mt genes overall (P<0.001), and also for the subset of OXPHOS genes
(P<0.01).
9
Figure S9 Candidate genes for mitochondrial-nuclear interactions mapped to a
three-dimensional model of OXPHOS complex I.
Mitonuclear interactions in the OXPHOS enzymatic complexes are formed by
mitochondrial-encoded genes (mtDNA genes) and nuclear-encoded genes with
mitochondrial function (N-mt OXPHOS genes). The underlying structure represents
the complete structure of the bovine mitochondrial complex I (Zhu et al. 2016). The
modules responsible for NADH oxidation (E module), ubiquinone reduction (Q
module), and proton-pumping (ND1, ND2, ND4, and ND5 modules) are represented
with differentially shaded backgrounds. Arrows show the routes of electron transfer
(thin, dotted, maroon), proton translocation (medium, dashed, navy), and the
conformational changes that couple them (thick, solid, grey). EYR N-mt genes in the
genomic island of divergence of chromosome 1A (NDUFA6, NDUFA12, NDUFB2; see
Fig. 3 of main text) are mapped onto the protein structure (in yellow colour). EYR
mitochondrially-encoded core subunits that show evidence of positive selection
between the mito-A and mito-B (ND4, ND4L and ND5) are highlighted in purple.
10
Figure S10 Shape and location of transects for cline analyses
For each transect, we projected the location of each individual sample along a
unidimensional transect (dashed line) and calculated the distance of each sample to
a common geographic point (triangle). The geographic locations from which samples
were taken did not comprise ideal transects running perpendicular to the
mitochondrial variation (black = mito-A and red = mito-B), particularly in the southern
transect, is likely the reason why the confidence intervals of nuclear clines were so
wide (see Results).
11
Figure S11A PCA analysis of neutral and outlier loci for northern transect.
A PCA analysis was performed using two datasets: (A) 6,947 non-outliers located at
least 100,000 bases away from any outliers and (B) 292 outliers located within the
chromosome 1A island of differentiation. Each panel shows on the left side a biplot of
PCA scores for the two first axes (mitolineages: blue- mito-B, red- mito-A), ellipses
capture 80% of the variation for each group. Each panel shows on the right side
shows the spatial location of individuals with PCA-axis-1 scores ranging from low
negative values (dark blue) though zero (white) to large positive values (dark red).
The outline of the squares represents the mitolineage membership of each individual.
Individual locations are jittered in the latitudinal direction to allow visualization of
overlapping individuals.
12
Figure S11B PCA analysis of neutral and outlier loci for southern transect.
A PCA analysis was performed using two datasets: (A) 6,947 non-outliers located at
least 100,000 bases away from any outliers and (B) 292 outliers located within the
chromosome 1A island of differentiation. Each panel shows on the left side a biplot of
PCA scores for the two first axes (mitolineages: blue- mito-B, red- mito-A), ellipses
capture 80% of the variation for each group. Each panel shows on the right side
shows the spatial location of individuals with PCA-axis-1 scores ranging from low
negative values (dark blue) though zero (white) to large positive values (dark red).
The outline of the circles represents the mitolineage membership of each individual.
Individual locations are jittered in the longitudinal direction to allow visualization of
overlapping individuals.
13
14
Figure S12 Spatial population structure in the northern transect.
(A) Shows the actual spatial location of each individual across the transect, samples
in the rest of the panels are jittered on the latitudinal axes to facilitate visualization of
overlapping samples. (B) Mitolineage membership for each individual (red- mito-A,
blue- mito-B). (C-E) Pie-charts show individual probability of assignment to two genetic
cluster from the STRUCTURE (Pritchard et al. 2000) analysis, the circle outline shows
mitolineage membership (red- mito-A, blue- mito-B). (C) Structure recovered from
12/20 replicates with genome wide neutral loci (6,947 non-outliers located at least
100,000 bases away from any outliers); (D) Structure recovered from 8/20 replicates
with genome wide neutral loci; (E) Structure recovered from 20/20 replicates with 292
chromosome 1A outlier loci.
15
Figure S13 Spatial population structure in the southern transect.
(A) Shows the actual spatial location of each individual across the transect, samples
in the rest of the panels are jittered on the longitudinal axes to facilitate visualization of
overlapping samples. (B) Mitolineage membership for each individual (red- mito-A,
blue- mito-B). (C-D) Pie-charts show individual probability of assignment to two genetic
cluster from the STRUCTURE (Pritchard et al. 2000) analysis, the circle outline shows
mitolineage membership (red- mito-A, blue- mito-B). (C) Structure recovered from
20/20 replicates with genome wide neutral loci (6,947 non-outliers located at least
100,000 bases away from any outliers); (D) Structure recovered from 20/20 replicates
with 292 chromosome 1A outlier loci.
16
Figure S14A Linkage disequilibrium (LD, r2) decay chromosomes 1-6.
The plots show the decay of LD (y-axis) with physical distance between pairs of loci
relative to their position in the zebra finch genome (x-axis). Black dots indicate
observed pairwise LD. Green lines show the expected average decay of LD per
chromosome with confidence intervals in red, estimated with a non-linear regression.
17
Figure S14B Linkage disequilibrium (LD, r2) decay chromosomes 7-14.
The plots show the decay of LD (y-axis) with physical distance between pairs of loci
relative to their position in the zebra finch genome (x-axis). Black dots indicate
observed pairwise LD. Green lines show the expected average decay of LD per
chromosome with confidence intervals in red, estimated with a non-linear regression.
18
Figure S14C Linkage disequilibrium (LD, r2) decay chromosomes 15-Z.
The plots show the decay of LD (y-axis) with physical distance between pairs of loci
relative to their position in the zebra finch genome (x-axis). Black dots indicate
observed pairwise LD. Green lines show the expected average decay of LD per
chromosome with confidence intervals in red, estimated with a non-linear regression.
19
Figure S15 DArT-tags mapping comparison against two reference genomes.
The plots show the mapping position in base pairs to the zebra finch reference genome
(Warren et al. 2010) on the x-axis against the difference in mapping position compared
to the collared flycatcher reference genome (Ellegren et al. 2012) on the y-axis. DArTtags that mapped to identical positions in both reference genomes are located along
zero values on the y-axis (e.g. chromosome 6). DArT-tags that mapped to different
positions in both reference genomes depart from zero on the y-axis (e.g. chromosome
Z). Only macro-chromosomes are shown for simplicity.
20
Figure S16 correlation of chromosome size and number of SNPs
obtained
21
Figure S17 Results summary of outlier detection methods for loci that were not
mapped to a reference genome.
Histograms of distributions of various summary statistics for the different outlier
detection methods employed: (A) fine-scale FST estimates (including only individuals
sampled within 40 km of the centre of the contact zone), (B) BayeScEnv significance
levels and (C) PCAdapt significance levels. The inset in each histogram shows the
tail of the distribution. (D) Venn diagrams show the overlap of significant outliers
detected between methods (A, B and C) in each transect.
22
Figure S18 Populations arbitrarily defined across each transect for the
BayesScanEnv analyses and the Hardy-Weinberg equilibrium tests.
Samples were organized in (A) 6 populations in the north transect (25 mito-A
individuals, 27 mito-B), and (B) 11 populations in the south transect (71 mito-A, 32
mito-B). Each colour represents a different population. Populations contain at least
two individuals that belong to the same mitogroup and are in close proximity.
23
Figure S19 Summary of result for PCAdapt analyses.
(A) Screeplot of the amount of genetic variation explained by the first 76 K clusters
(see Methods). (B) PCA analysis of both transects (green- north mito-A-bearing
population, red- north mito-B-bearing population, purple- south mito-A-bearing
population, blue- south mito-B-bearing population). (C) PCA analysis of north
transect (blue- mito-A, red- mito-B). (D) PCA analysis of south transect (red- mito-A,
blue- mito-B).
24
References
Ellegren H, Smeds L, Burri R, et al. (2012) The genomic landscape of species
divergence in Ficedula flycatchers. Nature 491, 756-760.
Morales HE, Sunnucks P, Joseph L, Pavlova A (2017) Perpendicular axes of
differentiation generated by mitochondrial introgression. Molecular
Ecology.
Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure
using multilocus genotype data. Genetics 155, 945-959.
Warren WC, Clayton DF, Ellegren H, et al. (2010) The genome of a songbird.
Nature 464, 757-762.
Zhu J, Vinothkumar KR, Hirst J (2016) Structure of mammalian respiratory
complex I. Nature 536, 354-358.
25