Download Supporting Information A

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
SUPPORTING INFORMATION
Complete mitochondrial genomes and a novel spatial genetic method reveal cryptic
phylogeographical structure and migration patterns among brown bears in northwestern Eurasia
Marju Keis, Jaanus Remm,Simon Y. W. Ho, John Davison, Egle Tammeleht, Igor L.
Tumanov, Alexander P. Saveljev, Peep Männil, Ilpo Kojola, Alexei V. Abramov, Tõnu Marg
and Urmas Saarma
Journal of Biogeography
Appendix S1: Supporting text, tables and figures
Appendix S1 provides:
 additional text information for the nucleotide diversity in different regions of the
mitochondrial genome, and phylogenetic resolution through comparative analysis of
five mtDNA datasets;
 data for samples and mitogenomes used in different analyses (Table S1);
 primers for complete mtDNA sequencing (Table S2);
 population genetic statistics of different mtDNA datasets (Table S3);
 molecular diversity indices for different regions of brown bear mtDNA (Table S4);
 relative contribution of different mitochondrial genes to the NCmtDNA network
(Table S5);
 relative contribution of different mitochondrial genes to the NCmtDNA network
normalized to gene length (Table S6)
 phylogeographic relationships of brown bears based on two mtDNA datasets: (a) 257
bp, (b) 1942 bp (Figure S1);
 relationships between sequence length and the number of parsimony-informative sites
(PIS), the average number of nucleotide differences (k) and the haplotype diversity
(Hd) (Figure S2);
 NCmtDNA network with defining mutations (Figure S3);
 Bayesian phylogenetic tree based on NCmtDNA data (Figure S4);
 network based on complete protein-coding sequences (Figure S5).
1
Nucleotide diversity in different regions of the mitochondrial genome
The most variable section was the control region, but when homoplasious sections were
excluded, its nucleotide diversity () dropped by almost an order of magnitude (Table S4),
rendering it less diverse than several protein-coding genes. Of the coding sequences, the most
variable genes were ND4, ND4L and ND5. ND4 and ND5 also had the highest average
number of nucleotide differences per gene. Among tRNAs the most variable were tRNA-Tyr
and tRNA-Ser, while many tRNAs were completely conserved (Table S4). To evaluate the
relative contribution to the NCmtDNA network of each mitochondrial gene (and the control)
region, we expressed the number of mutations in a particular gene as a proportion of the total
number of mutations in the whole NCmtDNA network and in particular haplogroups (A–E)
(Table S5). On this basis, the highest contributors to the NCmtDNA network were ND5, ND4,
CYTB and the control region, and these same regions were also the highest contributors to
individual haplogroups, though also COX3 also contributed importantly to haplogroup A.
However, when contribution was normalized in relation to gene length (Table S6), a different
set of regions contributed most (tRNAs, ATP6, ND6 and COX2).
Different regions of mtDNA evolve at different rates, leading to considerable variation in
nucleotide diversity along the genome. Among the 95 mitogenome sequences in our study,
mutation hotspots were located in the control region, which is non-coding and thus able to
accumulate frequent mutations without a significant cost to fitness. However, after removing
indels and tandem repeats that are potentially homoplasious, the nucleotide diversity of the
control region dropped considerably, even below the level of protein-coding genes ND4,
ND4L and ND5. If haplotype diversity values are also considered, it would appear that the
control region does not hold any advantage over coding sequences with regard to information
content. ND4 and ND5 have previously been used to study the phylogeny of Ursidae, though
ND4L exhibited poor phylogenetic value (Yu et al., 2007). We found that ND4L exhibited the
highest nucleotide diversity among protein-coding genes, but its overall contribution was
small (only four variable positions) because it comprises fewer than 300 bp and is the secondshortest gene (after ATP8). However, the highest contributors to the NCmtDNA network were
ND5, ND4, CYTB and the control region, and the same regions were among the highest
contributors to individual haplogroups (Table S5). On the other hand, when contribution was
normalized in relation to gene length (Table S6), the highest contributing regions were
different (tRNAs, ATP6, ND6 and COX2). It should be noted that the importance of the
control region was also low in this respect, as the network based only on protein-coding genes
(Fig. S5) was highly similar to the NCmtDNA network.
2
Phylogenetic resolution through comparative analysis of five mtDNA datasets
Five datasets comprising the same bear individuals but different sections of mtDNA (Table 1)
were analysed to compare their power to resolve phylogeographic structure: (1) 257 bp; (2)
1942 bp; (3) NCmtDNA, 16686–16689 bp; (4) protein-coding, 11406 bp; and (5) complete
mtDNA, 16760–16793 bp. The coding of haplotypes is such that the identities of individual
haplotypes are traceable between different sets (e.g. haplotypes 1a–c in NCmtDNA set,
shown in Fig. 1, are all derived from individuals with haplotype 1 in Fig. S1b).
(1) Based on the most commonly used section of control region (257 bp) we identified six
closely related haplotypes. Six variable sites were found in this short fragment of the control
region, with the central haplotype separated from the others by only one or two mutations.
Almost 90% of the samples (85 out of 95) represented a single and widespread haplotype
(marked in orange in Fig. S1a, Table S1). The other five haplotypes comprised only ten
samples and were found mostly at the periphery of the study area.
(2) Analysis of 1942 bp revealed 18 closely related haplotypes that also formed a star-like
network (Fig. S1b), albeit with more internal structure than the 257 bp network. The central
haplotype of the 257 bp dataset was divided into 13 different haplotypes in this analysis (Fig.
S1b, Table S1). In common with the shorter sequence set, the central haplotype of this set was
one of the most numerous (comprising 27% of all analysed samples) and was found
throughout the study area, except in Pskov and Vologda oblasts and Estonia. Most other
haplotypes were more or less geographically restricted and comprised a small number of
samples. From this perspective, haplotypes 2, 5 and 13 were exceptional: haplotype 2 was as
numerous as haplotype 1, but was specific to brown bears in Estonia (n = 26; 72% of all
analysed bear samples from Estonia carried this haplotype); haplotypes 5 and 13 were less
numerous, but covered much larger areas and were found from Finland to Arkhangelsk oblast.
(3) The third network, based on nearly complete mtDNA (NCmtDNA), contained 37 different
haplotypes, with 117 variable sites, of which 85 were parsimony-informative (Fig. 1, Table
1). Compared with the previous networks, the much larger number of characters in this
dataset (16686–16689 bp) meant that phylogeographically distinct clusters could be
identified, although the centre of the network consisted of several median vectors (i.e.
unsampled or extinct haplotypes). The most significant change compared with the previous
3
networks was that the haplotypes were divided among five divergent, geographically confined
and partially overlapping haplogroups (defined as A, B, C, D and E; Fig. 1). Haplotype 1
from the 1942 bp dataset was transformed most dramatically in this dataset: it was divided
into 13 different haplotypes (1a1–1a5, 1b, 1c1–1c3, 1d, and 1e1–1e3) that contributed to all
haplogroups except for D (specific to Finland). Haplotypes 2 and 3 were both further divided
into three haplotypes (2a–c, 3a–c), all belonging to haplogroup B. Haplotypes 2a–c were
specific to Estonia and haplotypes 3a–c to Estonia and adjacent areas of Russia (Leningrad
and Pskov oblasts). Haplotype 5 was also further divided into three closely related haplotypes
5a–c (most found in Arkhangelsk oblast, but a single sample was also detected in eastern
Finland), positioned in haplogroup C. One notable change was a brown bear from
Arkhangelsk oblast which in the 1942 bp dataset represented haplotype 13, generally found in
Finland; but clustered here together with samples from Kirov oblast (1c1–1c3) in the
NCmtDNA dataset.
(4) A median-joining network based on the full protein-coding dataset (11406 bp) showed the
same topology (Fig. S5) as that based on the NCmtDNA dataset (Fig. 1), except that several
branches were shorter and three haplotypes present in the NCmtDNA network were not
evident here (see also Table S1).
(5) In comparison with NCmtDNA, complete genomes (16760–16793 bp) yielded a messy
network with many reticulations, presumably due to homoplastic repeat sequences (data not
shown).
The assignment of short sequences did not always correspond, even coarsely, to the results
from the nearly complete mtDNA (Dataset 3, Table 1). For example, one bear from
Arkhangelsk oblast grouped together with two Finnish bears in the 1942 bp dataset (Fig. S1b,
haplotype 13), whereas the same bears were placed into different haplogroups based on the
NCmtDNA dataset: the bear from Arkhangelsk oblast was placed into haplogroup C (Fig. 1,
haplotype 13b).
REFERENCE
Yu, L., Li, Y.W., Ryder, O.A. & Zhang, Y.P. (2007) Analysis of complete mitochondrial
genome sequences increases phylogenetic resolution of bears (Ursidae), a mammalian
family that experienced rapid speciation. BMC Evolutionary Biology, 7, 198.
4
Table S1 Data for 95 brown bear samples and mitogenomes used in different analyses
No.
GenBank no.1
Haplotype designations for four different datasets
2
3
4
5
257 bp
1942 bp
NCmtDNA
Coding
Groups6
1
1 (HQ685901)
Orange
2
2a
2a
NEst
2
2 (HQ685902)
Orange
2
2a
2a
NEst
3
2 (HQ685902)
Orange
2
2a
2a
NEst
4
2 (HQ685902)
Orange
2
2a
2a
NEst
5
1 (HQ685901)
Orange
2
2a
2a
NEst
6
3 (HQ685903) Light blue Light blue Light blue
Light blue
NEst
7
4 (HQ685904)
Orange
2
2a
2a
NEst
8
2 (HQ685902)
Orange
2
2a
2a
SEst
9
5 (HQ685905)
Orange
2
2a
2a
NEst
10
6 (HQ685906)
Orange
2
2a
2a
SEst
11
7 (HQ685907)
Orange
2
2b
2b
NEst
12
8 (HQ685908)
Orange
2
2a
2a
NEst
13
2 (HQ685902)
Orange
2
2a
2a
NEst
14
9 (HQ685909)
Orange
2
3a
3a+3c
SEst
15
9 (HQ685909)
Orange
2
3a
3a+3c
SEst
16
2 (HQ685902)
Orange
2
2a
2a
NEst
17
10 (HQ685910)
Orange
2
2a
2a
NEst
18
11 (HQ685911)
Orange
2
2a
2a
NEst
19
2 (HQ685902)
Orange
2
2a
2a
NEst
20
2 (HQ685902)
Orange
2
2a
2a
NEst
21
2 (HQ685902)
Orange
2
2a
2a
NEst
22
12 (HQ685912) Light blue Light blue Light blue
Light blue
NEst
23
2 (HQ685902)
2a
NEst
24
12 (HQ685912) Light blue Light blue Light blue
Light blue
NEst
25
9 (HQ685909)
Orange
3
3a
3a+3c
SEst
26
9 (HQ685909)
Orange
3
3a
3a+3c
SEst
Orange
2
2a
5
27
13 (HQ685913)
Orange
3
3a
3a+3c
SEst
28
4 (HQ685904)
Orange
2
2a
2a
SEst
29
6 (HQ685906)
Orange
2
2a
2a
NEst
30
14 (HQ685914)
Orange
2
2c
2c
NEst
31
2 (HQ685902)
Orange
2
2a
2a
NEst
32
12 (HQ685912) Light blue Light blue Light blue
Light blue
SEst
33
5 (HQ685905)
Orange
2
2a
2a
NEst
34
1 (HQ685901)
Orange
2
2a
2a
NEst
35
1 (HQ685901)
Orange
2
2a
2a
NEst
36
15 (HQ685915)
Orange
3
3c
3a+3c
NEst
37
16 (HQ685916)
Orange
1
1e2
1e2
WRus
38
17 (HQ685917)
Orange
12
12
12
WRus
39
18 (HQ685918)
Orange
12
12
12
WRus
40
19 (HQ685919)
Orange
1
1e1
1e1
WRus
41
20 (HQ685920)
Orange
1
1e1
1e1
WRus
42
21 (HQ685921)
Orange
12
12
12
WRus
43
22 (HQ685922)
Orange
1
1e1
1e1
WRus
44
22 (HQ685922)
Orange
1
1e1
1e1
WRus
45
23 (HQ685923)
Orange
3
3b
3b
WRus
46
18 (HQ685918)
Orange
12
12
12
WRus
47
24 (HQ685924)
Orange
12
12
12
WRus
48
22 (HQ685922)
Orange
1
1e1
1e1
WRus
49
25 (HQ685925)
Orange
1
1b
1b
WRus
50
26 (HQ685926)
Orange
1
1a2
1a2
WRus
51
27 (HQ685927)
Orange
4
4
4
NRus
52
54 (HQ685954)
Orange
3
3a
3a+3c
WRus
53
28 (HQ685928)
Orange
13
13b
13b+1c1
NRus
54
29 (HQ685929)
Black
Black
Black
Black
NRus
55
30 (HQ685930)
Orange
5
5b
5b
NRus
6
56
31 (HQ685931)
Black
Black
Black
Black
NRus
57
30 (HQ685930)
Orange
5
5b
5b
NRus
58
32 (HQ685932)
Orange
5
5a
5a
NRus
59
32 (HQ685932)
Orange
5
5a
5a
NRus
60
33 (HQ685933)
Orange
1
1e3
1e3
NRus
61
34 (HQ685934)
Orange
1
1d
1d
NRus
62
33 (HQ685933)
Orange
1
1e3
1e3
NRus
63
35 (HQ685935)
Orange
11
11
11
NRus
64
36 (HQ685936)
Orange
11
11
11
NRus
65
37 (HQ685937)
Orange
1
1c1
13b+1c1
ERus
66
38 (HQ685938)
Orange
6
6
6
ERus
67
39 (HQ685939)
Orange
1
1c1
13b+1c1
ERus
68
40 (HQ685940)
Orange
7
7
7
ERus
69
41 (HQ685941)
Orange
9
9
9
ERus
70
42 (HQ685942)
Red
Red
Red
Red
ERus
71
43 (HQ685943)
Orange
1
1c3
1c3
ERus
72
44 (HQ685944)
Orange
1
1c2
1c2
ERus
73
45 (HQ685945)
Orange
7
7
7
ERus
74
46 (HQ685946)
Orange
10
10
10
ERus
75
39 (HQ685939)
Orange
1
1c1
13b+1c1
ERus
76
47 (HQ685947)
Orange
1
1a5
1a5
WRus
77
48 (HQ685948)
Orange
1
1a4
1a4
WRus
78
49 (HQ685949)
Orange
1
1e1
1e1
WRus
79
50 (HQ685950)
Orange
1
1a1
1a1
CRus
80
51 (HQ685951)
Orange
1
1a4
1a4
CRus
81
52 (HQ685952)
Orange
1
1a4
1a4
CRus
82
53 (HQ685953)
Orange
1
1a1
1a1
CRus
83
55 (HQ685955)
Orange
8
8
8
NFin
84
56 (HQ685956)
Orange
8
8
8
SFin
7
85
57 (HQ685957)
Blue
Blue
Blue
Blue
NFin
86
56 (HQ685956)
Orange
8
8
8
SFin
87
57 (HQ685957)
Blue
Blue
Blue
Blue
SFin
88
58 (HQ685958)
Orange
1
1a3
1a3
SFin
89
59 (HQ685959)
Orange
8
8
8
SFin
90
60 (HQ685960)
Green
Green
Green
13+green
NFin
91
61 (HQ685961)
Orange
1
1a3
1a3
SFin
92
62 (HQ685962)
Orange
13
13
13+green
NFin
93
61 (HQ685961)
Orange
1
1a3
1a3
SFin
94
63 (HQ685963)
Orange
13
13
13+green
NFin
95
64 (HQ685964)
Orange
5
5c
5c
SFin
1
Haplotype numbers in GenBank annotation derived from complete mtDNA (Dataset 5, see
Table S3)
2
Control region (see also Table S3 and Fig. S1a)
3
Control region and CYTB (see also Table S3 and Fig. S1b)
4
Nearly complete mtDNA (16686–16689 bp): complete mtDNA with homoplastic sequences
removed, i.e Dataset 3 (see Table S3, Fig. 1, Materials and Methods)
5
Complete protein-coding dataset (11406 bp, see Table S3 and Fig. S5)
6
Names of regions used in spatial diffusion analysis; Estonia = North-Estonia (NEst, n = 27,
Lääne-Virumaa, Ida- Virumaa, Raplamaa, Järvamaa, Jõgevamaa) and South-Estonia (SEst,
n = 9, Pärnumaa, Viljandimaa, Tartumaa, Põlvamaa), Finland = North-Finland (NFin, n =
5, Oulu and Lapland provinces) and South-Finland (SFin, n = 8, Southern Finland,
Western Finland and Eastern Finland provinces). European Russia: Northern (NRus, n =
13), Western (WRus, n = 18), Central (CRus, n = 4) and Eastern Russia (ERus, n = 11).
See also Fig. 3.
8
Table S2 Primers for PCR and sequencing of complete mtDNA of brown bears
Primer
pair
no.
Primer sequence (5' to 3')
Position*
Size of PCR
product
1
F - ACAACCAGTAGAACATCCCTT
R - TGGCACGAAATTAACCAA
15188–15208
300–317
1800 bp
2
F - CATAAAGGTTTGGTCCTAG
R - CAAGTTGATATGGGTGTTG
67–85
1646–1664
1598 bp
3
F - GCCTGGTGATAGCTGGTT
R- CTGCGATAGGTTGTAGGAGAC
1439–1456
2885–2905
1467 bp
4
F - AGCCCGGTGATTGCATAA
R - TGGACGGTGAGATTTGGT
2691–2708
4347–4364
1674 bp
5
F - CGCATCCATGCTCCTGAT
R - ATGGTGGCAGCAACCAGA
4114–4131
5651–5668
1555 bp
6
F - ATTGGCACTCTTTACCTTCT
R - GTGGTTATGATATTGGCTTG
5389–5408
6926–6945
1557 bp
7
F - ATCCAGATGCCTATACAACA
R - TTACGAGTATCGGGATTAGA
6674–6693
8356–8375
1702 bp
8
F - AACTAATCTTCTGGGTCTATTA
R - GTTTGTGATGCTCAGGGA
8191–8212
9706–9723
1533 bp
9
F - TCTATTGATGAGGGTCCTG
R - GATTATTAAGGCTGTTGCTC
9395–9413
11119–11138
1744 bp
10
F - ACAAGTTCTATCTGCCTACG
R - CGGATTAGAAGAAATACCC
11001–11020
12530–12548
1548 bp
11
F - AACATGAAAACCTTAACA
R - ATGGCCACTGAGCAGTAT
12377–12394
13859–13876
1500 bp
12
F - ACTGTATAAAGCCGCAATC
R - GTCTTAGGGAGGGTAGGTA
13652–13670
15359–15377
1726 bp
F - TTCCAATCCTACTAACCCTTC
15823–15843
812 bp
R - CATGCCTGTGATTACAGTGAT
16614–16634
*Positions are according to HQ685901 (Table S1); **This is an additional primer pair used
occasionally to double-check part of the control region if an equivocal sequence was obtained
using the original primers; F – forward primer; R – reverse primer
13**
9
Table S3 Population genetic statistics of different mtDNA datasets (n = 95)
Dataset
Length (bp)
h
1. 257 bp
257
2. 1942 bp
1942
Hd (SD)
 (SD)
V
S
PI
k
6 0.199 (0.055) 0.00097 (0.0003)
6
2
4
0.249
18 0.839 (0.024) 0.00094 (0.0001)
21
6
15
1.816
3. NCmtDNA* 16686–16689 37 0.923 (0.02) 0.00075 (0.00003) 117 32 85 12.501
4. Coding**
11406
34
0.92 (0.02)
0.00092 (0.00004) 90 25 65 10.511
5. CmtDNA*** 16760–16793 64 0.969 (0.01) 0.00097 (0.00004) 132 32 100 16.213
h, number of haplotypes; Hd, haplotype diversity; SD, standard deviation; , nucleotide
diversity; V, total number of variable sites; S, singletons; PI, parsimony-informative sites; k,
average number of nucleotide differences.
*NCmtDNA denotes “nearly complete mtDNA”, with TC-tracks, TAC-CGT replacements
and tandem repeats removed from the complete genome due to homoplasy (see Materials and
Methods);
**Coding – all protein-coding genes concatenated;
***Complete mtDNA – full-length mtDNA sequences. See Materials and Methods for
detailed information about datasets.
10
Table S4 Molecular diversity indices for different regions of brown bear mtDNA (n = 95
mitogenomes)
Sequence*
Length (bp)
Hd (SD)
 (SD)
V
S PI
I
tRNA-Phe
68
0
0
0
0
0
0
12S rRNA
962-963
0.196 (0.053)
0.00021 (0.00006)
3
1
2
1
tRNA-Val
66
0.082 (0.038)
0.00124 (0.00057)
1
0
1
0
16S rRNA
1579
0.618 (0.040)
0.00047 (0.00004)
5
0
5
0
tRNA-Leu
75
0.042 (0.028)
0.00056 (0.00038)
1
0
1
0
ND1
957
0.387 (0.063)
0.00045 (0.00008)
6
1
5
0
tRNA-Ile
70
0.021 (0.021)
0.0003 (0.00029)
1
1
0
0
tRNA-Gln**
73
0
0
0
0
0
0
tRNA-Met
69
0
0
0
0
0
0
ND2
1044
0.264 (0.056)
0.00031 (0.00008)
5
3
2
0
tRNA-Trp
68
0
0
0
0
0
0
tRNA-Ala**
69
0
0
0
0
0
0
tRNA-Asn**
73
0
0
0
0
0
0
tRNA-Cys**
67
0.042 (0.028)
0.00062 (0.00042)
1
0
1
0
tRNA-Tyr**
67
0.12 (0.044)
0.0018 (0.00065)
1
0
1
0
COX1
1545
0.784 (0.026)
0.00082 (0.00007)
8
3
5
0
tRNA-Ser**
72-74
0
0
0
0
0
2
tRNA-Asp
67
0
0
0
0
0
0
COX2
684
0.598 (0.037)
0.001 (0.0001)
6
4
2
0
tRNA-Lys
68
0
0
0
0
0
0
ATP8
204
0.121 (0.045)
0.0006 (0.0002)
2
1
1
0
ATP6
681
0.542 (0.044)
0.00095 (0.0001)
4
0
4
0
COX3
784
0.379 (0.06)
0.001 (0.0002)
5
0
5
0
11
tRNA-Gly
69
0
0
0
0
0
0
ND3
346
0.042 (0.028)
0.00012 (0.00008)
1
0
1
0
tRNA-Arg
69
0
0
0
0
0
0
ND4L
297
0.671 (0.021)
0.003 (0.00021)
4
1
3
0
ND4
1378
0.739 (0.043)
0.001 (0.0001)
13
3
10
0
tRNA-His
69
0.021 (0.021)
0.00031 (0.0003)
1
1
0
0
tRNA-Ser
59
0.062 (0.034)
0.001 (0.00059)
2
1
1
0
tRNA-Leu
70
0
0
0
0
0
0
ND5
1821
0.812 (0.026)
0.0012 (0.0001)
19
5
14
0
ND6**
528
0.452 (0.058)
0.001 (0.00018)
7
1
6
0
tRNA-Glu**
69
0.042 (0.028)
0.0006 (0.0004)
1
0
1
0
CYTB
1140
0.735 (0.031)
0.001 (0.00009)
10
3
7
0
tRNA-Thr
70
0
0
0
0
0
0
tRNA-Pro**
65
0
0
0
0
0
0
Control region
1317-1350
0.835 (0.022)
0.00329 (0.00025)
25
3
22
35
0.358 (0.00401)
0.0005 (0.00011)
10
3
7
0
Control region*** 1244
*for position numbers, see the reference sequence HQ685901 in GenBank (Table S1); **genes coded
by complementary strand; ***shorter control region (NCmtDNA); Hd, haplotype diversity; SD,
standard deviation; , nucleotide diversity; V, total number of variable sites; S, singletons; PI,
parsimony-informative sites; I, indels
12
Table S5 Relative contribution of different mitochondrial genes (and the shortened control
region) to the NCmtDNA network (Fig. 1). In the NCmtDNA and haplogroup columns the
first value represents the number of mutations in a particular gene, and the second value is the
number of mutations in a particular gene expressed as a percentage of the number of
mutations in the whole network (NCmtDNA) or in a particular haplogroup (A-E). The
mitochondrial genes contributing most to the NCmtDNA network are shown in bold red.
Total = number of mutations for the whole NCmtDNA network and for different haplogroups.
Sequence*
tRNA-Phe
12S rRNA
tRNA-Val
16S rRNA
tRNA-Leu
ND1
tRNA-Ile
tRNA-Gln**
tRNA-Met
ND2
tRNA-Trp
tRNA-Ala**
tRNA-Asn**
tRNA-Cys**
tRNA-Tyr**
COX1
tRNA-Ser**
tRNA-Asp
COX2
tRNA-Lys
ATP8
ATP6
COX3
tRNA-Gly
ND3
tRNA-Arg
ND4L
ND4
tRNA-His
tRNA-Ser
tRNA-Leu
ND5
ND6**
tRNA-Glu**
CYTB
tRNA-Thr
tRNA-Pro**
Control region (shorter)
Total
Length (bp) NCmtDNA
68
962-963
66
1579
75
957
70
73
69
1044
68
69
73
67
67
1545
72-74
67
684
68
204
681
784
69
346
69
297
1378
69
59
70
1821
528
69
1140
70
65
1244
4/3.3
1/0.8
5/4.2
1/0.8
6/5
1/0.8
5/4.2
1/0.8
1/0.8
8/6.7
2/1.7
6/5
2/1.7
4/3.3
5/4.2
1/0.8
4/3.3
13/10.8
1/0.8
2/1.7
19/15.8
7/5.8
1/0.8
10/8.3
10/8.3
120
13
A
1/3.6
2/7.1
2/7.1
2/7.1
2/7.1
1/3.6
5/17.9
3/10.7
2/7.1
2/7.1
2/7.1
4/14.3
28
Haplogroups
B
C
D
E
3/5
1/8.3
1/1.7
1/8.3
3/5
1/1.7
2/3.3 2/22.2
1/1.7
1/8.3
2/16.7
1/1.7
1/1.7
1/8.3
3/5
1/8.3
1/8.3
3/5
2/16.7
2/3.3
2/16.7 2/3.3 1/11.1
1/1.7
2/3.3
2/16.7
6/10
1/8.3
1/1.7
2/3.3
2/16.7 11/18.3 1/11.1 3/25
3/5
2/22.2
1/1.7
2/16.7 5/8.3
1/8.3
5/8.3 3/33.3 1/8.3
12
60
9
12
Table S6 Relative contribution of different mitochondrial genes (and the control region) to
the NCmtDNA network (Fig. 1) normalized in relation to gene length. Contribution values
were calculated as follows: the number of mutations in a particular gene was divided by a
number of mutations in the whole network (NCmtDNA) or in a particular haplogroup (A–E),
which was then subsequently divided by gene-length (and multiplied by 100). The
mitochondrial genes contributing most to the NCmtDNA network are shown in bold red.
Total = number of mutations for the whole NCmtDNA network and for different haplogroups
Sequence*
tRNA-Phe
12S rRNA
tRNA-Val
16S rRNA
tRNA-Leu
ND1
tRNA-Ile
tRNA-Gln**
tRNA-Met
ND2
tRNA-Trp
tRNA-Ala**
tRNA-Asn**
tRNA-Cys**
tRNA-Tyr**
COX1
tRNA-Ser**
tRNA-Asp
COX2
tRNA-Lys
ATP8
ATP6
COX3
tRNA-Gly
ND3
tRNA-Arg
ND4L
ND4
tRNA-His
tRNA-Ser
tRNA-Leu
ND5
ND6**
tRNA-Glu**
CYTB
tRNA-Thr
tRNA-Pro**
Control region
(shorter)
Total
Haplogroups
B
C
D
8.7
42.1
44
5.3
37.0
5.8
258
39.7
66.5
41.5
41.5
44.9
5.4
101.5
12.2
27.2
203.9
8.2
181.3
8.0
18.7
100.8
12.1
40.3
94.2
76.3
16.8
67.8
15.8
467.6
40.3
121.8
12.2
-
Length (bp)
NCmtDNA
68
962-963
66
1579
75
957
70
73
69
1044
68
69
73
67
67
1545
72-74
67
684
68
204
681
784
69
346
69
297
1378
69
59
70
1821
528
69
1140
70
65
3.5
12.6
2.6
11.1
5.2
11.9
3.9
12.4
12.4
4.3
22.8
7.9
8.2
4.9
5.3
2.4
11.2
7.9
12.1
28.2
8.7
11.1
12.1
7.3
-
A
8.1
26.7
24.4
16.5
349.5
18.6
81.3
27.8
14.0
48.3
22.4
-
1244
6.7
41.0
-
11.2
297.7
55.8
120
28
12
60
9
12
14
E
72.1
133
44.9
203.1
50.4
114.4
60.9
-
Figure S1 Phylogeographic relationships between brown bears in north-western Eurasia (n =
95) based on two mtDNA datasets: (a) 257 bp; (b) 1942 bp (see Table S3). A separate colour
is assigned to each haplotype in (b), with the colours corresponding to those in (a); e.g. an
orange haplotype in (b) represents an individual that was assigned to the large central
haplotype (also orange) in (a). Note that: (1) on networks the sample sizes for haplotypes are
presented inside the symbols (small numbers in italic) and the haplotype numbers are beside
the symbols (large numbers in bold); (2) on maps the sample sizes are not presented, and the
haplotype numbers are inside the points.
15
Figure S2 Relationships between sequence length and the number of parsimony-informative
sites (PIS), the average number of nucleotide differences (k) and the haplotype diversity (Hd)
(see also Table 1). Asterisks denote parameters multiplied by a factor of ten.
16
Figure S3 NCmtDNA network with defining mutations (i.e. Dataset 3, Table S3). Five
haplogroups (A–E) are shown. The colours and numbering of haplotypes follows that in Fig.
1.
17
Figure S4 Bayesian phylogenetic tree of brown bear haplotypes based on 95 nearly complete
mitochondrial sequences (Dataset 3, see Tables S1 & S3). Haplogroup names (A–E) and
colours correspond to those in Fig. 1.
18
Figure S5 Median-joining network of brown bears in north-western Eurasia (n = 95) based on
complete protein-coding sequences (i.e. Dataset 4, Table S3). Compared to the nearly
complete mtDNA (i.e. Dataset 3 in Table S3; Fig. 1), the same five haplogroups (A–E) are
defined, although the lengths of several branches are shorter here and three haplotypes are
missing (not diverged owing to a smaller number of characters; shown with red arrows). The
colours and numbering of haplotypes follow that in Fig. 1.
19