Download CommercialOutbreds05..

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Behavioural genetics wikipedia , lookup

Heritability of IQ wikipedia , lookup

Genetic algorithm wikipedia , lookup

Transcript
Genetic characterization of commercially available outbred
mice and an assessment of their utility for QTL mapping
1
Introduction
Genetic dissection of complex disease and quantitative phentoypes in the
mouse is limited by a lack of resources for gene identification. By contrast to
human genome wide association studies (GWAS), which exploit accumulated
historical recombinations to map susceptibility loci with a resolution measured
in tens of kilobases, genetic mapping in mice typically uses crosses between
inbred strains that deliver a resolution measured in tens of megabases. To
improve mapping resolution significantly we need a population of mice with
similar population genetics to humans: a large effective population size and
dense in independent recombination events. It might appear that these criteria
could be found in completely outbred wild, but their use for mapping would
encounter the same drawbacks that afflict human GWAS:
(i) tens of
thousands of subjects are needed for robust detection of common causal
variants and (ii) the majority of the genetic variance remains unexplained,
even using these large sample sizes.
One potential solution is to map in a population in which susceptibility
loci consist entirely of known higher-frequency alleles. We have previously
demonstrated the potential of this approach using genetically heterogeneous
stock (HS) for high-resolution genetic mapping. The HS is descended from
eight known inbred strains, subjected to approximately 50 generations of
pseudo-random breeding, thereby introducing multiple recombinants. Each
animal is a fine-grained mosaic of the progenitors making high resolution
2
mapping possible, and the known ancestry means that, in contrast to wild
derived populations, which are likely to contain many rare variants, every
allele can be tracked back to the founders. In consequence loci detected in
the HS explained on average three quarters of the phenotypic variance of
each mapped phenotype (over 100 analysed to date).
Quantitative trait loci (QTLs) that contribute to variation in common
complex phenotypes in an HS can be mapped into intervals of about 3 Mb, a
substantial improvement over mapping in inbred strain crosses, but still too
large for gene-level resolution. Previously, we showed that mapping in a
commercially available stock, HsdOla:MF1 UK mice, could identify genes.
Even though the MF1’s origins are unclear, sequence analysis indicated that it
could be modeled as if animals were descended from inbred strains. We
exploited this feature of the MF1 to identify Rgs2 as a gene underlying a
quantitative trait locus for an anxiety-related phenotype. Subsequently, a US
colony of MF1 has been used to obtain sub-megabase mapping resolution on
a genome-wide level, for QTLs influencing transcript abundance.
Success with the MF1 suggests that commercially available outbred
stocks may be a potentially important resource for gene identification. Not
only could they deliver genome-wide gene-level mapping resolution, but they
could be cheaper to use for mapping than traditional laboratory strains that
have to be maintained and crossed within the user’s laboratory. Outbred mice
are simply imported, phenotyped and then genotyped.
There has been no systematic examination of the genetic architecture
of commercially available mice. To date only about half a dozen colonies have
3
been examined, from different often unrelated perspectives: investigations of
eight colonies outbred Swiss mice, using assays of protein variation, indicated
that the colonies had the same amount of variation found in fully outbred
mouse or human populations {Rice, 1980 #263; Cui, 1993 #1591}
examination of outbred CD-1 mice found high levels of population
substructure {Aldinger, 2009 #8005} and genetic drift has been documented
in a colony of CFLP mice {Papaioannou, 1980 #8002}.
Important gaps in our current knowledge need to be filled if we are to
determine the suitability of commercial stocks for gene identification. First, we
lack linkage-disequilibrium (LD) maps: low LD will favour high-resolution
mapping. Second, we do not know to what extent colonies are genetically
related. We do not know to what extent the frequency of alleles varies
between colonies, nor what fraction of variants is rare or private to specific
colonies. Stocks with different names are assumed to be genetically different,
but we do not know the extent of that differentation nor the extent to which
colonies with the same name but sold by different suppliers are genetically
similar. Mapping in colonies that consist primarily of high frequency variants
will require fewer animals. Furthermore colonies that contain alleles common
to laboratory strains would enable loci already detected in inbred strain
crosses to be mapped and, potentially, the genes identified.
Results
Colony breeding protocols, size, age and health status
4
We contacted commercial providers of outbred stocks throughout the world,
requesting details on colony sizes, colony history and protocols for
maintaining stocks. Table 1 summarizes results from the XX companies that
agreed to provide this information. We estimate that this represents XX of
global colonies of outbred mice.
There is considerable variation in the way animals are maintained and
Table 1 documents practices that rule out colonies for genetic mapping.
Since unintended directional selection (for example culling small mice) and
genetic
drift
alter
genetic
diversity
some
some
breeders
maintain
heterozygosity by periodically crossing the stock to animals taken from a
much smaller population (the protocol is called IGS (which stands for….). In
consequence a small number of chromosomes are distributed widely
throughout the population, introducing large regions of linkage disequilibrium
which significantly reduces mapping resolution. With the exception of YY
colonies, which we examined to confirm this prediction, we did not genetically
characterize colonies using the IGS breeding scheme.
Colonies also vary considerably in size, age and health status. Larger
colonies (such as XX ) maintain heterozygosity better than smaller colonies.
This is because mouse colonies behave very much like finite island
populations, except for imposed bottlenecks or forcible introduction of new
alleles. The time required for a neutral allele to go to fixation in a population,
and hence to reduce heterozygosity, is approximately equal to four times the
effective population size (Ne). The age of a colony determines mapping
resolution: older colonies accumulate more recombinations and mapping
resolution depends primarily on the number of generations since the colony
5
was founded. Finally health status will determine a colony’s suitability in
academic laboratories that impose strict health criteria for allowing animals
into their facilities. For example only XX colonies had sufficiently clean health
reports to be considered for admission into the Mary Lyon Centre, MRC,
Harwell UK.
Genetic structure: inbreeding and population stratification
We started by comparing measures of inbreeding and genetic relatedness
within each colony. High rates of inbreeding make colonies less suitable for
mapping because they contain fewer (if any) segregating QTLs. Colonies that
consist of a mixture of relatives (such as siblings, half siblings, cousins,
second degree and third degree relatives) will be difficult to use for mapping
because of the differing degrees of genetic relatedness introduces population
structure.
We screened all populations with 351 markers at four loci chosen so
that they could also be used to map QTLs and assess linkage disequilibrium
(Table 2) SNPs were spaced so as to allow us to make inferences about both
long and short range LD. Each of the four regions extends for approximately 4
megabases (Mb) with a mean intermarker distance of 47 Kb. The QTLs cover
four large effect QTLs detected in the HS that are easy and inexpensive to
phenotype (large effect QTLs can be detected with relatively few animals).
The region on chromosome 17 includes the MHC, highly polymorphic in wild
populations and a sensitive indicator therefore of any loss of heterozygosity.
However it should be noted that the LD structure of the MHC is atypical of the
6
genome. While these four loci constitute less than 1% of the genome, it is
unlikely that they are unrepresentative; if QTLs cannot be mapped at high
resolution here, it is unlikely that colonies will be suitable for genome-wide
mapping.
Our aim was to compare and rank colonies, which could be done with
genotypes from the four loci. We included three control populations, with
known genetic characteristics: 8 HS mice, 109 collaborative cross mice (a set
of XX recombinant inbred lines being created from eight inbred strains and at
generation XX of inbreeding when analysed), 94 inbred lines and a population
of wild mice caught from multiple sites in Arizona, that consists of unrelated
individuals and is more likely to represent a fully outbred population, similar to
that used in a human GWAS.
Table 3 gives three measures of inbreeding: heterozygosity (inbred
colonies will score low on this measure); the percentage of markers that failed
a test of Hardy Weinberg equilibrium (HWE) (colonies that consist of inbred
but unrelated individuals, will have high scores) and a coefficient of inbreeding
that compares the observed versus expected number of homozygous
genotypes {Purcell, 2007 #8008}. A measure of relatedness is given in Figure
1: the pairwise extent of similarity between individuals, using the identity by
state (IBS) of markers (IBS distance: (IBS2 + 0.5 X IBS1) / ( number of SNP
pairs )) {Purcell, 2007 #8008}.
The measures detect different features of the genetic structure of the
colonies. While low heterozygosity, high HWE failure and high inbreeding
coefficient correctly identify the inbred strains, the collaborative cross, which is
7
still not completely inbred, scores relatively well on heterozygosity (19%), but
is identified as inbred by the its high inbreeding coefficient (table 2). The IBS
distance correctly identifies the CC, inbred strains, HS, and the wild-Lausanne
mice as containing more highly related individuals than expected by chance.
There are some surprising findings for the commercial outbreds. Four
colonies are almost inbred: NTac:NIHBS-US, ClrHli:CD1-IL, Hsd:NIHSBC-IL,
BK:W-UK. With heterozygosities < 5% almost all the markers we genotyped
were not polymorphic. A further five colonies have heterozygosities less than
10% and so are unlikely to be useful for mapping (nor indeed to be useful for
the most of the outbred stock intended purposes).
inbreeding coefficients greater than 20%
Three colonies have
(HsdHu:SABRA-IL, Sca:NMRI-
SE_10an, HsdOla:MF1-IL) and a further seven with values greater than 10%.
Heterozygosity across all populations (including wild mice, HS, CC and
inbred strains) is just over 25%, with about 80% of the total genetic variation
attributable to variation within colonies. However restricting attention to
commercial stocks gives a weighted mean Fst of 0.108. This contrasts with
human populations where estimates of Fst are typically less than 5% (Reich,
etc).
Genetic relatedness
We carried out a PCA using genotypes from all populations to investigate the
genetic relationship between colonies. The first two principal components
explain 52% of the variation, but neither component is easy to interpret.
Figure 3 plots the two components, superimposing the stock name (3a)
producer of each colony (figure 3b) and the country of origin name (figure 3c).
8
Stocks that we obtained from a single producer (for example SABRA) form
relatively discrete groups, but it was not possible to differentiate stocks that
originate from different producers and countries (such as CD1 and NMRI). A
similar picture was found using multi-dimensional scaling of an IBS pairwise
distance matrix (PLINK) (Figure 4a, 4b, 4c )
This result suggested that many of the commercial stocks are derived
from a common set of founders. We attempted to identify this set by
considering each genome as originating from K ancestral populations
determine genetic ancestry regardless of population identity, (). We looked at
values of K from 3 to 12. Figure X shows results for K = 9 and plots of results
are given in Supp Fig X. Across the top of each plot we show the names of
the outbred stocks, and on the bottom the name of each colony.
The proportions of shared ancestry vary considerably between stocks.
MF1 and TO stocks consist largely of a single and unique component. CFW
divides into two: the Crl derived animals have the same ancestry, different
from the Hsd CFW. The latter is indistinguishable from one stock of NMRI
(Hsd:NMRI-DE).
The
large
groups
of
CD1
and
NMRI
mice
are
heterogeneous, with large variation between suppliers.
We evaluated the relationships between populations using Wright’s
fixation indices (Fst), calculated using population
allele frequencies
(Beaumont). Figure XX(a) shows agglomerative clustering of the Fst
distances (without an outgroup it is not possible to root the tree). MF1 stock
cluster togetherl, as do NIHS, but there is less consistency for CD1 and
NMRI: while stocks from the same supplier aggregate (e.g. CD1 from Crl on
9
the left of the figure) there is no clear partitioning of the two stocks. As in the
ancestry analysis the Hsd CFW stock clusters with NMRI from the same
supplier (Hsd:NMRI-DE). However we find that the CFW from Crl clusters with
other CD1 stocks.
We assessed population structure within each colony using multidimensional scaling of the IBS pairwise distance matrices. Supp Figure X
shows results for all populations; representative examples are shown in
Figure X. We found two or more clusters in eighteen populations. Populations
maintained by the IGS system as expected gave rise to population structure,
but we also found evidence of structure in XX populations that were
maintained by a random breeding method.
10
Mapping resolution
We assessed mapping resolution using three measures:
(i) haplotypic
diversity across each region (ii) genetic diversity measured by the SNPs’
average minor allele frequency (MAF) (iii) mean LD decay radius (defined
here to be the mean physical separation in bp between SNPs at which the
squared correlation coefficient R2 drops below 0.5). We estimated the
variance of mean MAF and LD decay radius by resampling 80% of the data
500 times and re-calculating both measures.
We phased haplotypes across the four regions using fastPHASE,
following the procedure described in Conrad 2006. We expect the proportion
of shared haplotypes between colonies to reflect the genetic relationships.
Comparison with the clustering based on Fst shows good agreement (figure
X) validating the haplotype reconstruction. The total number of haplotypes
Figure we
Figure xx shows the results for all populations analysed, sorted by the
LD decay radius (there were insufficient genotypes to calculate an LD decay
radius for NTac:NIHBS-US and ClrHli:CD1-IL). The mean is shown as a
black bar and the 95% confidence intervals as a grey box, with outliers
displayed at both extremities.
Many colonies have a mean LD decay radius comparable to that found
in the wild Arizona mice (0.8), which we can use as a bench mark for a stock
appropriate for gene-level mapping resolution.
By contrast the HS has a
value of 2.9. There are 29 populations with mean MAF less 0.05 or a mean
LD decay radius greater than 2 Mb. Combined with exclusions on the basis of
11
poor genetic structure, there are 35 populations that have properties
conducive to high resolution mapping; ten of these have LD decay radii of less
than 1 Mb.
Populations vary considerably in the loci in the extent of LD at the
different loci. Figure xx shows LD plots (from Haploview) on chromosome 17
for six populations. Variation is such that although Hsd:Win:NMRI-NL has a
mean LD decay radius of just over 1, it will of little use mapping MHC region.
We compared our findings from 351 markers with those obtained from
whole genome analyses. We used whole genome mouse SNP arrays to
interrogate six colonies, chosen to cover a range of LD decay measures:
Crl:CFW(SW)-US_P08, HsdWin:CFW1-NL, HsdWin:NMRI-NL, Hsd:ICR(CD1)-FR, RjHan:NMRI-FR, Crl:NMRI(Han)-FR. Figure 4 shows good agreement
between the decay of LD with distance averaged across the genome (in 100
Kb windows), compared to the LD decay detected by the 352 SNPs.
Comparable measures were found for genetic structure (table 3).
Temporal variation
The genetic characteristics of colonies will vary over time due to unintended
directional selection and genetic drift alter genetic diversity. We assessed XX
colonies on XX occasions
12
Sequence analysis and novel variants
We used two methods to determine the extent and nature of sequence
variation. First we used PCR to amplify 22 fragments of about 1.2 Kb, (see
Supp Table xxx for primer information). We randomly selected eight regions
from a 5Mb-QTL region we previously mapped on mouse chromosome 1
(REF), four regions from three loci involved in HDL, CD4 and MCV traits
(REF) and 2 regions from the AKP2 locus. We sequenced 12 animals from
each of the three pilot populations (HsdWin:CFW-1 NL HNL1, Crl:CFW US
K71 and HsdWin:NMRI NL HNL1), 12 wild mice animals (DNA provided to us
by Alexandre Reymond, University of Lausanne) and 10 classical inbred
strains (A/J, AKR/J, BALB/cJ, C3H/HeJ, C57BL/6J, CBA/J, DBA/2J, LP/J,
I/LnJ and RIII/DmMobJ).
We discovered 120 SNPs (see Supp Table xx for detailed information).
Wild mice have an average of one SNP every 200bp., but this rate varies
between strains: HsdWin:CFW-1 and Crl:CFW have frequency of 1 SNP
every 350bp, whereas HsdWin:NMRI has 1 SNP on average one SNP every
520 bp. Nine of the SNPs are coding variants (table ).
We found 3 novel variants (giving a rate of 2.5%) in Crl:CFW
(positioned on chr1:173306046, chr1:173368101 and chr17:34785468) and
only
one
(rate
0.8%)
in
each
HsdWin:CFW-1
and
HsdWin:NMRI
(chr17:34785468).
13
Our locus-specific sequencing data suggest that HsdWin:CFW-1 is closely
related
to
wild-derived
inbred
strains PWK
whereas
Crl:CFW
and
HsdWin:NMRI are related to Swiss-derived inbred strains (eg NOD and FVB).
Genome sequencing
Genome wide analysis of sequence variants
CNVs?
QTL mapping
A critical determinant of the usefulness of the stock is whether it can be used
to replicate and fine-map QTLs detected in other populations {Yalcin, 2004
#32}. We analysed 200 animals from three colonies: Crl:CFW (USA),
HsdWin:CFW (Netherlands) and HsdWin:NMRI (Netherlands). Blood samples
were taken from a tail vein and we performed assays for serum alkaline
phosphatase (ALP), the ratio of CD4+ to CD8+ T-cells, concentration of highdensity lipoproteins (HDL) in serum and mean red cell volume.
We found significant association results for three phenotypes (ALP,
CD4/CD8 ratio and HDL). Applying a conservative Bonferroni correction for
testing 351 markers for four phenotypes in three populations gives a threshold
of 4.93, which, as figure XX shows, is exceeded over a 1 Mb interval on
chromosome 4 for ALP, a 0.5 Mb region on chromosome 1 for HDL and a two
megabase region on chromosome 17 for CD4/CD8 ratio. The QTLs are
detected in different populations: ALP detected in Crl:CFW (with less
significant evidence for association in HsdWin:NMRI,); HDL in HsdWin:CFW;
CD4/CD8 in Crl:CFW and HsdWin:CFW.
14
The extent of the association signal seen in Fig X could be due to
linkage disequilibrium between markers, to the presence of multiple
independent effects within the same region or due to undetected population
structure. To distinguish between these alternatives we used a resample
model averaging procedure developed in our analysis of the HS (). Using
forward selection to determine which markers to keep in a model explaining
phenotypic variation, the data were re-sampled (without replacement) 2,000
times.
We determined the performance and resolution of the method by
simulating a QTL at each polymorphic marker in the three regions and in all
populations. As expected, confidence intervals depended on the location of
QTL within a region of high LD, and varied from less than 100Kb to more than
2 Mb (Fig)
Results of RMA mapping of the three phenotypes is shown in Figure X
with the strength of pairwise LD indicated by a grayscale above the plots
(where black circles are R2 of 1). We found no evidence of multiple effects at
these loci (as indicated by the logP of second and subsequent rounds of
forward selection falling below significance thresholds). The ALP locus
remains diffusely spread over a 1 megabase region in both the Crl:CFW and
HsdWin:NMRI populations. However much higher resolution is seen for
mapping CD4/CD8 ratio and HDL where the 95% confidence intervals (from
simulation) is less than 200 Kb in the vicinity of the QTL (?high resolution
figure??)
Characterization of the molecular basis of CD4/CD8 – h2ealpha is
within the location we have identified chr17:34,421,575-34,579,223
15
Characterization of the molecular basis of HD. The locus is chr1: 173.6-73.7
this excludes apoa2 (chr1:173,155,220-173,156,501). It includes Cd48,
SlamF1, CD84 and SlamF6. NOTE A DUPLICATION IN LOOKSEQ at
173,759,999-173,775,001
Deletion at 173735500 - 173745500
Discussion
We have characterized XX commercially available mouse colonies, from YY
breeders in ZZ locations across the world. We document considerable
variation in genetic diversity between colonies, estimate inbreeding,
population structure and linkage disequilibrium for each colony, catalogue
sequence variation and show that colonies can be used to ma a genome wide
sca and deshow that linkage disequilibrium with a number of outbred colonies
have properties how that some colonies = 80% of the variation
On the basis of low heterozygosity, evidence for unexpectedly high
genetic relatedness and evidence of population structure Overall, 38 colonies
can be excluded.
Gst Ht 0.252 Hs 0.200
Gst 0.207
r 0.343
HS and HT are the mean heterozygosity within populations and in the entire
population that some colonies are appropriate for high resolution mapping on
16
Variaton between colonies – no single colony is ideal. Needs larger survey of
the genomes of all colonies. Colonies fluctuate, partly breeders fault as they
move stock around or introduce large chunks of the genome
Companies
do
not
Greater
awareness
of
genetic
variaton
17
METHODS
Sequencing
PCR
LD
Genetic mapping
Where necessary, phenotypes are transformed into Gaussian deviates.
Covariates (such as gender, age, experimenter, time) that explain a significant
fraction of each phenotype’s variance with ANOVA P-value<0.01 are included
in subsequent statistical analyses. We use two mapping methods: a single
point analysis of variance of each marker and a multi-point method.
Haplotypes are reconstructed as mosaics of know inbred strains using a
dynamic programming algorithm that minimises the number of breakpoints
required {Yalcin, 2004 #32}. These strains are used as progenitors for the
multipoint analysis (probabilistic ancestral haplotype reconstruction (in the
HAPPY package) {Mott, 2000 #96}. Region-wide significance levels are
estimated by permuting the transformed phenotype values 1,000 times.
18
TABLES
Table 1 – Mouse providers, location, breeding protocols, health status
Table 2 – QTLs and SNPs used to assess colonies
Phenotype
No. of
markers
Chr
Start
End
Red cells
MCV
1
131.6
134.5
42
CD4/CD8
17
32.6
38.9
112
ALP
4
136.2
139
72
HDL
1
172.6
177.2
125
19
Table 3 – Genetic characteristics of outbred mouse colonies
Popullation
No.
%
genotyped
%
homozygote
Het.
Pct
MAF <
5%
Pct
fail
HWE
Mean
inbreeding
coef
Aai:ICR-US
24
98.83
75.92
0.08
6.80
2.27
2.76
BK:W_UK
48
92.17
87.25
0.04
3.12
2.27
8.78
BomTac:NMRI-DK-151
23
91.98
65.16
0.16
2.83
1.70
-5.68
BomTac:NMRI-DK-160
24
93.11
65.72
0.15
3.97
1.98
4.57
109
89.17
5.38
0.19
2.83
89.24
67.28
ClrHli:CD1_IL
20
94.65
93.20
0.01
2.83
0.57
-16.50
Crl:CD1(ICR)-DE
48
94.07
40.51
0.19
18.98
7.08
10.26
Crl:CD1(ICR)-FR
48
94.26
32.01
0.28
15.01
4.53
6.00
Crl:CD1(ICR)-IT
48
95.15
33.71
0.31
13.31
5.38
4.70
Crl:CD1.ICR_UK
48
93.20
30.88
0.27
13.88
3.97
4.40
Crl:CD1(ICR)-US_C61
24
96.81
31.44
0.30
12.75
2.27
0.68
Crl:CD1(ICR)-US_H43
24
96.07
36.54
0.29
9.92
3.97
6.00
Crl:CD1(ICR)-US_H48
24
95.88
37.68
0.30
1.70
2.55
-4.18
Crl:CD1(ICR)-US_K64
48
93.91
29.46
0.30
14.16
5.38
-1.41
Crl:CD1(ICR)-US_K95
24
97.14
44.19
0.28
3.12
2.27
-10.45
Crl:CD1(ICR)-US_P10
24
96.41
42.21
0.22
15.58
1.98
1.56
Crl:CD1(ICR)-US_R16
24
96.86
38.24
0.35
3.40
2.83
-12.10
Crl:CD1.ICR-US_iso
30
97.37
37.96
0.24
11.90
4.25
13.73
Crl:CF1-US
48
94.92
25.50
0.35
4.82
6.80
10.04
Crl:CFW(SW)-US_K71
48
94.25
41.36
0.26
4.25
4.53
6.28
Crl:CFW(SW)-US_P08
48
97.27
29.18
0.22
24.36
0.00
4.65
Crl:MF1_UK
47
93.04
64.87
0.13
1.13
1.13
-2.06
Crl:NMRI(Han)-DE
48
94.74
39.94
0.27
11.61
4.82
1.93
Crl:NMRI(Han)-FR
48
85.44
37.39
0.26
5.67
6.23
12.01
Crl:NMRI(Han)-HU
48
90.37
39.66
0.26
8.22
6.52
0.43
Crl:OF1-FR_B22
24
91.89
26.63
0.35
6.80
6.80
-5.27
Crl:OF1-FR_B41
24
93.77
27.76
0.35
9.07
6.80
-7.98
Crl:OF1-HU
48
92.54
28.05
0.35
5.10
6.80
-1.35
Crlj:CD1(ICR)-JP
48
94.79
41.93
0.21
8.22
7.08
4.61
HS
48
98.44
21.81
0.43
0.57
2.83
-3.88
CC
20
HanRcc:NMRI-CH
48
94.17
66.29
0.20
1.98
1.98
-11.67
Hla:(ICR)CVF-US
48
83.42
49.29
0.21
12.46
4.82
-3.13
Hsd:ICR(CD-1)-DE
48
89.89
47.03
0.29
4.25
5.10
2.13
Hsd:ICR(CD-1)-ES
48
88.56
46.46
0.26
7.37
5.38
3.49
Hsd:ICR(CD-1)-FR
48
93.52
45.04
0.28
5.10
5.38
5.60
Hsd:ICR(CD-1)-IL
48
86.08
43.91
0.29
6.23
3.68
-6.55
Hsd:ICR(CD-1)-IT
48
88.94
47.03
0.28
2.83
4.82
7.52
Hsd:ICR(CD-1)-MX
48
91.28
47.88
0.30
5.10
13.60
-11.34
Hsd:ICR(CD-1)-UK
48
92.96
46.18
0.28
5.95
3.97
-0.34
Hsd:ICR(CD-1)-US
48
95.99
48.16
0.28
6.80
5.38
4.36
Hsd:ND4-US
48
93.68
69.97
0.07
17.00
2.27
4.89
Hsd:NIHSBC_IL
12
91.64
90.93
0.02
1.42
0.57
3.11
Hsd:NIHS_UK_C
15
93.75
68.56
0.11
10.48
1.70
6.36
Hsd:NIHS_UK_G
33
92.63
75.07
0.11
3.40
3.12
-5.09
Hsd:NIHS-US
48
92.11
54.67
0.19
6.52
9.92
-18.01
Hsd:NSA(CF1)-US
48
93.30
30.88
0.34
12.18
11.61
1.90
HsdHu:SABRA_IL
48
91.97
45.04
0.22
5.67
22.38
25.44
HsdIco:OF1-IT
48
90.48
30.31
0.34
2.27
13.60
5.22
HsdOla:MF1_IL
8
90.51
50.42
0.21
0.00
1.70
21.38
HsdOla:MF1-UK_C
48
72.71
26.06
0.21
10.20
4.25
5.31
HsdOla:MF1_UK_G
48
93.90
41.08
0.28
7.37
3.40
-0.65
HsdOla:MF1-US_202A_iso
24
93.87
75.35
0.13
1.70
0.85
-6.90
HsdOla:MF1-US_202A_prod
24
94.76
75.35
0.13
1.13
0.85
-9.21
HsdOla:TO_UK
48
93.63
71.10
0.10
4.25
3.68
9.47
HsdWin:CFW1-DE
48
87.64
49.01
0.24
9.92
7.93
-0.88
HsdWin:CFW1-NL
48
82.99
51.84
0.21
7.93
4.82
3.62
HsdWin:NMRI-DE
48
90.78
58.07
0.20
6.80
2.27
-8.87
HsdWin:NMRI-NL
64
93.96
57.79
0.19
5.95
3.12
2.11
HsdWin:NMRI_UK
32
93.92
62.89
0.12
15.58
1.70
-4.89
IcrTac:ICR-US
36
89.28
69.69
0.06
13.31
2.55
5.40
Inbreds_94_strains
94
91.25
0.00
0.00
2.83
98.58
100.00
NTac:NIHBS-US
36
91.71
93.77
0.01
1.98
0.57
-53.44
RjHan:NMRI-FR
48
92.58
31.16
0.28
14.45
13.60
17.80
RjOrl:Swiss-FR
48
91.68
64.87
0.17
1.70
3.40
-9.22
21
Sca:NMRI_SE-10an
24
75.51
70.82
0.09
5.38
5.38
22.31
Sca:NMRI_SE_22
24
80.63
75.07
0.09
3.97
3.12
15.16
Sim:(SW)fBR-US_A1
48
94.56
74.50
0.10
5.67
3.68
12.43
Sim:(SW)fBR-US_B1
24
95.82
79.60
0.11
1.42
1.13
-7.87
Tac:SW-US
36
92.67
46.18
0.33
1.98
3.97
-2.00
Wild_Arizona
96
85.77
17.85
0.26
13.31
38.81
27.86
22
Table : Whole genome analyses
Population
No.
Markers
Genos.
Hom.
Het.
MAF
HWE
Inbreed coef
Crl:CFW(SW)-US_P08
22
169,333
97.30
71.06
0.19
8.00
6.36
-20.86
HsdWin:CFW1-NL
22
152,716
97.17
74.55
0.18
4.98
7.15
-20.70
HsdWin:NMRI-NL
26
164,287
97.41
73.02
0.13
4.51
7.23
-18.33
Hsd:ICR(CD-1)-FR
20
623,124
87.24
45.19
0.22
10.50
1.53
-11.82
RjHan:NMRI-FR
13
171,198
96.49
63.33
0.18
4.69
7.59
-10.62
Crl:NMRI(Han)-FR
20
623,124
87.04
38.09
0.24
11.09
4.55
3.14
23
Figure 1: Linkage disequilibrium decay radius and minor allele frequencies in
outbred mice. The figure shows the distribution of 250 analyses of resampled
data
24
Figure 2
25
Figure 3
26
27
28
Figure
29
:
30
Figure 4
31
Figure 4: Linkage disequilibrium in six colonies at the MHC locus on mouse
chromosome 17
32
Figure 4 : Decay of linkage disequilibrium with distance; whole genome
compared to locus specific analyses
33
Figure 5: QTL mapping of three phenotypes in three colonies
34
Figure 6: Simulation of resample model averaging
Performance of the SMA method depends on the position of the QTL and the
population analysed. Here the resolution of the RMA (indicated by the
distribution of the black dots) varies according to the postion of a simulated
QTL, indicated by dotted red lines)
35
Figure
7
Resample
model
averaging
and
linkage
disequilibrium
36
SUPPLEMENTAL
MDS FIGURE OF ALL POPULATIONS
37