Download Genetic Interactions with the Laboratory Environment

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Exploratory factor analysis wikipedia , lookup

Transcript
Genetic Interactions With the
Laboratory Environment
Elissa J. Chesler, Ph.D.
University of Tennessee
Health Science Center
Studying Individual Differences
in the Mouse
Individual differences are due to both
environmental and genetic effects.
Evidence for a strong role of the laboratory
environment comes from multiple sources:
experimentalists woe
direct examination
heritability estimates
Experimentalist Woe:
Now you see it, now you don’t !
• Anecdotal evidence of failures to replicate
• A file-drawer problem
• Data driven evaluation of the laboratory
environment must be performed
Trading Spaces
• Genetic Architecture of Selected Lines:
– open field activity test
– High and low activity lines bred selectively (Flint et al, 1995)
– Two replicates to determine whether the same loci are selected
(Turri et al, 2001).
– The 2001 lines had the same selected loci.
– Only two highly significant loci were replicated across 1995 and
2001 experiments.
A Direct Examination:
Three labs, same mice
• Crabbe, Wahlsten and Dudek (1999)
– 8 behavioral traits studied in Portland,
Edmonton and Albany laboratories.
– Strains had similar relative phenotypes
– Magnitude of effects varied by lab
– What were the relevant environmental factors?
Heritability Estimation:
The Tail Withdrawal Test of
Thermal Nociception
49°C
49°C
Estimating Heritability
• Heritability is the proportion of trait variance
accounted for by genetic factors.

h  2 2
G  E
2
2
G
Inbred Mice—A diverse genetic resource
Beck et al, 2000
Estimating Heritability
Table 2. One-way ANOVA table used to estimate heritability of tail withdrawal baselines.
Source of
Variance
Strain
Error
Total
d.f.
S-1
28
Observed
b
Sums of
Mean
Expected
Squares Squares
Mean Squares
SSbs
SSbs / (a-1)
ws+kbs
198.89
7.10
ws+186.32 bs
N-S
5543
SSws
647.10
a
SSws/(N)
0.12
ws
ws = .11674
N-1
SStotal
5571 845.99
a
S is the number of strains and N is the total number of individuals.
The coefficient, k, is the number of individuals in each strain in a balanced design.
b
2
In an unbalanced design, k = (1/S-1)*{N – (ni /N)}, where ni is the number of
th
individuals in the i strain.
Organismic Influences on
Tail-Withdrawal Latency: Genotype
TW Latency (s)
5
4
3
2
1
F2AF29P3 A KR /10 L/6 B/c /He /FeC58 BA A/2 IIIS SM KO KO KO KO bre e/eD-1 D4Sim HA LA AR AR
2
A BL 7B AL 3H eB
C B R
H L
C -N D 3H 12
1BELT ND MU om
D
6
W
7
W
5
T
H
B C3
E
B C
S
S S
C5 C
5H D
C
Variability in Tail-Withdrawal Latency:
Something in the Air?
400
200
h2 = 24%
n = 8034
Mean: 3.1s
SD: 1.3 s
0.10
0.08
0.06
0.04
0.02
0
0.00
0 1 2 3 4 5 6 7 8 9 10
Tail-Withdrawal Latency (s)
Proportion per Bar
Count
600
Contruction of the TW Data Archive
• Data Sheet Records
– 11 Experimenters
– 40 Genotypes
including RI, Mutant,
Selected, Inbred,
Outbred
– 4 Seasons
– 9:30 – 17:00 h
– Both Sexes
– Cage Populations
– Order of testing within
cage
• Merged by date with
animal colony records
–
–
–
–
Temperature
Humidity
Cage changes
Food lots.
Organismic Influences on
Tail-Withdrawal Latency: Sex
TW Latency (s)
200
3.50
0.05
3.25
0.04
3.00
0.03
2.75
100
Male
Female
0.02
0.01
0
0
1
2
3 4 5 6 7
TW Latency (s)
8
0.00
9 10
Proportion per Bar
Count
300
Organismic Influences on
Tail-Withdrawal Latency: Weight
10
9
8
TWBL
7
6
5
4
3
2
1
0
0
10
20
30
WT
40
50
60
Environmental Influences on
Tail-Withdrawal Latency:
Experimenter
3.5
3.0
2.5
KM
HH
BM
JH
SW
2.0
JM
TW Latency (s)
4.0
Environmental Influences on
Tail-Withdrawal Latency: Season
TW Latency (s)
3.50
3.25
3.00
2.75
Winter
Spring Summer
Fall
Environmental Influences on
Tail-Withdrawal Latency: Cage Density
3.75
Males
TW Latency (s)
TW Latency (s)
3.75
3.50
3.25
3.00
Females
3.50
3.25
3.00
2.75
(32)
2.50
2.25
2.75
1
2
3
4
Cage Density
5
6
1
2
3
4
Cage Density
5
6
Environmental Influences on
Tail-Withdrawal Latency: Time of Day
TW Latency (s)
4.0
Albino Mice
3.5
3.0
Pigmented Mice
2.5
2.0
1000
1100
1200
1300
1400
Time of Day (h)
1500
1600
Environmental Influences on
Tail-Withdrawal Latency: Order of Testing
TW Latency (s)
3.50
3.25
3.00
2.75
1
2
3
4
5
Order of Testing
6
Which of these factors actually matter?
A “Messy Data” Problem
• Large sample sizes preclude meaningful planned
comparisons—everything is “significant”!
• Data are unbalanced with respect to the many
predictors.
• Some observations are missing.
• Insufficient data for comparing variable
importance through hierarchically related models.
• Linear modeling fits a single structure to data,
when many complex structures may exist.
"To consult a statistician
after an experiment is
finished is often merely to
ask him to conduct a postmortem examination.
He can perhaps say what
the experiment died of."
- R. A. Fisher, 1938
Which factors actually matter?
• Archive analysis
– Data Mining
– Modeling
• Planned Experimentation
Which factors actually matter?
• Archive analysis
– Data Mining
– Modeling
• Planned Experimentation
Data Mining the GE interaction
• Classification And Regression Trees (CART)
• Develops rules for splitting data into groups
using the many predictors.
• Partitions are chosen that maximally reduce
the variability in the resulting subsets.
• Variables are ranked based on the degree to
which they reduce variability.
• This method allows for many complex data
structures to co-exist.
Detail of the regression tree
█
█
█
█
█
█
█
█
Experimenter
Genotype
Season
Cage Density
Time of Day
Sex
Humidity
Order
Entire tree is available online at:
http://www.nature.com/neuro/journal/v5/n11
/extref/nn1102-1101-S1.pdf
The resulting regression tree accounts for
42% of the variance in trait data
Relative Error
0.9
0.584
0.8
0.7
0.6
0.5
0
100
200
300
Number of Nodes
400
500
600
Assessing the Environmental Influence
Table 2. Factor importance rankings computed by CART.
Factor
Number of Levels
Score
Experimenter
11
100.0
Genotype
40
78.0
Season
4
35.8
Cage Density
7
20.4
Time of Day
3a
17.4
Sex
2
14.6
Humidity
4b
12.0
Order of Testing
7
8.7
a
Time of day levels were: early (09:30-10:55 h), midday
(11:00-13:55 h), and late (14:00-17:00 h).
b
Humidity levels were: high (60%), medium-high (40-59%),
medium-low (20-39%), and low (<20%).
• In the presence of sex
differences, females
were more sensitive
than males.
• The first mouse from
each cage has a
higher latency than
other mice.
• Lower latencies
– late in the day
– in the spring
– in higher humidity
Humidity and Season
80
•Humidity
fluctuates with
season
70
% H u m id it y
60
•This is true
even in a
“climate
controlled”
environment.
50
40
30
20
10
0
50
100
W inter
150
300
Fall
Summer
3.5
3.5
3.5
3.5
3.0
3.0
3.0
3.0
2.5
2.5
2.5
2.5
2.0
2.0
2.0
<20% 20-39% 40-59% >60%
<20% 20-39% 40-59%
>60%
350
Fall
4.0
4.0
Spring
Winter
250
Sum m er
4.0
4.0
200
Spring
2.0
<20% 20-39% 40-59% >60%
<20% 20-39% 40-59% >60%
•TW Baselines
drop with
increasing
humidity within
spring, summer
and fall.
Which factors actually matter?
• Archive analysis
– Data Mining
– Modeling
• Planned Experimentation
Modeling of Fixed-Effects
Table 3. The tail-withdrawal variability model
Source
df
STRAIN
SEX
SEASON
TIME
CAGEPOP
HUMIDITY
ORDER
PERSON
STRAIN x SEX
STRAIN x SEASON
STRAIN x TIME
STRAIN x CAGEPOP
STRAIN x HUMIDITY
STRAIN x PERSON
TIME x SEASON
SEASON x HUMIDITY
SEX x CAGEPOP
PERSON x TIME
POPCAT x SEASON
TIME x HUMIDITY
CAGEPOP x HUMIDITY
10 7.19
1 20.12
3 0.82
2 4.51
1 3.82
3 0.44
5 27.84
4 33.99
10 4.18
30 3.46
19 1.80
10 2.09
30 1.64
35 3.25
4 3.10
6 3.23
1 4.08
4 3.16
3 5.37
4 7.93
3 3.15
a
F
P-value
0.0001
0.0001
0.4823
0.0111
0.0509
0.7268
0.0001
0.0001
0.0001
0.0001
0.0181
0.0224
0.0163
0.0001
0.0149
0.0037
0.0436
0.0135
0.0011
0.0001
0.0241
Fixed-Effects remaining in the final reduced model of
tail-withdrawal variability based on 1772 subjects.
b
The denominator df = 1580.
c
Note that some numerator df's are lower than
expected due to the empty cells.
• All factors interact with genotype
except for within cage order of
testing.
Strain Differences in Tester Effects
2
1.8
1.6
1.4
1.2
JM
1
0.8
0.6
0.4
SW
RI
IIS
/2
DB
A
CB
A
C5
8
AK
R
BA
LB
/c
C3
H/
He
C5
7B
L/
10
C5
7B
L/
6
A
12
9/
P3
0.2
0
Which factors actually matter?
• Archive analysis
– Data Mining
– Modeling
• Planned Experimentation
Experimenter
TW Latency (s)
5
LS Means
Planned Experiment
P <.05
4
3
2
1
0
BM
JH
JM
KM
SW
Genotype
P <.05
TW Latency (s)
5
P <.05
4
LS Means
Planned Experiment
3
2
1
0
129/P3
A/J
AKR/J
BALB/cJ C3H/HeJ C57BL/6J C57BL/10J C58/J
CBA/J
DBA/2J
RIIIS/J
Time of Day
TW Latency (s)
5
4
LS Means
Planned Experiment
P <.05
3
2
1
0
08:00-10:55
11:00-13:55
14:00-17:00
Cage Density
TW Latency (s)
5
LS Means
4
3
2
1
0
1-3 (Low)
4-6 (High)
Sex
TW Latency (s)
5
LS Means
Planned Experiment
4
3
2
1
0
Female
Male
Order of Testing
TW Latency (s)
5
LS Means
Planned Experiment
4
3
2
1
0
First
Second
Third
Fourth
Planned Experiments:
Order of Testing
TW Latency (s)
7
Home Cage
Holding Cage
6
5
*
4

3
1st
2nd
3rd
4th
Order of Testing
% Analgesia
100
1st (AD50:
2nd (AD50:
3rd (AD50:
4th (AD50:
80
60
40
20
0
5
10
20
40
Morphine Dose (mg/kg)
14.2 mg/kg)
16.6 mg/kg)
17.2 mg/kg)
22.0 mg/kg) *
• Within-cage
order of testing
is a main effect.
• The order
influence can be
eliminated.
• The order
influence is even
greater in
studies of
analgesia than in
studies of
nociception.
Nature, Nurture or Both?
Genotype
27%
STRAIN
TESTER
ERROR
Residual
13%
TIME
ORDER
STRAINxSEXxENV
Genotype by
Environment
15%
STRAINxENV SEX ENVxENV
SEXxENV
STRAINxSEX
Environment
45%
• Genotype
accounts for less
than 1/3 of the
trait variance.
• Two-thirds of the
variance is
accounted for by
environmental
effects and their
interactions with
genotype.
Why is the laboratory environment
more important than ever?
• Expansion of the scope of projects
• Multiple staff turnovers – transience of undergraduates,
graduate students, and post-docs
• Long-term Experiments (mapping studies, special
breeding)
• Multi-lab, multi-site collaboration (TMGC)
• Data sharing projects (e.g. WebQTL, MPD)
• Distributed Mouse Reagents (TMGC)
• Later addition of data (fickle dissertation committee, pilots
of costly studies)
• Small sample studies (microarray)
Laboratory influence on
gene expression?
• Many factors can vary systematically with a
grouping variable (Confounds)
• Unplanned is not the same as random.
• Careful balancing of important factors is the best
approach.
• Small samples can easily become confounded.
Morning
Afternoon
B6
D2
C3H
Integrating Data Across Laboratories
www.webQTL.org
High Correlation Across
Laboratories for this Trait
A highly heritable behavioral trait
Chromosome 18
Locomotor Activity
Standardization vs. Systematic Variation
• Fix laboratory
conditions for the
entire study
• Cost effective for high
throughput studies
• Results may only
apply to a specific
environment
• Perform experiment
across a limited set of
known conditions
• Cost increase or power
decrease
• Increases ability to
generalize findings to
multiple environments
Acknowledgements
Data Archive and Analysis
Dr. Jeffrey S. Mogil
Dr. Sandra L. Rodriguez-Zas
Dr. Lawrence Hubert
Dr. William R. Lariviere
Dr. Sonya G. Wilson
…and the Mogil Lab
Dr. John C. Crabbe
Dr. Robert W. Williams
Dr. Daniel Goldwitz