Download 20060511_microarray_..

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Epistasis Analysis
Using Microarrays
Chris Workman
Experiments with Microarrays

Cool technology, but how do we use it? How
is it useful?

Identify “marker genes” in disease tissues


Toxicology, stress response



Classification, diagnostics
Drug candidate screens, basic science
Genetic factors
Measuring interactions (chIP-on-chip)
Overview




Expression profiling in single-deletions
Epistasis analysis using single- and doubledeletions
Epistasis analysis, genetic and environmental
factors
Reconstructing pathways that explain the
genetic relationships between genes
Expression Profiling in 276 Yeast
Single-Gene Deletion Strains
(“The Rosetta Compendium”)

Only 19 % of yeast genes are essential in rich media

Giaever, et. al. Nature (2002)
Clustered Rosetta Compendium Data
Gene Deletion Profiles Identify Gene
Function and Pathways
Principle of Epistasis Analysis
Experimental Design


Compare single-gene deletions to wild type
Compare to the double knockout to wild type
Experimental Design:
Single vs Double-Gene Deletions
Classical Epistasis Analysis Using
Microarrays to Determine the Molecular
Phenotypes
Time series
expression (0-24hrs)
every 2hrs
Mixing Genetic and Environmental
Factors
Expression in Single-Gene Deletions
(yeast mec1 and dun1 deletion strains)
Chen-Hsiang Yeang, PhD
MIT
UC Santa Cruz
Craig Mak
UCSD
Yeang, Jaakkola, Ideker. J Comp Bio (2004)
Yeang, Mak, et. al. Genome Res (2005)
Measurements
“Systems level” understanding
Treat disease
Networks
Synthetic biology
In silico cells
Measurements
“Systems level” understanding
Treat disease
Networks
Synthetic biology
Test & Refine
In silico cells
Displaying deletion effects
Published work: “Epistasis analysis using expression profiling” (2005)
Relevant Interactions

Subset of Rosetta
compendium used

28 deletions were TF
(red circles)

355 diff. exp. genes
(white boxes)


P < 0.005
755 TF-deletion effects
(grey squiggles)
Network Measurements

Yeast under normal growth conditions

Promoter binding


ChIP-chip / location analysis
Lee, et. al. Science (2002)
Protein-protein interaction

Yeast 2-hybrid
Database of Interaction Proteins (DIP)
Deane, et. al. Mol Cell Proteomics (2002)
ChIP Measurement of Protein-DNA
Interactions (Chromatin Immunoprecipitation)
Step 1: Network connectivity
(chIP-chip analysis)
~ 5k genes
(white boxes)
~ 20k interactions
(green lines)
Step 2: Network annotation
(gene expression analysis)
Measure variables that are a function of
the network (gene expression).
Monitor these effects after perturbing the
network (TF knockouts).
What parts are
wired together
How and why the parts
are wired together
the way they are
Inferring regulatory paths
Direct
Indirect
=
=
Annotate: inducer or repressor
OR
Annotate: inducer or repressor
Computational methods

Problem Statement:


Find regulatory paths consisting of physical
interactions that “explain” functional relationship
Method:

A probabilistic inference approach




Yeang, Ideker et. al. J Comp Bio (2004)
To assign annotations
Formalize problem using a factor graph
Solve using max product algorithm

Kschischang. IEEE Trans. Information Theory (2001)

Mathematically similar to Bayesian inference, Markov random
fields, belief propagation
Inferred Network Annotations
A network with ambiguous annotation
Test
&
Refine
Which deletion experiments should
we do first?

A mutual information based score

For each candidate experiment (gene )

Variability of predicted expression profiles



Predict profile for each possible set of annotations
More variable = more information from experiment
Reuse network inference algorithm to compute effect
of deletion!
I M;Y e   H(M ) H M | Y e 
 H M    PM  mPY e  ylog 2 PM  m | Y e  y
m, y
Ranking candidate experiments
Gene
Function
HHF1
CKA1
Histone
52.1429
regulator for meiosis and PKA
45.0279
pathway
protein kinase of cell cycle
45.0075
A2
mating response
YAP6*
SOK2*
NRG1
FKH1
FKH2
SLT2
MSN4*
HAP4*
Downstream
genes
74
Rank
Model
1
2
64
2
1
64
3
5
40.9023
58
4
4
stress response regulator
regulator of glucose
dependent genes
regulator of cell cycle
35.1652
50
5
1, 3
31.6501
45
6
3
29.1194
41
7
2
regulator of cell cycle
protein kinase of cell wall
integrity pathway
regulator of stress response
26.7131
38
8
7
23.4727
31
9
8
21.8224
31
10
1
6.3310
9
34
1
regulator of cellular
respiration
Score
We target experiments to one region of
network
Expression for: SOK2, HAP4 , MSN4 , YAP6 
Expression of Msn4 targets
Average signed
z-score
1 N
Ze 
zie  0 sgn riezie
N i 1
Expression of Hap4 targets
Yap6 targets are unaffected
Refined Network Model

Caveats




Assumes target genes
are correct
Only models linear paths
Combinatorial effects
missed
Measurements are for
rich media growth
Using this method of choosing
the next experiment

Is it better than other methods?

How many experiments?

Run simulations vs:


Random
Hubs
Simulation results
# simulated deletions profiles used to learn a “true” network
Current Work
Measurements
“Systems level” understanding
Treat disease
Networks
Test & Refine
Transcriptional
response to
DNA damage
Acknowledgments
Trey Ideker
Craig Mak
Chen-Hsiang Yeang
Tommi Jaakkola
Scott McCuine
Maya Agarwal
Mike Daly
Ideker lab members
Tom Begley
Leona Samson
Funding grants from NIGMS, NSF, and NIH