Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Epistasis Analysis Using Microarrays Chris Workman Experiments with Microarrays Cool technology, but how do we use it? How is it useful? Identify “marker genes” in disease tissues Toxicology, stress response Classification, diagnostics Drug candidate screens, basic science Genetic factors Measuring interactions (chIP-on-chip) Overview Expression profiling in single-deletions Epistasis analysis using single- and doubledeletions Epistasis analysis, genetic and environmental factors Reconstructing pathways that explain the genetic relationships between genes Expression Profiling in 276 Yeast Single-Gene Deletion Strains (“The Rosetta Compendium”) Only 19 % of yeast genes are essential in rich media Giaever, et. al. Nature (2002) Clustered Rosetta Compendium Data Gene Deletion Profiles Identify Gene Function and Pathways Principle of Epistasis Analysis Experimental Design Compare single-gene deletions to wild type Compare to the double knockout to wild type Experimental Design: Single vs Double-Gene Deletions Classical Epistasis Analysis Using Microarrays to Determine the Molecular Phenotypes Time series expression (0-24hrs) every 2hrs Mixing Genetic and Environmental Factors Expression in Single-Gene Deletions (yeast mec1 and dun1 deletion strains) Chen-Hsiang Yeang, PhD MIT UC Santa Cruz Craig Mak UCSD Yeang, Jaakkola, Ideker. J Comp Bio (2004) Yeang, Mak, et. al. Genome Res (2005) Measurements “Systems level” understanding Treat disease Networks Synthetic biology In silico cells Measurements “Systems level” understanding Treat disease Networks Synthetic biology Test & Refine In silico cells Displaying deletion effects Published work: “Epistasis analysis using expression profiling” (2005) Relevant Interactions Subset of Rosetta compendium used 28 deletions were TF (red circles) 355 diff. exp. genes (white boxes) P < 0.005 755 TF-deletion effects (grey squiggles) Network Measurements Yeast under normal growth conditions Promoter binding ChIP-chip / location analysis Lee, et. al. Science (2002) Protein-protein interaction Yeast 2-hybrid Database of Interaction Proteins (DIP) Deane, et. al. Mol Cell Proteomics (2002) ChIP Measurement of Protein-DNA Interactions (Chromatin Immunoprecipitation) Step 1: Network connectivity (chIP-chip analysis) ~ 5k genes (white boxes) ~ 20k interactions (green lines) Step 2: Network annotation (gene expression analysis) Measure variables that are a function of the network (gene expression). Monitor these effects after perturbing the network (TF knockouts). What parts are wired together How and why the parts are wired together the way they are Inferring regulatory paths Direct Indirect = = Annotate: inducer or repressor OR Annotate: inducer or repressor Computational methods Problem Statement: Find regulatory paths consisting of physical interactions that “explain” functional relationship Method: A probabilistic inference approach Yeang, Ideker et. al. J Comp Bio (2004) To assign annotations Formalize problem using a factor graph Solve using max product algorithm Kschischang. IEEE Trans. Information Theory (2001) Mathematically similar to Bayesian inference, Markov random fields, belief propagation Inferred Network Annotations A network with ambiguous annotation Test & Refine Which deletion experiments should we do first? A mutual information based score For each candidate experiment (gene ) Variability of predicted expression profiles Predict profile for each possible set of annotations More variable = more information from experiment Reuse network inference algorithm to compute effect of deletion! I M;Y e H(M ) H M | Y e H M PM mPY e ylog 2 PM m | Y e y m, y Ranking candidate experiments Gene Function HHF1 CKA1 Histone 52.1429 regulator for meiosis and PKA 45.0279 pathway protein kinase of cell cycle 45.0075 A2 mating response YAP6* SOK2* NRG1 FKH1 FKH2 SLT2 MSN4* HAP4* Downstream genes 74 Rank Model 1 2 64 2 1 64 3 5 40.9023 58 4 4 stress response regulator regulator of glucose dependent genes regulator of cell cycle 35.1652 50 5 1, 3 31.6501 45 6 3 29.1194 41 7 2 regulator of cell cycle protein kinase of cell wall integrity pathway regulator of stress response 26.7131 38 8 7 23.4727 31 9 8 21.8224 31 10 1 6.3310 9 34 1 regulator of cellular respiration Score We target experiments to one region of network Expression for: SOK2, HAP4 , MSN4 , YAP6 Expression of Msn4 targets Average signed z-score 1 N Ze zie 0 sgn riezie N i 1 Expression of Hap4 targets Yap6 targets are unaffected Refined Network Model Caveats Assumes target genes are correct Only models linear paths Combinatorial effects missed Measurements are for rich media growth Using this method of choosing the next experiment Is it better than other methods? How many experiments? Run simulations vs: Random Hubs Simulation results # simulated deletions profiles used to learn a “true” network Current Work Measurements “Systems level” understanding Treat disease Networks Test & Refine Transcriptional response to DNA damage Acknowledgments Trey Ideker Craig Mak Chen-Hsiang Yeang Tommi Jaakkola Scott McCuine Maya Agarwal Mike Daly Ideker lab members Tom Begley Leona Samson Funding grants from NIGMS, NSF, and NIH