Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Canadian Bioinformatics Workshops www.bioinformatics.ca Module #: Title of Module 2 Module 4 Analyzing gene list function and associations Quaid Morris Place an image representing the talk here http://morrislab.med.utoronto.ca Overview • Extending gene lists using functional associations • Sources of functional association • GeneMANIA Module 4 bioinformatics.ca Extending Gene Lists • Given a gene list, find other similar genes – Gene list defines the query and the “function” of interest • Query: complex or pathway components – Result: additional members • Query: kinases – Result: other kinases and related genes • Query: genes affected in RNAi screen – Result: other genes that may affect phenotype Module 4 bioinformatics.ca Network-Based Gene Function Prediction • Genes of similar sequence often have similar function • Unknown gene similar to known gene likely to have similar function (annotation transfer) • Guilt-by-association principle • Many other similarity measures for genes (e.g. colocalization) Fraser AG, Marcotte EM - A probabilistic view of gene function - Nat Genet. 2004 Jun;36(6):559-64 Module 4 bioinformatics.ca Functional association networks to predict gene function Microarray expression data Co-expression network Cell cycle CDC3 CLB4 CDC16 UNK1 RPT1 RPN3 RPT6 Eisen et al (PNAS 1998) UNK2 Protein degradation Fraser AG, Marcotte EM - A probabilistic view of gene function - Nat Genet. 2004 Jun;36(6):559-64 Module 4 bioinformatics.ca Predicting Gene Function Using a Network Is gene X involved in cell cycle regulation? + CDC3 + CDC16 CLB4 + ? Labelled examples RPT1 - UNK1 - Classification algorithm UNK2 e.g. co-expression ? UNK3 Module 4 UNK1 UNK2 UNK3 0.9 0.1 0.05 RPN3 ? RPT6 Discriminant value Discriminant value: a value you can use to rank the genes according to certainty or threshold to classify genes bioinformatics.ca Predicting Gene Function Using a Network Is gene X involved in cell cycle regulation? + CDC3 + CDC16 CLB4 + ? Labelled examples RPT1 - UNK1 - kNN,SVM, LabelProp UNK2 e.g. co-expression ? UNK3 Module 4 UNK1 UNK2 UNK3 0.9 0.1 0.05 RPN3 ? RPT6 Discriminant value Discriminant value: a value you can a) use to rank the genes according to certainty and b) threshold to classify genes bioinformatics.ca Label propagation vs guilt-by-association CDC48 -1 …………....+1 MCA1 Discriminant Value CPR3 TDH2 Guilt-by-association Label propagation algorithm CDC48 MCA1 CDC48 CPR3 TDH2 Module 4 MCA1 CPR3 TDH2 bioinformatics.ca Types of functional associations • • • • Molecular Interactions (i.e. physical interactions) Regulatory Interactions (e.g. ChIP-chip binding) Genetic Interactions (e.g. synthetic lethality) Similarity relationships – – – – – – – Co-expression Protein sequence (e.g. BLAST –log(E-value)) Domain architecture Phylogenetic profiles Gene neighborhood** Gene fusion** … ** most useful for bacterial genes Module 4 bioinformatics.ca Problem: genes are multi-function • Gene function could be a/the: – – – – – – Biological process, Biochemical/molecular function, Subcellular/Cellular localization, Regulatory targets, Temporal expression pattern, Phenotypic effect of deletion. Some networks may be better for some types of gene function than others Module 4 bioinformatics.ca Query-specific weights for multifaceted functional queries w1 x Cell cycle weights w2 x CDC27 CDC23 APC11 UNK1 + + Genetic Tong et al. 2001 RAD54 XRS2 DNA repair UNK2 MRE11 w3 x Co-complexed Jeong et al 2002 = Co-expression June 24, 2009 The GeneMANIA project Pavlidis et al, 2002, Lanckriet et al, 2004 Mostafavi et al, 2008 13 GeneMANIA in the MouseFunc contest “Test” benchmark: Predicting held-out genes One of GeneMANIA’s two entries had the best area under the ROC curve in every category Module 4 bioinformatics .ca Sara Mostafavi GeneMANIA performance on yeast More error Slower GeneMANIA on 15 networks GeneMANIA label propagation on bioPIXIE* Probabilistic graph search* on bioPIXIE* GeneMANIA on 5 networks TSS** on 5 networks Mostafavi et al, 2008 * Myers et al, 2005 ** Tsuda et al, 2005 15 GeneMANIA Prediction Server http://www.genemania.org or http://qa.genemania.org Module 4 16 bioinformatics .ca GeneMANIA network data sources Module 4 bioinformatics.ca GeneMANIA Cytoscape Plugin Module 4 bioinformatics.ca Other prediction servers • STRING (http://string-db.org/) • Funcoup (http://funcoup.sbc.su.se/) • FunctionalNet (http://www.functionalnet.org) • bioPIXIE (http://pixie.princeton.edu) • MouseNet (http://mousenet.princeton.edu/) Module 4 bioinformatics.ca Chemogenomics • STITCH: Chemical-Protein Interactions • http://stitch.embl.de/ Module 4 bioinformatics.ca What Have We Learned? • Network-based gene function prediction – Guilt-by-association principle • used to predict gene function using functional association networks – Many types of functional associations exist • Can be combined intelligently to optimize prediction accuracy – Convenient software available: GeneMANIA – Emerging area: chemical genomics gene function prediction Module 4 bioinformatics.ca Please follow along lab display on the wiki Module 4 bioinformatics.ca