Download PPT - Bioinformatics.ca

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Canadian Bioinformatics Workshops
www.bioinformatics.ca
Module #: Title of Module
2
Module 4
Analyzing gene list function and
associations
Quaid Morris
Place an image representing the talk here
http://morrislab.med.utoronto.ca
Overview
• Extending gene lists using functional
associations
• Sources of functional association
• GeneMANIA
Module 4
bioinformatics.ca
Extending Gene Lists
• Given a gene list, find other similar genes
– Gene list defines the query and the “function” of interest
• Query: complex or pathway components
– Result: additional members
• Query: kinases
– Result: other kinases and related genes
• Query: genes affected in RNAi screen
– Result: other genes that may affect phenotype
Module 4
bioinformatics.ca
Network-Based Gene Function Prediction
• Genes of similar sequence often have similar function
• Unknown gene similar to known gene likely to have
similar function (annotation transfer)
• Guilt-by-association principle
• Many other similarity measures for genes (e.g. colocalization)
Fraser AG, Marcotte EM - A probabilistic
view of gene function - Nat Genet. 2004
Jun;36(6):559-64
Module 4
bioinformatics.ca
Functional association networks to predict
gene function
Microarray expression data
Co-expression network
Cell cycle
CDC3
CLB4
CDC16
UNK1
RPT1
RPN3
RPT6
Eisen et al (PNAS 1998)
UNK2
Protein degradation
Fraser AG, Marcotte EM - A probabilistic view of gene function - Nat Genet. 2004 Jun;36(6):559-64
Module 4
bioinformatics.ca
Predicting Gene Function Using a Network
Is gene X involved in cell cycle regulation?
+
CDC3
+
CDC16
CLB4
+
?
Labelled
examples
RPT1
-
UNK1
-
Classification
algorithm
UNK2
e.g. co-expression ?
UNK3
Module 4
UNK1
UNK2
UNK3
0.9
0.1
0.05
RPN3
?
RPT6
Discriminant
value
Discriminant value: a value you
can use to rank the genes
according to certainty or
threshold to classify genes
bioinformatics.ca
Predicting Gene Function Using a Network
Is gene X involved in cell cycle regulation?
+
CDC3
+
CDC16
CLB4
+
?
Labelled
examples
RPT1
-
UNK1
-
kNN,SVM,
LabelProp
UNK2
e.g. co-expression ?
UNK3
Module 4
UNK1
UNK2
UNK3
0.9
0.1
0.05
RPN3
?
RPT6
Discriminant
value
Discriminant value: a value you
can a) use to rank the genes
according to certainty and b)
threshold to classify genes
bioinformatics.ca
Label propagation vs guilt-by-association
CDC48
-1 …………....+1
MCA1
Discriminant
Value
CPR3
TDH2
Guilt-by-association
Label propagation algorithm
CDC48
MCA1
CDC48
CPR3
TDH2
Module 4
MCA1
CPR3
TDH2
bioinformatics.ca
Types of functional associations
•
•
•
•
Molecular Interactions (i.e. physical interactions)
Regulatory Interactions (e.g. ChIP-chip binding)
Genetic Interactions (e.g. synthetic lethality)
Similarity relationships
–
–
–
–
–
–
–
Co-expression
Protein sequence (e.g. BLAST –log(E-value))
Domain architecture
Phylogenetic profiles
Gene neighborhood**
Gene fusion**
…
** most useful for bacterial genes
Module 4
bioinformatics.ca
Problem: genes are multi-function
• Gene function could be a/the:
–
–
–
–
–
–
Biological process,
Biochemical/molecular function,
Subcellular/Cellular localization,
Regulatory targets,
Temporal expression pattern,
Phenotypic effect of deletion.
Some networks may be better for some
types of gene function than others
Module 4
bioinformatics.ca
Query-specific weights for multifaceted
functional queries
w1 x
Cell
cycle
weights
w2 x
CDC27
CDC23
APC11
UNK1
+
+
Genetic
Tong et al. 2001
RAD54
XRS2
DNA
repair
UNK2
MRE11
w3 x
Co-complexed
Jeong et al 2002
=
Co-expression
June 24, 2009
The GeneMANIA project
Pavlidis et al, 2002,
Lanckriet et al, 2004
Mostafavi et al, 2008
13
GeneMANIA in the MouseFunc contest
“Test” benchmark: Predicting held-out genes
One of GeneMANIA’s two entries
had the best area under the
ROC curve in every category
Module 4
bioinformatics
.ca
Sara Mostafavi
GeneMANIA performance on yeast
 More error
Slower 
GeneMANIA on 15 networks
GeneMANIA label
propagation on bioPIXIE*
Probabilistic graph search* on bioPIXIE*
GeneMANIA on 5 networks
TSS** on 5 networks
Mostafavi et al, 2008
* Myers et al, 2005
** Tsuda et al, 2005
15
GeneMANIA Prediction Server
http://www.genemania.org or http://qa.genemania.org
Module 4
16
bioinformatics
.ca
GeneMANIA network data sources
Module 4
bioinformatics.ca
GeneMANIA Cytoscape Plugin
Module 4
bioinformatics.ca
Other prediction servers
• STRING (http://string-db.org/)
• Funcoup (http://funcoup.sbc.su.se/)
• FunctionalNet (http://www.functionalnet.org)
• bioPIXIE (http://pixie.princeton.edu)
• MouseNet (http://mousenet.princeton.edu/)
Module 4
bioinformatics.ca
Chemogenomics
• STITCH: Chemical-Protein Interactions
• http://stitch.embl.de/
Module 4
bioinformatics.ca
What Have We Learned?
• Network-based gene function prediction
– Guilt-by-association principle
• used to predict gene function using functional association networks
– Many types of functional associations exist
• Can be combined intelligently to optimize prediction accuracy
– Convenient software available: GeneMANIA
– Emerging area: chemical genomics gene function prediction
Module 4
bioinformatics.ca
Please follow along lab display on the
wiki
Module 4
bioinformatics.ca