Download Slides

Document related concepts
no text concepts found
Transcript
STRING
Modeling of biological systems through
cross-species data integration
Lars Juhl Jensen
promoter analysis
Jensen et al., Bioinformatics, 2000
genome visualization
Pedersen et al., Journal of Molecular Biology, 2000
protein function prediction
STRING
integrate diverse evidence
functional interactions
Bork et al., Current Opinion in Structural Biology, 2005
179 proteomes
genomic context methods
phylogenetic profiles
Cell
Cellulosomes
Cellulose
anti-correlated profiles
analogous enzymes
Morett et al., Nature Biotechnology, 2003
gene neighborhood
bidirectional promoters
Korbel et al., Nature Biotechnology, 2004
gene fusion
evolution
statistics
(the original sin)
scoring and benchmarking
raw quality scores
gene neighborhood
sum of intergenic distances
many types of evidence
not directly comparable
calibrate vs. gold standard
curated knowledge
KEGG
Kyoto Encyclopedia of Genes and Genomes
STKE
Signal Transduction Knowledge Environment
Reactome
MIPS
Munich Information center
for Protein Sequences
primary experimental data
Jensen et al., Drug Discovery Today: Targets, 2004
microarray expression data
GEO
Gene Expression Omnibus
physical protein interactions
BIND
Biomolecular Interaction Network Database
MINT
Molecular Interactions Database
GRID
General Repository for Interaction Datasets
DIP
Database of Interacting Proteins
HPRD
Human Protein Reference Database
von Mering et al., Nucleic Acids Research, 2005
literature mining
MEDLINE
SGD
Saccharomyces Genome Database
The Interactive Fly
OMIM
Online Mendelian Inheritance in Man
co-mentioning
different gene names
curated synonyms lists
NLP
Natural Language Processing
Gene and protein names
Cue words for entity recognition
Verbs for relation extraction
[nxgene The GAL4 gene]
[nxexpr The expression of
[nxgene the cytochrome genes
[nxpg CYC1 and CYC7]]]
is controlled by
[nxpg HAP1]
Jensen et al., Nature Reviews Genetics, 2006
combine all evidence
naïve Bayesian scheme
spread over many species
transfer based orthology
Target species
?
Source species
defining functional modules
qualitative modeling
the mitochondrial system
RCCs
predicting “mode of action”
Jensen et al., Drug Discovery Today: Targets, 2004
Jensen et al., Drug Discovery Today: Targets, 2004
Acknowledgments
• The STRING team (EMBL)
–
–
–
–
–
–
–
Christian von Mering
Berend Snel
Martijn Huynen
Sean Hooper
Mathilde Foglierini
Julien Lagarde
Peer Bork
• Literature mining project
(EML Research)
– Jasmin Saric
– Rossitza Ouzounova
– Isabel Rojas
• New genomic context
methods (EMBL)
– Jan Korbel
– Peer Bork
• Modeling of yeast
mitochondria (EMBL)
– Fabiana Perocchi
– Lars Steinmetz
• Inspiration for presentation
– Dick Clarence Hardt
– Anders Gorm Pedersen
Thank you!
Related documents