Download Gene networks - biological context

Document related concepts
no text concepts found
Transcript
Gene expression analysis
and network discovery:
Genevestigator
Philip Zimmermann, Genevestigator Team, ETH Zurich
© ETH Zürich | Genevestigator
6 November 2007
Presentation flow
 Gene networks – biological context
 Microarray compendium: how, and what for?
 Meta-profile analysis: concepts and validation
 Genevestigator® V3
 Data integration
 Summary & conclusion
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
2
Presentation flow
 Gene networks – biological context
 Microarray compendium: how, and what for?
 Meta-profile analysis: concepts and validation
 Genevestigator® V3
 Data integration
 Summary & conclusion
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
3
Gene networks - biological context
 What is the interpretational
value of a gene network derived
by graphical modeling or
correlation analysis?
6 November 2007
P. Zimmermann /

a snapshot in time?

a snapshot in space?

an average trend?
ETH Zurich / [email protected]
4
Gene networks - biological context
 From what experiment(s) was
this network derived?
6 November 2007
P. Zimmermann /

time-course?

cell culture, whole organism?

stimulus, drug response?

anatomy part?

stage of development?

genetic modification?
ETH Zurich / [email protected]
5
Context and dynamics of networks
 Hypothesis: networks are dynamic
and context-dependant

=> networks evolve!

=> networks may have different
functions in different contexts!
 Question: how can we quantify
the role of the context in
shaping the network?
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
6
Context: the time-space-response dimensions
 Time
=> time-course, development
 Space
=> anatomy parts, intracellular localization
 Response => response to external perturbations
=> response to modifications in the genome
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
7
Context and dynamics of networks
 Modeling the time, space and response dimensions
requires:

experiments testing time, space and response variables

storage of measurement data and its meta-data

developing analysis methods that incorporate these dimensions
(→ meta-profiles)
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
8
Presentation flow
 Gene networks – biological context
 Microarray compendium: how, and what for?
 Meta-profile analysis: concepts and validation
 Genevestigator® V3
 Data integration
 Summary & conclusion
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
9
Analysis versus meta-analysis
100 genes –
what to do next?
Data analysis
Microarray
experiment
10 billion data
points –
what to do next?
Data storage
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
Data repositories
Data
Annotations
heterogenous
datasets
6 November 2007
+
unsystematic
or poor annotation
P. Zimmermann /
?
=
ETH Zurich / [email protected]
meta-analysis
impossible!
Data warehouses
anatomy
development
stimulus
Data
quality
control
ordered
datasets
6 November 2007
+
mutation
Expert annotation
with systematic
ontologies
systematic
annotation
P. Zimmermann /
=
ETH Zurich / [email protected]
meta-analysis
possible!
Data quality control
RLE
NUSE
Unprocessed values
Affy QC metrics
Border elements
Correlation matrix
RNA degradation
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
13
Ontologies – example of Anatomy
 Mouse / Rat:
anatomy

Edinburgh Mouse Atlas
development
stimulus
 Human:

mutation
mapping to Mouse and Rat anatomy tree
 Arabidopsis / Barley:

terms from Plant Ontology

tree created by Genevestigator
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
Expert annotation
with systematic
ontologies
Ontologies – example of Development
 Mouse: Theiler stages
 Rat: Witschi stages
 Human: Carnegie table
 Arabidopsis: Boyes key
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
Meta-analysis tools
• Who is most interested to
mine this data?
• Who can best interpret
the results?
THE BIOLOGIST!
6 November 2007
Genevestigator® –
a tool for biologists
P. Zimmermann /
ETH Zurich / [email protected]
Presentation flow
 Gene networks – biological context
 Microarray compendium: how, and what for?
 Meta-profile analysis: concepts and validation
 Genevestigator® V3
 Data integration
 Summary & conclusion
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
17
Expression meta-profiles
[space]
6 November 2007
[time]
P. Zimmermann /
[response]
ETH Zurich / [email protected]
[response]
18
Data validation
[space]
[time]
[response]
e.g. heart ventricle
Category type
Probe set
6 November 2007
e.g. Mm.23432
P. Zimmermann /
ETH Zurich / [email protected]
19
Data validation
[space]
[time]
[response]
e.g. heart ventricle
Category type
Probe set
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
20
Mouse anatomy meta-profiles [space]
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
21
Data validation
[space]
[time]
[response]
Category type
Probe set
6 November 2007
e.g. Mm.23432
P. Zimmermann /
ETH Zurich / [email protected]
22
Rnf33
Transcription of Rnf33 has been shown to occur
already in the mouse oocyte but not beyond the
eight-cell stage nor in adult tissues
Hoxa1
Hoxa1 expression starts at E7.5 and begins to
retreat caudally by day E8.5
hemopexin
hemopexin (hx), is known to be only lowly
expressed in embryos and newborn mice and
reaches it’s highest expression level not until the
first year of age
a – f: pre-natal
g – l: post-natal
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
23
light-harvesting chlorophyll a/b binding protein (AT4G14690 )
protochlorophyllide reductase A (At5g54190 )
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
Presentation flow
 Gene networks – biological context
 Microarray compendium: how, and what for?
 Meta-profile analysis: concepts and validation
 Genevestigator® V3
 Data integration
 Summary & conclusion
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
25
Development of Genevestigator®
Biological experiments
Anatomy
Development
Stimulus
Mutation
 14‘500 Affymetrix arrays (Nov 2007)
 Human, mouse, rat, arabidopsis, barley
Microarray data
 Metabolic and regulatory pathway maps
Public
repositories
Curation & Quality control
for mouse and arabidopsis
 > 10‘000 registered users
Genevestigator database
Application server
 > 500 citations in peer reviewed journals
Client Java application
Genevestigator
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
26
Genevestigator® V3
Website
Java Client Application
Database and Application Server Cluster
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
27
Toolsets and tools
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
28
[space]
[time]
[response]
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
29
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
30
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
31
Biomarker Search toolset
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
Abiotic stresses and hormonal responses
salt (+)
osmotic (+)
salt (-)
osmotic (-)
salt (+)
osmotic (+)
cold (+)
salt (+)
drought (+)
anoxia (-)
hypoxia (-)
hypoxia (-)
ABA (+)
---
ABA (+)
MeJA (+)
BL / H3BO3(+)
ethylene (+)
norflurazon (-)
mycorrhiza (-)
ozone (-)
genotoxic (-)
2,4-D
glucose
syringolin (-)
P. syringae (+)
ozone (+)
B. cinerea (+)
syringolin (-)
cycloheximide (-)
H2O2 (-)
AVG (+)
chitin (+)
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
33
[space]
[time]
[response]
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
34
Biclustering
 Searches subsets of genes
coexpressed across subsets
of conditions
 BiMax algorithm

Finds all maximal bicliques
[space]
[time]
[response]
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
35
Example of a bicluster
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
36
Proline
Phenylalanine / Tyrosine
[space]
[time]
[response]
Starch / sucrose
ABA biosynthesis
Cold response
Inositol
phosphate
ABA response
Beta-alanine
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
37
Presentation flow
 Gene networks – biological context
 Microarray compendium: how, and what for?
 Meta-profile analysis: concepts and validation
 Genevestigator® V3
 Data integration
 Summary & conclusion
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
38
Biomarker search [time]
 Genes expressed specifically
in seeds and germinating
seedlings
 De-novo identification
of cis-regulatory elements
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
39
Biomarker search [space]
z = 18.2
z = 5.8
z = 5.4
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
40
Biomarker search [response]
 „Supervised biclustering“

isoxaben (+)

norflurazon (-)

light (+)

nitrate_low (-)
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
41
Anatomy clustering and promoter analysis
 Clusters of genes expressed specifically in:
 cell suspension
 petals
 roots
z > 5.0
 seeds
 stamen
 xylem
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
Development clustering and promoter analysis
 Clusters of Arabidopsis genes expressed specifically at:
 dev. stage 1
 dev. stage 3
z > 5.0
 dev. stage 9
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
Stimulus clustering and promoter analysis
 „Supervised biclustering“ of stimulus meta-profiles:
 cluster 1
 cluster 2
 cluster 4
z > 5.0
 cluster 5
 cluster 7
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
cell suspension
Proteins
cotyledons
flowers
leaves
roots
seeds
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
seeds
roots
leaves
flowers
Transcripts
cotyledons
cell suspension
Data integration: transcriptome - proteome
Transcript quantification measure
Arabidopsis leaf transcripts and proteins
Protein quantification measure
Frequency
proteins detected in leaves
proteins not detected in leaves but for which
there is a probeset on the ATH1 array
6 November 2007
P. Zimmermann /
general background range for
transcript quantification measure
ETH Zurich / [email protected]
Number of transcripts/proteins
Protein detection and transcript abundance
4000
3500
3000
2500
probe sets called “absent” on ATH1 (p >= 0.05)
probe sets called “present” on ATH1 (p < 0.05)
leaf proteins detected by peptide identification
2000
1500
1000
500
1.2
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Transcript abundance measure (log2 signal)
1
0.8
0.6
0.4
Fraction of „present“ transcripts
that were detected on
the protein level
6 November 2007
P. Zimmermann /
0.2
ETH Zurich / [email protected]
0
6
7
8
9
10
11
12
13
14
15
GO analysis
1.2
1
n = 221 specific probesets with
average signal in leaves >13
0.8
0.6
0.4
GO Cellular Component
0.2
0
6
7
8
9
10 11 12 13 14 15
cell wall
chloroplast
cytosol
ER
extracellular
Golgi apparatus
mitochondria
nucleus
other cellular components
other cytoplasmic components
other intracellular components
other membranes
plasma membrane
plastid
ribosome
ATH1 array (control)
Proteins not detected but transcripts have high abundance ( >13 )
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
GO analysis
1.2
1
n = 221 specific probesets with
average signal in leaves >13
0.8
0.6
0.4
GO Biological Process
0.2
0
6
7
8
9
10 11 12 13 14 15
cell organization and biogenesis
developmental processes
DNA or RNA metabolism
electron transport or energy pathways
other biological processes
other cellular processes
other metabolic processes
protein metabolism
response to abiotic or biotic stimulus
response to stress
signal transduction
transcription
transport
ATH1 array (control)
Proteins not detected but transcripts have high abundance ( >13 )
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
GO analysis
1.2
1
n = 221 specific probesets with
average signal in leaves >13
0.8
0.6
0.4
GO Molecular Function
0.2
0
6
7
8
9
10 11 12 13 14 15
DNA or RNA binding
hydrolase activity
kinase activity
nucleic acid binding
nucleotide binding
other binding
other enzyme activity
other molecular functions
protein binding
receptor binding or activity
structural molecule activity
transcription factor activity
transferase activity
transporter activity
ATH1 array (control)
Proteins not detected but transcripts have high abundance ( >13 )
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
Data integration – pathway analysis
Transcript abundance
Mevalonate
biosynthesis
Riboflavin
metabolism
Chlorophyll / Porphyrin
metabolism
Phenylpropanoid
metabolism
Protein abundance
Carotenoid
biosynthesis
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
Relative protein-to-transcript ratio
serine,
glycine,
cystein
Calvin cycle
starch and sucrose
metabolism
Fatty acid
biosynthesis
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
Relative protein-to-transcript ratio
Chlorophyll / Porphyrin
metabolism
Purine
metabolism
Pyrimidine
metabolism
Fatty acid
biosynthesis
Glycolysis /
Gluconeogenesis
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
Proteomic and transcriptomic biomarkers
„Root-specific“expression
Search by scoring
the proteomic dataset
6 November 2007
P. Zimmermann /
Search by scoring
the Genevestigator dataset
ETH Zurich / [email protected]
Proteomic and transcriptomic biomarkers
Search by scoring
the proteomic dataset
6 November 2007
P. Zimmermann /
Search by scoring
the Genevestigator dataset
ETH Zurich / [email protected]
Presentation flow
 Gene networks – biological context
 Microarray compendium: how, and what for?
 Meta-profile analysis: concepts and validation
 Genevestigator® V3
 Data integration
 Summary & conclusion
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
56
Summary and conclusions
 Biological networks: importance of the biological context
 Meta-profiles: context-driven analysis
 Biological validation of meta-profiles and clusters
 Genevestigator – a tool for biologists!
 Data integration: challenging biological complexity
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
57
Data type?
Experimental
context?
Modes of
interactions?
Organism?
Network dynamics?
Reproducibility?
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
Acknowledgements
 ETH Zurich
 Prof. Gruissem
 Developer Team:

Tomas Hruz, Oliver Laule, Stefan Bleuler, Philip Zimmermann

Gabor Szabo, Frans Wessendorp, Lukas Oertle, Dominique
Dümmler, Matthias Hirsch-Hoffmann
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
Thanks for your attention!
6 November 2007
P. Zimmermann /
ETH Zurich / [email protected]
60
Related documents