Download Discover the Biology:

Document related concepts

Multi-state modeling of biomolecules wikipedia , lookup

List of types of proteins wikipedia , lookup

Obesogen wikipedia , lookup

JADE1 wikipedia , lookup

RNA-Seq wikipedia , lookup

Metabolic network modelling wikipedia , lookup

Gene regulatory network wikipedia , lookup

Metabolomics wikipedia , lookup

Pharmacometabolomics wikipedia , lookup

Transcript
Comprehensive pathway and network analysis of complex 'omics data
Biological Analysis and Interpretation in IPA®
October 2013
Gene Chen 陳冠文
Senior Specialist of GGA & IPA Certified Analyst
How can I analyze existing data …
Proprietary and Confidential
2
How Researchers Ask Questions Now
Search multiple Websites
Read multiple articles
Spend time in the lab
Mine Internal Databases
Wrangle multiple Excel sheets
Proprietary and Confidential
3
Agenda
• Introduction to Ingenuity Pathways Analysis (IPA)
• Introduction to Ingenuity Knowledge Base
• Questions Arise During Experimental Process :
– How Can IPA Help You?
• Data Analysis & Interpretation in IPA
– Case study for Cross Platform Integration of Metabolomics and
Transcriptomics from a Diabetic Mouse Model
• Q&A
Proprietary and Confidential
4
• Ingenuity Systems is a pioneer and leading provider in
capturing information, structuring information, building tools
that turn information into knowledge
Proprietary and Confidential
5
IPA
IPA is an All-in-one, web-based software application
Enables researchers to model, analyze, and understand the complex
biological and chemical systems at the core of life science research
Proprietary and Confidential
6
IPA
Applications:
• Disease Mechanisms
• Target Identification and Variation
• Biomarker Discovery
• Drug Mechanism of Action
Experimental Platform Supported :
• Gene Expression:
(mRNA, miRNA, microarray
platform, Next-gen sequencing, qPCR)
• Proteomics
• Genotyping
• Metabolomics Identifiers
• Drug Mechanism of Toxicity
Proprietary and Confidential
7
Peer-Reviewed Publications Citing IPA
Peer-Reviewed Research Articles Citing IPA
9483
10000
Expression, Proteomics, SNP,
Copy Number, RNAi, miRNA,
9000
Total # of Citations
8000
Oncology, Cardiovascular
Disease, Neuroscience,
Metabolic Disease,
Inflammation/Immunology,
Infectious Disease
7000
6000
5000
4000
3000
Basic, Translational, Drug
Discovery & Development
Research
2000
1000
0
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
thru
Sept.
2013
Full bibliography at www.ingenuity.com
Proprietary and Confidential
9
使用IPA之研究機構
麻省理工學院
波士頓大學
德國癌症研究中心
癌症研究中心
美國國家衛生研究院
哈佛醫學院
杜克大學
明尼蘇達大學
Proprietary and Confidential
加州大學舊金山分校
匹茲堡大學
史丹佛大學醫學圖書館
10
使用IPA之企業
賽諾菲安萬特藥廠
葛蘭素史克藥廠
傑克森實驗室
惠氏藥廠
默克化學
美國安進
輝瑞藥廠
嬌生藥廠
Proprietary and Confidential
默克雪蘭諾生物製藥 阿斯特捷利康公司
必治妥藥廠
轉譯基因組學研究所
11
Introduction to Ingenuity Knowledge Base
Proprietary and Confidential
12
Ingenuity Expert Findings
From full text, contextual detail, experimentally demonstrated
Original sentence
from publication
Ingenuity Expert
Findings
nNOS
overexpression
mice showed
reduced
myocardial
contractility.
Transgenic nNOS in
myocardium from mouse
heart decreases the
contractility of
myocardium in left
ventricle from mouse
heart.
Francisella
organisms
efficiently induce
IL-1beta
processing and
release.
Proprietary and Confidential
Francisella tularensis
subsp. novicida U112
increases (in a timedependent manner)
release of human IL1B
protein from human
monocytes.
► Contextual details: Manual curation
process captures relevant details
► Experimentally demonstrated:
Findings are from full text articles –
includes tables and figures
► Structured: Supports computation
and answering in-depth biological
questions in the relevant context
► High quality: QC’d to ensure
accuracy
► Timely information: Weekly updates
so up to date information is captured
13
Ingenuity Content
Expert Extraction: Full text from top journals
• Coverage of top journals, plus review articles and
textbooks
• Manually extracted by Ph.D. scientists
Import Annotations, Findings:
• OMIM, GO, Entrez Gene
• Tissue and Fluid Expression Location
• Molecular Interactions (e.g. BIND, TarBase)
Internally curated knowledge:
• Signaling & Metabolic Pathways
• Drug/Target/Disease relationships
• Toxicity Lists
All findings structured for computation and
updated weekly
Proprietary and Confidential
14
Ingenuity® Supported Third Party Information
• Synonyms, Protein Family, Domains
– GO, Entrez Gene, Pfam
• Tissue and Biofluid Expression & Location
– GNF, Plasma Proteome
• Molecular Interactions
– BIND, DIP, MIPS, IntAct, Biogrid, MINT, Cognia, etc.
• miRNA/mRNA target databases
– TargetScan, TarBase, miRBase
• Gene to Disease Associations
– OMIM, GWAS
• Exploratory Clinical Biomarkers
• Clinical Trial and drug information
– ClinicalTrials.gov, Drugs@FDA, Mosby’s Drug
Consult,..etc
Proprietary and Confidential
15
The Ingenuity Ontology
Structures, translates, and integrates information
THE INGENUITY ONTOLOGY
▶ Helps you to find highly relevant and
contextual information. ex: direction
of change
▶ Makes information computationally
accessible and available for queries.
ex:
• Query over any type of connections
(molecular, cellular, organism)
• Make leaps from one concept to
another and ask “Is there a path that
might lead from A to B?
▶ Ensures we are all talking about the
same concept– regardless of your
preferred nomenclature
(semantically consistent). :
ex : IL-1 beta increases regulation of COX1:
Which COX1? cyclooxygenase or cytochrome c oxidase – both are enzymes
Proprietary and Confidential
16
Explore the Ingenuity Knowledge Base
Ingenuity Expert Findings
• From theKNOWLEDGE
full text
THE INGENUITY
BASE
• Contextual details
• Timely
• High-quality
• Extensive: Leverages knowledge in one
place
- Largest scientific knowledge base of its kind
with modeled relationships between proteins,
genes, complexes, cells, tissues, drugs,
pathways and diseases
• Structured: Captures relevant details
Ingenuity ExpertAssist
Findings
• High coverage (abstracts)
• Timely
• High-quality
Ingenuity Expert
Knowledge
Ingenuity Supported Third
Party Information
Proprietary and Confidential
- Scientific statements are modeled into
Findings (often causal) using the Ingenuity
Ontology
• Expert Review Process: Checked for
accuracy
- Findings go through extensive QC process
• Timely: Frequent updates and up-to-date
knowledge
- Findings are added weekly
17
The Challenge
Integrate – Interpret – Gain
Therapeutic Insight from Experimental Data
Disease phenotype,
physiological response
Metastasis
Disease
Processes
Cellular phenotypes,
pathways
Cellular
Processes
Apoptosis
Angiogenesis
Molecular modules
Molecules
Experimental
Platforms
Proprietary and Confidential
Fas
Vegf
Molecular “fingerprint” –
cancer vs. normal cells
18
The Challenge
 Rapid understanding and interpretation of experimental systems
Search for genes
implicated in disease
Cancer
Disease
Processes
Identify related cellular
processes, pathways
Cellular
Processes
Apoptosis
Angiogenesis
Generate hypothesis of
Molecules
Experimental
Platforms
Proprietary and Confidential
Fas
VEGFA
bevacizumab molecular mechanism
Educate in vivo, in
vitro assays
19
Questions Arise During Experimental Process :
How Can IPA Help You?
Proprietary and Confidential
20
IPA Allows Scientists to Explore Biological Findings
• Browse and Search the comprehensive Ingenuity Knowledge Base
– Gene/Chemical Search
– Functions Search
– Pathway Search
• Build Pathways; Build Hypotheses
– Use Build Tools to explore which molecules have molecular interactions with
molecule(s) of interest
– Use the Overlay tools to layer additional functional, drug and biomarker information
• Analyze Data; Interpret Cause and Effect; Discover the Biology
– Gain insights into the Biological Functions, Canonical Pathways and Molecular
Networks that involve dataset molecules
– Predict Transcription Factors & Upstream Regulators involved in transcriptional
changes and connect Regulators into Mechanistic and Causal Networks
– Explore the Causal Effects of network changes
• Filter Datasets
– Biomarkers and Biofluid expression
– microRNA Target Filter for miRNA-mRNA relationships
Proprietary and Confidential
21
Search for Genes, Chemicals, Diseases, Functions, or Pathways
Proprietary and Confidential
22
Build Pathways; Build Hypotheses
Search and Explore Examples
• Tell me about my gene of interest – Insulin / INS
– What canonical signalling pathways does it appear in?
– What are the transcriptional regulators of this gene?
– What Ligand-Dependent Nuclear Receptors are regulated by these Transcription
Factors?
• What GPCRs are involved in diabetes?
– How do they interconnect?
– What other biological processes (functions) are these genes involved in?
– What are the molecular connections that link these genes to cytokines involved in
obesity?
– What drugs target these genes?
• Tell me about rosiglitazone?
– What clinical trials are running with rosiglitazone?
– How does rosiglitazone treatment affect the gene expression of these diabetes
GPCRs and obesity cytokines?
• What are the upstream regulators of the gene expression changes
induced by rosiglitazone treatment?
Proprietary and Confidential
23
Build and Grow Networks of Molecules
Grow Upstream from AKT1 to
kinases and phosphatases
Proprietary and Confidential
24
View Canonical Pathways and Link Additional Molecules
Proprietary and Confidential
25
Cause and Effect Analytics
• Upstream Regulator Analysis (including Transcription Factors)
– Predicts which Transcriptional Regulators and other upstream molecules
are driving gene expression changes and predicts which are activated /
inhibited to explain gene expression observed in a dataset
• Create Mechanistic and Causal Networks
– Connect upstream regulators into networks to help understand the
regulatory control of the gene expression seen
– Use the Ingenuity Knowledge Base of causal relationships to predict
regulators that can be causally linked to the dataset molecules for
unprecedented understanding of biological regulation
• Downstream Effects Analysis
– Predict increase or decreases in downstream biological processes
(functions) and disease using the direction of change in your gene
expression data
• Molecule Activity Predictor
– Visualize the predicted activity of causally connected molecules in Networks
and Pathways
Proprietary and Confidential
26
Upstream Regulators and Mechanistic Networks
Upstream Regulator
Regulator
Dataset Molecules
Mechanistic Network
Algorithm chains interacting
regulators together to create
a “Mechanistic Network”
Upstream
Regulator
Additional
Upstream Regulators
Dataset Molecules
Proprietary and Confidential
27
Upstream Regulator Analysis
Identify important signaling molecules for a more complete regulatory picture
• Quickly filter by molecule type
• Filter by biological context
• Generate regulators-targets network to identify key relationships
Proprietary and Confidential
Proprietary and
Confidential
28
Mechanistic Networks
How might the upstream molecule drive the observed expression changes?
• Hypothesis generation and visualization
– Each hypothesis generated indicates the molecules
predicted to be in the signaling cascade
Proprietary and Confidential
29
Interpret Downstream Biological Functions
Identify over-represented biological functions and predict how
those functions are increased or decreased in the experiment
Proprietary and Confidential
31
Compare Canonical Pathways across analyses as a heatmap
Proprietary and Confidential
33
BioProfiler*: Find, Filter and Explore
• Find molecules causally relevant to a disease, phenotype, or
function
• Filter by specific genetic evidence or species
• Explore association with other similar diseases or
phenotypes/symptoms leveraging the depth of the Ingenuity
Ontology and the Human Phenotype Ontology
Proprietary and Confidential
*Available for additional cost
34
Filter Datasets for Biomarkers or miRNA Targets
miRNA
Data
88 data
points
miRNA
Target Filter
13,690
targets
Molecule
Type
Pathways
1,090
targets
333
targets
(Cancer/
Growth)
mRNA
39
targets
↑↓
↓↑
?
32
targets
Use Pathway tools to build hypothesis for microRNA to mRNA target association
Proprietary and Confidential
35
Summary
IPA is a powerful data analysis and reference tool used by
thousands of scientists worldwide
• Browse and search the comprehensive Ingenuity
Knowledge Base
• Build pathways
• Build hypotheses
• Analyze and filter data
• Discover and interpret cause and effect
• Build enterprise knowledge base and results repository
Proprietary and Confidential
37
IPA: Unique Tools for Biological Analysis and Interpretation
Gene View & Chem View
Summaries
Human Isoform Views
Interaction Networks
Biological Functions
Canonical Pathways
Upstream Regulators/
Causal Networks
BioProfiler
Build & Overlay Tools
Proprietary and Confidential
38
Data Analysis & Interpretation in IPA
- Case study for Cross Platform Integration of
Metabolomics and Transcriptomics from a
Diabetic Mouse Model
Proprietary and Confidential
39
Background
• T2DM is one of the most common diseases of the western world
• 150 million afflicted worldwide.
• Animal models can aid discovery of biomarkers and clinical
compounds.
• However no animal model reflects all aspects of the human form of the
disease.
• Omics analysis across model systems could provide supporting
evidence of the value of those animal models.
• Metabolic manifestations of diabetes associated with insensitivity to
insulin include:
– Uncontrolled lipogenesis
– Hepatic glucose production
– Mitochondrial dysfunction
– Altered protein turnover
Proprietary and Confidential
40
Dataset used
• Integration of Metabolomics and Transcriptomics Data to
Aid Biomarker Discovery in Type 2 Diabetes. Connor S et
al. 2009
• Metabolites identified from Urine of dB/dB mice compared
to dB/+ controls using a Non Targeted NMR based
approach.
• Transcriptomic analysis performed on tissue from liver,
adipose and muscle using Affymetrix arrays
Proprietary and Confidential
41
dB/dB mouse model
• Lack functional Leptin receptor (LEPR-)
• Leads to defective leptin-mediated signal transduction.
• Results in:
– Chronic overeating
– Obesity
– Severe hyperinsulinaemia
– Hyperglycaemia and dyslipidaemia
Proprietary and Confidential
42
Aim of case study
• What diabetes aligned phenotypes are highlighted by the
IPA Metabolite analysis e.g changes in lipid, glucose and
protein metabolism, mitochondrial dysfunction and
oxidative stress.
• Can we integrate metabolite and transcript data into a
concerted analysis of a dB/dB model?
• Are there differences in transcript and metabolite levels
relevant to gluconeogenesis (Hepatic Glucose
Production)?
• Can we identify putative serum/tissue biomarkers relevant
to a dB/dB model?
Proprietary and Confidential
43
Aim of case study
• What diabetes aligned phenotypes are highlighted by the
IPA Metabolite analysis e.g changes in lipid, glucose and
protein metabolism, mitochondrial dysfunction and
oxidative stress.
• Can we integrate metabolite and transcript data into a
concerted analysis of a dB/dB model?
• Are there differences in transcript and metabolite levels
relevant to gluconeogenesis (Hepatic Glucose
Production)?
• Can we identify putative serum/tissue biomarkers relevant
to a dB/dB model?
Proprietary and Confidential
44
Metabolite upload and mapping
Includes some
phase-1 and
phase-2 type
transformed
metabolites.
68 out of 74
metabolites
mapped.
Proprietary and Confidential
45
Summary of metabolic analysis
Networks built around
the metabolites also
include key protein
regulators of relevant
functions and pathways
•T2DM and Insulin
receptor signalling.
•Hyperglycemia,
hyperinsulinemia and
quantity of lipid
Proprietary and Confidential
46
Network 1
Dysregulated metabolites
and network associated
proteins highlight:
•Lipid metabolism
•Carbohydrate metabolism
•Branched Chain Amino
Acid metabolism
Proprietary and Confidential
47
Key regulators of diabetes: ADIPOR2
Proprietary and Confidential
49
Aim of case study
• What diabetes aligned phenotypes are highlighted by the
IPA Metabolite analysis e.g changes in lipid, glucose and
protein metabolism, mitochondrial dysfunction and
oxidative stress.
• Can we integrate metabolite and transcript data into a
concerted analysis of a dB/dB model?
• Are there differences in transcript and metabolite levels
relevant to glucose metabolism?
• Can we identify putative serum/tissue biomarkers relevant
to a dB/dB model?
Proprietary and Confidential
50
Network 1- metabolite and transcript data
Liver
Muscle
Adipose
Inclusion of Liver,
Muscle and
Adipose transcript
data
Proprietary and Confidential
51
Aim of case study
• What diabetes aligned phenotypes are highlighted by the
IPA Metabolite analysis e.g changes in lipid, glucose and
protein metabolism, mitochondrial dysfunction and
oxidative stress.
• Can we integrate metabolite and transcript data into a
concerted analysis of a dB/dB model?
• Are there differences in transcript and metabolite levels
relevant to glucose metabolism?
• Can we identify putative serum/tissue biomarkers relevant
to a dB/dB model?
Proprietary and Confidential
52
Carbohydrate metabolism
Proprietary and Confidential
53
TCA Cycle
Upregulated Citrate
Cycle feeding
Pyruvate into
Gluconeogenesis
Proprietary and Confidential
54
Gluconeogenesis
Liver
Muscle
Adipose
Gluconeogenesis upregulated at transcript
level in Liver, but not Muscle or Adipose
Proprietary and Confidential
55
Aim of case study
• What diabetes aligned phenotypes are highlighted by the
IPA Metabolite analysis e.g changes in lipid, glucose and
protein metabolism, mitochondrial dysfunction and
oxidative stress.
• Can we integrate metabolite and transcript data into a
concerted analysis of a dB/dB model?
• Are there differences in transcript and metabolite levels
relevant to glucose metabolism?
• Can we identify putative serum/tissue biomarkers relevant
to a dB/dB model?
Proprietary and Confidential
56
Pre-established Clinical Biomarkers
Proprietary and Confidential
57
Putative novel biomarkers
Proprietary and Confidential
58
Putative novel biomarkers
• Hyperglycaemia
– Glucose
– Creatinine
• Hyperinsulinaemia
– Glucose
• Diabetes
– Creatine
• All detectable in Serum/Urine and phenotypically tagged to
Diabetes and/or co-morbidities
Proprietary and Confidential
59
Summary
• What diabetes aligned phenotypes are highlighted by the
IPA Metabolite analysis e.g changes in lipid, glucose and
protein metabolism, mitochondrial dysfunction and
oxidative stress.
– Molecular networks and functions from metabolite data align to a
range of carbohydrate, lipid and protein metabolism functions and
pathways
• Can we integrate metabolite and transcript data into a
concerted analysis of a dB/dB model?
– Ready alignment of array data and metabolite data identifies key
metabolic pathways perturbed across gluconeogenesis, citrate cycle
and branched chain amino acid metabolism
Proprietary and Confidential
60
Summary
• Are there differences in transcript and metabolite levels
relevant to glucose metabolism?
– Gluconeogenesis upregulated in liver c.f. adipose and muscle
tissue.
– Citrate cycle and Valine, Leucine and Isoleucine degradation
support this hypothesis
• Can we identify putative serum/tissue biomarkers relevant
to a dB/dB model?
– Established diagnosis (INS) and efficacy (Glucose) biomarkers for
T2DM, obesity & dislipidaemia
– Putative biomarkers in metabolite profile (glucose, creatine &
creatinine) for diabetes, hyperglycaemia and hyperinsulinaemia
Proprietary and Confidential
61
Thank you
歡迎與我們聯絡
Office: +886-2-2795-1777#1169
Fax: +886-2-2793-8009
My E-mail: [email protected]
MSC Support: [email protected]
Predicting upstream regulators of a dataset
 Every possible TF & Upstream Regulator in the
Ingenuity Knowledge Base is analyzed
UR
-
- + +
 Literature-based effect TF/UR has on downstream genes
+ + + +
↓
↓
↑
↓
↑
↑
↑
↑
 Differential Gene Expression (Uploaded Data)
1
1
1
-1
1
1
1
1
 Predicted activation state of TF/UR:
1 = Consistent with activation of UR
-1 = Consistent with inhibition of UR
=(7-1)/√8 = 2.12 (=predicted activation)
•
•
z-score is a statistical measure of the match between expected relationship direction
and observed gene expression
z-score > 2 or < -2 is considered significant
Note that the actual z-score is weighted by the underlying findings, the relationship bias, and dataset bias
Proprietary and Confidential
64
Single- vs. Mechanistic- vs. Causal Networks
Leveraging the network to create more upstream regulators
Single Upstream Regulator
Causal Network
Scoring
Casual connection to
disease, function, or
gene of interest
Regulator
Dataset
Molecules
Upstream
Regulator
Transcription
Factors
Mechanistic Network of
Upstream Regulators
Dataset
Molecules
Proprietary and Confidential
65
Peer-Reviewed Publications citing IPA
Sorted by Research Area, Through September 2013
Renal Disorders
1%
Eye Disease
2%
Wound Healing/Injury
3%
Stem Cell Biology
4%
Developmental Biology
3%
Aging
1%
Bone Development
1%
Liver Disease
1%
Nutritional
Sciences
2%
Oncology
27%
Cardiovascular Disease
4%
Metabolic Disease
3%
Genetics
4%
Inflammation/Immunology
12%
Hematology
4%
Reproductive Biology
4%
Toxicity/Safety Assessment
5%
Bioinformatics/General
Methods
5%
Proprietary and Confidential
Infectious
Disease
7%
Neuroscience
9%
66
Peer-Reviewed Publications Citing IPA
Sorted by Experimental Platforms Through September 2013
ChIP-on-Chip
1%
Metabolomics
1%
Next-Gen Sequencing
2%
Chromatin Immunoprecipitation
1%
Methylation Profiling
3%
Genotyping
4%
miRNA
6%
RNAi
4%
Proteomics Profiling
13%
Proprietary and Confidential
Gene Expression Profiling
72%
67