Download Overview of Weighted Gene Co- Expression Network Analysis

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Fetal origins hypothesis wikipedia , lookup

X-inactivation wikipedia , lookup

Epigenetics in stem-cell differentiation wikipedia , lookup

Point mutation wikipedia , lookup

Oncogenomics wikipedia , lookup

Metabolic network modelling wikipedia , lookup

Genomic imprinting wikipedia , lookup

Epistasis wikipedia , lookup

Pathogenomics wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

History of genetic engineering wikipedia , lookup

NEDD9 wikipedia , lookup

Genetic engineering wikipedia , lookup

Copy-number variation wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Epigenetics in learning and memory wikipedia , lookup

Genome evolution wikipedia , lookup

Public health genomics wikipedia , lookup

Gene wikipedia , lookup

Genome (book) wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Neuronal ceroid lipofuscinosis wikipedia , lookup

Saethre–Chotzen syndrome wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

The Selfish Gene wikipedia , lookup

Helitron (biology) wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Gene therapy wikipedia , lookup

Gene desert wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Gene expression programming wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Gene nomenclature wikipedia , lookup

Gene expression profiling wikipedia , lookup

RNA-Seq wikipedia , lookup

Microevolution wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Designer baby wikipedia , lookup

Transcript
Overview of Weighted Gene CoExpression Network Analysis
(WGCNA)
Steve Horvath & Brian Chen
Human Genetics and Biostatistics
UCLA
http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/
Networks are particularly valuable
for data integration
•  Resulting analysis is known as
– 
– 
– 
“systems biology“
“systems genetics”
“integromics”
•  WGCNA useful for correlating disparate data sets:
– 
– 
– 
– 
SNPs
Gene expression
DNA methylation
Clinical outcomes
Nature (2011) 25;474(7351):380-4
PNAS (2010) 107(28):12698-703
PloS Genetics (2009) ;5(9):e1000642
Nature (2008) Mar 27;452(7186):429-35
Standard analyses identify “differentially
expressed” or “differentially methylated” genes
Control
Gene 1
Gene 2
Gene 3
Gene 4
Gene 5
Gene 6
Gene 7
Gene 8
Gene 9
Gene 10
Gene 11
Gene 12
Gene 13
Gene 14
Gene 15
Gene 16
.
.
.
.
.
.
.
.
.
.
.
Gene n-2
Gene n-1
Gene n
Experimental
•  Each gene is treated as
an individual entity.
• 
• 
• 
Misses the forest for the trees
Ignores the strong
correlations between genes
Plagued by false positives
due to multiple comparisons
Comparison of gene-centric vs.
network approaches
Gene-centric
Network
Number of
comparisons
~105
~101
Reproducibility
•  Results are
sensitive to
analytic
decisions
•  Robust
statistical
framework
•  focus on
pathways
Pathway
information
Data bases
Data-driven +
data-bases
Typical analysis steps of weighted correlation network analysis (WGCNA)
Construct a network
Rationale: make use of interaction patterns between genes
Identify modules
Rationale: module (pathway) based analysis
Relate modules to external information
Array Information: Clinical traits, SNPs, proteomics
Gene Information: gene ontology, pathways, enrichment
Rationale: find biologically interesting modules
Study Module Preservation across data
Rationale:
•  Same data: to check robustness of module definition
•  Different data: to find interesting modules.
Find the key drivers in interesting modules
Tools: intramodular connectivity, causality testing
Rationale: experimental validation, therapeutics, biomarkers
Constructing co-expression networks
High dimensional data
(e.g. expression,
methylation)
Measure of co-expression
Hierarchical clustering to
identify modules
(clusters)
Network
Data reduction with networks
Relative risk for CVD
Hub gene
= most highly
connected gene
Module Eigengene
= weighted average
Connectivity
Relating modules to clinical traits:
Table with correlations and p-values
Modules
Clinical traits related to metabolic syndrome (data by AJ Lusis)
WGCNA in WHI Long Life Study
High-dimensional Data
• RNA
• methylation
• metabolites
WCGNA
data
integration
& reduction
Modules
Clinical Traits
•  CVD
•  Blood counts
•  Glucose
•  Insulin
•  CRP
•  Creatinine
•  Triglycerides
•  Cholesterol
•  Blood pressure
•  Height/weight
•  Waist
•  Physical
performance
•  Physical activity
Gene Ontology, Pathways,
Enrichment
Ingenuity Pathway Analysis (IPA)
Gene-Set Enrichment Analysis (GSEA)
Gene Ontology (DAVID/EASE)
Systems Genetics Integration
(“Integromics”)
SNP
•  eQTL
•  Causal pathway testing
•  SNP-set enrichment
Gene
expression
•  GWAS
•  Module
eigengenes
•  SNP-set
enrichment
Clinical
traits
Conclusions
•  WGCNA is a highly robust, systems approach for:
•  Integrating high-dimensional, multi-scale data
• 
• 
• 
• 
• 
Microarray
RNA-seq
Methylation
Proteomics
MRI
•  Identify modules and key driver genes that relate to
disease outcomes