Download Genetics meets Genomics: Genetic Variation and Regulatory Networks

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

SULF1 wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Silencer (genetics) wikipedia , lookup

RNA-Seq wikipedia , lookup

Gene regulatory network wikipedia , lookup

Transcript
Genetic Regulatory Complexity:
Lessons from yeast to cancer
Dana Pe’er
Dept. of Biological Sciences
C2B2 Center for Computational Biology
Columbia University
.
How does sequence variation affect the
function of molecular networks?
Genetic Variation and Regulation:
?
?
?
?
?
?
?
Variation 
perturbations to regulatory network
“Genetics Genomics” Data
…
Lab (BY)
profile
gene
expression
Wine (RM)
determine
genome
segregation
correlate genotype
with
transcript abundance
Modularity
 Module - a set of biological entities that act
collectively to perform an identifiable and distinct
function
activator
Target
Activator Genotype
RM
BY
Target
…
Target
 Power of modules:
multiple co-regulated
genes provide statistical
power for linkage.
 Enables combinatorial
regulation
Utilizing Gene Expression
DNA variation can change
abundance of a regulator
which in turn changes
expression of many targets
 Can “explain" the linkage
 Can uncover novel
regulatory mechanisms
Causality verses Co-regulation
Cause
Share
Common Cause
Effect
 The key is to go beyond pair-wise correlation and test
multivariate statistical dependencies
 Statistical test: Is regulator gene expression is significantly
more predictive of trait than genotype?
 Permutation testing fixing genotype
Zooming Into the Linked Region
 A Bayesian score that integrates gene expression and SNP
structure to help identify causal gene
 Prioritizing genes within a region
 Increasing confidence in a weak linkage
Network Learning Engine
Genotype Data
Expression Data
Candidate
regulators
 ~600 candidate regulators
 ~500 genotype regions
::
clustering
Gene partition
Regulation
Module
program
learning
Geronimo:
Gene
reassignmen
Networks Algorithm
t to modules
Functional
modules
Automatically identify
modules of co-regulated
genes & their regulatory
program
Puf3 Module
Dhh1: Part of P-body complex that stimulates
mRNA decapping, coordinates distinct steps in
mRNA function and decay.
Mitochondria
139/153 genes – p < 10-92
Puf3 (3’UTR) P<5.8X10-131
Puf3
Hypothesis: Puf3
“marks” mRNAs by
binding, then
Dhh1 p-body
degrades them?
expression
genotype
Prediction Validated
ΔDhh1 leads to
over expression of
puf3 targets in a
similar magnitude
to ΔPuf3
 Approach discovered a novel regulatory mechanism which
we validated
 Required using gene expression as an intermediary
 Treating the module as an entity aided interpretation
Our detected gene expression regulator is causal
Linkage Analysis Result
ChrXIV
500 genes
linked to
ChrXIV locus
Linkage analysis
[Brem & Kruglyak]
 Large set of genes linked to ChrXIV
region
 Highly heterogeneous
 No hypothesis suggested for linkage
Identifying the Causal SNP
Region contains 33 genes
Use gene
expression
with Bayesian
Prior to
identify gene
Chromosome XIV region
Dhh1
 Binds at 3’ UTR of mRNAs
 Regulates translation of Puf5-dependent mRNA (HO)
 Significant SNP in highly conserved residue
Our gene expression regulator aids in identifying causal SNP
The Full Ribosome Module
 Hundreds of ribosomal genes have clear coexpression pattern, but only 4 link to primary locus
with any significance, no loci associated with
others, even as we lower p-value threshold.
 When we use modularity to include interacting
loci…
Can this approach scale to human?
Challenges
 Network complexity
 Multiple tissues
 100x genotypes
 No breeding
 No perturbation
Genetic Genomics of Cancer
 Coordinated genomics study of tumor samples: Copy
number, LOH, SNPs + Gene Expresion
?
?
Melanoma
 Problem!
 Yeast: 500 genotypes and 600 regulators
 Cancer: Tens of thousands of genotypes and
regulators.
Limiting to only those regions with copy
number change
 Almost every region of
the genome is altered
in at least one tumor
Cancer
 Which are the drivers?
Solution: Use evolutionary
principles
Beroukhim et.al PNAS 2007
GISTIC: Significantly recurring changes
AMPLOTYPE: Integrating SNPs
Conexic: Module Network Algorithm
prioritize a smaller set of
potential causative genotypes: (GISTIC,
Preliminary step,
Gene
Expression
Copy
Number
2008 PNAS)
 Integrating genotype to expression
 Who are the Conexic:
driving mutations?
Module
Algorithm
 What
genesNetwork
and processes
do they effect?
 How do they interact together?
Capturing Regulation
3p14 -MITF
TF which regulates the differentiation and development of
melanocytes retinal pigment epithelium and is also responsible for
pigment cell-specific transcription of the melanogenesis enzyme genes
 Module
enriched for
pigment
metabolism
and creation
A Key Melanoma Oncogene
Ras
Raf
Mek
MapK
MITF
3p14 MITF
Pigmentation
BCL2
SILV
Pax6
 Anti-proliferation factor
that is a crucial event
for the progression of
melanomas that harbor
oncogenic B-RAF.
 MITF chosen as key
regulator for 14
modules (different
combinatorial
regulation)
 All known MITF targets
detected
Beyond Correlation
Is simple correlation enough?
Chromosome 13
 Correlation alone does not
identify any candidate gene
in deleted region on
chromosome 13.
 Perhaps not real driver
mutation?
Combinatorial Regualtion
13q12.11 - TBCD14
13q12.11 - EDNRB
 Module significantly enriched
for apoptosis and AKT
 EDNRB is needed for
melanocyte proliferation.
Inhibiting its action in
melanoma leads to apoptosis.
 TBC1D4 connected to EDNRB
Conexic picked two
distinct genes in the same via the AKT pathway.
deleted segment, which  Dramatically different gene
of genes in the
combinatorially influence expression
same deleted region
a set of apoptosis genes
Discovering Additional Driver Mutations
Problem: Many known oncogenes are missed by
GISTIC, high statistical burden.
Solution: Lower the threshold. 3q21.3 - RAB7A
15q21 - RAB27A
Significantly recurring copy
number change coinciding
with its ability to predict
the expression patterns
varying across tumors,
strengthens the evidence
of its causative role in
cancer.
Targeting the same pathway
 Rab27a and Rab7a regulate melanosome maturation.
 A region containing either of these genes was amplified in
23 samples
3q21.3 - RAB7A
15q21 - RAB27A
Our approach successfully
scales to better understand
driving mutations in cancer
Rab7a amp
Rab27a amp
How does gene expression effect growth?
Complex Phenotype: Cell Growth
 Does the cell care that ~4000 gene expressions significantly
change? How?
 Growth under 40 physiologically relevant conditions:
by
rm
 Carbon source (8 sugars), environment stress (e.g
osmolarity, heat), starvation (nitrogen, phosphate)
 Robust highly quantitative protocol
 OD by
sampled every 10 minutes
rm
 spearman correlation 97-99%
by
rm
Can we explain our growth phenotypes?
by
rm
by
rm
by
rm
 Same as before, try to use gene expression as an
intermediary to explain cellular phenotype
 Note: gene expression measured in Glucose and growth
measured in many other conditions.
Does regulation effect growth rates?
Mean Squared Error
Test MSE
0.8
0.5
Single Genotype
S
RegNet G
RegNet RG
0.2
1
6
11
16
conditions
Conclusion: Genetic variation in the regulatory network
has a significant effect on growth.
Growth Under Oxidative Stress
Regressors
Chr13:227254
Chr13:245457
0.31
0.21
-0.291
50
100
50
100
-0.41
1
1
1
Prediction
50
SUT2
DHH1
100
50
100
1
50
100
1
1
50
50
100
100
R+G
G (3 markers)
1
Growth
50
100
1
50
100
1
50
100
H2O2
 MSE: 10.50, compared with
0.81 of using only
50
100 genotype
Integrating Genotype and Gene Expression
Aids in Interpreting Growth Phenotypes
Intermediate regulator
DHH1
growth
(H2O2)
Oxidative stress causes pbodies to increase
P-bodies degrade
mitochondrial ribosome,
critical under oxidative stress
Zooming In the Region
YAP1
M13
locus
growth
(H2O2)
Yap1p activates
transcription of genes in
response to oxidative stress
Summary
?
?
?
?
 Combining genotype and gene expression:
 Helps better explain observed variation
 Uncovers regulatory network
 Approach scales to discovering driver genes in
cancer and the pathways they alter
 Towards a complex phenotype, using the network to
understand growth in different biological conditions
What Next:
Understanding Drug Resistance in Cancer
101006_plate1
1
?
OD [60 0nm]
?
D1
D2
D3
D4
D5
D6
D7
D8
D9
D10
D11
D12
0.1
0.01
-5
0
5
10
15
20
25
Ti me[h rs]
 600 cell lines, derived from different cancers
 Affy 500K SNP chip
 Gene expression
 Growth under 100 drugs, 3 doses each
 What mutations drive drug resistance and how?
With Jeff Settleman, MGH, Harvard
What Next:
How is Signal Processing Altered in Cancer?
 70 Melanoma samples
 SNP chip and Gene expression
 Reverse Phase Protein Array, 300 antibodies
 Growth and response under Mek inhibitor
With Levi Garraway, Dana Farber
Acknowledgements
Geronemo
Su-in Lee
Stanford
Daphne Koller
Stanford
Yeast
Bo-Juen Chen
Pe’er lab
Cancer
Oren Litvin
Pe’er lab
Noel Goddard
Hunter College
Uri-David Akavia Oren Litvin
Pe’er lab
Pe’er lab
Levi Garraway
Dana Farber
Funding
Positions available contact:
[email protected]