Download Location Analysis of Transcription Factor Binding - CS

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Gene expression programming wikipedia , lookup

X-inactivation wikipedia , lookup

Pathogenomics wikipedia , lookup

Non-coding DNA wikipedia , lookup

Transposable element wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Cancer epigenetics wikipedia , lookup

Human genome wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Essential gene wikipedia , lookup

Short interspersed nuclear elements (SINEs) wikipedia , lookup

Primary transcript wikipedia , lookup

Microevolution wikipedia , lookup

Oncogenomics wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Long non-coding RNA wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Epigenetics in stem-cell differentiation wikipedia , lookup

Genomic imprinting wikipedia , lookup

History of genetic engineering wikipedia , lookup

Designer baby wikipedia , lookup

RNA-Seq wikipedia , lookup

Gene wikipedia , lookup

Genome evolution wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

NEDD9 wikipedia , lookup

Genome (book) wikipedia , lookup

Ridge (biology) wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Gene expression profiling wikipedia , lookup

Mir-92 microRNA precursor family wikipedia , lookup

Minimal genome wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Transcript
Location Analysis
of Transcription Factor Binding
Tommy
Computational Biology Seminar
Nov. 2005
2
Background
• Immuno Precipitation
• ChIP - Chromatin Immuno Precipitation
• Microarray evolution
(from promoter arrays to tiling arrays)
• ChIP-chip (ChIP followed by microarray hybridization)
3
Things to do with ChIP chip…
General method for identification of
– Target genes of transcription factors
– Transcribed genes (Pol II)
– Transcribed miRNAs (Pol II)
– Chromatin states (ABs for modified histones)
– etc. – (any protein (mod AB) that binds DNA)
4
Outline
• Kim, Ren et al. Nature (2005)
A high-resolution map of active promoters in the
human genome.
• Boyer, Young et al. Cell (2005)
Core transcriptional regulatory circuitry in human
embryonic stem cells.
• Odom, Young et al. Science (2004)
Control of pancreas and liver gene expression
by HNF transcription factors.
5
General Transcription Factors
(GTFs)
TFIIF 2 subunits
TFIIE
TFIIA
2 subunits
2-3 subunits
TFIID
TFIIB
1 subunit
Pol II
12 subunits
15 subunits
TFIIH
9 subunits
6
Formation of Pre-Initiation Complex
1. Localization at the promoter
2. DNA melting, initiation and elongation
IIF
IIA
TAFs
TBP
IIB
Pol. II
IIE IIH
TATA BRE
TSS
Core promoter
7
Kim, Barrera, Ren et al. Nature (2005)
A high-resolution map of active promoters
in the human genome
• Accurate mapping of active promoters in
human fibroblast cells (IMR90)
– Active genes
– Identify transcription start sites
• DNA microarray of Human genome
NimbleGen 50bp probe every 100bp
• ABs for Pol II preinitiation complex (PIC)
• Computational aspects
deconvolution of semi-continuous signal
8
Kim et al. Map of active promoters
Beware, spoiler!
The Titanic drowns and Leo DiCaprio dies
9
Kim, Barrera, Ren et al. Nature (2005)
A high-resolution map of active promoters
in the human genome
• Found 12,150 bound regions (promoters)
– 10,576 belong to 6,763 known genes
– 1,196 un-annotated transcriptional units
•
•
•
•
Many genes with multiple promoters
Clusters of active promoters
Four classes of promoters
Many novel genes (RNA genes?)
10
Kim et al. Map of active promoters
Technicalities
• Follows similar work on ENCODE
regions
Kim et al, Gen. Res. (2005); ENCODE project, Science (2004)
• Chip design: series of DNA microarrays
covering 14.5 million (!) 50bp probes,
covering all the human genome*
• IP design: Monoclonal AB to TAF1
(TAFII250) of TFIID
* Except for genomic repeats
11
Kim et al. Map of active promoters
Method
•
•
•
•
Compare IP to control DNA
Identify stretches of 4 bound probes
Re-check using a new array
Computational detection of 12,150
peaks (Mpeak)
• Compare to known genes
(DBTSS, RefSeq, GenBank, EnsEMBL)
• 87% matched 5’ ends of known mRNAs
(up to 2.5Kb)
12
Kim et al. Map of active promoters
13
Kim et al. Map of active promoters
Validation of results
• Anti-RNAP AB re-found 97% of bound
promoters
• Standard ChIP found 27/28 of randomly
selected bound promoters
• Bound promoters are enrichment for
known TSS elements
• 97% of promoters had chromatin state of
active genes – H3Ac, H3K4Me
14
Kim et al. Map of active promoters
Un-annotated promoters
• 1,597 promoters are ≥ 2.5Kb from 5’ of
known genes
• 607 of them match EST  possible genes
• 632 of them are also bound by RNAP and in
the “right” chromatin state
– Measure mRNA expression of 567 promoters
(50bp probes at 28Kb around each gene)
– 35 new transcription units. Rest unstable?
– One located 250bp ups to predicted miRNA
15
Kim et al. Map of active promoters
16
Kim et al. Map of active promoters
Un-annotated promoters
• 1,239 putative promoters correspond to
novel transcription units.
– Evolutionary conserved
– Enriched with core promoter motifs
• 1,196 outside current gene annotation
(13% of promoters)
17
Kim et al. Map of active promoters
Clusters of active genes
• 256 clusters of ≥4 active genes
(1,668 EnsEMBL genes)
• 1609 genes had multiple promoters
10000
– Most have the same
gene product
– Some have different 1st exon
1000
– Some undergo different splicing
100
• All at a single cell type!
10
1
2
3
4
5
18
Kim et al. Map of active promoters
Transcription machinary
vs.
Gene Expression
• 14,437 genes
• IMR90 human fibroblast cells
• Compare PIC occupancy to expression
19
• Classes I and IV are consistent (75% of genes)
• Class II - PIC is bound, no expression
– PIC is assembled but not sufficient for TXN
• Contain immediate response genes (stress)
– mRNA transcribes but degraded
(miRNA targets?)
• Class III - Expressed with no bound PIC
– Test 10 random genes with ChIP (TFIID, RNAP)
– Nearly 60% were weakly bound
20
Kim, Barrera, Ren et al. Nature (2005)
A high-resolution map of active promoters
in the human genome
•
•
•
•
•
Found 12,150 bound regions (promoters)
Many genes with multiple promoters
1,239 novel genes (RNA genes?)
Clusters of active promoters (chromatin)
Four classes of promoters
21
Kim, Barrera, Ren et al. Nature (2005)
A high-resolution map of active promoters
in the human genome
• So what have we learned?
22
Odom, Young et al. Science (2004)
Control of pancreas and liver gene
expression by HNF transcription factors
• Diabetes is bad.
• Uncover the transcriptional regulatory
network that control insulin secretion.
• Human liver and pancreatic islets
• Use ChIP for Pol II and 3 TFs
• Measure expression of genes
23
Odom et al. HNF regulation in pancreas and liver
Background
• Transcriptional regulation in the liver
– HNF1α (homeodomain)
– HNF4α (nuclear receptor)
– HNF6 (onecut)
• Same with the pancreatic islets?
– All three are require for normal function
– Mutations maturity-onset diabetes of the
young (MODY3, MODY1)
• Understand normal to explain abnormal
24
Odom et al. HNF regulation in pancreas and liver
MODY
• maturity-onset diabetes of the young
• Genetic disorder of the insulin-secreting
pancreatic β cells
• Onset of diabetes mellitus before 25
• Autosomal dominant pattern of inheritance
• Not to confuse with type 2 (late-onset)
diabetes
– early-onset insulin resistance
– functional defects in insulin secretion
25
Pancreas β cell
26
Hepatocyte
27
Odom et al. HNF regulation in pancreas and liver
Method
•
•
•
•
Identify targets of three TFs in two tissues
Identify transcribed genes (using Pol II)
Promoter array (13K genes)
-700bp to +200bp relatively to TSS
28
Odom et al. HNF regulation in pancreas and liver
Hepatocyte targets of HNF1α
• 222 genes that represent a substantial
section of hepatocyte biochemistry
– gluconeogenesis and associated pathways
– carbohydrate synthesis and storage
– Lipid metabolism
(synthesis of cholesterol and apolipoproteins)
– Detoxification
(synthesis of cytochrome P450 monooxygenases)
– Serum proteins
(synthesis of albumin and coagulation factors).
29
Odom et al. HNF regulation in pancreas and liver
Pancreas targets of HNF1α
•
•
•
•
106 genes, 30% of which bound in liver
Fewer chaperons and enzymes
Receptors and signal transduction genes vary
Many known targets are missing…
– Stringent criteria
– Short promoters
30
Odom et al. HNF regulation in pancreas and liver
Targets
• HNF6 binds 227 (1.3%) and 189 (1.45%),
incl. important cell-cycle regulators
• HNF4α 1575 (12%) and 1423 (11%)
– Two different ABs
– Western blots
– Standard ChIP (50)
– Other tissues (17)
– Preimmune ABs bind not
– 80% (73%) also bound by PolII.
31
Odom et al. HNF regulation in pancreas and liver
The transcriptome
• “It is difficult to determine the transcriptome of these
tissues accurately by profiling transcript levels with
DNA microarrays.”
• What is the appropriate reference RNA?
• 2,984 (23%) are bound by Pol II in hepatocytes
• 2,426 (19%) in islets, 81% of which by both
• 80% (73%) of HNF4α are bound by Pol II
• Three HNFs cover many of transcribed genes
32
Odom et al. HNF regulation in pancreas and liver
Regulatory network
• Some differences between regulation in
the two tissues
33
Regulatory network motifs
34
Odom et al. HNF regulation in pancreas and liver
Multi-component loop
• Capacity for feedback control and produce
bistable systems that can switch between
two alternate states [Milo et al, 2002]
• The multi-component loop of HNF1α and
HNF4α is responsible for stabilization of
the terminal phenotype in pancreatic beta
cells [Ferrer 2002]
35
Odom et al. HNF regulation in pancreas and liver
Feed-forward loop
• A feedforward loop acts as a switch, sensitive to
sustained inputs (rather than transient)
• HNF6 serves as a master regulator for feedforward motifs in hepatocytes and pancreatic
islets
• Involves >80 genes in each tissue
36
Odom et al. HNF regulation in pancreas and liver
Regular Chain motifs
• Regulator chain motifs represent the
simplest circuit logic for ordering
transcriptional events in a temporal
sequence
37
Odom et al. HNF regulation in pancreas and liver
Summary
• HNF4α binds almost half of active genes
in the liver and pancreas islets
• Crucial for development and function of
these tissues
• Might explain why mutations can increase
type II diabetes
38
Boyer, Young et al. Cell (2005)
Core transcriptional regulatory circuitry
in human embryonic stem cells
• Embryonic stem cells are important
– Can be propagated in undifferentiated state
– Can differentiate into >200 unique cell types
– Great promise for regenerative medicine
• Reveal transcriptional regulatory circuitry
controlling pluripotency and self-renewal.
• Early development and cell identity is
controlled by several homeodomain TFs
39
Boyer et al. Regulation in embryonic stem cells
Background
• Early development and cell identity is
controlled by several homeodomain TFs
• OCT4, SOX2, NANOG have central roles in
maintaining the pluripotency of stem cells
• KO of each results with differentiation
• Over-expression of OCT4 ~ NANOG KO
• Why? Identify targets of each and see…
40
Boyer et al. Regulation in embryonic stem cells
Method
• Human H9 embryonic stem cells
• Agilent promoter arrays
– 60-mer probes
– Spaced at ~300bp
– Covering -8Kb to +2Kb relatively to TSS
• Including 98% of TRANSFAC binding sites
(Wow!!)
– 17,917 genes
• Replicate set of ChIP assay
41
Boyer et al. Regulation in embryonic stem cells
OCT4
Analysis of peaks found:
• 623 genes (3%)
• 5 miRNAs (3%)
Many known targets:
• Mouse ES cells
• Expressed in ES
Improved protocol
• Better than Odom et al
• <1% FPR, 20% FNR
42
Boyer et al. Regulation in embryonic stem cells
SOX2
1271 genes (7%)
NANOG
1687 genes (9%)
43
Boyer et al. Regulation in embryonic stem cells
Binding in proximity
• Co-binding suggests that OCT4, SOX2 &
NANOG function together
44
Boyer et al. Regulation in embryonic stem cells
Function of TFs
• Checked expression these genes in ES cells
(published data)
• 1,303/2,260 genes are active, 957 inactive
• Of the 353 tri-bound genes, half active
• Active include TFs (OCT4, SOX2, NANOG, STAT3, ZIC3),
components of TGF-β and Wnt pathways
• Inactive genes include developmental TFs
(important for differentiation)
• Many other homeodomain TFs
45
Boyer et al. Regulation in embryonic stem cells
Putative regulatory circuitry
46
Boyer et al. Regulation in embryonic stem cells
47
Boyer, Young et al. Cell (2005)
Core transcriptional regulatory circuitry
in human embryonic stem cells
• So what have we learned?
48