Download msb201035-sup

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Essential gene wikipedia , lookup

Designer baby wikipedia , lookup

Microevolution wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Genome (book) wikipedia , lookup

Genome evolution wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Gene wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Minimal genome wikipedia , lookup

Long non-coding RNA wikipedia , lookup

NEDD9 wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Genomic imprinting wikipedia , lookup

Ridge (biology) wikipedia , lookup

Gene expression programming wikipedia , lookup

RNA-Seq wikipedia , lookup

Mir-92 microRNA precursor family wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Gene expression profiling wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Transcript
Supplementary material
Wilczynski & Furlong
Supplementary Figures:
Supple. Figure 1: Spatio-temporal expression patterns of Twist, Tinman, Mef2 and
Biniou transcription factors
Embryo immunostains showing the protein expression of the four transcription factors under
study. The ChIP time-course used for Twist and Tinman contained three developmental time
windows (indicated by the blue rectangle): 2-4 hrs (stages 5-7), 4-6hrs (stages 8-9) and 68hrs (stages 10-11). The ChIP time-course examined for Mef2 and Biniou (indicated by the
blue rectangle) consisted of: 6-8hrs (stages 10-11), 8-10hrs (stage 12) and 10-12hrs (stage
13). A representative stage is shown for each time-window. For a given TF, the confocal
pictures of all stages were imaged with the same laser and gain settings. Note that although
the expression pattern of a TF will naturally change as development progresses, there are no
dramatic decreases or increases in the overall level of the TF’s expression during the length
of the time-course understudy.
1
Supplementary material
Wilczynski & Furlong
Supple. Figure 2: Dynamic CRM occupancy of four TFs during embryonic
development
The percentage of CRMs exhibiting early (green), continuous (yellow) and late (red) binding
patterns is very similar for different factors, with appropriately 50% of CRMs being
transiently bound at either early or late stages of development. As Mef2 binding was
measured over five consecutive time-points, the data was divided into two sets, with three
time-points each to more readily compare to the time-courses of the other TFs. Temporal
occupancy does not show any significant skew towards a specific time-point for different
factors, ruling out a general bias caused by a weaker binding signal at one time-point (this is
further confirmed in Supple. Fig. 3 and 5).
2
Supplementary material
Wilczynski & Furlong
Supple. Figure 3: Quantitative Chip signal for all TFs in temporally bound CRM
classes. Y-axis corresponds to the log2 ChIP signal for different TFs (quantile normalized).
X-axis represents developmental time in hours. Box-plot colors indicate temporal TFbinding classes (green: early, yellow: continuous, red: late). Temporal classes show very
different distributions of signals between early and late time-points. For example, there is
very little TF occupancy (ChIP signal) on early Twist-bound CRMs (green) at 6-8h
compared to 2-4h, while the converse is true for late bound CRMs (red). Continuously
bound CRMs (yellow) exhibit high levels of TF binding (ChIP signal) at all time-points.
Data from Supple Table 1. A similar trend is shown at an individual CRM level in the scatter
plots in Supple Fig. 5. Here, the distribution of TF binding signal of green dots (early bound
CRMs) is shifted from high in early time-points to low in the late time-points. The converse
trend is observed for the red dots (late bound CRMs), which have low binding signal in early
time-points, and high binding signal at the late time-points.
3
Supplementary material
Wilczynski & Furlong
Supple. Figure 4: There are no significant differences in the enrichment of the TF’s
motif between the temporally bound classes of CRMs
(a) Motif enrichment using optimized motifs from the entire data set (Zinzen, Girardot,
Gagneur et al) and (b) newly determined motifs from each temporal class of CRMs.
Y-Axis: percentage of CRMs containing a motif above the PATSER threshold. X-Axis:
CRM occupancy by the TF indicated. Colors correspond to different binding classes (not
bound: Grey, early: green, continuous: yellow, late: red). Approximately 85% of CRMs
bound by Twist, for example, contain a Twist motif, regardless of the timing of Twist
binding.
(b) Novel motifs derived (using NestedMICA tool) from early-bound (green rectangle) and
late-bound CRMs (red rectangle) sets show equal preference to early and late CRMs,
indicated by the green and red bars, respectively. Grey bars correspond to the percentage of
background sequences containing the respective motifs. Note that the main variant of the
Twist motif identified in the early Twist-bound CRMs is equally enriched in early- and latebound regions.
4
Supplementary material
Wilczynski & Furlong
Supple. Figure 5: Correlation between strength of TF binding and motif quality
Scatter plots showing dependence of ChIP signal for different TFs (Twist, Tinman, Biniou, and
Mef2) at different time points. Y-axis represents the strength of the ChIP signal (log2) for the
indicated TF (Supple Table 1), X-axis represents the motif score using the best motif (log-odds).
Each circle corresponds to a single CRM. CRMs bound by the corresponding TF are labeled green,
yellow or red depending on the temporal binding class (early, continuous or late respectively). Blue
indicates CRMs that are clearly not bound by the TF (the best signal below 0 and therefore depleted
in the IP). Interestingly, while bound regions are depleted for in quality motifs, there is a portion of
CRMs with no enrichment for TF binding although they contain a strong motif(s) for their respective
TF (blue dots).
Pearson correlation coefficients for bound (all) regions: Twist 2-4h R=0.172 (0.33); Twist 4-6h
R=0.192 (0.36); Twist 6-8h R=0.243 (0.35); Tinman 2-4h R=0.381 (0.44); Tinman 4-6h R=0.419
(0.44); Tinman 6-8h R=0.346 (0.41); Biniou 6-8h R=0.198 (0.39); Biniou 8-10h R=0.081 (0.34);
Biniou 10-12h R=0.188 (0.38); Mef2 6-8h R=0.155 (0.29); Mef2 8-10h R=0.178 (0.37); Mef2 1012h R=0.112 (0.37)
5
Supplementary material
Wilczynski & Furlong
Supple. Figure 6: Correlation between strength of TF binding and motif quality, using a
thermodynamic model of the CRMs overall motif(s) quality
Scatter plots showing dependence of ChIP signal for different TFs (Twist, Tinman, Biniou, and
Mef2) at different time points. Y-axis represents the strength of the ChIP signal (log2) for the
indicated TF (Supple Table 1), X-axis represents the TRAP score using a quasi-thermodynamic
model (log2). Each circle corresponds to a single CRM. CRMs bound by the corresponding TF are
labeled green, yellow or red depending on the temporal binding class (early, continuous or late
respectively). Blue indicates CRMs that are clearly not bound by the TF (the best signal below 0 and
therefore depleted in the IP), yet many contain a strong motif or collection of motifs giving rise to a
strong TRAP score for their respective TF (blue dots).
Pearson correlation coefficients for bound (all) regions: Twist 2-4h R=0.075 (0.39); Twist 4-6h
R=0.111 (0.39); Twist 6-8h R=0.162 (0.43); Tinman 2-4h R=0.268 (0.45); Tinman 4-6h R=0.297
(0.45); Tinman 6-8h R=0.240 (0.42); Biniou 6-8h R=0.332 (0.56); Biniou 8-10h R=0.189 (0.51);
Biniou 10-12h R=0.245 (0.54); Mef2 6-8h R=0.215 (0.30); Mef2 8-10h R=0.336 (0.40); Mef2 1012h R=0.221 (0.37)
6
Supplementary material
Wilczynski & Furlong
Supple. Figure 7: Comparison of the percentage of CRMs co-bound by other TFs in
different temporal classes
The bar height corresponds to the percentage of CRMs in a given category co-bound by
another TF at the time the TF in question is not bound. For example, looking at the
occupancy of Twist-early bound CRMs at a late time point (6-8 h), the first blue bar
indicates that less than 25% of CRMs bound by Twist at early stages (Twi E) are co-bound
by other TFs at 6-8h (the time that Twist “unbinds”), while almost 70% of CRMs bound by
Twist continuously (orange bar) are co-bound by other TFs at 6-8 hrs. Similarly, the second
pair of bar graphs indicates the percentage of CRMs bound by twist at late stages (Twi L)
that are also bound by other TFs at 2-4h. Transiently bound CRMs show >2fold depletion in
co-binding (or un-binding) of other TFs, in comparison to continuously bound CRMs,
indicating coordinated temporal occupancy.
7
Supplementary material
Wilczynski & Furlong
a)
b)
Supple. Figure 8: CRM occupancy correlates with the timing of target gene expression
Graphs display percentages of associated target genes that are expressed at specific times in
development and have a CRM bound at either early (green) or late (red) stages of
development. (a) The correlation between the timing of each TF separately to the timing of
its target genes expression: the target genes for each TF were divided into early or late based
on the timing of TF occupancy to the associated CRM. The graph shows the first time point
of expression for the nearest target gene using the BDGP in-situ database. The green line
represents genes with early binding, red line indicates the expression timing of genes with
late CRM binding. Approximately 35% of Twist early-bound CRMs are expressed at early
stages of development, for example, while ~45% of late Twist-bound CRMs drive
expression at the equivalent stages (stages 10-11). Interesting, most of Twist target genes are
not expressed at late developmental stage (Stage 13-16), which are stages where Twist is no
longer expressed. (b) A microarray based time-course of gene expression from BDGP
database was used as a second independent source of expression timing for developmental
genes. For each temporal binding class, the percentage of genes with an expression peak
earlier than the mid time-point measured for TF binding is displayed (twi, tin: 4h, Mef2, Bin:
8h). All TFs, with the exception of Tinman, exhibit significant enrichment of target genes
with early expression that have early-bound CRMs as compared to the control late class
(Fisher exact test, p<0.05). The fact that the timing of Tinman binding is not highly
correlation with the timing of its target genes’ expression most likely reflects the importance
of co-operativity with other TFs for Tinman function, as previously reported (e.g. HsiuHsiang Lee and Manfred Frasch, 2005).
8
Supplementary material
Wilczynski & Furlong
Supple. Figure 9: Transcription factor binding is highly correlated with the timing of
target gene expression, even for CRMs only bound at a single time-point
CRMs bound only at a single timepoint where linked to the closest transcriptional start site.
To avoid confounding affects of genes expressed in multiple tissues, only genes that are
expressed in mesoderm and/or muscle in the BDGP in-situ database were included. The
target genes were divided into three groups (2-4h, 6-8h, 10-12h) based on the timing of their
CRM occupancy. The green line in the graph shows the percentage of genes with CRMs
bound at 2-4h, indicating that 40% initiate expression at 2-3.5h, while only ~10% are
expressed at 9.5-16h. Similarly, ~12% of genes with CRMs bound at 10-12h initiate
expression at 2-3.5h (red line), while ~30% of genes with CRMs bound at 10-12h initiate
expression at 9.5-16h. Yellow line: ~75% of genes with CRMs bound at mid embryogenesis
(6-8hr) initiate expression at either 3.5-6h or 6-9.5hr of development.
9