* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download msb201035-sup
Essential gene wikipedia , lookup
Designer baby wikipedia , lookup
Microevolution wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Genome (book) wikipedia , lookup
Genome evolution wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Minimal genome wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Genomic imprinting wikipedia , lookup
Ridge (biology) wikipedia , lookup
Gene expression programming wikipedia , lookup
Mir-92 microRNA precursor family wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Supplementary material Wilczynski & Furlong Supplementary Figures: Supple. Figure 1: Spatio-temporal expression patterns of Twist, Tinman, Mef2 and Biniou transcription factors Embryo immunostains showing the protein expression of the four transcription factors under study. The ChIP time-course used for Twist and Tinman contained three developmental time windows (indicated by the blue rectangle): 2-4 hrs (stages 5-7), 4-6hrs (stages 8-9) and 68hrs (stages 10-11). The ChIP time-course examined for Mef2 and Biniou (indicated by the blue rectangle) consisted of: 6-8hrs (stages 10-11), 8-10hrs (stage 12) and 10-12hrs (stage 13). A representative stage is shown for each time-window. For a given TF, the confocal pictures of all stages were imaged with the same laser and gain settings. Note that although the expression pattern of a TF will naturally change as development progresses, there are no dramatic decreases or increases in the overall level of the TF’s expression during the length of the time-course understudy. 1 Supplementary material Wilczynski & Furlong Supple. Figure 2: Dynamic CRM occupancy of four TFs during embryonic development The percentage of CRMs exhibiting early (green), continuous (yellow) and late (red) binding patterns is very similar for different factors, with appropriately 50% of CRMs being transiently bound at either early or late stages of development. As Mef2 binding was measured over five consecutive time-points, the data was divided into two sets, with three time-points each to more readily compare to the time-courses of the other TFs. Temporal occupancy does not show any significant skew towards a specific time-point for different factors, ruling out a general bias caused by a weaker binding signal at one time-point (this is further confirmed in Supple. Fig. 3 and 5). 2 Supplementary material Wilczynski & Furlong Supple. Figure 3: Quantitative Chip signal for all TFs in temporally bound CRM classes. Y-axis corresponds to the log2 ChIP signal for different TFs (quantile normalized). X-axis represents developmental time in hours. Box-plot colors indicate temporal TFbinding classes (green: early, yellow: continuous, red: late). Temporal classes show very different distributions of signals between early and late time-points. For example, there is very little TF occupancy (ChIP signal) on early Twist-bound CRMs (green) at 6-8h compared to 2-4h, while the converse is true for late bound CRMs (red). Continuously bound CRMs (yellow) exhibit high levels of TF binding (ChIP signal) at all time-points. Data from Supple Table 1. A similar trend is shown at an individual CRM level in the scatter plots in Supple Fig. 5. Here, the distribution of TF binding signal of green dots (early bound CRMs) is shifted from high in early time-points to low in the late time-points. The converse trend is observed for the red dots (late bound CRMs), which have low binding signal in early time-points, and high binding signal at the late time-points. 3 Supplementary material Wilczynski & Furlong Supple. Figure 4: There are no significant differences in the enrichment of the TF’s motif between the temporally bound classes of CRMs (a) Motif enrichment using optimized motifs from the entire data set (Zinzen, Girardot, Gagneur et al) and (b) newly determined motifs from each temporal class of CRMs. Y-Axis: percentage of CRMs containing a motif above the PATSER threshold. X-Axis: CRM occupancy by the TF indicated. Colors correspond to different binding classes (not bound: Grey, early: green, continuous: yellow, late: red). Approximately 85% of CRMs bound by Twist, for example, contain a Twist motif, regardless of the timing of Twist binding. (b) Novel motifs derived (using NestedMICA tool) from early-bound (green rectangle) and late-bound CRMs (red rectangle) sets show equal preference to early and late CRMs, indicated by the green and red bars, respectively. Grey bars correspond to the percentage of background sequences containing the respective motifs. Note that the main variant of the Twist motif identified in the early Twist-bound CRMs is equally enriched in early- and latebound regions. 4 Supplementary material Wilczynski & Furlong Supple. Figure 5: Correlation between strength of TF binding and motif quality Scatter plots showing dependence of ChIP signal for different TFs (Twist, Tinman, Biniou, and Mef2) at different time points. Y-axis represents the strength of the ChIP signal (log2) for the indicated TF (Supple Table 1), X-axis represents the motif score using the best motif (log-odds). Each circle corresponds to a single CRM. CRMs bound by the corresponding TF are labeled green, yellow or red depending on the temporal binding class (early, continuous or late respectively). Blue indicates CRMs that are clearly not bound by the TF (the best signal below 0 and therefore depleted in the IP). Interestingly, while bound regions are depleted for in quality motifs, there is a portion of CRMs with no enrichment for TF binding although they contain a strong motif(s) for their respective TF (blue dots). Pearson correlation coefficients for bound (all) regions: Twist 2-4h R=0.172 (0.33); Twist 4-6h R=0.192 (0.36); Twist 6-8h R=0.243 (0.35); Tinman 2-4h R=0.381 (0.44); Tinman 4-6h R=0.419 (0.44); Tinman 6-8h R=0.346 (0.41); Biniou 6-8h R=0.198 (0.39); Biniou 8-10h R=0.081 (0.34); Biniou 10-12h R=0.188 (0.38); Mef2 6-8h R=0.155 (0.29); Mef2 8-10h R=0.178 (0.37); Mef2 1012h R=0.112 (0.37) 5 Supplementary material Wilczynski & Furlong Supple. Figure 6: Correlation between strength of TF binding and motif quality, using a thermodynamic model of the CRMs overall motif(s) quality Scatter plots showing dependence of ChIP signal for different TFs (Twist, Tinman, Biniou, and Mef2) at different time points. Y-axis represents the strength of the ChIP signal (log2) for the indicated TF (Supple Table 1), X-axis represents the TRAP score using a quasi-thermodynamic model (log2). Each circle corresponds to a single CRM. CRMs bound by the corresponding TF are labeled green, yellow or red depending on the temporal binding class (early, continuous or late respectively). Blue indicates CRMs that are clearly not bound by the TF (the best signal below 0 and therefore depleted in the IP), yet many contain a strong motif or collection of motifs giving rise to a strong TRAP score for their respective TF (blue dots). Pearson correlation coefficients for bound (all) regions: Twist 2-4h R=0.075 (0.39); Twist 4-6h R=0.111 (0.39); Twist 6-8h R=0.162 (0.43); Tinman 2-4h R=0.268 (0.45); Tinman 4-6h R=0.297 (0.45); Tinman 6-8h R=0.240 (0.42); Biniou 6-8h R=0.332 (0.56); Biniou 8-10h R=0.189 (0.51); Biniou 10-12h R=0.245 (0.54); Mef2 6-8h R=0.215 (0.30); Mef2 8-10h R=0.336 (0.40); Mef2 1012h R=0.221 (0.37) 6 Supplementary material Wilczynski & Furlong Supple. Figure 7: Comparison of the percentage of CRMs co-bound by other TFs in different temporal classes The bar height corresponds to the percentage of CRMs in a given category co-bound by another TF at the time the TF in question is not bound. For example, looking at the occupancy of Twist-early bound CRMs at a late time point (6-8 h), the first blue bar indicates that less than 25% of CRMs bound by Twist at early stages (Twi E) are co-bound by other TFs at 6-8h (the time that Twist “unbinds”), while almost 70% of CRMs bound by Twist continuously (orange bar) are co-bound by other TFs at 6-8 hrs. Similarly, the second pair of bar graphs indicates the percentage of CRMs bound by twist at late stages (Twi L) that are also bound by other TFs at 2-4h. Transiently bound CRMs show >2fold depletion in co-binding (or un-binding) of other TFs, in comparison to continuously bound CRMs, indicating coordinated temporal occupancy. 7 Supplementary material Wilczynski & Furlong a) b) Supple. Figure 8: CRM occupancy correlates with the timing of target gene expression Graphs display percentages of associated target genes that are expressed at specific times in development and have a CRM bound at either early (green) or late (red) stages of development. (a) The correlation between the timing of each TF separately to the timing of its target genes expression: the target genes for each TF were divided into early or late based on the timing of TF occupancy to the associated CRM. The graph shows the first time point of expression for the nearest target gene using the BDGP in-situ database. The green line represents genes with early binding, red line indicates the expression timing of genes with late CRM binding. Approximately 35% of Twist early-bound CRMs are expressed at early stages of development, for example, while ~45% of late Twist-bound CRMs drive expression at the equivalent stages (stages 10-11). Interesting, most of Twist target genes are not expressed at late developmental stage (Stage 13-16), which are stages where Twist is no longer expressed. (b) A microarray based time-course of gene expression from BDGP database was used as a second independent source of expression timing for developmental genes. For each temporal binding class, the percentage of genes with an expression peak earlier than the mid time-point measured for TF binding is displayed (twi, tin: 4h, Mef2, Bin: 8h). All TFs, with the exception of Tinman, exhibit significant enrichment of target genes with early expression that have early-bound CRMs as compared to the control late class (Fisher exact test, p<0.05). The fact that the timing of Tinman binding is not highly correlation with the timing of its target genes’ expression most likely reflects the importance of co-operativity with other TFs for Tinman function, as previously reported (e.g. HsiuHsiang Lee and Manfred Frasch, 2005). 8 Supplementary material Wilczynski & Furlong Supple. Figure 9: Transcription factor binding is highly correlated with the timing of target gene expression, even for CRMs only bound at a single time-point CRMs bound only at a single timepoint where linked to the closest transcriptional start site. To avoid confounding affects of genes expressed in multiple tissues, only genes that are expressed in mesoderm and/or muscle in the BDGP in-situ database were included. The target genes were divided into three groups (2-4h, 6-8h, 10-12h) based on the timing of their CRM occupancy. The green line in the graph shows the percentage of genes with CRMs bound at 2-4h, indicating that 40% initiate expression at 2-3.5h, while only ~10% are expressed at 9.5-16h. Similarly, ~12% of genes with CRMs bound at 10-12h initiate expression at 2-3.5h (red line), while ~30% of genes with CRMs bound at 10-12h initiate expression at 9.5-16h. Yellow line: ~75% of genes with CRMs bound at mid embryogenesis (6-8hr) initiate expression at either 3.5-6h or 6-9.5hr of development. 9