* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download ChromatinDB: a database of genome-wide
Metagenomics wikipedia , lookup
Behavioral epigenetics wikipedia , lookup
Primary transcript wikipedia , lookup
Neocentromere wikipedia , lookup
Epitranscriptome wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Designer baby wikipedia , lookup
Epigenetics of depression wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Epigenetics wikipedia , lookup
Cancer epigenetics wikipedia , lookup
Epigenetics in stem-cell differentiation wikipedia , lookup
Epigenomics wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
BIOINFORMATICS APPLICATIONS NOTE Vol. 23 no. 14 2007, pages 1828–1830 doi:10.1093/bioinformatics/btm236 Genome analysis ChromatinDB: a database of genome-wide histone modification patterns for Saccharomyces cerevisiae Timothy R. O’Connor and John J. Wyrick* School of Molecular Biosciences and Center for Reproductive Biology, Washington State University, Pullman, WA, USA Received on February 22, 2007; revised on April 11, 2007; accepted on April 26, 2007 Advance Access publication May 7, 2007 Associate Editor: Chris Stoeckert ABSTRACT Summary: Covalent modifications to histone proteins play a critical role in regulating gene transcription. Previous studies have used chromatin immunoprecipitation (ChIP) based microarray assays to profile genomic regions that are enriched or depleted for a particular histone modification. Such studies have been conducted extensively in the yeast Saccharomyces cerevisiae, but currently no comprehensive data repositories or analysis tools are available for these data sets. For this reason, we have developed the ChromatinDB database, which contains genome-wide ChIP data for 22 different histones or histone modifications in S.cerevisiae. ChromatinDB provides novel tools to facilitate the visualization and statistical analysis of chromatin features for user-selected gene sets. Availability: http://www.bioinformatics2.wsu.edu/ChromatinDB Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online. websites, and few software tools are available to visualize and analyze these data. A number of databases have been previously developed to curate information about chromatin factors. For example, the Histone Database contains extensive sequence information for histone proteins and proteins containing histone-fold domains (Marino-Ramirez et al., 2006). At least two databases (CREMOFAC and ChromDB) contain protein sequences and information about chromatin remodeling factors (Shipra et al., 2006; and www.chromdb.org). However, no comparable resource exists for the analysis of ChIP-microarray data for histones or histone modifications. To remedy this deficiency, we have developed a database called ChromatinDB that provides a centralized repository for these data sets. In addition, ChromatinDB contains a number of tools to facilitate the visualization and statistical analysis of these data. 2 1 INTRODUCTION Complex patterns of gene expression are observed in developing eukaryotic organisms. Recent studies have shown that post-translational modifications to histone proteins (e.g. histone acetylation, methylation or phosphorylation) play a critical role in regulating gene expression patterns (Millar and Grunstein, 2006). Histone modifications are directed to specific regions of the genome by histone modifying enzymes, and can act to both repress and activate gene transcription. Recently, a new experimental methodology coupling chromatin immunoprecipitation (ChIP) assays with DNA microarray technology has been used to profile the DNA regions associated with distinct types of modified histone proteins. Almost 20 distinct histone modifications have been profiled in this manner in yeast alone (Millar and Grunstein, 2006). These data provide a genome-wide map of genomic regions that are enriched or depleted of each type of histone modification. Also available are data sets measuring the relative association of the histone proteins themselves (histone H2A, H2A.Z, H2B, H3 and H4) with DNA. Unfortunately, all of these valuable data sets are fragmented among a variety of different databases and *To whom correspondence should be addressed. 1828 DATABASE IMPLEMENTATION The ChromatinDB database and accompanying web portal were implemented using the MySQL DBMS, custom Perl scripts and dynamically generated web pages. ChromatinDB contains genome-wide ChIP data for 22 different histones or histone modifications, which were obtained from various published studies (see Supplementary Materials section for more details). These data were first filtered and normalized prior to uploading the data into ChromatinDB (see Supplementary Materials). 2.1 Statistical analysis A custom Perl module was written to identify histones or histone modifications that are enriched or depleted in a userselected set of promoter regions or open reading frames (ORFs). This Perl module employed a Wilcoxon rank sum test to determine if there was a bias in the distribution of the ranks of the user-selected genes for any of the histone or histone modification data sets. The significance of this bias (P-value) was estimated using the normal approximation for the Wilcoxon rank sum test. Significance values are only reported by ChromatinDB when five or more genes with data are selected, as at least this number of data points is needed for an accurate significance estimate using the normal distribution. ß The Author 2007. Published by Oxford University Press. All rights reserved. For Permissions, please email: [email protected] ChromatinDB Fig. 1. (A) Graphical display of histone and histone modification patterns for promoter regions bound by the Gcn5 histone acetyltransferase. ChromatinDB was used to display ChIP-microarray data for 135 selected promoter regions. The log base-2 of the average enrichment ratio for each of 22 histone modification data sets was plotted. Nucleosome normalized data is shown. Acetylated or methylated lysine residues are indicated; occupancy refers to the association of the histone protein with the DNA region. (B) Summary of enriched and depleted histone modifications in the promoter regions bound by Gcn5. ChromatinDB was used to identify histone data sets that were significantly enriched or depleted for the 135 selected promoter regions. The average rank percentile, the data trend (e.g. enriched or depleted) and the corresponding significance (P) values are displayed for each histone modification. 3 RESULTS AND FUNCTIONALITY ChromatinDB provides the user with easy access to ChIP-microarray data for a large set of histones or histone modifications in Saccharomyces cerevisiae. The principal means for accessing these data is through the chromatin visualization tool. In the visualization tool’s gene selection page the user can enter any number of gene names or accessions (e.g. TAH11 or YJR046W), and specify which type of chromatin data to display (e.g. histone acetylation). Additional selection options allow the user to specify whether to display standard or nucleosome-normalized data; whether to display data for promoter regions or ORFs; and the significance criteria used to determine whether histone modifications are enriched or depleted in the user selected gene set. 1829 T.R.O’Connor and J.J.Wyrick ChromatinDB was used to visualize histone modification data for 135 yeast promoters that are targeted by the Gcn5 histone acetyltransferase. These genes were chosen as each were significantly bound by Gcn5 (P50.01) based on a previous ChIP microarray study (Robert et al., 2004). A portion of the resulting web page is shown in Figure 1. The graph shown in Figure 1A displays, for these 135 promoter regions, the average nucleosome-normalized log2 binding ratio for each histone or histone modification data set. As shown in Figure 1B, the web page also identifies which histones or histone modifications are significantly enriched or depleted in these promoter regions. This analysis was performed using P-value threshold of 1 10 6, and a Bonferroni correction for multiple hypothesis testing. In addition, the resulting web page shows the average mRNA levels and transcription frequency of the selected genes (Supplementary Fig. S1). For the Gcn5-bound promoter regions, we find significant enrichment for acetylation of each of the lysine residues in histone H3. These results indicate that Gcn5 binding is correlated with histone H3 acetylation, presumably because these residues are acetylated by Gcn5 or because Gcn5 binds to these acetylated lysine residues through its bromodomain. This result is in accord with previous studies of Gcn5 substrate specificity (Roth et al., 2001; Suka et al., 2001). In contrast, selection of a random set of 135 yeast promoter regions did not reveal a significant enrichment or depletion of any of the histone modification data sets (Supplementary Fig. S2). Intriguingly, we observe that acetylation of H2AZ K14 is strongly enriched in the Gcn5-bound promoters (Fig. 1). This result is in accordance with a previous study, which indicated that Gcn5 might acetylate H2AZ K14 (Babiarz et al., 2006). It is not clear why Gcn5 binding is significantly associated with histone H4 N-terminal acetylation, as Gcn5 has not been previously reported to be associated with this histone 1830 modification. Finally, the occupancy of all the core histone proteins is significantly depleted in these promoter regions (Fig. 1), suggesting that Gcn5 binding is correlated with nucleosome depletion at promoter regions. ChromatinDB also provides a Genome Display tool, which enables the user to analyze histone modification patterns based on their location relative to chromosome features, such as a telomere or centromere, or based on their chromosome coordinates. Supplementary Figure S3 shows the genome display of histone modification patterns located from 0 to 10 kb of a telomere. ACKNOWLEDGEMENT This work was supported, in part, by American Cancer Society grant RSG-03-181-01-GMC. Conflict of Interest: none declared. REFERENCES Babiarz,J.E. et al. (2006) Telomeric heterochromatin boundaries require NuA4dependent acetylation of histone variant H2A.Z in Saccharomyces cerevisiae. Genes Dev., 20, 700–710. Marino-Ramirez,L. et al. (2006) The Histone Database: a comprehensive resource for histones and histone fold-containing proteins. Proteins, 62, 838–842. Millar,C.B. and Grunstein,M. (2006) Genome-wide patterns of histone modifications in yeast. Nat. Rev. Mol. Cell Biol., 7, 657–666. Robert,F. et al. (2004) Global position and recruitment of HATs and HDACs in the yeast genome. Mol. Cell, 16, 199–209. Roth,S.Y. et al. (2001) Histone acetyltransferases. Annu. Rev. Biochem., 70, 81–120. Shipra,A. et al. (2006) CREMOFAC – a database of chromatin remodeling factors. Bioinformatics, 22, 2940–2944. Suka,N. et al. (2001) Highly specific antibodies determine histone acetylation site usage in yeast heterochromatin and euchromatin. Mol. Cell, 8, 473–479.