Download ChromatinDB: a database of genome-wide

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Metagenomics wikipedia , lookup

Behavioral epigenetics wikipedia , lookup

RNA-Seq wikipedia , lookup

Primary transcript wikipedia , lookup

Neocentromere wikipedia , lookup

Epitranscriptome wikipedia , lookup

Long non-coding RNA wikipedia , lookup

Designer baby wikipedia , lookup

ENCODE wikipedia , lookup

Epigenetics of depression wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Epigenetics wikipedia , lookup

Cancer epigenetics wikipedia , lookup

Epigenetics in stem-cell differentiation wikipedia , lookup

Epigenomics wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Nucleosome wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Epigenetics in learning and memory wikipedia , lookup

Histone acetyltransferase wikipedia , lookup

Transcript
BIOINFORMATICS APPLICATIONS NOTE
Vol. 23 no. 14 2007, pages 1828–1830
doi:10.1093/bioinformatics/btm236
Genome analysis
ChromatinDB: a database of genome-wide histone modification
patterns for Saccharomyces cerevisiae
Timothy R. O’Connor and John J. Wyrick*
School of Molecular Biosciences and Center for Reproductive Biology, Washington State University, Pullman,
WA, USA
Received on February 22, 2007; revised on April 11, 2007; accepted on April 26, 2007
Advance Access publication May 7, 2007
Associate Editor: Chris Stoeckert
ABSTRACT
Summary: Covalent modifications to histone proteins play a critical
role in regulating gene transcription. Previous studies have used
chromatin immunoprecipitation (ChIP) based microarray assays to
profile genomic regions that are enriched or depleted for a particular
histone modification. Such studies have been conducted extensively
in the yeast Saccharomyces cerevisiae, but currently no comprehensive data repositories or analysis tools are available for these
data sets. For this reason, we have developed the ChromatinDB
database, which contains genome-wide ChIP data for 22 different
histones or histone modifications in S.cerevisiae. ChromatinDB
provides novel tools to facilitate the visualization and statistical
analysis of chromatin features for user-selected gene sets.
Availability: http://www.bioinformatics2.wsu.edu/ChromatinDB
Contact: [email protected]
Supplementary information: Supplementary data are available at
Bioinformatics online.
websites, and few software tools are available to visualize and
analyze these data.
A number of databases have been previously developed to
curate information about chromatin factors. For example, the
Histone Database contains extensive sequence information
for histone proteins and proteins containing histone-fold
domains (Marino-Ramirez et al., 2006). At least two databases
(CREMOFAC and ChromDB) contain protein sequences and
information about chromatin remodeling factors (Shipra et al.,
2006; and www.chromdb.org). However, no comparable
resource exists for the analysis of ChIP-microarray data
for histones or histone modifications. To remedy this deficiency, we have developed a database called ChromatinDB
that provides a centralized repository for these data sets.
In addition, ChromatinDB contains a number of tools to
facilitate the visualization and statistical analysis of these data.
2
1
INTRODUCTION
Complex patterns of gene expression are observed in developing eukaryotic organisms. Recent studies have shown that
post-translational modifications to histone proteins (e.g.
histone acetylation, methylation or phosphorylation) play a
critical role in regulating gene expression patterns (Millar and
Grunstein, 2006). Histone modifications are directed to specific
regions of the genome by histone modifying enzymes, and can
act to both repress and activate gene transcription.
Recently, a new experimental methodology coupling chromatin immunoprecipitation (ChIP) assays with DNA microarray technology has been used to profile the DNA regions
associated with distinct types of modified histone proteins.
Almost 20 distinct histone modifications have been profiled in
this manner in yeast alone (Millar and Grunstein, 2006). These
data provide a genome-wide map of genomic regions that are
enriched or depleted of each type of histone modification. Also
available are data sets measuring the relative association of the
histone proteins themselves (histone H2A, H2A.Z, H2B, H3
and H4) with DNA. Unfortunately, all of these valuable data
sets are fragmented among a variety of different databases and
*To whom correspondence should be addressed.
1828
DATABASE IMPLEMENTATION
The ChromatinDB database and accompanying web portal
were implemented using the MySQL DBMS, custom Perl
scripts and dynamically generated web pages. ChromatinDB
contains genome-wide ChIP data for 22 different histones or
histone modifications, which were obtained from various
published studies (see Supplementary Materials section for
more details). These data were first filtered and normalized
prior to uploading the data into ChromatinDB (see
Supplementary Materials).
2.1
Statistical analysis
A custom Perl module was written to identify histones or
histone modifications that are enriched or depleted in a userselected set of promoter regions or open reading frames
(ORFs). This Perl module employed a Wilcoxon rank sum
test to determine if there was a bias in the distribution of the
ranks of the user-selected genes for any of the histone or
histone modification data sets. The significance of this bias
(P-value) was estimated using the normal approximation for
the Wilcoxon rank sum test. Significance values are only
reported by ChromatinDB when five or more genes with
data are selected, as at least this number of data points is
needed for an accurate significance estimate using the normal
distribution.
ß The Author 2007. Published by Oxford University Press. All rights reserved. For Permissions, please email: [email protected]
ChromatinDB
Fig. 1. (A) Graphical display of histone and histone modification patterns for promoter regions bound by the Gcn5 histone acetyltransferase.
ChromatinDB was used to display ChIP-microarray data for 135 selected promoter regions. The log base-2 of the average enrichment ratio for each
of 22 histone modification data sets was plotted. Nucleosome normalized data is shown. Acetylated or methylated lysine residues are indicated;
occupancy refers to the association of the histone protein with the DNA region. (B) Summary of enriched and depleted histone modifications in the
promoter regions bound by Gcn5. ChromatinDB was used to identify histone data sets that were significantly enriched or depleted for the 135
selected promoter regions. The average rank percentile, the data trend (e.g. enriched or depleted) and the corresponding significance (P) values are
displayed for each histone modification.
3
RESULTS AND FUNCTIONALITY
ChromatinDB provides the user with easy access to
ChIP-microarray data for a large set of histones or
histone modifications in Saccharomyces cerevisiae. The
principal means for accessing these data is through the
chromatin visualization tool. In the visualization tool’s gene
selection page the user can enter any number of gene names
or accessions (e.g. TAH11 or YJR046W), and specify which
type of chromatin data to display (e.g. histone acetylation).
Additional selection options allow the user to specify
whether to display standard or nucleosome-normalized data;
whether to display data for promoter regions or ORFs;
and the significance criteria used to determine whether histone
modifications are enriched or depleted in the user selected
gene set.
1829
T.R.O’Connor and J.J.Wyrick
ChromatinDB was used to visualize histone modification
data for 135 yeast promoters that are targeted by the Gcn5
histone acetyltransferase. These genes were chosen as each were
significantly bound by Gcn5 (P50.01) based on a previous
ChIP microarray study (Robert et al., 2004). A portion of the
resulting web page is shown in Figure 1. The graph shown in
Figure 1A displays, for these 135 promoter regions, the average
nucleosome-normalized log2 binding ratio for each histone or
histone modification data set. As shown in Figure 1B, the web
page also identifies which histones or histone modifications are
significantly enriched or depleted in these promoter regions.
This analysis was performed using P-value threshold of
1 10 6, and a Bonferroni correction for multiple hypothesis
testing. In addition, the resulting web page shows the average
mRNA levels and transcription frequency of the selected genes
(Supplementary Fig. S1).
For the Gcn5-bound promoter regions, we find significant
enrichment for acetylation of each of the lysine residues in
histone H3. These results indicate that Gcn5 binding is
correlated with histone H3 acetylation, presumably because
these residues are acetylated by Gcn5 or because Gcn5 binds to
these acetylated lysine residues through its bromodomain. This
result is in accord with previous studies of Gcn5 substrate
specificity (Roth et al., 2001; Suka et al., 2001). In contrast,
selection of a random set of 135 yeast promoter regions did not
reveal a significant enrichment or depletion of any of the
histone modification data sets (Supplementary Fig. S2).
Intriguingly, we observe that acetylation of H2AZ K14 is
strongly enriched in the Gcn5-bound promoters (Fig. 1). This
result is in accordance with a previous study, which indicated
that Gcn5 might acetylate H2AZ K14 (Babiarz et al., 2006).
It is not clear why Gcn5 binding is significantly associated
with histone H4 N-terminal acetylation, as Gcn5 has not
been previously reported to be associated with this histone
1830
modification. Finally, the occupancy of all the core histone
proteins is significantly depleted in these promoter regions
(Fig. 1), suggesting that Gcn5 binding is correlated with
nucleosome depletion at promoter regions.
ChromatinDB also provides a Genome Display tool, which
enables the user to analyze histone modification patterns based
on their location relative to chromosome features, such as a
telomere or centromere, or based on their chromosome
coordinates. Supplementary Figure S3 shows the genome
display of histone modification patterns located from 0 to
10 kb of a telomere.
ACKNOWLEDGEMENT
This work was supported, in part, by American Cancer Society
grant RSG-03-181-01-GMC.
Conflict of Interest: none declared.
REFERENCES
Babiarz,J.E. et al. (2006) Telomeric heterochromatin boundaries require NuA4dependent acetylation of histone variant H2A.Z in Saccharomyces cerevisiae.
Genes Dev., 20, 700–710.
Marino-Ramirez,L. et al. (2006) The Histone Database: a comprehensive
resource for histones and histone fold-containing proteins. Proteins, 62,
838–842.
Millar,C.B. and Grunstein,M. (2006) Genome-wide patterns of histone modifications in yeast. Nat. Rev. Mol. Cell Biol., 7, 657–666.
Robert,F. et al. (2004) Global position and recruitment of HATs and HDACs in
the yeast genome. Mol. Cell, 16, 199–209.
Roth,S.Y. et al. (2001) Histone acetyltransferases. Annu. Rev. Biochem., 70,
81–120.
Shipra,A. et al. (2006) CREMOFAC – a database of chromatin remodeling
factors. Bioinformatics, 22, 2940–2944.
Suka,N. et al. (2001) Highly specific antibodies determine histone acetylation site
usage in yeast heterochromatin and euchromatin. Mol. Cell, 8, 473–479.