Download Lluís Millán Ariño GENOMIC DISTRIBUTION AND FUNCTIONAL SPECIFICITY OF

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Primary transcript wikipedia , lookup

Transposable element wikipedia , lookup

Epigenetics in stem-cell differentiation wikipedia , lookup

Pathogenomics wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Essential gene wikipedia , lookup

Epigenetics in learning and memory wikipedia , lookup

Public health genomics wikipedia , lookup

Gene desert wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

Oncogenomics wikipedia , lookup

X-inactivation wikipedia , lookup

Cancer epigenetics wikipedia , lookup

History of genetic engineering wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Microevolution wikipedia , lookup

Genome evolution wikipedia , lookup

Long non-coding RNA wikipedia , lookup

Gene expression programming wikipedia , lookup

Gene wikipedia , lookup

Mir-92 microRNA precursor family wikipedia , lookup

Designer baby wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Genome (book) wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Minimal genome wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Genomic imprinting wikipedia , lookup

Ridge (biology) wikipedia , lookup

NEDD9 wikipedia , lookup

RNA-Seq wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Gene expression profiling wikipedia , lookup

Transcript
GENOMIC DISTRIBUTION AND FUNCTIONAL SPECIFICITY OF
HUMAN HISTONE H1 SUBTYPES
Lluís Millán Ariño
DOCTORAL THESIS · 2013
Thesis supervisor:
Dr. Albert Jordan Vallès
Molecular Genomics Department
Institut de Biologia Molecular de Barcelona (IBMB) - CSIC
DEPARTMENT OF EXPERIMENTAL AND HEALTH SCIENCES – UPF
RESULTS
RESULTS - CHAPTER I:
GENOMIC DISTRIBUTION OF
HUMAN HISTONE H1 SUBTYPES
Results
The main goal of this thesis project was to determine whether the genomic distribution of
human histone H1 is different among variants. To achieve it, we decided to use chromatin
immunoprecipitation combined with semiquantitative PCR (ChIP-qPCR), promoter array
hybridization (ChIP-on-chip) and massive sequencing (ChIP-seq). By the combining these
complementary strategies we expected to go in depth in the understanding of the
heterogeneity of the histone H1 family.
1. DEVELOPMENT AND CHARACTERIZATION OF STABLE HA-TAGGED T47D-MTVL
CELL LINES
Because a limited number of H1-variant specific ChIP-grade antibodies exist (only H1.2 and
H1X in our hands), we developed T47D-MTVL-derived cell lines stably expressing
hemagglutinin (HA)-tagged versions of each of the five somatic H1 variants expressed in most
cell types (H1.0, H1.2, H1.3, H1.4 and H1.5) (see Materials and Methods). These cell lines
proliferated similarly to parental cells (Figure R.1A) and no significant differences in cell cycle
profile were observed compared to T47D original cell line (Figure R.1B). Moreover, HA-tagged
H1 variants (H1-HA) were expressed at levels below or similar to their correspondent
endogenous histone, and comparable between different H1 variant-HA-expressing cell lines
(Figure R.1C). Finally, these exogenous H1 forms were appropriately incorporated into
chromatin (Figure R.1D).
A
B
C
D
75
Results
Figure R.1. Expression of HA-tagged somatic histone H1 variants in breast cancer cells. (A) Proliferation assay to
measure the effect of exogenous H1 variant expression. H1-HA expressing cell lines (GFP positive) were mixed 1:1
with parental T47D cells (GFP negative). Cells were split at the indicated times and the percentage of GFP-positive
cells was measured by FACS. Data is expressed as percentage of variation of the proportion GFP-positive versus
GFP-negative cells along time respect to the initial seeding proportion. (B) Cell cycle profile after propidium iodide
(PI) staining. Data is represented as percentage of cells in G1, S and G2/M cell cycle phases. (C and D) H1 extract (C)
or chromatin (D) was prepared from T47D-derived cells stably expressing HA-tagged H1 variants, wild-type or a
K26A mutant of H1.4, and loaded into a 10% SDS-PAGE. Western blot hybridization was performed with H1 variantspecific antibodies (C), or with anti-HA antibody (D).
2. SPECIFICITY OF THE H1 AND HA ANTIBODIES USED IN ChIP EXPERIMENTS
During our chromatin immunoprecipitation (ChIP) assays, we used a specific HA antibody to
immunoprecipitate the exogenous H1 variants (H1-HA), and also two commercial antibodies
specifically recognizing H1.2 and H1X subtypes. In other to test the specificity of the HA
antibody, we performed ChIP in HA-expressing cell lines and cells infected with the empty
vector, not containing HA (Figure R.2). Immunoprecipitation with the HA antibody was only
achieved in cells expressing H1.2-HA or H1.4-HA, but not in mock cells, were
immunoprecipitated material was comparable with IPed material using an unrelated
immunoglobulin (IgG). This assays proved the specificity of the HA antibody used in further
ChIP experiments for exogenously expressed H1-HA variants.
Figure R.2. Specificity of the anti-HA antibody. ChIP was
performed in cells expressing H1-HA or mock infected with
the empty retroviral expression vector, with anti-HA
antibody or unrelated immunoglobulins (IgG), and the
abundance of IPed material was quantified by qPCR with
ß-actin promoter oligonucleotides.
In addition, the ChIP specificity for endogenous H1.2 and H1X antibodies was also confirmed in
H1.2 and H1X inducible knock-down cells (Figure R.3). ChIP experiments were performed after
6 days of doxycycline treatment cells, when H1 depletion is achieved, as shown by western
blot hybridization. H1.2 and H1X immunoprecipitated material for the corresponding H1.2 and
H1X depleted cell lines was much lower than in control non-treated cells for all the genomic
76
Results
loci interrogated, although signal was not completely abrogated. This could be due to
incomplete knock-down of the H1 variants.
A
B
Figure R.3. Specificity of H1.2 and H1X antibodies in ChIP experiments determined in H1 variant-specific knockdown cells. T47D-derived cells stably harboring an inducible system for shRNA expression against H1.2 (A) or H1X
(B) expression were treated with doxycycline for 6 days or left untreated. Then, ChIP was performed with H1
variant-specific antibodies against H1.2 and H1X (or control IgG), and the IPed material was quantified by qPCR
with oligonucleotides for the indicated promoters (-10 kb distal promoter or TSS), and corrected by input DNA
amplification. Western blot hybridizations showing the rate of H1 depletion upon doxycycline treatment are shown.
3. STUDYING H1 VARIANT DISTRIBUTION BY ChIP-qPCR
3.1. All H1 variants are non-specifically present at gene promoters and are depleted from
transcription start sites (TSS) in active genes
In our first attempt to elucidate H1 variant distribution in the genome, we tested occupancy in
specific loci corresponding to diverse genomic features. Therefore, in ChIP-qPCR experiments
using the anti-HA antibody in H1-HA expressing cells, H1-associated chromatin included gene
promoters, coding regions and repetitive DNA, irrespective of which H1-HA variant was
immunoprecipitated (Figure R.4). The abundance of the different H1 variants in those regions
was comparable, although few differences among them were observed, e.g. H1.3 was reduced
at alphoid repeats while H1.4 and H1.5 were relatively enriched. Moreover, promoter regions
seemed to have reduced H1 levels compared to coding regions or repetitive heterochromatic
regions.
77
Results
Figure R.4. HA-tagged H1
variants are associated with
gene
promoters,
coding
regions and repetitive DNA.
ChIP was performed in cells
expressing H1-HA with anti-HA
antibody and the abundance of
IPed material was quantified
by qPCR with oligonucleotides
for the indicated promoters,
coding or heterochromatic
regions, and corrected by input
DNA amplification with the
same primer pair.
Later, the specificity of H1 variants distribution was investigated in more detail at gene
promoters previously shown to contain H1 at distal regions located 10kbp upstream of their
transcription start site (TSS), and a depletion of H1 at the TSS (“H1 valley”) [170]. In those
selected gene promoters, all H1 variants were detected at all distal promoter regions tested, in
equivalent proportions, and a similar H1 depletion was observed at the TSS of all genes for all
H1 variants (Figure R.5A). The distribution of all H1 variants in those selected promoters was
also comparable to an H1.4 mutant (K26A) at a residue targeted by acetyl and methyl
transferases and reported to be involved in recruiting chromatin proteins (Figure R.5A) [149,
150, 164, 165]. Focusing on endogenous histones, a local depletion of H1 at TSS was also
observed by immunoprecipitating H1s with specific H1.2 and H1X antibodies (Figure R.5B).
Interestingly, the TSS-associated H1 valley was not observed at genes inactive in these cells,
i.e. OCT4 and NANOG (Figure R.5C). Thus, we confirmed in our breast cancer cell line the H1
depletion near the transcriptional start site (TSS) of active genes, but not in repressed genes.
Moreover, this “H1 valley” seems not to be generally specific for any of the H1 subtypes
tested, including a mutant form of H1.4.
78
Results
A
B
C
Figure R.5. All H1 variants are present at gene
promoters and depleted from transcription start
sites. (A, C) ChIP experiments were performed in T47Dderived cells stably expressing HA-tagged H1 variants,
wild-type or a K26A mutant of H1.4, with anti-HA
antibody, and the abundance of IPed material was
quantified by qPCR with oligonucleotides for the
indicated promoters (-10 kb distal promoter or TSS),
and corrected by input DNA amplification with the
same primer pair. (B) ChIP experiments were
performed in parental T47D cells with H1 variantspecific antibodies against H1.2 and H1X and the IPed
material was quantified as in (A).
To further confirm that the H1 valley was dependent on gene expression, we performed
several assays relating the transcriptional status of the genes and the presence H1 depletion at
the TSS. Thus, the H1.0-HA and H1.2 valley was evident at genes being expressed, depicted by
mRNA accumulation measured by RT-qPCR, and by representation of gene expression
microarray data. Moreover, the presence of H1 valley positively correlated with H3K4me3
enrichment at TSS compared to a -10kbp upstream region, a modification associated with
active transcription. Finally, an open chromatin state at TSS measured by formaldehydeassisted isolation of regulatory elements (FAIRE)-qPCR, and nucleosome depletion at the TSS
(H3 ChIP), was observed in genes presenting a clear H1 valley, but not in genes whose
79
Results
promoter was fully occupied with H1 (Figure R.6). In conclusion, an H1 valley was found in the
promoter of transcribing genes, but not in repressed genes.
Figure R.6. An H1 valley at TSS is found at
genes being expressed that show FAIREmeasured open chromatin and increased
H3K4me3 at TSS. Several genes representing
different levels of expression according to
microarrays data were chosen to analyze
mRNA abundance by RT-qPCR, chromatin
accessibility at TSS by FAIRE-qPCR, and
distribution of H1, H3 and H3K4me3 by ChIP at
distal (upstream) promoter compared to TSS.
ND, not determined. RT-PCR values for each
gene were normalized to GAPDH expression
and genomic DNA amplification with the same
set of primers. ChIP and FAIRE PCRs were
normalized to input DNA.
3.2.H1 variants are further depleted from TSS upon induced gene activation
Another approach to relate H1 promoter abundance and gene expression is to study the H1
content in inducible promoters under stimulating conditions. To do so, we treated T47D cells
either with a progesterone analog (R5020) or mitogenic drugs (PMA). Thus, H1 variant
depletion at TSS was evident at the nucleosome B (nucB) of the MMTV inducible promoter
after progesterone treatment at different time points (Figure R.7A). Moreover, treating cells
80
Results
with PMA after serum starvation increased the total H1, H1.2 and H1X depletion at TSS around
2-fold or more in inducible Jun and Fos promoters, while in active and repressed nonresponsive control genes (PSMB4 and OCT4) the extend of H1 valley did not change (Figure
R.7B and C).
A
B
C
81
Results
Figure R.7. H1 depletion at TSS of inducible promoters. (A) H1 depletion at TSS of hormone-responsive promoters
upon stimulation with ligand. T47D-derived cells harboring an MMTV-luciferase construct and expressing different
H1-HAs were treated with a progestin analog (R5020 10nM) for the indicated time and ChIP was performed with
anti-HA antibody. The abundance of IPed material was quantified by qPCR with specific oligonucleotides for the
MMTV promoter (nucleosome B). (B and C) The H1 valley at TSS of JUN and FOS genes is preformed and increases
upon mitogenic stimulation. T47D cells were treated with PMA 100nM for 60 min or left untreated, and ChIP was
performed with H1, H1.2, H1X and H3 antibodies. The abundance of IPed material was quantified by qPCR with
oligonucleotides for the indicated promoters (-10 kb distal promoter or TSS), and corrected by input DNA
amplification (B). Jun and Fos were responsive to PMA as shown in the RT-qPCR experiment (right panels). PSMB4
and OCT4 are non-responsive control genes, active and repressed, respectively, in T47D cells. The table below (C)
shows the H1 valley ratio (calculated as distal/TSS) at 0 and 60 min, or the relative H1 valley ratio at 60 min
compared to 0 min.
Interestingly, these immediate-early response genes presented an H1 and H3 valley already
preformed at non-inducing conditions that became deeper upon stimulation. This suggests
that their promoter is kept in an “open” state to allow rapid response after stimulation by
preserving the nucleosome free regions (NFR) at the TSS, but also by maintaining the promoter
free of H1.
4. ChIP-ChIP EXPERIMENTS ON PROMOTER ARRAYS
After validating our ChIP method in studying H1 variant localization and, in order to explore
the genome-wide distribution of the different H1 variants along gene promoters, we
hybridized ChIP material, obtained with variant-specific antibodies or corresponding to HAtagged H1 variants-associated chromatin, on a Nimblegen promoter tiling array (Nimblegen
HG18 RefSeq Promoter 3x720K) containing probes for 30,893 transcripts (-3,200 to +800bp to
the TSS) arising from 22,542 human promoters.
4.1. Extended depletion of H1 at promoters is dependent on the transcriptional status of
the gene and shows differences between variants
For each variant, the average log2 ratio of probe intensity for all transcripts was represented
regarding the relative distance to TSS, and an H1 valley close to TSS was apparent in all cases.
Interestingly, in the two H1.2 samples (endogenous H1.2 and H1.2-HA), the valley was more
pronounced and slightly shifted towards the TSS, compared to the rest of H1 variants
(endogenous H1X and H1.0/3/4/5-HA) (Figure R.8).
82
Results
Figure R.8. Extension of H1 depletion at promoters.
Average log2 enrichment ratio of ChIP-chip probe intensity
for all transcripts was represented regarding the relative
distance to TSS for each variant.
Thereafter, this ChIP-chip data was combined with gene expression data for ca. 20,000 of the
transcripts, obtained with the parental cell line growing in the same conditions in a human
expression array from Agilent (Human v1 Sureprint G3 Human GE 8x60k Microarray). Thus, we
worked for further analysis with the 20,338 promoters whose expression data was included in
this Agilent expression array, obtained by overlapping transcript IDs from the ChIP-chip
promoter array with genes in the expression microarray (Figure R.9).
A
B
Figure R.9. Global gene expression profile in T47D
cells. (A) Overlap between unique transcript IDs at
the Nimblegen promoter array and Agilent
expression microarray. (B) Gene expression profile
in T47D cells of 62,976 transcripts, from highest to
lowest expression, obtained by hybridization with
an Agilent microarray.
Next, using these expression data, heat maps representing H1 binding intensity around
promoter regions were constructed for each variant, ranking promoters from highest to lowest
gene expression (Figure R.10A). An H1 valley was clearly seen for at least the top 50-60% highly
expressed transcripts in all variants. Noteworthy, for both H1.2 samples the valley extended
towards the lowest expressed genes. Then, the total number of considered transcripts was
83
Results
divided into 10 groups, from high to low expression, and average log2 ratio of ChIP-chip probe
intensity was represented regarding the relative distance to the TSS for each expression group
and each variant (Figure R.10B). These graphs confirmed that H1 depletion at promoters is
dependent on the transcriptional status of the gene, as H1 valley was progressively weakening
as genes became less expressed. Interestingly, the H1 valley around the TSS was deeper and
wider for H1.2 than for the other variants, irrespective whether endogenous or HA-tagged
histone was measured, and repressed genes contained less H1.2 compared with other
subtypes. These observations point to a specific behavior of H1.2 variant at promoter regions.
A
B
Figure R.10. The extension of H1 depletion at promoters is transcription status-dependent and variant-specific. (A)
Heat maps of ChIP-chip probe intensity around TSS (-3,200 to +800bp) for 20,338 transcripts from which the
expression rate was determined. Genes are ordered from highest to lowest gene expression. (B) Average log2 ratio
of ChIP-chip probe intensity represented regarding the relative distance to the TSS for all transcripts classified
according to expression in ten groups containing a same number of transcripts, from highest (EG1) to lowest (EG10)
expression. Representative ChIP-chip experiments are shown.
84
Results
In general, it is worth noting that H1 depletion extended in some degree at least 1kbp
upstream the TSS of active genes for all H1 variants, wider than the predicted extent of the
reported nucleosome free region (NFR) that precedes the TSS. In order to confirm this result,
ChIP-chip for the core histone H3, used to monitor nucleosome occupancy, was also
performed and showed that H3 was also depleted at active genes but more locally than H1
(Figure R.10A and B). Moreover, H3 and all H1s, except H1.2, presented a marked enrichment
peak immediately downstream of TSS that may correspond to a positioned nucleosome as
previously reported [98, 100]. Additionally, ChIP-qPCR on selected promoters confirmed some
of these observations, i.e. some repressed promoters presented high H1.0 content around TSS,
but low H1.2 (Figure R.11).
Figure R.11. ChIP-qPCR validation of ChIPchip observations. Some repressed genes
show an H1.2 valley at the TSS. ChIPed
material from H1.0-HA cells or with the
specific H1.2 antibody was quantified by qPCR
with oligonucleotides for the indicated
promoter regions and corrected by input DNA
amplification. Selected genes belong to the
indicated expression profile percentiles in
Figure R.10B.
4.2. An H1 valley at non-protein coding promoter transcripts is seen for H1.2
In addition to protein-coding genes, the promoter array contained 1,145 non-coding promoter
transcripts (NRs) including structural RNAs and transcribed pseudogenes, that overall
presented a low expression rate compared to the total transcriptome. An H1 valley at TSS was
only apparent at the ChIP-chip heat maps for endogenous and HA-tagged H1.2, in agreement
with the previous observation that an H1.2 valley occurs even at lowly expressed promoters
(Figure R.12).
85
Results
A
B
Figure R.12. Non protein-coding transcripts show an H1.2 valley at TSS. (A) Heat maps of ChIP-chip probe intensity
around TSS (-3200 to +800 bp) for 1,145 non protein-coding transcripts (NRs) from which the expression rate was
determined (B). NRs are ordered from highest to lowest gene expression. (B) Expression levels of NRs are shown as a
boxplot compared to total transcriptome included in the Agilent expression microarray.
4.3. H1.2 abundance at distal promoter is a mark of transcriptional inactivity
Noteworthy, previous heat map and expression-dependent plot analysis indicated that H1.2
abundance at distal promoter regions is inversely proportional to gene expression, being more
abundant at repressed promoters (Figure R.10A and B). This was also observed in some extent
for the other H1 variants and H3, with the exception of the top and bottom ca. 10% expressed
genes that showed an inverse correlation. In agreement with this, when gene promoters were
ranked from lowest to highest H1 enrichment at the distal promoter region (-3,200 to -2,000
bp from TSS), a negative correlation with gene expression was clearly seen mainly for H1.2
(Figure R.13A). At the same time, we compared the expression rate of highly (top 10%) or
lowly (bottom 10%) enriched promoters for each variant with total transcriptome included in
the expression array (Figure R.13B). In agreement with previous analysis, genes with the
highest distal promoter H1.2 content (top 10%) mainly fell within the lowest expressed genes,
whereas genes with the lowest H1.2 content (bottom 10%) fell within the highly expressed
genes. This was partially true also for H1X but less evident for the H1-HAs. In conclusion, H1.2
abundance at distal promoter regions is inversely related with the expression of the associated
gene, and this is not so true for other variants.
86
Results
B
A
Figure R.13. H1.2 abundance at distal promoter regions negatively correlates with gene expression. (A) Heat maps
of gene expression data for 20,338 transcripts ordered from lowest to highest H1 content at distal promoter regions
(-3200 to -2000 bp relative to TSS), for each of the H1 variants indicated. (B) Expression levels of genes presenting
the highest or lowest H1 variant content at distal promoter are shown as a boxplot, and compared with total
transcriptome.
4.4. H1.2 abundance at distal promoter negatively correlates with the presence of other
H1 variants
With the aim to further characterize the relation of H1 variant abundance in promoter regions,
we generated heat maps for all H1 variants ranking genes from low to high H1.2 content at the
distal promoter. Interestingly, H1.2 abundance at distal promoter regions inversely correlated
with H3, H1X and H1-HA abundance, except H1.2-HA that showed an intermediate pattern
(Figure R.14A). This inverse relationship was also observed by performing correlation analysis
comparing mean probe intensity at distal promoters for H1.2 with other variants (Figure
R.14B). H1.2 negatively correlated with H1.0-HA, H1X and H3, and presented almost no
correlation with H1.2-HA, as expected from Figure R.14A. On the other hand, H1.0-HA and H3
presented a strong positive correlation, suggesting that they mainly bind the same promoters.
Globally, this indicates that there is a preferential binding of H1.2 for some promoters (mostly
repressed genes, according to previous data in Figure R.13) compared to the rest of variants,
and vice versa, many promoters are devoid of H1.2 but contain other H1 variants.
87
Results
A
B
Figure R.14. H1.2 abundance at distal promoter regions negatively correlates with abundance of other variants.
(A) Heat maps of H1 ChIP-chip probe intensity around TSS (-3,200 to +800bp) for 20,338 transcripts from which the
expression rate was determined. Genes are ordered from lowest to highest H1.2 content at distal promoter regions.
Genes with the highest or lowest distal H1.2 content are indicated. These genes for each H1 variant (2050 genes in
each group, 10% of the total) were used to determine the number of coinciding genes as shown in Figure R.15 and
R.16. (B) Correlation scatter plots between abundance of different H1 variants at distal promoter regions. X and Y
axis represent mean probe intensity at distal promoter regions (-3200 to -2000 bp relative to TSS), for the indicated
H1 variants. R: Pearson’s correlation coefficient.
Taking previous observations into account, we further deciphered if differentially H1 variant
content at distal promoter was related with the regulation of distinct biological processes. For
this purpose we compared gene ontology (GO) analysis of endogenous H1.2 and H1X variants
enriched (top 10%) or deprived (bottom 10%) promoters, as these endogenous H1 variants
presented an opposite distribution at distal promoter regions (Figure R.14). Analysis using
DAVID software depicted that different biological processes were related with differential H1
variant abundance at distal promoter regions in T47D cells. For example, promoters with the
lowest content of H1X included genes involved in chromatin organization. Promoters with the
lowest H1.2 content included cell-cell signaling or regionalization. Genes with the highest H1X
content at promoter included pattern formation, and genes with high H1.2 included repressed
genes involved in sensory perception (Table R.1). This observation suggests that different
88
Results
biological processes are regulated by different H1 variants, supporting the vision of H1 variants
regulating different subsets of genes, and, hence, different functions.
Table R.1. Gene ontology of genes presenting the highest (top 10%) or lowest (bottom 10%) H1.2 or H1X content at
distal promoter (-3200 to -2000 bp relative to TSS) according to ChIP-chip data shown in Figure R.13. P-value
(adjusted for multiple testing by Benjamini method) and false discovery rate are shown.
Afterwards, in order to corroborate the preferential binding of H1.2 for certain promoters
compared with other variants, Venn diagrams of top 10% genes having high or low H1.2 at
distal promoter and high or low H1X were performed to identify genes presenting high2/lowX
and low2/highX (Figure R.15A and B). Those genes were supposed to present high levels of
H1.2 and low levels of H1X at their distal promoter, and vice versa. The most abundant
coincidence was between low2/highX promoters (553 genes), that mainly correspond to
expressed genes compared with the total transcriptome in the expression array (Figure R.15C).
89
Results
On the other hand, genes containing high H1.2 were mainly repressed, and genes depleted in
both H1.2 and H1X were the most active ones.
A
B
C
Figure R.15. Coincidence between genes presenting the highest or lowest H1.2 or H1X content at distal promoter.
(A) Heat maps of H1 ChIP-chip probe intensity around TSS. Genes are ordered from lowest to highest H1.2 (left) or
H1X (right) content at distal promoter regions. Genes with the highest or lowest distal H1 content are indicated.
These genes (2050 genes in each group, 10% of the total) were used to determine the number of coinciding genes as
shown in Venn diagrams (B). A streaking coincidence exists between genes presenting few H1.2 but high H1X. (C)
Expression levels of coinciding genes in the four comparisons depicted in (B). Genes were classified in five expression
groups (EG) similar to Figure R.10B, and the percentage of coinciding genes belonging to each of these groups was
determined. (Right panel) Expression levels of coinciding genes are also shown as a boxplot.
Similarly, we also compared H1.2 with H1.0-HA abundance that, as H1X, also presented an
opposed binding at the distal promoter region in relation with H1.2. Venn diagram comparison
of top 10% genes having high or low H1.2 versus H1.0-HA showed that the most abundant
coincidences were between low2/high0, with 716 genes, and between high2/low0, with 276
genes (Figure R.16). Again, H1.2-enriched genes were mainly repressed.
90
Results
A
B
C
Figure R.16. Coincidence between genes presenting the highest or lowest H1.2 or H1.0-HA content at distal
promoter. See Figure R.15 legend.
Altogether, our data indicates that promoters having few H1.2 are loaded with high amounts
of other variants, not only with exogenously expressed H1.0-HA, but also with endogenous
H1X. Expression analysis of such groups of genes denoted that genes with few H1 variants at
distal promoter are highly expressed, and vice versa, but H1.2 content is the strongest
predictor of gene expression (Figures 15C and 16C).
Finally, we focused in differential promoter binding of endogenous H1.2 versus H1X (Figure
R.15) in order to experimentally confirm that some promoters have preferential binding for
particular variants. Thus, representative genes of the intersections for low2/highX (TMEM204
and TUBGCP5) and high2/lowX (COL4A3 and CUGBP2) were randomly selected and
interrogated by ChIP-qPCR (Figure R.17A). When we represented the differential ratio between
91
Results
H1.2 and H1X (H1.2/H1X) abundance at those selected promoters, both TMEM204 and
TUBGCP5 presented a lower relative H1.2/H1X ratio compared with COL4A3 and CUGBP2, as
expected, indicating that H1.2 and H1X are differentially enriched within these two groups of
genes. So, these results, experimentally confirmed that some genes are enriched in H1.2 or
H1X at distal promoters regions.
The universality of the relative H1.2/H1X abundance at representative genes was also tested in
two additional cell lines by ChIP-qPCR (Figure R.17A and B). HeLa cells showed similar results
than T47D, i.e. COL4A3 and CUGBP2 genes presented a higher H1.2/H1X ratio than TMEM204
and TUBGCP5. Interestingly, ratios were even higher than in T47D, reflecting a higher relative
abundance of H1.2 compared with H1X in HeLa cells (Figure R.17C and D). On the other hand,
MCF7 cells presented similar H1.2/H1X ratios in all four genes, due to the increased H1X signal
in COL4A3 and CUGBP2 genes (Figure R.17A, B and C). Note that H1X content at chromatin in
MCF7 cells is higher than in T47D cells, which in turn presents more H1X than HeLa cells
(Figure R.17C). This result denotes that relative abundance between variants at promoters is
not fully conserved between cell types, probably due to differences in their relative content
within the nucleus.
A
C
B
D
92
Results
Figure R.17. The ratio between H1.2 and H1X abundance at selected genes is conserved among T47D and HeLa
cells, but not in MCF7, in relation to the cell abundance of each variant. (A) ChIP-qPCR confirmed that some genes
are enriched in H1.2 or H1X at distal promoter. The differential ratio between H1.2 and H1X abundance at selected
genes observed in T47D cells is conserved in HeLa cells but not in MCF7. TMEM204 and TUBGCP5 genes were
randomly chosen among the group of genes presenting low H1.2 and high H1X (553 genes), and COL4A3 and
CUGBP2 genes among the genes presenting high H1.2 and low H1X (189 genes) (see Figure R.15). After ChIP-qPCR of
H1.2 and H1X abundance at distal promoter regions of these genes in T47D, HeLa and MCF7 cells, the relative ratio
H1.2/H1X was calculated. (B) ChIP-qPCR of H1.2 and H1X abundance at TMEM204, TUBGCP5, COL4A3 and CUGBP2
distal promoter regions in T47D, HeLa and MCF7 cells. IPed material was corrected by input DNA amplification. (C)
Abundance of H1 variants in T47D, HeLa and MCF7 chromatin determined by immunoblot with specific antibodies.
(D) Expression of H1 variants in T47D, HeLa and MCF7 cells determined by RT-qPCR. cDNA levels for the indicated H1
variants were corrected by GAPDH expression and amplification of genomic DNA with the same PCR primers.
In relation with this, ChIP-chip experiments for endogenous H1.2 and H1X were further
extended to HeLa cells, confirming that these two variants do not totally coexist at the same
promoters (Figure R.18). Therefore, when H1.2 and H1X promoter occupancy of both T47D
and HeLa cells was represented ranking genes from lowest to highest H1.2 abundance at distal
promoter region in T47D cells, H1.2 distribution in HeLa cells resembled H1.2 distribution in
T47D, but H1X presented a different distribution in both cell types compared with H1.2.
Figure R.18. Comparison of H1.2 and H1X promoter content among T47D and HeLa cell lines. Heat maps of H1.2
and H1X ChIP-chip probe intensity around TSS (-3200 to +800 bp) for 20,338 transcripts. Shown promoters in all
heat maps are ordered from lowest to highest H1.2 content at distal promoter regions in T47D cells.
93
Results
4.5. H1.2 abundance correlates with clusters of differential gene expression along
chromosomes
Next, we represented heat maps of H1.2 abundance at the promoter ordering genes according
to their position along several human chromosomes (Figure R.19A). Interestingly, several
domains of H1.2 abundance became apparent along chromosomes, correlating with clusters of
differential gene expression. Thus, clustered genes presenting high amounts of H1.2 along the
promoter, including the TSS, are transcriptionally repressed, while genomic domains with few
H1.2 are active. This supports the observation that the amount of H1.2 at distal promoter is
related with the transcriptional status of the associated gene, as active and inactive gens
cluster together in broad genomic regions depleted or enriched in H1.2, respectively.
Noteworthy, after calculating a gene-richness coefficient for each individual chromosome
(Figure R.19B), chromosomes 17 and 19, the highest gene-rich chromosomes, showed overall
high gene expression and low H1.2 content at promoters. On the other side, the gene-poorest
chromosome 13 presented low gene expression and high H1.2 content (Figure R.19A and C).
These observations are in agreement with previously reported associations of active and
inactive gene-rich domains in the nuclear space, as well as with the distinct radial organization
of chromosomes in the nucleus regarding gene density and transcriptional activity [104, 250].
Moreover, they favor again a strong relation between H1.2 content and gene expression.
Additionally, the observed clustered distribution of H1 was well conserved within variants in
different cell lines, but differed among H1 variants (Figure R.19D). H1.2 and H1X content in
chromosome 1 of HeLa cells resembled distribution of their counterparts in T47D cells.
However, interestingly, H1X or H1.0-HA abundance was not clustered matching gene
expression in both T47D and HeLa cells, like H1.2 did.
In summary, H1.2 content at promoters is the best H1 reporter of gene expression. Moreover,
its organization in chromosomal clusters correlating with differential gene expression, points
to a role of this variant in the chromatin organization within the nucleus in breast cancer cells.
94
Results
A
B
C
D
95
Results
Figure R.19. H1 variant content at gene promoters along human chromosomes. (A) Heat maps of H1.2 ChIP-chip
probe intensity around TSS (-3,200 to +800bp) for genes ordered according to their position along several human
chromosomes. Gene expression levels for each gene in T47D cells are represented in the left in two different ways
(as a heat map and as graphical representation of log 2 ratios). A gene-richness coefficient (GRC) for each
chromosome is indicated. The centromere location is marked with a triangle. A region of interest is marked with an
asterisk, viewed in the UCSC genome browser in Figure R.23. (B) Gene-richness of human chromosomes. The
relative size (base pairs) of each chromosome compared to the total human genome is represented as percentage,
as well as the relative content of coding genes compared to the total number of genes in the genome. The generichness coefficient for each chromosome, calculated as the ratio between the percentage of genes present in each
chromosome and the percentage of base pairs of each chromosome to the total human genome, is shown in the
graph below. (C) Expression levels of genes present in several chromosomes. Boxplots show the distribution of
expression (log2 ratio) in T47D cells for the genes present in each indicated chromosome. (D) Heat maps of H1.2
and H1X ChIP-chip probe intensity around TSS (-3.2 to +0.8 kbp) for genes ordered according to their position along
chromosome 1 in T47D and HeLa cells. Heat map of H1.0-HA in T47D is also included.
5. GENES SPECIFICALLY DEREGULATED BY KNOCK-DOWN OF A PARTICULAR H1
VARIANTS ARE NOT ENRICHED IN SUCH VARIANT AT THE PROMOTER
We have previously shown that inducible knock-downs (KD) of individual H1 variants
deregulate a reduced subset of genes (≤2%), specific for each variant, and including up- or
down-regulated genes in similar proportions [97]. One hypothesis would be that these subsets
contain genes specifically targeted or with prevalence for some of the H1 variants. If a
promoter was heavily or uniquely loaded by a particular variant and H1 was playing a role in its
repression or activation, its depletion could cause gene deregulation. We explored the H1
variants occupancy at promoters specifically deregulated by the correspondent variant KD and
we found no differences on their abundance at distal promoter regions compared to total
genes (Figure R.20C). For example, the H1.2 content at distal promoter of genes up- or downregulated by inducible H1.2 KD was similar to the H1.2 content distribution of total genes, just
slightly lower for the up-regulated genes, contrary to what the hypothesis predicted. Results
were similar within all H1 KD cell lines. On the other hand, genes deregulated in H1 KD cells
were mostly within the top 50% basally expressed genes, especially true for the down- genes
(Figure R.20A and B). So, low H1.2 content at distal promoter would be expected for H1.2
according to data in Figure R.10B and R.13, and lower at down- than up-regulated genes.
Strikingly, we observed that H1.2 content at down-regulated genes was higher than what was
predicted, meaning that perhaps in those genes H1.2 plays an important role, and genes
became down-regulated upon depletion from the promoter.
96
Results
A
B
C
Figure R.20. Gene expression response to specific H1 variant knock-down is not related with the abundance of
such variant at the promoter. (A) Gene expression profile in T47D cells harboring inducible shRNAs against the
indicated H1 variant, in the presence or absence of Doxycicline as inducer (6 days treatment at 2.5 mg/ml),
represented from highest to lowest basal (-Dox) expression. Data was obtained by hybridization with an Illumina
microarray and reported elsewhere [97]. (B) Basal gene expression of the subsets of genes up- or down-regulated in
the different H1 KD cells, compared to expression of total transcriptome, represented as boxplots. (C) Boxplots of
H1.0-HA, H1.2 and H1X abundance (ChIP-chip probe intensity ) at distal promoter regions (-3200 to -2000 bp
relative to TSS) for the genes up- or down-regulated upon H1.0, H1.2 or H1X knock-down, respectively, compared to
the H1 abundance of total genes.
6. ChIP-seq EXPERIMENTS TO FURTHER STUDY H1 VARIANT DISTRIBUTION IN
THE GENOME
To further determine whether the genomic distribution of H1 variants is heterogeneous, we
extended our study to high-throughput sequencing analysis in order to get the complete
genomic map for several H1 variants in T47D cells. For this purpose, we combined ChIP of
endogenous H1.2, H1X, and H3, and of HA-tagged H1.0, H1.2, and H1.4 with high-resolution
sequencing (ChIP-seq) up to 50 million reads per sample (Table R.2).
Table R.2. Summary of samples analyzed by ChIP-seq in three independent experiments (r1, r2, r3; replica 1, 2 and
3). Read length, number of total reads obtained, and number of mappable reads to the human genome version
hg18, as well as mapped rate, is shown.
97
Results
6.1. Genome-wide H1 variant distribution around gene bodies
To confirm the results obtained by ChIP-chip, we focused first on the input-subtracted
normalized average ChIP signal obtained around coding regions of genes, grouped according to
basal expression as before (Figure R.21). Again, the H1 valley at the TSS depended on
expression rates and differences were seen between H1 variants. Mainly, H1.2 was less
abundant at TSS of non-expressed genes compared to the other subtypes that showed higher
levels towards nucleosome +1. Note that both H1.2-HA and endogenous H1.2 behaved
similarly, and different than the other variants. Transcription termination sites (TTS) also
showed differences between variants, being depleted of H1 subtypes, except for H1.2.
Interestingly, the H1 content of gene bodies increased towards the end and was also
depending on gene expression rates. Therefore, while H3 levels were uniform along gene
bodies, H1 variants such as H1.2 were reduced at the 5’ moiety of highly active genes (Figure
Gene expression
R.21).
Figure R.21. H1 distribution in gene bodies. Average, input-subtracted ChIP-seq signal of H1 variants around gene
bodies flanked by transcription start site (TSS) and transcription termination site (TTS), grouped according to basal
expression (10% of total genes in each group). EG1 represents top expressed genes, and EG10 genes with the lowest
expression. Average for all genes is shown in black. Genic regions are represented as a 3 kb-long meta-gene
surrounded by 1 kb region upstream TSS and 1 kb downstream TTS.
98
Results
In addition, like in the case of ChIP-chip analysis, the H1 depletion in the promoter of active
genes was wider than the predicted nucleosome free region (NFR). Indeed, H3 depletion,
reporting nucleosome occupancy, was more restricted to the TSS than H1 depletion, which
expanded up to 1kbp upstream the TSS, confirming again the broader H1 depletion in active
promoters compared to the NFR.
6.2. H1 variants are differentially depleted from regulatory regions and enriched at CpG
sites
ChIP-seq experiments enabled us to explore H1 variant distribution out of promoter context.
Thereby, in addition to the local displacement of H1 from active promoters, H1 variants were
also depleted from other regulatory regions along the genome, i.e. CCCTC-binding factor
(CTCF) binding sites corresponding to insulators, and p300 binding sites associated with
enhancers, but, interestingly, not that much from DNase hypersensitivity sites and FAIRE
regions, representing open chromatin regions (Figure R.22A and D).
Moreover, we also studied H1 variant distribution at regions enriched for several histone
modifications. Thus, when the input-subtracted coverage of H1 variants across the peaks of
selected core histone modifications was calculated, depletion of H1.0 and H1.2, and in some
extent H1.4 but not H1X, was associated with positive histone marks linked to strong
enhancers such as H3K4me1, H3K4me2 and H3K27ac (Figure R.22B). H1 abundance at
H3K4me3 or H3K9ac sites, mainly enriched at TSS of active promoters, was different between
variants, reflecting H1.2 depletion at TSS of most genes but local enrichment of the other
variants immediately after TSS. On the other hand, no significant enrichment of H1 was found
at negative histone marks such as H3K9me3 or H3K27me3. Worth noting, H1.2 abundance was
lower at active marks than at marks related with repression and chromatin compaction, in
agreement with the observed correlation between H1.2 content and gene repression. Note
that data for histone PTMs belongs to HeLa cells, so better correlations would be expected if
using T47D data, which is not available.
Next, we investigated whether H1 variants coincided with CpG regions across the genome. As
seen in Figure R.22C and D, H1.0, H1X and H1.4 were clearly overrepresented at CpG regions,
compared with H1.2. Because CpGs are mostly localized at gene promoters, this coincidence
may reflect the overall higher abundance of those variants compared to H1.2 around TSS,
considering the lowly expressed genes. Alternatively, it may not be discarded a major
99
Results
relationship between H1.0 (and other variants different than H1.2) and CpG or DNA
methylation. Further analyses on this issue are presented below.
A
B
C
D
Figure R.22. H1 is depleted from regulatory regions but present at CpG sites in a variant-specific manner. (A) ChIPseq signal of H1 variants at regions enriched for the indicated genomic features shown as a boxplot: DNase
hypersensitivity sites (data from T47D cells), FAIRE regions (HeLa data), CTCF and p300 binding sites (T47D data). (B)
ChIP-seq signal of H1 variants at regions enriched for the indicated histone marks (data from HeLa cells) shown as a
boxplot. (C) ChIP-seq signal of the indicated H1 variants at CpG islands (as defined in UCSC database) shown as a
boxplot. (D) Average input-subtracted ChIP-seq signal of H1 variants around the center of genomic CTCF and p300
binding sites (data from T47D data) and CpG islands (as defined in UCSC database).
100