Download CG Context and Content - Neuro Epigenetics Laboratory

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
*Manuscript
Click here to download Manuscript: manuscript.doc
Title: Role of CG Context and Content in Evolutionary Signatures of Brain DNA
Methylation
Running Head: CG Context and Content in Brain DNA Methylation
Authors: Yurong Xin1*, Anne H. O’Donnell1,2*, Yongchao Ge3, Benjamin Chanrion1, Maria
Milekic1, Gorazd Rosoklija1,4,5, Aleksander Stankov5, Victoria Arango1, Andrew J.
Dwork1,5,6, J. John Mann1, Jay Gingrich1, and Fatemeh G. Haghighi1**
*
Authors contributed equally to this work
Corresponding author: [email protected]
1
Department of Psychiatry, Columbia University and The New York State Psychiatric
Institute, New York, NY
2
Integrated Graduate Program in Cellular, Molecular, and Biomedical Studies, Columbia
University, New York, NY
3
Department of Neurology, Mount Sinai School of Medicine, New York, NY
4
Macedonian Academy of Sciences and Arts, Skopje, Macedonia
5
School of Medicine, University “Ss. Cyril & Methodius,” Skopje, Macedonia
6
Department of Pathology and Cell Biology, Columbia University, New York, NY
**
1
ABSTRACT
DNA methylation is essential in brain function and behavior, yet understanding the role
of DNA methylation in brain-based disorders begins with the study of DNA methylation
profiles in normal brain. We apply an enzymatic-based approach, Methylation Mapping
Analysis by Paired-end Sequencing (Methyl-MAPS) that utilizes second-generation
sequencing technology to provide an unbiased representation of genome-wide DNA
methylation profiles of human and mouse brains. In this large-scale study, we assayed
CG methylation in cerebral cortex of neurologically and psychiatrically normal human
postmortem specimens, as well as mouse forebrain specimens. Cross-species humanmouse DNA methylation conservation analysis shows that DNA methylation is not
correlated with sequence conservation. Instead, greater DNA methylation conservation
is correlated with increasing CG density. Genomic regions with significant human-mouse
DNA methylation conservation (correlation >80%) typically have greater than 5 CG
dinucleotides within a 100bp window. In addition to CG density, these data show that
genomic context is a critical factor in DNA methylation conservation and alteration
signatures throughout mammalian brain evolution. We identify key genomic features that
can be targeted for identification of epigenetic loci that may be developmentally and
evolutionarily conserved and wherein aberrations in DNA methylation patterns can
confer risk for disease.
2
INTRODUCTION
DNA methylation is an evolutionary ancient epigenetic mark found within most
eukaryotic organisms including fungi, plants, and animals [1, 2]. DNA methylation
modification involves addition of a methyl-group to cytosine bases in a heritable fashion.
Cytosine methylation can be categorized into CG, CHG, and CHH methylation (where H
refers to either A, C, or T nucleotides). In animals, methylation primarily occurs at CG
dinucleotides, whereas in plants, methylation of cytosines is observed in all contexts.
Cytosine methylation is catalyzed by highly conserved DNA methyltransferases (Dnmts)
[3, 4]. As such, DNA methylation is an essential evolutionary process involved in many
gene regulatory systems, including genomic imprinting, X-chromosome inactivation,
transposon silencing, and expression of endogenous genes.
Cross species comparative epigenomic analyses have revealed intriguing trends
in both the conserved and divergent features of DNA methylation in eukaryotic evolution.
Cytosine methylation has been detected throughout the majority of repetitive sequences
and transposons, but also within the body of protein coding genes. Within gene bodies,
CG methylation appears to be favored in exons over introns. Although the biological
function of gene body methylation or mechanisms by which gene bodies are targeted by
the methylation machinery are not well-understood, the preferential methylation of exons
in plant and animal species appear to be an evolutionary conserved phenomena [5]. In
mammalian genomes, DNA methylation occurs almost exclusively within CG
dinucleotides (70-80%), though a small amount of non-CG methylation is found within
embryonic stem cells [6-9]. The remaining unmethylated CG sites mostly occur in dense
clusters referred to as CG islands [10, 11]. Mammalian genomes with widespread DNA
methylation have undergone CG dinucleotide depletion over evolutionary time as a
consequence of cytosine methylation that leads to increased rate of C-to-T transitions
3
occurring after deamination of methylated cytosines [12-14]. Such depletions are not
ubiquitous since some genomic regions, like CG islands, remain enriched in CG
dinucleotides [15, 16]. CG islands represent a large fraction of cis-regulatory sequences
because they occupy the majority of gene promoters [12, 17, 18], with non-promoter CG
islands functioning as distal regulators (e.g., insulators and enhancers) [19]. The
localized genomic depletion of CG dinucleotides in vertebrates results in a markedly
different CG distribution as compared to genomes of invertebrates. Invertebrate
genomes show homogeneous distribution and no depletion of CG dinucleotides,
whereas vertebrate genomes are globally depleted, except at CG islands [20].
Establishment of DNA methylation patterns are dependent on CG content within
specific genomic compartments. For example, vertebrate promoters are comprised of
two major classes based on CG content (i.e., CG-poor and CG-rich promoters). CG-poor
promoters are largely methylated and show tissue-specific patterns of methylation and
gene expression. In contrast, CG-rich promoters are typically unmethylated. Although
CG-rich promoters containing CG islands are generally thought to regulate expression of
housekeeping genes, they also show tissue-specific patterns of gene expression [20].
Furthermore, regions of intermediate CG density that lie in close proximity, but often not
within CG islands (generally referred to as island shores), show tissue-specific
differential patterns of methylation [21]. These regions appear to be sufficient for
distinguishing liver, spleen, and brain tissues, and also appear to be conserved between
human and mouse. In the present study, we show that in distinct genomic contexts, the
conservation in CG methylation is highly correlated with CG content or density. While
genomic regions, such as CG islands with high CG density, show greater methylation
conservation, the CG island shores with lower CG density show substantially less DNA
methylation conservation. These observations are not limited to CG island promoters. In
our cross species study, we have profiled genome-wide DNA methylation patterns in
4
human and mouse cortex and show that DNA methylation conservation is only weakly
correlated with sequence conservation. Rather, DNA methylation conservation is
strongly correlated with CG content. This is consistent with the finding that genomic
regions with relatively low CG density, such as CG island shores, exhibit greater DNA
methylation variability across multiple tissue types [21]. The functional implications of
these results are dependant on genomic context. In the context of promoter CG islands,
the CG depleted shores may contain cis-acting regulatory elements with important roles
in transcriptional regulation. We present results from our studies investigating the pattern
of DNA methylation conservation and alteration in different genomic contexts with an
evolutionary perspective.
RESULTS
In this large-scale DNA methylation profiling study, we generated 657,424,756
million sequence reads, mapping DNA methylation for greater than 80% of CG sites
within human and mouse brains. These data are from cortical samples of 10 human (6
ventral prefrontal and 4 auditory) cortical specimens, and 5 (129S6/SvEv) mouse
forebrain specimens. Human samples were selected from our postmortem brain
collection with complete neuropathological and psychopathological data, as well as brain
toxicology reports, all confirming the absence of neuropsychiatric disorders, pathological
lesions, and psychoactive substances [22]. DNA Methylation patterns did not differ
significantly by age, sex, pH, or postmortem interval within specimens examined (data
not shown). DNA methylation data was generated using the Methylation Mapping
Analysis by Paired-end Sequencing (Methyl-MAPS). The Methyl-MAPS method is based
on our previously reported enzymatic assay [23], that has been extended to take
advantage of vast improvements in sequencing throughput of second-generation
5
sequencing technology [24] (see supplementary materials on details of experimental and
analytical procedures). Methyl-MAPS data were independently validated by Illumina
Infinium HumanMethylation27 BeadChip technology [25], showing strong correlation of
results (Pearson Correlation Coefficient r=0.81) between the two experimental methods
across 19,627 overlapping CG cites, thus demonstrating that the Methyl-MAPS method
produces highly robust and reliable DNA methylation data genome-wide. Although our
methylation data provide >80% coverage of CG sites in the genome, examination of
methylation levels from biological replicates revealed that genomic coverage of 8 or
greater produced robust estimates of methylation states (shown in supplementary
materials). Hence, all analyses were based on these high coverage CG sites. Using this
coverage constraint, we were able to map 36%, i.e., 10,262,160 CG sites in common
across all human cortical samples, representing the largest whole-genome methylation
profiling effort in primary tissue to date.
Cross-species DNA methylation conservation
Data from whole genome DNA methylation profiling of human and mouse brains
allowed us to determine whether there is a relationship between sequence conservation
and DNA methylation. We examined methylation patterns within evolutionary conserved
regions [26], positing that conservations at these regions likely have a biologically
significant function. Human-mouse cross-species methylation comparison of such
regions showed that methylation correlates weakly with sequence conservation (r=0.25;
Figure S1). This was comparable with correlations from random selections of syntenic
regions from human-mouse pair-wise alignments (r=0.15). Additionally, DNA methylation
patterns were compared within promoters of human-mouse orthologous genes since
promoters are regions where DNA methylation confers a functional effect in the
repression of gene expression. Here also, we observed weak correlation in human and
6
mouse DNA methylation patterns within 8,956 orthologous gene promoters (r=0.16;
Figure S2).
If sequence conservation is not the driving force underlying DNA methylation
conservation, then what is? This prompted us to ask whether evolutionary conservation
of DNA methylation is driven by CG density rather than sequence homology. Since CG
dinucleotides are the targets of DNA methyltransferase enzymes with the propensity to
methylate CG clusters [27, 28], it is highly plausible that CG density may be the key
factor. We examined human-mouse DNA methylation profiles within distinct ranges of
CG density and found that indeed, DNA methylation conservation increases with
increasing CG content (Figure 1). These data show that evolutionarily, DNA methylation
patterns are conserved in regions of the genome with CG enrichment. Data from humanmouse pair-wise alignments were used to identify genomic regions with comparable
ranges of CG density in both human and mouse. DNA methylation conservation was
highly correlated with CG content (r=0.80) for regions with CG density of 5-CG
dinucleotides or greater, while by comparison, the genome average is approximately 1
CG per 100 base pairs. Furthermore, the observed patterns of DNA methylation were
quite striking, becoming increasingly polarized towards extreme hypo- and hypermethylated states with increased CG density (Figure 2-top panel). The density ranges
represented in Figure 2 as poor, intermediate, and rich were selected for illustrative
purposes based on our observed correlation trends (Figure 1). While we find that there is
a greater accumulation of internal introns in the CG-poor compared to CG-rich
compartments, gene promoters and exons appear to be depleted in CG-poor
compartments and overrepresented in CG-rich compartments (Figure 2-bottom panel).
These observations prompted us to re-examine the promoters of orthologous genes
(Figure S2) in context of CG density.
7
Within CG-rich gene promoters, DNA methylation patterns are highly conserved,
proximal to transcription start sites (TSS) (Figure 3). The extent of DNA methylation
conservation appears to be confined to [-200bp, +500bp] around the TSS. This pattern
tracks with CG density, revealing that as expected, these CG-rich promoters are hypomethylated and are ultra-conserved across species (Figure 3). Remarkably, the shores
of these CG-rich regions show elevated DNA methylation variability that extends up to
2KB upstream and downstream of the TSS (Figure 3-bottom panel). This is consistent
with previous findings, reporting tissue-specific DNA methylation variation in CG island
shores within gene promoters [21]. These shores demarcate transitions in CG density. A
salient characteristic of these shores is that they retain an intermediate CG density
cross-species (Figure 3-middle panel), which might be important for establishment of
tissue-specific and perhaps species-specific methylation profiles. Beyond these shores,
DNA methylation patterns resume basal levels (Figure 3-top panel) and do not appear to
vary substantially cross-species (Figure 3-bottom panel). Taken together, these results
underscore the significance of CG density in sustained DNA methylation conservation
and alteration throughout evolution as a fundamental component of mammalian
epigenetic programming.
Intra-species DNA methylation conservation and alteration in cortex
In human cortex, patterns of DNA methylation conservation and alteration are
also highly dependent on CG density. In comparing DNA methylation signatures within
prefrontal cortex (PFC) and auditory cortex (AC) across individuals, we found that DNA
methylation patterns are highly conserved among 26% of CG sites and only significantly
altered in 1% of CG sites, with the remaining having variable CG methylation. These
conserved, altered, and variable methylated domains were represented within all
genomic features (Figure S3). Further examination of the conserved and altered
8
methylation distributions relative to CG density revealed that in human cortex, DNA
methylation is conserved in CG-rich regions and altered in CG-depleted regions of the
genome (Figure 4). The CG density transition associated with conserved methylation is
similar to what we observed cross-species (approximately 5-CG sites within 100bp
region). This might demarcate a potential genomic signature for evolutionary
conservation of DNA methylation and maintenance.
Investigation of conserved and altered CG sites within different genomic
compartments showed that, in addition to CG density, genomic context determines DNA
methylation profiles. Adopting a gene-centric view, we found that across different gene
features, DNA methylation patterns vary markedly relative to CG density (Figure 5).
Promoters and 1st exons demonstrate the expected pattern of methylation, showing less
methylation in CG-dense regions associated with CG islands and more methylation in
CG-poor regions. The internal exons and introns tend to be more methylated with
greater CG depletion as compared to promoters and 1 st exons. Thus, the difference in
methylation pattern between promoters/1 st exons and internal exons/introns correlates
with different patterns of CG-density distributions. Intriguingly, the transition between
unmethylated and methylated domains appears to have similar CG transition density
observed previously, when comparing conserved and altered genomic DNA methylation
profiles. These results are also evident in the mouse genome (Figure S4).
We next focused our attention on the promoter and TSS regions, considering
whether differences in DNA methylation can be detected when comparing two human
cortical regions for two distinct classes of CG-rich and -poor promoters. We addressed
this question in the context of DNA methylation conservation. We found that for CG-rich
promoters, DNA methylation patterns are highly conserved within approximately [-200bp,
+500bp] of the TSS (Figure 6). Beyond this range, we detected elevated methylation
differences for up to ±2KB akin to our previous observations with the “shores” (Figure 3).
9
Although the magnitude of DNA methylation differences is relatively small, the patterns
are highly consistent. This indicates that these shores may harbor regulatory elements
that, in addition to determining tissue- and species-specific methylation profiles, may
reveal DNA methylation differences specific to human cortical specialization and function.
It is possible, that the extent of DNA methylation differences at the shores might be a
reflection of the samples functional and evolutionary divergence.
We also examined the DNA methylation patterns at shores of CG islands. In the
two cortical regions, we examined gene promoter associated CG island shores, as well
as CG island shores contained within gene bodies. We found that promoter-associated
islands show differential methylation patterns at the shores (Figure 7), consistent with
our observations at the promoter TSS region (Figure 6), and previous reports [21]. We
also examined the pattern of CG island shore methylation within gene bodies, and found
that the two cortical regions did not differ in their methylation patterns at these shores
(Figure 7). This is perhaps not unexpected, since gene body CG islands tend to be
highly methylated as part of the evolutionary conserved feature of gene body
methylation [29]. Indeed, the average length of gene body islands is half the length of
those in gene promoters [24]. This is attributed to cytosine methylation resulting in
increased C-to-T transitions after deamination of methylated cytosines, and hence
depletion of CG sites from gene body CG islands. However, a small fraction of gene
body CG islands remain unmethylated within each cortical region (Figure S5). We also
found that a small proportion of promoter-associated CG islands are methylated.
Examination of these CG island shores revealed that unmethylated CG islands within
the promoters and gene bodies show striking similarity in their shores, in that they exhibit
increased methylation differences between PFC and AC in the immediate shores of the
CG islands (Figure S6-top panel). In contrast, the methylated CG islands show no such
differences at the shores (Figure S6-middle panel). These findings indicate that although
10
CG islands within gene bodies maybe few in number, they still may harbor regulatory
regions with potentially tissue-specific gene expression [30]. Taken together, these
results underscore the importance of CG content and context in patterns of DNA
methylation conservation and alteration in mammalian brain evolution.
DISCUSSION
The data presented represents the first comprehensive map of DNA methylation
landscape of the human cortex to our knowledge. In this study, we have demonstrated
that CG methylation patterns have a distinct evolutionary signature that is both contentand context-dependent. We find that in the brain, CG methylation is conserved in CG
dense regions of the genome, which are independent of sequence conservation. The
extent of DNA methylation conservation extends to both unmethylated and methylated
compartments of the genome. Also, the genomic environment plays a critical role in
defining DNA methylation patterning throughout the mammalian genome. These
methylation signatures provide important insight into the role of DNA methylation in brain
development and disease.
Genomic regions, like CG islands with conserved DNA methylation signatures
cross-species, may be important in basic cellular function and development of brain
neural circuitry. Although conserved in the adult brain, DNA methylation patterns at
these regions may be highly dynamic during neurodevelopment. Differential DNA
methylation patterns have been well established in imprinted loci during early
development [31, 32], and it is likely that methylation changes at non-imprinted loci can
also play an important role in neuronal development and differentiation. One can
speculate that aberrant changes in DNA methylation in epigenetically conserved regions
would result in drastic phenotypic abnormalities during early stages of development.
11
Hence, it is plausible to posit that DNA methylation changes during key
neurodevelopmental trajectories may lead to such severe early-onset
neurodevelopmental disorders as autism and childhood schizophrenia. Autism is an
early-onset disorder typically diagnosed before the age of 3 and is accompanied by
significantly altered neurodevelopment. Individuals with the disorder exhibit global
cerebral gray matter hyperplasia in the first 2 years of life [33] and larger frontal and
temporal gray matter volumes by 4 years, followed by a slower rate of growth in these
regions by 7 years [34, 35]. Childhood-onset schizophrenia, with a mean age of onset
around 10, is associated with striking parietal gray matter loss, which progresses
anteriorly during adolescence [36]. Alterations in DNA methylation patterns that can
influence the degree or timing of basic brain maturational patterns may at least partially
underlie these neurodevelopmental disorders and are active areas of current research.
Therefore, in DNA methylation studies of neurodevelopmental disorders, evolutionary
conserved CG-rich genomic regions are important targets in investigations involving the
developmental origins of such disorders.
While CG-rich genomic regions with evolutionary conserved DNA methylation
patterns may be essential in normal brain development and function, they might not be
the likely foci for DNA methylation abnormalities associated with adult onset
neuropsychiatric disorders, that may result from accumulating environmental insults
during the lifespan. Unlike dramatic methylation changes that are typically observed in
most cancers, DNA methylation changes for such disorders are likely to be more subtle.
Thus, changes in genomic regions with intermediate or poor CG density, as in CG island
shores, are likely the targets. In contrast to CG islands that are generally protected [37,
38], CG dinucleotides at these regions are most likely targets of epigenetic modifications
that are induced by accumulating exposures to environmental stressors. DNA
methylation modifications at these regions may be the consequence of the
12
neuropathology associated with the disease, or the consequence of environmental
stressors that increase disease risk. For instance, adult-onset schizophrenia (the more
typical form) is more strongly associated with deficits in later-maturing temporal and
frontal regions [39-41], and is associated with selective abnormalities of the heteromodal
regions [42]. Also, in major depression, decreased volumes of cortical and subcortical
regions have been reported. Patients with major depressive disorder show reduced gray
matter concentration in the left inferior temporal cortex, the right orbitofrontal and the
dorsolateral prefrontal cortex [43]. Such brain abnormalities in schizophrenia and
depression may, in addition to genetic factors, be due to epigenetic factors that undergo
alterations in response to the changes in the environment. In disease studies of
schizophrenia and depression in postmortem samples, it is not possible to separate the
influence of genetics and environment on disease neuropathology and associated
epigenetic alterations. However, epigenetic studies provide crucially important insight on
regulatory signatures that maybe aberrantly regulated in these disorders.
Discovery of DNA methylation alterations associated with neurodevelopmental
and neuropsychiatric disorders will depend on CG content and context in the genome. It
is critical to recognize the potential role that these factors play in brain DNA methylation
signatures, and to profile these methylation patterns within relevant brain regions. In the
brain, DNA methylation patterns are region specific [44, 45], and cell-specific. Although
the present study examines DNA methylation patterns within specific cortical regions, it
does not capture the DNA methylation complexity within specific cell populations. This is
partly due to limitations in current approaches, which require relatively large amounts of
starting DNA to perform such whole-genome DNA methylation profiling. However, with
improvements in methylation sequencing technology, interrogation of cell-specific
epigenetic profiles will be possible in the near future. Hence, depending on the nature of
the brain-based disorder, specific brain regions and cell populations, and genomic
13
regions (i.e., conserved or altered regions) may be targeted for epigenetic studies. This
work highlights the evolutionary signatures of DNA methylation patterning in the brain,
which will both inform our understanding of DNA methylation programming in
mammalian genomes and offer insight into study-design considerations for future
epigenetic studies of neurodevelopmental and neuropsychiatric disorders.
MATERIALS AND METHODS
Ethics Statement: All human and rodent procedures were approved by the Institutional
Review Boards of the New York State Psychiatric Institute / Columbia University
Department of Psychiatry and the School of Medicine, University “Ss Cyril & Methodius,”
or by the Institutional Animal Care and Use Committee, respectively.
Samples and Subjects: Human brain specimens were obtained from the New York
State Psychiatric Institute (NYSPI) and the Macedonia brain collection. Normal human
tissue specimens were obtained with institutional review board approval and anonymous
individual identifiers. A total of 10 cortical specimens were dissected frozen, including six
specimens from the rostral portion of the right orbital gyrus, and four specimens from the
right primary auditory cortex. For three subjects, tissue specimens from both prefrontal
and auditory cortices were collected. Samples consisted of 4 males, average age 51±5
years, and 3 females, average age 45±5. Brain pH ranged from 6.3 to 6.7 with average
postmortem interval (PMI) of 9.4±5.1 hours. The subjects had died suddenly and were
autopsied in the Institute for Forensic Medicine at the School of Medicine, University “Ss.
Cyril & Methodius,” Skopje, Macedonia. Cases were chosen that were without
psychopathology or history of psychoactive drugs (as determined by psychological
autopsy interviews with their survivors), without significant abnormalities on
14
neuropathological examination, and with negative screening of brain and body fluids for
psychoactive drugs, including therapeutic levels in brain. The right cerebral hemisphere
was sliced coronally at intervals of 2-4 cm. The slices were rapidly frozen in Freon 134a
(1,1,1,2 tetrafluoroethane) and stored at -80ºC until used. For the current study, slices
were warmed to -20ºC, and cortical samples of ~200-500 mg were cut, with a scalpel
chilled in dry ice, from the rostral portion of the orbital gyrus (BA47), taking care to avoid
visible white matter. Primary auditory cortex (BA41) was similarly dissected from the
caudal portion of Heschl’s gyrus [46]. Mouse brain (strain 129 SvEv, from Taconic) DNA
was collected from the entire left cerebral hemisphere. Samples were kept at -80ºC until
further processing.
Methyl-MAPS Procedure: Brain DNA from human and mouse was fractionated into
methylated and unmethylated compartments. Paired-end libraries were constructed,
subsequently sequenced, and then mapped onto the human genome. DNA fractionation
and library preparation methods have been previously described [23] (and detailed in
supplementary materials). Briefly the Methyl-MAPS procedure is described as follows.
Seven micrograms of DNA is digested with McrBC and RE in parallel. McrBC
endonuclease generates the unmethylated compartment and is able to interrogate the
methylation state of more than 74% of the CG sites in the genome. The methylated
compartment is generated by digestion with a panel of all known methylation-sensitive
tetranucleotide restriction enzymes termed RE (HpaII, HhaI, AciI, BstUI, and HpyCH4IV),
each of which cuts at a specific 4bp sequence only if the CG in the recognition site is
unmethylated. By using such a cocktail, sequence-specific biases of the enzyme
recognition site are minimized, and we are able to interrogate the methylation state of
38% of CG sites genome-wide. The strength of this approach is that each strategy
augments the other and combined, they provide greater overall coverage, permitting the
15
assessment of >80% of all CG sites genome-wide. Jumping libraries of fragments
greater than 700bp in size were constructed utilizing EcoP15’s unique digestion
properties. Deep sequencing of digested sequence fragments was performed on the
SOLiD sequencing platform from Applied Biosystems [47]. Mapping of paired-end
sequenced fragments were performed with the SOLiD software analysis package.
Methyl-MAPS Estimation of Methylation State: The methylation probability of a CG
dinucleotide was estimated by the genomic coverage of the methylated and
unmethylated sequence fragments from the RE and McrBC library, respectively. The RE
fragments contribute to the methylation coverage and the McrBC fragments contribute to
the unmethylated coverage. For each CG, we defined two variables to represent
coverage corresponding to methylated coverage ( n1for RE fragments) and unmethylated
coverage ( n 2 for McrBC fragments). To correct intra-individual sampling bias between

the methylated and unmethylated compartments, we estimated the ratio (  ) of sampling

probabilities
for the McrBC and RE library with


 n1 p
n
2
(1 p)
,
where p is the global methylation level estimated experimentally using a highly reliable

method, known as the LUminometric
Methylation Assay (LUMA) [48]. Therefore, the
 methylation probability pˆ of each CG was computed using the following equation
pˆ 

n1
n1  n 2 .
 Genomic feature annotations were downloaded from the
Methyl-MAPS Data Analysis:
UCSC Bioinformatics website (http://genome.ucsc.edu/), including CG islands and
16
RefSeq gene annotations. All annotations and methylation data were indexed by CG site
and were stored in a MySQL database for use in subsequent analyses. CG
dinucleotides were overlapped with the following mutually exclusive genomic features,
i.e., promoters, first exons, first introns, internal exons, internal introns, and so on.
RefSeq gene annotations [49] were based on the Human Genome NCBI Build 36 and
the Mouse Genome NCBI Build 37. Only genes with complete start and end coding
sequences were used in our analysis. Orthologous gene annotation for human and
mouse was obtained from the HomoloGene database
(http://www.ncbi.nlm.nih.gov/homologene) and formed a subset of the RefSeq genes.
For promoter analyses, promoter regions referred to 1KB upstream of the TSS.
Classification of CG-poor and CG-rich promoters was determined by observed CG
density within ±500bp centered at TSS of RefSeq genes. Both human and mouse
exhibited expected bimodal distributions for CG density. These data were used to
determine cutoff values for CG-poor and CG-rich promoters with <0.07 and >0.07 CG
densities, respectively. Here CG density is defined as (2  #CG sites)/fragment length.
For ease of representation, we also report the number of CG sites within a 100bp
sequence fragment to describe CG density. Finally, evolutionary conserved regions,
ECRs [26] were used to estimate correlations between DNA methylation conservation
and sequence similarity. ECRs are sequence fragments with greater than 100bp in
length and greater than 70% sequence identity between human and mouse. For
genome-wide sequence alignments, we use pair-wise human-mouse alignments [50].
ACKNOWLEDGEMENTS
We like to thank Dr. Smiley for his efforts in dissection of auditory cortex
specimens.
17
References
1.
Chan, S.W.L., I.R. Henderson, and S.E. Jacobsen, Gardening the genome: DNA
methylation in Arabidopsis thaliana. Nature Reviews Genetics, 2005. 6(5): p.
351-360.
2.
Law, J.A. and S.E. Jacobsen, Establishing, maintaining and modifying DNA
methylation patterns in plants and animals. Nature Reviews Genetics. 11(3): p.
204-220.
3.
Cheng, X.D. and R.M. Blumenthal, Mammalian DNA methyltransferases: A
structural perspective. Structure, 2008. 16(3): p. 341-350.
4.
Goll, M.G. and T.H. Bestor, Eukaryotic cytosine methyltransferases. Annual
Review of Biochemistry, 2005. 74: p. 481-514.
5.
Feng, S.H., et al., Conservation and divergence of methylation patterning in
plants and animals. Proceedings of the National Academy of Sciences of the
United States of America. 107(19): p. 8689-8694.
6.
Ehrlich, M., et al., Amount and distribution of 5-methylcytosine in human DNA
from different types of tissues of cells. Nucleic Acids Res, 1982. 10(8): p. 270921.
7.
Bird, A., DNA methylation patterns and epigenetic memory. Genes Dev, 2002.
16(1): p. 6-21.
8.
Lister, R., et al., Human DNA methylomes at base resolution show widespread
epigenomic differences. Nature, 2009. 462(7271): p. 315-22.
9.
Ramsahoye, B.H., et al., Non-CpG methylation is prevalent in embryonic stem
cells and may be mediated by DNA methyltransferase 3a. Proc Natl Acad Sci U
S A, 2000. 97(10): p. 5237-42.
18
10.
Cedar, H. and Y. Bergman, Linking DNA methylation and histone modification:
patterns and paradigms. Nat Rev Genet, 2009. 10(5): p. 295-304.
11.
Suzuki, M.M. and A. Bird, DNA methylation landscapes: provocative insights
from epigenomics. Nature Reviews Genetics, 2008. 9(6): p. 465-476.
12.
Antequera, F. and A. Bird, Number of CpG islands and genes in human and
mouse. Proc Natl Acad Sci U S A, 1993. 90(24): p. 11995-9.
13.
Bestor, T.H. and A. Coxon, CYTOSINE METHYLATION - THE PROS AND
CONS OF DNA METHYLATION. Current Biology, 1993. 3(6): p. 384-386.
14.
Weber, M., et al., Distribution, silencing potential and evolutionary impact of
promoter DNA methylation in the human genome. Nature Genetics, 2007. 39(4):
p. 457-466.
15.
Gardinergarden, M. and M. Frommer, CPG ISLANDS IN VERTEBRATE
GENOMES. Journal of Molecular Biology, 1987. 196(2): p. 261-282.
16.
Takai, D. and P.A. Jones, Comprehensive analysis of CpG islands in human
chromosomes 21 and 22. Proceedings of the National Academy of Sciences of
the United States of America, 2002. 99(6): p. 3740-3745.
17.
Bajic, V.B., et al., Mice and men: Their promoter properties. Plos Genetics, 2006.
2(4): p. 614-626.
18.
Ioshikhes, I.P. and M.Q. Zhang, Large-scale human promoter mapping using
CpG islands. Nature Genetics, 2000. 26(1): p. 61-63.
19.
Tanay, A., et al., Hyperconserved CpG domains underlie Polycomb-binding sites.
Proceedings of the National Academy of Sciences of the United States of
America, 2007. 104(13): p. 5521-5526.
20.
Mohn, F., et al., Lineage-specific polycomb targets and de novo DNA methylation
define restriction and potential of neuronal progenitors. Mol Cell, 2008. 30(6): p.
755-66.
19
21.
Irizarry, R.A., et al., The human colon cancer methylome shows similar hypoand hypermethylation at conserved tissue-specific CpG island shores. Nat Genet,
2009. 41(2): p. 178-86.
22.
Kelly, M.P., C.T. Johnson, and J.M. Govern, Recognition memory test: validity in
diffuse traumatic brain injury. Appl Neuropsychol, 1996. 3(3-4): p. 147-54.
23.
Rollins, R.A., et al., Large-scale structure of genomic methylation patterns.
Genome Res, 2006. 16(2): p. 157-63.
24.
Edwards, J.R., et al., Chromatin and sequence features that define the fine and
gross structure of genomic methylation patterns. Genome Res.
25.
Bibikova, M. and L. Zhou, Genome-wide DNA methylation profiling using
Infinium(R) assay. Epigenomics, 2009. 1(1): p. 177-200.
26.
Loots, G. and I. Ovcharenko, ECRbase: database of evolutionary conserved
regions, promoters, and transcription factor binding sites in vertebrate genomes.
Bioinformatics, 2007. 23(1): p. 122-4.
27.
Jia, D., et al., Structure of Dnmt3a bound to Dnmt3L suggests a model for de
novo DNA methylation. Nature, 2007. 449(7159): p. 248-U13.
28.
Glass, J.L., et al., CG dinucleotide periodicities recognized by the Dnmt3aDnmt3L complex are distinctive at retroelements and imprinted domains. Mamm
Genome, 2009. 20(9-10): p. 633-43.
29.
Feng, S., et al., Conservation and divergence of methylation patterning in plants
and animals. Proc Natl Acad Sci U S A. 107(19): p. 8689-94.
30.
Maunakea, A.K., et al., Conserved role of intragenic DNA methylation in
regulating alternative promoters. Nature, 2010. 466(7303): p. 253-7.
31.
Dulac, C., Brain function and chromatin plasticity. Nature, 2010. 465(7299): p.
728-35.
20
32.
Wilkinson, L.S., W. Davies, and A.R. Isles, Genomic imprinting effects on brain
development and function. Nat Rev Neurosci, 2007. 8(11): p. 832-43.
33.
Courchesne, E., R. Carper, and N. Akshoomoff, Evidence of brain overgrowth in
the first year of life in autism. Jama, 2003. 290(3): p. 337-44.
34.
Carper, R.A., et al., Cerebral lobes in autism: early hyperplasia and abnormal
age effects. Neuroimage, 2002. 16(4): p. 1038-51.
35.
Saitoh, O. and E. Courchesne, Magnetic resonance imaging study of the brain in
autism. Psychiatry Clin Neurosci, 1998. 52 Suppl: p. S219-22.
36.
Thompson, P.M., et al., Mapping adolescent brain change reveals dynamic wave
of accelerated gray matter loss in very early-onset schizophrenia. Proc Natl Acad
Sci U S A, 2001. 98(20): p. 11650-5.
37.
Thomson, J.P., et al., CpG islands influence chromatin structure via the CpGbinding protein Cfp1. Nature, 2010. 464(7291): p. 1082-6.
38.
Zhang, Y., et al., Chromatin methylation activity of Dnmt3a and Dnmt3a/3L is
guided by interaction of the ADD domain with the histone H3 tail. Nucleic Acids
Res, 2010. 38(13): p. 4246-53.
39.
DeLisi, L.E., et al., The timing of brain morphological changes in schizophrenia
and their relationship to clinical outcome. Biol Psychiatry, 1992. 31(3): p. 241-54.
40.
Gur, R.E., et al., A follow-up magnetic resonance imaging study of schizophrenia.
Relationship of neuroanatomical changes to clinical and neurobehavioral
measures. Arch Gen Psychiatry, 1998. 55(2): p. 145-52.
41.
Shenton, M.E., et al., A review of MRI findings in schizophrenia. Schizophr Res,
2001. 49(1-2): p. 1-52.
42.
Buchanan, R.W., et al., Morphometric assessment of the heteromodal
association cortex in schizophrenia. Am J Psychiatry, 2004. 161(2): p. 322-31.
21
43.
Vasic, N., et al., Gray matter reduction associated with psychopathology and
cognitive dysfunction in unipolar depression: a voxel-based morphometry study.
J Affect Disord, 2008. 109(1-2): p. 107-16.
44.
Ladd-Acosta, C., et al., DNA methylation signatures within the human brain. Am
J Hum Genet, 2007. 81(6): p. 1304-15.
45.
Xin, Y., et al., Genome-wide divergence of DNA methylation marks in cerebral
and cerebellar cortices. PLoS One, 2010. 5(6): p. e11357.
46.
Dwork, A., et al., Postmortem and in vivo structural pathology in schizhoprenia, in
Neurobiology of Mental Illness, E. Nestler and D. Charney, Editors. 2008, Oxford
University Press. p. 201-320.
47.
Shendure, J., et al., Accurate multiplex polony sequencing of an evolved
bacterial genome. Science, 2005. 309(5741): p. 1728-32.
48.
Karimi, M., et al., LUMA (LUminometric Methylation Assay)--a high throughput
method to the analysis of genomic DNA methylation. Exp Cell Res, 2006.
312(11): p. 1989-95.
49.
Pruitt, K.D., T. Tatusova, and D.R. Maglott, NCBI Reference Sequence (RefSeq):
a curated non-redundant sequence database of genomes, transcripts and
proteins. Nucleic Acids Res, 2005. 33(Database issue): p. D501-4.
50.
Schwartz, S., et al., Human-mouse alignments with BLASTZ. Genome Res, 2003.
13(1): p. 103-7.
22
Funding: This work was supported by grants from the National Institute of Mental Health
(MH048514) and the National Human Genome Research institute (HG002915). AHO
was supported by an NRSA F30 fellowship from the NIMH (F30MH085471). The funders
had no role in study design, data collection and analysis, decision to publish, or
preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
23
Figure Legends
Figure 1. Heatmap of human-mouse DNA methylation conservation with respect to
CG density. Human-mouse pairwise alignments were obtained from UCSC Genome
Bioinformatics Browser, with DNA methylation correlation computed for ranges of CG
density depicted as number of CG sites within 100bp windows of aligned sequences.
The observed correlation coefficients varied in range from 0.02 to 0.88, showing regions
of ≥5 CG sites with high methylation correlation.
Figure 2. Conservation of DNA methylation is correlated with CG dinucleotide
density. Human–mouse pairwise sequence alignments were split into 100bp nonoverlapping windows and average methylation levels computed. The three panels
correspond to poor, intermediate, and rich CG densities, defined as, Poor:  3,
Intermediate: >3 to 5 and Rich: >5 CG sites. Top panel: human-mouse DNA
methylation patterns for poor (r=0.11), intermediate (r=0.43), and rich (r=0.80) CG
densities. Bottom panel: percentage of gene features in both human and mouse within
the three compartments.
Figure 3. Methylation patterns at promoter regions for human/mouse orthologous
genes. A total of 9,926 orthologous genes with CG-rich promoters were examined in the
analysis. Regions of 5kb-upstream and 5kb-downstream from TSS were split into 100bp
non-overlapping windows, where each window is represented by average methylation
across all genes. (Top panel) CG methylation level, (Middle panel) CG density, and
(Bottom panel) CG methylation difference in human and mouse.
24
Figure 4. Distribution of conserved and altered DNA methylation in prefrontal and
auditory cortices. DNA methylation conservation is correlated and alteration is
anticorrelated with respect to CG density. DNA methylation conservation increases with
increasing CG density, where as DNA methylation alteration increases with decreasing
CG density.
Figure 5. Human cortex, patterns of DNA methylation and CG density for CG sites
within genic compartments. Left panel shows that the extent of CG methylation
depends on CG density and genomic context. Right panel shows the shift in transition
from unmethylated to methylated states relative to CG density. CG sites with ≤0.2 and
≥0.8 methylation were assigned to the unmethylated and methylated groups,
respectively.
Figure 6. Comparative analysis of patterns of DNA methylation conservation at the
TSS of refSeq genes within human prefrontal and auditory cortices. The top panel
shows the percentage of conserved CGs in CG-poor (black) and CG-rich (red)
promoters; the middle panel shows methylation difference between prefrontal and
auditory cortices for CG-poor and -rich promoters; the bottom panel illustrates CG
density in the two types of promoters of refSeq genes.
Figure 7. Methylation differences between human prefrontal and auditory cortices
for CG islands in promoters and gene bodies. The top panel shows methylation
differences in promoter (black) and gene-body (green) CG islands; the bottom panel
shows CG density for all annotated islands in promoters and gene bodies.
25
8
5
6
7
0.6
3
4
0.4
2
0.2
1
Mouse #CG/100bp
9
0.8
1
2
3
4
5
6
7
Human #CG/100bp
8
9
10
Correlation coefficient between human and mouse
10
Figure 1
Click here to download Figure: Figure1.eps
Figure 2
Click here to download high resolution image
1.0
0.6
0.4
0.2
Methylation
0.8
Figure 3
Click here to download Figure: Figure3.eps
8
6
4
0.25 0
0.20
0.15
0.10
0.05
Methylation difference
2
#CG/100bp
10
12
0.0
Human
Mouse
−4000
−2000
0
2000
Distance from TSS (kb)
4000
5
10
#CG/100bp
15
0.1
20
0.2
30
0.3
40
0.4
50
0.5
60
0.6
70
Percentage of altered DNA methylation
10
Percentage of conserved DNA methylation
Figure 4
Click here to download Figure: Figure4.eps
Promoter/first exon
0.30
Figure 5
Click here to download Figure: Figure5.eps
0.25
Unmethylated
Methylated
Density
0.10
0.2
0.15
Density
0.20
0.3
0.05
0.1
0.4
Met
0.6
hyla
tion
0.8
prob
abli
ty
15
#CG
/100
10
0.2
0.00
0.0
0.0
bp
5
0
5
10
15
#CG/100bp
1.0
Internal exon
1.0
0.15
Unmethylated
Methylated
0.4
0.10
0.6
0.05
Density
Density
0.8
0.2
0.4
Met
0.6
hyla
tion
0.8
prob
abli
ty
15
#CG
/100
10
0.2
0.00
0.0
0.0
bp
5
0
5
10
15
#CG/100bp
1.0
Internal intron
1.5
0.2
0.5
0.4
Me t
0.6
hyla
tion
0.8
prob
abli
ty
15
1.0
0.0
10
0.2
bp
0.0
0.0
0.1
5
#CG
/100
Density
Density
1.0
0.3
0.4
0.5
Unmethylated
Methylated
0
5
10
#CG/100bp
15
70
60
50
40
30
20
0.05
0.04
0.03
0.02
CG poor
CG rich
8
6
4
#CG/100bp
10
12
0.01
Methylation difference
Percentage of conserved CGs
Figure 6
Click here to download Figure: Figure6.eps
−4000
−2000
0
2000
Distrance from TSS (bp)
4000
0.06
0.05
0.04
0.03
0.02
0.01
6
8
Promoter associated
Gene body
4
#CG/100bp
10
12
Average methylation difference
Figure 7
Click here to download Figure: Figure7.eps
−4000
−2000
Start End
Distance from CG island
2000
4000
Supporting Information
Click here to download Supporting Information: supplementary_materials.doc