Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
*Manuscript Click here to download Manuscript: manuscript.doc Title: Role of CG Context and Content in Evolutionary Signatures of Brain DNA Methylation Running Head: CG Context and Content in Brain DNA Methylation Authors: Yurong Xin1*, Anne H. O’Donnell1,2*, Yongchao Ge3, Benjamin Chanrion1, Maria Milekic1, Gorazd Rosoklija1,4,5, Aleksander Stankov5, Victoria Arango1, Andrew J. Dwork1,5,6, J. John Mann1, Jay Gingrich1, and Fatemeh G. Haghighi1** * Authors contributed equally to this work Corresponding author: [email protected] 1 Department of Psychiatry, Columbia University and The New York State Psychiatric Institute, New York, NY 2 Integrated Graduate Program in Cellular, Molecular, and Biomedical Studies, Columbia University, New York, NY 3 Department of Neurology, Mount Sinai School of Medicine, New York, NY 4 Macedonian Academy of Sciences and Arts, Skopje, Macedonia 5 School of Medicine, University “Ss. Cyril & Methodius,” Skopje, Macedonia 6 Department of Pathology and Cell Biology, Columbia University, New York, NY ** 1 ABSTRACT DNA methylation is essential in brain function and behavior, yet understanding the role of DNA methylation in brain-based disorders begins with the study of DNA methylation profiles in normal brain. We apply an enzymatic-based approach, Methylation Mapping Analysis by Paired-end Sequencing (Methyl-MAPS) that utilizes second-generation sequencing technology to provide an unbiased representation of genome-wide DNA methylation profiles of human and mouse brains. In this large-scale study, we assayed CG methylation in cerebral cortex of neurologically and psychiatrically normal human postmortem specimens, as well as mouse forebrain specimens. Cross-species humanmouse DNA methylation conservation analysis shows that DNA methylation is not correlated with sequence conservation. Instead, greater DNA methylation conservation is correlated with increasing CG density. Genomic regions with significant human-mouse DNA methylation conservation (correlation >80%) typically have greater than 5 CG dinucleotides within a 100bp window. In addition to CG density, these data show that genomic context is a critical factor in DNA methylation conservation and alteration signatures throughout mammalian brain evolution. We identify key genomic features that can be targeted for identification of epigenetic loci that may be developmentally and evolutionarily conserved and wherein aberrations in DNA methylation patterns can confer risk for disease. 2 INTRODUCTION DNA methylation is an evolutionary ancient epigenetic mark found within most eukaryotic organisms including fungi, plants, and animals [1, 2]. DNA methylation modification involves addition of a methyl-group to cytosine bases in a heritable fashion. Cytosine methylation can be categorized into CG, CHG, and CHH methylation (where H refers to either A, C, or T nucleotides). In animals, methylation primarily occurs at CG dinucleotides, whereas in plants, methylation of cytosines is observed in all contexts. Cytosine methylation is catalyzed by highly conserved DNA methyltransferases (Dnmts) [3, 4]. As such, DNA methylation is an essential evolutionary process involved in many gene regulatory systems, including genomic imprinting, X-chromosome inactivation, transposon silencing, and expression of endogenous genes. Cross species comparative epigenomic analyses have revealed intriguing trends in both the conserved and divergent features of DNA methylation in eukaryotic evolution. Cytosine methylation has been detected throughout the majority of repetitive sequences and transposons, but also within the body of protein coding genes. Within gene bodies, CG methylation appears to be favored in exons over introns. Although the biological function of gene body methylation or mechanisms by which gene bodies are targeted by the methylation machinery are not well-understood, the preferential methylation of exons in plant and animal species appear to be an evolutionary conserved phenomena [5]. In mammalian genomes, DNA methylation occurs almost exclusively within CG dinucleotides (70-80%), though a small amount of non-CG methylation is found within embryonic stem cells [6-9]. The remaining unmethylated CG sites mostly occur in dense clusters referred to as CG islands [10, 11]. Mammalian genomes with widespread DNA methylation have undergone CG dinucleotide depletion over evolutionary time as a consequence of cytosine methylation that leads to increased rate of C-to-T transitions 3 occurring after deamination of methylated cytosines [12-14]. Such depletions are not ubiquitous since some genomic regions, like CG islands, remain enriched in CG dinucleotides [15, 16]. CG islands represent a large fraction of cis-regulatory sequences because they occupy the majority of gene promoters [12, 17, 18], with non-promoter CG islands functioning as distal regulators (e.g., insulators and enhancers) [19]. The localized genomic depletion of CG dinucleotides in vertebrates results in a markedly different CG distribution as compared to genomes of invertebrates. Invertebrate genomes show homogeneous distribution and no depletion of CG dinucleotides, whereas vertebrate genomes are globally depleted, except at CG islands [20]. Establishment of DNA methylation patterns are dependent on CG content within specific genomic compartments. For example, vertebrate promoters are comprised of two major classes based on CG content (i.e., CG-poor and CG-rich promoters). CG-poor promoters are largely methylated and show tissue-specific patterns of methylation and gene expression. In contrast, CG-rich promoters are typically unmethylated. Although CG-rich promoters containing CG islands are generally thought to regulate expression of housekeeping genes, they also show tissue-specific patterns of gene expression [20]. Furthermore, regions of intermediate CG density that lie in close proximity, but often not within CG islands (generally referred to as island shores), show tissue-specific differential patterns of methylation [21]. These regions appear to be sufficient for distinguishing liver, spleen, and brain tissues, and also appear to be conserved between human and mouse. In the present study, we show that in distinct genomic contexts, the conservation in CG methylation is highly correlated with CG content or density. While genomic regions, such as CG islands with high CG density, show greater methylation conservation, the CG island shores with lower CG density show substantially less DNA methylation conservation. These observations are not limited to CG island promoters. In our cross species study, we have profiled genome-wide DNA methylation patterns in 4 human and mouse cortex and show that DNA methylation conservation is only weakly correlated with sequence conservation. Rather, DNA methylation conservation is strongly correlated with CG content. This is consistent with the finding that genomic regions with relatively low CG density, such as CG island shores, exhibit greater DNA methylation variability across multiple tissue types [21]. The functional implications of these results are dependant on genomic context. In the context of promoter CG islands, the CG depleted shores may contain cis-acting regulatory elements with important roles in transcriptional regulation. We present results from our studies investigating the pattern of DNA methylation conservation and alteration in different genomic contexts with an evolutionary perspective. RESULTS In this large-scale DNA methylation profiling study, we generated 657,424,756 million sequence reads, mapping DNA methylation for greater than 80% of CG sites within human and mouse brains. These data are from cortical samples of 10 human (6 ventral prefrontal and 4 auditory) cortical specimens, and 5 (129S6/SvEv) mouse forebrain specimens. Human samples were selected from our postmortem brain collection with complete neuropathological and psychopathological data, as well as brain toxicology reports, all confirming the absence of neuropsychiatric disorders, pathological lesions, and psychoactive substances [22]. DNA Methylation patterns did not differ significantly by age, sex, pH, or postmortem interval within specimens examined (data not shown). DNA methylation data was generated using the Methylation Mapping Analysis by Paired-end Sequencing (Methyl-MAPS). The Methyl-MAPS method is based on our previously reported enzymatic assay [23], that has been extended to take advantage of vast improvements in sequencing throughput of second-generation 5 sequencing technology [24] (see supplementary materials on details of experimental and analytical procedures). Methyl-MAPS data were independently validated by Illumina Infinium HumanMethylation27 BeadChip technology [25], showing strong correlation of results (Pearson Correlation Coefficient r=0.81) between the two experimental methods across 19,627 overlapping CG cites, thus demonstrating that the Methyl-MAPS method produces highly robust and reliable DNA methylation data genome-wide. Although our methylation data provide >80% coverage of CG sites in the genome, examination of methylation levels from biological replicates revealed that genomic coverage of 8 or greater produced robust estimates of methylation states (shown in supplementary materials). Hence, all analyses were based on these high coverage CG sites. Using this coverage constraint, we were able to map 36%, i.e., 10,262,160 CG sites in common across all human cortical samples, representing the largest whole-genome methylation profiling effort in primary tissue to date. Cross-species DNA methylation conservation Data from whole genome DNA methylation profiling of human and mouse brains allowed us to determine whether there is a relationship between sequence conservation and DNA methylation. We examined methylation patterns within evolutionary conserved regions [26], positing that conservations at these regions likely have a biologically significant function. Human-mouse cross-species methylation comparison of such regions showed that methylation correlates weakly with sequence conservation (r=0.25; Figure S1). This was comparable with correlations from random selections of syntenic regions from human-mouse pair-wise alignments (r=0.15). Additionally, DNA methylation patterns were compared within promoters of human-mouse orthologous genes since promoters are regions where DNA methylation confers a functional effect in the repression of gene expression. Here also, we observed weak correlation in human and 6 mouse DNA methylation patterns within 8,956 orthologous gene promoters (r=0.16; Figure S2). If sequence conservation is not the driving force underlying DNA methylation conservation, then what is? This prompted us to ask whether evolutionary conservation of DNA methylation is driven by CG density rather than sequence homology. Since CG dinucleotides are the targets of DNA methyltransferase enzymes with the propensity to methylate CG clusters [27, 28], it is highly plausible that CG density may be the key factor. We examined human-mouse DNA methylation profiles within distinct ranges of CG density and found that indeed, DNA methylation conservation increases with increasing CG content (Figure 1). These data show that evolutionarily, DNA methylation patterns are conserved in regions of the genome with CG enrichment. Data from humanmouse pair-wise alignments were used to identify genomic regions with comparable ranges of CG density in both human and mouse. DNA methylation conservation was highly correlated with CG content (r=0.80) for regions with CG density of 5-CG dinucleotides or greater, while by comparison, the genome average is approximately 1 CG per 100 base pairs. Furthermore, the observed patterns of DNA methylation were quite striking, becoming increasingly polarized towards extreme hypo- and hypermethylated states with increased CG density (Figure 2-top panel). The density ranges represented in Figure 2 as poor, intermediate, and rich were selected for illustrative purposes based on our observed correlation trends (Figure 1). While we find that there is a greater accumulation of internal introns in the CG-poor compared to CG-rich compartments, gene promoters and exons appear to be depleted in CG-poor compartments and overrepresented in CG-rich compartments (Figure 2-bottom panel). These observations prompted us to re-examine the promoters of orthologous genes (Figure S2) in context of CG density. 7 Within CG-rich gene promoters, DNA methylation patterns are highly conserved, proximal to transcription start sites (TSS) (Figure 3). The extent of DNA methylation conservation appears to be confined to [-200bp, +500bp] around the TSS. This pattern tracks with CG density, revealing that as expected, these CG-rich promoters are hypomethylated and are ultra-conserved across species (Figure 3). Remarkably, the shores of these CG-rich regions show elevated DNA methylation variability that extends up to 2KB upstream and downstream of the TSS (Figure 3-bottom panel). This is consistent with previous findings, reporting tissue-specific DNA methylation variation in CG island shores within gene promoters [21]. These shores demarcate transitions in CG density. A salient characteristic of these shores is that they retain an intermediate CG density cross-species (Figure 3-middle panel), which might be important for establishment of tissue-specific and perhaps species-specific methylation profiles. Beyond these shores, DNA methylation patterns resume basal levels (Figure 3-top panel) and do not appear to vary substantially cross-species (Figure 3-bottom panel). Taken together, these results underscore the significance of CG density in sustained DNA methylation conservation and alteration throughout evolution as a fundamental component of mammalian epigenetic programming. Intra-species DNA methylation conservation and alteration in cortex In human cortex, patterns of DNA methylation conservation and alteration are also highly dependent on CG density. In comparing DNA methylation signatures within prefrontal cortex (PFC) and auditory cortex (AC) across individuals, we found that DNA methylation patterns are highly conserved among 26% of CG sites and only significantly altered in 1% of CG sites, with the remaining having variable CG methylation. These conserved, altered, and variable methylated domains were represented within all genomic features (Figure S3). Further examination of the conserved and altered 8 methylation distributions relative to CG density revealed that in human cortex, DNA methylation is conserved in CG-rich regions and altered in CG-depleted regions of the genome (Figure 4). The CG density transition associated with conserved methylation is similar to what we observed cross-species (approximately 5-CG sites within 100bp region). This might demarcate a potential genomic signature for evolutionary conservation of DNA methylation and maintenance. Investigation of conserved and altered CG sites within different genomic compartments showed that, in addition to CG density, genomic context determines DNA methylation profiles. Adopting a gene-centric view, we found that across different gene features, DNA methylation patterns vary markedly relative to CG density (Figure 5). Promoters and 1st exons demonstrate the expected pattern of methylation, showing less methylation in CG-dense regions associated with CG islands and more methylation in CG-poor regions. The internal exons and introns tend to be more methylated with greater CG depletion as compared to promoters and 1 st exons. Thus, the difference in methylation pattern between promoters/1 st exons and internal exons/introns correlates with different patterns of CG-density distributions. Intriguingly, the transition between unmethylated and methylated domains appears to have similar CG transition density observed previously, when comparing conserved and altered genomic DNA methylation profiles. These results are also evident in the mouse genome (Figure S4). We next focused our attention on the promoter and TSS regions, considering whether differences in DNA methylation can be detected when comparing two human cortical regions for two distinct classes of CG-rich and -poor promoters. We addressed this question in the context of DNA methylation conservation. We found that for CG-rich promoters, DNA methylation patterns are highly conserved within approximately [-200bp, +500bp] of the TSS (Figure 6). Beyond this range, we detected elevated methylation differences for up to ±2KB akin to our previous observations with the “shores” (Figure 3). 9 Although the magnitude of DNA methylation differences is relatively small, the patterns are highly consistent. This indicates that these shores may harbor regulatory elements that, in addition to determining tissue- and species-specific methylation profiles, may reveal DNA methylation differences specific to human cortical specialization and function. It is possible, that the extent of DNA methylation differences at the shores might be a reflection of the samples functional and evolutionary divergence. We also examined the DNA methylation patterns at shores of CG islands. In the two cortical regions, we examined gene promoter associated CG island shores, as well as CG island shores contained within gene bodies. We found that promoter-associated islands show differential methylation patterns at the shores (Figure 7), consistent with our observations at the promoter TSS region (Figure 6), and previous reports [21]. We also examined the pattern of CG island shore methylation within gene bodies, and found that the two cortical regions did not differ in their methylation patterns at these shores (Figure 7). This is perhaps not unexpected, since gene body CG islands tend to be highly methylated as part of the evolutionary conserved feature of gene body methylation [29]. Indeed, the average length of gene body islands is half the length of those in gene promoters [24]. This is attributed to cytosine methylation resulting in increased C-to-T transitions after deamination of methylated cytosines, and hence depletion of CG sites from gene body CG islands. However, a small fraction of gene body CG islands remain unmethylated within each cortical region (Figure S5). We also found that a small proportion of promoter-associated CG islands are methylated. Examination of these CG island shores revealed that unmethylated CG islands within the promoters and gene bodies show striking similarity in their shores, in that they exhibit increased methylation differences between PFC and AC in the immediate shores of the CG islands (Figure S6-top panel). In contrast, the methylated CG islands show no such differences at the shores (Figure S6-middle panel). These findings indicate that although 10 CG islands within gene bodies maybe few in number, they still may harbor regulatory regions with potentially tissue-specific gene expression [30]. Taken together, these results underscore the importance of CG content and context in patterns of DNA methylation conservation and alteration in mammalian brain evolution. DISCUSSION The data presented represents the first comprehensive map of DNA methylation landscape of the human cortex to our knowledge. In this study, we have demonstrated that CG methylation patterns have a distinct evolutionary signature that is both contentand context-dependent. We find that in the brain, CG methylation is conserved in CG dense regions of the genome, which are independent of sequence conservation. The extent of DNA methylation conservation extends to both unmethylated and methylated compartments of the genome. Also, the genomic environment plays a critical role in defining DNA methylation patterning throughout the mammalian genome. These methylation signatures provide important insight into the role of DNA methylation in brain development and disease. Genomic regions, like CG islands with conserved DNA methylation signatures cross-species, may be important in basic cellular function and development of brain neural circuitry. Although conserved in the adult brain, DNA methylation patterns at these regions may be highly dynamic during neurodevelopment. Differential DNA methylation patterns have been well established in imprinted loci during early development [31, 32], and it is likely that methylation changes at non-imprinted loci can also play an important role in neuronal development and differentiation. One can speculate that aberrant changes in DNA methylation in epigenetically conserved regions would result in drastic phenotypic abnormalities during early stages of development. 11 Hence, it is plausible to posit that DNA methylation changes during key neurodevelopmental trajectories may lead to such severe early-onset neurodevelopmental disorders as autism and childhood schizophrenia. Autism is an early-onset disorder typically diagnosed before the age of 3 and is accompanied by significantly altered neurodevelopment. Individuals with the disorder exhibit global cerebral gray matter hyperplasia in the first 2 years of life [33] and larger frontal and temporal gray matter volumes by 4 years, followed by a slower rate of growth in these regions by 7 years [34, 35]. Childhood-onset schizophrenia, with a mean age of onset around 10, is associated with striking parietal gray matter loss, which progresses anteriorly during adolescence [36]. Alterations in DNA methylation patterns that can influence the degree or timing of basic brain maturational patterns may at least partially underlie these neurodevelopmental disorders and are active areas of current research. Therefore, in DNA methylation studies of neurodevelopmental disorders, evolutionary conserved CG-rich genomic regions are important targets in investigations involving the developmental origins of such disorders. While CG-rich genomic regions with evolutionary conserved DNA methylation patterns may be essential in normal brain development and function, they might not be the likely foci for DNA methylation abnormalities associated with adult onset neuropsychiatric disorders, that may result from accumulating environmental insults during the lifespan. Unlike dramatic methylation changes that are typically observed in most cancers, DNA methylation changes for such disorders are likely to be more subtle. Thus, changes in genomic regions with intermediate or poor CG density, as in CG island shores, are likely the targets. In contrast to CG islands that are generally protected [37, 38], CG dinucleotides at these regions are most likely targets of epigenetic modifications that are induced by accumulating exposures to environmental stressors. DNA methylation modifications at these regions may be the consequence of the 12 neuropathology associated with the disease, or the consequence of environmental stressors that increase disease risk. For instance, adult-onset schizophrenia (the more typical form) is more strongly associated with deficits in later-maturing temporal and frontal regions [39-41], and is associated with selective abnormalities of the heteromodal regions [42]. Also, in major depression, decreased volumes of cortical and subcortical regions have been reported. Patients with major depressive disorder show reduced gray matter concentration in the left inferior temporal cortex, the right orbitofrontal and the dorsolateral prefrontal cortex [43]. Such brain abnormalities in schizophrenia and depression may, in addition to genetic factors, be due to epigenetic factors that undergo alterations in response to the changes in the environment. In disease studies of schizophrenia and depression in postmortem samples, it is not possible to separate the influence of genetics and environment on disease neuropathology and associated epigenetic alterations. However, epigenetic studies provide crucially important insight on regulatory signatures that maybe aberrantly regulated in these disorders. Discovery of DNA methylation alterations associated with neurodevelopmental and neuropsychiatric disorders will depend on CG content and context in the genome. It is critical to recognize the potential role that these factors play in brain DNA methylation signatures, and to profile these methylation patterns within relevant brain regions. In the brain, DNA methylation patterns are region specific [44, 45], and cell-specific. Although the present study examines DNA methylation patterns within specific cortical regions, it does not capture the DNA methylation complexity within specific cell populations. This is partly due to limitations in current approaches, which require relatively large amounts of starting DNA to perform such whole-genome DNA methylation profiling. However, with improvements in methylation sequencing technology, interrogation of cell-specific epigenetic profiles will be possible in the near future. Hence, depending on the nature of the brain-based disorder, specific brain regions and cell populations, and genomic 13 regions (i.e., conserved or altered regions) may be targeted for epigenetic studies. This work highlights the evolutionary signatures of DNA methylation patterning in the brain, which will both inform our understanding of DNA methylation programming in mammalian genomes and offer insight into study-design considerations for future epigenetic studies of neurodevelopmental and neuropsychiatric disorders. MATERIALS AND METHODS Ethics Statement: All human and rodent procedures were approved by the Institutional Review Boards of the New York State Psychiatric Institute / Columbia University Department of Psychiatry and the School of Medicine, University “Ss Cyril & Methodius,” or by the Institutional Animal Care and Use Committee, respectively. Samples and Subjects: Human brain specimens were obtained from the New York State Psychiatric Institute (NYSPI) and the Macedonia brain collection. Normal human tissue specimens were obtained with institutional review board approval and anonymous individual identifiers. A total of 10 cortical specimens were dissected frozen, including six specimens from the rostral portion of the right orbital gyrus, and four specimens from the right primary auditory cortex. For three subjects, tissue specimens from both prefrontal and auditory cortices were collected. Samples consisted of 4 males, average age 51±5 years, and 3 females, average age 45±5. Brain pH ranged from 6.3 to 6.7 with average postmortem interval (PMI) of 9.4±5.1 hours. The subjects had died suddenly and were autopsied in the Institute for Forensic Medicine at the School of Medicine, University “Ss. Cyril & Methodius,” Skopje, Macedonia. Cases were chosen that were without psychopathology or history of psychoactive drugs (as determined by psychological autopsy interviews with their survivors), without significant abnormalities on 14 neuropathological examination, and with negative screening of brain and body fluids for psychoactive drugs, including therapeutic levels in brain. The right cerebral hemisphere was sliced coronally at intervals of 2-4 cm. The slices were rapidly frozen in Freon 134a (1,1,1,2 tetrafluoroethane) and stored at -80ºC until used. For the current study, slices were warmed to -20ºC, and cortical samples of ~200-500 mg were cut, with a scalpel chilled in dry ice, from the rostral portion of the orbital gyrus (BA47), taking care to avoid visible white matter. Primary auditory cortex (BA41) was similarly dissected from the caudal portion of Heschl’s gyrus [46]. Mouse brain (strain 129 SvEv, from Taconic) DNA was collected from the entire left cerebral hemisphere. Samples were kept at -80ºC until further processing. Methyl-MAPS Procedure: Brain DNA from human and mouse was fractionated into methylated and unmethylated compartments. Paired-end libraries were constructed, subsequently sequenced, and then mapped onto the human genome. DNA fractionation and library preparation methods have been previously described [23] (and detailed in supplementary materials). Briefly the Methyl-MAPS procedure is described as follows. Seven micrograms of DNA is digested with McrBC and RE in parallel. McrBC endonuclease generates the unmethylated compartment and is able to interrogate the methylation state of more than 74% of the CG sites in the genome. The methylated compartment is generated by digestion with a panel of all known methylation-sensitive tetranucleotide restriction enzymes termed RE (HpaII, HhaI, AciI, BstUI, and HpyCH4IV), each of which cuts at a specific 4bp sequence only if the CG in the recognition site is unmethylated. By using such a cocktail, sequence-specific biases of the enzyme recognition site are minimized, and we are able to interrogate the methylation state of 38% of CG sites genome-wide. The strength of this approach is that each strategy augments the other and combined, they provide greater overall coverage, permitting the 15 assessment of >80% of all CG sites genome-wide. Jumping libraries of fragments greater than 700bp in size were constructed utilizing EcoP15’s unique digestion properties. Deep sequencing of digested sequence fragments was performed on the SOLiD sequencing platform from Applied Biosystems [47]. Mapping of paired-end sequenced fragments were performed with the SOLiD software analysis package. Methyl-MAPS Estimation of Methylation State: The methylation probability of a CG dinucleotide was estimated by the genomic coverage of the methylated and unmethylated sequence fragments from the RE and McrBC library, respectively. The RE fragments contribute to the methylation coverage and the McrBC fragments contribute to the unmethylated coverage. For each CG, we defined two variables to represent coverage corresponding to methylated coverage ( n1for RE fragments) and unmethylated coverage ( n 2 for McrBC fragments). To correct intra-individual sampling bias between the methylated and unmethylated compartments, we estimated the ratio ( ) of sampling probabilities for the McrBC and RE library with n1 p n 2 (1 p) , where p is the global methylation level estimated experimentally using a highly reliable method, known as the LUminometric Methylation Assay (LUMA) [48]. Therefore, the methylation probability pˆ of each CG was computed using the following equation pˆ n1 n1 n 2 . Genomic feature annotations were downloaded from the Methyl-MAPS Data Analysis: UCSC Bioinformatics website (http://genome.ucsc.edu/), including CG islands and 16 RefSeq gene annotations. All annotations and methylation data were indexed by CG site and were stored in a MySQL database for use in subsequent analyses. CG dinucleotides were overlapped with the following mutually exclusive genomic features, i.e., promoters, first exons, first introns, internal exons, internal introns, and so on. RefSeq gene annotations [49] were based on the Human Genome NCBI Build 36 and the Mouse Genome NCBI Build 37. Only genes with complete start and end coding sequences were used in our analysis. Orthologous gene annotation for human and mouse was obtained from the HomoloGene database (http://www.ncbi.nlm.nih.gov/homologene) and formed a subset of the RefSeq genes. For promoter analyses, promoter regions referred to 1KB upstream of the TSS. Classification of CG-poor and CG-rich promoters was determined by observed CG density within ±500bp centered at TSS of RefSeq genes. Both human and mouse exhibited expected bimodal distributions for CG density. These data were used to determine cutoff values for CG-poor and CG-rich promoters with <0.07 and >0.07 CG densities, respectively. Here CG density is defined as (2 #CG sites)/fragment length. For ease of representation, we also report the number of CG sites within a 100bp sequence fragment to describe CG density. Finally, evolutionary conserved regions, ECRs [26] were used to estimate correlations between DNA methylation conservation and sequence similarity. ECRs are sequence fragments with greater than 100bp in length and greater than 70% sequence identity between human and mouse. For genome-wide sequence alignments, we use pair-wise human-mouse alignments [50]. ACKNOWLEDGEMENTS We like to thank Dr. Smiley for his efforts in dissection of auditory cortex specimens. 17 References 1. Chan, S.W.L., I.R. Henderson, and S.E. Jacobsen, Gardening the genome: DNA methylation in Arabidopsis thaliana. Nature Reviews Genetics, 2005. 6(5): p. 351-360. 2. Law, J.A. and S.E. Jacobsen, Establishing, maintaining and modifying DNA methylation patterns in plants and animals. Nature Reviews Genetics. 11(3): p. 204-220. 3. Cheng, X.D. and R.M. Blumenthal, Mammalian DNA methyltransferases: A structural perspective. Structure, 2008. 16(3): p. 341-350. 4. Goll, M.G. and T.H. Bestor, Eukaryotic cytosine methyltransferases. Annual Review of Biochemistry, 2005. 74: p. 481-514. 5. Feng, S.H., et al., Conservation and divergence of methylation patterning in plants and animals. Proceedings of the National Academy of Sciences of the United States of America. 107(19): p. 8689-8694. 6. Ehrlich, M., et al., Amount and distribution of 5-methylcytosine in human DNA from different types of tissues of cells. Nucleic Acids Res, 1982. 10(8): p. 270921. 7. Bird, A., DNA methylation patterns and epigenetic memory. Genes Dev, 2002. 16(1): p. 6-21. 8. Lister, R., et al., Human DNA methylomes at base resolution show widespread epigenomic differences. Nature, 2009. 462(7271): p. 315-22. 9. Ramsahoye, B.H., et al., Non-CpG methylation is prevalent in embryonic stem cells and may be mediated by DNA methyltransferase 3a. Proc Natl Acad Sci U S A, 2000. 97(10): p. 5237-42. 18 10. Cedar, H. and Y. Bergman, Linking DNA methylation and histone modification: patterns and paradigms. Nat Rev Genet, 2009. 10(5): p. 295-304. 11. Suzuki, M.M. and A. Bird, DNA methylation landscapes: provocative insights from epigenomics. Nature Reviews Genetics, 2008. 9(6): p. 465-476. 12. Antequera, F. and A. Bird, Number of CpG islands and genes in human and mouse. Proc Natl Acad Sci U S A, 1993. 90(24): p. 11995-9. 13. Bestor, T.H. and A. Coxon, CYTOSINE METHYLATION - THE PROS AND CONS OF DNA METHYLATION. Current Biology, 1993. 3(6): p. 384-386. 14. Weber, M., et al., Distribution, silencing potential and evolutionary impact of promoter DNA methylation in the human genome. Nature Genetics, 2007. 39(4): p. 457-466. 15. Gardinergarden, M. and M. Frommer, CPG ISLANDS IN VERTEBRATE GENOMES. Journal of Molecular Biology, 1987. 196(2): p. 261-282. 16. Takai, D. and P.A. Jones, Comprehensive analysis of CpG islands in human chromosomes 21 and 22. Proceedings of the National Academy of Sciences of the United States of America, 2002. 99(6): p. 3740-3745. 17. Bajic, V.B., et al., Mice and men: Their promoter properties. Plos Genetics, 2006. 2(4): p. 614-626. 18. Ioshikhes, I.P. and M.Q. Zhang, Large-scale human promoter mapping using CpG islands. Nature Genetics, 2000. 26(1): p. 61-63. 19. Tanay, A., et al., Hyperconserved CpG domains underlie Polycomb-binding sites. Proceedings of the National Academy of Sciences of the United States of America, 2007. 104(13): p. 5521-5526. 20. Mohn, F., et al., Lineage-specific polycomb targets and de novo DNA methylation define restriction and potential of neuronal progenitors. Mol Cell, 2008. 30(6): p. 755-66. 19 21. Irizarry, R.A., et al., The human colon cancer methylome shows similar hypoand hypermethylation at conserved tissue-specific CpG island shores. Nat Genet, 2009. 41(2): p. 178-86. 22. Kelly, M.P., C.T. Johnson, and J.M. Govern, Recognition memory test: validity in diffuse traumatic brain injury. Appl Neuropsychol, 1996. 3(3-4): p. 147-54. 23. Rollins, R.A., et al., Large-scale structure of genomic methylation patterns. Genome Res, 2006. 16(2): p. 157-63. 24. Edwards, J.R., et al., Chromatin and sequence features that define the fine and gross structure of genomic methylation patterns. Genome Res. 25. Bibikova, M. and L. Zhou, Genome-wide DNA methylation profiling using Infinium(R) assay. Epigenomics, 2009. 1(1): p. 177-200. 26. Loots, G. and I. Ovcharenko, ECRbase: database of evolutionary conserved regions, promoters, and transcription factor binding sites in vertebrate genomes. Bioinformatics, 2007. 23(1): p. 122-4. 27. Jia, D., et al., Structure of Dnmt3a bound to Dnmt3L suggests a model for de novo DNA methylation. Nature, 2007. 449(7159): p. 248-U13. 28. Glass, J.L., et al., CG dinucleotide periodicities recognized by the Dnmt3aDnmt3L complex are distinctive at retroelements and imprinted domains. Mamm Genome, 2009. 20(9-10): p. 633-43. 29. Feng, S., et al., Conservation and divergence of methylation patterning in plants and animals. Proc Natl Acad Sci U S A. 107(19): p. 8689-94. 30. Maunakea, A.K., et al., Conserved role of intragenic DNA methylation in regulating alternative promoters. Nature, 2010. 466(7303): p. 253-7. 31. Dulac, C., Brain function and chromatin plasticity. Nature, 2010. 465(7299): p. 728-35. 20 32. Wilkinson, L.S., W. Davies, and A.R. Isles, Genomic imprinting effects on brain development and function. Nat Rev Neurosci, 2007. 8(11): p. 832-43. 33. Courchesne, E., R. Carper, and N. Akshoomoff, Evidence of brain overgrowth in the first year of life in autism. Jama, 2003. 290(3): p. 337-44. 34. Carper, R.A., et al., Cerebral lobes in autism: early hyperplasia and abnormal age effects. Neuroimage, 2002. 16(4): p. 1038-51. 35. Saitoh, O. and E. Courchesne, Magnetic resonance imaging study of the brain in autism. Psychiatry Clin Neurosci, 1998. 52 Suppl: p. S219-22. 36. Thompson, P.M., et al., Mapping adolescent brain change reveals dynamic wave of accelerated gray matter loss in very early-onset schizophrenia. Proc Natl Acad Sci U S A, 2001. 98(20): p. 11650-5. 37. Thomson, J.P., et al., CpG islands influence chromatin structure via the CpGbinding protein Cfp1. Nature, 2010. 464(7291): p. 1082-6. 38. Zhang, Y., et al., Chromatin methylation activity of Dnmt3a and Dnmt3a/3L is guided by interaction of the ADD domain with the histone H3 tail. Nucleic Acids Res, 2010. 38(13): p. 4246-53. 39. DeLisi, L.E., et al., The timing of brain morphological changes in schizophrenia and their relationship to clinical outcome. Biol Psychiatry, 1992. 31(3): p. 241-54. 40. Gur, R.E., et al., A follow-up magnetic resonance imaging study of schizophrenia. Relationship of neuroanatomical changes to clinical and neurobehavioral measures. Arch Gen Psychiatry, 1998. 55(2): p. 145-52. 41. Shenton, M.E., et al., A review of MRI findings in schizophrenia. Schizophr Res, 2001. 49(1-2): p. 1-52. 42. Buchanan, R.W., et al., Morphometric assessment of the heteromodal association cortex in schizophrenia. Am J Psychiatry, 2004. 161(2): p. 322-31. 21 43. Vasic, N., et al., Gray matter reduction associated with psychopathology and cognitive dysfunction in unipolar depression: a voxel-based morphometry study. J Affect Disord, 2008. 109(1-2): p. 107-16. 44. Ladd-Acosta, C., et al., DNA methylation signatures within the human brain. Am J Hum Genet, 2007. 81(6): p. 1304-15. 45. Xin, Y., et al., Genome-wide divergence of DNA methylation marks in cerebral and cerebellar cortices. PLoS One, 2010. 5(6): p. e11357. 46. Dwork, A., et al., Postmortem and in vivo structural pathology in schizhoprenia, in Neurobiology of Mental Illness, E. Nestler and D. Charney, Editors. 2008, Oxford University Press. p. 201-320. 47. Shendure, J., et al., Accurate multiplex polony sequencing of an evolved bacterial genome. Science, 2005. 309(5741): p. 1728-32. 48. Karimi, M., et al., LUMA (LUminometric Methylation Assay)--a high throughput method to the analysis of genomic DNA methylation. Exp Cell Res, 2006. 312(11): p. 1989-95. 49. Pruitt, K.D., T. Tatusova, and D.R. Maglott, NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res, 2005. 33(Database issue): p. D501-4. 50. Schwartz, S., et al., Human-mouse alignments with BLASTZ. Genome Res, 2003. 13(1): p. 103-7. 22 Funding: This work was supported by grants from the National Institute of Mental Health (MH048514) and the National Human Genome Research institute (HG002915). AHO was supported by an NRSA F30 fellowship from the NIMH (F30MH085471). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors have declared that no competing interests exist. 23 Figure Legends Figure 1. Heatmap of human-mouse DNA methylation conservation with respect to CG density. Human-mouse pairwise alignments were obtained from UCSC Genome Bioinformatics Browser, with DNA methylation correlation computed for ranges of CG density depicted as number of CG sites within 100bp windows of aligned sequences. The observed correlation coefficients varied in range from 0.02 to 0.88, showing regions of ≥5 CG sites with high methylation correlation. Figure 2. Conservation of DNA methylation is correlated with CG dinucleotide density. Human–mouse pairwise sequence alignments were split into 100bp nonoverlapping windows and average methylation levels computed. The three panels correspond to poor, intermediate, and rich CG densities, defined as, Poor: 3, Intermediate: >3 to 5 and Rich: >5 CG sites. Top panel: human-mouse DNA methylation patterns for poor (r=0.11), intermediate (r=0.43), and rich (r=0.80) CG densities. Bottom panel: percentage of gene features in both human and mouse within the three compartments. Figure 3. Methylation patterns at promoter regions for human/mouse orthologous genes. A total of 9,926 orthologous genes with CG-rich promoters were examined in the analysis. Regions of 5kb-upstream and 5kb-downstream from TSS were split into 100bp non-overlapping windows, where each window is represented by average methylation across all genes. (Top panel) CG methylation level, (Middle panel) CG density, and (Bottom panel) CG methylation difference in human and mouse. 24 Figure 4. Distribution of conserved and altered DNA methylation in prefrontal and auditory cortices. DNA methylation conservation is correlated and alteration is anticorrelated with respect to CG density. DNA methylation conservation increases with increasing CG density, where as DNA methylation alteration increases with decreasing CG density. Figure 5. Human cortex, patterns of DNA methylation and CG density for CG sites within genic compartments. Left panel shows that the extent of CG methylation depends on CG density and genomic context. Right panel shows the shift in transition from unmethylated to methylated states relative to CG density. CG sites with ≤0.2 and ≥0.8 methylation were assigned to the unmethylated and methylated groups, respectively. Figure 6. Comparative analysis of patterns of DNA methylation conservation at the TSS of refSeq genes within human prefrontal and auditory cortices. The top panel shows the percentage of conserved CGs in CG-poor (black) and CG-rich (red) promoters; the middle panel shows methylation difference between prefrontal and auditory cortices for CG-poor and -rich promoters; the bottom panel illustrates CG density in the two types of promoters of refSeq genes. Figure 7. Methylation differences between human prefrontal and auditory cortices for CG islands in promoters and gene bodies. The top panel shows methylation differences in promoter (black) and gene-body (green) CG islands; the bottom panel shows CG density for all annotated islands in promoters and gene bodies. 25 8 5 6 7 0.6 3 4 0.4 2 0.2 1 Mouse #CG/100bp 9 0.8 1 2 3 4 5 6 7 Human #CG/100bp 8 9 10 Correlation coefficient between human and mouse 10 Figure 1 Click here to download Figure: Figure1.eps Figure 2 Click here to download high resolution image 1.0 0.6 0.4 0.2 Methylation 0.8 Figure 3 Click here to download Figure: Figure3.eps 8 6 4 0.25 0 0.20 0.15 0.10 0.05 Methylation difference 2 #CG/100bp 10 12 0.0 Human Mouse −4000 −2000 0 2000 Distance from TSS (kb) 4000 5 10 #CG/100bp 15 0.1 20 0.2 30 0.3 40 0.4 50 0.5 60 0.6 70 Percentage of altered DNA methylation 10 Percentage of conserved DNA methylation Figure 4 Click here to download Figure: Figure4.eps Promoter/first exon 0.30 Figure 5 Click here to download Figure: Figure5.eps 0.25 Unmethylated Methylated Density 0.10 0.2 0.15 Density 0.20 0.3 0.05 0.1 0.4 Met 0.6 hyla tion 0.8 prob abli ty 15 #CG /100 10 0.2 0.00 0.0 0.0 bp 5 0 5 10 15 #CG/100bp 1.0 Internal exon 1.0 0.15 Unmethylated Methylated 0.4 0.10 0.6 0.05 Density Density 0.8 0.2 0.4 Met 0.6 hyla tion 0.8 prob abli ty 15 #CG /100 10 0.2 0.00 0.0 0.0 bp 5 0 5 10 15 #CG/100bp 1.0 Internal intron 1.5 0.2 0.5 0.4 Me t 0.6 hyla tion 0.8 prob abli ty 15 1.0 0.0 10 0.2 bp 0.0 0.0 0.1 5 #CG /100 Density Density 1.0 0.3 0.4 0.5 Unmethylated Methylated 0 5 10 #CG/100bp 15 70 60 50 40 30 20 0.05 0.04 0.03 0.02 CG poor CG rich 8 6 4 #CG/100bp 10 12 0.01 Methylation difference Percentage of conserved CGs Figure 6 Click here to download Figure: Figure6.eps −4000 −2000 0 2000 Distrance from TSS (bp) 4000 0.06 0.05 0.04 0.03 0.02 0.01 6 8 Promoter associated Gene body 4 #CG/100bp 10 12 Average methylation difference Figure 7 Click here to download Figure: Figure7.eps −4000 −2000 Start End Distance from CG island 2000 4000 Supporting Information Click here to download Supporting Information: supplementary_materials.doc