* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download The evolution of genomic imprinting and X
Non-coding DNA wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Behavioral epigenetics wikipedia , lookup
Oncogenomics wikipedia , lookup
Gene desert wikipedia , lookup
Non-coding RNA wikipedia , lookup
Y chromosome wikipedia , lookup
Public health genomics wikipedia , lookup
Epigenetics in learning and memory wikipedia , lookup
Short interspersed nuclear elements (SINEs) wikipedia , lookup
Human genome wikipedia , lookup
History of genetic engineering wikipedia , lookup
Koinophilia wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Pathogenomics wikipedia , lookup
Adaptive evolution in the human genome wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Ridge (biology) wikipedia , lookup
Minimal genome wikipedia , lookup
Designer baby wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Gene expression programming wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Microevolution wikipedia , lookup
Gene expression profiling wikipedia , lookup
Genome (book) wikipedia , lookup
Skewed X-inactivation wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Genome evolution wikipedia , lookup
Epigenetics of human development wikipedia , lookup
CHAPTER 6: DISCUSSION AND CONCLUSION The four individual publications that form the core of my thesis are connected by a common goal to understand how and why genomic imprinting and X inactivation evolved in mammals. In the following discussion, I review these chapters and relate their findings to the aim of this thesis, concentrating specifically on points not discussed in the final section of each publication. 6.1 The evolution of imprinted loci 6.1.1 Evolution of the PWS-AS locus In pursuing the core aim of my project, I first examined the evolution of imprinted loci in a single domain that was newly characterized in humans and mice, which contained regions responsible for Prader Willi Syndrome (PWS) and Angelman Syndrome (AS). In the publication that forms Chapter 2, Robert Rapkins and I, with other co-authors presented data that demonstrated that the PWS-AS imprinted domain was constructed recently by major genome rearrangement. This was an entirely unexpected result. UBE3A, the gene thought to be responsible for AS, was found not to be located next to SNRPN (PWS) and other genes of human chromosome 15q11-13 in platypus, chicken and zebrafish as we had expected. Instead UBE3A was found next CNGA3, a gene located on chromosome 2 in humans. This unexpected UBE3A-CNGA3 conformation proved to be shared with the wallaby and opossum genomes, implying that it was the ancestral vertebrate arrangement. A rearrangement that brought UBE3A and SNRPN together must have occurred in a eutherian ancestor after their divergence from marsupials. The only non-eutherian group from which we could isolate a SNRPN orthologue was the marsupials, where SNRPN is situated adjacent to SNRPB. The position of SNRPN in marsupials and its absence from the genomes of monotremes or non-mammal vertebrates imply that SNRPN arose from SNRPB by tandem duplication. There are four other imprinted protein coding genes upstream of SNRPN in the eutherian PWS-AS domain of humans and mice. All appeared to be missing from the genomes of marsupials, monotremes and non-mammalian vertebrates. The intronless MKRN3, MAGEL2 and NDN genes all have closely related intron containing genes in 109 the non-eutherian species, so they were likely to have been added to the PWS-AS domain by retrotransposition after the divergence of marsupials from eutherians. The fourth gene SNURF is of particular importance because it is implicated in the regulation of expression at the PWS-AS domain in humans and mice. We could find no marsupial progenitor of the SNURF protein encoding gene, or the upstream regulatory elements, including the ICR, which controls imprinting of this entire region. Considering this, it was perhaps not surprising that we found non-imprinted, biallelic expression of SNRPN in wallaby, and UBE3A in wallaby and platypus (Chapter 2, Figure 2.3). Thus, the PWS-AS region was assembled from a range of components found in disparate regions of the genome, and acquired imprinting only in the eutherian lineage. This must have occurred relatively recently, since the divergence of marsupials and eutherians 180MYA. Prior to the publication of these data, work on the IGF2 and IGF2R imprinted genes had lead most to expect that all large imprinted domains evolved imprinting in the common ancestor of therian mammals in response to the evolution of placentation and viviparity. Our results showed for the first time that imprinting of a large domain can occur much later, within eutherian mammals alone. 6.1.1.1 Evolution of the snoRNAs in the PWS-AS domain One striking feature of the PWS-AS domain is that it contains arrays of small nucleolar RNAs (snoRNAs). SnoRNAs are best known for their ability to guide posttranscriptional modification of target RNAs, the vast majority of which are non-coding structural RNAs, such as ribosomal RNA. Interestingly, at the PWS-AS domain there are two major arrays of snoRNAs, at least one of which does not target structural RNAs. The HBII-52 array modifies alternative exons of the serotonin receptor (Kishore and Stamm, 2006b), while the HBII-85 array acts upon as yet unknown targets. Disruption of these arrays, in particular HBII-85, is thought to contribute largely to the phenotype of PWS (Sahoo et al., 2008). Significantly, our studies showed that marsupials do not possess these large arrays of the HBII-52 and HBII-85 snoRNAs, implying that their genesis and expansion coincided with the acquisition of imprinting at this region. However, we were not able to determine their evolutionary origin, something of extreme interest given the importance of these arrays in the pathology of PWS. 110 One very recent publication has provided an explanation for the origins of at least one of these arrays (Nahkuri et al., 2008). In this analysis, Nahkuri et al. identified a single snoRNA within the human SNRPB gene, named SNORD119. SNORD119 was also found within the SNRPB gene of all tetrapods, as well as in the SNRPN orthologue we discovered in marsupials. When the authors performed a phylogenetic analysis of SNORD119 and the eutherian HBII-52 snoRNA cluster; they found that the SNORD119 copy located within opossum SNRPN is more similar to the eutherian HBII-52 snoRNA repeats than any other SNORD119 homologue. From this they proposed that this SNORD119 homologue provided the initial snoRNA ‘seed’ for this locus in a eutherian ancestor, which was subsequently duplicated many times over to give rise to the extensive HBII-52 array (>40 copies in humans). As for most other snoRNAs, the human SNORD119 snoRNA is thought to modify rRNA (Yang et al., 2006). Nahkuri et al. did not comment on the likelihood that either of the marsupial SNORD119 orthologues targets rRNA, the serotonin receptor (as HBII-52 does) or something else. It would be interesting to determine these targets and to understand why the SNORD119 ancestor multiplied to such a great extent only within eutherian mammals. Like the many other unique characteristics of this locus, it may well correlate with the evolution of imprinting within this region. 6.1.1.2 Did PWS-AS acquire its imprinting from the X chromosome? One interesting outcome of our studies on the PWS-AS locus was the finding that UBE3A and some of its neighbouring chromosome 15 genes are located on chromosome 5p in wallaby (Figure 2.2). This region of wallaby 5p is homologous to parts of the short arm of the human X chromosome, and the Y chromosome, as well as to 15q11-13 in which the PWS-AS region resides. As proposed in Chapter 2, this suggests that parts of the PWS-AS region may once have belonged to an ancestral eutherian X chromosome. We speculated that during its tenure on the X, this locus acquired monoallelic or imprinted expression as a part of a dosage compensation system, then retained it even after further genome rearrangements deposited it upon an autosome. Although there may be more parsimonious explanations for the evolution of imprinting at this PWS-AS locus, this is an idea worth examining, particularly considering the several suggestions that imprinting evolved from X chromosome inactivation. Walter and Paulsen (2003) hypothesised that all imprinted genes evolved their unique 111 expression in one region (such as the X chromosome, or a large autosome) at some stage during the mammalian history. They proposed that these genes were then translocated to several disparate genomic regions by duplication and translocation, leading to the current distribution of imprinted genes throughout the genome (Walter and Paulsen, 2003). This hypothesis was tested recently by Edwards et al., (2007), who localised the orthologues of many imprinted genes in wallaby and platypus (Edwards et al., 2007). None of these localisations correlated with the sex chromosomes of either species, and generally these orthologous gene clusters mapped to disparate chromosomes. Moreover, localisations of imprinted gene orthologues in chicken are similarly disparate (Dunzinger et al., 2005). Thus, Edwards et al. reject the hypothesis that all imprinted genes came from a single ancestral imprinted locus, or a sex chromosome, and then were translocated to other regions of the genome. As the ‘translocation’ of monoallelic expression does not appear to be a general phenomenon contributing to the evolution of imprinted domains, it is unlikely to have contributed to the evolution of the PWS-AS domain. I made a minor contribution to the Edwards et al. (2007) publication, so I have included a reprint of it as Appendix 2. 6.1.2. Evolution of the XIC Many insights gained about the evolution of imprinted loci from the PWS-AS region were reiterated when evolution of the XIC was examined (Chapter 3). Firstly, like the PWS-AS region, a host of non-coding RNAs apparently arose within the region homologous to the XIC in an early eutherian ancestor, the most significant of which was the XIST gene. I discovered that the A-repeat and exon IV regions of XIST are highly conserved in all eutherian mammals (Figures 3.2 and 3.S1). I predicted that if these sequences were functionally similar in marsupials or monotremes, then I should have been able to detect them easily. However, I could find no trace of these sequences in the full genome sequences of platypus and opossum (6x genome coverage) or the low coverage wallaby genome (2x coverage) using standard similarity searches. Equally, FTX, JPX, TSIX, Tsx and Xite orthologues, and a small cluster of extremely highly conserved miRNAs within the FTX transcript, could not be found outside the eutherian 112 mammals. Thus, I concluded that these non-coding RNAs evolved only within the eutherian mammals, after their divergence from marsupials. My conclusion that the non-coding RNAs of the XIC are eutherian specific was independently demonstrated in a paper by Duret et al., (2006) (Duret et al., 2006). In this publication, the authors reported that of the five protein coding genes in chicken and Xenopus which are located in the region homologous to the XIC, two genes (LNX3 and FIP1L2) showed similarities to parts of XIST and the mouse-specific gene Tsx respectively. Although this similarity was near background, Duret et al. claimed homology between these two regions by demonstrating that it was unlikely the LNX3XIST alignments would overlap exonic sequence if they were not related by common descent. As marsupials possess an intact coding version of the LNX3 gene, this would imply that marsupials do not have the non-coding XIST alternative. The finding that marsupials do not possess an XIST orthologue, yet still perform X inactivation has been regarded as very surprising. I discuss this result with respect to proposed theories regarding the evolution of X inactivation in Section 6.3.2. Although my studies and those of Duret et al. produced new insight into the evolution of the XIC, there were still questions left unanswered. For instance, if pseudogenisation of LNX3 gave rise to two regions of XIST, it is curious that only one (exon IV) is predicted to have any function, albeit non-essential (Caparros et al., 2002; Hore et al., 2007a). Yet other extremely important regions of XIST showed apparently no similarity to LNX3. The most significant of these is the A-repeat, which recruits genes to the centre of the XIST repressive domain (Chaumeil et al., 2006), and performs gene silencing through the formation of 8-10 double hairpin loops (Wutz et al., 2002). I discovered the A-repeat was conserved in sequence and the capability to form these 810 hairpin loops in all four superordinal groups of eutherian mammals (Figure 3.S1). This conservation in structure implies that the crucial function of the A-repeat is conserved throughout all eutherian mammals. Yet, Duret et al. could provide no explanation for the A-repeat origin except for the speculation that it was derived from inserted sequences such as transposable elements. 113 This prediction has turned out to be rather insightful, as a recent publication showed that although many parts of XIST are homologous to the LNX3 protein coding gene, the critical A-repeat and other tandem repeats of XIST are transposon derived (Elisaphenko et al., 2008). Thus, not only can an imprinted domain be assembled from multiple unexpected components, it appears that imprinted genes such as XIST can also have a mixed heritage from an eclectic set of genomic elements. Another similarity between the PWS-AS domain and the XIC is that both their evolutionary histories were demarcated by genomic rearrangement. I discovered that the protein coding genes in the region homologous to the XIC of marsupials and monotremes independently underwent fission (Figures 3.3 and 3.4) – a finding that was independently corroborated for marsupials by two other laboratories (Davidow et al., 2007; Shevchenko et al., 2007). Rearrangement of the region homologous to the XIC was different to the PWS-AS rearrangement of eutherian mammals in the sense that instead of bringing the region together in eutherian mammals, it broke it apart in marsupials and monotremes. However, there could be features shared between these loci that may have rendered them susceptible to genomic rearrangement. Both the XIC and the UBE3A-CNGA3 region underwent major expansion (~200% and 1000% respectively) prior to their translocation. It could be that this expansion event, presumably through insertion of repeats, ‘softened’ the surrounding loci and rendered it unstable. However, this hypothesis needs to be interpreted carefully, as the entire genome of mammals appears to have expanded by 150-300% from that found in birds and reptiles (http://www.genomesize.com). 6.1.3. Evolution of other imprinted loci 6.1.3.1 Genome protection and the evolution of imprinted loci In Chapter 4 it was discussed how the discoveries of rearrangements and insertions at the XIC and PWS-AS locus related to what was known about the evolution of other imprinted loci (Hore et al., 2007b). It was emphasized that the evolutionary trajectory of the Callipyge and PEG10 imprinted loci were strikingly similar to that which we previously observed at the loci of interest. All shared the gain of multiple non-coding RNAs (both large and small) on acquisition of imprinting, along with the coincident expansion of their loci and the generation of new coding sequences by retroposition and translocation (Figure 4.2). 114 One of the aspects of the PEG10 domain evolution which sets it apart from the other loci was its stepwise evolution. Only PEG10 from this locus was imprinted in marsupials, while the two neighbouring genes from this region, which are imprinted in humans and mice, appeared to be bi-allelically expressed (Suzuki et al., 2007). Although other scenarios are possible, it would appear that PEG10 imprinting occurred first (prior to therian radiation), then other genes from this domain succumbed to imprinted gene regulation during eutherian evolution only. Could the ‘intermediate’ state of marsupials give further insight into how imprinted loci evolved? Suzuki et al. proposed that PEG10 retroposition was the initial cause of imprinted evolution at this locus (Suzuki et al., 2007). It is thought that genome protection mechanisms that silence inserted elements such as the PEG10 retrotransposon could be modified to give rise to parent specific expression. Indeed, many transgenes inserted into mice develop expression that is biased toward one parent (Chaillet, 1994; Preis et al., 2003). Despite these correlations, this concept is difficult to prove because although genome protection may explain silencing of retroposed genes, it does not explain why or how this would turn into parent-specific gene silencing. Moreover, the intermediate states required to properly resolve the evolutionary construction of imprinted loci are missing from extant species. Despite this, it is interesting to note that a very recent and thorough publication analysing the evolution of the Callipyge domain established that the first event occurring in the evolution of this locus was retroposition of the RTL1 gene in a therian ancestor, much like PEG10 (Edwards et al., 2008) 1 . Thus, it remains possible that protection against inserted may be a driving force of the evolution of genomic imprinting, or at least a mechanism by which genomic imprinting evolves at some loci. 6.1.3.2 Non-coding RNAs, differential methylation and the evolution of imprinted loci Non-coding RNAs and differential methylation are found at most imprinted loci in humans and mice, suggesting they are important for the regulation of imprinted genes throughout the genome. However, as discussed in Chapter 4 (Box 1), marsupial examples of non-coding RNAs and differential methylation have so far been hard to find. The IGF2R locus of marsupials apparently lacks both differential methylation and 1 Note - this early retroposition event was not detected by Weidman et al., (2006) in their analysis of the marsupial Callipyge region, an error which was unintentionally propagated in our review in Chapter 4. 115 the non-coding RNA AIR (Weidman et al., 2006a). Moreover, the presence of small RNAs (miRNAs and snoRNAs) associated with the imprinted loci of marsupials has also been doubted as these RNAs are usually well conserved between species, and should be easily identified by similarity if they existed (Royo et al., 2006). Finally, although XIST is not considered a classic imprinted locus, it is significant that the marsupial X lacks XIST and apparently does not feature differential methylation (Cooper et al., 1993; Kaslow and Migeon, 1987). If it is found that marsupial imprinted loci do, indeed, lack differential methylation and non-coding RNAs, then it would seem likely that they are not as essential for imprinted gene regulation as has been perceived. Recent data may put to rest some of these doubts. As discussed in Chapter 4, the discovery of differential methylation at the PEG10 locus was the first example of such regulation at an imprinted locus in a marsupial (Suzuki et al., 2005). This discovery has since been reiterated at the IGF2 locus – independent studies showed that differential methylation is a common feature of this locus in both American and Australian marsupials (Smits et al., 2008; Lawton et al., 2008). Perhaps this is not surprising given that marsupials also possess all members of the DNA methyltransferase family required for imprinting, [including the therian specific DNMT3L orthologue (Yokomine et al., 2006), as well as other regulators of imprinting - discussed in Section 6.2]. In their study of the IGF2 locus, Smits et al. (2008) also showed that marsupials possess a paternally-expressed orthologue of the non-coding RNA H19. This demonstration proves for the first time that, despite the absence of AIR and XIST, marsupials do possess imprinted macro RNAs. Of equal novelty was the finding that nestled within marsupial H19 is miR-675 – a miRNA also found within H19 of humans, mice and other eutherian mammals. Smits et al. noted that the marsupial miR-675 is not perfectly conserved in sequence with eutherian miR-675. As they suggest, this could mean that either miR-675 has a different target in these mammalian groups, or miR-675 has the same target in these groups, but that the target sequence has changed between the two. One example of a rapidly evolving target mRNA could be imprinted large non-coding RNAs themselves. In either case, the poor conservation of miR-675 implies that other small RNAs associated with imprinted loci in marsupials might be discovered by using low stringency searches. 116 6.1.4 Kinship and the evolution of imprinting Perhaps one of the most fascinating intersections between theoretical genetics and molecular biology in recent times has been that of the kinship hypothesis and genomic imprinting. The kinship hypothesis predicts that parent specific expression of genes evolved in mammals due to fitness conflicts between maternally- and paternally-derived genes over the exploitation of maternal resources during placentation and early development. Our comparative studies of imprinted loci in mammals and their closest relatives encompasses the spectrum of viviparity and commitment to imprinting from the egg-laying birds and reptiles capable of parthenogenesis (and therefore probably not imprinted) to the eutherian mammals with extended, complex gestation and around 100 imprinted genes. In this next section I examine novel insights this study provides into the evolution of viviparity, conflict and genomic imprinting, then make predictions about the nature of mammalian imprinting with regard to the kinship theory. 6.1.4.1 Imprinting and perceived conflict during gestation Despite the initial stunning correlation between placentation and genomic imprinting in mammals, it is now clear that these unique traits are not inseparable within vertebrates and other animals. For example, our results from studies of orthologues of the PWS-AS domain (Chapter 2, Rapkins et al., 2006) were the first to demonstrate that imprinting of large domains could occur considerably after the arrival of viviparity in early therian mammals. I confirmed this conclusion for the X-inactivation centre (Chapter 3, Hore et al., 2007a) and others have found similar eutherian-specific imprinting at the Callipyge locus (Weidman et al., 2006b; Edwards et al., 2008; reviewed Chapter 4, Hore et al., 2007b). I conclude, therefore, that imprinting did not arise as a genome-wide response to parental conflict, but was selected for locus-by-locus. Should we be surprised that imprinted genes did not all immediately evolve parentspecific expression in response to the evolution of viviparity in therians 180-210 MYA? As discussed in Chapter 4, although marsupials undoubtedly possess placentation and viviparity, marsupial gestation is brief, generally non-invasive and gives rise to very underdeveloped young. The kinship hypothesis would therefore predict that marsupials have lower selection pressure on imprinted genes acting at the fetal-maternal interface than do eutherian mammals. This potentially explains why the XIC, Callipyge and PWS-AS loci all possess eutherian-specific imprinting. 117 From the discussions above, it seems entirely possible that the evolution of imprinted genes is responsive to changes in reproductive lifestyle and perceived levels of parental conflict. Thus, it may be instructive to make further predictions about the levels of imprinting in a given animal group on the basis of its reproductive lifestyle and administration of maternal care. There are a number of novel predictions made by the kinship hypothesis that are explored in the next sections. 6.1.4.2 Pre-natal conflict in marsupials and monotremes and the potential for novel imprinted genes Marsupial gestation presents an opportunity for paternally-derived genes to sequester maternal resources. Although the marsupial yolk-sac placenta is not typically invasive, marsupial females do respond to hormones produced by the offspring. Thus, expression of paternally-derived growth enhancing genes in an offspring might adversely affect the reproductive fitness of the mother, presumably by decreasing the growth of half-sibs found within multiple pregnancies. It seems possible that polyandrous marsupials with large litter sizes would have a greater potential for conflict during gestation, and may therefore possess more imprinted genes. This argument seems to be applicable for eutherian mammals at least. Human pregnancies are predominantly singleton and therefore presumably experience less parental conflict over maternal resources during pregnancy relative to mice, which produce large litters. In line with this idea, IGF2R imprinting is found within all therian mammals, but is absent, or at least relaxed, in humans (Monk et al., 2006) and other Euarchonta (Dermoptera, Scandentia and Primates) (Killian et al., 2001b). Likewise, many genes from the Kcnq1 imprinted domain of mice show no evidence for imprinting in the human placenta (Monk et al., 2006). Perhaps, then, the most interesting marsupials are the carnivorous dasyurids of Australia. Of these, the broad-footed marsupial mouse (Antechinus spp.) is particularly fascinating as males are semelparous (they experience only one breeding season, and then die) and only ~7% of females make it to two breeding seasons (Holleley et al., 2006). Antechinus make up for their ‘once in a lifetime’ breeding season by polygamy and high litter numbers. Females visit congregations of males in communal nests during the mating season, copulating with many suitable partners, a behavioural trait known as 118 lekking (Naylor et al., 2008). Offspring from each litter may comprise animals with up to four different fathers (Fisher et al., 2006), and usually outnumber the teats the female possesses (Selwood, 1980). Fierce selection between half-sibs might drive the evolution of imprinting in these species. This unexpected reproductive lifestyle is not restricted to Antechinus. Even the relatively large dasyurid, the Tasmanian devil (Sarcophilus harrisii), produces up to 50 offspring in one litter, despite possessing only four teats (Tyndale-Biscoe, 2005). Devils also possess a short reproductive lifespan that borders on semelparity when pressured by disease (Jones et al., 2008). It would be very interesting to examine whether or not these unique reproductive lifestyles result in increased parental conflict and increased numbers of imprinted genes in dasyurids and other small marsupials. Over the years, there has been some debate over whether or not monotremes would be expected to possess imprinted genes as a consequence of conflict during gestation. Monotreme eggs expand considerably immediately prior to parturition, presumably due to uptake of endometrial nutrients secreted by the mother (Hughes and Hall, 1998). Some have argued that this makes monotreme development similar to that of marsupials, thereby providing an opportunity for paternally-derived genes to sequester maternal resources (John and Surani, 2000). However, freshly laid monotreme eggs are extremely underdeveloped, consisting of an embryo no larger than 18 mm and possessing only 1920 somites (Hughes and Hall, 1998) (Figure 6.1). If it is even possible for paternally derived genes to influence endometrial secretion through the monotreme egg membrane, it would seem unlikely that they could gain much benefit from doing so. Thus, if some genes are imprinted in a monotreme, they probably do not act prior to hatching. 6.1.4.3 Imprinting, conflict and lactation Without doubt a marsupial or monotreme mother makes her greatest maternal contribution to the development of her offspring during lactation (Figure 6.1). The possibility that imprinting evolved in response to conflicts during lactation has long been proposed (Wilkins and Haig, 2003; Reik and Lewis, 2005), but is perhaps not as well discussed as conflict during gestation. Is it even possible for paternally-derived genes to enhance nutrient transfer from the mother during lactation? 119 Monotremes Marsupials (~3kg) Eutherians (~3kg) Figure 6.1. Maternal investment during gestation and lactation. The total mass of monotreme litters at weaning relative to the mothers mass is around 50% (calculated from echidna data reported in Griffiths, 1978). This proportion is similar for marsupial (55%) and eutherian mammals (59%) of comparable size (Hayssen et al., 1985). However, marsupial and monotreme gestation (red) contributes relatively little (<0.1%) to this mass compared with lactation (cream). Similarly, the developmental stage of eutherians at birth is far more advanced than the altricial young of marsupials and the extremely underdeveloped (19-20 somites) monotreme embryo post-partum (depicted below not . to scale). In humans, rodents and well characterised domesticates, milk let-down by the mother requires mechanical stimulation of the areola, followed by release of the nona-peptide oxytocin from the hypothalamus (reviewed Uvnas-Moberg and Eriksson, 1996). This response appears to be conserved in all mammals, as even marsupials and monotremes require hypothalamic secretion of oxytocin to let down milk (Chauvet et al., 1985). Thus, it seems likely that even in monotremes (which lack teats), if infants do not adequately stimulate the mother, they will not be provided with milk. The possibility that the desire to suckle is enhanced by paternally-derived genes seems plausible. PWS (and AS in mice) are characterised by abnormal feeding behaviours (Haig and Wharton, 2003; Cattanach et al., 1997) and knockout of the paternally expressed GNASXL is responsible for a failure to suckle in human and mice neonates (Plagge et al., 2004; 120 Genevieve et al., 2005). It would be extremely interesting to see if GNASXL is also imprinted in marsupials and monotremes (Reik and Lewis, 2005). Potential conflict during lactation may not be restricted to behavioural traits. It is now becoming clear that imprinted genes have a role in fine-tuning postnatal energy homeostasis systems by altering metabolic rate, appetite, and fat deposition (Haig, 2004; Smith et al., 2006; Charalambous et al., 2007). Imprinted genes apparently achieve this through modification of a variety of control systems, including the Leptin, β-adrenergic, thyroid, and insulin pathways as well as thermogenesis. Generally, paternally expressed imprinted genes promote metabolic rate, appetite and fat deposition, whereas maternally expressed genes act oppositely to repress these energy consuming processes (Haig, 2004; Smith et al., 2006; Charalambous et al., 2007). Thus, it may be that parental conflict acts during early lactation to not only fine-tune the amount of milk received by the young (ie. through behavioural adaptations), but also how sparingly it uses the energy derived from it. Further investigation of post-natal imprinting, particularly in the marsupials and monotremes will be required to clarify this possibility. It could be that marsupials and monotremes possess a whole suite of imprinted genes acting at lactation and early post-natal development which have been lost in eutherian mammals due to decreased selection. In light of recent technological advances, high-throughput cDNA sequencing of hybrid species (which should have widespread single nucleotide polymorphisms) might be an efficient way to identify potential candidates. 6.1.4.4 Parental conflict in non-mammalian vertebrates Viviparity and placenta-like structures have evolved independently in multiple vertebrate lineages, including mammals, cartilaginous and teleost fish, and reptiles. If genomic imprinting were driven by parental conflict, then perhaps there should be evidence for imprinting within these viviparous species. However, of the live-bearing killi-fish that have so far been tested (Heterandria formosa and Poeciliopsis prolifica), there is no evidence of imprinting at the IGF2 gene (Lawton et al., 2005). Furthermore, the live-bearing Amazon molly (P. formosa ) (Hubbs and Hubbs, 1932) and Hammerhead shark (S. tiburo) (Chapman et al., 2007) are both capable of parthenogenesis, which as discussed earlier, is incompatible with imprinted expression of essential genes (Table 1.1). 121 However, despite the absence of evidence for imprinting in viviparous non-mammalian vertebrates, there is evidence for parental conflict. Accelerated positive evolution of IGF2 has been discovered in killi-fish species which have evolved viviparity, whereas their egg-laying relatives have normal rates of IGF2 evolution (O'Neill et al., 2007). It is thought that this accelerated evolution has been caused by successive selective sweeps of antagonistic mutations which favour either the maternal lineage or the paternal lineage. Thus, it would appear that killi-fish use alternative non-imprinted mechanisms to wage the parental wars they experience due to viviparity. 6.2 The evolution of regulators of imprinting and X inactivation 6.2.1 The evolution of CTCF, BORIS and imprinting A major approach I used to understand the evolution of imprinting and X inactivation was to examine the evolution of two closely related genes, CTCF and BORIS, that regulate these epigenetic mechanisms. The CTCF protein contributes to the regulation of random X inactivation by mediating the transient co-localisation of the X inactivation centre (XIC) prior to establishment of monoallelic expression of Xist (Xu et al., 2007). CTCF also contributes to the regulation of imprinted expression of Igf2/H19 by binding to the ICR on the maternally-derived chromosome, protecting it from methylation (Schoenherr et al., 2003; Pant et al., 2003; Engel et al., 2006), upregulating H19 (Engel et al., 2006) and simultaneously silencing Igf2 by blocking its access to downstream enhancers (Hark et al., 2000; Bell and Felsenfeld, 2000). In a similar fashion, CTCF may also assist in regulating imprinted expression of Xist in the extraembryonic tissues of mice. It binds to multiple regions of differential methylation near Xite and Tsix, which together regulate expression of Xist (Boumil et al., 2006). BORIS, on the other hand, has not yet been found to impact on the regulation of X inactivation, but appears to have a role in establishing germline imprinting at the Igf2/H19 locus. Using an artificial transgene system in the oocytes of Xenopus (which do not possess imprinting), the Igf2/H19 ICR of mouse was methylated only when coinjected with BORIS, a histone modifier called PRMT7 and members of the DNA methyltransferase 3 family (Jelinic et al., 2006). 122 Of the DNA methyltransferase 3 members essential for the establishment of methylation at Igf2/H19, Dnmt3L is the most interesting, as it is found in species with imprinting (marsupials and eutherians) but not in species without imprinting (chicken and other non-mammalian vertebrates) (Yokomine et al., 2006). In a similar vein, Loukinov et al. proposed that BORIS might not be required in platypus, chicken or other nonmammalian vertebrates that do not possess imprinting of IGF2 or other genes (Loukinov et al., 2002). They based this hypothesis on their inability to find a BORIS orthologue in the chicken genome, although they could find human and mouse orthologues of BORIS. In contradiction to this hypothesis, I discovered BORIS orthologues in all mammal groups, and in reptiles, and found traces of BORIS sequence in chicken. This implies that BORIS arose sometime during early amniote evolution, much earlier than the evolution of imprinting, and the date predicted by Loukinov et al., (2002). Significantly, I found that BORIS was widely expressed in the monotreme (platypus) and reptile (bearded dragon) I examined, involving multiple somatic and reproductive tissues. Yet, BORIS expression was restricted to the germline of the marsupial (wallaby) and eutherian (cattle) mammals I tested, as it is in humans and mice. Thus although BORIS predates imprinting by over 100 million years, the restriction of BORIS expression to the germline coincides with the evolution of imprinting, suggesting that this gene was recruited to the regulation of imprinting in therians relatively recently. 6.2.2 BORIS expression in ovary and oocytes One of the interesting discoveries I made regarding CTCF and BORIS in species other than humans and mice was the finding that wallaby and cattle expressed BORIS in the ovary as well as the testis. This expression pattern contradicted other reports that in nonpathological situations BORIS is restricted to the developing and adult testis of humans and mice (Loukinov et al., 2002; Jelinic et al., 2006; Kholmanskikh et al., 2008). It was not then clear whether this represents a real species difference, or incomplete data in human and mice, so in our paper we did not emphasize the importance of the finding of BORIS expression in ovary. However, while our manuscript was in press, a very recent publication appeared that provides strong support for my findings (Monk et al., 2008). The authors of this paper 123 report BORIS expression from human ovary at a level around 10% of that found in human testes. BORIS expression in individual oocytes was higher; in many cases exceeding that of BORIS expression in testes. This implies that expression of BORIS in both the male and female germlines is a trait conserved in all therian mammals. This finding is not easy to reconcile with the proposed role of BORIS in establishing paternal-specific methylation at imprinted genes. If BORIS is expressed in both parental germlines it could not be the sole factor which differentiates maternal imprints from paternal imprints. However, it could be that BORIS has a broader function within the cell, resetting epigenetic marks during male and female gametogenesis (discussed further below). Clearly, more research into the role of BORIS during gametogenesis and imprint establishment is required. 6.2.3 CTCF, BORIS, cancer and reprogramming Although not specifically related to the evolution of genomic imprinting and X inactivation, there are some wider implications of my research into the evolution of CTCF and BORIS which I will discuss briefly. As introduced in Chapter 5, the mutually exclusive expression of CTCF and BORIS in humans and mice, along with the observation that they bind common sites (eg. the Igf2/H19 ICR) but have opposite regulatory effects upon them, suggests that CTCF and BORIS are antagonistic epigenetic regulators. This proposed ‘sibling rivalry’ is perhaps best demonstrated by the finding that expression of BORIS can displace CTCF-binding upstream of the anti-apoptotic gene BAG1 and some members of the cancer-testis antigen family (Vatolin et al., 2005; Hong et al., 2005; Kang et al., 2007; Sun et al., 2008). Displacement of CTCF by BORIS results in an alteration of the epigenetic status of these target genes, leading to their upregulation and potentially the induction of carcinogenesis. Do my results from species outside human and mouse support the current view of CTCF and BORIS antagonism, and its potential role in carcinogenesis? At least in monotremes and reptiles, where I found co-expression of these genes in multiple normal somatic tissues, one could assume that these genes are not antagonistic in a deleterious way. I speculate that if these genes are antagonistic in human and mouse, this must represent a recent change in function. It probably did not evolve until the restriction of 124 BORIS to the germline in early therian mammals, perhaps as a result of its recruitment to the regulation of imprinting (Figure 5.5). In order to test this hypothesis sufficiently, some crucial experiments are required. Firstly, I need to date the evolution of mutually exclusive expression of CTCF and BORIS. Presumably it was not prior to the divergence of monotremes from the mammalian tree, as they show co-expression of CTCF and BORIS in multiple somatic tissues, including liver and other organs with essentially homogenous cell type. It seems possible that in cattle and wallaby gonads CTCF and BORIS are transcribed in different cell types, as this is what occurs in the testes of humans and mice (Loukinov et al., 2002; Jelinic et al., 2006). This could be tested by in-situ expression analysis or immunohistochemistry to localise transcripts or products of the two genes. To further examine when CTCF and BORIS evolved mutually exclusive expression, it may also be instructive to determine if CTCF represses BORIS expression in other species in the same way that it does for human; by binding to its promoter (Renaud et al., 2007). Preliminary bioinformatic analysis indicates that CTCF binding sites are common in the BORIS promoter of eutherians and some marsupials but not in the green anole lizard (T.H. unpublished data). In line with these findings, one might predict that in monotremes and reptiles, in which CTCF and BORIS expression overlap, CTCF would not be a negative regulator of BORIS and therefore would not require CTCF binding sites. It would be interesting to confirm these results experimentally. If it can be confirmed that BORIS is required for the epigenetic resetting of CTCF-sites during imprint establishment at gametogenesis, it seems distinctly possible that BORIS functions more widely in epigenetic reprogramming of the cell during development. This possibility finds some support in the observation that CTCF binds to over 20,000 different sites within the human genome, the vast majority of which are not imprinted. Presumably, BORIS binds to at least a subset of these non-imprinted CTCF sites, including, of course, the previously discovered non-imprinted sites upstream of BAG1 and the CT-genes (Vatolin et al., 2005; Hong et al., 2005). Considering these possibilities, it is of great interest to understand exactly where BORIS binds within the genome and what effect it would have on mammalian development if BORIS was knocked out. 125 6.3 The evolution of X inactivation The majority of discussion so far in this thesis regarding the evolution of X inactivation has centred upon the XIC and XIST (Chapters 3 and 4). XIST is the key regulator of X inactivation in humans and mice, but I showed that, despite its high conservation between eutherian mammals, it is absent from the marsupial genome. However, the absence of XIST and the XIC from marsupials shows that there is much more to mammalian X inactivation than XIST and the XIC. In this next section I will first discuss the long neglected topic of dosage compensation in monotremes. Understanding monotreme dosage compensation is crucial to understanding the evolution of mammalian X inactivation, and is therefore of great importance to this study. Moreover, I contributed to a recent paper describing its characteristics for the first time (Appendix 3). Following this, I will further examine marsupial X inactivation and attempt to predict the extent to which it is related to other mammalian forms of dosage compensation. 6.3.1 Dosage compensation in monotremes 6.3.1.1 Monotreme sex chromosomes are bird-like and share no homology with therian sex chromosomes The evolution of dosage compensation systems are inextricably linked to the evolution of the sex chromosomes. It is only through degradation of the sex-determining chromosome (the Y in therian mammals) that an imbalance in gene dosage occurs between the sexes, thus requiring dosage compensation. Ohno predicted over 40 years ago that dosage compensation mechanisms might, in turn, constrain the evolution of sex chromosomes, repressing genomic rearrangements that disrupt dosage compensation systems (Ohno, 1967). Thus, it is useful first to discuss the evolution of mammalian sex chromosomes, before examining the evolution of their dosage compensation systems. The inspiration for “Ohno’s Law” came from the observation that the X chromosome of mammals appeared to represent a similar proportion of the total genome size in a wide range of species. Comparative gene mapping and painting vindicated this rather wild guess. A modification of Ohno’s Law to include the marsupials had to be made when it was discovered that the marsupial X is homologous to about 60% of the eutherian X, 126 but that the rest (corresponding to most of the short arm of the human X) is autosomal in marsupials and belongs to a separate evolutionary block in chicken, defining conserved and added regions of the eutherian X (Graves, 1995). Early gene mapping suggested that Ohno’s Law also extended to the X chromosome of the platypus. However, early this year, it was unequivocally reported that the multiple sex chromosomes of monotremes do not possess any homology to the therian X chromosome (Veyrunes et al., 2008). Instead, genes from the therian X are located almost exclusively on platypus chromosome 6. This result came as a big surprise, because it contradicted not only Ohno’s law, but the body of early mapping data (Watson et al., 1990; Wilcox et al., 1996; Mitchell et al., 1998). In hindsight, there were indications that the initial localisations by radioactive in situ hybridization with human cDNA probes may have been unreliable, as several therian X genes were subsequently mapped to platypus 6 (Waters et al., 2005). An example of this comes from Chapter 3, where I demonstrate that the region homologous to the XIC of eutherian mammals is located on two separate regions platypus chromosome 6 (Figure 3.3). Intensive comparative mapping of sequenced platypus contigs showed that the five platypus X chromosomes had considerable homology to the bird Z chromosome (Veyrunes et al., 2008). Thus, it seems likely that the monotreme sex chromosomes represent an ancient reptilian ZW or XY pair, and that the therian XY system evolved from a different autosome much more recently than first suspected, 180-210 MYA. This hypothesis is supported by the finding that retroposition of housekeeping genes off the X chromosome are more recent than would be expected if the therian sex chromosomes were older (Potrzebowski et al., 2008). 6.3.1.2 Dosage compensation in monotremes occurs by incomplete and variable X inactivation Given that monotreme sex chromosomes are apparently independently evolved from therian sex chromosomes, it is of particular interest to establish whether or not monotremes possess a dosage compensation system that equalizes expression from their sex chromosomes in males and females. If so, it is of further interest to determine if this dosage compensation is in any way related to the therian X inactivation system. 127 A recent study for which I was a minor author (Appendix 3) examined this very question (Deakin et al., 2008). Quantitative real-time PCR experiments between male and female platypus showed that at least some X-linked genes have approximately equal levels of gene expression despite the fact that females possessed a double dosage of these genes relative to males. This demonstrates for the first time that some form of dosage compensation exists in monotreme mammals. However, many other X-linked genes from the same experiment showed higher expression in females relative to males. This indicates that the dosage compensation system of platypus is incomplete, and varies between genes. This situation is parallel between monotremes and birds, which also show incomplete and locus-specific dosage compensation (Itoh et al., 2007). To determine the nature of this dosage compensation occurring in monotremes, Deakin et al. performed RNA-FISH on platypus cultured cells and looked for the number of actively transcribing alleles. A proportion of female cells displayed monoallelic expression of X-linked genes, indicative of X inactivation. Multicolour FISH showed that genes on the same chromosome were co-ordinately repressed or expressed. Single nucleotide polymorphisms were identified within these inactivated genes and when amplified from cDNA showed no allelic bias. Thus, the dosage compensation system of monotremes appears to be an incomplete form of random X inactivation that is different for different genes or domains. Is dosage compensation in birds and monotremes related to X inactivation in therian mammals? It seems more likely that monotreme X inactivation evolved independently as the therian X shares no homology to the platypus sex chromosomes. Like the evolution of imprinted genes and other epigenetic traits, the evolution of dosage compensation systems in mammals is probably labile and responsive to changes in selection pressure. 6.3.2. Marsupial X inactivation Given that the marsupial and eutherian X chromosomes share a common evolutionary history, it would seem likely that their dosage compensation systems are also identical by descent (monophyletic). If so, we might expect that they would share similar molecular regulatory mechanisms. Indeed, the marsupial inactive X possesses late replication (Graves, 1967) and hypoacetylation of the inactive X (Wakefield et al., 1997) 128 just as the inactive X of eutherians does. Furthermore, like eutherians, the marsupial inactive X does form a repressive domain from which RNA polymerase II and active histone marks are excluded (Chaumeil et al., 2008, in preparation). However, the hypothesis that the X inactivation systems of eutherians and marsupials are evolutionarily related is not supported by observations that the marsupial inactive X differs from that of eutherians as it does not form a Barr body in adults, apparently lacks DNA methylation, and is considered to be leaky and tissue-specific. Perhaps most significant of all of these differences, is the fact that marsupials lack the XIST gene (Chapter 2) (Hore et al., 2007a; Duret et al., 2006). Further evidence for dissimilar forms of X inactivation regulation between marsupials and eutherians was provided by the recent analysis of opossum genome sequence, in which I played a minor role (Mikkelsen et al., 2007, Appendix 4). Firstly, the opossum X lacks enrichment of LINE1 elements compared with its homologous region on the eutherian X. LINE1 elements are proposed to be an essential part of eutherian X inactivation, providing booster or ‘way-stations’ assisting in the spread of XIST along the X chromosome (reviewed Lyon, 2006). Secondly, the opossum X chromosome has suffered rearrangement at a rate similar to autosomal loci. In contrast, the eutherian X chromosome has experienced significantly less rearrangement, presumably because rearrangement would disrupt interactions between XIST and regions of the X intended for inactivation. Considering the major differences between marsupial and eutherian X inactivation is it still possible that these systems are related by common evolutionary descent? The view that marsupial X inactivation represents an ‘ancestral’ form of X inactivation, from which eutherians developed a tighter regulatory system was proposed over 35 years ago (Cooper, 1971). This hypothesis is yet to be disproved. The advent of the XIST noncoding RNA, and recruitment of methylation and LINE1 elements could all be mechanisms by which X inactivation was stabilised and randomised during early eutherian evolution. An alternative hypothesis is that the marsupial and eutherian forms of X inactivation evolved independently. It is possible that when eutherians and marsupials diverged 129 ~180 MYA, their sex chromosomes would not have become sufficiently degraded to require a chromosome wide dosage compensation system (Duret et al., 2006). After separation, the marsupial and eutherian sex chromosomes would have degraded independently, thus requiring independent genesis of dosage compensation systems. When this hypothesis was first proposed, it was supported only in the evidence that marsupial and eutherian X inactivation show many differences at the phenotypic and molecular levels. However, additional support for this hypothesis came recently from finding that monotreme sex chromosomes do not possess any homology to therian sex chromosomes (Veyrunes et al., 2008). Thus, the predicted age of the X and Y in marsupials and eutherians is 180-210 million years old (between the divergences of monotremes and marsupials from eutherians) - much less than the 210-310MYA previously accepted from the time between the divergence of mammals from birds, and the divergence of monotremes from therians. If the rate of Y chromosome degradation has been constant throughout therian history and current dates of mammalian divergence are correct, one would predict that a maximum of ~14% (30/210 MYA) of therian Y chromosome genes would have been lost by the time that marsupials and eutherians diverged. Indeed, this figure may even be slighter as degradation might be expected to be initially slower since the target region starts off small (Graves, 2006). Although other scenarios are possible, this crude analysis would lend support to the hypothesis that marsupial and eutherian forms of X inactivation could have evolved independently. Additional support for this hypothesis comes from a bioinformatic analysis examining the effects of MSCI upon gene movements in marsupials and eutherians (Potrzebowski et al., 2008). As previously discussed, the likely function of MSCI is to balance the dosage of sex-linked genes within haploid germ cells of the heterogametic sex. Thus the onset of MSCI in therians can act as a proxy for increased levels of X and Y divergence. One consequence of MSCI is that it presumably switches off the expression of X-linked housekeeping genes which may in turn be required for survival of the haploid cell. Therefore, one might expect that retroposition of X-linked housekeeping genes would be enriched upon the onset of MSCI in therians. Indeed, Potrzebowski et al. reported that both marsupials and eutherians have significantly enriched rates of gene retroposition off the X chromosome to autosomes relative to the expected retroposition 130 rate (p< 0.01). Importantly, not one of the 46 X to autosome retroposition events identified by Potrzebowski et al. occurred in the ancestor of all therian mammals, instead occurring in marsupial and eutherian lineages independently. This implies that the majority of X and Y divergence (and therefore selection for X inactivation) occurred after the divergence of marsupials and eutherians. The above discussion suggests that X inactivation of marsupials and eutherians are independently evolved, and that both these have evolved independently of monotreme X inactivation. Thus, it would appear that dosage compensation systems can efficiently evolve in response to changes in the gene dosage upon sex chromosomes if necessary. As previously discussed, responsiveness to selection is also a characteristic of imprinted genes - different imprinted domains have acquired imprinting at different times, presumably in response to changing selective pressures. 6.4 Conclusion In this thesis I have examined the evolution of genomic imprinting and X inactivation, two overlapping epigenetic traits with considerable impact upon medicine, genetics and molecular biology. I extended comparisons of these traits across a wide range of mammals and other vertebrates, discovering that the evolution of imprinting at the domain responsible for Prader-Willi and Angelman syndromes occurred recently within the eutherian lineage only. Coincident with the fusion of regions containing the putative AS gene UBE3A and a putative PWS gene SNRPN was considerable genomic rearrangement, duplication, retroposition, and the genesis of novel non-coding RNA genes, both large and small. I also discovered that the X-inactivation centre showed a similar evolutionary trajectory, being eutherian-specific and put together in a similar manner by rearrangement, retroposition and the genesis of regulatory RNAs such as the critical XIST gene. When reviewing related work, I found similar trends of complex and unexpected genomic events coinciding with the evolution of imprinting at given loci. Lastly, I discovered that the evolution of function, but not sequence, of epigenetic regulators also correlates with the onset of imprinting. I demonstrated that, although BORIS evolved by an ancient duplication of CTCF, its specific expression in the 131 germline occurs only in imprinted species. In non-imprinted species, however, it shows much wider expression, much like that of CTCF, the gene from which it arose by duplication. Throughout my studies I have been struck by the close similarities between the evolution of different imprinted domains and of X inactivation, but also by some striking differences. For instance, all appear to use similar regulatory mechanisms such as DNA methylation, histone modification, CTCF insulation and the use of regulatory non-coding RNA. Yet, my discovery that not all large imprinted loci evolved at the same time, a discordance that is now confirmed at many imprinted sites, implies that the evolution of epigenetic traits occurs on a case-by-case basis, in response to changes in selection over relatively small evolutionary timescales. Furthermore, as I proposed in section 6.3.2 above, the very similar X inactivation systems of marsupial, eutherian and monotreme mammals may have evolved independently in each of the three lineages. Perhaps this has been achieved by dipping into a common “toolbox” of epigenetic modulators, which are used by the cell in a very general way to heritably modify gene expression patterns during development. This is suggested by the use of DNA methylation to silence genes in plants, and histone hyperacetylation to upregulate the single X chromosome in Drosophila males. Yet, despite convergent recruitment of common epigenetic modifiers, it also appears that there has been significant innovation during the evolution of imprinting and X inactivation to tailor these run-of-the-mill epigenetic modifiers into enigmatic non-mendelian traits with unexpected consequences for biology and medicine. Examples of this innovation examined in my studies include the genesis of XIST, the key controller of many general epigenetic processes essential to X inactivation in eutherians, and potentially the establishment of BORIS as the marker of the paternally derived IGF2/H19 ICR, around which imprinting is controlled at this locus. In the future, it will be extremely exciting to uncover more of the common themes and unexpected innovations surrounding these and other epigenetic processes. 132