* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Small, K, Wagener, M and Warren, ST: Isolation and characterization of the complete mouse emerin gene. Mammalian Genome 8:337-341 (1997).
Ancestral sequence reconstruction wikipedia , lookup
Genetic code wikipedia , lookup
Gene therapy wikipedia , lookup
Protein–protein interaction wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Gene desert wikipedia , lookup
Magnesium transporter wikipedia , lookup
Expression vector wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Proteolysis wikipedia , lookup
Community fingerprinting wikipedia , lookup
Western blot wikipedia , lookup
Gene nomenclature wikipedia , lookup
Gene regulatory network wikipedia , lookup
Gene expression wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Two-hybrid screening wikipedia , lookup
Point mutation wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Mammalian Genome 8, 337–341 (1997). © Springer-Verlag New York Inc. 1997 Isolation and characterization of the complete mouse emerin gene Kersten Small, Maylene Wagener, Stephen T. Warren Howard Hughes Medical Institute and Departments of Biochemistry and Pediatrics, Emory University School of Medicine, Room 4035 Rollins Research Center, 1510 Clifton Road, Atlanta, Georgia 30322, USA Received: 1 December 1996 / Accepted: 12 January 1997 Abstract. Emery-Dreifuss muscular dystrophy (EMD) is an Xlinked recessive disorder associated with muscle wasting, contractures, and cardiomyopathy. The responsible emerin gene has recently been identified and found to encode a serine-rich protein similar to lamina-associated protein 2 (LAP2), although the disease mechanism remains obscure. In order to pursue the pathophysiology of this disorder, we report here the isolation and characterization of the complete mouse emerin gene. The emerin cDNA was isolated from murine strain BALB/c, and the emerin gene was isolated from strain 129. The 2.9-kb mouse emerin gene was completely sequenced and found to be composed of 6 exons and encode a protein 73% identical to that of the human protein. Key similarities with LAP2 were found to be conserved, including critical LAP2 phosphorylation sites. Examination of the murine promoter revealed three previously unrecognized cAMP response elements (CRE) conserved between human and mouse. While Northern analysis shows emerin to be widely expressed in the mouse, as it is in humans, these promoter elements may indicate cAMP responsiveness. These data provide the necessary elements to further investigate EMD in a murine system. ficiency and to develop a mouse model for EMD, we report here the isolation and characterization of the complete mouse emerin gene. Materials and methods Isolation of cDNA and genomic clones. A mouse skeletal muscle lgt10 cDNA library (Clontech) was screened as described by Price et. al. (1996). The probe used in this experiment was the human emerin cDNA generated by RT-PCR (Bione et al. 1994). 43 cDNA clones were identified in this screen, and the most strongly hybridizing clone, 3–1, was isolated and sequenced. RT-PCR was performed with the GeneAmp RNA PCR kit (Perkin Elmer). Each RT-PCR reaction contained ∼500 ng mouse RNA isolated from mouse embryonic stem cells (Trizol Reagent, Gibco) as template, with random hexamers supplied in the kit used for first-strand synthesis. A full-length cDNA representing the entire coding region of mouse emerin was generated by RT-PCR with primers R30: 58 CGGTTGGTTTCTTGGGCCCTGTCTG and 3R: 58 TCCATGAAAGCAAAGCCAGGGTG. A mouse genomic 129 P1 library was screened, with primers R2: 58 GGGTATTGGTTTTAGAGC and 3R (Genome Systems, Inc.). All PCR amplifications were performed in a Perkin-Elmer 9600 thermal cycler for 35 cycles (denaturation, 45 s, 95°C; annealing, 45 s, 60°C or 65°C; extension, 1 min, 72°C). Introduction Nucleotide sequencing. Double-stranded sequencing of plasmid subEmery-Dreifuss muscular dystrophy (EMD) is an X-linked recessive disorder characterized by progressive muscle wasting and weakness, contractures of the elbows, Achilles tendons, and postcervical muscles, and cardiomyopathy (McKusick 1994; Hopkins and Warren 1993). This disorder maps to human Xq28 (Consalez et al. 1991), and the responsible gene has recently been identified (Bione et al. 1994). A number of unique and typically null mutations have been documented within the emerin gene among patients with the classic phenotype and X-linked inheritance (Bione et al. 1995; Nigro et al. 1995; Klauck et al. 1995). The human emerin gene is ubiquitously expressed and encodes a serine-rich protein that localizes to the nuclear membrane in both skeletal muscle and heart (Nagano et al. 1996; Manilal et al. 1996). The predicted emerin protein shows structural similarity with two regions of the nuclear protein thymopoietin (Harris et al. 1994). Recently, the rat homolog of b-thymopoietin has been determined to be LAP2, an integral protein of the inner nuclear membrane (Furukawa et al. 1995). LAP2 has been shown to bind directly to both lamin B1 and chromosomes in a mitotic phosphorylationregulated manner (Foisner and Gerace 1993). As LAP2 is thought to be involved in nuclear envelope assembly and/or anchoring, it is unclear how a deficiency of emerin, which displays both structural and cellular similarities with LAP2, leads to a form of muscular dystrophy. In order to pursue the pathophysiology of emerin deCorrespondence to: S.T. Warren The nucleotide sequence data reported in this paper have been submitted to GenBank and have been assigned the accession number U79753. clones was performed on an ABI 373A automated DNA sequencer with the Taq DyeDeoxy Terminator Cycle Sequencing Kit (ABI) as described by the manufacturer. Primers used to obtain the initial sequence from the ends of the inserts were pBluescript vector primers SK and KS, and the remainder of the inserts were sequenced by primer walking. DNA and protein sequences were analyzed with GeneWorks Version 2.3.1 software (IntelliGenetics, Inc.). Nucleotide and protein DataBanks were searched with the Blast computer program, PROSITE was used to search for motifs, and transcription factor binding sites were identified with the WWW Signal Scan service. Northern blot. A mouse multiple tissue Northern blot was purchased from Clontech. Hybridizations were performed at 50°C according to manufacturer’s recommendations. Purified clone 3-1 cDNA insert was labeled by random priming (Megaprime, Amersham) and used as the probe. Results In order to identify the mouse emerin gene, we screened a mouse skeletal muscle cDNA library (Clontech), using the human emerin cDNA as a probe. In this screen, one strongly hybridizing clone (3-1) was isolated and sequenced and found to contain a sequence homologous to both human and rat emerin cDNAs (Bione et al. 1994; unpublished data, Genbank accession number X98377). Clone 3-1 was 1160 bp in length and had a large open reading frame (ORF) encoding a polypeptide of 257 amino acids; however, absence of a start codon and comparison of this sequence with both rat and human cDNAs suggested that this clone lacked the 58-UTR and the first 5 bp of coding sequence. To generate a full-length 338 K. Small et al.: Mouse emerin gene Fig. 1. Nucleotide sequence of the mouse emerin gene. The exonic sequence is in capital letters. Primers R30 and 3R, used to amplify the emerin cDNA by RT-PCR, are indicated with arrows. ATG start, TAA stop, and the polyadenylation signal are in bold and underlined. K. Small et al.: Mouse emerin gene 339 cDNA, RT-PCR was subsequently performed on RNA isolated from mouse embryonic stem cells with primer R30, derived from mouse genomic sequence (described below) that had homology with the 58 end of the rat cDNA, and primer 3R, derived from sequence corresponding to the 38 end of cDNA clone 3-1 (Fig. 1). Sequence analysis of this RT-PCR product revealed a 1270-bp cDNA containing a 106-bp 58-UTR (beginning within the interval that defines the start of transcription for both rat and human cDNAs), a single large ORF encoding a 259-amino acid polypeptide, and a 387 bp 38-UTR. Within the coding region the mouse emerin nucleotide sequence was 95% and 78% identical to rat and human emerin genes, respectively. The high degree of sequence similarity between mouse and rat emerin was also observed within the untranslated regions of these cDNAs, with 92% identity in the 58UTR and 89% identity in the 38-UTR. Mouse and human 58- and 38- UTRs were less similar, showing 43% and 40% identity, respectively. Northern blot analysis with clone 3–1 as a probe revealed an mRNA expression pattern similar to human emerin, with an approximately 1.3-kb transcript present in all tissues analyzed, including skeletal muscle and heart (Fig. 2). Genomic emerin clones were also isolated by screening a mouse 129 genomic P1 library with primers derived from clone 3-1. Southern blot hybridization with cDNA clone 3-1 as a probe localized the emerin gene to a 4.0-kb HindIII fragment in mouse 129 genomic DNA and in P1 clones (data not shown), and this fragment was subsequently subcloned into pBluescript (Stratagene) and sequenced (Fig. 1). In addition, X Chromosome (Chr) localization of P1 clones was confirmed by Southern blot hybrid- ization of a P1 DNA fragment to mouse genomic DNA that showed the appropriate hybridization intensities in both male (that is, 1X) and female mouse DNAs (that is, 2X; data not shown). Mouse cDNA (of strain BALB/c) and exonic genomic sequences (of strain 129) were identical except for a single base (G or A) in the wobble position of codon 11 that did not change the amino acid sequence. All splice sites contained the canonical GT and AG dinucleotides at the intron borders and matched consensus splice site sequences to varying extents. The 3938 bp of genomic sequence showed that the mouse emerin gene spanned approximately 2900 bp and was organized into six small exons interrupted by five introns, similar to the structure of the 2100-bp human gene (Fig. 3A). The difference in size between these genes is primarily owing to the size of the 4th intron, which is 385 bp in human and 1001 bp in mouse. The coding regions of exons 1, 3, and 4 are identical in size in both human and mouse genes. The remaining exons differ in size by 3 bp (exons 2 and 5) or 15 bp (exon 6), and these differences account for the gaps seen in mouse/human emerin amino acid alignment (Fig. 4). The 58 promoter region of the emerin gene displayed a number of potential transcription factor binding sites (WWW Signal Scan). When compared with the human emerin promoter region, both genes have one CAAT box and three cAMP reponsive elements (CRE) in the first 200 bp of 58 flanking sequence, and both genes lack TATA boxes. Three CRE sites, with the consensus binding sequence of TGACG, were found within the first 100 bp of emerin 58 flanking sequences, and two of these sites are separated by exactly 17 bp in both human and mouse sequences (Fig. 3B). The nucleotide sequence of the mouse emerin cDNA predicts a polypeptide of 259 amino acids with a molecular weight of 29.4 kDa that is 93% and 73% identical (95% and 79% similar) to rat and human emerin proteins, respectively (Fig. 4). Both mouse and rat emerin proteins are slightly larger than human emerin, with four additional amino acids at the C-terminus of the protein. Consistent with the high degree of similarity between emerin homologs, mouse emerin is serine rich and shares regions of structural similarity with thymopoietin proteins including the highly hydrophobic putative transmembrane domain at the C-terminus Fig. 3. (A) Comparison of human and mouse emerin genes. The mouse gene is shown on top, and human emerin is shown below. Boxes are exons with the start (ATG) and stop (TAA/TAG) codons for translation shown. The numbers above and below indicate the sizes of each intron and exon, respectively. (B) Comparison of mouse and human emerin promoter re- gions. The 200 bp of 58 flanking sequences for both mouse and human emerin genes are shown with the 38 end of each sequence corresponding to a region that encompasses the start of transcription for each gene. Potential transcription factor binding sites are also indicated: CAAT boxes are underlined, and CREs are boxed. Fig. 2. Northern analysis of the mouse emerin gene. Clone 3-1 DNA was used to probe a multiple tissue Northern blot (Clontech) containing ∼2 mg each polyA+ RNA from various tissues. 340 K. Small et al.: Mouse emerin gene Fig. 4. Amino acid alignment of mouse, rat, and human emerin proteins. Identical amino acids are boxed. The regions of similarity with thymopoietins/LAP2 are indicated by lines, and the hydrophobic domain is indicated by a dashed line. Conserved potential phosphorylation sites are also indicated: arrows, protein kinase C; asterisks, casein kinase II, and plus signs, tyrosine kinase. (Fig. 4). Compared with mouse thymopoietins, amino acids 3–44 of mouse emerin are 50% identical (64% similar) to amino acids 110 to 151 within the common N-terminal portion of all mouse thymopoietin isoforms, and amino acids 222–255 are 26% identical (44% similar) to a C-terminal region, encoded by exon 10, within the b-, g-, d-, and e-isoforms (Berger et al. 1996). Several potential protein phosphorylation sites were also identified after a search of the Prosite Data bank. Three tyrosine kinase phosphorylation sites as well as five sites each for protein kinase C and casein kinase II were found to be conserved among all three emerin homologs (Fig. 4). Furthermore, the three most N-terminal phosphorylation sites predicted for emerin are also present in thymopoietins (Fig. 3). Two N-glycosylation and two N-myristolation sites were also identified; however, when compared with human emerin, these sites were not conserved. Discussion In this report we describe the complete sequence of the mouse emerin gene. Nucleotide sequence analysis revealed a high degree of similarity between mouse, rat, and human cDNAs. Exon/intron organization as well as potential transcription factor binding sites in the promoter regions of each gene were found to be highly conserved between mouse and human emerin. We uncovered three cAMP response elements (CREs) within the murine promoter and showed these to be conserved with three previously unnoticed CREs within the human promoter. Interestingly, an interval of 17 bp is conserved between two CREs of both species, as is the general location of the elements relative to the start of transcription. Although we confirmed by Northern analysis that the mouse emerin gene is widely expressed throughout the body, the potential now exists for cyclic AMP (cAMP) modulation of emerin expression (Faisst and Meyer 1992; Borrelli et al. 1992). Mouse emerin encodes a highly conserved, serine-rich 259amino acid protein that shows structural similarity with thymopoietin/LAP2. The rat homolog of b-thymopoietin, LAP2, is an integral nuclear membrane protein that binds directly to both lamin B1 and chromosomes in a mitotic phosphorylation-regulated manner (Furukawa et al. 1995; Foisner and Gerace 1993). Localization studies of LAP2 deletion mutants have provided evidence that the hydrophobic C-terminus of LAP2 is a transmembrane-spanning domain that localizes the hydrophilic N-terminal portion of the protein to the nucleoplasm (Furakawa et al. 1995). Emerin is a primarily hydrophilic protein with a hydrophobic, C-terminal domain, similar to that of LAP2. Moreover, immunofluorescence microscopy showed that emerin also localizes to the nuclear membrane (Manilal et al. 1996; Nagano et al. 1996). Taken together, these data suggest a similar association of LAP2 and emerin with the nuclear membrane. The presence of several potential phosphorylation sites also supports a role for emerin as a phosphoprotein. This is supported by both the documented phosphorylation of LAP2 and the 34-kDa mass of human emerin protein in Western blots, as opposed to the predicted 29-kDa protein, which could result from emerin phosphorylation (Manilal et al. 1995; Nagano et al. 1996). Indeed, three predicted phosphorylation sites present in the N-terminal region of emerin are conserved and also found in LAP2, suggesting that the N-terminal domains of these proteins also share a common function. Unlike dystrophin and the dystrophin-related glycoproteins, emerin appears not to be a cytoskeletal protein (Ozawa et al. 1995). Rather, emerin is associated with the nuclear membrane and shares attributes with LAP2, which appears to link lamin B1 and mitotic chromosomes (Foisner and Gerace 1993). If emerin functions similarly, a novel mechanism of neuromuscular disease will be uncovered. Toward this end, the complete murine emerin gene sequence is described here. The conserved nature of the mouse protein to human emerin and the precise conservation of key regions of LAP2 similarity support a LAP2-like function for emerin. Further, the characterization of the murine promoter region of emerin identified three CREs, suggesting the potential for cAMP modulation of emerin expression. Moreover, the characterization of the complete emerin gene from the mouse strain 129 will lead to the development of a mouse emerin knock-out, a potential animal model for EMD that should greatly assist further study into EMD and emerin function. Acknowledgments. We thank Lisa Lakkis for review of the manuscript. This work was supported, in part, by a grant from the Muscular Dystrophy Association. S.T. Warren is an investigator, and K. Small is an associate of the Howard Hughes Medical Institute. References Berger R, Theodor L, Shoham J, Gokkel E, Brok-Simoni F, Avraham KB, Copeland NG, Jenkins NA, Rechavi G, Simon AJ (1996) The charac- K. Small et al.: Mouse emerin gene terization and localization of the mouse thymopoietin/lamina-associated polypeptide 2 gene and its alternatively spliced products. Genome Res 6, 361–370 Bione S, Maestrini E, Rivella S, Mancini M, Regis S, Romeo G, Toniolo D (1994) Identification of a novel X-linked gene responsible for EmeryDreifuss muscular dystrophy. Nature Genet 8, 323–327 Bione S, Small K, Aksmanovic VMA, D’Urso M, Ciccodicola A, Merlini L, Morandi L, Kress W, Yates JRW, Warren ST, Toniolo D (1995) Identification of new mutations in the Emery-Dreifuss muscular dystrophy gene and evidence for genetic heterogeneity of the disease. Hum Mol Genet 4, 1859–1863 Borrelli E, Montmayeur JP, Foulkes NS, Paulo SC (1992) Signal transduction and gene control: the cAMP pathway. Crit Rev Oncog 3, 321– 338 Consalez GG, Thomas NST, Stayton C, Knight SJL, Johnson M, Hopkins LC, Harper PS, Elsas LJ, Warren ST (1991) Assignment of EmeryDreifuss muscular diptrophy to the distal region of xq28: the results of a collaborative study. Am J Hum Genet 48, 468–480 Faisst S, Meyer S (1992) Compilation of vertebrate-encoded transcription factors, Nucleic Acids Res 20, 3–26 Foisner R, Gerace L (1993) Integral membrane proteins of the nuclear envelope interact with lamins and chromosomes, and binding is modulated by mitotic phosphorylation. Cell 73, 1267–1279 Furukawa K, Pante N, Aebi U, Gerace L (1995) Cloning of a cDNA for lamina-associated polypeptide 2 (LAP2) and identification of regions that specify targeting to the nuclear envelope. EMBO J 14, 1626–1636 Harris CA, Andryuk PJ, Cline S, Chan HK, Natarajan A, Siekierka JJ, Goldstein G (1994) Three distinct human thymopoietins are derived 341 from alternatively spiced mRNAs. Proc Natl Acad Sci USA 91, 6283– 6287 Hopkins LC, Warren ST (1993) In Handbook of Clinical Neurobiology, LP Rowland, S DiMauro (eds) (Amsterdam: Elsevier Publishers) pp 145– 160 Klauck SM, Wilgenbus P, Yates JRW, Muller C, Poutska A (1995) Identification of novel mutations in three families with Emery-Dreifuss muscular dystrophy. Hum Mol Genet 4, 1853–1857 Manilal S, thi Man N, Sewry CA, Morris GE (1996) The Emery Dreifuss muscular dystrophy protein, emerin, is a nuclear membrane protein. Hum Mol Genet 5, 801–808 McKusick VA (1994) Mendelian Inheritance in Man (Baltimore, Md.: Johns Hopkins University Press) Nagano A, Ritsuko K, Ogawa M, Kurano Y, Kawada J, Okada R, Hayashi YK, Tsukahara T, Arahata K (1996) Emerin deficiency at the nuclear membrane in patients with Emery-Dreifuss muscular dystrophy. Nature Genet 12, 254–259 Nigro V, Bruni P, Ciccodicola A, Politano L, Nigro G, Puliso G, Cappa V, Covone AE, Romeo G, D’Urso M (1995). SSCP detection of novel mutations in patients with Emery-Dreifuss muscular dystrophy: definition of a small C-terminal region required for emerin function. Hum. Mol Genet 4, 2003–2004 Ozawa E, Yoshida M, Suzuki A, Mizuno Y, Hagiwara Y, Noguchi S (1995) Dystrophin-associated proteins in muscular dystrophy. Hum Mol Genet 4, 1711–1716 Price DK, Zhang F, Ashley CT, Warren ST (1996) The chicken FMR1 gene is highly conserved with a 58-untranslated repeat and encodes an RNA-binding protein. Genomics 31, 3–12.