* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Structure, Expression and Duplication of Genes Which Encode
Frameshift mutation wikipedia , lookup
Copy-number variation wikipedia , lookup
Gene therapy wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Genetic engineering wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Expanded genetic code wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
Non-coding RNA wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Gene nomenclature wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Ridge (biology) wikipedia , lookup
Gene desert wikipedia , lookup
Gene expression programming wikipedia , lookup
Minimal genome wikipedia , lookup
Genomic imprinting wikipedia , lookup
Transposable element wikipedia , lookup
Human genome wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Genome (book) wikipedia , lookup
Genetic code wikipedia , lookup
Pathogenomics wikipedia , lookup
Non-coding DNA wikipedia , lookup
Epigenetics of human development wikipedia , lookup
History of genetic engineering wikipedia , lookup
Metagenomics wikipedia , lookup
Genomic library wikipedia , lookup
Gene expression profiling wikipedia , lookup
Point mutation wikipedia , lookup
Genome editing wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Genome evolution wikipedia , lookup
Designer baby wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Primary transcript wikipedia , lookup
Microevolution wikipedia , lookup
Copyright 0 1994 by the Genetics Society of America Structure, Expression and Duplication of Genes Which Encode Phosphoglyceromutase of Drosophila melanogaster Peter D. Currie' and David T.Sullivan2 Department of Biology, Syracuse University, Syracuse, New York 13244-1270 Manuscript received February 5, 1994 Accepted for publication June 8, 1994 ABSTRACT We report here the isolation and characterization of genes from Drosophila that encode the glycolytic enzyme phosphoglyceromutase (PGLYM). Two genomic regions have been isolated that have potential to encode PGLYM. Their cytogenetic localizations have been determined by in situ hybridization to salivary gland chromosomes. One gene,Pglym78, is found at 78A/B and the other, Pglym87, at 87B4,5 of the Drosophila polytene map. Pglym78 transcription follows a developmental pattern similarto other glycolytic genes in Drosophila,Le., substantial maternal transcript deposited during oogenesis; a decline in abundancein the first half-of embryogenesis; a subsequent increase in the second half of embryogenesis which continues throughoutlarval life; a decline in pupae and a second increase to a plateau in adults. This transcript has been mapped by cDNA and genomic sequence comparison, RNase protection, and primer extension.Using similar analyses transcripts of Pglym87 could not be detected.Pglym 78 has two introns which interrupt the coding region, while the Pglym87 gene lacks introns. This and other features Pglym87. The apparent support a model of retrotransposition mediated gene duplication for the oforigin absence of a complete, intact coding frame and transcript suggest Pglym87 thatis a pseudogene.However, retention of reading frame and codon bias suggests that Pglym87 may retain coding function, may or have been inactivated recently, substantially after the time of duplication, or that the molecular evolution of Pglym87 is unusual. Similaritiesof the unusual molecular evolution ofPglym87 and other proposed pseudogenes are discussed. A S part of an ongoing study to characterizethe control of expression of glycolyticgenes of Drosophila melanogaster, a significant subset of these genes has been isolated and characterized (ROSEUI-REHFUS et al., 1992; SHAW-LEE et al. 1991, 1992; VON KALM et al. 1989; Suurvm et al. 1985). This study details the molecular characterization of an additional member of this set of genes, the gene that encodes phosphoglyceromutase (PGLYM). PGLYM catalyzes the interconversion of 2-phosphoglycerate and %phosphoglycerate. PGLYM from insects has not been well characterized. To our knowledge, there have been no reports concerningthis enzyme or its gene from Drosophila. In mammals, two genes encode two isozyme subunits, each subunit is 30 kD. The enzyme is a dimer so three isoforms of PGLYM may be found in mammalian tissues. There is a muscle specific subunit, M, and a non-muscle-specific,or brain, subunit B, which is synthesized in many tissues (OMENN and CHEUNG 1974). The homodimer, MM form, is found mainly in mature muscle; the BB form, is found mainly in liver, kidney and brain; and the heterodimer MB form, alongwith varyingproportions of the MM and BB forms, is found mainly in the heart (OMENN CHEUNC and 1974). In addition, PGLYM in humans is developmentally regulated. Early in development fetal muscle con- ' Present address: ICRF Developmental Biology Unit. Department of Zoology, University of Oxford, South Parks Road, Oxford OX1 3PS, U.K. * To whom correspondence should be addressed. Genetics 138 353-363 (October, 1994) tains mainly the nonspecific or brain form. At approximately 80-100 days of gestation, the muscle specific form first appearsand becomes the predominant form thereafter (OMENN and CHEUNG 1974; MIRANDAet a2. 1988). Genes which encode human, rat, rabbit and mouse isoforms have been characterized and comparisons reveal a substantial conservation in nucleotide and amino acid sequence (URENA et al. 1992; CASTELLA-ESCOLA et al. 1990; Y ~ A G A WetA al. 1986; LEBOULCH et al. 1988). In humans there is a single copy of Pglym-m per haploid genome. Thisis in contrast to a large family ofPglym-b genes, many of which have been demonstrated to be pseudogenes (CASTELLA-ESCOLA et al. 1990). It has been proposed that multiple functional copies of Pglym-b once existed within the Pglym-b gene family and one evolved to form Pglym-m. This hypothesis is supported by phylogenetic comparisons which indicate that Pglym-m has diverged from this multigene family relaet al. 1988; LEBOULCH et al. tively recently (SAKODA 1988). Transcriptional regulationof muscle specific Pglym-m has been investigated in some detail. Promoter deletion studies have defined a single binding site in the Pglym-m promoter for the MEF-2 protein which is involved in tissue and temporal expression in muscle (NAKATSUJI et al. 1992). Band shift assays using promoter fragments demonstrated that protein binding tothis site was only P. 354 Currie and D. T. Sullivan detected using nuclear proteins derived from muscle Sephadex G50 column by a method specified by the manufacturer (Boehringer Mannheim). extracts. Thus, the mammalian muscle specific isoform Screening of bacteriophage libraries:Genomic clones were is thought to be under direct control of genes stimu-that isolated from an adult Oregon-R D.melanogasterstrain library late muscle differentiation (NAKATSUJI et al. 1992). constructed in the A vector EMBL4 (SHAW-LEE et al, 1991). Saccharomyces cerevisiae PGLYM has also been well cDNA clones were isolated from an adult Oregon-R strain D. characterized. Both the nucleotide and amino acid semelanogasterlibraryconstructed in the vector Agtll (VON KALM quence have been independently determined (HEINISCH et al. 1989). Then 20,000 pfu of the respective libraries were plated with 400 pl of an overnight culture of the appropriate et al. 1991; FOTHERGILL-GILMORE and WATSON 1989).The host strain per large NZCYM plate, and plates were allowed to yeast enzyme is composed of four identical 27-kD s u b grow to near confluence. Plates were incubated at 4" for 1 hr units encoded by the G p m l gene. The crystal structure to harden top agarose. The libraries were then screened acof the enzyme has been solved (FOTHERGILL-GILMORE and cording to the method of BENTON and DAWS(1977). Southern and Northern blot analysis: Southern blot transWATSON 1989). One important feature learned from fer was carried out following the method of SOUTHERN (1975) these studies has been the demonstration that two hiswith the adaptations described by the membrane manufactidine residues are involved in catalysis at the active site turer (Schleicher 8c Schuell).Northern blot analysiswas of this enzyme. The essential role of these histidines for et al. (1987) using Genescreen memadapted from LISSEMORE catalytic activityhas been demonstrated further by sitebranes according to the manufacturers instructions (New England Nuclear). Modifications included soaking the electrodirected mutagenesis (WHITEand FOTHERGILL-GILMORE phoresis rig, tray, comb and peristaltic pump recirculation 1992). tubing in 0.1% diethyl pyrocarbonate overnight before use. The biochemical properties of PGLYM from the bacWe used 200 pl/cm of prehybridization buffer (50% deionized teria, Streptomyces coelicolor, the only prokaryote from formamide, 5 X SSC, 5 X Denhardt's solution, 100 pg/ml which the PGLYM encoding genehas been cloned and sonicated denatured salmon sperm DNA, 40mM sodium phosphate, pH 6.8, 0.5% bovine serum albumin, 1% sodium dccharacterized, indicate that this enzyme has properties decyl sulfate (SDS),and 10%dextran sulfate) and 100 pl/cm2 similar to the yeast PGLYM (WHITEet al. 1992). The of hybridization buffer (50% deionized formamide, 5 X SSC, subunit molecular mass is 28 kD and the native mass 100 pg/ml sonicated denatured salmon sperm DNA, 40 mM determined by gel filtration was estimated at 120 kD. It sodium phosphate, pH 6.8, and 10% dextran sulfate). Prehywas assumed therefore that theS. coelicolor enzyme is a bridization was for 4-6 hr at42" and hybridization was at 42" for 12-16 hr. tetramer, similar in form to the yeast enzyme. The bacRNase protection: One microgram of linearized SK+ plasterial enzymepossesses the greatest amino acid semid DNA was used as a template for transcript synthesis using quence similarity to the yeast enzymebut also has a high T3 or T7 polymerase and [cx-~~PJCTP (NEN) utilizing the Rilevel of amino acid similarity to mammalian PGLYM, bobasica I1 transcription kit and a method supplied by the reinforcing the view that glycolytic genes have been manufacturer (IBI). Two micrograms of IBI RQ1 DNase I were added to the labeled transcript and incubated at 37" for 30 highly conserved during evolution (WHITE et al. 1992). min. Twenty microliters of 50 mM EDTA were added, and the Results reported here provide the initial characterizatranscriptwas purified over a Sephadex G50 RNA spin column tion of an insect Pglym gene. In D. melanogaster it ap(Boehringer Mannheim). Labeled transcript (20-50,000 pears there is one functionalgene and a second cpm) was added to an appropriate amount of total RNA that sequence which may be a pseudogene. had been dried under vacuum and resuspended in 30 pl of MATERIALSANDMETHODS DNA isolation and preparation: Plasmid DNA, propagated in Escherichia coli strain DH5a, was isolated in smallquantities by the method of MORELLE(1988). Large scale preparation of plasmid DNA wasperformed by as described in MANIATIS et al. (1989). Single-stranded DNAfrom M13mp18 or M13mp19 sequencing vectors, large scale bacteriophage A DNA using a plate lysate method or lysis in liquid culture were isolated using methods described in MANIATIS et al. (1989). Drosophila genomic DNA was prepared in small scale via the method of LISet al. (1983). Large scale isolation of Drosophila genomic DNA was performed using a modified method of R. LIFTON et al. 1983). (BENDER Isolation and radiolabeling of DNA fragments: Agarose gel slices containing DNA restriction endonuclease fragments were visualized and excised from ethidium bromide containing 1% agarose gels and electroeluted according to the dialysis tube method of MANUTIS et al. (1989). A125-500-ng sample of isolated restriction fragment DNA was random primed utilizing a randompriming labeling kit as specified by the manufacturer (Boehringer Mannheim).Unincorporated nucleotides were separated from labeled DNAvia passage over a hybridization buffer (80% formamide, 40 mM PIPES, pH 6.4, 0.4 M NaCl, 1 mM EDTA) and heated to 85" for 10 min. The sample tubes were then kept at 55" overnight, 300 pl of digestion buffer (0.1 M Tris-Cl pH 8.0, 0.5 M EDTA, 0.3 M NaCl, 40 pg/ml heat-treated RNase A (Sigma) and 2.87 units/ml RNase T1 (BRL)) were added, and the reaction mixture was incubated at 30" for 30 min. Twenty microliters of 10% SDS and 2.5 pl of 20mg/ml proteinase Kwere then added,and the reaction mixture was incubated at37" for 30 min. Extractions were performed with the addition of an equal volume of phenol and 0.5 pl of 10 mg/ml yeast tRNA was added to the extracted aqueous phase. Reaction productswere precipitated by the addition of an equal volume of ethanol, rinsed with 70% ethanol and the pellets dried under vacuum. Pellets were resuspended in loading dye (U. S. Biochemical Corp.) and heated to 85" for 5 min. Protection products were visualized following electrophoresis on a 1 X TBE, 6% acrylamide gelby standard autoradiographic techniques. Primer extension: Primer extension was performed essentially as described by MANIATIS et al. (1989) with the following modifications: 5-30 pg of poly(A+)RNA wereused depending on the stage and primer used in each individual experiment; 105-106 cpm of the labeled primer were added to the specified amount at poly(A+)RNA; annealing temperatureswere either Drosophila Pglym Genes at 25" or at 30" as described. Primers were extended using 40 units of avian myeloblastosis virus reverse transcriptase (Life Sciences, St. Petersburg, Florida). Extension products were separated on 6% acrylamide 3/4-3 X TBE buffer gradient gels (BICCINet al. 1983) and visualized by standard autoradiographic techniques. PUriFcation of synthetic oligonucleotide primers: Oligonucleotides designed for DNA sequencing were synthesized (Syracuse University DNA and protein core facility) and purified via passage overa NENSORB PREPcartridge (DuPont). Sequencing strategies:To sequence genomic regions spanning each gene, subclones were isolated that contained regions of bacteriophage clones that were homologous to the cDNA clone used to isolatethem. The sequencing strategy for these subclones involved sequencing from either end and using oligonucleotide primers to bridge these sequences. Isolated cDNA clones were subcloned either into pBluescript I1 SK+ or mp18/19 by subcloning liberated EcoRI fragments of hgtl 1 inserts or performing BamHI, Hind111 directional subcloning. Genomic fragments were cloned into the appropriate polyiinker sites of pBluescript I1 SK'. Single-stranded DNA was sequenced by the dideoxy chain termination method incorporating [a-35S]dATP (SANCER et al. 1977; HONC1982) and visualized on 6% acrylamide 3/4-3 X TBE buffered gradient gels ( BICGINet al. 1983). Sequencing of double stranded plasmid templates utilized the following protocol. Two micrograms of plasmid DNAin 20 pl ofdH,O weredenatured by the addition of 2 pl of 2 M NaOH, 2 mM EDTA (pH 8.0), and incubation was at 85" for 5 min. The mixture was neutralized by the addition of 3 pl of 3 M sodium acetate (pH 5.2) and 7 pl of dH,O. Absolute ethanol (75 pl) was added to the DNA and placed at -70" for 5 min. DNA was recovered by centrifugation for 10 min at 4". The DNA pellet was washed in 200 pl of 70% ethanol and dried under vacuum. The pellet was resuspended in 7 pl of dH,O and annealed to a primer at 37" for 30 min. All sequencing utilized the Sequenaseversion 2.0DNA sequencing kit as per the manufacturer's direction (U. S. Biochemical Corp.). Autoradiograms of sequencing gelswere read and sequences determined using a digitizer and computer programs from DNASTAR (Madison, Wisconsin). Both Pglym sequences have been submitted to the GenBank data base. Pglym78 has been assignedaccession no. L27654. Pglym87 has been assigned accession no. L27656. In situ hybridization to polytene chromosomes: Polytene chromosomes were dissected from late 3rd instar larvae of Oregon-R D . melanogaster that had been raised at 18" on banana media. Dissection of salivary glands, separation and fixation and pretreatment of polytene chromosomes was performed essentiallyas described in ENCELSet al. (1986). Chromosomes were examined under phase contrast optics and the best chromosomes selected for hybridization. From 250 ng to 1 pg of unrestricted bacteriophage DNA containing the genomic DNAof interest werebiotinylatedutilizing Bio-1 1-dUTP (ENZO diagnostics) ina random priming reaction (Boehnnger Mannheim), and unincorporated nucleotides were separated by passage overa Sephadex G50 spin column (Boehringer Mannheim). Twenty micrograms of salmon sperm DNAwere added and precipitated by the addition of 2.5 volumes of ethanol. The probe was recovered by centrifugation, resuspended in 75 p1 of hybridization buffer (0.6 M NaCl, 50 mM sodium phosphate buffer, pH 6.8, 1 X Denhardt's solution, 5 mM MgCl,), denatured by boiling for 3 min, and chilled on ice. The probe solution was added to the slides, sealed with a coverslip, and incubated in a moist chamber at 58" for 12-18 hr. The slides were washed three times for 15min each in 2 X SSC, and the signal was detected using the Detek I-hrp kit as specified by the manufacturer (ENZO Diagnostics). 355 Assays of PGLYM activiq The assays for PGLYM activity was modified from the methods of GMSOLIA and CARRERAS (1975).This assay measures formation of phosphoenolpyruvic acid (PEP) by increase in optical density at 240 nm by rate limiting amounts of mutase in an excess of enolase. One gram of adult flies was ground in 10 ml of 50 mM Tris, pH 7.4, 2% of saturation with respect to phenylthiourea, using a motorized mortar and pestle. The homogenate was spun at 10,000 rpm for 15 min at 4". The supernatant was recovered by filtration through small gauge nylon mesh. Fifty microliters of homogenate were added to 3 pl of 50 mM %phosphoglyceric acid (PGA), trisodium salt (Sigma), 10 pmol ofMgSO,, 100 pmol of Tris, pH 7.0, and 10 units of enolase in a I-cm light path quartz cell. The rate of increase of absorbance at 240 nm was measured spectrophotometricallyusing a Gilford 240 recording spectrophotometer, and the amount of enzyme activity in each sample was determined utilizing a molar extinction coefficient of 1.75 X lo3 for PEP and the fact that 1.5 mmol of 2-PGA formed corresponds to 1 mmol of PEP measured. Protein concentrations were determined using the BioRad microassay with bovine serum albumin as a standard, utilizing the reagent and instructions supplied by the manufacturer. Lethal mutations in the 87B region were obtained from the Umea stockcenter, Sweden,and fromJ. GAUSZ at the Hungarian Academy of Sciences. RESULTS Isolation of the D . melanogaster PGLYM encoding genes: To isolate D . melanogaster Pglym, a full length cDNA which encodes the human brain PGLYM isoform (TSUJIMO et al. 1989) was used to screen a Drosophila cDNA library prepared using adult mRNA and the vector, hgtl1 (VON KALM et aZ. 1989).Five cDNAclones were isolated. All contained anidentically sized insert of 1 kb. The clones were sequenced and found to contain a PGLYM encoding open reading frame by comparison to the human PGLYM amino acid sequence. ADrosophila cDNA clone was used to probe a Southernblot of Drosophila genomic DNA. This analysis (Figure 1) revealed multiple regions within the Drosophila genome having various intensities of hybridization. This suggests the likelihood that more than one gene may be present in the genome. A cDNA clone was used to screen a Drosophila genomic library and threegenomic clones were isolated. Partial restriction maps of the inserts in these phage were assembled and identify two non-overlapping clones (Figure 2). By comparison to the genomic Southern blot, each clonecould be characterized as containing restriction fragments relating toone or the other of the hybridizing regions. This confirmed existence of two PgZym regions within the Drosophila genome. The nucleotide sequenceof each was determined according to the strategy indicated in Figure 2. The cytogenetic localization of each gene on the Drosophila polytene map was determined by in situhybridization to D . melanogasterpolytene chromosomes. Using the entireinsert of each lambda clone as a probe for hybidizationn to polytene salivary gland chromosomes, a single site for each was identified (data not shown). One gene is located at87B4,5 on the rightarm of chromosome 3 and 356 P. D. Currie and D. T. Sullivan FIGURE I.-Cenomic Southern restrictedwithvarious enzymes and the resultant blot probed with the isolated Pglym cDNA. Lane 1, EcoRI; lane 2, BamHI; lane 3, HindIII; lane 4, SalI; lane 5, BglII; lane 6, PsA; and lane 7, SphI. The arrows represent positions of DNA standards of 12,5,4,3,2 and 1 kb, respectively. the other atregion 78 A/B on the left arm of chromosome 3. Thesegenes will be called Pglym87 and Pglym 78, respectively. Sequencing of Pglym78 The PGLYM encoding cDNA used to isolate the genomic clones was demonstrated, by sequencing, to correspond to thetranscribed region of Pglym 78. The sequence of 2290 nucleotides of the Pglym 78 genomic region beginning 632 nucleotides upstream of the translation start ATG and extending 381 nucleotides beyond the translation stop codon and the conceptual translation of the coding region is shown in Figure 3. Pglym78 contains two introns of 469 and 58 nucleotides, respectively, and three exons of 102, 498 and 166 nucleotides. The promoter region of Pglym78 does not contain a sequence which is an identifiable TATA box sequence. Exon one, as defined by the 5' extent of the cDNA, wasconfirmed by primer extension and RNase protection assays (see below). Nucleotide sequencing of the 3' end of the cDNA clone revealed that polyadenylation site occurs 20 bp downstream from the sequence, AATAAA, which matches the consensus polyadenylation signal. Transcript analysis of Pglym78 To confirm the transcription unit suggested by the comparison of cDNAand genomic sequences, RNase protection and primer extension analyses were performed on the Pglym 78 transcript (Figure 4). The5' end of the Pglym78 transcript was defined using RNase protection and transcripts from the 1.3-kb BglII subclone. The transcripts pro- FIGURE2.-Sequencing strategy of the isolated Pglym regions from the genomic A clones. Stipled bars represent the BglII bands that hybridized ingenomic Southern blot analysis (see Figure 1, lane 5). Enzymes used to construct the maps were, E, EcoRI; Bg, BglII;B, BamHI. Sequencing from restriction sites is denoted with a plain arrow while the oligonucleotide primers used to bridge these sequences are shown as +m. Sequencing of the cDNA and genomic sequence covered both strands of each gene. tected by this probe confirmed the boundaries of intron 1 and revealed possible multiple transcription start sites. This transcript protected four RNA fragments, 240,188, 180 and 170 nucleotides long, respectively (Figure 4, panel A). The 240-bp fragment represents that portion of the Pglym transcript that extends from the 3' end of the probe to the 5' end of exon two. The remaining bands presumably represent transcript fragments extending from the 3' end of exon one to multiple transcription start sites at thebeginning of the Pglym 78 transcript. The 180-bp band corresponds insize tothe putative start site defined by the 5' end of the cDNA. To confirm that themultiple bands protected by transcripts from the 1.3kb subclone represent multiple start sites, the probe was shortened by restriction at a PvuII site. This probe extends 427 nucleotides 5' to the ATG and protects bands identical in sizeto those protected by the full length probe. Since the 240 nucleotide fragment is known to derive from exon two, the other threemust be derived from the region between the PvuII site and the 3' end of exon one. This result requires that these fragments overlap, have a common 3' end andthat theirsize differences is at their 5' ends,(Figure 4, panel B) . This is most simplyinterpreted as multiple transcription start sites. This interpretation is confirmed by primer extension analysis of the Pglym 78 transcript (Figure 4). A 22base primer correspondingto nucleotide positions 7-28 3' to the ATG was annealed to 5 pg of poly(A+) RNA Drosophila Pglym Genes 357 ATACAGCCGACTCGATTGAAAACAGCACGGCGCGATTCTCGTCGATCACGTCGTCCAAGTCGGGCCTGCGGACAAAGGAGGTTTCGAAATTACTTGATAA-~OO AAACCAGTTTCGGGGCGTGGAATCCTTTGTCTCACTTCACAAAGACGTTCTTCGTCGGACAAGCGGGTGGCCACCAGTTGGTCCAGCCCGTTGCCCGACA-200 GCTGGGGCCAAGTGCAGCGACGCGGTGACCTTTTCGCACATGTTTAAGTGACAAAAAGTTGGAATTTATGGGCGCAAAAACAGCCAAATTTAAAACACGA-300 ATATAGTGATGACAAACGCGAGCGATAACATCTGCGCGGGGGATCCAACCTGTTGCACACGACAAACTAATTGGACATTTGAAGGGGTGTAGCTTGACGC-400 GCTATTCCAATAAGACCATACTCCGGAAATCCAATCAGAGAAGGCACCGCTCTGCGCACTACGTTTTTCCAACAGAACGGCAATATCGCGGCCTGACAGC-500 * * * T G C G A C C T T T G G A C G G C T C C C T T G T A G G T C A A A C T T C A ~ G T C G C ~ T - 6 o Q G C A G C WATGGGCGGCAAGTACAAGATTGTGATGGTGCGCCACGGCGAG~CCGAGTGGAACCAGGAGAATCAGTTCTGCGGC-700 M G G K Y K I V M V R H G E S E W N Q E N Q F C G TGGTACGACGCCAACCTCAGCGAAAAGG~GAGTGTTACATCAATTTGAAATTCAAACACATATGTGGATTAAAATCTGATGTATGTGTGTACATCTAGT-~OO W y D A N L S E K G------------------------------------------------------------------------TCCAATCGATGATCCTTTGCTTTGATCAGTTTCACAATAGTTTTTCATTTCATTTACCCAAATTAATCTTAATTGACCTACAACAACCTGCAACATTATC-900 - .................................................... TCATTCAAACTTTTCTGTGATAAAAATAGGTCTCTACAAACAAAAGCTTTCCAGTGCAATTTTCAGAAGCATCAAAATTGTAGTTTCTTACACTTTAAAG-1000 " " " " " " " " " " " " " " " " " " " " " " " " ~ " " " " " " " " " " " " " " " " " " " " ~ " " " " " ~ TGCAAAAATTTGCTAACTCATATGTTTCTAAAGTTCAGATTATGCTTATGAGCATAATTTGATAGGTTATCGCCTTTATTTAACCCTAGTAATCAAATTA 1 1 0 0 """"""""""""~""""~"""""""""""""~"""""""""""""""~""" AAATTAAACGCATGCCTTTCAATTAGATTAAATTGGCTTTAAGTTCGTTACCCAACACGGCGTATACTTAATGTTCGCTAAACATTTGCCCAAAC~GTC-1200 Q " ~ ~ " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " ~ AGGAGGAGGCTCTAGCCGCGCGCAAGGCCGTCAAGGATGCCGGCCTGGAGTTCGATGTGGCCCACACCTCGGTGCTAACCC~CGCCCAGGTGACGCTGGC-1300 E E A L A A R K A V K D A G L E F D V A H T S V L T R A Q V T L A CAGCATCCTGAAGCCAGTGGCCACAAGGAGATCCCCAATCCAGAAGACCTGGCGCCTGAACGAGCGCCACTACGGTGGACTCACTGG~CTGAACAAGGCC-1400 S I L K P V A T R R S P I Q K T W R L N E R H Y G G L T G L N K A GAGACCGCCGCCAAGTACGGCGAGGCCCAGGTGCAGATCTGGCGTCGCAGCTTCGACACCCCGCCACCACCGATGGAGCCGGGCCATCCGTACTACGAGA-1500 E T A A K Y G E A Q V Q I W R R S F D T P P P P M E P G H P Y Y E ACATCGTCAAGGATCCCCGCTACGCCGAGGGTCCCAAGCCCGAGGAGTTCCCCCAGTTCGAGTCCCTCAAGCTGACCATCGAGCGCACACTGCCCTACTG-1600 I V K D P R Y A E G P K P E E F P Q F E S L K L T I E R T L P Y W GAACGACGTCATCATTCCCCAGATGAAGGAGGGCAAGCGCATCCTGATCGCTGCCCACGGCAACAGCCTCCGTGGCATCGTCAAGCATTTAGACA~GAG-1 7 0 0 N D V I I P Q M K E G K R I L I A A H G N S L R G I V K H L D N BOO CATGGCAGATATATCTAACATATCATCTTTTTACCATATCCTTGTCAATC~ACCTTTCTGAGGACGCCATCATGGCCCTAAATCTGCCCACGGGCATTC-1 L S E " " " " " " " ~ " " " " " " " " " " " " " " " " " " " ~ D A I M A L N L P T G I P 900 CGTTCGTCTACGAGCTGGACGAGAACTTCAAGCCGGTGGTGTCCATGCAGTTCCTTGGCGACGAGGAGACCGTGAAGAAGGCCATCGAGGCTGTGGCCGC-1 F V Y E L D E N F K P V V S M Q F L G D E E T V K K A I E A V A A CCAGGGCAAGGCCAAG~AGCACGCAGGCCACACCTCATGAGAATCCACTCAAATACGGCACAAGTCAGAGCATTCTGATTACGCTTCTGTTCAAGATGT-20~0 - Q G K A K AGCCA~CGTTGTTGTTGTTCGCGTAGCAGTAATTGGTTCATCCCCACTACAGT~AATAAATAATTTATTATTACACTCAAGAGGACAGGCGGA-2100 GGAGGCTCATTTATTGTCGGCTACTTCTTCTTATTCTTACTACGTCTTGGCATTCAGGCCAGGATGCACTTGGCGCACCTTGTACGTTCGTGAGAGCGCG-2200 TGTGCACAATCGCACCGTGTCCTGATTGCAGTGCTCTCGTGCTCGTGGTGCAGCTTCACACACAGTGTCTGGCGCGCGCTCCGTGCTAGCC -2300 FIGURE 3,"Seqvence and conceptual translationof Pglym78. Nucleotides are numbered with reference to thefirst nucleotide of the sequenced region and the introns are indicated via a dashed line which breaks the derived amino acid sequence. Splice site donor and acceptorsequences are in underlined type. Multiple transcription start sites denoted are by asterisks. Untranslated underlined.The translation start codon, stop codon and the polyadenylation consensus signal is double regionsof the transcript are underlined type. Pglym78 has accession no. L27654 in the GenBank database. from embryos and larvae. Three major extension products of 96, 106 and 114 nucleotides were detected. The 5' endsof these extension products corresponddirectly to the positions of transcript start sites predicted by the RNase protection studies. The 3' end of the Pglym78 gene was analyzed by RNase protection experiments using transcripts from the 3.0-kb BglII subclone used in sequencing of the 3' half of this gene. RNase protection using a transcript of the entire 3.0-kb insert protects three products. These are 334,259 and246 nucleotides long, respectively. The 334nucleotide fragment corresponds in size to a fragment which extends from the 5' end of exon 3 to the polyadenylation site. The 259-nucleotide band defines a fragment from the 5' end of the probe to the3' end of exon 2. The origin of the 246nucleotide fragmentwas unclear and was further investigated by using a probe shortened at BspHI a restriction site (Figure 4, panel D). This transcript protected fragmentsof 246,150 and 144 nucleotides. The 246nucleotide fragment is the same 246nucleotide fragment protected by the longer transcript (above) indicating that this fragment is from the region 3' to Pglym78. The origin of this fragment is likely from a downstream gene since in the Northern blots described below a Pglym78 transcript that is 246 nucleotides longer at the 3' end would have been resolved in addition to the1-kb PgZym78 transcript. If this fragment corresponded to a duplicated, alternatively spliced Pglym exon it would have been discovered by inspection of the sequence. The bands of 150 and 144 nucleotides represent shorteningof the 334nucleotide protected fragment corresponding to exon 3 and the 3'-untranslated region of the transcript as expected. The two bands could reflect alternative polyadenylation sites or some nuclease digestion at the ends of the protected molecules in A rich regions. Itis unclear if the original 334-nucleotide fragment might bea doublet since a 6-nucleotide difference probably would not be 358 D. P. Currie Sullivan and D. T. Pglym 78A/B 1 KB 5' Probes A PRIMER EXTENSION 3 Probes B C D . ~.. ._- ss4bp, " . 146bO-D 259bw 246b- 1l 4 b d 240bH IO6bH 96bp-1 FIGURE 4.-Transcript mapping of the Pglym78 transcript. The Pglym78 exons are solid rectangles. Expected protection products for the individual probes, as derived from genomic and cDNA sequence comparisons, are shown as thin lines labeled with their predicted sizes. Four probes A, B, C and D are indicated as stippled lines and the results using each is shown in panels A, B, C and D, respectively. Primer extension analysis of the 5' end of Pglym78 mRNA is shown inthe right most panel. Lane R, ribonuclease treated probe; L, third instar larva] R N A s ; A, adult RNAq and E, embryo RNAs. 150be 147bH resolved in the higher molecular weight 334 nucleotide band. Northern blot analysis using RNA prepared fromdifferent developmental stages of the Drosophila life cycle detects a single 1.0-kb transcript which variesin relative amounts throughout development (Figure 5 ) . Pglym78 transcripts are very abundant at the earliest embryonic stages, probably as a result of transcription during oogenesis. They become undetectable as embryogenesis proceeds and an increase in abundance is evident by hour 12 of development. During larval, pupal and adult stages, the amountof Pglym78 transcript rises, fallsand rises again in a pattern identical to transcripts from other glycolytic genes of Drosophila (SHAW-LEE et al. 1991, 1992; ROSELLI-REHFUS et al. 1992). Nucleotide sequencingof Pglym87: The sequencing strategy for Pglym87is illustrated in Figure 2. The nucleotide sequence of 1838 nucleotides from Pglym87 and its conceptual translation is given in Figure 6. A putative open reading frame, but lacking the first two codons, which might encode a PGLYM subunit is present and intact through an appropriately positioned translation stop codon. A methionine codon which might serve as FIGURE 5.-Northern blot analysis ofthe Pglym 7 8 transcript. Samples of 6.2 pg of RNA from various developmental stages of Drosophila total were electrophoresed, blotted and hybridized with a cDNA probe representing the full length Pglym78 transcript. Time points are in hours (hr) from egg deposition until the l&hr time point where time is then in days (dy). an initiating codon does exist, in frame, eight amino acid residues into this reading frame. Translation of an mRNA of this type would yielda PGLYM like molecule lacking the first nine amino acids but otherwise intact. Pglym87lacks the introns present in Pglym78which suggests that the Pglym gene duplication occurred by retrotransposition and raises the possibility (discussed below) that Pglym87 is a pseudogene. Transcript analysisof Pglym87: RNase protection experiments were performed in parallel with and using the same RNA samples as those of PgZym78 (Figure 4). Drosophila PgZym Genes 359 CCGGACTAGACTATCCGATCTTCGCCTGAATACGAGGCAGCGCTATCATTACTTCGAGGAAGAACGTCTATGTGGACTTCTATATTCCATTACACTATAT-~OO ATATATTATGATACTAGTGAACTCTCGCTGATATCTCGCAAGCTAGCTCTAGCTAAATATTACGAATTATATGCTGAGTACCACCTTTGTACAAGACCTT-ZOO AGGCTTATAGACAATACCACTATGAAGCTGGCTGAGTCAACCATTACTAGCAGTGAACGTAGAATATAATCATTAAAGAATATTTCCATACTGTTAGAAT-300 GCATGAAATAAAAACCTCAAAGTAACCTTATTTTCAACGAACTTTTCTTACTGTCATAGGTTCAACCCCGGCAGGTTTTAAATTAAGGAAAAAATTCCAA-400 TTTCCGGAAAATTGGAAACCAACCGATATAGACTGCAAAACACTTTCCAAATTCAAAGTGTCCCAAAATGTTGCTCTGCCAGAATTCGTGGTGTGGCCAA-500 TTGCTGTGTCGCGTGGCGAACATCGAAGCCCGCTGCCCGTTCTCGAAGTCATCATCCGGCGGAAAGCAGGCCGAGAAGAAGGGCAAGTACAGGATTGTGA-600 G K Y R I V M TGGTTCGTCATGGAGAGTCCGAATGGAATCAGAAGAACCTATTCTGCGGATGGTTCGATGCCAAGCTATCGGAGAAGGGGCAGCAGGAGGCCTGTGCCGC-700 V R H G E S E W N Q K N L F C G W F D A K L S E K G Q Q E A C A A TGGCAAAGCGCTCAAAGATGCCAAAATTGAGTTCGATGTGGCCCACACCTCGGTGCTGACCAGAGCCCAGGAGACACTCAGGGCCGCCCTAAAGTCCAGC-800 G K A L K D A K I E F D V A H T S V L T R A Q E T L R A A L K S S GAGCACAAGAAGATTCCGGTGTGCACCACGTGGCGCCTCAACGAGCGCCACTACGGCGGACTGACGGGACTGAATAAGGCCGAGACCGCCAAGAAGTTCG-~~O E H K K I P V C T T W R L N E R H Y G G L T G L N K A E T A K K F G GCGAGGAGAAGGTTAAGATCTGGCGCCGCAGCTTTGACACCCCACCGCCGCCAATGGAGAAGGATCACGAGTACTACACCTGCATTGTTGAGGATCCGCG-1000 E E K V K I W R R S F D T P P P P M E K D H E Y Y T C I V E D P R CTACAAGGAGCCAACTGAAGCCGGAGAGTTCCCCAAATCCGAGTCCCTCAAGCTCACCATCGAGCGCACCTTGCCCTACTGGAACGAGGTCATAGTGCCC-1100 Y K E P T E A G E F P K S E S L K L T I E R T L P Y W N E V I V P CAGATCAAGGATGGAATGCGTGTCCTGATCGCCGCCCACGGCAACAGCCTGCGGGGCGTGGTCAAGCACTTGGAATGCATCTCTGACAAGGACATTATGA-1200 Q I K D G M R V L I A A H G N S L R G V V K H L E C I S D K D I M S GCCTCAACCTGCCCACTGGCATTCCTTTCGTCTACGAACTAGACGAGAGCCTCAAGCCTTTGGCCACGTTGAAGTTCCTCGGCGATCCGGAGACGGTGAA-1300 L N L P T G I P F V Y E L D E S L K P L A T L K F L G D P E T V K GAAGGCAATGGAGTCGGTGGCCAACCAGGGCAAAGCCAAG~ATCCGTGATTTTGGGTATACCCAGCAGCCTCCGAAGAAGGTCAGAGCAATCAGCTCAA-1400 K A M E S V A N Q G K A K TGCACTTGTATGCCAAACAAAAGCGGCGAATCGAAAGTGGTGCTCACTGTGGCCAGGCGAATGTCAAGAAGCAAATAAA~TT~T~TGArT~TAATTGACA-1500 ATGAAGAGTCAACAACGATGGGCTGGCTGGCGTTGGTCTCTTGGTCTTGGTCTTGGTCTTGGTCTCTGGAACTCAGGCGACG~TTCTGGGTCCAGGGTCT-1600 GGGTCTGGATGTGGGTCTGGGTCTGGGTCTGAGCCAAAGTCAAGCCGATTATATGTGTTCCATTGGCGCTGCTTAAGAATCAGGAGTCGAGTGTGTGTCC-1700 TCAATTTAGTGCTTSGAATGAGTCTGAGAGCGCCTGTATTGCACAGGTCATTATTGTTAATGAATCTCATTAGGCTATGGAAC~CCAAGTATAGTGACTA-1800 AGTATAATCTGAGGCCGGTCCAGGCGAACTGTATCTCGA-1839 FIGURE 6."Sequence and conceptual translation of Pglym87. Nucleotides are numbered with reference to the first nucleotide of the sequencedregion. Translation from codon threeyields a PGLYM similar sequence. A putative stop codon and the consensus polyadenylation signal are double underlined. The 3' direct repeat sequences 3' to the putative coding region are underlined. Pgbrn87 has accession no. L27656 in the GenBank database. These experiments failed to detect any transcript from Pglym87 in the RNA from larval and adult stages. To determine if Pglym87 transcripts could be detected at any developmental stage, RNase protections (internally controlled with probes to Pglym78 and ribosomal protein RP49) were performed using RNA isolated from animals which span the complete developmental spectrum, i. e . , the same RNA samples as used for the northern blots. Theseexperiments(data not shown) also failed to find any developmental stage at which a Pglym87 transcript could be detected.Attempts to identify a Pglym87 transcript using primer extension analyses were also unsuccessful. Analysis of the 87B region: Prior to completion of the above analysis whichfailed to finda Pglym87 transcript, we theorized that a mutation in a PGLYM encoding gene that reduced the total level of PGLYM enzyme significantly might be lethal. GAUSZ et al. (1981) analyzed the 87B cytogenetic region and identified 15 complementation groups represented in a set of 88 alleles. This is an average of 5.9 alleles per complementation group. This suggests that the region may be close to mutationally saturated and raises the possibility that one of the lethal complementationgroupsmightrepresent Pglym87. Representatives of eachcomplementation group were assayed,as heterozygotes, for significant reduction of PGLYMactivity. Resultsfrom this analysis are presented in Figure 7. Reduction of PGLYM activity Complementation Group FIGURE7,"Measurement of the PGLYM activity in animals heterozygous for a representative allele of each of the lethal complementation groups of the 87B region. Each allele was assayed intriplicate. The standard error is indicated by vertical bars. could not be associated with any lethal complementation group. Therefore if Pglym87 has a function, it is likely not a vital function so that mutations in Pglym87 were not detected in this set of complementation groups. In addition, the amount of enzyme encoded by Pglym87 is likely a small fraction of the total phosphoglycerate mutase activity since heterozygous deletions also showed no reduction in activity. P. D. Currie and D. T. Sullivan 360 Alignment of DrosophilaandHuman PGLYM s 8 7 -g k y r i u H U RHGESEUNQK NLFCGUFDAK LSEKGQOEAC ARGKRLKDAK IEFDUAHTSU78-HGGKYKIUHURHGESEUNQENQFCGUYDANLSEKGQEEALARRKAUKDAGLEFDUAHTSU HB-HRA-YKLULIRHGESAUNLENRFSGUYDADLSPRGHEEAKRGGQALRDRGYEFDICFTSU HH-HAT-HRLUHURHGESTUNQENRFCGUFDAELSEKGTEERKRGAKAIKDAKHEFDICYTSU 60 LTRAQETLRAALKSSEHKKIPUCTTWRLNERHYGGLTGLNKAETAKKFGEEKUKIURRSF-120 LTRRQUTLRSILKPUATRRSPIQKTURLNERHYGGLTGLNKAETARKYGERQUQIURRSF QKRRIRTLUTULDAlDQflULPUURTURLNERHYGGLTGLNKRETRAKHGEAQUKIURRSV LKRAIRTLURILDGTDQflULPUURTURLNERHYGGLTGLNKAETAAKHGEEQUKIURRSF FIGURE 8.-Alignment of the conceptually translatedproteinsequence of the pglrm genes afDrosophi~a and humans. Dashes indicate gaps introduced DTPPPPflEKDHEYYTCIUEDPRYKEPTERGEFPKSESLKLTIERTLPYUNEUIUPQ I KDG-180 DTPPPPHEPGHPYYEN I UKDPRYAEGPKPEEFPQFESLKLTIERTLPYUNDUI I PQnKEG DUPPPPtlEPD HPFYSHISKD RRVRDLTEDQLPSCESLKD TIARRLPFWN EEIUPQIKEG DlPPPPflDEKHPYYNSISKE RRYRGL-KPGELPTCESLKDTIRRRLPFUNEEIUPQIKRG to maintain alignment. IIRULIRRHGNSLRGUUKHLECISDKDIHSLNLPTGIPFUYELOESLKPLATLKFLGDPET-2+0 KRlLlAAHGNSLRGIUKHLDNLSEDAIHALNLPTGIPFUYELDENFKPUUSHQFLGDEET KRULIAAHGNSLRGIUKHLEGLSEEAIHELNLPTGlPlUYELDKNLKPIKPflQFLGDEET KRULIRRHGNSLRGIUKHLEGIISDORIHELNLPTGIPIUYELNKELKPTKPIIQFLGDEET UKKAflESURN UKKAIEAUAR URKAIIEAURA URKAIIERURA QGKRKQGKRKQGKAKK OGKFIK- Evolutionarycomparison of the Drosophila Pglym genes: Alignment of the putative proteins encoded by the Drosophila genes and the human muscle and brain isoforms, PGA"M and PGAM-B, respectively (Figure 8 ) reveals that the potentialprimary amino acid sequence is highly conserved. The putative Drosophila PGLYM proteins are 73% identical. The first eight amino acids of PGLYM87 are shown in small letters because no putative initiator methionine precedes them, yet they are conserved. Comparison of these amino acid sequences was used to determine whether the Drosophila Pglym duplication represented a duplication having functional simiIarity to the mammalian Pglym duplication which resulted in genesfordifferent isoforms. PGLYM78 seems to beequally similar to the PGA"B isoform, 68% and the PGA"M isoform, 69%. At positions of amino acid substitutions where there can either be a PGA"M or PGA"B type amino acid within the PGLYM78 protein, 12 amino acids are identical to the PGA"B gene and a further 12 substitutions are PGA"M type. Putative PGLYM87has 64% amino acid identity with the human muscle isoform PGA"M protein and 68% amino acid identity with general isoform PGAM-B. Analysis of the substitutions of PGLYM87 reveals that eight are of the PGA"B type and 18are ofthe PGM-M type. As the PGA"B isoform is the mammalian general isoform and the PGA"M form is considered tissue restricted isoform this difference may indicate that Pglym8 7 at some point encodeda protein that was restricted to a specific cell type, e.g., muscle. Analysis of Drosophila Pglym sequences reveals that each shows a nonrandom distribution of nucleotides in the third codon position (Table 1) with an abundance of C and G and scarcity ofA typical ofother Drosophila TABLE 1 Codon usage and bias of the Drosophila Pglyrn genes Bases in the third position of codons Gene U C A G Total Pglym87 % of total Pgly m 78 % of total 29 11.4 22 8.6 103 40.6 123 48.0 27 10.6 14 5.5 95 37.9 97 37.9 254 256 genes (STARMER and SULLIVAN 1989). This nonrandom distribution is reflective of codon utilization bias. Pglym87 has a somewhat less extreme codon bias than does Pglym78 indicating some loss of bias since the duplication. DISCUSSION Molecular genetics of Pglym78 Pglym 78 has properties similar to other Drosophila genes which encode enzymes of the glycolytic pathway.These includea promoter region which has no TATA box but has multiple, closely spaced transcription initiation sites and a transcript expression pattern similar to all of the otherDrosophila glycolytic genes tested so far (SHAW-LEE et al. 1991, 1992; ROSELLI-REHFUS et al. 1992). This similarity in temporal regulation of glycolyticgenes reinforces the notion that there may be common regulatory control mechanisms exercised onthe transcription of these genes. The Drosophila Pglym 78 gene and the human muscle Pglym gene (TSUJIMO et al. 1989) are the only Pglym genes forwhich intron positions have been determined. Each of these genes has two introns. The first intron of Drosophila Pglym Genes Pglym78 interrupts the gene at codon35 and does not have a counterpart in the human gene. Thesecond intron of Pglym78 interrupts codon 201. The second intron of the humanmuscle gene interrupts codon197 of the PGLYM open reading frame. This may represent an intron position conserved in the two genes. It has been suggested that intron positions may “slide” around a conserved position and observations consistent with this interpretation have been made when comparisons between homologous genes that have been separated for a substantial evolutionary period have been made (MCKNIGHT et al. 1986; WCHIONNI and GILBERT 1986). Origin of the duplication: Since Pglym87 lacks the introns present in Pglym78 and appears at least two codons shortat theN terminus seems it highly likelythat Pglym87 arose through retrotransposition from a Pglym78 transcript. Additional support of this interpretation comes from several features of the Pglym87 region. The intron-less Pglym87 gene is on a separate chromosome arm from the intron containingPglym78 gene. Astretch of 20 nucleotides, 19 nucleotides 3’ to a putative polyadenylation signal (underlined in Figure 6), contains 53% adenosineresidues. This could represent the remnant of a poly(A) tail from the transcript. A similar percentage of adenosine residues 3’ to the consensus polyadenylation signal in human Pgk, another retrotransposed gene, has been used to argue that this region represents the corrupted remainder of the reverse transcribed transcript’s poly(A) tail (MCCARREY and THOMAS1987). While evidence presented here indicates that Pglym87 is almost certainly the result of a retrotransposition mediated gene duplicationevent, examination of the nucleotide sequence around Pglym87 suggests that additionalor more complex mechanisms may have played some role in the origin of Pglym87. Sequences immediately 3’ to the putative poly A remnant contain four perfect copies and two corrupted copies of the sequence GGTCTT in adjacentdirect repeats. These could encode a poly prolineglutamic acid peptide on the opposite strand.Immediately downstream from this repeat arefive perfect copies and one corrupted copy of therepeat GGGTCT which, on the opposite strand would encode a poly proline-arginine peptide. These repeats are similar to the S repeats found within the transposable element Hobo, which is composed of the sequence TGAGGTCTT (STRECKet al. 1986). Also, in one openreading frameof the Ac transposable element of maize there is a region of repeated sequence which encodes 10 tandem prolineglutamicacid repeats reminiscent of the hobo S repeats (MULLER-NEUMANN et al. 1984). Thepresence of these repeat sequences, immediately three primeto the Pglym87 gene that are similar to those found in transposons of the hobo/Activator class, suggests an interesting mechanism for this gene duplication. This class of transposable elements trans- 361 pose through a DNA intermediate (&VI et al. 1991). If reverse transcript DNA were to be incorporated into an actively transposing hobo/Ac type element, this could provide a mechanism for integration into different regions of the genome using the usual pathway of the parent transposon. While retrotranspositions of genes and pseudogenes is found in the genomesof most species and seems particularly frequent in mammalian genomes, only a few examples have been noted in the Drosophila genome. This is despite demonstration of the presence in Drosophila cells of retroviral-like reverse transcriptase enzyme (HEINEet al. 1980) and an abundance of transposable elements that are of the retrotransposon class. On the basis of different chromosomal location of the Rh4 opsin gene in Drosophila virilis and D. melanogaster and the absence of introns in the D.virilis gene (NEUFELD et al. 1991) have suggested that the Rh4 opsin gene ofD. virilis has undergone retrotransposition during its evolution in the D . virilis lineage. In addition, it is likely that retrotransposition has played a significant role during theevolution of the Gapdh gene duplication in the genus Drosophila (WOJTASet al. 1992). Another example is that of the retrotransposedcopies of the alcohol dehydrogenase ( A d h ) genes of Drosophila yakubaand Drosophila teissieri UEFFSand ASHBURNER 1991; JEFF-S et al. 1994). In many respects these retrotransposed Adh genes aresimilar to Pglym87. Their Adh open readingframes remain intact except for a deletion of the Adh start codon. Each has lost the introns typical of an Adh gene andthey occupy chromosomal positions which are not homologous to the Adh position in D. melanogaster. These putative pseudogenes aremore similar to one another than to their respective functional Adh genes, even containing similar deletions in 5’ non codingDNA. The chromosome positions of the two genes in D. teissieri and D. yakuba is conserved which indicates that the two genes have a common origin and that the duplication arose before the species diverged. Is Pglym87apseudogene?: Since Pglym87 has its origin through retrotransposition the question that naturally arisesis: Isit a pseudogene?Consistent with the view that itis a pseudogeneis its origin through retrotranposition and the lack of detectable transcripts. However, we recognize the difficulty in proving that there areno Pglym87 transcripts. These could exist in very low levels, at restricted periods of development or in a very fewcells and have gone undetected in our assays. Also consistent with the hypothesis that Pglym87 is a pseudogene is a consideration of its 5’end. If the methionine encoded by the codon homologous to codon nine of Pglym78 is in fact a translation start in Pglym87, then this would result in the loss of eight relatively conserved amino acids from the N terminus. Furthermore, it is not clear why the sequence encoding the six amino acids preceding the methionine would be conserved as compared to 362 P. and D. Currie Pglym78. Another reason to think that Pglym87 might be a a pseudogene is that retrotransposedgenes have the diffkulty of not carrying sufflcient information for their correct transcription. In fact, two of the better analyzed retrotransposed genes, human testes specific Pgk (McCARREY 1994) and D. uirilis Rh4 opsin (NEUFELD et al. 1991) both of which are known to function, seem to be reverse transcriptase copies of a transcript substantially longer at the 5’ end than the mature mRNA. In each case copies of sequences required fortranscription apparently were included during retrotransposition. Inspection of the sequence 5’ to the PGLYM reading frame of Pglym8 7 reveals no significant similarity to the sequence 5’ to Pglym78. In support of Pglym87 being functional are several aspects of itsmolecular evolution including observations that the PGLYM reading frame is intact and that the gene shows at least some retention of codon utilization bias. In addition, the crystal structure of PGLYM from yeast has been solved and specific amino acids have been implicated in catalysis. These are histidine 8, histidine 181 and serine 11. Histidine 181 has been conserved in all mutases. Serine 11 has been implicated in the catalytic mechanism as aphospho ligand (WHITEand FOTHERGILL-GILMORE 1992). All these amino acids are conserved in the PGLYM78 protein with histidine 8 and 181 corresponding to histidine 12 and 188 respectively and serine11is conserved as serine 15. In ahypothetical PGLYM87 protein, the two histidine residues and the serine are also conserved and are located in identical positions as in PGLYM78. Thus, both Drosophila genes possess the capability to encode those amino acid residues shown to be important for PGLYM catalytic activity. If the PGLYM 87 is nonfunctional, then it should not, by the conventional view ofpseudogenes, be subject to constraint on its sequence divergence. Conservation of coding frame, codon bias or catalytically relevant specific amino acids would not be expected in a pseudogene. This is clearly the case for mammalian pseudogenes, where the biased base composition seen in the third codon position of functional genes is almost absent and the loss of reading frameis commonly observed (KIMURA 1983). Therefore we are left to chose between two hypotheses regarding the natureof Pglym87, neither of which is in total accord with present views regarding the molecular evolution of products of retrotransposition and pseudogenes. The structural properties suggest Pglym87 is a pseudogene with a peculiar rate and pattern of evolution, while aspects of its molecular evolution suggest it might be a functional gene having puzzling aspects in its structure. Inthis regard it seemsworth noting that in Drosophila species, two other candidates for pseudogenes have been characterized and in each, peculiar evolutionary properties have been noted. The similarity between Pglym87 and the retrotransposed Adh sequences in D. teissieri and D.yakuba sequences D. T. Sullivan are particularly striking. When these sequences were identified, theirpeculiar pattern of evolution, including retention of reading frame, codonbias and higher rate of substitution at synonymous nucleotide positions was noted UEFFSand ASHBURNER 1991). Recently an alternative interpretation forthese sequences has been offered in which it is suggested that the retrotransposed Adh sequences captured upstream exons with the resultant evolution of a new gene (LONGand LANGLEY 1993). However, one of the principle reasons offered to support the hypothesis that the new gene, called jingwei, is functional is its pattern of molecular evolution. The ultimate answer with respect to jingwei function remains to be verified by molecular analysis of its putative transcrip tion and translation products,butit seemshighly improbablethatan analogous situation applies to Pglym87. Inspection of the sequence makes no suggestion of captured exons in the 5’ sequence. In another example of a pseudogene in Drosophila, we have recently analyzed the molecular evolution of an Adh sequence that is present in species of the repleta group (SULLIVAN et al. 1994). This sequenceis found in all species ofthe repleta groupand is located immediately u p stream of the functional Adh genes. It is not a product of retrotransposition since it retains introns homologous to the functional Adh genes. Analysis of the evolution of this sequence reveals it is diverging more slowly than the neighboring intergenic regions, some codon bias is retained, and the ratio of K, to K , ranges from 10 to 14, in painvise interspecific comparisons using the sequences from seven species. These ratios are only slightly lowerthan those obtained from equivalent comparisons of the functional Adh genes and clearly not unity, as wouldbe expectedfor a pseudogene.Yet, in this set of Adh homologous sequences there is no openreading frame of appreciable length common to this set of seven species. Since there are only the two Adh pseudogenes and this report of Pglym, it is unclear at present what the peculiar patterns of molecular evolution of these putative pseudogenes might mean. Itseems likely that more extensive analysis of these examples and the discovery of others may reveal features of genome evolution in Drosophila not yet appreciated or understood. LITERATURE CITED BENDER, W., P. SPIERER and D. S. HOGNESS, 1983 Chromosomal walking and jumping to isolate DNA from the Ace and rosy loci and the bithorax complex in Drosophila melanogaster. J. Mol. Biol. 168: 17-33. BENTON, W. D., and R. W. DAVIS, 1977 Screening lambda gt recombinant clones by hybridization to single plaques in situ. Science. 1 9 6 180-182. BIGGIN, M. D., T. S. GIBSON and G. G . HONG,1983 Buffer gradientgels and [95S] label as an aid to rapid DNA sequence determination. Proc. Natl. Acad. Sci. USA 80: 3963-3965. C a w , B. R.,T. J. HONG,S. D. FINDLEY and W. M. GELBART, 1991 Evidence for a common evolutionary originof inverted repeat transposons in Drosophila and plants:hobo, Activator, and Tam3. Cell. 66: 465-471. CASTELLA-ESCOLA, J., D. M. OJCIUS, P. LEBOWLCH, V. JOULIN, Y.BLOWQUIT et al., 1990 Isolation and characterizationof the gene encoding Drosophila Pglym Genes the musclespecific isozyme ofhuman phosphoglycerate mutase. Gene. 91: 225-232. ENGELS, W.R.,C.R.PRESTON,P. THOMPSON and W.B. EGGLESTON, 1986 In situ hybridization to Drosophila salivary chromosomes with biotinylated DNAprobes and alkaline phosphatase. Focus. 8: 6-8. FOTHERGILL-GILMORE,L.A., and H. c . WATSON, 1989 Phosphoglyceratemutase. Adv. Enzymol. 62: 227-313. GAUSZ, J. H., G.GWRKOVICS, A. A. M. AWAD,J. J. HOLDEN and D. ISH-HOROWICZ, 1981 Genetic characterization of the region between 86F1,2 and 87B15 on chromosome three of Drosophila melanogaster. Genetics 98: 775-7 GILBERT, W., 1986 On the antiquity of introns. Cell. 46: 151-153. GRISOLIA, S., and J. CARRERAS, 1975 Phosphoglycerate mutase from yeast, chicken breast muscle, and kidney (2,SPGAilependent). Methods Enzymol. 42: 435-450. HEINE, C.W.,D.C. KELLY and R. J. AVERY, 1980 The detection of intracellular retroviral like entities in Drosophilamelanogaster cell culture. J. Gen. Virol. 4 9 385-395. HEINISCH,J., R.C. VON BORSTEL and R.RoDtcro, 1991 Sequence and localization of the geneencoding yeast phosphoglycerate mutase. Curr. Genet. 20: 167-171. HONG, G. F., 1982 Asystematic DNAsequencing method. J. Mol. Biol. 158: 539-549. JEFFS,P., and M. ASHBURNER, 1991 Processed pseudogenes in Drosophila. Proc. R. SOC.Lond. Ser. B 244: 151-159. JEFFS, P., E. C. HOLMES and M. ASHBURNER, 1994 The molecular evolution of the alcohol dehydrogenase and alcohol dehydrogenaserelated genes in Drosophilamelanogaster species subgroup. J. Mol. EVO~11: 287-304. M. 1983 The Neutral Theory of Molecular Evolution. CamKIMURA, bridge University Press. LEBOULCH, P., V. JOULIN, M. C. GAREL, J. ROSAand M. COHENSOLAL, 1988 Molecular cloning and nucleotide sequence ofmurine 2 , s bisphosphoglycerate mutase cDNA. Biochem. Biophys. Res. Commun. 156: 874-881. LIS, J. T., J. A. SIMON and C. A. SUTTON,1983 New heat shock puffs and Kalactosidase activity resulting from transformation of Drosophila with an hsp70-lac2 hybrid gene. Cell 35: 403-410. LISSEMORE, J. L., J. T. COLBERT and P. H.QUAIL, 1987 Cloning of cDNA for phytochrome from etiolated Cucurbita and coordinate photoregulation of two distinct phytochrome transcripts. Plant Mol. Biol. 8: 485-496. LONG, M. Y., and C. H. LANGLEY, 1993 Natural selection and the origin of jingwei, a chimeric processed functional gene in Drosophila. Science 260: 91-95. MANIATIS, T., E. F. FRJTSCH and J. SAMBROOK, 1989 Molecular Cloning: A Laboratory Manual, Ed. 2. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. MCCARREY, J. R., 1994 Evolution of tissue-specific gene expression in mammals. Bioscience 44: 20-27. McCAWY, J. R., and K. THOMAS, 1987 Human testis-specific Pgkgene lacks introns and possesses characteristics of a processed gene. Nature 3 2 6 501-505. McKnight, G.L.,P. J. O’Hara and M. L. PARKER, 1986 Nucleotide sequence of the triosephosphate isomerase gene from Aspergillus nidulans, implications for differential loss of introns. Cell 46: 143-147. MIRANDA, A. F.,E.R.PETERSON and E. B. MASUROVSKY, 1988 Differential expression of creatine kinase and phosphoglycerate mutase isozymes during development in aneural and innervated human muscle cultures. Tissue Cell 20: 179-191. MORELLE, G., 1988 A plasmid extraction procedure on a miniprep scale. Focus 11: 7-8. MULLER-NEUMANN, M., J. L. YODERand P. STARLINGER, 1984 The DNA sequence of the transposable element Ac of Zea mays. Mol. Gen. Genet. 198: 19-24. NAKATSUJI, Y., K. HIDAKA, S. TSUJIMO, Y. YAMAMOTO, T. MUKAIet al., 1992 A single MEF-2 site is a major positive regulatory element required for transcription of the muscle-specific subunit of the 363 human phosphoglycerate mutase gene inskeletal and cardiac muscle cells. Mol. Cell. Biol. 12: 4384-4390. NEUFELD, T. P., R.W. CARTHEW and G. M. RUBIN, 1991 Evolution of gene position-chromosomal arrangement and sequence comparison of the Drosophila melanogaster and Drosophila virilisSina and Rh4 Genes. Proc. Natl. Acad. Sci.USA 88: 10203-10207. OMENN, G. S., and S. C. Y. CHEUNG, 1974 Phosphoglycerate mutase isozyme marker for tissue differentiation in man. Am. J. Hum. Genet. 26: 393-399. ROSELLI-REHFUS, L.,F. YE, J. L. LISSEMORE and D. T. SULLIVAN, 1992 Structure and expression of the phosphoglycerate kinase gene ( P g k ) of Drosophila melanogaster. Mol. Gen. Genet. 235: 213-220 SAKODA, S., S. SHANSKE, S. DI MAURO and E.A. SCHON, 1988 Isolation of a cDNA encoding the B isozyme of human phosphoglycerate mutase (PGAM) and characterization of the PGAM gene family. J. Biol. Chem. 263 16899-16905. SANGER, F., S. NICKLENand A.R. COULSON, 1977 DNA sequencing with chain terminating inhibitors. Proc. Natl. Acad. Sci.USA 74: 5463-5467. SHAW-LEE, R. L., J. L. LISSEMORE and D. T. SULLIVAN, 1991 Structure and expression of the triose phosphate isomerase ( T p i ) gene of Drosophila melanogaster. Mol. Gen. Genetics. 230: 225-229. SHAW-LEE,J.R., L. LISSEMORE, D. T. SULLIVAN and D. R. TOM, 1992 Alternative splicing of fructose 1,Gbisphosphatealdolase transcripts in Drosophila melanogaster predicts 3 isozymes. J. Biol. Chem. 267: 3959-3967. SOUTHERN, E. M., 1975 Detection of specific sequences among DNA fragments separated by gel electrophoresis.J. Mol. Biol. 98: 503-517. STARMER, W. T., AND D. T. SULLIVAN 1989 A shift in the third-codonposition nucleotide frequency in alcohol dehydrogenase genes in the genus Drosophila. Mol. Biol. Evol. 6: 546-552 STRECK, R. D., J. E. MAcGmuand S. K BECKENDORF, 1986 The structure of hobo transposable elements and their insertion sites. EMBO J. 5: 3615-3623. SULLIVAN, D. T., W. T. CARROLL, C.L. KANIK-ENNUIAT, Y. HIITI,J. A. LOVEIT et al., 1985 Glyceraldehyde-%phosphatedehydrogenase from Drosophilamelanogaster: identification oftwoisozymes forms encoded by separate genes.J. Biol. Chem. 260: 4345-4350. SULLIVAN, D.T., W. T. STARMER, S. W. CURTIS,M. MENOTTI-RAYMOND and JUNGSUN YUM, 1994 Unusual molecular evolution of an Adh pseudogene in Drosophila. Mol. Biol. Evol. 11: 443-458. TSUJIMO, S. SAKADO, S. MIZUNO,R. KOBAYASHI, T. SUZUKI et al., 1989 Structure of the gene ecoding the muscle-specific subunit of human phosphoglycerate mutase. J. Biol. Chem. 264: 15334-15337 URENA, J. M., X. GRANA, L. DELECEA, P. R u I ~ J.,CASTELLA et al., 1992 Isolation and characterization of a cDNA encoding the B-isozyme of rat phosphoglycerate mutase. Gene. 113: 281-282. VON KALM, L., J. WEAVER, J. D E ~ ~ A RR.CJ.OMACINTIW , and D. T. SULLIVAN, 1989 Structural characterization of the aglycerol-%phosphate dehydrogenaseencoding geneof Drosophila melanogaster. Proc. Natl. Acad. Sci. USA 86: 5020-5024. WHITE, M. F., and L.A. FOTHERGILL-GILMORE, 1992 Development of a mutagenesis, expression and purification system for yeast phosphoglycerate mutase. Investigation of the role ofactive-site Hisl8l. Eur. J. Biochem. 207: 709-714. WHITE, P. J., J. NAIRN, N. C. PRICE, H. G. NIMMO, J. R. COGGINS et al., 1992 Phosphoglycerate mutase from Streptomyces coelicolor A3(2): purification and characterization of the enzyme and cloning and sequence analysis ofthe gene. J. Bacteriol. 174: 434-440. WOJTAS,K. M., L. VON KALM, J. R. WEAVER and D. T. SULLIVAN 1992 The evolution of duplicate glyceraldehyde-%phosphate dehydrogenase genes in Drosophila. Genetics. 132: 789-797. YANAGAWA, S.-I., K. HITOME, R. SASAKI and H. CHIEA, 1986 Isolation and characterization of cDNA encoding rabbit reticulocyte 2,3bisphosphoglycerate synthetase. Gene 44: 185-191. Communicating editor: V. G. FINNERTY