Download Structure, Expression and Duplication of Genes Which Encode

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Frameshift mutation wikipedia , lookup

Copy-number variation wikipedia , lookup

Gene therapy wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Genetic engineering wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Expanded genetic code wikipedia , lookup

Bisulfite sequencing wikipedia , lookup

Non-coding RNA wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Long non-coding RNA wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Gene nomenclature wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Ridge (biology) wikipedia , lookup

Gene desert wikipedia , lookup

Gene expression programming wikipedia , lookup

Minimal genome wikipedia , lookup

Genomic imprinting wikipedia , lookup

Transposable element wikipedia , lookup

Human genome wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Genome (book) wikipedia , lookup

Genetic code wikipedia , lookup

Pathogenomics wikipedia , lookup

Non-coding DNA wikipedia , lookup

Epigenetics of human development wikipedia , lookup

History of genetic engineering wikipedia , lookup

Metagenomics wikipedia , lookup

Genomic library wikipedia , lookup

Gene expression profiling wikipedia , lookup

Point mutation wikipedia , lookup

Genome editing wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Genomics wikipedia , lookup

Genome evolution wikipedia , lookup

Gene wikipedia , lookup

Designer baby wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Primary transcript wikipedia , lookup

Microevolution wikipedia , lookup

Helitron (biology) wikipedia , lookup

RNA-Seq wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Transcript
Copyright 0 1994 by the Genetics Society of America
Structure, Expression and Duplication of Genes Which Encode
Phosphoglyceromutase of Drosophila melanogaster
Peter D. Currie' and David T.Sullivan2
Department of Biology, Syracuse University, Syracuse, New York 13244-1270
Manuscript received February 5, 1994
Accepted for publication June 8, 1994
ABSTRACT
We report here the isolation and characterization
of genes from Drosophila that encode the glycolytic
enzyme phosphoglyceromutase (PGLYM). Two genomic regions have been isolated that have potential
to encode PGLYM. Their cytogenetic localizations have been determined by in situ hybridization to
salivary gland chromosomes. One gene,Pglym78, is found at 78A/B and the other, Pglym87, at 87B4,5
of the Drosophila polytene map.
Pglym78 transcription follows a developmental pattern similarto other
glycolytic genes in Drosophila,Le., substantial maternal transcript deposited during oogenesis;
a decline
in abundancein the first half-of embryogenesis;
a subsequent increase
in the second half
of embryogenesis
which continues throughoutlarval life; a decline in pupae and a second increase to a plateau in adults.
This transcript has been mapped by cDNA and genomic sequence comparison, RNase protection, and
primer extension.Using similar analyses transcripts
of Pglym87 could not be detected.Pglym 78 has two
introns which interrupt the coding region,
while the Pglym87 gene lacks introns. This and
other features
Pglym87. The apparent
support a model of retrotransposition mediated gene duplication for the oforigin
absence of a complete, intact coding frame and transcript suggest
Pglym87
thatis a pseudogene.However,
retention of reading frame and codon
bias suggests that
Pglym87 may retain coding function, may
or have
been inactivated recently, substantially after the time of duplication, or that the molecular evolution
of Pglym87 is unusual. Similaritiesof the unusual molecular evolution ofPglym87 and other proposed
pseudogenes are discussed.
A
S part of an ongoing study to characterizethe control of expression of glycolyticgenes of Drosophila
melanogaster, a significant subset of these genes has
been isolated and characterized (ROSEUI-REHFUS
et al.,
1992; SHAW-LEE
et al. 1991, 1992; VON KALM et al. 1989;
Suurvm et al. 1985). This study details the molecular
characterization of an additional member of this set of
genes, the gene that encodes
phosphoglyceromutase
(PGLYM). PGLYM catalyzes the interconversion of
2-phosphoglycerate and %phosphoglycerate. PGLYM
from insects has not been well characterized. To our
knowledge, there have been no reports concerningthis
enzyme or its gene from Drosophila. In mammals, two
genes encode two isozyme subunits, each subunit is 30
kD. The enzyme is a dimer so three isoforms of PGLYM
may be found in mammalian tissues. There is a muscle
specific subunit, M, and a non-muscle-specific,or brain,
subunit B, which is synthesized in many tissues (OMENN
and CHEUNG
1974). The homodimer, MM form, is found
mainly in mature muscle; the BB form, is found mainly
in liver, kidney and brain; and the heterodimer MB
form, alongwith varyingproportions of the MM and BB
forms, is found mainly in the heart (OMENN CHEUNC
and
1974). In addition, PGLYM in humans is developmentally regulated. Early in development fetal muscle con-
'
Present address: ICRF Developmental Biology Unit. Department of Zoology, University of Oxford, South Parks Road, Oxford OX1 3PS, U.K.
* To whom correspondence should be addressed.
Genetics 138 353-363 (October, 1994)
tains mainly the nonspecific or brain form. At approximately 80-100 days of gestation, the muscle specific
form first appearsand becomes the predominant form
thereafter (OMENN and CHEUNG
1974; MIRANDAet a2.
1988).
Genes which encode human, rat, rabbit and mouse
isoforms have been characterized and comparisons reveal a substantial conservation in nucleotide and amino
acid sequence (URENA
et al. 1992; CASTELLA-ESCOLA
et al.
1990; Y ~ A G A WetA al. 1986; LEBOULCH
et al. 1988). In
humans there is a single copy of Pglym-m per haploid
genome. Thisis in contrast to a large family ofPglym-b
genes, many of which have been demonstrated to be
pseudogenes (CASTELLA-ESCOLA
et al. 1990). It has been
proposed that multiple functional copies of Pglym-b
once existed within the Pglym-b gene family and one
evolved to form Pglym-m. This hypothesis is supported
by phylogenetic comparisons which indicate
that
Pglym-m has diverged from this multigene family relaet al. 1988; LEBOULCH
et al.
tively recently (SAKODA
1988).
Transcriptional regulationof muscle specific Pglym-m
has been investigated in some detail. Promoter deletion
studies have defined a single binding site in the Pglym-m
promoter for the MEF-2 protein which is involved in
tissue and temporal expression in muscle (NAKATSUJI
et al. 1992). Band shift assays using promoter fragments
demonstrated that protein binding tothis site was only
P.
354
Currie and D. T. Sullivan
detected using nuclear proteins derived from muscle
Sephadex G50 column by a method specified by the manufacturer (Boehringer Mannheim).
extracts. Thus, the mammalian muscle specific isoform
Screening of bacteriophage libraries:Genomic clones were
is thought to be under direct control of genes
stimu-that
isolated from an adult Oregon-R D.melanogasterstrain library
late muscle differentiation (NAKATSUJI
et al. 1992).
constructed in the A vector EMBL4 (SHAW-LEE
et al, 1991).
Saccharomyces cerevisiae PGLYM has also been well
cDNA clones were isolated from an adult Oregon-R strain D.
characterized. Both the nucleotide and amino acid semelanogasterlibraryconstructed in the vector Agtll (VON KALM
quence have been independently determined (HEINISCH et al. 1989). Then 20,000 pfu of the respective libraries were
plated with 400 pl of an overnight culture of the appropriate
et al. 1991; FOTHERGILL-GILMORE
and WATSON
1989).The
host strain per large NZCYM plate, and plates were allowed to
yeast enzyme is composed of four identical 27-kD s u b
grow to near confluence. Plates were incubated at 4" for 1 hr
units encoded by the G p m l gene. The crystal structure
to harden top agarose. The libraries were then screened acof the enzyme has been solved (FOTHERGILL-GILMORE
and
cording to the method of BENTON
and DAWS(1977).
Southern and Northern blot analysis: Southern blot transWATSON
1989). One important feature learned from
fer was carried out following the method of SOUTHERN
(1975)
these studies has been the demonstration that two hiswith the adaptations described by the membrane manufactidine residues are involved in catalysis at the active site
turer (Schleicher 8c Schuell).Northern blot analysiswas
of this enzyme. The essential role of these histidines for
et al. (1987) using Genescreen memadapted from LISSEMORE
catalytic activityhas been demonstrated further by sitebranes according to the manufacturers instructions (New England Nuclear). Modifications included soaking the electrodirected mutagenesis (WHITEand FOTHERGILL-GILMORE
phoresis rig, tray, comb and peristaltic pump recirculation
1992).
tubing in 0.1% diethyl pyrocarbonate overnight before use.
The biochemical properties of PGLYM from the bacWe used 200 pl/cm of prehybridization buffer (50% deionized
teria, Streptomyces coelicolor, the only prokaryote from
formamide, 5 X SSC, 5 X Denhardt's solution, 100 pg/ml
which the PGLYM encoding genehas been cloned and
sonicated denatured salmon sperm DNA, 40mM sodium phosphate, pH 6.8, 0.5% bovine serum albumin, 1% sodium dccharacterized, indicate that this enzyme has properties
decyl sulfate (SDS),and 10%dextran sulfate) and 100 pl/cm2
similar to the yeast PGLYM (WHITEet al. 1992). The
of hybridization buffer (50% deionized formamide, 5 X SSC,
subunit molecular mass is 28 kD and the native mass
100 pg/ml sonicated denatured salmon sperm DNA, 40 mM
determined by gel filtration was estimated at 120 kD. It
sodium phosphate, pH 6.8, and 10% dextran sulfate). Prehywas assumed therefore that theS. coelicolor enzyme is a
bridization was for 4-6 hr at42" and hybridization was at 42"
for 12-16 hr.
tetramer, similar in form to the yeast enzyme. The bacRNase protection: One microgram of linearized SK+ plasterial enzymepossesses the greatest amino acid semid DNA was used as a template for transcript synthesis using
quence similarity to the yeast enzymebut also has a high
T3 or T7 polymerase and [cx-~~PJCTP
(NEN) utilizing the Rilevel of amino acid similarity to mammalian PGLYM,
bobasica I1 transcription kit and a method supplied by the
reinforcing the view that glycolytic genes have been
manufacturer (IBI). Two micrograms of IBI RQ1 DNase
I were
added to the labeled transcript and incubated at 37" for 30
highly conserved during evolution (WHITE
et al. 1992).
min. Twenty microliters of 50 mM EDTA were added, and the
Results reported here provide the initial characterizatranscriptwas purified over a Sephadex G50 RNA spin column
tion of an insect Pglym gene. In D. melanogaster it ap(Boehringer
Mannheim).
Labeled transcript (20-50,000
pears there is one functionalgene
and a second
cpm) was added to an appropriate amount of total RNA that
sequence which may be a pseudogene.
had been dried under vacuum and resuspended in 30 pl of
MATERIALSANDMETHODS
DNA isolation and preparation: Plasmid DNA, propagated
in Escherichia coli strain DH5a, was isolated in smallquantities
by the method of MORELLE(1988). Large scale preparation of
plasmid DNA wasperformed by as described in MANIATIS et al.
(1989). Single-stranded DNAfrom M13mp18 or M13mp19 sequencing vectors, large scale bacteriophage A DNA using a
plate lysate method or lysis in liquid culture were isolated using
methods described in MANIATIS et al. (1989). Drosophila
genomic DNA was prepared in small scale via the method of
LISet al. (1983). Large scale isolation of Drosophila genomic
DNA was performed using a modified method of R. LIFTON
et al. 1983).
(BENDER
Isolation and radiolabeling of DNA fragments: Agarose gel
slices containing DNA restriction endonuclease fragments
were visualized and excised from ethidium bromide containing 1% agarose gels and electroeluted according to the dialysis
tube method of MANUTIS et al. (1989). A125-500-ng sample
of isolated restriction fragment DNA was random primed utilizing a randompriming labeling kit as specified by the manufacturer (Boehringer Mannheim).Unincorporated nucleotides were separated from labeled DNAvia passage over a
hybridization buffer (80% formamide, 40 mM PIPES, pH 6.4,
0.4 M NaCl, 1 mM EDTA) and heated to 85" for 10 min. The
sample tubes were then kept at 55" overnight, 300 pl of digestion buffer (0.1 M Tris-Cl pH 8.0, 0.5 M EDTA, 0.3 M NaCl,
40 pg/ml heat-treated RNase A (Sigma) and 2.87 units/ml
RNase T1 (BRL)) were added, and the reaction mixture was
incubated at 30" for 30 min. Twenty microliters of 10% SDS
and 2.5 pl of 20mg/ml proteinase Kwere then added,and the
reaction mixture was incubated at37" for 30 min. Extractions
were performed with the addition of an equal volume of phenol and 0.5 pl of 10 mg/ml yeast tRNA was added to the extracted aqueous phase. Reaction productswere precipitated by
the addition of an equal volume of ethanol, rinsed with 70%
ethanol and the pellets dried under vacuum. Pellets were resuspended in loading dye (U. S. Biochemical Corp.) and
heated to 85" for 5 min. Protection products were visualized
following electrophoresis on a 1 X TBE, 6% acrylamide gelby
standard autoradiographic techniques.
Primer extension: Primer extension was performed essentially as described by MANIATIS et al. (1989) with the following
modifications: 5-30 pg of poly(A+)RNA wereused depending
on the stage and primer used in each individual experiment;
105-106 cpm of the labeled primer were added to the specified
amount at poly(A+)RNA; annealing temperatureswere either
Drosophila Pglym Genes
at 25" or at 30" as described. Primers were extended using 40
units of avian myeloblastosis virus reverse transcriptase (Life
Sciences, St. Petersburg, Florida). Extension products were
separated on 6% acrylamide 3/4-3 X TBE buffer gradient gels
(BICCINet al. 1983) and visualized by standard autoradiographic techniques.
PUriFcation of synthetic oligonucleotide primers: Oligonucleotides designed for DNA sequencing were synthesized
(Syracuse University DNA and protein core facility) and purified via passage overa NENSORB PREPcartridge (DuPont).
Sequencing strategies:To sequence genomic regions spanning each gene, subclones were isolated that contained regions of bacteriophage clones that were homologous to the
cDNA clone used to isolatethem. The sequencing strategy for
these subclones involved sequencing from either end and
using oligonucleotide primers to bridge these sequences. Isolated cDNA clones were subcloned either into pBluescript I1
SK+ or mp18/19 by subcloning liberated EcoRI fragments of
hgtl 1 inserts or performing BamHI, Hind111 directional subcloning. Genomic fragments were cloned into the appropriate
polyiinker sites
of pBluescript I1 SK'. Single-stranded DNA was
sequenced by the dideoxy chain termination method incorporating [a-35S]dATP (SANCER
et al. 1977; HONC1982) and
visualized on 6% acrylamide 3/4-3 X TBE buffered gradient
gels ( BICGINet al. 1983). Sequencing of double stranded plasmid templates utilized the following protocol. Two micrograms of plasmid DNAin 20 pl ofdH,O weredenatured by the
addition of 2 pl of 2 M NaOH, 2 mM EDTA (pH 8.0), and
incubation was at 85" for 5 min. The mixture was neutralized
by the addition of 3 pl of 3 M sodium acetate (pH 5.2) and 7
pl of dH,O. Absolute ethanol (75 pl) was added to the DNA
and placed at -70" for 5 min. DNA was recovered by centrifugation for 10 min at 4". The DNA pellet was washed in 200 pl
of 70% ethanol and dried under vacuum. The pellet was resuspended in 7 pl of dH,O and annealed to a primer at 37" for
30 min. All sequencing utilized the Sequenaseversion 2.0DNA
sequencing kit as per the manufacturer's direction (U. S. Biochemical Corp.). Autoradiograms of sequencing gelswere
read and sequences determined using a digitizer and computer programs from DNASTAR (Madison, Wisconsin). Both
Pglym sequences have been submitted to the GenBank data
base. Pglym78 has been assignedaccession
no. L27654.
Pglym87 has been assigned accession no. L27656.
In situ hybridization to polytene chromosomes: Polytene
chromosomes were dissected from late 3rd instar larvae of
Oregon-R D . melanogaster that had been raised at 18" on banana media. Dissection of salivary glands, separation and fixation and pretreatment of polytene chromosomes was performed essentiallyas described in ENCELSet al. (1986).
Chromosomes were examined under phase contrast optics
and the best chromosomes selected for hybridization. From
250 ng to 1 pg of unrestricted bacteriophage DNA containing
the genomic DNAof
interest werebiotinylatedutilizing
Bio-1 1-dUTP (ENZO diagnostics) ina random priming reaction (Boehnnger Mannheim), and unincorporated nucleotides were separated by passage overa Sephadex G50 spin column (Boehringer Mannheim). Twenty micrograms of salmon
sperm DNAwere added and precipitated by the addition of 2.5
volumes of ethanol. The probe was recovered by centrifugation, resuspended in 75 p1 of hybridization buffer (0.6 M NaCl,
50 mM sodium phosphate buffer, pH 6.8, 1 X Denhardt's solution, 5 mM MgCl,), denatured by boiling for 3 min, and
chilled on ice. The probe solution was added to the slides,
sealed with a coverslip, and incubated in a moist chamber at
58" for 12-18 hr. The slides were washed
three times for 15min
each in 2 X SSC, and the signal was detected using the Detek
I-hrp kit as specified
by the manufacturer (ENZO Diagnostics).
355
Assays of PGLYM activiq The assays for PGLYM activity
was modified from the methods of GMSOLIA
and CARRERAS
(1975).This assay measures formation of phosphoenolpyruvic
acid (PEP) by increase in optical density at 240 nm by rate
limiting amounts of mutase in an excess of enolase. One gram
of adult flies was ground in 10 ml of 50 mM Tris, pH 7.4, 2%
of saturation with respect to phenylthiourea, using a motorized mortar and pestle. The homogenate was spun at 10,000
rpm for 15 min at 4". The supernatant was recovered by filtration through small gauge nylon mesh. Fifty microliters of
homogenate were added to 3 pl of 50 mM %phosphoglyceric
acid (PGA), trisodium salt (Sigma), 10 pmol ofMgSO,, 100
pmol of Tris, pH 7.0, and 10 units of enolase in a I-cm light
path quartz cell. The rate of increase of absorbance at 240 nm
was measured spectrophotometricallyusing a Gilford 240 recording spectrophotometer, and the amount of enzyme activity in each sample was determined utilizing a molar extinction coefficient of 1.75 X lo3 for PEP and the fact that 1.5
mmol of 2-PGA formed corresponds to 1 mmol of PEP measured. Protein concentrations were determined using the BioRad microassay with bovine serum albumin as a standard, utilizing the reagent and instructions supplied by the
manufacturer. Lethal mutations in the 87B region were obtained from the Umea stockcenter, Sweden,and fromJ. GAUSZ
at the Hungarian Academy of Sciences.
RESULTS
Isolation of the D . melanogaster PGLYM encoding
genes: To isolate D . melanogaster Pglym, a full length
cDNA which encodes the human brain
PGLYM isoform
(TSUJIMO
et al. 1989) was used to screen a Drosophila
cDNA library prepared using adult mRNA and the vector, hgtl1 (VON KALM et aZ. 1989).Five cDNAclones were
isolated. All contained anidentically sized insert of 1 kb.
The clones were sequenced and found to contain a
PGLYM encoding open reading frame
by comparison to
the human PGLYM amino acid sequence. ADrosophila
cDNA clone was used to probe a Southernblot of Drosophila genomic DNA. This analysis (Figure 1) revealed
multiple regions within the Drosophila genome having
various intensities of hybridization. This suggests the
likelihood that more than one gene may be present in
the genome. A cDNA clone was used to screen a Drosophila genomic library and threegenomic clones were
isolated. Partial restriction maps of the inserts in these
phage were assembled and identify two non-overlapping
clones (Figure 2). By comparison to the genomic Southern blot, each clonecould be characterized as containing restriction fragments relating toone or the other
of
the hybridizing regions. This confirmed existence of two
PgZym regions within the Drosophila genome. The
nucleotide sequenceof each was determined according
to the strategy indicated in Figure 2. The cytogenetic
localization of each gene on the Drosophila polytene
map was determined by in situhybridization to D . melanogasterpolytene chromosomes. Using the entireinsert
of each lambda clone as a probe for hybidizationn to
polytene salivary gland chromosomes, a single site for
each was identified (data not shown). One gene is located at87B4,5 on the rightarm of chromosome 3 and
356
P. D. Currie and D. T. Sullivan
FIGURE
I.-Cenomic Southern restrictedwithvarious enzymes and the resultant blot probed with the isolated Pglym
cDNA. Lane 1, EcoRI; lane 2, BamHI; lane 3, HindIII; lane 4,
SalI; lane 5, BglII; lane 6, PsA; and lane 7, SphI. The arrows
represent positions of DNA standards of 12,5,4,3,2
and 1 kb,
respectively.
the other atregion 78 A/B on the left arm of chromosome 3. Thesegenes
will be called Pglym87 and
Pglym 78, respectively.
Sequencing of Pglym78 The PGLYM encoding
cDNA used to isolate the genomic clones was demonstrated, by sequencing, to correspond to thetranscribed
region of Pglym 78. The sequence of 2290 nucleotides of
the Pglym 78 genomic region beginning 632 nucleotides
upstream of the translation start ATG and extending 381
nucleotides beyond the translation stop codon and the
conceptual translation of the coding region is shown in
Figure 3. Pglym78 contains two introns of 469 and 58
nucleotides, respectively, and three exons of 102, 498
and 166 nucleotides. The promoter region of Pglym78
does not contain a sequence which is an identifiable
TATA box sequence. Exon one, as defined by the 5'
extent of the cDNA, wasconfirmed by primer extension
and RNase protection assays (see below). Nucleotide sequencing of the 3' end of the cDNA clone revealed that
polyadenylation site occurs 20 bp downstream from the
sequence, AATAAA, which matches the consensus
polyadenylation signal.
Transcript analysis of Pglym78 To confirm the transcription unit suggested by the comparison of cDNAand
genomic sequences, RNase protection and primer extension analyses were performed on the Pglym 78 transcript (Figure 4). The5' end of the Pglym78 transcript
was defined using RNase protection and transcripts
from the 1.3-kb BglII subclone. The transcripts pro-
FIGURE2.-Sequencing strategy of the isolated Pglym regions from the genomic A clones. Stipled bars represent the
BglII bands that hybridized ingenomic Southern blot analysis
(see Figure 1, lane 5). Enzymes used to construct the maps
were, E, EcoRI; Bg, BglII;B, BamHI. Sequencing from restriction sites is denoted with a plain arrow while the oligonucleotide primers used to bridge these sequences are shown as +m.
Sequencing of the cDNA and genomic sequence covered both
strands of each gene.
tected by this probe confirmed the boundaries of intron
1 and revealed possible multiple transcription start sites.
This transcript protected four RNA fragments, 240,188,
180 and 170 nucleotides long, respectively (Figure 4,
panel A). The 240-bp fragment represents that portion
of the Pglym transcript that extends from the 3' end of
the probe to the 5' end of exon two. The remaining
bands presumably represent transcript fragments extending from the 3' end of exon one to multiple transcription start sites at thebeginning of the Pglym 78 transcript. The 180-bp band corresponds insize tothe
putative start site defined by the 5' end of the cDNA. To
confirm that themultiple bands protected by transcripts
from the 1.3kb subclone represent multiple start sites,
the probe was shortened by restriction at a PvuII site.
This probe extends 427 nucleotides 5' to the ATG and
protects bands identical in sizeto those protected by the
full length probe. Since the 240 nucleotide fragment is
known to derive from exon two, the other threemust be
derived from the region between the PvuII site and the
3' end of exon one. This result requires that these fragments overlap, have a common 3' end andthat theirsize
differences is at their 5' ends,(Figure 4, panel B) . This
is most simplyinterpreted as multiple transcription start
sites. This interpretation is confirmed by primer extension analysis of the Pglym 78 transcript (Figure 4). A 22base primer correspondingto nucleotide positions 7-28
3' to the ATG was annealed to 5 pg of poly(A+) RNA
Drosophila Pglym Genes
357
ATACAGCCGACTCGATTGAAAACAGCACGGCGCGATTCTCGTCGATCACGTCGTCCAAGTCGGGCCTGCGGACAAAGGAGGTTTCGAAATTACTTGATAA-~OO
AAACCAGTTTCGGGGCGTGGAATCCTTTGTCTCACTTCACAAAGACGTTCTTCGTCGGACAAGCGGGTGGCCACCAGTTGGTCCAGCCCGTTGCCCGACA-200
GCTGGGGCCAAGTGCAGCGACGCGGTGACCTTTTCGCACATGTTTAAGTGACAAAAAGTTGGAATTTATGGGCGCAAAAACAGCCAAATTTAAAACACGA-300
ATATAGTGATGACAAACGCGAGCGATAACATCTGCGCGGGGGATCCAACCTGTTGCACACGACAAACTAATTGGACATTTGAAGGGGTGTAGCTTGACGC-400
GCTATTCCAATAAGACCATACTCCGGAAATCCAATCAGAGAAGGCACCGCTCTGCGCACTACGTTTTTCCAACAGAACGGCAATATCGCGGCCTGACAGC-500
*
*
*
T G C G A C C T T T G G A C G G C T C C C T T G T A G G T C A A A C T T C A ~ G T C G C ~ T - 6 o Q
G C A G C WATGGGCGGCAAGTACAAGATTGTGATGGTGCGCCACGGCGAG~CCGAGTGGAACCAGGAGAATCAGTTCTGCGGC-700
M G G K Y K I V M V R H G E S E W N Q E N Q F C G
TGGTACGACGCCAACCTCAGCGAAAAGG~GAGTGTTACATCAATTTGAAATTCAAACACATATGTGGATTAAAATCTGATGTATGTGTGTACATCTAGT-~OO
W y D A N L S E K G------------------------------------------------------------------------TCCAATCGATGATCCTTTGCTTTGATCAGTTTCACAATAGTTTTTCATTTCATTTACCCAAATTAATCTTAATTGACCTACAACAACCTGCAACATTATC-900
-
....................................................
TCATTCAAACTTTTCTGTGATAAAAATAGGTCTCTACAAACAAAAGCTTTCCAGTGCAATTTTCAGAAGCATCAAAATTGTAGTTTCTTACACTTTAAAG-1000
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
~
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
~
"
"
"
"
"
~
TGCAAAAATTTGCTAACTCATATGTTTCTAAAGTTCAGATTATGCTTATGAGCATAATTTGATAGGTTATCGCCTTTATTTAACCCTAGTAATCAAATTA 1 1 0 0
""""""""""""~""""~"""""""""""""~"""""""""""""""~"""
AAATTAAACGCATGCCTTTCAATTAGATTAAATTGGCTTTAAGTTCGTTACCCAACACGGCGTATACTTAATGTTCGCTAAACATTTGCCCAAAC~GTC-1200
Q
"
~
~
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
~
AGGAGGAGGCTCTAGCCGCGCGCAAGGCCGTCAAGGATGCCGGCCTGGAGTTCGATGTGGCCCACACCTCGGTGCTAACCC~CGCCCAGGTGACGCTGGC-1300
E E A L A A R K A V K D A G L E F D V A H T S V L T R A Q V T L A
CAGCATCCTGAAGCCAGTGGCCACAAGGAGATCCCCAATCCAGAAGACCTGGCGCCTGAACGAGCGCCACTACGGTGGACTCACTGG~CTGAACAAGGCC-1400
S I L K P V A T R R S P I Q K T W R L N E R H Y G G L T G L N K A
GAGACCGCCGCCAAGTACGGCGAGGCCCAGGTGCAGATCTGGCGTCGCAGCTTCGACACCCCGCCACCACCGATGGAGCCGGGCCATCCGTACTACGAGA-1500
E T A A K Y G E A Q V Q I W R R S F D T P P P P M E P G H P Y Y E ACATCGTCAAGGATCCCCGCTACGCCGAGGGTCCCAAGCCCGAGGAGTTCCCCCAGTTCGAGTCCCTCAAGCTGACCATCGAGCGCACACTGCCCTACTG-1600
I V K D P R Y A E G P K P E E F P Q F E S L K L T I E R T L P Y W
GAACGACGTCATCATTCCCCAGATGAAGGAGGGCAAGCGCATCCTGATCGCTGCCCACGGCAACAGCCTCCGTGGCATCGTCAAGCATTTAGACA~GAG-1 7 0 0
N D V I I P Q M K E G K R I L I A A H G N S L R G I V K H L D N
BOO
CATGGCAGATATATCTAACATATCATCTTTTTACCATATCCTTGTCAATC~ACCTTTCTGAGGACGCCATCATGGCCCTAAATCTGCCCACGGGCATTC-1
L
S
E
"
"
"
"
"
"
"
~
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
~
D
A
I
M
A
L
N
L
P
T
G
I
P
900
CGTTCGTCTACGAGCTGGACGAGAACTTCAAGCCGGTGGTGTCCATGCAGTTCCTTGGCGACGAGGAGACCGTGAAGAAGGCCATCGAGGCTGTGGCCGC-1
F V Y E L D E N F K P V V S M Q F L G D E E T V K K A I E A V A A
CCAGGGCAAGGCCAAG~AGCACGCAGGCCACACCTCATGAGAATCCACTCAAATACGGCACAAGTCAGAGCATTCTGATTACGCTTCTGTTCAAGATGT-20~0
-
Q
G
K
A
K
AGCCA~CGTTGTTGTTGTTCGCGTAGCAGTAATTGGTTCATCCCCACTACAGT~AATAAATAATTTATTATTACACTCAAGAGGACAGGCGGA-2100
GGAGGCTCATTTATTGTCGGCTACTTCTTCTTATTCTTACTACGTCTTGGCATTCAGGCCAGGATGCACTTGGCGCACCTTGTACGTTCGTGAGAGCGCG-2200
TGTGCACAATCGCACCGTGTCCTGATTGCAGTGCTCTCGTGCTCGTGGTGCAGCTTCACACACAGTGTCTGGCGCGCGCTCCGTGCTAGCC
-2300
FIGURE
3,"Seqvence and conceptual translationof Pglym78. Nucleotides are numbered with reference to thefirst nucleotide
of the sequenced region and the introns are indicated via a dashed line which breaks the derived amino acid sequence. Splice
site donor and acceptorsequences are in underlined type. Multiple transcription start sites denoted
are
by asterisks. Untranslated
underlined.The translation start
codon, stop codon and the polyadenylation consensus signal
is double
regionsof the transcript are
underlined type. Pglym78 has accession no. L27654 in the GenBank database.
from embryos and larvae. Three major extension products of 96, 106 and 114 nucleotides were detected. The
5' endsof these extension products corresponddirectly
to the positions of transcript start sites predicted by the
RNase protection studies.
The 3' end of the Pglym78 gene was analyzed by
RNase protection experiments using transcripts from
the 3.0-kb BglII subclone used in sequencing of the 3'
half of this gene. RNase protection using a transcript of
the entire 3.0-kb insert protects three products. These
are 334,259 and246 nucleotides long, respectively. The
334nucleotide fragment corresponds in size to a fragment which extends from the 5' end of exon 3 to the
polyadenylation site. The 259-nucleotide band defines a
fragment from the 5' end of the probe to the3' end of
exon 2. The origin of the 246nucleotide fragmentwas
unclear and was further investigated by using a probe
shortened at BspHI
a
restriction site (Figure 4, panel D).
This transcript protected fragmentsof 246,150 and 144
nucleotides. The 246nucleotide fragment is the same
246nucleotide fragment protected by the longer transcript (above) indicating that this fragment is from the
region 3' to Pglym78. The origin of this fragment is
likely from a downstream gene since in the Northern
blots described below a Pglym78 transcript that is 246
nucleotides longer at the 3' end would have been resolved in addition to the1-kb PgZym78 transcript. If this
fragment corresponded to a duplicated, alternatively
spliced Pglym exon it would have been discovered by
inspection of the sequence. The bands of 150 and 144
nucleotides represent shorteningof the 334nucleotide
protected fragment corresponding to exon 3 and the
3'-untranslated region of the transcript as expected. The
two bands could reflect alternative polyadenylation sites
or some nuclease digestion at the ends of the protected molecules in A rich regions. Itis unclear if the
original 334-nucleotide fragment might bea doublet
since a 6-nucleotide difference probably
would not be
358
D.
P.
Currie
Sullivan
and D. T.
Pglym 78A/B
1 KB
5' Probes
A
PRIMER
EXTENSION
3 Probes
B
C
D
. ~.. ._-
ss4bp,
" .
146bO-D
259bw
246b-
1l 4 b d
240bH
IO6bH
96bp-1
FIGURE
4.-Transcript
mapping of
the Pglym78 transcript. The Pglym78
exons are solid rectangles. Expected
protection products for the individual
probes, as derived from genomic and
cDNA sequence comparisons, are
shown as thin lines labeled with their
predicted sizes. Four probes A, B, C and
D are indicated as stippled lines and the
results using each is shown in panels A,
B, C and D, respectively. Primer extension analysis of the 5' end of Pglym78
mRNA is shown inthe right most panel.
Lane R, ribonuclease treated probe; L,
third instar larva] R N A s ; A, adult RNAq
and E, embryo RNAs.
150be
147bH
resolved in the
higher
molecular
weight 334
nucleotide band.
Northern blot analysis using RNA prepared fromdifferent developmental stages of the Drosophila life cycle
detects a single 1.0-kb transcript which variesin relative
amounts throughout development (Figure
5 ) . Pglym78
transcripts are very abundant at the earliest embryonic stages, probably as a result of transcription during
oogenesis. They become undetectable as embryogenesis
proceeds and an increase in abundance is evident by
hour 12 of development. During larval, pupal and adult
stages, the amountof Pglym78 transcript rises, fallsand
rises again in a pattern identical to transcripts from
other glycolytic genes of Drosophila (SHAW-LEE
et al.
1991, 1992; ROSELLI-REHFUS
et al. 1992).
Nucleotide sequencingof Pglym87: The sequencing
strategy for Pglym87is illustrated in Figure 2. The nucleotide sequence of 1838 nucleotides from Pglym87 and
its conceptual translation is given in Figure 6. A putative
open reading frame, but lacking the first two codons,
which might encode a PGLYM subunit is present and
intact through an appropriately positioned translation
stop codon. A methionine codon which might serve as
FIGURE
5.-Northern blot analysis ofthe Pglym 7 8 transcript.
Samples of 6.2 pg of RNA from various developmental stages
of Drosophila total were electrophoresed, blotted and hybridized with a cDNA probe representing the full length Pglym78
transcript. Time points are in hours (hr) from egg deposition
until the l&hr time point where time is then in days (dy).
an initiating codon does exist, in frame, eight amino
acid residues into this reading frame. Translation of an
mRNA of this type would yielda PGLYM like molecule
lacking the first nine amino acids but otherwise intact.
Pglym87lacks the introns present
in Pglym78which suggests that the Pglym gene duplication occurred by retrotransposition and raises the possibility (discussed below) that Pglym87 is a pseudogene.
Transcript analysisof Pglym87: RNase protection experiments were performed in parallel with and using the
same RNA samples as those of PgZym78 (Figure 4).
Drosophila PgZym Genes
359
CCGGACTAGACTATCCGATCTTCGCCTGAATACGAGGCAGCGCTATCATTACTTCGAGGAAGAACGTCTATGTGGACTTCTATATTCCATTACACTATAT-~OO
ATATATTATGATACTAGTGAACTCTCGCTGATATCTCGCAAGCTAGCTCTAGCTAAATATTACGAATTATATGCTGAGTACCACCTTTGTACAAGACCTT-ZOO
AGGCTTATAGACAATACCACTATGAAGCTGGCTGAGTCAACCATTACTAGCAGTGAACGTAGAATATAATCATTAAAGAATATTTCCATACTGTTAGAAT-300
GCATGAAATAAAAACCTCAAAGTAACCTTATTTTCAACGAACTTTTCTTACTGTCATAGGTTCAACCCCGGCAGGTTTTAAATTAAGGAAAAAATTCCAA-400
TTTCCGGAAAATTGGAAACCAACCGATATAGACTGCAAAACACTTTCCAAATTCAAAGTGTCCCAAAATGTTGCTCTGCCAGAATTCGTGGTGTGGCCAA-500
TTGCTGTGTCGCGTGGCGAACATCGAAGCCCGCTGCCCGTTCTCGAAGTCATCATCCGGCGGAAAGCAGGCCGAGAAGAAGGGCAAGTACAGGATTGTGA-600
G K Y R I V M
TGGTTCGTCATGGAGAGTCCGAATGGAATCAGAAGAACCTATTCTGCGGATGGTTCGATGCCAAGCTATCGGAGAAGGGGCAGCAGGAGGCCTGTGCCGC-700
V R H G E S E W N Q K N L F C G W F D A K L S E K G Q Q E A C A A
TGGCAAAGCGCTCAAAGATGCCAAAATTGAGTTCGATGTGGCCCACACCTCGGTGCTGACCAGAGCCCAGGAGACACTCAGGGCCGCCCTAAAGTCCAGC-800
G K A L K D A K I E F D V A H T S V L T R A Q E T L R A A L K S S
GAGCACAAGAAGATTCCGGTGTGCACCACGTGGCGCCTCAACGAGCGCCACTACGGCGGACTGACGGGACTGAATAAGGCCGAGACCGCCAAGAAGTTCG-~~O
E H K K I P V C T T W R L N E R H Y G G L T G L N K A E T A K K F G
GCGAGGAGAAGGTTAAGATCTGGCGCCGCAGCTTTGACACCCCACCGCCGCCAATGGAGAAGGATCACGAGTACTACACCTGCATTGTTGAGGATCCGCG-1000
E E K V K I W R R S F D T P P P P M E K D H E Y Y T C I V E D P R CTACAAGGAGCCAACTGAAGCCGGAGAGTTCCCCAAATCCGAGTCCCTCAAGCTCACCATCGAGCGCACCTTGCCCTACTGGAACGAGGTCATAGTGCCC-1100
Y
K
E
P
T
E
A
G
E
F
P
K
S
E
S
L
K
L
T
I
E
R
T
L
P
Y
W
N
E
V
I
V
P
CAGATCAAGGATGGAATGCGTGTCCTGATCGCCGCCCACGGCAACAGCCTGCGGGGCGTGGTCAAGCACTTGGAATGCATCTCTGACAAGGACATTATGA-1200
Q I K D G M R V L I A A H G N S L R G V V K H L E C I S D K D I M S
GCCTCAACCTGCCCACTGGCATTCCTTTCGTCTACGAACTAGACGAGAGCCTCAAGCCTTTGGCCACGTTGAAGTTCCTCGGCGATCCGGAGACGGTGAA-1300
L N L P T G I P F V Y E L D E S L K
P L A T L K F L G D P E T V K GAAGGCAATGGAGTCGGTGGCCAACCAGGGCAAAGCCAAG~ATCCGTGATTTTGGGTATACCCAGCAGCCTCCGAAGAAGGTCAGAGCAATCAGCTCAA-1400
K A M E S V A N Q G K A K
TGCACTTGTATGCCAAACAAAAGCGGCGAATCGAAAGTGGTGCTCACTGTGGCCAGGCGAATGTCAAGAAGCAAATAAA~TT~T~TGArT~TAATTGACA-1500
ATGAAGAGTCAACAACGATGGGCTGGCTGGCGTTGGTCTCTTGGTCTTGGTCTTGGTCTTGGTCTCTGGAACTCAGGCGACG~TTCTGGGTCCAGGGTCT-1600
GGGTCTGGATGTGGGTCTGGGTCTGGGTCTGAGCCAAAGTCAAGCCGATTATATGTGTTCCATTGGCGCTGCTTAAGAATCAGGAGTCGAGTGTGTGTCC-1700
TCAATTTAGTGCTTSGAATGAGTCTGAGAGCGCCTGTATTGCACAGGTCATTATTGTTAATGAATCTCATTAGGCTATGGAAC~CCAAGTATAGTGACTA-1800
AGTATAATCTGAGGCCGGTCCAGGCGAACTGTATCTCGA-1839
FIGURE
6."Sequence and conceptual translation of Pglym87. Nucleotides are numbered with reference to the first nucleotide
of the sequencedregion. Translation from codon threeyields a PGLYM similar sequence. A putative stop codon and the consensus
polyadenylation signal are double underlined. The 3' direct repeat sequences 3' to the putative coding region are underlined.
Pgbrn87 has accession no. L27656 in the GenBank database.
These experiments failed to detect any transcript from
Pglym87 in the RNA from larval and adult stages. To
determine if Pglym87 transcripts could be detected at
any developmental stage, RNase protections (internally
controlled with probes to Pglym78 and ribosomal protein RP49) were performed using RNA isolated from
animals which span the complete developmental spectrum, i. e . , the same RNA samples as used for the northern blots. Theseexperiments(data
not shown) also
failed to find any developmental stage at which a
Pglym87 transcript could be detected.Attempts to identify a Pglym87 transcript using primer extension analyses were also unsuccessful.
Analysis of the 87B region: Prior to completion of the
above analysis whichfailed to finda Pglym87 transcript,
we theorized that a mutation in a PGLYM encoding gene
that reduced the total level of PGLYM enzyme significantly might be lethal. GAUSZ
et al. (1981) analyzed the
87B cytogenetic region and identified 15 complementation groups represented in a set of 88 alleles. This is
an average of 5.9 alleles per complementation group.
This suggests that the region may be close to mutationally saturated and raises the possibility that one of
the lethal complementationgroupsmightrepresent
Pglym87. Representatives of eachcomplementation
group were assayed,as heterozygotes, for significant reduction of PGLYMactivity. Resultsfrom this analysis are
presented in Figure 7. Reduction of PGLYM activity
Complementation Group
FIGURE7,"Measurement of the PGLYM activity in animals
heterozygous for a representative allele of each of the lethal
complementation groups of the 87B region. Each allele was
assayed intriplicate. The standard error is indicated by vertical
bars.
could not be associated with any lethal complementation group. Therefore if Pglym87 has a function, it is
likely not a vital function so that mutations in Pglym87
were not detected in this set of complementation
groups. In addition, the amount
of enzyme encoded by
Pglym87 is likely a small fraction of the total phosphoglycerate mutase activity since heterozygous deletions
also showed no reduction in activity.
P. D. Currie and D. T. Sullivan
360
Alignment of DrosophilaandHuman
PGLYM s
8 7 -g k y r i u H U
RHGESEUNQK
NLFCGUFDAK
LSEKGQOEAC
ARGKRLKDAK
IEFDUAHTSU78-HGGKYKIUHURHGESEUNQENQFCGUYDANLSEKGQEEALARRKAUKDAGLEFDUAHTSU
HB-HRA-YKLULIRHGESAUNLENRFSGUYDADLSPRGHEEAKRGGQALRDRGYEFDICFTSU
HH-HAT-HRLUHURHGESTUNQENRFCGUFDAELSEKGTEERKRGAKAIKDAKHEFDICYTSU
60
LTRAQETLRAALKSSEHKKIPUCTTWRLNERHYGGLTGLNKAETAKKFGEEKUKIURRSF-120
LTRRQUTLRSILKPUATRRSPIQKTURLNERHYGGLTGLNKAETARKYGERQUQIURRSF
QKRRIRTLUTULDAlDQflULPUURTURLNERHYGGLTGLNKRETRAKHGEAQUKIURRSV
LKRAIRTLURILDGTDQflULPUURTURLNERHYGGLTGLNKAETAAKHGEEQUKIURRSF
FIGURE
8.-Alignment of the conceptually
translatedproteinsequence
of
the pglrm genes afDrosophi~a and humans. Dashes indicate gaps introduced
DTPPPPflEKDHEYYTCIUEDPRYKEPTERGEFPKSESLKLTIERTLPYUNEUIUPQ
I KDG-180
DTPPPPHEPGHPYYEN I UKDPRYAEGPKPEEFPQFESLKLTIERTLPYUNDUI
I PQnKEG
DUPPPPtlEPD
HPFYSHISKD
RRVRDLTEDQLPSCESLKD
TIARRLPFWN
EEIUPQIKEG
DlPPPPflDEKHPYYNSISKE RRYRGL-KPGELPTCESLKDTIRRRLPFUNEEIUPQIKRG
to maintain alignment.
IIRULIRRHGNSLRGUUKHLECISDKDIHSLNLPTGIPFUYELOESLKPLATLKFLGDPET-2+0
KRlLlAAHGNSLRGIUKHLDNLSEDAIHALNLPTGIPFUYELDENFKPUUSHQFLGDEET
KRULIAAHGNSLRGIUKHLEGLSEEAIHELNLPTGlPlUYELDKNLKPIKPflQFLGDEET
KRULIRRHGNSLRGIUKHLEGIISDORIHELNLPTGIPIUYELNKELKPTKPIIQFLGDEET
UKKAflESURN
UKKAIEAUAR
URKAIIEAURA
URKAIIERURA
QGKRKQGKRKQGKAKK
OGKFIK-
Evolutionarycomparison of the Drosophila Pglym
genes: Alignment of the putative proteins encoded by
the Drosophila genes and the human muscle and brain
isoforms, PGA"M and PGAM-B, respectively (Figure 8 )
reveals that the potentialprimary amino acid sequence
is highly conserved. The putative Drosophila PGLYM
proteins are 73% identical. The first eight amino acids
of PGLYM87 are shown in small letters because no putative initiator methionine precedes them, yet they are
conserved. Comparison of these amino acid sequences
was used to determine whether the Drosophila Pglym
duplication represented a duplication having functional
simiIarity to the mammalian Pglym duplication which
resulted in genesfordifferent
isoforms. PGLYM78
seems to beequally similar to the PGA"B isoform, 68%
and the PGA"M isoform, 69%. At positions of amino
acid substitutions where there can either be
a PGA"M
or PGA"B type amino acid within the PGLYM78 protein, 12 amino acids are identical to the PGA"B gene
and a further 12 substitutions are PGA"M type. Putative PGLYM87has 64% amino acid identity with the human muscle isoform PGA"M protein and 68% amino
acid identity with general isoform PGAM-B. Analysis of
the substitutions of PGLYM87 reveals that eight are of
the PGA"B type and 18are ofthe PGM-M type. As the
PGA"B isoform is the mammalian general isoform and
the PGA"M form is considered tissue restricted isoform this difference may indicate that Pglym8 7 at some
point encodeda protein that was restricted to a specific
cell type, e.g., muscle.
Analysis of Drosophila Pglym sequences reveals that
each shows a nonrandom distribution of nucleotides in
the third codon position (Table 1) with an abundance
of C and G and scarcity ofA typical ofother Drosophila
TABLE 1
Codon usage and bias of the Drosophila Pglyrn genes
Bases in the third position of codons
Gene
U
C
A
G
Total
Pglym87
% of total
Pgly m 78
% of total
29
11.4
22
8.6
103
40.6
123
48.0
27
10.6
14
5.5
95
37.9
97
37.9
254
256
genes (STARMER
and SULLIVAN
1989). This nonrandom
distribution is reflective of codon utilization bias.
Pglym87 has a somewhat less extreme codon bias than
does Pglym78 indicating some loss of bias since the
duplication.
DISCUSSION
Molecular genetics of Pglym78 Pglym 78 has properties similar to other Drosophila genes which encode
enzymes of the glycolytic pathway.These includea promoter region which has no TATA box but has multiple,
closely spaced transcription initiation sites and a transcript expression pattern similar to all of the otherDrosophila glycolytic genes tested so far (SHAW-LEE
et al.
1991, 1992; ROSELLI-REHFUS
et al. 1992). This similarity
in temporal regulation of glycolyticgenes reinforces the
notion that there may be common regulatory control
mechanisms exercised onthe transcription of these
genes.
The Drosophila Pglym 78 gene and the human
muscle
Pglym gene (TSUJIMO
et al. 1989) are the only Pglym
genes forwhich intron positions have been determined.
Each of these genes has two introns. The first intron of
Drosophila Pglym Genes
Pglym78 interrupts the gene at codon35 and does not
have a counterpart in the human gene. Thesecond intron of Pglym78 interrupts codon 201. The second intron of the humanmuscle gene interrupts codon197 of
the PGLYM open reading frame. This may represent an
intron position conserved in the two genes. It has been
suggested that intron positions may “slide” around a
conserved position and observations consistent with
this interpretation have been made when comparisons
between homologous genes that have been separated
for a substantial evolutionary period have been made
(MCKNIGHT
et al. 1986; WCHIONNI
and GILBERT
1986).
Origin of the duplication: Since Pglym87 lacks the
introns present in Pglym78 and appears at least two
codons shortat theN terminus seems
it
highly likelythat
Pglym87 arose through retrotransposition from a
Pglym78 transcript. Additional support of this interpretation comes from several features of the Pglym87 region. The intron-less Pglym87 gene is on a separate
chromosome arm from the intron containingPglym78
gene. Astretch of 20 nucleotides, 19 nucleotides 3’ to a
putative polyadenylation signal (underlined in Figure
6), contains 53% adenosineresidues. This could represent the remnant of a poly(A) tail from the transcript.
A similar percentage of adenosine residues 3’ to the consensus polyadenylation signal in human Pgk, another
retrotransposed gene, has been used to argue that this
region represents the corrupted remainder of the reverse transcribed transcript’s poly(A) tail (MCCARREY
and THOMAS1987).
While evidence presented
here
indicates that
Pglym87 is almost certainly the result of a retrotransposition mediated gene duplicationevent, examination
of the nucleotide sequence around Pglym87 suggests
that additionalor more complex mechanisms may have
played some role in the origin of Pglym87. Sequences
immediately 3’ to the putative poly A remnant contain
four perfect copies and two corrupted copies of the sequence GGTCTT in adjacentdirect repeats. These
could encode a poly prolineglutamic acid peptide on
the opposite strand.Immediately downstream from this
repeat arefive perfect copies and one corrupted
copy of
therepeat GGGTCT which, on the opposite strand
would encode a poly proline-arginine peptide. These
repeats are similar to the S repeats found within the
transposable element Hobo, which is composed of the
sequence TGAGGTCTT (STRECKet al. 1986). Also, in
one openreading frameof the Ac transposable element
of maize there is a region of repeated sequence which
encodes 10 tandem prolineglutamicacid repeats reminiscent of the hobo S repeats (MULLER-NEUMANN
et al.
1984). Thepresence of these repeat sequences, immediately three primeto the Pglym87 gene that are
similar
to those found in transposons of the hobo/Activator
class, suggests an interesting mechanism for this gene
duplication. This class of transposable elements trans-
361
pose through a DNA intermediate (&VI et al. 1991).
If reverse transcript DNA were to be incorporated into
an actively transposing hobo/Ac type element, this
could provide a mechanism for integration into different regions of the genome using the usual pathway of
the parent transposon.
While retrotranspositions of genes and pseudogenes
is found in the genomesof most species and seems particularly frequent in mammalian genomes, only a few
examples have been noted in the Drosophila genome.
This is despite demonstration of the presence in Drosophila cells of retroviral-like reverse transcriptase enzyme (HEINEet al. 1980) and an abundance of transposable elements that are of the retrotransposon class.
On the basis of different chromosomal location of the
Rh4 opsin gene in Drosophila virilis and D. melanogaster and the absence of introns in the D.virilis gene
(NEUFELD
et al. 1991) have suggested that the Rh4 opsin
gene ofD. virilis has undergone retrotransposition during its evolution in the D . virilis lineage. In addition, it
is likely that retrotransposition has played a significant
role during theevolution of the Gapdh gene duplication
in the genus Drosophila (WOJTASet al. 1992).
Another example is that of the retrotransposedcopies
of the alcohol dehydrogenase ( A d h ) genes of Drosophila yakubaand Drosophila teissieri UEFFSand ASHBURNER
1991; JEFF-S et al. 1994). In many respects these retrotransposed Adh genes aresimilar to Pglym87. Their Adh
open readingframes remain intact except for a deletion
of the Adh start codon. Each has lost the introns typical
of an Adh gene andthey occupy chromosomal positions
which are not homologous to the Adh position in D.
melanogaster. These putative pseudogenes aremore
similar to one another than to their respective functional Adh genes, even containing similar deletions in 5’
non codingDNA. The chromosome positions of the two
genes in D. teissieri and D. yakuba is conserved which
indicates that the two genes have a common origin and
that the duplication arose before the species diverged.
Is Pglym87apseudogene?: Since Pglym87 has its origin through retrotransposition the question that naturally arisesis: Isit a pseudogene?Consistent with the view
that itis a pseudogeneis its origin through retrotranposition and the lack of detectable transcripts. However,
we recognize the difficulty in proving that there areno
Pglym87 transcripts. These could exist in very low levels,
at restricted periods of development or in a very fewcells
and have gone undetected in our assays. Also consistent
with the hypothesis that Pglym87 is a pseudogene is a
consideration of its 5’end. If the methionine encoded
by
the codon homologous to codon nine of Pglym78 is in
fact a translation start in Pglym87, then this would result
in the loss of eight relatively conserved amino acids from
the N terminus. Furthermore, it is not clear why the
sequence encoding the six amino acids preceding the
methionine would be conserved as compared to
362
P.
and
D. Currie
Pglym78. Another reason to think that Pglym87 might
be a a pseudogene
is that retrotransposedgenes have the
diffkulty of not carrying sufflcient information for their
correct transcription. In fact, two of the better analyzed
retrotransposed
genes,
human
testes
specific
Pgk
(McCARREY 1994) and D. uirilis Rh4 opsin (NEUFELD
et al. 1991) both of which are known to function, seem
to be reverse transcriptase copies of a transcript substantially longer at the 5’ end than the mature mRNA.
In each case copies of sequences required fortranscription apparently were included during retrotransposition. Inspection of the sequence 5’ to the PGLYM reading frame of Pglym8 7 reveals no significant similarity to
the sequence 5’ to Pglym78.
In support of Pglym87 being functional are several
aspects of itsmolecular evolution including observations
that the PGLYM reading frame is intact and that the
gene shows at least some retention of codon utilization
bias. In addition, the crystal structure of PGLYM from
yeast has been solved and specific amino acids have been
implicated in catalysis. These are histidine 8, histidine
181 and serine 11. Histidine 181 has been conserved in
all mutases. Serine 11 has been implicated in the catalytic mechanism as aphospho
ligand (WHITEand
FOTHERGILL-GILMORE
1992). All these amino acids are
conserved in the PGLYM78 protein with histidine 8 and
181 corresponding to histidine 12 and 188 respectively
and serine11is conserved as serine 15. In ahypothetical
PGLYM87 protein, the two histidine residues and the
serine are also conserved and are located in identical
positions as in PGLYM78. Thus, both Drosophila genes
possess the capability to encode those amino acid residues shown to be important
for PGLYM catalytic activity.
If the PGLYM 87 is nonfunctional, then it should not,
by
the conventional view ofpseudogenes, be subject to constraint on its sequence divergence. Conservation of coding frame, codon bias or catalytically relevant specific
amino acids would not be expected in a pseudogene.
This is clearly the case for mammalian pseudogenes,
where the biased base composition seen in the third
codon position of functional genes is almost absent and
the loss of reading frameis commonly observed (KIMURA
1983).
Therefore we are left to chose between two hypotheses regarding the natureof Pglym87, neither of which
is in total accord with present views regarding the molecular evolution of products of retrotransposition and
pseudogenes. The structural properties suggest
Pglym87 is a pseudogene with a peculiar rate and pattern of evolution, while aspects of its molecular evolution suggest it might be a functional gene having puzzling aspects in its structure. Inthis regard it seemsworth
noting that in Drosophila species, two other candidates
for pseudogenes have been characterized and in each,
peculiar evolutionary properties have been noted. The
similarity between Pglym87 and the retrotransposed
Adh sequences in D. teissieri and D.yakuba sequences
D. T. Sullivan
are particularly striking. When these sequences were
identified, theirpeculiar pattern of evolution, including
retention of reading frame, codonbias and higher rate
of substitution at synonymous nucleotide positions was
noted UEFFSand ASHBURNER 1991). Recently an alternative interpretation forthese sequences has been offered
in which it is suggested that the retrotransposed Adh
sequences captured upstream exons with the resultant
evolution of a new gene (LONGand LANGLEY 1993). However, one of the principle reasons offered to support the
hypothesis that the new gene, called jingwei, is functional is its pattern of molecular evolution. The ultimate
answer with respect to jingwei function remains to be
verified by molecular analysis of its putative transcrip
tion and translation products,butit
seemshighly
improbablethatan
analogous situation applies to
Pglym87. Inspection of the sequence makes no suggestion of captured exons in the 5’ sequence. In another
example of a pseudogene in Drosophila, we have recently analyzed the molecular evolution of an Adh sequence that is present in species of the repleta group
(SULLIVAN
et al. 1994). This sequenceis found in all species ofthe repleta groupand is located immediately u p
stream of the functional Adh genes. It is not a product
of retrotransposition since it retains introns homologous to the functional Adh genes. Analysis of the evolution of this sequence reveals it is diverging more slowly
than the neighboring intergenic regions, some codon
bias is retained, and the ratio of K, to K , ranges from 10
to 14, in painvise interspecific comparisons using the
sequences from seven species. These ratios are only
slightly lowerthan those obtained from equivalent comparisons of the functional Adh genes and clearly not
unity, as wouldbe expectedfor a pseudogene.Yet, in this
set of Adh homologous sequences there is no openreading frame of appreciable length common to this set of
seven species. Since there are only the two Adh pseudogenes and this report of Pglym, it is unclear at present
what the peculiar patterns of molecular evolution of
these putative pseudogenes might mean. Itseems likely
that more extensive analysis of these examples and the
discovery of others may reveal features of genome evolution in Drosophila not yet appreciated or understood.
LITERATURE CITED
BENDER,
W., P. SPIERER
and D. S. HOGNESS,
1983 Chromosomal walking and jumping to isolate DNA from the Ace and rosy loci and
the bithorax complex in Drosophila melanogaster. J. Mol. Biol.
168: 17-33.
BENTON,
W. D., and R. W. DAVIS,
1977 Screening lambda gt recombinant clones by hybridization to single plaques in situ. Science.
1 9 6 180-182.
BIGGIN,
M. D., T. S. GIBSON
and G. G . HONG,1983 Buffer gradientgels
and [95S] label as an aid to rapid DNA sequence determination.
Proc. Natl. Acad. Sci. USA 80: 3963-3965.
C a w , B. R.,T. J. HONG,S. D. FINDLEY
and W. M. GELBART,
1991 Evidence for a common evolutionary originof inverted repeat transposons in Drosophila and plants:hobo, Activator, and Tam3.
Cell.
66: 465-471.
CASTELLA-ESCOLA,
J., D. M. OJCIUS,
P. LEBOWLCH,
V. JOULIN,
Y.BLOWQUIT
et al., 1990 Isolation and characterizationof the gene encoding
Drosophila Pglym Genes
the musclespecific isozyme ofhuman phosphoglycerate mutase.
Gene. 91: 225-232.
ENGELS,
W.R.,C.R.PRESTON,P.
THOMPSON
and W.B. EGGLESTON,
1986 In situ hybridization to Drosophila salivary chromosomes
with biotinylated DNAprobes and alkaline phosphatase. Focus. 8:
6-8.
FOTHERGILL-GILMORE,L.A., and H. c . WATSON, 1989 Phosphoglyceratemutase. Adv. Enzymol. 62: 227-313.
GAUSZ,
J. H., G.GWRKOVICS,
A.
A.
M. AWAD,J. J. HOLDEN
and
D. ISH-HOROWICZ,
1981 Genetic characterization of the region
between 86F1,2 and 87B15 on chromosome three of Drosophila
melanogaster. Genetics 98: 775-7
GILBERT,
W., 1986 On the antiquity of introns. Cell. 46: 151-153.
GRISOLIA,
S., and J. CARRERAS,
1975 Phosphoglycerate mutase from
yeast, chicken breast muscle, and kidney (2,SPGAilependent).
Methods Enzymol. 42: 435-450.
HEINE,
C.W.,D.C.
KELLY and R. J. AVERY,
1980 The detection of
intracellular retroviral like entities in Drosophilamelanogaster
cell culture. J. Gen. Virol. 4 9 385-395.
HEINISCH,J., R.C. VON BORSTEL
and R.RoDtcro, 1991 Sequence and
localization of the geneencoding yeast phosphoglycerate mutase.
Curr. Genet. 20: 167-171.
HONG,
G. F., 1982 Asystematic DNAsequencing method. J. Mol. Biol.
158: 539-549.
JEFFS,P., and M. ASHBURNER,
1991 Processed pseudogenes in Drosophila. Proc. R. SOC.Lond. Ser. B 244: 151-159.
JEFFS, P., E. C. HOLMES
and M. ASHBURNER,
1994 The molecular evolution of the alcohol dehydrogenase and alcohol dehydrogenaserelated genes in Drosophilamelanogaster
species subgroup.
J. Mol. EVO~11: 287-304.
M. 1983 The Neutral Theory of Molecular Evolution. CamKIMURA,
bridge University Press.
LEBOULCH,
P., V. JOULIN, M. C. GAREL,
J. ROSAand M. COHENSOLAL,
1988 Molecular cloning and nucleotide sequence ofmurine 2 , s
bisphosphoglycerate mutase cDNA. Biochem.
Biophys.
Res.
Commun. 156: 874-881.
LIS, J. T., J. A. SIMON
and C. A. SUTTON,1983 New heat shock puffs
and Kalactosidase activity resulting from transformation of Drosophila with an hsp70-lac2 hybrid gene. Cell 35: 403-410.
LISSEMORE,
J. L., J. T. COLBERT
and P. H.QUAIL,
1987 Cloning of cDNA
for phytochrome from etiolated Cucurbita and coordinate photoregulation of two distinct phytochrome transcripts. Plant Mol.
Biol. 8: 485-496.
LONG,
M. Y., and C. H. LANGLEY,
1993 Natural selection and the origin
of jingwei, a chimeric processed functional gene in Drosophila.
Science 260: 91-95.
MANIATIS, T., E. F. FRJTSCH
and J. SAMBROOK,
1989 Molecular Cloning:
A Laboratory Manual, Ed. 2. Cold Spring Harbor Laboratory,
Cold Spring Harbor, N.Y.
MCCARREY,
J. R., 1994 Evolution of tissue-specific gene expression in
mammals. Bioscience 44: 20-27.
McCAWY, J. R., and K. THOMAS,
1987 Human testis-specific Pgkgene
lacks introns and possesses characteristics of a processed gene.
Nature 3 2 6 501-505.
McKnight, G.L.,P. J. O’Hara and M. L. PARKER,
1986 Nucleotide
sequence of the triosephosphate isomerase gene from Aspergillus
nidulans, implications for differential loss of introns. Cell 46:
143-147.
MIRANDA,
A. F.,E.R.PETERSON and E. B. MASUROVSKY, 1988 Differential expression of creatine kinase and phosphoglycerate mutase
isozymes during development in aneural and innervated human
muscle cultures. Tissue Cell 20: 179-191.
MORELLE,
G., 1988 A plasmid extraction procedure on a miniprep
scale. Focus 11: 7-8.
MULLER-NEUMANN,
M., J. L. YODERand P. STARLINGER,
1984 The DNA
sequence of the transposable element Ac of Zea mays. Mol. Gen.
Genet. 198: 19-24.
NAKATSUJI, Y., K. HIDAKA,
S. TSUJIMO,
Y. YAMAMOTO,
T. MUKAIet al.,
1992 A single MEF-2 site is a major positive regulatory element
required for transcription of the muscle-specific subunit of the
363
human phosphoglycerate mutase gene inskeletal and cardiac
muscle cells. Mol. Cell. Biol. 12: 4384-4390.
NEUFELD, T. P.,
R.W. CARTHEW and G. M. RUBIN,
1991 Evolution of
gene position-chromosomal arrangement and sequence comparison of the Drosophila melanogaster and Drosophila virilisSina and Rh4 Genes. Proc. Natl. Acad. Sci.USA 88: 10203-10207.
OMENN,
G. S., and S. C. Y. CHEUNG,
1974 Phosphoglycerate mutase
isozyme marker for tissue differentiation in man. Am. J. Hum.
Genet. 26: 393-399.
ROSELLI-REHFUS,
L.,F. YE, J. L. LISSEMORE
and D. T. SULLIVAN,
1992 Structure and expression of the phosphoglycerate kinase
gene ( P g k ) of Drosophila melanogaster. Mol. Gen. Genet. 235:
213-220
SAKODA,
S., S. SHANSKE,
S. DI MAURO and E.A. SCHON,
1988 Isolation
of a cDNA encoding the B isozyme of human phosphoglycerate
mutase (PGAM) and characterization of the PGAM gene family.
J. Biol. Chem. 263 16899-16905.
SANGER,
F., S. NICKLENand A.R. COULSON,
1977 DNA sequencing
with chain terminating inhibitors. Proc. Natl. Acad. Sci.USA 74:
5463-5467.
SHAW-LEE,
R. L., J. L. LISSEMORE
and D. T. SULLIVAN,
1991 Structure
and expression of the triose phosphate isomerase ( T p i ) gene of
Drosophila melanogaster. Mol. Gen. Genetics. 230: 225-229.
SHAW-LEE,J.R.,
L. LISSEMORE,
D. T. SULLIVAN
and D. R. TOM, 1992 Alternative splicing of fructose 1,Gbisphosphatealdolase transcripts
in Drosophila melanogaster predicts 3 isozymes. J. Biol. Chem.
267: 3959-3967.
SOUTHERN,
E. M., 1975 Detection of specific sequences among
DNA fragments separated by gel electrophoresis.J. Mol. Biol. 98:
503-517.
STARMER,
W. T., AND D. T. SULLIVAN
1989 A shift in the third-codonposition nucleotide frequency in alcohol dehydrogenase genes in
the genus Drosophila. Mol. Biol. Evol. 6: 546-552
STRECK,
R. D., J. E. MAcGmuand S. K BECKENDORF,
1986 The structure of hobo transposable elements and their insertion sites.
EMBO J. 5: 3615-3623.
SULLIVAN,
D. T., W. T. CARROLL,
C.L. KANIK-ENNUIAT,
Y. HIITI,J. A.
LOVEIT
et al., 1985 Glyceraldehyde-%phosphatedehydrogenase
from Drosophilamelanogaster: identification oftwoisozymes
forms encoded by separate genes.J. Biol. Chem. 260: 4345-4350.
SULLIVAN, D.T., W. T. STARMER,
S. W. CURTIS,M. MENOTTI-RAYMOND and
JUNGSUN YUM, 1994 Unusual molecular evolution of an Adh
pseudogene in Drosophila. Mol. Biol. Evol. 11: 443-458.
TSUJIMO,
S. SAKADO,
S. MIZUNO,R.
KOBAYASHI, T. SUZUKI
et al.,
1989 Structure of the gene ecoding the muscle-specific subunit of human phosphoglycerate mutase. J. Biol. Chem. 264:
15334-15337
URENA,
J. M., X. GRANA,
L. DELECEA,
P. R u I ~
J.,CASTELLA et al., 1992 Isolation and characterization of a cDNA encoding the B-isozyme of
rat phosphoglycerate mutase. Gene. 113: 281-282.
VON KALM, L., J. WEAVER,
J. D E ~ ~ A RR.CJ.OMACINTIW
,
and D. T. SULLIVAN,
1989 Structural characterization of the aglycerol-%phosphate
dehydrogenaseencoding geneof Drosophila melanogaster. Proc.
Natl. Acad. Sci. USA 86: 5020-5024.
WHITE,
M. F., and L.A. FOTHERGILL-GILMORE,
1992 Development of a
mutagenesis, expression and purification system for yeast phosphoglycerate mutase. Investigation of the role ofactive-site
Hisl8l. Eur. J. Biochem. 207: 709-714.
WHITE, P.
J., J. NAIRN,
N. C. PRICE,
H. G. NIMMO,
J. R. COGGINS
et al.,
1992 Phosphoglycerate mutase from Streptomyces
coelicolor
A3(2): purification and characterization of the enzyme and cloning and sequence analysis ofthe gene. J. Bacteriol. 174: 434-440.
WOJTAS,K. M., L. VON KALM, J. R. WEAVER
and D. T. SULLIVAN
1992 The
evolution of duplicate glyceraldehyde-%phosphate dehydrogenase genes in Drosophila. Genetics. 132: 789-797.
YANAGAWA,
S.-I., K. HITOME,
R. SASAKI
and H. CHIEA,
1986 Isolation and
characterization of
cDNA
encoding rabbit reticulocyte 2,3bisphosphoglycerate synthetase. Gene 44: 185-191.
Communicating editor: V. G. FINNERTY