Download DNA sequence of the rat growth hormone gene: location of the 5

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Genome evolution wikipedia , lookup

Cell-free fetal DNA wikipedia , lookup

Neuronal ceroid lipofuscinosis wikipedia , lookup

SNP genotyping wikipedia , lookup

Bisulfite sequencing wikipedia , lookup

Frameshift mutation wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Copy-number variation wikipedia , lookup

Gene expression programming wikipedia , lookup

Genetic engineering wikipedia , lookup

Saethre–Chotzen syndrome wikipedia , lookup

Gene expression profiling wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Pathogenomics wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Molecular cloning wikipedia , lookup

Transposable element wikipedia , lookup

Non-coding DNA wikipedia , lookup

Epigenomics wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

Human genome wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

History of genetic engineering wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Gene therapy wikipedia , lookup

Primary transcript wikipedia , lookup

Gene nomenclature wikipedia , lookup

Gene desert wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Gene wikipedia , lookup

Genomic library wikipedia , lookup

Microsatellite wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Point mutation wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Metagenomics wikipedia , lookup

Microevolution wikipedia , lookup

Designer baby wikipedia , lookup

RNA-Seq wikipedia , lookup

Genomics wikipedia , lookup

Genome editing wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Helitron (biology) wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Transcript
volume 9 Number 91981
Nucleic Acids Research
DNA sequence of the rat growth hormone gene: location of the 5' terminus of the growth hormone
mRNA and identification of an internal transposon-like element
Guy S.Page, Susan Smith and Howard M.Goodman
Howard Hughes Medical Institute Laboratories, Department of Biochemistry and Biophysics,
University of California, San Francisco, CA 94143, USA
Received 29 January 1981
ABSTRACT
The present communication describes the molecular cloning and DNA
sequence determination of the rat growth hormone (rGH) gene. The rGH gene
was cloned on an 11 kilobase EcoRI fragment of total rat DNA; it has four
intervening sequences which correspond in position to those of the human
growth hormone (hGH) gene. One of the intervening sequences in the rGH
gene contains a possible transposable element: a 200 base pair direct
repeat that is itself flanked by an exact 15 base pair direct repeat.
The DNA sequence was used to estimate the location of the 5 1 end of
the mature growth hormone mRNA. By SI nuclease mapping it was located
approximately 25 bases "downstream" frcm a TATAAA sequence presumed to play
a role in initiation of transcription of the rGH gene.
INTRODUCTION
Many of the processes of development, tissue differentiation, and the
responses of an organism to environmental changes occur at the primary
genetic level, that is, at the level of genetic transcription. However,
while we know that gene transcription can show tissue specificity, we have
as yet very little direct information about the way (or ways) in which such
specificity is conferred, and to what extent it depends upon the primary
structures (i.e., DNA sequences) of the genes involved.
As part of our approach to these general questions we have been studying the rat growth hormone (rCH) gene, whose tissue-specific expression can
be examined both in vivo and in cultured cell lines. Growth hormone (GF!)
is produced by a specialized subset of the cells of the anterior pituitary
in response to specific signals from the hypothalamus. In addition, GH is
produced and secreted by cells of the closely related rat pituitary tumor
lines GH., GH-j, and GC (1). In these cells there is preliminary evidence
that the level of GH mRNA can be controlled by both glucocorticoid and thvroid hormones (?). Thus an excellent system is available for an examination of hormone and tissue-specific control of genetic transcrirjtion and
© IRL Press Limited, 1 Falconberg Court, London W 1 V 5FG. U.K.
2087
Nucleic Acids Research
i t s relation to acne structure.
In this paper we report the molecular cloning and complete OTA
sequence determination of the rGM gene. We also rerort the localization of
the 5' end of the nature rGIl mPNA, i.e. the probable site of initiation of
transcrintion and the finding of a possible transposon-like element located
in one of the intervening sequences.
IOTERIRLS and METHODS
Restriction Enzyire Analyses
All d i g e s t i o n s v/ere done with enzymes purchased fran e i t h e r Mew England Biolabs, rtethesda Pesearch Laboratories, o r Poehringer f'annheim.
Digestions were usually done with a s u b s t a n t i a l excess of enzyme and
approximately in accordance with t h e conditions provided by t h e manufact u r e r . Gel e l e c t r o p h o r e t i c separations, unless specified otherwise, were
performed e i t h e r on 1% agarose gels or fi% acrylamide gels i n TBE buffer
CS.P09M T r i s , rc.ir!9M Boric acid, FJ.miM KOTA, pH 8 . 3 ) . Gels were stained
with ethidium brcmide and visualized by illumination with UV l i g h t .
nPC-5 Fractionation of DNA Fragments
A ^ . 9 x 22 cm col inn of RTC-5 (3) vras packed under pressure (400
p . s . i . ) a n d e q u i l i b r a t e d with 1.2511 NaOftc l^my T r i s . I ' d pi! 7 . 5 , lmt-: 3OTA.
Three milligrams of CcoPI-cleaved ilooded r a t OMA ( g i f t of A. Ullrich) was
loaded onto t h e column in the same buffer, and t h e fragments eluted with a
20^ x 20" ml gradient of 1.45 t o 1.55M HaOAc (l^nf Tris.SICl pH 7 . 5 , 1 n*1
DDTA) a t a flov; r a t e of 1.5 rnl/min (~4^M p . s . i . ) . Tractions were assayed
for absorbance a t 2671 nm and a l i g u o t s of selected fractions electrophoresed
on an agarose g e l .
Five microqrams of DUA from each r?7A-containing fraction were assayed
by Southern hybridization for r a t qrowth hormone (rGH) sequences. The e l u tion peak was spread broadly across 3T f r a c t i o n s . The rGH seouences eluted
in four f r a c t i o n s , which were pooled and t h e !7TA p r e c i p i t a t e d from them.
Sucrose Gradient Fractionation of Restriction Fragments
Sedimentation in sucrose gradients was used as a p u r i f i c a t i o n step
both for t h e "arms" of t h e cloning vector \Charon-4A and for t h e HV.b rCTI
HcoPI fragment. The Gradients were run e s s e n t i a l l y as described i n Lawn,
e t a l . , ( 4 ) . Gradients of 11% t o 403 Sucrose (10 x lfi ml) in 1" - IaCl,
0.11T! T r i s (pH 7 . 4 ) , n.TIlM EDTA, " . 5 uq/ml ethidium branide were c e n t r i fuqed a t ?7,cy.9i rpn i n a Beckman S"J?7 r o t o r . Tine o*7 centrifuqation
depended on the s i z e of t h e fragment t o be p u r i f i e d . Specific bands visu-
2088
Nucleic Acids Research
alized in the centrifuge tube by UV illumination were collected from the
side of the tube with a syringe.
More disperse fragment populations were
fractionated through a hole punched in the bottom of the tube.
I'olecular Cloning of the rGH Gene
"arms" to the 11kb EcoRI
Ligation of purified \Charon-4A
fragments
containing rGH, _in vitro packaging of the reccmbinant molecules, transfection,
and
identification
of products were all carried
out
exactly
as
described elsewhere (5, 6 ) .
A description of the methods used for subcloning the 11kb rGH EcoRI
fragment and specific parts of it into pBR322 may be found in Ullrich, et
al. (7).
Subcloning of RGH Sequences into Ml3-derived Vectors
The 1.5 kb PvuII-D fragment containing most of the rGH gene was cloned
from the 11kb rGH EcoRI fragment in pBR322
(p.gRGH) into the vector M13mp5
as described by Cordell, et al., (G). Reccmbinant phaae were propagated in
the bacterial strain 79.02
(gift of B. Gronenborg), and single stranded
template DNA for sequencing prepared from them as described by Winter and
Fields (8). The clones were designated MP5.gRGH.l (+) and MP5.gRGH.2
depending upon the orientation of
the PvuII-D
fraoment
(-)
in the single-
stranded vector.
DHA Sequencing
Most of the DMA sequencing was carried out by the chain-termination
nethod of Sanger (9) using as templates MP5.gRGH.l and MP5.gRGH.?
above).
(see
Hpecific ENA primers were prepared from the rGH cDMA clone (1^)
and from the gene PvuII-D fragment by digestion with selected restriction
endonucleases and purified by polyacrylamide gel electrophoresis. The D?!A
sequencing reactions were carried out as described by Cordell, et al., (6).
The t'axam and Gilbert procedure (11; 1?) was used to sequence regions
flanking the PvuII-D fragment and to confirm selected sequences obtained by
the chain-termination method within the PvuII-D fragment.
All manipula-
tions closely followed the published protocols.
Growth of OH
Cells and R'TA Extraction
GH, cells (1) were grown either in suspension or in monolayers in
Dulbecco's fodified Eagle's medium supplemented with 1W
Fetal Calf Serum
and lnil triiodothyronine and 50 yf-1 dexamethasone (13). In order to obtain
cytoplasmic PNA, cells (either trypsinized from monolayers or directly from
suspension culture) were pelleted, washed once with 10 mf' Tris.UCl pH 7.4,
5 mV NaCl, 1 rrt-: MgCl. (RSB), and suspended in cold Rsn containinq n.4?. MP-
2089
Nucleic Acids Research
40, and 0.06% sodiun deoxycholate. The suspension was kept on i c e l o r
minutes,
a f t e r which the lysed c e l l s were removed by centrifugation.
rive
The
supernatant was brought t o 0.25M NaCl, 0.05 M Tris.HCl pH 7 . 4 , 0.05 M EDTA
and extracted once each with phenol and chloroform.
RNA was precipitated
from the f i n a l supernatant with ethanol.
RNA-DNA Hybridizations and Sl^ Nuclease Mapping
The RNA and DNA fragments t o be hybridized together (see Results
d e t a i l s ) were f i r s t mixed and coprecipitated with ethanol.
dried
under vacuum and dissolved
i n 20 u l
of
a
b o i l e d for three minutes, then placed a t 50
for
The p e l l e t was
buffer
HCONH2, 0 . 2 M NaCl, 20 mM PIPES pH 6 . 5 , and 0 . 5 mM EOTA.
for
containing 80%
The s o l u t i o n was
three hours,
the resul-
tant hybrids were d i l u t e d to 300 u l i n SI buffer (0.25M NaCl,
0.03M Na0Ac
pH 4 . 6 ,
Urihybridized
0.001M ZnS04) and 300 u n i t s of SI nuclease added.
nucleic a c i d s were digested a t rccm temperature for 30 minutes.
The pro-
ducts o f t h e d i g e s t i o n were analyzed on an 8% polyacrylamide DNA sequencing
g e l ( s e e d e s c r i p t i o n o f sequencing p r o t o c o l s , above).
RESULTS
Molecular Cloning and Restriction Map of the Rat Growth Hormone Gene
In order to facilitate identification of a rat growth hormone (rGH)
gene clone, total rat DMA was enriched for rGH gene sequences prior to
cloning using a two-step process. DHA was extracted from total tissue of
Hooded rats and digested to completion with EooRI. The resultant fragments
were fractionated on an RPC-5 column (14; 15), and aliguots of the column
fractions were assayed by hybridization with a labeled rGH cDNA probe (10)
a single peak of hybridization was observed. Appropriate column fractions
were pooled and the restriction fragments separated by sedimentation
through a sucrose gradient (4). The rGH sequences were again located by
hybridization with the cDNA probe, and the peak fractions were used for
cloning. Enrichment for rGH—specific sequences by these procedures was
approximately 50- to 100-fold.
The rat genomic EcoRI fragments enriched for rGH gene sequences were
then ligated to purified \ Charon4A "vector arms", packaged in vitro and
plaques screened as described elsewhere (16, 5, 6 ) . Several putative rGHcontaining clones were isolated. The rGH isolate chosen for detailed study,
designated VgRGH, contained an 11 kilobase (kbj EcoRI restriction fragment that hybridized to the rGH cENA probe. A restriction map of this
fragment is shown in Fig. 1. Map positions for Xbal, SstI, Bglll, Hindlll,
2090
Nucleic Acids Research
mRNA coding wquoncs
Intervening Mquenca
' R*peat*d Mqusnce
Direction of TranMription
kh I
I
I
I
1
2
3
1
1
1
A
Xba 1 1
\
1
1
1
5
6
1
\
4
Bgl II
A
3.4
lf A
B
1.6
1
I
7
8
9
1
1
10
11
1
C
1.1
0
|
E
1
BamH 1 1
0.6
B
o.is T3T
1.5
1
35
A
6.9
1
A
69
1
0.85
0
|
3.8
4.4
Sal 1 1
1
|
A
5.8
1.1
1
B
7.6
|
Hind III 1
Pst I L
1
1
B
8.9
1
2 1
Sst 1 1
1
1
C
1 7
B
|
4.1
B
D
1.6
1
1 35
|E| F ,
0.2 0.45
1.3
Figure 1. Restriction Map of the iGH Gene: The locations of the restriction sites indicated were determined as described in Materials and Methods.
An expansion of the rGH section of the 11 kb EcoRI fragment is shown at the
top with the location of the rCH gene, its intervening sequences, and the
200 bp repeated sequences. The nunber under each restriction fragment indicates its size in kilobase pairs. The left-most and right-most (except
for PstI and Pvull) vertical bars are always the EcoRI sites.
Sail, and Barrel sites were obtained from \.gRGH by standard methods. To
simplify more detailed
restriction mapping, the approximately 6kb EcoRI-
to-Hindlll fragment (Hindlll-A in Fig. 1) in \.gRGH was transferred to the
plasmid vector pBR322.
This sub-clone, designated p.gRGH, was used to
reconfirm the map positions of those enzymes listed above and to position
the PstI and Pvull sites.
Each of the enzymes Bglll, Xba I, and SstI was found to separate portions of the rGH protein-ooding sequence, i.e., to cleave within the gene.
As sites for these enzymes are not found in the cDNA sequence, it seemed
likely that they indicated the presence of intervening sequences. Furthermore, the cDNA hybridized to regions between the Xbal site and the distal
Bglll site, and between this Bglll. site and the SstI site.
This observa-
tion demonstrated the presence of protein-coding sequence in the Xbal-Bqlll
2091
Nucleic Acids Research
and Dglll-SstI intervals.
As such, these three s i t e s were taken to indi-
cate the presence of at least three d i s t i n c t interveninq sequences.
Location of the rGH Gene and Determination of the Orientiation
of Tran-
scription
The approximate location of
the
rG1! gene on the cloned
irkb EcoRI
fragment was determined by hybridization of selected digests of the clone
with nick-translated rGH cDMA probe (17).
For example, the probe was found
t o hybridize to Bglll fragments C and D, but not to Bglll-B (see Fig. 1 for
fragment nomenclature).
These data
limit
the
I t hybridized to Hindlll-A but not to 'IindIII-B.
location
of
the rGH gene to between the BglII-B/
BclII-C junction and the HinJIII-A/ Hindlll-B junction.
By similar hybrid-
ization analyses the location of the rGH gene was determined (Fig. 1).
The orientation of transcription of the rCTI gene was determined lay
hybridization analysis with 5 ' - and 3'-specific
rGH cDNA clone.
probes prepared
from the
Briefly, the BOO-base pair Hindlll fragment containing the
cloned rGH cDNA was purified from the plasmid p.cPGH (pRGH-1 of Seeburg, et
al.,
(11)) and cleaved with Hhal. The 220-base Dair 5'-end fragment and the
275-base pair
phoresis.
3'-end
liich of
fragment
these
were isolated by preparative eel electro-
fragments
was labeled
by nick-translation
and
hybridized to BqlH, SstI and double digests of the genctnic clone (Tiq. 2).
The 5'-specific cDNA fragment hybridizes to Rglll-C,
C
.stl-A, and to a frag-
ment the same size as BolII-C in the double digest (Fig. ">,b).
specific
cKJA fragment
hybridizes
to Bglll-D,
Pstl-B,
Bglll-SstI fragment in the double digest (Tig. 2,c).
the orientation
and to
The 3 ' a
3.8
kb
These results specify
of transcription as shown in Fig. 1.
The genctnic clone
therefore contains about ? kb of DNft sequence "upstream" from the putative
5'
end of the gene.
Determination of the 'lumber of rG'l Genes
There are multiple growth horraone aenes in the human genone (10).
comparable multiple gene structure
orowth homone gene in r a t .
tion with
does
not appear to be found
for
A
the
Digestion of rat I>A with "coP.I and hybridiza-
the rCl cD*IA nrobe yields a sinqle band corresponding to the
cloned 11 kb restriction fragment described above.
However, to determine
whether the hybridization seen with the rGM cD>'A nrobe i s in fact due to
the nresence of a single rGU gene, a direct comparison was tnadp between
restriction digests of r a t oenonic DMA and the cloned gencmic rCl sequence.
Kat genome CIA and niA frcn
PstI,
2092
the rCl clone vrere digested with PvuII and
each of \/hich cleaves the cloned rt?! qene several times
(Fio. 1).
Nucleic Acids Research
Figure 2. Orientation of Transcription of the rGH Gene: In each lane 2 pg
of DNA fran the plasmid clone p.gRGH was digested with the following enzymes: the first lane, with nglll; the second, with SstI; and the third,
with both enzymes. The three sets are (a) ethidium bromide staining pattern (b)autoradiograph of hybridization with 5'-specific probe; and (c) autoradiograph of hybridization with 3'-specific probe. Numbers at the left
refer to restriction fragment sizes in kilobase pairs. The conditions of
hybridization were those described by Gordell, et. al., (6).
The gencmic digests and appropriate amounts of the digests of the cloned
DNA were electrophoresed on the same gel, transferred to nitrocellulose,
and hybridized with nick-translated probe prepared from the cloned rCH cDNA
sequence.
The results of this comparison are shown in Fig. 3.
As there
are no restriction fragments in the digests of the rat gencmic DMA that
cannot be accounted for by fragments frcm the rGH clone, it seems most
plausible to conclude that there is only one growth hormone gene in the rat
genome. This conclusion is supported by more detailed
restriction mapping
data (10). The data do not .however, rule out the possibility of several
identical genes in identical sequence environments.
DMA Sequence of the Rat Growth Hormone Gene
Both the chain-termination
method of OTA sequence determination (9)
2083
Nucleic Acids Research
».
«-
3
3
Q.
Q.
0-
Q_
l g
| i
? i
o o
i§
g J
2 x
DO
11-
•1.5/1.6
1.10.85-
Figure 3. Hybridization Analysis of Rat Genanic CE-7A: A comparison was made
between selected restriction digests of the 11 kb \.gRGH clone (vis. Fig.
1) and total rat genoraic I?P\. Each hybridization i s to either 10 ug of
genomic EEIA or 39 pg of the cloned ENA. Particular digests were as indicated in the figure. All digests were electrophoresed together. After electrophoresis the gel was treated for 20 min. in 50 ITM HC1 (this may explain
the poor recovery of the smaller restriction fragments). Ml digests were
transferred t o nitrocellulose and hybridized together. Hybridization was
under described conditions (6), for seven days a t a probe concentration of
5 x 10 cpm/ml (specific activity 1-2 x 10 cpn pg). The autoradiogram was
exposed for five days at -70 with a Dupont Cronex Lightening Plus intensifying screen. TVie numbers refer to size in kilobase pairs of indicated restriction fragments.
Hybridization to the T.fffikb genoraic PstI fraoment was visible in the
original autoradiogram, although at a reduced intensity. This hybridization band dirJ not reproduce.
and the chemical degradation method developed by Maxam & Gilbert (11, 12)
were used to obtain the complete sequence of the rGH gene.
A diagram of
the complete seauencing strategy i s given in Fiq. 4.
The chain-termination method of ttlA sequencing relies on the availab i l i t v of a sinqle-stranded template for the DMA synthesis reaction.
2094
Such
t t
tt
t
t
M
i i
t
]
•
s
z
t M [ [
i l\ i li lj
t t I [[ Mtt
i• i* |I jilj ] ii]
Figure 4. DMA Sequencing Schane for the rGH Gene: The bottom line shows the location of restriction
sites used in the DHA sequence determination (H.B. This is not a complete restriction map. Such a map is
available on request). The numbers refer to the distance in base pairs fran the Bglll site and
correspond to those used in Fig. 5. Above the restriction map is shown a representation of the rGH gene.
Protein-coding portions of the gene are shown as open boxes; 5'- and 3'-untranslated regions of the mRNA
are shown as cross-hatched boxes; and the intervening sequences are the single line regions designated A
through D. The direction of transcription of the rGH gene is left to right. The uppermost portion of
the figure represents the CtlA sequencing scheme. Each arrow corresponds in position, direction, and
length to one sequence determination. Thin arrc"^ represent sequence obtained by the chain-termination
method, and the thick arrows, sequence obtained by the Maxam and Gilbert method.
>
g
Q.
8
CD
o
Nucleic Acids Research
a template was obtained for the PvuII-D fragment (Fig. 1), which spans most
of the rGH gene, by transferring this fragment to the single-stranded cloning vector Ml3mp5 (20). A decanucleotide "linker"
(CCAAGCTTGG) containing
the Hindlll restriction s i t e was ligated to a PvuII digest of p.gRGH.
The
products were cloned into M13np5 and isolates containing the PvuII-D fragment identified
yielded
by hybridization and restriction analyses.
recombinant
phage clones containing
each of
This approach
the strands of
the
PvuII-D fragment for use as sequencing templates.
Primers for the chain-termination sequencing reactions were prepared
fran two sources: selected fragments frctn the cDMA clone were used to prime
reactions from protein-coding portions of the gene; fragments
from p.gRGH
were used to obtain sequence within the intervening sequences (IV5).
The Maxam-Gilbert method was used to sequence regions 5' and 3 ' to the
PvuII-D fragment, as well as t o clarify any ambiguous sequence within t h i s
fragment.
The entire sequence of the rGH gene i s shown in Fig. 5.
half
Approximately
of the gene was sequenced on both strands. Those portions that were
not sequenced on both strands were either sequenced on the same strand by
both methods (Fig. 4; circa nucleotide 85"i) or on the same strand with different primers (Fig. 4; circa nucleotide 1150).
portions
for
The data for
those
few
vJiich a single sequence determination was made were wholly
unambiguous in interpretation.
Localization of the "CAP" Site of Mature iGH mKNA
From the DHA sequence presented in Fig. 5 we were able to make a prediction of the location of the rGH mRNA "CAP" s i t e and to t e s t this prediction.
fied
At position 209-205 is the sequence TATAAA, which has been identias part of the signal for i n i t i a t i o n of transcription by RNA Polym-
erase I I (21).
The initiation s i t e i s usually located at an A residue 25 +
1 bases from the TATAAA sequence (22).
In order to experimentally determine the position of the "CAP" s i t e of
the rGI mRMA we extracted cytoplasmic RMA from the r a t pituitary tumor line
GH, after the c e l l s had been grown in triiodothyronine and dexamethasone
for three days (23).
rGH message (2).
Induced GH., mRNA was expected to contain about l%-5%
32
PstI fragments of p.gFGU were end-labeled with /-[ P]ATP
and T4 polynucleotide kinase and the 0.6 kb Pstl-B fragment (rig. 1) purified
by polyacrylamide gel electrophoresis.
This PstI
fragment
overlaps
the region where the "CAP" s i t e should occur ( i . e . , ca. nucleotide 231 in
Fig.
2096
5).
The labeled Pstl-B fragment was hybridized to QI, mRIIA under high
Nucleic Acids Research
cgtaccattqoocataaacttggcaaaggogqcsggtggaaaggtaagatcaqggaogtgaccgcaggagag
1
30
60
cagtqgaqaogcgatgtqtgggaggagcttctaaattatcx»tcagcacaagctgtcagtggctocagcca
90
120
tgaataaatgtataqggaaaaaqqcaggagocttggggtcgaggaaaacaggtagggtataaaaagggcat
150
180
210
geaacy^accaaatccagcacxxit^agoccagattccaaactactcaggtoctgtggacagatcactgag
240"
270
-26
Met Ala Ala A
tggcg ATG GCT OCA G gtaagcatgogcagatcocqctgggtgtggtttggaccaaagagccttgaa
300
330
gatggatctgagacttctagtqtgacjagcatcccaacttcoaoccatgttggqaacattctgggaocctat
360
390
420
gqggattgggagagattggtecttgctcccagcctcctcctgtcctectgtctctctttctag
450
480
-20
-10
Gin Thr Pro Trp Leu Leu Thr
CAG AC 1 CCC TOG CIC CTG ACC
510
-1 1
Ala Gly Ala Phe Pro Ala Met
GCT GGT GCT TTC CCT GCC ATG
20
sp Ser
AC TCT
Phe Ser Leu Leu Cys Leu Leu Trp Pro Gin Glu
TIC AGC CTG CTC TOC CTGCTGTOG OCT CAA GAG
540
10
Pro Leu Ser Ser Leu Phe Ala Asn Ala Val Leu
CCC TTG TOC W7T CTG TTT GCC AAT GCT GTG CTC
570
30
Arg Ala Gin His Leu His Gin Leu Ala Ala Asp Thr Tyr Lys Glu Phe
CGA GCC CflG CAC CTG CAC CflG CTG GCT GCT GAC ACC IK: AAA GAG TTC gtaagt
600
630
tcctoqqtqttqggtgcxstgactgtggaagcaggaaaggggcaogatoccaccctcgooccgaatccctgc
660
690
720
ooocaqqaagteataggaggaaactatgocgttagatgagcagaaaaagaatgggtogtocataagcagta
750
'
780
atgacagaqagggctgqagagatggctcagtggttaagagcacoogactgctcttccaaaggtoctgagtt
8iO
840
caattoecagcaaccacatqgtggctcacaaccatctgtaaagagatoogatgasctcttctggtgtgtct
870
900
930
gaagacagctacaqtgtacttatataataaacaaataaatctttaaaaaaaaaaacaaaaaoggggctgga
960
990
gagatggctcagoggttaagagogcocgactgctcttocagaggtcatgagttcaattoscagcaaccaca
1020
1050
1
tggtqgctcacaat3catctgtaaaqagatctgatgocctcttctggtgtatctgaagacagctacagtgta
080
1110
1140
2097
Nucleic Acids Research
cttatatataataaataaataaatctttaaaaaaaacaaaacaaaaacaaaaacaaaacagtaatgacaga
1170
1200
_^
Glu Arg Ala Tyr lie Pro Glu
gagtcacaagctggtccctcagtgactacctttcctccag
GAG OGT GOC TAG ATT OOC GAG
1230
1260
40
50
Gly Gin Arg Tyr Ser l i e Gin Asn Ala Gin Ala Ala Phe Cys
QGA CAG CGC TAT TOC ATT CAG AAT GOC CAG GOT GOG TTC TOO
1290
1320
60
70
l i e Pro Ala Pro Thr Gly Lys Glu Glu Ala Gin Gin Arg Thr
ATC CCA GOC CCC ACC GGC AAG GAG GAG GOC CAG CAG AGA ACT
1350
Phe Ser Glu Thr
TTC TCA GAG ACC
gtgagtaggcccag
1380
qccttgtctqtacagatcctcttttcttcxx:aagcaqccctaactgcagtccaggcx:agggaccagctctt
1410
1440
cxx:tgaggctgaggtaacctgggagtoccaggcagaggtcactagctaatgcacagcxxx:ttttttccx:te
1470
1500
1530
Asp Met Glu Leu Leu Arg Phe
aq GAC ATG GAA TTG CTT CGC TTC
1560
90
Pro Val Gin Phe Leu Ser Arg H e
OOC GTG CAG TTT CTC AGC AGG ATC
1590
110
Asp Arg Val Tvr Glu Lys Leu Lys
GAC CGC GTC TAT GAG AAA CTG AAG
1650
80
Ser Leu Leu Leu H e Gin Ser Trp Leu Gly
TOG CTG CTG CTC ATC CAG TCA TOG CTG GGG
100
Phe Thr Asn Ser Leu Met
TTT ACC AAC AGC CTG ATG
1620
120
Asp Leu Glu Glu Gly H e
GAC CTG GAA GAG GGC ATC
1680
Phe Gly Thr Ser
TTT GGT ACC TOG
Gin Ala Leu Met
CAG GOT CTG ATG
Gin
CAG gtcaggatqgaoogggggcgctagoctgaggttatactgaoctttgcctctgcttggagcctagct
1710 " '
'
1740
qggqggctcactgagctctgtttacoggtcagacx:ttaaaccttgagaaggcttcctactcactttccctt
1770
'
1800
1830
atqaagcx:tccaggcctttctctaggttctggagttggggagggcaoggctctgagttcttctttcxx:aca
1860
"
1890
130
140
Glu Leu Glu Asp Gly Ser Pro Arg H e Gly Gin H e Leu Lys Gin Thr
acaq
GAG CTG GAA GAC GGC AGC CCC OGT ATT GGG CAG ATC CTC AAG CAA ACC
1920
1950
150
160
Tyr Asp Lys Phe Asp A3 a Asn Met Arq Ser
TAT GAC AAG TTT GAC GOC AAC ATG CGC AGC
1980
170
G l v Leu Leu S e r Cys Phe Lys Lys Asp Leu
GGG CTG CTC TOO TQC TTC AAG AAG GAC CTG
2040
2098
Asp Asp Ala Leu Leu Lys .Asn Tyr
GAT GAC GOT CTG CTC AAA AAC TAT
2010
H i s Lys Ala Glu Thr Tyr Leu Arg
CAC AAG GCA GAG ACC TAC CTG COG
Nucleic Acids Research
180
19"!
192
Val Ket Lys Cys Arg Arg Phe Ala Glu Ser Ser Cys Ala Phe AM
GTC ATG AAG TGT CGC CGC TIT GCG GAA AGC AGC TGT GCT TIC TAG
2100
2070
gcacacactq
gtgtrtctgcggcactcx:cxx^tacccx:cctqtactctggcaactgccacccctacactttqtcctaata
2130
2160
2190
aaattaagatqcatcatatcactctgctagacatcttttttttttttgaaggc
222"!
2243
Figure 5. DtIA Sequence of the iGH Gene The region sequenced was fran the 5' end
of Bglll-C to the 3 end of PvuII-E. The sequence in the figure is differentiated as follows: Protein-Coding Sequence, upper case letters with amino acid
designations above them (numbers above the line in these regions refer to amino
acids -26 through 190); intervening sequences, lower case letters; and _5'— and
3'-untranslated regions, lower case, underlined. The two diamonds at positions
804 and 999 mark the beginnings of the two 20"! bp direct repeat sequences found
in IVS-B. The 15 bp direct repeats are indicated by horizontal arrows. All
numbers below the lines refer to distance in base pairs frcm the first base pair
presented in the sequence.
formamide conditions that favor DKA-RNA hybridization over reannealing of
the probe
labeled
(24), and the hybrids were digested with
Sl-resistant
material
was sized
SI nuclease. The
on an 8% polyacrylamide DNA
sequencing gel. A DNA sequence ladder prepared from the rGH Xhol site was
used as a size marker.
The results of the experiment are shown in Fig. 6.
The length of the DNA strand protected from digestion with SI by the rGH
mENA is about 65 bases.
Thus the "CAP" site of rGH mRNA appears to be
approximately 65 bases frcm the PstI site, i.e., at position 230, an A
residue 25 base pairs downstream frcm the TATAAA sequence.
The variation in length of the Sl-resistant DNA fragments probably
results from variation in the extent of digestion with SI.
Any fragments
not digested to completion will be slightly longer than the correct length
and any overdigestion will slightly shorten the fragments.
Most of the
label is found in the 65-base fragment; however, we cannot, by these data,
rule out the possibility of a slight variation in the rGH mRNA "CAP" site,
e.g., initiation at the A residue at position 233, 28 bases frcm the end of
TATAAA sequence.
DISCUSSION
Comparison of the rGH Gene and rGH cDMA Sequences
Comparison of the cloned rGH cDNA sequence with the
shewed
two discrepancies
protein sequence
(9). One occurred at the amino-terminal amino
2099
Nucleic Acids Research
-70
65-
60
-50
•40
Figure 6. Location of,the rGH iriRNA "CAP" Site: The end-labeled 0.G kb PstI
fragment (T.I yq, - 10 cpm) from p.gPGM was hybridized to li*1 nq of cytoplasmic
RNA extracted from GH., cells. The hybrids were digested with SI nuclease as
described in Materials and Methods. The Sl-resistant material was electrophoresed on an 0% polyacrylamide DNft sequencing gel at 30 rnA for 2 hours. A DMA
sequencing ladder prepared from the Xhcl site of p.gRQI was used as a size standard. (So as to equalize autoradiographic intensities. The Sl-resistant band
and the sequencing ladder were photographed separately and the photos reassembled) . All numbers in the figure refer to lengths in bases of the indicated DMA
fraaments.
2100
Nucleic Acids Research
acid.
Protein sequences for rat, as well as human, bovine, equine, and
ovine growth hormones, placed a Phe residue at this position (25). The
reported cENA sequence predicted a Leu residue (codon UUA) as the aminoterminal amino acid. The ENA sequence of the cloned gene is consistent with
the protein sequence, rather than the cENA sequence, in that it specifies a
Phe residue (codon UUC) at the amino-terminal position of the mature hormone. We have repeated the ENA sequence of the original cENA clone and have
confirmed the initial characterization. It seems most likely, therefore,
that the discrepancy between the cENA and the protein sequence at this
position resulted from an error in reverse transcriptase copying of the rGH
mFNA during the cENA preparation.
This is consistent with the high error
frequency that has been observed for reverse transcriptase (26).
Wallis and Davies
mature hormone
(25) placed a Gly residue at position 8 of the
sequence, while
sequence obtained here for the
the cENA
sequence predicts a Ser.
The
rGH gene is in agreement with the cDNA
sequence, placing a Ser at position 8.
This same amino acid has been seen
in the equivalent position of other GH's (25).
Finally, at position 749 in the cDNA (9) a T residue is found that is
not found at the corresponding position in the gene sequence (between bases
2196-2197). This portion of the original cHWA clone was resequenced and was
also found to lack this base.
The discrepancy thus appears to bo attribut-
able to an error in the original sequence determination.
Interveninq Sequences in the Growth Hormone Gene
Comparison of the rGH cBNA sequence with that of the cloned gene shows
that in the gene the protein-cod ing portion is divided by four intervening
sequences, designated A through D. They are located within amino acid -23
of the pre-hormone (A), and between amino acids 31 and 32 (B), 7<* and 71
(C), and 124 and 125 (D). The location of intervening sequence A is unambiguous.
Each of the other three could conceivably be placed one or more
bases removed from its given position.
However, those locations shown are
the only ones that place a GT dinucleotide at the 5' junction of each
intervening sequence and an AG dinucleotide at each 3'
these
two
dinucleotides
seem
to
be
a
general
junction.
feature of
sequence junctions (27), we feel the positions given are reliable.
observations are consistent with
Since
intervening
those of Chien and Thompson
Our
(28) who
described by heteroduplex analysis interveninq sequences in an independently obtained clone of the rat Growth Hormone gene.
The four intervening sequences in PGH are in approximately the same
2101
Nucleic Acids Research
locations and
intervening
are approximately
sequence
the
same sizes
(with the exception of
R) as in human growth hormone
(18).
The
entire
difference in size between the human and rat E intervening sequences may be
accounted for by the presence of a 200-base pair direct repeat that is
found in the rat gene.
The first unit of the repeat is located between
bases 894 and 996, and the second, between bases 999 and 1206. The difference in size of the repeat units results frcm a repetition of the sequence
CAAAA at the end of the second repeat unit.
Other than this, there are
only 8 base-pair differences between the two repeat units.
Furthermore,
there is an identical 15-base pair sequence (CACJTAATGACAGAGA) located just
before the first repeat unit (bases 789 and DP3) and just after the second
repeat unit (bases 1297 to 1221), i.e., the ?Qf-base pair direct repeat is
itself flanked by a 15-base pair direct repeat.
These observations strongly suggest that the large size of the rat B
intervening sequence is due to the transposition of a 2T*— base pair direct
repeat into that sequence.
'.Jhether this repeat was once present in the
ancestral human gene and has been lost in the course of evolution, or
represents a more recent event occurring uniquely in the rat gene cannot be
determined from the observations made here. However, that its presence is
the result of a transposition event seems very likely (29).
It is worth noting that a sequence similar to the repeat units identified above and inverted with respect to them is found just 3' to the rGH
gene (vis. Figure 2 ) .
This repeat unit has not been sequenced, but has
been identified and mapped by electron microscopy (30).
Fran the data given in the DNA sequence and the SI nuclease mapping
data
we can make an initial estimate of the size of the primary transcript
of the rCH gene. The mFNA "CAP" site, identified for other eucaryotic transcriptional units as being close to if not coincident with the site of initiation of transcription (31, 32, 33), is located at position 230. The
poly-A addition site was located by comparison with the rGH cENA at position 2210.
The distance between the two locations—our estimate of the
size of the rGH primary transcript—is 1980 bases.
On the basis of hybridization data, Maurer et a_l., (34) have identified a 2.3 kb nuclear RSA species as a potential precursor to the rGH message. Considering the experimental errors involved, this estimate is in
reasonable agreement with our own.
However, they also identified a 5.6 kb
and a 6.7 kb species as potential precursors.
In consideration of the data
we have presented above, and in the absence of any direct structural char-
2102
Nucleic Acids Research
acterization of the putative precursors, we feel that the identition of the
larger RNA species as rGH precursors seems premature.
REFERENCES
1. Tashjian, Jr., A.H., Yasumura, Y., Levine, L., Sato, G.H., and Parker,
M.L. (1968) Endocrinology 82, 342-352.
2. Martial, J., Baxter, J., Goodman, H.M., and Seeburg, P. (1977) Proc.
Natl. Acad. Sci. 74., 1816.
3. Pearson, R.L., Weiss, J.F., and Kelmers, A.D. (1971) Biochem. Biophys.
Acta 228, 770-774.
4. Lawn, R.M., Fritsch, E.F., Parker, R.C., Blake, G., and Maniatis, T.
(1978) Cell 1. 1157-1174.
5. Fiddes, J.C., Seeburg, P.H., DeNbto, F.M., Hallewell, R.A., Baxter,
J.D., and Goodman, H.M. (1979) Proc. Natl. Acad. Sci. USA 76, 4294-4298.
6. Cordell, B., Bell, G., Tischer, E., DeNbto, F.M., Ullrich, A., Pictet,
R., Rutter, W.J., and Goodman, H.M. (1979) Cell 18, 533-543.
7. Ullrich, A., Shine, J., Chirgwin, J., Pictet, R., and Rutter, W.J., and
Goodman, H.M. (1977). Science 196, 1313-1319.
8. Winters, G., and Fields, S. (1980) Nucleic Acids Research 8, 1965.
9. Sanger, F., Nicklen, S., and Coulson, A.R. (1977) Proc. Natl. Acad. Sci.
USA 74_, 5463-5467.
10. Seeburg, P.H., Shine, J., Martial, J.A., Baxter, J.D., and Goodman,
H.M. (1977) Nature 270, 486-494. 61-70.
11. Maxam, A, and Gilbert, W. (1977) Proc. Natl. Acad. Sci. USA 74, 560564.
12. Maxam, A., and Gilbert, W. (1980) in Methods in Enzymology (L. Grossman
and K. Moldave, eds.) Vol. 65, pp. 499-560 Academic Press, New York.
13. Samuels, H.H., Klein, D., Stanley, F., and Casanova, J. (1978) J. Biol.
Chem. 253., 5895.
14. Hardies, S.C. and Wells, R.D. (1976) Proc. Natl. Acad. Sci. USA 73,
3117-3121.
15. Leder, P., Tiemeeier, D., and Enquist, L. (1977) Science 196, 175-177.
16. Blattner, F.R., Blechl, A.E., Denniston-Thcmpson, K., Faber, H.E.,
Richards, J.E., Slightom, J.L., Tucker, P.W., and Smithies, 0. (1978) Science 202, 1279-1284.
17. Southern, E.M. (1975) J. Mol. Biol. 98, 503.
18. Moore, D., manuscript in preparation.
19. Diamond, D.J., and Goodman, H.M., unpublished information.
20. Gronenborg, B., and Messing, J. (1978) Nature 272, 375-377.
21. Gannon, F., O'Hare, K.O., Perrin, F., Le Pennec, J.P., Benoist, C ,
Cochet, M., Breathnach, R., Royal, A., Garapin, A., Cauri, B., and Chambon,
P. (1979) Nature 278, 428-434.
22. Goldberg, M. (1979) Ph.D. Thesis, Stanford University.
23. Yu, L.Y., Tushinski, R.J., and Bancroft, F.C. (1977) J. Biol. Chem. 252
24. Weaver, R.F., and Weissman, C. (1979) Nuc. Acid. Res. T_, 1175-1193.
25. Wallis, M., and Davis, R.V., in Growth Hormone and Related Peptides
(eds. Pecile, A., and Muller, E.E.) 1-14 (Elsevier, New York, 1976).
26. Gopinthan, K.P., Weymouth, L.A., Kunkel, T.A., and Loeb, L.A. (1979)
Nature 278, 857.
27. Seif, I., Khoury, G., and Dhar, R. (1979) Nucleic Acids Research 6,
3387-3398.
28. Chien, Y-H. and Thompson, E.B. (1980) Proc. Natl. Acad. Sci. 77, 4583.
29. Potter, S., Truett, M., Phillips, M., and Maher, A. TL9iBPlT~Cell 20,
639-647.
2103
Nucleic Acids Research
30. Goodman, H.M. e t . a l . , manuscript i n preparation.
31. Ziff, E.B., and Evans, R.M. (1978) Cell 15 1463-1475.
32. Baker, C.C., and Ziff, E.B. (1960) Cold Spring Harbor Synp. of Quant.
Biol. Vol. XLJV, 415-428.
33. Luse, D.S., and Roeder, R.G. (1980) Cell 20, 691-699.
34. Maurer, R.A., Gubbins, E.J., Erwin, C.R., and Donelson, J.E. (1980) J .
Biol. Chan. 255, 2243-2246.
2104