Download The Sequence of the Gorilla Fetal Globin Genes

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Microsatellite wikipedia , lookup

Helitron (biology) wikipedia , lookup

Transcript
The Sequence of the Gorilla Fetal Globin Genes: Evidence
for Multiple Gene Conversions in Human Evolution’
Alan F. Scott, Peter Heath, Stephen Trusko, and Samuel H. Bayer*
Department
of Medicine and *Howard Hughes Medical Institute, Johns Hopkins University
William Prass, Morris Goodman, and John Czelusniak
Department
of Anatomy, Wayne State University
L.-Y. Edward Chang and Jerry L. Slightom
Department
of Genetics, University of Wisconsin-Madison
Two fetal globin genes (“r and 9) from one chromosome of a lowland gorilla
(Gorilla gorilla gorilla) have been sequenced and compared to three human loci
(a Gy-gene and two ?-alleles). A comparison of regions of local homology among
these five sequences indicates that long after the duplication that produced the
two nonallelic y-globin loci of catarrhine primates, about 35 million years (Myr)
ago, at least one gene conversion event occurred between these loci. This conversion
occurred not long before the ancestral divergence (about 6 Myr ago) of Homo and
Gorilla. After this ancestral divergence, a minimum of three more gene conversion
events occurred in the human lineage. Each human ?-allele shares specific sequence
features with the gorilla ?-gene; one such distinctive allelic feature involves the
simple repeated sequence in IVS 2. This suggests that early in the human lineage
the y-genes may have undergone a crossing-over event mediated by this simple
repeated sequence. The DNA sequences from coding regions of both Gr- and yloci, a comparison of 292 codons in the corresponding gorilla and human genes,
show an unusually low evolutionary rate, with only two nonsilent differences and,
surprisingly, not even one silent substitution. The two nonsynonymous substitutions
observed predict a glycine at codon 73 and an arginine at codon 104 in the gorilla
?-sequence rather than aspartic acid and lysine, respectively, in human 9. Because
only arginine has been found at position 104 in y-chains of Old World monkeys,
it may represent the ancestral residue lost in gorilla and human Gy-chains and in
the human y-chain. Possibly the arginine codon (AGG) was replaced by the lysine
codon (AAG) in the Gy-gene of a common ancestor of Homo and GoriZZaand
then was transferred to the ?-gene by subsequent conversions in the human
lineage. DNA sequence conversions, similar to that attributed to the fetal y-globin
genes, appear to be relatively frequent phenomena and, if widespread throughout
the genome, may have profound evolutionary consequences.
Introduction
The P-globin genes of humans occur within a chromosomal region of about 50
kilobase pairs (kbp) of DNA and consist of an embryonic locus (E), two fetal loci (y),
and two adult loci (6 and p), as well as a nontranscribed locus or pseudogene (wpl),
with the order of these genes from 5’ to 3’ being E, G+y,9, ypl, 6, fl (Fritsch et al.
1. Keywords: gene conversion, fetal globin genes, gene evolution, gorilla, human.
Address for correspondence and reprints: Alan F. Scott. Department of Medicine, Johns Hopkins
University, Baltimore, Maryland 21205.
Mol. Biol. Evol. 1(5):371-389. 1984.
0 1984 by The University of Chicago. All rights reserved.
0737-4038/84/0105-0001$02.00
371
372 Scott et al.
1980; Shen and Smithies 1982). The same sequence organization of P-globin genes,
including the presence of two adjacent y loci, has also been found in other members
of the primate infraorder Catarrhini (e.g., in the Old World monkey Papio anubis
and in the African ape Gorilla gorilla) (Barrie et al. 198 1). However, only one
y-locus has been found in the New World monkey Aotes trivirgatus (representing
Platyrrhini, the sister taxon of Catarrhini) as well as in a prosimian primate Lemur
fulvus. The conclusion has been drawn that, during evolution of the Catarrhini, the
P-globin gene cluster became more complex as a result of a duplication of the fetal
y-locus in the basal catarrhines approximately 35 Myr ago after their separation from
progenitors of New World monkeys (Barrie et al. 198 1; Shen et al. 198 1). The products
of the two human y-loci can be distinguished at residue position 136 by the presence
of glycine (Gly) in one chain and alanine (Ala) in the other. Fetal y-chains with Gly
and Ala at position 136 have also been found in chimpanzees (Pan troglodytes) (De
Jong 197 1) and in gorillas (G. gorilla) (Huisman et al. 1973), but only Gly has been
detected at this position in orangutans (Pongo pygmaeus) (Schroeder et al. 1978) and
in Old World monkeys (Nute and Mahoney 1979a, 1979b; Mahoney and Nute 1980).
Other than the Gly/Ala difference at position 136, the two human y-chains have an
identical amino acid sequence, which differs from those of Old World monkey by
three to four amino acid residues (Nute and Mahoney 1979a, 1979b; Mahoney and
Nute 1980).
It has been observed from amino acid sequence data on both the a- and
y-globins from several organisms that paralogous genes within a species are more
alike than are the orthologous genes in different species. Thus intraspecific duplicates
appear to have evolved in parallel (e.g., Snyder 1980). This observation has been
confirmed by DNA studies and is true not only for the globin genes (Zimmer et al.
1980; Liebhaber et al. 198 1) but for other repeated sequences as well (e.g., the ribosomal
genes; Amheim and Southern [ 19771) and has been referred to as “concerted evolution”
(Zimmer et al. 1980).
The genetic process underlying concerted evolution for the human y genes was
initially studied by Slightom et al. ( 1980) and Shen et al. ( 198 l), who sequenced both
loci and their flanking regions. Two alleles of the ?-gene, from the same individual,
were identified on the basis of their sequence and location on separate chromosomal
homologs designated A and B. A portion of the ~-locus from chromosome A was
shown to resemble more closely the 5’ Gy-locus from that chromosome than it did
its allele on chromosome B. However, the 3’ region of the two p”y-loci are truly allelic
in that each codes for a protein specifying Ala at position 136. It was concluded that
sequences from the AGy-gene had been superimposed on the Aky-locus of chromosome
A by a mechanism involving gene conversion. The allele from chromosome B was
thought to represent an unconverted gene. The 3’ boundary of the conversion appeared
to coincide with an unusual simple sequence consisting primarily of (TG),, where n
ranged from 12 (human A?) to 22 (human AGy). This region is about 600 base pairs
(bp) downstream from the beginning of IVS 2 and was described as a “hot spot”
because it differed so significantly between the three sequences and because of its
apparent role in the conversion of A?. The converted portion of the A?-allele
extends from the hot spot in a 5’ direction for about 1,500 bp (Shen et al. 198 1). In
this study we now examine the fetal globin genes of the gorilla because this species
is sufficiently close to man that extensive similarity would be expected, yet small
changes of the sort seen between human alleles still might be detected and serve to
explain further the process by which conversion is mediated.
Gorilla Fetal Globin Genes
373
Material and Methods
Material
Restriction endonucleases, EC&I, BarnHI, PstI, AvaI, BgZI, BgZII, HindIII, HincII,
SacI, SmaI, and XbaI were from Promega Biotec (Madison, Wis.) or Bethesda Research
Laboratories (Gaithersburg, Md.). Polynucleotide kinase, Ml 3 phage DNA, and sequencing primer were from P-L Biochemicals (Milwaukee), and bovine intestinal
alkaline phosphatase was from Boehringer-Mannheim (Indianapolis). DNA polymerase
large fragment and Bal 3 1 nuclease were from Bethesda Research Laboratories. Proteinase K was obtained from EM Reagents. The [u-~*P] dATP (800 Ci/mM) and
[+Y-~*P]
ATP (2,000-3,000 Ci/mM; 1 Ci = 3.7 X lO”Bq) and T4 ligase were from
New England Nuclear (Boston). Chemicals used for Maxam and Gilbert sequencing
were obtained from the recommended vendors (Maxam and Gilbert 1980). X-ray
film, X-omat AR-5 was from Kodak.
DNA Cloning and Isolation
DNA was prepared by the method of Blin and Stafford (1976) from blood from
a male lowland gorilla (Gorilla gorilla gorilla) (Tomoka, who lives at the National
Zoological Park in Washington, DC.) and partially digested with EcoRI. Fragments
of 12-20 kbp were selected from sucrose gradients and cloned into the EcoRI “arms”
of the lambda phage Charon 4A (Maniatis et al. 1978; Williams and Blattner 1979).
The resulting phage library was screened with the y-globin cDNA probe pJW15 1
(Wilson et al. 1978). Several phage-containing y-globin genes were identified, and
DNA was prepared from the clone designated GyG 1. Restriction enzyme site mapping
and blot hybridization (Southern 1975) with the human y-globin cDNA probe (fig.
1) showed that this phage contained the complete gene of both y loci (fig. 2). Various
regions of the lambda insert DNA were subcloned into the plasmid pBR322 (Fritsch
et al. 1980) or Ml 3 phage vectors (Messing et al. 198 1).
DNA Sequencing
Cloning into M 13 vectors was accomplished by using specific restriction digestions
of plasmid or lambda phage DNA and the appropriate vector DNA and by generating
randomly terminated fragments with Ba13 1 nuclease digestion followed by blunt end
ligation into the SmaI site of M 13mp9 (Messing and Vieria 1982). Enzymatic sequencing was done as described by Sanger et al. ( 1977). Chemical sequencing of clone
G-yGl was done by end labeling the fragments obtained by enzymatic digestion of
lambda DNA (20-50 pg), isolating these labeled fragments, and sequencing them as
described by Maxam and Gilbert (1980). DNA sequences were analyzed on 60- or
85-cm-long and 0.4-mm-thick gels (enzymatic reactions), or on 104-cm-long and 0.2mm-thick water-jacketed gels (chemical reactions). Long gel plates were treated as
described by Garoff and Anzorge (198 1) so that acrylamide used to form the gel
matrix was bonded directly to the plate. The times employed for chemical sequencing
reactions and the procedure for pouring gels followed the directions of Slightom et
al. ( 1983). Most of the gorilla sequence reported here was obtained independently
by both procedures and confirmed in more than one laboratory.
Evolutionary
Reconstruction:
Parsimony Procedure
The main principle used in aligning the gorilla “r- and ?-sequences against
each other as well as against the human sequences (“r- and the two ~-alleles) and
Gorilla Fetal Globin Genes
375
used sparingly in the gene sequences and were placed so as to maximize the number
of matching bases. This helped ensure that for each homologous region we could
find the order of evolutionary branching that involved the fewest possible genetic
events, that is, maximized genetic likenesses while minimizing parallel and back
substitutions. Each nucleotide substitution was counted as a single event. Gaps (i.e.,
insertions or deletions), regardless of the number of nucleotides involved, were also
counted as single events. Inversions involving double base pairs were counted as two
independent events, even though we cannot exclude the possibility that they may
occur by a single process. Because of the very extensive similarity among the five
genes it was not necessary to weigh gaps more than nucleotide substitutions in aligning
these sequences. In fact, in our alignment, no gaps were employed in any sequences
at 97% of the positions (1,779 of the 1,837 positions of the full alignment shown in
fig. 3). Nucleotide substitutions and gaps occurred at only 110 sites among the
five genes.
In constructing the order of ancestral branching with the fewest events (i.e.,
maximum parsimony), we determined the number of nucleotide substitutions in
branches of particular trees using the algorithm of Fitch ( 197 1). For the orthologous
coding regions, we included amino acid sequence data for the Old World monkeys
(Nute and Mahoney 1979a, 1979b; Mahoney and Nute 1980), chimpanzee (DeJong
197 I), and orangutan (Huisman et al. 1973), as well as the human and gorilla sequences.
The parsimony procedure used for these sequences considered, in addition to the
amino acid sequences, the actual nucleotide sequences when known (Moore et al.
1973; Czelusniak et al. 1982).
Identifying Gene Conversion Regions
Our approach to the documentation of gene conversion events is to identify
sequence regions that diverged much less than would be expected based on the amount
of divergence of surrounding regions of the duplicated genes involved. This approach
is illustrated by a simple hypothetical situation in figure 4. Because the Gy- and
y-sequences are descendants of a y-globin gene which apparently duplicated in the
ancestors of Old World monkeys and hominoids about 35 Myr ago, we see significant
sequence divergence between the two loci. We can test whether the sequence differences
are uniformly distributed throughout the genes by parsimony analysis where sequences
are clustered on the basis of the number of substitutions required to generate one
sequence from the other. The shortest route between two sequences corresponds to
the most parsimonious solution. Therefore, for regions of the genes which have not
undergone conversion, the Gy-sequences should form a distinct group from the
y-sequences. However, if sequences were exchanged between the tandem loci, then
the most parsimonious groupings would not distinguish 5’ and 3’ loci as separate
groups, and instead “r- and P”y-sequences in one species would cluster in a group
separately from those in another species. Both sorts of arrangements can be seen in
the five y gene sequences (see fig. 3). By preparing parsimony trees for various regions
of homology within the y-sequences, we could provide evidence for the conversion
events described below (see fig. 5).
Results and Discussion
Restriction Enzyme Site Mapping
We have isolated the fetal globin gene region from a lowland gorilla in the
recombinant Ch 4A phage clone m
1. Restriction mapping of this clone with EcoRI,
376
Scott et al.
400
600
Pst
660
II:::
&A
660
g
iA
B”
A
I
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
. . ..~~~~~~~~~A
. . . . . . . . ..*..A
i
i!
. . . . . . . . . . . ..A
c
:: f
e
6
T6AATCTACCTACC
+_________+_________+_________+_________+________-+
__~_~~~~~+~-_______+__~~_____+___~~~~~*~~~~----Xbs
700
I
;;; Ag ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
HSA
AA
660
B”
A
660
HSA
A
tkiA
8
_____----*__-------+-
660
TCTT
;h
if
CT
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
6
HSA
B”
A
BOO
!II$
•.~.6
__+_________+_________+_________+_____~~~~+
fE
+__~~~~~~~*_________+_______*-_+_~~~~~~
5
AA
::
C
1:::
t
4
T6A
A
C
___+_________+_________*________*--____~~~~~+~~~~~~~~~*
TCTT
l *o*
____-----+__-------
A
G60
4i
I
~~~~~~~~+~~~~_____+_~~~~~
A
HSA
HSA
C6
f
6ACTlT~:::TATTA6ATT:C~6TA6AAA6AACTTT:AZt6TATG6TC
II;::$
660
C
6
i
6
6
t
A
_________+_________+_________+________*_________+_________+_________+
E
900
i
ii
~_____~~_*_________*_~~~~~~--,
i
1000
FIG.3.-Nucleotide sequence comparison of w- and Ay-globin genes from human and gorilla. Human
q- and Ay-globin gene nucleotide sequences are from Slightom et al. (1980) and Shen et al. (198 1) with
the genes in Ch4A 165.24 (Hsa AG and Hsa AA) from chromosome A and the gene in Ch3A 5 1.1
(Hsa BA)from chromosome B of a single individual. Gorilla w- and Ay-globin gene nucleotide sequences
are from clone Ch4A GyGl and are referredto as Ggo G and Ggo A. The numbering system is set by the
overall alignment. The complete nucleotide sequence for Hsa B? has also been obtained from another
clone from the same individual no. 563 (J. L. Slightom, unpublished data). Asterisks indicate the presence
BamHI, and Hind111 shows that it contains most of the sites found in the human
y-gene clone 165.24 (Slightom et al. 1980). Because these two recombinants contain
almost identical y-gene regions (the human clone has an extra 1.5-kbp EcoRI fragment
Gorilla Fetal Globin Genes
377
I
)
Hot
&qu*ncer
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.
l. ..o.T~T~T~C~C~C~T~T~TTT~T~T~T~T~T~A~
6
. . . . ..~~...~~~..........~~~T6T6T6T6TC
6
......T6C6C6C6C6C6T6T6T**6
676767676TC
6
. . . . ..*...................
i
LT 6767676TC
6
Spot
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
f
z
--~~~~~~~+~~-------*~~~~~~~--+---------
. TA
. TA
. AA
;%
c
f
A
6
+~~~~~~~~~,~~~~-----+---------
:
c
!if
6T
76
rifi
f:
676
+---------+----~~~~~+~~~~~~~~*
CTC
T
!!f
:
1800
;TTTiT:TA:6j;A6AT6T66$TTTT6;T6A6CAAAT
A
TA
T-0
;i
A
TA
T..
A
TA
T..
~~~~~~~~~+~~~~~~~-~+---------+-------
A
x
1837
which are used to maximize identities. The complete nucleotide sequence for the gorilla Gy-globin
gene is written on the top line. If a nucleotide difference is noted in any of the sequences, the nucleotide
for that position is written for each of the genes. Nucleotides that may have biological importance are
noted: single overline (TATA box) and double overline (poly[A] addition signal). The amino acid sequence
is printed below the dashed counting line, and amino acid replacements are printed below the appropriate
positions at residues 73, 104, and 136. The initiator codon is the 6rst Met and the terminator codon is
designated TER. Arrows indicate exon-intron boundaries which conform to the GT/AG rule (Breathnach
et al. 1978).
on its 5’ end) and are in the same 5’ to 3’ orientation with respect to the Ch 4A vector,
a direct comparison of their enzyme sites can easily be made (see fig. 1). Blot hybridization of these DNA fragments to 32P-labeled y-globin cDNA, isolated from the
plasmid pJW 15 1 (Wilson et al. 1978), clearly shows that the gorilla clone does contain
both fetal globin genes in EcoRI fragments of sizes characteristic of this region in
humans (Gy- and *y-globin genes in 6.9-kbp and 2.64-kbp fragments, respectively).
378
Scott et al.
0
m
Regions related by duplication or speciation
m
The converting
region
Regions related
by conversion
1.
m
Regions
by conversion
2.
related
k’m
b-,
lmBIm,
1T I I
d4
d2
d3
dl
FIG. 4.-Hypothetical
scheme for identifying multiple gene conversions. di is the distance between
homologous regions related paralogously by duplication; it is the distance between regions, one from locus
A and one from locus B, depicted by 0. dz is the distance between homologous regions related orthologously
by speciation; it is the distance between regions, both from locus A or locus B, depicted by 0. dJ is the
distance between homologous regions related by conversion 1 and depicted by 1. & is the distance between
homologous regions related by conversion 2 and depicted by 1. di > dz signifies that duplication preceded
speciation; di > d3 identifies gene conversion 1; dS > dz signifies that gene conversion 1 preceded speciation;
dz > d.+identifies gene conversion 2. Identification of the A gene as the converting gene in conversion 2
requires not only that A and B genes of the right-hand species diverge less from each other than each gene
diverges from either A or B gene of the left-hand species (as well as from either A or B gene of the ancestral
species of right- and left-hand species) but also that each gene of the right-hand species diverges less from
A than from B gene of the left-hand species (as well as diverges less from A than from B gene of the
ancestral species of right- and left-hand species).
The only difference noted in the EC&I fragment pattern was in the 3’ region of the
Gy-genes, where humans have a 1.5%kbp fragment, whereas the absence of an EcoRI
site in the gorilla clone results in a larger 2.3-kbp fragment containing the 3’ third
exon of the Gy-gene (see figs. 1 and 2). We also found that neither of these gorilla
genes contains a Hind111 site in IVS 2 which is polymorphic in the human genes
(Jeffreys 1979). In contrast, the human clone 165.24 has a Hind111 site in IVS 2 of
G+y(see figs. 1 and 2).
Structural Comparison of the Gorilla and Human Fetal Globin Genes
We have obtained the nucleotide sequence for identical regions of the gorilla
“r- and P”y-globingenes, starting 55 bp 5’ of the expected capped nucleotide of the
mRNA and extending 17 1 bp 3’ of the expected poly(A) addition for a total of 1,837
bases (see fig. 3). For comparison, the human “r- and 9-globin genes of chromosome
A and the Ay-globin gene of chromosome B are also shown in figure 3.
The 5’ nontranslated regions for these gorilla and human genes show very few
Gorilla Fetal Globin Genes
Duplication
379
(35 MYr)
( I MYr)
I
Gorilla
Human
FIG. 5.-History of gamma gene evolution and conversion events. A duplication of a y-globin gene
encoding glycine at position 136 occurred in the early catarrhine primates about 35 Myr ago. Because a
glycine is coded at this position in both y-genes of the orangutan and Old World monkeys (table 2) whereas
a replacement with alanine in the 3’ gene is found in Homo and Gorilla, we conclude that this change
occurred after the divergence of Pongo (about 14- 18 Myr ago) but before the Homo and Gorilla branching
(about 5-6 Myr ago). This replacement may have occurred before the first conversion (C,), or it may have
happened afterward, but still before the separation of Homo and Gorilla. If the glycine - alanine replacement
occurred before C,,the 3’ boundary of C1 can be placed in exon 3 at codon position 135 or nucleotide
position 1543 (fig. 3), but if the replacement occurred after C, the 3’ boundary can be placed at the start
of the 3’ untranslated region. C, is common to both humans and gorillas and is estimated to have occurred
about 10 Myr ago, using the replacement rate change of 1% for every 10 Myr calculated by Efstratiadis et
al. ( 1980). No further conversions have been identified in the gorilla lineage, but three have been identified
in the human lineage. Ct and Cs are estimated to have occurred about 2-3 Myr ago either in a common
ancester of human chromosome types A and B or in an early version of chromosome B itself. Conversion
CZ is evident in the BAY-genefrom positions 90 1 to 1128 and extending into the “hot spot,” and Cs is also
evident in the By-gene, but from positions 42 to 777 (fig. 3). C, extending over some 1,500 bp is estimated
to have occurred no earlier than 1 Myr ago on human chromosome type A (Shen et al. 1981). Its effects
are evident in the A*y gene from positions 42 to 1128with its 3’ boundary being located in the hot spot
region. Whereas in the case of C4 the converting sequences clearly come from the AGy gene, in each of
the earlier events we have yet to establish the origin of the converting sequence.
differences and none that would be expected to alter their expression. The gorilla
genes share the identical promoter sequence (AATAAA) at the same position, 3 1 bp
before the expected capped site (Efstratiadis et al. 1980). Because the gorilla and
human promoter sequences are located in identical positions, we expect that the 5’
nontranslated region will also be of the same length (53 bp).
The gorilla y-globin genes contain two intervening sequences at the expected
positions. IVS 1 has the same length (122 bp) in all five y-globin genes, and the
sequences are virtually identical. As found in the human y genes, both the length
380 Scott et al.
and sequence of IVS 2 for the gorilla “r- and ?-gene differ. IVS 2 is 906 bp long in
gorilla Gy and 872 bp in gorilla 9. In the human genes, AGy is 886 bp long, whereas
Aky is 866 bp and Bky is 876 bp (Slightom et al. 1980). In both species this difference
in length of IVS 2 is due to the presence of a simple sequence DNA region (fig. 3,
positions 1126-l 18 1) which consists of (TG)n. The value of n ranges from 13 in
human A? to 24 for gorilla Gy. Also, both gorilla y-genes are lengthened by an extra
13 bp (at positions 625-637) that are not found in any of the human genes.
The gorilla fetal globin genes both use the same terminator codon and have their
poly(A) addition signal (AATAAA) located at the same position as the human genes.
We assume that poly(A) for these gorilla y-gene transcripts would be added about 2 1
bp downstream from these poly(A) signal sites as is true for the human transcripts
(Poon et al. 1978; Forget et al. 1979). If so, the 3’-untranslated region would be some
90 bp for the gorilla y-gene and some 89 bp for the Gy-gene.
A Slower Evolutionary
Rate for the Primate y Globin Genes
In a previous comparative restriction mapping study of the P-globin gene cluster,
Barrie et al. ( 198 1) estimated that human and gorilla are about 0.9% different at the
DNA level (most of the sites examined were regions of flanking or intervening sequence). According to the data in figure 3, human and gorilla Gy-sequences are about
2.1% different, whereas the gorilla y-gene differs from the human A?- and
B*y-alleles by 2.3% and 2.2%, respectively. These differences are not spread randomly
between the y-sequences. Instead, as can be seen in figure 3, human and gorilla
y-genes share certain regions (e.g., the exons) that are almost identical, whereas other
regions (in particular IVS 2) diverge significantly.
It has been argued from amino acid sequence data that hominoid (human,
chimpanzee, gorilla, and orangutan) hemoglobins have accumulated fewer amino
acid replacements than expected when compared to the replacement rates seen in
other mammalian hemoglobins (Goodman 198 1; Goodman et al. 1983). The present
study confirms this unusually low evolutionary rate and extends the observation to
silent or synonymous substitutions in the y-coding regions as well. In the entire coding
sequence of both “r- and ?-genes of the gorilla (a total of 876 bases), we detect only
two substitutions in comparison with the human genes, each of which results in
amino acid replacements (with the gorilla having Gly at y-codon 73 and Arg at
*y-codon 104 compared to Asp and Lys, respectively, in human 9). No silent substitutions were detected in the coding sequences reported here. This finding does not
conform to the “neutralist” paradigm (e.g., Li et al. 198 1; Li 1983; Kimura 1983),
which predicts that many more silent than amino acid-changing substitutions should
accumulate in exons of active genes during evolution.
The separation of human and gorilla is thought to have occurred between 5 and
6 Myr ago on the basis of various interpretations of the fossil record (Johanson and
White 1979; Pilbeam 1979; McHenry and Corruccini 1980; Lovejoy 198 1) and from
the application of DNA hybridization data as an evolutionary clock (C. Sibley and
J. Ahlquist, personal communication). It is surprising that no silent changes have
accumulated in that period. Calculations based on substitution rates found for coding
sequences (exons) of globin genes in human/rabbit, human/mouse, and rabbit/mouse
comparisons (Efstratiadis et al. 1980) show a 1% change at silent sites for every 1.4
Myr. This would predict that, over the 876 bp of coding DNA in Gy- and y-genes,
at least nine silent changes should have accumulated in the 220 silent sites. Thus,
Gorilla Fetal Globin Genes
381
our data point to an unexpectedly low silent substitution rate in the coding sequences
of hominoid y-globin genes. Because the number of detected substitutions is still
relatively small, we cannot be sure whether this is a statistical anomaly or represents
the true condition. But if the observed number of substitutions is indeed lower, then
we interpret this to mean that either selection has acted throughout the coding sequences
to minimize both silent and nonsilent substitutions, or else other mechanisms are
operating to maintain the slower relative substitution rate in these regions of the
y-genes. Because the number of silent substitutions is also much fewer than expected,
we conclude that the conservation of these sequences is not necessarily related to the
encoded protein product. Although this conservation might reflect selection for RNA
secondary structure, we believe that a more reasonable alternative is that there has
been an overall reduction in the apparent substitution rate. This could be due to a
general mechanism such as enhanced DNA repair, longer generation times in primates
(Goodman 1976), or restrictions in the types of allowable substitutions as a consequence
of the base composition of the region. Mammalian DNAs have fewer CG doublets
than would be predicted, perhaps as a consequence of methylation of the 5’ cytosine
(Razin and Riggs 1980), and, because the y-genes are relatively GC rich, they may
tolerate fewer substitutions (Smithies et al. 1981). However, we would expect that
both of these mechanisms might apply to IVS and flanking region sequences as well
and would not account for the disproportionately lower silent substitution rate in
coding regions. Thus, such mechanisms do not specifically explain the disproportionately lower silent substitution rate in coding regions. Two other mechanisms that
could account for the reduced coding region rates are (1) gene conversions confined
to the coding regions, or (2) cDNA-mediated conversion events which, by definition,
would involve only coding sequences. This last mechanism has recently been proposed
to account for peculiarities in the evolution of various families of repeated genes and
processed genes that lack introns and, often, the normally present signal sequences
in adjacent DNA (Jagadeeswaran et al. 198 1; Lewin 1983). This conversion process
might result from hybridization and strand exchange with either fragments of full
size y-cDNAs or conversion of a part of the cellular gene with a complete cDNA.
Only with the accumulation of additional coding region sequences can these speculations be tested.
The 3’ Regions of the y-Genes Have Not Been Converted
Although the three exons and IVS 1 are extremely similar in sequence among
the five y-genes (showing only four substitutions over 560 aligned positions), other
regions are less similar. From the time of the ancestral divergence of Homo and
Gorilla to the present, the fastest rates of substitution occurred in IVS 2. (The difference
in substitution and gaps between orthologous human and gorilla Gy-genes in IVS 2
is 3.5% compared to 2.3% in the 3’ untranslated and flanking region, 1.8% in 5’
flanking region, and no differences in exons and IVS 1.) However, the largest difference
between Gy- and *y-genes is not found in IVS 2 but in the 3’ untranslated and flanking
region which differs by an average of 13% (see table l), whereas this region of the
orthologous human and gorilla Gy- and *y-genes differs by only 2.3% and 1.4%,
respectively. Such a relatively large difference strongly suggests that this 3’ untranslated
and flanking region of the tandem y-loci has not undergone conversion, perhaps since
the original duplication. At 29 of the 41 substitution sites in the 3’ flanking region
(fig. 3 and table 1, positions 1577-l 837), both of the Gy-sequences are distinct from
382
Scott et al.
Table 1
Pairwise Comparisons of Nucleotide Sequences from Gorilla and Human Gy- and AT-Genes
NRs
Gaps
Positions
Shared
Difference (%)
In the 3’ Untranslated and Flanking Region (Pos. 1577-1837)
Not Subjected to Conversions since the Time of the
Tandem y-Duplication in Early Catarrhines
G vs. human G . . . . .
B vs. human AAy . . .
G vs. human BAy . . .
G vs. gorilla A . . . . . .
4
28
29
29
2
4
4
5
259
256
256
253
2.3
12.5
13.0
13.4
Human G vs. human AAy . .
Human G vs. human BAy . .
Human G vs. gorilla A . . . . .
32
33
33
2
2
3
258
258
255
13.0
13.6
14.0
Human AAy vs. human BAy
Human AAy vs. gorilla A . . .
1
2
0
1
258
255
0.4
1.2
Human BAy vs. gorilla A . . .
3
1
255
1.6
Gorilla
Gorilla
Gorilla
Gorilla
In the Region Subjected Only to Conversion C, (Pos. 11291576 Spanning the 3’ Third of IVS 2 and Exon 3)
human G . . . . .
human AAy . . .
human BAy . . .
gorilla A . . . . . .
5
12
16
9
2
3
3
3
441
421
439
421
1.6
3.6
4.3
2.9
Human G vs. human AAy . .
Human G vs. human BAy . .
Human G vs. gorilla A . . . . .
13
15
10
1
1
1
421
439
421
3.3
3.6
2.6
Human AAy vs. human BAy
Human AAy vs. gorilla A . . .
2
7
1
0
421
421
0.7
1.7
Human BAy vs. gorilla A . . .
7
1
421
1.9
Gorilla
Gorilla
Gorilla
Gorilla
G
G
G
G
vs.
vs.
vs.
vs.
In the Region of IVS 2 Subjected to Conversions C, and C4
(Pos. 8 12-900)
Gorilla
Gorilla
Gorilla
Gorilla
G vs. human
G vs. human
G vs. human
G. vs. gorilla
G ...._
AAy . . .
BAy . . .
A .....
4
4
5
5
0
0
1
1
89
89
89
89
4.5
4.5
6.7
6.7
Human G vs. human AAy . .
Human G vs. human BAy . .
Human G vs. gorilla A . . . . .
0
5
5
0
1
1
89
85
85
0
7.1
7.1
Human AAy vs. human BAy
Human AAy vs. gorilla A . . .
5
5
1
1
85
85
7.1
7.1
.....
0
0
85
0
Human BAy vs. gorilla
Gorilla Fetal Globin Genes
383
Table 1 (Continued)
NRs
Gaps
Positions
Shared
Difference (%)
In the region of IVS 2 Subjected to Conversions C1, CZ, and C4
(Pos. 901-l 128)
human G . . . . .
human AAy . . .
human BAy . . .
gorilla A . . . . . .
11
11
11
8
Human G vs. human AAy . .
Human G vs. human BAy . .
Human G vs. gorilla A . . . . .
0
0
7
Human AAy vs. human BAy
Human AAy vs. gorilla A . . .
Human BAy vs. gorilla A . . .
Gorilla
Gorilla
Gorilla
Gorilla
G
G
G
G
vs.
vs.
vs.
vs.
228
228
228
228
4.8
4.8
4.8
3.5
0
0
0
228
228
228
0
0
0
7
0
0
228
228
0
7
0
228
3.1
3.1
3.1
In the Region Subjected to Conversions C, , C3, and C4 (Pos. l8 11 Spanning 5’ Untranslated and Flanking, Exon 1,
IVS 1, Exon 2, and a 5’ Portion of IVS 2)
human G . . . . .
human AAy . . .
human BAy . . .
gorilla A . . . . . .
11
12
14
18
Human G vs. human AAy . .
Human G vs. human BAy . .
Human G vs. gorilla A . . . . .
7
17
Human AAy vs. human BAy
Human AA y vs. gorilla A . . .
Human BAy vs. gorilla A . . .
Gorilla
Gorilla
Gorilla
Gorilla
G
G
G
G
vs.
vs.
vs.
vs.
1
1
2
1
798
798
794
808
1.5
1.6
2.0
2.4
0
1
2
798
794
795
0.1
1.0
2.4
7
18
1
2
794
795
1.0
2.5
18
3
791
2.7
all three ?-sequences, but within each group these positions are identical. In the 12
remaining sites, six are the same within the “r-gtoup, whereas six are identical within
the G.y-group. Therefore, parsimony analysis supports joining the gorilla “r- and
human Gy-genes into one branch and the gorilla y- and human ?-sequences into
another.
An Ancestral Hominoid
Conversion
In contrast to the 3’ gene region, the remaining portions of the Gy- and ?-genes
(namely, coding and IVS regions) are considerably more similar, which indicates a
conversion of one locus by another and can be seen by analyzing the sequences in
IVS 2. The minimum difference between human and gorilla throughout this region
is 3.4%, a value obtained by comparing the Gy-sequences of each species. Yet comparison of the two most divergent sequences among the five, IVS 2 from the gorilla
G,y_and ?-genes, yields a value of only 5.1 yo-roughly one and a half times as large
as the difference for this region between species and much less than the value expected
384
Scott et al.
if these sequences had accumulated substitutions for the full 35 Myr since the original
duplication. On the basis of this similarity for the gorilla “r- and ?-genes, we conclude
that long after the original duplication, but before the separation of human and gorilla,
a gene conversion occurred whereby DNA from one of the two loci was replaced by
sequence from the other locus. This first or earlier conversion event in the hominoid
ancestor of both humans and gorillas is labeled C1 in figure 5.
In the region starting at nucleotide position 1182 (just 3’ to the hot spot in IVS
2) and progressing downstream, no further conversions after C1 are evident. From
position 1182 to 1450, at the 3’end of IVS 2, human y- and Gy-genes have accumulated
13 substitutions over 269 nucleotides compared to only seven substitutions between
either human ?-allele and the gorilla ?-genes, and only two substitutions between
human Gy and gorilla Gy (see fig. 3). Furthermore, in six of the 16 substitutions, all
Gy-sites are alike between human and gorilla but differ from all ?-sites which, in
turn, are alike between human and gorilla. There are four sites (positions 1280, 128 1,
1285, and 1286) that specifically group A? and B? together. Thus, by parsimony
analysis, both Gy-sequences for this region fall into one group and all three ?-sequences
into another.
Evidence for the Ci is strengthened by pairwise comparisons of the Gy- and
9-exons, either between or within species. If no conversion had occurred in the
descent of gorilla Gy- and ?-sequences for the full 35 Myr since the duplication,
about 28 silent substitutions would be expected between the two y-genes, using the
silent substitution rate cited above. Yet, as already emphasized, no silent substitutions
were found. Amino acid sequence data for other primate y-chains (table 2) also
provide evidence for Cl. Both Gy- and ?-chains of human, chimpanzees, and gorillas
have His at codon 77 and Thr at codon 135, whereas Old World monkeys have Asn
at position 77 and Ala at position 135. One orangutan y-chain has Thr and the other
Ala at position 135. This suggests that in one of the duplicated loci an Ala to Thr
replacement occurred at position 135 in early hominoids before the ancestral separation
of orangutan from African apes and humans, and that after the orangutan separation
a gene conversion (C,) replaced the sequence encoding 135 Ala by that encoding 135
Thr in the common ancester of African apes and humans. Although the original
duplication clearly preceded the ancestral divergence of Old World monkeys and the
hominoids, present-day African ape and human Gy-sequences are closer to African
ape and human ?-sequences than are either set of sequences to those of Old World
monkeys.
Three Additional Conversions in Human y-Genes
From the differences in table 1 and parsimony analysis of substitution sites, we
find evidence for three additional conversions, all within the human genes and none
extending beyond the 3’ end of the hot spot sequence in IVS 2. We also observe a
small stretch of sequence in IVS 2, from positions 8 12 to 900, where gorilla 9- and
human B?-genes are identical (see table 1). Within this region there is a 4-bp deletion
(858-86 l), an inversion to TC from CT (847-848), and five point substitutions which
are shared in common (fig. 3). This small sequence represents DNA in the human
Bky-allele, which has remained unconverted since the species diverged and which is
flanked on each side by sequences that have undergone conversion. The fact that
this segment of B? is unconverted is also supported by parsimony criterion, since
a minimum of four additional genie events (three more substitutions at 822 and 847848 and one 4-base deletion or insertion at 858-861) would be required if human
Table 2
Amino Acid Sequence Differences among Primate y-Chains
RESIDUE NUMBER
SPECIES
Homo sapiens
.......
Pan troglodytes.
.....
.......
Gorilla gorilla
73
75
77
104
117
135
136
139
REFERENCES
Asp
Ile
His
LYS
His
Thr
Gly/Ala
Ser
Schroeder et al. 1963
Slightom et al. 1980
DeJong 1971
Asp
Ile
His
Thr
Gly/Ala
Ile
His
LYs
LysfArg
His
AsPIG~Y
His
Thr
Gly/Ala
Ser
Ser
?
Ile/Val
?
?
?
Thr/ Ala
Gly
Ser
Ile
Asn
Arg
His
Ala
Gly
Ser
Mahoney and Nute 1980
Ile
Asn
Arg
Asn
Arg
Ala
Ala
GlY
GlY
Ser
Ile/VaI
Arg
His
Ser
Nute and Mahoney 1979a
Nute and Mahoney 19796
?
?
?
?
Ala
Gly
Ala
Huisman et al. 1973
Huisman et al. 1973
This report
Pongo pygmaeus
.....
Huisman et al. 1973
Schroeder et al. 1978
Macaca mulatta
.. ..,
Macaca nemestrina
..
Papio cynocephalus , . .
Saguinus fuscollis . . . ,
NOTE.-The amino acid sequence differences among the y-chains of the apes and Old World monkeys determined by protein sequencing or inferred from the DNA sequence. Note that the
gorilla shares an Arg with the Old World monkeys at position 104.
’ !%quencesof Ay and Gy chains of P. troglodytes were inferred from the amino acid compositions of small peptides.
386
Scott et al.
B? were not joined first to gorilla 9. Thus, the best arrangement would have human
B? and gorilla 9 grouped closest together.
We hypothesize that three conversion events occurred between 9- and
Gy-sequences in the human line after the separation of humans and gorillas. One of
these, designated C2 in figure 5, is evident in the By-allele and extends from immediately beyond the unconverted region into the 5’ end of the hot spot sequence
(i.e., from position 90 1 to 1128). Among 13 substitution sites found in this region,
the three human sequences are identical, whereas the gorilla genes differ considerably
from human, 11 differences for Gy and seven differences for 9 (see fig. 3 and
table 1).
Evidence for another conversion, C3 in figure 5, can also be seen in the human
By-allele (table 1, positions l-8 1 l), that is, a conversion 5’ to the stretch of sequence
(positions 8 12-900) where human B?- and gorilla y-genes are identical. We are
uncertain about the 5’ and 3’ ends of C3, although it apparently terminates at its 3’
end prior to position 820 (fig. 3). Up to position 8 11 there are 29 substitution sites,
of which only two (568 and 7 15) support joining the human B?-allele to gorilla 9,
whereas six sites (42, 99, 627-637 gap, 661, 742, and 777) would join human
B*y-, A*y-, and AGy-sequences together as a branch distinct from the two gorilla
genes. Moreover, among the 23 remaining sites, there are 10 at which gorilla Gy and
the three human sequences are identical and differ from gorilla 9, compared to only
four sites in which gorilla 9 and the three human sequences are identical and differ
from gorilla “r. Thus, in these regions, as a result of CZ and C3, the most parsimonious
grouping has human Bky joined to the branch of human AP”yand AGy.
The last conversion event, C-4, is seen when the human A? is compared to
human AGy. These sequences are nearly identical from the 5’ end up to the hot spot.
This event was described by Slightom et al. (1980) and, based on the near identity
of sequences between A*y and AGy in this region, must have occurred recently,
perhaps within the last million years (Shen et al. 198 1). Furthermore, A?-sequence
must have been replaced by AGy-sequence. If the reverse had occurred in Cq, AGy
would be virtually identical not only to the A?-allele but also the B?-allele. Instead,
the B?-allele diverges significantly from both A? and AGy. There is much weaker
but suggestive evidence that in the case of C3, the converting sequence also originated
from “r. In this case the three human sequences diverge much less from gorilla Gy
than from gorilla 9. The direction of other conversions (i.e., C, and C,) cannot be
determined at present but should be determinable once other hominoid and Old
World monkey y-genes are sequenced.
Conclusions
The data presented here extend the arguments of Slightom et al. (1980) that the
hot spot sequence is important in mediating conversion events between fetal y-globin
genes. Both C2 and Cs extend up to this region but not beyond it. The finding that
the hot spot regions in gorilla *y and human A? are nearly identical, whereas 5’ to
this region the gorilla gene shares unique substitutions with the human
By-allele but not with A?, strongly suggests that one of the two human ~-alleles
may have resulted from a recombination event at this location. When additional
alleles in human and other closely related species are sequenced, it should be possible
to determine whether the hot spot has been involved with both conversion and
crossing-over events. Indeed, both events are believed to be related by the same
underlying mechanism (Radding 1978).
Gorilla Fetal Globin Genes
387
As already noted, the coding sequences of the gorilla and human Gy-genes are
identical, whereas the ?-genes differ by two substitutions resulting in an Asp to Gly
change at codon 73 and an Arg to Lys change at codon 104. Because codon 104 is
also Arg in two species of macaques (Macaca mulatta and M. nemestrina) and a
baboon (Papio cynocephalus), whereas in human and chimpanzee it is Lys (table 2),
we suggest that not only does the gorilla y-gene retain an “ancestral” feature shared
with Old World monkeys but that human and chimpanzees share the derived feature
and thereby a more recent common ancestor. We can more thoroughly test this
possibility once nucleotide sequences are obtained from the paired y-genes of chimpanzees and other hominoids.
Clearly, gene conversions appear to be common events in the evolution of the
human y genes. What is the consequence of such a process? Possibly, as noted by
Dover (1982), conversion will slow the effective evolutionary rate. As new substitutions
occur in a sequence they will tend to be lost because they are likely to be converted
back to the sequence of the more common allele in the species population. The faster
the conversion rate, the slower the effective substitution rate.
Finally, an important consequence for the reconstruction of phylogenetic relationships has emerged from this study. Conversion between related genes in one
species may result in the transfer of stretches of sequences quite different from that
found in what appears to be the orthologous gene in another species. In the case of
y genes our present results suggest that this problem may be less acute when
Gy-sequences are compared to one another in different species than when ~-sequences
are compared, perhaps because G+yis more likely to be the “donor” sequence. To see
if such interpretations are warranted and to better reconstruct the history of these
two nonallelic loci, we must again stress the need to sequence y-genes from additional
species of higher primates.
Acknowledgments
This.study was supported by the following grants: NIH GM 28931 (A. F. S.),
NSF DEB 7810717 (M. G.), and NIH HD 16595 (J. L. S.); J. C. was supported by
a National Science Foundation predoctoral fellowship. We wish to thank Drs. Haig
Kazazian, Nobuyo Maeda, Barbara Schmeckpeper, and Oliver Smithies for helpful
discussions; Timothy W. Theisen for technical assistance; and Dr. M. Bush at the
National
Zoological
Park for providing
blood samples. J. L. S. and
L.-Y. E. C. are also endebted to Drs. Frederick Blattner and Oliver Smithies for the
shared use of laboratory space and equipment. This article is paper no. 2692 from
the Laboratory of Genetics, University of Wisconsin-Madison.
LITERATURE
CITED
ARNHEIM,N., and E. M. SOUTHERN.1977. Heterogeneity of the ribosomal genes in mice and
men. Cell 11:363-370.
BARRIE,P. A., A. J. JEFFREYS,
and A. F. SCOTT.198 1. Evolution of the P-globin gene cluster
in man and the primates. J. Mol. Biol. 149:319-336.
BLIN,N., and D. W. STAFFORD.1976. A general method for isolation of high molecular weight
DNA from eukaryotes. Nucleic Acid Res. 3:2303-2308.
BREATHNACH,
R., C. BENOIST,K. O’HARE, F. GANON, and P. CHAMBON.1978. Ovalbumin
gene: evidence for a leader sequence in mRNA and DNA sequences at the exon-intron
boundaries. Proc. Natl. Acad. Sci. 754853-4857.
CZELUSNIAK,
J., M. GOODMAN,D. HEWETT-EMMETT,
M. L. WEISS,P. J. VENTA, and R. E.
TASHIAN. 1982. Phylogenetic origins and adaptive evolution of avian and mammalian
haemoglobin genes. Nature 298:297-300.
388 Scott et al.
DEJONG, W. W. W. 197 1. Chimpanzee foetal haemoglobin: structure heterogeneity of the y
chain. Biochem. Biophys. Acta 251:2 17-226.
DOVER,G. 1982. Molecular drive: a cohesive mode of species evolution. Nature 299564-572.
EDWARDS,A. F. W., and L. L. CAVALLI-SFORZA.1963. The reconstruction of evolution. Ann.
Hum. Genet. 27:104-105.
EFSTRATIADIS,A., J. W. POSAKONY,T. MANIATIS,R. M. LAWN, C. O’CONNELL,R. A. SPRITZ,
J. K. DERIEL, B. G. FORGET, S. M. WEISSMAN,J. L. SLIGHTOM,A. E. BLECHL,0. SMITHIES,
F. E. BARALLE,C. C. SHOULDERS,N. J. PROUDFOOT. 1980. The structure and evolution
of the human P-globin gene family. Cell 21:653-668.
FARRIS, J. S. 1970. Methods for computing Wagner Trees. Syst. Zool. 19:83-92.
FITCH, W. M. 197 1. Toward defining the course of evolution: minimum change for a specific
tree topology. Syst. Zool. 20:406-4 16.
FORGET, B. G., C. CAVALLESEO,J. K. DERIEL, R. A. SPRITZ, P. V. CHOUDARY,J. T. WILSON,
L. B. WILSON, V. B. REDDY, and S. M. WEISSMAN. 1979. Structure of the human globin
genes. Pp. 367-381 in R. AXEL,T.MANIATIS,
and C. F. Fox, eds. Eucaryotic gene regulation.
ICN-UCLA Symposium on Molecular and Cellular Biology, 14. Academic Press, New York.
FRITSCH, E. F., R. M. LAWN, and T. MANIATIS. 1980. Molecular cloning and characterization
of the human B-like globin gene cluster. Cell 19:959-972.
GAROFF, H., and W. ANSORGE. 198 1. Improvements of DNA sequencing gels. Anal. Biochem.
115:450-457.
GOODMAN,M. 1976. Towards a genealogical description of the primates. Pp. 321-353 in M.
GOODMAN and R. E. TASHIAN, eds. Molecular anthropology. Plenum, New York.
. 198 1. Decoding the pattern of protein evolution. Prog. Biophys. Mol. Biol. 37: 105164.
GOODMAN, M., G. BRAUNITZER,A. STANGL, and B. SCHRANK. 1983. Evidence on human
origins from haemoglobins of African apes. Nature 303:546-548.
HUISMAN, T. H. J., W. A. SCHROEDER,M. E. KEELING, W. GENGOZIAN, A. MILLER, A. R.
BRODIE, J. R. SHELTON, and G. APELL. 1973. Search for non-allelic structural genes for
y-chains of fetal hemoglobins in some primates. Biochem. Genet. 10:309-3 18.
JAGADEESWARAN,
P., B. G. FORGET, and S. M. WEISSMAN.198 1. Short interspersed repetitive
DNA elements in eukaryotes: transposable DNA elements generated by reverse transcription
of RNA Pol III transcripts? Cell 26: 14 1- 142.
JEFFREYS,A. J. 1979. DNA sequence variants in w-, Ay-, 6- and P-globin genes in man. Cell
l&1-10.
JOHANSON,D. C., and T. D. WHITE. 1979. A systematic assessment of early African hominids.
Science 203:32 l-330.
KIMURA, M. 1983. The neutral theory of molecular evolution. Pp. 208-233 in M. NEI and
R. K. KOEHN, eds. Evolution of genes and proteins. Sinauer, Sunderland, Mass.
LEWIN, R. 1983. How mammalian RNA returns to its genome. Science 219:1052-1054.
LI, W.-H. 1983. Evolution of duplicate genes and pseudogenes. Pp. 14-37 in M. NEI and
R. K. KOEHN, eds. Evolution of genes and proteins. Sinauer, Sunderland, Mass.
LI, W.-H., T. GOJOBORI,and M. NEI. 198 1. Pseudogenes as a paradigm of neutral evolution.
Nature 292:237-239.
LIEBHABER,S. A., M. GOOSSENS,and Y. W. KAN. 198 1. Homology and concerted evolution
at the al and a2 loci of human a-globin. Nature 270:26-29.
LOVEJOY,L. 0. 198 1. The origin of man. Science 211:341-350.
MCHENRY, H. M., and R. S. CORRUCCINI. 1980. Late tertiary hominoids and human origins.
Nature 285:397-398.
MAHONEY,W. C., and P. E. NUTE. 1980. Fetal hemoglobin of the rhesus monkey, Mucuca
mulatta: complete primary structure of the y-chains. Biochemistry 19:4436-4442.
MANIATIS, T., R. C. HARDISON, E. LACY, J. LAUER, C. O’CONNELL,D. QUON, G. K. SIM,
and A. EFSTRATIADIS.1978. The isolation of structural genes from libraries of eucaryotic
DNA. Cell 15:687-701.
MAXAM, A. M., and W. GILBERT. 1980. Sequencing end-labeled DNA with base specific
chemical cleavages. Methods Enzymol. 65:499-560.
Gorilla Fetal Globin Genes 389
MESSING, J., R. CREA, and P. H. SEEBURG. 198 1. A system for shotgun DNA sequencing.
Nucleic Acids Res 9:309-32 1.
MESSING,J., and J. VIERIRA. 1982. A new pair of Ml3 vectors for selecting either DNA strand
of double-digest restriction fragments. Gene 19:269-276.
MOORE, G. W., J. BARNABAS,and M. GOODMAN. 1973. A method for constructing maximum
parsimony ancestral amino acid sequences on a given network. J. Theor. Biol. 38:459-485.
NUTE, P. E., and W. C. MAHONEY. 1979a. Complete amino acid sequence of the y-chain from
the major fetal hemoglobin of the pigtailed macaque, Macaca nernestrina. Biochemistry
l&467-472.
. 1979b. Complete sequence of the y-chain from the fetal hemoglobin of the baboon,
Papio cynocephalus. Hemoglobin 3:399-4 10.
PILBEAM,D. 1979. Recent finds and interpretations of Miocene hominoids. Annu. Rev. Anthropol. 8:333-352.
POON, R., Y. W. RAN, and H. W. BOYER. 1978. Sequence of the 3’ noncoding and adjacent
coding regions of human y-globin mRNA. Nucleic Acid Res. 5:4625-4630.
RADDING, C. M. 1978. Genetic recombination: strand transfer and mismatch repair. Annu.
Rev. Biochem. 47:847-880.
RAZIN, A., and A. D. RIGGS. 1980. DNA methylation and gene function. Science 210:604610.
SANGER,F., S. NICKLEN, and A. R. COULSON. 1977. DNA sequencing with chain-terminating
inhibitors. Proc. Natl. Acad. Sci. USA 74:5463-5467.
SCHROEDER,W. A., J. R. SHELTON,J. B. SHELTON,and T. H. J. HUISMAN. 1963. The amino
acid sequence of the y-chain of human fetal hemoglobin. Biochemistry 2:992-1008.
. 1978. The Vy-chain of fetal hemoglobin of the orangutan. Biochem. Genet. 16: 12031205.
SHEN, S., J. L. SLIGHTOM,and 0. SMITHIES. 198 1. A history of the human fetal globin gene
duplication. Cell 26: 19 l-203.
SHEN, S.-H., and 0. SMITHIES. 1982. Human globin wBz is not a globin-related sequence.
Nucleic Acid Res. 10:7809-87 18.
SLIGHTOM, J. L., A. E. BLECHL, and 0. SMITHIES. 1980. Human fetal w- and Ay-globin
genes: complete nucleotide sequence suggests that the DNA can be exchanges between these
duplicated genes. Cell 21:627-638.
SLIGHTOM,J. L., S. M. SUN, and T. C. HALL. 1983. Complete nucleotide sequence of a French
bean storage protein gene: phaseolin. Proc. Natl. Acad. Sci. USA. 80: 1897-l 90 1.
SMITHIES,O., W. R. ENGLES,J. R. DEVERUX, J. L: SLIGHTOM,and S.-H. SHEN. 198 1. Base
substitutions, length differences and DNA strand asymmetries in the human Gr- and
?-fetal globin gene region. Cell 26:345-353.
SNYDER,L. R. G. 1980. Closely-linked alpha-chain hemoglobin loci in Peromyscus and other
animals: speculations on the evolution of duplicate loci. Evolution 34: 1077- 1098.
SOUTHERN,E. 1975. Detection of specific sequences among DNA fragments separated by gel
electrophoresis. J. Mol. Biol. 98:503-5 17.
WILLIAMS,B. G., and F. R. BLATTNER. 1979. Construction and characterization of the hybrid
bacteriophage lambda Charon vectors for DNA cloning. J. Virol. 29:555-575.
WILSON, J. T., L. B. WILSON, J. K. DERIEL, L. VILLA-K• MAROFF,A. EFSTRATIADIS,B. G.
FORGET, and S. M. WEISSMAN. 1978. Insertion of synthetic copies of human globin genes
into bacterial plasmids. Nucleic Acids Res. 5:563-581.
ZIMMER, E. A., S. L. MARTIN, S. M. BEVERLEY,Y. W. RAN, and A. C. WILSON. 1980. Rapid
duplication and loss of genes coding for the u-chains of hemoglobin. Proc. Natl. Acad. Sci.
USA 77:2 158-2 162.
ZUCKERKANDL,E. 1964. Further principles of chemical paleogenetics as applied to the evolution
of hemoglobin. Protides Biol. Fluids 12: 102- 109.
WALTER M. FITCH, reviewing editor
Received October 25, 1983; revision received March 9, 1984.