Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Sequence of the Gorilla Fetal Globin Genes: Evidence for Multiple Gene Conversions in Human Evolution’ Alan F. Scott, Peter Heath, Stephen Trusko, and Samuel H. Bayer* Department of Medicine and *Howard Hughes Medical Institute, Johns Hopkins University William Prass, Morris Goodman, and John Czelusniak Department of Anatomy, Wayne State University L.-Y. Edward Chang and Jerry L. Slightom Department of Genetics, University of Wisconsin-Madison Two fetal globin genes (“r and 9) from one chromosome of a lowland gorilla (Gorilla gorilla gorilla) have been sequenced and compared to three human loci (a Gy-gene and two ?-alleles). A comparison of regions of local homology among these five sequences indicates that long after the duplication that produced the two nonallelic y-globin loci of catarrhine primates, about 35 million years (Myr) ago, at least one gene conversion event occurred between these loci. This conversion occurred not long before the ancestral divergence (about 6 Myr ago) of Homo and Gorilla. After this ancestral divergence, a minimum of three more gene conversion events occurred in the human lineage. Each human ?-allele shares specific sequence features with the gorilla ?-gene; one such distinctive allelic feature involves the simple repeated sequence in IVS 2. This suggests that early in the human lineage the y-genes may have undergone a crossing-over event mediated by this simple repeated sequence. The DNA sequences from coding regions of both Gr- and yloci, a comparison of 292 codons in the corresponding gorilla and human genes, show an unusually low evolutionary rate, with only two nonsilent differences and, surprisingly, not even one silent substitution. The two nonsynonymous substitutions observed predict a glycine at codon 73 and an arginine at codon 104 in the gorilla ?-sequence rather than aspartic acid and lysine, respectively, in human 9. Because only arginine has been found at position 104 in y-chains of Old World monkeys, it may represent the ancestral residue lost in gorilla and human Gy-chains and in the human y-chain. Possibly the arginine codon (AGG) was replaced by the lysine codon (AAG) in the Gy-gene of a common ancestor of Homo and GoriZZaand then was transferred to the ?-gene by subsequent conversions in the human lineage. DNA sequence conversions, similar to that attributed to the fetal y-globin genes, appear to be relatively frequent phenomena and, if widespread throughout the genome, may have profound evolutionary consequences. Introduction The P-globin genes of humans occur within a chromosomal region of about 50 kilobase pairs (kbp) of DNA and consist of an embryonic locus (E), two fetal loci (y), and two adult loci (6 and p), as well as a nontranscribed locus or pseudogene (wpl), with the order of these genes from 5’ to 3’ being E, G+y,9, ypl, 6, fl (Fritsch et al. 1. Keywords: gene conversion, fetal globin genes, gene evolution, gorilla, human. Address for correspondence and reprints: Alan F. Scott. Department of Medicine, Johns Hopkins University, Baltimore, Maryland 21205. Mol. Biol. Evol. 1(5):371-389. 1984. 0 1984 by The University of Chicago. All rights reserved. 0737-4038/84/0105-0001$02.00 371 372 Scott et al. 1980; Shen and Smithies 1982). The same sequence organization of P-globin genes, including the presence of two adjacent y loci, has also been found in other members of the primate infraorder Catarrhini (e.g., in the Old World monkey Papio anubis and in the African ape Gorilla gorilla) (Barrie et al. 198 1). However, only one y-locus has been found in the New World monkey Aotes trivirgatus (representing Platyrrhini, the sister taxon of Catarrhini) as well as in a prosimian primate Lemur fulvus. The conclusion has been drawn that, during evolution of the Catarrhini, the P-globin gene cluster became more complex as a result of a duplication of the fetal y-locus in the basal catarrhines approximately 35 Myr ago after their separation from progenitors of New World monkeys (Barrie et al. 198 1; Shen et al. 198 1). The products of the two human y-loci can be distinguished at residue position 136 by the presence of glycine (Gly) in one chain and alanine (Ala) in the other. Fetal y-chains with Gly and Ala at position 136 have also been found in chimpanzees (Pan troglodytes) (De Jong 197 1) and in gorillas (G. gorilla) (Huisman et al. 1973), but only Gly has been detected at this position in orangutans (Pongo pygmaeus) (Schroeder et al. 1978) and in Old World monkeys (Nute and Mahoney 1979a, 1979b; Mahoney and Nute 1980). Other than the Gly/Ala difference at position 136, the two human y-chains have an identical amino acid sequence, which differs from those of Old World monkey by three to four amino acid residues (Nute and Mahoney 1979a, 1979b; Mahoney and Nute 1980). It has been observed from amino acid sequence data on both the a- and y-globins from several organisms that paralogous genes within a species are more alike than are the orthologous genes in different species. Thus intraspecific duplicates appear to have evolved in parallel (e.g., Snyder 1980). This observation has been confirmed by DNA studies and is true not only for the globin genes (Zimmer et al. 1980; Liebhaber et al. 198 1) but for other repeated sequences as well (e.g., the ribosomal genes; Amheim and Southern [ 19771) and has been referred to as “concerted evolution” (Zimmer et al. 1980). The genetic process underlying concerted evolution for the human y genes was initially studied by Slightom et al. ( 1980) and Shen et al. ( 198 l), who sequenced both loci and their flanking regions. Two alleles of the ?-gene, from the same individual, were identified on the basis of their sequence and location on separate chromosomal homologs designated A and B. A portion of the ~-locus from chromosome A was shown to resemble more closely the 5’ Gy-locus from that chromosome than it did its allele on chromosome B. However, the 3’ region of the two p”y-loci are truly allelic in that each codes for a protein specifying Ala at position 136. It was concluded that sequences from the AGy-gene had been superimposed on the Aky-locus of chromosome A by a mechanism involving gene conversion. The allele from chromosome B was thought to represent an unconverted gene. The 3’ boundary of the conversion appeared to coincide with an unusual simple sequence consisting primarily of (TG),, where n ranged from 12 (human A?) to 22 (human AGy). This region is about 600 base pairs (bp) downstream from the beginning of IVS 2 and was described as a “hot spot” because it differed so significantly between the three sequences and because of its apparent role in the conversion of A?. The converted portion of the A?-allele extends from the hot spot in a 5’ direction for about 1,500 bp (Shen et al. 198 1). In this study we now examine the fetal globin genes of the gorilla because this species is sufficiently close to man that extensive similarity would be expected, yet small changes of the sort seen between human alleles still might be detected and serve to explain further the process by which conversion is mediated. Gorilla Fetal Globin Genes 373 Material and Methods Material Restriction endonucleases, EC&I, BarnHI, PstI, AvaI, BgZI, BgZII, HindIII, HincII, SacI, SmaI, and XbaI were from Promega Biotec (Madison, Wis.) or Bethesda Research Laboratories (Gaithersburg, Md.). Polynucleotide kinase, Ml 3 phage DNA, and sequencing primer were from P-L Biochemicals (Milwaukee), and bovine intestinal alkaline phosphatase was from Boehringer-Mannheim (Indianapolis). DNA polymerase large fragment and Bal 3 1 nuclease were from Bethesda Research Laboratories. Proteinase K was obtained from EM Reagents. The [u-~*P] dATP (800 Ci/mM) and [+Y-~*P] ATP (2,000-3,000 Ci/mM; 1 Ci = 3.7 X lO”Bq) and T4 ligase were from New England Nuclear (Boston). Chemicals used for Maxam and Gilbert sequencing were obtained from the recommended vendors (Maxam and Gilbert 1980). X-ray film, X-omat AR-5 was from Kodak. DNA Cloning and Isolation DNA was prepared by the method of Blin and Stafford (1976) from blood from a male lowland gorilla (Gorilla gorilla gorilla) (Tomoka, who lives at the National Zoological Park in Washington, DC.) and partially digested with EcoRI. Fragments of 12-20 kbp were selected from sucrose gradients and cloned into the EcoRI “arms” of the lambda phage Charon 4A (Maniatis et al. 1978; Williams and Blattner 1979). The resulting phage library was screened with the y-globin cDNA probe pJW15 1 (Wilson et al. 1978). Several phage-containing y-globin genes were identified, and DNA was prepared from the clone designated GyG 1. Restriction enzyme site mapping and blot hybridization (Southern 1975) with the human y-globin cDNA probe (fig. 1) showed that this phage contained the complete gene of both y loci (fig. 2). Various regions of the lambda insert DNA were subcloned into the plasmid pBR322 (Fritsch et al. 1980) or Ml 3 phage vectors (Messing et al. 198 1). DNA Sequencing Cloning into M 13 vectors was accomplished by using specific restriction digestions of plasmid or lambda phage DNA and the appropriate vector DNA and by generating randomly terminated fragments with Ba13 1 nuclease digestion followed by blunt end ligation into the SmaI site of M 13mp9 (Messing and Vieria 1982). Enzymatic sequencing was done as described by Sanger et al. ( 1977). Chemical sequencing of clone G-yGl was done by end labeling the fragments obtained by enzymatic digestion of lambda DNA (20-50 pg), isolating these labeled fragments, and sequencing them as described by Maxam and Gilbert (1980). DNA sequences were analyzed on 60- or 85-cm-long and 0.4-mm-thick gels (enzymatic reactions), or on 104-cm-long and 0.2mm-thick water-jacketed gels (chemical reactions). Long gel plates were treated as described by Garoff and Anzorge (198 1) so that acrylamide used to form the gel matrix was bonded directly to the plate. The times employed for chemical sequencing reactions and the procedure for pouring gels followed the directions of Slightom et al. ( 1983). Most of the gorilla sequence reported here was obtained independently by both procedures and confirmed in more than one laboratory. Evolutionary Reconstruction: Parsimony Procedure The main principle used in aligning the gorilla “r- and ?-sequences against each other as well as against the human sequences (“r- and the two ~-alleles) and Gorilla Fetal Globin Genes 375 used sparingly in the gene sequences and were placed so as to maximize the number of matching bases. This helped ensure that for each homologous region we could find the order of evolutionary branching that involved the fewest possible genetic events, that is, maximized genetic likenesses while minimizing parallel and back substitutions. Each nucleotide substitution was counted as a single event. Gaps (i.e., insertions or deletions), regardless of the number of nucleotides involved, were also counted as single events. Inversions involving double base pairs were counted as two independent events, even though we cannot exclude the possibility that they may occur by a single process. Because of the very extensive similarity among the five genes it was not necessary to weigh gaps more than nucleotide substitutions in aligning these sequences. In fact, in our alignment, no gaps were employed in any sequences at 97% of the positions (1,779 of the 1,837 positions of the full alignment shown in fig. 3). Nucleotide substitutions and gaps occurred at only 110 sites among the five genes. In constructing the order of ancestral branching with the fewest events (i.e., maximum parsimony), we determined the number of nucleotide substitutions in branches of particular trees using the algorithm of Fitch ( 197 1). For the orthologous coding regions, we included amino acid sequence data for the Old World monkeys (Nute and Mahoney 1979a, 1979b; Mahoney and Nute 1980), chimpanzee (DeJong 197 I), and orangutan (Huisman et al. 1973), as well as the human and gorilla sequences. The parsimony procedure used for these sequences considered, in addition to the amino acid sequences, the actual nucleotide sequences when known (Moore et al. 1973; Czelusniak et al. 1982). Identifying Gene Conversion Regions Our approach to the documentation of gene conversion events is to identify sequence regions that diverged much less than would be expected based on the amount of divergence of surrounding regions of the duplicated genes involved. This approach is illustrated by a simple hypothetical situation in figure 4. Because the Gy- and y-sequences are descendants of a y-globin gene which apparently duplicated in the ancestors of Old World monkeys and hominoids about 35 Myr ago, we see significant sequence divergence between the two loci. We can test whether the sequence differences are uniformly distributed throughout the genes by parsimony analysis where sequences are clustered on the basis of the number of substitutions required to generate one sequence from the other. The shortest route between two sequences corresponds to the most parsimonious solution. Therefore, for regions of the genes which have not undergone conversion, the Gy-sequences should form a distinct group from the y-sequences. However, if sequences were exchanged between the tandem loci, then the most parsimonious groupings would not distinguish 5’ and 3’ loci as separate groups, and instead “r- and P”y-sequences in one species would cluster in a group separately from those in another species. Both sorts of arrangements can be seen in the five y gene sequences (see fig. 3). By preparing parsimony trees for various regions of homology within the y-sequences, we could provide evidence for the conversion events described below (see fig. 5). Results and Discussion Restriction Enzyme Site Mapping We have isolated the fetal globin gene region from a lowland gorilla in the recombinant Ch 4A phage clone m 1. Restriction mapping of this clone with EcoRI, 376 Scott et al. 400 600 Pst 660 II::: &A 660 g iA B” A I ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ . . ..~~~~~~~~~A . . . . . . . . ..*..A i i! . . . . . . . . . . . ..A c :: f e 6 T6AATCTACCTACC +_________+_________+_________+_________+________-+ __~_~~~~~+~-_______+__~~_____+___~~~~~*~~~~----Xbs 700 I ;;; Ag ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ HSA AA 660 B” A 660 HSA A tkiA 8 _____----*__-------+- 660 TCTT ;h if CT ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 6 HSA B” A BOO !II$ •.~.6 __+_________+_________+_________+_____~~~~+ fE +__~~~~~~~*_________+_______*-_+_~~~~~~ 5 AA :: C 1::: t 4 T6A A C ___+_________+_________*________*--____~~~~~+~~~~~~~~~* TCTT l *o* ____-----+__------- A G60 4i I ~~~~~~~~+~~~~_____+_~~~~~ A HSA HSA C6 f 6ACTlT~:::TATTA6ATT:C~6TA6AAA6AACTTT:AZt6TATG6TC II;::$ 660 C 6 i 6 6 t A _________+_________+_________+________*_________+_________+_________+ E 900 i ii ~_____~~_*_________*_~~~~~~--, i 1000 FIG.3.-Nucleotide sequence comparison of w- and Ay-globin genes from human and gorilla. Human q- and Ay-globin gene nucleotide sequences are from Slightom et al. (1980) and Shen et al. (198 1) with the genes in Ch4A 165.24 (Hsa AG and Hsa AA) from chromosome A and the gene in Ch3A 5 1.1 (Hsa BA)from chromosome B of a single individual. Gorilla w- and Ay-globin gene nucleotide sequences are from clone Ch4A GyGl and are referredto as Ggo G and Ggo A. The numbering system is set by the overall alignment. The complete nucleotide sequence for Hsa B? has also been obtained from another clone from the same individual no. 563 (J. L. Slightom, unpublished data). Asterisks indicate the presence BamHI, and Hind111 shows that it contains most of the sites found in the human y-gene clone 165.24 (Slightom et al. 1980). Because these two recombinants contain almost identical y-gene regions (the human clone has an extra 1.5-kbp EcoRI fragment Gorilla Fetal Globin Genes 377 I ) Hot &qu*ncer ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ . l. ..o.T~T~T~C~C~C~T~T~TTT~T~T~T~T~T~A~ 6 . . . . ..~~...~~~..........~~~T6T6T6T6TC 6 ......T6C6C6C6C6C6T6T6T**6 676767676TC 6 . . . . ..*................... i LT 6767676TC 6 Spot ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ f z --~~~~~~~+~~-------*~~~~~~~--+--------- . TA . TA . AA ;% c f A 6 +~~~~~~~~~,~~~~-----+--------- : c !if 6T 76 rifi f: 676 +---------+----~~~~~+~~~~~~~~* CTC T !!f : 1800 ;TTTiT:TA:6j;A6AT6T66$TTTT6;T6A6CAAAT A TA T-0 ;i A TA T.. A TA T.. ~~~~~~~~~+~~~~~~~-~+---------+------- A x 1837 which are used to maximize identities. The complete nucleotide sequence for the gorilla Gy-globin gene is written on the top line. If a nucleotide difference is noted in any of the sequences, the nucleotide for that position is written for each of the genes. Nucleotides that may have biological importance are noted: single overline (TATA box) and double overline (poly[A] addition signal). The amino acid sequence is printed below the dashed counting line, and amino acid replacements are printed below the appropriate positions at residues 73, 104, and 136. The initiator codon is the 6rst Met and the terminator codon is designated TER. Arrows indicate exon-intron boundaries which conform to the GT/AG rule (Breathnach et al. 1978). on its 5’ end) and are in the same 5’ to 3’ orientation with respect to the Ch 4A vector, a direct comparison of their enzyme sites can easily be made (see fig. 1). Blot hybridization of these DNA fragments to 32P-labeled y-globin cDNA, isolated from the plasmid pJW 15 1 (Wilson et al. 1978), clearly shows that the gorilla clone does contain both fetal globin genes in EcoRI fragments of sizes characteristic of this region in humans (Gy- and *y-globin genes in 6.9-kbp and 2.64-kbp fragments, respectively). 378 Scott et al. 0 m Regions related by duplication or speciation m The converting region Regions related by conversion 1. m Regions by conversion 2. related k’m b-, lmBIm, 1T I I d4 d2 d3 dl FIG. 4.-Hypothetical scheme for identifying multiple gene conversions. di is the distance between homologous regions related paralogously by duplication; it is the distance between regions, one from locus A and one from locus B, depicted by 0. dz is the distance between homologous regions related orthologously by speciation; it is the distance between regions, both from locus A or locus B, depicted by 0. dJ is the distance between homologous regions related by conversion 1 and depicted by 1. & is the distance between homologous regions related by conversion 2 and depicted by 1. di > dz signifies that duplication preceded speciation; di > d3 identifies gene conversion 1; dS > dz signifies that gene conversion 1 preceded speciation; dz > d.+identifies gene conversion 2. Identification of the A gene as the converting gene in conversion 2 requires not only that A and B genes of the right-hand species diverge less from each other than each gene diverges from either A or B gene of the left-hand species (as well as from either A or B gene of the ancestral species of right- and left-hand species) but also that each gene of the right-hand species diverges less from A than from B gene of the left-hand species (as well as diverges less from A than from B gene of the ancestral species of right- and left-hand species). The only difference noted in the EC&I fragment pattern was in the 3’ region of the Gy-genes, where humans have a 1.5%kbp fragment, whereas the absence of an EcoRI site in the gorilla clone results in a larger 2.3-kbp fragment containing the 3’ third exon of the Gy-gene (see figs. 1 and 2). We also found that neither of these gorilla genes contains a Hind111 site in IVS 2 which is polymorphic in the human genes (Jeffreys 1979). In contrast, the human clone 165.24 has a Hind111 site in IVS 2 of G+y(see figs. 1 and 2). Structural Comparison of the Gorilla and Human Fetal Globin Genes We have obtained the nucleotide sequence for identical regions of the gorilla “r- and P”y-globingenes, starting 55 bp 5’ of the expected capped nucleotide of the mRNA and extending 17 1 bp 3’ of the expected poly(A) addition for a total of 1,837 bases (see fig. 3). For comparison, the human “r- and 9-globin genes of chromosome A and the Ay-globin gene of chromosome B are also shown in figure 3. The 5’ nontranslated regions for these gorilla and human genes show very few Gorilla Fetal Globin Genes Duplication 379 (35 MYr) ( I MYr) I Gorilla Human FIG. 5.-History of gamma gene evolution and conversion events. A duplication of a y-globin gene encoding glycine at position 136 occurred in the early catarrhine primates about 35 Myr ago. Because a glycine is coded at this position in both y-genes of the orangutan and Old World monkeys (table 2) whereas a replacement with alanine in the 3’ gene is found in Homo and Gorilla, we conclude that this change occurred after the divergence of Pongo (about 14- 18 Myr ago) but before the Homo and Gorilla branching (about 5-6 Myr ago). This replacement may have occurred before the first conversion (C,), or it may have happened afterward, but still before the separation of Homo and Gorilla. If the glycine - alanine replacement occurred before C,,the 3’ boundary of C1 can be placed in exon 3 at codon position 135 or nucleotide position 1543 (fig. 3), but if the replacement occurred after C, the 3’ boundary can be placed at the start of the 3’ untranslated region. C, is common to both humans and gorillas and is estimated to have occurred about 10 Myr ago, using the replacement rate change of 1% for every 10 Myr calculated by Efstratiadis et al. ( 1980). No further conversions have been identified in the gorilla lineage, but three have been identified in the human lineage. Ct and Cs are estimated to have occurred about 2-3 Myr ago either in a common ancester of human chromosome types A and B or in an early version of chromosome B itself. Conversion CZ is evident in the BAY-genefrom positions 90 1 to 1128 and extending into the “hot spot,” and Cs is also evident in the By-gene, but from positions 42 to 777 (fig. 3). C, extending over some 1,500 bp is estimated to have occurred no earlier than 1 Myr ago on human chromosome type A (Shen et al. 1981). Its effects are evident in the A*y gene from positions 42 to 1128with its 3’ boundary being located in the hot spot region. Whereas in the case of C4 the converting sequences clearly come from the AGy gene, in each of the earlier events we have yet to establish the origin of the converting sequence. differences and none that would be expected to alter their expression. The gorilla genes share the identical promoter sequence (AATAAA) at the same position, 3 1 bp before the expected capped site (Efstratiadis et al. 1980). Because the gorilla and human promoter sequences are located in identical positions, we expect that the 5’ nontranslated region will also be of the same length (53 bp). The gorilla y-globin genes contain two intervening sequences at the expected positions. IVS 1 has the same length (122 bp) in all five y-globin genes, and the sequences are virtually identical. As found in the human y genes, both the length 380 Scott et al. and sequence of IVS 2 for the gorilla “r- and ?-gene differ. IVS 2 is 906 bp long in gorilla Gy and 872 bp in gorilla 9. In the human genes, AGy is 886 bp long, whereas Aky is 866 bp and Bky is 876 bp (Slightom et al. 1980). In both species this difference in length of IVS 2 is due to the presence of a simple sequence DNA region (fig. 3, positions 1126-l 18 1) which consists of (TG)n. The value of n ranges from 13 in human A? to 24 for gorilla Gy. Also, both gorilla y-genes are lengthened by an extra 13 bp (at positions 625-637) that are not found in any of the human genes. The gorilla fetal globin genes both use the same terminator codon and have their poly(A) addition signal (AATAAA) located at the same position as the human genes. We assume that poly(A) for these gorilla y-gene transcripts would be added about 2 1 bp downstream from these poly(A) signal sites as is true for the human transcripts (Poon et al. 1978; Forget et al. 1979). If so, the 3’-untranslated region would be some 90 bp for the gorilla y-gene and some 89 bp for the Gy-gene. A Slower Evolutionary Rate for the Primate y Globin Genes In a previous comparative restriction mapping study of the P-globin gene cluster, Barrie et al. ( 198 1) estimated that human and gorilla are about 0.9% different at the DNA level (most of the sites examined were regions of flanking or intervening sequence). According to the data in figure 3, human and gorilla Gy-sequences are about 2.1% different, whereas the gorilla y-gene differs from the human A?- and B*y-alleles by 2.3% and 2.2%, respectively. These differences are not spread randomly between the y-sequences. Instead, as can be seen in figure 3, human and gorilla y-genes share certain regions (e.g., the exons) that are almost identical, whereas other regions (in particular IVS 2) diverge significantly. It has been argued from amino acid sequence data that hominoid (human, chimpanzee, gorilla, and orangutan) hemoglobins have accumulated fewer amino acid replacements than expected when compared to the replacement rates seen in other mammalian hemoglobins (Goodman 198 1; Goodman et al. 1983). The present study confirms this unusually low evolutionary rate and extends the observation to silent or synonymous substitutions in the y-coding regions as well. In the entire coding sequence of both “r- and ?-genes of the gorilla (a total of 876 bases), we detect only two substitutions in comparison with the human genes, each of which results in amino acid replacements (with the gorilla having Gly at y-codon 73 and Arg at *y-codon 104 compared to Asp and Lys, respectively, in human 9). No silent substitutions were detected in the coding sequences reported here. This finding does not conform to the “neutralist” paradigm (e.g., Li et al. 198 1; Li 1983; Kimura 1983), which predicts that many more silent than amino acid-changing substitutions should accumulate in exons of active genes during evolution. The separation of human and gorilla is thought to have occurred between 5 and 6 Myr ago on the basis of various interpretations of the fossil record (Johanson and White 1979; Pilbeam 1979; McHenry and Corruccini 1980; Lovejoy 198 1) and from the application of DNA hybridization data as an evolutionary clock (C. Sibley and J. Ahlquist, personal communication). It is surprising that no silent changes have accumulated in that period. Calculations based on substitution rates found for coding sequences (exons) of globin genes in human/rabbit, human/mouse, and rabbit/mouse comparisons (Efstratiadis et al. 1980) show a 1% change at silent sites for every 1.4 Myr. This would predict that, over the 876 bp of coding DNA in Gy- and y-genes, at least nine silent changes should have accumulated in the 220 silent sites. Thus, Gorilla Fetal Globin Genes 381 our data point to an unexpectedly low silent substitution rate in the coding sequences of hominoid y-globin genes. Because the number of detected substitutions is still relatively small, we cannot be sure whether this is a statistical anomaly or represents the true condition. But if the observed number of substitutions is indeed lower, then we interpret this to mean that either selection has acted throughout the coding sequences to minimize both silent and nonsilent substitutions, or else other mechanisms are operating to maintain the slower relative substitution rate in these regions of the y-genes. Because the number of silent substitutions is also much fewer than expected, we conclude that the conservation of these sequences is not necessarily related to the encoded protein product. Although this conservation might reflect selection for RNA secondary structure, we believe that a more reasonable alternative is that there has been an overall reduction in the apparent substitution rate. This could be due to a general mechanism such as enhanced DNA repair, longer generation times in primates (Goodman 1976), or restrictions in the types of allowable substitutions as a consequence of the base composition of the region. Mammalian DNAs have fewer CG doublets than would be predicted, perhaps as a consequence of methylation of the 5’ cytosine (Razin and Riggs 1980), and, because the y-genes are relatively GC rich, they may tolerate fewer substitutions (Smithies et al. 1981). However, we would expect that both of these mechanisms might apply to IVS and flanking region sequences as well and would not account for the disproportionately lower silent substitution rate in coding regions. Thus, such mechanisms do not specifically explain the disproportionately lower silent substitution rate in coding regions. Two other mechanisms that could account for the reduced coding region rates are (1) gene conversions confined to the coding regions, or (2) cDNA-mediated conversion events which, by definition, would involve only coding sequences. This last mechanism has recently been proposed to account for peculiarities in the evolution of various families of repeated genes and processed genes that lack introns and, often, the normally present signal sequences in adjacent DNA (Jagadeeswaran et al. 198 1; Lewin 1983). This conversion process might result from hybridization and strand exchange with either fragments of full size y-cDNAs or conversion of a part of the cellular gene with a complete cDNA. Only with the accumulation of additional coding region sequences can these speculations be tested. The 3’ Regions of the y-Genes Have Not Been Converted Although the three exons and IVS 1 are extremely similar in sequence among the five y-genes (showing only four substitutions over 560 aligned positions), other regions are less similar. From the time of the ancestral divergence of Homo and Gorilla to the present, the fastest rates of substitution occurred in IVS 2. (The difference in substitution and gaps between orthologous human and gorilla Gy-genes in IVS 2 is 3.5% compared to 2.3% in the 3’ untranslated and flanking region, 1.8% in 5’ flanking region, and no differences in exons and IVS 1.) However, the largest difference between Gy- and *y-genes is not found in IVS 2 but in the 3’ untranslated and flanking region which differs by an average of 13% (see table l), whereas this region of the orthologous human and gorilla Gy- and *y-genes differs by only 2.3% and 1.4%, respectively. Such a relatively large difference strongly suggests that this 3’ untranslated and flanking region of the tandem y-loci has not undergone conversion, perhaps since the original duplication. At 29 of the 41 substitution sites in the 3’ flanking region (fig. 3 and table 1, positions 1577-l 837), both of the Gy-sequences are distinct from 382 Scott et al. Table 1 Pairwise Comparisons of Nucleotide Sequences from Gorilla and Human Gy- and AT-Genes NRs Gaps Positions Shared Difference (%) In the 3’ Untranslated and Flanking Region (Pos. 1577-1837) Not Subjected to Conversions since the Time of the Tandem y-Duplication in Early Catarrhines G vs. human G . . . . . B vs. human AAy . . . G vs. human BAy . . . G vs. gorilla A . . . . . . 4 28 29 29 2 4 4 5 259 256 256 253 2.3 12.5 13.0 13.4 Human G vs. human AAy . . Human G vs. human BAy . . Human G vs. gorilla A . . . . . 32 33 33 2 2 3 258 258 255 13.0 13.6 14.0 Human AAy vs. human BAy Human AAy vs. gorilla A . . . 1 2 0 1 258 255 0.4 1.2 Human BAy vs. gorilla A . . . 3 1 255 1.6 Gorilla Gorilla Gorilla Gorilla In the Region Subjected Only to Conversion C, (Pos. 11291576 Spanning the 3’ Third of IVS 2 and Exon 3) human G . . . . . human AAy . . . human BAy . . . gorilla A . . . . . . 5 12 16 9 2 3 3 3 441 421 439 421 1.6 3.6 4.3 2.9 Human G vs. human AAy . . Human G vs. human BAy . . Human G vs. gorilla A . . . . . 13 15 10 1 1 1 421 439 421 3.3 3.6 2.6 Human AAy vs. human BAy Human AAy vs. gorilla A . . . 2 7 1 0 421 421 0.7 1.7 Human BAy vs. gorilla A . . . 7 1 421 1.9 Gorilla Gorilla Gorilla Gorilla G G G G vs. vs. vs. vs. In the Region of IVS 2 Subjected to Conversions C, and C4 (Pos. 8 12-900) Gorilla Gorilla Gorilla Gorilla G vs. human G vs. human G vs. human G. vs. gorilla G ...._ AAy . . . BAy . . . A ..... 4 4 5 5 0 0 1 1 89 89 89 89 4.5 4.5 6.7 6.7 Human G vs. human AAy . . Human G vs. human BAy . . Human G vs. gorilla A . . . . . 0 5 5 0 1 1 89 85 85 0 7.1 7.1 Human AAy vs. human BAy Human AAy vs. gorilla A . . . 5 5 1 1 85 85 7.1 7.1 ..... 0 0 85 0 Human BAy vs. gorilla Gorilla Fetal Globin Genes 383 Table 1 (Continued) NRs Gaps Positions Shared Difference (%) In the region of IVS 2 Subjected to Conversions C1, CZ, and C4 (Pos. 901-l 128) human G . . . . . human AAy . . . human BAy . . . gorilla A . . . . . . 11 11 11 8 Human G vs. human AAy . . Human G vs. human BAy . . Human G vs. gorilla A . . . . . 0 0 7 Human AAy vs. human BAy Human AAy vs. gorilla A . . . Human BAy vs. gorilla A . . . Gorilla Gorilla Gorilla Gorilla G G G G vs. vs. vs. vs. 228 228 228 228 4.8 4.8 4.8 3.5 0 0 0 228 228 228 0 0 0 7 0 0 228 228 0 7 0 228 3.1 3.1 3.1 In the Region Subjected to Conversions C, , C3, and C4 (Pos. l8 11 Spanning 5’ Untranslated and Flanking, Exon 1, IVS 1, Exon 2, and a 5’ Portion of IVS 2) human G . . . . . human AAy . . . human BAy . . . gorilla A . . . . . . 11 12 14 18 Human G vs. human AAy . . Human G vs. human BAy . . Human G vs. gorilla A . . . . . 7 17 Human AAy vs. human BAy Human AA y vs. gorilla A . . . Human BAy vs. gorilla A . . . Gorilla Gorilla Gorilla Gorilla G G G G vs. vs. vs. vs. 1 1 2 1 798 798 794 808 1.5 1.6 2.0 2.4 0 1 2 798 794 795 0.1 1.0 2.4 7 18 1 2 794 795 1.0 2.5 18 3 791 2.7 all three ?-sequences, but within each group these positions are identical. In the 12 remaining sites, six are the same within the “r-gtoup, whereas six are identical within the G.y-group. Therefore, parsimony analysis supports joining the gorilla “r- and human Gy-genes into one branch and the gorilla y- and human ?-sequences into another. An Ancestral Hominoid Conversion In contrast to the 3’ gene region, the remaining portions of the Gy- and ?-genes (namely, coding and IVS regions) are considerably more similar, which indicates a conversion of one locus by another and can be seen by analyzing the sequences in IVS 2. The minimum difference between human and gorilla throughout this region is 3.4%, a value obtained by comparing the Gy-sequences of each species. Yet comparison of the two most divergent sequences among the five, IVS 2 from the gorilla G,y_and ?-genes, yields a value of only 5.1 yo-roughly one and a half times as large as the difference for this region between species and much less than the value expected 384 Scott et al. if these sequences had accumulated substitutions for the full 35 Myr since the original duplication. On the basis of this similarity for the gorilla “r- and ?-genes, we conclude that long after the original duplication, but before the separation of human and gorilla, a gene conversion occurred whereby DNA from one of the two loci was replaced by sequence from the other locus. This first or earlier conversion event in the hominoid ancestor of both humans and gorillas is labeled C1 in figure 5. In the region starting at nucleotide position 1182 (just 3’ to the hot spot in IVS 2) and progressing downstream, no further conversions after C1 are evident. From position 1182 to 1450, at the 3’end of IVS 2, human y- and Gy-genes have accumulated 13 substitutions over 269 nucleotides compared to only seven substitutions between either human ?-allele and the gorilla ?-genes, and only two substitutions between human Gy and gorilla Gy (see fig. 3). Furthermore, in six of the 16 substitutions, all Gy-sites are alike between human and gorilla but differ from all ?-sites which, in turn, are alike between human and gorilla. There are four sites (positions 1280, 128 1, 1285, and 1286) that specifically group A? and B? together. Thus, by parsimony analysis, both Gy-sequences for this region fall into one group and all three ?-sequences into another. Evidence for the Ci is strengthened by pairwise comparisons of the Gy- and 9-exons, either between or within species. If no conversion had occurred in the descent of gorilla Gy- and ?-sequences for the full 35 Myr since the duplication, about 28 silent substitutions would be expected between the two y-genes, using the silent substitution rate cited above. Yet, as already emphasized, no silent substitutions were found. Amino acid sequence data for other primate y-chains (table 2) also provide evidence for Cl. Both Gy- and ?-chains of human, chimpanzees, and gorillas have His at codon 77 and Thr at codon 135, whereas Old World monkeys have Asn at position 77 and Ala at position 135. One orangutan y-chain has Thr and the other Ala at position 135. This suggests that in one of the duplicated loci an Ala to Thr replacement occurred at position 135 in early hominoids before the ancestral separation of orangutan from African apes and humans, and that after the orangutan separation a gene conversion (C,) replaced the sequence encoding 135 Ala by that encoding 135 Thr in the common ancester of African apes and humans. Although the original duplication clearly preceded the ancestral divergence of Old World monkeys and the hominoids, present-day African ape and human Gy-sequences are closer to African ape and human ?-sequences than are either set of sequences to those of Old World monkeys. Three Additional Conversions in Human y-Genes From the differences in table 1 and parsimony analysis of substitution sites, we find evidence for three additional conversions, all within the human genes and none extending beyond the 3’ end of the hot spot sequence in IVS 2. We also observe a small stretch of sequence in IVS 2, from positions 8 12 to 900, where gorilla 9- and human B?-genes are identical (see table 1). Within this region there is a 4-bp deletion (858-86 l), an inversion to TC from CT (847-848), and five point substitutions which are shared in common (fig. 3). This small sequence represents DNA in the human Bky-allele, which has remained unconverted since the species diverged and which is flanked on each side by sequences that have undergone conversion. The fact that this segment of B? is unconverted is also supported by parsimony criterion, since a minimum of four additional genie events (three more substitutions at 822 and 847848 and one 4-base deletion or insertion at 858-861) would be required if human Table 2 Amino Acid Sequence Differences among Primate y-Chains RESIDUE NUMBER SPECIES Homo sapiens ....... Pan troglodytes. ..... ....... Gorilla gorilla 73 75 77 104 117 135 136 139 REFERENCES Asp Ile His LYS His Thr Gly/Ala Ser Schroeder et al. 1963 Slightom et al. 1980 DeJong 1971 Asp Ile His Thr Gly/Ala Ile His LYs LysfArg His AsPIG~Y His Thr Gly/Ala Ser Ser ? Ile/Val ? ? ? Thr/ Ala Gly Ser Ile Asn Arg His Ala Gly Ser Mahoney and Nute 1980 Ile Asn Arg Asn Arg Ala Ala GlY GlY Ser Ile/VaI Arg His Ser Nute and Mahoney 1979a Nute and Mahoney 19796 ? ? ? ? Ala Gly Ala Huisman et al. 1973 Huisman et al. 1973 This report Pongo pygmaeus ..... Huisman et al. 1973 Schroeder et al. 1978 Macaca mulatta .. .., Macaca nemestrina .. Papio cynocephalus , . . Saguinus fuscollis . . . , NOTE.-The amino acid sequence differences among the y-chains of the apes and Old World monkeys determined by protein sequencing or inferred from the DNA sequence. Note that the gorilla shares an Arg with the Old World monkeys at position 104. ’ !%quencesof Ay and Gy chains of P. troglodytes were inferred from the amino acid compositions of small peptides. 386 Scott et al. B? were not joined first to gorilla 9. Thus, the best arrangement would have human B? and gorilla 9 grouped closest together. We hypothesize that three conversion events occurred between 9- and Gy-sequences in the human line after the separation of humans and gorillas. One of these, designated C2 in figure 5, is evident in the By-allele and extends from immediately beyond the unconverted region into the 5’ end of the hot spot sequence (i.e., from position 90 1 to 1128). Among 13 substitution sites found in this region, the three human sequences are identical, whereas the gorilla genes differ considerably from human, 11 differences for Gy and seven differences for 9 (see fig. 3 and table 1). Evidence for another conversion, C3 in figure 5, can also be seen in the human By-allele (table 1, positions l-8 1 l), that is, a conversion 5’ to the stretch of sequence (positions 8 12-900) where human B?- and gorilla y-genes are identical. We are uncertain about the 5’ and 3’ ends of C3, although it apparently terminates at its 3’ end prior to position 820 (fig. 3). Up to position 8 11 there are 29 substitution sites, of which only two (568 and 7 15) support joining the human B?-allele to gorilla 9, whereas six sites (42, 99, 627-637 gap, 661, 742, and 777) would join human B*y-, A*y-, and AGy-sequences together as a branch distinct from the two gorilla genes. Moreover, among the 23 remaining sites, there are 10 at which gorilla Gy and the three human sequences are identical and differ from gorilla 9, compared to only four sites in which gorilla 9 and the three human sequences are identical and differ from gorilla “r. Thus, in these regions, as a result of CZ and C3, the most parsimonious grouping has human Bky joined to the branch of human AP”yand AGy. The last conversion event, C-4, is seen when the human A? is compared to human AGy. These sequences are nearly identical from the 5’ end up to the hot spot. This event was described by Slightom et al. (1980) and, based on the near identity of sequences between A*y and AGy in this region, must have occurred recently, perhaps within the last million years (Shen et al. 198 1). Furthermore, A?-sequence must have been replaced by AGy-sequence. If the reverse had occurred in Cq, AGy would be virtually identical not only to the A?-allele but also the B?-allele. Instead, the B?-allele diverges significantly from both A? and AGy. There is much weaker but suggestive evidence that in the case of C3, the converting sequence also originated from “r. In this case the three human sequences diverge much less from gorilla Gy than from gorilla 9. The direction of other conversions (i.e., C, and C,) cannot be determined at present but should be determinable once other hominoid and Old World monkey y-genes are sequenced. Conclusions The data presented here extend the arguments of Slightom et al. (1980) that the hot spot sequence is important in mediating conversion events between fetal y-globin genes. Both C2 and Cs extend up to this region but not beyond it. The finding that the hot spot regions in gorilla *y and human A? are nearly identical, whereas 5’ to this region the gorilla gene shares unique substitutions with the human By-allele but not with A?, strongly suggests that one of the two human ~-alleles may have resulted from a recombination event at this location. When additional alleles in human and other closely related species are sequenced, it should be possible to determine whether the hot spot has been involved with both conversion and crossing-over events. Indeed, both events are believed to be related by the same underlying mechanism (Radding 1978). Gorilla Fetal Globin Genes 387 As already noted, the coding sequences of the gorilla and human Gy-genes are identical, whereas the ?-genes differ by two substitutions resulting in an Asp to Gly change at codon 73 and an Arg to Lys change at codon 104. Because codon 104 is also Arg in two species of macaques (Macaca mulatta and M. nemestrina) and a baboon (Papio cynocephalus), whereas in human and chimpanzee it is Lys (table 2), we suggest that not only does the gorilla y-gene retain an “ancestral” feature shared with Old World monkeys but that human and chimpanzees share the derived feature and thereby a more recent common ancestor. We can more thoroughly test this possibility once nucleotide sequences are obtained from the paired y-genes of chimpanzees and other hominoids. Clearly, gene conversions appear to be common events in the evolution of the human y genes. What is the consequence of such a process? Possibly, as noted by Dover (1982), conversion will slow the effective evolutionary rate. As new substitutions occur in a sequence they will tend to be lost because they are likely to be converted back to the sequence of the more common allele in the species population. The faster the conversion rate, the slower the effective substitution rate. Finally, an important consequence for the reconstruction of phylogenetic relationships has emerged from this study. Conversion between related genes in one species may result in the transfer of stretches of sequences quite different from that found in what appears to be the orthologous gene in another species. In the case of y genes our present results suggest that this problem may be less acute when Gy-sequences are compared to one another in different species than when ~-sequences are compared, perhaps because G+yis more likely to be the “donor” sequence. To see if such interpretations are warranted and to better reconstruct the history of these two nonallelic loci, we must again stress the need to sequence y-genes from additional species of higher primates. Acknowledgments This.study was supported by the following grants: NIH GM 28931 (A. F. S.), NSF DEB 7810717 (M. G.), and NIH HD 16595 (J. L. S.); J. C. was supported by a National Science Foundation predoctoral fellowship. We wish to thank Drs. Haig Kazazian, Nobuyo Maeda, Barbara Schmeckpeper, and Oliver Smithies for helpful discussions; Timothy W. Theisen for technical assistance; and Dr. M. Bush at the National Zoological Park for providing blood samples. J. L. S. and L.-Y. E. C. are also endebted to Drs. Frederick Blattner and Oliver Smithies for the shared use of laboratory space and equipment. This article is paper no. 2692 from the Laboratory of Genetics, University of Wisconsin-Madison. LITERATURE CITED ARNHEIM,N., and E. M. SOUTHERN.1977. Heterogeneity of the ribosomal genes in mice and men. Cell 11:363-370. BARRIE,P. A., A. J. JEFFREYS, and A. F. SCOTT.198 1. Evolution of the P-globin gene cluster in man and the primates. J. Mol. Biol. 149:319-336. BLIN,N., and D. W. STAFFORD.1976. A general method for isolation of high molecular weight DNA from eukaryotes. Nucleic Acid Res. 3:2303-2308. BREATHNACH, R., C. BENOIST,K. O’HARE, F. GANON, and P. CHAMBON.1978. Ovalbumin gene: evidence for a leader sequence in mRNA and DNA sequences at the exon-intron boundaries. Proc. Natl. Acad. Sci. 754853-4857. CZELUSNIAK, J., M. GOODMAN,D. HEWETT-EMMETT, M. L. WEISS,P. J. VENTA, and R. E. TASHIAN. 1982. Phylogenetic origins and adaptive evolution of avian and mammalian haemoglobin genes. Nature 298:297-300. 388 Scott et al. DEJONG, W. W. W. 197 1. Chimpanzee foetal haemoglobin: structure heterogeneity of the y chain. Biochem. Biophys. Acta 251:2 17-226. DOVER,G. 1982. Molecular drive: a cohesive mode of species evolution. Nature 299564-572. EDWARDS,A. F. W., and L. L. CAVALLI-SFORZA.1963. The reconstruction of evolution. Ann. Hum. Genet. 27:104-105. EFSTRATIADIS,A., J. W. POSAKONY,T. MANIATIS,R. M. LAWN, C. O’CONNELL,R. A. SPRITZ, J. K. DERIEL, B. G. FORGET, S. M. WEISSMAN,J. L. SLIGHTOM,A. E. BLECHL,0. SMITHIES, F. E. BARALLE,C. C. SHOULDERS,N. J. PROUDFOOT. 1980. The structure and evolution of the human P-globin gene family. Cell 21:653-668. FARRIS, J. S. 1970. Methods for computing Wagner Trees. Syst. Zool. 19:83-92. FITCH, W. M. 197 1. Toward defining the course of evolution: minimum change for a specific tree topology. Syst. Zool. 20:406-4 16. FORGET, B. G., C. CAVALLESEO,J. K. DERIEL, R. A. SPRITZ, P. V. CHOUDARY,J. T. WILSON, L. B. WILSON, V. B. REDDY, and S. M. WEISSMAN. 1979. Structure of the human globin genes. Pp. 367-381 in R. AXEL,T.MANIATIS, and C. F. Fox, eds. Eucaryotic gene regulation. ICN-UCLA Symposium on Molecular and Cellular Biology, 14. Academic Press, New York. FRITSCH, E. F., R. M. LAWN, and T. MANIATIS. 1980. Molecular cloning and characterization of the human B-like globin gene cluster. Cell 19:959-972. GAROFF, H., and W. ANSORGE. 198 1. Improvements of DNA sequencing gels. Anal. Biochem. 115:450-457. GOODMAN,M. 1976. Towards a genealogical description of the primates. Pp. 321-353 in M. GOODMAN and R. E. TASHIAN, eds. Molecular anthropology. Plenum, New York. . 198 1. Decoding the pattern of protein evolution. Prog. Biophys. Mol. Biol. 37: 105164. GOODMAN, M., G. BRAUNITZER,A. STANGL, and B. SCHRANK. 1983. Evidence on human origins from haemoglobins of African apes. Nature 303:546-548. HUISMAN, T. H. J., W. A. SCHROEDER,M. E. KEELING, W. GENGOZIAN, A. MILLER, A. R. BRODIE, J. R. SHELTON, and G. APELL. 1973. Search for non-allelic structural genes for y-chains of fetal hemoglobins in some primates. Biochem. Genet. 10:309-3 18. JAGADEESWARAN, P., B. G. FORGET, and S. M. WEISSMAN.198 1. Short interspersed repetitive DNA elements in eukaryotes: transposable DNA elements generated by reverse transcription of RNA Pol III transcripts? Cell 26: 14 1- 142. JEFFREYS,A. J. 1979. DNA sequence variants in w-, Ay-, 6- and P-globin genes in man. Cell l&1-10. JOHANSON,D. C., and T. D. WHITE. 1979. A systematic assessment of early African hominids. Science 203:32 l-330. KIMURA, M. 1983. The neutral theory of molecular evolution. Pp. 208-233 in M. NEI and R. K. KOEHN, eds. Evolution of genes and proteins. Sinauer, Sunderland, Mass. LEWIN, R. 1983. How mammalian RNA returns to its genome. Science 219:1052-1054. LI, W.-H. 1983. Evolution of duplicate genes and pseudogenes. Pp. 14-37 in M. NEI and R. K. KOEHN, eds. Evolution of genes and proteins. Sinauer, Sunderland, Mass. LI, W.-H., T. GOJOBORI,and M. NEI. 198 1. Pseudogenes as a paradigm of neutral evolution. Nature 292:237-239. LIEBHABER,S. A., M. GOOSSENS,and Y. W. KAN. 198 1. Homology and concerted evolution at the al and a2 loci of human a-globin. Nature 270:26-29. LOVEJOY,L. 0. 198 1. The origin of man. Science 211:341-350. MCHENRY, H. M., and R. S. CORRUCCINI. 1980. Late tertiary hominoids and human origins. Nature 285:397-398. MAHONEY,W. C., and P. E. NUTE. 1980. Fetal hemoglobin of the rhesus monkey, Mucuca mulatta: complete primary structure of the y-chains. Biochemistry 19:4436-4442. MANIATIS, T., R. C. HARDISON, E. LACY, J. LAUER, C. O’CONNELL,D. QUON, G. K. SIM, and A. EFSTRATIADIS.1978. The isolation of structural genes from libraries of eucaryotic DNA. Cell 15:687-701. MAXAM, A. M., and W. GILBERT. 1980. Sequencing end-labeled DNA with base specific chemical cleavages. Methods Enzymol. 65:499-560. Gorilla Fetal Globin Genes 389 MESSING, J., R. CREA, and P. H. SEEBURG. 198 1. A system for shotgun DNA sequencing. Nucleic Acids Res 9:309-32 1. MESSING,J., and J. VIERIRA. 1982. A new pair of Ml3 vectors for selecting either DNA strand of double-digest restriction fragments. Gene 19:269-276. MOORE, G. W., J. BARNABAS,and M. GOODMAN. 1973. A method for constructing maximum parsimony ancestral amino acid sequences on a given network. J. Theor. Biol. 38:459-485. NUTE, P. E., and W. C. MAHONEY. 1979a. Complete amino acid sequence of the y-chain from the major fetal hemoglobin of the pigtailed macaque, Macaca nernestrina. Biochemistry l&467-472. . 1979b. Complete sequence of the y-chain from the fetal hemoglobin of the baboon, Papio cynocephalus. Hemoglobin 3:399-4 10. PILBEAM,D. 1979. Recent finds and interpretations of Miocene hominoids. Annu. Rev. Anthropol. 8:333-352. POON, R., Y. W. RAN, and H. W. BOYER. 1978. Sequence of the 3’ noncoding and adjacent coding regions of human y-globin mRNA. Nucleic Acid Res. 5:4625-4630. RADDING, C. M. 1978. Genetic recombination: strand transfer and mismatch repair. Annu. Rev. Biochem. 47:847-880. RAZIN, A., and A. D. RIGGS. 1980. DNA methylation and gene function. Science 210:604610. SANGER,F., S. NICKLEN, and A. R. COULSON. 1977. DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74:5463-5467. SCHROEDER,W. A., J. R. SHELTON,J. B. SHELTON,and T. H. J. HUISMAN. 1963. The amino acid sequence of the y-chain of human fetal hemoglobin. Biochemistry 2:992-1008. . 1978. The Vy-chain of fetal hemoglobin of the orangutan. Biochem. Genet. 16: 12031205. SHEN, S., J. L. SLIGHTOM,and 0. SMITHIES. 198 1. A history of the human fetal globin gene duplication. Cell 26: 19 l-203. SHEN, S.-H., and 0. SMITHIES. 1982. Human globin wBz is not a globin-related sequence. Nucleic Acid Res. 10:7809-87 18. SLIGHTOM, J. L., A. E. BLECHL, and 0. SMITHIES. 1980. Human fetal w- and Ay-globin genes: complete nucleotide sequence suggests that the DNA can be exchanges between these duplicated genes. Cell 21:627-638. SLIGHTOM,J. L., S. M. SUN, and T. C. HALL. 1983. Complete nucleotide sequence of a French bean storage protein gene: phaseolin. Proc. Natl. Acad. Sci. USA. 80: 1897-l 90 1. SMITHIES,O., W. R. ENGLES,J. R. DEVERUX, J. L: SLIGHTOM,and S.-H. SHEN. 198 1. Base substitutions, length differences and DNA strand asymmetries in the human Gr- and ?-fetal globin gene region. Cell 26:345-353. SNYDER,L. R. G. 1980. Closely-linked alpha-chain hemoglobin loci in Peromyscus and other animals: speculations on the evolution of duplicate loci. Evolution 34: 1077- 1098. SOUTHERN,E. 1975. Detection of specific sequences among DNA fragments separated by gel electrophoresis. J. Mol. Biol. 98:503-5 17. WILLIAMS,B. G., and F. R. BLATTNER. 1979. Construction and characterization of the hybrid bacteriophage lambda Charon vectors for DNA cloning. J. Virol. 29:555-575. WILSON, J. T., L. B. WILSON, J. K. DERIEL, L. VILLA-K• MAROFF,A. EFSTRATIADIS,B. G. FORGET, and S. M. WEISSMAN. 1978. Insertion of synthetic copies of human globin genes into bacterial plasmids. Nucleic Acids Res. 5:563-581. ZIMMER, E. A., S. L. MARTIN, S. M. BEVERLEY,Y. W. RAN, and A. C. WILSON. 1980. Rapid duplication and loss of genes coding for the u-chains of hemoglobin. Proc. Natl. Acad. Sci. USA 77:2 158-2 162. ZUCKERKANDL,E. 1964. Further principles of chemical paleogenetics as applied to the evolution of hemoglobin. Protides Biol. Fluids 12: 102- 109. WALTER M. FITCH, reviewing editor Received October 25, 1983; revision received March 9, 1984.