* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Complex History of a Chromosomal Paralogy Region: Insights from
Y chromosome wikipedia , lookup
Oncogenomics wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Segmental Duplication on the Human Y Chromosome wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Transposable element wikipedia , lookup
Genetic engineering wikipedia , lookup
Essential gene wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Human genome wikipedia , lookup
Copy-number variation wikipedia , lookup
Gene therapy wikipedia , lookup
Public health genomics wikipedia , lookup
Point mutation wikipedia , lookup
Pathogenomics wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Gene nomenclature wikipedia , lookup
Gene desert wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
History of genetic engineering wikipedia , lookup
X-inactivation wikipedia , lookup
Helitron (biology) wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Ridge (biology) wikipedia , lookup
Genomic imprinting wikipedia , lookup
Minimal genome wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Gene expression programming wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Genome evolution wikipedia , lookup
Gene expression profiling wikipedia , lookup
Microevolution wikipedia , lookup
Genome (book) wikipedia , lookup
Complex History of a Chromosomal Paralogy Region: Insights from Amphioxus Aromatic Amino Acid Hydroxylase Genes and Insulin-Related Genes Simon J. Patton, Graham N. Luke, and Peter W. H. Holland School of Animal and Microbial Sciences, University of Reading, Whiteknights, U.K. Aromatic amino acid hydroxylase (AAAH) genes and insulin-like genes form part of an extensive paralogy region shared by human chromosomes 11 and 12, thought to have arisen by tetraploidy in early vertebrate evolution. Cloning of a complementary DNA (cDNA) for an amphioxus (Branchiostoma floridae) hydroxylase gene (AmphiPAH) allowed us to investigate the ancestry of the human chromosome 11/12 paralogy region. Molecular phylogenetic evidence reveals that AmphiPAH is orthologous to vertebrate phenylalanine (PAH) genes; the implication is that all three vertebrate AAAH genes arose early in metazoan evolution, predating vertebrates. In contrast, our phylogenetic analysis of amphioxus and vertebrate insulin-related gene sequences is consistent with duplication of these genes during early chordate ancestry. The conclusion is that two tightly linked gene families on human chromosomes 11 and 12 were not duplicated coincidentally. We rationalize this paradox by invoking gene loss in the AAAH gene family and conclude that paralogous genes shared by paralogous chromosomes need not have identical evolutionary histories. Introduction The past decade has witnessed a rapid expansion in the number of mammalian genes mapped to chromosomal locations. As data have accumulated, patterns have become apparent. Chromosomal paralogy regions are sets of linked genes, each having homologues linked on another chromosome (Lundin 1993). For example, the genes for phenylalanine hydroxylase, FGF6, insulinlike growth factor 1 (IGF-1), lactate dehydrogenase B, and two myogenic bHLH proteins, K-ras and parathyroid hormone-like hormone, map to human chromosome 12; all these genes have close homologues on human chromosome 11. The simplest explanation for this pattern is common ancestry: human chromosomes 11 and 12 probably arose by duplication of an ancestral chromosome, perhaps by tetraploidy of the genome (Lundin 1993). There is now considerable evidence that tetraploidy did occur during early vertebrate evolution. Susumo Ohno was the first to seriously suggest this notion in his seminal book (Ohno 1970). More recently, analysis of gene family complexity in vertebrates and in the closely related invertebrate, amphioxus, has given strong support to the notion that the genome underwent at least one, and possibly two, rounds of tetraploidy in early vertebrate evolution (Holland et al. 1994; Sharman and Holland 1996; Sharman, Hay-Schmidt, and Holland 1997). If chromosomal paralogy regions arose by tetraploidy during early vertebrate evolution, it seems reasonable to assume that each pair of related genes arose simultaneously at this time. This assumption has never been adequately tested, since it requires DNA sequence information from a close outgroup for each of Key words: amphioxus, paralogy region, chromosome evolution, molecular phylogeny, phenylalanine hydroxylase, insulin. Address for correspondence and reprints: Peter W. H. Holland, School of Animal and Microbial Sciences, University of Reading, Whiteknights, U.K. E-mail: [email protected]. Mol. Biol. Evol. 15(11):1373–1380. 1998 q 1998 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038 two or more linked gene families within a paralogy group. Amphioxus may be the ideal outgroup for such analyses since it is the sister group of the vertebrates (defined here as synonymous with craniates) and is thought to have branched from the chordate lineage just before the putative tetraploidy events. Within the chromosome 11/12 paralogy group, the physical linkage between aromatic amino acid hydroxylase (AAAH) genes and insulin-like genes is particularly close. Chromosome 11 has tryptophan hydroxylase (TPH) and tyrosine hydroxylase (TH), while chromosome 12 has phenylalanine hydroxylase (PAH). Chromosome 11 has insulin and IGF-2, while chromosome 12 has IGF-1 (fig. 1). An amphioxus homologue of the vertebrate insulin-like genes (insulin, IGF-1, and IGF-2) has already been cloned (Chan, Cao, and Steiner 1990) and named insulin-like peptide (ILP). The physically closest genes to IGF-1 on human chromosome 12 and to IGF-2 and to insulin on human chromosome 11 are members of the AAAH gene family. We therefore elected to clone an amphioxus member of this gene family. Molecular phylogenetic analyses of the amphioxus AAAH gene and ILP gene sequences, in relation to vertebrate and other metazoan homologues, revealed the dates of gene duplication in each gene family. We find that the gene duplication yielding AAAH genes on chromosomes 11 and 12 occurred at a vastly different time from the duplication that yielded IGF genes on chromosomes 11 and 12. Materials and Methods Isolation and Characterization of cDNA Clones for AAAH Genes An amphioxus cDNA library in l-ZAPII (made from 5- to 24-h embryos and a gift from J. Langeland, University of Kalamazoo, Mich.) was screened with a probe (p299) containing exons 6 to 11 of a genomic cosmid clone for amphioxus PAH (Patton and Holland, unpublished data). The genomic clone was originally found by library screening with a polymerase chain re1373 1374 Patton et al. FIG. 1—Chromosomal paralogy region shared by human chromosomes 11 and 12. Examples of genes mapping to 11p15 are shown; the dotted lines indicate gene family relationships to genes mapping to 12p12 or to 12q21–22. Chromosome 12 is drawn upside down to facilitate comparison. Map positions are from Online Mendelian Inheritance in Man at http://www.ncbi.nlm.nih.gov/Omim/. action fragment amplified with primers designed from vertebrate AAAH genes. Hybridizations were carried out in 7% SDS and 250 mM sodium phosphate buffer pH 7.2 at 658C, and hybridizing clones were plaque purified. cDNA was rescued into pBSII SK1 (Stratagene), restriction mapped, and sequenced in both directions by a combination of subcloning and walking with specific oligonucleotide primers. Molecular Phylogenetic Analysis of AAAH Genes and Insulin-Related Genes Aromatic amino acid hydroxylase gene or cDNA sequences were obtained through text and sequence similarity searches against GenBank databases including data from the Nematode Genome Sequencing Project (Sanger Centre, Hinxton, Cambridge, U.K.). The sequences used were the following: human TH type 1 (Grima et al. 1987; GenBank X05290); human TPH (Boularand et al. 1990; GenBank X52836); human PAH (Kwok et al. 1985; GenBank K03020); bovine TH (D’Mello et al. 1988; SwissProt P17289); rabbit TPH (Grenett et al. 1987; GenBank M17250); mouse TH (Ichikawa, Sasaoka, and Nagatsu 1991; GenBank M69200); mouse TPH (Stoll, Kozak, and Goldman 1990; GenBank J04758); mouse PAH (Ledley et al. 1990; GenBank X51942); quail TH (Fauquet et al. 1988; GenBank M24778); chick TPH (Florez et al. 1996; GenBank U26428); frog TPH (Green and Besharse 1994; GenBank L20679); fly TH (Neckameyer and Quinn 1989; SwissProt P18459); fly PAH (Neckameyer and White 1992; SwissProt P17276); and nematode TH, TPH, and PAH (unpublished sequences citing Wilson et al. 1994; WEBACE B0432.5, C36H8.3, and K08F8.4). Vertebrate, amphioxus, Drosophila, and Caenorhabditis AAAH protein sequences were aligned using the CLUSTAL V program (Higgins, Bleasby, and Fuchs 1991). A distance matrix was constructed from the confidently alignable regions at all sequences (323 sites), excluding regions of dubious alignment, and the phylogeny was deduced by the neighbor-joining method of the PHYLIP package, version 3.572c (Felsenstein 1993). Confidence in each node of the phylogeny was assessed by performing 100 bootstrap resamplings of the data. Neighbor joining does not assume a molecular clock or a root position. An identical strategy was undertaken for the insulin-related genes, except that a greater diversity of species could be included. We included human INS (Bell et al. 1980; GenBank J00265), human IGF-1 (Rotwein et al. 1986; SwissProt P05019), human IGF-2 (Dull et al. 1984; SwissProt P01344), rat insulin (Lomedico et al. 1979; SwissProt P01323), rat IGF-1 (Shimatsu and Rotwein 1987; SwissProt P08025), mouse IGF-2 (Rotwein and Hall 1990; GenBank U71085), rabbit insulin (Devaskar et al. 1994; GenBank U03610), chick insulin (Perler et al. 1980; SwissProt P01332), chick IGF-1 (Kajimoto and Rotwein 1989; SwissProt P18254), frog insulin (Shuldiner et al. 1989; SwissProt P12707), frog IGF-1 (Kajimoto and Rotwein 1990; SwissProt P16501), salmon insulin (Koval, Petrenko, and Kavsan 1989; SwissProt P04667), trout IGF-1 and IGF-2 (Shamblott and Chen 1992; SwissProt Q02815 and Q02816), dogfish insulin (Bajaj et al. 1983; SwissProt P12704, dogfish IGF-1 and IGF-2 (Duguay et al. 1995; GenBank Z50081 and Z50082), hagfish insulin (Chan et al. 1981; SwissProt P01342), hagfish IGF (Nagamatsu et al. 1991; SwissProt P22618), amphioxus ILP (Chan, Cao, and Steiner 1990; SwissProt P22334), and ascidian insulin and IGF (McRory and Sherwood 1997). Analysis was restricted to the confidently alignable A and B domains (50 sites); large regions of extreme length variability outside these domains (or inside for ascidian IGF) were deliberately excluded to avoid problems with gap weighting. Both analyses were performed without outgroups, since alignments suggested that prokaryotic AAAH genes (Zhao et al. 1994) and relaxin-like genes (Hudson et al. 1983) were too distantly related to provide reliable root positions. Results Cloning of an Amphioxus AAAH Screening an amphioxus cDNA library with a genomic AAAH probe identified six hybridizing clones. Sequencing revealed that all clones derived from the same gene: all six clones started from the same 59 EcoRI site. There was variation in the 39 untranslated region (UTR) between clones, as assessed by length and restriction endonuclease mapping; we conclude that there is polymorphism within the 39 UTR. The longest clone was fully sequenced and was found to include a 438– amino acid open-reading frame (fig. 2); this remains in frame at the 59 end. Alignment to known AAAH protein sequences suggests that the clone is missing approximately 14 codons (3%) of the protein. Since the extreme N-terminus of AAAH proteins is highly variable, it would be uninformative for our phylogenetic analyses in any case, and the near-full-length clone obtained is Paralogy Evolution 1375 FIG. 2—Nucleotide and deduced amino acid sequence of amphioxus aromatic amino acid hydroxylase (AAAH) cDNA, AmphiPAH. Amino acid residues shown in bold cover the region used in phylogenetic analyses, after optimal alignment to 16 AAAH sequences. Asterisk, stop codon; underline, polyadenylation signal. GenBank/EMBL AJ001677. likely to contain ample phylogenetic information. A putative polyadenylation signal was found starting at nucleotide 2837, followed 17 bp later by the poly (A)10 tail. Database searches using the deduced protein sequence revealed close similarity to known eukaryote AAAH genes, with a particularly strong similarity to PAH genes. On the basis of similarity, we designated this cDNA AmphiPAH; molecular phylogenetics provide a test of this hypothetical assignment Molecular Phylogeny of the AAAH Gene Family To determine the phylogenetic relationship of AmphiPAH to other aromatic amino acid hydoxylase genes, we first aligned it with the deduced protein sequences of all eukaryote AAAH genes. This included three unpublished nematode AAAH sequences deposited in GenBank by the Caenorhabditis elegans genomesequencing project. All sequences aligned extremely 1376 Patton et al. FIG. 3—Phylogenetic relationships of eukaryote aromatic amino acid hydroxylase (AAAH) proteins, inferred by the neighbor-joining method. Figures on branches signify the percentage of times the sequences to the right of that node were found together following bootstrap resampling of the data. Nodes receiving below 60% bootstrap support were collapsed, leaving only well-supported aspects of the tree. Branch lengths are to scale; the scale bar denotes 0.1 inferred mutations per site. The tree is not rooted because of the absence of a suitable outgroup; nonetheless, TH, TPH, and PAH clearly fall as distinct genes across all metazoan taxa analyzed. well across the C-terminal two-thirds of each enzyme. Seventy residues were absolutely conserved in all eukaryote hydroxylase proteins. A phylogenetic tree constructed by neighbor joining is shown in figure 3 with confidence assessed by bootstrapping. The results show that each of the three hydroxylase gene types (TH, TPH, and PAH) forms a monophyletic group supported by very high bootstrap values. The position of the amphioxus hydroxylase gene confirms that it is directly orthologous to human PAH; the same conclusion applies to the Drosophila gene previously de- scribed as a possible precursor of PAH and TPH (Neckameyer and White 1992). The three nematode AAAH genes sequenced by the nematode genome project are clearly orthologous to TH, TPH, and PAH. The tree is consistent with the hypothesis of two separate phases of gene duplications in this gene family giving rise to three genes (TH, TPH, and PAH), but we cannot firmly deduce the relative order of the two duplications because of the absence of an outgroup. The clear conclusion is that all three gene types (TH, TPH, and PAH) originated by duplication prior to the divergence of nematodes, ar- Paralogy Evolution 1377 FIG. 4—Phylogenetic relationships of insulin-related proteins, inferred by the neighbor-joining method. Percentage bootstrap support is shown only where this reaches 60% or above; other nodes are collapsed. Scale denotes 0.1 mutations per site. thropods, and chordates and hence well before the origin of vertebrates. Although amphioxus TPH, Drosophila TPH, and amphioxus TH have not yet been cloned, their existence is strongly predicted by the topology of the phylogenetic tree. Molecular Phylogeny of the Insulin Gene Family An amphioxus gene for insulin-like peptide (ILP) gene was cloned by Chan, Cao, and Steiner (1990); its phylogenetic relationships were analyzed by Ellsworth, Hewett-Emmett, and Li (1994). We have refined this analysis by incorporating the recently published sequences of insulin gene family members from the elasmobranch Squalus acanthias (Duguay et al. 1995) and the ascidian Chelyosoma productum (McRory and Sherwood 1997), since these give further insight into the timing of gene duplication in the insulin gene family. Unlike the case with the AAAH genes, we find that the metazoan insulin-related genes do not all fall into the three groups characteristic of mammals (fig. 4). For ex- ample, the hagfish IGF gene is found to be a possible outgroup to the IGF-1 and IGF-2 groups of genes. Furthermore, our phylogenetic analysis conclusively demonstrates that neither amphioxus ILP nor ascidian IGF genes are direct orthologues of either IGF-1 or IGF-2 genes; they are most probably direct descendants of an ancestral IGF. This implies that the gene duplication that gave rise to IGF-1 and IGF-2 occurred after the origin of vertebrates. The phylogenetic position of ascidian insulin is more confusing; our analyses suggest it could be a descendant of the precursor to insulin and both IGF genes. Further insight into the relationships between these genes comes from examination of their gene structures, including presence of particular protein domains. The deduced protein products of amphioxus ILP and ascidian IGF each have a C-terminal extension (domains D and E) characteristic of vertebrate IGF proteins, not present in insulin. This argues that amphioxus ILP and ascidian IGF are true IGF genes, not insulin genes. Both 1378 Patton et al. lines of evidence point to one conclusion: IGF-1 and IGF-2 arose within the vertebrates. Discussion We chose to analyze AAAH genes and insulin-related genes as test cases to investigate the hypothesis that paralogy regions are reflections of tetraploidy in early vertebrate evolution. When a set of linked genes on one chromosome has linked relatives on another, it seems reasonable to assume that each paralogous pair arose simultaneously, by chromosome duplication. However, our characterization of an amphioxus AAAH gene and our molecular phylogenetic analyses of AAAH genes and insulin-related genes have yielded an unexpected result. We find that the AAAH genes on both human chromosomes 11 and 12 arose very early in metazoan evolution. The gene duplication events yielding TH, PAH, and TPH predate the divergence of nematodes, arthropods, and chordates, well back into the Precambrian era. In contrast, the neighboring IGF genes on the same chromosomes arose much later, during chordate radiation (postdating the hagfish/gnathostome divergence but predating the chondrichthyan/ray-finned fish/tetrapod divergences). The IGF duplication is dated to the time of the putative tetraploidy events in early vertebrate evolution, for which much supporting data have now accumulated (Holland et al. 1994; Holland 1996; Sharman and Holland 1996). This apparent paradox raises important questions concerning the nature and origin of chromosomal paralogy regions. The resolution that we propose below suggests that paralogy regions have a more complex history than previously recognized and also highlights the factors that affect the rates of gene loss in evolution. The PAH gene is on chromosome 12 at 12q21.1, whereas TH and TPH map to 11p15.5. The IGF-1 gene maps adjacent to PAH, whereas IGF-2 is close to TH and TPH. The genes lie within a well-characterized and much-cited chromosomal paralogy region, with over 10 different gene families mapping to both chromosome regions (Lundin 1993; fig. 1). Considering the large number of gene families involved, it seems an inescapable conclusion that a substantial part of human chromosome 11 is evolutionarily related to part of chromosome 12, probably through chromosome duplication due to tetraploidy (Brissenden, Ullrich, and Francke 1984; Craig et al. 1986; Morton et al. 1986; Ledley et al. 1987; Lundin 1993). The discrepancies in the locations of some genes are likely to be due to pericentric inversion and/or translocations altering gene order on chromosome 12 compared with that on chromosome 11 (Brissenden, Ullrich, and Francke 1984; fig. 1). If we accept that human chromosomes 11 and 12 are evolutionarily related, how can the different duplication dates of AAAH and IGF genes be explained? We propose that in the genome of an early chordate, a linked array of genes existed comprising TH, PAH, TPH, insulin, and IGF (fig. 5). Of these, at least TH, PAH, and TPH can be traced further back into early metazoan evolution, although when they became phys- FIG. 5—Proposed model for the evolution of the aromatic amino acid hydroxylase (AAAH) genes and insulin-related genes within a chromosomal paralogy group on human chromosomes (HSA) 11 and 12. Each gene is depicted by a box (hatched boxes, AAAH genes; dotted boxes, insulin-related genes). ically linked to insulin-related gene(s) cannot yet be deduced. During the early stages of vertebrate evolution, the chromosome containing this cluster of genes duplicated to yield two paralogous chromosomes, each containing linked copies of TH, PAH, TPH, insulin, and IGF. We propose that this occurred during tetraploidy of the whole genome, and we date this event to soon after the divergence of jawed vertebrates from the ancestral jawless vertebrates. This equates to the second of the two bouts of gene duplication during vertebrate evolution proposed by Holland et al. (1994) and Sharman and Holland (1996). After tetraploidy, the two IGF genes diverged in function, yielding IGF-1 and IGF-2. By contrast, the duplicate copies of TH, PAH, TPH, and insulin were all purged from the genome. Presumably, deletions, frameshifts, and nonsense mutations corroded duplicate copies to pseudogenes and ultimately deleted them. If loss of function occurred relatively soon after tetraploidy, as predicted by theoretical considerations (Marshall, Raff, and Raff 1994), it is not surprising that all traces of the duplicate genes have been removed from the genome during the 450 million years since tetraploidization. An interesting corollary of our proposed model is that some of the duplicate genes must have been deleted from each of the two daughter chromosomes. Thus, the genomic region that is now human chromosome 11 has lost a duplicate copy of the PAH gene; the loss has been more extensive on chromosome 12, which has been purged of duplicate TH, TPH, and insulin genes (fig. 5). These conclusions have several implications for genome evolution. Most important, this example demonstrates that the net result of tetraploidy is not always gene duplication. Instead, as in the case of the AAAH Paralogy Evolution genes, the net result can be transfer of a gene from one chromosome to another. Hence, arrays of paralogous genes on two chromosomes are likely to include some genes related by duplication during tetraploidy and others related by gene transfer during tetraploidy. This implies that attempts to date chromosome or genome duplications using dates of gene duplications must proceed with caution. Certainly, it is not valid to use congruence of gene duplication dates within paralogous chromosomes as a test of the tetraploidy hypothesis. Likewise, caution must be exercised in extrapolating gene duplication dates from one gene family to its neighboring gene family (Bailey et al. 1997). More generally, our findings highlight differential gene loss as another factor contributing to the complex evolution of multigene families, alongside gene duplication, gene conversion, rate differences (Williams and Holland 1998), and convergent or parallel evolution (Stewart, Schilling, and Wilson 1987). Finally, we note that there is a wide variance in the degree of gene loss following tetraploidy. All duplicated AAAH genes have been lost, in stark contrast to a large number of developmentally expressed genes in which duplicate genes have been retained (e.g., Hox gene clusters, Msx homeobox genes, myogenic genes, and IGF; Holland 1996). We speculate that the difference in extent of gene loss is related to the ease with which genes may acquire novel, selectively advantageous roles after duplication. Ubiquitous or broadly expressed genes, particularly those with enzymatic functions, may require rare and multiple substitutions in the protein-coding sequence to alter function. These may be so infrequent that genes are likely to be lost before novel functions arise. In contrast, genes with very localized sites of expression, particularly developmentally expressed genes, may be functionally adapted more simply, by loss or gain of enhancer sequences, creating novel sites of expression. We suggest that these genes are more likely to acquire new roles and to be retained after duplication. Acknowledgments P.W.H.H. thanks a critical seminar audience at Cambridge University, U.K., for help in rationalizing apparently contradictory results; Richard Sandford for advice at the earliest stages of this project; Seb Shimeld and Hiroshi Wada for critical comment; and Lars-G. Lundin for sparking an interest in paralogy. This work was funded by the Medical Research Council, U.K., and the University of Reading. LITERATURE CITED BAILEY, W. J., J. KIM, G. P. WAGNER, and F. H. RUDDLE. 1997. Phylogenetic reconstruction of vertebrate Hox cluster duplications. Mol. Biol. Evol. 14:843–853. BAJAJ, M., T. L. BLUNDELL, J. E. PITTS et al. (12 co-authors). 1983. Dogfish insulin. Primary structure, conformation and biological properties of an elasmobranch insulin. Eur. J. Biochem. 135:535–542. BELL, G. I., R. L. PICTET, W. J. RUTTER, B. CORDELL, E. TISCHER, and H. M. GOODMAN. 1980. Sequence of the human insulin gene. Nature 284:26–32. 1379 BOULARAND, S., M. C. DARMON, Y. GANEM, J. M. LAUNAY, and J. MALLET. 1990. Complete coding sequence of human tryptophan hydroxylase. Nucleic Acids Res. 18:4257. BRISSENDEN, J. E., A. ULLRICH, and U. FRANCKE. 1984. Human chromosomal mapping of genes for insulin-like growth factors 1 and 11 and epidermal growth factor. Nature 310: 781–784. CHAN, S. J., Q. P. CAO, and D. F. STEINER. 1990. Evolution of the insulin superfamily: cloning of a hybrid insulin/insulinlike growth factor cDNA from amphioxus. Proc. Natl. Acad. Sci. USA 87: 9319–9323. CHAN, S. J., S. O. EMDIN, S. C. KWOK, J. M. KRAMER, S. FALKMER, and D. F. STEINER. 1981. Messenger RNA sequence and primary structure of preproinsulin in a primitive vertebrate, the Atlantic hagfish. J. Biol. Chem. 256:7595– 7602. CRAIG, S. P., V. J. BUCKLE, A. LAMOUROUX, J. MALLET, and I. CRAIG. 1986. Localisation of the human tyrosine hydroxylase gene to 11p15: gene duplication and evolution of metabolic pathways. Cytogenet. Cell Genet. 42:29–32. DEVASKAR, S. U., S. J. GIDDINGS, P. A. RAJAKUMAR, L. R. CARNAGHI, R. K. MENON, and D. S. ZAHM. 1994. Insulin gene expression and insulin synthesis in mammalian neuronal cells. J. Biol. Chem. 269:8445–845. D’MELLO, S. R., E. P. WEISBERG, M. K. STACHOWIAK, L. M. TURZAI, A. E. GIOIO, and B. B. KAPLAN. 1988. Isolation and nucleotide sequence of a cDNA clone encoding bovine adrenal tyrosine hydroxylase: comparative analysis of tyrosine hydroxylase gene products. J. Neurosci. Res. 19: 440–449. DUGUAY, S. J., S. J. CHAN, T. P. MOMMSEN, and D. F. STEINER. 1995. Divergence of insulin-like growth factors 1 and 2 in the elasmobranch, Squalus acanthias. FEBS Lett. 371:69– 72. DULL, T. J., A. GRAY, J. S. HAYFLICK, and A. ULLRICH. 1984. Insulin-like growth factor 11 precursor gene organization in relation to insulin gene family. Nature 310:777–781. ELLSWORTH, D. L., D. HEWETT-EMMETT, and W.-H. LI. 1994. Evolution of base composition in the insulin and insulinlike growth factor genes. Mol. Biol. Evol. 11:875–885. FAUQUET, M., B. GRIMA, A. LAMOUROUX, and J. MALLET. 1988. Cloning of quail tyrosine hydroxylase: amino acid homology with other hydroxylases discloses functional domains. J. Neurochem. 50:142–148. FELSENSTEIN, J. 1993. PHYLIP. Version 3.572c. University of Washington, Seattle. FLOREZ, J. C., K. J. SEIDENMAN, R. K. BARRETT, A. M. SANGORAM, and J. S. TAKAHASHI. 1996. Molecular cloning of chick pineal tryptophan-hydroxylase and circadian oscillation of its messenger-RNA levels. Mol. Brain Res. 42:25– 30. GREEN, C. B., and J. C. BESHARSE. 1994. Tryptophan hydroxylase expression is regulated by a circadian clock in Xenopus laevis retina. J. Neurochem. 62:2420–2428. GRENETT, H. E., F. D. LEDLEY, L. L. REED, and S. L. WOO. 1987. Full-length cDNA for rabbit tryptophan hydroxylase: functional domains and evolution of aromatic amino acid hydroxylases. Proc. Natl. Acad. Sci. USA 84:5530–5534. GRIMA, B., A. LAMOUROUX, C. BONI, J.-F. JULIEN, F. JAVOYAGID, and J. MALLET. 1987. A single human gene encoding multiple tyrosine hydroxylases with different predicted functional characteristics. Nature 326:707–711. HIGGINS, D. G., A. J. BLEASBY, and R. FUCHS. 1991. CLUSTAL V: improved software for multiple sequence alignment. CABIOS. 8:189–191. 1380 Patton et al. HOLLAND, P. W. H. 1996. Molecular biology of lancelets: insights into development and evolution. Isr. J. Zool. 42: S247–S272. HOLLAND, P. W. H., J. GARCIA-FERNÀNDEZ, N. A. WILLIAMS, and A. SIDOW. 1994. Gene duplications and the origin of vertebrate development. Dev. Suppl. 1994:125–133. HUDSON, P., J. HALEY, M. CRONK, R. CRAWFORD, J. HARALAMBIDIS, G. TREGEAR, J. SHINE, and H. NIALL. 1983. Structure of a genomic clone encoding biologically active human relaxin. Nature 301:628–631. ICHIKAWA, S., T. SASAOKA, and T. NAGATSU. 1991. Primary structure of mouse tyrosine hydroxylase deduced from its cDNA. Biochem. Biophys. Res. Commun. 176:1610–1616. KAJIMOTO, Y., and P. ROTWEIN. 1989. Structure and expression of a chicken insulin-like growth factor 1 precursor. Mol. Endocrinol. 3:1907–1913. . 1990. Evolution of insulin-like growth factor 1 (IGF1): structure and expression of an IGF-1 precursor from Xenopus laevis. Mol. Endocrinol. 4:217–226. KOVAL, A. P., A. I. PETRENKO, and V. M. KAVSAN. 1989. Sequence of the salmon (Oncorhynchus keta) [corrected] preproinsulin gene. Nucleic Acids Res. 17:1758. KWOK, S. C., F. D. LEDLEY, A. G. DILELLA, K. J. ROBSON, and S. L. WOO. 1985. Nucleotide sequence of a full-length complementary DNA clone and amino acid sequence of human phenylalanine hydroxylase. Biochemistry 24:556–561. LEDLEY, F. D., H. E. GRENETT, D. P. BARTOS, P. VAN TUINEN, D. H. LEDBETTER, and S. L. C. WOO. 1987. Assignment of human tryptophan hydroxylase locus to chromosome 11: gene duplication and translocation in evolution of aromatic amino acid hydroxylases. Somat. Cell Mol. Genet. 13:575– 580. LEDLEY, F. D., H. E. GRENETT, B. S. DUNBAR, and S. L. WOO. 1990. Mouse phenylalanine hydroxylase. Homology and divergence from human phenylalanine hydroxylase. Biochem. J. 267:399–405. LOMEDICO, P., N. ROSENTHAL, A. EFSTRATIDADIS, W. GILBERT, R. KOLODNER, and R. TIZARD. 1979. The structure and evolution of the two nonallelic rat preproinsulin genes. Cell 18: 545–558. LUNDIN, L. G. 1993. Evolution of the vertebrate genome as reflected in paralogous chromosomal regions in man and the house mouse. Genomics 16:1–19. MARSHALL, C. R., E. C. RAFF, and R. A. RAFF. 1994. Dollo’s Law and the death and resurrection of genes. Proc. Natl. Acad. Sci. USA 91:12283–12287. MCRORY, J. E., and N. M. SHERWOOD. 1997. Ancient divergence of insulin and insulin-like growth factor. DNA Cell Biol. 16:939–949. MORTON, C. C., M. G. BYERS, H. NAKAI, G. I. BELL, and T. B. SHOWS. 1986. Human genes for insulin-like growth factors 1 and 11 and epidermal growth factor are located on 12q22-q24.1, 11p15, and 4q25-q27, respectively. Cytogenet. Cell Genet. 41:245–249. NAGAMATSU, S., CHAN, S. J., FALKMER, S., and STEINER, D. F. 1991. Evolution of the insulin gene superfamily. Sequence of a preproinsulin-like growth factor cDNA from the Atlantic hagfish. J. Biol. Chem. 266:2397–2402. NECKAMEYER, W. S., and W. G. QUINN. 1989. Isolation and characterization of the gene for Drosophila tyrosine hydroxylase. Neuron 2:1167–1175. NECKAMEYER, W. S., and K. WHITE. 1992. A single locus encodes both phenylalanine hydroxylase and tryptophan hydroxylase activities in Drosophila. J. Biol. Chem. 267: 4199–4206. OHNO, S. 1970. Evolution by gene duplication. Springer-Verlag, Heidelberg, Germany. PERLER, F., A. EFSTRADDIATOS, P. LOMEDICO, W. GILBERT, R. KOLODNER, and J. DODGSON. 1980. The evolution of genes: the chicken preproinsulin gene. Cell 20:555–566. ROTWEIN, P., and L. J. HALL. 1990. Evolution of insulin-like growth factor 2: characterization of the mouse IGF-2 gene and identification of two pseudo-exons. DNA Cell Biol. 9: 725–735. ROTWEIN, P., K. M. POLLOCK, D. K. DIDIER, and G. G. KRIVI. 1986. Organization and sequence of the human insulin-like growth factor 1 gene. Alternative RNA processing produces two insulin-like growth factor 1 precursor peptides. J. Biol. Chem. 261:4828–4832. SHAMBLOTT, M. J., and T. T. CHEN. 1992. Identification of a second insulin-like growth factor in a fish species. Proc. Natl. Acad. Sci. USA 89:8913–8917. SHARMAN, A. C., A. HAY-SCHMIDT, and P. W. H. HOLLAND. 1997. Cloning and analysis of an HMG gene from the lamprey, Lampetra fluviatilis: gene duplication in vertebrate evolution. Gene 184:99–105. SHARMAN, A. C., and P. W. H. HOLLAND. 1996. Conservation, duplication and divergence of developmental genes during chordate evolution. Neth. J. Zool. 46:47–67. SHIMATSU, A., and P. ROTWEIN. 1987. Mosaic evolution of the insulin-like growth factors. Organization, sequence, and expression of the rat insulin-like growth factor 1 gene. J. Biol. Chem. 262:7894–7900. SHULDINER, A. R., S. PHILLIPS, C. T. ROBERTS JR., D. LEROITH, and J. ROTH. 1989. Xenopus laevis contains two nonallelic preproinsulin genes. cDNA cloning and evolutionary perspective. J. Biol. Chem. 264:9428–9432. STEWART, C.-B., J. W. SCHILLING, and A. C. WILSON. 1987. Adaptive evolution in the lysozymes of foregut fermenters. Nature 330:401–404. STOLL, J., C. A. KOZAK, and D. GOLDMAN. 1990. Characterization and chromosomal mapping of a cDNA encoding tryptophan hydroxylase from a mouse mastocytoma cell line. Genomics 7:88–96. WILLIAMS, N. A., and P. W. H. HOLLAND. 1998. Gene and domain duplication in the chordate Otx gene family: insights from amphioxus Otx. Mol. Biol. Evol. 15:600–607. WILSON, R., R. AINSCOUGH, K. ANDERSON et al. (53 co-authors). 1994. 2.2 Mb of contiguous nucleotide sequence from chromosome III of C. elegans. Nature 368:32–38. ZHAO, G., T. XIA, J. SONG, and R. A. JENSEN. 1994. Pseudomonas aeruginosa possesses homologues of mammalian phenylalanine hydroxylase and 4 a-carbinolamine dehydratase/DCoH as part of a three-component gene cluster. Proc. Natl. Acad. Sci. USA 91:1366–1370. CLAUDIA KAPPEN, reviewing editor Accepted July 23, 1998