Download Genomic Analysis of Hox Clusters in the Sea Lamprey

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Essential gene wikipedia , lookup

Long non-coding RNA wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Public health genomics wikipedia , lookup

Gene expression programming wikipedia , lookup

Segmental Duplication on the Human Y Chromosome wikipedia , lookup

Transposable element wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Point mutation wikipedia , lookup

History of genetic engineering wikipedia , lookup

Human genome wikipedia , lookup

Gene desert wikipedia , lookup

Genome (book) wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Non-coding DNA wikipedia , lookup

Genomics wikipedia , lookup

Minimal genome wikipedia , lookup

Designer baby wikipedia , lookup

Genomic imprinting wikipedia , lookup

Microevolution wikipedia , lookup

Genome editing wikipedia , lookup

Gene wikipedia , lookup

RNA-Seq wikipedia , lookup

Pathogenomics wikipedia , lookup

Gene expression profiling wikipedia , lookup

Genome evolution wikipedia , lookup

Helitron (biology) wikipedia , lookup

Metagenomics wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Genomic library wikipedia , lookup

Ridge (biology) wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Hox gene wikipedia , lookup

Transcript
JEZ Mde 2054
JOURNAL OF EXPERIMENTAL ZOOLOGY (MOL DEV EVOL) 294:47–62 (2002)
Genomic Analysis of Hox Clusters in the Sea
Lamprey Petromyzon marinus
STEVEN Q. IRVINE1, JANET L. CARR2†, WENDY J. BAILEY3,
KAZUHIKO KAWASAKI4, NOBUYOSHI SHIMIZU5, CHRIS T. AMEMIYA6,
1
AND FRANK H. RUDDLE *
1
Yale University, Department of Molecular, Cellular and Developmental
Biology, New Haven, Connecticut
2
Genaissance Pharmaceuticals, New Haven, Connecticut
3
Merck and Co., Inc., Department of Bioinformatics, West Point,
Pennsylvania
4
Department of Anthropology, Pennsylvania State University, University
Park, Pennsylvania
5
Keio University School of Medicine, Department of Molecular Biology,
Tokyo, Japan
6
Virginia Mason Research Center, Seattle, Washington
ABSTRACT
The sea lamprey Petromyzon marinus is among the most primitive of extant vertebrates. We are interested in the organization of its Hox gene clusters, because, as a close relative of the gnathostomes, this information would help to infer Hox cluster organization at the base
of the gnathostome radiation. We have partially mapped the P. marinus Hox clusters using phage,
cosmid, and P1 artificial chromosome libraries. Complete homeobox sequences were obtained for
the 22 Hox genes recovered in the genomic library screens and analyzed for cognate group identity. We estimate that the clusters are somewhat larger than those of mammals (roughly 140 kbp
vs. 105 kbp) but much smaller than the single Hox cluster of the cephalochordate amphioxus (at
more than 260 kb). We never obtained more than three genes from any single cognate group from
the genomic library screens, although it is unlikely that our screen was exhaustive, and therefore
conclude that P. marinus has a total of either three or four Hox clusters. We also identify four
highly conserved non-coding sequence motifs shared with higher vertebrates in a genomic comparison of Hox 10 genes. J. Exp. Zool. (Mol. Dev. Evol.) 294:47–62, 2002. © 2002 Wiley-Liss, Inc.
The Hox classes of homeobox transcription factor genes are conserved in all animals and have
critical roles in developmental patterning (DeRobertis, ’94). The discovery that these genes are
arrayed in conserved genomic clusters and that
their position in the cluster is related to their expression patterns along the body axis is one of
the most striking findings of modern biology
(Gehring, ’94). All invertebrates have a single Hox
cluster. All vertebrates examined to date, on the
other hand, have multiple Hox clusters, with as
many as seven found, for example, in the zebrafish
(Amores et al., ’98). The ubiquitous presence of
multiple Hox clusters in vertebrates, given the
single cluster in invertebrates, has led to the idea
that cluster multiplicity is of great importance in
the evolution of vertebrate lineages and in the control of developmental patterning (Holland and
Garcia-Fernandez, ’96a; Ruddle et al., ’99).
A history of cluster duplications in the verte© 2002 WILEY-LISS, INC.
brates would help explain the significance of different cluster numbers in various lineages. If the
history of the clusters varies between lineages it
would have profound implications for the evolution of the gene orthologs and their associated
regulatory sequences. For example, if there have
been independent cluster duplications within
separate vertebrate lineages, such as occurred in
teleost fish, it means that these independently derived clusters are not directly homologous to clus†
Co-first author.
Grant sponsor: National Science Foundation; Grant numbers: IBN9630567 (FHR), DBI-9803937 (SQI); Grant sponsor: National Institutes of Health; Grant number: GM-09966 (FHR); Grant sponsor:
National Science Foundation/Alfred P. Sloan Postdoctoral Research
Fellowship in Molecular Evolution; Grant number: DBI-9803937
*Correspondence to: Dr. Frank H. Ruddle, Department of Molecular, Cellular and Developmental Biology, Yale University, P.O. Box
208103, New Haven, CT 06520–8103. Email: [email protected]
Received 14 August 2001 Accepted 10 January 2002
Published online in Wiley InterScience (www.interscience.wiley.com).
DOI: 10.1002/jez.10090
48
S.Q. IRVINE ET AL.
ters in other vertebrates, and have their own evolutionary histories.
The agnathan vertebrates are the most primitive true vertebrate taxa extant (Forey and
Janvier, ’93). Thus they occupy a pivotal phylogenetic position between the invertebrate cephalochordates such as amphioxus, which has one
cluster (Garcia-Fernandez and Holland, ’94), and
the gnathostomes, or all other living vertebrates,
which have multiple clusters. While most agnathan groups are extinct, two still survive, the lampreys and hagfishes, forming a group called the
cyclostomes. A traditional view based on morphological study holds that the lampreys are more
closely related to gnathostomes than hagfishes
(Forey and Janvier, ’93; Rasmussen et al., ’98).
Molecular studies have found either a monophyletic grouping of lampreys and hagfishes (Mallatt
and Sullivan, ’98; Stock and Whitt, ’92) or support for the traditional view (Rassmussen et al.,
’98). Because the lamprey is far more amenable
to laboratory culture and embryology than the
hagfish, we have chosen to assess Hox cluster organization in the sea lamprey Petromyzon marinus. Regardless of the phylogenetic relationships
within the cyclostomes, examination of the Hox
cluster in the lamprey will help to reconstruct the
ancestral state of the Hox clusters at the initial
radiation of the vertebrates.
Three previous studies (Pendleton et al., ’93;
Sharman and Holland, ’98; W. J. Bailey, unpublished) have used polymerase chain reaction (PCR)
screening to estimate gene composition and cluster number in lampreys. These data, in combination with the present mapping study in the sea
lamprey Petromyzon marinus, indicates that either three or four Hox clusters exist in this species. We discuss the implications of this prediction
for the evolution of vertebrate Hox clusters.
MATERIALS AND METHODS
P. marinus libraries
Three P. marinus genomic libraries were used
in this study. The first was a cosmid library constructed by a commercial laboratory (Stratagene,
La Jolla, CA) (Pendleton et al., ’93). The second
was a lambda phage library made as outlined below. Finally, we screened a P1 Artificial Chromosome (PAC) library (Amemiya et al., ’96).
Construction and plating of phage library
Lamprey genomic DNA prepared from embryos
from a single-pair mating was provided by J. W.
Pendleton. DNA was partially digested with
Sau3AI using 0.4–0.6 units for 30 µg of DNA in 1
ml, 37°C for 1 hour. Size-fractionated DNA was
ligated with LambdaGEM-11 BamHI arms (Promega, Madison, WI), packaged in GigaPack Gold
(Epicentre Technologies, Madison, WI) and titered
using KW251 cells.
For the primary screen of the library, 8×105 Pfu’s
total were screened by hybridization to a Hox cognate group 11 probe (see below).
Analysis of genomic library clones
Primary picks were amplified by PCR with a 3′
modification of the HoxE and HoxF primers
(Bartels et al., ’93) that allows them to be used
with the CloneAMP pAMP1 kit (GibcoBRL, Rockville, MD). PCR was performed as in Bartels et
al. (’93). PCR products were cloned into the
pAMP1 cloning vector following manufacturer’s
guidelines. Five clones from each phage isolate
were sequenced, and this sequence was used to
give the clone a tentative identity.
Identity of the clones was confirmed by direct
sequencing from phage clones using 32P-end-labeled oligonucleotide primers and the Thermosequenase (Amersham, Piscataway, NJ) cycle
sequencing kit.
Phage, cosmid, and PAC clones were restriction
mapped for the following enzymes: EcoRI, XhoI,
NotI, SfiI, and SpeI, using endprobes, oligonucleotides, homeobox probes, et al. specific to certain
phage clones. All phage clones were cut with a
series of three restriction enzymes: AvaII, BanI,
and HincII. These digests were run on an agarose gel, blotted, and the filters were probed with
dig-labeled oligos to the T7 or SP6 promoter sites.
For each “end,” a restriction site was chosen,
which cut the DNA between 500 bp and 1 kbp
from the promoter site. The phage DNA was then
digested with that enzyme and used as a template for synthesis of riboprobe.
DNA probes were labeled by PCR using digoxigenin-11-dUTP (10X DNA labeling mix, Roche, Indianapolis, IN). Primers 5E5 and HoxF were used
and lamprey group 11 homeoboxes were used as
template.
Oligonucleotide probes were end-labeled with
digoxigenin-11-ddUTP (Roche) and terminal transferase (Roche) following the manufacturer’s instructions with the following modifications: The
reaction mixture was incubated at 37°C for 30
minutes and no further purification was performed before hybridization.
Riboprobes were prepared by incubating linear-
LAMPREY HOX CLUSTERS
ized DNA with dig RNA labeling mix (Roche), RNA
polymerase buffer, and either the T7 or SP6 RNA
polymerase in the presence of an RNAse inhibitor (RNAsin; Promega) at 40°C for 30–60 minutes.
Phylogenetic and genomic analysis
Homeobox nucleotide and conceptual amino acid
sequences from mouse and amphioxus Hox cognate groups 1 to 11 were obtained from Genbank.
Homeodomain amino acid sequences were analyzed by neighbor-joining trees constructed using
the Neighbor program, based on protein distance
matrices derived using the PAM-Dayhoff option
of ProtDist, both programs from the Phylip 3.5
package (Felsenstein, ’95).
Genomic sequences were aligned and compared
using Pipmaker (Schwartz et al., 2000) available
at http://nog.cse.psu.edu/pipmaker. Low complexity and repeat sequences were masked using
RepeatMasker (Smith and Green, unpublished
data) available at http://ftp.genome.washington.
edu/cgi-bin/RepeatMasker.
RESULTS
Three separate P. marinus genomic libraries were
screened for Hox genes by filter hybridization. Three
positive clones from a cosmid library were obtained
and restriction mapped. CosA2 of approximately 34
kb contains a Hox 1 gene. Two overlapping clones,
Cos4A4 and Cos2B, comprise 68 kb of total sequence
and contain Hox5, Hox6, Hox7, Hox8, and Hox9
genes. Other contigs were constructed using a
lambda phage library for genomic walking. Two additional large contigs were obtained in this manner. One from nine overlapping clones of 38 kb total
contains Hox2 and Hox3 genes. The other, also from
nine overlapping clones, is 52 kb in length and contains Hox5/6, Hox6/7, and Hox8 genes. A remaining seven contigs and five individual phage clones
contained single Hox genes, which we were unable
to link with other clones. These results are summarized in Figure 1. Detailed restriction maps are
available on request.
Estimated cluster size
The average spacing of the 12 linked Hox genes
is 15.6 kb—slightly more than the average spacing of genes in the mouse Hox clusters, which is
13.8 kb. Assuming a similar average number of
genes per cluster (8.75 in the mouse), that would
give an estimate of average cluster size of about
136 kb. This size is much smaller than that of
the amphioxus cluster, which is 260 kb in length
from the Hox1 to the Hox10 genes.
49
Gene orthology
The entire homeobox was sequenced from each
clone, which had a unique restriction map. Nucleotide sequences are available in GenBank, accession numbers AF410908–AF410925. The deduced
amino acid sequences are shown aligned to mouse
and amphioxus sequences in Figure 2. To assign
genes to Hox cognate groups, the complete homeodomain amino acid sequences from P. marinus
were aligned with those from mouse and amphioxus. This alignment was then used in a neighbor-joining analysis resulting in the bootstrapped
tree of Figure 3.
The known mouse Hox gene cognates cluster
together on the neighbor-joining tree in all cases.
Note that for the posterior (i.e., 5′) cognate groups
the amphioxus genes do not cluster with those of
mouse and lamprey, a finding termed “posterior
flexibility” by Ferrier et al. (2000). Excluding the
amphioxus genes, one or more lamprey homeodomains clustered with the complete complement
of mouse cognates for groups 1, 2, 3, 8, and 11
with bootstrap confidence levels of 97% or greater.
Groups 4, 9, and 10 are recovered at lower bootstrap proportions.
Because of the high degree of amino acid similarity between homeodomains of groups 5, 6, and 7,
these groups were not resolved in the tree. We attempted to resolve the relationships between these
groups by including 3 residues N-terminal and 6
residues C-terminal of the homeodomain for each
gene in a neighbor-joining analysis. This additional
data, which includes some of the conserved residues outside the homeodomain (Sharkey et al., ’97)
failed to resolve the relationships with significant
bootstrap proportions. (For example, Amphi-Hox5,
the three mouse Hox5 genes, and PmHoxN5 and
PmHoxJ5/6/7 genes form a clade, but the bootstrap proportion is only 55%; data not shown).
Where we have linkage data for the P. marinus
genes (HoxK6/7, HoxL5/6, HoxN5, HoxN6, HoxN7,
HoxQ8, HoxQ8a, HoxV9), orthology was assigned
based on cluster position to the extent possible.
Names incorporating more than one paralogy group
number, such as HoxK6/7, reflect ambiguous assignments.
It is possible that these assignments based on
cluster position could be incorrect if there have
been tandem duplications of individual genes after cluster duplication. The most likely candidates
for this situation are PmHoxN6 and PmHoxN7.
These genes have identical nucleotide sequences
over the entire homeobox, although they begin to
diverge immediately outside (Fig. 4). This conser-
50
S.Q. IRVINE ET AL.
Fig. 1. Contig maps of lamprey Hox genes. Contigs are designated by lines with relative positions of homeoboxes shown by thick bars. Constituent clones are shown below each contig. Linkage relationships between individual contigs is unknown and relative positions of unlinked clones is
arbitrary.
LAMPREY HOX CLUSTERS
Fig. 2. Homeodomain amino acid sequences. Conceptual
translations based on DNA sequences. Lamprey homeodomains (bold type) shown aligned with amphioxus and mouse
51
sequences obtained from GenBank. The sequence of Drosophila melanogaster Antennapedia is shown at the top of the
figure for reference.
52
S.Q. IRVINE ET AL.
Fig. 3. Gene orthology estimation. Neighbor-joining tree
using complete homeodomain amino acid sequences. Amino
acid distances calculated using a PAM/Dayhoff matrix, consensus tree of 100 bootstrap replications. Internal branch
lengths are proportional to bootstrap support, with bootstrap
percentages shown at nodes. Where bootstrap support is lower
than 50% branches were collapsed. Cognate group members
shown by brackets.
LAMPREY HOX CLUSTERS
Fig. 4. Illustration of instances where nucleotide sequences of different gene cognates are identical within the region of the homeobox amplified
using common degenerate PCR primers. The PCR amplified region is shown shaded. Each of the sequences shown is known by restriction maps of the
phage clones to be derived from a separate gene. However, sequence based on the PCR generated fragment alone is insufficient to distinguish the
cognates.
53
54
S.Q. IRVINE ET AL.
vation of even synonymous sites suggests that the
two genes were tandemly duplicated relatively recently, possibly with an extreme level of codon bias
preserving third positions, or that gene conversion within the homeobox has occurred.
Comparison of PCR survey and
genomic mapping data
In order to relate the different lamprey Hox
gene sequences to each other, the identifiers of
PCR sequences from two surveys and of sequences
from genomic clones found in this study and in a
similar study by another group (A. Force and J.
Postlethwait, personal communication) are listed
in Table 1. Sequence names at the same horizontal position in the table are the same sequence.
Some homeobox nucleotide sequences showed
little or no variation within certain regions, as
mentioned for PmHoxN6 and N7. There are four
different sets of sequences, each of which could
TABLE 1. Clone summary
Putative
cognate group
1
2
Unique
PCR sequence1
a
b
c4
d4
e
Matching
Irvine, et al.
genomic clone2
Matching
Force, et al.
genomic clone3
1B
pethox1w
E2
3
–
3
4
g4
h
(l)
(n)
G4
5/6/7
l4
j
k
l
f
m4
n
o4
p4
pethox3y5
pethox4y
Total unique
clones assignable to
cognate group
Genomic clones
assignable to
cognate group
4
1
1
1
6
2
26
4
3
13
8
2
3
4
3
3
3
4
3
pethox4x
pethox4w
J5/6/7
K6/7
L5/6
F5/6/7
pethox5w
pethox83
pethox51
pethox31
N5
N6
N7
pethox6w
pethox5x
#139
8
q
r4
9
s4
t
u
v
10
w
11
11.1
11.6
11.8
x
1
Q8
Q8a
R8
T9
V9
W10a
W10b
X10
PCRHx13(9)
Y11
pethox11w
Z11a
Z11b
Pendleton, et al. PCR survey and W. Bailey unpublished.
This study.
Force and Pendleton, personal communication.
4
6 nt or fewer differences with another PCR clone.
5
May be the same as genomic clone 3.
6
This number will be 1 if genomic clone 3 and pethox3y are the same.
2
3
pethox9w
pethox9x
pethox9y
pethox10w
LAMPREY HOX CLUSTERS
only be detected as one sequence in the PCR survey. These sequences are shown in Figure 4. The
region that is amplified by PCR is shown shaded.
In each of the four sets, the sequence is identical
in the region amplified by PCR, but there are mismatches outside that region. Mapping information
confirms that these sequences belong to different
genes. Each of the sequences is surrounded by a
unique restriction map. The uniqueness of the
HoxN5, -6, and -7 and the HoxQ8 and -Q8a sequences is further supported by linkage analysis.
The three HoxN sequences are linked to each
other in one cluster, and their positions allow us
to assign each to a distinct cognate group, with
the reservations noted above. The two “Q” sequences are in different linkage groups. HoxQ8 is
linked to the three HoxN genes; whereas, HoxQ8a
is linked to HoxK6/7 and HoxL5/6.
In addition to these genes, which share the same
sequence within a portion of the homeobox, we
also found a number of PCR fragments that differed by six or fewer nucleotides from another sequence (noted by 4 in Table 1). According to the
calculations of Misof and Wagner (’96), these sequences may represent allelic variants of the same
gene. Of the sequences that differed by seven or
more nucleotides, only four were not identified in
our library screens. The medial sequences #139
and “h” were only found in one survey each and
may be PCR artifacts or possibly contamination
from another organism. In cognate groups 1 and
9, the sequences “a” and “u” were found in both
PCR surveys, but not in our libraries (although
another group has found this sequence in their
55
cosmid library screen; A. Force and J. Postlethwait, personal communication). These two sequences are probably genuinely missing from our
genomic clones. Though the phage library screen
produced most of the Hox genes identified by PCR,
it is possible that some genes were missed. From
a total of 22 different genes found, there were six
genes (27%) for which only one phage clone was
isolated. That suggests that Hox-containing clones
were not isolated to saturation from the library.
Number of Hox clusters in P. marinus
According to the PCR data summarized in
Tables 1 and 2 one might conclude that there are
four Hox clusters in the lamprey P. marinus. However, if clones with fewer than seven nucleotide
differences as compared with another cognate
group member are excluded, assuming they are
alleles of another gene, and the remaining unique
sequences are combined with the total number of
genomic clones recovered, the resulting numbers
are consistent with a total of three Hox clusters.
This is also the case if the PCR data for the lamprey Lampetra planeri (Sharman and Holland,
’98) is considered (Table 2). On the other hand,
the Hox clusters of teleosts and mammals have
experienced many losses of genes within the various cognate groups. Only groups Hox4 and Hox9
in mammals, for example, have a cognate representative in all four clusters. Our lamprey genomic
survey, in combination with data from the laboratory of J. Postlethwaite (personal communication) found three members for cognate groups 4,
8, 9, 10, and 11, and found 8 members total for
TABLE 2. Numbers of genes recovered from surveys sorted by cognate group
Reference
PCR surveys
Cognate group
Genomic clones
Pendleton et al., 1993
Bailey, unpublished
Sharmon and Holland, 1998
This study
1
2
3
4*
1
0
4*
1
1
3
1
1
1
1
1
4
2*
1
5
1
5
6
7
9*
7*
8
9
10
11
12
13
2*
4*
2
–
–
–
2*
4*
2
3
–
–
7
2
3
2
–
–
1
3
2
3
3
–
–
Data based on Petromyzon marinus, except Sharman and Holland, 1998, based on Lampetra planeri.
*Indicates that one or more clones had less than seven nucleotide differences from others over the homeobox and may be allelic variants
leading to possible overestimation of gene number.
56
S.Q. IRVINE ET AL.
groups 5, 6, and 7 (Table 1). If the lamprey clusters have a similar level of gene loss as those of
the mammals, there would be more than three
clusters. Therefore, either three or four Hox clusters are likely to exist in P. marinus.
The Hox9 cognate group can be useful for determining cluster identity because it has representatives in all four mammalian clusters. We
tested whether the two lamprey Hox9 genes found
in our genomic screen could be assigned to particular clusters based on phylogenetic analysis
with the mouse orthologs and using amphioxus
Hox9 as an outgroup. We used nucleotide sequences for the entire homeodomain in maximum
parsimony analysis and excluded third codon positions on the assumption that these were saturated for nucleotide substitutions. We found that
in the best-supported tree, the two lamprey genes
clustered together as a sister group to the mammalian Hox9 genes. The next best-supported tree
grouped both genes with Hoxc-9 (data not shown).
However, neither tree had significant bootstrap
values (54% and 40%, respectively) probably because there were only 16 non-synonymous character state changes in the ingroup. In addition,
given the low number of informative characters,
it is possible that the lamprey genes are clustering together due to differences in codon bias between lampreys and mammals. In short, based
on the data we currently have, we cannot rule for
or against duplication of one or more of the lamprey clusters independent of the duplications leading to the gnathostome clusters.
Genomic analysis of the lamprey
HoxW10a region
In order to examine a portion of the lamprey
Hox clusters in greater detail, we sequenced the
entire 30 kb PAC clone Pm18, which contains the
gene HoxW10a (GenBank accession no. AF464190).
A diagram of the genomic organization of this
clone is shown in Figure 5, along with a series of
percent identity plot (PIP) alignments to Hox clusters from other vertebrate species. A first observation on examination of Figure 5 is that in
comparison to the sequences from other vertebrates, the lamprey has a higher presence of
simple repeat sequences and a greater proportion
of CpG islands. In addition, as compared with the
other vertebrates, the lamprey sequence around
HoxW10a is expanded in several respects. First,
the inferred intron of HoxW10a is larger than
those of the other sequences (approximately 7.5
kb as compared with a maximum of 3 kb for the
other Hox10 genes). Secondly, there is a transposable element (Tc1-like transposase sequence)
inserted downstream of HoxW10a. Transposable
elements are rarely, if ever, found in gnathostome
Hox clusters. In addition, the PIP analysis identifies four regions of high sequence conservation
in non-coding DNA. These conserved elements are
located in the same relative positions with respect
to the Hox10 exons in all the sequences compared,
but in the lamprey are spread out in a manner
consistent with a general expansion of the genome
around HoxW10a. Furthermore, despite the 20 kb
of sequence, we obtained 3′ of HoxW10a exon 2;
we failed to encounter an adjacent Hox gene on
this clone. This is in contrast to the average 5.4
and 9.4 kb distances to the first exon of Hox9 for
the other Hoxa and Hoxc sequences, respectively.
The sequence alignments for the four conserved
elements are shown in Figure 6. These sequences
are conserved over total divergence times ranging
from more than 8×108 years for C1 and C2 to at
least 1.5×109 years for the A2 sequence. Interestingly, the A2 and C2 elements are found in both
the Hoxa and Hoxc clusters of other vertebrates.
This presence in other clusters is reminiscent of the
HB-1 element found in several locations in the
Drosophila Hox cluster and in multiple clusters in
vertebrates, and which has been shown to be responsive to Hox proteins (Haerry and Gehring, ’96;
Haerry and Gehring, ’97). Another possibility is that
this sequence is a basic cluster control element, as
has been proposed for a conserved sequence found
in all four vertebrate Hox clusters downstream of
HOXA7, HOXB7, HOXC8, and HOXD8 (termed H8/
7-6 FCS; Kim et al., 2000).
The C1 element is also found as an inverted duplication in the lamprey, termed C3 (Figs. 5 and
6), but not in the mammalian sequences examined.
Because the sequence is largely preserved in its
Fig. 5. Percent identity plots PIPs of sequence alignments
between lamprey PAC clone Pm18 (HoxW10a) and the corresponding genomic regions of various Hoxa-10 and Hoxc10 genes available in GenBank. The species and cluster
names listed first to the left of the PIPs are the reference
sequences for which the exon locations and repeated elements are depicted about the PIP, e.g. Fugu Hoxa vs. lamprey Pm18 shows the genomic organization around the Fugu
Hoxa-10 gene with that sequence locally aligned by the
PipMaker program (Schwartz et al., 2000) with Pm18. Conserved regions are circled and the corresponding sequences
linked by solid lines between PIPs. Refs. and GenBank acc.
nos.: Fugu, Aparicio et al. (’97) U92573; Striped bass, Snell
et al. (’99) AF089743; Horned shark, Kim et al. (2000),
AF224262; Mouse Hoxa, AC015583; Mouse Hoxc, AC021667;
Human HOXA, AC004080; Human HOXC, NT_009563.
LAMPREY HOX CLUSTERS
57
Fig. 5.
58
S.Q. IRVINE ET AL.
Fig. 6. Sequence alignments for conserved regions identified in the PIPs of Figure 5. The inverted C3 element is shown
aligned with the C1 sequences. Abbreviations: Fr, Fugu
rubripes; Hf, Heterdontus francisci; Hs, Homo sapiens; Mm,
Mus musculus; Ms, Morone saxitalis; Pm, Petromyzon
marinus. Refs. as in Figure 5.
new location, this duplication must either be recent or it must retain some functional significance.
Hox genes previously identified by PCR (Pendleton
et al., ’93), along with several new genes. We have
determined the restriction maps of all of these
clones, including linking some of them into clusters. We have also determined the nucleotide sequence of the homeobox (180 bp) for all of these
DISCUSSION
In these studies we have isolated phage, cosmid,
and PAC clones that contain most of the lamprey
LAMPREY HOX CLUSTERS
sequences. Though the cluster map is incomplete,
it does show that there are at least three clusters
of Hox genes in the lamprey, for two of which we
have shown linkage data. Thus, the lamprey is
the most primitive chordate described to have
multiple Hox clusters.
We have also identified non-coding sequences
shared between P. marinus and higher vertebrates, and conserved for more than one billion
years of divergence time. These regions are likely
to be regulatory elements, as shown by studies of
sequence conservation combined with experimental assessment of function (Hardison, 2000; Loots
et al., 2000). The extreme level of conservation of
these elements suggests that they perform some
basic function in regulating nearby Hox genes or
that they are general cluster control or insulation
elements.
Lamprey clusters and hypotheses
of duplication history
As of yet, we have insufficient data to determine
with confidence the identities and evolutionary histories of the lamprey Hox clusters. However, based
on certain assumptions, evolutionary hypotheses
can be proposed for testing as more data becomes
available. We propose two initial assumptions in
constructing these hypotheses. First, we assume
a three-step duplication scenario based on the
analysis of Bailey et al. (’97). This study used extensive sequence both from Hox4 and Hox9
paralogy group genes combined with sequence
from the linked collagen genes to reconstruct the
most likely duplication scenario based on several
phylogenetic reconstruction methods. The model
proposes that the ancestral Hox cluster was similar to the tetrapod D cluster, based on outgroup
analysis, which duplicated to create an A-like cluster which in turn produced the B and C clusters,
ie. (D(A(B,C))).
We also assume a four cluster organization at
the base of the gnathostome radiation. This is a
parsimonious assumption given that all tetrapods
examined have four Hox clusters. In addition, library and PCR screening data (Kim et al., 2000;
C.-B. Kim, personal communication) suggests that
sharks, the most primitive gnathostomes, have
four or fewer Hox clusters. In this view, the seven
or more Hox clusters existing in ray-finned fishes
are the result of one or more duplication events
in that lineage after the divergence of the lobefinned fishes (Amores et al., ’98).
We propose four alternate scenarios, based on
the preceding assumptions (Fig. 7). In Case A, the
59
common ancestor of lampreys and gnathostomes
had one Hox cluster, with two or three independent cluster duplications occurring within the lamprey lineage. Three cluster duplications then
occurred in the basal gnathostome lineage to create the four clusters retained in tetrapods. In this
case, true vertebrates evolved without multiple
Hox clusters. However, if the suggestion that one
or more lamprey Hox clusters is not directly related to those of tetrapods is correct, it would lend
weight to this hypothesis.
In Case B, two Hox clusters were present in a
primitive agnathan, and one or two independent
cluster duplications occurred in the lamprey lineage to give three or four clusters. Thus, two cluster duplications occurred at the base of the
gnathostome clade to complete the four ancestral
tetrapod Hox clusters. This scenario is consistent
with that of Amores et al. (’98) who argue for a
two-step duplication scenario, with a duplication
of both ancestral agnathan clusters, possibly simultaneously by genome duplication, to produce
the four cluster ancestral gnathostome arrangement. This view would still be, in part, consistent
with the possibility presented above that one or
more lamprey Hox clusters is a sister to all the
tetrapod clusters.
In Case C, three Hox clusters are present in
the ancestral agnathan, the result of two cluster
duplications, with a possible independent duplication in the lineage of the modern lamprey. An
additional duplication occurs in the basal gnathostome lineage, producing the final tetrapod cluster complement.
Finally, in Case D, all the duplications leading to the gnathostome Hox clusters have occurred before the divergence of lampreys. In this
case, all the lamprey clusters would be directly
homologous to those of mammals. This scenario
is the most parsimonious of the four in the absence of significant evidence of independent duplications along the lamprey lineage. We propose
that this case is the best working hypothesis,
because we believe there are most likely four
lamprey clusters, as judged from gene numbers
in each paralogy group, and allowing for levels
of gene loss consistent with those of other vertebrate groups.
Note that numerous other possibilities also exist if cluster losses have occurred in one or more
lineages. In fact, preliminary PCR survey data
suggests that four clusters may exist in hagfishes
(Bailey and Wagner, unpublished data), supporting the notion that the ancestral agnathan had
60
S.Q. IRVINE ET AL.
Fig. 7. Four hypotheses of vertebrate Hox cluster duplication. Number of Hox clusters at ancestral nodes are shown
in ovals. Bars represent cluster duplication events, with
dashed bars representing an additional duplication for the
case of four Hox clusters in the lamprey. Hox cluster complements are indicated at termini of branches, with Hox clusters independently duplicated within a branch shown dashed.
See text for further base assumptions.
LAMPREY HOX CLUSTERS
four clusters. If P. marinus has only three clusters, Case D could still be the correct duplication
history with a cluster loss in the lamprey lineage
(Sharman and Holland, ’98). The difficulty of distinguishing between these hypotheses illustrates
the need for both complete mapping as well as
extensive sequence data for use in phylogenetic
analysis. As this data becomes available, a clearer
picture of the lamprey Hox clusters will emerge,
and tests of the above hypotheses will be possible.
This work, in turn, will enable the reconstruction
of the Hox cluster complement at the origin of
the vertebrate radiation.
Hox cluster number and
morphological complexity
It has been proposed that the increasing complexity of vertebrate body plans over evolutionary time might be related to expansions in the
number of Hox clusters (Kappen et al., ’89). The
fact that an agnathan vertebrate has at least
three clusters suggests that the cluster duplications occurred at the very base of the vertebrate
radiation and long preceded increases in axial
complexity (Holland and Garcia-Fernandez, ’96b;
Ruddle et al., ’99). However, although lampreys
and hagfishes have relatively simple axial body
plans, extinct groups of agnathans, such as
cephalaspids, had considerable axial complexity
(Forey and Janvier, ’93). If three or even four Hox
clusters existed in agnathan groups, it may be the
case that rather than a gradual increase in Hox
gene and cluster number accompanying cladogenesis in the vertebrates, the cluster duplications
occurred early in the chordate radiation and were
the permissive condition for evolution of the basic vertebrate body plan, and sufficient for the evolution of the morphological complexity within
agnathans.
ACKNOWLEDGMENTS
The authors gratefully acknowledge the technical assistance of Kimberly Hartwell and Stephanie
Atiyeh.
LITERATURE CITED
Amemiya CT, Ota T, Litman GW. 1996. Construction of P1
artificial chromosome (PAC) libraries from lower vertebrates. Nonmammalian genomic analysis: a practical guide.
San Diego: Academic Press.
Amores A, Force A, Yan YL, Joly L, Amemiya C, Fritz A, Ho
RK, Langeland J, Prince V, Wang YL, Westerfield M, Ekker
M, Postlethwait JH. 1998. Zebrafish hox clusters and vertebrate genome evolution. Science 282:1711–1714.
Aparicio S, Hawker K, Cottage A, Mikawa Y, Zuo L,
Venkatesh B, Chen E, Krumlauf R, Brenner S. 1997. Or-
61
ganization of the Fugu rubripes Hox clusters: evidence for
continuing evolution of vertebrate Hox complexes. Nature
Genet 16:79–83.
Bailey WJ, Kim J, Wagner GP, Ruddle FH. 1997. Phylogenetic reconstruction of vertebrate Hox cluster duplications.
Mol Biol Evol 14:843–853.
Bartels JL, Murtha M, Ruddle FH. 1993. Multiple Hox/
HOM-class homeoboxes in Platyhelminthes. Mol Phyl
Evol 2:143–151.
DeRobertis EM. 1994. The homeobox in cell differentiation
and evolution. In: Duboule D, editor. Guidebook to the
homeobox genes. Oxford: Sambrook & Tooze. p 13–23.
Felsenstein J. 1995. PHYLIP (Phylogeny Inference Package),
computer program distributed by the author. Version 3.57c.
Seattle: Department of Genetics, Univ. of Washington.
Ferrier DEK, Minguillon C, Holland PWH, Garcia-Fernandez
J. 2000. The amphioxus Hox cluster: deuterostome posterior flexibility and Hox14. Evol Dev 2:284–293.
Forey P, Janvier P. 1993. Agnathans and the origin of jawed
vertebrates. Nature 361:129–134.
Garcia-Fernandez J, Holland PWH. 1994. Archetypal organization of the amphioxus Hox gene cluster. Nature 370:
563–566.
Gehring WJ. 1994. A history of the homeobox. In: Duboule
D, editor. Guidebook to the homeobox genes. Oxford:
Sambrook & Tooze. p 3–10.
Haerry TE, Gehring WJ. 1996. Intron of the mouse Hoxa-7
gene contains conserved homeodomain binding sites that
can function as an enhancer element in Drosophila. Proc
Natl Acad Sci USA 93:13884–13889.
Haerry TE, Gehring WJ. 1997. A conserved cluster of
homeodomain binding sites in the mouse Hoxa-4 intron
functions in Drosophila embryos as an enhancer that is
directly regulated by Ultrabithorax. Dev Biol 186:1–15.
Hardison RC. 2000. Conserved noncoding sequences are reliable guides to regulatory elements. Tr Genet 16:369–372.
Holland PW, Garcia-Fernandez J. 1996a. Hox genes and
chordate evolution. Dev Biol 173:382–95.
Holland PWH, Garcia-Fernandez J. 1996b. Hox genes and
chordate evolution. Dev Biol 173:382–395.
Kappen C, Schughart K, Ruddle FH. 1989. Two steps in the
evolution of antennapedia-class vertebrate homeobox
genes. Proc Natl Acad Sci USA 86:5459–5463.
Kim C, Amemiya C, Bailey W, Kawasaki K, Mezey J, Miller
W, Minoshima S, Shimizu N, Wagner G, Ruddle F. 2000.
Hox cluster genomics in the horn shark, Heterodontus
francisci. Proc Natl Acad Sci USA 97:1655–1660.
Loots GG, Locksley RM, Blankespoor CM, Wang ZE, Miller
W, Rubin EM, Frazer KA. 2000. Identification of a coordinate regulator of interleukins 4, 13, and 5 by cross-species sequence comparisons. Science 288:136–140.
Mallatt J, Sullivan J. 1998. 28S and 18S rDNA sequences
support the monophyly of lampreys and hagfishes. Mol Biol
Evol 15:1706–1718.
Misof BY, Wagner GP. 1996. Evidence for four Hox clusters
in the killifish Fundulus heteroclitus (Teleostei). Mol Phyl
Evol 5:309–322.
Pendleton JW, Nagai BK, Murtha MT, Ruddle RH. 1993.
Expansion of the Hox gene family and the evolution of
chordates. Proc. Natl. Acad. Sci. USA 90:6300–6304.
Rasmussen A-S, Janke A, Arnason U. 1998. The mitochondrial DNA molecule of the hagfish (Myxine glutinosa) and
vertebrate phylogeny. J Mol Evol 46:382–388.
Ruddle FH, Carr JL, Kim C-B, Ledje C, Shashikant CS,
62
S.Q. IRVINE ET AL.
Wagner G. 1999. Evolution of chordate Hox gene clusters.
Annals NYAS 870:238–248.
Schwartz S, Zhang Z, Frazer KA, Smit A, Riemer C, Bouck
J, Gibbs R, Hardison R, Miller W. 2000. PipMaker — a
web server for aligning two genomic DNA sequences. Genome Res 10:577–586.
Sharkey M, Graba Y, Scott MP. 1997. Hox genes in evolution: protein surfaces and paralog groups. Tr Genet 13:
145–151.
Sharman AC, Holland PWH. 1998. Estimation of Hox gene
cluster number in lampreys. Int J Dev Biol 42:617–620.
Snell EA, Scemama J-L, Stellwag EJ. 1999. Genomic organization of the Hoxa4-Hoxa10 region from Morone saxatilis:
implications for Hox gene evolution among vertebrates. J
Exp Zool (Mol Dev Evol) 285:41–49.
Stock DW, Whitt GS. 1992. Evidence from 18S ribosomal
RNA sequences that lampreys and hagfishes form a natural group. Science 257:787–789.