Download Comparison of Bombyx mori and Helicoverpa armigera cytoplasmic

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Comparison of Bombyx mori and Helicoverpa armigera Cytoplasmic Actin
Genes Provides Clues to the Evolution of Actin Genes in Insects
Alain Mangé and Jean-Claude Prudhomme
Centre de Génétique Moléculaire et Cellulaire, Université Claude Bernard Lyon I, Centre National de la Recherche
Scientifique, Villeurbanne, France
The cytoplasmic actin genes BmA3 and BmA4 of Bombyx mori were found clustered in a single genomic clone in
the same orientation. As a similar clustering of the two cytoplasmic actin genes Ha3a and Ha3b also occurs in
another lepidopteran, Helicoverpa armigera, we analyzed the sequence of the pair of genes from each species. Due
to the high conservation of cytoplasmic actins, the coding sequence of the four genes was easily aligned, allowing
the detection of similarities in noncoding exon and intron sequences as well as in flanking sequences. All four genes
exhibited a conserved intron inserted in codon 117, an original position not encountered in other species. It can
thus be postulated that all of these genes derived from a common ancestral gene carrying this intron after a single
event of insertion. The comparison of the four genes revealed that the genes of B. mori and H. armigera are related
in two different ways: the coding sequence and the intron that interrupts it are more similar between paralogous
genes within each species than between orthologous genes of the two species. In contrast, the other (noncoding)
regions exhibited the greatest similarity between a gene of one species and a gene of the other species, defining
two pairs of orthologous genes, BmA3 and HaA3a on one hand and BmA4 and HaA3b on the other. However, in
each species, the very high similarities of the coding sequence and of the single intron that interrupts it strongly
suggest that gene conversion events have homogenized this part of the sequence. As the divergence of the B. mori
genes was higher than that of the H. armigera genes, we postulated that the gene conversion occurred earlier in
the B. mori lineage. This leads us to hypothesize that gene conversion could also be responsible for the original
transfer of the common intron to the second gene copy before the divergence of the B. mori and H. armigera
lineages.
Introduction
The actin genes constitute a family that differentiated during evolution into genes coding for muscle actins and genes coding for cytoplasmic (or cytoskeletal)
actins. All of these genes are highly similar and are supposed to derive from an ancestral gene through rounds
of duplication. In all Metazoa studied so far, at least two
genes encoding cytoplasmic actin are observed, suggesting the occurrence of a very ancient duplication, as
well as the involvement of some unknown mechanism
to maintain multicopy genes. In insects, unlike in some
other invertebrates such as echinoderms that have several cytoplasmic actin genes (Kissinger, Hahn, and Raff
1997), only two cytoplasmic actin genes have been detected (Fyrberg et al. 1981; Mounier and Prudhomme
1986; Mangé, Couble, and Prudhomme 1996; Rourke
and East 1997). These cytoplasmic actin genes are clearly distinct from the muscular actin genes, which have
diverged further from the putative ancestral gene (Vandekerckhove and Weber 1984; Mounier et al. 1992). In
fact, the coding sequence of cytoplasmic actin genes is
highly conserved among all species and thus is of little
help in deducing phylogenetic relationships among arthropods.
The availability of the sequences of the two tandemly repeated cytoplasmic actin genes of the lepidopterans Bombyx mori (Mounier and Prudhomme 1986;
Key words: actin genes, gene conversion, gene evolution, insect,
Bombyx, Helicoverpa.
Address for correspondence and reprints: Jean-Claude Prudhomme, Centre de Génétique Moléculaire et Cellulaire (UMR 5534),
Université Claude Bernard Lyon I, Centre National de la Recherche
Scientifique, 43, boulevard du 11 Novembre 1918, F. 69622 Villeurbanne cedex, France.
Mol. Biol. Evol. 16(2):165–172. 1999
q 1999 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038
Mangé, Couble, and Prudhomme 1996) and Helicoverpa
armigera (Rourke and East 1997) prompted us to compare coding and noncoding sequences of the four genes.
In this paper, we report that B. mori and H. armigera
exhibit two pairs of orthologous genes. However, the
detailed comparison of the four genes shows that the
sequence of the coding region and that of the intron that
interrupts it are more similar between paralogous genes
than between orthologous genes. This strongly suggests
that a conversion event occurred not only in H. armigera, as previously proposed (Rourke and East 1997),
but in B. mori as well. Since these sequences are more
similar in H. armigera than in B. mori, one may suppose
that the conversion occurred earlier in the latter species.
Thus, gene conversion could be a frequent event, and
this leads us to hypothesize that it may be responsible
for the transfer of the common intron after a first single
insertion into one copy in an ancestor of both species.
Materials and Methods
Molecular Biology
DNA of a lambda genomic clone carrying BmA3
(Mounier and Prudhomme 1986) was digested with
EcoRI, electrophoresed on 0.8% agarose gel, blotted
onto nylon membranes (Amersham), and hybridized
overnight to a radiolabeled BmA4 cDNA clone (Mangé,
Couble, and Prudhomme 1996) at 658C in 5 3 SSC, 1
3 Denhardt’s solution, 0.1% SDS, and 0.5 mg/ml of
denatured salmon sperm DNA. Blots were washed at
658C in 0.2% SSC and 0.1% SDS and exposed to Kodak
XAR films at 2808C. DNA probes were 32P-labeled by
using a random-primer DNA-labeling kit from Boerhinger Mannheim. Sequencing was carried out using the
T7 sequencing kit of Sigma.
165
166
Mangé and Prudhomme
FIG. 1.—Schematic diagram of the orthologous BmA4/HaA3b and BmA3/HaA3a cytoplasmic actin genes. Exons and introns are numbered
using BmA4 for reference. The filled boxes in E3 and E4 represent the coding sequences. For HaA3b, dashed boxes or lines indicate putative
exons and introns. Numbers in brackets indicate the lengths of sequences (not drawn to scale). Vertical bars show the limits of the sequenced
regions. The position of the compared promoters is indicated by p.
Sequence Analysis
The BmA3 and BmA4 cytoplasmic actin genes of
B. mori have been isolated and sequenced in our laboratory (Mounier and Prudhomme 1986; Mangé, Couble,
and Prudhomme 1996). The sequences of HaA3a and
HaA3b cytoplasmic actin genes of H. armigera (Rourke
and East 1997) were obtained from EMBL/GenBank
(accession numbers X97614 and X97615). Nucleotide
and amino acid sequences were analyzed with the
CLUSTAL W program (Thompson, Higgins, and Gibson 1994), with manual improvement.
Results
sequence and the intron that interrupts it are more similar between paralogous genes within each species than
between orthologous genes of the two species. In contrast, the other (noncoding) regions exhibited the greatest similarity between a gene of one species and a gene
of the other species, defining two pairs of orthologous
genes, BmA3 and HaA3a on one hand and BmA4 and
HaA3b on the other. This observation suggests that a
gene conversion event occurred not only in the H. armigera lineage, as previously proposed (Rourke and
East 1997), but also in the B. mori lineage. The analysis
described below aimed to compare in detail the nucleotide sequences of the four genes under consideration.
Southern blots of the genomic clone containing
BmA3 digested with EcoRI were probed with the BmA4
cDNA. Due to the high similarity of coding sequences,
the two genes cross-hybridized efficiently and revealed
the known 8.1-kb-long EcoRI fragment which contains
BmA3. Interestingly, the probe also hybridized to the
1.3-kb-long distal EcoRI fragment. This suggested that
the BmA3 genomic clone could contain another actincoding sequence. Restriction enzyme mapping established that the two clones share common restriction sites
and overlapped for more than 10 kb. To identify the
hybridizing sequence, the 1.3-kb-long fragment from the
BmA3 genomic clone was subcloned and sequenced.
The 59 nontranslated sequence, exon 2, and intron 2 of
BmA4 were all identified in this fragment, showing that
the two cytoplasmic actin genes are clustered in the
same orientation and are separated by around 11 kb.
Due to the high stability of actin sequences, it was
easy to unambiguously align the coding sequences of
the four genes BmA3, BmA4, HaA3a, and HaA3b and to
identify similarities in introns and flanking sequences
(fig. 1). This showed that the genes of B. mori and H.
armigera are related in two different ways: the coding
The Coding Sequence
In all four of the genes, the coding sequence is
entirely contained in exons E3 and E4, which additionally carry noncoding 39 and 59 sequences. These exons
are separated by the intron I3 inserted into codon 117
(fig. 1). Table 1 shows pairwise comparisons of the coding sequences of the BmA3, BmA4, HaA3a, and HaA3b
genes. These data confirm that cytoplasmic actins are
extremely conserved within Lepidoptera, with more
than 90% similarity at the nucleotide level and more
than 99% similarity at the amino acid level (table 1, see
total coding region).
Exon E3 contains 116 codons, of which 71 (61.2%)
are identical in the four genes. The 45 other codons
exhibited only silent changes in the third positions (and
a single silent substitution in the first position). Intraspecies comparison shows no difference between the
two H. armigera genes and only five nucleotide substitutions (1.4%) between the two B. mori genes. However,
interspecies pairwise comparisons show more than 40
nucleotide substitutions (11.4% and 12.3%; table 1, see
E3). None of these substitutions alter the protein sequence.
Lepidopteran Cytoplasmic Actin Genes
167
Table 1
Pairwise Comparison of Bombyx mori and Helicaverpa armigera Cytoplasmic Actin Genes
PROMOTERa
59 UTR (E2)b
BmA3 BmA4 HaA3a HaA3b
BmA3
BmA4
HaA3a
HaA3b
—
36.8
43.9
ND
46.3
—
35.9
ND
72.0
41.2
—
ND
E3 (350 nt)c
BmA3 BmA4 HaA3a HaA3b
ND
ND
ND
—
—
—
—
—
53.3
—
—
—
63.4
50.9
—
—
BmA3
BmA4
HaA3a
HaA3b
BmA3 BmA4 HaA3a HaA3b
44.9
83.7
43.7
—
—
100
100
100
TOTAL CODING REGIONc
E4 (781 nt)c
98.6
—
100
100
88.6
87.7
—
100
88.6
87.7
100
—
I3d
BmA3 BmA4 HaA3a HaA3b
—
99.2
99.6
99.2
97.4
—
99.6
100
91.4
91.2
—
99.6
91.8
92.3
97.7
—
39 UTRe
BmA3
BmA4
HaA3a
HaA3b
BmA3
BmA4
HaA3a
HaA3b
BmA3
BmA4
HaA3a
HaA3b
—
99.5
99.7
99.5
97.8
—
99.7
100
90.5
90.1
—
99.7
90.8
90.9
98.4
—
—
—
—
—
90.2
—
—
—
53.7
52.6
—
—
51.6
50.5
96.2
—
—
—
—
—
43.1
—
—
—
77.4
43.5
—
—
42.1
81.0
41.6
—
a Percentages of nucleotide identity of the promoter region. Above diagonal: nucleotide identity in the proximal 211 nt. Below diagonal: nucleotide identity in
the more upstream 331 nt.
b Percentages of nucleotide identity of the 59 UTR (115, 86, 80, 77 nt).
c Percentages of nucleotide (above the diagonal) and of amino acid identity (below the diagonal) of the coding region.
d Percentages of identity of the intron I3 (92 nt).
e Percentages of identity of the 39 UTR (148 nt).
Exon E4 contains 260 codons, of which 184
(70.8%) are identical in the four genes. Two codons differed in the three bases, resulting in the presence of
Arg320 in BmA3 compared with Ala320 in the three
other genes, and of Asn273 in the orthologous HaA3a
and BmA3 genes compared with Cys273 in HaA3b and
BmA4. All the other differences concerned silent first
codon positions (codons 255 and 300) and silent third
positions. For exon E4, as for exon E3, there are more
differences between orthologous genes, with more than
60 nucleotide substitutions (7.7%–8.8%; table 1, see
E4), than between paralogous genes, with 18 and 20
nucleotide substitutions in B. mori and H. armigera
genes, respectively (2.3% and 2.6%; table 1, see E4).
The differences between paralogous genes were not
uniformly distributed along the sequence. Thus, in H.
armigera, there was not a single nucleotide replacement
within the first 252 codons, whereas many differences
were located in the 39 part of the sequence (8 of the 18
nucleotide substitutions occur between codons 253 and
311, and the other 10 occur between codons 343 and
371). In B. mori, the distribution of variable codons was
more symmetrical, but it is noticeable that there were
only 5 nucleotide substitutions among the 180 first codons and 20 among the other 195. For comparison, Drosophila melanogaster has 159 codons (42.4%) that are
different between the two cytoplasmic actin genes
Dm4sA and Dm5C, and their distribution along the sequence is homogenous (Fyrberg et al. 1981). This suggests that the postulated gene conversion events were
basically 59 polar phenomena.
Intron I3
The four cytoplasmic actin genes of B. mori and
H. armigera are interrupted by a conserved intron within
codon 117 (I3, fig. 1), an original position not encountered in other species. Interestingly, this intron is remarkably similar between paralogous genes, which exhibited 90% and 96% identity in B. mori and H. armigera, respectively (table 1). It exhibits five short inser-
tions/deletions (indels), accounting for a total of 11
bases. All of these indels (three of one base, two of two
bases, and one of five bases) characterize the orthologous genes, but none occur in paralogous genes. These
indels add eight bases (three A’s, three T’s, one G, and
one C) to the B. mori intron and four bases (three C’s
and one G) to H. armigera, increasing the difference of
the GC content in H. armigera versus B. mori. In fact,
the GC content was high, around 50% in the B. mori
introns and 61% in the H. armigera introns. In D. melanogaster, for comparison, the mean GC content of a
large sample of introns is 37% (Shields et al. 1988).
59 Untranslated Region
In the four genes, the 59 untranslated exon region
(UTR) is composed of the exon E2 and of an 11-baselong sequence at the 59 end of E3 (fig. 1). BmA4 contains another, more distal, exon (E1), but the corresponding part in H. armigera has not been sequenced. The
comparison of the 59 UTR sequences showed a significant similarity between orthologous genes, with 63.4%
and 83.7% identical nucleotides for HaA3a/BmA3 and
HaA3b/BmA4, respectively (table 1).
Intron I2
The 59 UTR is interrupted by an intron (I2), like
many other actin genes. The I2 sequences of the four
genes were compared, but it was not possible to align
the corresponding sequences of either the paralogous
genes or the orthologous BmA3/HaA3a. In contrast, intron I2 of the orthologous BmA4 and HaA3b were
aligned and showed 53.3% identity. Interestingly, they
exhibited conserved blocks of sequences located in
identical positions (fig. 2). It is thus highly probable that
this intron was present in the ancestral species. The
functional meaning of the conserved sequences is unknown, but one of them contains the motif
CCTTATTTGG, which fits the CArG consensus of the
vertebrate serum response element (Treisman 1985).
This CArG box is located within an intron, in a manner
168
Mangé and Prudhomme
FIG. 2.—Alignment of intron I2 sequences of BmA4 and HaA3b. Vertical bars show identical nucleotides; blocks of homologies are boxed
in; gaps are indicated by dashes. The sequence of a putative CArG element is indicated.
similar to that of the vertebrate beta-actin gene, for
which the intronic element has been proven to be functional (Kawamoto et al. 1988; Ng et al. 1989).
39 Untranslated Region
Alignment of the 39 UTR sequences showed that
the high similarity between the paralogous genes decreases immediately downstream of the stop codon (fig.
3), suggesting that the 39 UTR was not involved in the
gene conversion postulated above. Pairwise comparisons showed that BmA3 and HaA3a on one hand and
BmA4 and HaA3b on the other are very similar until the
end of the available sequences, in fact defining the orthologous gene pairs (fig. 3). In the region that is sequenced in the four genes (fig. 3 and table 1), the orthologous BmA4 and HaA3b are 81% identical with seven indels (19 nt), whereas BmA3 and HaA3a show
77.4% of identical nucleotides and 12 indels (24 nt).
The 39 UTR sequence can be divided into two
parts: in the upstream area, the four genes could be
aligned and exhibited common sequences, whereas more
downstream, only orthologous genes were alignable,
since BmA4 and HaA3b are very GC-rich and BmA3 and
HaA3a are very AT-rich (fig. 3). The boundary between
the two parts may be the trace of the initial duplication
or that of an ancient gene conversion which occurred
before the splitting of the two lineages.
Promoter Region
The promoter of BmA3 has been extensively studied, and positive (TATA box, ActE1, CArG elements)
and negative (RA3 element) regulatory sequences have
been described (Mangé et al. 1997). The putative promoter region of HaA3a was aligned with the promoter
region of BmA3 and with the I1 of BmA4, which contains the proximal promoter (Mangé, Couble, and Prud-
homme 1996). The corresponding sequences of HaA3b
are not available.
The alignment of the 59 regions flanking BmA3 and
HaA3a shows a similarity as high as 72% for the first
211 nucleotides (fig. 4A and table 1). This extends the
region of orthology between BmA3 and HaA3a far upstream of the coding sequence. An alignment was found
between the BmA4 proximal promoter and both the orthologous BmA3 and HaA3a promoters, but with a lower
similarity (fig. 4A and table 1). No alignment was possible with the distal promoter of BmA4, which thus
seems unrelated. Since the 59 cloned region of HaA3b
is truncated inside exon2, the promoter region is not
available, but the overall homology between HaA3b and
BmA4 led us to predict their similarity in the flanking
59 DNA (see fig. 1).
The functional regulatory elements of the BmA4
and HaA3a promoters are not known. However, sequences similar to several elements which regulate
BmA3 transcription were found upstream of these promoters in the same relative positions. Both BmA4 and
HaA3a carry the CArG motif (fig. 4A), which is also
involved in the regulation of vertebrate cytoplasmic actin genes (Mohun, Garrett, and Treisman 1987; Danilition et al. 1991). In BmA3, the CArG box enhances a
low transcription level that depends on the ActE1 element and TATA box. Sequences very closely related to
ActE1 and TATA box are also found upstream of BmA4
and HaA3a. In BmA3 promoter, all these positive elements are downregulated by a strong inhibitory domain
called RA3 which contains three subparts, each of which
contributes to the negative effect. Two of them are found
within the HaA3a promoter in the same relative position
as that of the BmA3 promoter (fig. 4A), but the BmA4
proximal promoter appears completely devoid of RA3.
More upstream of these sequences, the similarity be-
Lepidopteran Cytoplasmic Actin Genes
169
FIG. 3.—Comparison of the 39 UTR of B. mori and H. armigera cytoplasmic actin genes. Alignment of the sequences starts at the translation
termination codon (TAA). Gaps are indicated by dashes; vertical bars show identical nucleotides between the four genes; small letters indicate
the nucleotides which differ between the orthologous genes; the polyadenylation signal of BmA3 is underlined; arrows show the limit between
the two parts of the 39 UTR (see text).
tween the three promoters decreases (table 1). However,
the orthologous genes remain easily alignable and contain a highly conserved 56-base-long sequence (fig. 4B)
that could activate the BmA3 promoter (unpublished
data).
These observations show that the positive functional elements that have been inherited in orthologous
genes are also present in paralogous genes, suggesting
that they were present in the ancestral gene and have
been targets of selection. The presence of the negative
FIG. 4.—Comparison of the 59 flanking sequence of B. mori and H. armigera cytoplasmic actin genes. Gaps are indicated by dashes;
vertical bars show identical nucleotides in the three genes; small letters indicate nucleotides which differ between the two orthologous genes.
A, Alignment of the sequence flanking the three genes (until the transcription start site). Motifs with homology to the TATA, ActE1, CArG,
and RA3 sites of BmA3 are indicated. B, Upstream conserved sequence of BmA3 and HaA3a between positions 2415 and 2360 and between
positions 2460 and 2404, respectively.
170
Mangé and Prudhomme
element RA3 in both B. mori and H. armigera predicts
that it should regulate cytoplasmic actin gene promoters
in other species.
Discussion
The cytoplasmic actin genes have been fully described for only three insects, D. melanogaster, B. mori,
and H. armigera. In each species, two genes are present,
but these genes encode almost identical proteins. Moreover, it is highly probable that in each species one gene
has two promoters (the comparison between BmA4 and
HaA3b suggests this conclusion for HaA3b) and the second gene has only a single promoter. The biological significance of this observation remains to be deciphered,
but it suggests that the regulation of transcription is an
important factor favoring the persistence of duplicate
genes. A comparison with cytoplasmic actin genes of
other orders of insects is needed to specify how ancestral
this organization is.
Our data show that BmA3 and BmA4 are clustered,
as are the two genes HaA3a and Ha3b of H. armigera
(Rourke and East 1997). Our sequence analysis shows
that BmA3 and HaA3a are orthologous, as are BmA4 and
HaA3b. This conclusion is based on the high similarity
of the noncoding sequences present in the 59 and 39 parts
of the genes. In both species, one gene copy is followed
in the 39 part by GC-rich DNA, the other by AT-rich
DNA. Sine in both species the two genes are linked in
the same orientation and distant by only a few kilobases,
it is highly probable that we are dealing with the same
single chromosomal locus that most probably was the
site of the initial duplication of the ancestral gene. In D.
melanogaster, the two cytoplasmic genes Dm5C and
Dm42A are not linked and show a higher divergence
(Fyrberg et al. 1981). Their relationships to the lepidopteran genes remain unknown, since only the highly conserved actin-coding sequences can be aligned. Moreover, the comparison of the noncoding regions of the
cytoplasmic actin gene Agact1D of the mosquito Anopheles gambiae with those of DmAct5C shows a highly
divergent evolution of these sequences within Diptera
(Salazar et al. 1994).
It is noteworthy that the four genes of the two lepidopterans exhibit a similar intron interrupting the coding sequence in codon 117, a position not encountered
in any other species (Weber and Kabsch 1994). This
suggests that the common ancestral species carried two
genes built on patterns similar to those of BmA3/HaA3a
and BmA4/HaA3b, both bearing an intron in codon 117.
Otherwise, it must be accepted that a similar intron inserted independently in the same codon in both B. mori
and H. armigera lineages, which is highly improbable.
In D. melanogaster, no intron inserted after the divergence between Lepidoptera and Diptera. Otherwise, it
is possible that it was independently lost twice in the D.
melanogaster lineage.
To explain the very high similarity of the sequences
of this intron to the coding sequences in the paralogous
genes of each species, we suppose that gene conversion
events occurred and homogenized these sequences in
both species. Under this assumption, the genes BmA4
and HaA3b on one hand and BmA3 and HaA3a on the
other have diverged since the speciation event that separated the B. mori and H. armitgera lineages, but the
converted parts—the coding region and intron I2—of
the paralogous genes have diverged in each species only
since the last conversion event. It is thus possible to
compare the extent of nucleotide replacements in the
different parts of the converted region. In intron I3,
9.8% of the nucleotides of BmA3 and BmA4 show a
difference, but only 3.6% of them vary between HaA3a
and HaA3b. Accepting that the rate of nucleotide replacement is the same in both species would lead to the
conclusion that the homogenization of the sequence occurred 2.7 times earlier in B. mori than in H. armigera.
The coding sequence within exon E3 exhibits no difference between the two genes of H. armigera and a level
of replacement of silent third position of codons in B.
mori of only 4.6%. This is consistent with the conversion event being more ancient in the B. mori lineage
and, moreover, suggests that silent nucleotide replacement is less frequent in coding sequences than in introns.
In H. armigera, nucleotide replacements are obviously not homogeneously distributed over the sequence:
none occurred until codon 252, 8 are present between
codons 253 and 311, and 10 affect the codons between
codon 343 and the last codon, codon 376. This polarity
suggests that the last conversion event was limited to
the 59 part of the sequence and did not extend beyond
codon 252. It also suggests that more ancient conversion
events extended on the 39 side. Similar deductions can
be made from the sequences of B. mori genes. Such a
polarity suggests that a hot spot of conversion is present
at the 59 end of exon E2, but we were not able to find
any special DNA structure in this region. It is apparent
that homogenization of the sequences by gene conversion is thus restricted to the coding sequences and the
interrupting intron and does not extend into intron I2 on
the 59 side nor into the 39 UTR. This situation, which
has been described for other tandemly repeated genes
(Popadic and Anderson 1995), suggests that the high
sequence similarity required for gene conversion is provided here by the conserved actin-coding sequence.
Gene conversion between duplicated copies has
been described for actin genes of various other species
(Crain et al. 1987; Moniz de Sa and Drouin 1996; Kissinger, Hahn, and Raff 1997), as well as for other gene
families (Shyue et al. 1994; Popadic and Anderson
1995; Zhou and Li 1996), suggesting that concerted evolution is a common phenomenon. Thus, it seems more
parsimonious to suppose that the intron inserted once in
codon 117 in a common ancestor of B. mori and H.
armigera and was then transmitted to the paralogous
gene by gene conversion. Otherwise, we have to admit
that an intron-carrying ancestral gene was duplicated,
giving rise, at least transitorily, to three copies of cytoplasmic actin genes. However, no pseudogene- or actinrelated sequence sustaining this hypothesis has been observed. Comparative analysis of lepidopterans from dif-
Lepidopteran Cytoplasmic Actin Genes
171
FIG. 5.—Model for the evolution of the cytoplasmic actin genes in insects. A pathway is presented for the origin of existing genes from a
single putative ancestral gene. Structural homologies between genes of B. mori and H. armigera are indicated.
ferent families is needed to estimate when this insertion
occurred during evolution.
Figure 5 summarizes our observations and suggests
a testable hypothesis about the events leading to the present cytoplasmic actin genes of B. mori and H. armigera. As no metazoan has a single cytoplasmic actin
gene, we first postulate that the ancestor of Diptera and
Lepidoptera had two gene copies. We also admit that
the intron 3 inserted after the divergence between Lepidoptera and Diptera and was transmitted to the second
copy by genic conversion before the divergence between
Bombyx and Helicoverpa. Then, in both lineages, multiple conversion events homogenized the homologous
coding and intronic sequences. This model can be tested
by comparing the cytoplasmic actin genes of primitive
lepidopterans with those of insects belonging to orders
related to Lepidoptera and Diptera.
Acknowledgments
We thank Pierre Couble for his critical reading of
the manuscript. This work was supported by the Centre
National de la Recherche Scientı̀fique and by the French
Ministère de l’Education Nationale. It was also supported by the European Union (contract CI 1*CT 94
0092).
LITERATURE CITED
CRAIN, W. JR., M. F. BOSHAR, A. D. COOPER, D. S. DURICA,
A. NAGY, and D. STEFFEN. 1987. The sequence of a sea
urchin muscle actin gene suggests conversion with a cytoskeletal actin gene. J. Mol. Evol. 25:37–45.
DANILITION, S. L., R. M. FREDERICKSON, C. Y. TAYLOR, and
N. G. MIYAMOTO. 1991. Transcription factor binding and
spacing constraints in the human beta-actinproximal promoter. Nucleic Acids Res. 19:6913–6922.
FYRBERG, E. A., B. J. BOND, N. D. HERSHEY, K. S. MIXTER,
and N. DAVIDSON. 1981. The actin genes of Drosophila:
protein coding regions are highly conserved but intron positions are not. Cell 24:107–116.
KAWAMOTO, T., K. MAKINO, H. NIWA, H. SUJIYAMA, S. KIMURA, M. AMEMURA, A. NAKATA, and T. KAKUNAGA.
1988. Identification of the human beta actin enhancer and
its binding factor. Mol. Cell. Biol. 8:267–272.
KISSINGER, J. C., J. H. HAHN, and R. A. RAFF. 1997. Rapid
evolution in a conserved gene family. Evolution of the actin
family in the sea urchin genus Heliocidaris and related genera. Mol. Biol. Evol. 14:654–665.
MANGÉ, A., P. COUBLE, and J. C. PRUDHOMME. 1996. Two
alternative promoters drive the expression of the cytoplasmic actin A4 gene of Bombyx mori. Gene 183:191–199.
MANGÉ, A., E. JULIEN, J. C. PRUDHOMME, and P. COUBLE.
1997. A strong inhibitory element down-regulates SREstimulated transcription of the A3 cytoplasmic actin gene of
Bombyx mori. J. Mol. Biol. 265:266–274.
MOHUN, T., N. GARRETT, and R. TREISMAN. 1987. Xenopus
cytoskeletal actin and human c-fos gene promoters share
conserved protein-binding site. EMBO. J. 6:667–673.
MONIZ DE SA, M., and G. DROUIN. 1996. Phylogeny and substitution rates of angiosperm actin genes. Mol. Biol. Evol.
13:1198–1212.
MOUNIER, N., M. GOUY, D. MOUCHIROUD, and J. C. PRUDHOMME. 1992. Insect muscle actins differ distinctly from
172
Mangé and Prudhomme
invertebrate and vertebrate cytoplasmic actins. J. Mol. Evol.
34:406–415.
MOUNIER, N., and J. C. PRUDHOMME. 1986. Isolation of actin
genes in Bombyx mori: the coding sequence of a cytoplasmic actin gene expressed in the silk gland is interrupted by
a single intron in an unusual position. Biochimie 68:1053–
1061.
NG, S. Y., P. GUNNING, S. H. LIU, J. LEAVITT, and L. KEDES.
1989. Regulation of the human beta actin promoter by upstream and intron domains. Nucleic Acids Res. 17:601–615.
POPADIC, A., and W. W. ANDERSON. 1995. Evidence for gene
conversion in the amylase multigene family of Drosophila
pseudoobscura. Mol. Biol. Evol. 12:564–572.
ROURKE, I. J., and P. D. EAST. 1997. Evidence for gene conversion between tandemly duplicated cytoplasmic actin
genes of Helicoverpa armigera (Lepidoptera: Noctuidae).
J. Mol. Evol. 44:169–177.
SALAZAR, C. E., D. MILLS HAMM, D. M. WESSON, C. B.
BEARD, V. KUMAR, and F. H. COLLINS. 1994. A cytoskeletal
actin gene in the mosquito Anopheles gambiae. Insect Mol.
Biol. 3:1–13.
SHIELDS, D. C., P. M. SHARP, D. G. HIGGINS, and I. WRIGHT.
1988. Silent sites in Drosophila genes are not neutral: evidence of selection among synonymous codons. Mol. Biol.
Evol. 5:704–716.
SHYUE, S. K., L. LI, B. H.-J. CHANG, and W. H. LI. 1994.
Intronic gene conversion in the evolution of human Xlinked color vision genes. Mol. Biol. Evol. 11:548–551.
THOMPSON, J. D., D. G. HIGGINS, and T. J. GIBSON. 1994.
CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673–4680.
TREISMAN, R. 1985. Transient accumulation of c-fos RNA following serum stimulation requires a conserved 59-end element and c-fos 39 sequences. Cell 42:889–902.
VANDEKERCKHOVE, J., and K. WEBER. 1984. Chordate muscle
actins differ distinctly from invertebrate muscle actins. The
evolution of the different vertebrate muscle actins. J. Mol.
Biol. 179:391–413.
WEBER, K., and W. KABSCH. 1994. Intron positions in actin
genes seem unrelated to the secondary structure of the protein. EMBO J. 13:1280–1286.
ZHOU, Y. H., and W. H. LI. 1996. Gene conversion and natural
selection in the evolution of X-linked color vision genes in
higher primates. Mol. Biol. Evol. 13:780–783.
STEVE PALUMBI, reviewing editor
Accepted October 6, 1998