Download Mechanisms Underlying the Evolution and Maintenance of

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Genetic engineering wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Oncogenomics wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Gene nomenclature wikipedia , lookup

Point mutation wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Copy-number variation wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Adaptive evolution in the human genome wikipedia , lookup

Essential gene wikipedia , lookup

Genomics wikipedia , lookup

Transposable element wikipedia , lookup

Public health genomics wikipedia , lookup

Koinophilia wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Non-coding DNA wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Human genome wikipedia , lookup

History of genetic engineering wikipedia , lookup

Gene desert wikipedia , lookup

Metagenomics wikipedia , lookup

Genomic imprinting wikipedia , lookup

Gene expression programming wikipedia , lookup

Genome editing wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Pathogenomics wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Ridge (biology) wikipedia , lookup

RNA-Seq wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Helitron (biology) wikipedia , lookup

Genome (book) wikipedia , lookup

Gene wikipedia , lookup

Minimal genome wikipedia , lookup

Designer baby wikipedia , lookup

Gene expression profiling wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Microevolution wikipedia , lookup

Genome evolution wikipedia , lookup

Transcript
Mechanisms Underlying the Evolution and Maintenance of Functionally
Heterogeneous 18S rRNA Genes in Apicomplexans
Alejandro P. Rooney
Microbial Genomics & Bioprocessing Research Unit, National Center for Agricultural Utilization Research, Agricultural Research
Service, U.S. Department of Agriculture, Peoria, Illinois
In many species of the protist phylum Apicomplexa, ribosomal RNA (rRNA) gene copies are structurally and
functionally heterogeneous, owing to distinct requirements for rRNA-expression patterns at different developmental
stages. The genomic mechanisms underlying the maintenance of this system over long-term evolutionary history are
unclear. Therefore, the aim of this study was to investigate what processes underlie the long-term evolution of
apicomplexan 18S genes in representative species. The results show that these genes evolve according to a birth-anddeath model under strong purifying selection, thereby explaining how divergent 18S genes are generated over time while
continuing to maintain their ability to produce fully functional rRNAs. In addition, it was found that Cryptosporidium
parvum undergoes a rapid form of birth-and-death evolution that may facilitate host-specific adaptation, including that of
type I and II strains found in humans. This represents the first case in which an rRNA gene family has been found to
evolve under the birth-and-death model.
Introduction
In order to synthesize protein, both eukaryotic and
prokaryotic cells possess ribosomes that consist of protein
as well as RNA molecules. These RNA molecules are
encoded by three to four different ribosomal RNA (rRNA)
genes that are present in variable copy numbers in a given
genome. In bacteria, there are three ribosomal rRNA genes
(23S, 16S, and 5S), which are arranged in units that are
usually present in 10 copies or fewer and dispersed
throughout the genome (reviewed in Liao 2000). Previous
studies have shown the copies to be highly homogenized at
the nucleotide level, although some exceptions do exist
(Wang, Zhang, and Ramanan 1997; Yap, Zhang, and
Wang 1999). In eukaryotes, three nuclear rRNA genes
(28S, 18S, and 5.8S) are arranged in a unit that occurs as
a tandem repeat, which in most species also contains
a fourth rRNA gene, the 5S gene. However, in some
species, 5S gene copies are dispersed throughout the
genome, as in Schizosaccharomyces pombe (Wood et al.
2002), are organized in one or more tandem arrays separate from the other nuclear rRNA genes, as in soybeans
(Gottlob-McHugh et al. 1990), or as a combination of
both, as in humans (Little and Braaten 1989).
Regardless of the organizational pattern, rRNA genes
are usually repeated a few to several hundred times and are
presumed by most researchers to be highly homogenized,
owing to their concerted evolution. Under the concerted
evolution model, mutations that arise in one member of
a multigene family can spread to all other members
through some sort of homogenization process other than
purifying selection, such as gene conversion or unequal
crossover (Brown, Wensink, and Jordan 1972; Zimmer
et al. 1980; Dover and Coen 1981; Arnheim 1983; Li 1997;
Ohta 1989, 2000). As a result, the members of the
multigene family do not evolve independently. The notion
that all rRNA genes (with the exception of those in
Key words: birth-and-death, purifying selection, multigene family,
rRNA.
E-mail: [email protected].
Mol. Biol. Evol. 21(9):1704–1711. 2004
doi:10.1093/molbev/msh178
Advance Access publication June 2, 2004
organellar genomes) evolve in a concerted manner has in
effect become dogma (Dover and Coen 1981; Hillis and
Dixon 1991; Ohta 2000). Nevertheless, there are cases in
which departures from the rRNA concerted evolution
model have been identified. For example, there are two
distinct repeat families of 18S rRNA genes in dugesiid
flatworms (Carranza et al. 1996; Carranza, Baguñà, and
Riutort 1999) and two distinct repeat families of 16S
rRNA genes in certain actinomycetous bacteria (Wang,
Zhang, and Ramanan 1997; Ueda et al. 1999). Similarly,
many species of the microbial eukaryotic phylum
Apicomplexa possess distinct rRNA gene ‘‘types.’’
In apicomplexans, the structure and function of rRNA
genes has been best studied in Plasmodium. Species within
this genus possess functionally distinct rRNA ‘‘types’’
believed to be maintained in response to developmental
constraints imposed by a multihost life cycle (Gunderson et
al. 1987, McCutchan et al. 1988; Zhu et al. 1990; Rogers et
al. 1996; Li et al. 1997; Gardner et al. 2002; MercereauPuijalon, Barale, and Bischoff 2002). Specifically, a certain
rRNA type is expressed only during a particular growth
stage of the organism, whereas another type is expressed at
a different stage (Gunderson et al. 1987; Le Blancq et al.
1997; Mercereau-Puijalon, Barale, and Bischoff 2002). An
important question that remains unanswered is how rRNA
functional heterogeneity evolved and subsequently has
been maintained over millions of years of apicomplexan
evolutionary history. This is an important question to
answer, because the ‘‘rule’’ in all other eukaryotes is that
rRNA genes are homogeneous (Coen, Strachan, and Dover
1982; Nei 1987; Li 1997; Graur and Li 2000). As such, the
exception of apicomplexan rRNA genes to this ‘‘rule’’
indicates that they must undergo unique mechanisms of
multigene family evolution. The purpose of this study is to
examine these mechanisms in more detail by studying the
evolutionary patterns of 18S rRNA gene diversification in
representative apicomplexan species.
Materials and Methods
Data Analysis
Nucleotide sequences for all five 18S rRNA genes of
Plasmodium falciparum and their 59 and 39 flanking
Molecular Biology and Evolution vol. 21 no. 9 Ó Society for Molecular Biology and Evolution 2004; all rights reserved.
Molecular Evolution of Apicomplexan 18S rRNA Genes 1705
proaches developed by McGuire, Wright, and Prentice
(1997), McGuire and Wright (2000), and Husmeier and
McGuire (2003).
Multigene Family Models
FIG 1.—Expected gene relationships under (A) the concerted
evolution model and (B) the birth-and-death evolution model. Species
are denoted by Greek letters and genes are denoted by numbers. Under
the concerted evolution model, genes cluster according to species;
however, they do not do so under the birth-and-death model except in
cases of recent gene duplication [e.g., the a3 sequences in (B)].
regions were extracted from this species’ complete genome
(Gardner et al. 2002). The 18S gene sequences from other
Plasmodium species (Order Hemosporida) as well as
species in the Order Eimeriida (represented by the genera
Toxoplasma, Cryptosporidium, Neospora, Besnoitia, and
Hammondia) were extracted from GenBank. Sequences
from the Cryptosporidium parvum whole genome shotgun
(Abrahamsen et al. 2004) were also used. The species
names and sequence accession numbers are given in the
resultant trees. Sequences were aligned using the computer
program ClustalX (Thompson et al. 1997) and checked for
errors by visual inspection. The computer program PAUP*
4.0 beta, version 10 (Swofford 2002) was used to reconstruct phylogenetic trees by using the neighbor-joining
(NJ) (Saitou and Nei 1987) and maximum likelihood
(ML) methods. The distances and models used included the Kimura (1980) two-parameter; Hasegawa,
Kishino, and Yano (1985) (HKY); and Tamura and Nei
(1993) models. Maximum likelihood trees were reconstructed using tree-bisection-reconnection (TBR) branchswapping under a full heuristic search in which the starting
tree was obtained by using the NJ method. Statistical
reliability of internal branches was assessed using 1,500
bootstrap replicates with the NJ method and 100 bootstrap
replicates with the ML method. The computer program
TOPALi v 0.18 was used to assess the probability that
nucleotide sequences are subject to recombination. This
program implements the graphical and Bayesian ap-
In this study, patterns of apicomplexan 18S gene
evolution are reconciled with either the concerted
evolution (Brown, Wensink, and Jordan 1972; Zimmer
et al. 1980; Arnheim 1983) or birth-and-death evolution
models (Hughes and Nei 1989; Ota and Nei 1994; Nei,
Gu, and Sitnikova 1997; Gu and Nei 1999; Rooney,
Piontkivska, and Nei 2002). In the latter model, gene
duplication gives rise to new genes, some of which persist
in the genome for long periods, whereas others are lost
through deletion events or degenerate into pseudogenes.
Accordingly, multigene family members evolve more or
less independently and do not show high levels of nucleotide sequence homogeneity under this model, except
in the case of recently duplicated genes. Thus, in a
phylogenetic analysis of genes from several closely related
taxa, sequences will not show a within-species clustering
pattern, except in the case of recent gene duplicates (fig. 1).
In contrast, concerted evolution is a form of nonindependent evolution through which a mutation that
arises in one member of a multigene family spreads to all
other members through a process of recombination such as
gene conversion or unequal crossover (Brown, Wensink,
and Jordan 1972; Zimmer et al. 1980; Arnheim 1983). As
a result of concerted evolution, a multigene family will
show the following: (1) high levels of nucleotide sequence
identity between gene copies will be retained within
species; (2) gene copies will diverge in sequence similarity
between species, the degree of which depends on the time
since the species last shared a common ancestor; and (3)
all gene copies from the same species will cluster together
in a phylogenetic tree to the exclusion of genes from
other species (fig. 1). These patterns in part distinguish the
effects of concerted evolution from that of purifying
selection, which also constrains sequence evolution and,
therefore, can serve as a homogenizing force. The primary
distinction between the two is that concerted evolution is
a rapid process that creates species-specific gene clusters,
whereas purifying selection results in the maintenance of
sequence identity beyond speciation events and, therefore,
creates multispecies gene groups. As an aside, purifying
selection typically exerts distinct constraints on synonymous versus nonsynonymous nucleotide sites in proteincoding genes, whereas concerted evolution makes no such
distinction (Rooney, Piontkivska, and Nei 2002).
Results
Mode of Multigene Family Evolution
The pattern of 18S gene evolution in apicomplexans
is marked by varying degrees of differentiation between
sequences, which points to a dynamic level of gene duplication, turnover, and maintenance. These patterns are consistent with the birth-and-death model of multigene family
evolution described previously. For example, evidence of
between-species clustering was found for many different
1706 Rooney
species. In Plasmodium, this pattern is readily apparent
with respect to Plasmodium vivax, Plasmodium cynomolgi, Plasmodium knowlesi, Plasmodium malariae, and
Plasmodium fragile (fig. 2). Furthermore, in many instances the 18S sequences from the same species were
observed to be quite divergent, such as the Plasmodium
berghei sequences that differ from each other by 76 to 154
nucleotides (nucleotide distance (d) ¼ 0.038 to 0.08) (fig.
2). Likewise, some C. parvum sequences are highly divergent, in some cases by as many as 126 nucleotides (d ¼
0.081) (fig. 3). Unsurprisingly then, the Cryptosporidium
18S genes show a clearly discernible pattern of betweenspecies clustering, as does the sarcosystid apicomplexan
Toxoplasma gondii (fig. 3).
However, there are instances in which sequences
from the same species cluster together or are identical
(figs. 2 and 3). There are three explanations for this: (1) the
sequences represent different genes that are homogenized
through gene conversion; (2) the sequences represent
different genes derived from a recent duplication event, in
which case not enough time has elapsed for nucleotide
differences to have occurred; or (3) the sequences represent allelic copies of the same gene. With regard to
some species, the sequences analyzed may be a mixture of
paralogs and alleles. For example, in the case of P. vivax,
the three large gene clusters correspond to at least three
different paralogs. Yet, there may also be allelic sequences
within some clusters (e.g., multiple ‘‘El Salvador’’ or
sequences within two of the clusters). Thus, it is difficult to
choose any one of the aforementioned explanations to
the complete exclusion of the others without the aid of
genome mapping or sequence data. For this reason, an
analysis of the 18S genes from the completed genome
sequence of P. falciparum is invaluable because we know
the exact number of 18S genes in that genome and their
precise chromosomal locations, thus removing any doubt
that the sequences are distinct genes and not allelic copies
of the same gene. This allows for a thorough examination
of gene divergence and phylogeny for 18S coding regions
and their 59 and 39 flanking regions in this species and will
also facilitate analyses designed to detect recombination
owing to gene conversion or unequal crossover.
An examination of Kimura (1980) nucleotide distances
(d) between P. falciparum genes shows that the 18S coding
regions of chromosomes 11 and 13 genes differ by only one
nucleotide (d ¼ 0.001) and that these coding regions in turn
differ from the chromosome 1 coding region by only five
(d ¼ 0.002) and six nucleotides (d ¼ 0.003), respectively.
All three of these coding regions are highly divergent
(between 199 [d ¼ 0.104] and 130 [d ¼ 0.105] nucleotide
differences) from the coding regions of the chromosome
five and seven genes, which show 25 nucleotide sequence
differences (d ¼ 0.018) from each other. Thus, certain pairs
of 18S genes appear to be homogenized. The question raised
by this information is whether the apparent homogenization is the result of recombination or if it actually represents
sequence conservation resulting from purifying selection.
To answer this question, the 59 and 39 flanking
regions from each P. falciparum were analyzed. The latter
corresponds to the internal transcribed spacer 1 (ITS1)
region, which lies between the 18S and 5.8S genes. The
FIG 2.—Phylogeny of Plasmodium 18S genes. Because all ML and
NJ trees were highly similar, only the NJ tree reconstructed from HKY
distances is shown. Values along branches represent bootstrap percentage
values from 1,500 pseudoreplicates. The geographic origin for the P.
vivax sequences is shown in parentheses. The tree is unrooted.
level of divergence displayed between the 59 and 39
flanking sequences greatly exceeds what is observed
between the 18S coding sequences. The chromosome 11
and 13 sequences differ by 60 nucleotides (d ¼ 0.109) in
the 59 flanking region and by 3 nucleotides (d ¼ 0.008)
in the 39 flanking region. These genes differ from the
chromosome 1 gene by 185 (d ¼ 0.427) and 191 (d ¼
0.438) nucleotides, respectively, in the 59 flanking region
and by 24 (d ¼ 0.072) and 26 (d ¼ 0.079) nucleotides,
respectively, in the 39 flanking region. The three genes
differ by an even greater nucleotide distance compared to
those on chromosomes 5 and 7, in which the ranges lie
between 234 (d ¼ 0.624) and 268 (d ¼ 0.712) differences in
the 59 flanking region and between 53 (d ¼ 0.186) and 62
(d ¼ 0.224) in the 39 flanking region. The chromosome 5
and 7 genes show the smallest level of divergence in the
coding region, yet they differ by 30 nucleotides (d ¼
0.052) in the 59 flanking region and by 16 nucleotides (d ¼
0.046) in the 39 flanking region. As a whole, these
divergence data strongly suggest that recombination does
not influence 18S genes in P. falciparum. Otherwise, the
topologies of the trees reconstructed from the flanking and
coding regions (fig. 4) would have been different from one
another owing to the creation of a mosaic of distinct
Molecular Evolution of Apicomplexan 18S rRNA Genes 1707
FIG 3.—Phylogeny of eimeriid apicomplexan 18S genes. Because all ML and NJ trees were highly similar, only the NJ tree reconstructed from
HKY distances is shown. Values along branches represent bootstrap percentage values from 1,500 pseudoreplicates. The host animal is listed in
parentheses next to each C. parvum sequence for which this information was available. The tree is unrooted.
phylogenetic histories in these different regions. Yet, this
did not occur. Thus, it is no surprise that Bayesian analyses
of recombination conducted using the computer program
TOPALi failed to find evidence in support of recombination on a concatenation of all thee regions or when each
region was analyzed separately. In summary, these results
suggest that recombination does not act on P. falciparum
18S genes, or, if it does, the degree of recombination is so
negligible that it has little impact.
Collectively, these results indicate (1) that the high
level of nucleotide sequence identity observed between the
coding regions of P. falciparum genes is due to purifying
selection, (2) that gene conversion (for which no evidence
was found here) is not important over the evolutionary
history of the P. falciparum 18S genes, and (3) that the
duplication events giving rise to these genes are not recent.
Thus, evidence was not found in this study for extensive
homogenization owing to concerted evolution of 18S
genes in this species (Enea and Corredor 1991; Corredor
and Enea 1994) or in any other examined here. Instead, the
results support a model of birth-and-death evolution under
strong purifying selection.
Rapid Locus Turnover in C. parvum
In most cases, genes are shared for prolonged periods
between species under the birth-and-death model (Nei,
Sitnikova, and Gu 1997; Takahashi, Rooney, and Nei
1708 Rooney
2000). Yet, there are some exceptional cases in which
rapid locus turnover occurs. The MHC class I genes of
callitrichine New World monkeys (Cadavid et al. 1997)
and the eosinophil-associated RNase genes of rodents
(Zhang, Dyer, and Rosenberg 2000) are good examples of
multigene families that experience rapid gene turnover due
to birth-and-death evolution. In these cases, rapid gene
turnover has led to the creation of species-specific gene
clusters as a result of frequent gene duplication and loss.
Consequently, few or no genes are shared between species.
The results from this study suggest that 18S genes in
some apicomplexans may also undergo rapid birth-anddeath evolution. This is best shown through an analysis of
C. parvum sequences. It was possible to determine in
several cases the host-species for the C. parvum strains that
that produced the 18S sequences (fig. 3). The sequences
that correspond to the human host-species are separable
into two classes known as type I and type II. These types
produce distinct schizonts that are distinguishable on the
basis of how they function in reproduction. Essentially,
type I schizonts are involved in asexual reproduction,
whereas type II schizonts are involved in sexual reproduction (reviewed in Laurent et al. 1999). In addition,
the former have been found only in human hosts, whereas
the latter have been found in humans as well as in other
animals (Gibbons et al. 1998).
The phylogeny in figure 3 shows that types I and II
18S sequences form two distinct clusters that in turn are
distinct from the other C. parvum 18S sequences. These
latter sequences come from strains that possess nonhuman
host-species (listed in parentheses next to each taxon in fig.
3). This pattern of clustering on the basis of host-species is
explainable by host-specific adaptation, which has been
described for other kinds of C. parvum molecular and
biochemical data (Xiao et al. 2002; Gibbons-Matthews and
Prescott 2003). Accordingly, a specific molecular or
biochemical genotype is associated solely with a unique
host-species. The reasons for host-specific adaptation are
unclear, but they may have to do with antigenic variation
and evasion of the host immune system. At any rate, rapid
birth-and-death evolution would certainly facilitate, if not
enhance, the process of host-specific adaptation, regardless
of the causes. Accordingly, rapid gene duplication and loss
would aid the acquisition of a set of genes optimally
adapted to a particular host-species and would eventually
guarantee that genes particular to one C. parvum strain are
not shared with other strains.
This hypothesis can be tested through a phylogenetic
analysis of 18S genes from the recently completed genome
of a C. parvum type II strain (Abrahamsen et al. 2004) in
conjunction with the sequences of other strains. There are
five 18S rRNA genes in C. parvum (LeBlanq et al. 1997),
and five genes were reported in the published type II
genome. Unfortunately, only a small fragment (approximately 125 bp) of one of those genes (cgd7_5535) was
sequenced, while another (cgd2_1375) appears to have
been misidentified as an 18S gene. Regarding the latter,
there are short stretches of sequence similarity between
that gene and the other C. parvum sequences analyzed in
this study, but the sequence is so divergent from the other
18S sequences that it cannot be aligned with any
FIG 4.—Phylogeny and sequence divergence of (A) the 18S gene
coding sequences and their respective (B) 59 and (C ) 39 flanking
sequences from the complete genome sequence of P. falciparum. The
trees are unrooted and were constructed from Kimura (1980) distances,
which are shown below each tree. Values along branches represent
bootstrap percentage values from 1,500 pseudoreplicates. Note that the 39
sequences correspond to the ITS1 region.
reliability. Furthermore, when the sequence was searched
against GenBank using Blast, the closest matches were
human oncogenes. Thus, only three of the 18S genes from
the complete type II genome are useful for our purposes.
In so far as that is concerned, the C. parvum genome
sequences cluster with other type II sequences (fig. 3) but
separately from the type I sequences, indicating that the
genes originated subsequent to the divergence of the
respective strains from which they came. These results
indicate that types I and II genes undergo rapid turnover
(i.e., rapid duplication and loss). It should be noted,
Molecular Evolution of Apicomplexan 18S rRNA Genes 1709
however, that the support for distinct type I and type II
clades is not very high (bootstrap values of 66% and 60%,
respectively). Yet, this is not unexpected if the two groups
were recently separated. It should also be pointed out that,
until the remaining two genes from the type II genome
become available for analysis, it cannot be said that all
genes from the type II genome cluster apart from the type I
sequences. Clearly, more studies are needed to investigate
this problem. Nevertheless, the previously described
results clearly indicate that 18S genes in C. parvum
undergo rapid duplication and loss.
Discussion
The ‘‘rule’’ regarding rRNA gene copy diversification
is that it does not occur within a species because all copies
of a given rRNA gene must remain interchangeable in
order to maintain the structural and functional homogeneity of the rRNA products that they encode (Coen,
Strachan, and Dover 1982; Nei 1987; Li 1997; Graur and
Li 2000). Consequently, it is commonly held that the
existence of divergent gene copies would skew the rate of
ribosome synthesis from its optimum and result in
a negative impact on fitness. Still in all, Plasmodium,
Cryptosporidium, and other apicomplexans, produce
distinct rRNA ‘‘types’’ that differ on the basis of their
expression pattern as well as in possessing different
regions that control mRNA decoding and translational
termination (Gunderson et al. 1987; Le Blancq et al. 1997;
Rogers et al. 1996; Li et al. 1997). It has been shown that
differences between rRNA genes result in changes to
the biology of P. falciparum at different stages of its
development in response to the need for unique adaptation
and immune-evasion strategies in different hosts (Velichutina et al. 1998). However, what remain to be elucidated
are the evolutionary mechanisms that influence these
genes, which clearly do not evolve in concert (Rogers et al.
1995) unlike the rRNA genes of virtually all other
eukaryotes.
This study shows that apicomplexan 18S rRNA genes
evolve according to a birth-and-death model under strong
purifying selection. The action of purifying selection
guarantees that rRNA gene copies maintain their functional integrity in spite of their independent evolution from
one another. The differential rates of duplication and loss
produced under the birth-and-death model explain why
some rRNA genes are shared between species and are
maintained for long periods of evolutionary time, while
others appear to be recent gene duplicates or to have been
lost from the genomes of other species (figs. 2 and 3). This
explains why any one 18S gene copy does not exist in all
species of apicomplexans, as there is a distinct process
of gene turnover owing to repeated duplication and loss.
Yet, how rapid is the process of gene turnover among
apicomplexan 18S rRNA genes?
The results presented here indicate that there is a fairly
rapid degree of 18S gene turnover in C. parvum and that
this is probably tied to host-specific adaptation. It has been
known for some time that there must be an underlying
reason for the production of distinct C. parvum molecular
and biochemical genotypes that have led to their
classification as type I or type II (Gibbons-Matthews and
Prescott 2003). What is unclear is whether type I and type
II genotypes reflect that the cells from which the sequences
come are distinct strains. The results from this study
indicate that they indeed represent distinct strains, on the
basis of our phylogenetic analyses (fig. 3). Given that type
I has been found only in humans, whereas type II has been
found in both humans and nonhuman species, it likely that
type I represents a uniquely human-adapted strain. This
information should prove useful in epidemiological studies
of both human and veterinary cryptosporidiosis.
The results concerning rapid gene turnover in C.
parvum are particularly interesting in light of the
observation that T. gondii 18S rRNA genes do not form
species-specific clusters (fig. 3) despite the fact that they
are organized in a tandem array (Gagnon, Bourbeau, and
Levesque 1996). This presents itself as an unusual
situation that warrants further study, because concerted
evolution is supposed to be the rule among multigene
family members arranged in tandem arrays (Nei 1987; Li
1997; Graur and Li 2000). Although it cannot be shown
with the currently available data, perhaps this result is also
indicative of rapid locus turnover in different strains of this
species. Thus, it will be interesting to examine T. gondii
18S genes more thoroughly after the completion of this
species’ genome sequencing project, provided that the
rRNA gene cluster is sequenced.
Unfortunately, rRNA genes are not sequenced in their
entirety in most genome projects if they are organized in
large clusters. One reason for this is the practical difficulty
of sequencing through a cluster of highly similar genes
without having inadvertently sequenced any individual
gene repeatedly. Another reason is the general presumption that all rRNA genes are identical (or nearly so) in
a given genome, which leads many to believe that there is
no need to sequence through rRNA clusters because they
are uninteresting in terms of their molecular or genomic
evolution. Thus, a large effort need not be expended upon
them. Nevertheless, this study and others (Carranza et al.
1996; Carranza, Baguñà, and Riutort 1999) show that the
evolutionary genomics of rRNA genes is more complex
than what we currently assume in many different species.
For instance, the 5S rRNA genes of many eukaryotes are
organized in a tandem array, but in some species of fungi
they are dispersed across the genome (e.g., Wood et al.
2002). In many of these species, the 5S genes appear to be
divergent from one another (unpublished data), suggesting
that they might also undergo birth-and-death evolution.
Clearly, these observations and the findings of this study
indicate that the evolutionary genomics of rRNA genes
deserves a higher level of scrutiny than what it is currently
afforded.
Acknowledgments
I thank L. Katz, C. P. Kurtzman, M. Nei, T. J. Ward,
J. Zhang, and two anonymous reviewers for comments on
the manuscript. The mention of firm names or trade
products does not imply that they are endorsed or
recommended by the U.S. Department of Agriculture over
other firms or similar products not mentioned.
1710 Rooney
Literature Cited
Abrahamsen, M. S., T. J. Templeton, S. Enomoto, et al. (17 coauthors). 2004. Complete genome sequence of the apicomplexan, Cryptosporidium parvum. Science 304:441–445.
Arnheim, N. 1983. Concerted evolution of multigene families.
Pp. 38 –61 in M. Nei and R. K. Koehn, eds. Evolution of
Genes and Proteins. Sinauer Associates, Sunderland, Mass.
Brown, D. D., P. C. Wensink, and E. Jordan. 1972. A comparison
of the ribosomal DNAs of Xenopus laevis and Xenopus
mulleri: the evolution of tandem genes. J. Mol. Biol. 63:
57–73.
Cadavid, L. F., C. Shufflebotham, F. J. Ruiz, M. Yeager, A. L.
Hughes, and D. I. Watkins. 1997. Evolutionary instability of
the major histocompatibility complex class I loci in New
World primates. Proc. Natl. Acad. Sci. U S A 94:14536–
14541.
Carranza, S., J. Baguñà and M. Riutort. 1999. Origin and
evolution of paralogous rRNA gene clusters within the
flatworm family Dugesiidae (Platyhelminthes, Tricladida). J.
Mol. Evol. 49:250–259.
Carranza, S., G. Giribet, C. Ribera, J. Baguñà and M. Riutort.
1996. Evidence that two types of 18S rDNA coexist in the
genome of Dugesia (Schmidtea) mediterranea (Platyhelminthes, Turbellaria, Tricladida). Mol. Biol. Evol. 13:824–832.
Coen, E., T. Strachan, and G. Dover. 1982. Dynamics of
concerted evolution of ribosomal DNA and histone gene
families in the melanogaster species subgroup of Drosophila.
J. Mol. Biol. 158:17–35.
Corredor, V., and V. Enea. 1994. The small ribosomal subunit
RNA isoforms in Plasmodium cynomolgi. Genetics 136:
857–865.
Dover, G., and E. Coen. 1981. Springcleaning ribosomal DNA:
a model for multigene evolution? Nature 290:731–732.
Enea, V., and V. Corredor. 1991. The evolution of plasmodial
stage-specific rRNA genes is dominated by gene conversion.
J. Mol. Evol. 32:183–186.
Gagnon, S., D. Bourbeau, and R. C. Levesque. 1996. Secondary
structures and features of the 18S, 5.8S and 26S RNAs from
the apicomplexan parasite Toxoplasma gondii. Gene
173:129–135.
Gardner, M. J., N. Hall, E. Fung et al. (45 co-authors). 2002.
Genome sequence of the human malaria parasite Plasmodium
falciparum. Nature 419:498–511.
Gibbons, C. L., B. G. Gazzard, M. A. A. Ibrahim, S. MorrisJones, C. S. L. Ong, and F. M. Awad-El-Kariem. 1998.
Correlation between markers of strain variation in Cryptosporidium parvum: evidence of clonality. Parasitol. Int.
47:139–147.
Gibbons-Matthews, C. L., and A. M. Prescott. 2003. Intra-isolate
variation of Cryptosporidium parvum small subunit ribosomal
RNA genes from human hosts in England. Parasitol. Res.
90:439–444.
Gottlob-McHugh, S. G., M. Levesque, K. MacKenzie, M. Olson,
O. Yarosh, and D. A. Johnson. 1990. Organization of the 5S
rRNA genes in the soybean Glycine max (L.) Merrill and
conservation of the 5S rDNA repeat structure in higher plants.
Genome 33:486–494.
Graur, D., and W.-H. Li. 2000. Fundamentals of molecular
evolution. Sinauer Associates, Sunderland, Mass.
Gu, X., and M. Nei. 1999. Locus specificity of polymorphic
alleles and evolution by a birth-and-death process in
mammalian MHC genes. Mol. Biol. Evol. 16:147–156.
Gunderson, J. H., M. L. Sogin, G. Wollett, M. Hollingdale, V. F.
de la Cruz, A. P. Waters, and T. F. McCutchan. 1987.
Structurally distinct, stage-specific ribosomes occur in
Plasmodium. Science 238:933–937.
Hasegawa, M., H. Kishino, and T. Yano. 1985. Dating of the
human-ape splitting by a molecular clock of mitochondrial
DNA. J. Mol. Evol. 22:160–174.
Hillis, D. M., and M. T. Dixon. 1991. Ribosomal DNA:
molecular evolution and phylogenetic inference. Q. Rev.
Biol. 66:411–453.
Hughes, A. L., and M. Nei. 1989. Evolution of the major
histocompatibility complex: independent origin of nonclassical class I genes in different groups of mammals. Mol. Biol.
Evol. 6:559–579.
Husmeier, D., and G. McGuire. 2003. Detecting recombination in
4-taxa DNA sequence alignments with Bayesian hidden
Markov models and Markov chain Monte Carlo. Mol. Biol.
Evol. 20:315–337.
Kapitonov, V. V., and J. Jurka. 2003. A novel class of SINE
elements derived from 5S rRNA. Mol. Biol. Evol. 20:694–702.
Kimura, M. 1980. A simple method for estimating evolutionary
rates of base substitutions through comparative studies of
nucleotide sequences. J. Mol. Evol. 16:111–120.
Kumar, S., K. Tamura, I. B. Jakobsen, and M. Nei. 2001.
MEGA2: molecular evolutionary genetics analysis software.
Bioinformatics 17:1244–1245.
Laurent, F., D. McCole, L. Eckmann, and M. F. Kagnoff. 1999.
Pathogenesis of Cryptosporidium parvum infection. Microbes
Infect. 1:141–148.
Le Blancq, S. M., N. V. Khramtsov, F. Zamani, S. J. Upton, and T.
W. Wu. 1997. Ribosomal RNA gene organization in Cryptosporidium parvum. Mol. Biochem. Parasitol. 90:463–478.
Li, J., R. R. Gutell, S. H. Damberger, R. A. Wirtz, J. C.
Kissinger, M. J. Rogers, J. Sattabongkot, and T. F.
McCutchan. 1997. Regulation and trafficking of three distinct
18S ribosomal RNAs during development of the malaria
parasite. J. Mol. Biol. 269:203–213.
Li, W.-H. 1997. Molecular evolution. Sinauer Associates,
Sunderland, Mass.
Liao, D. 2000. Gene conversion drives within genic sequences:
concerted evolution of ribosomal RNA genes in Bacteria and
Archaea. J. Mol. Evol. 51:305–317.
Little, R. D., and B. C. Braaten. 1989. Genomic organization of
human 5 S rDNA and sequence of one tandem repeat.
Genomics 4:376–383.
McCutchan, T. F., V. F. de la Cruz, A. A. Lal, J. H. Gunderson,
H. J. Elwood, and M. L. Sogin. 1988. Primary sequences of
two small subunit ribosomal RNA genes from Plasmodium
falciparum. Mol. Biochem. Parasitol. 28:63–68.
McGuire, G., and F. Wright. 2000. TOPAL 2.0: improved
detection of mosaic sequences within multiple alignments.
Bioinformatics 16:130–134.
McGuire G., F. Wright, and M. J. Prentice. 1997. A graphical
method for detecting recombination in phylogenetic data sets.
Mol. Biol. Evol. 14:1125–1131.
Mercereau-Puijalon, O., J.-C. Barale, and E. Bischoff. 2002.
Three multigene families in Plasmodium parasites: facts and
questions. Int. J. Parasitol. 32:1323–1344.
Nei, M. 1987. Molecular evolutionary genetics. Columbia
University Press, New York.
Nei, M., X. Gu, and T. Sitnikova. 1997. Evolution by the birthand-death process in multigene families of the vertebrate
immune system. Proc. Natl. Acad. Sci. U S A. 94:7799–7806.
Ohta, T. 1989. Role of gene duplication in evolution. Genetics
31:304–310.
Ohta, T. 2000. Evolution of gene families. Gene 259:45–52.
Ota, T., and M. Nei. 1994. Divergent evolution and evolution by
the birth-and-death process in the immunoglobulin VH gene
family. Mol Biol Evol. 11:469–482.
Rogers, M. J., G. A. McConkey, J. Li, and T. F. McCutchan.
1995. The ribsosomal rDNA in Plasmodium falciparum
Molecular Evolution of Apicomplexan 18S rRNA Genes 1711
accumulate mutations independently. J. Mol. Biol. 254:
881–891.
Rogers, M. J., R. R. Gutell, S. H. Damberger, J. Li, G. A.
McConkey, A. P. Waters, and T. F. McCutchan. 1996.
Structural features of the large subunit rRNA expressed in
Plasmodium falciparum sporozoites that distinguish it from
the asexually expressed subunit rRNA. RNA 2:134–145.
Rooney, A. P., H. Piontkivska, and M. Nei. 2002. Molecular
evolution of the nontandemly repeated genes of the histone 3
multigene family. Mol. Biol. Evol. 19:68–75.
Saitou, N., and M. Nei. 1987. The neighbor-joining method:
a new method for reconstructing phylogenetic trees. Mol.
Biol. Evol. 4:406–425.
Swofford, D. L. 2002. PAUP*: phylogenetic analysis using
parsimony (and other methods). 4.0 Beta. Sinauer Associates,
Sunderland, Mass.
Takahashi, K, A. P. Rooney, and M. Nei. 2000. Origins and
divergence times of mammalian class II MHC gene clusters.
J. Hered. 91:198–204.
Tamura, K., and M. Nei. 1993. Estimation of the number of
nucleotide substitutions in the control region of mitochondrial
DNA in humans and chimpanzees. Mol. Biol. Evol. 10:
512–526.
Thompson, J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, and
D. G. Higgins. 1997. The ClustalX windows interface:
flexible strategies for multiple sequence alignment aided by
quality analysis tools. Nuc. Acids Res. 24:4876–4882.
Ueda, K., T. Seki, T. Kudo, T. Yoshida, and M. Kataoka. 1999.
Two distinct mechanisms cause heterogeneity of 16S rRNA.
J. Bacteriol. 181:78–82.
Velichutina, I. V., M. J. Rogers, T. F. McCutchan, and S. W.
Liebman. 1998. Chimeric rRNAs containing the GTPase
centers of the developmentally regulated ribosomal rRNAs of
Plasmodium falciparum are functionally distinct. RNA 4:
594–602.
Wang, Y., Z. Zhang, and N. Ramanan. 1997. The actinomycete
Thermobispora bispora contains two distinct types of
transcriptionally active 16S rRNA genes. J. Bacteriol.
179:3270–3276.
Wood, V., R. Gwilliam, M. A. Rajandream et al. (132 coauthors). 2002. The genome sequence of Schizosaccharomyces pombe. Nature 415:871–880.
Yap, W. H., Z. Zhang, and Y. Wang. 1999. Distinct types of
rRNA operons exist in the genome of the actinomycete
Thermomonospora chromogena and evidence for horizontal
transfer of an entire rRNA operon. J. Bacteriol. 181:5201–
5209.
Xiao, L., I. M. Sulaiman, U. M. Ryan, L. Zhou, E. R. Atwill, M.
L. Tischler, X. Zhang, R. Fayer, and A. A. Lal. 2002. Host
adaptation and host-parasite co-evolution in Cryptosporidium:
implications for taxonomy and public health. Int. J. Parasitol.
32:1773–1785.
Zhang, J., K. D. Dyer, and H. F. Rosenberg. 2000. Evolution of
the rodent eosinophil-associated RNase gene family by rapid
gene sorting and positive selection. Proc. Natl. Acad. Sci.
U S A 97:4701–4706.
Zhu, J. D., A. P. Waters, A. Appiah, T. F. McCutchan, A. A. Lal,
and M. R. Hollingdale. 1990. Stage-specific ribosomal RNA
expression switches during sporozoite invasion of hepatocytes. J. Biol. Chem. 265:12740–12744.
Zimmer, E. A., S. L. Martin, S. M. Beverley, Y. W. Kan, and A.
C. Wilson. 1980. Rapid duplication and loss of genes coding
for the a chains of hemoglobin. Proc. Natl. Acad. Sci. U S A
77:2158–2162.
Laura Katz, Associate Editor
Accepted May 26, 2004