Download Selection-Driven Evolution of Emergent Dengue

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Canine parvovirus wikipedia , lookup

Hepatitis B wikipedia , lookup

Marburg virus disease wikipedia , lookup

Orthohantavirus wikipedia , lookup

Influenza A virus wikipedia , lookup

Henipavirus wikipedia , lookup

Transcript
Selection-Driven Evolution of Emergent Dengue Virus
Shannon N. Bennett,* Edward C. Holmes, Maritza Chirivella,ৠDania M. Rodriguez,*
Manuela Beltran,§ Vance Vorndam,§ Duane J. Gubler,k and W. Owen McMillan*
*Department of Biology, University of Puerto Rico–Rio Piedras, San Juan, Puerto Rico; Department of Zoology, University of
Oxford, Oxford, England; àDepartment of Microbiology and Medical Zoology, University of Puerto Rico–Ciencias Medicas,
San Juan, Puerto Rico; §Centers for Disease Control and Prevention, San Juan Branch, San Juan, Puerto Rico; kCenters for
Disease Control and Prevention, Fort Collins, Colorado
In the last four decades the incidence of dengue fever has increased 30-fold worldwide, and over half the world’s
population is now threatened with infection from one or more of four co-circulating viral serotypes (DEN-1 through
DEN-4). To determine the role of viral molecular evolution in emergent disease dynamics, we sequenced 40% of the
genome of 82 DEN-4 isolates collected from Puerto Rico over the 20 years since the onset of endemic dengue on the
island. Isolates were derived from years with varying levels of DEN-4 prevalence. Over our sampling period there were
marked evolutionary shifts in DEN-4 viral populations circulating in Puerto Rico; viral lineages were temporally
clustered and the most common genotype at a particular sampling time often arose from a previously rare lineage.
Expressed changes in structural genes did not appear to drive this lineage turnover, even though these regions include
primary determinants of viral antigenic properties. Instead, recent dengue evolution can be attributed in part to positive
selection on the nonstructural gene 2A (NS2A), whose functions may include replication efficiency and antigenicity.
During the latest and most severe DEN-4 epidemic in Puerto Rico, in 1998, viruses were distinguished by three amino
acid changes in NS2A that were fixed far faster than expected by drift alone. Our study therefore demonstrates viral
genetic turnover within a focal population and the potential importance of adaptive evolution in viral epidemic
expansion.
Introduction
RNA viruses comprise one of the fastest growing
categories of emergent diseases (Domingo and Holland
1997). Although they exhibit remarkable genetic diversity,
attributable to intrinsically high rates of mutation and
replication as well as large population sizes (Domingo and
Holland 1997; Drake and Holland 1999), the role of viral
evolution in determining disease dynamics has only been
described in a few cases (for example, Bush et al. 1999;
Zanotto et al. 1999; Manzin et al. 2000; Hatta et al. 2001).
We examine evolutionary change in dengue (DEN), an
acute mosquito-borne RNA virus (genus Flavivirus), over
a 20-year period that has marked the emergence of dengue
in Puerto Rico (PR), a dense urban population whose
growth rate rivals Asian population centers. The virus,
which causes dengue fever (DF), and the more severe
dengue hemorrhagic fever (DHF) and dengue shock
syndrome (DSS), consists of four antigenically distinct
serotypes, DEN-1 through DEN-4, that are evolutionarily
derived from at least three independent introductions into
humans from wild primates in Africa and Southeast Asia
(Wang et al. 2000). There is also abundant genetic
diversity within each serotype, in the guise of phylogenetically distinct clusters of sequences often referred to as
‘‘genotypes’’ (reviewed in Holmes and Burch 2000).
The ongoing expansion of dengue throughout Asia
and the South Pacific is being recapitulated in the
Americas (Gubler 1998). Before the 1950s, people were
typically exposed to a single strain (hypoendemicity), and
epidemics were rare and self-limiting (Gubler 1998).
However, geographic expansion of the primary mosquito
Key words: dengue virus, positive selection, epidemiology, phylogeny, maximum likelihood.
E-mail: [email protected].
Mol. Biol. Evol. 20(10):1650–1658. 2003
DOI: 10.1093/molbev/msg182
Molecular Biology and Evolution, Vol. 20, No. 10,
Ó Society for Molecular Biology and Evolution 2003; all rights reserved.
1650
vector (Aedes aegypti), increasing host densities, particularly in urban centers, and global travel have substantially
altered dengue’s epidemiologic landscape (Gubler 1998).
Now dengue annually infects an estimated 50 million to
100 million people worldwide (WHO 1999), many of
whom are exposed to two or more co-circulating DEN
serotypes (hyperendemicity), resulting in frequent largescale epidemics and more frequent severe disease (Gubler
1998).
Determining the contributing factors to the emergence of dengue as a global pandemic, particularly the
increasing incidence of DHF and DSS, has proven difficult
both because there are no satisfactory models or in vitro
correlates with which to study disease transmissibility or
pathogenicity directly (Rothman and Ennis 1999), and
because most molecular epidemiologic studies to date have
had limited scope (Holmes 1998). Associations have been
demonstrated between severe manifestations of dengue
(DHF/DSS) and both host infection history and viral
genotype. Most notably, secondary infections with heterologous serotypes are more likely to develop into DHF/
DSS than primary infections (Halstead 1988; Thein et al.
1997; Gubler 1998) so that increasing hyperendemicity
could account in part for the rise of DHF/DSS. However,
there is also evidence that viral genotype may be
a contributing factor in determining dengue disease. For
example, attenuated and virulent strains of DEN-2 were
first observed simultaneously in the Tonga epidemics of
1974/75 (Gubler et al. 1978), and the introduction of
a genetically distinct Asian DEN-2 strain into the
Americas has been associated with an increase in DHF/
DSS (Rico-Hesse et al. 1997; Leitmeyer et al. 1999). More
tentatively, an analysis of selection pressures acting on
dengue virus genomes suggested that genotypes of DEN-2
have selectively determined differences in transmissibility,
in turn determining their ability to cause epidemics on
a global scale (Twiddy et al. 2002).
Selection in Emergent Dengue Virus 1651
FIG. 1.—Incidence of dengue virus in Puerto Rico since 1981. Years included in this study are marked on the x-axis with a black bar: * denotes the
sample from Dominica. The rise of dengue in Puerto Rico mirrors the onset of the dengue pandemic in the New World. Before WWII, epidemics in
Puerto Rico were rare, but subsequent years have been marked by frequent epidemics and, since the 1980s, continuous hyperendemic transmission
(Dietz et al. 1996; Gubler 1998). Of the dengue cases reported annually (solid black line, right axis), a subset is submitted to the CDC, isolated, and
identified to serotype (hatched area, left axis, plotted against month/year of isolation). The proportions that were DEN-4 are shaded solid gray. Because
of dengue’s variable etiology, it often goes unreported, and thus the number of recorded cases underrepresents the true number of dengue infections by
up to an estimated factor of 50 to 100 (WHO 1999).
Puerto Rico provides an ideal natural laboratory to
gather a detailed record of viral evolutionary change during
disease expansion. The island has a large urban population
with high mosquito vector densities and, like many tropical
regions, has experienced nearly 20 years of dengue
epidemics that are becoming increasingly severe (Gubler
1998). Although dengue fever was recorded in Puerto Rico
as early as 1915 (Dietz et al. 1996), continuous transmission of all four serotypes has only occurred since the
1980s (Dietz et al. 1996; Gubler 1998). The first epidemic
in Puerto Rico, consisting primarily of DEN-4, was
reported in 1981/82, followed by another DEN-4–dominated outbreak in 1986, this one marked by high
incidences of DHF/DSS (Dietz et al. 1996; fig. 1). DHF/
DSS cases have occurred periodically since the 1980s,
reaching record levels in the latest DEN-4 epidemic in
Puerto Rico in 1998.
Taking advantage of Puerto Rico’s turbulent epidemiological record, we use a longitudinal phylogenetic
approach to recover the history of evolutionary change in
DEN-4 during disease emergence. In the absence of
experimental models, phylogenetic analyses within a focal
population provide the only method with which to correlate
viral genetic change with epidemic behavior. Herein we
examine viral evolution in DEN-4 isolates collected from
Puerto Rico since the onset of epidemic dengue on the
island. We sample nearly 40% of the viral genome,
including all the structural genes known to be important in
viral packaging and host cell entry, as well as a subset of
nonstructural genes, from 82 viral isolates collected over
a 20-year period. Thus we expand current knowledge of
dengue molecular evolution to include genes never before
systematically surveyed on this scale (Holmes 1998). We
assess the role of viral molecular evolution in disease
dynamics by testing for a viral adaptive basis to the
changing patterns of DEN-4 incidence in Puerto Rico.
Hence, we ascertain for the first time the role of natural
selection in DEN-4 evolution in the context of a wellcharacterized pattern of epidemic outbreaks.
Materials and Methods
We examined substitution patterns in 82 DEN-4
isolates from Puerto Rico and surrounding regions since
the disease was established in 1981/82 (Dietz et al. 1996).
We subsampled viral isolates from the U.S. Centers for
Disease Control and Prevention (CDC) sample bank that
had been collected in Puerto Rico during the years 1982 (n¼
14), 1986/87 (n ¼ 19), 1992 (n ¼ 15), 1994 (n ¼ 14), and
1998 (n ¼ 13) to represent both endemic and epidemic
disease conditions (fig. 1). With 13 to 19 isolates per
year-group, we have a 75%–86% chance of sampling rare
1652 Bennett et al.
Table 1
Primers Designed to Amplify and Sequence DEN-4 Gene Regions
Labela
90U
842L
138U
410L
518U
736L
616U
1676L
486U
1786L
686U
1142U
1181L
1603L
580U
803L
967U
1136L
1363U
1658L
1568U
2679L
1602U
2114U
1985L
2519L
3528U
4225L
3592U
4143L
7038U
7769L
7106U
7674L
10133U
10600L
10612L
10620L
a
Sequence
59ATCTCTGGAAAAATGAACCAACGAA
59ATAAGCCATAAATCCTGCCAAGAGC
59AATATGCTGAAACGCGAGAGAAACC
59TACGGTGGGAATCAAGCACAGCAA
59GACAACAGAGGGGATCAACAAATGC
59GCTCTTGTTTCCAATCCCATTCCTG
59CCGAACCTGAAGACATTGATTGCTG
59TTCCAGCACTGTCACATCCTGTCTC
59CACGTATAAATGCCCCCTACTGGTC
59GCTGTGTTTCTGCCATCTCTTTGTC
59GAGCGGAGAACGGAGACGAGAGAAG
59AACTACGGCAACAAGATGTCCAACG
59CTGTTGGTCCTGTTCCTCTTTCAGA
59TGAACCTCTGATGTGTCTGCTCCTG
59ACCCAGAGCGGAGAACGGAGACGAG
59GGGGCGACCAGCATCATTAGGACAA
59GAACTGACTAAGACAACAGCCAAGG
59AACAAGCCACAGCCATTGCCCCACC
59CCGGACTATGGAGAACTAACACTCG
59TGTCCTGCAAACATGTGATTTCCAT
59GCAATGGTTTTTGAATCTGCCTCTT
59CCTTCACATCCCCAGCCACTACAGT
59GCAGGAGCAGACACATCAGAGGTTC
59GAAAGGGAGTTCCATTGGCAAGATG
59CAAAGGGGTGGATGAGATGATACGC
59TCTCGCTGGGGACTCTGGTTGAAAT
59TTTGTGGAAGAATGCTTGAGGAGAA
59GCCAGAAGTAAGCCTCCTGCCACCA
59CTCTTTGTGCTATCATCTTGGGAGG
59AACCCACAGCCATTATGCCCTCGTT
59CTAATGGGGCTTGGAAAAGGATGGC
59TACAACTTCCCCTTTTGGCTTTACC
59ATGCTATTCTCAAGTGAACCCAACA
59CTTTCAGGGCAGACTTGGCTTCAGT
59CACCTGGGCGAAGAACATTCACACG
59CACCAATCCATCTTGCGGCGCTCTG
59TTGGATCAACAACACCAATCCATCT
59AGAACCTGTTGGATCAACAACACCA
Function
Amplification
Amplification
Sequencing
Sequencing
Sequencing
Sequencing
Amplification
Amplification
Amplification
Amplification
Sequencing
Sequencing
Sequencing
Sequencing
Sequencing
Sequencing
Sequencing
Sequencing
Sequencing
Sequencing
Amplification
Amplification
Sequencing
Sequencing
Sequencing
Sequencing
Amplification
Amplification
Sequencing
Sequencing
Amplification
Amplification
Sequencing
Sequencing
Amplification
Amplification
Amplification
Amplification
/
/
/
/
Gene/Gene Fragment
sequencing
sequencing
sequencing
sequencing
Capsid/prMem
Capsid/prMem
Capsid/prMem
Capsid/prMem
Capsid/prMem
Capsid/prMem
EnvA
EnvA
EnvA
EnvA
EnvA
EnvA
EnvA
EnvA
EnvA
EnvA
EnvA
EnvA
EnvA
EnvA
EnvB/NS1
EnvB/NS1
EnvB/NS1
EnvB/NS1
EnvB/NS1
EnvB/NS1
NS2A
NS2A
NS2A
NS2A
NS4B
NS4B
NS4B
NS4B
39NTR
39NTR
39NTR
39NTR
Number indicates genome nucleotide position according to Zhao et al. (1986): U for forward and L for reverse.
alleles (defined as existing at a frequency of 10% in the
population) at least once. In addition to 75 Puerto Rican
isolates, we included seven isolates sampled from outside
Puerto Rico during the same period: three originating
within the Caribbean basin, two from Central America,
and one from Ecuador. Caribbean basin samples included
a 1981 sample from Dominica, Lesser Antilles, believed
to represent the introduction of Asian DEN-4 into the
Caribbean basin.
All samples have low passage histories, reducing the
risk of artificial selection in vitro: only those samples
derived from chronic (generally low) infections were first
cultured in A6/C36 mosquito cells, for one or, at most, two
passages, prior to RNA extraction. To further eliminate
potential biases due to artificial selection, samples were not
processed in temporal (year) order. We extracted sample
RNA using QIAamp Viral RNA Mini kits (Qiagen
GmbH). For each isolate we amplified, using reversetranscriptase polymerase chain reaction (RT-PCR), gene
regions amounting to 40% of the viral genome (4,016 bp
of an 11 kbp genome) and including both 59 and 39 ends
(see table 1 for primer sequences). Amplified regions
included all the structural genes (capsid: C; membrane: M;
and envelope: E), a subset of nonstructural genes (NS1,
NS2A, and NS4B), and the noncoding 39 NTR region.
Amplifications were divided into separate reactions
according to length of the target. Before sequencing, RTPCR products were purified using Qiagen PCR purification kits (Qiagen GmbH). We sequenced both strands of
the amplified products using forward and reverse primers
(table 1) in standard dye-labeling reactions. Sequence data
were collected on an ABI 377 slab-gel automated
sequencer (Applied Biosystems), edited, and compiled
with Sequencher 3.1.1 (Gene Codes) and aligned against
reference sequences (GenBank number M14931; Zhao
et al. 1986, Mackow et al. 1987) using Megalign’s clustal
algorithm (version 3.1.7, Lasergene). We imported aligned
sequences into PAUP* (Swofford 2001) for phylogenetic
analysis.
Recombination, reported in all DEN serotypes
(Worobey, Rambaut, and Holmes 1999; Tolou et al.
2001; Uzcategui et al. 2001; Twiddy and Holmes 2003),
can lead to conflicts in phylogenetic trees. We searched for
potential recombinants across the entire phylogeny by
testing for topological incongruity among NeighborJoining (NJ) trees generated using a 500-base sliding
Selection in Emergent Dengue Virus 1653
window. The statistical support for recombination in these
sequences, as well as the locations of the breakpoints, was
determined using a maximum likelihood method (program
LARD; Holmes, Worobey, and Rambaut 1999), and then
maximum likelihood (ML) trees were constructed on
either side of the breakpoints identified.
The evolutionary relationships among DEN-4 isolates
were inferred using a ML method (PAUP* package,
Swofford [2001]). In all cases trees were estimated using
the best fitting model of nucleotide substitution identified
by Modeltest 3.06 (Posada and Crandall 1998). The model
of DNA substitution that best described DEN-4 evolution
in Puerto Rico (including the outgroup and six other
foreign samples) was the general time-reversible model
that includes six substitution rate parameters (A$C ¼
2.0346, A$G ¼ 12.5935, A$T ¼ 1.7144, C$G ¼
2.0608, C$T ¼ 31.0427, G$T ¼ 1), with 41.5% of sites
variable and a gamma distribution of among-site rate
variation (4 categories) with a shape parameter (a) of
1.020 (substitution model GTR þ I þ ). Phylogenies were
generated under successive rounds of tree-bisection/
reconnection (TBR) branch swapping, updating parameter
estimates at each round. To assess the support for the
phylogenetic groupings observed we undertook a bootstrap
resampling analysis using 1,000 replicate NeighborJoining trees estimated under the ML substitution model
determined above. Trees were rooted with the 1981 isolate
from Dominica, the oldest sequence available.
We used two methods to assess the extent of adaptive
evolution in DEN-4. First, we examined the relative rates
of nonsynonymous (dN) and synonymous (dS) substitution
across coding portions of the viral genome. To do this, we
employed a ML approach to compare models of evolution
that allow dN/dS to vary within genes or among lineages of
the ML tree of the PR sequences (Yang et al. 2000). In
particular, we compared models that allow for positive
selection because they incorporate a class of codons where
dN/dS can be greater than 1 (models M2, M3, M8) with
those that specify neutral evolution because dN is
constrained to be less than dS (models M0, M1, and
M7). We also used the free ratio (FR) model that allows
each branch of the tree to have a different dN/dS ratio.
Models were compared using standard likelihood ratio
tests. A Bayesian approach was used to identify those
individual codons most likely subject to positive selection.
This approach calculates the posterior probabilities of dN/
dS categories for each amino acid site so that sites with the
highest probabilities of falling into dN/dS category.1 are
most likely to have been under positive selection. All these
analyses were undertaken using the CODEML program
from the PAML package (Yang 1997).
We also employed a population genetic approach to
test for adaptive evolution in dengue virus. According to
standard theory, the average time to fixation of neutral
mutations in a haploid population is ;2Ne, generations.
Consequently, if mutations have been fixed much faster
than this, we can conclude that their substitution dynamics
are dominated by positive selection rather than drift. To
calculate 2Ne generations for DEN-4 in Puerto Rico, we
estimated the parameter h (¼ 2Nel), the neutral mutation
rate per site per generation (l), and the viral generation
time (g). h was estimated from given sampling years (1994
and 1998) using a coalescent method (program Fluctuate;
Kuhner, Yamato, and Felsenstein [1998]); the generation
time of dengue virus was taken as 14 days comprising
intrinsic (within human) and extrinsic (within mosquito)
replication times of 7 days duration each (Holmes, Bartley,
and Garnett 1998). Although direct estimates of l are not
available for dengue virus, a synonymous rate of 6.89 3
104 substitutions/site/year was recently estimated for
DEN-4 (Twiddy, Holmes, and Rambaut 2003). Given
a generation time of 14 days, this is equivalent to a l of
2.64 3 105 mutations per site, per generation. Putatively
positively selected amino acid changes were identified as
those that fall on the internal branches of the tree that
separate sampling times (for example, on the branch
leading to the viral isolates sampled in 1998); at the
population genetic level, mutations that are absent from an
early time-point yet present in all sequences from a later
time-point can be assumed to have gone to fixation over
the course of the sampling period.
Sequences generated by this study can be accessed
on GenBank according to accession numbers AY152036
through AY152363.
Results
We examined over 4,000 nucleotides from each of
82 DEN-4 isolates collected in Puerto Rico and surrounding regions over a 20-year period. This included isolates
from (1) 1982, representing the first major outbreak of
DEN-4 in Puerto Rico; (2) the second major dengue
epidemic on the island in 1986 to 1987, marked by
hyperendemic transmission, and 29 DHF/DSS cases
including 3 deaths (Dietz et al. 1996); (3) two years—
1992, 1994—during which DEN-4 occurred at relatively
low prevalence; and (4) the most recent DEN-4 epidemic
in 1998 (fig 1). The 1998 epidemic marked the first time
in 12 years that DEN-4 again dominated the epidemiological landscape in PR (44% of all positively diagnosed
dengue cases) and was one of the most severe on the
island (396,000–792,000 infections estimated and a record
59 DHF cases reported, 2.5 standard deviations above the
mean of 16.5; CDC data not shown). DEN-4 viruses
circulating in the 1998 outbreak shared on average 98.5%
sequence similarity with those from 1981/82. Over the
entire study period, only 14% of all nucleotide sites
experienced substitutions, of which 26% from translated
regions (or 3.6% of all coding sites) resulted in amino acid
substitutions.
Our ML phylogenetic analysis of DEN-4 in Puerto
Rico revealed a pattern of evolution marked by strong
temporal clustering of isolates by year of sampling (fig. 2).
All early (1982) isolates from Puerto Rico were associated
with the 1981 isolate from Dominica, and were 0.7%
(range: 0.5% to 1%) different from the closest group of
subsequent PR isolates from 1986/87. Three of six other
foreign isolates shared ancestors with this early introduction group as opposed to later PR isolates (El Salvador
1993, Ecuador 1994, and Mexico 1995, data not shown),
reflecting the widespread distribution of the introduced
Asian DEN-4 variant from 1981 (Gubler 1998; Foster et al.
1654 Bennett et al.
2003). Since the initial epidemic in 1982, DEN-4 was
virtually absent from Puerto Rico until the 1986 epidemic
(fig. 1; Dietz et al. 1996). All viruses sampled in Puerto
Rico during and after this re-emergence (1986 onward) fell
into a single lineage defined by four silent nucleotide
substitutions and one amino acid substitution in the
envelope (E) gene (methionine to threonine, aa position
163; fig. 2). With the exception of a single 1994 isolate,
two additional silent and two conservative amino acid
substitutions (isoleucine to valine, envelope aa position
351; lysine to arginine, NS1 aa position 51) occurred in the
formation of the re-emergent PR lineage. Within this reemergent lineage, sublineages were largely temporally
ordered. For example, most of the 1987 isolates fell into
a well-defined temporal cluster, distinguished by five silent
changes across coding regions examined (gold in fig. 2).
Similarly, major temporal clusters were formed by all 1992
(green in fig. 2), most 1994 (blue in fig. 2), and all 1998
isolates (red in fig. 2), respectively. The 1998 year group
was defined by several silent changes concentrated in the E
gene, and more notably three amino acid replacements in
the nonstructural NS2A protein (isoleucine to valine, aa
position 14; valine to threonine, aa position 54; and proline
to serine, aa position 101).
Although DEN-4 isolates grouped into temporal
clusters, the dominant clade (that which included most of
the isolates) from a particular year descended from older
isolates that represented minor variants in the previous
sampling period. For example, the 1992 cluster did not
descend from the major 1987 cluster, but from contemporaneous (1987) variants representing only 17% (3 out of
19) of the isolates sampled in 1987. Similarly, the 1998
cluster descended from a rare 1994 lineage represented by
only 8% (2 out of 27) of the isolates sampled between
1992 and 1994. Indeed, only the major 1994 lineage was
nested within the dominant lineage of the previous
sampling period, 1992. This pattern of sequence differences among DEN-4 isolates from Puerto Rico indicates
phylogenetic shifts in the population of variants between
sampling periods. We refer to this numerical shift in the
population of genotypes present in a given year away from
the dominant genotypes of an earlier time as ‘‘lineage
turnover,’’ since it infers a replacement of the most
successful lineage from one sampling period to the next.
Finally, we found no evidence for major shifts in
topological position among gene regions indicative of
recombination.
To determine whether positive selection has played
a significant role in DEN-4 evolution and lineage turnover,
we examined rates of nonsynonymous (dN) and synonymous (dS) substitution in individual viral genes using a ML
method. Although eight potentially positively selected
sites were identified (posterior probability P . 0.99) in the
E, NS1, NS4B, and most notably the NS2A genes, where
a small class of codons (0.9%) had a mean dN/dS ratio of
;4.6, in no case could a model of codon evolution
allowing positive selection conclusively reject all competing neutral models (table 2; the results for all model
comparisons are available from the authors on request).
However, the evolution of the nonstructural gene NS2A
was striking in that the branch leading to the 1998 cluster
of sequences was distinguished exclusively by three
nonconservative amino acid replacements in NS2A (14Ileu
to Thr, 54Val to Thr, 101Pro to Ser; fig. 2), in the absence of
any synonymous nucleotide changes. This results in an
infinitely large dN/dS ratio along this branch, suggestive of
positive selection: mean dN/dS for all other internal
branches of our phylogeny in NS2A were significantly
lower (mean dN/dS ¼ 0.038, P ¼ 0.001, using absolute
number nucleotide changes for observed and expected
values). Moreover, these mutations appear to have been
fixed far more quickly than if they were subject to genetic
drift alone. Estimated values of h (2Nel) are 0.024 (range
0.014 to 0.045) and 0.027 (range 0.016 to 0.052) for the
viruses sampled from years 1994 and 1998, respectively.
Assuming a neutral mutation rate of 2.64 3 105 mutations
per site, per generation, effective population sizes (Ne)
were only 454 and 511 for 1994 and 1998, respectively.
Taking the mean Ne value across these two sampling times
(482), we obtain an expected fixation time under genetic
drift of 13,496 days (482 3 14 days/generation 3 2) or
;37 years. However, the observed fixation time for these
mutants is a maximum of 6 years; as these mutations were
first detected in 1994, we assume that they appeared
sometime between the 1992 and 1994 epidemics, giving
a maximum of 6 years time difference to the 1998 strains.
Discussion
This longitudinal phylogenetic study of DEN-4 in
a focal host population examined evolutionary changes
during viral epidemic expansion. Our most striking
observation was that the evolutionary history of DEN-4
in Puerto Rico was characterized by the replacement of
lineages between epidemic years: most isolates from
a given year were closely related, but turnover of the
common variant occurred between sampling periods. This
pattern of lineage turnover is similar to that seen in some
other acute RNA viruses. For example, in coxsackie-A
virus temporally organized lineages, regardless of geographic origin, are equally unrelated to each other (Santti
et al. 2000; Ishiko et al. 2002) suggestive of lineage
turnover fueled by virus exchange between spatially
distinct populations. Phylogenetic evidence also suggests
!
FIG. 2.—Maximum likelihood tree based on 3,543 bp sequences (coding regions) from 75 isolates of DEN-4 from Puerto Rico and one from
Dominica (the outgroup sequence). Six other foreign isolates have been omitted from the figure for simplicity. The same topology was obtained when
phylogenies were constructed including non-coding sequence data (4,016 bp per isolate). Branches are color-coded by year of sample isolation.
Bootstrap support values, shown at nodes, were generated by using 1,000 replicate Neighbor-Joining trees reconstructed under the best-fit model of
nucleotide evolution. Three amino acid changes, in envelope (E) and NS1 (N1) genes, that define the post-introduction Puerto Rican lineage, and three
amino acid changes in the positively selected NS2A (2A) gene that define the 1998 clade, are marked with black bars and the amino acid position within
their respective genes.
Selection in Emergent Dengue Virus 1655
1656 Bennett et al.
Table 2
Maximum Ratio of Nonsynonymous to Synonymous
Substitutions for Each DEN-4 Gene Region Examined in
This Study
dN/dSa
Gene
Capsid / membrane
Envelope / NS1
NS2A
NS4B
Max. dN/dS
0.822
2.110
4.574
1.851
b
Proportion of Codonsc
Pd
0.167
0.017
0.009
0.014
0.997
0.157
0.725
0.937
a
Values given for the M3 model of codon evolution that allows three classes
of dN/dS per gene sequence alignment, all of which are estimated from the data.
b
Highest dN/dS for a set of codons estimated under the M3 model.
c
Proportion of codons with the maximum dN/dS value.
d
Significance value obtained from a likelihood ratio test involving M3 and
the neutral codon model M1 (which allows two classes of dN/dS, 0 and 1).
that vesicular stomatitis virus in the United States and
Mexico has experienced lineage shifts since the early
1980s, following its geographical spread in the Americas
(Nichol, Rowe, and Fitch 1993). Finally, there is some
evidence for lineage turnover in human influenza A virus,
although phylogenetic trees from this virus tend to have
a more regular temporal structure, most likely reflecting
the continual selection pressure exerted by neutralizing
antibodies (Bush et al. 1999).
There are several non-mutually exclusive explanations for the lineage turnover observed in DEN-4 evolution
in Puerto Rico over the last 20 years, aside from incomplete
sampling. Novel lineages could arise and proliferate in
a population through multiple re-introductions, genetic
drift, and/or selection. However, although introductions
from other DEN-4 populations may provide a source of
variation, evidence suggests that DEN-4 in the Caribbean is
characterized by local evolution interrupted occasionally
by gene flow (Foster et al. 2003). There was also no
evidence that microgeographic population structure within
Puerto Rico generated the observed pattern, as virus
samples were obtained from similar geographic regions
in all cases (data not shown). In addition, the distinct and
persistent pattern of lineage turnover is difficult to explain
by random sampling processes alone, because we would
expect common genotypes to become fixed by genetic drift
more often than rare ones. Indeed, the stochastic nature of
the dengue virus life-cycle should favor common variants:
genetic bottlenecks occur at every mosquito feeding event,
along with seasonal reductions in vector populations
(Gubler 1987), and annual variation in the abundance of
susceptible human hosts. Instead, the dominant Puerto
Rican lineage of a given year twice descended from earlier
rare genotypes, a pattern that suggests that much of the
lineage turnover is driven by selection on viral genotype. In
support of this hypothesis, there was an increase in the rate
of nonsynonymous substitution (in the absence of any
silent changes in NS2A) on the lineage leading to the 1998
epidemic, and these amino acid changes were fixed far
more quickly than expected by genetic drift. Moreover, our
population genetic estimations for the fixation time of the
NS2A mutants are conservative in that these changes may
have been fixed much faster than the 6 years separating the
1992 and 1998 samples, and our estimates of Ne may be
artificially low if positive selection has purged genetic
diversity. Consequently, adaptive evolution in the NS2A
gene may have triggered the 1998 epidemic in Puerto Rico,
and DEN-4 genotypes bearing these NS2A modifications
were also associated with contemporaneous epidemics
throughout the Greater and Lesser Antilles (Foster et al.
2003). Conversely, a similar association between amino
acid changes and lineage turnover was not observed
between 1987 and 1992, where neither clade was defined
by amino acid substitutions. In this case, lineage turnover
may have resulted from drift-sensitive population bottlenecks, inter-island extinction/recolonization, or selection
on other parts of the genome not examined in this study.
Although we examined many more nucleotides than
previous studies (e.g., Rico-Hesse 1990; Lewis et al.
1993; Lanciotti et al. 1994; Lanciotti, Gubler, and Trent
1997; Rico-Hesse et al. 1997, 1998; Singh et al. 1999;
Uzcategui et al. 2001; Twiddy et al. 2002), 60% of the
dengue genome was not surveyed, including genes known
to be important in virus replication, such as NS5 (Leitmeyer
et al. 1999), and virus antigenicity, such as NS1 (Mathew
et al. 1998; Jacobs et al. 2000). Because selection is
apparently restricted to very few sites, a complete appreciation of the forces driving genetic change in DEN-4 will
ultimately require the analysis of full genome sequences.
The apparent positive selection on the NS2A gene is
even more anomalous given the relatively strong constraints acting on other regions of the viral genome. In
particular, there was no convincing evidence that changes
in structural genes, the primary targets of specific
immunity, underlie the evolutionary shifts we observed
in DEN-4 after its re-emergence in 1986. Most positions
within the structural genes were invariant (table 2), and
very few of the nonsynonymous substitutions in these
regions occurred at internal nodes. Two amino acid
changes in E (positions 163 and 351; see fig. 2) defined
the DEN-4 that re-emerged in the late 1980s after 3 years
of undetectable transmission, both occurring within wellcharacterized structural epitope domains (summarized in
Roehrig [1997]). The E protein, which enables host cell
binding and entry, providing a target for the host immune
response (Roehrig 1997), is the functional analog of
influenza A’s hemagglutinin (HA) gene, which, in contrast,
appears to be under strong antigenic selection (Bush et al.
1999). In dengue virus, constraints on the E gene may be
attributable to its two-host life cycle and resultant multicell type tropism (Beaty, Trent, and Roehrig 1988; Strauss
and Strauss 1988), so that rates of nucleotide substitution
are lower than those seen in many other RNA viruses
(Weaver, Rico-Hesse, and Scott 1992; Jenkins et al. 2002).
In addition, positive selection is less likely to occur
because of intrinsic negative fitness trade-offs (Woelk and
Holmes 2002). Indeed, substitution patterns across the four
gene regions examined here are consistent with a genome
under stabilizing selection, with synonymous changes
greatly outnumbering nonsynonymous changes. Against
this conservative background, the amino acid changes in
NS2A that distinguish the 1998 virus samples appear even
more conspicuous, and natural selection on nonstructural
genes has been described for other viruses and correlated
with epidemic outbreaks (Knowles et al. 2001).
Selection in Emergent Dengue Virus 1657
Aside from epidemiologic evidence, the phenotypic
traits targeted by natural selection involving NS2A are
unclear because we know so little about the gene’s
function. Dengue viruses in Puerto Rico may be under
particularly intense selection to improve replication rate,
survival, and, ultimately, transmission rate, because the
only vector present, urban-specialist A. aegypti, is
relatively inefficient and requires high viral titers to
acquire infection (up to 106 infectious units/ml blood in
laboratory studies [Gubler 1987; Kuno 1997]). Puerto Rico
also lacks potential reservoir (primate) hosts, and its vector
exhibits extremely low levels of vertical transmission, such
that the disease must cycle directly between mosquitoes
and humans to persist (Gubler 1987, 1998). Alternatively,
the selection pressure could relate to survival pressure
exerted by the human immune system in the guise of
cytotoxic T-lymphocytes (CTLs). Epitopes that elicit
human T-cell responses ranging from serotype-specific to
cross-reactive have been identified throughout the nonstructural regions of the dengue genome (Loke et al.
2001), and phylogenetic evidence for positive selection at
or near T-cell epitopes has been noted previously (Twiddy,
Woelk, and Holmes 2002). The function of NS2A has been
associated with viral replication (Falgout and Markoff
1995; Mackenzie et al. 1998) and the mediation of host
immune interactions via NS1 (Rothman et al. 1993;
Mathew et al. 1998; Jacobs et al. 2000). As the three
amino acid substitutions in NS2A that define the 1998
cluster were all highly nonconservative changes from
hydrophobic, non-polar residues to polar, uncharged
amino acids, they would at the very least change the 3dimensional structure of the NS2A protein. To fully
determine the repercussions of observed NS2A modifications on viral extended phenotype, future studies must
endeavor to characterize the NS2A protein’s structure and
function, and to survey this gene in phylogenetic studies of
epidemic dengue.
Acknowledgments
We thank M. Worobey for assistance with recombination analyses, and J. J. Bull, K. A. Hanley, and D. D.
Kapan for invaluable comments on the manuscript. Some
of the data were acquired in partial fulfillment of M.C.’s
Master’s degree at the Department of Microbiology and
Medical Zoology, University of Puerto Rico, and thanks
are therefore due to her advisory committee. This research
was supported by the National Institutes of Health (USA)
through a research project grant and the Research Centers
in Minority Institutions program, and by The Royal
Society (UK).
Literature Cited
Beaty, B. J., D. W. Trent, and J. T. Roehrig. 1988. Virus
variation and evolution. Pp. 59–85 in T. P. Monath, ed. The
arboviruses: epidemiology and ecology, Vol. 1. CRC Press,
Boca Raton, Fla.
Bush, R. M., C. A. Bender, K. Subbarao, N. J. Cox, and W. M.
Fitch. 1999. Predicting the evolution of human influenza A.
Science 286:1921–1925.
Dietz, V., D. J. Gubler, S. Ortiz, G. Kuno, A. Casta-Velez, G. E.
Sather, and I. Gomez. 1996. The 1986 dengue and dengue
hemorrhagic fever epidemic in Puerto Rico: epidemiologic
and clinical observations. P. R. Health Sci. J. 15:201–210.
Domingo, E., and J. J. Holland. 1997. RNA virus mutations for
fitness and survival. Annu. Rev. Microbiol. 51:151–178.
Drake, J. W., and J. J. Holland. 1999. Mutation rates among
RNA viruses. Proc. Natl. Acad. Sci. USA 96:13910–13913.
Falgout, B., and L. Markoff. 1995. Evidence that Flavivirus NS1NS2A cleavage is mediated by a membrane-bound host
protease in the endoplasmic reticulum. J. Virol. 69:7232–
7243.
Foster, J. E., S. N. Bennett, H. Vaughan, V. Vorndam, W. O.
McMillan, and C. V. F. Carrington. 2003. Molecular
evolution and phylogeny of dengue type 4 virus in the
Caribbean. Virology 306:126–134.
Gubler, D. J. 1987. Current research on dengue. Pp. 37–56 in
K. F. Harris, ed. Current topics in vector research, Vol. 3.
Springer-Verlag, New York.
———. 1998. Dengue and dengue hemorrhagic fever. Clin.
Microbiol. Rev. 11:480–496.
Gubler, D. J., D. Reed, L. Rosen, and J. R. Hitchcock, Jr. 1978.
Epidemiologic, clinical, and virologic observations on dengue
in the Kingdom of Tonga. Am. J. Trop. Med. Hyg. 27:581–
589.
Halstead, S. B. 1988. Pathogenesis of dengue: challenges to
molecular biology. Science 239:476–481.
Hatta, M., P. Gao, P. Halfmann, and Y. Kawaoka. 2001.
Molecular basis for high virulence of Hong Kong H5N1
influenza A viruses. Science 293:1840–1842.
Holmes, E. C. 1998. Molecular epidemiology of dengue virus—
the time for big science. Trop. Med. Int. Health 3:855–856.
Holmes, E. C., L. M. Bartley, and G. P. Garnett. 1998. The
emergence of dengue: past, present and future. Pp. 301–325
in R. M. Krause, ed. Emerging infections, Academic Press,
New York.
Holmes, E. C., and S. S. Burch. 2000. The causes and
consequences of genetic variation in dengue virus. Trends
Microbiol. 8:74–77.
Holmes, E. C., M. Worobey, and A. Rambaut. 1999. Phylogenetic evidence for recombination in dengue virus. Mol. Biol.
Evol. 16:405–409.
Ishiko, H., Y. Shimada, M. Yonaha, O. Hashimoto, A. Hayashi,
K. Sakae, and N. Takeda. 2002. Molecular diagnosis of
human enteroviruses by phylogeny-based classification by use
of the VP4 sequence. J. Infect. Dis. 185:744–754.
Jacobs, M. G., P. J. Robinson, C. Bletchly, J. M. Mackenzie, and
P. R. Young. 2000. Dengue virus nonstructural protein 1 is
expressed in a glycosyl-phosphatidylinositol-linked form that
is capable of signal transduction. FASEB J. 14:1603–1610.
Jenkins, G. M., A. Rambaut, O. G. Pybus, and E. C. Holmes.
2002. Rates of molecular evolution in RNA viruses:
a quantitative phylogenetic analysis. J. Mol. Evol. 54:156–
165.
Knowles, N., P. Davies, T. Henry, V. O’Donnell, J. M. Pacheco,
and P. Mason. 2001. Emergence in Asia of foot and mouth
disease viruses with altered host range: characterization of
alteration in the 3A protein. J. Virol. 75:1551–1556.
Kuhner, M. K., J. Yamato, and J. Felsenstein. 1998. Maximum
likelihood estimation of population growth rates based on the
coalescent. Genetics 149:429–434.
Kuno, G. 1997. Factors influencing the transmission of dengue
viruses. Pp. 61–88 in D. J. Gubler, and G. Kuno, eds. Dengue
and dengue hemorrhagic fever. CAB International, New York.
Lanciotti, R. S., D. J. Gubler, and D. W. Trent. 1997. Molecular
evolution and phylogeny of dengue-4 viruses. J. Gen. Virol.
78:2279–2286.
Lanciotti, R. S., J. G. Lewis, D. J. Gubler, and D. W. Trent. 1994.
1658 Bennett et al.
Molecular evolution and epidemiology of dengue-3 viruses.
J. Gen. Virol. 75:65–75.
Leitmeyer, K. C., D. W. Vaughn, D. M. Watts, R. Salas, I.
Villalobos de Chacon, C. Ramos, and R. Rico-Hesse. 1999.
Dengue virus structural differences that correlate with
pathogenesis. J. Virol. 73:4738–4747.
Lewis, J. A., G-J. Chang, R. S. Lanciotti, R. M. Kinney, L. W.
Mayer, and D. W. Trent. 1993. Phylogenetic relationships of
dengue-2 viruses. Virology 197:216–224.
Loke, H., D. B. Bethell, C. X. T. Phuong, M. Dung, J. Schneider,
N. J. White, N. P. Day, J. Farrar, and A. V. S. Hill. 2001.
Strong HLA class-I restricted T cell responses in dengue
hemorrhagic fever: a double-edged sword. J. Infect. Dis.
184:1369–1373.
Mackenzie, J. M., A. A. Kromykh, M. K. Jones, and E. G.
Westaway. 1998. Subcellular localization and some biochemical properties of the flavivirus Kunjin nonstructural
proteins NS2A and NS4A. Virology 245:203–215.
Mackow, E., Y. Makino, B. T. Zhao, Y. M. Zhang, L. Markoff,
A. Buckler-White, M. Guiler, R. Chanock, and C. J. Lai.
1987. The nucleotide sequence of dengue type 4 virus:
analysis of genes coding for nonstructural proteins. Virology
159:217–228.
Manzin, A., L. Solforosi, M. Debiaggi, F. Zara, E. Tanzi, L.
Romano, A. R. Zanetti, and M. Clementi. 2000. Dominant
role of host selective pressure in driving hepatitis C virus
evolution in perinatal infection. J. Virol. 74:4327–4334.
Mathew, A., I. Kurane, S. Green, H. A. F. Stephens, D. W.
Vaughn, S. Kalayanarooj, S. Suntayakorn, F. A. Ennis, and
A. L. Rothman. 1998. Predominance of HLA-restricted CTL
responses to serotype crossreactive epitopes on nonstructural
proteins after natural dengue virus infections. J. Virol.
72:3999–4004.
Nichol, S. T., J. E. Rowe, and W. M. Fitch. 1993. Punctuated
equilibrium and positive Darwinian evolution in vesicular
stomatitis virus. Proc. Natl. Acad. Sci. USA 90:10424–
10428.
Posada, D., and K. A. Crandall. 1998. MODELTEST: testing the
model of DNA substitution. Bioinformatics 14:817–818.
Rico-Hesse, R. 1990. Molecular evolution and distribution of
dengue viruses type 1 and 2 in nature. Virology 174:479–493.
Rico-Hesse, R., L. M. Harrison, A. Nisalak, D. W. Vaughn, S.
Kalayanarooj, S. Greene, A. L. Rothman, and F. A. Ennis.
1998. Molecular evolution of Dengue type 2 virus in
Thailand. Am. J. Trop. Med. Hyg. 58:96–101.
Rico-Hesse, R., L. M. Harrison, R. A. Salas, D. Tovar, A.
Nisalak, C. Ramos, J. Boshell, M. T. de Mesa, R. M.
Nogueira, and A. T. da Rosa. 1997. Origins of dengue type 2
viruses associated with increased pathogenicity in the
Americas. Virology 230:244–251.
Roehrig, J. T., 1997. Immunochemistry of dengue viruses. Pp.
199–219 in D. J. Gubler, and G. Kuno, eds. Dengue and
dengue hemorrhagic fever. CAB International, New York.
Rothman, A. L., and F. A. Ennis. 1999. Immunopathogenesis of
dengue hemorrhagic fever. Virology 257:1–6.
Rothman, A. L., I. Kurane, C. J. Lai, M. Bray, B. Falgout, R.
Men, and F. A. Ennis. 1993. Dengue virus protein recognition
by virus-specific murine CD8þ cytotoxic T lymphocytes.
J. Virol. 67:801–806.
Santti, J., H. Harvala, L. Kinnunen, and T. Hyypiä. 2000.
Molecular epidemiology and evolution of coxsackievirus A9.
J. Gen. Virol. 81:1361–1372.
Singh, U. B., A. Maitra, S. Broor, A. Rai, S. T. Pasha, and P.
Seth. 1999. Partial nucleotide sequencing and molecular
evolution of epidemic causing dengue 2 strains. J. Infect. Dis.
180:959–965.
Strauss, J. H., and E. G. Strauss. 1988. Evolution of RNA
viruses. Annu. Rev. Microbiol. 42:657–683.
Swofford, D. L. 2001. PAUP*: phylogenetic analysis using
parsimony (*and other methods). Version 4. Sinauer Associates, Sunderland, Mass.
Thein, S., M. M. Aung, T. N. Shwe, M. Aye, Z. Aung, K. Aye,
K. M. Aye, and J. Aaskov. 1997. Risk factors in dengue shock
syndrome. Am. J. Trop. Med. Hyg. 56:566–572.
Tolou, H. J. G., P. Couissinier-Paris, J.-P. Durand, V. Mercier,
J.-J. de Pina, P. de Micco, F. Billoir, R. N. Charrel, and
X. de Lamballerie. 2001. Evidence for recombination in
natural populations of dengue virus type 1 based on the
analysis of complete genome sequences. J. Gen. Virol.
82:1283–1290.
Twiddy, S. S., J. F. Farrar, N. V. Chau, B. Wills, E. A. Gould, T.
Gritsun, G. Lloyd, and E. C. Holmes. 2002. Phylogenetic
relationships and differential selection pressures among
genotypes of dengue-2 virus. Virology 298:63–72.
Twiddy, S. S., and E. C. Holmes. 2003. The extent of
homologous recombination in the genus Flavivirus. J. Gen.
Virol. 84:429–440.
Twiddy, S. S., E. C. Holmes, and A. Rambaut. 2003. Inferring
the rate and time-scale of dengue virus evolution. Mol. Biol.
Evol. 20:122–129.
Twiddy, S. S., C. H. Woelk, and E. C. Holmes. 2002.
Phylogenetic evidence for adaptive evolution of dengue
viruses in nature. J. Gen. Virol. 83:1679–1689.
Uzcategui, N. Y., D. Camacho, G. Comach, E. C. Holmes and
E. A. Gould. 2001. The molecular epidemiology of Dengue-2
virus in Venezuela: evidence for in situ viral evolution and
recombination. J. Gen. Virol. 82:2945–2953.
Wang, E., H. Ni, X. Renling, A. D. T. Barrett, S. J. Watowich,
D. J. Gubler, and S. C. Weaver. 2000. Evolutionary relationships of endemic/epidemic and sylvatic dengue viruses. J.
Virol. 74:3227–3234.
Weaver, S. C., R. Rico-Hesse, and T. W. Scott. 1992. Genetic
diversity and slow rates of evolution in new-world alphaviruses. Curr. Top. Microbiol. Immunol. 176:99–117.
Woelk, C. H., and E. C. Holmes. 2002. Reduced positive
selection in vector-borne RNA viruses. Mol. Biol. Evol.
19:2333–2336.
World Health Organization (WHO). 1999. Strengthening implementation of the global strategy for dengue fever/ dengue
haemorrhagic fever prevention and control: report of the
informal consultation, WHO, Geneva, 18–20 October 1999
(WHO Report WHO/CDS/( DEN)/IC/2000. 1; www.who.int/
emc-documents/dengue/whocdsdenic20001c.html).
Worobey, M., A. Rambaut, and E. C. Holmes. 1999. Widespread
intra-serotype recombination in natural populations of dengue
virus. Proc. Natl. Acad. Sci. USA 96:7352–7357.
Yang, Z. 1997. PAML, a program package for phylogenetic
analysis by maximum likelihood. Comput. Appl. Biosci.
13:555–556.
Yang, Z., R. Nielsen, N. Goldman, and A.-M. K. Pedersen. 2000.
Codon substitution models for heterogeneous selection
pressure at amino acid sites. Genetics 155:431–449.
Zanotto, P. M. de A., E. G. Kallas, R. F. de Souza, and E. C.
Holmes. 1999. Genealogical evidence for positive selection in
the nef gene of HIV-1. Genetics 153:1077–1089.
Zhao, B., E. Mackow, A. Buckler-White, L. Markoff, R. M.
Chanock, C. J. Lai, and Y. Makino. 1986. Cloning full-length
dengue type 4 viral DNA sequences: analysis of genes coding
for structural proteins. Virology 155:77–88.
Keith Crandall, Associate Editor
Accepted May 25, 2003