Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Selection-Driven Evolution of Emergent Dengue Virus Shannon N. Bennett,* Edward C. Holmes, Maritza Chirivella,ৠDania M. Rodriguez,* Manuela Beltran,§ Vance Vorndam,§ Duane J. Gubler,k and W. Owen McMillan* *Department of Biology, University of Puerto Rico–Rio Piedras, San Juan, Puerto Rico; Department of Zoology, University of Oxford, Oxford, England; àDepartment of Microbiology and Medical Zoology, University of Puerto Rico–Ciencias Medicas, San Juan, Puerto Rico; §Centers for Disease Control and Prevention, San Juan Branch, San Juan, Puerto Rico; kCenters for Disease Control and Prevention, Fort Collins, Colorado In the last four decades the incidence of dengue fever has increased 30-fold worldwide, and over half the world’s population is now threatened with infection from one or more of four co-circulating viral serotypes (DEN-1 through DEN-4). To determine the role of viral molecular evolution in emergent disease dynamics, we sequenced 40% of the genome of 82 DEN-4 isolates collected from Puerto Rico over the 20 years since the onset of endemic dengue on the island. Isolates were derived from years with varying levels of DEN-4 prevalence. Over our sampling period there were marked evolutionary shifts in DEN-4 viral populations circulating in Puerto Rico; viral lineages were temporally clustered and the most common genotype at a particular sampling time often arose from a previously rare lineage. Expressed changes in structural genes did not appear to drive this lineage turnover, even though these regions include primary determinants of viral antigenic properties. Instead, recent dengue evolution can be attributed in part to positive selection on the nonstructural gene 2A (NS2A), whose functions may include replication efficiency and antigenicity. During the latest and most severe DEN-4 epidemic in Puerto Rico, in 1998, viruses were distinguished by three amino acid changes in NS2A that were fixed far faster than expected by drift alone. Our study therefore demonstrates viral genetic turnover within a focal population and the potential importance of adaptive evolution in viral epidemic expansion. Introduction RNA viruses comprise one of the fastest growing categories of emergent diseases (Domingo and Holland 1997). Although they exhibit remarkable genetic diversity, attributable to intrinsically high rates of mutation and replication as well as large population sizes (Domingo and Holland 1997; Drake and Holland 1999), the role of viral evolution in determining disease dynamics has only been described in a few cases (for example, Bush et al. 1999; Zanotto et al. 1999; Manzin et al. 2000; Hatta et al. 2001). We examine evolutionary change in dengue (DEN), an acute mosquito-borne RNA virus (genus Flavivirus), over a 20-year period that has marked the emergence of dengue in Puerto Rico (PR), a dense urban population whose growth rate rivals Asian population centers. The virus, which causes dengue fever (DF), and the more severe dengue hemorrhagic fever (DHF) and dengue shock syndrome (DSS), consists of four antigenically distinct serotypes, DEN-1 through DEN-4, that are evolutionarily derived from at least three independent introductions into humans from wild primates in Africa and Southeast Asia (Wang et al. 2000). There is also abundant genetic diversity within each serotype, in the guise of phylogenetically distinct clusters of sequences often referred to as ‘‘genotypes’’ (reviewed in Holmes and Burch 2000). The ongoing expansion of dengue throughout Asia and the South Pacific is being recapitulated in the Americas (Gubler 1998). Before the 1950s, people were typically exposed to a single strain (hypoendemicity), and epidemics were rare and self-limiting (Gubler 1998). However, geographic expansion of the primary mosquito Key words: dengue virus, positive selection, epidemiology, phylogeny, maximum likelihood. E-mail: [email protected]. Mol. Biol. Evol. 20(10):1650–1658. 2003 DOI: 10.1093/molbev/msg182 Molecular Biology and Evolution, Vol. 20, No. 10, Ó Society for Molecular Biology and Evolution 2003; all rights reserved. 1650 vector (Aedes aegypti), increasing host densities, particularly in urban centers, and global travel have substantially altered dengue’s epidemiologic landscape (Gubler 1998). Now dengue annually infects an estimated 50 million to 100 million people worldwide (WHO 1999), many of whom are exposed to two or more co-circulating DEN serotypes (hyperendemicity), resulting in frequent largescale epidemics and more frequent severe disease (Gubler 1998). Determining the contributing factors to the emergence of dengue as a global pandemic, particularly the increasing incidence of DHF and DSS, has proven difficult both because there are no satisfactory models or in vitro correlates with which to study disease transmissibility or pathogenicity directly (Rothman and Ennis 1999), and because most molecular epidemiologic studies to date have had limited scope (Holmes 1998). Associations have been demonstrated between severe manifestations of dengue (DHF/DSS) and both host infection history and viral genotype. Most notably, secondary infections with heterologous serotypes are more likely to develop into DHF/ DSS than primary infections (Halstead 1988; Thein et al. 1997; Gubler 1998) so that increasing hyperendemicity could account in part for the rise of DHF/DSS. However, there is also evidence that viral genotype may be a contributing factor in determining dengue disease. For example, attenuated and virulent strains of DEN-2 were first observed simultaneously in the Tonga epidemics of 1974/75 (Gubler et al. 1978), and the introduction of a genetically distinct Asian DEN-2 strain into the Americas has been associated with an increase in DHF/ DSS (Rico-Hesse et al. 1997; Leitmeyer et al. 1999). More tentatively, an analysis of selection pressures acting on dengue virus genomes suggested that genotypes of DEN-2 have selectively determined differences in transmissibility, in turn determining their ability to cause epidemics on a global scale (Twiddy et al. 2002). Selection in Emergent Dengue Virus 1651 FIG. 1.—Incidence of dengue virus in Puerto Rico since 1981. Years included in this study are marked on the x-axis with a black bar: * denotes the sample from Dominica. The rise of dengue in Puerto Rico mirrors the onset of the dengue pandemic in the New World. Before WWII, epidemics in Puerto Rico were rare, but subsequent years have been marked by frequent epidemics and, since the 1980s, continuous hyperendemic transmission (Dietz et al. 1996; Gubler 1998). Of the dengue cases reported annually (solid black line, right axis), a subset is submitted to the CDC, isolated, and identified to serotype (hatched area, left axis, plotted against month/year of isolation). The proportions that were DEN-4 are shaded solid gray. Because of dengue’s variable etiology, it often goes unreported, and thus the number of recorded cases underrepresents the true number of dengue infections by up to an estimated factor of 50 to 100 (WHO 1999). Puerto Rico provides an ideal natural laboratory to gather a detailed record of viral evolutionary change during disease expansion. The island has a large urban population with high mosquito vector densities and, like many tropical regions, has experienced nearly 20 years of dengue epidemics that are becoming increasingly severe (Gubler 1998). Although dengue fever was recorded in Puerto Rico as early as 1915 (Dietz et al. 1996), continuous transmission of all four serotypes has only occurred since the 1980s (Dietz et al. 1996; Gubler 1998). The first epidemic in Puerto Rico, consisting primarily of DEN-4, was reported in 1981/82, followed by another DEN-4–dominated outbreak in 1986, this one marked by high incidences of DHF/DSS (Dietz et al. 1996; fig. 1). DHF/ DSS cases have occurred periodically since the 1980s, reaching record levels in the latest DEN-4 epidemic in Puerto Rico in 1998. Taking advantage of Puerto Rico’s turbulent epidemiological record, we use a longitudinal phylogenetic approach to recover the history of evolutionary change in DEN-4 during disease emergence. In the absence of experimental models, phylogenetic analyses within a focal population provide the only method with which to correlate viral genetic change with epidemic behavior. Herein we examine viral evolution in DEN-4 isolates collected from Puerto Rico since the onset of epidemic dengue on the island. We sample nearly 40% of the viral genome, including all the structural genes known to be important in viral packaging and host cell entry, as well as a subset of nonstructural genes, from 82 viral isolates collected over a 20-year period. Thus we expand current knowledge of dengue molecular evolution to include genes never before systematically surveyed on this scale (Holmes 1998). We assess the role of viral molecular evolution in disease dynamics by testing for a viral adaptive basis to the changing patterns of DEN-4 incidence in Puerto Rico. Hence, we ascertain for the first time the role of natural selection in DEN-4 evolution in the context of a wellcharacterized pattern of epidemic outbreaks. Materials and Methods We examined substitution patterns in 82 DEN-4 isolates from Puerto Rico and surrounding regions since the disease was established in 1981/82 (Dietz et al. 1996). We subsampled viral isolates from the U.S. Centers for Disease Control and Prevention (CDC) sample bank that had been collected in Puerto Rico during the years 1982 (n¼ 14), 1986/87 (n ¼ 19), 1992 (n ¼ 15), 1994 (n ¼ 14), and 1998 (n ¼ 13) to represent both endemic and epidemic disease conditions (fig. 1). With 13 to 19 isolates per year-group, we have a 75%–86% chance of sampling rare 1652 Bennett et al. Table 1 Primers Designed to Amplify and Sequence DEN-4 Gene Regions Labela 90U 842L 138U 410L 518U 736L 616U 1676L 486U 1786L 686U 1142U 1181L 1603L 580U 803L 967U 1136L 1363U 1658L 1568U 2679L 1602U 2114U 1985L 2519L 3528U 4225L 3592U 4143L 7038U 7769L 7106U 7674L 10133U 10600L 10612L 10620L a Sequence 59ATCTCTGGAAAAATGAACCAACGAA 59ATAAGCCATAAATCCTGCCAAGAGC 59AATATGCTGAAACGCGAGAGAAACC 59TACGGTGGGAATCAAGCACAGCAA 59GACAACAGAGGGGATCAACAAATGC 59GCTCTTGTTTCCAATCCCATTCCTG 59CCGAACCTGAAGACATTGATTGCTG 59TTCCAGCACTGTCACATCCTGTCTC 59CACGTATAAATGCCCCCTACTGGTC 59GCTGTGTTTCTGCCATCTCTTTGTC 59GAGCGGAGAACGGAGACGAGAGAAG 59AACTACGGCAACAAGATGTCCAACG 59CTGTTGGTCCTGTTCCTCTTTCAGA 59TGAACCTCTGATGTGTCTGCTCCTG 59ACCCAGAGCGGAGAACGGAGACGAG 59GGGGCGACCAGCATCATTAGGACAA 59GAACTGACTAAGACAACAGCCAAGG 59AACAAGCCACAGCCATTGCCCCACC 59CCGGACTATGGAGAACTAACACTCG 59TGTCCTGCAAACATGTGATTTCCAT 59GCAATGGTTTTTGAATCTGCCTCTT 59CCTTCACATCCCCAGCCACTACAGT 59GCAGGAGCAGACACATCAGAGGTTC 59GAAAGGGAGTTCCATTGGCAAGATG 59CAAAGGGGTGGATGAGATGATACGC 59TCTCGCTGGGGACTCTGGTTGAAAT 59TTTGTGGAAGAATGCTTGAGGAGAA 59GCCAGAAGTAAGCCTCCTGCCACCA 59CTCTTTGTGCTATCATCTTGGGAGG 59AACCCACAGCCATTATGCCCTCGTT 59CTAATGGGGCTTGGAAAAGGATGGC 59TACAACTTCCCCTTTTGGCTTTACC 59ATGCTATTCTCAAGTGAACCCAACA 59CTTTCAGGGCAGACTTGGCTTCAGT 59CACCTGGGCGAAGAACATTCACACG 59CACCAATCCATCTTGCGGCGCTCTG 59TTGGATCAACAACACCAATCCATCT 59AGAACCTGTTGGATCAACAACACCA Function Amplification Amplification Sequencing Sequencing Sequencing Sequencing Amplification Amplification Amplification Amplification Sequencing Sequencing Sequencing Sequencing Sequencing Sequencing Sequencing Sequencing Sequencing Sequencing Amplification Amplification Sequencing Sequencing Sequencing Sequencing Amplification Amplification Sequencing Sequencing Amplification Amplification Sequencing Sequencing Amplification Amplification Amplification Amplification / / / / Gene/Gene Fragment sequencing sequencing sequencing sequencing Capsid/prMem Capsid/prMem Capsid/prMem Capsid/prMem Capsid/prMem Capsid/prMem EnvA EnvA EnvA EnvA EnvA EnvA EnvA EnvA EnvA EnvA EnvA EnvA EnvA EnvA EnvB/NS1 EnvB/NS1 EnvB/NS1 EnvB/NS1 EnvB/NS1 EnvB/NS1 NS2A NS2A NS2A NS2A NS4B NS4B NS4B NS4B 39NTR 39NTR 39NTR 39NTR Number indicates genome nucleotide position according to Zhao et al. (1986): U for forward and L for reverse. alleles (defined as existing at a frequency of 10% in the population) at least once. In addition to 75 Puerto Rican isolates, we included seven isolates sampled from outside Puerto Rico during the same period: three originating within the Caribbean basin, two from Central America, and one from Ecuador. Caribbean basin samples included a 1981 sample from Dominica, Lesser Antilles, believed to represent the introduction of Asian DEN-4 into the Caribbean basin. All samples have low passage histories, reducing the risk of artificial selection in vitro: only those samples derived from chronic (generally low) infections were first cultured in A6/C36 mosquito cells, for one or, at most, two passages, prior to RNA extraction. To further eliminate potential biases due to artificial selection, samples were not processed in temporal (year) order. We extracted sample RNA using QIAamp Viral RNA Mini kits (Qiagen GmbH). For each isolate we amplified, using reversetranscriptase polymerase chain reaction (RT-PCR), gene regions amounting to 40% of the viral genome (4,016 bp of an 11 kbp genome) and including both 59 and 39 ends (see table 1 for primer sequences). Amplified regions included all the structural genes (capsid: C; membrane: M; and envelope: E), a subset of nonstructural genes (NS1, NS2A, and NS4B), and the noncoding 39 NTR region. Amplifications were divided into separate reactions according to length of the target. Before sequencing, RTPCR products were purified using Qiagen PCR purification kits (Qiagen GmbH). We sequenced both strands of the amplified products using forward and reverse primers (table 1) in standard dye-labeling reactions. Sequence data were collected on an ABI 377 slab-gel automated sequencer (Applied Biosystems), edited, and compiled with Sequencher 3.1.1 (Gene Codes) and aligned against reference sequences (GenBank number M14931; Zhao et al. 1986, Mackow et al. 1987) using Megalign’s clustal algorithm (version 3.1.7, Lasergene). We imported aligned sequences into PAUP* (Swofford 2001) for phylogenetic analysis. Recombination, reported in all DEN serotypes (Worobey, Rambaut, and Holmes 1999; Tolou et al. 2001; Uzcategui et al. 2001; Twiddy and Holmes 2003), can lead to conflicts in phylogenetic trees. We searched for potential recombinants across the entire phylogeny by testing for topological incongruity among NeighborJoining (NJ) trees generated using a 500-base sliding Selection in Emergent Dengue Virus 1653 window. The statistical support for recombination in these sequences, as well as the locations of the breakpoints, was determined using a maximum likelihood method (program LARD; Holmes, Worobey, and Rambaut 1999), and then maximum likelihood (ML) trees were constructed on either side of the breakpoints identified. The evolutionary relationships among DEN-4 isolates were inferred using a ML method (PAUP* package, Swofford [2001]). In all cases trees were estimated using the best fitting model of nucleotide substitution identified by Modeltest 3.06 (Posada and Crandall 1998). The model of DNA substitution that best described DEN-4 evolution in Puerto Rico (including the outgroup and six other foreign samples) was the general time-reversible model that includes six substitution rate parameters (A$C ¼ 2.0346, A$G ¼ 12.5935, A$T ¼ 1.7144, C$G ¼ 2.0608, C$T ¼ 31.0427, G$T ¼ 1), with 41.5% of sites variable and a gamma distribution of among-site rate variation (4 categories) with a shape parameter (a) of 1.020 (substitution model GTR þ I þ ). Phylogenies were generated under successive rounds of tree-bisection/ reconnection (TBR) branch swapping, updating parameter estimates at each round. To assess the support for the phylogenetic groupings observed we undertook a bootstrap resampling analysis using 1,000 replicate NeighborJoining trees estimated under the ML substitution model determined above. Trees were rooted with the 1981 isolate from Dominica, the oldest sequence available. We used two methods to assess the extent of adaptive evolution in DEN-4. First, we examined the relative rates of nonsynonymous (dN) and synonymous (dS) substitution across coding portions of the viral genome. To do this, we employed a ML approach to compare models of evolution that allow dN/dS to vary within genes or among lineages of the ML tree of the PR sequences (Yang et al. 2000). In particular, we compared models that allow for positive selection because they incorporate a class of codons where dN/dS can be greater than 1 (models M2, M3, M8) with those that specify neutral evolution because dN is constrained to be less than dS (models M0, M1, and M7). We also used the free ratio (FR) model that allows each branch of the tree to have a different dN/dS ratio. Models were compared using standard likelihood ratio tests. A Bayesian approach was used to identify those individual codons most likely subject to positive selection. This approach calculates the posterior probabilities of dN/ dS categories for each amino acid site so that sites with the highest probabilities of falling into dN/dS category.1 are most likely to have been under positive selection. All these analyses were undertaken using the CODEML program from the PAML package (Yang 1997). We also employed a population genetic approach to test for adaptive evolution in dengue virus. According to standard theory, the average time to fixation of neutral mutations in a haploid population is ;2Ne, generations. Consequently, if mutations have been fixed much faster than this, we can conclude that their substitution dynamics are dominated by positive selection rather than drift. To calculate 2Ne generations for DEN-4 in Puerto Rico, we estimated the parameter h (¼ 2Nel), the neutral mutation rate per site per generation (l), and the viral generation time (g). h was estimated from given sampling years (1994 and 1998) using a coalescent method (program Fluctuate; Kuhner, Yamato, and Felsenstein [1998]); the generation time of dengue virus was taken as 14 days comprising intrinsic (within human) and extrinsic (within mosquito) replication times of 7 days duration each (Holmes, Bartley, and Garnett 1998). Although direct estimates of l are not available for dengue virus, a synonymous rate of 6.89 3 104 substitutions/site/year was recently estimated for DEN-4 (Twiddy, Holmes, and Rambaut 2003). Given a generation time of 14 days, this is equivalent to a l of 2.64 3 105 mutations per site, per generation. Putatively positively selected amino acid changes were identified as those that fall on the internal branches of the tree that separate sampling times (for example, on the branch leading to the viral isolates sampled in 1998); at the population genetic level, mutations that are absent from an early time-point yet present in all sequences from a later time-point can be assumed to have gone to fixation over the course of the sampling period. Sequences generated by this study can be accessed on GenBank according to accession numbers AY152036 through AY152363. Results We examined over 4,000 nucleotides from each of 82 DEN-4 isolates collected in Puerto Rico and surrounding regions over a 20-year period. This included isolates from (1) 1982, representing the first major outbreak of DEN-4 in Puerto Rico; (2) the second major dengue epidemic on the island in 1986 to 1987, marked by hyperendemic transmission, and 29 DHF/DSS cases including 3 deaths (Dietz et al. 1996); (3) two years— 1992, 1994—during which DEN-4 occurred at relatively low prevalence; and (4) the most recent DEN-4 epidemic in 1998 (fig 1). The 1998 epidemic marked the first time in 12 years that DEN-4 again dominated the epidemiological landscape in PR (44% of all positively diagnosed dengue cases) and was one of the most severe on the island (396,000–792,000 infections estimated and a record 59 DHF cases reported, 2.5 standard deviations above the mean of 16.5; CDC data not shown). DEN-4 viruses circulating in the 1998 outbreak shared on average 98.5% sequence similarity with those from 1981/82. Over the entire study period, only 14% of all nucleotide sites experienced substitutions, of which 26% from translated regions (or 3.6% of all coding sites) resulted in amino acid substitutions. Our ML phylogenetic analysis of DEN-4 in Puerto Rico revealed a pattern of evolution marked by strong temporal clustering of isolates by year of sampling (fig. 2). All early (1982) isolates from Puerto Rico were associated with the 1981 isolate from Dominica, and were 0.7% (range: 0.5% to 1%) different from the closest group of subsequent PR isolates from 1986/87. Three of six other foreign isolates shared ancestors with this early introduction group as opposed to later PR isolates (El Salvador 1993, Ecuador 1994, and Mexico 1995, data not shown), reflecting the widespread distribution of the introduced Asian DEN-4 variant from 1981 (Gubler 1998; Foster et al. 1654 Bennett et al. 2003). Since the initial epidemic in 1982, DEN-4 was virtually absent from Puerto Rico until the 1986 epidemic (fig. 1; Dietz et al. 1996). All viruses sampled in Puerto Rico during and after this re-emergence (1986 onward) fell into a single lineage defined by four silent nucleotide substitutions and one amino acid substitution in the envelope (E) gene (methionine to threonine, aa position 163; fig. 2). With the exception of a single 1994 isolate, two additional silent and two conservative amino acid substitutions (isoleucine to valine, envelope aa position 351; lysine to arginine, NS1 aa position 51) occurred in the formation of the re-emergent PR lineage. Within this reemergent lineage, sublineages were largely temporally ordered. For example, most of the 1987 isolates fell into a well-defined temporal cluster, distinguished by five silent changes across coding regions examined (gold in fig. 2). Similarly, major temporal clusters were formed by all 1992 (green in fig. 2), most 1994 (blue in fig. 2), and all 1998 isolates (red in fig. 2), respectively. The 1998 year group was defined by several silent changes concentrated in the E gene, and more notably three amino acid replacements in the nonstructural NS2A protein (isoleucine to valine, aa position 14; valine to threonine, aa position 54; and proline to serine, aa position 101). Although DEN-4 isolates grouped into temporal clusters, the dominant clade (that which included most of the isolates) from a particular year descended from older isolates that represented minor variants in the previous sampling period. For example, the 1992 cluster did not descend from the major 1987 cluster, but from contemporaneous (1987) variants representing only 17% (3 out of 19) of the isolates sampled in 1987. Similarly, the 1998 cluster descended from a rare 1994 lineage represented by only 8% (2 out of 27) of the isolates sampled between 1992 and 1994. Indeed, only the major 1994 lineage was nested within the dominant lineage of the previous sampling period, 1992. This pattern of sequence differences among DEN-4 isolates from Puerto Rico indicates phylogenetic shifts in the population of variants between sampling periods. We refer to this numerical shift in the population of genotypes present in a given year away from the dominant genotypes of an earlier time as ‘‘lineage turnover,’’ since it infers a replacement of the most successful lineage from one sampling period to the next. Finally, we found no evidence for major shifts in topological position among gene regions indicative of recombination. To determine whether positive selection has played a significant role in DEN-4 evolution and lineage turnover, we examined rates of nonsynonymous (dN) and synonymous (dS) substitution in individual viral genes using a ML method. Although eight potentially positively selected sites were identified (posterior probability P . 0.99) in the E, NS1, NS4B, and most notably the NS2A genes, where a small class of codons (0.9%) had a mean dN/dS ratio of ;4.6, in no case could a model of codon evolution allowing positive selection conclusively reject all competing neutral models (table 2; the results for all model comparisons are available from the authors on request). However, the evolution of the nonstructural gene NS2A was striking in that the branch leading to the 1998 cluster of sequences was distinguished exclusively by three nonconservative amino acid replacements in NS2A (14Ileu to Thr, 54Val to Thr, 101Pro to Ser; fig. 2), in the absence of any synonymous nucleotide changes. This results in an infinitely large dN/dS ratio along this branch, suggestive of positive selection: mean dN/dS for all other internal branches of our phylogeny in NS2A were significantly lower (mean dN/dS ¼ 0.038, P ¼ 0.001, using absolute number nucleotide changes for observed and expected values). Moreover, these mutations appear to have been fixed far more quickly than if they were subject to genetic drift alone. Estimated values of h (2Nel) are 0.024 (range 0.014 to 0.045) and 0.027 (range 0.016 to 0.052) for the viruses sampled from years 1994 and 1998, respectively. Assuming a neutral mutation rate of 2.64 3 105 mutations per site, per generation, effective population sizes (Ne) were only 454 and 511 for 1994 and 1998, respectively. Taking the mean Ne value across these two sampling times (482), we obtain an expected fixation time under genetic drift of 13,496 days (482 3 14 days/generation 3 2) or ;37 years. However, the observed fixation time for these mutants is a maximum of 6 years; as these mutations were first detected in 1994, we assume that they appeared sometime between the 1992 and 1994 epidemics, giving a maximum of 6 years time difference to the 1998 strains. Discussion This longitudinal phylogenetic study of DEN-4 in a focal host population examined evolutionary changes during viral epidemic expansion. Our most striking observation was that the evolutionary history of DEN-4 in Puerto Rico was characterized by the replacement of lineages between epidemic years: most isolates from a given year were closely related, but turnover of the common variant occurred between sampling periods. This pattern of lineage turnover is similar to that seen in some other acute RNA viruses. For example, in coxsackie-A virus temporally organized lineages, regardless of geographic origin, are equally unrelated to each other (Santti et al. 2000; Ishiko et al. 2002) suggestive of lineage turnover fueled by virus exchange between spatially distinct populations. Phylogenetic evidence also suggests ! FIG. 2.—Maximum likelihood tree based on 3,543 bp sequences (coding regions) from 75 isolates of DEN-4 from Puerto Rico and one from Dominica (the outgroup sequence). Six other foreign isolates have been omitted from the figure for simplicity. The same topology was obtained when phylogenies were constructed including non-coding sequence data (4,016 bp per isolate). Branches are color-coded by year of sample isolation. Bootstrap support values, shown at nodes, were generated by using 1,000 replicate Neighbor-Joining trees reconstructed under the best-fit model of nucleotide evolution. Three amino acid changes, in envelope (E) and NS1 (N1) genes, that define the post-introduction Puerto Rican lineage, and three amino acid changes in the positively selected NS2A (2A) gene that define the 1998 clade, are marked with black bars and the amino acid position within their respective genes. Selection in Emergent Dengue Virus 1655 1656 Bennett et al. Table 2 Maximum Ratio of Nonsynonymous to Synonymous Substitutions for Each DEN-4 Gene Region Examined in This Study dN/dSa Gene Capsid / membrane Envelope / NS1 NS2A NS4B Max. dN/dS 0.822 2.110 4.574 1.851 b Proportion of Codonsc Pd 0.167 0.017 0.009 0.014 0.997 0.157 0.725 0.937 a Values given for the M3 model of codon evolution that allows three classes of dN/dS per gene sequence alignment, all of which are estimated from the data. b Highest dN/dS for a set of codons estimated under the M3 model. c Proportion of codons with the maximum dN/dS value. d Significance value obtained from a likelihood ratio test involving M3 and the neutral codon model M1 (which allows two classes of dN/dS, 0 and 1). that vesicular stomatitis virus in the United States and Mexico has experienced lineage shifts since the early 1980s, following its geographical spread in the Americas (Nichol, Rowe, and Fitch 1993). Finally, there is some evidence for lineage turnover in human influenza A virus, although phylogenetic trees from this virus tend to have a more regular temporal structure, most likely reflecting the continual selection pressure exerted by neutralizing antibodies (Bush et al. 1999). There are several non-mutually exclusive explanations for the lineage turnover observed in DEN-4 evolution in Puerto Rico over the last 20 years, aside from incomplete sampling. Novel lineages could arise and proliferate in a population through multiple re-introductions, genetic drift, and/or selection. However, although introductions from other DEN-4 populations may provide a source of variation, evidence suggests that DEN-4 in the Caribbean is characterized by local evolution interrupted occasionally by gene flow (Foster et al. 2003). There was also no evidence that microgeographic population structure within Puerto Rico generated the observed pattern, as virus samples were obtained from similar geographic regions in all cases (data not shown). In addition, the distinct and persistent pattern of lineage turnover is difficult to explain by random sampling processes alone, because we would expect common genotypes to become fixed by genetic drift more often than rare ones. Indeed, the stochastic nature of the dengue virus life-cycle should favor common variants: genetic bottlenecks occur at every mosquito feeding event, along with seasonal reductions in vector populations (Gubler 1987), and annual variation in the abundance of susceptible human hosts. Instead, the dominant Puerto Rican lineage of a given year twice descended from earlier rare genotypes, a pattern that suggests that much of the lineage turnover is driven by selection on viral genotype. In support of this hypothesis, there was an increase in the rate of nonsynonymous substitution (in the absence of any silent changes in NS2A) on the lineage leading to the 1998 epidemic, and these amino acid changes were fixed far more quickly than expected by genetic drift. Moreover, our population genetic estimations for the fixation time of the NS2A mutants are conservative in that these changes may have been fixed much faster than the 6 years separating the 1992 and 1998 samples, and our estimates of Ne may be artificially low if positive selection has purged genetic diversity. Consequently, adaptive evolution in the NS2A gene may have triggered the 1998 epidemic in Puerto Rico, and DEN-4 genotypes bearing these NS2A modifications were also associated with contemporaneous epidemics throughout the Greater and Lesser Antilles (Foster et al. 2003). Conversely, a similar association between amino acid changes and lineage turnover was not observed between 1987 and 1992, where neither clade was defined by amino acid substitutions. In this case, lineage turnover may have resulted from drift-sensitive population bottlenecks, inter-island extinction/recolonization, or selection on other parts of the genome not examined in this study. Although we examined many more nucleotides than previous studies (e.g., Rico-Hesse 1990; Lewis et al. 1993; Lanciotti et al. 1994; Lanciotti, Gubler, and Trent 1997; Rico-Hesse et al. 1997, 1998; Singh et al. 1999; Uzcategui et al. 2001; Twiddy et al. 2002), 60% of the dengue genome was not surveyed, including genes known to be important in virus replication, such as NS5 (Leitmeyer et al. 1999), and virus antigenicity, such as NS1 (Mathew et al. 1998; Jacobs et al. 2000). Because selection is apparently restricted to very few sites, a complete appreciation of the forces driving genetic change in DEN-4 will ultimately require the analysis of full genome sequences. The apparent positive selection on the NS2A gene is even more anomalous given the relatively strong constraints acting on other regions of the viral genome. In particular, there was no convincing evidence that changes in structural genes, the primary targets of specific immunity, underlie the evolutionary shifts we observed in DEN-4 after its re-emergence in 1986. Most positions within the structural genes were invariant (table 2), and very few of the nonsynonymous substitutions in these regions occurred at internal nodes. Two amino acid changes in E (positions 163 and 351; see fig. 2) defined the DEN-4 that re-emerged in the late 1980s after 3 years of undetectable transmission, both occurring within wellcharacterized structural epitope domains (summarized in Roehrig [1997]). The E protein, which enables host cell binding and entry, providing a target for the host immune response (Roehrig 1997), is the functional analog of influenza A’s hemagglutinin (HA) gene, which, in contrast, appears to be under strong antigenic selection (Bush et al. 1999). In dengue virus, constraints on the E gene may be attributable to its two-host life cycle and resultant multicell type tropism (Beaty, Trent, and Roehrig 1988; Strauss and Strauss 1988), so that rates of nucleotide substitution are lower than those seen in many other RNA viruses (Weaver, Rico-Hesse, and Scott 1992; Jenkins et al. 2002). In addition, positive selection is less likely to occur because of intrinsic negative fitness trade-offs (Woelk and Holmes 2002). Indeed, substitution patterns across the four gene regions examined here are consistent with a genome under stabilizing selection, with synonymous changes greatly outnumbering nonsynonymous changes. Against this conservative background, the amino acid changes in NS2A that distinguish the 1998 virus samples appear even more conspicuous, and natural selection on nonstructural genes has been described for other viruses and correlated with epidemic outbreaks (Knowles et al. 2001). Selection in Emergent Dengue Virus 1657 Aside from epidemiologic evidence, the phenotypic traits targeted by natural selection involving NS2A are unclear because we know so little about the gene’s function. Dengue viruses in Puerto Rico may be under particularly intense selection to improve replication rate, survival, and, ultimately, transmission rate, because the only vector present, urban-specialist A. aegypti, is relatively inefficient and requires high viral titers to acquire infection (up to 106 infectious units/ml blood in laboratory studies [Gubler 1987; Kuno 1997]). Puerto Rico also lacks potential reservoir (primate) hosts, and its vector exhibits extremely low levels of vertical transmission, such that the disease must cycle directly between mosquitoes and humans to persist (Gubler 1987, 1998). Alternatively, the selection pressure could relate to survival pressure exerted by the human immune system in the guise of cytotoxic T-lymphocytes (CTLs). Epitopes that elicit human T-cell responses ranging from serotype-specific to cross-reactive have been identified throughout the nonstructural regions of the dengue genome (Loke et al. 2001), and phylogenetic evidence for positive selection at or near T-cell epitopes has been noted previously (Twiddy, Woelk, and Holmes 2002). The function of NS2A has been associated with viral replication (Falgout and Markoff 1995; Mackenzie et al. 1998) and the mediation of host immune interactions via NS1 (Rothman et al. 1993; Mathew et al. 1998; Jacobs et al. 2000). As the three amino acid substitutions in NS2A that define the 1998 cluster were all highly nonconservative changes from hydrophobic, non-polar residues to polar, uncharged amino acids, they would at the very least change the 3dimensional structure of the NS2A protein. To fully determine the repercussions of observed NS2A modifications on viral extended phenotype, future studies must endeavor to characterize the NS2A protein’s structure and function, and to survey this gene in phylogenetic studies of epidemic dengue. Acknowledgments We thank M. Worobey for assistance with recombination analyses, and J. J. Bull, K. A. Hanley, and D. D. Kapan for invaluable comments on the manuscript. Some of the data were acquired in partial fulfillment of M.C.’s Master’s degree at the Department of Microbiology and Medical Zoology, University of Puerto Rico, and thanks are therefore due to her advisory committee. This research was supported by the National Institutes of Health (USA) through a research project grant and the Research Centers in Minority Institutions program, and by The Royal Society (UK). Literature Cited Beaty, B. J., D. W. Trent, and J. T. Roehrig. 1988. Virus variation and evolution. Pp. 59–85 in T. P. Monath, ed. The arboviruses: epidemiology and ecology, Vol. 1. CRC Press, Boca Raton, Fla. Bush, R. M., C. A. Bender, K. Subbarao, N. J. Cox, and W. M. Fitch. 1999. Predicting the evolution of human influenza A. Science 286:1921–1925. Dietz, V., D. J. Gubler, S. Ortiz, G. Kuno, A. Casta-Velez, G. E. Sather, and I. Gomez. 1996. The 1986 dengue and dengue hemorrhagic fever epidemic in Puerto Rico: epidemiologic and clinical observations. P. R. Health Sci. J. 15:201–210. Domingo, E., and J. J. Holland. 1997. RNA virus mutations for fitness and survival. Annu. Rev. Microbiol. 51:151–178. Drake, J. W., and J. J. Holland. 1999. Mutation rates among RNA viruses. Proc. Natl. Acad. Sci. USA 96:13910–13913. Falgout, B., and L. Markoff. 1995. Evidence that Flavivirus NS1NS2A cleavage is mediated by a membrane-bound host protease in the endoplasmic reticulum. J. Virol. 69:7232– 7243. Foster, J. E., S. N. Bennett, H. Vaughan, V. Vorndam, W. O. McMillan, and C. V. F. Carrington. 2003. Molecular evolution and phylogeny of dengue type 4 virus in the Caribbean. Virology 306:126–134. Gubler, D. J. 1987. Current research on dengue. Pp. 37–56 in K. F. Harris, ed. Current topics in vector research, Vol. 3. Springer-Verlag, New York. ———. 1998. Dengue and dengue hemorrhagic fever. Clin. Microbiol. Rev. 11:480–496. Gubler, D. J., D. Reed, L. Rosen, and J. R. Hitchcock, Jr. 1978. Epidemiologic, clinical, and virologic observations on dengue in the Kingdom of Tonga. Am. J. Trop. Med. Hyg. 27:581– 589. Halstead, S. B. 1988. Pathogenesis of dengue: challenges to molecular biology. Science 239:476–481. Hatta, M., P. Gao, P. Halfmann, and Y. Kawaoka. 2001. Molecular basis for high virulence of Hong Kong H5N1 influenza A viruses. Science 293:1840–1842. Holmes, E. C. 1998. Molecular epidemiology of dengue virus— the time for big science. Trop. Med. Int. Health 3:855–856. Holmes, E. C., L. M. Bartley, and G. P. Garnett. 1998. The emergence of dengue: past, present and future. Pp. 301–325 in R. M. Krause, ed. Emerging infections, Academic Press, New York. Holmes, E. C., and S. S. Burch. 2000. The causes and consequences of genetic variation in dengue virus. Trends Microbiol. 8:74–77. Holmes, E. C., M. Worobey, and A. Rambaut. 1999. Phylogenetic evidence for recombination in dengue virus. Mol. Biol. Evol. 16:405–409. Ishiko, H., Y. Shimada, M. Yonaha, O. Hashimoto, A. Hayashi, K. Sakae, and N. Takeda. 2002. Molecular diagnosis of human enteroviruses by phylogeny-based classification by use of the VP4 sequence. J. Infect. Dis. 185:744–754. Jacobs, M. G., P. J. Robinson, C. Bletchly, J. M. Mackenzie, and P. R. Young. 2000. Dengue virus nonstructural protein 1 is expressed in a glycosyl-phosphatidylinositol-linked form that is capable of signal transduction. FASEB J. 14:1603–1610. Jenkins, G. M., A. Rambaut, O. G. Pybus, and E. C. Holmes. 2002. Rates of molecular evolution in RNA viruses: a quantitative phylogenetic analysis. J. Mol. Evol. 54:156– 165. Knowles, N., P. Davies, T. Henry, V. O’Donnell, J. M. Pacheco, and P. Mason. 2001. Emergence in Asia of foot and mouth disease viruses with altered host range: characterization of alteration in the 3A protein. J. Virol. 75:1551–1556. Kuhner, M. K., J. Yamato, and J. Felsenstein. 1998. Maximum likelihood estimation of population growth rates based on the coalescent. Genetics 149:429–434. Kuno, G. 1997. Factors influencing the transmission of dengue viruses. Pp. 61–88 in D. J. Gubler, and G. Kuno, eds. Dengue and dengue hemorrhagic fever. CAB International, New York. Lanciotti, R. S., D. J. Gubler, and D. W. Trent. 1997. Molecular evolution and phylogeny of dengue-4 viruses. J. Gen. Virol. 78:2279–2286. Lanciotti, R. S., J. G. Lewis, D. J. Gubler, and D. W. Trent. 1994. 1658 Bennett et al. Molecular evolution and epidemiology of dengue-3 viruses. J. Gen. Virol. 75:65–75. Leitmeyer, K. C., D. W. Vaughn, D. M. Watts, R. Salas, I. Villalobos de Chacon, C. Ramos, and R. Rico-Hesse. 1999. Dengue virus structural differences that correlate with pathogenesis. J. Virol. 73:4738–4747. Lewis, J. A., G-J. Chang, R. S. Lanciotti, R. M. Kinney, L. W. Mayer, and D. W. Trent. 1993. Phylogenetic relationships of dengue-2 viruses. Virology 197:216–224. Loke, H., D. B. Bethell, C. X. T. Phuong, M. Dung, J. Schneider, N. J. White, N. P. Day, J. Farrar, and A. V. S. Hill. 2001. Strong HLA class-I restricted T cell responses in dengue hemorrhagic fever: a double-edged sword. J. Infect. Dis. 184:1369–1373. Mackenzie, J. M., A. A. Kromykh, M. K. Jones, and E. G. Westaway. 1998. Subcellular localization and some biochemical properties of the flavivirus Kunjin nonstructural proteins NS2A and NS4A. Virology 245:203–215. Mackow, E., Y. Makino, B. T. Zhao, Y. M. Zhang, L. Markoff, A. Buckler-White, M. Guiler, R. Chanock, and C. J. Lai. 1987. The nucleotide sequence of dengue type 4 virus: analysis of genes coding for nonstructural proteins. Virology 159:217–228. Manzin, A., L. Solforosi, M. Debiaggi, F. Zara, E. Tanzi, L. Romano, A. R. Zanetti, and M. Clementi. 2000. Dominant role of host selective pressure in driving hepatitis C virus evolution in perinatal infection. J. Virol. 74:4327–4334. Mathew, A., I. Kurane, S. Green, H. A. F. Stephens, D. W. Vaughn, S. Kalayanarooj, S. Suntayakorn, F. A. Ennis, and A. L. Rothman. 1998. Predominance of HLA-restricted CTL responses to serotype crossreactive epitopes on nonstructural proteins after natural dengue virus infections. J. Virol. 72:3999–4004. Nichol, S. T., J. E. Rowe, and W. M. Fitch. 1993. Punctuated equilibrium and positive Darwinian evolution in vesicular stomatitis virus. Proc. Natl. Acad. Sci. USA 90:10424– 10428. Posada, D., and K. A. Crandall. 1998. MODELTEST: testing the model of DNA substitution. Bioinformatics 14:817–818. Rico-Hesse, R. 1990. Molecular evolution and distribution of dengue viruses type 1 and 2 in nature. Virology 174:479–493. Rico-Hesse, R., L. M. Harrison, A. Nisalak, D. W. Vaughn, S. Kalayanarooj, S. Greene, A. L. Rothman, and F. A. Ennis. 1998. Molecular evolution of Dengue type 2 virus in Thailand. Am. J. Trop. Med. Hyg. 58:96–101. Rico-Hesse, R., L. M. Harrison, R. A. Salas, D. Tovar, A. Nisalak, C. Ramos, J. Boshell, M. T. de Mesa, R. M. Nogueira, and A. T. da Rosa. 1997. Origins of dengue type 2 viruses associated with increased pathogenicity in the Americas. Virology 230:244–251. Roehrig, J. T., 1997. Immunochemistry of dengue viruses. Pp. 199–219 in D. J. Gubler, and G. Kuno, eds. Dengue and dengue hemorrhagic fever. CAB International, New York. Rothman, A. L., and F. A. Ennis. 1999. Immunopathogenesis of dengue hemorrhagic fever. Virology 257:1–6. Rothman, A. L., I. Kurane, C. J. Lai, M. Bray, B. Falgout, R. Men, and F. A. Ennis. 1993. Dengue virus protein recognition by virus-specific murine CD8þ cytotoxic T lymphocytes. J. Virol. 67:801–806. Santti, J., H. Harvala, L. Kinnunen, and T. Hyypiä. 2000. Molecular epidemiology and evolution of coxsackievirus A9. J. Gen. Virol. 81:1361–1372. Singh, U. B., A. Maitra, S. Broor, A. Rai, S. T. Pasha, and P. Seth. 1999. Partial nucleotide sequencing and molecular evolution of epidemic causing dengue 2 strains. J. Infect. Dis. 180:959–965. Strauss, J. H., and E. G. Strauss. 1988. Evolution of RNA viruses. Annu. Rev. Microbiol. 42:657–683. Swofford, D. L. 2001. PAUP*: phylogenetic analysis using parsimony (*and other methods). Version 4. Sinauer Associates, Sunderland, Mass. Thein, S., M. M. Aung, T. N. Shwe, M. Aye, Z. Aung, K. Aye, K. M. Aye, and J. Aaskov. 1997. Risk factors in dengue shock syndrome. Am. J. Trop. Med. Hyg. 56:566–572. Tolou, H. J. G., P. Couissinier-Paris, J.-P. Durand, V. Mercier, J.-J. de Pina, P. de Micco, F. Billoir, R. N. Charrel, and X. de Lamballerie. 2001. Evidence for recombination in natural populations of dengue virus type 1 based on the analysis of complete genome sequences. J. Gen. Virol. 82:1283–1290. Twiddy, S. S., J. F. Farrar, N. V. Chau, B. Wills, E. A. Gould, T. Gritsun, G. Lloyd, and E. C. Holmes. 2002. Phylogenetic relationships and differential selection pressures among genotypes of dengue-2 virus. Virology 298:63–72. Twiddy, S. S., and E. C. Holmes. 2003. The extent of homologous recombination in the genus Flavivirus. J. Gen. Virol. 84:429–440. Twiddy, S. S., E. C. Holmes, and A. Rambaut. 2003. Inferring the rate and time-scale of dengue virus evolution. Mol. Biol. Evol. 20:122–129. Twiddy, S. S., C. H. Woelk, and E. C. Holmes. 2002. Phylogenetic evidence for adaptive evolution of dengue viruses in nature. J. Gen. Virol. 83:1679–1689. Uzcategui, N. Y., D. Camacho, G. Comach, E. C. Holmes and E. A. Gould. 2001. The molecular epidemiology of Dengue-2 virus in Venezuela: evidence for in situ viral evolution and recombination. J. Gen. Virol. 82:2945–2953. Wang, E., H. Ni, X. Renling, A. D. T. Barrett, S. J. Watowich, D. J. Gubler, and S. C. Weaver. 2000. Evolutionary relationships of endemic/epidemic and sylvatic dengue viruses. J. Virol. 74:3227–3234. Weaver, S. C., R. Rico-Hesse, and T. W. Scott. 1992. Genetic diversity and slow rates of evolution in new-world alphaviruses. Curr. Top. Microbiol. Immunol. 176:99–117. Woelk, C. H., and E. C. Holmes. 2002. Reduced positive selection in vector-borne RNA viruses. Mol. Biol. Evol. 19:2333–2336. World Health Organization (WHO). 1999. Strengthening implementation of the global strategy for dengue fever/ dengue haemorrhagic fever prevention and control: report of the informal consultation, WHO, Geneva, 18–20 October 1999 (WHO Report WHO/CDS/( DEN)/IC/2000. 1; www.who.int/ emc-documents/dengue/whocdsdenic20001c.html). Worobey, M., A. Rambaut, and E. C. Holmes. 1999. Widespread intra-serotype recombination in natural populations of dengue virus. Proc. Natl. Acad. Sci. USA 96:7352–7357. Yang, Z. 1997. PAML, a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13:555–556. Yang, Z., R. Nielsen, N. Goldman, and A.-M. K. Pedersen. 2000. Codon substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155:431–449. Zanotto, P. M. de A., E. G. Kallas, R. F. de Souza, and E. C. Holmes. 1999. Genealogical evidence for positive selection in the nef gene of HIV-1. Genetics 153:1077–1089. Zhao, B., E. Mackow, A. Buckler-White, L. Markoff, R. M. Chanock, C. J. Lai, and Y. Makino. 1986. Cloning full-length dengue type 4 viral DNA sequences: analysis of genes coding for structural proteins. Virology 155:77–88. Keith Crandall, Associate Editor Accepted May 25, 2003