Download View - Iowa State University

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
1
Selection on Rev during persistent equine infectious anemia virus infection
2
Wendy O. Sparks1, Karin Dorman,2,3 Sijun Liu1, and Susan Carpenter*1,4
3
4
5
6
1
Department of Veterinary Microbiology and Preventive Medicine, 2Department of Genetics,
7
Development and Cell Biology, 3Department of Statistics, Iowa State University, Ames, IA,
8
50011 and 4Department of Veterinary Microbiology and Pathology, Washington State
9
University, Pullman, WA 99164
10
11
12
*Corresponding author
13
Susan L. Carpenter, Dept. of Veterinary Microbiology & Pathology, Washington State
14
University, Pullman, WA 99164-7040
15
E-mail:
[email protected]
16
Telephone:
509-335-6043
17
FAX:
509-335-8529
18
19
Running title: EIAV Rev selection in vivo
20
Total # words in summary: 240
21
Total # words in main text and summary: 5397
22
Total # of figures and tables: 5
23
131
24
SUMMARY
25
26
Longitudinal analyses of Rev variation in horses infected with equine infectious anemia virus
27
(EIAV) have revealed the presence of two subpopulations of Rev that co-existed and differed in
28
genotype and phenotype. To better understand the role of Rev variation in EIAV persistence,
29
computational and genetic analyses were used to examine Rev selection and fitness in vivo. Rev
30
evolution is complicated by the fact that it overlaps with the transmembrane protein coding
31
region, so we developed a novel technique for quantitating selection in both reading frames.
32
Overall, the Rev protein was highly conserved, with purifying selection dominating evolution of
33
Rev in a sample of over 300 clones. However, mutations nonsynonymous in both reading frames
34
were surprisingly well tolerated especially among the most frequently sampled mutations. To
35
investigate whether the most common nonsynonymous mutations could modulate Rev
36
phenotype, we studied the phenotypic effect of ten amino acid mutations observed at a frequency
37
greater than 10% in the sample population. Nine of the 10 mutations were found to significantly
38
alter Rev nuclear export activity, either as single mutations or in the context of cumulatively
39
fixed mutations. These results indicate that limited genetic variation in Rev can result in
40
significant phenotype changes that may confer a selective advantage in vivo. Indeed, special
41
sites, nonessential in both reading frames, may be especially tolerant of genetic variation. The
42
resulting phenotypic variation in Rev may be an important mechanism of immune evasion and
43
lentivirus persistence in vivo.INTRODUCTION
44
45
46
Equine infectious anemia virus (EIAV) is a member of the lentivirus genus within the
family of retroviruses. EIAV has the characteristic features shared by all lentiviruses, such as a
231
47
complex genome, tropism for cells of the monocyte/macrophage lineage, and lifelong persistent
48
infection. However, it is unique among lentiviruses in that it results in a dynamic disease course
49
characterized by recurrent cycles of fever, viremia and thrombocytopenia. Most animals
50
eventually gain control of viral replication, progressing to a clinically inapparent stage of
51
disease, yet remaining inapparent carriers of the virus for life. The dynamics of clinical disease
52
and immune control make EIAV a good model to study the role of both host and viral
53
mechanisms contributing to lentiviral persistence and pathogenesis.
54
High genetic variation has been observed in the EIAV rev/tm overlapping reading frames
55
of the EIAV genome, which encode the regulatory protein Rev and the cytoplasmic tail of the
56
transmembrane (TM) protein (Alexandersen and Carpenter, 1991; Leroux et al., 1997, Belshan et
57
al., 1998). Rev is an essential regulatory protein that acts to transport partially spliced and
58
unspliced RNA into the cytoplasm. These RNAs are translated into the structural proteins
59
necessary for replication, and also provide full-length viral genomes for encapsidation. Variation
60
in HIV-1 Rev has been shown to down-regulate the expression of viral late genes and alter
61
sensitivity to Gag-specific cytotoxic T lymphocytes (CTL) (Bobbitt et al., 2003). In addition,
62
CTL epitopes have been identified within HIV-1 Rev (Aldo et al., 2001), as well as within EIAV
63
Rev (Mealey et al., 2003). In EIAV infected horses, nonprogressors exhibited a strong-avidity
64
CTL response to epitopes within Rev, while progressors did not (Mealey et al., 2003). Genetic
65
changes within rev may facilitate immune evasion directly by altering CTL epitopes in Rev,
66
and/or indirectly through altering Rev nuclear export activity and decreasing expression of
67
structural proteins and virion production.
68
Previously, we undertook longitudinal analyses of EIAV Rev variation throughout a clinically
69
dynamic disease course in one pony experimentally infected with the virulent EIAVWYO2078
331
70
(Belshan et al., 2001; Baccam et al., 2003). This pony exhibited a classical disease course, with
71
an acute stage of disease followed by a chronic stage of recurrent febrile episodes, which
72
decreased in frequency and severity over time. Concurrent with maturation of the humoral
73
immune response, this pony entered the inapparent stage of disease, which was then followed by
74
two late febrile episodes. Phylogenetic and partition analyses identified multiple subpopulations
75
of EIAV Rev that were independently evolving, coexisted throughout disease, and exhibited
76
different phenotypes (Baccam et al., 2003). Moreover, these phenotypically distinct
77
subpopulations fluctuated in dominance in a manner coincident with disease state, such that the
78
subpopulation with high Rev phenotype was dominant during the chronic and late chronic stages
79
of disease, whereas, a subpopulation with lower Rev phenotype was dominant during the
80
inapparent stage of disease. These studies indicated that in vivo selection on EIAV may drive
81
genetic and phenotypic variation in Rev. In the present study, we used genetic and biological
82
analyses to characterize the evolution and selection of Rev variants in vivo. We conclude that
83
although Rev is under largely purifying selection, there is enough genetic flexibility, possibly in
84
nonessential regions of both Rev and TM, to substantially modulate Rev phenotype. Selection at
85
these sites could contribute to virus escape in the face of a maturing immune
86
response.MATERIALS AND METHODS
87
88
Experimental infections and identification of Rev variants
The virulent Wyoming strain of EIAV was used to infect pony 524, and sequential sera
89
samples were collected from different stages of clinical disease as previously described (Belshan
90
et al, 2001). The inoculum has been maintained by serial in vivo passage and contains a
91
heterogeneous population of EIAV, similar to a natural infection. Virion RNA was isolated from
92
the inoculum and sera samples collected post infection. The rev exon 2/tm overlapping region of
431
93
EIAV was amplified via RT-PCR. A total of 61 rev clones were obtained from the inoculum and
94
23-25 clones from each of 11 sera samples taken at 12, 35, 67, 89, 118, 201, 289, 385, 437, 754,
95
and 800 dpi for a total sample population of 320 clones. All nucleotide sequences were
96
translated in the Rev open reading frame. Amino acid variants were named in the order they
97
were identified, with identical variants given the same name, e.g. R1. Nucleotide variants were
98
named based on the corresponding amino acid variant name, e.g. R1A, R1B, etc.
99
100
101
Genetic analysis of Rev variation and evolution
The consensus rev sequence from the inoculum was calculated using Bioedit 5.0.9 (Hall,
102
1999), and corresponded to the variant R1A. The amino acid consensus sequence of the
103
inoculum corresponded to the Rev amino acid variant R1. This variant was used for comparison
104
in both computational and biological analyses.
105
To test for evidence of selection on the Rev sequences, we developed a novel technique
106
inspired by the Nei & Gojobori (1986) method (NG method) for non-overlapping reading frames
107
(manuscript in preparation). Because rev overlaps with the tm reading frame, we classified
108
mutations into four distinct selection classes: double synonymous (SS), double nonsynonymous
109
(NN), synonymous in rev and nonsynonymous in tm (SN), or nonsynonymous in rev and
110
synonymous in tm (NS). Transitions are much more common than transversions; therefore, we
111
also distinguished two subclasses within each of the four selection classes. In the NG method,
112
the number of observed mutations in each class is computed and compared to the number of
113
opportunities for each type of mutation.
114
The connection between observed mutations and opportunities for mutations was made
115
through various plausible models. The neutral model was a fully specified model with no free
531
116
parameters. It assumed that each type of mutation occurs in direct proportion to the number of
117
opportunities it was given. As a very simple example, suppose you observe a single nucleotide
118
A. It has two opportunities to experience a transversion (to C or T) and one opportunity to
119
experience a transition (to G), thus we expect 2/3 of the observed mutations at this site to be
120
transversions. We considered more complex models with up to four parameters st ,sr ,str ,sv ,
121
where st is the selection coefficient acting on SN type changes (nonsynonymous in TM), sr
122
against NS changes (nonsynonymous in Rev), str against NN changes, 
and sv against
123
transversions. Transversion selection was presumed multiplicative with protein selection,
124
justified by the fact that selection against transversions is actually a mechanistic bias occurring at
125
the level of replication and independent of protein selection. Again, as a simple example,
126
suppose the transversion A to C is a type SN mutation. Then the fitness of the C mutant relative
127
to any double synonymous (SS) transition is 1  st 1  sv  . Fitness effects of mutations at
128
multiple sites were assumed independent.
129
We describe two methods for computing observed mutation counts. The original NG
130
method compares pairs of sequences and records the observed number of mutations in each
131
mutation class by averaging over all possible mutation pathways; counts may be fractional
132
because of averaging. Because there was insufficient variation in any pair of sequences to
133
perform statistical tests, we summed pairwise observed counts over all pairs of sequences, and
134
divided by the total number of mutations across all pairwise comparisons to obtain observed
135
mutation probabilities, p c for class c . We used the expected number of unique mutations in
136
each class as the observed data. Since the total number of unique mutations in the data set was
137
126 (after removing 4 mutations to stop codons in rev and 2 in tm), we rounded 126 p c to the
138
nearest integer to obtain the observed number of mutations in class c . By limiting the total
631
139
number of mutations to 126, we neglected the possibility of parallel or back substitutions and
140
therefore obtained a minimum estimate of the observed number of mutations. We call this
141
method for obtaining observed mutation counts the pairwise comparison method.
142
The above method may give excess importance to mutations appearing deep in the
143
evolutionary tree, because many pairs of sequences differ by these mutations. Given the low
144
diversity of this data set, it was possible to identify 60 non-overlapping blocks of mutations.
145
Within blocks, mutations are dependent because of codons and overlapping reading frames, but
146
between blocks, mutations were assumed independent. Of these 60 blocks, only six blocks had
147
variants with two or three simultaneous mutations different from the block consensus; some
148
blocks had more than one distinct multiply mutated variant. In all cases, it was possible to
149
identify a most parsimonious pathway for accumulation of these mutations because intermediate
150
mutants were always available in our sample. For the seven cases where there were multiple
151
intermediates and thus multiple equally parsimonious pathways to a multi-mutant, six cases had
152
one intermediate far more prevalent than the other, and we assumed the mutation pathway with
153
most prevalent intermediates. In the last case, where one intermediate was present in one copy
154
and the other in two, we gave each pathway equal probability and averaged over pathways,
155
resulting in ½ count each in the NS and SN categories. In the end, we identified 137 mutational
156
events, a few more than the number of unique mutations (126) because seven mutations
157
happened in two contexts and one mutation happened in three contexts. We also fit these 137
158
mutants to the selection models described above in what we call the most parsimonious
159
reconstruction method.
160
161
Given observed mutation counts, we then maximized the likelihood of observing the
expected mutations over all or some of the parameters and compared nested models using the
731
162
likelihood ratio test (LRT) statistic (Casella & Berger, 2001). For comparing non-nested models,
163
say M1 and M2, of equal complexity, we performed parametric bootstrap to obtain an empirical
164
approximation of the sampling distribution of the likelihood ratio under M1. We then computed
165
the p-value to reject M1 as the proportion of times the likelihood ratio fell below the observed
166
likelihood ratio (Coulibaly and Brorsen 1999). We caution that the maximum likelihood
167
estimates of selection coefficients may be biased (Ina 1995, Yang 2000), but the biases tend to be
168
small for sequences with low diversity (Nei & Gojobori 1986). While parameter estimates
169
varied somewhat (within the reported confidence intervals), the results of model comparisons
170
were consistent for several different methods of computing observed and opportunity
171
distributions. For example, in the pairwise comparison method when codons are mutated at two
172
positions, either mutation may have occurred first and can impact how the mutations are
173
classified. We obtained the same results when assuming one mutation order (selected at random)
174
vs. averaging over all orders.
175
176
177
178
Assays of Rev nuclear export activity
179
A Rev expression vector was constructed by replacing the second exon of pRevWT
180
(Belshan et al., 1998) with R1, the consensus sequence in the inoculum. Specific mutations were
181
introduced into the RI background using PCR-based mutagenesis, and all mutations were
182
confirmed by sequencing. The Rev nuclear export activity of each of the mutants was
183
determined in transient transfection assays using the pDM138-based CAT reporter plasmid
184
pERRE-All, which contains the EIAV RRE (nt 5280-7534) and the chloramphenicol
831
185
acetyltransferase (CAT) gene as previously described (Belshan et al., 1998; Harris et al., 1998).
186
Briefly, 293 cells were seeded in triplicate at 1-5x105 cells/well in 6-well tissue culture dishes,
187
and transfected with 0.2 g of pERRE-All, 0.2g of -galactosidase reporter plasmid pCH110
188
(Pharmacia, Uppsala, Sweden), 1 g of Rev expression plasmid or empty vector along with 0.60
189
g pUC19 for a total of 2 g DNA per reaction. Each experiment included a sham group that
190
contained no reporter plasmid, but an additional 0.2 μg of pUC19. At 48 hours post transfection,
191
cells were harvested in phosphate-buffered saline (PBS) containing 0.5 mM EDTA, pelleted, and
192
resuspended in 500 l 0.25 M Tris, pH 7.5, and lysed by three rounds of freeze-thawing. Cell
193
lysates were assayed for -galactosidase activity and these values used to normalize for
194
transfection efficiency. Cell lysates were assayed for CAT enzyme using a commercial CAT
195
ELISA kit (Roche Molecular Biochemicals, Indianapolis, IN). Experiments were performed in
196
triplicate, and results represent at least 6 independent transfections. Statistical analysis was
197
performed using analysis of variance (ANOVA) and student’s t-test assuming unequal variance
198
among groups.
199
200
201
202
Nucleotide sequence accession numbers
GenBank accession numbers are AF314257 to AF314404RESULTS
203
204
EIAV Rev variation in vivo
205
To accurately reflect the genetic diversity of an in vivo infection, pony 524 was
206
inoculated with the highly virulent Wyoming strain of EIAV (Belshan et al., 2000), which has
207
been maintained by serial in vivo passage (Oaks et al., 1998). Following experimental infection,
931
208
pony 524 experienced a variable clinical disease course characterized by recurring fever cycles
209
interspersed with afebrile periods ranging from days to months. In the data set of 320 rev clones,
210
there were 146 unique nucleotide variants and 99 unique amino acid variants. This included 61
211
clones from the inoculum, with 39 unique nucleotide variants and 25 unique amino acid variants.
212
The amino acid variant R1, was the consensus sequence of the inoculum, and the most frequently
213
observed variant overall. All genotypic and phenotypic analyses were performed relative to
214
R1A, which was the dominant nucleotide sequence encoding the amino acid variant R1.
215
216
217
Purifying selection dominates evolution of both Rev and TM
Analyses of Rev evolution are complicated by the fact that the second exon of Rev,
218
which contains the functional domains, (Fridell et al., 1993; Lee et al., 2006) overlaps with the
219
cytoplasmic tail of the transmembrane protein coding region. We tested various models of dual-
220
protein selection to explain all non-stop codon mutations observed in the sample of 320
221
nucleotide sequences. Tables 1 and 2 display the models in order of increasing complexity or
222
degrees of freedom for pairwise comparison of sequences (Table 1) and the most parsimonious
223
reconstruction of mutations (Table 2); only the best-fitting model at each level of complexity is
224
shown. The parameters that can be included in each model are the selection coefficient against
225
transversions sv, the selection coefficient against nonsynonymous changes in TM st, the selection
226
coefficient against nonsynonymous changes in Rev sr, and the selection coefficient against
227
double nonsynonymous changes in both reading frames str. These parameters generally range
228
from zero, indicating no selection, to one, indicating maximal negative selection against change;
229
negative values indicate positive selection. NE means the parameter was not estimated in this
230
model, i.e. the corresponding selection coefficient was set to zero. NA means the parameter does
1031
231
not exist in this model. For each level of complexity, i=0, 1, 2, 3, and 4, we fit all possible
232
models with i parameters. We also tested multiplicative selection models where str=srst
233
wherever applicable.
234
The neutral model assumed no mutation bias or selection. Under the pairwise sequence
235
comparison method, Table 1 shows the neutral model fit significantly worse than the one-
236
parameter model where transitions were highly favored over transversions (p-value < 0.0001).
237
The transition model, in turn, fit significantly worse than the best two-parameter selection model,
238
which selected against single frame nonsynonymous mutations, but treated double
239
nonsynonymous mutations as neutral (p-value < 0.0001). Finally, the best three-parameter
240
model implied strong selection in the Rev reading frame but left double nonsynonymous
241
mutations essentially “neutral” (p-value 0.03). An alternative three-parameter model that could
242
not be rejected using the bootstrap test set st  0 and estimates sr  0.53, str  1.01, sv  0.99 ,
243
emphasizing again the strong selection against change in Rev and the apparent selection for
244
double nonsynonymous mutations, especially relative to single nonsynonymous changes. The
245
best fitting three-parameter model fit no worse than the full four-parameter model or the most
246
general seven-parameter model (p-values >0.4).
247
The above analysis generated information about the observed mutations by comparing
248
pairs of sequences. The approach gave excess weight to mutations occurring deep in the
249
phylogenetic tree, since many pairs of sequences differed by these early mutational events.
250
Assuming no temporal change in selection, old mutations should display the same patterns as
251
recent mutations, but the presence of relatively few old mutations means random variation in
252
patterns can be accentuated by the pairwise analysis. To overcome this bias, we reconstructed
253
the most parsimonious series of mutational events by examination of the data. Most mutations
1131
254
occurred in a single local context in the reconstruction. Eight mutations occurred in multiple
255
contexts and each occurrence counted as one observed mutation. Although this approach avoids
256
excess weighting of deep branch mutations, the results are conditional on the hypothesized
257
pathways for accumulation of mutations. In addition, the separation of mutations into
258
independent blocks could under-count mutations that appeared in the same local sequence
259
background but distinct global backgrounds. As we observed for the pairwise comparison
260
method, the results in Table 2 indicated that the best fitting one-parameter model involved a
261
significant selection coefficient against transversions (p-value < 0.0001). The best two-
262
parameter selection model equally disfavored Rev nonsynonymous and double nonsynonymous
263
mutations, leaving changes in TM effectively neutral (pvalue 0.03). Several nearly equivalent
264
fits at this level suggested that the protein selection coefficients st , s r , str were statistically
265
indistinguishable and substantially greater than sv . Trends in the relationship between these
266
parameters are shown by the fit of the four-parameter model, but again, these differences were
267
not statistically supported by the data. No three-parameter or greater complexity model fit this
268
data better than the best-fitting two-parameter model.
269
Both tables include bootstrap-derived confidence intervals for each estimated parameter.
270
“Selection” (selection coefficient 0.93 to 0.99) against transversions was powerful and
271
consistently estimated across all models and methods. Selection against Rev (selection
272
coefficient 0.32 or 0.77) tended to be stronger than selection against TM (selection coefficient
273
0.0 or 0.6), though the confidence intervals did not rule out equal effects. In both the two- and
274
three-parameter model based on the pairwise analysis, double nonsynonymous mutations were
275
effectively neutral and favored over single-frame nonsynonymous mutations. In the
276
parsimonious reconstruction, double nonsynonymous mutations were about as disfavored as any
1231
277
single nonsynonymous mutation. Comparison of these analyses suggested a difference between
278
high and low frequency mutations. A statistical test for equal distributions of mutation type
279
between high frequency (>0.10) and low frequency (<0.10) mutations indicated double
280
nonsynonymous mutations were highly over-represented among high frequency mutations (p-
281
value: 0.001252). Together, the tests showed that selection acts to suppress amino acid change
282
in both reading frames, and especially in Rev, but that double nonsynonymous mutations were
283
particularly well-tolerated to become the most frequent mutation type observed.
284
285
Single amino acid substitutions significantly alter Rev phenotype.
286
Ten nucleotide mutations were present at a frequency greater than 0.10 of the total
287
population (Fig 1A). Surprisingly, given the selection against amino acid change in Rev, nine of
288
these mutations were nonsynonymous in Rev. All but one were also nonsynonymous in TM, and
289
this Rev codon experienced a second high frequency mutation that was nonsynonymous in both
290
reading frames. Except for this case, only one amino acid variant dominated at all the sites with
291
frequent nonsynonymous mutations (Fig 1B). Therefore, we identified nine highly variable
292
amino acid positions in Rev, which resulted in ten different amino acid variants.
293
To examine the effect of amino acid changes on Rev phenotype, each of the ten single
294
amino acid mutations was introduced in the backbone of R1, the consensus of the inoculum (Fig.
295
2A). Rev nuclear export activity was quantified using transient transfection assays and activity
296
was normalized relative to R1. Seven of the ten amino acid mutations significantly altered Rev
297
phenotype: six increased Rev activity and one decreased Rev activity (Fig. 2B). The only
298
mutations located within a known functional domain of Rev (Fridell et al., 1993; Mancuso et al.,
299
1994; Harris et al., 1998; Lee et al., 2006) were the changes at position 55, at the C-terminal end
1331
300
of the nuclear export signal. Both S55L and S55P significantly increased Rev activity as
301
compared to R1. Interestingly, three of the four mutations located within a non-essential region
302
of Rev, amino acids 131-143 (Lee et al., 2006) resulted in significant changes in activity:
303
D135G and Q138R increased activity whereas G134D decreased activity. These findings
304
indicate that EIAV Rev nuclear export activity is highly sensitive to point mutations, even those
305
that occur outside known functional domains. Further, the majority of mutations observed at a
306
high frequency in vivo were sufficient to cause significant changes in Rev nuclear export
307
activity.
308
309
310
Fixation of preexisting and in vivo mutations in EIAV Rev
To gain further insight into how the genetic mutations were related to the temporal
311
evolution and selection of EIAV Rev, we examined the mutations over time. R1 was the
312
dominant variant in the inoculum, as well as during the acute and inapparent stages of disease.
313
Four of the ten high frequency amino acid mutations observed in vivo were preexisting in the
314
inoculum and persisted throughout the course of disease. These included S55L, G134D, D135G
315
and Q138R. The V112A mutation was observed near the end of the acute stage of disease in the
316
background of the G134D mutation. Although no further high frequency mutations were
317
observed in this background, G134D/V112A variants persisted through the last time point of the
318
inapparent stage of disease. Variants containing the S55L mutation were observed in the
319
inoculum, as well as the acute and inapparent stages of disease. The persistence of S55L was
320
due primarily to the recurrence of variants that had been observed previously, and not to new
321
variants which had the S55L mutation. The remaining five mutations arose during the course of
322
infection and were fixed in the background of the D135G/Q138R mutations. The mutation
1431
323
R127K appeared at dpi 118 in the chronic stage of disease, followed by G110D and V105A at
324
dpi 201 in the inapparent stage of disease, and culminating with the simultaneous appearance of
325
S55P and R143H at dpi 754, during the late chronic stage of disease. At dpi 800, 91% of the
326
variants sampled contained these 7 amino acid changes, which resulted in a significant increase
327
in Rev nuclear export activity (Belshan et al., 2001).
328
It was of interest to determine if the cumulative fixation of mutations in the
329
D135G/Q138R background conferred greater fitness, as indicated by higher Rev activity. A
330
series of constructs were created that reflected the appearance and fixation of the high frequency
331
mutations through 800 days post infection (Fig. 3A). Rev phenotype was quantified in transient
332
expression assays and results were expressed as activity relative to R1 (Fig. 3B). Evo-1 contains
333
the Q138R mutation in the backbone of R1 and showed a significant increase in Rev activity, to
334
183. Evo-2 adds the D135G mutation, while constructs Evo-3 through Evo-6 represent the
335
cumulative fixation of the five remaining mutations in the backbone of Evo-2. The cumulative
336
fixation of high frequency mutations resulted in Rev activity significantly greater than the variant
337
R1; however, there did not appear to be selection for ever-increasing relative Rev activity.
338
In several instances, the effect of specific mutations on Rev phenotype was dependent on
339
the sequence context of the mutation. R127K and V105A showed no effect on Rev activity
340
when introduced singly in the backbone of R1 (Fig. 2); however, R127K significantly decreased
341
activity in the context of the cumulative mutations Q138R and D135G (Fig. 3B) and V105A
342
significantly increased activity when added to the background on the Evo-4. The G134D
343
mutation significantly decreased activity of R1 (Fig 2), but resulted in a significant increase in
344
activity in the background of Evo-1 (Fig 3). These results suggest specific sites in EIAV Rev
345
can not only accommodate genetic variation, but that the effect of variation can have a positive,
1531
346
negative or neutral effect on Rev phenotype, depending on the sequence context of that change.
347
These phenotypic assays provide experimental support for our hypothesis that special sites in
348
constrained regions of the virus genome may be permissive for genetic and phenotypic variation.
349
350
351
1631
352
353
DISCUSSION
Lentiviruses are characterized by high rates of mutation, recombination, and replication,
354
resulting in diverse populations of viral variants that rapidly adapt to changes in the host
355
environment (Coffin, 1995). Understanding the virus and host factors that shape the evolution
356
and selection of viral variants in vivo is an essential component of preventive and therapeutic
357
strategies to control lentivirus infections. Previously, we identified genetic and phenotypic
358
variation in Rev coincident with changes in clinical stages of EIA, and suggested that Rev
359
phenotype contributes to variant selection in vivo (Belshan et al., 2001; Baccam et al., 2003).
360
Here, we examined in more detail the genetic variation in rev and its effect on Rev phenotype in
361
order to further understand the evolution and selection of Rev during disease progression. Within
362
the population of 320 Rev clones, 121 of 135 amino acid positions varied in less than 2% of
363
sequences, and 70 amino acid positions were 100% conserved. However, ten amino acid
364
positions varied in more than 10% of the sequences, and changes at nine positions resulted in
365
significant changes in Rev nuclear export activity. Both Rev and the overlapping region of TM
366
were overall subject to purifying selection, with Rev somewhat more highly selected than TM.
367
Interestingly, despite widespread purifying selection, mutations that were nonsynonymous in
368
both reading frames were, on average, highly tolerated. Especially among the common
369
mutations, double nonsynonymous mutations appeared to be effectively neutral, like double
370
synonymous mutations. We hypothesize that there are specialized sites that can mutate without
371
severe consequence in either reading frames, and that these sites may be selected in vivo. If
372
these sites also modulate activity of one encoding protein without disrupting function in the
373
second reading frame, they provide a mechanism for the virus to diverge functionally, despite
374
heavy selective constraints in regions of overlapping reading frames.
1731
375
The variation of Rev in pony 524 was dominated by the presence of four mutations that
376
pre-existed in the inoculum, and six mutations that arose throughout the course of disease in
377
vivo. Nine of these ten mutations, including the six novel mutations, were specific to the
378
previously describe subpopulation A, which had significantly higher Rev activity and was the
379
dominant population during recurrent febrile episodes of EIA (Belshan et al., 2001; Baccam et
380
al., 2003). Evolution of subpopulation A during disease was best characterized by two mutations,
381
Q138R and D135G, present in the inoculum and five mutations that occurred in vivo during
382
subsequent febrile episodes; the two other mutations seemed to be evolutionary dead-ends. Pre-
383
existing mutations, Q138R and D135G, both alone and together conferred a dramatic increase in
384
Rev activity relative to R1 (Figs. 2 and 3). Many of the five novel mutations that arose during
385
infection substantially altered Rev phenotype, but none significantly decreased Rev activity
386
below that of Q138R or altered phenotype more than Q138R alone in the backbone of R1. The
387
maintenance of high Rev activity despite continued mutation suggests that high Rev activity is
388
important for the virus, especially during febrile stages of disease. In pony 524, pre-existing
389
mutant Q138R was critical for achieving high Rev activity, but it is clear from our single mutant
390
analyses (Fig. 2) that an arginine at position 138 is not necessary for the high Rev phenotype.
391
Indeed, there are a number of variable positions where a single amino change was found
392
significantly alter Rev phenotype. The presence of multiple mutational pathways to high Rev
393
activity confers flexibility on a protein whose evolution is constrained by an overlapping reading
394
frame and occasional immune epitopes.
395
Rev nuclear-export activity is dependent on several defined functional domains that
396
mediate protein-protein or protein-RNA interactions essential for nuclear import, RNA-binding
397
and interaction with Crm-1. The highly variable amino acid positions observed in vivo were
1831
398
found outside the known functional domains of EIAV Rev, which varied in less than 2% of the
399
sequences. In fact, four of the 10 variable positions were located in a region found to be non-
400
essential for Rev nuclear export activity (Lee et al., 2006). Nonetheless, nine of the 10 amino
401
acid changes that occurred at a high frequency in vivo were found to significantly alter Rev
402
nuclear export activity, either as single mutations or in the context of cumulatively fixed
403
mutations. Further, three of the four changes within the non-essential region significantly
404
increased, or decreased, nuclear export activity. The non-essential region may function as a
405
regulatory domain, allowing a high rate of genetic variation that modulates, but does not
406
eliminate, an activity essential for virus replication.
407
Rev overlaps the intracytoplasmic tail (ICT) of TM, and selection may act on
408
nonsynonymous changes in TM. The ICT of lentiviruses is unusually long and analyses of
409
primate lentiviruses indicate the ICT affects multiple steps in virus replication, including
410
infectivity, cytopathicity, and assembly (Lee et al., 1989; Gabuzda et al., 1992; Dubay et al.,
411
1992; Kalia et al., 2003, Freed and Martin, 1996; Cosson, 1996). In addition, the ICT has been
412
shown to be a locus for SIV attenuation in vivo (Shackeltt et al., 2001; Fultz et al., 2001). The
413
domains of ICT that mediate these various functions are not well defined. Functional motifs
414
identified in the ICT include the endocytotic sequence motifs, YXXL and di-leucine sequences
415
(Boge et al., 1998; Wyss et al., 2001). In addition, amphipathic -helical domains designated as
416
lentivirus lytic peptides (LLP-1 and LLP-2) play distinct roles in lentivirus infectivity and
417
fusogenicity (Kalia et al., 2003). Little information is available regarding functional domains of
418
the EIAV ICT, nor the role of the ICT in virus replication. The EIAV TM contains a proteolytic
419
cleavage site, and viruses producing a truncated TM were found to be more infectious in vitro
420
than wild-type viruses (Rice et al., 1990). Limited analyses of rev/tm variants in the context of
1931
421
infectious molecular clones correlated Rev activity in transient expression assays with replication
422
phenotype in vitro (Baccam et al., 2003). However, it is likely that at least some of the variants
423
would alter phenotype due to nonsynomymous changes in TM. Further characterization of the
424
replication and antigenic phenotype of Rev/TM variants will provide further insight into the
425
virologic and immunologic factors important in lentivirus selection and persistence in vivo.
426
The results of the phenotype analyses provide experimental support of our hypothesis that
427
specialized sites in constrained regions of the viral genome allow limited genetic variability that
428
can alter phenotype and confer a selective advantage in vivo. Importantly, the fact that so many
429
of the high frequency mutations induced measurable phenotypic differences suggests that their
430
abundance may be at least partly explained by selection. Evaluation of the phenotypic effects of
431
minor variants, would help establish this theory. Further support for a role of selection is found
432
in the observations that phenotypes with high Rev activity were dominant during febrile periods,
433
while Rev variants with lower activity were predominant during the inapparent stages of
434
infection (Belshan et al., 2001; Baccam et al., 2003). Although new mutations continued to
435
accumulate during febrile episodes, they did not progressively increase Rev activity, rather
436
maintaining a consistent, high level of Rev activity relative to the R1 variant that dominated the
437
inoculum and inapparent disease. The selective advantage of the observed variation in Rev is not
438
clear, but may include direct immune evasion resulting from down-regulation of structural gene
439
expression. If so, inapparent disease may indicate a healthy immune response that requires
440
evasion through decreased Rev activity, while febrile episodes mark immune escape allowing
441
vigorous virus production provided by accelerated Rev activity.
442
443
2031
444
ACKNOWLEDGEMENTS
445
The authors thank Yvonne Wannemuehler and Susan Vleck for excellent technical assistance.
446
This work was supported in part by funding from the National Institutes of Health grant
447
CA97936 and the National Research Initiative of the USDA Cooperative State Research,
448
Education and Extension Service grant number 2002-35204-12699. WOS was partially
449
supported by USDA HEP National Needs Fellowship 2000-3842-8824. REFERENCES
450
Addo, M.M., Altfeld, M., Rosenberg, E.S., Eldridge, R.L., Phillips, M.N., Habeeb, K.,
451
Khatri, A., Brander, C., Robbins, G.K., Mazzara, G.P., Goulder, P.J.R., Walker, B.D. and
452
the HIV Controller Study Collaboration. (2001). The HIV-1 regulatory proteins Tat and Rev
453
are frequently targeted by cytotoxic T lymphocytes derived from HIV-1-infected individuals.
454
Proc. Natl. Acad. Sci. USA 98, 1781-1786.
455
456
Alexandersen, S., and Carpenter, S. (1991). Characterization of variable regions in the
457
envelope and S3 open reading frame of equine infectious anemia virus. J. Virol. 65, 4255-4262.
458
459
Baccam, P., Thompson, R.J., Li, Y., Sparks, W. O., Belshan, M., Dorman, K. S.,
460
Wannemuehler, Y., Oaks, J. L., Cornette, J. L. and Carpenter, S. (2003). Subpopulations of
461
equine infectious anemia virus Rev coexist in vivo and differ in phenotype. J. Virol. 77, 12122-
462
12131.
463
464
Belshan, M., Baccam, P., Oaks, J. L., Sponseller, B. A., Murphy, S. C. , Cornette, J. and
465
Carpenter, S. (2001). Genetic and biological variation in equine infectious anemia virus Rev
2131
466
correlates with variable stages of clinical disease in an experimentally infected pony. Virology.
467
279, 185-200.
468
469
Belshan, M., Harris, M. E., Shoemaker, A. E., Hope, T. J. and Carpenter, S. (1998).
470
Biological characterization of Rev variation in equine infectious anemia virus. J. Virol. 72, 4421-
471
4426.
472
Bobbitt, K. R., Addo, M. M., Altfeld, M., Filzen, T., Onafuwa, A. A., Walker, B. D. and
473
Collins, K. L. (2003). Rev activity determines sensitivity of HIV-1-infected primary T cells to
474
CTL killing. Immunity 18, 289-299.
475
476
Casella, G. and Berger, R. L. (2001) Statistical Inference. Duxbury Press, Belmont, CA.
477
478
Coffin, J. M. (1995). HIV population dynamics in vivo: implications for genetic variation,
479
pathogenesis, and therapy. Science 267, 483-489.
480
481
Cosson, P. (1996). Direct interaction between the envelope and matrix proteins of HIV-1.
482
EMBO J. 15, 5783-5788.
483
484
Coulibaly, N., and Brorsen, B.W. (1999). Monte Carlo sampling approach to testing nonnested
485
hypotheses: Monte Carlo results.. Econometric Reviews 18,195-209.
486
487
de Oliveira, T., Salemi, M., Gordon, M., Vandamme, A.-M., van Rensburg, E. J.,
488
Engelbrecht, S., Coovadia, H. M. and Cassol, S. (2004). Mapping sites of positive selection
2231
489
and amino acid diversification in the HIV genome: an alternative approach to vaccine design?
490
Genetics 167, 1047-1058.
491
492
Dubay, J.W., Roberts, S.J., Hahn, B.H. and Hunter, E. (1992). Truncation of the human
493
immunodeficiency virus type 1 transmembrane glycoprotein cytoplasmic domain blocks virus
494
infectivity. J. Virol. 66, 6616-6625.
495
496
Freed, E.O. and Martin, M.A. (1996). Domains of the human immunodeficiency virus type 1
497
matrix and gp41 cytoplasmic tail required for envelope incorporation into virions. J. Virol. 70,
498
341-351.
499
500
Fridell, R. A., Partin, K. M., Carpenter, S. and Cullen, B. R. (1993). Identification of the
501
activation domain of equine infectious anemia virus rev. J. Virol. 67, 7317-7323.
502
503
Fultz, P.N., Vance, P.J., Endres, M.J., Tao, B., Dvorin, J.D., Davis, I.C., Lifson, J.D.,
504
Montefiori, D.C., Marsh, M., Malim, M.H. and Hoxie, J.A. (2001). In vivo attenuation of
505
simian immunodeficiency virus by disruption of a tyrosine-dependent sorting signal in the
506
envelope glycoprotein cytoplasmic tail. J. Virol. 75, 278-291.
507
508
Gabuzda, D.H., Lever, A., Terwilliger, E. and Sodroski. J. (1992). Effects of deletions in the
509
cytoplasmic domain on biological functions of human immunodeficiency virus type 1 envelope
510
glycoproteins. J. Virol. 66, 3306-3315.
511
2331
512
Hall, T. A. (1999). BioEdit: a user-friendly biological sequence alignment editor and analysis
513
program for Windows 95/98/NT. Nucl. Acids. Symp. Ser. 45, 95-98.
514
515
Harris, M. E., Gontarek, R. R., Derse, D. and Hope, T. J. (1998). Differential requirements
516
for alternative splicing and nuclear export functions of equine infectious anemia virus Rev
517
protein. Mol. Cell. Biol. 18, 3889-3899.
518
519
Ina, Y. (1995). New methods for estimating the numbers of synonymous and nonsynonymous
520
substitutions. J. Mol. Evol. 40, 190-226.
521
522
Kalia, V., Sarkar, S., Gupta, P. and Montelaro, R.C. (2003). Rational site-directed mutations
523
of LLP-1 and LLP-2 lentivirus lytic peptide domains in the intracytoplasmic tail of human
524
immunodeficiency virus type 1 gp41 indicate common functions in cell-cell fusion but distinct
525
roles in virion envelope incorporation. J. Virol. 77, 3634-3646.
526
527
Kumar, S., Tamura, K., Jakobsen, I. B. and Nei, M. (2001). MEGA2: molecular evolutionary
528
genetics analysis software. Bioinformatics 17, 1244-1245.
529
530
Lee, J.-H., Murphy, S.C., Belshan, M., Sparks, W.O., Wannemuehler, Y., Liu, S., Hope,
531
T.J., Dobbs, D. and Carpenter, S. (2006). Characterization of functional domains of equine
532
infectious anemia virus Rev suggests a bipartite RNA-binding domain. J. Virol. 80, 3844-3852.
533
2431
534
Lee, S.J., Hu, W., Fisher, A.G., Looney, D.J., Kao, V.F., Mitsuya, H., Ratner, L. and Wong-
535
Staal, F. (1989). Role of the carboxy-terminal portion of the HIV-1 transmembrane protein in
536
viral transmission and cytopathogenicity. AIDS Res. Hum. Retrovir. 5, 441-449.
537
538
Leroux, C., Issel, C. J. and Montelaro, R. C. (1997). Novel and dynamic evolution of equine
539
infectious anemia virus genomic quasispecies associated with sequential disease cycles in an
540
experimentally infected pony. J. Virol. 71, 9627-9639.
541
542
Mancuso, V. A., Hope, T. J., Zhu, L., Derse, D., Phillips, T. and Parslow, T. G. (1994).
543
Posttranscriptional effector domains in the rev proteins of feline immunodeficiency virus and
544
equine infectious anemia virus. J. Virol. 68,1998-2001.
545
546
Mealey, R. H., Zhang, B., Leib, S. R., Littke, M. H. and McGuire, T. C. (2003). Epitope
547
specificity is critical for high and moderate avidity cytotoxic T lymphocytes associated with
548
control of viral load and clinical disease in horses with equine infectious anemia virus. Virology
549
313, 537-552.
550
551
Moya, A., Holmes, E. C. and Gonzalez-Candelas, F. (2004). The population genetics and
552
evolutionary epidemiology or RNA viruses. Nat. Rev. Microbiol. 2, 279-288.
553
554
Nei, M. and Gojobori, T. (1986). Simple methods for estimating the numbers of synonymous
555
and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 3, 418-426.
556
2531
557
Oaks, J.L., McGuire, T.C., Ulibarri, C. and Crawford, T.B. (1998). Equine infectious
558
anemia virus is found in tissue macrophages during subclinical infection. J. Virol. 72, 7263-
559
7269.
560
561
Rice, N.R., Henderson, L. E., Sowder, R.C., Copeland, T.D., Oroszlan, S. and Edwards,
562
J.F. (1990). Synthesis and processing of the transmembrane envelope protein of equine
563
infectious anemia virus. J. Virol. 64, 3770-3778.
564
565
Shacklett, B.L., Weber, C.J., Shaw, K.E.S., Keddie, E.M., Gardner, M.B., Sonigo, P. and
566
Luciw, P.A. (2000). The intracytoplasmic domain of the Env transmembrane protein is a locus
567
for attenuation of simian immunodeficiency virus SIVmac in rhesus macaques. J. Virol. 74,
568
5836-5844.
569
570
Yang, Z. and Nielsen, R. (2000). Estimating synonymous and nonsynonymous substitution
571
rates under realistic evolutionary models. Mol. Biol. Evol. 17, 32-43.
572
2631
573
Table 1. Fit of various selection rev/tm models for pairwise sequence comparisonsa.
Log
Model
sv
st
sr
str
Neutral
NE
NE
NE
NE
-282.15
564.30
1parameter
0.98
(0.96,1.00)
NE
NE
NE
-165.57**
333.14
2parameter
0.99
(0.97,1.00)
NE
-147.85**
299.70
3parameter
0.99
(0.97,1.00)
0.56
(0.34,0.73)
0.77
(0.61,0.88)
NE
-145.62*
297.24
4parameter
0.99
0.41
0.69
-0.37
-145.41
298.82
Unrestricted
NA
NA
NA
NA
-143.93
301.86
0.66
(0.49,0.77)
likelihood
AIC
574
a
575
are shown were applicable, along with bootstrapped 95% confidence intervals. Each model is compared to the
576
nested model in the preceding line.
577
** implies a pvalue <0.001, * implies a pvalue 0.03, and no star implies a non-significant result.
578
NE means the selection coefficient was set to 0.
579
NA means the selection coefficient does not exist in the specified model.
Various models in order of increasing complexity are compared. Estimates of selection coefficients sv, st, sr, and str
580
2731
581
582
583
584
585
Table 2. Fit of various rev/tm selection models for parsimonious reconstruction of mutationsa.
Log
Model
sv
st
sr
str
Neutral
NE
NE
NE
NE
-305.85
611.70
1parameter
0.94
(0.90,0.96)
NE
NE
NE
-212.10**
426.20
2parameter
0.93
(0.89,0.97)
NE
0.32
(0.04, 0.52)
-209.69*
423.38
4parameter
0.93
0.26
0.36
0.49
-209.27
426.54
Unrestricted
NA
NA
NA
NA
-208.94
431.88
a
See notes for Table 1.
586
2831
likelihood
AIC
587
FIGURE LEGENDS
588
589
Figure 1. Genetic variation in EIAV rev in vivo. (A) Frequency of non-consensus amino acids
590
in Rev exon 2, relative to the founder variant, R1. (B) Frequency of individual amino acids
591
observed at the nine positions with frequency of non-consensus amino acids greater than 0.10.
592
The first amino acid shown is the consensus of the inoculum.
593
594
Figure 2. The effect of high frequency mutations on Rev nuclear export activity. A. Amino acid
595
sequence of Rev exon 2 showing location a single high frequency amino acid changes introduced
596
into the backbone of R1 cDNA. The functional domains required for Rev activity are boxed and
597
include the nuclear export signal (a.a. 31-55); the RNA binding/nuclear localization signal
598
(RRDRW and KRRRK). The shaded area indicates a region not essential for Rev nuclear export
599
activity (Lee et al., 2006). (B) Rev nuclear export activity of single amino acid mutants. results
600
are expressed relative to the consensus of the inoculum, R1, and represent the mean activity of at
601
least six independent transfections, ± standard error. Variants that differed significantly from the
602
activity of R1 are indicated by astericks, *p<0.05; **p<0.005; ***p<0.0005.
603
604
Figure 3. Genetic and phenotypic variation in Rev over time. A. Cumulative fixation of high
605
frequency Rev mutations based on the inferred ancestry of Rev variants observed in vivo. B.
606
Phenotype of Rev evolution mutants relative to R1. Variants that differed significantly from the
607
activity of R1 are indicated, with p values represented by (*) p < 0.05, (**) p < 0.005, (***) p <
608
0.0005.
609
2931