Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Viral phylodynamics wikipedia , lookup
Public health genomics wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Human genome wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Computational phylogenetics wikipedia , lookup
Point mutation wikipedia , lookup
JOurnal o f General Virology (1996), 77, 783-792. 783 Printed in Great Britain Consistent risk group-associated differences in human immunodeficiency virus type 1 vpr, vpu and V3 sequences despite independent evolution Carla L. Kuiken, 1. Marion T. E. Cornelissen, ~ Fokla Zorgdrager, 1 Susan Hartman, 1 Adrian J. Gibbs 2 and Jaap Goudsmit I 1 Human Retrovirus Laboratory, Department o f Human Retrovirology, University o f Amsterdam, Meibergdreef 15, 1105 A Z Amsterdam, The Netherlands and ~Research School o f Biological Sciences, Australian National University, Canberra, Australia Human immunodeficiency virus type 1 vpr, vpu and V3 sequences from 15 homosexual men and 19 intravenous drug users in the Amsterdam Cohort studies were analysed. Previously, we reported that V3 domains of viruses from drug users are distinguishable from those of homosexual men on the basis of two silent mutations. Phylogenetic analysis of vpr, vpu and V3 shows that differences in all three regions correlate with risk group. Two positions in both vpr and vpu were found to differ significantly between the risk groups. The distinguishing positions were confirmed for sequences from 11 Scottish and four German samples. The three regions show relatively independent evolution patterns; they resulted in different phylogenies, the only stable clustering being that based on the risk group distinction. Pairwise differences between sequences of the genes were moderately correlated (around 0.30). Surprisingly, when only silent changes were counted, the correlations dropped almost to zero, indicating that the evolution towards independence was more advanced in the silent than in the non-silent positions. This suggests that selection at the amino acid level is not the primary driving force for the independent evolutionary behaviour of the genes. Recombination, combined with restrictions on certain amino acids because of epistatic interactions between the genes, could be an alternative explanation of this phenomenon. Introduction arisen from a duplication of the vpr gene. Large quantities of Vpr protein are packaged in the virion, suggesting that it may have a role in the early stages of infection, before integration of the DNA (Yuan et al., 1990). Several studies have shown that the protein is involved in virus production in macrophages, but not (or less pronounced) in T cells (Balliet et al., 1994; Ogawa et al., 1989; Cohen et al., 1990; Yuan et aI., 1990; Connor et al., 1995; Balotta et al., 1994). Vpr has the same function in HIV-2 (Hattori et al., 1990). Recent evidence indicates that it has a role in the regulation of latent infection (Levy et al., 1994) through the control of activation of both CD4 T cells and macrophages (Levy et al., 1993). The vpu gene has no known homologue in HIV-2 and SIV. Vpu protein is synthesized in infected cells, but not packaged in the virion (Strebel et at., 1989). Available evidence indicates that Vpu is mainly associated with particle maturation and budding (Strebel et al., 1989; Klimkait et al., 1990; Terwilliger et al., 1989). It has also been shown that independent of this mechanism, Vpu is involved in CD4 degradation (Willey et at., 1994; Chen et al., 1993; Raja et at., 1994; Buonocore et al., 1994), which in turn affects HIV-1 particle production because vpr and ~7~u are two of the accessory genes of human imnmnodeficiency virus type 1 (HIV-1). They are located just 5' of the envelope gene, with vpu partially overlapping it. The proteins are dispensable in vitro, since laboratory strains can be defective for either gene (Dedera et al., 1989; Cohen et al., 1988). However, in vivo both are usually intact, and probably have a role in the virus life cycle (Westervelt et aL, 1992; Klimkait et al., 1990). The vpr gene encodes a 96 amino acid protein. It is also found in HIV-2 and some of the simian immunodeficiency virus (SIV) isolates known. Moreover, similarities found between the vpr and vpx genes in the HIV-SIV group suggest that Wx may actually have * Author for correspondence. Fax +31 20 691 6531. e-mail [email protected] The sequence data reported in this paper have been deposited in GenBank under accession numbers Z29280, Z29296, Z29298, Z29299, Z29302-Z29304, Z29306, Z29307, Z29310-Z29312, Z29315, Z29319-Z29324, Z68033, Z68047, Z68062, Z68070, Z68074, Z68086, Z68508-Z68616 and Z68687-Z68693. 0001-3630 © 1996 SGM Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Thu, 11 May 2017 04:30:11 784 C. L. Kuiken and others CD4 molecules capture gp160 envelope precursors on their way to the cell surface. A study of V3 sequences from a group of HIV-1 infected people in the Amsterdam Cohorts of homosexuals and intravenous drug users has revealed that the virus variants that were found in the homosexual men and haemophiliacs could be distinguished by two mutations from variants found in drug users (Kuiken et al., 1993). These mutations were both silent, and one of them was found with very high consistency. Subsequently, in a collection of over 100 homosexual seroconversion sequences the 'drug user variant' was found only once (Kuiken et aI., 1996). In a second study, using only published V3 sequences from homosexuals and drug users (and their sexual partners) from other countries in Europe, and judging by the most conserved distinctive mutation drug users and homosexuals from Germany (Hamburg) and the United Kingdom (Scotland: Edinburgh and surrounding regions) could also be reliably distinguished on the basis of the V3 sequences. Sequences obtained from samples from Italian drug users indicated that the epidemic there consists of a mixture of the two virus variants (Kuiken & Goudsmit, 1994). As the most consistent mutations were silent, we assumed that the distinction was epidemiological in origin, and represented a founder effect: the drug user population appears to have become infected by one accidental variant, which differs from the other variants by a number of mutations. To further investigate this hypothesis, we have sequenced other regions of the HIV1 genome. In this paper we report the results of an analysis of the vpr and vpu sequences fi'om 34 HIV-1 infected participants in the Amsterdam Cohorts of homosexuals and drug users, supplemented with those from 11 Scottish and four German samples. The data show that the distinction between the risk groups is present in both vpr and vpu. Methods Origins ofsera. For sequencing of vpr, vpu and V3, serum samples were obtained from 15 homosexual men and 19 intravenous drug users, all participants in the Amsterdam Cohorts (Van den Hock et al., 1988; De Wolf et al., 1987). All drug user samples were obtained 3 months to 1 year after seroconversion, and form a random selection from all participants who seroconverted between 1986 and 1991, Samples from homosexuals were either seroconversion samples (seroconversion dates between 1985 and 1988) or the first seropositive sample for those who were already HIV infected when they entered the Cohort study (sampling date 1984 or 1985). In addition, four German samples [two from homosexuals, two from drug users; a subset of those previously discussed in Kuiken & Goudsmit (1994)] and 11 Scottish samples [five from homosexuals and six drug users and sexual contacts of these; unrelated to the samples discussed in Kuiken & Goudsmit (1994)] were used. R N A isolation and sequencing. Genomic RNA was isolated from 100 gl of sermn or plasma according to Boom et aL (1990). The RNA was converted to cDNA, which was subjected to a nested PCR. For V3, the method was as described previously (De Wolf et al., 1994). Primers used for vpr were 5'VPR-1 (5' GATCTCTACAITACTTGGCACT 3') (I = inosine) and YVPR-4 (5' CTTCTTCCTGCCATAGGAGATGCC Y), final MgCI~ concentration 2-4 m~, for the first PCR, and 5'VPR-2-Sp6 (5" ATTTAGGTGACACTATAGCACCTTTGCCTAGTGTTAIGAA 3') and YVPR-3-T7 (51 TAATACGACTCACTATAGGGAAAGCAACACTTTTTACAATAGCA 3'), both at a final MgClz concentration of 2.4 mM, for the second PCR. For vpu the primer sets were 5'VPU-1 (5' GCATCTCCTATGGCAGGAAGAAG 3') and YVPU-4 (5' ATATGCTTTAGCATCTGATGCACAAAATA 3') for the first PCR (final MgCI~ concentration 3.0 mM) and 5'VPU-2Sp6 (5' ATTTAGGTGACACTATAGTCATCAAGITICTCTAICAAAGC Y) and YVPU-3-T7 (51 TAATACGACTCACTATAGGGCCATAAT1ACIGTGACCCACA Y) (final MgCI 2 concentration 3-5 mM) for the second PCR. In all cases, sequences were rejected if they contained more than 1% heterogeneous or illegible positions after two sequencing attempts. A minimum of six clones was then made by cloning the PCR product in an AT cloning system. In three Scottish samples, two subpopulations were clearly present; in these three cases, one subpopulation was randomly selected after it was established that the relevant nucleotide positions were the same in both populations. Sequencing was done using an automated sequencer (.Applied Bitsystems). Sequence analysis. Alignment was done by eye, keeping codons intact. Phylogenetic analysis was done using the program MEGA (Kumar et al., 1994). The neighbour-joining method (Saitou & Nei, 1987) was used to construct phylogenetic trees based on distances calculated using the Kimura 2-parameter estimation method (Kimura, 1980). The variability within sets of sequences was calculated using Hamming distances between all pairs of sequences (Hamming, 1986), expressed as percentages of the total sequence length (including gaps). These were averaged over all sequences in a set to calculate the variation. Pearson correlation coefficients between the Hamming distances on two genes/regions were used as a measure of co-variation. Positions containing gaps or ambiguities were excluded from all analyses. Additional multivariate analysis of the sequences was done using the program PCOORD (Higgins, 1992). Numbers of silent and non-silent changes were calculated according to Nei & Gojobori (1986) using MEGA. For these calculations, the vpr and vpu sequences were truncated to span only the open reading frames of the genes. Assessing the strength o f risk group association. To evaluate the strength of the association between the positions and the risk groups, the following method was used. For each position in the sequences, a pairwise comparison matrix was constructed consisting of zeros and ones, indicating for each pair of sequences whether they contained identical or different nucleotides at that position (position matrix). A second matrix contained a similar same/different index indicating whether each pair of sequences came from the same or a different risk group (risk group matrix). A Pearson correlation coefficient was then calculated between corresponding cells of the position matrix and the risk group matrix, yielding a position-wise indication of the association between sequence differences and group membership. The correlation coefficient, in the present case, was calculated on the pairwise differences, which are not statistically independent. Although this does not alter its interpretation, the associated significance test loses its vMidity. Instead, a simulation method was employed to assess Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Thu, 11 May 2017 04:30:11 Risk group differences in HIV-1 vpr and vpu the correlation that one can expect to find accidentally, when no real association is present. The sequences were randomly divided into two groups 100 times, and for all positions in all three regions the correlation was calculated; these never rose above 0.33. Therefore actual correlations larger than 0.33 were considered significant. The descriptive level for the resulting criterion is difficult to calculate due to dependencies in the data, but the criterion is considerably stricter than 1%. Because the criterion is so conservative, difficulties due to multiple testing are avoided. the groups was seen (drug user: mean 6.4%, s o 1.7; homosexuals: mean 6.0%, so 1.5). For the drug user group the V3 variation was 8.6 % (SD 2"3) versus 6"5 % (SD 1"9) in the homosexual group. In the drug user group, there were two samples that contained very similar virus variants (I3072 and I5038 : < 1% difference in all genes). An epidemiological link between these persons could not be established, but the samples may have been a few transmissions apart. Ratios of silent to non-silent changes (calculated per number of silent/nonsilent sites) were 1.12 for V3, 3"3 for vpr and 1"8 for vpu. N o n e of the sequences contained any mutations that generated stop codons. To investigate the evolutionary differences between the genes, we looked at the extent to which differences between viruses in one gene or region were associated Results Independent variation o f the genes In both vpr and vpu, variation was lower than in V3. In vpr, the variation in the drug user group (mean 3"5 %, SD 1"3) was slightly lower than that in the homosexual group (mean 4.3 %, SD 1"3). In vpu, no difference between 3< (a) 17 16 .I0033 ,I3077 .Ii164 ,-I0142 ,I0090 .I5032 Ii035 • Ii035 .Ii083 3 39= -I3008 27 14 -" 15038 ,I3007 .I7025 ,I3064 --Jl ,I5037 I0221 H0617 92[ H0569 H0157 H0337 H0583 9911 .......13057-09 ,I0070-07 H0172 H0230 H0186 H0211 H0537 36[ HI145 19 H0008 H0411 18~ ' 785 H0140 471 H0709 Fig. 1 (a). For legend see page 787. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Thu, 11 May 2017 04:30:11 786 C. L. Kuiken and others (b) .13072 38 15038 . . . . •I 0 2 2 1 17025 ~ I 0 1 4 1 ,I009( 3 I0070-07 2O ,'~I0142 .I3008 1 . 3] . . . 15032 . , , ,I3007 IA ~ 1 5 0 3 7 34 ¸ ',I0033 ,11035 ,II083 ,Ii164 ,I3064 ]J 2vii ,':,I3077 HI145 .I3057-09 H0537 H0230 j H0008 3,5 H0617 H0411 H0140 H0186 18 l[ ...... H 0 5 8 3 .... 3 , H0569 H0157 52[ H0172 H0709 HO21i 581 H0337 F i g . 1 (b). F o r l e g e n d s e e f a c i n g p a g e , with differences in another. This co-variation was calculated by correlating the numbers of pairwise differences between the sequences of one gene with those between the corresponding sequences of the other. The genes do not co-vary strongly: the Pearson correlation coefficient between the pairwise distances in vpr and vpu was 0-36; between vpu and V3, 0'31 ; and between vpr and V3, 0.32. Correlations between the numbers of silent changes were 0.05 (vpr and vpu), 0.20 (vpr and V3) and 0-07 (vpu and V3). These numbers are much lower, which is surprising because the numbers of silent changes presumably reflect the evolutionary distances more accurately, being less subject to various pressures. [However, silent changes are not always selectively neutral (e.g. Coffin, 1992).] Absolute numbers of silent changes were large enough for meaningful comparison (vpr: median 4, range 0-11 ; vpu: median 3, range 0-9; V3: median4.5, range 0--13.5). The significance of the Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Thu, 11 May 2017 04:30:11 Risk group differences in HIV-1 vpr and vpu (c) 787 .I3008 .I0142 ,I5037 .I5032 ~17025 -I3007 Ii035 ,I0221 ,I0141 05 ]~'~I5038 I00]L---------~13072 I0033 ....I3064 ,I3077 10090 .Ii164 .Ii083 96]. ~I 301' ' ,I3057 ....~I0070 . H0211 H0569 2~ L~39 H0157 H0583 H0008 ,,, H0617 02 H0186 ,' HI145 H0230 H0172 H0337 H0140 H0411 H0537 H0709 Fig. 1. Phylogenetic trees based on the vpr (a), ~7'u (b) and V3 (e) regions. Double lines indicate branches leading to HIV-1 sequences from intravenous drug users, single lines those leading to HIV-1 sequences from homosexual men. difference between the correlations of the numbers of silent changes could not be directly tested, however, because again they are based on pairwise comparisons. A very conservative estimate (using Fisher's r-to-Z transformation with N set equal to the number of sequences) indicates that the 95 % confidence interval around the 0-20 correlation has a width of 0-24, making the difference between 0-20 and 0-05 non-significant. This phenomenon of relatively independent variation is also reflected in the phylogenetic analyses (Fig. 1). Analyses based on the three genes yielded different phylogenies. Only the I3072/I5038 pair consistently clustered together. Several small clusters with moderate to high bootstrap values (> 0.60) formed in phylogenetic analyses based on all three genes [e.g. H0008/H0617 (0-66) in ~Tu, and H0569/H0617 (0.92) and 10090/15032 (0-96) in vpr]. Some of these clustered pairs appeared in more than one of the trees, but none was present in all (except the related viruses mentioned above). Bootstrap values in the analyses based on individual genes were generally low. This instability was reflected in an analysis based on all three genes together, which yielded a tree with three stable groupings (not shown): one with the related drug user sequences (100%), one with the two drug user sequences with homosexual characteristics (see below, 100 %), and the separation between the two risk groups (86 %). Two drug users who had sequences with 'homosexual' characteristics formed a separate subcluster within the homosexual cluster in all trees except the one based on vpu sequences. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Thu, 11 May 2017 04:30:11 788 C. L. Kuiken and others relative position ......... i0 [ ......... 20 [ ......... 30 [ ......... 40 [ ......... 50 [ ......... 60 [ ......... 70 [ ......... 80 ~ ......... 90 [ vpr vpr homo IVDU v p r -> <- v l f ACTGACAGAGGATAGATGGAACAAGCCCCAGAAGACCAAGGGCCACAGAGGGAGCCATACA%ATGAATGGACACTAGAG~Fz-i-i~AGAGGAG ............................................... A ..................... G .................... T/A vpr vpr homo IVDU CTTAAGAATGAAG CTG~AGACA%~FF~CCTAGGATATGGCTCCATAGCTTAGGGCAACATATCTATGAAACTT~ATACTTGGGCA .......................................................................................... IS0 vpr vpr homo IVDU GGAGTGGAAG C CATAATAAGAATTCTG CAACAACTG CTGTTTATTCA TTTCAGAATTGGGTGTCGACATAGCAG~TAGGCATTACTCAA ................................................................ A ......................... 270 vpr vpr homo IVDU t a t -> <- v p r CAGAGGAGAG CAAGAAATGGAGCCAGTAGATCCTAGACTAGAGCCCTGGAAGCATCCAGGAAGTCAGCCTAAGACTGCTT~TACCAC -G ..................................................................................... R/e 357 Q/R relative positio. . . . . . . . . . i0 I ......... 20 t ......... 30 I ......... 40 1 ......... 50 I ......... 60 r ......... 70 t ......... 80 L ......... 90 I vpu vpu homo IVDU vpu - > TAAGTAGTACATGTAATGCAAT~-i-i~i~%CAAATATTAGCAATAGTAGCATTAGTAGTAGCAACAATAATAGCAATAGTTGTGTGGTCCATA .......................................................................................... vpu vpu homo IVDU env G T A T T C A ~ A G A C A G G T T A A T T G A T A G A A T A A G A G A A A G A G C A G A A G A C A G T G G C A A T .......................................................................................... vpu vpu homo IVDU <- V p U GAGAGTGAAGGAGATCAGGAAGAATTATCAGCACTTGTGGAGA~CA~CATGCTCCTTGGGATATTGATGATCTGTAGTGCTGCAGAAAAATT ........ C ................................ A ........................... A .......... A ............ E/D ......... i0 I ......... 180 R/K 20 [ ......... 30 ~ ......... * 40 I ......... S/N* 50 { ......... 60 I ......... 70 [ ......... 80 [ ......... K/N* 90 ] V3 h o m o V3 IVDU GTAGTAATTAGATCTGACAATTTCA CGGA CAATG CTAAAA CCATAATAGTACAG CTGAATCAATCTGTAGAAATTAATTGTACAAGACCC .............. C ......................... T .................. C .............................. r/l K/N V3 homo V3 IVDU A~CAACAATACAAGAAAAAGTATACATATAGGACCAGGGAGAGCA~F~-~~GGAGAAATAATAGGAGATATAAGACAAGCACAT ...................................... C ................................................... V3 V3 TGTAAC CTTAGTAGAG CAAAATGGAATAACACTTTAAAACAGATAGTTAAAAAATTAAGAGAACAA~AATAAAACAATAGTCTTTAATCAA ........................... G ..................... T .............................................. N/D K/I homo IVDU 276 T-- DIN K/T* relative position -> 180 276 Fig. 2. vpr, ~Tu and V3 regions (based on the consensus sequence) depicting the mutations characteristic of homosexual and drug user virus populations. Letters underneath the alignments indicate amino acid differences that result from the mutations: in X/Y, X indicates the amino acid associatedwith the homosexualvariant, Y the one in the drug user group. If no translation is given, the change is silent. In the vpu sequence, translations based on both Epuand env reading frames are shown. The numberingof the regions is relative to the starting point of the sequences used in this study. * Amino acid changes in the Env reading frame. Risk-group related differences between the sequences Phylogenetic analysis clearly separated the two risk groups in trees based on each of three regions (Fig. 1). The drug user group contained two samples that yielded virus which did not conform to the risk group classification (I3057 and I0070). These two samples mostly clustered together and were grouped with the homosexual samples. The two groups could also be distinguished on the basis o f each region using principal coordinate analysis (Higgins, 1992), again suggesting that some positions differed consistently between the two groups. To evaluate the strength of the association for each position, Pearson correlations were used (see Methods). As noted previously for V3, associations varied from almost perfect to very weak. In Table 1, all correlations over 0-2 are included. In the Dutch group, correlations that are significant according to the criterion defined above are marked with an asterisk. In both vpr, vpu and V3, two positions were found to be significantly associated with risk group. In V3, both changes were silent. In ~7~r, one of the two changes was non-silent (Q/R). In vpu, both significant changes were silent in the vpu reading frame (one was located in the stop codon). They are both located in the section that also encodes Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Thu, 11 May 2017 04:30:11 Risk group differences in HIV-1 vpr and vpu Table 1. Pearson correlations between nucleotides present at each position and risk group membership for selected positions in vpr, vpu and V3 Analysis of Dutch vpr and vpu genes was based on a smaller sample of 19 intravenous drug users and 15 homosexuals. Foreign samples consist of four German (two homosexual, two drug users) and 11 Scottish samples (fivehomosexuals and six drug users and sexual contacts of these). Position Dutch samples (n = 34) Foreignsamples (n = 15) V3-15 V3-41 V3-60 V3-129 V3-208 V3-230 vpr-48 vpr-70 vpr-245 vpr-272 vpu-189 vpu-222 vpu-250 vpu-261 vpu-274 0.36* 0.33 0.33 0.73* 0.33 0.27 0.77* 0.20 0"33 0'77* 0"30 0"71* 0"33 0"48* 0"30 0.36 0.30 0.22 0.74 0"15 0.22 0.69 0"33 0-35 0.76 0"37 0"48 O"17 0"39 0"35 * Significant correlation (see Results). Env; in the env reading frame, both changes were nonsilent (vpu-222, R / K ; vpu-261, S/N). Other northern European samples Previously, we reported that samples from G e r m a n y and Scotland showed the same risk group-related distinction as the Dutch ones. vpr and vpu sequences from 11 Scottish samples (five homosexuals and six drug users) and four G e r m a n samples (two homosexuals, two drug users) were analysed to assess whether the same pattern was also found in these genes. One sample from a Scottish homosexual was negative for vpr despite repeated isolation attempts, so for vpr only four Scottish homosexual samples were available. Two Scottish samples from heterosexual women were assigned to the drug user group, because the Scottish epidemic is most severe a m o n g drug users (Brettle et al., 1987; Peutherer et al., 1985), and because in northern Europe the largest risk factor for women who do not use intravenous drugs is sexual contact with a drug user. Both samples yielded a V3 sequence that was unmistakably of the drug user type. Judging by the two significant positions in the V3 region, one Scottish drug user sample (IUK60), from a male drug user, yielded a sequence with clear homosexual characteristics. All other samples fell into the correct group based on V3 position 129 and vpr position 272. For vpr-48, one homosexual sample deviated from the expected pattern; for V3-15, two drug user samples had a different nucleotide than 789 expected. Consistency in vpu was lower: for both positions (222 and 261), three of seven homosexual samples adhered to the Dutch pattern. All drug user samples except I U K 6 0 showed the expected pattern in vpu-222, and five of these did so in vpu-261. Discussion The finding that the intravenous drug user population in The Netherlands appeared to be infected with an H I V variant that is distinct from the one found in homosexuals and haemophilia patients presumably reflects an epidemiological founder effect: the ancestral virus infecting the first drug user victim of the H I V epidemic had several unique mutations which, being selectively neutral or nearly so, remained present in the quasispecies descending from it. The two mutations found in V3 were both silent, and therefore presumably have little or no effect on the fitness of the virus. Furthermore, haemophilia patients without exception carried the ' h o m o sexual' variant, which, for the Netherlands (where blood donations are given voluntarily and without financial compensation) is compatible with this epidemiological model. Later on, the same distinction was found to be present in other European countries as well, namely G e r m a n y and Scotland (with high consistency) and Italy (where four out of nine drug user samples contained the ' h o m o s e x u a l ' variant) (Kuiken & Goudsmit, 1994). It is possible that all these drug user infections are epidemiologically related. The V3-129 mutation is found in only four out of 47 sequences listed in the Los Alamos compendium alignment (Myers et al., 1994), all of these from the United States: HIV-1 BAL, a macrophage-tropic isolate from a New Y o r k patient, and three from intravenous drug users from Baltimore (HIVUS712, -715 and -716). These four exclusively also have vpu-261, and three have V3-15. The mutation in vpu-222, however, is found in 13 of the sequences listed in the alignment, vpr sequences are not available for these isolates. In a set of 30 V3 sequences from the United States (NIAID), three out of four drug user sequences had the V3-129 mutation, as well as two heterosexual transmission cases from Puerto Rico. Other anecdotal evidence suggests that the Dutch drug user variant is also associated with drug use in the United States, though less consistently than in mid-Europe: based on V3-129, it is present in three of 29 sequences from control subjects in the Florida dentist case, one of these from a drug user (Ciesielsky et al., 1992); it is found in the only drug user in a set of three index persons from New Y o r k City and their partners (Zhu et al., 1993), and again in a drug user in a set of four United States patients studied by Haigwood et al. (1993). In a set of 11 samples Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Thu, 11 May 2017 04:30:11 790 C. L. Kuiken and others from Baltimore drug users, five adhered to this pattern (V. V. Lukashov and others, unpublished). The present study was undertaken to investigate whether a risk group-related distinction found in V3 could also be found in vpr and vpu. The results show that this distinction is clearly present in both these genes. This is consistent with a founder effect: accidental differences present in the variant that was the founder o f the epidemic in drug users would be scattered throughout the genome. Most of the positions found to discriminate between the risk groups in the Dutch dataset could be confirmed independently using sequences from a Scottish and a German sample set. A distinction between viruses obtained from drug users and homosexuals has been found in other studies as well. A recent study by Albert et al. (1994) showed a clear and consistent distinction in a set of reverse transcriptase sequences. Our analysis of these sequences showed that three positions in these reverse transcriptases are consistently different between the risk groups (not shown). Analysis of a small set of 10 Swedish V3 sequences suggests that, although again the sequences differ between risk groups, this distinction is not the same one that is found in the Dutch, Scottish and German viruses (not shown). A Madrid study showed a distinction to be present in gag as well as env (fusion domain) sequences (Rojas et al., 1994), but again, based on V3 sequences, this distinction seems to be different from the D u t c h / German/Scottish distinction (C. Ldpez-Galindez, personal communication). Combining all these results, we can conclude that it is likely that several different founder effects are present in the European HIV epidemic. The fact that the ' D u t c h distinction' is also present in part of the United States drug user population suggests that this particular variant may have originated from the United States, and then having crossed the Atlantic subsequently spread among Dutch, German and Scottish, and some Italian drug users. A sizeable proportion of Amsterdam drug users are German or Italian; a study among Edinburgh drug users showed that several of them had shared needles in Amsterdam (Brettle et at., 1987). The source of the homosexual infection could be a different one. At present, data are too scarce to draw more inferences about the origin of the other founder effects, or about the ways in which the virus has spread through Europe. The hypothesis that the northern European drug user variant is derived from a United States variant is being investigated at present. The three regions investigated here appear to have evolved more or less independently. The correlations between pairwise differences were small, and the only clustering that was reinforced when the sequences were analysed together was the homosexual/drug user split. This instability may be caused by the fact that the sequences are rather short. Also, phylogenetic trees in populations of closely related sequences typically have a star shape, with short internal branches and long external ones (Holmes et al., 1995); this results in very unstable trees, where removal of one or two sequences often results in a different structure. Although the data have to be interpreted with care due to the above considerations, alternative explanations can be suggested. One is that the genes are evolving in different directions because of different selection pressures. However, this model is hard to reconcile with the fact that the correlations between the pairwise distances drop almost to zero when they are based on only silent mutations. Although silent positions may also be under some selective pressure in coding regions (Coffin, 1992), it is hard to explain why they would be m o r e diverged than non-silent ones. This finding may reflect some form of epistatic interaction between the genes, whereby certain amino acids encoded by one gene necessitate or facilitate the presence of specific amino acids encoded by the other. Independent evolution in different genes has been reported before in sequences from one patient (Pedroza Martins et al., 1991 ; Delassus et al., 1991 ; Vartanian et al., 1991; Howell et al., 1991), and frequent recombination events have been suggested as an explanation. Increasingly, reports suggest that recombination occurs frequently in HIV-1 (Monken et al., 1995; Diaz et al,, 1995; Zhu et al., 1995; Robertson et aI., 1995). It is possible that the independent evolution found in the present study is a reflection of recombination, where the risk group distinction is preserved because the recombining viruses generally are from the same risk group and share the same genotype. In this light, one might expect that correlations between distances would be inversely proportional to the distance between the genes (and thus, to the probability of recombination). This does not appear to be the case in the present dataset, but more data are needed to draw this conclusion more firmly. We would like to express our gratitude to Dr Michael Schreiberof the BernhardNocht Institut ftir Tropenmedizinin Hamburg, Gemaany, and to Dr Andrew Leigh Brown of the University of F~nburgh, Scotland, for generouslydonating serum samples that were used for this study; to Dr Edward Holmes of the University of Oxford, for thoughtful comments; to Drs Jan Albert, Thomas Leitner and Cecilio Ldpez-Galindez, for sharing unpublished results; and to Drs Tuofu Zhu (Aaron Diamond Institute, New York) and Carol Ciesielsky (Centers for Disease Control, Atlanta) for unearthing the risk groups associated with published sequences. References ALBERT,J., WAHLBERG,J., LE1TNER,T., ESCANILLA,D. & UHLEN, M. (1994), Analysis of a rape case by direct sequencing of the human immunodeficiencyvirus type 1 pol and gag genes. Journal of Virology 68, 5918-5924, Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Thu, 11 May 2017 04:30:11 R i s k group differences in H I V - 1 vpr and W u BALLIET,J. W., KOLSON,D. L., EIGER, G., KIM, F. M., MCGANN, K. A., SRIN1VASAN,A. & COLLMAN,R. (1994). Distinct effect in primary macrophages and lymphocytes of the human immunodeficiency virus type l accessory genes vpr, vpu and nef: mutational analysis of a primary HIV-I isolate. Virology 200, 623-631. BALOTTA,C., LuSSO, P., CROWLEY,R., GALLO,R. C. & FRANCHINI,G. (1994). Antisense phosphorothioate oligodeoxynucleotides targeted to the vpr gene inhibit human immunodeficiency virus type 1 replication in primary human macrophages. Journal of Virology 67, 4409-4414. BOOM, R., SOL, C.J.A., SALIMANS, M . M . M . , JANSEN, C. L., WERTHEIM-VANDILLEN, P. M. E. & VAN DER NOORDAA,J. (1990). Rapid and simple method for purification of nucleic acids. Journal of Clinical Microbiology" 28, 495-503. BRETTLE, R. P., BtSSET,K., BURNS, S., DAVIDSON,J., DAVIDSON,S. J., GRAY, J. M. N., INCLIS,J. M., LEES,J. S. & MOK, J. (1987). Human immunodeficiency virus and drug misuse: the Edinburgh experience. British Medical Journal 295, 421-424. BUONOCORE, L., TURI, T.G., CRISE, B. & ROSE, J.K. (1994). Stimulation of heterologous protein degradation by the Vpu protein of HIV-1 requires the transmembrane and cytoplasmic domains of CD4. Virology 204, 482-486. CHEN, M. Y., MALDARELLI,F., KARCZEWSKI,M. K., WILLEY,R. L. & STREBEL, K. (1993). Human immunodeficiency virus type 1 Vpu protein induces degradation of CD4 in vitro: the cytoplasmic domain of CD4 contributes to Vpu sensitivity. Journal of Virology 67, 3877-3884. CIESIELSKY, C . A , MARIANOS, D., OU, C . Y , DUMBAUGH, R., BERKELMAN,R., GOOCH,B., MYERS,G., LUO, C. C., SCHOCHETMAN, G., HOWELL, J., LASCH, A., BELL, K., ECONOMOU,N., SCOTT, B., FURMAN,L , CtmRAN, J. & JArEE, H. (1992). Transmission of human immunodeficiency virus in a dental practice. Annals of Internal Medicine 166, 798-805. COEEIN, J. M. (1992). Genetic diversity and evolution of retroviruses. Current Topics in Microbiology and Immunology 176, 143-164. COHEN, E. A., TERWILLIGER,E. F., SODROSKI,J, G. & HASELTINE,W. A. (1988). Identification of a protein encoded by the vpu gene of HIV-1. Nature 334, 532-534. COHEN,E. A., TERWILLIGER,E. F., JALINOOS,Y., PROULX,J., SODROSKI, J.G. 8¢ HASELTINq3, W.A. (1990). Identification of HIV-1 vpr product and function. Journal of Acquired Immune Deficiency Syndrome 3, 11-18. CONNOR, R. 1., KUANCHEN,B., CHOE,S. t~ LANDAU,N. R. (1995). Vpr is required for efficient regulation of human immunodeficiency virus type 1 in mononuclear phagocytes. Virology 206, 935-944. DEDERA,D., HU, W., VAN DER HEYDEN,N. & RATNER,L. (1989). Viral protein R of human immunodeficiency virus types 1 and 2 is dispensable for replication and cytopathogenicity in lymphoid cells. Journal of Virology 63, 3205-3208. DELASSUS,S., CHEYNIER,R. & WAIN-HoBsoN, S. (1991). Evolution of human immunodeficiency virus type 1 nef and long terminal repeat sequences over 4 years in vivo and in vitro. Journal of Virology 65, 1, 225-231. DE WOLF, F., GOLq)SMIT,J., PAUL,D. A., LANGE,J. M. A., HOOYKAAS, C., SCHELLEKENS,P. T. m., COUT1NHO,R. m. & VAN DER NOORDAA, J. (1987). Risk of AIDS related complex and AIDS in homosexual men with persistent HIV antigenaemia. British Medical Journal 295, 569-572. DE WOLF, F., HOGERVORST, E., GOUDSMIT, J., FENY6, E.-M., R/.)BSAMEN-WAIGMANN, H., HOLMES, H., GALVAO-CASTRO, B., KARITA, E., WASI, C., SEMPALA,S. D. K., BAAN, E., ZORGDRAGER, F., LUKASHOV,V. V., OSMANOV,S., KUIKEN,C. L., CORNELISSEN,M. T, E. 8¢ THE WHO NETWORK ON H1V-1 ISOLATION& CHARACTERIZATION (1994). Syncytium-inducing (SI) and non-syncytiuminducing (NSI) capacity of human immunodeficiency virus type 1 (HIV-1) subtypes other than B: phenotypic and genotypic characteristics. AIDS Research and Human Retroviruses 10, 1387-1400. DIAZ, R. S, SAmNO,E. C., MAYER,A., MOSL~Y,J. W. & BUSCH,M. P. (1995). Dual human immunodeficiency virus type 1 infection and recombination in a dually exposed transfusion recipient. Journal of Virology 69, 3273-3281. HAIGWOOD, N. L., MASON, O. B., YORK-HIGGINS,D , KORBER, B.T. 791 M., WE1SER, B. & BURGER, H. (1993). HIV-1 envelope sequence diversity in long-term infected asymptomatic patients. Vaccines" 233-242. HAMMING, R.W. (1986). Coding & h~brmation Theory, 2nd edn. Englewood Cliffs: Prentice Hall. HATTORI,N., MICHAELS,F., FARGNOLI,K., MARCON,L., GALLO,R. C. & FRANCHINI,G. (1990). The human immunodeficiency virus type 2 vpr gene is essential for productive infection of human macrophages. Proceedings of the National Academy of Sciences, USA 87, 8080-8084. HI,GINS, D. G. (1992). Sequence ordinations: a multivariate analysis approach to analysing large sequence data sets. CABIOS 8, 15-22. HOLMES, E. C., NEE, S., RAMBAUT,A. R., GARNE1~, G. P. & HARX,q~Y, P. H. (1995). Revealing the history of infectious disease epidemics through phylogenetic trees. Philosophical Trtmsactions of the Royal Society of London 349, 33~40. HOWELL, R. M., FITZGIBBON,J. E., NOE, M., REN, Z., GOCKE, D. J., SCHWARTZER, T.A. t~; DUBIN, D.T. (1991). In vivo sequence variation of the human immunodeficiency virus type 1 env gene: evidence for recombination among variants found in a single individual. AIDS Research and Human Retroviruses 7, 869-876. KIMUgA,M. (1980). A simple method for estimating evolutionary rates of base substitution through comparative studies of nucleotide sequences. Journal of Molecular Evolution 16, 111-120. KLIMKAIT, T., STREBEL, K., HOGGAN, M.D., MARTIN, M.A. & ORENSTEIN,J. M. (1990). The human immunodeficiency virus type 1specific protein vpu is reqlaired for efficient virus maturation and release. ,Journal of Virology 64, 621 629. KU1KEN, C. L. • GOUDSMIT,J. (1994). Silent mutation pattern in V3 sequences distinguishes virus according to risk group in Europe. AIDS Research and Human Retroviruses 10, 31%320. KUIKEN, C.L., ZWART, G., BAAN, E., COUTINHO, R.A., VAN DEN HOEH, J. A. R. & GOUDSMIT, J. (1993). Increasing antigenic and genetic diversity of the HIV-1 V3 domain in the course of the AIDS epidemic. Proceedingsof the National Academy of Sciences, USA 90, 9061-906L KUIKEN, C. L., LUKASHOV,V. V., LEUNISSEN,J. A. M. & GOUDSMIT,J. (1996). Restricted intra-patient evolution of genomic RNA encoding the third variable (V3) domain of the HIV-1 envelope. AIDS 10, 31-37. KUMAR, S., TAMLq/A, K. & NEI, M. (1994). MEGA: molecular evolutionary genetics analysis software for microcomputers. CABIOS 10, 189-192. LEVY, D.N., FERNANDES,L.S., WILLIAMS,W.V. 8¢ WEINER, D.B. (1993). Induction of cell differentiation by human immunodeficiency type l vpr. Cell 72, 541-550. LEVY,D. N., REFAELI,Y., MACGREGOR,R. R. & WEINER,D. B. (1994). Serum vpr regulates productive infection and latency of human immunodeficiency virus type 1. Proceedingsof the National Academy of Sciences, USA 91, 10873-10877. MONKEN, C.E., WU, B. & SRINIVASAN,A. (1995). High resolution analysis of the HIV-1 quasispecies in the brain. AIDS 9, 345-349. MYERS, G., WAIN-HOBSON,S., HENDERSON,L. E., KORBER,B., JEANG, K.T. 8¢ PAVLAKIS, G. N. (1994). Human Retroviruses and AIDS 1994. Los Alamos, N. Mex.: Theoretical Biology and Biophysics Group. NEI, M. & GOJOBORI,T. (1986). Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Molecular Biology and Evolution 3, 418-426. OGAWA, K., SHIBATA,R., KIYOMASU,T., HIGUCHI, I., KISHIDA, Y., ISHIMOTO, A. & ADACHI, A. (1989). Mutational analysis of the human immunodeficiency virus vpr open reading frame. Journal of Virology 63, 4110-4114. PEDROZA MARTINS,L., CHENCINER,N., fikSJO,B., MEYERHANS,A. & WAIN-HOBSON,S. (1991). Independent fluctuation of human immunodeficiency virus type 1 rev and gp41 quasispecies in FIFO. Journal of Virology 65, 4502-4507. PEUTHERER,J. F., EDMOND,E., SIMMONDS,P., DICKSON,J. D. & BATH, G. E. (1985). HTLV-III antibody in Edinburgh drug addicts. Lancet ii, 1129-1130. RmA, N. U., VINCENT,M. J. & JABBAR,M. A. (1994). Vpu-mediated proteolysis of gpl60/CD4 chimeric envelope glycoproteins in the Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Thu, 11 May 2017 04:30:11 792 C. L. Kuiken and others endoplasmic reticulum: requirement of both the anchor and cytoplasmic domain of CD4. Virology 204, 357-366. ROBERTSON, D. L , HAHN, B.H. & SHARP, P.M. (1995). Recombination in AIDS viruses. Journal of Molecular Evolution 40, 24%259. RO~AS, J.M., DOPAZO, J., NA3ERA, L, SANCHEZ-PALOMINO, S., OLIVARES,L, MARTIN,M. J., BERNAL,A., GARCIASAIZ, A., NAJERA, R. & LOPEZ-GALINDEZ,C. (1994). Molecular epidemiology of HIV1 in Madrid. Virus Research 31, 331-.34_?. SA:TOU, N. & Nt~I, M. (1987). The neighbor-joining method: a new method for reconstructing phylogenetic trees. MolecularBiology and Evolution 4, 406~25. STREBEL,K., KLIMKAtT,T., MALDARELLI,F. & MARTIN,M. A. (1989). Molecular and biochemical analyses of human immunodeficiency virus type 1 vpu protein. Journal of Virology 63, 3784-3791. TERWILrlGER, E.F., COHEN, E.A., LU, Y, SODROSKI, J.G. & HASELTIN~0 W.A. (1989). Functional role of human immunodeficiency virus type 1 vpu. Proceedings of the National Academy of Sciences, USA 86, 5163 5167. VANDENHOEK,J. A. R., COUTINHO,R. A., VAN HAASTRECHT,H. J. A., VAN ZADELHOFF,A. W. ~¢ GOUDSMIT,J. (1988), Prevalence and risk factors of HIV infections among drug users and drug-using prostitutes in Amsterdam. AIDS 2, 55-60. VARTANIAN, J.P., MEYERHANS, A., ]kSJ6, B. & WAIN-HOBSON, S. (1991). Selection, recombination and G + A hypermutation of human immunodeficiency virus type 1 genomes. Journal of Virology 65, 177%1788. VqESTERVELT,P., HENKEL, T , TROWBRIDGE,D. B., ORENSTEIN,J. M., HEUSER,J., GENDEL~aN,H. E. & RATNER,L. (1992). Dual regulation of silent and productive infection in monocytes by distinct hmnan immunodeficiency virus type 1 determinants. Journal of Virology 66, 39257931. WILLEY, R. L , BUCKLER-WroTE,A. & STREBEL,K. (1994). Sequences present in the cytoplasmic domain of CD4 are necessary and sufficient to confer sensitivity to the human immunodeficiency virus type 1 Vpu protein. Journal of Virology 68, 1207-1212. YUAN,X., MATSUDA,Z., MATSUDA,M., ESSEX,M. & LEE,T.-H. (1990). Human immunodeficiency virus vpr gene encodes a virion-associated protein. AIDS Research and Human Retroviruses 6, 1265-1271. ZHU,T., Mo, H., WANG, N , NAM, D. S., CAO, Y., KouP, R. A. & Ho, D. D. (1993). Genotypic and phenotypic characterization of HIV-1 in patients with primary infection. Science 261, 1179-1181. ZHU, T., WANG, N., CARR, A., WOLINSKY, S. & Ho, D. D. (1995). Evidence for coinfection by multiple strains of human immunodeficiency virus type 1 subtype B in an acute seroconvertor. Journal of Virology 69, 1324-1327. (Received 14 September 1995; Accepted 14 December 1995) Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Thu, 11 May 2017 04:30:11