Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Journal of General Virology (1999), 80, 717–725. Printed in Great Britain ................................................................................................................................................................................................................................................................................... Variation of hepatitis C virus following serial transmission : multiple mechanisms of diversification of the hypervariable region and evidence for convergent genome evolution Carmela Casino,1 Jane McAllister,1 Fiona Davidson,2 Joan Power,3 Emer Lawlor,3 Peng Lee Yap,2 Peter Simmonds1 and Donald B. Smith1 1 Department of Medical Microbiology, University of Edinburgh, Medical School, Teviot Place, Edinburgh EH8 9AG, UK Edinburgh and South East Scotland Blood Transfusion Service, Lauriston Place, Edinburgh EH3 9HB, UK 3 Irish Blood Transfusion Board, St Finbarr’s Hospital, Cork and Pelican House, Dublin, Ireland 2 We have studied the evolution of hepatitis C virus (HCV) from a common source following serial transmission from contaminated batches of anti-D immunoglobulin. Six secondary recipients were each infected with virus from identifiable primary recipients of HCV-contaminated anti-D immunoglobulin. Phylogenetic analysis of virus E1/E2 gene sequences [including the hypervariable region (HVR)] and part of NS5B confirmed their common origin, but failed to reproduce the known epidemiological relationships between pairs of viruses, probably because of the frequent occurrence of convergent substitutions at both synonymous and nonsynonymous sites. There was no evidence that the rate at which the HCV genome evolves is affected by transmission events. Three different mechanisms appear to have been involved in generating variation of the hypervariable region ; nucleotide substitution, insertion/deletion of nucleotide triplets at the E1/E2 boundary and insertion of a duplicated segment replacing almost the entire HVR. These observations have important implications for the phylogenetic analysis of HCV sequences from epidemiologically linked isolates. Introduction The ability of hepatitis C virus to establish a persistent infection following transmission to a new host is generally thought to be related to continued evolution of the virus envelope glycoproteins E1 and E2. Attention has focused particularly on a 25–30 amino acid hypervariable region (HVR) at the NH terminus of the E2 protein that induces the # production of specific antibodies that may be neutralizing in vivo (Farci et al., 1996) and may be capable of blocking virus attachment to cells in vitro (Zibert et al., 1995). Previous work has identified both invariant and variable positions within the HVR ; amino acid replacements at some variable sites are limited to a small number of residues with similar chemical Author for correspondence : Donald Smith. Fax j44 131 650 6531. e-mail Donald.B.Smith!ed.ac.uk The GenBank accession numbers for the nucleotide sequences reported here are AF119717–AF119768, AF056764–AF056785 and AF056795–AF056812. 0001-5972 # 1999 SGM properties, while at other sites almost every residue can occur (Sekiya et al., 1994 ; Sherman et al., 1996 ; McAllister et al., 1998). Detailed comparisons of the HVR sequences in different individuals infected from a common infectious source has provided evidence for the operation of selection against amino acid substitution at some positions, and positive selection for amino acid replacement at others (McAllister et al., 1998). However, most previous studies of the evolution of the HVR have been confined to longitudinal studies of acutely or persistently infected individuals, or in a few cases to individuals infected from an identifiable source (Weiner et al., 1993 ; Hohne et al., 1994 ; Esteban et al., 1996). To date, there is no information about the effect of multiple transmission events on virus evolution in humans, and it is possible that very different patterns of variation might be observed under these circumstances. For example, continued selection of antigenically distinct variants of the HVR might drive the virus population away from a fitness optimum, and subsequent transmission to a new host might result in selection for less divergent variants, as observed previously for human immunodeficiency virus (Zhang et al., 1993) and during experimental transmission of Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 16 Jun 2017 08:44:41 HBH C. Casino and others HCV in chimpanzees (Kojima et al., 1994). Transmission of virus to an immunologically naı$ ve host might also result in a slowing in the rate of virus evolution if immunological selection is a major force in driving virus evolution. These types of effect have implications for the study of the rate of virus evolution, the extrapolation of short-term rates of sequence change as a way of reconstructing the history of virus infections, and in the interpretation of phylogenetic relationships between viruses in different infected individuals where transmission is suspected. We have investigated these questions by studying virus E1\E2 sequences from six HCV-infected individuals for whom the source of infection could be traced to a different individual each of whom had received anti-D immunoglobulin in Ireland in 1977. One of the donors to the plasma pools used to manufacture batches of anti-D used in 1977 was acutely infected with HCV genotype 1b (Power et al., 1994). Retrospective studies have established that batches of this antiD were PCR-positive and infectious (Lawlor et al., 1999), and that virus from different anti-D recipients was of genotype 1b and closely related to that present in archived batches of antiD immunoglobulin and the single implicated donor (Power et al., 1995 ; Smith et al., 1997 b ; McAllister et al., 1998). By studying virus sequences in secondary recipients of these HCV-infected anti-D recipients we have therefore been able to investigate the effect of virus transmission on the rate and characteristics of virus evolution. Methods Sera. Plasma and serum samples were obtained in 1994 from HCVinfected individuals identified through targeted lookback by the Irish Blood Transfusion Service Board as having received blood from or being a member of the family of a woman who had received an infectious batch of anti-D immunoglobulin in 1977. All individuals were PCR-positive for the virus 5h non-coding region (5hNCR) and were infected with HCV subtype 1b as assessed by RFLP analysis of the 5hNCR (Davidson et al., 1995) and\or by phylogenetic analysis of the E1 and NS5B regions (Smith et al., 1997 b). PCR. Virus RNA was extracted from 0n1 ml of plasma or serum by incubation with proteinase K–polyadenylic acid–SDS as reported previously (Jarvis et al., 1994). Synthesis of cDNA was carried out using 5 µl extracted RNA and 10 units of avian myeloblastosis virus reverse transcriptase (Promega) in 20 µl buffer containing 50 mM Tris–HCl pH 8n0, 5 mM MgCl , 5 mM DTT, 50 mM KCl, 0n05 µg\µl BSA, 15 % # DMSO, 600 µM each of dGTP, dATP, dTTP and dCTP, 1n5 µM primer and 10 units RNasin (Promega). A fragment containing the COOH terminus of E1 and an NH terminal region of E2 was amplified using the # primers 2174 (5h TTCATCCAYGTRCASCCRAACCA 3h, antisense, positions 1986–2006 numbered from the AUG initiation codon) and 2173 (5h CAYCGNATGGCNTGGGAYATGATG 3h, sense, positions 1287–1310) for the first round of PCR, and primers 8914 (5h CGGGATCCGGGTGCTCACTGGGGAGTCCTGGCGGGC 3h, sense, positions 1389–1415 incorporating a BamHI site) and 2070 (5h GGAATTCGTGAARCARTACACYGGRCCRCANAC 3h, antisense, positions 1845–1870 incorporating an EcoRI site) for the second round of HBI PCR. Optimized conditions for PCR amplification of the HVR utilized 5 µl cDNA and 30 cycles, each consisting of 0n6 min at 94 mC, 0n7 min at 50 mC, 1n5 min at 72 mC and a final extension period of 6n5 min at 72 mC. Reactions were carried out with 0n4 units Taq DNA polymerase (Promega) in 50 µl buffer containing 50 mM KCl, 10 mM Tris–HCl pH 9n0, Triton X-100, 1n5 mM MgCl , 30 µM each of dGTP, dATP, # dTTP, dCTP and 0n25 µM of each of the outer primers. For the second round of PCR, 1 µl of PCR product was transferred to a fresh tube containing 100 µl reaction mix and subjected to 30 cycles of PCR as before. Sequence analysis. PCR products and the plasmid pUC18 were cleaved with both BamHI and EcoRI (Promega) and purified from 0n8 % low melting point agarose (Gibco BRL). Ligations were performed using 50 ng vector, 300 ng of PCR insert and 5 units of T4 DNA ligase (Promega), and were incubated at room temperature overnight. Ligation reactions were transformed into competent TG1 cells and plated on Luria agar plates containing 200 µg\ml ampicillin, 10 µg\ml X-Gal and 200 µM IPTG. Plasmid DNA was extracted from transformants producing white colonies by alkali denaturation and sequenced by standard methods using the Sequenase 2.0 enzyme (Amersham) and the primers DBS6 (5h CACTGGGGAGTCCTGGCGGGC 3h, positions 1395–1415), S6645 (5h TGCCARCTNCCRTTGGTRTT 3h, positions 1583–1603) and pUC reverse primer (5h CAGGAAACAGCTATGAC 3h). Nucleotide sequences were entered, aligned and checked using Simmonic 2000 (P. Simmonds). Phylogenetic analysis. Phylogenetic analysis was carried out using the Molecular Evolutionary Phylogenetic Analysis (MEGA) version 1.02 package (Kumar et al., 1993). Evolutionary distances were calculated using the Kimura 2-parameter method (all sites) or the Jukes–Cantor correction (synonymous sites). Mean distances between groups of sequences were computed from distance matrices using the program Dst strp (P. Simmonds). Results Study group Six different HCV-infected individuals (SR8, SR7, SR11, SR14, SR13 and SR6) were identified whose likely source of infection was from an individual (R1, R5, R10, R12, R15 and R21 respectively) who had received an infectious batch of antiD immunoglobulin in 1977 (Table 1). Three of the secondary recipients (SR6, SR7 and SR8) were infected following blood transfusion of blood from individuals subsequently identified as HCV-infected recipients of 1977 anti-D immunoglobulin. Two secondary recipients (SR11 and SR14) are children of HCV-infected recipients of 1977 anti-D who may have been infected before or shortly after birth. Some studies of vertical transmission of HCV have first detected HCV RNA 2–6 months after birth (Ohto et al., 1994 ; Zanetti et al., 1995), but there is also evidence that transmission can occur before birth (Weiner et al., 1993). Alternatively, it is possible that these transmissions occurred through parenteral exposure later in life. The remaining secondary recipient (SR13) is a relative of an anti-D recipient, and the timing and route of transmission are unknown. Serum samples obtained in 1994 were analysed for each of these individuals. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 16 Jun 2017 08:44:41 HCV hypervariable region evolution Table 1. Primary and secondary recipients of anti-D immunoglobulin Secondary recipient Age in 1994 Sex Anti-D recipient SR7 SR8 SR11 SR13 SR14 82 70 11 42 17 M F M F F R5 R1 R10 R15 R12 SR6 42 F R21 Year of anti-D exposure 1977 1977 1977 1977 1977, 1978 1977 Fig. 1. Evolutionary distances between epidemiologically linked pairs of sequences. The mean Jukes–Cantor distance between virus E1/E2 sequences from secondary recipients and their respective implicated primary anti-D recipients (A) and between sequences from different primary anti-D recipients (B) are plotted on the y-axis. Convergent evolution of the E1/E2 genes Phylogenetic analysis of virus E1\E2 sequences revealed that all six secondary recipients were infected with an HCV of subtype 1b virus which was closely related to the virus present in infectious batches of anti-D immunoglobulin manufactured in 1977 and in primary recipients of these batches. However, a surprising finding was that virus sequences in secondary recipients were no more closely related to virus sequences from the anti-D recipient implicated as their source of infection than were two different recipients of anti-D immunoglobulin (Fig. 1). This finding is unexpected since virus sequences present in the infectious batch were relatively homogeneous (Smith et al., 1997 b ; McAllister et al., 1998) and so virus from a secondary recipient and an implicated primary recipient should share a more recent common ancestor than that shared by different primary recipients. Although the pair R5 and SR7 had a relatively low pairwise distance, consistent with the relatively recent time of transmission (1988), the pair R1 and SR8 with the most extreme pairwise distance had Means of transmission Transfusion Transfusion Vertical? Horizontal? Vertical? Time of transmission 1988 1988 1984? 1977 1978? Transfusion 1984 exactly the same time of transmission. Similarly, except for the pair R5 and SR7, phylogenetic analysis of sequences revealed no evidence for groupings within the anti-D recipient cohort between secondary recipients and their respective implicated anti-D recipients (data not shown). This was true even if the HVR was excluded from the analysis, or if comparisons were limited to nonsynonymous or synonymous sites (Fig. 2). This lack of grouping of epidemiologically linked sequences may be due to the frequent occurrence of convergent substitutions within the region studied. Within the 18 amino acid COOH-terminal region of E1 studied here there were four polymorphic sites where the same substitution was observed in different primary or secondary anti-D recipients. However, each of these substitutions sometimes occurred in a secondary recipient without being observed in the corresponding implicated primary anti-D recipient. Another three substitutions occurred in a single sequence from the primary and secondary recipient pairs described here, but occurred amongst published subtype 1b sequences, again consistent with convergent substitution. Finally, four substitutions were unique to the sequences reported here and may represent artefacts introduced during PCR amplification since this number of sporadic amino acid substitutions would be expected amongst cloned PCR products if the error rate of Taq DNA polymerase was 2i10−& per nucleotide copied, at the lower end of published estimates (Smith et al., 1997 a). There was also evidence for convergent evolution in the region of E2 following the HVR, since the majority of substitutions that segregated within an infected individual or that differed between an anti-D recipient and their associated secondary recipient also occurred amongst other anti-D recipients or amongst epidemiologically unrelated subtype 1b sequences. Of the 27 positions within this region that were variable amongst unrelated subtype 1b sequences, 23 were also polymorphic amongst the anti-D cohort, usually with the same amino acid replacements. Fifteen of these positions were discordant between the clones of a secondary recipient and those of the implicated primary anti-D recipient, but there was no consistent association between these polymorphisms in Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 16 Jun 2017 08:44:41 HBJ C. Casino and others Fig. 2. For legend see facing page. HCA Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 16 Jun 2017 08:44:41 HCV hypervariable region evolution different recipients. None of these positions were polymorphic amongst 24 clones derived from virus isolated from archived batches of 1977 anti-D immunoglobulin (McAllister et al., 1998). Surprisingly, there was also evidence that many synonymous substitutions were convergent. In the region of E1\E2 flanking the HVR, six synonymous substitutions were confined to a particular anti-D recipient\secondary recipient pair, but at 51 positions the same synonymous substitution was observed in different anti-D recipients. Several of these substitutions occurred in virus sequences from a secondary recipient and one or more primary anti-D recipients, but not in virus from their implicated primary anti-D recipient. This segregation of substitutions is unlikely to be due to heterogeneity in the infectious source of anti-D since only three of the polymorphic positions were variable amongst cloned sequences from archived batches of 1977 anti-D immunoglobulin, and there was no association between polymorphisms in different anti-D recipients or secondary recipients. Variation of the HVR The consensus HVR sequences of the six secondary recipients differed from each other at an average of 11n9 amino acid positions, similar to the difference between different antiD recipients (mean 9n8) or between epidemiologically unrelated subtype 1b sequences (mean 13n2) (McAllister et al., 1998). Variation between the five pairs of primary and secondary recipients (Fig. 3) was less than this for all pairs (3, 6, 7 and 9 differences, mean 6n25) except for sequences from R1 and SR8 that differed at 17 out of 27 positions because of a genome rearrangement (see below). The extent of variability between different virus sequences within individual secondary recipients was limited to five or fewer positions in three individuals, but 14 or more positions were variable amongst the viruses cocirculating in the remaining three individuals (Fig. 3). This compares with the previous observation that only three out of seventeen primary anti-D recipients were infected with more than one HVR variant (McAllister et al., 1998). This difference is surprising given the shorter duration of infection in secondary recipients (mean of less than 10n1 years compared with 17 years for anti-D recipients). There was no simple relationship between the route of infection and the degree of variability within individual secondary recipients since only a single variant was observed in one (SR7) of the three individuals who became infected following a blood transfusion from an infected anti-D recipient and would have acquired a large quantity of virus at transmission (Fig. 3). No evidence for resetting In order to test the possibility that there is selection for reversion of substitutions upon transmission to a naı$ ve host, we compared the distance between virus sequences from secondary recipients with those infecting anti-D recipients. Viruses in these two groups are equally separated from their common ancestor present in infectious batches of anti-D immunoglobulin manufactured in 1977, but virus from secondary recipients has undergone an additional transmission event. There was no significant difference between primary and secondary recipients in the distance to virus sequences from archived batches of anti-D, regardless of whether or not the HVR was included in the analysis or whether amino acid, nucleotide or only synonymous differences were considered (data not shown). Similarly, for individual donor–recipient pairs, virus from secondary recipients was just as divergent from the 1977 sequence as was virus from the implicated antiD recipient. No difference in the extent of virus diversity within an individual was observed between primary and secondary recipients as groups or as individual pairs, or between the secondary recipients infected as a result of blood transfusion compared with those infected by other routes. Hence, there is no discernible effect of secondary transmission on the rate of virus evolution. Generation of diversity in the HVR through duplication Comparison of HVR sequences between anti-D recipients, their secondary recipients and sequences detected in infectious batches of anti-D immunoglobulin revealed evidence for three different types of mechanism by which diversity of the HVR was generated. First, individual anti-D recipients and secondary recipients each contained different HVR sequences (Fig. 3) but these sequences conformed to the restricted amino acid diversity previously documented for different HVR positions (Sekiya et al., 1994 ; Sherman et al., 1996 ; McAllister et al., 1998). Within infected individuals HVR amino acid sequences also differed between different clones at a small number of positions, or in some cases more drastically although this variation was generally less than that between variants infecting different individuals. The majority of these differences occurred in more than one clone, and so are likely to represent polymorphisms in the virus population. Other substitutions observed in only a single cloned sequence are likely to represent artefacts introduced during PCR amplification since their frequency was equivalent to an error rate of Taq DNA polymerase of 5i10−&, within the range of published estimates (Smith et al., 1997 a). Fig. 2. Phylogenetic tree of synonymous sites in the region flanking the HVR. Nucleotide distances at synonymous sites between E1/E2 sequences excluding the HVR were calculated using the Jukes–Cantor correction, and used to construct a neighbour-joining tree. Numbers indicate the bootstrap support (100 replicates) for particular groupings. Sequences from individuals with a known risk of transmission are linked by double lines. Sequences of four epidemiologically unrelated subtype 1b isolates were used as outgroups. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 16 Jun 2017 08:44:41 HCB C. Casino and others Fig. 3. HVR amino acid sequences in anti-D recipients and paired secondary recipients. HVR sequences of individual clones from different individuals are shown with ‘ . ’, indicating identity to the consensus sequence found in infectious batches of anti-D immunoglobulin manufactured in 1977 (top line). Unique substitutions not otherwise observed amongst these or other members of the anti-D recipient cohort (McAllister et al., 1998) and likely to represent misincorporation by Taq DNA HCC Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 16 Jun 2017 08:44:41 HCV hypervariable region evolution Fig. 4. Structure of the HVR region in virus from SR8. The nucleotide sequence and predicted HVR amino acid sequence of clone 5 from SR8. Amino acid positions that are strongly conserved amongst anti-D recipients and epidemiologically unrelated subtype 1b sequences are indicated by asterisks. Below are shown the nucleotide differences and amino acid substitutions observed amongst eight other clones from SR8. Nucleotides in bold are those found in both copies of the 33 nucleotide repeat amongst the variants detected in SR8. The bottom line gives the consensus nucleotide and amino acid sequence of the HVR from R1, the implicated blood transfusion donor to SR8. A second type of variation was observed in a variant from SR6 in which an additional codon (GGG) was present at the extreme 5h end of the HVR (Fig. 3). Another variant contained a two nucleotide insertion, but translation of the remainder of the polyprotein would have become out of frame. Both of these variants shared several substitutions with other variants detected in SR6, consistent with a relatively recent origin, although it was not possible to investigate the diversity in the implicated donor. Single HVR sequences from SR6 and R10 contained nucleotide deletions that affected the reading frame downstream of the HVR. These may be derived from defective viruses or may represent deletions introduced during PCR amplification. Finally, variants present in SR8 also contained an additional codon at the NH terminus of the HVR and had HVR amino # acid sequences that were strikingly divergent from those from the implicated transfusion donor R1 (Fig. 3). One of the sequences from SR8 (clone 5) had a distinct HVR sequence that contained an imperfect direct repeat of 33 nucleotides (18 matches) (Fig. 4). The first six nucleotides of the repeat encode the amino acids Trp.Gly which are uncommon amongst published HCV sequences (McAllister et al., 1998) either at position 1 or at positions 11 and 12 where they are specified by the second copy. The repeated segment may derive from the HVR of an ancestor of the SR8 variants since there is a strong similarity between codons 13–20 of SR8 variants (excluding clone 5) and codons 12–20 of R1 variants (excluding codon 15). This suggests a process in which a duplication event occurred in a variant containing Trp.Gly at positions 11\12 and with a deletion at codon 15, followed by deletion of codons 1–10 of the HVR. Subsequent evolution of the HVR might then have led to the accumulation of substitutions in the COOH terminus of the HVR in the lineage represented by clone 5, and by the accumulation of substitutions in the NH # terminus of the HVR and in the flanking regions of the other main lineage. Discussion This study of serial transmission of HCV from a defined source to primary and secondary recipients has failed to uncover any effect of multiple transmission events on the rate or characteristics of virus evolution. Three of the secondary transmissions studied here involved blood transfusion and the secondary recipient would have been infected with large quantities of infectious virus. For the other three transmission events the route and timing is uncertain, but the dose is likely to have been much smaller than following transfusion of infected blood. The lack of any consistent differences between the extent of virus evolution in these different secondary recipients argues that it may be valid to extrapolate the rate of evolution observed over single transmission events (Smith et al., 1997 b) to longer periods of time during which multiple transmissions to different individuals will occur. Reconstruction of epidemiological relationships within a cohort An important finding of this study is that the detailed epidemiological relationships between viruses from anti-D recipients and secondary recipients could not be adequately reproduced by phylogenetic analysis of the E1\E2 region (Fig. 2). Although we were able to confirm that all primary and polymerase during PCR amplification are indicated in bold type. Single nucleotide deletions are indicated by ‘ F ’, ambiguities by ‘ ? ’ and gaps introduced to maintain alignments by ‘ - ’. The number of clones with each sequence is indicated together with the number of amino positions that are variable within virus sequences from an individual and the number that are variable between viruses from the primary (R) and secondary recipient (SR). Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 16 Jun 2017 08:44:41 HCD C. Casino and others secondary anti-D recipients were infected with closely related viruses of subtype 1b, the only consistent sub-grouping observed within this cohort was for R5 and SR7 who became infected following a blood transfusion in 1988. Even this grouping was poorly supported if analysis was confined to synonymous sites in the region flanking the HVR. This deficiency may be partly due to the relative similarity (mean evolutionary distance of 0n121) of virus sequences infecting different anti-D recipients compared with that between epidemiologically unrelated subtype 1b isolates (mean evolutionary distance of 0n23). However, in addition many of the changes observed in virus sequences from both primary and secondary recipients of virus from 1977 anti-D immunoglobulin were convergent. This was true not only for amino acid replacements within the HVR as previously described (McAllister et al., 1998), but also for both nonsynonymous and synonymous substitutions in E1 and E2 flanking the HVR. Similar observations were made when a less variable 222 nucleotide sequence from the virus NS5B region was analysed (data not shown). All of the sequences shared a common branch with the sequence of virus present in an infectious batch of anti-D immunoglobulin, but there was no subgrouping of sequences from donor–recipient pairs. Amongst the five pairs of NS5B sequences, there were one nonsynonymous and eleven synonymous substitutions, none of which were shared by a donor–recipient pair. Two of the synonymous substitutions were observed in two or more different pairs of sequences but not in ten independent sequences obtained from an infectious batch (Smith et al., 1997 b). Since both these substitutions and all those represented by a single sequence also occurred in one or more epidemiologically unlinked HCV subtype 1b sequences, these observations are consistent with the occurrence of convergent synonymous substitution. These findings imply that the investigation of epidemiological relationships within a cohort of individuals infected directly or indirectly from a common source may require the phylogenetic analysis of much longer genomic regions than the variable (NS5B, 222 nucleotides) and hypervariable (E2, 363 nucleotides) regions considered here. This may also be the case even when comparisons are made immediately after a transmission event, since groupings of multiple E1\E2 virus sequences from different anti-D recipients were not always supported by bootstrap analysis despite their having had 17 years of independent evolution (Fig. 2), (McAllister et al., 1998). Multiple mechanisms for the generation of diversity in the HVR This detailed study of the evolution of the HVR of E2 from a defined source reveals that diversification of the HVR can occur through at least three different mechanisms. The most familiar of these is nucleotide substitution at those positions in HCE the HVR at which amino acid replacements are tolerated. The second mechanism is through the insertion of additional nucleotides at the junction of E1 and E2, as observed in two variants in SR6. Variation in this region has occasionally been reported with insertion or deletion of 1–4 codons (Abe et al., 1992 ; Kato et al., 1992 ; Hohne et al., 1994). In only one of these cases is it clear that variants have been produced by insertion of nucleotides into an otherwise unaltered HVR sequence (Hohne et al., 1994), and a similar process seems to have generated the variants observed in SR6 since additional G residues occur within a sequence otherwise very similar to that present in 1977 anti-D immunoglobulin. There are only 12 sites of insertion or deletion when the coding regions of 37 complete genomes of HCV are compared, most of which have strong associations with particular virus types or subtypes. Since there is evidence that virus types and subtypes diverged at least several hundred years ago (Smith et al., 1997 b), the repeated observation of insertion and deletion of nucleotides at the junction of E1 and E2 during relatively short-term longitudinal studies suggests that nucleotide rearrangements may be relatively common in this part of the virus genome. One way in which this could occur might be through stalling of the RNA polymerase because of a conserved RNA secondary structure, but we have been unable to detect such a structure in the regions of the genome flanking the HVR. The nucleotide sequence of virus from SR8 provides evidence for a previously undescribed third mechanism through which diversity of the HVR can be generated. Comparison of virus sequences from SR8 and R1, the implicated donor, suggests a COOH-terminal region of the HVR was duplicated and replaced the NH -terminal region # resulting in an unrelated HVR sequence with one extra amino acid residue. Duplication events have not previously been described for HCV, although they have been implicated in the pathogenesis of a pestivirus, bovine viral diarrhoea virus (Tautz et al., 1996). Although the HVR of HCV is the most variable part of the virus genome, it is also striking that this region is subject to strong constraints on the ways in which it can change. Some positions within the HVR are confined to particular amino acid residues, or to a small selection of residues that share particular chemical characteristics (McAllister et al., 1998). In addition, there appears to be strong selection against changes in the size of the HVR since the HVR is the same length in all six genotypes of HCV with the exception of type 6 which for both sequences currently available has a deletion of one residue (Chamberlain et al., 1997). Such selection might explain why virus variants with lengthened HVR sequences have often been found to co-circulate with variants with various types of HVR deletion (Kato et al., 1992 ; Hohne et al., 1994). Together these features suggest that the HVR has an important function in the replication of HCV, and that this function depends on something more than its ability to be structurally flexible (Taniguchi et al., 1993). Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 16 Jun 2017 08:44:41 HCV hypervariable region evolution C. C. was supported by a study fellowship from the University of Rome ‘ La Sapienza’, J. McA. and D. B. S. are supported by a grant from the Wellcome Trust and P. S. is a Darwin Fellow. Ohto, H., Terazawa, S., Sasaki, N., Hino, K., Ishiwata, C., Kako, M., Ujiie, N., Endo, C., Matsui, A., Okamoto, H., Mishiro, S., Kojima, M., Aikawa, T., Shimoda, K., Sakamoto, M., Akahane, Y., Yoshizawa, H., Tanaka, T., Tokita, H. & Tsuda, F. (1994). Transmission of hepatitis C virus from mothers to infants. New England Journal of Medicine 330, 744–750. References Abe, K., Inchauspe, G. & Fujisawa, K. (1992). Genomic characterization and mutation rate of hepatitis C virus isolated from a patient who contracted hepatitis during an epidemic of non-A, non-B hepatitis in Japan. Journal of General Virology 73, 2725–2729. Chamberlain, R. W., Adams, N. J., Taylor, L. A., Simmonds, P. & Elliott, R. M. (1997). The complete coding sequence of hepatitis C virus genotype 5a, the predominant genotype in South Africa. Biochemical and Biophysical Research Communications 236, 44–49. Davidson, F., Simmonds, P., Ferguson, J. C., Jarvis, L. M., Dow, B. C., Follett, E. A. C., Seed, C. R. G., Krusius, T., Lin, C., Medgyesi, G. A., Kiyokawa, H., Olim, G., Duraisamy, G., Cuypers, T., Saeed, A. A., Teo, D., Conradie, J., Kew, M. C., Lin, M., Nuchaprayoon, C., Ndimbie, O. K. & Yap, P. L. (1995). Survey of major genotypes and subtypes of hepatitis C virus using RFLP of sequences amplified from the 5h noncoding region. Journal of General Virology 76, 1197–1204. Esteban, J. I., Gomez, J., Martell, M., Cabot, B., Quer, J., Camps, J., Gonzalez, A., Otero, T., Moya, A., Esteban, R. & Guardia, J. (1996). Transmission of hepatitis C virus by a cardiac surgeon. New England Journal of Medicine 334, 555–560. Farci, P., Shimoda, A., Wong, D., Cabezon, T., De Gioannis, D., Strazzera, A., Shimizu, Y., Shapiro, M., Alter, H. J. & Purcell, R. H. (1996). Prevention of hepatitis C virus infection in chimpanzees by Power, J. P., Lawlor, E., Davidson, F., Yap, P. L., Kenny-Walsh, E., Whelton, M. J. & Walsh, T. J. (1994). Hepatitis C viraemia in recipients of Irish intravenous anti-D immunoglobulin. Lancet 344, 1166–1167. Power, J. P., Lawlor, E., Davidson, F., Holmes, E. C., Yap, P. L. & Simmonds, P. (1995). Molecular epidemiology of an outbreak of infection with hepatitis C virus in recipients of anti-D immunoglobulin. Lancet 345, 1211–1213. Sekiya, H., Kato, N., Ootsuyama, Y., Nakazawa, T., Yamauchi, K. & Shimotohno, K. (1994). Genetic alterations of the putative envelope proteins encoding region of the hepatitis C virus in the progression to relapsed phase from acute hepatitis : humoral immune response to hypervariable region 1. International Journal of Cancer 57, 664–670. Sherman, K. E., Andreatta, C., O’Brien, J., Gutierrez, A. & Harris, R. (1996). Hepatitis C in human immunodeficiency virus-coinfected patients : increased variability in the hypervariable envelope coding domain. Hepatology 23, 688–694. Smith, D. B., McAllister, J., Casino, C. & Simmonds, P. (1997 a). Virus ‘ quasispecies’ : making a mountain out of a molehill? Journal of General Virology 78, 1511–1519. Smith, D. B., Pathirana, S., Davidson, F., Lawlor, E., Power, J., Yap, P. L. & Simmonds, P. (1997 b). The origin of hepatitis C virus hyperimmune serum against the hypervariable region 1 of the envelope 2 protein. Proceedings of the National Academy of Sciences, USA 93, 15394–15399. Hohne, M., Schreier, E. & Roggendorf, M. (1994). Sequence variability in the env-coding region of hepatitis C virus isolated from patients infected during a single source outbreak. Archives of Virology 137, 25–34. genotypes. Journal of General Virology 78, 321–328. Jarvis, L. M., Watson, H. G., McOmish, F., Peutherer, J. F., Ludlam, C. A. & Simmonds, P. (1994). Frequent reinfection and reactivation of Tautz, N., Meyers, G., Stark, R., Dubovi, E. J. & Thiel, H. J. (1996). hepatitis C virus genotypes in multitransfused hemophiliacs. Journal of Infectious Diseases 170, 1018–1022. Kato, N., Ootsuyama, Y., Tanaka, T., Nakagawa, M., Nakazawa, T., Muraiso, K., Ohkoshi, S., Hijikata, M. & Shimotohno, K. (1992). Marked sequence diversity in the putative envelope proteins of hepatitis C viruses. Virus Research 22, 107–123. Kojima, M., Osuga, T., Tsuda, F., Tanaka, T. & Okamoto, H. (1994). Influence of antibodies to the hypervariable region of e2\NS1 glycoprotein on the selective replication of hepatitis C virus in chimpanzees. Virology 204, 665–672. Kumar, S., Tamura, K. & Nei, M. (1993). MEGA : Molecular Evolutionary Genetics Analysis, version 1.02. Pennsylvania : Pennsylvania State University. Lawlor, E., Power, J., Garson, J. A., Yap, P. L., Davidson, F., Columb, G., Smith, D. B., Pomeroy, L., O’Riordan, J., Simmonds, P. & Tedder, R. S. (1999). Transmission rates of HCV by different batches of a contaminated anti-D immunoglobulin preparation. Vox Sanguinis (in press). McAllister, J., Casino, C., Davidson, F., Power, J., Lawlor, E., Yap, P. L., Simmonds, P. & Smith, D. B. (1998). Long-term evolution of the hypervariable region of hepatitis C virus in a common-source-infected cohort. Journal of Virology 72, 4893–4905. Taniguchi, S., Okamoto, H., Sakamoto, M., Kojima, M., Tsuda, F., Tanaka, T., Munekata, E., Muchmore, E. E., Peterson, D. A. & Mishiro, S. (1993). A structurally flexible and antigenically variable N-terminal domain of the hepatitis C virus E2\NS1 protein – implication for an escape from antibody. Virology 195, 297–301. Cytopathogenicity of a pestivirus correlates with a 27-nucleotide insertion. Journal of Virology 70, 7851–7858. Weiner, A. J., Thaler, M. M., Crawford, K., Ching, K., Kansopon, J., Chien, D. Y., Hall, J. E., Hu, F. & Houghton, M. (1993). A unique, predominant hepatitis C virus variant found in an infant born to a mother with multiple variants. Journal of Virology 67, 4365–4368. Zanetti, A. R., Tanzi, E., Paccagnini, S., Principi, N., Pizzocolo, G., Caccamo, M. L., Damico, E., Cambie, G., Vecchi, L., Bresciani, S., Marin, M. G., Padula, D., Rodella, A., Bulgarelli, I., Chiodo, F., Magliano, E., Miotto, G., Muggiasca, M. L., Pilloton, E., Pozzoli, R., Pregliasco, F., Romano, L., Stringhi, C., Dagostino, F., Paolillo, F. & Zapparoli, B. (1995). Mother-to-infant transmission of hepatitis C virus. Lancet 345, 289–291. Zhang, L. Q., Mackenzie, P., Cleland, A., Holmes, E. C., Leigh Brown, A. J. & Simmonds, P. (1993). Selection for specific sequences in the external envelope protein of human immunodeficiency virus type 1 upon primary infection. Journal of Virology 67, 3345–3356. Zibert, A., Schreier, E. & Roggendorf, M. (1995). Antibodies in human sera specific to hypervariable region 1 of hepatitis C virus can block viral attachment. Virology 208, 653–661. Received 21 September 1998 ; Accepted 18 November 1998 Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 16 Jun 2017 08:44:41 HCF