Download SOM 1 Procedure to identify potential transmitters of

Document related concepts

Herpes simplex virus wikipedia , lookup

Chickenpox wikipedia , lookup

West Nile fever wikipedia , lookup

Trichinosis wikipedia , lookup

Cross-species transmission wikipedia , lookup

Sarcocystis wikipedia , lookup

Schistosomiasis wikipedia , lookup

Human cytomegalovirus wikipedia , lookup

Hepatitis C wikipedia , lookup

Dirofilaria immitis wikipedia , lookup

HIV/AIDS wikipedia , lookup

Neonatal infection wikipedia , lookup

Hepatitis B wikipedia , lookup

Sexually transmitted infection wikipedia , lookup

Oesophagostomum wikipedia , lookup

Epidemiology of HIV/AIDS wikipedia , lookup

HIV wikipedia , lookup

Hospital-acquired infection wikipedia , lookup

Diagnosis of HIV/AIDS wikipedia , lookup

Microbicides for sexually transmitted diseases wikipedia , lookup

Transcript
Sources of HIV infection among men having sex with men and
implications for prevention *
O. Ratmann1, A. van Sighem2, D. Bezemer2, A. Gavryushkina3, S. Jurriaans4, A. Wensing5, F.
de Wolf1, P. Reiss2, 6, C. Fraser1
1
10
Department of Infectious Disease Epidemiology, School of Public Health, Imperial College
London, London, United Kingdom.
2
Stichting HIV Monitoring, Amsterdam, the Netherlands.
3
Department of Computer Science, University of Auckland, New Zealand.
4
Department of Medical Microbiology, Academic Medical Center, Amsterdam, The Netherlands.
5
Department of Medical Microbiology, University Medical Center Utrecht, Utrecht, the
Netherlands.
6
Department of Global Health, Academic Medical Center, Amsterdam, the Netherlands.
Corresponding author:
Oliver Ratmann
[email protected]
*
20
This manuscript has been accepted for publication in Science Translational Medicine. This version
has not undergone final editing. Please refer to the complete version of record at
www.sciencetranslationalmedicine.org/. The manuscript may not be reproduced or used in any
manner that does not fall within the fair use provisions of the Copyright Act without the prior,
written permission of AAAS.
ONE SENTENCE SUMMARY
To tailor HIV prevention strategies amongst men having sex with men, we characterized the sources
of ~600 transmission events in the Netherlands. More than half of these infections could have been
averted with available antiretrovirals, but only if considerably more men had tested annually.
30
ABSTRACT
New HIV diagnoses among men having sex with men (MSM) have not decreased appreciably in
most countries, even though care and prevention services have been scaled up substantially in the
past twenty years. To maximize the impact of prevention strategies, it is crucial to quantify the
sources of transmission at the population level. We used viral sequence and clinical patient data from
one of Europe’s nation-wide cohort studies to estimate probable sources of transmission for 617
recently infected MSM. 71% of transmissions were from undiagnosed men, 6% from men who had
1
initiated antiretroviral therapy (ART), 1% from men with no contact to care for at least 18 months,
40
and 43% from those in their first year of infection. The lack of substantial reductions in incidence
amongst Dutch MSM is not a result of ineffective ART provision or inadequate retention in care. In
counterfactual modeling scenarios, 19% of these past cases could have been averted with current
annual testing coverage and immediate ART to those testing positive. 66% of these cases could have
been averted with available antiretrovirals (immediate ART provided to all MSM testing positive,
and pre-exposure antiretroviral prophylaxis taken by half of all who test negative for HIV), but only
if half of all men at risk of transmission had tested annually. With increasing sequence coverage,
molecular epidemiological analyses can be a key tool to direct HIV prevention strategies to the
predominant sources of infection, and help send HIV epidemics amongst MSM into a decisive
decline.
50
INTRODUCTION
Combination antiretroviral therapy (ART) transformed HIV from a deadly to a life-long disease, and
is also one of the most effective strategies for preventing onward infections (1, 2). However, among
men having sex with men (MSM), the substantial scale-up of ART in the past twenty years has not
resulted in appreciable reductions of new HIV infections and diagnoses (table 1) (3). Building on
successful behavioural and biomedical HIV prevention strategies (4), further interventions exist that
could be used to reduce the number of HIV infections amongst MSM. The 2016 WHO guidelines
now recommend initiation regardless of CD4 cell count after diagnosis (immediate ART), as well as
60
provision of antiretrovirals as pre-exposure prophylaxis (PrEP) to those at substantial risk of
infection (5). Future prevention programmes could focus on one or both recommended interventions,
as well as on increased routine HIV testing and diagnosis (6); RNA testing to detect MSM in early
acute infection when they are thought to be the most infectious (7); and improved adherence and
linkage support to assist patients with attaining and sustaining undetectable viral loads whilst on
ART (8). The potential impact of any of these interventions, and specifically those recommended by
the WHO, relies crucially on how many HIV transmissions originate from different stages in the
entire HIV infection and care continuum, ranging from undiagnosed acute infection through treated
infection and loss to follow-up. This has been challenging to measure directly through classical
epidemiological approaches.
70
In this study, we use the viral phylogenetic relationship between partial HIV-1 subtype B polymerase
sequences to reconstruct past, probable transmission events in the Netherlands (figure 1). These
2
sequences were routinely collected for drug resistance testing of HIV-infected patients that are in
care (9). Amongst sampled MSM, 94% were of subtype B. Then, we use clinical records to
determine the staging of probable transmission events within the infection and care continuum
(figure 2A and table 2). This enabled us to estimate the population-level proportion of transmissions
amongst the reconstructed transmission events that are attributable to fourteen stages of the infection
and care continuum in figure 2A. Transmissions could be attributed to stages before diagnosis
because HIV sequences, always collected after diagnosis, diverge fast enough to indicate past
80
transmission events (10). Similarly, transmissions could also be attributed to men with no contact to
care for at least 18 months. Finally, using these estimates, we quantified the potential impact of
available, but currently not implemented prevention programmes in the Dutch MSM population, had
they been used in the last three years. In particular, we evaluate if the revised 2016 WHO guidelines
on immediate ART and PrEP could have substantially altered the course of the Dutch HIV epidemic
amongst MSM.
Understanding which interventions should be prioritized for the Dutch MSM epidemic is an
important case study. First, the number of new MSM infections in the Netherlands has not decreased
appreciably (9) despite comprehensive linkage and retention in care, substantial ART scale up free of
90
charge, and frequent follow up to maintain viral control of the vast majority of those on ART (table
1). Second, similar epidemic trends are reported from other countries with an overall equally
comprehensive cascade of care (table 1), casting more general doubts on the population-level impact
of current prevention strategies targeting MSM epidemics (11). Third, nearly all HIV-infected MSM
in care are enrolled in the clinical, national opt-out ATHENA cohort since early 1996 (12). HIV care
is monitored comprehensively at high frequency (clinic visits, treatment histories, co-morbidities
recorded; ~3 viral load/CD4 measurements per year per individual) (12), which allowed us to
characterize phylogenetically reconstructed transmission events in detail.
RESULTS
100
Potential transmissions to MSM in confirmed recent infection at time of diagnosis
By 2013, 11,863 HIV-infected MSM were registered and still in care in the Netherlands. To estimate
their sources of transmission and then the impact of prevention programmes, we focussed on
transmissions to MSM that were recently infected at time of diagnosis (stage A in figure 1). Between
July 1996 and December 2010, 1,794 MSM had been infected at most 12 months prior to diagnosis.
3
Types of evidence were a previous negative HIV test (76%), laboratory diagnosis (7%), or clinical
diagnosis of acute infection (17%). For 1,045 (58%) of these, a sequence was available. To these
recipient MSM, we considered as potential transmitters all HIV-infected men whose course of
110
infection overlapped with the infection window of the recipient (stage A in figure 1). With this
approach, we could resolve the timing and direction of potential transmission events (13). Out of all
12,207 potential transmitters, 5,593 (46%) had a viral sequence and formed ~ 4.4 million potential
transmission pairs with sequences available for both individuals (stage B in figure 1).
Phylogenetically probable transmission events
Genetic sequences of the virus alone cannot prove epidemiological linkage (14). However, most of
the potential transmission pairs could be ruled out as implausible, based on the phylogenetic
relationship of the viral sequences. The viral phylogeny among the Dutch sequences and their closest
120
matches in the Los Alamos HIV sequence database (http://www.hiv.lanl.gov/) was reconstructed
with maximum-likelihood methods, and reliable subtrees were identified (see Material and
Methods). Potential transmitters whose sequences did not occur in the same reliable subtree as those
of the recipient MSM were excluded (stage C in figure 1) (14), as were potential transmitters whose
sequences were incompatible with a direct HIV transmission event (stage D in figure 1) (15). Direct
transmission could be excluded in 99.96% of all potential transmission pairs. We identified 903
phylogenetically probable transmitters to 617 recipient MSM in 2,343 pairs. Our analyses are based
on this open observational cohort of past, phylogenetically reconstructed transmission events.
To guide and interpret this exclusion analysis, we evaluated patterns of viral divergence between
130
sequences isolated from epidemiologically confirmed transmission pairs (16), and pairings of Dutch
MSM that could not have infected each other (see Material and Methods). Based on these pairs, the
above exclusion criteria were highly specific (true transmitters to recipients are not excluded, >90%),
whilst sensitivity was low (incorrect transmission pairs could not always be excluded, ~60%). This
indicates that the actual transmitter is almost certainly among the phylogenetically reconstructed,
probable transmitters, provided he was sequenced. From the known sequence coverage alone, we
expected that approximately half of all 1,045 recipient MSM with a sequence had their actual
transmitter sampled—suggesting further that the actual transmitter is among the phylogenetically
reconstructed, probable transmitters for the large majority of the reconstructed 617 transmission
events.
140
4
Clinical and demographic characteristics of the selected 617 recipient MSM were typical of all 1,794
MSM that were in confirmed recent infection at time of diagnosis (table 3). This indicates that the
probable transmitters in the cohort are also typical of the transmitters to recently infected MSM.
Characterization of individual transmission events by stage in the HIV infection and care continuum
Using clinical records, we then enumerated all stages in the HIV infection and care continuum
during which the 617 transmission events could have occurred. Probable transmitters progressed in
stage over time, and overlapped with infection windows in 13,169 time-resolved, six week long
150
transmission intervals (figure 2B). Censoring and sequence sampling biases were identified for each
stage by comparing men with and without a sequence, and were adjusted in line with previous work
(17). Reflecting targeted sequence collection, intervals were not missing at random (figures 2C and
S9). Each interval was associated with a phylogenetic transmission probability, based on the genetic
distance between sequences from the transmitter and recipient and the time elapsed since the putative
transmission interval and the sampling dates of both individuals (see Materials and Methods and
figure S10). For each recipient, the probability that transmission occurred from one of the fourteen
stages then depends on the number of his probable transmitters in that stage, and the transmission
probabilities associated with each of the corresponding transmission intervals (see Materials and
Methods).
160
Sources of HIV transmission
The population-level proportions of HIV transmissions attributable to the fourteen infection/care
stages were obtained by summing individual-level transmission probabilities by stage across all
recipients, and are shown in table 4. Figure 3 compares the proportion of transmissions from each
stage to the population-level proportion of infected men in these stages. Between July 1996 and
December 2010, an estimated 71% [66%-73%] of all 617 transmission events originated from
undiagnosed men, 22% [21%-26%] from diagnosed but not yet treated men, 6% [5%-8%] from men
who initiated ART and 1% [0.7-1.6%] from men with no contact to care for at least 18 months. An
170
estimated 43% [37%-46%] of the 617 recipient MSM were infected by men undergoing their first
year of infection.
Impact of prevention strategies
5
Figure 4 describes the counterfactual prevention scenarios for which we calculated the proportion of
transmissions in the cohort that could have been averted between mid 2008 to December 2010, had
we intervened to re-distribute the identified, probable transmitters to less infectious infection/care
stages. Young MSM are at particularly high risk of infection (18, 19). We therefore considered—
along the revised 2016 WHO guidelines (5)— roll-out of immediate ART to all infected MSM and
180
PrEP to half of all MSM aged 30 or less that test negative: at most 30% [22%-39%] of infections
could have been averted without increased annual testing. Immediate ART alone could have averted
19% [13%-26%] of these cases at current testing levels. In practice, low adherence is associated with
decreasing effectiveness of PrEP (20). We assumed an 86% efficacy of PrEP as reported in the
recent Ipergay and PROUD trials (21, 22). Figure S12 reports the impact of lower efficacy values.
Figure S13 reports the impact of lower or higher PrEP coverage. Next, we considered increased
annual testing. Only 17% of identified probable transmitters had a last negative test in the year
before diagnosis, compared to 27% of diagnosed MSM between mid 2008 to December 2010 and
38% of uninfected MSM in 2013 (table 1). If half of all transmitters had tested annually, immediate
ART and PrEP to half of all MSM aged 30 or less that test negative could have averted 45% [34%-
190
56%] of infections. Additional roll-out of PrEP to half of all men testing negative would have
substantially boosted the combination intervention: 66% [50%-78%] of infections could have been
averted.
DISCUSSION
HIV epidemics amongst MSM have—unlike other settings (23)— not declined appreciably with
substantial improvements to care and ART scale-up (table 1). We characterized 617 past
transmission events amongst MSM in the Netherlands based on phylogenetic and clinical data,
estimated their sources throughout the infection and care continuum, and quantified the impact that
200
biomedical prevention programmes could have had in averting the reconstructed transmission events.
Analysing this transmission cohort, we aim to inform the design of future prevention interventions
beyond high levels of ART coverage and the numerous successful behavioural interventions that are
already in place (9).
A potential limitation of this study is that transmitters to MSM in recent infection at diagnosis may
differ from typical transmitters. On average, fewer men diagnosed late with a CD4 count below 350
cells/ml occurred in phylogenetic transmission clusters with a recipient MSM, compared to those
without (figure S23). This may imply that overall, the proportion of transmissions from undiagnosed
6
men in chronic infection is higher, and consequently that the impact that immediate ART could have
210
had is lower than our estimates. Conversely, the impact of increased annual testing and PrEP could
be larger than reported, if men diagnosed late are not more difficult to reach than the average
transmitter in our cohort. Further, this study focuses on the sources and prevention of in-country
transmissions: 97% of the recipient MSM reported that infection was likely acquired in the
Netherlands compared to 86% of diagnosed MSM. The contribution of cross-border transmissions
may increase as the response is strengthened (24), an effect which we did not consider. Phylogenetic
uncertainty and the phylogenetic exclusion criteria had little impact on our findings (figures S14S22). A further potential caveat to the robustness of our findings is that only half of all potential
transmitters had a viral sequence sampled. Although population-level sampling biases were adjusted,
we must acknowledge that the actual transmitter may not have been sampled for all recipients.
220
Improving sequence sampling coverage at time of diagnosis is needed to facilitate phylogenetic
prevention analyses (25).
The identified sources of transmission imply, first, that viral suppression induced by ART is highly
effective in preventing transmissions in this population (figure 3). The relative risk of HIV
transmission from men after ART initiation varies by stage but is always estimated well below one
when compared to diagnosed, untreated men with a CD4 count above 500 cells/ml, and is in
particular 0.04 [0.02-0.1] for men with viral suppression (figure S11).
Second, very few transmissions are attributable to temporary or permanent loss to follow up, which
230
must be considered in the context of high linkage and retention to care in the Netherlands: few
diagnosed MSM had subsequently no contact to care for at least 18 months (8.2%) and most reentered care owithin five years (69%) (9). In contrast, several studies indicate that more than half of
all transmissions amongst MSM in the United States originate from men that were not retained in
care (26-28). The estimated impact of particular prevention strategies in figure 4 is limited to settings
with a similar epidemic profile and care cascade as the Netherlands (table 1).
Third, not more than an estimated 20% of infections in the cohort could have been averted between
mid 2008 and December 2010 with immediate ART after diagnosis. Given the remarkable expansion
of ART coverage in the Netherlands in the past (9), the prevention potential of immediate ART is
240
now limited. Nonetheless, starting ART at a cell count above 500 cells/ml leads to improved clinical
outcomes and remains a priority (29).
7
Fourth, and similar to other locations (25, 30), almost half of all infections in our transmission cohort
originated from men in their first year of infection. Frequent early transmission limits the overall
impact of annual testing plus immediate ART to those testing positive (figure 4), and implies that
prevention services to uninfected MSM must be strengthened. The substantial, estimated impact that
PrEP would have had in averting transmissions in our cohort (figure 4) supports making PrEP
available to MSM testing negative as in the United States (31). Recent PrEP demonstration projects
(32, 33) indicate that existing barriers such as low awareness (34) and a lack of experience amongst
250
providers (35) can be addressed. Concerns regarding the toxicity of PrEP, increasing sexual risk
behaviour and emerging drug resistance have to date not been substantiated since PrEP was made
available in the United States (36). In the context of PrEP-experienced prevention services, high
discontinuation rates after PrEP initiation appear to be the greatest challenge to maintain protection
from infection (32).
Fifth, without substantial increases to current annual testing coverage, ART and PrEP offered along
the revised 2016 WHO guidelines could not have prevented more than a quarter of all infections in
our transmission cohort. Since phylogenetically probable transmitters tend to test much less
frequently than the average diagnosed MSM, substantial barriers likely exist in reaching men at high
260
risk of onward transmission, and further work is needed to characterize these (37). Strategies such as
self-testing (38), community-based testing (39), and more provider-initiated routine testing in general
practices and at medical admissions raised annual testing coverage in pilot projects (12), and need to
be expanded alongside biomedical interventions.
Sixth, this study indicates that substantial reductions in HIV incidence amongst MSM could have
been realized with a combination approach that includes—critically—increased annual testing, with
uptake of PrEP by young MSM testing negative and provision of immediate ART to those testing
positive. This finding is primarily based on the impact of increased annual testing and the higher
efficacy of PrEP reported in two recent randomized controlled trials (21, 22), and updates previous
270
studies that estimate more limited benefits (4, 40, 41). Beyond age at testing, other characteristics not
available to this study may also indicate high infection risk (42), and thereby identify groups of
MSM to which PrEP should be made available as a priority. Provision of PrEP to all men testing
negative is not affordable at current drug prices in high-income countries (40). The magnitude of the
predicted impact of test-and-PrEP-and-treat for all (figure 4) could set an aspirational target for the
fight against HIV amongst MSM.
8
The lack of substantial reductions in incidence amongst Dutch MSM is not a result of ineffective
ART provision or inadequate retention in care. New HIV infections amongst MSM are challenging
to prevent due to frequent early transmission and continued low testing uptake of men at risk of
280
transmission. Counterfactual prevention scenarios on phylogenetically reconstructed, past
transmission events to MSM in recent infection at diagnosis predict that increased annual testing and
uptake of PrEP by men at high risk of infection have a key role to send the HIV epidemic amongst
MSM into a decisive decline.
MATERIALS AND METHODS
Study design
We conducted a retrospective viral phylogenetic transmission and prevention study that focuses on
290
transmissions to MSM in confirmed recent HIV infection at time of diagnosis in the Netherlands
(figure 1). The pre-specified objectives were to, first, reconstruct past, phylogenetically probable
transmission events to these recipient MSM; second, to estimate the proportion of transmissions
originating throughout the infection and care continuum based on the reconstructed transmission
events; and, third, to estimate the proportion of infections that could have been averted through reallocating past, probable transmitters to less infectious stages in counterfactual modeling scenarios.
The ATHENA national observational HIV cohort includes anonymized data of all HIV-infected
patients followed longitudinally in the 27 HIV treatment centres in the Netherlands since 1996,
except 1.5% who opt-out (9). ATHENA patients are informed of data collection by their treating
300
physician and can refuse further collection of clinical data according to an opt-out procedure.
Patients who were diagnosed between 1981 and1995 were included in the cohort when they were
still alive in 1996 (9). Demographic, clinical, and viral sequence data were collected at entry and
follow-up visits as described previously (9). By March 2013, viral sequence data had been
systematically entered until December 2010. Therefore, recipients were enrolled between early 1996
and December 2010. Potential transmitters were enrolled until database closure in March 2013.
Table S1 characterizes the demographic, clinical, and viral sequence data that were used in this
study. The resolution of the infection/care stages in table 2 was adjusted to ensure adequate sample
sizes. The number of probable transmission intervals after first viral suppression was too small to
enable further stratification by treatment class. This study was reviewed and approved by the HIV
9
310
Monitoring Institutional Data Access and Ethics Committee, and reported along STROME-ID
guidelines.
Viral sequences of different subtypes (n=355 from MSM), with less than 250 nucleotides (n=368) or
indication for intra-subtype recombination (n=52) were removed prior to analysis. Primary drug
resistance mutations were masked in each sequence (43). Demographic and clinical data were
checked for consistency along patient timelines, and to lie within appropriate ranges. Outliers were
reported to the ATHENA quality control team, and manually updated.
Recently infected, recipient MSM and infection windows
320
We enrolled as recipients all MSM for whom a narrow infection window could be identified. MSM
had evidence for infection within 12 months prior to diagnosis if either a last negative HIV-1
antibody test in the 12 months preceding diagnosis, an indeterminate HIV-1 western blot, or clinical
diagnosis of acute infection were reported. Figure S1 shows enrollment progress over time. Infection
windows were at most 12 months, or shorter if indicated by a last negative HIV antibody test (figure
S2).
Potential transmitters to recipient MSM
We enrolled as potential transmitters all registered infected men that overlapped with infection
330
windows of recipients, and thus could have in principle infected a recipient. This definition required
estimation of putative infection times. Calculations are based on a method by Rice and colleagues
(44), see the online supplementary material. Estimated infection times are associated with substantial
uncertainty, and sensitivity analyses were conducted for lower and upper 95% estimates. Table S2
characterizes the potential transmitters to all recipients. Further analysis was restricted to potential
transmission pairs with sequences from both individuals (stage B in figure 1).
Viral phylogenetic exclusion analysis to construct the transmission cohort
The viral phylogeny was reconstructed under the GTR nucleotide substitution model with maximum-
340
likelihood methods (45) and is shown in figure S3. 500 bootstrap trees were created to quantify
uncertainty in tree reconstruction (14). Genetic distances between sequences from transmitterrecipient pairs were highly variable (figure S4), which was accounted for in all analyses. To guide
our choice of exclusion criteria, we considered, first, epidemiologically confirmed transmission pairs
from previously published transmission chains in Belgium and Sweden (16, 46). The Belgium
transmission chain was subsequently oversampled (15), providing 2,807 sequence pairs from
10
confirmed transmitters and recipients without multi-drug resistance. Further, we considered 4,117
pairs of sequences from the same Dutch patient and 201,605 pairs between Dutch patients that died
before the last negative antibody test of another patient. These pairs were used to quantify patterns of
viral evolutionary diversification that can be expected among confirmed linked and unlinked pairs,
350
and to develop exclusion criteria with high specificity; see online supplementary material. The
Swedish pairs were used for validation purposes. All potential transmitters that were not excluded
were considered phylogenetically probable, and are characterized in table S4.
Relative pairwise transmission probabilities
Among the 2,807 confirmed transmission pairs (15), the genetic distance between sequences from
the transmitter and the recipient was strongly associated with the time elapsed between both
sampling dates and the midpoint of the established infection window (figure S5). We fitted a
probabilistic molecular clock model to these data to describe the relative probability of observing a
360
given genetic distance between sequences from a transmission pair that diverged for a specified
amount of time from each other. The fitted model was then used to express the relative probability
that a phylogenetically identified transmitter was the actual transmitter to a recipient (figure S5).
Matching of clinical data to associate infection/care stages with transmission intervals
Sources of transmission were not defined in terms of individuals, but the fourteen stages in the
infection and care continuum in table 2 (stage E in figure 1). Stages were allocated to transmission
intervals based on available clinical data (table S1). The duration of transmission intervals was set to
six weeks to accommodate abrupt changes in infection/care stages.
370
Adjusting for censoring and sequence sampling biases
Towards the present, an increasing fraction of potential transmitters may not have been diagnosed by
the time of database closure. Potential transmitters in recent infection at time of diagnosis must, by
definition, have been diagnosed within 12 months after the putative transmission interval. Therefore,
the extent of right censoring differs between stages. To adjust for right censoring, we counted when
potential transmitters in a particular infection/care stage became diagnosed in relation to the time of
diagnosis of their recipient (figure S6). This enabled us to estimate the proportion of censored
intervals for a hypothetical database closure time in the past (figure S6). We then extrapolated these
11
380
estimates to the actual database closure time with a bootstrap algorithm; see the online
supplementary material. To quantify sequence sampling biases, we compared men with and without
a sequence in the near complete population cohort (figure S7). A negative Binomial missing data
model was then used to adjust for the number of missing transmission intervals (17). Adjustments
accounted for censoring; increasing sampling frequency with duration in care; high sampling
frequency of men returning to care, men participating in particular sub-studies, and men with
indication of drug-resistance; as well as increasing sampling frequency with calendar time (figure
S7).
Epidemiological transmission analysis
390
Each interval was associated with a phylogenetic transmission probability (stage F in figure 1). The
relative pairwise transmission probabilities (figure S5) were equally apportioned to all observed
intervals of the same transmitter-recipient pair. Stage-specific data such as viral load was not used to
determine these probabilities, to avoid circularity in the attribution of transmissions to infection/care
stages. Then, the transmission probability in an observed interval 𝜏 from transmitter 𝑖 to recipient 𝑗
was calculated by
𝑝𝑖𝑗𝜏 =
𝜔𝑖𝑗𝜏
⁄∑ 𝜔 + ∑ 𝑚 (𝑧)𝜔(𝑧) ,
𝑘,𝑠 𝑘𝑗𝑠
𝑧 𝑗
where 𝜔𝑖𝑗𝜏 is the relative transmission probability in interval 𝜏, and the denominator sums over all
observed, competing intervals as well as expected missing intervals 𝑚𝑗 (𝑧) in stage 𝑧 to recipient 𝑗.
400
For missing intervals, relative transmission probabilities were imputed and set to the median 𝜔𝑖𝑗𝑠 of
all observed intervals 𝑠 in stage 𝑧, denoted by 𝜔(𝑧). For a missing transmission interval 𝜐 in stage 𝑥
to recipient 𝑗, we calculated
𝑝𝑗𝜐 =
𝜔(𝑥)
⁄∑ 𝜔 + ∑ 𝑚 (𝑧)𝜔(𝑧) .
𝑘,𝑠 𝑘𝑗𝑠
𝑧 𝑗
In 24 cases, two recipients were each other’s phylogenetically probable transmitter. We considered
transmission in each direction equally likely. The relative transmission probabilities 𝜔𝑖𝑗𝜏 were
calculated by
𝜔𝑖𝑗𝜏 = 𝜔𝑖𝑗 𝜑𝑖𝑗 ⁄𝜏𝑖𝑗 ,
where 𝜑𝑖𝑗 equals 0.5 if 𝑖 and 𝑗 are each other’s phylogenetically probable transmitters and otherwise
one, 𝜔𝑖𝑗 are the relative pairwise probabilities shown in figure S5, and 𝜏𝑖𝑗 is the number of
410
transmission intervals between transmitter 𝑖 and recipient 𝑗.
12
These probabilities sum to one per recipient. If all transmitters are sampled, we obtain 𝑝𝑖𝑗𝜏 =
𝜔𝑖𝑗𝜏 ⁄∑𝑘,𝑠 𝜔𝑘𝑗𝑠 . If some transmitters are not sampled, the first part of the denominator, ∑𝑘,𝑠 𝜔𝑘𝑗𝑠 , is
smaller and adjusted by the second part of the denominator. The number of expected missing
intervals 𝑚𝑗 (𝑧) differs by stage, and adjusts for stage-specific censoring and sampling biases.
The proportion of transmissions originating from the fourteen infection/care stages were obtained by
summing the corresponding individual-level transmission probabilities (figure S8). Precisely, the
proportion of transmissions from stage 𝑥 to recipients diagnosed in [𝑡1 , 𝑡2 ] was calculated by
420
𝑃𝑇 (𝑥, 𝑡1 , 𝑡2 ) =
∑𝑗∈𝑅(𝑡1 ,𝑡2 ) 𝑝𝑗 (𝑥)
1
=
∑𝑧 ∑𝑗∈𝑅(𝑡1 ,𝑡2 ) 𝑝𝑗 (𝑧)
𝐽
∑
𝑝𝑗 (𝑥),
𝑗∈𝑅(𝑡1 ,𝑡2 )
where 𝑅(𝑡1 , 𝑡2 ) is the set of recipients with date of diagnosis in [𝑡1 , 𝑡2 ], 𝐽 is the number of recipients
with date of diagnosis in [𝑡1 , 𝑡2 ], and 𝑝𝑗 (𝑥) is the probability that recipient 𝑗 was infected by a
transmitter in stage 𝑥. The probability 𝑝𝑗 (𝑥) is the sum
𝑚𝑗 (𝑥)
𝑝𝑗 (𝑥) = ∑ ∑
𝑖∈𝐼𝑗 𝜏∈𝑉𝑖𝑗(𝑥)
𝑝𝑖𝑗𝜏 + ∑ 𝑝𝑗𝜐 ,
𝜐=1
where 𝐼𝑗 are the observed, phylogenetically probable transmitters to recipient 𝑗, 𝑉𝑖𝑗(𝑥) is the set of
observed transmission intervals between 𝑖 and 𝑗 in stage 𝑥, and all other quantities as defined above.
The formula for 𝑃𝑇 (𝑥, 𝑡1 , 𝑡2 ) can be intuitively interpreted as the average probability that a recipient
was infected by a transmitter in stage 𝑥. Thus, the precision in the estimated 𝑃𝑇 (𝑥, 𝑡1 , 𝑡2 ) depends
primarily on the number of available recipients. We identified substantial individual-level variation
430
in the transmission probabilities 𝑝𝑗 (𝑥) (figure S8), suggesting that a relatively large number of past
transmission events are needed in order to reliably quantify sources of transmission.
To obtain a central estimate of 𝑃𝑇 (𝑥, 𝑡1 , 𝑡2 ), we used the central estimates of the 𝜔𝑖𝑗𝜏 and the
expected number of missing transmission intervals. To quantify uncertainty in 𝑃𝑇 (𝑥, 𝑡1 , 𝑡2 ), we
propagated uncertainty in the genetic distances and the number of missing transmission intervals
with a bootstrap algorithm.
Epidemiological prevention analysis
440
With the sources of transmission estimated, we compared the impact of prevention strategies in
counterfactual scenarios that modelled the re-distribution of phylogenetically identified transmitters
13
to less infectious stages in the HIV infection and care continuum. This reduced the overall
probability that any of the recipients would have been infected to less than one. The proportion of
infections that could have been averted in the period [𝑡1 , 𝑡2 ] with a counterfactual prevention
scenario 𝐻 is
𝑎(𝐻) = 1 −
∑
∑ 𝑝𝑗𝐻 (𝑥) ,
𝑗∈𝑅(𝑡1 ,𝑡2 ) 𝑥
where 𝑝𝑗𝐻 (𝑥) is the probability that recipient 𝑗 is infected by someone in stage 𝑥 under the
counterfactual prevention scenario 𝐻. The individual-level prevention models are described in the
supplementary online material.
450
Statistical uncertainty
Central estimates of 𝑃𝑇 (𝑥, 𝑡1 , 𝑡2 ) and 𝑎(𝐻) were obtained under central estimates of the genetic
distances in figure S4, the resulting phylogenetic transmission probabilities 𝜔𝑖𝑗𝜏 , and the expected
number of missing transmission intervals (figure 2C). Bootstrap sampling of the recipients, the
empirical distribution of genetic distances, the number of missing transmission intervals under a
Negative Binomial missing data model, and the counterfactual re-allocation procedure of probable
transmitters to less infectious infection/care stages was conducted to obtain non-parametric 95%
confidence intervals. Confidence intervals are based on 1,000 bootstrap replicates.
460
Supplementary Materials
Word document Online Materials and Methods
Fig. S1 Number of identified recipient MSM by 3-month intervals.
Fig. S2 Duration of infection windows of recipient MSM.
Fig. S3 Snapshot of the reconstructed viral phylogeny.
Fig. S4 Uncertainty in the estimated genetic distance between sequences from the transmitter and
recipient of potential transmission pairs.
470
Fig. S5 Genetic distance between sequence pairs from previously published, epidemiologically
confirmed transmitter-recipient pairs, and sequence pairs from the phylogenetically probable
transmission pairs in this study.
Fig. S6 Right censoring at past, hypothetical database closure times.
Fig. S7 Sequence sampling probabilities by stage in the infection and care continuum.
Fig. S8 Invidividual-level variation in phylogenetically derived transmission probabilities by
infection/care stages.
Fig. S9 Frequency of infection/care stages among phylogenetically probable transmitters.
14
Fig S10. Phylogenetically derived transmission probabilities of observed transmission intervals.
Fig. S11 Transmission risk ratio from men after ART start, compared to diagnosed untreated men
with CD4 > 500 cells/ml.
Fig. S12 Sensitivity analysis on the impact of PrEP with lower efficacy.
480
Fig. S13 Sensitivity analysis on the impact of lower or higher PrEP coverage.
Fig. S14 Impact of sampling and censoring adjustments on the estimated proportion of transmissions
from stages in the infection and care continuum.
Fig. S15 Impact of phylogenetic transmission probabilities on the estimated proportion of
transmissions from stages in the infection and care continuum.
Fig. S16 Impact of infection time estimates on the estimated proportion of transmissions from stages
in the infection and care continuum.
Fig. S17 Impact of phylogenetic clustering criteria on the estimated proportion of transmissions from
stages in the infection and care continuum.
490
Fig. S18 Impact of additional genetic distance criteria on the estimated proportion of transmissions
from stages in the infection and care continuum.
Fig. S19 Impact of sequence sampling and censoring adjustments on the estimated proportion of
averted infections.
Fig. S20 Impact of phylogenetic transmission probabilities on the estimated proportion of averted
infections.
Fig. S21 Impact of infection time estimates and phylogenetic exclusion criteria on the estimated
proportion of averted infections.
Fig. S22 Impact of additional genetic distance criteria on the estimated proportion of averted
infections per biomedical intervention.
Fig. S23 Differences in transmission networks with and without a recipient MSM.
500
Table S1. Clinical and viral sequence data used in this study.
Table S2 Potential transmitters and potential transmission pairs to the recipient MSM.
Table S3 Identified phylogenetically probable transmitters and phylogenetically probable
transmission pairs to the recipient MSM in the ATHENA cohort.
References and Notes
510
1.
M. S. Cohen, Y. Q. Chen, M. McCauley, T. Gamble, M. C. Hosseinipour, N. Kumarasamy, J.
G. Hakim, J. Kumwenda, B. Grinsztejn, J. H. Pilotto, S. V. Godbole, S. Mehendale, S.
Chariyalertsak, B. R. Santos, K. H. Mayer, I. F. Hoffman, S. H. Eshleman, E. Piwowar-Manning, L.
Wang, J. Makhema, L. A. Mills, G. de Bruyn, I. Sanne, J. Eron, J. Gallant, D. Havlir, S. Swindells,
H. Ribaudo, V. Elharrar, D. Burns, T. E. Taha, K. Nielsen-Saines, D. Celentano, M. Essex, T. R.
Fleming, Prevention of HIV-1 infection with early antiretroviral therapy. N Engl J Med 365, 493-505
(2011).
2.
A. Rodger, T. Bruun, V. Cambiano, P. Vernazza, V. Estrada, J. Van Lunzen, S. Collins, A.
M. Geretti, A. Phillips, J. Lundgren, HIV transmission risk through condomless sex if HIV+ partner
on suppressive ART: PARTNER Study, 21st Conference on Retroviruses and Opportunistic
Infections, Boston, MA, USA, 2014.
15
520
530
540
550
560
3.
C. Beyrer, S. D. Baral, F. van Griensven, S. M. Goodreau, S. Chariyalertsak, A. L. Wirtz, R.
Brookmeyer, Global epidemiology of HIV infection in men who have sex with men. Lancet 380,
367-377 (2012).
4.
P. S. Sullivan, A. Carballo-Dieguez, T. Coates, S. M. Goodreau, I. McGowan, E. J. Sanders,
A. Smith, P. Goswami, J. Sanchez, Successes and challenges of HIV prevention in men who have
sex with men. Lancet 380, 388-399 (2012).
5.
World Health Organization, Guideline on when to start antiretroviral therapy and on preexposure prophylaxis for HIV, No. September 2015 Geneva, 2015.
6.
A. Fogarty, L. Mao, Z. M. I, H. Santana, G. Prestage, J. Rule, P. Canavan, D. Murphy, M. D,
The Health in Men and Positive Health cohorts: A comparison of trends in the health and sexual
behaviour of HIV-negative and HIV-positive gay men, 2002-2005, National Centre in HIV Social
Research, Sydney, 2006.
7.
C. D. Pilcher, S. A. Fiscus, T. Q. Nguyen, E. Foust, L. Wolf, D. Williams, R. Ashby, J. O.
O'Dowd, J. T. McPherson, B. Stalzer, L. Hightow, W. C. Miller, J. J. Eron, Jr., M. S. Cohen, P. A.
Leone, Detection of acute infections during HIV testing in North Carolina. N Engl J Med 352, 18731883 (2005).
8.
H. A. Weiss, J. N. Wasserheit, R. V. Barnabas, R. J. Hayes, L. J. Abu-Raddad, Persisting
with prevention: the importance of adherence for HIV prevention. Emerg Themes Epidemiol 5, 8
(2008).
9.
A. van Sighem, L. Gras, A. Kesselring, C. Smit, I. Engelhard, I. Stolte, P. Reiss, Monitoring
of human immunodeficiency vrius infection in the Netherlands. Report 2013, Amsterdam, 2013.
10.
T. T. Lam, C. C. Hon, J. W. Tang, Use of phylogenetics in the molecular epidemiology and
evolutionary studies of viral infections. Crit Rev Clin Lab Sci 47, 5-49 (2010).
11.
D. P. Wilson, HIV treatment as prevention: natural experiments highlight limits of
antiretroviral treatment as HIV prevention. PLoS Med 9, e1001231 (2012).
12.
Public Health England, Time to test for HIV: Expanding HIV testing in
healthcare and community services in England, 2011.
13.
E. Romero-Severson, H. Skar, I. Bulla, J. Albert, T. Leitner, Timing and order of
transmission events is not directly reflected in a pathogen phylogeny. Mol Biol Evol 31, 2472-2482
(2014).
14.
D. Pillay, A. Rambaut, A. M. Geretti, A. J. Brown, HIV phylogenetics. BMJ 335, 460-461
(2007).
15.
B. Vrancken, A. Rambaut, M. A. Suchard, A. Drummond, G. Baele, I. Derdelinckx, E. Van
Wijngaerden, A. M. Vandamme, K. Van Laethem, P. Lemey, The genealogical population dynamics
of HIV-1 in a large transmission chain: bridging within and among host evolutionary rates. PLoS
Comput Biol 10, e1003505 (2014).
16.
P. Lemey, I. Derdelinckx, A. Rambaut, K. Van Laethem, S. Dumont, S. Vermeulen, E. Van
Wijngaerden, A. M. Vandamme, Molecular footprint of drug-selective pressure in a human
immunodeficiency virus transmission chain. J Virol 79, 11981-11989 (2005).
17.
R. J. A. Little, D. B. Rubin, Statistical analysis with missing data. (Wiley, New York ;
Chichester, 1987).
18.
F. van Griensven, T. H. Holtz, W. Thienkrua, W. Chonwattana, W. Wimonsate, S.
Chaikummao, A. Varangrat, T. Chemnasiri, W. Sukwicha, M. E. Curlin, T. Samandari, A.
Chitwarakorn, P. A. Mock, Temporal trends in HIV-1 incidence and risk behaviours in men who
have sex with men in Bangkok, Thailand, 2006-13: an observational study. Lancet HIV 2, e64-70
(2015).
19.
F. D. Koedijk, B. H. van Benthem, E. M. Vrolings, W. Zuilhof, M. A. van der Sande,
Increasing sexually transmitted infection rates in young men having sex with men in the Netherlands,
2006-2012. Emerg Themes Epidemiol 11, 12 (2014).
16
570
580
590
600
610
20.
R. M. Grant, J. R. Lama, P. L. Anderson, V. McMahan, A. Y. Liu, L. Vargas, P. Goicochea,
M. Casapia, J. V. Guanira-Carranza, M. E. Ramirez-Cardich, O. Montoya-Herrera, T. Fernandez, V.
G. Veloso, S. P. Buchbinder, S. Chariyalertsak, M. Schechter, L. G. Bekker, K. H. Mayer, E. G.
Kallas, K. R. Amico, K. Mulligan, L. R. Bushman, R. J. Hance, C. Ganoza, P. Defechereux, B.
Postle, F. Wang, J. J. McConnell, J. H. Zheng, J. Lee, J. F. Rooney, H. S. Jaffe, A. I. Martinez, D. N.
Burns, D. V. Glidden, T. iPrEx Study, Preexposure chemoprophylaxis for HIV prevention in men
who have sex with men. N Engl J Med 363, 2587-2599 (2010).
21.
S. McCormack, D. T. Dunn, M. Desai, D. I. Dolling, M. Gafos, R. Gilson, A. K. Sullivan, A.
Clarke, I. Reeves, G. Schembri, N. Mackie, C. Bowman, C. J. Lacey, V. Apea, M. Brady, J. Fox, S.
Taylor, S. Antonucci, S. H. Khoo, J. Rooney, A. Nardone, M. Fisher, A. McOwan, A. N. Phillips, A.
M. Johnson, B. Gazzard, O. N. Gill, Pre-exposure prophylaxis to prevent the acquisition of HIV-1
infection (PROUD): effectiveness results from the pilot phase of a pragmatic open-label randomised
trial. Lancet, (2015).
22.
J. M. Molina, C. Capitant, B. Spire, G. Pialoux, C. Chidiac, I. Charreau, C. Tremblay, L.
Meyer, J. F. Delfraissy, in CROI. (Seattle, 2015).
23.
F. Tanser, T. Barnighausen, E. Grapsa, J. Zaidi, M. L. Newell, High coverage of ART
associated with decline in risk of HIV acquisition in rural KwaZulu-Natal, South Africa. Science
339, 966-971 (2013).
24.
D. Frentz, A. M. Wensing, J. Albert, D. Paraskevis, A. B. Abecasis, O. Hamouda, L. B.
Jorgensen, C. Kucherer, D. Struck, J. C. Schmit, B. Asjo, C. Balotta, D. Beshkov, R. J. Camacho, B.
Clotet, S. Coughlan, S. De Wit, A. Griskevicius, Z. Grossman, A. Horban, T. Kolupajeva, K. Korn,
L. G. Kostrikis, K. Liitsola, M. Linka, C. Nielsen, D. Otelea, R. Paredes, M. Poljak, E. PuchhammerStockl, A. Sonnerborg, D. Stanekova, M. Stanojevic, A. M. Vandamme, C. A. Boucher, D. A. Van
de Vijver, S. Programme, Limited cross-border infections in patients newly diagnosed with HIV in
Europe. Retrovirology 10, 36 (2013).
25.
B. G. Brenner, M. A. Wainberg, Future of phylogeny in HIV prevention. J Acquir Immune
Defic Syndr 63 Suppl 2, S248-254 (2013).
26.
A. B. Cope, K. A. Powers, J. D. Kuruc, P. A. Leone, J. A. Anderson, L. H. Ping, L. P. Kincer,
R. Swanstrom, V. L. Mobley, E. Foust, C. L. Gay, J. J. Eron, M. S. Cohen, W. C. Miller, Ongoing
HIV Transmission and the HIV Care Continuum in North Carolina. PLoS One 10, e0127950 (2015).
27.
J. Skarbinski, E. Rosenberg, G. Paz-Bailey, H. I. Hall, C. E. Rose, A. H. Viall, J. L. Fagan, A.
Lansky, J. H. Mermin, Human immunodeficiency virus transmission at each step of the care
continuum in the United States. JAMA Intern Med 175, 588-596 (2015).
28.
E. S. Rosenberg, G. A. Millett, P. S. Sullivan, C. Del Rio, J. W. Curran, Understanding the
HIV disparities between black and white men who have sex with men in the USA using the HIV care
continuum: a modeling study. Lancet HIV 1, e112-e118 (2014).
29.
I. S. S. Group, J. D. Lundgren, A. G. Babiker, F. Gordin, S. Emery, B. Grund, S. Sharma, A.
Avihingsanon, D. A. Cooper, G. Fatkenheuer, J. M. Llibre, J. M. Molina, P. Munderi, M. Schechter,
R. Wood, K. L. Klingman, S. Collins, H. C. Lane, A. N. Phillips, J. D. Neaton, Initiation of
Antiretroviral Therapy in Early Asymptomatic HIV Infection. N Engl J Med 373, 795-807 (2015).
30.
E. Volz, E. Ionides, E. Romero-Severson, M. G. Brandt, E. Mokotoff, J. Koopman, HIV-1
Transmission During Early Infection in Men Who Have Sex with Men: A Phylodynamic Analysis.
PLoS Med 10, e1001568 (2013).
31.
U. S. F. a. D. Administration, Truvada approved to reduce the risk of sexually transmitted
HIV in people who are not infected with the virus., (2012).
32.
R. M. Grant, P. L. Anderson, V. McMahan, A. Liu, K. R. Amico, M. Mehrotra, S. Hosek, C.
Mosquera, M. Casapia, O. Montoya, S. Buchbinder, V. G. Veloso, K. Mayer, S. Chariyalertsak, L.
G. Bekker, E. G. Kallas, M. Schechter, J. Guanira, L. Bushman, D. N. Burns, J. F. Rooney, D. V.
Glidden, t. iPrEx study, Uptake of pre-exposure prophylaxis, sexual practices, and HIV incidence in
17
620
630
640
650
660
men and transgender women who have sex with men: a cohort study. Lancet Infect Dis 14, 820-829
(2014).
33.
A. Liu, S. Cohen, S. Follansbee, D. Cohan, S. Weber, D. Sachdev, S. Buchbinder, Early
experiences implementing pre-exposure prophylaxis (PrEP) for HIV prevention in San Francisco.
PLoS Med 11, e1001613 (2014).
34.
J. P. Bil, U. Davidovich, W. M. van der Veldt, M. Prins, H. J. de Vries, G. J. Sonder, I. G.
Stolte, What do Dutch MSM think of preexposure prophylaxis to prevent HIV-infection? A crosssectional study. AIDS 29, 955-964 (2015).
35.
M. J. Mimiaga, J. M. White, D. S. Krakower, K. B. Biello, K. H. Mayer, Suboptimal
awareness and comprehension of published preexposure prophylaxis efficacy results among
physicians in Massachusetts. AIDS Care 26, 684-693 (2014).
36.
K. H. Mayer, S. Hosek, S. Cohen, A. Liu, J. Pickett, M. Warren, D. Krakower, R. Grant,
Antiretroviral pre-exposure prophylaxis implementation in the United States: a work in progress. J
Int AIDS Soc 18, 19980 (2015).
37.
D. Pao, M. Fisher, S. Hue, G. Dean, G. Murphy, P. A. Cane, C. A. Sabin, D. Pillay,
Transmission of HIV-1 during primary infection: relationship to sexual risk and sexually transmitted
infections. AIDS 19, 85-90 (2005).
38.
N. Pant Pai, J. Sharma, S. Shivkumar, S. Pillay, C. Vadnais, L. Joseph, K. Dheda, R. W.
Peeling, Supervised and unsupervised self-testing for HIV in high- and low-risk populations: a
systematic review. PLoS Med 10, e1001414 (2013).
39.
N. Lorente, M. Preau, C. Vernay-Vaisse, M. Mora, J. Blanche, J. Otis, A. Passeron, J. M. Le
Gall, P. Dhotte, M. P. Carrieri, M. Suzan-Monti, B. Spire, A.-D. S. Group, Expanding access to nonmedicalized community-based rapid testing to men who have sex with men: an urgent HIV
prevention intervention (the ANRS-DRAG study). PLoS One 8, e61225 (2013).
40.
G. B. Gomez, A. Borquez, K. K. Case, A. Wheelock, A. Vassall, C. Hankins, The cost and
impact of scaling up pre-exposure prophylaxis for HIV prevention: a systematic review of costeffectiveness modelling studies. PLoS Med 10, e1001401 (2013).
41.
R. B. Birger, T. B. Hallett, A. Sinha, B. T. Grenfell, S. L. Hodder, Modeling the impact of
interventions along the HIV continuum of care in Newark, New Jersey. Clin Infect Dis 58, 274-284
(2014).
42.
J. Heuker, G. J. Sonder, I. Stolte, R. Geskus, A. van den Hoek, High HIV incidence among
MSM prescribed postexposure prophylaxis, 2000-2009: indications for ongoing sexual risk
behaviour. AIDS 26, 505-512 (2012).
43.
V. A. Johnson, V. Calvez, H. F. Gunthard, R. Paredes, D. Pillay, R. W. Shafer, A. M.
Wensing, D. D. Richman, Update of the drug resistance mutations in HIV-1: March 2013. Top
Antivir Med 21, 6-14 (2013).
44.
B. D. Rice, J. Elford, Z. Yin, V. C. Delpech, A new method to assign country of HIV
infection among heterosexuals born abroad and diagnosed with HIV. AIDS 26, 1961-1966 (2012).
45.
A. M. Kozlov, A. J. Aberer, A. Stamatakis, ExaML version 3: a tool for phylogenomic
analyses on supercomputers. Bioinformatics 31, 2577-2579 (2015).
46.
T. Leitner, D. Escanilla, C. Franzen, M. Uhlen, J. Albert, Accurate reconstruction of a known
HIV-1 transmission history by phylogenetic tree analysis. Proc Natl Acad Sci U S A 93, 1086410869 (1996).
47.
A. van Sighem, F. Nakagawa, D. De Angelis, C. Quinten, D. Bezemer, E. O. de Coul, M.
Egger, F. de Wolf, C. Fraser, A. Phillips, Estimating HIV Incidence, Time to Diagnosis, and the
Undiagnosed HIV Epidemic Using Routine Surveillance Data. Epidemiology 26, 653-660 (2015).
48.
Health Protection Agency, Longitudinal analysis of the trajectories of CD4 cell counts, 2011.
49.
D. Bezemer, F. de Wolf, M. C. Boerlijst, A. van Sighem, T. D. Hollingsworth, C. Fraser, 27
years of the HIV epidemic amongst men having sex with men in the Netherlands: an in depth
mathematical model-based analysis. Epidemics 2, 66-79 (2010).
18
670
680
50.
S. H. Eshleman, S. E. Hudelson, A. D. Redd, L. Wang, R. Debes, Y. Q. Chen, C. A. Martens,
S. M. Ricklefs, E. J. Selig, S. F. Porcella, S. Munshaw, S. C. Ray, E. Piwowar-Manning, M.
McCauley, M. C. Hosseinipour, J. Kumwenda, J. G. Hakim, S. Chariyalertsak, G. de Bruyn, B.
Grinsztejn, N. Kumarasamy, J. Makhema, K. H. Mayer, J. Pilotto, B. R. Santos, T. C. Quinn, M. S.
Cohen, J. P. Hughes, Analysis of genetic linkage of HIV from couples enrolled in the HIV
Prevention Trials Network 052 trial. J Infect Dis 204, 1918-1926 (2011).
51.
A. Gavryushkina, D. Welch, T. Stadler, A. J. Drummond, Bayesian inference of sampled
ancestor trees for epidemiology and fossil calibration. PLoS Comput Biol 10, e1003919 (2014).
52.
S. B. McCombs, E. McCray, D. A. Wendell, P. A. Sweeney, I. M. Onorato, Epidemiology of
HIV-1 infection in bisexual women. J Acquir Immune Defic Syndr 5, 850-852 (1992).
53.
B. Efron, R. J. Tibshirani, An introduction to the bootstrap. (Chapman and Hall, Boca Raton
u. a., ed. [Reprint], 1998), pp. XVI, 436 S.
54.
M. S. Cohen, G. M. Shaw, A. J. McMichael, B. F. Haynes, Acute HIV-1 Infection. N Engl J
Med 364, 1943-1954 (2011).
55.
Associated Partners of EMIS, EMIS 2010: The European Men-Who-Have-Sex-With-Men
Internet Survey. Findings from 38 countries., Stockholm, 2013.
Acknowledgments: We thank the Imperial College High Performance Computing Service
(http://www3.imperial.ac.uk/ict/services/hpc), three anonymous referees, the HIV treating
physicians, HIV nurse consultants and staff of the diagnostic laboratories and facilities in the HIV
treatment centres, along with the data collecting and monitoring staff both within and outside the
Stichting HIV Monitoring Foundation for their contributions to make this work possible. Funding:
OR is supported by the Wellcome Trust (fellowship WR092311MF); CF by the European Research
Council (Advanced Grant PBDR-339251) and the Bill & Melinda Gates Foundation (PANGEA-HIV
consortium). PR through his institution received independent scientific grant support from Bristol-
690
Myers Squibb, ViiV Healthcare, Gilead Sciences, Janssen Pharmaceuticals Inc., Merck&Co, served
on a scientific advisory board for Gilead Sciences and serves on a data safety monitoring committee
for Janssen Pharmaceuticals Inc., for which his institution has received remuneration. The Aids
Therapy Evaluation in the Netherlands (ATHENA) observational cohort study is part of Stichting
HIV Monitoring and supported by a grant from the Netherlands Ministry of Health, Welfare and
Sport through its Centre for Infectious Disease Control-National Institute for Public Health and the
Environment. The funders had no role in study design, data collection and analysis, decision to
publish, or preparation of the manuscript. Author contributions: OR, FW, PR, CF conceived the
study. OR, CF developed the methods, did the analysis, and reviewed all statistical aspects of the
analysis. AS, DB, SJ, AW provided data used to conduct the analysis. AG assisted in estimating the
700
viral phylogeny. AS, DB, FW, PR advised on analysis and interpretation. OR, CF wrote the first
draft. All authors reviewed and approved the final version. Competing interests: None declared.
Data and materials availability: Data are available from the HIV Monitoring Institutional Data
19
Access / Ethics Committee for researchers who meet the criteria for access to confidential
data. Contact email: [email protected].
20
Fig. 1. Study design. Nationwide sources of transmission were identified for MSM with evidence
710
for recent infection in the first year prior to diagnosis (recipient MSM). (A) Out of all patients in the
ATHENA cohort, men whose course of infection overlapped with the infection window were
considered as potential transmitters. (B) Only those pairs with sequences from both individuals were
considered for further analysis. (C-D) Using viral phylogenetic analyses, the vast majority of pairs
could be ruled out. All remaining pairs were considered phylogenetically probable. (E) Based on
detailed clinical records, probable transmission events were characterized by stage in the HIV
infection and care continuum. Because transmitters progressed in stage over time, we considered
time-resolved transmission intervals. (F) Independent viral phylogenetic data from epidemiologically
confirmed pairs was used to determine the phylogenetic probability of direct transmission during
each interval. Statistical analyses adjusted for extensive sampling and censoring biases.
21
720
730
Fig. 2. Phylogenetically probable transmission intervals, linked to stages in the infection and
care continuum. (A) Left: Each recipient could have been infected during his infection window
from multiple probable transmitters. For each transmitter, the transmission window was split into
six-week long probable transmission intervals. Infection/care stages were assigned to these intervals
based on clinical data to reflect progression of the transmitters through the infection/care continuum.
Right: Relationship between the fourteen infection/care stages as defined in table 2. Transmitters
progress uni-directionally, except for stages after first viral suppression, or when individuals re-enter
care (as indicated by arrows). (B) For each stage, the total number of observed transmission intervals
to recipient MSM during their infection windows is shown. Overall, the number of transmission
intervals per recipient increases with time, reflecting the increasing number of infected men in care.
Transmitters are increasingly less likely to have been diagnosed by 2013, resulting in a decreasing
number of undiagnosed transmission intervals towards the present. (C) In addition to censoring,
diagnosed transmitters may not have a sequence sampled. Comparing men with and without a
sequence in the near complete population cohort, we could adjust for these biases. The total number
of expected missing transmission intervals to recipients diagnosed in one of four observation periods
is shown, along with 95% bootstrap confidence intervals. Observed and expected missing
transmission intervals were associated with phylogenetic transmission probabilities, which sum to
one per recipient.
740
22
Fig. 3. Proportion of transmissions by stage in the infection and care continuum, versus
proportion of these stages amongst infected men. (A) Relative frequency of infection/care stages
in the population, among potential transmitters that overlap with the infection windows of recipient
MSM and could have in principle transmitted to one of the recipient MSM. (stage A in figure 1,
750
colour codes as in figure 2). (B) Proportion of the 617 transmission events attributable to each
infection/care stage (bar: 95% bootstrap confidence interval).
23
Fig. 4. Impact of biomedical interventions amongst MSM in the Netherlands. Estimated
proportion of transmissions that could have been averted in the period 2008/07-2010/12 if the
corresponding additional prevention strategies had been implemented by 2008/07 (line: median, box:
bootstrap interquartile range, whiskers: 95% bootstrap confidence interval). Scenarios were varied by
annual testing coverage of phylogenetically identified, probable transmitters. Current testing
760
coverage was 17%, corresponding to the proportion of probable transmitters that had a negative test
in the twelve months prior to diagnosis.
24
Table 1. HIV incidence trends and care for infected MSM in the Netherlands and other countries.
Country
Annual testing of
uninfected MSM
Diagnosed MSM receiving
ART
Treated MSM with suppressed
viral load
Retention of MSM in
care
HIV incidence among MSM
Year
%
Year
%
Year
%
Viral load
threshold
(cps/ml)
Year
%
Year
Trend
2003
??
2003
79
Median CD4
count at ART
initiation
(cells/ml)
202
2003
80
<100
2003
92
2003
Increasing n
2013
38.4 a
2013
90
382
2013
91
<100
2012
95
2013
Stable n
Australia
2013
61.1 b
2013
75 b
379 h
2013
88 k,
<50
2013
96 h,
2013
British Columbia
2009
51 c
2014
85 e
411 e
2014
84 e
<50
2011
86 d,
2013
Increasing, stable in Western
Australia and Queensland o
Stable p
Switzerland
2010
39.3 a
2014
86 f
402 f
2012
96 l,
<200
2012
97
i,
2014
United Kingdom
2010
36.4 a
2013
86 g
420 j
2013
91 m
<200
2013
95 m
2013
Netherlands
Decreasing new diagnoses and
recent infections q
Stable r
The EMIS Network. EMIS 2010: The European Men-Who-Have-Sex-With-Men Internet Survey. Findings from 38 countries. Stockholm: European Centre for Disease Prevention and
Control, 2013.b From Gay Community Periodic Surveys, https://kirby.unsw.edu.au/projects/gay-community-periodic-surveys, reported in HIV, hepatitis, and sexually transmissible infections
in Australia, Annual surveillance report 2014. c From Mancount, prospective cross-sectional survey in Vancouver http://www.mancount.ca/files/ManCount_Report2010.pdf .d From Nosyk B,
Montaner JS, Colley G, et al. The cascade of HIV care in British Columbia, Canada, 1996-2011: a population-based retrospective cohort study. The Lancet infectious diseases 2014;14:40-9 e
From HIV monitoring quarterly report for British Columbia, Fourth quarter 2014. f 2621 out of 3081 MSM on ART and registered in the Swiss HIV Cohort Study, personal communication
with the Datacenter of the Swiss HIV Cohort Study. g From https://www.gov.uk/government/statistics/hiv-data-tables. h From Australian HIV Observational Database Annual Report 2014,
reporting care indicators in a closed observational cohort. i From http://www.shcs.ch/155-shcs-key-data-figures update June 2014. j Within 9 months prior to ART initiation, personal
communication PHE. k From the Australian HIV Observational Database, reported in HIV, hepatitis, and sexually transmissible infections in Australia, Annual surveillance report 2014. l
Kohler P, Schmidt AJ, Ledergerber B, Vernazza P, CROI2015, http://www.croiconference.org/sites/default/files/posters-2015/1008.pdf. m From HIV in the United Kingdom: 2014 Report. n
From reference (44). o From fact sheet HIV and AIDS in Australia, 20th International AIDS conference. p From http://www.phac-aspc.gc.ca/aids-sida/publication/epi/2010/index-eng.php q
From HIV- und STI-Fallzahlen 2014: Berichterstattung, Analysen und Trends, in comparison with numbers for 2008 in the 2012 report,
http://www.bag.admin.ch/hiv_aids/12472/12480/12481/12484/index.html?lang=de. r From Birrell PJ, Gill ON, Delpech VC, et al. HIV incidence in men who have sex with men in England
and Wales 2001-10: a nationwide population study. The Lancet infectious diseases 2013;13:313-8.
Estimate not specific to MSM.
a From
25
Table 2. Stages in the HIV infection and care continuum
Infection/care stage
of transmitter
Definition
Undiagnosed
Transmission intervals whose midpoint is before diagnosis:
Confirmed recent
infection at diagnosis
Estimated to be in
recent infection
Estimated to be in
chronic infection
Diagnosed
Diagnosed < 3mo,
Recent infection at
diagnosis
No CD4 measured
CD4 > 500
CD4 in [350-500]
CD4 < 350
ART initiated
Before first viral
suppression
After first viral
suppression¶
No viral load measured¶
No viral suppression¶
Viral suppression, one
observation¶
Viral suppression,
>1 observations¶
Not in contact
¶
All transmission intervals of transmitters that were in laboratory confirmed recent infection at
time of diagnosis.
Considering transmitters that had no evidence for recent infection at time of diagnosis, all
transmission intervals whose midpoint is less than 12 months after the estimated infection date.
Considering transmitters that had no evidence for recent infection at time of diagnosis, all
transmission intervals whose midpoint is more than 12 months after the estimated infection
date.
Transmission intervals whose midpoint is after diagnosis and before ART start (only of
transmitters that are in contact with care services):
Considering potential or probable transmitters that were in laboratory confirmed recent
infection at time of diagnosis, all transmission intervals whose midpoint is within the first three
months after diagnosis.
No available CD4 count since diagnosis up to the midpoint of the interval.
CD4 counts remained above 500 cells/ml between the first CD4 count up to the midpoint of
the interval.
CD4 counts decreased to 350-500 cells/ml between the first CD4 count up to the midpoint of
the interval.
CD4 counts decreased to below 350 cells/ml between the first CD4 count up to the midpoint of
the interval.
Transmission intervals whose midpoint is after ART start (only of transmitters that are in contact
with care services):
No first viral load measurement below 100 copies/ml in any transmission interval of the
transmitter after ART start
No viral load measurement in any transmission interval of the transmitter after ART start
At least one viral load measurement at or above 100 copies/ml in any transmission interval of
the transmitter after ART start
One viral load measurement in any transmission interval of the transmitter after ART start,
which is below 100 copies/ml.
Several viral load measurements in any transmission interval of the transmitter after ART start,
all of which are below 100 copies/ml.
No patient record (last contact, clinic visit, CD4 measurement, viral load measurement) in the past
and future 9 months from the midpoint of the transmission interval.
While flow through the stages is typically unidirectional, men could move freely between these stages.
770
26
Table 3. Characteristics of the recipient MSM with identified sources of transmission
Characteristic
Recipient MSM with
a phylogenetically
probable transmitter
(n= 617)
Recipient MSM with
or without a
sequence
(n= 1,794)
Diagnosed MSM
77
76
17
8
7
2
15
17
4
36.8 (29.5-42.9)
37.2 (29.9-43.5)
38.7 (31.3-45.1)
505 (350-630)
534 (360-670)
402 (200-560)
4.9 (4.4-5.5)
4.8 (4.3-5.4)
4.7 (4.3-5.3)
45.1
43.5
43.6
77.0
76.1
17.1
96.9
91.9
88.5
(n= 7,978)
Evidence for infection in the past year
Previous negative test in the past
year (%)
Laboratory diagnosis
(%)
Clinical diagnosis of acute infection
(%)
Age at diagnosis
(years; mean and IQR)
First CD4 count within 12 months of
diagnosis and before ART start
(cells/ml; mean and IQR)
Viral load count within 12 months of
diagnosis
(log10 RNA; mean and IQR)
In care in the Amsterdam metropolitan
area
(%)
Last negative test within 12 months
prior to diagnosis
(%)
Self-reported in country infection¶
(%)
¶
Of those self-reporting a country of origin.
27
Table 4. Proportion of transmissions by stage in the HIV infection and care continuum.
Infection/care stage
of transmitter
% of transmissions
by time of diagnosis of recipient MSM
(95% confidence interval)
Overall
(n=617)
96/07-06/04
(n=165)
06/05-07/12
(n=145)
08/01-09/06
(n=151)
09/07-10/12
(n=156)
Undiagnosed (total)
Confirmed recent
infection at diagnosis
Estimated to be in
recent infection
Estimated to be in
chronic infection
70.9 (65.8-72.5)
67.6 (59.3-72.7)
72.3 (64.2-76.9)
71.8 (63.4-76.3)
72.2 (63.3-76.3)
15.5 (11.9-17.4)
15 (7.6-19.4)
21.7 (15-26.5)
16.4 (11-20.8)
9.4 (5.6-14.1)
25.1 (19.4-28.1)
17.3 (11.7-22.7)
23 (15.1-30.1)
25.9 (15.4-33.6)
34.6 (19.4-43.4)
30.3 (28-34)
35.2 (30.2-42)
27.6 (22.4-34)
29.5 (24.2-36.1)
28.2 (23-35.7)
Diagnosed (total)
Diagnosed < 3mo,
Recent infection at
diagnosis
No CD4 measured
CD4 >500
CD4 in [350-500]
CD4 < 350
22.4 (20.7-26.2)
23.6 (18.5-29.7)
22.9 (18.6-29.1)
22.8 (18.3-29.4)
20.7 (17.4-27.3)
2.9 (2.2-4.1)
1.6 (1.2-2.4)
8.3 (7-10.3)
6.4 (5.4-7.9)
3.4 (2.5-4.3)
2.5 (1-4.9)
2.9 (1.6-4.8)
10.2 (6.7-14.2)
4.8 (2.6-7.8)
3.2 (1.2-5.5)
3.2 (1.7-5.5)
0.8 (0.4-1.8)
7 (4.5-10.8)
7.3 (5.1-10.5)
4.6 (2.6-6.6)
3 (1.9-5.4)
1.5 (0.6-3)
8.7 (5.9-12.5)
5.9 (4.2-8.3)
3.7 (2.2-5.6)
2.8 (1.8-4.4)
1 (0.6-2.1)
7.1 (5.4-10.1)
7.7 (5.7-11)
2.1 (1.3-3.3)
5.7 (5.2-7.8)
7 (4.8-11.7)
3.7 (2.2-6.5)
4.9 (3.7-8.1)
6.7 (5.4-10.2)
1.8 (1.6-2.7)
2.2 (1.2-4.4)
0.7 (0.4-1.5)
1.3 (0.9-2.6)
2.8 (2.1-4.6)
ART initiated (total)
Before first viral
suppression
After first viral
suppression
No viral load measured
No viral suppression
Viral suppression, one
observation
Viral suppression,
>1 observations
Not in contact
Recent infection (total)
0.5 (0.3-1)
0.9 (0.1-2.4)
0.1 (0-0.3)
0.3 (0.1-0.9)
0.8 (0.4-1.8)
1.4 (0.9-2.1)
2.8 (1.2-5.2)
1.2 (0.4-2.6)
0.9 (0.4-1.9)
0.5 (0.1-1)
0.4 (0.3-0.8)
0.1 (0-0.5)
0.2 (0-0.8)
0.5 (0.2-1.7)
0.6 (0.3-1.5)
1.6 (1.1-2.5)
1 (0.1-2.6)
1.5 (0.6-3.1)
1.9 (0.9-3.6)
2 (1.1-3.6)
1 (0.7-1.6)
1.8 (0.8-3.4)
1.1 (0.4-2.3)
0.5 (0.2-1.4)
0.4 (0.2-0.8)
43.5 (36.6-46)
34.9 (25.4-40.6)
47.9 (36.9-54.8)
45.3 (33.3-54.1)
47.7 (32.8-53.8)
28
780
Extended acknowledgements
The ATHENA database is maintained by Stichting HIV Monitoring and supported by a grant from the Dutch
Ministry of Health, Welfare and Sport through the Centre for Infectious Disease Control of the National Institute
for Public Health and the Environment.
790
800
810
820
830
CLINICAL CENTRES
* denotes site coordinating physician
Academic Medical Centre of the University of Amsterdam: HIV treating physicians: J.M. Prins*, T.W.
Kuijpers, H.J. Scherpbier, J.T.M. van der Meer, F.W.M.N. Wit, M.H. Godfried, P. Reiss, T. van der Poll, F.J.B.
Nellen, S.E. Geerlings, M. van Vugt, D. Pajkrt, J.C. Bos, W.J. Wiersinga, M. van der Valk, A. Goorhuis, J.W.
Hovius, A.M. Weijsenfeld. HIV nurse consultants: J. van Eden, A. Henderiks, A.M.H. van Hes, M.
Mutschelknauss, H.E. Nobel, F.J.J. Pijnappel. HIV clinical virologists/chemists: S. Jurriaans, N.K.T. Back, H.L.
Zaaijer, B. Berkhout, M.T.E. Cornelissen, C.J. Schinkel, X.V. Thomas. Admiraal De Ruyter Ziekenhuis, Goes:
HIV treating physicians: M. van den Berge, A. Stegeman. HIV nurse consultants: S. Baas, L. Hage de Looff. HIV
clinical virologists/chemists: D. Versteeg. Catharina Ziekenhuis, Eindhoven: HIV treating physicians: M.J.H.
Pronk*, H.S.M. Ammerlaan. HIV nurse consultants: E.S. de Munnik. HIV clinical virologists/chemists: A.R. Jansz,
J. Tjhie, M.C.A. Wegdam, B. Deiman, V. Scharnhorst. Emma Kinderziekenhuis: HIV nurse consultants: A. van
der Plas, A.M. Weijsenfeld. Erasmus Medisch Centrum, Rotterdam: HIV treating physicians: M.E. van der
Ende*, T.E.M.S. de Vries-Sluijs, E.C.M. van Gorp, C.A.M. Schurink, J.L. Nouwen, A. Verbon, B.J.A. Rijnders,
H.I. Bax, M. van der Feltz. HIV nurse consultants: N. Bassant, J.E.A. van Beek, M. Vriesde, L.M. van Zonneveld.
Data collection: A. de Oude-Lubbers, H.J. van den Berg-Cameron, F.B. Bruinsma-Broekman, J. de Groot, M. de
Zeeuw- de Man. HIV clinical virologists/chemists: C.A.B. Boucher, M.P.G Koopmans, J.J.A van Kampen.
Erasmus Medisch Centrum–Sophia, Rotterdam: HIV treating physicians: G.J.A. Driessen, A.M.C. van
Rossum. HIV nurse consultants: L.C. van der Knaap, E. Visser. Flevoziekenhuis, Almere: HIV treating
physicians: J. Branger*, A. Rijkeboer-Mes. HIV nurse consultant and data collection: C.J.H.M. Duijf-van de Ven.
HagaZiekenhuis, Den Haag: HIV treating physicians: E.F. Schippers*, C. van Nieuwkoop. HIV nurse
consultants: J.M. van IJperen, J. Geilings. Data collection: G. van der Hut. HIV clinical virologist/chemist: P.F.H.
Franck. HIV Focus Centrum (DC Klinieken): HIV treating physicians: A. van Eeden*. HIV nurse consultants:
W. Brokking, M. Groot, L.J.M. Elsenburg. HIV clinical virologists/chemists: M. Damen, I.S. Kwa. Isala, Zwolle:
HIV treating physicians: P.H.P. Groeneveld*, J.W. Bouwhuis. HIV nurse consultants: J.F. van den Berg, A.G.W.
van Hulzen. Data collection: G.L. van der Bliek, P.C.J. Bor. HIV clinical virologists/chemists: P. Bloembergen,
M.J.H.M. Wolfhagen, G.J.H.M. Ruijs. Leids Universitair Medisch Centrum, Leiden: HIV treating physicians:
F.P. Kroon*, M.G.J. de Boer, M.P. Bauer, H. Jolink, A.M. Vollaard. HIV nurse consultants: W. Dorama, N. van
Holten. HIV clinical virologists/chemists: E.C.J. Claas, E. Wessels. Maasstad Ziekenhuis, Rotterdam: HIV
treating physicians: J.G. den Hollander*, K. Pogany, A. Roukens. HIV nurse consultants: M. Kastelijns, J.V. Smit,
E. Smit, D. Struik-Kalkman, C. Tearno. Data collection: M. Bezemer, T. van Niekerk. HIV clinical
virologists/chemists: O. Pontesilli.
Maastricht UMC+, Maastricht: HIV treating physicians: S.H. Lowe*, A.M.L. Oude Lashof, D. Posthouwer. HIV
nurse consultants: R.P. Ackens, J. Schippers, R. Vergoossen. Data collection: B. Weijenberg-Maes. HIV clinical
virologists/chemists: I.H.M. van Loo, T.R.A. Havenith. MC Slotervaart, Amsterdam: HIV treating physicians:
J.W. Mulder, S.M.E. Vrouenraets, F.N. Lauw. HIV nurse consultants: M.C. van Broekhuizen, H. Paap, D.J.
Vlasblom. HIV clinical virologists/chemists: P.H.M. Smits. MC Zuiderzee, Lelystad: HIV treating physicians: S.
Weijer*, R. El Moussaoui. HIV nurse consultant: A.S. Bosma. Medisch Centrum Alkmaar: HIV treating
physicians: W. Kortmann*, G. van Twillert*, J.W.T. Cohen Stuart, B.M.W. Diederen. HIV nurse consultant and
data collection: D. Pronk, F.A. van Truijen-Oud. HIV clinical virologists/chemists: W. A. van der Reijden, R.
Jansen. Medisch Centrum Haaglanden, Den Haag: HIV treating physicians: E.M.S. Leyten*, L.B.S. Gelinck.
HIV nurse consultants: A. van Hartingsveld, C. Meerkerk, G.S. Wildenbeest. HIV clinical virologists/chemists:
J.A.E.M. Mutsaers, C.L. Jansen. Medisch Centrum Leeuwarden, Leeuwarden: HIV treating physicians:
M.G.A.van Vonderen*, D.P.F. van Houte, L.M. Kampschreur. HIV nurse consultants: K. Dijkstra, S. Faber. HIV
clinical virologists/chemists: J Weel. Medisch Spectrum Twente, Enschede: HIV treating physicians: G.J.
Kootstra*, C.E. Delsing. HIV nurse consultants: M. van der Burg-van de Plas, H. Heins. Data collection: E. Lucas.
OLVG Amsterdam: HIV treating physicians: K. Brinkman*, G.E.L. van den Berk, W.L. Blok, P.H.J. Frissen,
K.D. Lettinga W.E.M. Schouten, J. Veenstra. HIV nurse consultants: C.J. Brouwer, G.F. Geerders, K. Hoeksema,
M.J. Kleene, I.B. van der Meché, M. Spelbrink, H. Sulman, A.J.M. Toonen, S. Wijnands. HIV clinical virologists:
29
840
850
860
M. Damen, D. Kwa. Data collection: E. Witte. Radboudumc, Nijmegen: HIV treating physicians: P.P.
Koopmans, M. Keuter, A.J.A.M. van der Ven, H.J.M. ter Hofstede, A.S.M. Dofferhoff, R. van Crevel. HIV nurse
consultants: M. Albers, M.E.W. Bosch, K.J.T. Grintjes-Huisman, B.J. Zomer. HIV clinical virologists/chemists:
F.F. Stelma, J. Rahamat-Langendoen. HIV clinical pharmacology consultant: D. Burger. Rijnstate, Arnhem: HIV
treating physicians: C. Richter*, E.H. Gisolf, R.J. Hassing. HIV nurse consultants: G. ter Beest, P.H.M. van
Bentum, N. Langebeek. HIV clinical virologists/chemists: R. Tiemessen, C.M.A. Swanink. Spaarne Gasthuis,
Haarlem: HIV treating physicians: S.F.L. van Lelyveld*, R. Soetekouw. HIV nurse consultants: N. Hulshoff,
L.M.M. van der Prijt, J. van der Swaluw. Data collection: N. Bermon. HIV clinical virologists/chemists: W.A. van
der Reijden, R. Jansen, B.L. Herpers, D.Veenendaal. Stichting Medisch Centrum Jan van Goyen, Amsterdam:
HIV treating physicians: D.W.M. Verhagen. HIV nurse consultants: M. van Wijk. St Elisabeth Ziekenhuis,
Tilburg: HIV treating physicians: M.E.E. van Kasteren*, A.E. Brouwer. HIV nurse consultants and data
collection: B.A.F.M. de Kruijf-van de Wiel, M. Kuipers, R.M.W.J. Santegoets, B. van der Ven. HIV clinical
virologists/chemists: J.H. Marcelis, A.G.M. Buiting, P.J. Kabel. Universitair Medisch Centrum Groningen,
Groningen: HIV treating physicians: W.F.W. Bierman*, H. Scholvinck, K.R. Wilting, Y. Stienstra. HIV nurse
consultants: H. de Groot-de Jonge, P.A. van der Meulen, D.A. de Weerd, J. Ludwig-Roukema. HIV clinical
virologists/chemists: H.G.M. Niesters, A. Riezebos-Brilman, C.C. van Leer-Buter, M. Knoester. Universitair
Medisch Centrum Utrecht, Utrecht: HIV treating physicians: A.I.M. Hoepelman*, T. Mudrikova, P.M.
Ellerbroek, J.J. Oosterheert, J.E. Arends, R.E. Barth, M.W.M. Wassenberg, E.M. Schadd. HIV nurse consultants:
D.H.M. van Elst-Laurijssen, E.E.B. van Oers-Hazelzet, S. Vervoort, Data collection: M. van Berkel. HIV clinical
virologists/chemists: R. Schuurman, F. Verduyn-Lunel, A.M.J. Wensing. VU medisch centrum, Amsterdam:
HIV treating physicians: E.J.G. Peters*, M.A. van Agtmael, M. Bomers, J. de Vocht. HIV nurse consultants: M.
Heitmuller, L.M. Laan. HIV clinical virologists/chemists: A.M. Pettersson, C.M.J.E. Vandenbroucke-Grauls, C.W.
Ang. Wilhelmina Kinderziekenhuis, UMCU, Utrecht: HIV treating physicians: S.P.M. Geelen, T.F.W. Wolfs,
L.J. Bont. HIV nurse consultants: N. Nauta.
COORDINATING CENTRE
Director: P. Reiss. Data analysis: D.O. Bezemer, A.I. van Sighem, C. Smit, F.W.M.N. Wit. Data management and
quality control: S. Zaheri, M. Hillebregt, A. de Jong. Data monitoring: D. Bergsma, P. Hoekstra, A. de Lang, S.
Grivell, A. Jansen, M.J. Rademaker, M. Raethke. Data collection: L. de Groot, M. van den Akker, Y. Bakker, M.
Broekhoven, E. Claessen, A. El Berkaoui, J. Koops, E. Kruijne, C. Lodewijk, R. Meijering, L. Munjishvili, B.
Peeck, C. Ree, R. Regtop, Y. Ruijs, T. Rutkens, L. van de Sande, M. Schoorl, S. Schnörr, E. Tuijn, L. Veenenberg,
S. van der Vliet, T. Woudstra. Patient registration: B. Tuk.
30
Supplementary Online Material and
Methods
SOM 1 Procedure to identify potential transmitters of recipient MSM
870
880
890
900
To reconstruct an evidence base of past transmission events amongst MSM in the Netherlands
between July 1996 and December 2010, we first identified MSM for whom a narrow infection
window could defined (see Materials and Methods in the main text). Next, we considered as
potential transmitters all registered infected men that could have in principle infected a recipient.
Potential transmitters were defined as infected men in the ATHENA cohort that overlap with the
infection window of a recipient MSM. To determine if an infected individual overlapped with an
infection window, we need to estimate when the individual in question became infected.
Equivalently, we here estimate the time from infection to diagnosis, which we denote by 𝑇𝑖𝐼→𝐷 for
individual 𝑖. This section describes how individual-level time to diagnosis estimates were obtained.
We denote the estimated time to diagnosis for individual 𝑖 by 𝑇̂𝑖𝐼→𝐷 . Estimated infection times are
associated with substantial uncertainty and sensitivity analyses were conducted for lower and upper
95% estimates. Findings did not depend substantially on these infection time estimates (figures S16
and S21).
We adapted a previously published method that estimates an individual’s time to diagnosis based on
certain risk variables at time of diagnosis (41, 45). This approach proceeds in two steps. First, HIV
surveillance data from an MSM cohort of drug naïve HIV seroconverters are used to estimate the
association between the time to diagnosis since the midpoint of the seroconversion interval and risk
variables at diagnosis. This association is described with a suitable regression model. Next, the fitted
regression model is used to predict the expected time to diagnosis for all infected individuals.
Previous work found that CD4 cell count at diagnosis, age at diagnosis, infection route and ethnicity
are significantly associated with the time to diagnosis since the midpoint of the seroconversion
interval (45). Here, ethnicity was not available and infection route was always MSM. Both
demographic variables were not considered in this analysis.
The previous method to estimate an individual’s time of infection assumes, first, that the time
between the midpoint of the seroconversion interval to diagnosis is representative of the unknown
time to diagnosis among seroconverting MSM. We denote the time to diagnosis from the midpoint
by 𝑇̃𝑖𝐼→𝐷 for seroconverter 𝑖. Second, the previous approach assumes that the approximated time to
diagnosis among seroconverting MSM is representative of the time to diagnosis among all infected
MSM. Here, we adapt this approach in order to relax both assumptions, using the 𝑇̃𝑖𝐼→𝐷 as an
intermediate step to obtain the final estimate 𝑇̂𝑖𝐼→𝐷 .
In the ATHENA cohort, data on 3,025 MSM with a last negative test and date of diagnosis between
2003/01-2010/12 were available to estimate the association between 𝑇̃𝑖𝐼→𝐷 and risk variables at time
of diagnosis. Table S4 characterizes these MSM with a last negative test.
We conducted an exploratory data analysis, shown in figure S24, which indicated that infection
status at time of diagnosis (evidence for infection within 12 months prior to diagnosis), age at
31
910
diagnosis, status of HIV infection at diagnosis and (to a lesser extent) the first CD4 count within 12
months of diagnosis are associated with 𝑇̃𝑖𝐼→𝐷 among drug naïve MSM with a last negative test.
For individuals in confirmed recent infection at time of diagnosis, we set
𝑇̂𝑖𝐼→𝐷 = 1 year.
For all other individuals, we estimated first 𝑇̃𝑖𝐼→𝐷 from age and first CD4 count at time of diagnosis.
Based on the exploratory data analysis shown in figure S24, we fitted the regression model
𝑇̃𝑖𝐼→𝐷 ~𝐺𝑎𝑚𝑚𝑎(𝜇𝑖 , 𝜙𝑖 ),
920
log 𝜇𝑖 = 𝛽0 + 𝑅 𝑁𝑖𝑛𝑑 ( 𝛽1𝑁𝑖𝑛𝑑 𝐶𝑖850 𝐴𝑖 + 𝛽2𝑁𝑖𝑛𝑑 𝐶𝑖250−850 𝐴𝑖 + 𝛽3𝑁𝑖𝑛𝑑 𝐶𝑖250 𝐴𝑖 + 𝛽4𝑁𝑖𝑛𝑑 𝐶𝑖𝑁𝐴 𝐴𝑖 ) +
𝑅 𝑚𝑖𝑠𝑠 ( 𝛽1𝑚𝑖𝑠𝑠 𝐶𝑖850 𝐴𝑖 + 𝛽2𝑚𝑖𝑠𝑠 𝐶𝑖250−850 𝐴𝑖 + 𝛽3𝑚𝑖𝑠𝑠 𝐶𝑖250 𝐴𝑖 + 𝛽4𝑚𝑖𝑠𝑠 𝐶𝑖𝑁𝐴 𝐴𝑖 )
log 𝜙𝑖 = 𝛾0 + 𝑅 𝑁𝑖𝑛𝑑 ( 𝛾1𝑁𝑖𝑛𝑑 𝐶𝑖850 𝐴𝑖 + 𝛾2𝑁𝑖𝑛𝑑 𝐶𝑖250−850 𝐴𝑖 + 𝛾3𝑁𝑖𝑛𝑑 𝐶𝑖250 𝐴𝑖 + 𝛾4𝑁𝑖𝑛𝑑 𝐶𝑖𝑁𝐴 𝐴𝑖 ) +
𝑅 𝑚𝑖𝑠𝑠 ( 𝛾1𝑚𝑖𝑠𝑠 𝐶𝑖850 𝐴𝑖 + 𝛾2𝑚𝑖𝑠𝑠 𝐶𝑖250−850 𝐴𝑖 + 𝛾3𝑚𝑖𝑠𝑠 𝐶𝑖250 𝐴𝑖 + 𝛾4𝑚𝑖𝑠𝑠 𝐶𝑖𝑁𝐴 𝐴𝑖 )
among MSM with a last negative test, where
𝑇̃𝑖𝐼→𝐷
𝜇𝑖
𝜙𝑖
𝛽𝑘
𝛾𝑘
𝑁𝑖𝑛𝑑
𝑅𝑖
𝑅𝑖𝑚𝑖𝑠𝑠
𝐶𝑖850
𝐶𝑖250−850
𝐶𝑖250
𝐶𝑖𝑁𝐴
𝐴𝑖
930
time between the midpoint of the seroconversion interval and diagnosis
mean of the Gamma distribution for the 𝑖th seroconverter
dispersion of the Gamma distribution for the 𝑖th seroconverter
location parameters
shape parameters
𝟏( 𝑖th seroconverter with recent infection at diagnosis not indicated )
𝟏( 𝑖th seroconverter with missing infection status at diagnosis )
𝟏( 𝑖th seroconverter with first CD4 count > 850 cells/ml within 12 months
after diagnosis and before ART start)
𝟏( 𝑖th seroconverter with first CD4 count in [250-850] cells/ml within 12
months after diagnosis and before ART start)
𝟏( 𝑖th seroconverter with first CD4 count < 250 cells/ml within 12 months
after diagnosis and before ART start)
𝟏( 𝑖th seroconverter with no first CD4 count within 12 months after
diagnosis and before ART start)
min(age at diagnosis of 𝑖th seroconverter, 45).
All regression coefficients were significant. Figure S25 illustrates the predictions obtained with the
fitted multivariable regression model. The fitted regression model explained 53% of the variance in
the observed 𝑇̃𝑖𝐼→𝐷 among MSM with a last negative test.
Rice et al. (41) used the expected 𝑇̃𝑖𝐼→𝐷 as an estimate of the unknown time to diagnosis. To relax the
two underlying assumptions noted earlier, we used instead a particular upper quantile 𝛼 of the
estimated probability density function of 𝑇̃𝑖𝐼→𝐷 . Figure S26 illustrates the probability density function
of 𝑇̃𝑖𝐼→𝐷 which was obtained from the parameters of the fitted regression model. The upper quantile
parameter 𝛼 was estimated so that the average 𝑇̂𝑖𝐼→𝐷 is consistent with the average time to diagnosis
derived from two mathematical modelling studies between 1996 and 1999. We chose this period in
order to validate if the predictive model can reproduce previously estimated reductions in average
time to diagnosis in subsequent years. For this period, Bezemer et al. (46) estimated an average time
32
940
950
to diagnosis amongst MSM of 3.16 years (95% confidence interval 3.00-3.41 years) in this period.
van Sighem et al. (7) estimated a mean time to diagnosis of 4.34 years (3.87-5.11 years) by the end
of 1999. We chose the quantile parameters 𝛼 = 0.109, 0.148, 0.194 in correspondence to these
estimates (figure S27-A). The fact that the chosen quantile parameters are substantially lower than
0.5 indicates that the expected time to diagnosis since the midpoint of the seroconversion interval
cannot be considered representative of the time to diagnosis among infected MSM. We used these
quantile parameters to obtain central, lower, and upper individual-level time to diagnosis estimates,
as shown in Figure S27-B.
Overall, the individual-level predictive model is able to reproduce previously estimated reductions in
average time to diagnosis without the addition of time-dependent variables (black lines in Figure
S27-B) (7, 46). The linear drop in time to diagnosis after 2005 may in part be explained by right
censoring in the cohort: as the study endpoint was 2010/12 and the maximum estimated time to
diagnosis is around 7 years, we expect that an increasing fraction of men infected since 2004-2005 is
not yet diagnosed. In comparison to Rice et al. (41), our approach results in larger estimates of time
to diagnosis. If the 50% quantile had been used to estimate times to diagnosis, the average time to
diagnosis for MSM infected in the period 1996-1999 would have been slightly less than 2 years
(figure S27-A).
SOM 2 Procedure to declare potential transmission pairs
phylogenetically implausible
960
HIV sequences cannot prove epidemiological linkage nor the direction of HIV infection (11, 12).
However, viral sequences can be used to exclude potential transmission events between individuals
whose viral sequences are phylogenetically unrelated. There is currently no widely agreed consensus
on viral phylogenetic exclusion criteria (8).
970
980
To guide the viral phylogenetic exclusion criteria adopted in this study, we conducted an
evolutionary analysis of sequences from transmitters and recipients in confirmed transmission pairs.
This analysis is described in figure S5. In addition, we considered 4,117 pairs of sequences from the
same Dutch patient and 201,605 pairs between Dutch patients that died before the last negative
antibody test of the other patient. These analyses are described below, and were used to develop
exclusion criteria with high specificity (i.e. small type-I error of falsely excluding true transmission
pairs). We chose central exclusion criteria for the main transmission analysis and varied lower and
upper criteria over the identified range. Sensitivity analyses demonstrate these criteria did not impact
substantially on the reported transmission and prevention analyses.
Figure S5 shows the genetic distance between sequences from confirmed pairs in the Belgium
transmission chain as a function of time elapsed. It is clear that the genetic distance between
sequences from confirmed pairs can exceed typical phylogenetic clustering thresholds, provided the
time elapsed is sufficiently large. This analysis indicates that genetic distances of not more than 2%
between partial HIV pol sequences from true transmission pairs are only expected when the total
time elapsed is small. This is typically the case when individuals are frequently followed up as in a
controlled, randomized trial (47). Among the phylogenetically probable transmission pairs in this
study, the maximum time elapsed was 10.87 years. Considering figure S28, the corresponding upper
97.5% quantile of the genetic distances between sequences from true transmission pairs is ~ 7%. To
validate the analysis in figure S5, we estimated the genetic distance between sequences from
confirmed transmission pairs in the Swedish transmission chain in the same manner. Figure S28
33
shows that these genetic distances fall into the 80% probability range estimated from the Belgium
transmission pairs. This argues against tight genetic distance thresholds to declare transmission pairs
phylogenetically implausible in this study.
990
To exclude potential transmission pairs, we used the following two criteria:
-
-
1000
1010
Bootstrap clade support. If the potential transmitter did not occur in the same clade as the
recipient MSM in sufficiently many bootstrap phylogenetic trees, the pair was excluded.
Such bootstrap criteria are frequently used (8).
Phylogenetic incompatibility with direct transmission. We found that within phylogenetic
clades with high bootstrap support, branches between the remaining potential transmitters
and the recipient MSM were often relatively long (figure S3). With approximately half of all
potential transmitters sampled, one explanation is that the actual transmitter did not have a
sequence sampled or was not diagnosed by March 2013. Unobserved intermediate
transmitters were detected with a coalescent compatibility test that was recently introduced
by Vrancken et al. (13). The idea behind this test is that viral lineages of a true transmission
pair must coalesce at a time when the transmitter was already infected. The test assumes that
transmitters are infected with a single virus. The test calculates the probability that the viral
lineages from the potential transmitter and the recipient coalesce after the transmitter was
infected and before the recipient was diagnosed. The test excludes the potential pair if this
coalescent compatibility probability is below a certain threshold. To apply this test, we dated
coalescent events within phylogenetic clusters. Specifically, the sampled ancestor birth-death
model was used in order to allow for the possibility that transmission might have occurred
after the time of sequence sampling. To accommodate temporal variation in model
parameters, we implemented a skyline version of the sampled ancestor birth-death model
along previous work (48).
We then sought to determine thresholds so that potential transmitters are excluded with high
specificity (a large proportion of true transmitters to recipients is not excluded). Typically, viral
phylogenetic studies aim to identify transmission chains (23). This leads to relatively strict
thresholds. Here, we aim to exclude pairs of individuals that did not infect each other. This different
objective leads to relatively large thresholds.
1020
1030
For the clade frequency criterion, the type-I error is the probability that sequence pairs of a true
transmission pair do not co-cluster. As a proxy, we calculated the probability that sequence pairs
from the same individual do not co-cluster. Figure S29 shows this probability as a function of the
clade frequency threshold. The approximate type-I error is more than 10% for clade frequency
thresholds above 85%. To limit this error, we settled on 80% as the central clade frequency
threshold, and considered 70% and 85% as the upper and lower thresholds respectively.
To determine the threshold of the coalescent compatibility test below which potential transmission
pairs are excluded, we proceeded as for the phylogenetic clustering test.
We approximated the type-I error with the probability that co-clustering sequence pairs from the
same individual were excluded by the coalescent compatibility test. Figure S30-A shows this
probability as a function of the coalescent compatibility threshold. The approximate type-I error is
around 5% for thresholds in the range of 10% to 30%. We chose 20% as the central threshold and
considered 10% and 30% in sensitivity analyses.
We also evaluated the power of the test in excluding co-clustering female-female pairs. All femalefemale pairs were considered as incorrect transmission pairs (49). Figure S30-B shows that the
34
coalescent compatibility test excludes more than half of all co-clustering female-female pairs if the
compatibility threshold is at least 10%.
1040
To summarize, we adopted the following exclusion criteria:
Exclusion
criteria
Clade frequency in
Coalescent compatibility with
bootstrap viral phylogenies direct transmission
Central
Lower-I
Lower-II
Upper-I
Upper-II
80%
80%
85%
80%
70%
20%
30%
20%
10%
20%
Viral phylogenetic analyses were remarkably successful in excluding potential transmission events.
Across the above exclusion criteria, between 99.94%-99.96% of all potential transmission pairs with
sequences available for both individuals could be excluded. Table S3 characterizes the
phylogenetically probable transmitters. The difference between using a 7% threshold or no threshold
at all was minimal: only 3 more recipients would have been excluded. Sensitivity analyses
demonstrate that the findings reported in this study did not vary substantially across these exclusion
criteria, and additional genetic distance criteria (figures S14-S22).
1050
SOM 3 Procedure to quantify censoring bias
The observed, probable transmission intervals reported in figure 2 are subject to two main sources of
bias. Below, we describe the technical bootstrap procedure to quantify the extent of censoring bias.
The idea behind this procedure is described in the Materials and Methods of the main text and figure
S6.
1060
Bootstrap techniques proceed by constructing sub-samples from observed data to estimate properties
of the observed data that is sampled from the population (50). Here, we implemented a bootstrap
technique that sub-censors the observed data to estimate the extent of censoring of the observed data.
Censoring describes the proportion of infected individuals that have not yet been registered in the
ATHENA cohort, irrespective of whether a sequence was sampled or not. To quantify censoring, we
considered all potential transmitters (stage A in figure 1) and their "overlap" intervals, during which
the potential transmitters overlapped with infection windows of recipients. The probable transmitters
and their transmission intervals do not enter the calculations below. We adopt the following notation:
𝑡𝐸
𝑡𝐶
[𝑡1 , 𝑡2 ]
𝑡𝐶∗ = 𝑡𝐶 − 𝛿
[𝑡1∗ , 𝑡2∗ ]
1070
end of the observation period
censoring time of potential transmitters
observation period of recipients
bootstrap censoring time, where 𝛿 > 0
bootstrap observation period, where 𝑡1∗ = 𝑡1 − 𝛿 and 𝑡2∗ = 𝑡2 − 𝛿.
Here, we set 𝑡𝐸 = 2013/03, the time of database closure; 𝑡𝐶 = 2010/12, the end of the study period;
and [𝑡1 , 𝑡2 ] to one of the six time intervals 1996/07-2006/06, 2006/07-0207/12, 2008/01-2009/06,
2009/07-2009/12, 2010/01-2010/06, 2010/07-2010/12. The fourth period in table 4 was split into
three intervals because of the rapidly increasing impact of censoring towards the present.
35
For a bootstrap censoring time 𝑡𝐶∗ , we can calculate the proportion of non-censored intervals in
infection/care stage 𝑥 to recipients that are diagnosed during the bootstrap observation period [𝑡1∗ , 𝑡2∗ ],
𝑐𝐸 (𝑥, 𝑡1∗ , 𝑡2∗ , 𝑡𝐶∗ ) =
∑𝑗∈𝑅(𝑡1∗ ,𝑡2∗ ) ∑𝜏∈𝑉𝑗 (𝑥) 𝟏{ 𝜏 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑏𝑒𝑓𝑜𝑟𝑒 𝑡𝐶∗ }
∑𝑗∈𝑅(𝑡1∗ ,𝑡2∗ ) ∑𝜏∈𝑉𝑗 (𝑥) 𝟏{ 𝜏 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑏𝑒𝑓𝑜𝑟𝑒 𝑡𝐸 }
,
where
𝑅(𝑡1∗ , 𝑡2∗ )
𝑉𝑗 (𝑥)
1080
set of recipient MSM diagnosed in the period [𝑡1∗ , 𝑡2∗ ],
set of overlap intervals to recipient 𝑗 that are in stage 𝑥.
If the corresponding potential transmitter is not diagnosed before 𝑡𝐶∗ , then 𝟏{ 𝜏 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑏𝑒𝑓𝑜𝑟𝑒 𝑡𝐶∗ }
equals zero and otherwise one. This is illustrated in figure S6, where 𝑡1∗ = 2006/06, 𝑡2∗ = 2007/12,
𝑡𝐸 = 2013/03 and 𝑡𝐶∗ could be any time between 2008/01 and 2013/03.
We aim to estimate, for the actual censoring time 𝑡𝐶 , the proportion of non-censored overlap
intervals in stage 𝑥 to recipients that are diagnosed during the period [𝑡1 , 𝑡2 ]. This can be written as
𝑐∞ (𝑥, 𝑡1 , 𝑡2 , 𝑡𝐶 ) =
1090
∑𝑗∈𝑅(𝑡1 ,𝑡2 ) ∑𝜏∈𝑉𝑗 (𝑥) 𝟏{ 𝜏 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑏𝑒𝑓𝑜𝑟𝑒 𝑡𝐶 }
∑𝑗∈𝑅(𝑡1 ,𝑡2 ) ∑𝜏∈𝑉𝑗(𝑥) 𝟏{ 𝜏 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑏𝑒𝑓𝑜𝑟𝑒 ∞}
.
We need to assume that the censoring process has not changed within the last Δ𝑚𝑎𝑥 years from 𝑡2 . In
this case,
𝑐∞ (𝑥, 𝑡1 , 𝑡2 , 𝑡𝐶 ) = 𝑐∞ (𝑥, 𝑡1 − 𝛿, 𝑡2 − 𝛿, 𝑡𝐶 − 𝛿)
for all 𝛿 < Δ𝑚𝑎𝑥 . We need to assume further that all overlap intervals have been observed by 𝑡𝐸 .
This is only the case when the bootstrap observation period lies sufficiently far back in time, that is
𝛿 > Δ𝑚𝑖𝑛 . In this case,
𝑐∞ (𝑥, 𝑡1 − 𝛿, 𝑡2 − 𝛿, 𝑡𝐶 − 𝛿) = 𝑐𝐸 (𝑥, 𝑡1 − 𝛿, 𝑡2 − 𝛿, 𝑡𝐶 − 𝛿)
1100
for all 𝛿 > Δ𝑚𝑖𝑛 . Under these assumptions on 𝛿, the following bootstrap algorithm provides an
estimate of the proportion of overlap transmission intervals that are not censored, 𝑐∞ (𝑥, 𝑡1 , 𝑡2 , 𝑡𝐶 ).
Bootstrap algorithm
Let 𝐵 be the number of bootstrap iterations.
For 1:B do
1. Draw 𝛿𝑏 from a uniform distribution with minimum Δ𝑚𝑖𝑛 and maximum Δ𝑚𝑎𝑥 .
2. Compute 𝑐̂𝑏 = 𝑐𝐸 (𝑥, 𝑡1 − 𝛿𝑏 , 𝑡2 − 𝛿, 𝑡𝐶 − 𝛿𝑏 ).
1110
Estimate 𝑐∞ (𝑥, 𝑡1 , 𝑡2 , 𝑡𝐶 ) with 𝑐̂ = ∑𝐵𝑏=1 𝑐̂𝑏 .
We chose Δ𝑚𝑖𝑛 and Δ𝑚𝑎𝑥 as follows. Mathematical modelling indicates that the average time to
diagnosis amongst MSM in the Netherlands is ~ 2-3 years in recent years (7, 46). For some
individuals, time to diagnosis may be substantially longer and we allowed for up to 4 years. In this
36
case, any 𝛿 such that 𝑡2 − 𝛿 ≤ 2009/03 should be sufficiently large. Since the most recent 𝑡2 is
2010/12, we have 𝛿 ≥ Δ𝑚𝑖𝑛 = 2 years. Further, we assumed that Δ𝑚𝑎𝑥 can be set to 3 years.
1120
Figure S31 shows that estimated censoring bias is extensive: for recipients diagnosed between
2010/07-2010/12, an estimated 20% of overlap intervals from potential transmitters estimated to be
in chronic infection are observed. As expected from figure S6, the estimated censoring bias was
substantially smaller for overlap intervals of potential transmitters in recent infection at time of
diagnosis.
SOM 4 Modelling counterfactual prevention scenarios
We formulated prevention models that moved probable transmitters to less infectious infection/care
stages, thereby changing the overall probability that any of the recipient MSM would have been
infected to less than one. This section describes these individual-level prevention models and how
they were parameterized.
1130
SOM 4.1 Improved testing with conventional assays
Counterfactual testing scenarios re-allocated undiagnosed men to less infectious infection/care stages
between diagnosis and ART start. The individual-level testing for prevention model has three
parameters
𝜃1𝑇𝑒𝑠𝑡
𝜃2𝑇𝑒𝑠𝑡
𝜃3𝑇𝑒𝑠𝑡
duration between consecutive HIV tests in months
additional fraction of probable transmitters that are tested with
frequency 𝜃1𝑇𝑒𝑠𝑡
window period of HIV testing assay,
and proceeds as follows to simulate a counterfactual scenario.
1140
A fraction 𝜃2𝑇𝑒𝑠𝑡 of randomly chosen, undiagnosed probable transmitters are tested in 𝜃1𝑇𝑒𝑠𝑡 intervals.
The first test date was randomly allocated so that the average first test was in mid-2008. We assumed
that the window period 𝜃3𝑇𝑒𝑠𝑡 of conventional assays is exactly 1 month (51). Before this window
period, all tests were assumed to be negative. After this window period, all tests were assumed to
correctly identify HIV status. After a counterfactual, positive test probable transmission intervals
before diagnosis were randomly re-allocated to one of the stages between diagnosis and ART start.
The re-allocation stage was drawn in proportion to the adjusted number of probable transmitters in
that stage. Each re-allocated probable transmission interval was associated with a randomly chosen
transmission probability from the new stage. Thus, the testing for prevention model changes the
probability of secondary infections from undiagnosed men to a lower probability of secondary
infections from diagnosed, untreated men.
1150
To parameterize this model, we reviewed testing behaviour amongst uninfected MSM in the
Netherlands, recipient MSM, and probable transmitters to the recipient MSM in this study. The
duration between consecutive tests, 𝜃1𝑇𝑒𝑠𝑡 , was set to 12 months throughout. 38% of uninfected MSM
in the Netherlands reported to test annually in the EMIS 2010: The European Men-Who-Have-SexWith-Men Internet Survey (52). Amongst MSM diagnosed between 2009/07-2010/12, 26.8% had a
last negative test within 12 months prior to diagnosis. Amongst probable transmitters to recipient
37
1160
MSM diagnosed between 2009/07-2010/12, 17.3% had a last negative test within 12 months prior to
diagnosis. Figure S32 shows that this low proportion of probable transmitters with a last negative test
was not sensitive to the choice of infection time estimates or phylogenetic exclusion criteria. Figure
4 reports estimates of the proportion of transmissions that could have been averted by overall testing
𝑇𝑒𝑠𝑡
𝑇𝑒𝑠𝑡
coverage 𝛾𝑡𝑎𝑟𝑔𝑒𝑡
. Given a proportion 𝛾𝑐𝑢𝑟𝑟𝑒𝑛𝑡
of probable transmitters that are already testing
𝑇𝑒𝑠𝑡
annually, we determined 𝜃2 through the relationship
𝑇𝑒𝑠𝑡
𝑇𝑒𝑠𝑡
𝑇𝑒𝑠𝑡
)𝜃2𝑇𝑒𝑠𝑡 .
𝛾𝑡𝑎𝑟𝑔𝑒𝑡
= 𝛾𝑐𝑢𝑟𝑟𝑒𝑛𝑡
+ (1 − 𝛾𝑐𝑢𝑟𝑟𝑒𝑛𝑡
𝑇𝑒𝑠𝑡
Based on figure S32, we set 𝛾𝑐𝑢𝑟𝑟𝑒𝑛𝑡
=0.17.
SOM 4.2 Improved testing with specialized assays that detect early
infection before the presence of HIV antibodies
1170
Counterfactual testing scenarios with specialized assays that can detect early infection before the
presence of HIV antibodies, were simulated as the scenarios based on conventional assays, except
that the window period was set to zero (51).
SOM 4.3 Antiretroviral pre-exposure prophylaxis
Counterfactual PrEP scenarios prevented randomly chosen, uninfected men from becoming infected.
The individual-level PrEP prevention model has two parameters,
𝜃1𝑃𝑟𝐸𝑃
𝜃2𝑃𝑟𝐸𝑃
fraction of individuals that take PrEP
probability that an individual taking PrEP is not infected,
1180
and proceeds as follows to simulate a counterfactual scenario.
1190
A fraction 𝜃1𝑃𝑟𝐸𝑃 of recipients that test negative is randomly chosen to take PrEP by mid 2008. The
intervention was assumed to be efficacious on a randomly chosen fraction 𝜃2𝑃𝑟𝐸𝑃 of those. This
fraction was removed from the newly infected recipients (infection probability set from 1 to 0). In
addition, a fraction 𝜃1𝑃𝑟𝐸𝑃 of probable transmitters was randomly chosen to take PrEP since they first
tested negative. Test dates were simulated as in SOM 4.1. The intervention was also assumed to be
efficacious on a randomly chosen fraction 𝜃2𝑃𝑟𝐸𝑃 of those. This fraction was removed from the
infected probable transmitters (lowering infection probabilities of the corresponding recipients to
below 1). The PrEP prevention model averts secondary infections amongst recipients as well as
primary infections of probable transmitters that were uninfected at time of testing.
We parameterized 𝜃2𝑃𝑟𝐸𝑃 based on findings from the iPrEX, PROUD and ANRS Ipergay studies (18,
19, 37). The iPrEX trial demonstrated an overall reduction in HIV incidence of 44% (95%
confidence interval 15-63%) of daily oral tenofovir-based PrEP amongst MSM from diverse settings
(18). The PROUD study demonstrated a reduction in HIV incidence of 86% (58%-96%) of daily oral
single-pill PrEP amongst predominantly white, high risk MSM recruited from sexual health clinics in
the United Kingdom (19). Reports from the ANRS Ipergay study indicate a reduction in HIV
incidence of 86% (40%-99%) amongst MSM in France and Canada who follow an on demand
dosing scheme 2-24 hours before sex (37). Reflecting the more recent PROUD and Ipergay trials,
38
1200
𝜃2𝑃𝑟𝐸𝑃 was for a single simulated counterfactual scenario drawn from a Beta distribution with mean
of 86% and 95% interquartile range 40%-99%. Uncertainty in this parameter is the main reason why
confidence intervals associated with prevention strategies that include PrEP are larger than those
without in figure 4. For the sensitivity analysis reported in figure S12, reflecting the iPrEX trial,
𝜃2𝑃𝑟𝐸𝑃 was for a single simulated counterfactual scenario drawn from a Beta distribution with mean
of 44% and 95% interquantile range 20%-70%.
SOM 4.4 Treatment as prevention
1210
Counterfactual treatment as prevention (TasP) scenarios re-allocated diagnosed, untreated men to
less infectious infection/care stages after ART start. The individual-level TasP prevention model has
one parameter
𝜃1𝑇𝑎𝑠𝑃
time to first viral suppression,
and proceeds as follows.
1220
In case of immediate ART provision, all diagnosed but untreated probable transmitters started ART.
Corresponding probable transmission intervals were randomly re-allocated to stages after ART start,
with the exception of the intervals between diagnosis and time to first viral suppression 𝜃1𝑇𝑎𝑠𝑃 . These
intervals were always re-allocated to be ‘before first viral suppression’. Each re-allocated probable
transmission interval was associated with a randomly chosen transmission probability from the new
stage. Thus, the TasP prevention model changes the probability of secondary infections from
diagnosed, untreated men to a lower probability of secondary infections from treated men.
In case of ART provision when CD4 progress below 500 cells/ml, only the probable transmission
intervals after diagnosis with CD4 progression to below 500 cells/ml were randomly re-allocated.
To parameterize this model, available Kaplan-Meier estimates of the percentage of patients with
initial suppression to below 100 copies/ml were used (7). An estimated 50% of all patients diagnosed
between 2007/01-2010/12 reached first viral suppression in 3.6 months, and 𝜃1𝑇𝑎𝑠𝑃 was set to this
value.
1230
SOM 4.5 Combinations
Counterfactual combination prevention scenarios were evaluated through combination of the single
intervention models. To evaluate test-and-treat prevention interventions, we first applied the testing
for prevention model, followed by the treatment as prevention model. The PrEP prevention model
was always linked to an HIV testing component. To evaluate PrEP in combination with test-and-treat
interventions, we first applied the PrEP+test prevention model, followed by the treatment as
prevention model.
39
Infection status at diagnosis
(%)
100
80
60
40
20
0
1997
1999
2001
2003
2005
2007
Confirmed recent HIV infection
Recent HIV infection not indicated
1240
2009
2011
Missing
Figure S 1 Number of identified recipient MSM by 3-month intervals. MSM were confirmed to be in recent HIV infection at time
of diagnosis if one of the following were reported: a last negative HIV-1 antibody test in the 12 months preceding diagnosis, an
indeterminate HIV-1 western blot, or clinical diagnosis of acute infection. MSM with confirmed recent infection were considered as
recipient in the viral phylogenetic transmission and prevention study. To evaluate trends over time, recipient MSM were stratified into
four time periods as illustrated by the four blocks in the figure.
96/07−06/06
06/07−07/12
08/01−09/06
09/07−10/12
200
100
80
60
40
20
10
4
6
8
10
12
4
6
8
10
12
4
6
8
10
12
4
6
8
10
12
duration of putative infection window
(months)
1250
Figure S 2 Duration of infection windows of recipient MSM. Infection windows were at most 12 months long, reflecting the
definition of recency of HIV infection. Where available, last negative HIV antibody tests were used to shorten infection windows. We
assumed that the window period of HIV antibody tests is approximately 4 weeks, so that the last negative test had to be within 11
months preceding diagnosis in order to reduce the duration of the infection window.
40
1260
Figure S 3 Snapshot of the reconstructed viral phylogeny. Dutch sequences were enriched with subtype B sequences from the Los
Alamos HIV sequence database because multiple subtype B lineages were likely imported into the Netherlands (7). Sequences were
aligned with ClustalX v2.1 (http://www.clustal.org/clustal2/) using default parameters, and the alignment was manually curated.
Primary drug resistance mutations listed in the IAS-USA March 2013 update were masked in each sequence. The viral phylogeny of
the enriched ATHENA sequences was reconstructed under the GTR nucleotide substitution model with the ExaML maximumlikelihood method (42). Each clade in the viral phylogeny was annotated with the frequency with which it occurred among all
bootstrap trees. Sequences from the Los Alamos sequence database are shown in grey. Sequences from men in recent infection at
diagnosis are shown in dark red. Sequences from men for whom recent infection at diagnosis was not indicated are shown in orange.
Sequences from men with unknown infection status at diagnosis are shown in yellow. Sub-clades that occurred in 400 out of 500
bootstrap trees are shown with thicker branches. Estimated branch lengths are in units of substitutions per site.
41
0.08
●
●
●
●
●
●
●
●
●
0.06
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
patristic distance
(subst/site)
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0.02
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0.04
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0.00
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
sequence pairs
first sequence from recipient MSM and second sequence from potential tr ansmitter
1270
1280
Figure S 4 Uncertainty in the estimated genetic distance between sequences from the transmitter and recipient of potential
transmission pairs. For illustration purposes, 100 pairs with a median genetic distance below 2% were selected. Genetic distances
(sum of the average number of nucleotide substitutions per site) were calculated from the reconstructed viral phylogeny on the
sequence alignment (red dot), and from reconstructed viral phylogenies on bootstrap sequence alignments (boxplot, bar: median, box:
interquartile range; whiskers: 95% quantile range). The genetic distance calculated on the tree with overall highest likelihood is shown
as a blue dot. Uncertainty in genetic distances was accounted for in transmission analyses through bootstrap resampling.
Figure S 5 Genetic distance between sequence pairs from previously published, epidemiologically confirmed transmitterrecipient pairs, and sequence pairs from the phylogenetically probable transmission pairs in this study. (A) Aligned sequences
from the Belgium transmission chain were obtained from the authors (13). Drug-resistance sites were masked in each sequence.
Patient A developed multi-drug resistance (13), and sequences from patient A were not considered. The viral phylogeny among all
sequences was constructed with the maximum-likelihood methods (42) under the GTR nucleotide substitution model. Genetic
distances between pairs of sequences from the confirmed transmitter and the confirmed recipient were calculated from the
reconstructed viral phylogeny. Infection windows were determined through in-depth patient interviews, and made available by the
authors (13). The time elapsed between sequences from a transmission pair was calculated as the time from the midpoint of the
established infection window of the recipient to the sampling date of the transmitter, plus the time from the midpoint to the sampling
date of the recipient. Genetic distances between confirmed pairs were strongly correlated with the time elapsed (Spearman correlation
𝝆=0.84, n=2,807). We fitted the probabilistic molecular clock model
where
𝑑𝑖
𝜇𝑖
𝑑𝑖 ~ 𝐺𝑎𝑚𝑚𝑎(𝜇𝑖 , 𝜙𝑖 )
𝜇𝑖 = 𝛽𝑡
𝜙𝑖 = 𝛾0 + 𝛾1 𝑡,
genetic distance between sequence pair 𝑖
mean of the Gamma distribution for the 𝑖th pair
42
𝜙𝑖
𝛽
𝛾𝑘
dispersion of the Gamma distribution for the 𝑖th pair
evolutionary rate
dispersion parameters
with regression techniques. The estimated model parameters were
1290
𝛽̂ = 0.00416, 𝛾̂0 = 1.008, 𝛾̂1=-0.0523.
The fitted model explained 28% of the variance in the genetic distances between sequences from confirmed transmission pairs.
Quantile ranges of the probabilistic molecular clock model are shown in red. (B) The fitted model was then applied to the 2,343
phylogenetically probable transmission pairs in this study to express the relative probability that a phylogenetically identified
transmitter was the actual transmitter to a recipient. To reflect uncertainty in the genetic distance between probable transmission pairs,
calculations were repeated on genetic distance values sampled from the distributions shown in figure S4. The time elapsed between
sequences from phylogenetically probable pairs was calculated as the time from the midpoint of the infection window of the recipient
to the sampling date of the transmitter, plus the time from the midpoint to the sampling date of the recipient. Transmission
probabilities clearly varied between probable transmitters.
1300
1310
Figure S 6 Right censoring at past, hypothetical database closure times. (A) Distribution of time of diagnosis of potential
transmitters to recipients that are diagnosed between 𝒕𝟏∗ = 𝟐𝟎06/06 to 𝒕𝟐∗ = 2007/12. (Left) Histogram of the time of diagnosis of
potential transmitters with confirmed recent infection at diagnosis. Infection windows of the recipients start the earliest in June 2005,
and so do the putative transmission intervals between potential transmitters and their recipients ("overlap intervals"). Therefore, all
potential transmitters with an overlap interval before diagnosis must be diagnosed after June 2005. This explains the abrupt start of the
histogram after June 2005. (Right) Histogram of the time of diagnosis of potential transmitters estimated to be in chronic infection.
Potential transmitters in undiagnosed, chronic infection at the putative transmission time may be diagnosed several years after their
recipient. (B) Estimated proportion of censored overlap intervals at hypothetical database closure times after 𝒕∗𝟐 = 2007/12.
Considering a hypothetical closure time, say 𝒕∗𝑪 = 𝟐𝟎𝟎𝟖/𝟏𝟐, we considered potential transmitters with date of diagnosis after 𝒕𝑪∗ .
Next, we counted the overlap intervals of the hypothetically censored potential transmitters in each stage. Then we determined the
proportion of these intervals among all intervals by stage. This proportion is plotted against hypothetical closure times, and quantifies
the proportion of intervals that would have been censored, had the database been closed at the hypothetical closure time. A bootstrap
algorithm described in the supplementary online material was used to extrapolate these estimates to the actual database closure time.
43
●
Not in contact
ART initiated,
After first viral suppression
Viral suppression, >1 obser vations
●
ART initiated,
After first viral suppression
Viral suppression, 1 obser vation
●
ART initiated,
After first viral suppression
No viral suppression
●
ART initiated,
After first viral suppression
No viral load measured
●
ART initiated,
Before first viral suppression
●
Diagnosed,
No CD4 measured
●
Diagnosed,
CD4 progression to <350
●
Diagnosed,
CD4 progression to [350−500]
●
Diagnosed,
CD4 progression to >500
●
Diagnosed < 3mo,
Recent infection
at diagnosis
●
Undiagnosed,
Unconfirmed chronic infection
●
Undiagnosed,
Unconfirmed recent infection
●
Undiagnosed,
Confirmed recent infection
at diagnosis
●
0
10
20
30
40
50
60
70
80
overlap intervals
of a potential transmitter with a sequence
(%)
time of diagnosis ● 96/07−06/06
of recipient MSM
06/07−07/12
1320
90
100
08/01−09/06
09/07−10/12
Figure S 7 Sequence sampling probabilities by stage in the infection and care continuum. To characterize sequence coverage by
stage in the infection/care continuum, we considered potential transmitters with and without a sequence, and their "overlap" intervals
during which they overlapped with infection windows of recipients. Then, the proportion of overlap intervals whose potential
transmitter had a viral sequence sampled was calculated, and plotted by stage and time of diagnosis of the recipient. Colour codes are
as in figure 2 in the main text. Typically, sampling probabilities increased with calendar time. Reflecting preferential sequencing for
drug resistance testing, intervals with viral load measurements below 100 copies/ml were sampled least frequently, while those above
100 copies/ml were sampled twice as often. Intervals with a lower CD4 count were more likely to be sampled than those with a higher
CD4 count. Intervals of transmitters in confirmed recent infection at diagnosis were also more likely to be sampled than those without,
reflecting participation of the former in sub-studies of the ATHENA cohort (7).
44
●
20
80
●
●
●
●
●
●
●
●
●
●
●
●
15
60
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
10
●
40
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
transmission probability
(%)
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
5
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
20
●
●
●
●
●
●
●
●
●
●
●
15
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
20
●
●
●
●
●
●
●
10
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
5
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
10
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
transmission intervals
1330
Figure S 8 Invidividual-level variation in phylogenetically derived transmission probabilities by infection/care stages.
Transmission probabilities for observed transmission intervals 𝝉 were calculated as described in Materials and Methods, and are shown
for a random sample of 40 observed transmission intervals for four infection/care stages. Colour codes match those in figure 2 in the
main text. Uncertainty in the estimated phylogenetic transmission probabilities is indicated with boxplots (black bar: median, box:
50% interquartile range, whiskers: 95% interquartile range). Substantial individual-level variation in transmission probabilities
indicates that a relatively large number of past transmission events are needed in order to reliably quantify sources of transmission.
45
central estimate of HIV infection times
central phylogenetic exclusion criteria
lower estimate of HIV infection times
upper estimate of HIV infection times
30
20
10
0
************** ************** **************
phylogenetic exclusion criteria
coalescent compatibility < 30%
clade frequency < 80%
phylogenetic exclusion criteria
coalescent compatibility < 10%
clade frequency < 80%
phylogenetic exclusion criteria
coalescent compatibility < 30%
clade frequency < 85%
30
20
Transmission intervals
(%)
10
0
************** ************** **************
phylogenetic exclusion criteria
coalescent compatibility < 20%
clade frequency < 85%
phylogenetic exclusion criteria
coalescent compatibility < 10%
clade frequency < 85%
phylogenetic exclusion criteria
coalescent compatibility < 30%
clade frequency < 70%
30
20
10
0
************** ************** **************
phylogenetic exclusion criteria
coalescent compatibility < 20%
clade frequency < 70%
phylogenetic exclusion criteria
coalescent compatibility < 10%
clade frequency < 70%
30
20
10
0
************** **************
stage in HIV infection and care continuum
1340
Figure S 9 Frequency of infection/care stages among phylogenetically probable transmitters. Phylogenetic exclusion criteria and
infection time estimates were varied as described in the panels. Colour codes are as in figure 2 in the main text. Overall, infection/care
stages before ART start were overrepresented amongst phylogenetically probable transmitters (marked by an asterix), while all stages
after ART start were underrepresented amongst phylogenetically probable transmitters (marked by an asterix). Periods with no contact
for at least 18 months to HIV care services were also underrepresented amongst phylogenetically probable transmitters, likely
reflecting that a large proportion of potential transmitters that are listed in the ATHENA cohort but had no contact for 18 months
moved abroad or died.
46
20
central estimate of HIV infection times
central phylogenetic exclusion criteria
lower estimate of HIV infection times
upper estimate of HIV infection times
phylogenetic exclusion criteria
coalescent compatibility < 30%
clade frequency < 80%
phylogenetic exclusion criteria
coalescent compatibility < 10%
clade frequency < 80%
phylogenetic exclusion criteria
coalescent compatibility < 30%
clade frequency < 85%
phylogenetic exclusion criteria
coalescent compatibility < 20%
clade frequency < 85%
phylogenetic exclusion criteria
coalescent compatibility < 10%
clade frequency < 85%
phylogenetic exclusion criteria
coalescent compatibility < 30%
clade frequency < 70%
phylogenetic exclusion criteria
coalescent compatibility < 20%
clade frequency < 70%
phylogenetic exclusion criteria
coalescent compatibility < 10%
clade frequency < 70%
15
10
5
0
20
phylogenetic transmission probability
per observed, probable interval
(%)
15
10
5
0
20
15
10
5
0
20
15
10
5
0
stage in HIV infection and care continuum
Figure S 10 Phylogenetically derived transmission probabilities of observed transmission intervals. Phylogenetic exclusion
criteria and infection time estimates were varied as described in the panels. Colour codes are as in figure 2 in the main text. Overall,
transmission probabilities were small, with a mean of 2.1% (25% quantile: 0.9%, 75% quantile: 2.5%, 97.5% quantile: 11.6%).
However, when grouped by infection/care stage, the phylogenetic transmission probabilities were highly informative as to how
transmission rates change with progression through the infection and care continuum.
1350
47
central estimate of HIV infection times
central phylogenetic exclusion criteria
lower estimate of HIV infection times
upper estimate of HIV infection times
100
100
100
50
50
50
0
0
0
Transmission risk ratio
compared to diagnosed, untreated men with CD4>500
(%)
96/07−06/06
08/01−09/06
96/07−06/06
08/01−09/06
96/07−06/06
08/01−09/06
06/07−07/12
09/07−10/12
06/07−07/12
09/07−10/12
06/07−07/12
09/07−10/12
phylogenetic exclusion criteria
coalescent compatibility < 30%
clade frequency < 80%
phylogenetic exclusion criteria
coalescent compatibility < 10%
clade frequency < 80%
phylogenetic exclusion criteria
coalescent compatibility < 30%
clade frequency < 85%
100
100
100
50
50
50
0
0
0
96/07−06/06
08/01−09/06
96/07−06/06
08/01−09/06
96/07−06/06
08/01−09/06
06/07−07/12
09/07−10/12
06/07−07/12
09/07−10/12
06/07−07/12
09/07−10/12
phylogenetic exclusion criteria
coalescent compatibility < 20%
clade frequency < 85%
phylogenetic exclusion criteria
coalescent compatibility < 10%
clade frequency < 85%
phylogenetic exclusion criteria
coalescent compatibility < 30%
clade frequency < 70%
100
100
100
50
50
50
0
0
0
96/07−06/06
08/01−09/06
96/07−06/06
08/01−09/06
96/07−06/06
08/01−09/06
06/07−07/12
09/07−10/12
06/07−07/12
09/07−10/12
06/07−07/12
09/07−10/12
phylogenetic exclusion criteria
coalescent compatibility < 20%
clade frequency < 70%
phylogenetic exclusion criteria
coalescent compatibility < 10%
clade frequency < 70%
100
100
50
50
0
0
96/07−06/06
08/01−09/06
96/07−06/06
08/01−09/06
06/07−07/12
09/07−10/12
06/07−07/12
09/07−10/12
Figure S 11 Transmission risk ratio from men after ART start, compared to diagnosed untreated men with CD4 > 500 cells/ml.
Colour codes are as in figure 2 in the main text.
no improvements to
annual testing coverage
annual testing coverage
of probable transmitters
30%
annual testing coverage
of probable transmitters
50%
annual testing coverage
of probable transmitters
70%
test−PrEP (<30 yrs)
PrEP reduction incidence
44%
test−PrEP (<30 yrs)−treat (CD4<500)
test−PrEP (<30 yrs)−treat (Immediate)
test−PrEP (all)
test−PrEP (all)−treat (CD4<500)
test−PrEP (all)−treat (Immediate)
test−PrEP (<30 yrs)
PrEP reduction incidence
86%
test−PrEP (<30 yrs)−treat (CD4<500)
test−PrEP (<30 yrs)−treat (Immediate)
test−PrEP (all)
test−PrEP (all)−treat (CD4<500)
test−PrEP (all)−treat (Immediate)
0
10 20 30 40 50 60 70 80 90 100 0
10 20 30 40 50 60 70 80 90 100 0
10 20 30 40 50 60 70 80 90 100 0
10 20 30 40 50 60 70 80 90 100
HIV infections amongst MSM in the tr ansmission cohor t
that could have been averted in 08/07 − 10/12
(%)
Figure S 12 Sensitivity analysis on the impact of PrEP with lower efficacy. Estimated proportion of infections between mid 2008
to 2011 that could have been averted through the listed interventions, assuming an 44% efficacy of PrEP as reported in the iPrEX trial,
and an 86% efficacy of PrEP as reported in the more recent Ipergay and PROUD trials.
48
no improvements to
annual testing coverage
annual testing coverage
of probable transmitters
30%
annual testing coverage
of probable transmitters
50%
annual testing coverage
of probable transmitters
70%
test
test (RNA)
PrEP coverage
of probable transmitters
that test negative
33%
test−treat (CD4<500)
test−treat (Immediate)
test−PrEP (<30 yrs)
test−PrEP (<30 yrs)−treat (CD4<500)
test−PrEP (<30 yrs)−treat (Immediate)
test−PrEP (all)
test−PrEP (all)−treat (CD4<500)
test−PrEP (all)−treat (Immediate)
test
test (RNA)
PrEP coverage
of probable transmitters
that test negative
50%
test−treat (CD4<500)
test−treat (Immediate)
test−PrEP (<30 yrs)
test−PrEP (<30 yrs)−treat (CD4<500)
test−PrEP (<30 yrs)−treat (Immediate)
test−PrEP (all)
test−PrEP (all)−treat (CD4<500)
test−PrEP (all)−treat (Immediate)
test
test (RNA)
PrEP coverage
of probable transmitters
that test negative
66%
test−treat (CD4<500)
test−treat (Immediate)
test−PrEP (<30 yrs)
test−PrEP (<30 yrs)−treat (CD4<500)
test−PrEP (<30 yrs)−treat (Immediate)
test−PrEP (all)
test−PrEP (all)−treat (CD4<500)
test−PrEP (all)−treat (Immediate)
0
10 20 30 40 50 60 70 80 90 100 0
10 20 30 40 50 60 70 80 90 100 0
10 20 30 40 50 60 70 80 90 100 0
10 20 30 40 50 60 70 80 90 100
HIV infections amongst MSM in the tr ansmission cohor t
that could have been averted in 08/07 − 10/12
(%)
Figure S 13 Sensitivity analysis on the impact of lower or higher PrEP coverage.
96/07−06/04
06/05−07/12
08/01−09/06
09/07−10/12
40
with adjustment for
censoring and sampling biases
30
20
10
0
40
30
with adjustment for
sampling biases
Proportion of transmissions
(%)
1360
20
10
0
40
no adjustment for
sampling and censoring biases
30
20
10
0
stage in HIV infection and care continuum
Figure S 14 Impact of sampling and censoring adjustments on the estimated proportion of transmissions from stages in the
infection and care continuum. The proportion of transmissions attributable to infection/care stages was calculated as described in the
Materials and Methods (top row), with equal sequence sampling probabilities in the missing data model (middle row), and with 𝒎𝒋 (𝒛)
=0 (bottom row, see Materials and Methods). Colour codes are as in figure 2 in the main text. With no corrections to censoring and
sampling bias, the proportion of transmissions attributable to undiagnosed men declines to less than 40% by 2009/07-2010/12. With no
49
1370
corrections to censoring bias but corrections for sequence sampling bias, the proportions of stages with large per capita transmission
probabilities increase most. Those stages with a large number of missing intervals are not necessarily amplified most, because each of
these intervals may be associated with a relatively small transmission probability. In particular, the estimated proportion of
transmissions from undiagnosed men in recent infection at diagnosis is large, even though the total number of added, missing intervals
in this stage is comparatively small. This comparison confirms, first, that sequence sampling and censoring bias can have an extensive
impact on population based phylogenetic analyses. Second, this comparison demonstrates that it is not intuitive how corrections to
sequence sampling and censoring biases impact on such analyses when relative transmission rates also vary across risk groups.
96/07−06/04
06/05−07/12
08/01−09/06
09/07−10/12
40
with phylogenetic transmission
probability per interval
30
20
10
0
with phylogenetic transmission
probability per probable transmitter
40
30
Proportion of transmissions
(%)
20
10
0
40
every interval
equally likely
30
20
10
0
40
every probable transmitter
equally likely
30
20
10
0
stage in HIV infection and care continuum
1380
Figure S 15 Impact of phylogenetic transmission probabilities on the estimated proportion of transmissions from stages in the
infection and care continuum. The proportion of transmissions attributable to infection/care stages was calculated as described in the
Materials and Methods (top row), without adjusting for differences in the number of intervals per pair (𝝎𝒊𝒋𝝉 = 𝝋𝒊𝒋 𝝎𝒊𝒋 , second row),
with transmission from every probable transmitter equally likely (𝝎𝒊𝒋𝝉 = 𝟏⁄𝝉𝒊𝒋 , third row), and transmission from every interval
equally likely (𝝎𝒊𝒋𝝉 = 𝟏 , bottom row). Colour codes are as in figure 2 in the main text. Setting 𝝎𝒊𝒋𝝉 = 𝝋𝒊𝒋 𝝎𝒊𝒋 (second row) had no
substantial impact on the estimated proportions. In the last two cases, the proportion of transmissions from undiagnosed men in recent
50
infection at time of diagnosis is very small, because the high infectiousness during this short stage in the infection and care continuum
is ignored. In addition, the proportion of transmissions from men after ART start is much higher because the low infectiousness during
these stages in the infection and care continuum is ignored. Ignoring differential transmission probabilities among probable
transmitters may complicate interpretation of viral phylogenetic cluster association studies.
96/07−06/04
06/05−07/12
08/01−09/06
09/07−10/12
central estimate of HIV infection times
central phylogenetic exclusion criteria
40
30
20
10
lower estimate of HIV infection times
Proportion of transmissions
(%)
0
40
30
20
10
0
upper estimate of HIV infection times
40
30
20
10
0
stage in HIV infection and care continuum
1390
Figure S 16 Impact of infection time estimates on the estimated proportion of transmissions from stages in the infection and
care continuum. The proportion of transmissions attributable to infection/care stages was calculated as described in the Materials and
Methods (top row), based on the potential transmitters identified with lower 95% estimates of HIV infection times in table S2 (middle
row), and upper 95% estimates of HIV infection times (bottom row). Colour codes are as in figure 2 in the main text. The estimated
proportions did not vary substantially.
51
20
10
40
30
20
10
30
20
10
30
20
10
20
10
30
20
10
30
0
52
20
10
phylogenetic exclusion criteria
coalescent compatibility < 30%
clade frequency < 70%
40
phylogenetic exclusion criteria
coalescent compatibility < 10%
clade frequency < 85%
40
phylogenetic exclusion criteria
coalescent compatibility < 20%
clade frequency < 85%
30
phylogenetic exclusion criteria
coalescent compatibility < 30%
clade frequency < 85%
40
phylogenetic exclusion criteria
coalescent compatibility < 10%
clade frequency < 80%
40
phylogenetic exclusion criteria
coalescent compatibility < 30%
clade frequency < 80%
40
Proportion of transmissions
(%)
30
central estimate of HIV infection times
central phylogenetic exclusion criteria
40
09/07−10/12
08/01−09/06
06/05−07/12
96/07−06/04
0
0
0
0
0
0
Figure S 17 Impact of phylogenetic clustering criteria on the estimated proportion of transmissions from stages in the infection
and care continuum. The proportion of transmissions attributable to infection/care stages was calculated as described in the Materials
and Methods (top row), and then using alternative upper and lower phylogenetic exclusion criteria as described in the row panels.
Colour codes are as in figure 2 in the main text. The estimated proportions did not vary substantially.
1400
Figure S 18 Impact of additional genetic distance criteria on the estimated proportion of transmissions from stages in the
infection and care continuum. The proportion of transmissions attributable to infection/care stages was calculated as described in the
Materials and Methods, but potential transmitters were also excluded with additional genetic distance criteria described in the row
panels. Colour codes are as in figure 2 in the main text. Before censoring and sampling biases were adjusted, an additional 2% genetic
distance criterion lead to a slight increase in the proportion of transmissions attributable to men in their first year of infection, while
the additional 4% genetic distance criterion lead to estimates that are comparable to those obtained without the genetic distance
criterion. After censoring and sampling biases are adjusted, the estimated proportions did not differ substantially from those obtained
without an additional genetic distance criterion.
53
no improvements to
annual testing coverage
annual testing coverage
of probable transmitters
30%
annual testing coverage
of probable transmitters
50%
annual testing coverage
of probable transmitters
70%
test
with adjustment for
censoring and sampling biases
test (RNA)
test−treat (CD4<500)
test−treat (Immediate)
test−PrEP (<30 yrs)
test−PrEP (<30 yrs)−treat (CD4<500)
test−PrEP (<30 yrs)−treat (Immediate)
test−PrEP (all)
test−PrEP (all)−treat (CD4<500)
test−PrEP (all)−treat (Immediate)
test
test (RNA)
with adjustment for
sampling biases
test−treat (CD4<500)
test−treat (Immediate)
test−PrEP (<30 yrs)
test−PrEP (<30 yrs)−treat (CD4<500)
test−PrEP (<30 yrs)−treat (Immediate)
test−PrEP (all)
test−PrEP (all)−treat (CD4<500)
test−PrEP (all)−treat (Immediate)
test
no adjustment for
sampling and censoring biases
test (RNA)
test−treat (CD4<500)
test−treat (Immediate)
test−PrEP (<30 yrs)
test−PrEP (<30 yrs)−treat (CD4<500)
test−PrEP (<30 yrs)−treat (Immediate)
test−PrEP (all)
test−PrEP (all)−treat (CD4<500)
test−PrEP (all)−treat (Immediate)
0
10 20 30 40 50 60 70 80 90 100 0
10 20 30 40 50 60 70 80 90 100 0
10 20 30 40 50 60 70 80 90 100 0
10 20 30 40 50 60 70 80 90 100
HIV infections amongst MSM in the tr ansmission cohor t
that could have been averted in 08/07 − 10/12
(%)
1410
Figure S 19 Impact of sequence sampling and censoring adjustments on the estimated proportion of averted infections. The
proportion of transmissions that could have been averted was calculated as described in the Materials and Methods (top row), with
equal sequence sampling probabilities in the missing data model (middle row), and with 𝒎𝒋 (𝒛) =0 (bottom row). With no corrections
to censoring and sampling bias, the estimated proportion of undiagnosed men is less than 40% in 2008/07-2010/12. Correspondingly,
the estimated impact of test-and-treat is higher when compared to the central estimate. However, even under this extreme case of
model misspecification, interventions including test-and-PrEP are associated with the largest reductions in HIV incidence. The
estimated 𝒂(𝑯) differ from the central estimate by at most 10%. The case where sampling bias is adjusted but censoring bias is
ignored is overall similar to the central estimates. This comparison indicates that the evaluation of the short-term impact of prevention
strategies is robust to extensive differences in how sequence sampling and right censoring biases are adjusted for.
no improvements to
annual testing coverage
annual testing coverage
of probable transmitters
30%
annual testing coverage
of probable transmitters
50%
annual testing coverage
of probable transmitters
70%
test
with phylogenetic transmission
probability per interval
test (RNA)
test−treat (CD4<500)
test−treat (Immediate)
test−PrEP (<30 yrs)
test−PrEP (<30 yrs)−treat (CD4<500)
test−PrEP (<30 yrs)−treat (Immediate)
test−PrEP (all)
test−PrEP (all)−treat (CD4<500)
test−PrEP (all)−treat (Immediate)
test
test (RNA)
test−treat (CD4<500)
every interval
equally likely
test−treat (Immediate)
test−PrEP (<30 yrs)
test−PrEP (<30 yrs)−treat (CD4<500)
test−PrEP (<30 yrs)−treat (Immediate)
test−PrEP (all)
test−PrEP (all)−treat (CD4<500)
test−PrEP (all)−treat (Immediate)
test
every probable transmitter
equally likely
test (RNA)
test−treat (CD4<500)
test−treat (Immediate)
test−PrEP (<30 yrs)
test−PrEP (<30 yrs)−treat (CD4<500)
test−PrEP (<30 yrs)−treat (Immediate)
test−PrEP (all)
test−PrEP (all)−treat (CD4<500)
test−PrEP (all)−treat (Immediate)
0
10 20 30 40 50 60 70 80 90 100 0
10 20 30 40 50 60 70 80 90 100 0
10 20 30 40 50 60 70 80 90 100 0
10 20 30 40 50 60 70 80 90 100
HIV infections amongst MSM in the tr ansmission cohor t
that could have been averted in 08/07 − 10/12
(%)
1420
Figure S 20 Impact of phylogenetic transmission probabilities on the estimated proportion of averted infections. The proportion
of transmissions that could have been averted was calculated as described in the Materials and Methods without adjusting for
differences in the number of intervals per pair (𝝎𝒊𝒋𝝉 = 𝝋𝒊𝒋 𝝎𝒊𝒋 , top row), with transmission from every probable transmitter equally
likely (𝝎𝒊𝒋𝝉 = 𝟏⁄𝝉𝒊𝒋 , middle row), and transmission from every interval equally likely (𝝎𝒊𝒋𝝉 = 𝟏, bottom row). It is clear that if men
54
after diagnosis are considered as infectious as undiagnosed men, then there is no secondary benefit in moving these individuals to
stages further down in the HIV infection and care continuum.
55
1430
56
Figure S 21 Impact of infection time estimates and phylogenetic exclusion criteria on the estimated proportion of averted
infections. The proportion of transmissions that could have been averted was calculated as described in the Materials and Methods,
but using alternative upper and lower phylogenetic exclusion criteria as described in the row panels. The estimated proportions averted
did not vary substantially.
no improvements to
annual testing coverage
annual testing coverage
of probable transmitters
30%
annual testing coverage
of probable transmitters
50%
central estimate of HIV infection times
central phylogenetic exclusion criteria
and genetic distance <2%
test
test (RNA)
test−treat (CD4<500)
test−treat (Immediate)
test−PrEP (<30 yrs)
test−PrEP (<30 yrs)−treat (CD4<500)
test−PrEP (<30 yrs)−treat (Immediate)
test−PrEP (all)
test−PrEP (all)−treat (CD4<500)
test−PrEP (all)−treat (Immediate)
annual testing coverage
of probable transmitters
70%
central estimate of HIV infection times
central phylogenetic exclusion criteria
and genetic distance <4%
test
test (RNA)
test−treat (CD4<500)
test−treat (Immediate)
test−PrEP (<30 yrs)
test−PrEP (<30 yrs)−treat (CD4<500)
test−PrEP (<30 yrs)−treat (Immediate)
test−PrEP (all)
test−PrEP (all)−treat (CD4<500)
test−PrEP (all)−treat (Immediate)
0
10 20 30 40 50 60 70 80 90 100 0
10 20 30 40 50 60 70 80 90 100 0
10 20 30 40 50 60 70 80 90 100 0
10 20 30 40 50 60 70 80 90 100
HIV infections amongst MSM in the tr ansmission cohor t
that could have been averted in 08/07 − 10/12
(%)
1440
Figure S 22 Impact of additional genetic distance criteria on the estimated proportion of averted infections per biomedical
intervention. The proportion of transmissions that could have been averted was calculated as described in the Materials and Methods,
but potential transmitters were also excluded with additional genetic distance criteria described in the row panels. The estimated shortterm impact of prevention strategies is insensitive to an additional 4% genetic distance criterion. With an addition 2% genetic distance
criterion, the predicted impact that test-and-treat strategies could have had on reducing incidence is lower than without a genetic
distance criterion.
57
1450
Figure S 23 Differences in transmission networks with and without a recipient MSM. To investigate if transmitters to MSM in
recent infection at diagnosis might differ from typical transmitters, we considered all identified viral phylogenetic clusters with and
without a recipient MSM. We then sought to evaluate if the number of late presenters (defined as men with a first CD4 count below
350 cells/ml within 12 months after diagnosis and before ART start) was enriched amongst clusters with no recipient MSM. (A) The
number of late presenters increases linearly with cluster size (blue: loess smooth, black: regression model with linear dependence on
cluster size). Adjusting for cluster size, we then fitted the following Poisson model with identity link
𝑛𝑖 ~ 𝑃𝑜𝑖(𝜇𝑖 )
𝜇𝑖 = 𝛽0 + 𝛽1 𝑟𝑖 + 𝛽2 𝑠𝑖 ,
where
𝑛𝑖
𝜇𝑖
𝑟𝑖
𝑠𝑖
number of late presenters in cluster 𝑖
mean number of late presenters in cluster 𝑖
1 if cluster 𝑖 has a recipient MSM and 0 otherwise
size of cluster 𝑖,
to estimate the contrast 𝛽1 between the average number of late presenters in clusters with and without a recipient. 𝛽1 was significantly
smaller than zero after differences in cluster size were adjusted for (p=2e-14). (B) To visualize, we calculated the adjusted number of
late presenters in a cluster as the number of late presenters minus the expected number of late presenters in clusters with a recipient
MSM (𝑟𝑖 = 1) under the fitted Poisson model. Dots indicate the adjusted number of late presenters for all viral phylogenetic
transmission clusters. Boxplots indicate the mean and two standard deviations from the mean. Viral phylogenetic transmission clusters
without a recipient MSM are clearly enriched in late presenters.
1460
58
Figure S 24 Exploratory local polynomial regression fits to the time to diagnosis of MSM with a last negative test in the
ATHENA cohort. Data from MSM with a last negative test (dots) is shown on top of the mean local polynomial regression fit (blue
line) and the corresponding 95% quantile ranges (light blue region).
59
Figure S 25 Multivariable Gamma regression model fitted to the time between the midpoint of the seroconversion interval and
diagnosis of MSM with a last negative test in the ATHENA cohort. Data from MSM with a last negative test (dots) are shown on
top of the mean Gamma regression fit (blue line) and the corresponding 95% quantile ranges (light blue region).
Probability that
time between midpoint of seroconversion
interval and diagnosis
is > t
1470
1.00
Seroconverting MSM for whom
recent HIV infection
was not indicated at diagnosis
and
0.75
CD4 < 250,
25 years at diagnosis
CD4 < 250,
35 years at diagnosis
0.50
CD4 > 850,
35 years at diagnosis
0.25
CD4 in [250−850],
35 years at diagnosis
0.00
0
1
2
3
4
5
6
7
8
t
(years)
Figure S 26 Estimated probability that the time between the midpoint of the seroconversion interval and diagnosis among
MSM with a last negative test is larger than t years.
60
0.10
Data set
pairs,
● Confirmed
Belgium
0.08
evolutionary divergence
(nucleotide substitutions / site)
1480
Figure S 27 Time to diagnosis estimates. (A) Average estimated time to diagnosis for MSM infected between 1996-1999 under the
regression model as a function of the quantile parameter. The maximum-likelihood estimate and the 95% confidence interval derived
from two mathematical modelling studies are highlighted in black/grey. (B) Predicted time to diagnosis for all MSM in the ATHENA
cohort with estimated date of infection between 1990-2011. The three subplots show the higher, central and lower estimates of time to
diagnosis that correspond to the calibrated quantile parameters 0.109, 0.148, 0.194.
pairs,
● Confirmed
Sweden
0.06
●
●
●
0.04
●
●
●
●
0.02
●
●
0.00
0
2
4
6
8
10
12
time elapsed
(years)
Figure S 28 Genetic distance among sequence pairs from transmitter-recipient pairs in the Belgium and Swedish transmission
chains. The 80% and 95% interquartile range under the fitted probabilistic molecular clock model are shown with dashed and dotted
lines.
61
60
Distance threshold
probability that sequences from
the same individual do not co−cluster
(%)
None
50
2% subst/site
4% subst/site
40
30
20
10
0
0.7
0.75
0.8
0.85
0.9
0.95
clade frequency threshold
1490
Figure S 29 Approximate type-I error of the phylogenetic clustering criterion as a function of the clade frequency threshold.
Approximate type-I error in excluding sequences from the same individual if no genetic distance threshold is used (green), a 2%
substitutions / site distance threshold is used (orange), and a 4% substitutions / site genetic distance threshold is used (purple).
Figure S 30 Type-I error and power of the coalescence compatibility test. (A) Approximate type-I error in excluding co-clustering
sequences from the same individual. (B) Power in excluding co-clustering female-female pairs.
1500
62
Figure S 31 Estimated fraction of non-censored potential transmission intervals. The fraction of non-censored potential
transmission intervals was estimated for six time intervals, [𝒕𝟏 , 𝒕𝟐 ]= 1996/07-2006/06, 2006/07-2007/12, 2008/01-2009/06, 2009/072009/12, 2010/01-2010/06, 2010/07-2010/12. The fraction is plotted at the end time of each time interval.
central estimate of HIV infection times
central phylogenetic exclusion criteria
lower estimate of HIV infection times
60
60
60
40
40
40
20
20
20
0
Proportion with a last negative test
60
0
1
2
phylogenetic exclusion criteria
coalescent compatibility < 10%
clade frequency < 80%
3
60
0
1
2
phylogenetic exclusion criteria
coalescent compatibility < 10%
clade frequency < 85%
3
60
40
40
40
20
20
20
0
60
0
1
2
phylogenetic exclusion criteria
coalescent compatibility < 20%
clade frequency < 85%
3
60
3
60
40
40
20
20
20
60
0
1
2
phylogenetic exclusion criteria
coalescent compatibility < 30%
clade frequency < 85%
3
1
2
phylogenetic exclusion criteria
coalescent compatibility < 20%
clade frequency < 70%
3
1
2
phylogenetic exclusion criteria
coalescent compatibility < 30%
clade frequency < 80%
3
0
1
2
phylogenetic exclusion criteria
coalescent compatibility < 30%
clade frequency < 70%
40
0
phylogenetic exclusion criteria
coalescent compatibility < 10%
clade frequency < 70%
0
1
2
3
1
2
3
upper estimate of HIV infection times
60
40
40
20
20
0
Diagnosed in 09/07−10/12
Probable transmitters
to recipients diagnosed in 09/07−10/12
0
1
2
3
1
2
3
time between last negative HIV test and diagnosis
(years)
Figure S 32 Time between last negative test and diagnosis amongst MSM diagnosed in 2009/07-2010/12 and probable
transmitters of recipients diagnosed in 2009/07-2010/12.
1510
63
Table S1 Clinical and viral sequence data used in this study
Sample sizes
Registered MSM by March 2013
MSM with confirmed recent infection
Viral load measurements
CD4 measurements
11,863
1,794
265,853
284,151
Coverage of linked clinical data
Time of diagnosis of recipient
MSM of potential transmitters
96/07-06/06
06/07-07/12
08/01-09/06
09/07-10/12
Unknown recency
status at diagnosis
(%)
51
48
46
45
No CD4 measurement
between diagnosis
and ART start (%)
8.6
8.0
7.8
7.6
No viral load measurement
after ART start (%)
8.4
8.2
8.0
8.0
Frequency of linked clinical data
96/07-06/06
06/07-07/12
08/01-09/06
09/07-10/12
2.5%
0.16
0.18
0.19
0.19
96/07-06/06
06/07-07/12
08/01-09/06
09/07-10/12
2.5%
0.72
0.76
0.75
0.76
CD4 measurements between diagnosis and ART start
of potential transmitters
(number / year)
25%
Median
75%
1.38
2.75
3.93
1.61
2.92
3.99
1.67
2.97
4.01
1.68
2.97
4.01
Viral load measurements after ART start
of potential transmitters
(number/year)
25%
Median
75%
2.13
2.70
3.29
2.11
2.66
3.23
2.10
2.64
3.21
2.09
2.62
3.19
97.5%
6.71
6.73
6.74
6.74
97.5%
5.00
4.54
4.50
4.41
Partial polymerase HIV-1 subtype B sequences
available Dutch sequences
enriched with 10 most similar sequences in the Los
Alamos Sequence Database
(http://www.hiv.lanl.gov/)
Data set used in analysis, after exclusion of
potentially recombinant sequences (identified with
3SEQ), and exclusion of sequences with length <=
250 nucleotides
n
8,748
9,474
9,054
Charateristics of Dutch HIV-1 subtype B sequences used in analysis
Dutch sequences
Sequences sampled after ART
start
Patients sampled
n
8,328
3,693
6,231
5%
1175
Length of ATHENA sequences
(nt)
Time from diagnosis to
0
sampling of an individual’s first
sequence
(years)
Individuals with at least one sequence
Male
Female
MSM
4767
0
25%
1235
Mean
1256
75%
1274
95%
1600
0.03
2.8
4.1
13.5
Drug user
123
57
Other
82
42
Unknown
174
14
Heterosexual
518
455
64
Table S2 Potential transmitters and potential transmission pairs to the recipient MSM
Time of
diagnosis of
recipient
MSM
Recipient
MSM
Potential
transmitters*
Potential
transmission
pairs*
Total
Central estimate of
infection time ¶
Lower estimate of
infection time ¶
Upper estimate of
infection time ¶
All
(n)
With a
sequence
(n)
All
(n)
With a
sequence
(n)
All
(n)
With a
sequence
(n)
Overall
1794
1045
1794
1045
1794
1045
96/07-06/06
06/07-07/12
08/01-09/06
09/07-10/12
695
323
405
371
368
216
233
228
695
323
405
371
368
216
233
228
695
323
405
371
368
216
233
228
Overall
12193
5585
12189
5585
12193
5585
96/07-06/06
06/07-07/12
08/01-09/06
09/07-10/12
9687
10179
10962
11419
4322
4750
5148
5329
9537
10093
10935
11415
4239
4718
5142
5329
9816
10272
10989
11419
4376
4779
5161
5329
Overall
9722349
4428060
9617162
4378332
9824133
4475601
96/07-06/06
2712072
1158948
2649412
1125357
2777347
1193496
06/07-07/12
2075669
961286
2052721
951409
2096432
969131
08/01-09/06
2427787
1134094
2412127
1128919
2440400
1138425
09/07-10/12
2506821
1173732
2502902
1172647
2509954
1174549
* Potential
transmitters and potential transmission pairs were counted for recipient MSM with a sequence for
computational reasons. ¶ The infection time of potential transmitters was estimated from their age at diagnosis,
recency of HIV infection at diagnosis, and first CD4 count within 12 months of diagnosis. The central estimate is
based on 𝛼 = 0.148 in the estimation procedure described in the supplementary online material; the lower estimate
is based on 𝛼 = 0.194, and the upper estimate is based on 𝛼 = 0.109.
65
Table S3 Identified phylogenetically probable transmitters and phylogenetically probable transmission pairs to the recipient MSM.
Time of
diagnosis of
recipient
MSM
Total
Central exclusion
criteria
Clade freq <80%
Coal comp <20%
(n)
(%)
Lower-I exclusion
criteria
Clade freq <80%
Coal comp <30%
(n)
(%)
Upper-I exclusion
criteria
Clade freq <80%
Coal comp <10%
(n)
(%)
Lower-II exclusion
criteria
Clade freq <85%
Coal comp <20%
(n)
(%)
Upper-II exclusion
criteria
Clade freq <70%
Coal comp <20%
(n)
(%)
Recipient
MSM with a
phylogenetically
probable
transmitter
Overall
96/07-06/06
06/07-07/12
08/01-09/06
09/07-10/12
617
165
144
152
157
59.14 *
44.84 *
66.67 *
65.24 *
68.86 *
594
150
143
149
152
56.84 *
40.76 *
66.2 *
63.95 *
66.67 *
638
172
146
159
161
61.05 *
46.74 *
67.59 *
68.24 *
70.61 *
564
146
134
138
146
53.97 *
39.67 *
62.04 *
59.23 *
64.04 *
656
179
150
160
167
62.78 *
48.64 *
69.44 *
68.67 *
73.25 *
Phylogenetically
probable
transmitters
Overall
96/07-06/06
06/07-07/12
08/01-09/06
09/07-10/12
903
268
331
348
407
16.17 §
6.32 §
7.02 §
6.77 §
7.64 §
841
237
302
322
370
15.06 §
5.59 §
6.4 §
6.26 §
6.94 §
981
308
362
400
448
17.56 §
7.27 §
7.67 §
7.78 §
8.41 §
823
240
307
323
367
14.74 §
5.55 §
6.46 §
6.27 §
6.89 §
975
308
359
391
442
17.46 §
7.13 §
7.56 §
7.6 §
8.29 §
Phylogenetically
probable
transmission
pairs
Overall
2343
0.05 ¶
2059
0.05 ¶
2698
0.06 ¶
2097
0.05 ¶
¶
353
0.03
¶
477
0.04
¶
2718
0.06 ¶
380
0.03
¶
498
0.04 ¶
96/07-06/06
401
0.04
06/07-07/12
506
0.05 ¶
446
0.05 ¶
569
0.06 ¶
488
0.05 ¶
579
0.06 ¶
08/01-09/06
731
0.06
¶
636
0.06
¶
842
0.07
¶
642
0.06
¶
819
0.07 ¶
09/07-10/12
705
0.06 ¶
624
0.05 ¶
810
0.07 ¶
587
0.05 ¶
822
0.07 ¶

See supplementary online material for a description of sensitivity analyses. * Proportion among all recipient MSM with a sequence, §proportion among potential
transmitters with a sequence, ¶proportion among potential transmission pairs with sequences from both individuals based on central estimates of infection times.
66
Table S4 Demographic and clinic characteristics of the 3,025 MSM with a last negative test, that were used to fit the
multivariable regression model
Time of
diagnosis
Age at
diagnosis
CD4 count
at diagnosis
Infection
status at
diagnosis
<=06/06
06/07-07/12
08/01-09/06
09/07-10/12
931
609
721
764
<=25
26-35
36-45
46-55
>55
242
944
1114
545
180
No CD4
measurement
to date
CD4
measurement
after 1 year
of diagnosis
93
Recent HIV
infection not
indicated
722
First CD4
measurement
after ART
start
52
Missing
≤250
251-850
>850
471
2169
230
10
Confirmed
recent HIV
infection
1369
934
67