Download A Review of the Use of the Number Needed to Treat to Evaluate the

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Pharmaceutical industry wikipedia , lookup

Adherence (medicine) wikipedia , lookup

Prescription costs wikipedia , lookup

Polysubstance dependence wikipedia , lookup

Clinical trial wikipedia , lookup

Theralizumab wikipedia , lookup

Bad Pharma wikipedia , lookup

Pharmacogenomics wikipedia , lookup

Bilastine wikipedia , lookup

Transcript
The Journal of Pain, Vol 16, No 2 (February), 2015: pp 116-123
Available online at www.jpain.org and www.sciencedirect.com
A Review of the Use of the Number Needed to Treat to Evaluate
the Efficacy of Analgesics
Nathaniel Katz, Florence C. Paillard, and Richard Van Inwegen
Analgesic Solutions, Natick, Massachusetts.
Abstract: Standardized measures of efficacy are needed to compare analgesic efficacy across trials.
The number needed to treat (NNT) is considered a statistically robust and readily interpretable measure to rank the efficacy of treatments, including analgesics. The NNT has become widely utilized to
compare the efficacy of chronic pain treatments, helping physicians make treatment decisions and
informing decisions for market access, reimbursement, and treatment guidelines. However, the
NNT is associated with specific weaknesses in calculation and interpretation not associated with
other methods for integrating trial data. These weaknesses include distortions in calculation as
placebo effects approach treatment effects, with the possibility of infinite values; difficulties in estimating the NNT’s confidence interval; and difficulties in interpretation. The NNT also requires selecting cutoffs of the original variable for dichotomization, with the NNT often changing depending on
the cutoff. The NNT also suffers from problems common to other placebo-adjusted endpoints,
including being sensitive to study-related and external factors (eg, year of publication). Therefore,
clinicians and other stakeholders need to be aware of these issues to correctly calculate, use, and
interpret the NNT. Nevertheless, efficacy, as measured by any variable, is only one aspect of a treatment to be considered in determining its place in therapy.
Perspective: The NNT has become widely utilized to compare the efficacy of chronic pain
treatments. This article reviews the uses of the NNT and the potential problems associated with its
calculation, use, and interpretation. Clinicians should be aware of these issues when interpreting
clinical trial data based on the NNT.
ª 2015 by the American Pain Society
Key words: Number needed to treat, analgesics, efficacy, clinical trials, chronic pain treatment.
T
he number needed to treat (NNT) was devised in
1988 by Laupacis et al29 as a single unitary measure
of a drug’s efficacy that was meant to provide an
intuitive means for evaluating the relative efficacy of
different drugs in order to rank them as to their efficacy.33 Like any other method that evaluates the relative
efficacy of treatments, the NNT is dependent on
comparing data across randomized double-blind
controlled trials, the gold standard in clinical research.
The efficacy of the drug being studied is measured as incremental benefit above that in the placebo group and is
typically quantified by the difference between groups in
The authors received a grant from Janssen Pharmaceuticals to support
independent writing of this review.
The authors declare having no financial relationship to the work. N.K.
and R.V.I. are employees and F.C.P. is a contractor of Analgesic Solutions.
Address reprint requests to Florence C. Paillard, PhD, Analgesic Solutions,
232 Pond St, Natick, MA 01760. E-mail: [email protected]
1526-5900/$36.00
ª 2015 by the American Pain Society
http://dx.doi.org/10.1016/j.jpain.2014.08.005
116
mean value for the primary endpoint. The group mean
difference is often difficult to interpret in a clinically
intuitive manner. Thus, to allow easier interpretation of
clinical data, the NNT defines each patient as either a
responder or a nonresponder (based on some predefined
definition of response) and compares the proportion of
responders in each group.
Definition and Methods of Calculation of
the NNT
The NNT is interpreted as the number of patients one
would need to treat in order to get one more responder
on the active treatment than one would have gotten had
they been treated with control. In technical terms, the
NNT is the inverse of the absolute risk difference (ARD):
1
NNT ¼ ARD
:
ARD is the difference in proportion of patients who
manifest a response to a treatment and the proportion
of patients who manifest a response to control. It is
possible to use placebo, nontreatment, or another active
Katz, Paillard, and Van Inwegen
treatment as the control to calculate the NNT. The choice
of controls has a huge impact on the NNT values and
their interpretations. Thus, any comparison of NNT
values must use the same controls for calculations;
comparing NNT values calculated with different control
groups would not be valid. Therefore, the discussions
in this review are limited to comparisons to placebo controls, unless otherwise so stated. Most often, the
response used to calculate the ARD is an improvement
(eg, reduction in pain), in which case ARD is calculated
as ‘‘response with drug minus response with placebo.’’
Responders, patients who manifest a response, are
defined as patients who meet a predefined criterion of
response in an all-or-nothing fashion (ie, death/life). However, for most uses, a nondichotomous endpoint is used
(eg, pain intensity score), and a predefined response criterion is created (eg, having a $30% pain reduction or not).
The following is an example of calculating the NNT. If
half (50%) the patients on active treatment respond
(response rate = .5), and one quarter (25%) of the
placebo-treated patients respond (response rate = .25),
then the ARD is .5.25 = .25. The NNT is 1/.25 = 4. This
can be interpreted as 4 patients would have to be treated
with the treatment to get 1 more responder than with
placebo. In other words, treating 4 patients with treatment would yield 2 responders, whereas treating 4
patients with placebo would yield only 1 responder.
The NNT can be also calculated using the odds ratio
(OR) or the relative risk reduction (RRR) (reviewed in33).
The optimal NNT value is 1, whereby every patient has
a positive response to treatment and no patient responds
to placebo. When a drug produces an identical effect to
placebo, the NNT would have a value of infinity (ARD = 0
and NNT = 1/0). Thus, in theory, the NNT can vary from 1
to infinity. It is also possible for the NNT to have a negative value when the response rate for placebo is greater
than that for the drug. In practice, the NNT is usually used
to compare effective drugs, so negative and infinity
values are not usually reported because the drug is
considered ineffective. Thus, the lower (ie, the closer to
1) the NNT, the more efficacious the drug.
Uses of the NNT
The NNT is usually used to compare the efficacy of
different drugs for the same indication to enable physicians to make informed choices in clinical practice. To
allow comparison and ranking of the efficacy of different
treatments, many systematic reviews and meta-analyses
of randomized controlled trials provide average NNT
values for various treatments across an indication (the
NNT being treatment- and dose-specific).33 For instance,
a meta-analysis evaluating the efficacy of analgesics
ranked the drugs’ efficacy by NNT value (the lower the
NNT, the higher the efficacy) as follows: efficacy of
ibuprofen 400 mg (NNT = 2.8) > paracetamol (600/
650 mg) (NNT = 5.0) > codeine 60 mg (NNT = 18).34
The NNT has become widely utilized for comparing
treatment efficacy for chronic pain and other conditions,
which informs decisions for market access, reimbursement, and position in treatment guidelines, which affect
The Journal of Pain
117
the lives of countless patients. Reliance on the NNT for
this purpose is based on presumptions of methodologic
robustness and straightforward interpretability.
Herein, we conducted a qualitative review of the
advantages and limitations of the NNT to assess whether
the NNT is suitable for evaluating the comparative efficacy of analgesics for chronic pain.
Critiques of the NNT
Critiques of NNT have been grouped into 6 major
categories (Table 1). A number of these critiques (both
positive and negative) were not specifically inherent to
the NNT, but could be applied to any comparative measure of efficacy such as the ARD and OR. The fact that
the NNT can be impacted by study design (eg, study
size, number of arms, type of comparator) and subject’s
characteristics (eg, indication, severity, and duration of
disease; reviewed in references14,31,32,48,51) is not too
surprising as these factors influence the response rate,
which forms the basis of the NNT definition. These
variables can also influence other comparative
measures. Thus, critiques of the NNT were categorized
based on whether they are specific to the NNT or
nonspecifically applicable to any placebo-adjusted
efficacy measure used in meta-analysis.
Critiques Specific to the NNT
Issues Associated With Calculating the NNT
Calculating the Combined NNT From Various Trials Can
Be Subject to Bias. Two methods have been proposed
and used to calculate the NNT from several clinical trials.
One method uses a meta-analytic method wherein each
trial is handled separately, and the other treats the data
as if arising from a single trial.2,6 The latter method is
prone to bias, especially when there is an imbalance in
sample size between treatment and placebo arms, a
phenomenon called the Simpson’s paradox. The metaanalytic method is not prone to this issue. In metaanalyses, the use of a relative measure such as OR or
risk ratio (RR) is advocated because event rates vary
considerably from study to study even for the same
drug.6 The method used to calculate the combined NNT
from various trials should be clearly stated.
The NNT Can Have an Infinite Value. Unlike other
measures of efficacy (eg, ARD), the NNT is an inversion
of a response (ie, 1/ARD) so that a zero difference
between active and placebo groups results in an undefined number (ie, 1/0).18,26,43 This could be further
complicated if the treatment in one study is less
efficacious than placebo, resulting in a negative number.
Having an infinite or negative value renders
comparing studies and calculating the combined NNT
for several trials difficult.2,8,22,30,50 As stated above,
using a meta-analytic methodology based on the OR or
RR is recommended.6
In some systematic reviews, the NNT is considered
‘‘nonsignificant’’ when it is high or infinite,16 which
means that the treatment has no therapeutic value.
However, the cutoff value for ‘‘high’’ is arbitrary; the
118
Uses and Misuses of NNT to Rank Analgesics
The Journal of Pain
Table 1.
Critiques of the NNT
TYPES OF CRITIQUES
STUDIES
Critiques specific to the NNT
Issues associated with calculating the NNT:
The NNT can have an infinite value, creating problems for meta-analyses and
causing a disproportionate impact of failed versus successful studies
The precision of the NNT is difficult to estimate
The NNT has a skewed CI
The NNT is difficult to interpret
Critiques of the NNT applicable to other endpoints used in meta-analyses
The NNT is sensitive to the level of efficacy in the placebo group
The NNT depends on the endpoint:
The NNT is dependent on the selection of outcome measured
The NNT is dependent on the cutoff chosen for the dichotomous outcome
The NNT is dependent on the time point of outcome
The NNT depends on factors internal and external to the study aside
from study drug
NNT does not reflect effectiveness of treatment in clinical practice
Suissa,49 Stang,48 Hutton,22 Hutton,23 Thabane,50
Ebrahim,13 Hutton,24 Lesaffre,30 Newcombe,37
Newcombe,38 Smeeth,47 Altman1,2
Christensen,8 Ebrahim,13 Smeeth47
Moore 33
Suissa,49 Stang,48 Citrome,9 McAlister,32 Moore,35
Christensen,8 Mayne,31 Tramer,51 Wu,53 Ebrahim13
Suissa,49 Stang,48 Citrome,9 McAlister,32 Moore,35
Christensen,8 Mayne,31 Tramer,51 Wu,54 Ebrahim13
Tramer,51 Smeeth,47 Black,4 Dowie10
therapeutic value or lack thereof of a therapy with a high
NNT would be dependent on the therapeutic area and
availability of successful treatments with lower NNT
values. In an indication with limited treatments (eg,
visceral pain), a ‘‘high’’ number would be more acceptable than in a pain indication with multiple options.
The Precision of the NNT Is Difficult to Estimate. A
measure of drug effect size should have a definable
precision, such as a 95% confidence interval (CI). When
the NNT is very high or infinite, its 95% CI cannot be accurately calculated.1,2,8,22,30,50 For instance, if the ARD is .01
and its 95% CI is [.01 to 1.03], the NNT is 100 (1/.01) but
its 95% CI is not [100 to 133.3]. In this example, the
95% CI would be mathematically 2 intervals, one at
[N, 100] and another at [133.3, 1N], because the
ARD 95% CI contains zero, which is very difficult to
interpret.8 Therefore, in such circumstances, it has been
recommended that the NNT be given only as a point
estimate (ie, 100 in this example) without the 95% CI,
which avoids the mathematical problem but does not
fulfil the desire for an estimate of precision.
The NNT Has a Skewed CI. The NNT is calculated by
inverting the ARD and its CI by inverting the ARD’s CI
limits. This leads to an uneven distribution of the NNT’s
CI, whereby the upper bound of the CI is artificially ‘‘inflated.’’2,50 For instance, if the ARD is .1 times the 95%
CI: [.05–.15], the NNT is 10 times the 95% CI: [6.7–20].
Therefore, the NNT value is further from the upper
bound of the CI than from the lower bound of the CI.2
To provide a real example, the NNT for 60 mg etoricoxib
to produce at least 50% pain relief at 6 weeks compared
with placebo was 2.6 [95% CI: 2.0–3.9], showing an
‘‘inflation’’ of the upper bound of the 95% CI.35 However,
this anomaly is not unique to the NNT and has been
observed with the OR, albeit for different reasons.
mean differences, RRR, OR, and absolute risk reduction
(ARR).33
However, several studies using a randomized design
to survey the understanding of NNT among doctors,21,39
medical students,43 patients,44 and laypersons20,28 have
shown that they have difficulty understanding the
meaning of the NNT. One survey of first-year medical
students showed that treatment benefit was correctly
interpreted by $75% of the students when presented
as RRR or ARR, but only by 30% of students when presented as the NNT.43 A survey of Norwegian medical
doctors showed that many believed that only 1 of NNT
patients would benefit from treatment (eg, if the NNT
were 4, then 1 of 4 patients would benefit).21 Patients
who were presented with clinical data in terms of
RRR, ARR, and NNT were best able to interpret the benefits of treatment when it was presented as RRR or
ARR.44 A study showed that a large majority of people
would accept using the hypothetical treatment presented to them regardless of its NNT, and that those
who declined using the medication misinterpreted the
NNT.28 In another study, laypersons presented with the
benefit of a hypothetical osteoporosis intervention
and offered the therapy were sensitive to the magnitude of treatment benefit when it was presented in
terms of postponement of hip fracture, but not in terms
of NNT.7,20
A search for definitions of the NNT on the Internet
shows that the NNT is commonly misinterpreted as the
number of patients one needs to treat to get one
responder.
The NNT Is Difficult to Interpret
The NNT Is Sensitive to the Level of ‘‘Efficacy’’
in the Placebo Group
Since its inception in 1988, the NNT has quickly become
a widely used measure of a drug’s efficacy because it was
thought to be a more intuitive measure of a drug’s efficacy than other measures of efficacy such as group
Critiques of the NNT Applicable to Other
Endpoints Used in Meta-Analyses
An important asset of the NNT is that it takes into
account the response observed in the placebo group,
thereby allowing for the adjustment for nonspecific
Katz, Paillard, and Van Inwegen
Impact of the Placebo Response on
NNT—Theoretical Example No. 1
Table 3.
% RESPONDERS
IN ACTIVE
TREATMENT GROUP
% RESPONDERS
IN ACTIVE
TREATMENT GROUP
Table 2.
10
20
30
40
50
60
70
80
119
The Journal of Pain
% RESPONDERS
IN PLACEBO
GROUP
DELTA
NNT
% MORE
RESPONDERS IN
ACTIVE GROUP*
0
10
20
30
40
50
60
70
10
10
10
10
10
10
10
10
10.0
10.0
10.0
10.0
10.0
10.0
10.0
10.0
100.0
50.0
33.3
25.0
20.0
16.7
14.3
12.5
*Calculated as (% responders in active group) (% responders in placebo
group) / % responders in active group.
factors that may be associated with non–treatment-specific benefits in a clinical trial, such as nonspecific
response, spontaneous improvements/worsening of
the disease, natural history, and regression to the
mean. To this end, the NNT is intending to serve as a
measure of the pharmacologic benefit of a drug over
and above nonspecific factors that may be associated
with benefits in a clinical trial. Other placebo-adjusted
measures such as group mean differences, ORs, and
RRs are also commonly used to ‘‘normalize’’ the incremental pharmacologic benefit of a drug in the context
of a clinical trial.
As with other comparative measures of efficacy that
rely upon the difference between drug and placebo,
the absolute level of placebo response can have an
impact on interpretation. For example, a treatment
with a high absolute benefit, but a nearly as high
benefit in the placebo group, will appear highly effective in clinical practice but will have an unfavorable
NNT. A treatment with a more modest absolute benefit,
but a much lower response in the placebo group, will
appear less effective in clinical practice despite a more
favorable NNT. This is illustrated in the theoretical
example illustrated in Table 2, where the NNT of
different treatments remains constant, but the likely
clinical interpretation and desirability of treatment
does not.
Another theoretical example to consider is where the
response to drug remains constant but the response to
placebo varies, as is often observed between similar
studies of the same treatment (see examples below
and reference12). In Table 3, the drug’s response was
held constant at 40% (a typical response rate observed
for analgesics) and the placebo response varied from
5% to 40%. This example shows that the NNT dramatically increases with increasing placebo response,
reaching infinity when the response in the placebo
group is equal to that in the active group (Table 3).
Although the NNT for these theoretical examples varied from 2.9 to infinity, the ARD only varied from .35
to 0, illustrating how the NNT appears to be more
unstable than the ARD in the face of variable placebo
responses.
Impact of the Placebo Response on
NNT—Theoretical Example No. 2
40
40
40
40
40
% RESPONDERS IN
PLACEBO GROUP
DELTA
NNT
ARD
5
10
20
30
40
35
30
20
10
0
2.9
3.3
5
10
N
.35
.30
.20
.10
0
The fact that the NNT varies more widely than the
ARD is illustrated in examples of clinical trials noted in
Table 4. This table shows trials from anticonvulsant
analgesics with very different NNTs (presented in order
of lowest to highest NNT) and presents the ARD. In
these examples, the NNT varied from 1.4 to infinity,
whereas the ARD only varied from .704 to 0. This is
again a consequence of the difference between the
active and placebo groups in the proportion of patients
improving.
The NNT Depends on the Endpoint
To calculate the NNT, continuous data need to be
converted into dichotomous data; for instance, pain
scores over time need to be converted into change in
pain score from baseline to endpoint, and then converted to a response rate using a predefined criterion;
for example, proportion of subjects having an X% pain
reduction from baseline to endpoint. Therefore, to calculate the NNT of an active drug in a trial, one needs to
select 1) the derived outcome measure (eg, % responders); 2) the cutoff for the definition of response
(eg, 30% reduction in pain); and 3) the time at which
the observation of response/nonresponse is made (time
point). Several articles have shown that the NNT is
dependent on the outcome measured,11,28 the time
point of measurement,6,11,44,45 and the cutoff value of
response.31
The NNT Is Dependent on the Selection of Endpoint
Measured. Comparison of the NNT for total mortality,
mortality, and all cardiovascular events in various statin
trials showed that the NNT value was different for each
of these outcome measures.11 In an analgesic trial, it
seems obvious that the NNT for reduction in pain will not
be the same as the NNT for improvement in sleep.
Moore’s review36 of pregabalin efficacy presented the
NNT for 2 outcomes: the proportion of subjects with
$30% or $50% pain reduction (% responders) and the
level of improvement in Patient’s Global Impression of
Change (Fig 1). The NNT for the level of improvement in
Patient’s Global Impression of Change was systematically
higher than the NNT for the proportion of responders
(regardless of the cutoff), although all 3 sets of NNT
values yield the same conclusion.36
The NNT Is Dependent on the Cutoff Chosen for the
Dichotomous Outcome. Transformation of continuous
data into dichotomous data, a process necessary to calculate the NNT, reduces the statistical estimation of outcomes and results in an important loss of information.
120
Uses and Misuses of NNT to Rank Analgesics
The Journal of Pain
NNT Varies More Widely Than ARD—Examples of Clinical Trials of Anticonvulsants in
Neuropathic Pain
Table 4.
REFERENCE
ANALGESIC DRUG
NO. (%) OF PATIENTS WHO IMPROVED*
WITH ACTIVE TREATMENT
NO. (%) OF PATIENTS WHO IMPROVED*
WITH PLACEBO
NNT
ARD
Killian27
Zakrzewska55
Gilron19
Simpson45
Simpson46
Otto40
Raskin41
Drewes11
Serpell42
Finnerup17
Carbamazepine
Lamotrigine
Gabapentin
Gabapentin
Lamotrigine
Valproate
Topiramate
Valproate
Gabapentin
Lamotrigine
19/27 (70.4)
7/13 (53.8)
27/44 (61.4)
15/30 (50.0)
86/150 (57.3)
8/31 (25.8)
74/214 (34.6)
6/20 (30.0)
32/153 (20.9)
4/21 (19.0)
0/27 (0.0)
1/14 (7.1)
13/42 (31.0)
7/30 (23.3)
30/77 (39.0)
3/31 (9.7)
23/109 (21.1)
4/20 (20.0)
21/152 (13.8)
4/21 (19.0)
1.4
2.1
3.3
3.8
5.4
6.2
7.4
10.0
14.1
N
.704
.467
.304
.267
.183
.161
.135
.10
.072
.0
NOTE. Data are from Finnerup et al.16
*Response was based on 50% pain relief; if 50% pain relief could not be obtained directly from the publication, then the number of patients reporting at least good
pain relief or reporting improvement was used to calculate the NNT.
Thus, if the NNT was to become the measure of choice in
clinical trials, study design may evolve toward the use of
dichotomous measures (such as response rate) as primary
endpoints, which would reduce studies’ power (or lead
to increased sample size).15
Dichotomizing the data requires that the analyst
decide on a cutoff, which can be somewhat arbitrary;
not all studies use the same definition for the cutoff
point to determine the number of responders (most
studies use the 30% threshold, but others use a 20%
or 50% threshold). Dichotomization separates patients
into responders and nonresponders, losing the nuances of partial responders (those close to the
threshold). For instance, a subject with a 31% response
will have a preferential impact over one that has a
29% response.
The example illustrated in Table 5 shows that the NNT
is dependent on the cutoff of the outcome: the NNT for
the 30% pain reduction cutoff is systematically lower
than that for the 50% cutoff. In another example, the
NNT seems to increase with increasing cutoffs of percentages of response, especially for the lower doses
(Table 5)35; indeed, for the 5-mg dose, the NNT jumps
from 4.0 to 5.3 to 12.5 (quadruples), and for the 10-mg
dose it jumps from 3.7 to 5.9 to 7.1 (doubles) for the
typical cutoffs of 20%, 30%, and 50% of pain relief,
respectively (Table 5). Thus, as with any comparative
measure, the selection of the cutoff chosen is critical
and the same exact cutoff needs to be used for
comparing data from different studies.
The NNT Is Dependent on the Time Point of
Outcome. The NNT depends on when the outcome is
measured. First, it is expected that the response to treatment will vary over time: the response rate may be very
different at 1 week and at 4 weeks of treatment, and
so will the NNT calculated at these 2 time points. This
issue renders difficult the comparison of NNT from
different studies with various lengths, which is typical
in chronic indications.31
The 3 issues mentioned above are not unique to the
NNT, and any comparative measure of efficacy is likely
to be affected by the type of outcome considered, the
time point of the outcome, and the cutoff chosen.
However, what is unique to the NNT is the mechanism
by which one has to dichotomize the selected
endpoint. Some endpoints do not lend themselves
easily to simple dichotomization into responder or
nonresponder.
The NNT Depends on Internal and External
Factors Aside From Study Drug
Like all other measures of efficacy in clinical trials, the
NNT is affected by factors internal to the study (study
design and conduct) and factors external to the study
(real-world factors).
NNT Varies With the Cutoff for the
Dichotomous Variable (% Pain Relief)
Table 5.
ETORICOXIB
DOSE (MG)
5
10
30
60
Figure 1. NNT for pregabalin calculated from various outcome
measures (data from Moore et al36).
NNT FOR 20%
PAIN RELIEF
NNT FOR 30%
PAIN RELIEF
NNT FOR 50%
PAIN RELIEF
4.0
3.7
3.6
2.3
5.3
5.9
4.0
2.4
12.5
7.1
4.3
2.6
NOTE. Data are from Moore et al.35
Katz, Paillard, and Van Inwegen
The Journal of Pain
121
Figure 2. NNT for analgesics by year of publication (data from Finnerup et al16). Left panel: tricyclic antidepressants; right panel:
anticonvulsants. Each point of data represents 1 clinical trial.
The NNT Depends on Factors Internal to the Study. A
few articles have reviewed the study factors that can
impact the NNT12,14,25,26,31,32,48,51 and we refer the
reader to these articles for further explanations. Briefly,
these factors include
1. The study population, as the type of pain indication
and severity of condition at baseline influence the
response to active treatment12
2. The study design, for example, number of sites12
3. The comparator: controls used as comparators have
huge impact on numbers generated and interpretations; possible controls include a true placebo, an
active placebo, no treatment, or another active
treatment3,53
4. The time period (as discussed in the previous
section)
5. The type of outcome (as discussed in the previous
section)
The heterogeneity of trials’ data makes pooling NNT
data across studies clinically irrelevant.14 Thus, if the
NNT is to be used to compare treatments, the therapies
must have been tested in similar populations with the
same condition at the same stage, using the same
comparator, time period, and outcomes.28 Therefore,
comparisons of analgesics using the NNT that fail to
follow these guidelines may not be valid. Approaches
to calculation of the NNT that adjust for such confounders have been described; using an NNT model to
account for differences in study design and conduct
allows for more meaningful comparisons.5
The NNT Depends on Factors External to the Study.
External factors have been cited as potential influencers
of observed net treatment effects. The year of publication is one such factor that may stand as a proxy for
multiple other influences (eg, availability of other treatments, approval status of the treatment, expectation of
treatment benefit, and shifts in sources of patients). A
valid measure of treatment effects should be robust to
factors other than the treatment itself. The NNT for antidepressant analgesics as well as for anticonvulsant analgesics has steadily increased (the efficacy had gotten
worse) over time (Fig 2; reference18). This could be
attributed to the fact that in antidepressant trials,
placebo response increased more over time (year of
publication) than the response to active treatment.52
Increasing placebo response over time relative to the
drug will distort any measure of efficacy, including the
group mean differences. However, this phenomenon
results in more distortion of the NNT over time compared
to alternative measures of efficacy (as discussed earlier).
Another, perhaps more obvious, reason for the increase
in NNT over time is that the experimental conditions
(study design and patient population) have evolved over
time, rendering the comparison of older and recent
studies inappropriate. Indeed, more recent trials, for
reasons that are not entirely clear, have greatly increased
placebo responses compared to older trials.10
Use of the NNT in Clinical Practice
The NNT has been used as a yardstick to compare the
efficacy of various drugs, to help clinicians choose
between active treatments, inform treatment guidelines, determine market access and reimbursement, and
support risk-benefit analyses.
But as placebo is not given in clinical practice, placeboadjusted metrics like the NNT have limited usefulness in
clinical practice and can be irrelevant to real-world efficacy3 and misleading.43 For example, in the clinic, a physician facing the choice between 1) Treatment A with an
NNT of 10 that produces a response in 80% of patients
(but a response in 70% of placebo patients) or 2) Treatment B with an NNT of 2.9 that yields a response in 40%
of active-treated patients (but 5% in the placebo group)
would likely choose Treatment A because it shows greater
efficacy in a large proportion of patients—which is what
the clinician is looking for—even though it has a higher
NNT. Thus, if one wants to measure efficacy in a clinical
trial (as defined by the incremental benefit of the drug’s
pharmacology over the nonspecific effects of being in the
trial), one needs a placebo-adjusted robust measure like
the NNT. However, to measure efficacy in the real world,
the absolute response rate and similar unadjusted measures may provide a more relevant perspective.
Moreover, the NNT taken alone does not summarize all
necessary information for the clinician to make informed
decisions regarding treatment.10,51 Indeed, a physician
not only considers the efficacy of the drug but also takes
into account the rate and seriousness of adverse effects
of each treatment, the patient’s preference, cost, and
clinical experience (if any) with the treatments. Efforts
are underway to create quantitative metrics that
integrate these factors. Thus, decisions for use in the real
world need to be based on multiple pieces of
information, not just a solitary measure of efficacy
compared to placebo.
122
The Journal of Pain
Conclusions
Standardized measures of efficacy are needed to
compare analgesic efficacy across trials. The NNT was
developed to be a statistically robust and readily interpretable measure to rank the efficacy of treatments,
including analgesics. However, the NNT is associated
with specific weaknesses in calculation and interpretation that are not associated with other methods for integrating data from multiple trials. These weaknesses
include distortions in its calculation as placebo effects
approach treatment effects, with the possibility of infinite values; difficulties in estimating the precision of the
NNT particularly for CI calculation; and, contrary to the
original intent, difficulties in interpretation. The NNT
requires manipulation of the original variable by
References
1. Altman DG: Confidence intervals for the number needed
to treat. Br Med J 317:1309-1312, 1998
2. Altman DG, Deeks JJ: Meta-analysis, Simpson’s paradox,
and the number needed to treat. BMC Med Res Methodol
2:3, 2002
3. Backonja M, Wallace MS, Blonsky ER, Cutler BJ, Malan P Jr,
Rauck R, Tobias J: NGX-4010 C116 Study Group: NGX-4010, a
high-concentration capsaicin patch, for the treatment of
postherpetic neuralgia: a randomised, double-blind study.
Lancet Neurol 7:1106-1112, 2008
Uses and Misuses of NNT to Rank Analgesics
selecting cutoff points for dichotomization, with the
NNT often changing depending on the cutoff. The NNT
suffers from problems common to other placeboadjusted endpoints, including being sensitive to studyrelated and external factors (eg, year of publication)
that, when not considered, undermine the validity of
the NNT as a tool for comparing different treatments in
different studies. Approaches to improving the validity
of the NNT (and other summary measures) by adjusting
for such factors deserve further exploration. Finally, efficacy alone, as measured by any measure, is only one part
of the profile of a treatment to be considered in determining its place in therapy. Therefore, clinicians and
other stakeholders need to be aware of these issues to
correctly calculate, use, and interpret the NNT.
Hertz S, Jay GW, Junor R, Kerns RD, Kerwin R, Kopecky EA,
Lissin D, Malamut R, Markman JD, McDermott MP,
Munera C, Porter L, Rauschkolb C, Rice AS, Sampaio C,
Skljarevski V, Sommerville K, Stacey BR, Steigerwald I,
Tobias J, Trentacosti AM, Wasan AD, Wells GA, Williams J,
Witter J, Ziegler D: Considerations for improving assay sensitivity in chronic pain clinical trials: IMMPACT recommendations. Pain 153:1148-1158, 2012
13. Ebrahim S: The use of numbers needed to treat derived
from systematic reviews and meta-analysis. Caveats and
pitfalls. Eval Health Prof 24:152-164, 2001
14. Edelsberg J, Oster G: Summary measures of number
needed to treat: How much clinical guidance do they
provide in neuropathic pain? Eur J Pain 13:11-16, 2009
4. Black HR, Crocitto MT: Number needed to treat: Solid
science or a path to pernicious rationing? Am J Hypertens
11:128S-134S, 1998. discussion 135S-137S
15. Fedorov V, Mannino F, Zhang R: Consequences of dichotomization. Pharm Stat 8:50-61, 2009
5. Caro JJ, Ishak KJ, Caro I, Migliaccio-Walle K, Klittich WS:
Comparing medications in a therapeutic area using an
NNT model. Value Health 7:585-594, 2004
16. Finnerup NB, Otto M, McQuay HJ, Jensen TS, Sindrup SH:
Algorithm for neuropathic pain treatment: An evidence
based proposal. Pain 118:289-305, 2005
6. Cates CJ: Simpson’s paradox and calculation of number
needed to treat from meta-analysis. BMC Med Res Methodol
2:1, 2002
17. Finnerup NB, Sindrup SH, Bach FW, Johannesen IL,
Jensen TS: Lamotrigine in spinal cord injury pain: A randomized controlled trial. Pain 96:375-383, 2002
7. Christensen PM, Brøsen K, Brixen K, Andersen M,
Kristiansen IS: A randomized trial of laypersons’ perception
of benefit of osteoporosis therapy: Number needed to treat
versus postponement of hip fracture. Clin Ther 25:
2575-2585, 2003
18. Finnerup NB, Sindrup SH, Jensen TS: The evidence for
pharmacological treatment of neuropathic pain. Pain 150:
573-581, 2010
8. Christensen PM, Kristiansen IS: Number-needed-to-treat
(NNT)—Needs treatment with care. Basic Clin Pharmacol
Toxicol 99:12-16, 2006
9. Citrome L: Compelling or irrelevant? Using number
needed to treat can help decide. Acta Psychiatr Scand 117:
412-419, 2008
10. Dowie J: The ‘‘number needed to treat’’ and the
‘‘adjusted NNT’’ in health care decision-making. J Health
Serv Res Policy 3:44-49, 1998
11. Drewes AM, Andreasen A, Poulsen LH: Valproate for
treatment of chronic central pain after spinal cord injury. A
double-blind cross-over study. Paraplegia 32:565-569, 1994
12. Dworkin RH, Turk DC, Peirce-Sandner S, Burke LB,
Farrar JT, Gilron I, Jensen MP, Katz NP, Raja SN,
Rappaport BA, Rowbotham MC, Backonja MM, Baron R,
Bellamy N, Bhagwagar Z, Costello A, Cowan P, Fang WC,
19. Gilron I, Bailey JM, Tu D, Holden RR, Weaver DF,
Houlden RL: Morphine, gabapentin, or their combination
for neuropathic pain. N Engl J Med 352:1324-1334, 2005
20. Halvorsen PA, Kristiansen IS: Decisions on drug therapies
by numbers needed to treat: A randomized trial. Arch Intern
Med 165:1140-1146, 2005
21. Halvorsen PA, Kristiansen IS, Aasland OG, Førde OH:
Medical doctors’ perception of the ‘‘number needed to
treat’’ (NNT). Scand J Prim Health Care 21:162-166, 2003
22. Hutton JL: Misleading statistics: The problems surrounding number needed to treat and number needed to harm.
Pharm Med 24:145-149, 2010
23. Hutton JL: Number needed to treat and number needed
to harm are not the best way to report and assess the results
of randomised clinical trials. Br J Haematol 146:27-30, 2009
24. Hutton JL: Number needed to treat: Properties and
problems. J R Statist Soc A 163:403-419, 2000
Katz, Paillard, and Van Inwegen
25. Katz N: Methodological issues in clinical trials of opioids
for chronic pain. Neurology 65:S32-49, 2005
26. Katz J, Finnerup NB, Dworkin RH: Clinical trial outcome
in neuropathic pain: Relationship to study characteristics.
Neurology 70:263-272, 2008
27. Killian JM, Fromm GH: Carbamazepine in the treatment of
neuralgia. Use of side effects. Arch Neurol 19:129-136, 1968
28. Kristiansen IS, Gyrd-Hansen D, Nexø J, Nielsen JB:
Number needed to treat: Easily understood and intuitively
meaningful? Theoretical considerations and a randomised
trial. J Clin Epidemiol 55:888-892, 2002
29. Laupacis A, Sackett DL, Roberts RS: An assessment of
clinically useful measures of the consequences of treatment.
N Engl J Med 318:1728-1733, 1988
30. Lesaffre E, Pledger G: A note on the number needed to
treat. Control Clin Trials 20:439-447, 1999
31. Mayne TJ, Whalen E, Vu A: Annualized was found better
than absolute risk reduction in the calculation of number
needed to treat in chronic conditions. J Clin Epidemiol 59:
217-223, 2006
32. McAlister FA: The ‘‘number needed to treat’’ turns 20—
and continues to be used and misused. CMAJ 179:549-553,
2008
33. Moore RA: What Is an NNT? In ‘‘What Is. Series.’’ April
2009. Available at: http://www.whatisseries.co.uk/whatis/
pdfs/What_is_an_NNT.pdf
The Journal of Pain
123
41. Raskin P, Donofrio PD, Rosenthal NR, Hewitt DJ,
Jordan DM, Xiang J, Vinik AI: Topiramate vs placebo in painful diabetic neuropathy: Analgesic and metabolic effects.
Neurology 63:865-873, 2004
42. Serpell MG: Gabapentin in neuropathic pain syndromes:
A randomised, double-blind, placebo-controlled trial. Pain
99:557-566, 2002
43. Sheridan SL, Pignone MP: Numeracy and the medical
student’s ability to interpret data. Eff Clin Pract 5:35-40, 2002
44. Sheridan SL, Pignone MP, Lewis CL: A randomized comparison of patient’s understanding of number needed to
treat and other risk reduction formats. J Gen Intern Med
18:884-892, 2003
45. Simpson DA: Gabapentin and venlafaxine for the treatment of painful diabetic neuropathy. J Clin Neuromuscul Dis
3:53-62, 2001
46. Simpson DM, McArthur JC, Olney R, Clifford D, So Y,
Ross D, Baird BJ, Barrett P, Hammer AE: Lamotrigine for
HIV-associated painful sensory neuropathies: A placebocontrolled trial. Neurology 60:1508-1514, 2003
47. Smeeth L, Haines A, Ebrahim S: Numbers needed to treat
derived from meta-analyses sometimes informative, usually
misleading. BMJ 318:1548-1551, 1999
48. Stang A, Poole C, Bender R: Common problems related
to the use of number needed to treat. J Clin Epidemiol 63:
820-825, 2010
34. Moore RA: Pain and systematic reviews. Acta Anaesthesiol Scand 45:1136-1139, 2001
49. Suissa D, Brassard P, Smiechowski B, Suissa S: Number
needed to treat is incorrect without proper time-related
considerations. J Clin Epidemiol 65:42-46, 2012
35. Moore RA, Moore OA, Derry S, McQuay HJ: Numbers
needed to treat calculated from responder rates give a better indication of efficacy in osteoarthritis trials than mean
pain scores. Arthritis Res Ther 10:R39, 2008
50. Thabane L: A closer look at the distribution of number
needed to treat (NNT): a Bayesian approach. Biostatistics 4:
365-370, 2003
36. Moore RA, Straube S, Wiffen PJ, Derry S, McQuay HJ:
Pregabalin for acute and chronic pain in adults. Cochrane
Database Syst Rev 3:CD007076, 2009
37. Newcombe RG: Know your limitations: Not just for clinicians: Estimation of confidence intervals is not straightforward. J Public Health Med 21:481-482, 1999
38. Newcombe RG: Confidence intervals for the number
needed to treat. Absolute risk reduction is less likely to be
misunderstood. BMJ 318:1765-1767, 1999
39. Nexø J, Gyrd-Hansen D, Kragstrup J, Kristiansen IS,
Nielsen JB: Danish GPs’ perception of disease risk and
benefit of prevention. Fam Pract 19:3-6, 2002
40. Otto M, Bach FW, Jensen TS, Sindrup SH: Valproic acid
has no effect on pain in polyneuropathy: A randomized
controlled trial. Neurology 62:285-288, 2004
r MR, Walder B: Number needed to treat (or
51. Trame
harm). World J Surg 29:576-581, 2005
52. Walsh BT, Seidman SN, Sysko R, Gould M: Placebo
response in studies of major depression: Variable, substantial, and growing. JAMA 287:1840-1847, 2002
53. Webster LR, Malan TP, Tuchman MM, Mollen MD,
Tobias JK, Vanhove GF: A multicenter, randomized,
double-blind, controlled dose finding study of NGX-4010,
a high-concentration capsaicin patch, for the treatment of
postherpetic neuralgia. J Pain 11:972-982, 2010
54. Wu LA, Kottke TE: Number needed to treat: Caveat
emptor. J Clin Epidemiol 54:111-116, 2001
55. Zakrzewska JM, Chaudhry Z, Nurmikko TJ, Patton DW,
Mullens EL: Lamotrigine (Lamictal) in refractory trigeminal
neuralgia: Results from a double-blind placebo controlled
crossover trial. Pain 73:223-230, 1997