Download How to Conduce Critical Appraisal for Clinical Trials: a Detailed

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Gene therapy of the human retina wikipedia , lookup

Gene therapy wikipedia , lookup

Patient safety wikipedia , lookup

Declaration of Helsinki wikipedia , lookup

Epidemiology wikipedia , lookup

Theralizumab wikipedia , lookup

Forensic epidemiology wikipedia , lookup

Placebo-controlled study wikipedia , lookup

Clinical trial wikipedia , lookup

Multiple sclerosis research wikipedia , lookup

Transcript
CLINICAL
RESEARCH
DCTH - 4•2014 -196-204
How to Conduce Critical Appraisal
for Clinical Trials: a Detailed Analysis
of Population, Intervention and Outcome
Lucia Manfredi1, Laura Postacchini1, Silvia Tedesco1, Giovanni Pomponio2
Clinica Medica, Dipartimento di Scienze Cliniche e Molecolari,
Università Politecnica delle Marche Ancona;
2
Clinica Medica, Dipartimento di Medicina Interna, Ospedali Riuniti Ancona
1
SUMMARY
International medical literature constantly produces a large amount of clinical studies,
but only a small percentage of these, when subjected to a systematic critical appraisal,
meets the minimum standard of quality. The objective of this paper is therefore to provide
the necessary tools for the critical analysis of a study.
The three phases of the critical appraisal process are: examination of internal validity (detailed analysis of the basic elements composing a trial), relevance of the results (analysis
of effect size and accuracy of results), applicability or external validity (comparison of
basic elements of a study with the real life setting).
The first step is to critically review the fundamental elements of every clinical research:
Population, Intervention, Comparison and Outcome.
In order to warrant a good methodological quality, with results transferable in the clinical practice, the following criteria have to be fulfilled: population representative and
described in detail (inclusion and exclusion criteria); interventions (experimental and
concomitant) carefully described to make the trial reproducible; outcome as much as
possible substantial from a clinical, organizational and beneficial point of view (clinically
relevant and/or surrogate endpoints, and their specific characteristics).
◗◗◗ INTRODUCTION
In 1993 Haynes published a controversial article in a major American scientific journal (1): by using a table of stanKey words: evidence-based medicine,
critical appraisal, clinical trial methods.
Correspondence:
Lucia Manfredi
Clinica Medica - Dipartimento di Scienze
Cliniche e Molecolari
Università Politecnica
delle Marche Ancona
Via Conca, 71 - 60126 Ancona, Italy
E-mail: [email protected]
dardized criteria for the assessment of
clinical studies, he stated that a surprisingly high percentage of scientific papers, appeared in major international
journals, did not meet the minimum
level of quality.
In fact 254 papers, published from 1994
to 1998 in the New England Journal of
Medicine, underwent systematic critical appraisal: only 16.9% met the minimum standard; the quality was even
lower for articles published in many
other relevant journals (Table 1).
Ten years after, despite these data,
JAMA published an article entitled
How to Conduce Critical Appraisal for Clinical Trials
TABLE 1 • Number of revised articles to be published in a major scientific journal, subjected to
critical appraisal.
Number of revised articles to be published in ACP
Journal Club between 1994 and 1998
Articles that have passed the
critical appraisal (%)
N Engl J Med
254
16.9
JAMA
303
12.2
Lancet
410
7.3
Ann Intern Med
246
13.3
BMJ
283
8.5
Arch Intern Med
262
10.3
Circulation
541
2.8
Am J Med
298
3.4
J Intern Med
157
10.8
Neurology
445
1.3
Chest
780
1.7
“Poor-quality medical research: what
can journals do?” (2) and also in the
British Medical Journal appeared an
editorial entitled “The scandal of poor
epidemiological research” (3).
The Critical points are:
• The review process of the article
standard (peer reviewing) could no
longer be considered as sufficiently
efficient.
• Knowledge regarding conduction
of an experimental study is not sufficiently widespread.
• The description of published clinical
trials is not clearly stated and is incomplete.
• Considering the high number of
biomedical journals being issued,
an article not accepted by one
journal still has the probability to
be accepted by another, making
the average quality of articles lower.
• There is a chance, if the results of
a clinical trial are not what it was
first thought to be, later for it to be
manipulated to more favorable
predictions using suitable statistical
methods.
A study, conducted in 2005, showed
that the results of nearly a third of 45
clinical studies, published between
1990 and 2003 in three major medical journals (New England Journal of
Medicine, JAMA and Lancet), had
never been confirmed or completely disavowed by subsequent clinical
studies. Nonetheless, these had been
mentioned many times (>1000) in the
international literature (4). The problem concerning critical appraisal is increased by the large number of published studies.
For example, regarding the treatment
of refractory chronic graft versus host
disease (cGVHD), more than 150 clinical trials have been published in the last
15 years (Figure 1). Every reader, in a first
approach to a clinical study, has to answer these three questions (Table 2):
1. Can I trust the authors’ conclusions?
2. What are the results of the study?
3. Can I transfer these results to my
clinical practice?
197
198
L. Manfredi, et al.
◗◗◗ CRITICAL APPRAISAL
OF A CLINICAL TRIAL
The first step of a critical analysis for a
clinical trial is to examine the fundamental elements of a clinical research:
Population, Intervention, Comparison
and Outcome.
The above must be described in detail in the report for every reader to be
able to complete the process of critical appraisal.
FIGURE 1 • Clinical studies published between 1998 and 2013 about treatment of cGVHD.
TABLE 2 • Stages of the critical appraisal process.
Stages of the critical
appraisal process
Questions to be
answered
Elements useful to answer
the questions
1. Examination
of internal validity
Can I trust authors’
conclusions?
• Critical analysis of basic elements in a
clinical study (population, intervention,
comparison, outcome and power of
the trial)
2. Relevance
of the results
What are the results of
the study?
• Analysis of effect size (relevance)
• Analysis of accuracy of the results
• Evaluation of clinical and statistical significance
3. Applicability
(external validity)
Can I transfer these
results to my clinical
practice?
• Comparison of basic elements of a
study (population, intervention, comparison, outcome) with our real life settings.
• Critical examination of experimental
study with evaluation of quality criteria
• Knowledge about population characteristics and setting of interest
How to Conduce Critical Appraisal for Clinical Trials
To guarantee the quality of the reports,
guidelines for reporting clinical trials,
developed by editors and scientific
communities, are accessible through
Equator project (www.equator-network.org).
CONSORT, useful for reporting randomized clinical trials, and STROBE, for
observing research, are the most well
known examples.
◗◗◗ POPULATION
The term “population” represents the
set of subjects exposed to the experimental intervention and to the possible comparison intervention. In order
to obtain a good quality study, the
population has to be representative
and described in detail so that the
reader could compare with his/her
own clinical practice.
To assess if the population can be considered as representative, it is necessary to examine the selection process.
Inclusion Criteria, that describes the
main characteristics of the enrolled
population, and the Exclusion Criteria,
that describes the main characteristics
of the population excluded from the trial, must be clearly classified, pinpointed at the beginning, and validated
(Box 1).
If the criteria are overly restrictive, to
transfer the results to the general population becomes more difficult.
Using the inclusion and exclusion criteria, a sample is selected and often
divided into two or more groups, to
which an experimental or a comparison intervention has to be applied.
The homogeneity across the groups
(before the application of any intervention) warrants a similar baseline risk
BOX 1 • Analysis of the population enrolled in a clinical study.
Title
Treatment of refractory chronic GVHD with rituximab: a GITMO study. Zaja
F. et al., BMT 2007 Aug; 40(3):273-7
Text
This study was conducted retrospectively to established GITMO transplant
centre experience in the use of Rituximab for the treatment of refractory
cGVHD, that is, cGHVD already treated and not responsive to one at least
prior treatment and/or necessitating chronic administrations of medium to
high-dose steroids.
Patient details, time and type of transplant, time of cGVHD onset, organ
involvement, previous treatments, therapeutic schedule, safety, response
rate and response duration were investigated
Flaws
In this report, what “patient with refractory chronic GVHD” signifies is not
stated clearly: steroid and immunosuppressive dosage to define refractoriness is not defined. If rituximab efficacy is demonstrated, because the enrolled population is not well described, it is impossible to state if it could be
useful for real-life clinical settings.
Possible
consequences
• If the enrolled population in a trial is not correctly described, results are
not transferable for actual clinical practice.
• If the results are applied to a type of patient not comparable to the
enrolled population, it could turn out as unpredictable consequences.
• If authors do not explain what “refractoriness” means (e.g. dosage of
previous or concomitant therapies), we don’t know if a patient in one’s
clinical practice could have similar baseline characteristics of the enrolled population.
199
200
L. Manfredi, et al.
BOX 2 • Analysis of baseline risk in the population enrolled for a trial.
Title
Effectiveness of an air mattress for pressure ulcers prevention
Text
In an orthopedic unit a consecutive series of patients with hip fracture was
enrolled.
Inclusion criteria: age 18-75 years, hip fracture < 48 hours looking forward to
prosthesis, ability to provide written informed consent.
Exclusion criteria: uncontrolled diabetes, heart failure, previous prosthesis
surgery, BMI> 30
Flaws
The enrolled population has a low risk profile for hip fracture: only young
patients (not older patients, age < 75 y), without relevant comorbidities
and subsequent low risk of pressure ulcer.
Results of this study are not applicable in a real-life setting where baseline
risk is very different from the population enrolled for the trial.
Possible
consequences
When in a clinical trial you analyze the population characteristics, it is very
important to realize that the enrolled population has the same “baseline
risk” as those you meet in real life.
Only in this way, you can apply results of a study to your patients in actual
clinical practice.
to develop the outcome of interest.
A balanced distribution of risk factors
among groups (known and unknown)
to develop the outcome is therefore of
utmost importance (Box 2).
A direct comparison relative to the
baseline risk between study population
and real practice could be possible if:
• the incidence of the outcome is
known for the population from the
clinical practice;
• the study is randomized and there
is a control arm not subjected to
treatment or placebo.
When a subgroup analysis is provided
by the Authors, readers should check if:
a) The technique used for creating
the group (example: randomization
technique) is appropriate and well
described;
b) Group characteristics have been
clearly described and are also
balanced.
A study with relevant gap from real
practice or imbalance between experimental and control group could
be yet defined of acceptable methodological quality, if authors had carried out a sensitivity analysis.
◗◗◗ INTERVENTION
The main aim of a clinical trial is either
to test the efficacy (studies in earliest
stages) or the effectiveness of a specific intervention (for example pharmacological, organizational, diagnostical, educational type).
In randomized controlled trials the intervention under evaluation is defined
as “experimental”.
It is compared at all times with:
• Non active intervention (placebo).
• Typical clinical practice.
• One or more active interventions of
known efficacy (active comparators).
• No intervention.
During the study, the enrolled patients
may receive concomitant treatments
other than the experimental one. It is
important to notice that the outcomes
How to Conduce Critical Appraisal for Clinical Trials
analyzed in the trial could be influenced by this aspect.
To conduct a clinical study with good
methodological quality, a detailed
description of interventions has to be
available so that the trial can be reproduced. In the final full report, within the
section “material and methods”, details about the characteristics of any
intervention must be available.
In a randomized controlled trial testing a drug, pharmacological features,
methods and frequency of administration, length of trial, criteria for dose adjustment, or discontinuation of have to
be clearly described in the report and
defined a priori.
If instead the authors have tested a
diagnostic or surgical procedure, a
behavioral treatment, or an organizational or educational model, they must
report sufficient details to eventually
allow the reader to replicate the experiment in his own clinical practice.
For example: the type of professional
involved with his/her level of expertise, the process of investigator training
and a possible learning curve must be
provided. If all these information are
missing or not described extensively,
BOX 3 • Analysis of experimental and comparison interventions in a clinical trial.
Title
Efficacy of Mycophenolate Mofetil in the Treatment of Chronic Graft-versus-Host Disease Lopez F. et al., BBMT 2005 11:307-313
Text
This retrospective review was approved by the Institutional Review Board of
the City of Hope National Medical Center. The review included all patients
at our institution who filled prescriptions for Mycophenolate Mofetil (MMF)
for treatment of cGVHD between March 1999 and January 2001…..A total
of 34 patients were identified who were treated with MMF for cGVHD……
All patients initiated treatment with MMF no earlier than day 80 after transplantation and continued prior therapy with PSE, CSA, or FK506 when MMF
was started. MMF was started in most adult patients at 500 mg twice daily
(BID) and then escalated if tolerated to 1000 mg BID. Data recorded at
the initiation of MMF and at a minimum follow-up of 6 months included the
date of diagnosis of cGVHD and the starting date of treatment with MMF;
the type of onset of cGVHD (progressive, de novo, or quiescent); and the
sites of cGVHD-related organ involvement. Clinical organ involvement
(skin, mouth, and eyes) was also described as mild, moderate, or severe
by the primary physician. In addition, the extent of skin involvement, platelet count, and liver function tests were quantified. The immunosuppressive
therapy previously used for the prevention and treatment of GVHD, the
current immunosuppressive medications, and PSE doses at the start of MMF
therapy and at last follow-up were recorded….
Flaws
In this retrospective review, MMF, used as second or first line therapy, was
added in 34 patients to standard care to assess its efficacy in the treatment of chronic GVHD. In the section, “patients and methods”, we can
understand that concomitant immunosuppressive medications were administered; however detailed information are not provided.
The lack of these information can affect the meaning of the true experimental drug effect.
Possible
consequences
All interventions (experimental, comparison, concomitant interventions
and supportive care) must be described in detail (dosage, methods and
timing of administration); otherwise we are not able to reproduce the
same schedule of treatment in clinical practice and this trial will become
irrelevant.
201
202
L. Manfredi, et al.
any positive result of the trial will not
be useful for clinical decisions in actual
practice (Box 3).
◗◗◗ OUTCOME
The Outcome in a clinical study is the
aim (clinical, organizational, economic, etc.) to achieve with the experimental intervention. It could also be
defined as endpoint.
In order for the study to be considered
applicable, the outcome has to be as
much as possible substantial from a
clinical, organizational and beneficial
point of view.
The Outcome’s Classification
The outcome can be dichotomous (for
example death or not), discrete (for
example improved, unchanged, worsened), or continuous (for example extension skin thickening). Endpoints can
be divided in two subgroups:
• Clinically relevant: as defined by
the Food and Drug Administration, it
evaluates how a patient feels, func-
tions or survives; in turn it is distinguished in “hard” outcome (death,
autonomy, end-stage organ failure) or “soft” outcome (for example
quality of life, clinical symptoms).
• Surrogate: a laboratory value or a
physical sign used as a substitute
for a clinically relevant outcome
is defined as a surrogate outcome
(Box 4).
If a clinical study considers clinically
relevant outcomes, especially “hard”
endpoints, a great number of enrolled
patients, a long observational period and a great deal of financial and
human resources are required to increase the chances to obtain the predetermined endpoints.
Because of this problematic detection, surrogate outcomes are often
preferred to be taken into account in
clinical trials. Indeed, this type of endpoint is measured more easily, requires
a smaller sample, a shorter observational period and less human and financial resources, in order to check a
BOX 4 • Analysis of clinically relevant and surrogate outcomes in a clinical trial.
Title
Weekly rituximab followed by monthly rituximab treatment for
steroid-refractory chronic graft-versus-host disease: results from a
prospective, multicenter, phase II study Kim J.S. et al., Haematologica. 2010 Nov;95(11):1935-42.
Text
This study was an open-label, multicenter, prospective, phase II
study to evaluate the efficacy of rituximab in terms of response to
treatment, changes in QOL and discontinuation of steroids. Eligible
subjects were patients with steroid-refractory chronic GVHD who
required treatment.....
Flaws
In this study about the use of Rituximab for the treatment of refractory chronic GVHD, authors have chosen three different types of
outcomes: first is the clinical response (hard outcome), second is
the changes in Quality of life (soft outcome). Both of them are defined clinically relevant. The third is the discontinuation of the steroids and this is a surrogate outcome: it is not always true that the
steroid suspension leads to a clinical improvement of the patient.
Possible consequences
Trials evaluating only surrogate outcomes could not add relevant
information for patients’ health.
How to Conduce Critical Appraisal for Clinical Trials
different rate of incidence for a single
endpoint in experimental and comparison groups.
Characteristics of a Good Surrogate
Outcome
A surrogate outcome can be considered equivalent to a clinically relevant
outcome if all the following conditions
are fulfilled (5):
a) There is a strong, independent, consistent association bet ween the
surrogate outcome and the clinical
endpoint.
b) There is the evidence, coming from
at least one randomized controlled
trial, in other drug classes that improvement in the surrogate endpoint has consistently led to im-
provement in the target endpoint.
c) There is the evidence, coming
from at least one randomized controlled trial, in the same drug class
that improvement in the surrogate
endpoint has consistently led to improvement in the target endpoint.
Composite Endpoints
In a clinical trial it is also possible to
find composite endpoints (artificial
outcomes), composed of elementary
outcomes in order to increase the accrual rate and reduce the sample size
and the economic resources required
to complete the study. Many complex
rating scales (for example “PSST: Pressure Score Status Tool” and “DAS28:
Disease Activity Score”) can be con-
FIGURE 2 • How to define an outcome significant.
203
204
L. Manfredi, et al.
sidered as a composite outcome. Their
clinical relevance and meaningfulness
depend on:
• Characteristics of each single element of the scale.
• How the different components carry on during the trial.
• Different sensitivity of each element
to eventual systematic errors affecting the trial.
tion instrument used (example: itch,
skin thickening).
• Confounding factors affecting the
outcome attribution (example: rules
followed to decide admission/discharge from the hospital which influence the length of hospitalization).
• Time to target, duration (example:
length of remission or time to healing).
Example: a combined assessment tool
(for example DAS28 used in the management of rheumatoid arthritis, which
considers swollen and tender joints,
erythrocyte sedimentation rate or Protein C reactive, and visual analogic
scale) can be particularly sensitive to a
change of a biological parameter (for
example Protein C reactive, that can be
considered as a surrogate outcome),
independently on the performance of
components clinically more relevant as
count of painful or swollen joints.
In general, a clinical study takes into
account a primary outcome and one
or more secondary outcomes; this distinction does not necessarily imply a
difference in their nature or biological
importance, but it has fundamental
consequences on data analysis.
Definition of primary outcome, indeed,
led to the sample size estimation and
statistical power calculation.
Other Factors Relevant for Endpoint
Significance
There are a number of further factors
which could influence the significance
of a particular endpoint (Figure 2).
Among them:
• Stakeholders’ values (Outcome
patient-centered, Outcome expert-centered, Outcome organisation-centered).
• The intrinsic characteristics and limits of the measure and the evalua-
◗◗◗ REFERENCES
1. Haynes RB. Where’s the meat in clinical
journals. ACP Journal Club. 1993; 119:
A23-A24.
2. Altman DG. Poor-quality medical research: what can journals do? JAMA.
2002; 287: 2765-7.
3. Von Elm E, Egger M. The scandal of poor
epidemiological research. BMJ. 2004;
329: 868-9.
4. Ioannidis JP. Contradicted and initially
stronger effects in highly cited clinical research. JAMA. 2005; 294: 218-28.
5. Gordon Guyatt, MD. Users’ guides to the
medical literature. Mc Graw Hill Ed. 2008;
329.