Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
What is wrong with What is said? Hasan Yazici, MD University of Istanbul Vitamin-E in gout the Hypothesis • Several recent reports hint that vitamin E can lower serum uric acid levels (1-3). • These are either single case reports or uncontrolled, open studies in small groups of patients. • Thus we tested the hypothesis to see whether vitamin E was effective treatment for gout in a 6 month, double blind, placebo controlled study. What is wrong? A hypothesis should not be formulated as a question. Methods - 1 • The patients were randomized into the vitamin E (n = 60) and placebo (n= 60) groups. • After randomization each patient signed a written informed consent. • The patients were assessed monthly with serum uric acid, hematocrit, serum creatinine, cholesterol and testosterone levels. • The number of gout attacks were also noted. What is wrong? • The randomization follows, not precedes, the informed consent. Methods -2 • 30 patients had to leave the study for various reasons. – 20 patients from the active drug group and 10 from the placebo left the study before its completion (ns). – Thus the final analysis for drug efficiency was made among 90 patients. • Among the 20 patients who dropped out from the treatment group 3 moved to another city. • 15 complained of severe indigestion and no information was available for the remaining 2. • Dyspepsia was the reason for withdrawal in 7 patients in the placebo group while headaches were the main reason for discontinuation of treatment in the remaining 3. What is wrong? • The intention to treat principle is neglected. Intention to treat another study • 90 patients were initially randomized into 30 patients each of placebo, QXY 2mg qd and QXY 5mg qd groups. • One patient in the QXY 5mg group developed pneumonia and had to be hospitalized 10 days after treatment started. • The patient was withdrawn and, according to the protocol, another patient was recruited. • Thus, the intention to treat analysis brought the number analyzed to 31 in the QXY 5mg group and the total number to 91. What is wrong? • How was randomization achieved in the newcomer? • The newcomer inflates the denominator when looking at harm. Methods-3 • The allocation of 60 patients to the active drug and the placeo arms ensured a 86% power to detect a difference rate of 25% in the uric acid levels between the 2 groups using a 2sided Fisher’s exact test with α set at 0.05 What is wrong? • In a power calculation it is not enough to state the magnitude of change, the anticapated event rates should also be given. • Ensured 86% power? Sample size calculations 4 components I. Type I (alpha) error II. Type II (beta) error III. Event rate in the control group IV. Event rate in the treatment group Power calculations for efficacy and harm • The primary end point was the achievement of at least 25% improvement according to the ACR criteria (ACR20) at week 20. • The sample size of 180 patients per group was chosen to ensure an adequate safety evaluation. • The sample size also ensured that there was 90% power to detect a significant difference in the proportion of ACR20 responders between the treatment groups using a significance level of 0.05, assuming 20% and 42% of the patients in the control and the treatment groups respectively achieved ACR 20 responses. What is wrong? • No basis for the assumptions related to harm • Post hoc power calculations Powering for safety • A sample size of 300 patients each in the study drug and the control groups was determined to demonstrate a specific adverse event rate of 1% or less with 95% confidence. What is wrong? The probability of an event not happening does not give us information about the likelihood of events happening in the different arms of the study. The zero patient method • If one screens n individuals and does not find an attribute y among this group; • Then one can conclude that the frequency of y is less than 1/0.33n among the n individuals surveyed, with 95% confidence. • This approximation is true for prevalances < 0.02 of the attribute surveyed. H Yazici et al Rheumatology Oxford (2001) Efficacy a subgroup analysis • The uric acid levels decreased to 5.2 ± 1.5mg/dL from a baseline of 9.3 ± 2.3 mg/dL in the treatment and from 9.7 ± 1.9 mg/dL in the placebo groups. • There were no differences in the lowering of uric acid between the 2 groups (p>0.05). • There were also no differences in the number of gout attacks between the 2 groups of patients. However a subgroup analysis was also done. • Among those patients in the treatment group who gave a history of more than 5 gouty attacks per year, as compared to those who had less than 3, the uric acid levels were significantly lower after treatment (p< 0.05). Reviewer comments • The findings in the subgroup analysis, conducted among a small number of patients, should be interpreted with caution. • Even though the authors found a statistically significant difference in efficacy the study was not primarily planned to assess this difference. • This lessens the external validity of these results. What is wrong? The main problem with this subgroup analysis is not that the subgroup is small. It is an effort in the direction of proving that the new drug at hand is effecacious. Harm another subgroup analysis • Table 3 gives the rate of observed adverse events. Indigestion was rather frequent in either group; 15/45 patients in the treatment and 14/45 in the placebo groups. • A subgroup analysis was also done after the patients who participated in the trial were questioned about a pre-trial history of indigestion. It turned out that 20/45 patients in the treatment and 18/45 in the placebo group had pre-trial indigestion. • A further analysis among these patients revealed that 14/20 in the treatment and 2/18 in the placebo group (p< 0.03) with a history of pre-trial indigeston also reported indigestion during the trial. Reviewer comments • The findings in the subgroup analysis, conducted among a small number of patients, should be interpreted with caution. • Even though the authors found a statistically significant difference in harm the study was not primarily planned to assess this difference. This lessens the external validity of these results. What is wrong? Again the main issue is not the size of the subgroup. This subgroup analysis looks at the possibility that this new drug might not be all that harmless. Thus the exercise is in the direction of falsification and therefore is justified. Dr. Pincus study • Recently, in a placebo controlled withdrawal study, Dr. Pincus showed that small doses of prednisone (1- 4 mg/day) were significantly effective in the management of rheumatoid arthritis. • The study was conducted among 31 patients Reviewer comments This beneficial effect of prednisone described among a small number of patients should be interpreted with extreme caution even if the authors found a statistically significant difference. The number of patients studied, thus the study power, was simply too small. What is wrong? • The reviewer is partially right. • The issue of power applies only if we are missing a more important outcome that would have been more evident in a larger group. • External validity? The small number of patients might be quite different from the real life RA patients. • However (!) this is not simply an issue of numbers but of patient selection. The coxib study • 6000 patients with osteoarthritis of the knee were randomized to recieve either the new coxib (NCB) or the traditional (TCB). • 40% of the NCB and 35% of the TCB patients found total pain relief. • Hypertension was a problem in 2.5 % of the NCB and 4.5 % of the TCB patients. • It was concluded: a. NCB decreased pain by 13% b. There was 45% less hypertension with NCB What is wrong? • Nothing. Needs more interpretation. • A NNT analysis will tell you that you have to treat 10 patients to see the superiority of NCB over TCB in 1 patient. • A NNH analysis will say that you have to treat 50 patients with TCB to harm 1 more patient with hypertension as compared to using NCB. NNT & NNH • Relative risk (RR): outcome ratio in the control group/ outcome ratio in the control group • Absolute risk reduction (ARR): difference in the ratios • Relative risk reduction (RRR): ARR/outcome in the control group OR 1-RR • NNH: 1/ARR (frequency of hypertension in the control group 0.045, frequency of hypertension in the study group 0.025, ARR= 0.020; NNT = 50.) A withdrawal study • Two previous double blind studies of colchicine in BS had shown no superiroity of this agent over placebo in treating the oral ulcers in this condition. • Recently a withdrawal study was done among those patients who claimed benefit from colchicine. They were randomized to contiune to receive the active drug or placebo. • After 3 months those who stopped taking colchicine had significantly more ulcers (sign test, p = 0.02). Reviewer comments • This study is interesting but is of limited use. A problem with all withdrawal trials, they seldom represent real life use. • The authors used the “sign test” to analyze the differences in oral ulcers between the 2 groups. Is this a new test? I suspect it is not quite powerful. Why not use the more powerful tests of significance to show the real differences, if any? What is wrong? (I) • The main problem with withdrawal studies is that they are not done often enough. • They provide excellent information about possible a type II error in previous, traditional RCT’s. • An important issue is that they do not give a fair picture of drug associated harm. What is wrong? (II) • The sign test is a time honored, conservative tool. • Significance in a conservative test gives more validity to the significant differences observed. The mighty “p” Among the 96 patients allocated to the new medication there were 3 cases of myocardial infarction while the same was true for 2/94 patients allocated to placebo (p=0.86). What is wrong? • Better give it with the statistic used (sign test, p = 0.02). • Do not use it to mesmerize the reader. The Commandments • • • • • • • • • • Hypothesis as an affirmative statement Intention to treat Components of the power calculation Post hoc power analyses Powering for harm, no short cuts Justifications for subgroup analyses Small but significant Meaning of treatment effects Withdrawal studies to be cherished The “mighty p” Summary • In a RCT, like in all scientific endeavor, all efforts aimed at proving the hypotheses are wrong. • The aim is falcification. Vitamin-E in gout the Conclusion Vitamin E is an innocuous agent and with the possible additional benefit of an antihypertensive effect, as was also serendipitously noted in this study, such studies are surely warranted. We are about to start a study along those lines at our institution. What is wrong? Do not preempt!