Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
How to find your way through the jungle of statistics ... Carl-Olav Stiller , associate professor Clinical pharmacology Karolinska University Hospital - Solna 17176 Stockholm Tel: 08-5177 3261 [email protected] Statistikens djungel - grunder Why do we need statistics Hypothesis testing Hypothesis generating Common pitfalls Parametric or non-parametric statistics Independent or dependent observations Planning of research Why do we need statistics in research? To To To To test hypothesis show similarities or differences analyse correlations describe findings / data Differences or similarities ? Do I want to show differences? Power analysis Which difference do I want to be able to detect? Sensitivity and specificity What can go wrong? We find a difference which is not true Alpha problem We find similarity, but the groups are different t Beta problem Hypotes - endpoint Primary hypotesis The aim of the study. Highest evidence Secondary hypotesis / endpoint All other tests Lower evidence Hypotesis generating Lack of difference is not the same as similarity! If you compare small groups it is hard to detect any difference. In order to show similarities the groups have to have a certain size. Prior to start of the study you have to define the interval for similarity. Sensitivity or specificity Sensitivity: May I trust a positive outcome? What is the likelyhood of a positive outcome being true / correct ? Specificity May I trust a negative outcome? What is the likelyhood of a negative outcome being true / correct ? Parametric or non- parametric statistics? Non-parametric statistics: Rank order: Same, smaller, bigger Parametric statistics: Based on normal distribution (Gauss curve) What kind of data do I have? Normal distribution: Parametric or non-parametric statistics Normal distribution with cut off: Non-parametric statistics preferred If you use parametric statistics SD gets too low and your precision seems to be higher than it is. Rank order scale: Non-parametric statistics should be used (is it ?) Assement scale: Non-parametric statistics should be used (is it?) Rank order scales: examples Borg scale for excertion: 1-5 No to maximal excertion Cardiac failure according to NYHA (New York heart association) 1-4 Pain intensity – for example migraine headace: 0 – no pain, 1 – some pain, 2 – moderate pain, 3 – severe pain, 4- very severe pain Visual analog scale Pain intensity 0: No pain 100: Worst imaginable pain Problem: Subjective assessment, everyone has different reference frames - 40 for one individual is not the same as 40 for another Inter individual variation VAS data are often calculated with parametric statistics ”appropiate or not? ” Combined assessment scales Depressions skala - Montgomery Åsberg Olika variabler slås ihop till ett värde Alzheimer skala - ADAS cog, Olika förmågor som påverkas vid Alzheimer skattas och slås ihop Intelligenskvot Prestationer i olika test vägs samman Parametric or non-parametric statistics ? Common pit falls: Non-parametric data are calculated with parametric statistics But parametric data may also be calculated with non-parametric statistics Control group or test before treatment and after treatment ? Test before and after may be useful as pilot study to generate a hypothesis ”hypotesgenererande” Control group is ”gold - standard” – better data and lower risk for false positive outcome. Treatment of severe headache with opioids or NSAIDs i.m. at the emergency department Harden RN, Gracely RH, Carter T, Warner G The placebo effect in acute headache management: ketorolac, meperidine,and saline in the emergency department. Headache 1996 Jun;36(6):352-6 Dependent or independent observations Dependent observations Control before or after treatment Tissue from different regions of the same individual Independent observations Observations in separate individuals Common staticaal tests comparing two or more groups Parametric statistics Non-parametric statistics Two groups Independent obs. Unpaired t-test Dependent obs Paired t-test Independent obs. Mann-Whitney test Dependent obs. Wilcoxons test Three or more groups Independent obs. Dependent obs One-way ANOVA Repeated measures (analys of variance) ANOVA + Tukey – alla par + Newman Keuls – alla par + Bonferroni – alla par + Bonferroni – selekterade par + Dunett – mot kontroll Independent obs. Kruskall Wallis Dependent obs. Friedman test + Dunns test Standard deviation SD Standard deviation SD Effect 80 60 40 20 0 Control Drug A Drug B Standard error of the mean SEM = SD /√ n SEM Effect 80 60 40 20 0 Control Drug A Drug B Confidence interval Correct illustration av effect range 95 % konfidens-intervall Effect 80 60 40 20 0 Control Drug A Drug B What is a good clinical study? Relevant patient population Sufficient size / power Clinically relevanta effect outcome Reference treatment using relevant doses Double blind / randomised Sufficient follow up time Few withdrawals Common pit falls ….. Preliminary data Limited number of participants No control group Open trial or single blind trials Beware ….. ... Control group with inadequate treatment. Second best alternative Gold standard Dose selection of drug and comparator ? Beware ….. No randomisation It is not the treatment, but the group selection which explains the outcome Beware .. Selection criteria Hard selection: Results may not be generalisable. No selection: The treament effect can be blurred by other aspects. Beware ….. ... Subgroup analysis not planned in advance Large number of subgroups analysed Risk for difference just by chance Beware….. Outcome was analysed with unproven methods Surrogate outcome Short follow up Drop out Beware ….. ... Differences in adverse events were not analysed Rare adverse events are not detected in RCT Beware ….. ... Results are only presented as percent change and not absolute difference A large relative change – for example 50 % decreased may soud impressive, but may be not important if the risk is low Summary Select statistics before you start your experiment Analyse your data Mind pit falls Good luck