Download Summary Statistics and Confidence Intervals

Summary Statistics & Confidence Intervals Annie Herbert Medical Statistician Research & Development Support Unit Salford Royal NHS Foundation Trust [email protected] 0161 2064567 Timetable Time Task 60 mins Presentation 20 mins Coffee Break 90 mins Practical Tasks in IT Room Outline • Sampling • Summary statistics • Confidence intervals • Statistics Packages ‘Population’ and ‘Sample’ • Studying population of interest. Usually would like to know typical value and spread of outcome measure in population. • Data from entire population usually impossible or inefficient/expensive so take a sample (even census data can have missing values). • Sample must be representative of population. • Randomise! E.g. Randomised Controlled Trial (RCT) POPULATION GROUP 1 OUTCOME GROUP 2 OUTCOME SAMPLE RANDOMISATION Types of Data Categorical Numerical/Continuous Example: • Yes/No • Blood Group Example: • Weight • Pain Score Graphs: • Bar Chart • Pie Chart Graphs: • Histogram • Box and Whisker Plot Summary: • Frequency (n) • Proportion (%) Summary: • Mean & Standard Deviation (SD) • Median & Inter-quartile range (IQR) Types of Average (‘Average’ - a number which typifies a set of numbers) • Mean = Total divided by n • Median = Middle value • Mode = Most common value/group (rarely used) Types of Average - Example Pain score data: 10, 8, 7, 7, 1, 7, 6, 5, 3, 4 Median 2nd 3rd 5th 6th 8th 9th Ordered: 1, 3, 4, 5, 6, 7, 7, 7, 8, 10 Mean = (1 + 3 + 4 + … + 10) ÷ 10 = 5.8 Median = (6+7) ÷ 2 = 6.5 Mode = 7 Mean or Median? Roughly Normally distributed: • Mean or median • Mean by convention 20 15 10 5 0 -3 -1 1 3 5 7 9 11 13 15 17 19 21 50 40 30 20 10 0 0 1 2 3 4 5 6 7 Skewed: • Median • Less affected by extreme values Variation and Spread • Standard Deviation (‘SD’) - Average distance from mean - Use alongside mean • Inter-Quartile Range (‘IQR’) - Range in which middle 50% of the data lie (middle 50% when ordered) - Use alongside median • Range - Highest and lowest value - Possibly quote in addition to SD/IQR Types of Variation - Example Pain score data: 10, 8, 7, 7, 1, 7, 6, 5, 3, 4 Median 2nd 3rd 5th 6th 8th 9th Ordered: 1, 3, 4, 5, 6, 7, 7, 7, 8, 10 IQR SD = 2.6 IQR = (3.75, 7.25) Range = (1,10) Standard Error • Not the same as standard deviation. • Calculated using a measure of variability and sample size. • Used to construct confidence intervals. • Not very informative when given alongside statistics or as error bars on a plot. Sample statistic is the best guess of the (true) population value • E.g. Sample mean is the best estimate of mean in population. • Mean likely to be different if take a new sample from the population. • Know that estimate not likely to be exactly right. Confidence Intervals (CIs) • Confidence interval = “range of values that we can be confident will contain the true value of the population”. • The “give or take a bit” for best estimate. • Convention is to use a 95% confidence interval (‘95% CI’). • But also leaves 5% confidence that this interval does not contain the true value. Example: Legislation for smoke-free workplaces and health of bar workers in Ireland: before and after study (Allwright et al; BMJ Oct 2005) Salivary cotinine (nmol/l) Before N=138 After N=138 Difference (95% CI) 29.0 5.1 -22.7 (-26.7 to -19.0) Median Any respiratory symptoms n (%) 90 (65%) 67 (49%) -16.7 (-26.1 to -7.3) Runny nose/sneezing n (%) 61 (44%) 48 (35%) -9.4 (-19.8 to 0.9) Example: Supplementary feeding with either ready-touse fortified spread or corn-soy blend in wasted adults starting antiretroviral therapy in Malawi (MacDonald et al; BMJ May 2009) “After 14 weeks, patients receiving fortified spread had a greater increase in BMI and fat-free body mass than those receiving corn-soy blend: 2.2 (SD 1.9) v 1.7 (SD 1.6) (difference 0.5, 95% confidence interval 0.2 to 0.8), and 2.9 (SD 3.2) v 2.2 (SD 3.0) kg (difference 0.7 kg, 0.2 to 1.2 kg), respectively.” Example: Sample size matters What proportion of patients attending clinic are satisfied? Sample size 10 Number satisfied 7 Proportion satisfied 70% 95% CI for proportion 35% to 93% 25 18 70% 50% to 88% 50 35 70% 55% to 82% 100 70 70% 60% to 79% 1000 700 70% 67% to 73% Example: % confidence matters What proportion of patients attending clinic are satisfied? Sample size = 50 No. satisfied = 35 Proportion satisfied= 70% 90% CI 58% to 81% 95% CI 55% to 82% 99% CI 51% to 85% p-values vs. Confidence Intervals • p-value: - Weight of evidence to reject null hypothesis - No clinical interpretation • - Confidence Interval: Can be used to reject null hypothesis Clinical interpretation Effect size Direction of effect Precision of population estimate So… it’s not all about p-values! • For some hypotheses p-value and CI will both indicate whether to reject it or not. • A CI will also provide an estimate, as well as a range for that estimate. • General medical journals prefer CI. Statistical Packages Package Summary Statistics SPSS Stats Direct • Not user-friendly • Gives a large choice of statistics to calculate Confidence Intervals Doesn’t provide a CI for some key comparative statistics: e.g. simple percentage • One right-click Provides a CI for most • Will produce a set statistics 20 or so of the most commonly used statistics Thanks for listening!

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Summary Statistics and Confidence Intervals