Download p value

P Values - part 3 The P value as a ‘statistic’ Robin Beaumont 1/03/2012 With much help from Professor Geoff Cumming P values - Putting it all together Summary Reviewso far • A P value is a conditional probability which considers a range of outcomes – shown as a ‘area’ in a graph. • The SEM formula allows us to: predict the accuracy of your estimate ( i.e. the mean value of our sample) across a infinite number of samples! Summary far What is a so statistic? • A statistic is just a summary measure, technically we have reduced a set of data to one or two values: • Range (smallest – largest) • Mean, median etc. • Inter-quartile range, SD Variance • Z score, T value, chi square value, F value etc • P value T value • T statistic – different types, simplest 1 sample: Tstatistic  observed difference in estimated mean and population value sampling variability in means Tstatistic  observed difference in estimated mean and population value SEM  observed difference in estimated mean and population value expected variability in means due to random samping Signal  Noise So when t = 0 means 0/anything = estimated and hypothesised population mean are equal So when t = 1 observed different same as SEM So when t = 10 observed different much greater than SEM T statistic example Serum amylase values from a random sample of 15 apparently healthy subjects. The mean = 96 SD= 35 units/100 ml. How likely would such a ‘unusual’ sample be obtained from a population of serum amylase determinations with a mean of 120. (taken from Daniel 1991 p.202 adapted) The population value = the null hypothesis Tstatistic  96  120 24   35 9.037 15 2.656 This looks like a rare occurrence? t density: s x = 9.037 n =15 120 96 Given that the sample was obtained from a population with a mean of 120 a sample with a T(n=15) statistic of 2.656 or 2.656 or one more extreme will occur 1.8% of the time = just under two samples per hundred on Shaded area average. =0.0188 .... What does the shaded Given that the sample was obtained area mean! from a population with a mean of 120 0 a sample of 15 producing a mean of 96 2.656 0 -2.656 t (120-x where x=24) or 144 (120+x Serum amylase values from a random sample of 15 apparently healthy subjects. mean =96 SD= 35 units/100 ml. where x=24) or one more extreme will How likely would such a unusual sample be obtained from a population of serum amylase determinations with a mean of 120. occur 1.8% of the time, that is just (taken from Daniel 1991 p.202 adapted) under two samples per hundred on =P value average. P value = 2 · P(t < t| H is true) = 2 · [area to the left of t under a t distribution with df = n − 1] Original units: (n−1) o P value and probability for the one sample t statistic p value = 2 x P(t(n-1) values more extreme than obtained t(n-1) | Ho is true) = 2 X [area to the left of t under a t distribution with n − 1 shape] Statistic -> sampling distribution -> PDF -> p value No sampling distribution! Create a virtual one P Value Variability Taking another random sample the P value be different How different? – Does not follow a normal distribution Depends upon the probability of the null hypothesis being true! Remember we have assumed so far that the null hypothesis is true. Dance of the p values – Geoff Cummings Simplified dance of the p values when the null hypothesis is true Example from Geoff Cummings dance of the p values The take home message is that we can obtain very small p values even when the null hypothesis is true. Why no CI for the P Value if it varies across trials • P value -> statistic but • Not all statistics represent values that are reflected in a population value • Other ways of getting an idea of variability across trials: • Reproducibility Probability Value (RP) Goodman 1992 and also 2001 journal articles Hung, O’Neill, Bauer & Kohne 1997 Biometrics journal Shao & Chow 2002 – Statistics in Medicine journal Boos & Stefanki 2011 – Journal of the American statistical association Cummings 2008 + and book Cumming’s Reproducibility (replication) Probability Value Given Pobtained = 0.05 What is the interval in which we are likely to see 80% of subsequent P values? Answer: We have 80% of seeing subsequent p values fall within the zero to 0.22 boundary 0, 0.22 [One sided] This means that we have a 20% of them being subsequently > 0.22 What about when the null hypothesis is not true?

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download p value