Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Hypothesis Testing Hypothesis • An educated opinion • What you think will happen, based on • previous research • anecdotal evidence • reading the literature Body fat level of 8th graders • National Norm: • Mean = 23%, SD = 7% • postulated parameter ( and ) • Your 8th grade PE program (N=200) How does my program compare?? Your gut feeling • You expect to find, you want to find, your instincts tell you that your students are better. Your gut feeling • You expect to find, you want to find, your instincts tell you that your students are better. But are they?? Question • Is any observed difference between your sample mean (representative of your 8th grade population mean) and the National Norm (population of all 8th graders) attributable to random sampling errors, or is there a real difference? Question • Is any observed difference between your sample mean (representative of your 8th grade population mean) and the National Norm (population of all 8th graders) attributable to random sampling errors, or is there a real difference? • Is the mean of your class REALLY the same as the National Norm? How to determine this • Research Question • is my POPULATION mean really 23% • Statistical Question • = 23% • set the Null Hypothesis that the mean of YOUR group is 23% (equal to the National Norm) • assume that your group is NOT REALLY different Null Hypothesis • Ho: = 23% • The true difference between your sample and the population mean is 0. • There is NO real difference between your sample mean and the population mean. • The performance of your students is not really different from the national norm. Null Hypothesis • In inferential statistics, we usually want to reject the Null hypothesis • to say that the differences are more than what would be expected by random sampling error • this was our initial gut feeling • our program is better 3 Possible Outcomes • No difference between groups • do not reject the null hypothesis 3 Possible Outcomes • No difference between groups • One specific group is higher than the other • directional hypothesis • What you EXPECT to happen when planning the experiment/measurement 3 Possible Outcomes • No difference between groups • One specific group is higher than the other • Either group mean is higher • non-directional hypothesis • The possible outcome of the experiment/measurement Alternative Hypothesis • Our research hypothesis (what we expect to see) • HA: 23% • non-directional hypothesis • interested to see if my grade body composition is better than or worse than the national norm Alternative Hypothesis • Our research hypothesis (what we expect to see) • HA: < 23% (HA: > 23%) • directional hypothesis • expect to see my grade mean less than (better than) the national norm • expect to see my grade mean greater than (worse than) that of the national norm Comparing My Class to the National Norm • My 8th grade PE program (N = 200) • National Norm = 23% • postulated parameter • At the end of the semester, calculate the mean % body fat • Using a random sample ( n = 25) • mean % body fat of 20 % Is my sample mean different from the National Norm? Need to Test Ho • Determine whether the observed difference is means is attributable to random sampling error rather than a true difference between the groups (my class and the national norm) • treatment effect Hypothesis Testing Null Hypothesis • No true difference between two means (sample mean and national norm) • Infers: my sample is drawn from the identified population • Nothing more than random sampling errors accounts for any observed difference between the means. An element of uncertainty is inherent in any act of observation (Menard’s Philosophy) Alternative Hypothesis • A true difference does exist between two means • Infers: my sample is not drawn from the identified population • Observed difference between the means is larger than what we are willing to attribute to random sampling error Testing Ho • Test the probability that the observed difference between means is attributable to random sampling error alone • Evaluate the probability that Ho is not to be rejected • reject or do not reject Ho What amount of risk are you willing to take? Weatherman Example • 85% chance of rain • put up the sunroof • 5% chance of rain • it may happen, but the chance is slight • not very likely to rain • willing to risk being wrong to avoid the inconvenience of having to put up the sunroof. If we do not put up the sunroof: We reject the hypothesis that it will rain If we do not put up the sunroof: We could be right or We could be wrong Wait for certainty means to wait forever What risk are YOU willing to take 1%?? 5%?? 10%% Applied Research = 0.10 = 0.05 = 0.01 = 0.05 • With these observed conditions • 5 times in 100 it will rain • 5 times in 100 it will rain when we have kept the sunroof down • 95 times in 100 it will not rain • 95 times in 100 it will not rain when we have kept the sunroof down = 0.05 • Reject Ho if the observed mean difference is greater than what we would expect to occur by chance (random sampling error) less than 5 times in 100 instances • reported in research as a statistically significant difference Testing Ho at = 0.05 • If p > 0.05 : do not reject Ho • difference is attributable to random sampling error (expected variability in mean drawn from a population) • If p 0.05 : reject Ho • difference is attributable to something other than random sampling error Decision Table DECISION Ho TRUE Ho FALSE Decision Table DECISION Ho TRUE R E A L I T Y Ho TRUE Ho FALSE Ho FALSE Decision Table: Correct DECISION Ho TRUE R E A L I T Y Ho TRUE Ho FALSE Ho FALSE Decision Table: Incorrect (RT1) DECISION Ho TRUE R E A L I T Y Ho TRUE Ho FALSE Ho FALSE Decision Table: Incorrect (AFII) DECISION Ho TRUE R E A L I T Y Ho TRUE Ho FALSE Ho FALSE Belief in God as Decision Table Ho: God does not exist DECISION R E A Ho TRUE L I Ho FALSE T Y Ho TRUE Life no hope Lost out on Eternal life Ho FALSE Lived life of hope Eternal life To this juncture • Sampling involves error • Expect differences between samples To this juncture • Sampling involves error • Expect differences between samples • If we expect a difference between treatments/conditions, BUT we also expect a difference because of random sampling error To this juncture • Sampling involves error • Expect differences between samples • If we expect a difference between treatments/conditions, BUT we also expect a difference because of random sampling error • HOW do we determine if difference is statistically significant (> than RSE)? Testing Ho requires • Mean value • measure of typical performance level • Standard deviation • measure of the variability • n of cases • known to affect • variability expected with the estimate of the population mean z test for one sample • Our beginning point • National Norm BF = 23% (SD = 7%) • Our sample performance • n = 25 • Mean = 20% • SD = 6% Do my students differ from the National Norm?? Our hypotheses • Research Hypothesis • Do my students differ from the national norm • want to know if better OR worse • Ho • There is no real difference in the BF% of my students and the national norm • = 0.05 Recall • z-score of > 1.96 or < -1.96 occurs less than 5% of the time • see table of the Normal Curve • That is, the probability of obtaining a zscore value this extreme purely by chance is 5% (only 5 times in 100) (explain). Relevance to Hypothesis Testing • Use the same general idea to evaluate the probability of obtaining a sample mean score of 20% with n = 25 if the true population mean is 23% • Recall the concept of the distribution of sampling means Recall: Z score equation X-X Z= SD Introduce: Z test equation Z= X - SEm Standard Error of the Mean SD SEm n Z test equation Mean difference Z= X- SEm Z test equation Z= Expected variability in sample means X- SEm Our given & required data • • • • • • • • X = 20% SD = 6% n = 25 = 23% = 7% SEm = ??? X - = ??? Z = ??? X- Z= SEm Our given & required data • • • • • • • • X = 20% SD = 6% n = 25 = 23% = 7% SEm = 7/5 = 1.4 X - = ??? Z = ??? X- Z= SEm Use the population standard deviation (SDp) Our given & required data • • • • • • • • X = 20% SD = 6% n = 25 = 23% = 7% SEm = 7/5 = 1.4 X - = 20% - 23% = -3% z = ??? X- Z= SEm Our given & required data • • • • • • • • X = 20% SD = 6% n = 25 = 23% = 7% SEm = 7/5 = 1.4 X - = 20% - 23% = -3% Z = -3 / 1.4 = -2.14 -3% Z= 1.4 Decision Making • What is the probability of obtaining a Z = -2.14 IF the difference is attributable only to random sampling error? • Is the observed probability (p) LESS THAN or EQUAL TO the level set? • Is p? From the tables • Z > 1.96 or Z < -1.96 has a 5% chance of occurring purely by chance (explain). • Since Zobserved = -2.14, our statistical conclusion is to reject Ho • the difference of -2.14 is not likely to have occurred by chance • The data indicate/suggest (not prove) that our class HAS less body fat than the norm. Graphically, Zcritical = -1.96 = 0.05 1.96 1.96 Z observed = -2.14 Graphically, Zcritical = = 0.05 1.96 Region of Non-Rejection -1.96 1.96 Z observed = -2.14 Graphically, Zcritical = = 0.05 1.96 Region of Rejection -1.96 Region of Rejection 1.96 Z observed = -2.14 Graphically, Zcritical = Region of Rejection -1.96 = 0.05 1.96 Region of Non-Rejection Region of Rejection 1.96 Z observed = -2.14 Reporting the Results = 0.05 The observed mean of our treatment group of 25 students was 20% ( 6%) body fat. The z-test for one sample indicates that the difference between the observed mean of 20% and the National Norm of 23% was statistically significant (Zobs = -2.14, p 0.05). These data suggest that our measured percent body fat was less than the national norm. Reporting the Results = 0.01 The observed mean of our treatment group was 20% ( 6%) body fat. The z-test for one sample indicates that the difference between the observed mean of 20% and the National Norm of 23% was not statistically significant (Zobs = -2.14, p > 0.01). Our measured percent body fat was not significantly different from the national norm. Reporting the Results, you set = 0.01 The observed mean of our treatment group was 20% ( 6%) body fat. With = 0.01, the z-test for one sample indicates that the difference between the observed mean of 20% and the National Norm of 23% was not statistically significant (Zobs = -2.14, p = 0.028). Our measured percent body fat was not significantly different from the national norm. Consider all possible reasons for your outcome Statistics humour What does a statistician call it when the heads of 10 rats are cut off and 1 survives? Statistics humour What does a statistician call it when the heads of 10 rats are cut off and 1 survives? Non-significant. Do not reject H0 vs Accept H0 Accept infers that we are sure Ho is valid Do not reject H0 vs Accept H0 Accept infers that we are sure Ho is valid Do not reject reflects that this time we are unable to say with a high enough degree of confidence that the difference observed is attributable to other than sampling error. Examples • Zobs = -3.45 • = 0.05 • Decision (statistical conclusion) = ??? Examples • Zobs = 1.45 • = 0.01 • Decision (statistical conclusion) = ??? Examples • Zobs = 1.96 • = 0.05 • Decision (statistical conclusion) = ??? Examples • Zobs = -1.96 • = 0.01 • Decision (statistical conclusion) = ??? Examples • Zobs = 1.96 • = 0.01 • Decision (statistical conclusion) = ??? Examples • Zobs = -1.95 • = 0.05 • Decision (statistical conclusion) = ??? Z-test vs t-test • SPSS does not provide the z-test • Can only use z-test if you know population SD • Typically, all population parameter values are estimated from sample statistics • Mean • Standard deviation • Standard error • SPSS uses t-test • Same concept, different assumptions • t-test more robust against departures from normality (doesn’t affect the accuracy of the p-estimate as much) When population mean is not known…changing distributions • The Z-test uses one sample statistic to estimate population parameters • sample mean population mean • Population standard deviation is known • The t-test uses two sample statistics to estimate population parameters • sample mean population mean • sample standard error population SD t-test equation • So the test statistic now becomes X 0 t sX Estimated population SD • To estimate pop SD from sample SD, the sample SD is inflated a little… s est You may have noticed this modification earlier ( x x) n 1 2 SEm from estimated SD population • To estimate standard error from sample SD, use the estimated SD again, thus… s sX n Recall factors affecting Sx • Size of estimated SE obviously depends on both SD of sample, and sample size s sX n When population mean is not known…changing distributions • The distribution used to evaluate calculated ratio switches from the normal distribution to the t-distribution • Sampling variation in Z-distribution reflected variability with respect to sample mean • BUT sampling variation in t-distribution reflects variability with respect to sample mean and standard error of the mean • So…as the sample gets smaller (and the standard error of the mean increases) the sampling distribution of t differs from that of Z • The good old 1.96 for 95% is toast Concept of Degrees of Freedom (df) • The number of independent pieces of information a sample of observations can provide for purposes of statistical inference • E.g. 3 numbers in a sample: 2, 2, 5 • Sample mean = 3; deviations are –1, -1, 2 • Are these independent? • No – when you know two, you’ll know the other because ( X X ) 0 • For any sample of size “n” you have “n-1” values that are free to vary – the last value is fixed Sampling distribution of t Large n t-dist pretty much like the z-dist (because sample SD is a good estimate of pop SD, & sample SE is a good estimate of pop SE) Sampling distribution of t • Because distribution gets flatter as n gets smaller, this implies t for significance gets bigger as n gets smaller http://duke.usask.ca/~rbaker/Tables.html Work an example with SPSS • Heart Rate (bpm) following aerobic activity • • • • • 147 155 132 165 133 • National standard: 158 • Group Mean : 146.4 ( 14.21) Atble351.sav SPSS Output e E e e N e H m u o n a l r e p d r t w p H 2 0 6 6 Statistics and beer Time Out