Download Ch 9A Hypothesis Testing

Chapter 9 Tests of Hypothesis Single Sample Tests The Beginnings – concepts and techniques Chapter 9A 9-1.1 Statistical Hypotheses Some Definitions Statistical Hypothesis - An assertion about a population parameter or distribution. Test of hypothesis – arriving at a decision to reject or not reject a hypothesis based upon a sample from the population. Null hypothesis – usually the hypothesis of no difference. The assertion that the researcher usually wants to reject. Ho:  = 0 Alternate Hypothesis – the assertion that is accepted if the null hypothesis is rejected. The assertion that the researcher generally wants to prove. H1:   0 Hypothesis Test on a Population Mean Two-Sided Test: One-Sided Tests: Test of Hypothesis    If the information in a sample is consistent with the null hypothesis, then we will conclude that the null hypothesis cannot be rejected; If this information is inconsistent with the null hypothesis, we will conclude that the hypothesis is false and reject the null hypothesis in favor of the alternate hypothesis. Critical Region – the set of values for the test statistic that results in rejecting the null hypothesis.  The test statistic is calculated from the sample; i.e. a sample statistic If ˆ  critical region, then reject H0 What can go wrong? Do not Reject H0 Reject H0 (accept H1) H0 true H0 false correct decision Type II error Type I error correct decision How Likely are the errors? • Type I Error – Incorrectly Rejecting a True Hypothesis a = P(Type I error) • (1- a) = probability of not rejecting a true hypothesis • Type II Error – Incorrectly Accepting a False Hypothesis b = P(Type II error) • Power of test (1-b) - probability of correctly rejecting the null when the alternative is true. The probability of a type I error is called the significance level of the test. Our Very First Hypothesis Test   Professor Notso Brite believes that his mean driving time to the campus from his home is 50 minutes while Dean Nowet Ah disagrees with him believing that it takes him, on the average more than 50 minutes. It is known that the standard deviation of his driving time is 2.5 minutes and driving time is normally distributed.    humor me here For the next 10 days, Professor Brite records his driving time with X = 51.7 minutes Can we accept Dean Nowet Ah’s assertion that the mean driving time must be greater than 50 minutes? The Hypothesis H0:  = 50 minutes H1:  > 50 minutes (one-tailed test) Given: X = 51.7 minutes  = 2.5 minutes n = 10  2.5 X = = = .79 n 10 More Probability of a Type I Error Let’s set the probability of a Type I error = .05, Then P(Type I error) = P(reject H0|H0 is correct) = a = .05 P  X  X c |  = 50 = .05  X   X X c  50  P    .79  X  P Z  z.05  = .05; P Z  1.6449 = .05 H0:  = 50 minutes H1:  > 50 minutes X c  50 = 1.6449; X c = 50  1.6449 .79 ) = 51.3 .79 If X  51.3  reject H 0 Since X = 51.7  51.3, reject H 0 What about the Type II Error? Incorrectly Accepting a False Hypothesis P(Type II error) = P(not rejecting H0 |H1 is correct) = b P  X  X c |   50 = b H0:  = 50 minutes H1:  > 50 minutes But professor, that probability depends upon the true value of the population mean under the alternate hypothesis. More about Type II Errors Say the true mean is 51:  X   X 51.3  51 P  X  X c |  = 51 = P    = P  z  .3797 = .6479 .79   X  X   X 51.3  52  P  X  X c |  = 52 = P    = P  z  .8861 = .1878 .79   X  X   X 51.3  53  P  X  X c |  = 53 = P    = P z  2.1519 = .0157 .79   X The Situation Graphically Displayed Probability Density Function 0 = 50; 1 = 51 0.6 0.5 0.4 0.3 0.2 0.1 0 47 48 49 Prob = .6479 50 51 52 Xc = 51.3 53 Prob = .05 More Graphical Display 0 = 50; 1 = 52 0.6 0.5 0.4 0.3 0.2 0.1 0 47 48 49 Prob = .1878 50 51 52 Xc = 51.3 53 54 Prob = .05 55 Prob Accept Null Hyp The Operating Characteristic (OC) Curve 1.0000 0.9000 0.8000 0.7000 0.6000 0.5000 0.4000 0.3000 0.2000 0.1000 0.0000 49.5 50 50.5 51 51.5 True Mean 52 52.5 53 53.5 The Power of the Test • The power is computed as 1 - b, and power can be interpreted as the probability of correctly rejecting a false null hypothesis. • We often compare statistical tests by comparing their power properties. The Power of the Test Power of test (1-b) - probability of correctly rejecting the null hypothesis when the alternative is true. Power Curve Prob reject nulll 1.2000 1.0000 0.8000 0.6000 0.4000 0.2000 0.0000 49.5 50 50.5 51 51.5 True Mean 52 52.5 53 53.5 The Prob-Value H0:  = 50 minutes H1:  > 50 minutes (one-tailed test) Given: X = 51.7  = 2.5, n = 10  X   X 51.7  50  P-value = P  X  51.7 |  = 50 = P    .79   X = P  z  2.1519 = .0157 The Prob-Value PDF of X-Bar 0.6 0.5 0.4 0.3 P-Value = .0157 0.2 a = .05 0.1 0 47 48 49 50 51.3 51 52 X = 51.7 53 54 55 Sample Size Determination  X c   0  P  X  X c |  0 = P z   =a  / n    Xc    P  X  X c |  = P  z  =b For 2-tailed test: / n  Xc  0 Xc   = za ; =  zb / n / n Sample Size in Action  What sample size is need if the level of significance is one percent and the probability of rejecting the null hypothesis if the true mean is 52 is 95 percent? a = .01  z.01 = 2.33 1  b = .95  z.05 = 1.645 2.33  1.645)  2.5)  n= = 24.68  25 2 2 2 2 A Two-Tailed Test H0:  = 50 minutes H1:  = 50 minutes (two-tailed test) Given: X = 51.7 minutes  = 2.5 minutes n = 10  2.5 X = = = .79 n 10 More Probability of a Type I Error Let’s set the probability of a Type I error = .05, Then P  X c1  X  X c 2 |  = 50 = 1  a = .95  X c1  50 X   X X c 2  50  P    .79  .79  X  P  z.025  Z  z.025  = .95; P 1.96  Z  1.96 = .95 X c1  50 = 1.96; X c1 = 50  1.96 .79 ) = 48.4516 .79 X c 2  50 = 1.96; X c 2 = 50  1.96 .79 ) = 51.5484 .79 If X  48.4516 or X  51.5484  reject H 0 Probability of a Type II Error b = P  X c1  X  X c 2 |  = 1 = P 48.4516  X  51.5484 |   50  48.4516  1 X   X 51.5484  1  P    .79  .79  X  1 = 49:  48.4516  49 X   X 51.5484  49  b = P    = P 0.6941  z  3.2258 = .7556 .79 X .79   1 = 52:  48.4516  52 X   X 51.5484  52  b = P    = P 4.4916  z  .5716 = .2838 .79 X .79   A Two-Tailed Prob-Value  X   X 51.7  50  P-value = 2  P  X  51.7 |  = 50 = 2  P    .79   X = 2  P  z  2.1519 = 2 .0157 ) = .0314 Reject H0 if a  .0314 z0 = X  0 X X  0 = / n A two-sided confidence interval – a study in comparison X = 51.7 minutes Given:  = 2.5 minutes and n = 10 95% confidence interval: z.025 = 1.96 X  za /2  n = 51.7  1.96 .79 ) = (50.1516,53.2484) A 95% confidence interval identifies a set of acceptable hypotheses at the 5% level of significance. A mean of 50 lies outside the interval and is therefore rejected. Confidence Intervals and Hypothesis Tests – together at last 0  za /2 x 0 x  za /2 x 0  za /2 x x x  za /2 x 9-2 Tests on the Mean of a Normal Distribution, Variance Known We wish to test: The test statistic is: Reject H0 if the observed value of the test statistic z0 is either: z0 > za/2 or z0 < -za/2 Fail to reject H0 if -za/2 < z0 < za/2 9-2 Tests on the Mean of a Normal Distribution, Variance Known Alternately H 0 :  = 0 H1 :    0 X c1 = 0  za /2 X c 2 = 0  za /2  n  n If X  X c1 or X  X c 2  reject H 0 Points to Ponder When we are considering Type II errors (beta), we use the distribution of the test statistic under the alternative hypothesis. When we are considering Type I errors (alpha), we use the distribution of the test statistic under the null hypothesis. Statistical versus Practical Significance   statistical significance says nothing about the importance of the difference there may be statistically significant difference between two values with no practical difference    mean of 50.4 driving minutes versus 49.7 driving minutes large sample sizes will identify a difference There may no statistically significant difference between two values but there is a significant practical difference  mean of .20 mm in the diameter of a ball-bearing versus .18 mm Interactions -- alpha, beta, sample size N and a then b N and a then b N then: a and b  Alpha/beta tradeoffs. Lower alpha value means a larger beta value. Power of a test is (1-beta). Lower alpha implies we are reluctant to risk rejecting a true hypothesis. But it means we must risk accepting a false one. Only way to improve both is to increase the sample size. On the selection of the level of significance    Convention is to use .01 or .05 Consider practical consequences of making a type I or II error Consider power of the test and sample size      Large N – small difference will be statistically significant – use small a (.01 - .001) Small N – large differences may not be detected – use large a (.05 - .10) Consider “true” difference Type I versus Type II errors Use the P-value and let the reader decide  “I’m just reporting the facts; you decide” General Procedures for Hypothesis Tests 1. 2. 3. 4. 5. 6. 7. Identify the parameter of interest. State the null hypothesis – H0. Specify the alternative – H1. Choose the significance level – alpha – risk of Type I error. Determine the appropriate test statistic. State the rejection region for the statistic. Compute the sample quantities (i.e. from the experiment or measurement) and substitute into the equation for test statistic. 8. Decide whether to reject H0. A Little Philosophy Consider the following: Consider H0:  = 10 H1  > 0 X C = 11.7; a = .05 • If we reject the null hypothesis: • either H1 is true • or we were extremely unlucky and hit on the 5 percent of the samples that fall in the critical region • We go with the odds and reject the null • If we fail to reject the null (assume x-bar = 11.1) • H0 is still left standing at the end of the test • The alternative hypothesis is what we wish to prove and believe to be correct • The sample supports H1 but the test does not allow us to reject H0 • Therefore, we conclude that the evidence does not allow us to reject H0 stopping short of saying we accept H0. Large Sample Test    In most situations, the population variance is unknown and the population may not be well modeled as a normal distribution If n is large (n >40), the sample standard deviation, s, can be substituted for  with little effect appealing to the central limit theorem Exact tests where the population is normal, 2 is unknown, and n is small results in t-distribution. A Little Recap  Tests on a mean, variance known, normal population or large sample size (CLT) H0:  = 0 H1:   0 X = 0  za /2  c X c = 0  za /2   or s ) n   or s ) n ; reject X  X c ; reject X  X c H0:  = 0   or s ) ; reject X  X c H1:  > 0 X c = 0  za n H0:  = 0 H1:  < 0 X c = 0  za   or s ) n ; reject X  X c Next Time Time Permitting

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Ch 9A Hypothesis Testing