* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Document
Survey
Document related concepts
Transcript
Introduction to Hypothesis Testing 1 1 Introduction The purpose of hypothesis testing is to determine whether there is enough statistical evidence in favor of a certain belief about a parameter. Examples Is there statistical evidence in a random sample of potential customers, that support the hypothesis that more than 10% of the potential customers will purchase a new products? Is a new drug effective in curing a certain disease? A sample of patients is randomly selected. Half of them are given the drug while the other half are given a placebo. The improvement in the patients conditions is then measured and compared. 2 2 Concepts of Hypothesis Testing The critical concepts of hypothesis testing. Example: • An operation manager needs to determine if the mean demand during lead time is greater than 350. • If so, changes in the ordering policy are needed. There are two hypotheses about a population mean: • H0: The null hypothesis m = 350 • H1: The alternative hypothesis m > 350 3 2 Concepts of Hypothesis Testing • Assume the null hypothesis is true (m= 350). m = 350 – Sample from the demand population, and build a statistic related to the parameter hypothesized (the sample mean). – Pose the question: How probable is it to obtain a sample mean at least as extreme as the one observed from the sample, if H0 is correct? 4 2 Concepts of Hypothesis Testing • Assume the null hypothesis is true (m= 350). x 355 m = 350 x 450 – Since the x is much larger than 350, the mean m is likely to be greater than 350. Reject the null hypothesis. – In this case the mean m is not likely to be greater than 350. Do not reject the null hypothesis. 5 Types of Errors • Two types of errors may occur when deciding whether to reject H0 based on the statistic value. – Type I error: Reject H0 when it is true. – Type II error: Do not reject H0 when it is false. • Example continued – Type I error: Reject H0 (m = 350) in favor of H1 (m > 350) when the real value of m is 350. – Type II error: Believe that H0 is correct (m = 350) when the real value of m is greater than 350. 6 Controlling the probability of conducting a type I error Recall: H0: m = 350 and H1: m > 350. H0 is rejected if x is sufficiently large Thus, a type I error is made if x critical value when m = 350. By properly selecting the critical value we can limit the probability of conducting a type I error to an acceptable level. Critical value m = 350 x 7 3 Testing the Population Mean When the Population Standard Deviation is Known • Example 1 – A new billing system for a department store will be costeffective only if the mean monthly account is more than $170. – A sample of 400 accounts has a mean of $178. – If accounts are approximately normally distributed with s = $65, can we conclude that the new system will be cost effective? 8 Testing the Population Mean (s is Known) Example 1 – Solution The population of interest is the credit accounts at the store. We want to know whether the mean account for all customers is greater than $170. H1 : m > 170 – The null hypothesis must specify a single value of the parameter m, H0 : m = 170 9 Approaches to Testing There are two approaches to test whether the sample mean supports the alternative hypothesis (H1) The rejection region method is mandatory for manual testing (but can be used when testing is supported by a statistical software) The p-value method which is mostly used when a statistical software is available. 10 The Rejection Region Method The rejection region is a range of values such that if the test statistic falls into that range, the null hypothesis is rejected in favor of the alternative hypothesis. 11 The Rejection Region Method – for a Right - Tail Test Example 1 – solution continued • Recall: therefore, H0: m = 170 H1: m > 170 • It seems reasonable to reject the null hypothesis and believe that m > 170 if the sample mean is sufficiently large. Reject H0 here Critical value of the sample mean 12 The Rejection Region Method for a Right - Tail Test Example 1 – solution continued • Define a critical value x L for to reject the null hypothesis. x that is just large enough • Reject the null hypothesis if x xL 13 Determining the Critical Value for the Rejection Region Allow the probability of committing a Type I error be a (also called the significance level). Find the value of the sample mean that is just large enough so that the actual probability of committing a Type I error does not exceed a. Watch… 14 Determining the Critical Value – for a Right – Tail Test Example 1 – solution continued za a m x 170 x L 170 65 400 xL x P(commit a Type I error) = P(reject H0 given that H0 is true) = P( x x L given that H0 is true) … is allowed to be a. Since P(Z Z a ) a we have: 15 Determining the Critical Value – for a Right – Tail Test Example 1 – solution continued a = 0.05 m x 170 xL za x L 170 65 400 65 x L 170 z a . 400 If we select a 0.05, z .05 1.645 . 65 x L 170 1.645 175 .34 . 400 16 Determining the Critical value for a Right - Tail Test Re ject the null hypothesis if x 175 .34 Conclusion Since the sample mean (178) is greater than the critical value of 175.34, there is sufficient evidence to infer that the mean monthly balance is greater than $170 at the 5% significance level. 17 The standardized test statistic Instead of using the statistic x, we can use the standardized value z. z x m s n Then, the rejection region becomes z za One tail test 18 The standardized test statistic Example 1 - continued We redo this example using the standardized test statistic. Recall: H0: m = 170 H1: m > 170 Test statistic: x m 178 170 z s n 65 400 2.46 Rejection region: z > z.05 1.645. 19 The standardized test statistic Example 1 - continued Re ject the null hypothesis if Z 1.645 Conclusion Since Z = 2.46 > 1.645, reject the null hypothesis in favor of the alternative hypothesis. 20 P-value Method The p-value provides information about the amount of statistical evidence that supports the alternative hypothesis. – The p-value of a test is the probability of observing a test statistic at least as extreme as the one computed, given that the null hypothesis is true. – Let us demonstrate the concept on Example 1 21 P-value Method The probability of observing a test statistic at least as extreme as 178, given that m = 170 is… P( x 178 when m 170 ) 178 170 P( z ) 65 400 P( z 2.4615 ) .0069 m x 170 x 178 The p-value 22 Interpreting the p-value Because the probability that the sample mean will assume a value of more than 178 when m = 170 is so small (.0069), there are reasons to believe that m > 170. Note how the event x 178 is rare under H0 when m x 170, but... …it becomes more probable under H1, when m x 170 H0 : m x 170 H1 : m x 170 x 178 23 Interpreting the p-value We can conclude that the smaller the p-value the more statistical evidence exists to support the alternative hypothesis. H0 : m x 170 H1 : m x 170 x 178 24 Interpreting the p-value Describing the p-value If the p-value is less than 1%, there is overwhelming evidence that supports the alternative hypothesis. – If the p-value is between 1% and 5%, there is a strong evidence that supports the alternative hypothesis. – If the p-value is between 5% and 10% there is a weak evidence that supports the alternative hypothesis. – If the p-value exceeds 10%, there is no evidence that supports the alternative hypothesis. 25 The p-value and the Rejection Region Methods The p-value can be used when making decisions based on rejection region methods as follows: • Define the hypotheses to test, and the required significance level a. • Perform the sampling procedure, calculate the test statistic and the p-value associated with it. • Compare the p-value to a. Reject the null hypothesis only if p-value <a; otherwise, do not reject the null hypothesis. a = 0.05 The p-value m x 170 x L 175.34 x 178 26 Conclusions of a Test of Hypothesis If we reject the null hypothesis, we conclude that there is enough evidence to infer that the alternative hypothesis is true. If we do not reject the null hypothesis, we conclude that there is not enough statistical evidence to infer that the alternative hypothesis is true. The alternative hypothesis is the more important one. It represents what we are investigating. 27 A Two - Tail Test Example 2 AT&T has been challenged by competitors who argued that their rates resulted in lower bills. A statistics practitioner determines that the mean and standard deviation of monthly longdistance bills for all AT&T residential customers are $17.09 and $3.87 respectively. 28 A Two - Tail Test Example 2 - continued A random sample of 100 customers is selected and customers’ bills recalculated using a leading competitor’s rates (see Xm11-02). Assuming the standard deviation is the same (3.87), can we infer that there is a difference between AT&T’s bills and the competitor’s bills (on the average)? 29 A Two - Tail Test Solution Is the mean different from 17.09? H0: m = 17.09 H1 : m 17.09 – Define the rejection region z za / 2 or z za / 2 30 A Two – Tail Test Solution - continued a/2 0.025 x a/2 0.025 17.09 If H0 is true (m =17.09), x can still fall far above or far below 17.09, in which case we erroneously reject H0 in favor of H1 (m 17.09) x We want this erroneous rejection of H0 to be a rare event, say 5% chance. 31 A Two – Tail Test Solution - continued z a/2 0.025 xm s n 17.55 17.09 3.87 1.19 100 17.55 x 17.09 x From the sample we have: a/2 0.025 a/2 0.025 a/2 0.025 x 17.55 -za/2 = -1.96 0 za/2 = 1.96 Rejection region 32 A Two – Tail Test There is insufficient evidence to infer that there is a difference between the bills of AT&T and the competitor. Also, by the p value approach: The p-value = P(Z< -1.19)+P(Z >1.19) = 2(.1173) = .2346 > .05 a/2 0.025 z xm s n a/2 0.025 -1.19 0 1.19 17.55 17.09 3.87 1.19 -za/2 = -1.96 za/2 = 1.96 100 33 11.4 Calculating the Probability of a Type II Error To properly interpret the results of a test of hypothesis, we need to specify an appropriate significance level or judge the pvalue of a test; understand the relationship between Type I and Type II errors. How do we compute a type II error? 34 Calculation of the Probability of a Type II Error To calculate Type II error we need to… express the rejection region directly, in terms of the parameter hypothesized (not standardized). specify the alternative value under H1. Let us revisit Example 1 35 Calculation of the Probability of a Type II Error Express the rejection region directly, not in standardized terms • Let us revisit Example 1 – The rejection region was x 175.34 with a = .05. – Let the alternative value be m = 180 (rather than just m>170) H : m = 170 0 H1: m = 180 Do not reject H0 a=.05 m= 170 xL Specify the alternative value under H1. m180 175 .34 36 Calculation of the Probability of a Type II Error – A Type II error occurs when a false H0 is not rejected. H0: m = 170 A false H0… …is not rejected H1: m = 180 x 175 .34 m= 170 xL a=.05 m180 175 .34 37 Calculation of the Probability of a Type II Error P( x 175.34 given that H0 is false) P( x 175.34 given that m 180) 175.34 180 P( z ) .0764 65 400 H0: m = 170 H1: m = 180 m= 170 xL m180 175 .34 38 Effects on of changing a Decreasing the significance level a, increases the value of , and vice versa. 2 < 1 m= 170 a2 > a1 m180 39 Judging the Test A hypothesis test is effectively defined by the significance level a and by the sample size n. If the probability of a Type II error is judged to be too large, we can reduce it by increasing a, and/or increasing the sample size. 40 Judging the Test Increasing the sample size reduces xL m Re call : z a , thus s n s xL m z a n By increasing the sample size the standard deviation of the sampling distribution of the mean decreases. Thus, x Ldecreases. 41 Judging the Test Increasing the sample size reduces xL m Re call : z a , thus s n s xL m z a n Note what happens when n increases: a does not change, but becomes smaller m= 170 xxxLLxLxLxLL m180 42 Judging the Test Increasing the sample size reduces In Example 11.1, suppose n increases from 400 to 1000. s 65 xL m z a 170 1.645 173 .38 n 1000 173 .38 180 P( Z ) P( Z 3.22 ) 0 65 1000 • a remains 5%, but the probability of a Type II drops dramatically. 43 Judging the Test Power of a test The power of a test is defined as 1 - . It represents the probability of rejecting the null hypothesis when it is false. 44