Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
STAT 215 Fall 2006 L20 TESTING HYPOTHESIS * A statistical hypothesis is a claim about the value of a single population characteristic. * Usually in any hypothesis testing problem, there are two contradictory hypothesis under consideration * The claim or research hypothesis that we wish to establish is called the alternative hypothesis H1 1 * The opposite statement, one that nullifies the research hypothesis H1, is called the null hypothesis H0 * A test procedure is a rule, based on the sample data, for deciding whether to reject H0. The test procedure is specified by 1). a test statistic, a function of the sample data on which the decision (reject H0 or do not reject H0) is to be based. 2). a rejection region, the set of all test statistic values for which H0 will be rejected. * The null hypothsis H0 will be rejected 2 if and only if the observed or computed test statistic value falls in the rejection region. Example 1. claims Suppose a cigarette manufacturer that the average nicotine content µ of a brand B cigarettes is at most 1.5 mg. So that, the alternative hypothesis H1: µ < 1.5 and the null hypothesis is H0 : µ = 1.5. 3). Errors in hypothesis testing * A Type I error consists of rejecting the 3 null hypothesis H0 when it is true. * A Type II error involves not rejecting the null hypothesis H0 when it is false. 4). Level of significance α = Probability of making a Type I error is called the level of significance. Usually, the specified level of significance α = .05 or .01 Power of a Test: 1 - P (Typer II error) = 1 - β Problems: 1. Stated here are some claims or research hypotheses that are to be substantiated by sample data. 4 In each case, identify the null hypothesis H0 and the alternative hypothesis H1 in terms of the population mean µ. 5 Examples: (a) The average mathematics score of the college-bound students in Milwaukee who participated in the American College Testing (ACT) program in 1996 was highter than 17.2. (b) The meantime for an airline passenger to obtain his or her luggage, once luggage starts coming out on the conveyer belt, is less than 210 seconds. (c) The content of fat in a name-brand chocolate ice cream is more than 4%, the amount printed on the label. 6 (d) The average weight of a brand of motors is different from the manufaturer’s target of 6 pounds. LARGE SAMPLE CASE . Assume that we have some population with unknown mean = µ and sd = σ. Suppose X1, ..., Xn is a random sample from this population with n - sample size. Let us consider the case when n ≥ 30. To test the hypothesis H0 : µ = µ0 versus the alternative H1 : µ < µ0, ( or µ > µ0 7 or µ 6= µ0 ) consider the standartized random variable Z= X̄ − µ0 √ , S/ n here again n 1 X Xi X̄ = n i=1 and S= v u u u t n 1 X (Xi − X̄)2. n − 1 i=1 8 Definitions and statistical concepts: 1. Null Hypothesis H0 : µ = µ0 , the alternative hypothesis H1 : µ < µ0 ( H1 : µ > µ0 or, for example, H1 : µ 6= µ0) 2. Type I and Type II errors: Type I error: if H0 is true and we reject H0 Type II error: if H1 is true and we accept H0 3. The Level of significance: α = P (Type I error) 4. The Power of a Test: 1 - P (Type II error) = 1 - β 9 5. Rejection Region: Z ≤ −zα ( or Z ≥ zα ), where Z is a test statistic: X̄ − µ0 √ S/ n and zα is the α -upper point of Z= standard normal distribution. Our general aim in Hypothesis Testing is to use statistics (Tests) that make α and β as small as possible. These actions are contradictory. Instead, our general strategy: to fix α at some specific level, for example, α = .05, .01, and to 10 use the test that max the power of the Test. When H0 is true and n ≥ 30 then statistic Z has approximately standard normal distribution, z-distribution. From the corresponding Table we can determine the critical value zα : P (Z ≥ zα ) = α. 11 The following sequence of steps is recommended in the Hypothesis testing analysis: 1. Identify the parameter of interest and describe it in the context of the problem situation; 2. Determine the null value and state the null hypothesis H0; 3. State the appropriate alternative hypothesis H1; 4. Give the formula for the computation 12 the value of the Test statistic ( T or Z statistics); 5. State the rejection region R for the specified significance level α ( with specified values zα or tα ); 6. Compute any necessary sample quantities ( sample mean, sample deviation), substitute them into the formula for the test statistics and compute that value; 7. Decide whether H0 should be rejected and state this conclusion in the problem context. 13 Testing of statistical hypotheses about population mean µ. Small samples Assume that the population distribution is normal with unknown mean = µ and sd = σ. Suppose X1, ..., Xn is a random sample from this population with n ≤ 30. To test the hypothesis H0 : µ = µ0 versus the alternative H1 : µ > µ0, consider the standartized random variable T = X̄ − µ0 √ , S/ n here again 14 n 1 X Xi X̄ = n i=1 and v u u u t n 1 X S= (Xi − X̄)2. n − 1 i=1 When H0 is true, T statistic has t-distribution (Student’s distribution) with degree of freedom (d.f.) = n - 1. This knowledge allows us to construct the rejection region R : T ≥ tα , such that P (type I error) = P (H0 is rejected when it is true) = P (T ≥ tα ) = Here α is the specified level of significance (usually = .05 ; .01). Remark. The test statistic is the same as in the 15 large sample case but is labeled by T to emphasis that its distribution is t-distribution with d.f. n−1 rather then standard normal z - distribution. Examples: 1. A random sample of size 20 from a normal population has x̄ = 182 and s = 2.3. To test H0 : µ = 181 against H1 : µ > 181 with α = .05. The null hypothesis H0 : µ = 181, the alternative hypothesis H1 : µ > 181. The test statistic X − 181 √ S/ 20 Since the sample size n = 20, the statistic T T = 16 has t-distribution with d.f. = 19. The specified level of significance is α = .05. The upper .05 point of t-distribution with d.f. = 19 is t.05 = 1.729, so that the rejection region is R : T ≥ 1.729. Now let us compare the calculated value of T with the value t.05 = 1.729, we get X − 181 182 − 181 √ √ = = 1.94 > 1.729 S/ 20 2.3/ 20 i.e. the value of T lies in the rejection region R. T = So that we reject the null hypothesis H0 in favor of the alternative hypothesis H1 : µ > 181. 17