* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download No Slide Title
Survey
Document related concepts
Transcript
Chapter 9 Hypothesis Testing © Null Hypothesis The null hypothesis is a statement about the population value that will be tested. The null hypothesis is held true unless sufficient evidence to the contrary is obtained. Alternative Hypothesis The alternative hypothesis is the hypothesis that includes all population values not covered by the null hypothesis. The alternative hypothesis is held true if the null hypothesis is rejected or held false. Simple and Composite Hypotheses A simple hypothesis is one that specifies a single value for the population parameter of interest. A composite hypothesis is one that specifies a range of values for the population parameter. One-Sided and Two-Sided Alternatives A one-sided alternative is an alternative hypothesis involving all possible values of a population parameter on either one side or the other of the value specified by the null hypothesis. A two-sided alternative is an alternative hypothesis involving all possible values of a population parameter other than the value specified by a simple null hypothesis States of Nature and Decisions on Null Hypothesis (Table 9.1) States of Nature Decisions on Null Hypothesis Accept (Fail to Reject) Reject Null Hypothesis is True Null Hypothesis is False Correct Decision Probability = 1 - Type II Error Probability = Type I Error Probability = ( is called the significance level) Correct Decision Probability = 1 - ((1 - ) is called power) Type I and Type II Errors A Type I Error is the rejection of a true null hypothesis. A Type II Error is the acceptance of a false hypothesis. Significance Level The significance level is the probability of rejecting a null hypothesis that is true. This is sometimes expressed as a percentage, so a test of significance level is referred to as a 100 % - level test. Power The power of a test is the probability of rejecting a null hypothesis that is false. Consequences of Fixing the Significance Level of a Test (Figure 9.1) Investigator chooses significance level (Probability of a Type I error) Decision Rule is Established Probability of Type II error follows A Test of the Mean of a Normal Population: Population Variance Known Given that we have a random sample of n observations from a normal population with mean and known variance 2. If the observed sample mean is X, the test with significance level of the null hypothesis H 0 : 0 against the alternative H1 : 0 is obtained from the decision rule Or equivalently Reject H 0 if Z X-μ0 Z σ/ n Reject H0 if X 0 Z / n where Z is the number for which P ( Z Z ) and Z is the standard normal random variable. Interpretation of the Probability value or p-value The probability value or p-value is the smallest significance level at which the null hypothesis can be rejected. Consider a random sample of size n observations from a population that has a normal distribution with mean and standard deviation , and the resulting computed sample mean, X. We are asked to test the null hypothesis H 0 : 0 against the alternative hypothesis The p-value for the test is H1 : 0 p - value P( X-μ0 Z p | H 0 : 0 ) σ/ n where Zp is the standard normal random value associated with the smallest significance level at which the null hypothesis can be rejected. The p-value is regularly computed by most statistical computer programs and provides more information about the test, based on the observed sample mean. A Test of the Mean of a Normal Population (Variance Known): Composite Null and Alternative Hypothesis The appropriate procedure for testing, at significance level , the null hypothesis H 0 : 0 against the alternative hypothesis H1 : 0 is precisely the same as when the null hypothesis is H0: = 0. In addition, the p-values are also computed in exactly the same way. A Test of the Mean of a Normal Distribution (Variance Known): Composite or Simple Null and Alternative Hypothesis The appropriate procedure for testing, at significance level , the null hypothesis H 0 : 0 or H 0 : 0 against the alternative H1 : 0 uses the decision rule Or equivalently X-μ0 Reject H 0 if Z Z σ/ n Reject H0 if X X c 0 Z / n where -Z is the number for which P ( Z Z ) and Z is the standard normal random variable. In addition the p-values can also be computed by using the lower tail probabilities. A Test of the Mean of a Normal Distribution Against Two-Sided Alternative: Known The appropriate procedure for testing, at significance level , the null hypothesis H 0 : 0 against the alternative hypothesis H1 : 0 is obtained from the decision rule Reject H 0 if Z X-μ0 Z / 2 σ/ n or Reject H 0 if Z X-μ0 Z / 2 σ/ n equivalently Reject H0 if X 0 Z / 2 / n or Reject H0 if X 0 Z / 2 / n A Test of the Mean of a Normal Distribution Against Two-Sided Alternative: Know (continued) In addition the p-values can be computed by noting that the corresponding tail probability would be doubled to reflect a p-value that refers to the sum of the upper and lower tail probabilities for the positive and negative values of Z. The p-value for the two-tailed test is X-μ0 p - value 2 P( Z p / 2 | H 0 : 0 ) σ/ n where Zp/2 is the standard normal value associated with the smallest probability of rejecting the null hypothesis at either tail of the probability distribution. A Test of the Mean of a Normal Distribution: Population Variance Unknown Given a random sample of n observations from a normal population with mean . Using the sample mean and standard deviation X and s we can use the following test with significance level , (i) To test either null hypothesis against the alternative H 0 : 0 or H 0 : 0 H1 : 0 the decision rule is X-μ0 Reject H 0 if t t n 1, s/ n Or equivalently Reject H 0 if X X c 0 tn1, s / n A Test of the Mean of a Normal Distribution: Population Variance Unknown (continued) (ii) To test either null hypothesis against the alternative the decision rule is H 0 : 0 or H 0 : 0 H1 : 0 X-μ0 Reject H 0 if t tn 1, s/ n Or equivalently Reject H 0 if X X c 0 tn1, s / n A Test of the Mean of a Normal Distribution: Population Variance Unknown (continued) (iii) To test the null hypothesis against the alternative the decision rule is Reject H 0 if t H 0 : 0 H1 : 0 X-μ0 X-μ0 tn 1, / 2 or Reject H 0 if t t n 1, / 2 s/ n s/ n equivalently Reject H 0 if X 0 tn1, / 2 s / n or Reject H 0 if X 0 tn1, / 2 s / n where tn-1,/2 is the student t-value for n – 1 degrees of freedom and upper tail probability /2. The p-values for these tests are computed in the same way as we did for tests with known variance except that the student t value is substituted for the normal Z value. Tests of the Population Proportion (Large Sample Size) We begin by assuming a random sample of n observations from a population that has a proportion whose members possess a particular attribute. If (1 - ) > 9 and the sample proportion is p the following tests have significance level : (i) To test either null hypothesis against the alternative H 0 : 0 or H 0 : 0 H1 : 0 the decision rule is Reject H 0 if Z p 0 Z 0 (1 0 ) / n Tests of the Population Proportion (Large Sample Size) (Continued) (ii) To test either null hypothesis against the alternative the decision rule is H 0 : 0 or H 0 : 0 H1 : 0 p 0 Reject H 0 if Z Z 0 (1 0 ) / n Tests of the Population Proportion (Large Sample Size) (Continued) (iii) To test the null hypothesis H0 : 0 against the two-sided alternative the decision rule is Reject H 0 if Z H1 : 0 p 0 Z / 2 0 (1 0 ) / n or Reject H 0 if Z p 0 Z / 2 0 (1 0 ) / n For all of these tests the p-value is the smallest significance level at which the null hypothesis can be rejected. Tests of Variance of a Normal Population Given a random sample of n observations from a normally distributed population with variance 2. If we observe the sample variance sx2, then the following tests have significance level : (i) To test either the null hypothesis H 0 : 2 02 or H 0 : 2 02 against the alternative the decision rule is H1 : 2 02 Reject H 0 if (n 1) s x2 02 n21, Tests of Variance of a Normal Population (continued) (ii) To test either null hypothesis H 0 : 2 02 or H 0 : 2 02 against the alternative the decision rule is H1 : 2 02 Reject H 0 if (n 1) s x2 02 n21,1 Tests of Variance of a Normal Population (continued) (iii) To test the null hypothesis H 0 : 2 02 against the alternative H1 : 2 02 the decision rule is Reject H 0 if (n 1) s x2 02 2 n 1, / 2 or (n 1) s x2 02 n21,1 / 2 Where 2n-1 is a chi-square random variable and P(2n-1 > 2n-1,) = . The p-value for these tests is the smallest significance level at which the null hypothesis can be rejected given the sample variance. Some Probabilities for the Chi-Square Distribution (Figure 9.5) f(2v) /2 1- 0 2v,1-/2 /2 2v,/2 Tests of the Difference Between Population Means: Matched Pairs Suppose that we have a random sample of n matched pairs of observations from distributions with means X and Y . Let D and sd denote the observed sample mean and standard deviation for the n differences Di = (xi – yi) . If the population distribution of the differences is a normal distribution, then the following tests have significance level . (i) To test either null hypothesis H 0 : x y D0 or H 0 : x y D0 against the alternative the decision rule is H1 : x y D0 Reject H 0 if D -D0 tn 1, sD / n Tests of the Difference Between Population Means: Matched Pairs (continued) (ii) To test either null hypothesis H 0 : x y D0 or H 0 : x y D0 against the alternative H1 : x y D0 the decision rule is Reject H 0 if D -D0 tn1, sD / n Tests of the Difference Between Population Means: Matched Pairs (continued) (iii) To test the null hypothesis H 0 : x y D0 against the two-sided alternative the decision rule is Reject H 0 if H1 : x y D0 D -D0 tn1, / 2 sD / n or D -D0 tn1, / 2 sD / n Here tn-1, is the number for which P(tn-1 > tn-1, ) = where the random variable tn-1 follows a Student’s t distribution with (n – 1) degrees of freedom. When we want to test the null hypothesis that the two population means are equal, we set D0 = 0 in the formulas. P-values for all of these tests are interpreted as the smallest significance level at which the null hypothesis can be rejected given the test statistic. Tests of the Difference Between Population Means: Independent Samples (Known Variances) Suppose that we have two independent random samples of nx and ny observations from normal distributions with means X and Y and variances 2x and 2y . If the observed sample means are X and Y, then the following tests have significance level . (i) To test either null hypothesis H 0 : x y D0 or H 0 : x y D0 against the alternative the decision rule is H1 : x y D0 Reject H 0 if X Y -D0 x2 nx 2 y ny Z Tests of the Difference Between Population Means: Independent Samples (Known Variances) (continued) (ii) To test either null hypothesis H 0 : x y D0 or H 0 : x y D0 against the alternative the decision rule is H1 : x y D0 Reject H 0 if X Y -D0 2 x nx 2 y ny Z Tests of the Difference Between Population Means: Independent Samples (Known Variances) (continued) (iii) To test the null hypothesis H 0 : x y D0 against the alternative H1 : x y D0 the decision rule is Reject H 0 if X Y -D0 2 x nx 2 y ny Z / 2 or X Y -D0 2 x nx 2 y Z / 2 ny If the sample sizes are large (n > 100) then a good approximation at significance level can be made if the population variances are replaced by the sample variances. In addition the central limit leads to good approximations even if the populations are not normally distributed. P-values for all these tests are interpreted as the smallest significance level at which the null hypothesis can be rejected given the test statistic. Tests of the Difference Between Population Means: Population Variances Unknown and Equal These tests assume that we have two independent random samples of nx and ny observations from normally distributed populations with means X and Y and a common variance. The sample variances sx2 and sy2 are used to compute a pooled variance estimator s 2p (nx 1) s x2 (n y 1) s y2 (nx n y 2) Then using the observed sample means are X and Y, the following tests have significance level : (i) To test either null hypothesis H 0 : x y D0 or H 0 : x y D0 against the alternative the decision rule is H1 : x y D0 Reject H 0 if X Y -D0 s 2 p nx s 2 p ny t nx n y 2, Tests of the Difference Between Population Means: Population Variances Unknown and Equal (continued) (ii) To test either null hypothesis H 0 : x y D0 or H 0 : x y D0 against the alternative the decision rule is H1 : x y D0 Reject H 0 if X Y -D0 s 2 p nx s 2 p ny t nx n y 2, Tests of the Difference Between Population Means: Population Variances Unknown and Equal (continued) (iii) To test the null hypothesis H 0 : x y D0 against the alternative H1 : x y D0 the decision rule is Reject H 0 if X Y -D0 s 2 p nx s 2 p ny t nx n y 2, / 2 or X Y -D0 s 2 p nx s 2 p t nx n y 2, / 2 ny Here tnx+ny-2, is the number for which P(tnx+ny-2, > tnx+ny-2, ) = . P-values for all these tests are interpreted as the smallest significance level at which the null hypothesis can be rejected given the test statistic. Tests of the Difference Between Population Means: Population Variances Unknown and Not Equal These tests assume that we have two independent random samples of nx and ny observations from normal populations with means X and Y and a common variance. The sample variances sx2 and sy2 are used. The degrees of freedom, v, for the student t statistic is given by 2 2 s x2 sy ( ) ( ) n y nx v 2 s y2 2 sx 2 ( ) /( nx 1) ( ) /( n y 1) nx ny Then using the observed sample means are X and Y, the following tests have significance level : (i) To test either null hypothesis H 0 : x y D0 or H 0 : x y D0 against the alternative the decision rule is H1 : x y D0 Reject H 0 if X Y -D0 2 x 2 y s s nx n y tv , Tests of the Difference Between Population Means: Population Variances Unknown and Not Equal (continued) (ii) To test either null hypothesis H 0 : x y D0 or H 0 : x y D0 against the alternative the decision rule is H1 : x y D0 Reject H 0 if X Y -D0 2 y s x2 s nx n y tv , Tests of the Difference Between Population Means: Population Variances Unknown and Not Equal (continued) (iii) To test the null hypothesis against the alternative H1 : x y D0 the decision rule is Reject H 0 if H 0 : x y D0 X Y -D0 2 y s x2 s nx n y tv , / 2 or X Y -D0 2 y s x2 s nx n y tv , / 2 Here tnx+ny-2, is the number for which P(tnx+ny-2, > tnx+ny-2, ) = . P-values for all these tests are interpreted as the smallest significance level at which the null hypothesis can be rejected given the test statistic. Testing the Equality of Population Proportions (Large Samples) Given independent random samples of nx and ny with proportion successes px and py. When we assume that the population proportions are equal, an estimate of the common proportion is p0 nx p x n y p y nx n y For large sample sizes - - n(1 - ) > 9 - - the following tests have significance level : (i) To test either null hypothesis H 0 : x y 0 or H 0 : x y 0 against the alternative the decision rule is Reject H 0 if H1 : x y 0 ( px p y ) p0 (1 p0 ) p0 (1 p0 ) nx ny Z Testing the Equality of Population Proportions - Large Samples (continued) (ii) To test either null hypothesis H 0 : x y 0 or H 0 : x y 0 against the alternative the decision rule is Reject H 0 if H1 : x y 0 ( px p y ) p0 (1 p0 ) p0 (1 p0 ) nx ny Z Testing the Equality of Population Proportions - Large Samples (continued) (iii) To test the null hypothesis against the alternative H1 : x y 0 the decision rule is Reject H 0 if H0 : x y 0 ( px p y ) p0 (1 p0 ) p0 (1 p0 ) nx ny Z / 2 or ( px p y ) p0 (1 p0 ) p0 (1 p0 ) nx ny Z / 2 It is also possible to compute and interpret the p-values for these tests by calculating the minimum significance level at which the null hypothesis can be rejected. The F Distribution Given that we have two independent random samples of nx and ny observations from two normal populations with variances 2x and 2y . If the sample variances are sx2 and sy2 then the random variable s x2 / x2 F 2 2 sy / y Has an F distribution with numerator degrees of freedom (nx – 1) and denominator degrees of freedom (ny – 1). An F distribution with numerator degrees of freedom v1 and denominator degrees of freedom v2 will be denoted Fv1, v2 . We denote Fv1, v2, the number for which P( Fv1 ,v2 Fv1 ,v2 , ) We need to emphasize that this test is quite sensitive to the assumption of normality. Tests for Equality of Variances from Two Normal Populations Let sx2 and sy2 be observed sample variances from independent random samples of size nx and ny from normally distributed populations with variances 2x and 2y . Use s2x to denote the larger variance. Then the following tests have significance level : (i) To test either null hypothesis H 0 : x2 y2 or H 0 : x2 y2 against the alternative the decision rule is H1 : x2 y2 s x2 Reject H 0 if F 2 Fnx 1,n y 1, sy Tests for Equality of Variances from Two Normal Populations (continued) (ii) To test the null hypothesis H 0 : x2 y2 against the alternative H1 : x2 y2 the decision rule is s x2 Reject H 0 if F 2 Fnx 1,n y 1, / 2 sy Where s2x is the larger of the two sample variances. Since either sample variance could be larger this rule is actually based on a two-tailed test and hence we use /2 as the upper tail probability. Here Fnx-1,ny-1 is the number for which P( Fnx 1,ny 1 Fnx 1,n y 1, ) Where Fnx-1,ny-1 has an F distribution with (nx – 1) numerator degrees of freedom and (ny – 1) denominator degrees of freedom. Determining the Probability of a Type II Error Consider the test against the alternative Using a decision rule H 0 : 0 H1 : 0 X-μ0 Reject H 0 if Z / 2 σ/ n or X μ0 Z σ/ n X c Using the decision rule determine the values of the sample mean that result in accepting the null hypothesis. Now for any value of the population mean defined by the alternative hypothesis H1 find the probability that the sample mean will be in the acceptance region for the null hypothesis. This is the probability of a Type II error. Thus we consider = * such that * > 0. Then for * the probability of a Type II error is * X P( X X c | * ) P[ Z c ] / n and Power = 1 - Power Function for Test H0: = 5 against H1: > 5 ( = 0.05, =0.1, n = 16) Power (1 - ) (Figure 9.13) 1 .5 .05 0 5.00 5.05 5.10 Key Words Alternative Hypothesis Determining the Probability of Type II Error Equality of Population Proportions F Distribution Hypothesis Testing Methodology Interpretation of the Probability value or pvalue Null Hypothesis Power Function States of Nature and Decisions on Null Hypothesis Test of Mean of a Normal Distribution (Variance Known) Composite Null and Alternative Composite or Simple Null and Alternative Hypothesis Key Words (continued) Testing the Equality of Two Population Proportions (Large Samples) Tests for Difference Between Population Means: Independent Samples Tests for Equality of Variances from Two Normal Populations Tests for the Difference Between Sample Means: Population Variances Unknown and Equal Tests for Differences Between Population Means: Matched Pairs Tests of the Mean of a Normal Distribution: Population Variance Unknown Tests of the Population Proportion (Large Sample Sizes) Tests of Variance of a Normal Population Type I Error Type II Error