Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Probability and Statistics 1 Case Ⅱ : Large-Sample Test When the sample size is large, we can use the test statistic: X μo Z S/ n which has approximately a standard normal distribution when Ho is true. The use of rejection region given previously for case Ⅰ then the results in the test procedures for which the significance level is approximately (rather than exactly) α . The rule of thumb n>40 will again be used to characterize a large sample size. 2 β and Sample Size Determination Determination of β and necessary sample size for these large-sample tests can be based either on specifying a plausible value of σ and using the case Ⅰ formulas or on using the curves to be introduced shortly in connection with case Ⅲ . 3 Case Ⅲ : A Normal Population Distribution When the sample size n is small, the CIT can no longer be invoked to justify the use the large-sample test. But when x̅ is the mean of a random sample of size n from a normal distribution with the mean μ , the rv : T X μ S/ n has a t distribution with n-1 degrees of freedom. So we can use the test statistic: X μo T S/ n 4 The One-Sample t Test H0 : μ=μo Null hypothesis: Test statistic value: Alternative Hypothesis Ha : μ>μo Ha : μ<μo Ha : μ≠μo x μo t S/ n Rejection Region for Level α Test t≥tα,n-1 (upper-tailed test) t≤-tα,n-1 (lower-tailed test) either t≥tα/2,n-1 or t≤-tα/2,n-1 (two-tailed test) 5 β and Sample Size Determination The calculation of β(μ') for the t test is much less straightforward. This is because the distribution of the test statistic is quite complicated when Ho is false and Ha is true. This must be done numerically, but fortunately it has been done by research statisticians for both oneand two-tailed t test. The results are summarized in the graphs of β that appears in Appendix Table A.17 [Probability and statistics for engineering and sciences, Jay L. Devore]. 6 3. Test Concerning a Population Proportion Large-Sample Tests Large-sample tests concerning p are a special case of the more general large-sample procedures for a parameter θ. Let θ̂ be an estimator of θ that is unbiased and approximately a normal distribution. Ho : θ=θo . Suppose that when Ho is true, the standard deviation θ̂ of σ θˆ , involves no unknown parameter. ˆ θ θ o Then the test statistic is : Z σ θˆ 2016/11/3 7 The estimator pˆ X n is unbiased, has σ p1 p n approximately a normal distribution, and the standard deviation is . So the test statistic is: pˆ pˆ p o Z p o 1 p o n 8 Null hypothesis: Ho : p=po Test statistic value: Alternative Hypothesis Ha : p>po Ha : p<po Ha : p≠po pˆ p o z p o 1 p o n Rejection Region for Level α Test z>zα (upper-tailed test) z<zα (lower-tailed test) either z≥zα/2 or z≤-zα/2 (two-tailed test) 9 β and Sample Size Determination Alternative Hypothesis Ha : p >po Ha : p <po Ha : p ≠ po β(μ') pˆ p o z a p o 1 p o n p1 p n pˆ p o z a p o 1 p o n p1 p n pˆ p o z a/2 p o 1 p o n pˆ p o z a/2 p o 1 p o n p1 p n p1 p n 10 The sample size n for which the level α test also satisfies β(μ') =β is z p 1 p z p1 p 2 o a o p p o n 2 z a/2 p o 1 p o z p1 p p p o one - tailed test two - tailed test (an approxomat e solution) 11 Small-Sample Tests Test procedures when the sample size n is small are based on the binominal distribution rather than the normal distribution. Consider the alternative hypothesis Ha : p>po and again let X be the number of successes in the sample. Then the X is the test statistic. When Ho is true, P( typeⅠ errors) =1-B(c-1; n, po). 12 Because X has a discrete probability distribution, it is usually not possible to find a value c for which P( typeⅠ errors) is exactly the desired significance level α . Instead, the largest rejection region of the form {c,c+1,…,n} satisfying 1-B(c-1; n, po) ≤ α is used. The procedures for Ha : p<po and Ha : p≠po are constructed in a similar manner. And β(μ') is the result of a straightforward binominal probability calculation. 13 4. P-Values A P-value conveys much information about the strength of evidence against Ho and allow an individual decision maker to draw a conclusion at any specified level α . The P-value is the smallest level of significance at which Ho would be rejected when a specified test procedure is used on a given data set. Once the P-value has been determined, the conclusion at any particular level α results from comparing the P-value to α : 1. P-value ≤ α → rejected Ho at level α . 2. P-value > α → do not rejected Ho at level α . 14 P-value is customary to call the data significant when Ho is rejected and not significant otherwise. The P-value is then the smallest level at which the data is significant. An easy way to visualize the comparison of the P-value with the chosen α is to draw a picture like that of Figure 8.6(page 347). 15 DEFINITION The P-value is the probability, calculated assuming Ho is true, of obtaining a test statistic value at least as contradictory to Ho as the value that actually resulted. The smaller the P-value, the more contradictory is the data to Ho . 16 P-values for z Test Let z = test statistic value for z test. Then P-value: 1 z P z 21 z for an upper - tailed test for an lower - tailed test for a two - tailed test Each of these is the probability of getting a value at least as extreme as what was obtained (assuming Ho is true). The three cases are illustrated in Figure 8.7 . 17 P-values for t Tests Let t = test statistic value for t test Then P-value for a t test will be a t curve area. (Figure 8.8) area in upper tail P area in lower tail sum of area in two tails upper - tailed test lower - tailed test two - tailed test 18 5. Some Comments on Selecting a Test Procedure Once the experimenter has decided on the question of interest and method for gathering data, construction of an appropriate test procedure consists of three distinct steps: 1. Specify a test statistic. 2. Decide on the general form of the rejection region. 3. Select the specific numerical critical value or values that will separate the rejection region from the acceptance region. 19 Issues to be considered in carrying out Steps 1-3 encompass the following questions: 1. What are the practical implications and consequences of choosing a particular level of significance once the other aspects of test procedure have been determined ? 2. Does there exits a general principle, not dependent just on intuition, that can be used to obtain best or good test procedures ? 20 3. When two or more tests are appropriate in given situation, how can the test be compared to decided which should be used ? 4. If a test is derived under specific assumptions about the distribution or population be sampled, how well will the test procedure work when the assumptions are violated ? 21