Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
STA301 – Statistics and Probability Lecture No 38: • Hypothesis-Testing regarding 1 - 2 (based on Z-statistic) • Hypothesis Testing regarding p (based on Z-statistic) In the last lecture, we discussed the basic concepts involved in hypothesis-testing. Also, we applied this concept to a few examples regarding the testing of the population mean . These examples pointed to the six main steps involved in any hypothesis-testing procedure. General Procedure for Testing Hypotheses: Testing a hypothesis about a population parameter involves the following six steps: i) State your problem and formulate an appropriate null hypothesis H0 with an alternative hypothesis H1, which is to be accepted when H0 is rejected. ii) Decide upon a significance level of the test, , which is the probability of rejecting the Null Hypothesis if it is true. iii) Choose a test-statistic such as the normal distribution, the t-distribution, etc. to test H0. iv) Determine the rejection or critical region in such a way that the probability of rejecting the null hypothesis H0, if it is true, is equal to the significance level, . The location of the critical region depends upon the form of H1 (i.e. whether we are carrying out a one-tailed test or a two-tailed test). The critical value(s) will separate the acceptance region from the rejection region. v) Compute the value of the test-statistic from the sample data in order to decide whether to accept or reject the null hypothesis H0. vi) Formulate the decision rule (i.e. draw a conclusion) as follows: a) Reject the null hypothesis H0, if the computed value of the test statistic falls in the rejection region. b) Accept the null hypothesis H0, otherwise. Important Note: It is very important to realize that when applying a hypothesis-testing procedure of the type explained above, we always begin by assuming that the null hypothesis is true. Important Note: As s2 is an unbiased estimator of 2 whereas S2 is a biased estimator, hence we would like to use this estimator whenever 2 is unknown. However, when n is large, s2 is approximately equal to S2, as explained below: We know th s 2 x x 2 x x n 1s 2 2 n 1 whereas S 2 x x n 2 x x nS 2 . 2 Hence n 1s 2 nS 2 S 2 n 1 s 2 1 1 s 2 Now, as n , 1 0. n n n Hence, if n is large, S2 ~ s2 . Hence, in case of a large sample drawn from a population with unknown variance 2, we may replace 2 by S2.We now consider the case when we are interested in testing the equality of two population means. We illustrate this situation with the help of the following example. Virtual University of Pakistan Page 299 STA301 – Statistics and Probability EXAMPLE: A survey conducted by a market-research organization five years ago showed that the estimated hourly wage for temporary computer analysts was essentially the same as the hourly wage for registered nurses. This year, a random sample of 32 temporary computer analysts from across the country is taken. The analysts are contacted by telephone and asked what rates they are currently able to obtain in the market-place A similar random sample of 34 registered nurses is taken. The resulting wage figures are listed in the following table: Computer Analysts $ 24.10 23.75 24.25 22.00 23.50 22.80 24.00 23.85 24.20 22.90 23.20 23.55 $25.00 22.70 21.30 22.55 23.25 22.10 24.25 23.50 22.75 23.80 Registered Nurses $24.25 21.75 22.00 18.00 23.50 22.70 21.50 23.80 25.60 24.10 $20.75 23.80 22.00 21.85 24.16 21.10 23.75 22.50 25.00 22.70 23.25 21.90 $23.30 24.00 21.75 21.50 20.40 23.25 19.50 21.75 20.80 20.25 22.45 19.10 $22.75 23.00 21.25 20.00 21.75 20.50 22.60 21.70 20.75 22.50 Conduct a hypothesis test at the 2% level of significance to determine whether the hourly wages of the computer analysts are still the same as those of registered nurses. SOLUTION: Hypothesis Testing Procedure: Step-1: Formulation of the Null and Alternative Hypotheses: H0 : 1 – 2 = 0 HA : 1 – 2 0 (Two-tailed test) Step-2: Level of Significance: = 0.02 Step-3: Test Statistic: Z X 1 X 2 1 2 12 n1 22 n2 Step-4: Calculations: The sample size, sample mean and sample standard deviation for each of the two samples are given below: Computer Analysts: n1 = 32 X 1 = $23.14 S12 = 1.854 Registered Nurses: Virtual University of Pakistan Page 300 STA301 – Statistics and Probability n2 = 34 X2 = $21.99 S22 = 1.845 Since the sample sizes are larger than 30, hence, the unknown population variances 12 and 22 can be replaced by S12 and S22. Hence, our formula becomes: Z X 1 X 2 1 2 S12 S 22 n1 n2 Hence, the computed value of Z comes out to be : Z 23.14 21.99 0 1.854 1.845 32 34 1.15 3.43 0.335 Step-5: Critical Region: As the level of significance is 2%, and this is a two-tailed test, hence, we have the following situation: /2 = .01 0.49 Z.01 = -2.33 0.49 0 /2 = .01 Z.01 = +2.33 Hence, the critical region is given by | Z | > 2.33 Step-6: Conclusion: As the computed value i.e. 3.43 is greater than the tabulated value 2.33, hence, we reject H0. Virtual University of Pakistan Page 301 STA301 – Statistics and Probability Z.01 = -2.33 Z=0 Z Z.01 = +2.33 Calculated Z = 3.43 X1 X 2 1 2 0 X 1 X 2 1.15 The researcher can say that there is a significant difference between the average hourly wage of a temporary computer analyst and the average hourly wage of a temporary registered nurse. The researcher then examines the sample means and uses common sense to conclude that, on the average, temporary computer analyst earn more than temporary registered nurses. Let us consolidate the above concept by considering another example: EXAMPLE: Suppose that the workers of factory B believe that the average income of the workers of factory A exceeds their average income. A random sample of workers is drawn from each of the two factories, an the two samples yield the following information: Factory A B Sample Size 160 220 Mean Variance 12.80 11.25 64 47 Test the above hypothesis? SOLUTION Let subscript 1 denote values pertaining to Factory A, and let subscript 2 denote values pertaining to Factory B.Then, we proceed as follows: Hypothesis-testing Procedure: Step 1: H0 : 1 < 2 (or 1 - 2 < 0) HA : 1 > 2 (or 1 - 2 > 0). Step 2: Level of significance = 5%. Steps 3 & 4: Z x1 x 2 0 s12 s 2 2 n1 n 2 12.80 11.25 64 47 160 220 1.55 1.55 1.99 0.61 0.78 Virtual University of Pakistan Page 302 STA301 – Statistics and Probability Step 5: Critical Region: Since it is a right-tailed test, hence the critical region is given by Z > Z0.05 i.e. Z > 1.645 Step 6: Conclusion: Since 1.99 is greater than 1.645, hence H0 should be rejected in favour of HA. The sample evidence has consolidated the belief of the workers of factory B.Next, we consider the case when we are interested in conducting a test regarding p, the proportion of successes in the population. We illustrate this situation with the help of the following example: EXAMPLE: A sociologist has a hunch that not more than 50% of the children who appear in a particular juvenile court three times or more are orphans. To test this hypothesis, a sample of 634 such children is taken and it is found that 341 of these children are orphans, (one or both parents dead). Test the above hypothesis using 1% level of significance. SOLUTION: Hypothesis-testing Procedure: Step 1: H0 : p < 0.50 HA : p > 0.50 (one-tailed test) Step 2: Level of significance: = 1% Step 3: Test statistic: Z X 12 n p0 n p0 1 p0 (where + ½ denotes the continuity correction) Step 4: Computation: Here np0 = 634 (0.50) = 317 and X = 341 Hence X > np0 so use X - ½ So Z 341 12 317 6340.500.50 23.5 12.59 = 1.87 Step 5: Critical region: Since = 0.01, hence the critical region is given by Z > 2.33 Step 6: Conclusion: Since 1.87 < 2.33, Hence the computed Z does not fall in the critical region. Hence, we conclude that the sociologist’s hunch is acceptable. Virtual University of Pakistan Page 303