Download • - WordPress.com

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
STA301 – Statistics and Probability
Lecture No 38:
•
Hypothesis-Testing regarding 1 - 2 (based on Z-statistic)
•
Hypothesis Testing regarding p
(based on Z-statistic)
In the last lecture, we discussed the basic concepts involved in hypothesis-testing. Also, we applied this concept to a
few examples regarding the testing of the population mean .
These examples pointed to the six main steps involved in any hypothesis-testing procedure.
General Procedure for Testing Hypotheses:
Testing a hypothesis about a population parameter involves the following six steps:
i) State your problem and formulate an appropriate null hypothesis H0 with an alternative hypothesis H1, which is to be
accepted when H0 is rejected.
ii) Decide upon a significance level of the test, , which is the probability of rejecting the Null Hypothesis if it is true.
iii) Choose a test-statistic such as the normal distribution, the t-distribution, etc. to test H0.
iv) Determine the rejection or critical region in such a way that the probability of rejecting the null hypothesis H0, if it
is true, is equal to the significance level, . The location of the critical region depends upon the form of H1 (i.e.
whether we are carrying out a one-tailed test or a two-tailed test). The critical value(s) will separate the acceptance
region from the rejection region.
v) Compute the value of the test-statistic from the sample data in order to decide whether to accept or reject the null
hypothesis H0.
vi) Formulate the decision rule (i.e. draw a conclusion) as follows:
a) Reject the null hypothesis H0, if the computed value of the test statistic falls in
the
rejection region.
b) Accept the null hypothesis H0, otherwise.
Important Note:
It is very important to realize that when applying a hypothesis-testing procedure of the type explained above,
we always begin by assuming that the null hypothesis is true.
Important Note:
As s2 is an unbiased estimator of 2 whereas S2 is a biased estimator, hence we would like to use this
estimator whenever 2 is unknown.
However, when n is large, s2 is approximately equal to S2, as explained below:
We know th
s
2
 x  x 

2
  x  x   n  1s 2
2
n 1
whereas
S
2
 x  x 

n
2
  x  x   nS 2 .
2
Hence
n  1s 2  nS 2  S 2  n  1 s 2  1  1 s 2
Now, as n  ,
1
 0.
n
n

n
Hence, if n is large,
S2 ~
 s2 .
Hence, in case of a large sample drawn from a population with unknown variance 2, we may replace 2 by S2.We
now consider the case when we are interested in testing the equality of two population means.
We illustrate this situation with the help of the following example.
Virtual University of Pakistan
Page 299
STA301 – Statistics and Probability
EXAMPLE:
A survey conducted by a market-research organization five years ago showed that the estimated hourly wage
for temporary computer analysts was essentially the same as the hourly wage for registered nurses. This year, a random
sample of 32 temporary computer analysts from across the country is taken. The analysts are contacted by telephone
and asked what rates they are currently able to obtain in the market-place A similar random sample of 34 registered
nurses is taken. The resulting wage figures are listed in the following table:
Computer Analysts
$ 24.10
23.75
24.25
22.00
23.50
22.80
24.00
23.85
24.20
22.90
23.20
23.55
$25.00
22.70
21.30
22.55
23.25
22.10
24.25
23.50
22.75
23.80
Registered Nurses
$24.25
21.75
22.00
18.00
23.50
22.70
21.50
23.80
25.60
24.10
$20.75
23.80
22.00
21.85
24.16
21.10
23.75
22.50
25.00
22.70
23.25
21.90
$23.30
24.00
21.75
21.50
20.40
23.25
19.50
21.75
20.80
20.25
22.45
19.10
$22.75
23.00
21.25
20.00
21.75
20.50
22.60
21.70
20.75
22.50
Conduct a hypothesis test at the 2% level of significance to determine whether the hourly wages of the computer
analysts are still the same as those of registered nurses.
SOLUTION:
Hypothesis Testing Procedure:
Step-1:
Formulation of the Null and Alternative Hypotheses:
H0 : 1 – 2 = 0
HA : 1 – 2  0
(Two-tailed test)
Step-2:
Level of Significance:
 = 0.02
Step-3:
Test Statistic:
Z
X
1
 X 2   1   2 
 12
n1

 22
n2
Step-4:
Calculations:
The sample size, sample mean and sample standard deviation for each of the two samples are given below:
Computer Analysts:
n1
=
32
X 1
=
$23.14
S12
=
1.854
Registered Nurses:
Virtual University of Pakistan
Page 300
STA301 – Statistics and Probability
n2
=
34
X2
=
$21.99
S22
=
1.845
Since the sample sizes are larger than 30, hence, the unknown population variances 12 and 22 can be replaced by
S12 and S22. Hence, our formula becomes:
Z
X
1
 X 2   1   2 
S12 S 22

n1 n2
Hence, the computed value of Z comes out to be :
Z
23.14  21.99  0 
1.854 1.845

32
34
1.15
 3.43
0.335
Step-5:
Critical Region:
As the level of significance is 2%, and this is a two-tailed test, hence, we have the following situation:
/2 = .01
0.49
Z.01 = -2.33
0.49
0
/2 = .01
Z.01 = +2.33
Hence, the critical region is given by
| Z | > 2.33
Step-6:
Conclusion:
As the computed value i.e. 3.43 is greater than the tabulated value 2.33, hence, we reject H0.
Virtual University of Pakistan
Page 301
STA301 – Statistics and Probability
Z.01 = -2.33
Z=0
Z
Z.01 = +2.33
Calculated Z = 3.43
X1  X 2
1  2  0
X 1  X 2  1.15
The researcher can say that there is a significant difference between the average hourly wage of a temporary computer
analyst and the average hourly wage of a temporary registered nurse. The researcher then examines the sample means
and uses common sense to conclude that, on the average, temporary computer analyst earn more than temporary
registered nurses.
Let us consolidate the above concept by considering another example:
EXAMPLE:
Suppose that the workers of factory B believe that the average income of the workers of factory A exceeds
their average income. A random sample of workers is drawn from each of the two factories, an the two samples yield
the following information:
Factory
A
B
Sample
Size
160
220
Mean
Variance
12.80
11.25
64
47
Test the above hypothesis?
SOLUTION
Let subscript 1 denote values pertaining to Factory A, and let subscript 2 denote values pertaining to Factory
B.Then, we proceed as follows:
Hypothesis-testing Procedure:
Step 1:
H0 : 1 < 2 (or 1 - 2 < 0)
HA : 1 > 2 (or 1 - 2 > 0).
Step 2:
Level of significance
= 5%.
Steps 3 & 4:
Z

x1  x 2  0
s12 s 2 2

n1 n 2

12.80  11.25
64 47

160 220
1.55 1.55

 1.99
0.61 0.78
Virtual University of Pakistan
Page 302
STA301 – Statistics and Probability
Step 5:
Critical Region:
Since it is a right-tailed test, hence the critical region is given by
Z > Z0.05
i.e. Z > 1.645
Step 6:
Conclusion:
Since 1.99 is greater than 1.645, hence H0 should be rejected in favour of HA. The sample evidence has
consolidated the belief of the workers of factory B.Next, we consider the case when we are interested in conducting a
test regarding p, the proportion of successes in the population.
We illustrate this situation with the help of the following example:
EXAMPLE:
A sociologist has a hunch that not more than 50% of the children who appear in a particular juvenile court
three times or more are orphans.
To test this hypothesis, a sample of 634 such children is taken and it is found that 341 of these children are
orphans, (one or both parents dead). Test the above hypothesis using 1% level of significance.
SOLUTION:
Hypothesis-testing Procedure:
Step 1:
H0 : p < 0.50
HA : p > 0.50
(one-tailed test)
Step 2:
Level of significance:  = 1%
Step 3:
Test statistic:
Z
X  12  n p0
n p0 1  p0 
(where + ½ denotes the continuity correction)
Step 4:
Computation:
Here np0 = 634 (0.50) = 317
and X = 341
Hence X > np0 so use X - ½
So Z 
341  12  317
6340.500.50

23.5
12.59
= 1.87
Step 5:
Critical region:
Since  = 0.01, hence the critical region is given by
Z > 2.33
Step 6:
Conclusion:
Since 1.87 < 2.33,
Hence the computed Z does not fall in the critical region. Hence, we conclude that the sociologist’s hunch is
acceptable.
Virtual University of Pakistan
Page 303