Download 3/11/00 252chisq

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Psychometrics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Omnibus test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
252chisq 2/29/08 (Open this document in 'Outline' view!)
E. CHI-SQUARED AND RELATED TESTS.
These tests are generalizations of the one-sample and two-sample tests of proportions. A test of
Goodness of Fit is necessary when a single sample is to be divided into more than two categories. A Test of
Homogeneity is needed when one wants to compare more than one sample. A test of Independence is used
to see if two variables or categorizations are related, but is formally identical to a test of homogeneity.
1. Tests of Homogeneity and Independence
Two possible null hypotheses apply here. The observed data is indicated by O, the expected data by E .
H 0 : Cities are Homogeneous by income groups
O City
1
Upper
2
3
4
Total
10
15
15
10
50
Income
Middle
5
10
15
10
40
40
Income
Lower
15
15
10
20
60
60
Income
Sample Size
30
pc
30
1
150
5
40
150
40
4
40
15
40
40
40
4
4
150
15
150
15
pr
1
150
3
50
4
150
2
150
150
15
5
1
1
H 0 : Sick Days are Independent of Age
O Days
0
Age
1
3
2
Total
50
10
15
15
10
50
15 - 25
Age
5
10
15
10
40
40
26 - 49
Age
15
15
10
20
60
60
50 up
Total
pc
30
30
1
150
5
40
150
40
4
40
15
40
150
4
40
40
4
15
150
15
150
pr
1
150
3
4
150
150
15
2
5
1
1
The numbers are obviously identical in these two cases. In each case the expected values
O  150 . Each cell
are done the same way. There are r  3 rows, c  4 columns and rc  12 cells. n 

gets p c p r n  p c (Column total) . For example, for the upper left corner the expected value is
1 1 150   1 30   10
3 5
3
1
252chisq 2/29/08 (Open this document in 'Outline' view!)
E Column
1
2
3
4
Total
50
Row
10
13 1 3
13 1 3
13 1 3
50
1
Row
8
10 2 3
10 2 3
10 2 3
40
40
2
Row
12
16
16
16
60
60
3
Total
pc
30
30
1
150
5
40
150
40
4
40
15
40
150
4
40
40
4
15
150
15
O  E 2
pr
1
150
3
150
4
150
150
15
2
5
1
1
O2
 n . The first of these two
E
E
formulas is shown below. For an explanation of the equivalence of these two formulas, the reason why the
degrees of freedom are as given below, and to relate the chi-squared test to a z test of proportions, see
252chisqnote.
The formula for the chi-squared statistic is  2 
E
O

E  O2
E O
or  2 

E  O  2
E
10.0000
10
0.0000
0.0000
0.00000
8.0000
5
3.0000
9.0000
1.12500
12.0000
15
-3.0000
9.0000
0.75000
13.3333
15
-1.6667
2.7778
0.20833
10.6667
10
0.6667
0.4445
0.04167
16.0000
15
1.0000
1.0000
0.06250
13.3333
15
-1.6667
2.7779
0.20834
10.6667
15
-4.3333
18.7775
1.76038
16.0000
10
6.0000
36.0000
2.25000
13.3333
10
3.3333
11.1109
0.83332
10.6667
10
0.6667
0.4445
0.04167
16.0000
20
-4.0000
16.0000
1.00000
150.0000
150
0.0000
8.28121
The degrees of freedom for this application are r  1c  1  3  14  1  23  6 .
The most common test is a one-tailed test on the grounds that the larger the discrepancy that occurs between
O and E , the larger will be
E  O  2
. If our significance level is 5%, compare
E
26 
 .05  12 .5916 . Since our value of this sum is less than

O  E 2
E
to
the table chi-squared, do not reject the null hypothesis.
Note: Rule of thumb for E .
All values of E should be above 5 and we generally combine cells to make this so. However a number  2
is acceptable in E if i) Our computed  2 turns out to be less than  2 or, (ii) The particular value of E
makes a very small contribution to

O  E 2
E
, relative to the value of the total.
Note: Marascuilo Procedure.
The Marascuilo procedure says that, for 2 by c tests, if (i) equality is rejected and
2
252chisq 2/29/08 (Open this document in 'Outline' view!)
 
(ii) p a  p b   2 s p , where a and b represent 2 groups, the chi - squared has c  1 degrees of
freedom and the standard deviation is s p 
p a q a pb qb

, you can say that you have a significant
na
nb
difference between p a and p b . This is equivalent to using a confidence interval of
c 1  p a q a
pb qb 


 n  n 
b 
 a
Example: Pelosi and Sandifer give the data below for satisfaction with local phone service classified by
type of company providing the service. 1) Is the proportion of people who rate the phone service as
excellent independent of the type of company? 2) If it is not, test for a difference in the proportion who rate
their service as excellent against the best-rated provider. Remember that this is a test of equality of
proportions.
Service 1 is a long distance company, Service 2 is a local phone company, Service 3 is a power company,
Service 4 is CATV (cable) and Service 5 is Cellular. p1 is thus the proportion of long distance company
customers that rate their service as excellent.
H 0 : p1  p 2  p3  p 4  p5 H 1 : Not all ps equal.
p a  pb   p a  pb    2
p1  .1592 p 2  .2520 p 3  .2127 p 4  .3328 p 5  .2571
n1  1658
n 2  1762
n 3  616
n 4  646
n 5  770
Solution: Set up the O table. To get the number that rate service as excellent for long distance, note that
p1 n1  .1592 1658   263 .95 . But this must be a whole number, so round it to 246. The number that do not
rate it as excellent is 1658  246  1394 . This gives us our first column.
O
Long Dist
Local Ph
Power
CATV
pq
is also computed for use later.
n
Cellular
Total
pr
Excellent
264
444
131
215
198
1252
.2296
Not
1394
1318
485
431
572
4200
.7704
Sum
1658
1762
616
646
770
5452
1.0000
Proportion
.1592
.2520
.2127
.3328
.2571
Excellent
.0000807
.0001070
.0002718
.0003437
.0002481
pq
n
Note that in addition to computing the overall proportion of excellent and not excellent service (.2296 and
.7704) , the ‘proportion excellent’ has been computed for each type of service as well as the variance
pa qa
used in the confidence interval formula. If we apply the proportions in each row to the column sums
na
we get the following expected values.
Long Dist
Local Ph
Power
CATV
Cellular
Total
E
pr
Excellent
380.68
404.56
141.43
148.32
176.79
1252 .
.2296
Not
1277.32
1357.44
474.57
497.68
593.21
4200
.7704
sum
1658
1762
616
646
770
5452
1.0000
3
252chisq 2/29/08 (Open this document in 'Outline' view!)
The chi-squared test follows.
Row
E
O
1
2
3
4
5
6
7
8
9
10
380.68
1277.32
404.56
1357.44
141.43
474.57
148.32
497.68
176.79
593.21
5452.00
264
1394
444
1318
131
485
215
431
198
572
5452
E  O 
116.677
-116.677
-39.445
39.445
10.434
-10.434
-66.678
66.678
-21.208
21.208
0.000
E  O2
13613.5
13613.5
1555.9
1555.9
108.9
108.9
4446.0
4446.0
449.8
449.8
E  O  2
E
35.7612
10.6578
3.8459
1.1462
0.7697
0.2294
29.9755
8.9335
2.5441
0.7582
94.622
4
The degrees of freedom are r  1c  1  2  15  1  14  4 and  2 .05  9.448 , so we reject the null
hypothesis and say that there is a difference between the proportions that rate their service as excellent.
Since the highest proportion satisfied was with CATV we compare the proportions with the proportion
calculated for CATV using the confidence interval formula above.
Long distance .1592  .3328   9.448 .0000807  .0003437
  .1736  .0633
Local Phone .2520  .3328   9.448 .0001070  .0003437   .0808  .0653
.2127  .3328   9.448 .0002718  .0003437   .1201  .0763
Power
.2571  .3328   9.448 .0002481  .0003437   .0757  .0748
Cellular
Notice that the absolute size of the   error term is always smaller than the absolute size of the difference
in proportions, so that we can say that all of these differences are significant. Though I have not checked it,
I doubt, if we compare all other proportions with the proportion saying cellular service is excellent we
would get such strong results.
2. Tests of Goodness of Fit
a. Uniform Distribution
Let us pool the data above, that is, treat it all as if it were one sample, and ask if it is uniformly distributed.
H 0 : Uniform distribution
E  O  2
E  O
O
E O
E
E
50
50
0
0
0
50
40
10
100
2
50
60
-10
100
2
150
150
0
4
Since there are 3 numbers here, there are 2 degrees of freedom. Since 4 is less than  .2052   5.9915 , we
2
cannot reject the null hypothesis. An easier way to do this is to compute  2 
 O   E  n.
E
50
50
50
150
O
50
40
60
150

O2
 n . Remember
E
O2
E
50
32
72
154
-150
4
252chisq 2/29/08 (Open this document in 'Outline' view!)
4
5
252chisq 2/29/08 (Open this document in 'Outline' view!)
For a combined Chi-square test of both uniformity and homogeneity see 252chisqx1
b. Poisson Distribution
x
0
1
2
3
4
5
6
7+
x
0
1
2
3+
Example:
I believe that there is almost a daily accident on my corner. To make this into a testable
hypothesis, let us sat that I believe that the distribution is Poisson with a parameter of 0.8 and that I
observe the numbers of accidents shown below over 200 days. For example there are 100 days
with no accidents, 60 days with 1 etc.
 H 0 : Poisson(0.8)

 H 1 : Not Poisson(0.8)
To get f , I look up f
O
f
E  fn
100
.4493
89.86
on the Poisson table an
60
.3595
71.90
multiply by n  200 , u
30
.1438
28.76
formula E  fn
6
.0383
7.66
Unfortunately, I canno
0
.0077
1.54(<5)
work with E as it appears here.
4
.0012
0.24(<5)
I must have each E at least
0
.0002
0.04(<5)
5. To fix the problem, I add
0
.0000
0.00(<5)
the smallest cells together to
200
1.0000
200.00
increase E to 5 or mo
O2
O
100
60
30
10
E
89.86
71.90
28.76
9.48
E
111.28
50.07
31.29
10.56
200
200
203.20
Since I did not estimate the mean
of 0.8 from the data, I have 3
degrees of freedom.
 .205(3)  7.815 so I do not
reject the null hypothesis.
200.00
3.20
H : Poisson
But what if my hypotheses are simply  0
? Then I would have to estimate the mean from
H 1 : Not Poisson
the data. Looking at the x and O columns I calculate
0100   160   230   36  54 158


 0.79 . I would still use the Poisson distribution with a
200
200
parameter of 0.8 unless I had a computer handy to compute it with a parameter of 0.79, but my degrees of
freedom are now 3 - 1 = 2, because I used the data to estimate a parameter.
6
252chisq 2/29/08 (Open this document in 'Outline' view!)
c. Normal Distribution
A common way to set up a  2 test of normality is to group data starting at the
mean and ending each group one-half of a standard deviation from the mean. One can
proceed outward from the mean until four or five groups have been sectioned off in each
direction. For example, if our null hypothesis is that x ~ N 100 ,10  , we can start at 100 and
let the width of each group be one half of   10 or 5. The groups would be 100-105,
105-110, etc. going up, and 95-100, 90-95, etc. going down. Then for the highest number in
x
each interval, compute z 
. For example, for the interval 90-95 compute

95  100
z
 0.5 . Then use the normal distribution to compute F z  .
10
. For example F (0.5)  Pz  0.5  .5  .1915  .3085 . Then, to find the frequency of
the interval, subtract this F z  from the F z  for the previous interval. An example of
calculating E this way is shown below.
H 0 : x ~ N 100 ,10  and n  1000
x interval
z
E  fn
 -80
.0228
22.8
80-85
.0440
44.0
85-90
.0919
91.9
90-95
.1498
149.8
95-100
.1915
191.5
100-105
0.5
191.5
105-110
1.0
149.8
110-115
1.5
F z 
f
-2.0
.0228
-1.5
.0668
-1.0
.1587
-0.5
.3085
0.0
.5000
.6915
.1915
.8413
.1498
.9332
.0919
91.9
44.0
115-120
2.0
.9772
.0440
120- 

1.0000
.0228
22.8
For smaller values of n we may find that some numbers in E are less than 5, so that we have to combine
some intervals. In the above example the degrees of freedom for  2 are 10 - 1 = 9 if the mean and
variance are known. If they both had to be computed from data the degrees of freedom would be reduced by
2 to 7.
7
252chisq 2/29/08 (Open this document in 'Outline' view!)
3. Kolmogorov-Smirnov Test
a. Kolmogorov-Smirnov One-Sample Test
This is a more powerful test of goodness of fit than the Chi-Squared test. Unfortunately, it can only be
used when the distribution in the null hypothesis is totally specified. For example, if we wanted to do the
test for Poisson(0.8) above, we would look up the cumulative distribution Fe for Poisson(0.8) and proceed
as below. Note that this would not work if our hypothesis was that the distribution was Poisson without the
mean specified.
x
0
1
2
3
4
5
6
7+
O
100
60
30
6
0
4
0
0
200
O
n
.50
.30
.15
.03
.00
.02
.00
.00
1.00
Fo
.50
.80
.95
.98
.98
1.00
1.00
1.00
Fe
.4493
.8088
.9526
.9909
.9986
.9998
1.0000
1.0000
D  Fo  Fe
.0507
.0088
.0026
.0109
.0106
.0002
.0000
.0000
The maximum difference is MaxD  .0507 , which must be checked against the Kolmogorov-Smirnoff table
1.36
 .0962 . Since MaxD is less
for n  200 . According to the table, for   .05 , the critical value is
200
than .0962, accept the null hypothesis.
b. Lilliefors Test.
Because the Kolmogorov-Smirnov Test is so limited in application, it proved advantageous to
develop a special version of that test to use to test for a normal distribution when the mean and variance are
unknown. Once a sample mean and variance are found, this test is identical to the K-S Test except for the
use of a special table.
Problem E9: Is the following data normal?
420, 440, 445, 450, 460, 475, 480, 500, 520, 530
Solution: Assume   .05
H0 : Normal The only practical method is the Lilliefors method. Question:
Why is Chi-squared impractical and Kolmogorov-Smirnov impossible?
The numbers must be in order before we begin computing cumulative probabilities! Checking the data we
xx
find that x  472 and s  35 .92 . We compute z 
. (This is really a t .) Fe is the cumulative
s
distribution, gotten from the Normal table by adding or subtracting 0.5. Fo comes from the fact that there
are 10 numbers, so that each number is one-tenth of the distribution.
For   .05 and n  10 the critical value from the Lilliefors table is 0.2616. Since the largest deviation here
is .1293, we do not reject H 0 .
Remember that the Lilliefors method is a specialized version of the KS method used only in situations
where you are testing for a Normal distribution and using a sample mean and standard deviation estimated
from the data. The KS method can only be used in situations where the null hypothesis including parameters
8
252chisq 2/29/08 (Open this document in 'Outline' view!)
is specified in advance. A Chi-squared test of goodness of fit is usually considered a large sample test, but
can be adjusted for estimation of parameters.
x
z
Fo
Fe
D
420
440
445
450
460
475
480
500
520
530
-1.45
-0.89
-0.75
-0.61
-0.33
0.08
0.22
0.78
1.34
1.61
0.1000
0.2000
0.3000
0.4000
0.5000
0.6000
0.7000
0.8000
0.9000
1.0000
.0735
.1867
.2266
.2702
.3707
.5319
.5871
.7823
.9099
.9463
.0265
.0133
.0734
.1291
.1293
.0681
.1129
.0177
.0099
.0537
©2002 Roger Even Bove
9