Download Powerpoint

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Yr
Yes
Fr
15 (15%) 85
100
E: 30
E:70
Chi: 7.5
Chi: 3.2
So
25 (25%) 75
E:
E:
Chi:
Chi:
100
Jr
30 (30%) 70
E:
E:
Chi:
Chi:
100
Se
50 (50%) 50
E:
E:
Chi:
Chi:
100
120
400
Total
No
280
Total
Expected count =
Chi-square Contribution =
Get Chi-square statistic by
summing contributions
from all the cells
How many df for the chiSquare statistic?
The chi-square statistic follows a chisquare distribution with a specific
number of degrees of freedom (df ).
Yr
Yes
Fr
15
85
(15%)
100
So
25
75
(25%)
100
Jr
30
70
(30%)
100
Se
50
50
(50%)
100
Total 120
No
280
Total
400
df = (r – 1)(c – 1)
r = # row variable categories
d = # column variable categories
For our example, df =
A. 1 B. 2
C. 3 D. 4
Example 2: Two Different Categorical
Variables
PSU First
Choice
(1 – 2) (3 – 5) (6 – 8) (≥ 9)
Total
No
12
60
29
19
120
Yes
100
128
41
11
280
Total
112
188
70
30
401
Double all counts to get:
PSU First
Choice
(1 – 2) (3 – 5) (6 – 8) (≥ 9)
Total
No
24
120
58
38
240
Yes
200
256
82
22
560
Total
224
376
140
60
802
Pearson chisquare =
40.392
For fixed proportions,
increasing the sample
size strengthens the
evidence against H0.
Pearson chisquare =
80.784,
which is
exactly
doubled!
Hypotheses for ANOVA:
Remember: There are groups within the population, as
defined by their values of the categorical variable.
H0: Population means are the same for each group.
Ha: Not all population group means are the same.
H0: µ1 = µ2 = … = µk
Ha: Not all µ’s are the same.
In our particular situation…
H0: Each class (F, So, J, Se) has the same mean study time.
Ha: The mean study times are not the same for each class.
ANOVA output is summarized in a single table:
Analysis of Variance
Source
CollYear
Error
Total
DF
3
283
286
Adj SS
186.7
25087.6
25274.2
The second row gives the
“within” variation.
Adj MS
62.23
88.65
F-Value
0.70
P-Value
0.552
The F statistic is merely the
ratio of the MS between to the
MS within. It is the test
statistic we use for ANOVA!
One more ANOVA table fact: The MS error, or MS
within, is also called the pooled sample variance. You
can take its square root to get the pooled standard
deviation.
The p-value is based on
the F statistic and the two
DF values for between
and within.
Related documents