Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Cross-Tabulations We have been looking at these for some time already. An arrangement of two categorical variables into rows and columns. Row variable  Column variable  Tells about relationships between two categorical variables 1 Depression and a new baby, Fathers | depress baby | 0 1 | Total -----------+----------------------+---------0 | 92 59 | 151 | 60.93 39.07 | 100.00 | 75.41 71.95 | 74.02 -----------+----------------------+---------1 | 30 23 | 53 | 56.60 43.40 | 100.00 | 24.59 28.05 | 25.98 -----------+----------------------+---------Total | 122 82 | 204 | 59.80 40.20 | 100.00 | 100.00 100.00 | 100.00 2 Stress and social class | class stress | Low Middle Upper | Total -----------+---------------------------------+---------Low | 246 90 55 | 391 | 62.92 23.02 14.07 | 100.00 | 59.42 64.75 80.88 | 62.96 -----------+---------------------------------+---------High | 168 49 13 | 230 | 73.04 21.30 5.65 | 100.00 | 40.58 35.25 19.12 | 37.04 -----------+---------------------------------+---------Total | 414 139 68 | 621 | 66.67 22.38 10.95 | 100.00 | 100.00 100.00 100.00 | 100.00 3 What goes into the cells? Frequencies Cell  Margin  Total  Row percentages Column percentages Total percentages 4 Percentages Independent variable - suspected cause Dependent variable - suspected effect Percentages should be based on the independent or causal variable 5 Stress and social class | class stress | Low Middle Upper | Total -----------+---------------------------------+---------Low | 246 90 55 | 391 | 62.92 23.02 14.07 | 100.00 | 59.42 64.75 80.88 | 62.96 -----------+---------------------------------+---------High | 168 49 13 | 230 | 73.04 21.30 5.65 | 100.00 | 40.58 35.25 19.12 | 37.04 -----------+---------------------------------+---------Total | 414 139 68 | 621 | 66.67 22.38 10.95 | 100.00 | 100.00 100.00 100.00 | 100.00 6 Make comparisons Compare categories of the independent variable To see effect on proportion in one category of the dependent variable To make comparisons we must be sure the comparisons make sense -- are of the same thing: not apples with oranges! 7 Independence Two variables, A and B, are independent if p(A) = p(A|B) p(Stress) = .37, p(Stress|Hi class) = .19 Also, note  p(s|low) = .41 p(s|mid) = .35 p(s|hi) = .19 Also note, these are from the appropriate percentages, since class causes stress. 8 Independence If there is independence, then  p(s) = p(s|lo) = p(s|mid) = p(s|hi) What would the frequencies be if there was independence? p(s) = .37 = p(s|lo) = p(s|mid) = p(s|hi)  This .37 is taken from the margin (unconditional probability of stress)  9 Apply this | class stress | Low Middle Upper | Total -----------+---------------------------------+---------Low | 246 90 55 | 391 | 62.96 62.96 62.96 | 62.96 | 260.65 87.52 42.81 | -----------+---------------------------------+---------High | 168 49 13 | 230 | 37.04 37.04 37.04 | 37.04 | 153.35 51.48 25.19 | -----------+---------------------------------+---------Total | 414 139 68 | 621 | 100.00 100.00 100.00 | 100.00 10 Observed and Expected Are they the same?  Then p(s) = p(s|class) -- Independence Are they different?  Then p(s) ‡ p(s|class) -- Relationship How can we tell?  Obs  Exp 2 Exp 11 Look at parts of formula  Obs  Exp 2 Exp What if we just sum difference without squaring? How big is a difference of 5 points? What happens when there are lots of cells in the table we are looking at? 12