Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistics 400 - Lecture 23 Last Day: Regression Today: Finish Regression, Test for Independence (Section 13.4) Suggested problems: 13.21, 13.23 Computer Output Will not normally compute regression line, standard errors, … by hand Key will be identifying what computer is giving you SPSS Example u E u t s R q q m M u u a 1 5 0 9 a P b O m e a S d u F a M i f a g a 1 R 6 1 6 1 0 R 4 7 1 T 0 8 a P b D a i c n d e d f f a t i s c S B e M E i t g t 1 ( C 2 4 0 6 R 4 7 6 7 0 a D What is the Coefficients Table? What is the Model Summary? What is the ANOVA Table Back to Probability The probability of an event, A , occurring can often be modified after observing whether or not another event, B , has taken place Example: An urn contains 2 green balls and 3 red balls. Suppose 2 balls are selected at random one after another without replacement from the urn. Find P(Green ball appears on the first draw) Find P(Green ball appears on the second draw) Conditional Probability The Conditional Probability of A given B : P( A and B) P( A | B) P( B) Example: An urn contains 2 green balls and 3 red balls. Suppose 2 balls are selected at random one after another without replacement from the urn. A={Green ball appears on the second draw} B= {Green ball appears on the first draw} Find P(A|B) and P(Ac|B) Example: Records of student patients at a dentist’s office concerning fear of visiting the dentist suggest the following proportions Fear Dentist Do Not Fear Dentist School Level Elementary Middle 0.12 0.08 0.28 0.25 Let A={Fears Dentist}; B={Middle School} Find P(A|B) High 0.05 0.22 Conditional Probability and Independence If fearing the dentist does not depend on age or school level what would we expect the probability distribution in the previous example to look like? What does this imply about P(A|B)? If A and B are independent, what form should the conditional probability take? Summarizing Bivariate Categorical Data Have studied bivariate continuous data (regression) Often have two (or more) categorical measurements taken on the same sampling unit Data usually summarized in 2-way tables Often called contingency tables Test for Independence Situation: We draw ONE random sample of predetermined size and record 2 categorical measurements Because we do not know in advance how many sampled units will fall into each category, neither the column totals nor the row totals are fixed Example: Survey conducted by sampling 400 people who were questioned regarding union membership and attitude towards decreased spending on social programs Union Non-Union Total Support 112 84 196 Indifferent 36 68 104 Opposed 28 72 100 Total 176 224 400 Would like to see if the distribution of union membership is independent of support for social programs If the two distributions are independent, what does that say about the probability of a randomly selected individual falling into a particular category What would the expected count be for each cell? What test statistic could we use? Formal Test Hypotheses: Test Statistic: P-Value: Spurious Dependence Consider admissions from a fictional university by gender Male Female Total Admit 490 280 770 Deny 210 220 430 Male Female Admit 0.70 0.56 Deny 0.30 0.44 Is there evidence of discrimination? Consider same data, separated by schools applied to: Business School: Male Female Admit 480 180 Deny 120 20 Male Female Admit 0.80 0.90 Deny 0.20 0.10 Deny 90 200 Male Female Admit 0.10 0.33 Deny 0.90 0.67 Law School: Male Female Admit 10 100 Simpson’s Paradox: Reversal of comparison due to aggregation Contradiction of initial finding because of presence of a lurking variable