Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Section 4.2: How to Look for Differences Cross-Tabulations College student binge drinkers experienced many personal and social problems, the researchers said. For instance, 90% of those who binged more than once a week said they experienced a hangover and 63% said they did something they regretted. --- Newsday, 12/94 Making Sense of the Data Were binge drinkers more likely to do things they regretted than students who did not binge drink? Defn: Binge drinking – consuming 5 or more dinks in a row for men or four or more in a row for women. Wechsler Study on American College Students Nonbinge drinkers (n=6894) Infrequent Frequent binge drinkers binge drinkers (n=4090) (n=3291) Did something 14% you regret 37% 63% Missed a class 8% 30% 61% Got behind in schoolwork 6% 21% 46% Cross-Tab Tables Categories of one measurement are listed on the top and the categories of the other measurement are listed on the side. Also, list the number of people that appear in each cell. Wechsler Cross Tab Nonbinge Kept up 6,480 with schoolwork Got behind 414 with schoolwork Infrequent Binge Frequent Binge 3,231 1,777 859 1,514 Marginal Totals and Proportions Cross Tabs will often show the totals (proportions) of each column in the bottom margin under the column and the total (proportions) of each row in the margin to the right of the row. Cross-Tab with Column Marginals Nonbinge Infrequent Binge Frequent Binge Kept up with schoolwork 6,480 3,231 1,777 Got behind with schoolwork 414 859 1,514 Column Count 6,894 4,090 3,291 Column % 48% 29% 23% Correlation A cross-tab is a useful way of checking for correlation between two categorical variables. Two variables are correlated if knowing an observation’s value for one of the measurements helps you to better estimate the likelihoods of that observation’s value on the other measurement. Correlation – Another View If the counts you would expect if there was no correlation are the counts you observed, then there is no correlation. Computing Expected Counts Expected counts can be thought of as the count that would be if the row proportion applied equally to all the data in it; that is, if there were no correlation between the observations in the cross-tab. Correlation in the Sample vs. Population Even though the expected counts may show a correlation in the sample data, how does one show there is correlation in the population? Perform a hypothesis test. Step 1: State the Null Hypothesis H0: No correlation in the population data. Remember…. H0 is always the hypothesis of “no difference”. In this instance “no difference” means no difference between expected and observed counts – or no correlation. Step 2: Collect Sample Data Use the Wechsler data. Is the sample data statistically significant? That is, assuming the null hypothesis is correct (i.e. there is no correlation between the two variables), how likely is it to get the sample data that we observed? Step 3: Compute the Appropriate Test Statistic In testing a claim about one sample proportion, the test statistic is the z-test statistic. Since we have several sample proportions we’ll need a new test statistic – the chisquared test statistic. To compute the test statistic: Calculate the expected counts for each cell. We’ll actually calculate the chi-squared test statistic tomorrow. Step 4: Find the p-value A p-value means just what it did earlier…. Assuming the null hypothesis is true (that is, there is no correlation between the two variables), it is the probability of getting sample data as extreme or more extreme than we observed. Or more simply…. If there is no correlation (association) between the two variables, could the differences in the observed and expected cell counts be attributed to chance? With this test we’ll use the TI-83 to find the p-value. Step 4: Reject or retain the null hypothesis Step 5: State Your Conclusions There is statistical evidence to suggest that there is (is not) correlation between the two variables…..