Download Chapter 26: Comparing Counts - Masin

Chapter 26: Comparing Counts AP Statistics Comparing Counts • In this chapter, we will be performing hypothesis tests on categorical data • In previous chapters, the data has been quantitative (normal distributions and student’s t-distributions) Chi-Square Tests We will be looking at three different types of hypothesis tests that deal with categorical data. 1. Chi-square Test for Goodness of Fit 2. Chi-square Test for Homogeneity 3. Chi-square Test for Independence/Association  Test for Goodness of Fit 2 Compares the observed sample distribution with the population distribution. Generally, we are testing how well the observations “fit” what we expect. df  n  1; where n is the number of categories  Test for Homogeneit y 2 Compares more than two groups. This is also referred to as a two-way table or a contingency table, as the data are in a table or matrix. df  r  1c  1; where r is the number of rows and c is the number of columns  Test for Independen ce/Associa tion 2 Tests the null hypothesis that there is no relationship between two categorical variables from a simple random sample, with each individual classified according to both of the categorical variables. df  n  1; where n is the number of categories Chi-Square Model The Chi-Square Model is part of a family of distributions (Normal, Student’s t ). Its shape is determined by one parameter-the degrees of freedom (like t-model) Chi-square Model • With only a few degrees of freedom, the model is strongly skewed right. • As the degrees of freedom increase, the model gets closer to being symmetric, but will never get there. • Will ALWAYS have a longer tail on the right. Chi-square Model Chi-squared Model Chi-square Model • • • • The mode is at df—2 The mean is at df Always “starts” at zero Only takes on positive values • Only used for tests (no confidence intervals here) Chi-square statistic • Functions just like a z-score or a t-score • You will calculate a P-value based on the Chisquare statistic • Will always represent the value in a one-tailed test   2 observed  exp ected  2 exp ected Chi-square model/statistic df=5 df=11  2 P  value Hypotheses • Null: Usually written in words – H0: Ages are uniformly distributed in the school – HA: Ages are not uniformly distributed in the school • Even though the alternative appears to be twotailed, chi-square tests are always one-sided – Only testing if statistic is “too large”. • Alternative has no direction. All we know is that it doesn’t fit. Hence, the “goodness of fit”. Assumptions/Conditions Counted Data Assumption make sure all data are “counts”, not percents or proportions. Independence Assumption are individuals counted in cells sampled independently from some population? Randomization Condition Sample Size Assumption Expected Cell Frequency Condition at least 5 individuals in each cell Logic • We assume the probability (expected) model is correct. Our test will assess whether the observed results (statistics) are consistent with that model. – Ask: Are the differences between the observed and hypothesized values just natural sampling variability or is it something else? (significant difference?) • To assess our model, we compute the Chi-square statistic and find the corresponding P-value from the chi-square model (with a certain degree of freedom). • We interpret the P-value and if it is greater than our α, we fail to reject the null and if it is less than our α, we reject our null—accepting our alternative. Goodness of Fit Example The other day I purchased a 40-pound bag of mixed nuts for a math department party. The bag said that the mix consisted of, by weight, 40% cashews, 15% Brazil nuts, 20% almonds and 25% peanuts. When the department came over, we randomly picked out 20 nuts to test the claim by the company. In our sample, we found 10 cashews, 8 Brazil nuts, 10 almonds and 12 peanuts. Based on this sample, do you feel as if the company is being misleading?

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Chapter 26: Comparing Counts - Masin