Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Practice of Statistics Third Edition Chapter 14: Inference for Distributions of Categorical Variables: Chi-Square Procedures Copyright © 2008 by W. H. Freeman & Company Inference for Two-Way Tables • We have learned how to compare proportions between two groups. – Two sample z-test. • What if we want to compare more than two groups? • We can use a two way (rows & columns) table. Does Background Music Influence Wine Purchases? • Consider the following table: No Music French Music Italian Music Total French Wine 30 39 30 99 Italian Wine 11 1 19 31 Other Wine 43 35 35 113 Total 84 75 84 243 Conditional Probability (%) French Wine Italian Wine Other Wine Total No Music 35.7% 30/84 13.1% 11/84 51.2% 43/84 100% 84/84 French Music 52.0% 39/75 1.3% 1/75 46.7% 35/75 100% 75/75 Italian Music 35.7% 30/84 22.6% 19/84 41.7% 35/84 100% 84/84 Total 40.7% 99/243 12.8% 31/243 46.5% 113/243 100% 243/243 Comparison of percents of different types of wine sold for different music conditions. 35.7% 22.6% 51.2% 52.0% 46.7% 41.7% 13.1% 35.7% 1.3% Here are the percents of different types of wine sold for different music conditions. 11/31 = 35.5% 30/99 = 30.3% 19/31 = 61.3% 30/99 = 30.3% 43/113 = 38% 35/113 = 31% 35/113 = 31% 39/99 = 39.4% 1/31 = 3.2% Conclusions • There appears to be an association between music played and the type of wine that customers buy. • Sales of Italian wine are low when French music is playing, but higher when Italian or no music is playing. • More French wine is sold when French music is playing. The Problems of Multiple Comparisons • We would expect music would influence sales, so music type is the explanatory variable (x) and type of wine purchased is the response variable (y). • We will compare the column percents that give the conditional distributions for each type of music played How to Compare • We could do 3 chi-square goodness of fit procedures. • H0: The distribution of wine types for no music is the same as the distribution of wine for French music. • H0: The distribution of wine types for no music is the same as the distribution of wine for Italian music. • H0: The distribution of wine types for French music is the same as the distribution of wine for Italian music. Weaknesses • We get three results. • We can’t safely compare many parameters by doing tests or confidence intervals for two parameters at a time. • This is the problem of MULTIPLE COMPARISIONS. Dealing with Multiple Comparisons • Two parts – An overall test to see if there is good evidence of differences among the parameters that we want to compare. – A detailed follow-up analysis to decide which of the parameters differ and to estimate how large the differences are. • The overall test is a chi-square test, but it will be used for comparing several population proportions. Two Way Tables • The tables we have been using are 3x3 tables because there are three types of wine (rows) and three types of music (columns). • The explanatory variable (x) is the type of music. • The response variable (y) is the type of wine purchased. Hypothesis • Each column represents one sample of music and each row a type of wine. • This is separate and independent random samples from column populations. – The column represent the populations. – The rows represent the response variable. • H0: The distribution of the response variable (type of wine purchased) is the same in all column populations. Music and Wine • We have 3 populations – Bottles of wine sold with no music – Bottles of wine sold with French music playing – Bottles of wine sold with Italian music playing. • We have three independent samples of 84, 75, and 84 bottles. • H0: The proportions of each wine sold is the same in all 3 populations. The null hypothesis is that the distribution of wine selected is the same for all three populations of music types. The alternative is the distributions of wine types are not all the same. If we have n independent trials and the probability of success on each trial is p, we expect np successes. If we draw a SRS of n individuals from a population in which the proportion of successes is p, we expect np successes in the sample. Two Way Table No Music French Music Italian Music Total French Wine 30 39 30 99 Italian Wine 11 1 19 31 Other Wine 43 35 35 113 Total 84 75 84 243 Finding the Expected Count of Row 1 and Column 1 • The proportion of no music among all 243 subjects: (Column 1 total)/(Table total) • 84/243 – Think of this as p, the overall proportion of no music. • If H0 is true, we would expect this same proportion of no music in all three groups. • So the expected count of no music among the 99 subjects who ordered French wine: np = (99)(84/243) = 34.222 • From the definition: (99)(84)/243 Expected Counts for Music and Wine French Wine Italian Wine Other Wine Total No Music 34.222 French Music 30.556 Italian Music 34.222 Total 99.000 10.716 9.568 10.716 31.00 39.062 34.877 39.062 113.001 84.000 75.001 84.000 243.001 Assignment • Construct the Expected Counts for Music and Wine Table. • Exercises 14.3, 14.11 • Read pages 858 – 865 • Exam on Wednesday, April 14th