Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Name Period Date Number Chi Square Test m&m Statistical Analysis Have you ever wondered why the package of m&m’s you just bought never seems to have enough of your favorite color? Or, why is it that you always seem to get the package of mostly brown m&m’s? What’s going on at the Mars Company? Is the number of the different colors of m&m’s in a package really different from one package to the next, or does the Mars Company do something to insure that each package gets the correct number of each color of m&m? I’ll bet you’ve lost a lot of sleep worrying about this. Do not fear, today you will learn a statistical test that allows you to answer these important questions. This test is the Chi Square statistical test. This test allows you to calculate the difference between observed and expected data. If the difference between observed (collected) data and expected data is large a large chi square value results, while a small difference results in a small chi square value. The formula for Chi Square is Χ2 = Σ (O – E)2 E O represents observed data, E represents expected data and Σ represents the summation of multiple data sets. The Mars Company Web Page states that each package of M&M’s should have specific percentages of each color M&M which is representative of the amount of each color the factory puts out. m&m’s Color: Brown – 13% Green– 16% Orange– 20% Blue– 24% Red– 13% Yellow – 14% Peanut m&m’s Color Brown – 20% Green – 10% Orange – 10% Blue– 20% Red– 20% Yellow– 20% One way that we could determine if the Mars Co. is true to its word is to sample a package of m&m’s and do a chi squared test on the data. This will allow us to determine if any differences between our observed (counted) data and our expected data are simply due to chance sample error or some other reason. We will begin with the null hypothesis, which states that any deviation between observed and expected data is due to the random deviation of data alone. We will accept this null hypothesis as long as the deviation is not too great. This lab will allow you to learn how to run a chi square analysis test, and when to accept or reject the null hypothesis. What would the null hypothesis be with regard to m&m’s? To test this hypothesis we will need to collect data and calculate the Χ 2 statistic. Procedure: 1. Wash your hands, as you will be handling food that you may want to munch on later. 2. Separate the m&m’s on your team’s plate by color and count the number of each color of m&m you have. 3. Record your observed counts under each category on the master candy list on the chalkboard, and in your individual data table. 4. Sum the observed counts for each color category. 5. Determine the total number of m&m’s in your group’s data set. 6. Calculate the expected numbers of m&m’s in each color category using the information published by the Mars Company. 7. Record the total expected counts in the individual data table. 8. Use this data to determine the probability that the difference between observed and expected values occurred by chance. To do this, statisticians have constructed tables of chi-square values with their associated probabilities. Because probabilities are also based upon “degrees of freedom.” This is always one less than the number of phenotypic classes in any experiment. Statisticians have generally agreed upon a probability value of 0.05 as the lowest acceptable value derived from the Χ2 test. This number indicates that if the experiment were repeated many times, the deviations from expected due to chance alone would be as large or larger than those observed about 5% of the time. Probabilities equal to or greater than 0.05 are considered to support the hypothesis while probabilities lower than 0.05 do not support the hypothesis. If the value is unacceptable, either the experimental design, including the method of data collection or the hypothesis, or both need to be reexamined. For the Chi square tests you will conduct use the following distribution table CHI-SQUARE DISTRIBUTION TABLE Accept Hypothesis Reject Hypothesis Probability (p) Degrees of Freedom 1 2 3 4 5 6 7 8 9 10 0.95 0.004 0.10 0.35 0.71 1.14 1.63 2.17 2.73 3.32 3.94 0.90 0.80 0.70 0.50 0.02 0.21 0.58 1.06 1.61 2.20 2.83 3.49 4.17 4.86 0.06 0.45 1.01 1.65 2.34 3.07 3.82 4.59 5.38 6.18 0.15 0.71 1.42 2.20 3.00 3.83 4.67 5.53 6.39 7.27 0.30 0.20 0.46 1.07 1.64 1.39 2.41 3.22 2.37 3.66 4.64 3.36 4.88 5.99 4.35 6.06 7.29 5.35 7.23 8.56 6.35 8.38 9.80 7.34 9.52 11.03 8.34 10.66 12.24 9.34 11.78 13.44 0.10 0.05 0.01 0.001 2.71 4.60 6.25 7.78 9.24 10.64 12.02 13.36 14.68 15.99 3.84 5.99 7.82 9.49 11.07 12.59 14.07 15.51 16.92 18.31 6.64 9.21 11.34 13.38 15.09 16.81 18.48 20.09 21.67 23.21 10.83 13.82 16.27 18.47 20.52 22.46 24.32 26.12 27.88 29.59 Individual Data Table Colors Observed (O) Expected (E) O-E (O – E)2 (O – E)2/E Brown Green Orange Blue Red Yellow 2 = Totals 1. This sentence represents a summary of your results. With degrees of freedom I have a 2 value of and , I therefore the null hypothesis. which has a p value between 2. Use this data to explain possible reasons for accepting/rejecting the null hypothesis. Recall in your explanation that you only have a single bag of m&m’s which is essentially a random sample of what is being pumped out of the m & m factory. On the Smartboard, consolidate your individual data into data for the whole class, and conduct another chi square test. Class Data Table Colors Observed (O) Expected (E) O-E (O – E)2 (O – E)2/E Brown Green Orange Blue Red Yellow 2 = Totals Use this data to answer the following questions 3. This sentence represents a summary of your results. With degrees of freedom I have a 2 value of and , I therefore the null hypothesis. which has a p value between 4. After completing two chi squared tests, do you believe the individual bags of m & m’s have an accurate representation of colors as purported by the Mars Co. and what they are producing in their factory? Explain how your chi squared data might help support your claim. Another group of high school students accumulated the following data for peanut m&m’s. Use the column for peanut m&m’s on the front page. Class Data Table for Peanut m&m’s Colors Observed (O) Brown 1520 Green 774 Orange 815 Blue 1620 Red 1603 Yellow 1550 Expected (E) O-E (O – E)2 (O – E)2/E Χ2 = Totals 5. This sentence represents a summary of your results. With degrees of freedom I have a 2 value of and , I therefore the null hypothesis. which has a p value between