* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download lab 9 anova - Faculty Web Pages
		                    
		                    
								Survey							
                            
		                
		                
                            
                            
								Document related concepts							
                        
                        
                    
						
						
							Transcript						
					
					Mathematics 244 Studio Lab 8: ANALYSIS OF VARIANCE Please answer questions in the answer template (Lab8ans.doc). Introduction The analysis of variance (ANOVA) procedure enables us to perform a test of the research hypothesis, Ha, of whether there are any differences in the means of several populations, provided that certain assumptions hold. Using data obtained by taking simple random samples from each population, the test statistic is based on a comparison of the “average” variation between samples with the “average” variation within samples. The larger the variation between samples relative to the variation within, the greater the evidence against the null hypothesis, H0, that the population means are equal. As a consequence of this approach to testing the null and alternative hypotheses above, for a given difference between population means (Ha is true), the greater the variation within the populations, the less likely we are to detect this difference (that is, correctly reject H0). We will explore this idea further in the lab by comparing the sodium levels of three different breakfast cereals: “Froot Loops” (manufactured by Kellogg’s), “Fruity Pebbles” (manufactured by Post), and “Trix” (manufactured by General Mills). Based on the nutritioninformation labels, for 1-ounce (approximately 30-g) servings, the sugar content per serving of these three cereals is about the same. However, the per-serving sodium content is different: 125 mg for “Froot Loops”, 135 mg for “Fruity Pebbles”, and 140 mg for “Trix”.1 Now it will not be the case that every single serving of these cereals will contain exactly these stated amounts — there will be variation in the sodium content of single servings of any cereal. This is the variation within the population of all possible single servings of each cereal produced by the manufacturing process. To get an idea of the magnitude of this variation for all three cereals, we consulted a food scientist, who indicated that the standard deviation of sodium content for each is approximately 12 mg per serving. Part I: The Mechanics Behind ANOVA Using the nominal values of the manufacturers’ stated sodium contents as population means and the standard deviation of 12 mg, we created normal populations of the sodium content in the servings for each of the three cereals. Using these populations, we will model the sampling of servings from the three manufacturers’ processes. We will draw random samples of size 20 from each cereal type and perform an ANOVA to test the null hypothesis that the mean per-serving 1 1993 figures. Data obtained from the StatLib archive of the American Statistical Association at http://lib.stat.cmu.edu/datasets/1993.expo/cereal . sodium contents are the same for all three cereals vs. the alternative hypothesis that some of the mean per-serving sodium contents for the three cereals differ. i) What assumptions must be met in order to perform an ANOVA? (You may wish to refer to “Procedures” handout) ii) Specify the hypotheses to be tested. H0: Ha: iii) Is H0 or Ha true here? Even though we know which of the two hypotheses is true, we will nevertheless go through the process of taking independent samples of cereal servings using the three populations of poker chips to demonstrate the necessary sampling protocol for an analysis of variance, and also to determine if our test makes the correct decision. We will sample n1, n2, and n3 poker chips oneat-a-time with replacement from “Froot Loops”, “Fruity Pebbles”, and “Trix”, respectively. iv) In a blank MINITAB worksheet, label columns C1-C3 with the cereal names. Type all of the collected sodium-content observations into the appropriate columns. v) What is the number of observations in each sample? observations? n1 = n2 = n3 = What is the total number of N= Now we will test the hypotheses, using an ANOVA, at the 0.05 level of significance. vii) Calculate the numbers of degrees of freedom for this test. Be sure to indicate your calculations. Determine the critical value, F, for this test. Use  = 0.05. df = ( , ) F = viii) Perform an ANOVA to test the hypotheses as follows. MINITAB has an “unstacked” oneway analysis-of-variance procedure which also produces simultaneous boxplots of the data. TYPING COMMANDS USING THE MOUSE MTB > aovoneway C1 – C3 ; (may be abbreviated “aovo”) StatANOVAOneway (Unstacked)... (not just “Oneway”) SUBS> gbox . Type C1 – C3 in the box for the Responses. 2 Click on Graphs... Select Boxplots of data. Click OK. Click OK. Copy both the ANOVA output (from the session window) and the boxplots here. ix) Compare the value the test statistic, F0, to F. What decision do you make? x) Use the P-value from the ANOVA output to conduct the test and write down your decision. xi) Does the information provided by the boxplots confirm your decision? Explain. xii) Write a concluding sentence to interpret the results of the test in the context of the mean sodium content of these three cereals. xiii) Is the test decision correct? xiv) Note that the MINITAB output includes a graphical display of confidence intervals for each of the three population means. Although these confidence intervals do not provide a complete picture of the differences in means, they provide a rough idea of when to reject H0, namely, when two or more intervals don’t overlap. Based on this notion, suggest why the confidence intervals support the test results. Part II: Effect of Within-Population Variation on the Power of the Test In this part of the lab, we will sample from large populations of the sodium content of 5000 single servings of these three cereals in order to examine how the level of within-population variation influences our ability to detect a difference in population means when one exists. Task A: Small Within-Population Variation Using the manufacturers’ nominal per-serving sodium content values as the population means and the quoted standard deviation of 12 mg per serving as the population standard deviations, we 3 created a normal population of the sodium content of 5000 servings for each of the three cereals. Columns C1-C3 of the MINITAB worksheet lab8dat.mtw contain these populations. We will take a random sample of 20 servings from each of the three populations. We want to know, based on our samples, whether we can detect any difference in the mean sodium content per serving of these three cereals. i) Specify the hypotheses to be tested. Which is true? H0: Ha: ii) Use MINITAB to select a sample of size 20 without replacement from each population, placing the samples in columns C4-C6, respectively. (Recall the SAMPLE command: sample 20 c1 c4 — repeat for the others.) iii) Stack your three samples, storing the per-serving sodium levels in column C7 and the subscripts in column C8. iv) Perform an ANOVA to test the hypotheses using the following procedure: TYPING COMMANDS MTB > oneway C7 C8 USING THE MOUSE StatANOVAOneway... (not “Oneway Unstacked”) Type C7 for the Response variable and C8 for the Factor. Click OK. Copy the ANOVA output from the session window here. v) What decision do you make? Use  = 0.05. vi) Write a concluding sentence to interpret the results of the test in the context of the mean sodium content of these three cereals. Task B: Large Within-Population Variation What if the quoted standard deviation of 12 mg per serving were incorrect? For example, what if the sodium content per serving were 24 mg instead of 12? Using the value of the manufacturers’ nominal per-serving sodium content values as population means and 24 mg per serving as the population standard deviations, we created a normal population of the sodium content of 5000 4 servings for each of the three cereals. lab8dat.mtw contain these populations. Columns C11-C13 of the MINITAB worksheet i) Specify the hypotheses to be tested. Which is true? H0: Ha: ii) Use MINITAB to select a sample of size 20 without replacement from each population, placing the samples in columns C14-C16, respectively. iii) Stack your three samples, storing the per-serving sodium levels in column C17 and the subscripts in column C18. iv) Perform an ANOVA to test the hypotheses. v) What decision do you make? Use  = 0.05. vi) Write a concluding sentence to interpret the results of the test in the context of the mean sodium content of these three cereals. Report your results to the Lab Instructor, who will collect and display them. vii) (a) For what proportion of the replications in the lab was the null hypothesis rejected this time? (b) What quantity does this proportion estimate? (a) (b) Task C: Comparison of the Results from Tasks A and B i) Plot the data for both sets of stacked samples obtained in Tasks A and B simultaneously, using the following procedure: USING THE MOUSE GraphPlot... Type C8 for the Y variable and C7 for the X variable for Graph 1. Type C18 for the Y variable and C17 for the X variable for Graph 2. Select DisplaySymbol and For eachGroup and type C8 for the Group variables. (Do this only once, not twice.) Select FrameMultiple Graphs... 5 Select Each graph on a separate page. Select Same X and same Y. Click OK. Click OK. <Note: There is a page-break here for a reason. If your boxes on the next page cross a page boundary, you may not be able to print your lab.> 6 <Note: There is a page-break here for a reason. If your boxes below cross a page boundary, you will likely not be able to print your lab.> Copy the graphs on the left and fill-in the corresponding values of si on the right. Identify the values of x1 , x 2 , and x 3 for each graph, writing them in by hand, once you have printed your lab. Draw a vertical line (by hand, once you have printed your lab) from each xi to its associated “dotplot.” Note that the MINITAB output from the ONEWAY command provides the sample means and standard deviations. s3 = s2 = s1 = s3 = s2 = s1 = ii) By comparing the three “dotplots” obtained for the samples in Task A to those for the samples in Task B, explain how the spread in the plot affects your ability to detect differences in the means of the three populations. iii) Compare the sample standard deviations from the samples taken in Task A with the sample standard deviations from the samples taken in Task B. Which group of values is larger? Is this surprising? Explain. iv) What happens to the power of the test when the within-population variation increases? Explain what this implies in terms of how the within-population variation affects our ability to detect a difference between population means when one exists. 7
 
									 
									 
									 
									 
									 
									 
									 
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                            