Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
MATH 117 - Probability and Statistics S. Janke (Block 1 - 2008) Data Set Worksheet The data set is in a MINITAB file, called BFAT.MTW, in course folder. The data come from Canadian records of pure-bred dairy cows. The first column gives the butterfat percentages for cows tested in the sample. In the second column, a 0 indicates Guernsey, a 1 denotes Holstein, and a 2 means Jersey. The third column gives a 0 if the cow is two years old (young) and a 1 if the cow is greater than 5 years old (mature). We are interested in whether breed or maturity makes a difference in butterfat content of the milk. Complete the following tasks and turn in your work by Thursday, Sept. 18, at 3:00 p.m. The tasks ask you to produce graphs, Minitab calculations, and your own observations. . Copy and paste information from MINITAB to a Word document for turning in. Clearly number each task in your write-up. 1. Produce a histogram and dotplot for the butterfat percentages in the first column. Make one or two observations about the shape of the data. (Is it bell-shaped? Skewed? Unimodal? Any outliers?) 2. Do a boxplot with displaying three boxplots on the same graph, one for each breed. Make one or two observations comparing the three boxplots. 3. Do a boxplot comparing plots for each of the ages. Again make some observations comparing the two boxplots. 4. Let Minitab calculate descriptive statistics for the butterfat column. Choose 1-sample t and let Minitab calculate a 95% confidence interval for the mean. 5. Unstack the butterfat column by breed (i.e. the subscripts for the unstack command are in column 2.) Once the breeds are unstacked, produce descriptive statistics for each one. Compare means, medians, standard deviations, and interquartile ranges between each breed. Make two or three observations about your comparisons. 6. Produce histograms for each breed’s butterfat percentages. Comment on the shape of each histogram. 7. Produce confidence intervals for each breed’s mean butterfat percentages. Compare the confidence intervals. (Does one breed produce significantly more butterfat than another?) 8. Use the 2-sample t test to compare each pair of breeds to see if their means are significantly different. (There will be three tests to perform.) If the p value reported for a test is less than 0.05, the means of the two samples were significantly different. Record and interpret the test results. (Does one breed have a significantly higher butterfat percentage than the others? Significantly lower?) 9. Go back to the complete butterfat data in column 1. Unstack it by age (i.e. subscripts in column 3). Once again calculate descriptive statistics for the unstacked data and histograms. Make one or two comparisons. 10. Use the 2-sample t test to compare the means of the two age groups. Report and interpret the result of the tests. 11. The assumptions necessary to draw statistical inferences from this data are primarily these: a. The data form a random sample from the population of dairy cows. Comment on whether you think the data is random. How do you think the data was collected? b. The sample data must come from a normal population in order to apply the t-tests. From the histograms, does the data look approximately normal? Run normality tests in MINITAB and interpret the results. 12. Give a one-paragraph summary of your findings in this analysis.