Download MATH 117 - Probability and Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Time series wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
MATH 117 - Probability and Statistics
S. Janke (Block 1 - 2008)
Data Set Worksheet
The data set is in a MINITAB file, called BFAT.MTW, in course folder. The data come
from Canadian records of pure-bred dairy cows. The first column gives the butterfat
percentages for cows tested in the sample. In the second column, a 0 indicates
Guernsey, a 1 denotes Holstein, and a 2 means Jersey. The third column gives a 0 if the
cow is two years old (young) and a 1 if the cow is greater than 5 years old (mature). We
are interested in whether breed or maturity makes a difference in butterfat content of the
milk.
Complete the following tasks and turn in your work by Thursday, Sept. 18, at 3:00 p.m.
The tasks ask you to produce graphs, Minitab calculations, and your own observations. .
Copy and paste information from MINITAB to a Word document for turning in. Clearly
number each task in your write-up.
1. Produce a histogram and dotplot for the butterfat percentages in the first column.
Make one or two observations about the shape of the data. (Is it bell-shaped? Skewed?
Unimodal? Any outliers?)
2. Do a boxplot with displaying three boxplots on the same graph, one for each breed.
Make one or two observations comparing the three boxplots.
3. Do a boxplot comparing plots for each of the ages. Again make some observations
comparing the two boxplots.
4. Let Minitab calculate descriptive statistics for the butterfat column. Choose 1-sample t
and let Minitab calculate a 95% confidence interval for the mean.
5. Unstack the butterfat column by breed (i.e. the subscripts for the unstack command are
in column 2.) Once the breeds are unstacked, produce descriptive statistics for each one.
Compare means, medians, standard deviations, and interquartile ranges between each
breed. Make two or three observations about your comparisons.
6. Produce histograms for each breed’s butterfat percentages. Comment on the shape
of each histogram.
7. Produce confidence intervals for each breed’s mean butterfat percentages. Compare
the confidence intervals. (Does one breed produce significantly more butterfat than
another?)
8. Use the 2-sample t test to compare each pair of breeds to see if their means are
significantly different. (There will be three tests to perform.) If the p value reported for a
test is less than 0.05, the means of the two samples were significantly different. Record
and interpret the test results. (Does one breed have a significantly higher butterfat
percentage than the others? Significantly lower?)
9. Go back to the complete butterfat data in column 1. Unstack it by age (i.e. subscripts
in column 3). Once again calculate descriptive statistics for the unstacked data and
histograms. Make one or two comparisons.
10. Use the 2-sample t test to compare the means of the two age groups. Report and
interpret the result of the tests.
11. The assumptions necessary to draw statistical inferences from this data are primarily
these:
a. The data form a random sample from the population of dairy cows. Comment on
whether you think the data is random. How do you think the data was collected?
b. The sample data must come from a normal population in order to apply the t-tests.
From the histograms, does the data look approximately normal? Run normality tests in
MINITAB and interpret the results.
12. Give a one-paragraph summary of your findings in this analysis.