Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Homework #4 STAT651 1) Open up the DNA adduct level data. Note and write down the variable names. - diet, proximal, and distal 2) Construct summary statistics for each of the two diets. Descriptives DNA adduct level in dis tal colon Diet Group Corn Oil Diet Fish Oil Diet Mean 95% Confidence Interval for Mean 5% Trimmed Mean Median Variance Std. Deviation Minimum Maximum Range Interquartile Range Skewness Kurtos is Mean 95% Confidence Interval for Mean 5% Trimmed Mean Median Variance Std. Deviation Minimum Maximum Range Interquartile Range Skewness Kurtos is Lower Bound Upper Bound Lower Bound Upper Bound Statis tic 29.5971 21.1720 Std. Error 3.92816 38.0222 28.6555 26.3683 231.457 15.21371 11.63 64.52 52.89 22.2349 .956 .638 16.7633 12.4187 .580 1.121 2.02564 21.1079 16.2711 16.2445 61.548 7.84528 4.82 37.57 32.75 9.3444 1.068 2.717 .580 1.121 a) Do you see any indication that there might be a massive outlier? - Not at all. The corn oil group had higher means, medians, standard deviations and interquartile ranges than the fish oil group. This general consistency is something that one might not see if there were a massive outlier. 3) Now construct the boxplot for the two populations. 70 60 50 40 28 30 20 10 0 N = 15 15 Corn Oil Diet Fish Oil Diet Diet Group a) Compare the two populations in terms of their measures of central tendency, i.e., their sample median. - The sample median from ‘Corn Oil Diet’ group is 26.36 and one from ‘Fish Oil Diet’ group 16.24. It looks that the fish oil enhanced diet group had less averagedamage(smaller adduct levels) than the corn oil enhanced diet group. b) Compare the two populations in terms of their measures of variability, i.e. the interquartile ranges(IQR). - The IQR from the ‘Corn Oil Diet’ group is 22.23 and one from ‘Fish Oil Diet’ group 9.33. The ‘Corn Oil Diet’ group has larger variability than ‘Fish Oil Diet’ group does. 4) (Fish Oil diet) the population standard deviation = 8. Use the empirical rule to narrow down with 95% confidence where the population mean for this diet group is. - The general rule is that 95% of the population values are within 1.96 standard deviations = 1.96 * 8 = 15.86 of the sample mean (16.76). Hence, 95% of the population values should lie between 16.76 – 15.86 = 0.90 and 16.76 + 15.86 = 32.62. - On the other hand, the chance is 95% that the population mean is within 1.96 standard errors of the sample mean. - Since the population standard deviation is known, the standard error is 1.96 * / (square root of n) = 1.96 * 8 / (square root of 15) = 15.86 / 3.87 = 4.10. - Hence, with 95% probability, the population mean is between 16.76 – 4.10 = 12.66 and 16.76 + 4.10 = 20.86. - By the way, the SPSS output notes that the exact confidence interval for the population mean is 12.41 and 21.10, once one takes into account the fact that the population standard deviation is unknown. This interval is slightly longer because we have to estimate the population standard deviation by the sample standard deviation. 5) The population mean = 17. What percentage of the population of rats who are fed a fish-oil enhanced diet will have a DNA adduct level exceeding 33? - Pr(X > 33) = Pr[ (X – 17)/8 > (33 – 17)/8 ] = Pr[Z > 2 ] = 0.0228. 6) Let X stand for the number of red petals of bluebonnets that are near the highway. Suppose that these bluebonnets have a population mean = 2.8 and a population standard deviation = 2. Compute the following probabilities. Round the z-score to 2 digits. If the z-score is greater than 3.4, set it = 3.4. If the z-score is less than 3.4, set it equal to -3.4. - Pr(X > 4.8) (Solution) Pr( X > 4.8 ) = Pr[ (X - ) / > (4.8 - ) / ] = Pr[ Z > ( 4.8 – 2.8 ) / 2 ] = Pr[ Z > 1.0 ] = 1 – Pr[Z 1.0] = 1 - 0.8413 = 0.1587 - Pr(X < 4.8) (Solution) Pr( X < 4.8 ) = Pr[ (X - ) / < (4.8 - ) / ] = Pr[ Z < ( 4.8 – 2.8 ) / 2 ] = Pr[ Z < 1.0 ] = 0.8413 - Pr(X < 2.8) (Solution) Pr( X < 2.8 ) = Pr[ (X - ) / < (2.8 - ) / ] = Pr[ Z < ( 2.8 – 2.8 ) / 2 ] = Pr[ Z < 0.0 ] = 0.5 7) Let X stand for the number of red petals of bluebonnets that are far from the highway. Suppose that these bluebonnets have a population mean = 4.8 and a population standard deviation = 2. Compute the following probabilities. Round the z-score to 2 digits. If the z-score is greater than 3.4, set it = 3.4. If the z-score is less than -3.4, set it equal to -3.4. - Pr(X > 4.8) (Solution) Pr( X > 4.8 ) = Pr[ (X - ) / > (4.8 - ) / ] = Pr[ Z > ( 4.8 – 4.8 ) / 2 ] = Pr[ Z > 0.0 ] = 0.5 - Pr(X < 4.8) (Solution) Pr( X < 4.8 ) = Pr[ (X - ) / < (4.8 - ) / ] = Pr[ Z < ( 4.8 – 4.8 ) / 2 ] = Pr[ Z < 0.0 ] = 0.5 - Pr(X < 2.8) (Solution) Pr( X < 2.8 ) = Pr[ (X - ) / < (2.8 - ) / ] = Pr[ Z < ( 2.8 – 4.8 ) / 2 ] = Pr[ Z < - 1.0 ] = 0.1587