Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Partial SOLUTIONS 4.14 a b c M - the sample median p - the sample proportion y - the sample mean 4.15 a b c p1 - p2 -the difference between two sample proportions is a statistic The figure $1.22 is a statistic because it was found from a sample of 6000 service stations. The figure 50% is a parameter because it pertains to the population of all Americans 4.16 a 25, 8/√10 b 25, 8/√16 c 25, 8/√30 d 25, 8/√100 Only in parts a and b when the sample size is 30 or more can we be assured that the sampling distribution of y will be approximately normally distributed. 4.17 72 ± 3σ /√n = 66 to 78 4.18 y is approximately normal and Z = ( y - 100)/(16/√36) P( y > 105.6) = P(Z > 2.1) = .0178, so it is somewhat unusual. > 1-pnorm(105.6,100,16/sqrt(36)) [1] 0.01786442 4.19 Z = (1900 - 1800)/(400/√90) = 2.37 P( y > 1900) = P(Z > 2.37) = 1 - .9911 = .0089 > 1-pnorm(1900,1800,400/sqrt(90)) [1] 0.008853033 4.20 Z = (75 - 72)/(10/√50) = 2.12 P( y > 75) = P(Z > 2.12) = 1 - .9830 = .0170 > 1-pnorm(75,72,10/sqrt(50)) [1] 0.01694743 4.21 a Z = (34 - 31)/2 = 1.5, Z = (28 - 31)/2 = -1.5 probability = .9332 - .0668 = .8664 > pnorm(34,31,2)-pnorm(28,31,2) [1] 0.8663856 b Z = (34 - 31)/(2/√25) = 7.5, Z = (34 - 31)/(2/√25) = 7.5 probability = 1- - 0 = 1-, almost certain. > pnorm(34,31,2/sqrt(25))-pnorm(28,31,2/sqrt(25)) [1] 1 4.22 a 50/√64 = 6.25 b 50/√400 = 2.5 4.23 12.63 ± 1.96(5 /√236) = 12.02 to 13.23 > 12.63-qnorm(.975)*5/sqrt(263) [1] 12.02572 > 12.63+qnorm(.975)*5/sqrt(263) [1] 13.23428 4.24 Z = (99.5 - 99)/(2/√50) = 1.77, Z = (98.5 - 99)/(2/√50) = -1.77 Probability of a false alarm = 2(1 - .9616) = .0768. > pnorm(98.5,99,2/sqrt(50))+(1-pnorm(99.5,99,2/sqrt(50))) [1] 0.07709987 4.25 The sampling distribution appears to be normally distributed. The value of µ is 160. The standard deviation appears to be 2 units. Approximately 95% of the distribution lies between 156 and 164. 4.26 > > > > > m <- 2000 n <- 16 mu <- 30 sigma <- 8 MEANS <- apply(matrix(rnorm(m*n,mu,sigma),nrow=m),1,mean) a > mean(MEANS) [1] 29.95246 b > hist(MEANS,main="Part b",col="orange") 200 0 100 Frequency 300 400 Part b 24 26 28 30 MEANS c > qqnorm(MEANS,col="blue",pch=19) > qqline(MEANS,col="red") 32 34 36 32 30 28 24 26 Sample Quantiles 34 36 Normal Q-Q Plot -3 -2 -1 0 1 2 3 Theoretical Quantiles The histogram and the normal probability plot indicate that the distribution of sample means is very close to that of a normal distribution. Even though the sample size is only 16 the sampling distribution of y will be normally distributed because the population itself is normally distributed. d > EDA(MEANS) Loading required package: e1071 [1] "MEANS" Size (n) Missing Minimum 1st Qu Mean 2000.000 0.000 23.334 28.685 29.952 TrMean 3rd Qu Max. Stdev. Var. 29.960 31.266 36.684 1.975 3.902 I.Q.R. Range Kurtosis Skewness SW p-val 2.581 13.350 0.156 -0.036 0.443 Median 29.935 SE Mean 0.044 e The mean of the distribution of sample means is 29.952 (your answer will be different but close as you are performing a simulation) which is very close to the population mean of 30. This is because the theoretical mean of the sampling distribution of y is the same as the mean of the population µ. f The standard deviation of the sample means is 1.975 which is not at all close to the population standard deviation of 8. The standard deviation of the sampling distribution of y is not equal to the population standard deviation σ, but rather is σ/√n. In this case σ/√n = 8/√16 = 2. Notice that the standard deviation from the simulation (1.975), is very close to 2. 4.27 > > > > > > > > > > m <- 2000 n <- 64 mu <- 30 sigma <- 8 MEANS <- apply(matrix(rnorm(m*n,mu,sigma),nrow=m),1,mean) par(mfrow=c(1,2)) hist(MEANS,col="pink",main="Problem 4.27 \n Part b",breaks="Scott") qqnorm(MEANS,col="blue",main="Problem 4.27 \n Part c") qqline(MEANS,col="red") par(mfrow=c(1,1)) Problem 4.27 Part c 28 30 Sample Quantiles 400 0 26 200 Frequency 600 32 800 34 Problem 4.27 Part b 26 28 30 32 34 -4 MEANS -2 0 2 4 Theoretical Quantiles The histogram in Exercise 4.26 is close to being normally distributed, centered at 30, and ranges from about 25 to 35. The histogram in this exercise is also close to normal, centered at 30, but ranges from about 27.5 to 32.5. The normal probability plot here also support the conjecture that the sampling distribution of y is normally distributed when the population is also normally distributed. The only difference between the two exercises is that this distribution is less variable. This is a result of the fact that the standard deviation of the sampling distribution of y is σ/√n = 8/√64 = 1, whereas for the previous exercise, when n = 16, the standard deviation of the sampling distribution of y was 8/√16 = 2. > EDA(MEANS) [1] "MEANS" Size (n) Missing Minimum 1st Qu Mean 2000.000 0.000 27.008 29.350 30.018 TrMean 3rd Qu Max. Stdev. Var. 30.014 30.680 33.016 0.995 0.989 I.Q.R. Range Kurtosis Skewness SW p-val 1.330 6.008 -0.121 0.041 0.235 Median 30.019 SE Mean 0.022 Notice that the mean of the distribution of sample means is 30.018, very close to 30, and the standard deviation is 0.995, very close to 1. 4.28 > > > > > > > > > > m <- 2000 n <- 16 mu <- 30 sigma <- 8 MEANS <- apply(matrix(rexp(m*n,.2),nrow=m),1,mean) par(mfrow=c(1,2)) hist(MEANS,col="pink",main="Problem 4.28 \n Part b",breaks="Scott") qqnorm(MEANS,col="blue",main="Problem 4.28 \n Part c") qqline(MEANS,col="red") par(mfrow=c(1,1)) Problem 4.28 Part c 0 2 50 4 6 Sample Quantiles 150 100 Frequency 200 8 250 300 10 Problem 4.28 Part b 2 4 6 8 10 -3 MEANS -2 -1 0 1 2 3 Theoretical Quantiles The histogram is skewed right and the normal probability plot does not support normality. Because the population is not normally distributed (exponentially distributed) and the sample size is only 16, the Central Limit Theorem does not guarantee that the sampling distribution of y will be approximately normally distributed. d > EDA(MEANS) [1] "MEANS" Size (n) Missing Minimum 1st Qu Mean 2000.000 0.000 1.681 4.093 5.006 TrMean 3rd Qu Max. Stdev. Var. 4.964 5.762 9.869 1.262 1.591 I.Q.R. Range Kurtosis Skewness SW p-val 1.669 8.188 0.344 0.498 0.000 e Median 4.939 SE Mean 0.028 The mean of the distribution of sample means is 5.006 which is very close, as it should be, to the population mean of 5. 4.29 > > > > > > > > > > m <- 2000 n <- 64 mu <- 30 sigma <- 8 MEANS <- apply(matrix(rexp(m*n,.2),nrow=m),1,mean) par(mfrow=c(1,2)) hist(MEANS,col="pink",main="Problem 4.29 \n Part b",breaks="Scott") qqnorm(MEANS,col="blue",main="Problem 4.29 \n Part c") qqline(MEANS,col="red") par(mfrow=c(1,1)) Problem 4.29 Part c 6 4 5 Sample Quantiles 150 100 3 0 50 Frequency 200 7 250 Problem 4.29 Part b 3 4 5 6 7 8 -3 MEANS -2 -1 0 1 2 3 Theoretical Quantiles The histogram appears much closer to a normal distribution than in Exercise 4.28. The normal probability plot does not support the conjecture that the sampling distribution of y is closely approximated by a normally distribution. It does, however, appear more normally distributed than in Exercise 4.28. It seems, in the case of the exponential distribution, that the sample size should be larger than 64 in order for the Central Limit Theorem to guarantee that the sampling distribution of y is normally distributed. d > EDA(MEANS) [1] "MEANS" Size (n) Missing Minimum 1st Qu Mean 2000.000 0.000 3.001 4.549 4.980 TrMean 3rd Qu Max. Stdev. Var. 4.968 5.366 7.806 0.627 0.393 I.Q.R. Range Kurtosis Skewness SW p-val 0.817 4.805 0.310 0.313 0.000 Median 4.961 SE Mean 0.014 The mean of the distribution of sample means is 4.980 which is very close, as it should be, to the population mean of 5. The standard deviation of the distribution of sample means, 0.627, is smaller than it was in Exercise 4.28, 1.262, when the sample size was only 16. e In comparing the results of this exercise and the results found in Exercise 4.28 we see that when the sample size is only 16, the sampling distribution of y is not approximately normally distributed. When the sample size is 64, however, the sampling distribution of y is closer to being normally distributed. Recall that the Central Limit Theorem says that it will be approximately normally distributed when the sample size is relatively large. For the heavily skewed exponential distribution the sample size should be larger than 64.