Download Chapter 4 Section 2 Problems

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Partial SOLUTIONS
4.14 a
b
c
M - the sample median
p - the sample proportion
y - the sample mean
4.15
a
b
c
p1 - p2 -the difference between two sample proportions is a statistic
The figure $1.22 is a statistic because it was found from a sample of 6000 service stations.
The figure 50% is a parameter because it pertains to the population of all Americans
4.16
a 25, 8/√10
b 25, 8/√16
c 25, 8/√30
d 25, 8/√100
Only in parts a and b when the sample size is 30 or more can we be assured that the sampling
distribution of y will be approximately normally distributed.
4.17
72 ± 3σ /√n = 66 to 78
4.18
y is approximately normal and Z = ( y - 100)/(16/√36)
P( y > 105.6) = P(Z > 2.1) = .0178, so it is somewhat unusual.
> 1-pnorm(105.6,100,16/sqrt(36))
[1] 0.01786442
4.19
Z = (1900 - 1800)/(400/√90) = 2.37
P( y > 1900) = P(Z > 2.37) = 1 - .9911 = .0089
> 1-pnorm(1900,1800,400/sqrt(90))
[1] 0.008853033
4.20
Z = (75 - 72)/(10/√50) = 2.12
P( y > 75) = P(Z > 2.12) = 1 - .9830 = .0170
> 1-pnorm(75,72,10/sqrt(50))
[1] 0.01694743
4.21
a
Z = (34 - 31)/2 = 1.5, Z = (28 - 31)/2 = -1.5
probability = .9332 - .0668 = .8664
> pnorm(34,31,2)-pnorm(28,31,2)
[1] 0.8663856
b
Z = (34 - 31)/(2/√25) = 7.5, Z = (34 - 31)/(2/√25) = 7.5
probability = 1- - 0 = 1-, almost certain.
> pnorm(34,31,2/sqrt(25))-pnorm(28,31,2/sqrt(25))
[1] 1
4.22
a
50/√64 = 6.25
b 50/√400 = 2.5
4.23
12.63 ± 1.96(5 /√236) = 12.02 to 13.23
> 12.63-qnorm(.975)*5/sqrt(263)
[1] 12.02572
> 12.63+qnorm(.975)*5/sqrt(263)
[1] 13.23428
4.24
Z = (99.5 - 99)/(2/√50) = 1.77, Z = (98.5 - 99)/(2/√50) = -1.77
Probability of a false alarm = 2(1 - .9616) = .0768.
> pnorm(98.5,99,2/sqrt(50))+(1-pnorm(99.5,99,2/sqrt(50)))
[1] 0.07709987
4.25
The sampling distribution appears to be normally distributed. The value of µ is 160. The standard
deviation appears to be 2 units. Approximately 95% of the distribution lies between 156 and 164.
4.26
>
>
>
>
>
m <- 2000
n <- 16
mu <- 30
sigma <- 8
MEANS <- apply(matrix(rnorm(m*n,mu,sigma),nrow=m),1,mean)
a
> mean(MEANS)
[1] 29.95246
b
> hist(MEANS,main="Part b",col="orange")
200
0
100
Frequency
300
400
Part b
24
26
28
30
MEANS
c
> qqnorm(MEANS,col="blue",pch=19)
> qqline(MEANS,col="red")
32
34
36
32
30
28
24
26
Sample Quantiles
34
36
Normal Q-Q Plot
-3
-2
-1
0
1
2
3
Theoretical Quantiles
The histogram and the normal probability plot indicate that the distribution of sample means is very
close to that of a normal distribution. Even though the sample size is only 16 the sampling
distribution of y will be normally distributed because the population itself is normally distributed.
d
> EDA(MEANS)
Loading required package: e1071
[1] "MEANS"
Size (n) Missing Minimum
1st Qu
Mean
2000.000
0.000
23.334
28.685
29.952
TrMean
3rd Qu
Max.
Stdev.
Var.
29.960
31.266
36.684
1.975
3.902
I.Q.R.
Range Kurtosis Skewness SW p-val
2.581
13.350
0.156
-0.036
0.443
Median
29.935
SE Mean
0.044
e
The mean of the distribution of sample means is 29.952 (your answer will be different but close as
you are performing a simulation) which is very close to the population mean of 30. This is
because the theoretical mean of the sampling distribution of y is the same as the mean of the
population µ.
f
The standard deviation of the sample means is 1.975 which is not at all close to the population
standard deviation of 8. The standard deviation of the sampling distribution of y is not equal to the
population standard deviation σ, but rather is σ/√n. In this case σ/√n = 8/√16 = 2. Notice that the
standard deviation from the simulation (1.975), is very close to 2.
4.27
>
>
>
>
>
>
>
>
>
>
m <- 2000
n <- 64
mu <- 30
sigma <- 8
MEANS <- apply(matrix(rnorm(m*n,mu,sigma),nrow=m),1,mean)
par(mfrow=c(1,2))
hist(MEANS,col="pink",main="Problem 4.27 \n Part b",breaks="Scott")
qqnorm(MEANS,col="blue",main="Problem 4.27 \n Part c")
qqline(MEANS,col="red")
par(mfrow=c(1,1))
Problem 4.27
Part c
28
30
Sample Quantiles
400
0
26
200
Frequency
600
32
800
34
Problem 4.27
Part b
26
28
30
32
34
-4
MEANS
-2
0
2
4
Theoretical Quantiles
The histogram in Exercise 4.26 is close to being normally distributed, centered at 30, and ranges
from about 25 to 35. The histogram in this exercise is also close to normal, centered at 30, but
ranges from about 27.5 to 32.5. The normal probability plot here also support the conjecture that
the sampling distribution of y is normally distributed when the population is also normally
distributed. The only difference between the two exercises is that this distribution is less variable.
This is a result of the fact that the standard deviation of the sampling distribution of y is σ/√n =
8/√64 = 1, whereas for the previous exercise, when n = 16, the standard deviation of the sampling
distribution of y was 8/√16 = 2.
> EDA(MEANS)
[1] "MEANS"
Size (n) Missing Minimum
1st Qu
Mean
2000.000
0.000
27.008
29.350
30.018
TrMean
3rd Qu
Max.
Stdev.
Var.
30.014
30.680
33.016
0.995
0.989
I.Q.R.
Range Kurtosis Skewness SW p-val
1.330
6.008
-0.121
0.041
0.235
Median
30.019
SE Mean
0.022
Notice that the mean of the distribution of sample means is 30.018, very close to 30, and the
standard deviation is 0.995, very close to 1.
4.28
>
>
>
>
>
>
>
>
>
>
m <- 2000
n <- 16
mu <- 30
sigma <- 8
MEANS <- apply(matrix(rexp(m*n,.2),nrow=m),1,mean)
par(mfrow=c(1,2))
hist(MEANS,col="pink",main="Problem 4.28 \n Part b",breaks="Scott")
qqnorm(MEANS,col="blue",main="Problem 4.28 \n Part c")
qqline(MEANS,col="red")
par(mfrow=c(1,1))
Problem 4.28
Part c
0
2
50
4
6
Sample Quantiles
150
100
Frequency
200
8
250
300
10
Problem 4.28
Part b
2
4
6
8
10
-3
MEANS
-2
-1
0
1
2
3
Theoretical Quantiles
The histogram is skewed right and the normal probability plot does not support normality. Because the
population is not normally distributed (exponentially distributed) and the sample size is only 16, the Central
Limit Theorem does not guarantee that the sampling distribution of y will be approximately normally
distributed.
d
> EDA(MEANS)
[1] "MEANS"
Size (n) Missing Minimum
1st Qu
Mean
2000.000
0.000
1.681
4.093
5.006
TrMean
3rd Qu
Max.
Stdev.
Var.
4.964
5.762
9.869
1.262
1.591
I.Q.R.
Range Kurtosis Skewness SW p-val
1.669
8.188
0.344
0.498
0.000
e
Median
4.939
SE Mean
0.028
The mean of the distribution of sample means is 5.006 which is very close, as it should be, to the
population mean of 5.
4.29
>
>
>
>
>
>
>
>
>
>
m <- 2000
n <- 64
mu <- 30
sigma <- 8
MEANS <- apply(matrix(rexp(m*n,.2),nrow=m),1,mean)
par(mfrow=c(1,2))
hist(MEANS,col="pink",main="Problem 4.29 \n Part b",breaks="Scott")
qqnorm(MEANS,col="blue",main="Problem 4.29 \n Part c")
qqline(MEANS,col="red")
par(mfrow=c(1,1))
Problem 4.29
Part c
6
4
5
Sample Quantiles
150
100
3
0
50
Frequency
200
7
250
Problem 4.29
Part b
3
4
5
6
7
8
-3
MEANS
-2
-1
0
1
2
3
Theoretical Quantiles
The histogram appears much closer to a normal distribution than in Exercise 4.28. The normal
probability plot does not support the conjecture that the sampling distribution of y is closely
approximated by a normally distribution. It does, however, appear more normally distributed than
in Exercise 4.28. It seems, in the case of the exponential distribution, that the sample size should be
larger than 64 in order for the Central Limit Theorem to guarantee that the sampling distribution of
y is normally distributed.
d
> EDA(MEANS)
[1] "MEANS"
Size (n) Missing Minimum
1st Qu
Mean
2000.000
0.000
3.001
4.549
4.980
TrMean
3rd Qu
Max.
Stdev.
Var.
4.968
5.366
7.806
0.627
0.393
I.Q.R.
Range Kurtosis Skewness SW p-val
0.817 4.805 0.310 0.313 0.000
Median
4.961
SE Mean
0.014
The mean of the distribution of sample means is 4.980 which is very close, as it should be, to
the population mean of 5. The standard deviation of the distribution of sample means, 0.627,
is smaller than it was in Exercise 4.28, 1.262, when the sample size was only 16.
e In comparing the results of this exercise and the results found in Exercise 4.28 we see that when
the sample size is only 16, the sampling distribution of y is not approximately normally
distributed. When the sample size is 64, however, the sampling distribution of y is closer to being
normally distributed. Recall that the Central Limit Theorem says that it will be approximately
normally distributed when the sample size is relatively large. For the heavily skewed exponential
distribution the sample size should be larger than 64.