Download Solutions of Quiz 3 (Take-Home) 1. The following table contains the

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

German tank problem wikipedia , lookup

Law of large numbers wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Solutions of Quiz 3 (Take-Home)
1. The following table contains the distribution of the number of attempts people make to pass the driving
test in Neverdriveland (denoted by X). (18 pts total )
x
p(x)
10
.2
30
.6
70
.2
(i) Find the mean and standard deviation of X. (You do not need to show any work). (2 pts)
µ = 34
σ=
√
384 ≈ 19.5959
You randomly select 2 people from this population and write down the number of times they attempted the
driving test. For the remaining questions, show all your work by filling in the tables on the answer pages.
(ii) What are all the different possible samples ? (2 pts)
Table of samples
1st value / 2nd value
10
30
70
10
(10, 10)
(30, 10)
(70, 10)
30
(10, 30)
(30, 30)
(70, 30)
70
(10, 70)
(30, 70)
(70, 70)
(iii) Calculate the mean of each different sample. (2 pts)
Table of means
1st value / 2nd value
10
30
70
10
10
20
40
30
20
30
50
70
40
50
70
(iv) Find the sampling distribution of the sample mean X̄. (4 pts)
Table of probabilities
1st value / 2nd value
10
30
70
10
.04
.12
.04
30
.12
.36
.12
70
.04
.12
.04
Sampling distribution of X̄
x̄
p(x̄)
10
.04
20
.24
30
.36
40
.08
50
.24
70
.04
(v) Find the expected value of the sample mean. How is it related to the population mean ? (4 pts)
x̄
p(x̄)
x̄p(x̄)
10
.04
.4
20
.24
4.8
30
.36
10.8
40
.08
3.2
50
.24
12
70
.04
2.8
µX̄ = Σx̄p(x̄) = .4 + 4.8 + 10.8 + 3.2 + 12 + 2.8 = 34 = µ
The expected value of the sample mean is equal to the population mean.
1
(vi) Find the standard deviation of the sample mean.
How is it related to the population standard deviation ? (4 pts)
x̄
p(x̄)
x̄ − µX̄
(x̄ − µX̄ )2
(x̄ − µX̄ )2 p(x̄)
10
.04
−24
576
23.04
20
.24
−14
196
47.04
30
.36
−4
16
5.76
40
.08
6
36
2.88
50
.24
16
256
61.44
70
.04
36
1296
51.84
2
σ
2
σX̄
= Σ(x̄ − µX̄ )2 p(x̄) = 23.04 + 47.04 + 5.76 + 2.88 + 61.44 + 51.84 = 192 = 384
2 = 2
q
√
√
σ
√
σX̄ = 192 ≈ 13.8564 also σX̄ = 192 = 384
2 =
2
The standard deviation of the sample mean is equal to the population standard deviation divided by the
square root of the sample size.
2. Researchers at the Terry College of Business at the University of Georgia sampled 344 business students and asked them this question: ”Over the course of your lifetime, what is the maximum number of
years you expect to work for any one employer ?” Assume that the sample of students was randomly selected
from the 6, 000 undergraduate students at the Terry College and that σ = 6 years. (16 pts total )
(i) If the mean for the 6, 000 undergraduate students is µ = 18.5 years, find P (X̄ > 19.1).
Here we have µ = 18.5, σ = 6 and n = 344 (not to confuse with the population size, which is 6, 000).
6
So µX̄ = µ = 18.5 and σX̄ = √σn = √344
≈ 0.3235.
Also n = 344 > 30 is a relatively large sample size. So by the Central Limit Theorem, X̄ is N(18.5, 0.3235).
Thus P (X̄ > 19.1) = normalcdf(19.1, 999 , 18.5, 0.3235) ≈ 0.0318
(ii) If the mean for the 6, 000 undergraduate students is µ = 19.5 years, find P (X̄ > 19.1). (4 pts)
Here we have µ = 19.5, σ = 6 and n = 344.
6
So µX̄ = µ = 19.5 and σX̄ = √σn = √344
≈ 0.3235.
Also n = 344 > 30 is a relatively large sample size. So by the Central Limit Theorem, X̄ is N(19.5, 0.3235).
Thus P (X̄ > 19.1) = normalcdf(19.1, 999 , 19.5, 0.3235) ≈ 0.8919
For the next two questions, DO NOT use the calculator at all.
Simply write two or three sentences or draw a graph to explain your answer.
(iii) If P (X̄ > 19.1) = .5, what is µ ? (4 pts)
If P (X̄ > 19.1) = .5, it means that 50% of the values of X̄ are above 19.1 and 50% below 19.1.
We know that the distribution of X̄ is symmetric, so for the previous statement to be true,
19.1 must be the expected value of X̄. So µX̄ = 19.1
But we also know that µX̄ = µ. So here we have µ = 19.1
(iv) If P (X̄ > 19.1) = .2, is µ greater than or less than 19.1 years ? (4 pts)
Simply draw a graph for this question. The area under the graph to the right of 19.1 is .2.
For this to be possible, 19.1 has to fall to the right of the mean of the distribution.
(Any value to the left of the mean would have an area of at least .5 to its right).
So 19.1 is to the right of µX̄ , that is µX̄ < 19.1.
But we also know that µX̄ = µ. So here we have µ < 19.1. µ is less than 19.1 years.
2
3. The weight X of corn chips dispensed into a bag labeled 14 ounces at the factory of the company Golden
Chips is modeled as a normally distributed random variable with mean 14.1 ounces and an adjustable standard deviation. (20 pts total )
For the first three questions, assume that the standard deviation is set at σ = .4 ounces.
(i) What is the probability of an underfill, that is, determine P (X < 14) ? (2 pts)
We know that X is N(14.1, 0.4).
So P (X < 14) = normalcdf(−999 , 14, 14.1, 0.4) ≈ 0.4013
(ii) For a random sample of 16 Golden Chips bags, what is the probability that the sample average fill
is less than 14 ounces ? (4 pts)
Notice that 16 < 30 is a relatively small sample size. However, since X has a Normal distribution,
X̄ will also have a Normal distribution no matter what the sample size is.
Here we have µ = 14.1, σ = 0.4 and n = 16.
= 0.4
So µX̄ = µ = 14.1 and σX̄ = √σn = √0.4
4 = 0.1.
16
X̄ is N(14.1, 0.1).
Thus P (X̄ < 14) = normalcdf(−999 , 14, 14.1, 0.1) ≈ 0.1587
(iii) Suppose that we draw a random sample of 16 bags of chips of a certain brand and observe that
x̄ = 13.75. Is it likely that the sample came from Golden Chips ? (6 pts)
Remember that the average weight of a sample of 16 bags from Golden Chips, X̄,
has a Normal distribution with µX̄ = 14.1 and σX̄ = 0.1.
IF the sample came from Golden Chips, the probability of observing an average weight as low as 13.75 oz is:
P (X̄ ≤ 13.75) = normalcdf(−999 , 13.75, 14.1, 0.1) ≈ 0.0002
This probability is extremely low, which indicates that our assumption was most likely wrong.
Thus it is not very likely that the sample came from Golden Chips.
(iv) The owners of the company would like to adjust the standard deviation to make sure that 98% of
all bags have a weight above the weight on the label. What should σ be ? (8 pts)
”98% of all bags have a weight above the weight on the label” We want P (X > 14) = 0.98.
We know that X is N(14.1, σ). So Z = X−14.1
is N(0, 1) (a standard normal variable).
σ
X > 14 ⇒ X − 14.1 > 14 − 14.1 ⇒ X − 14.1 > −0.1 ⇒
X − 14.1
−0.1
−0.1
>
⇒Z>
σ
σ
σ
P (X > 14) = 0.98 ⇒ P Z > −0.1
= 0.98
σ
So we want to find a value such that the area to the right of it is 0.98.
Thus the area to the left of the value that we are looking for is 0.02.
−0.1
−0.1
σ = invNorm(0.02, 0, 1) ≈ −2.0537 ⇒ σ = −2.0537 ≈ 0.0487
In order to make sure that 98% of all bags have a weight above the weight on the label, the standard
deviation should be set to σ = 0.0487 oz.
3
4. Circle the correct answer choice that completes the following sentence (no need to show any work):
(4 pts)
The standard deviation of the sampling distribution for a sample mean based on samples of size 100 is
twice as large as
the standard deviation of the sampling distribution of a sample mean based on samples of size 400.
Explanation: Let σ be the population standard deviation.
Let σA be the standard deviation of the sample mean based on samples of size 100
and σB be the standard deviation of the sample mean based on samples of size 400.
Then σA =
√σ
100
=
σ
10
so σ = 10σA and σB =
√σ
400
σ
20
=
so σ = 20σB
10σA = 20σB so σA = 2σB
5. A poll by the Gallup Organization sponsored by Philadelphia-based CIGNA Integrated Care found that
about 40% of employees have missed work due to a musculoskeletal (back) injury of some kind (National
Underwriter, Apr. 5, 1999). A random sample of 100 workers is to be drawn from a particular manufacturing
plant. Let X be the number of sampled workers who have missed work due to a back injury. (12 pts total )
Let p̂ be the proportion of workers who missed work due to back injuries in the sample.
Here we have: p = 40% = 0.4 and n = 100.
(i) Find the mean and standard deviation of p̂. (3 pts)
µp̂ = p = 0.4
σp̂ =
q
p(1−p)
n
=
q
(0.4)(1−0.4)
100
=
q
(0.4)(0.6)
100
≈ 0.0490
(ii) Find the probability that the sample proportion is less than .38. (4 pts)
np = (100)(0.4) = 40 > 15
n(1 − p) = (100)(1 − 0.4) = (100)(0.6) = 60 > 15
Thus p̂ has an approximately Normal distribution with µp̂ = 0.4 and σp̂ ≈ 0.0490.
Thus P (p̂ < 0.38) = normalcdf(−999 , 0.38, 0.4, 0.0490) ≈ 0.3416
(iii) Find the probability that less than 38 workers in the sample have missed work due to a back injury without using a normal model for the sample proportion. (5 pts)
Hint: Consider the random variable representing the number of workers who missed work due to a back
injury. What kind of random variable would that be ?
Let N be the number of people in the sample of 100 who have missed work due to a back injury.
We have an experiment with:
1 100 identical independent trials (selecting a worker from the population, observing whether the person
has missed work due to a back injury, repeating this 100 times)
2 each trial has 2 outcomes:
success: the person missed work due to a back injury (with probability .4)
failure: the person did not miss work due to a back injury (with probability .6)
3 probability of success/failure is the same at each trial
1 and 3 are true only because we are sampling from a large population.
N is the number of successes in such an experiment.
Thus N is a binomial random variable with n = 100 and p = 0.4.
P (N < 38) = P (N ≤ 37) = binomcdf(100, 0.4, 37) ≈ 0.3068
4