Download 4 - NYU Stern

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Probability wikipedia , lookup

Transcript
STAT-UB.0103 SPRING 2012
Homework Set 4
Solutions
”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•
1. If Z is a standard normal random variable, find the probability that
a.
0 ≤ Z ≤ 1.24
d.
-0.88 < Z < 0.14
b.
Z > -0.45
e.
-1.13 < Z ≤ -0.92
c.
| Z | ≥ 0.80
f.
| Z | ≤ 0.70
SOLUTION:
In all problems involving the normal table, we treat < and ≤ as equivalent symbols.
Similarly > and ≥ are treated as equivalent.
a.
b.
c.
0.3925
0.6736
0.4238
d.
e.
f.
0.3663
0.0496
0.5160
2. If Z is a standard normal random variable, find the values of w that satisfy the
following. These need not be done by interpolation; just use the closest table value.
a.
b.
P[ Z > w ] = 0.32
P[ Z ≤ w ] = 0.71
c.
d.
P[ Z ≤ w ] = 0.14
P[ -w ≤ Z ≤ w ] = 0.60
You can use Minitab as well. You’ll need Calc ⇒ Probability Distributions ⇒
Normal and the Inverse cumulate probability feature.
SOLUTION:
a.
Convert this to P[ 0 ≤ Z ≤ w ] = 0.18. Then find
P[ 0 ≤ Z ≤ 0.46 ] = 0.1772
P[ 0 ≤ Z ≤ 0.47 ] = 0.1808
The closer value corresponds to w = 0.47. (The exact answer, found from
Minitab, is w = 0.467699.)
b.
Convert this to P[ 0 ≤ Z ≤ w ] = 0.21. Then find
P[ 0 ≤ Z ≤ 0.55 ] = 0.2088
P[ 0 ≤ Z ≤ 0.56 ] = 0.2123
The closer value corresponds to w = 0.55. (The exact answer, found from
Minitab, is w = 0.553385.)
c.
The value of w is certainly negative. Convert this to P[ 0 ≤ Z ≤ -w ] = 0.36. Then
find
P[ 0 ≤ Z ≤ 1.08 ] = 0.3599
P[ 0 ≤ Z ≤ 1.09 ] = 0.3621
The closer value corresponds to -w = 1.08. Thus w = -1.08. (The exact answer,
found from Minitab, is w = -1.08032.
”•
Page 1
© gs2012
STAT-UB.0103 SPRING 2012
Homework Set 4
Solutions
”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•
d.
Convert this to P[ 0 ≤ Z ≤ w ] = 0.30. Then find
P[ 0 ≤ Z ≤ 0.84 ] = 0.2995
P[ 0 ≤ Z ≤ 0.85 ] = 0.3023
The closer value corresponds to w = 0.84. (The exact answer, found from
Minitab, is w = 0.841621.)
3. The diameters of apples from Happy Mac Orchard have diameters which are
approximately normally distributed with mean μ = 2.8 inches and standard deviation
σ = 0.3 inch. Apples can be size-sorted by being made to roll over a mesh screens. At
this farm, the steps are done sequentially.
First, the apples are rolled over a screen with mesh size 2.5 inches. This separates
out all apples with diameters < 2.5 inches.
Second, the remaining apples are rolled over a screen with mesh size 3.3 inches.
This separates out all apples with diameters between 2.5 and 3.3 inches.
All the apples will now be separated into three groups.
a.
Find the proportion of apples with diameter < 2.5 inches.
b.
Find the proportion of apples with diameters between 2.5 and 3.3 inches.
c.
Find the proportion of apples with diameters greater than 3.3 inches.
HINT: If X represents the diameter of a random apple, and if the screen has a mesh
size m, then P[X < m] represents the proportion of apples which will fall through.
SOLUTION: You are given the facts that μ = population mean diameter = 2.8 inches
and σ = population standard deviation = 0.3 inch. Let X be the random variable that
gives the diameter of an apple.
For part a, if a batch of apples is rolled over a screen with mesh size 2.5 inches, then
P[X < 2.5] represents the proportion of apples which will fall through. (NOTE: You can
think of P[X < 2.5] as the probability that one randomly selected apple will fall through
the screen; this also represents the proportion of all apples which will fall through.)
⎡ X − 2.8 2.5 − 2.8 ⎤
<
≈ P[ Z < -1.00 ] = 0.1587
P[ X < 2.5 ] = P ⎢
0.3 ⎥⎦
⎣ 0.3
About 16% of the apples will fall through.
b.
This is handled in exactly the same form as a.
⎡ X − 2.8 3.3 − 2.8 ⎤
<
≈ P[ Z < 1.67 ] = 0.9525
P[ X < 3.3 ] = P ⎢
0.3 ⎥⎦
⎣ 0.3
About 95% of the apples will fall through. This calculation includes those that would
have been separated by the first step. Therefore, the proportion of apples with diameters
between 2.5 inches and 3.3 inches is 0.9525 - 0.1587 = 0.7938 This is about 79%.
”•
Page 2
© gs2012
STAT-UB.0103 SPRING 2012
Homework Set 4
Solutions
”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•
c.
The proportion of apples that do not fall through the larger screen is 1 – 0.9525
= 0.0475. That is, about 5% of the apples are in the largest category.
4. The chocolate chip cookies that are produced at Perry’s Cookie Emporium have
weights which are approximately normally distributed with mean weight 180 grams and
with standard deviation 20 grams. The cookies, however, are sold by count, not by
weight. This is a high-markup business, and Perry wants to improve his image. He
decides to set aside lightest 20% of the cookies to be packaged and sold separately. What
cookie weight will divide the lightest 20% from the heaviest 80% ?
SOLUTION: Let X be the random variable which gives the cookie weights.
Apparently this has μ = 180 grams and σ = 20 grams. We seek the weight w so that
P[ X < w ] = 0.20 . But
P[ X < w ] = P
LM X − 180 < w − 180 OP = PLMZ < w − 180 OP
20 Q
N 20 Q
N 20
want
= 0.20
The normal table tells us that P[Z < -0.84] ≈ 0.20.
How do we get this value? If we seek c for which P[Z < c] = 0.20, then we note
that c must be negative and we apportion the probability as
0.20 = P[Z < c]
0.30 = P[c < Z < 0]
0.30 = P[0 < Z < -c]
0.20 = P[-c < Z ]
The third of these four facts corresponds to what we can look up in the normal
table. Thus we seek a value for -c so that, so close as possible, P[0 < Z < -c] is
0.30. That happens for -c = 0.84, and thus c = -0.84. We then solve -0.84 =
w − 180
, getting w = 163.2 .
20
If the cookies are divided at weight 163.2 grams, then the lightweight group will have
(about) 20% of the cookies.
You can get Minitab to do all the heavy lifting. Use Calc ⇒ Probability Distributions
⇒ Normal. Set up the information panel as follows:
”•
Page 3
© gs2012
STAT-UB.0103 SPRING 2012
Homework Set 4
Solutions
”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•
The output is this:
Inverse Cumulative Distribution Function
Normal with mean = 180 and standard deviation = 20
P( X <= x )
0.2
x
163.168
5. Clyde’s Deli is situated inside a large industrial park. The weekday gross sales at
Clyde’s average $2,480, with a standard deviation of $360. Find the probability that the
average over the next 50 weekdays will exceed $2,400. Please note the assumptions that
are used in making the calculation.
SOLUTION: Let X1, X2, …, X50 be the random amounts for these 50 weekdays. We
must assume that these random variables are statistically independent, each with the
mean $2,480 and each with the standard deviation $360. Let X be the average of these
50 values. We have no need to assume that the distribution is normal, as the Central
Limit theorem will assure us that X is sufficiently close to normal. The mean of the
$360
distribution of X is $2,480 and the standard deviation of this distribution is
≈
50
$50.91. We then proceed as follows:
⎡ X − $2, 480
$2, 400 − $2, 480 ⎤
P[ X > $2,400 ] = P ⎢
>
⎥ ≈ P[ Z > -1.57 ]
$50.91
⎣ $50.91
⎦
”•
Page 4
© gs2012
STAT-UB.0103 SPRING 2012
Homework Set 4
Solutions
”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•
= P[ Z < 1.57 ] = 0.50 + P[ 0 ≤ Z < 1.57 ] = 0.50 + 0.4418 = 0.9418
6. Examine the file CHS\GEYSER1.MTP. You can get this from the Web at
www.stern.nyu.edu/~gsimon/statdata. Look for the CHS folder.
Column C2 gives the duration, in minutes, of eruptions of the Old Faithful Geyser in
Yellowstone National Park. Column C3 gives the interval, in minutes, until the following
eruption. The concern here is whether the data in these columns follow a normal
distribution. Here we’ll just examine C2. (Column C3 is qualitatively very similar to
C2.)
Use Graph ⇒ Probability Plot to decide whether C2 follows a normal distribution.
What conclusion do you reach? You might also try Graph ⇒ Histogram to check
whether there is a simple description.
SOLUTION: Here’s the probability plot for C2:
Probability Plot of Duration
Normal - 95% CI
99.9
Mean
3.576
StDev
1.084
N
222
AD
12.714
P-Value <0.005
99
Percent
95
90
80
70
60
50
40
30
20
10
5
1
0.1
0
1
2
3
4
Duration
5
6
7
8
This is outrageously non-normal. But what exactly does it mean to see the string of dots
snaking around the center zone?
”•
Page 5
© gs2012
STAT-UB.0103 SPRING 2012
Homework Set 4
Solutions
”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•
The histogram will clear that up:
Histogram of Duration
40
Frequency
30
20
10
0
1.5
2.0
2.5
3.0
3.5
Duration
4.0
4.5
5.0
NOTE: The default Minitab output was edited so that the
horizontal scale is the “cutpoint” type, rather than
“midpoint.”
This is a pattern called bimodal, meaning that there are two peaks. Apparently there are
many eruptions of long duration and also quite a few of short duration. It’s somewhat
unusual to get an eruption of 2 12 to 3 12 minutes.
7. You are about to take a sample from a population in order to use X to estimate μ.
Here the population consists of items with monetary values. You would like the error
of estimate to be governed by the condition P[ | X - μ | ≤ $5 ] ≥ 0.80. If you think that
σ, the population standard deviation, could be as large as $40, find the smallest sample
size n which will allow you to satisfy the condition.
HINT: Assume that σ = $40. If σ is really smaller, then you will still satisfy
the condition.
HINT: You will need to find a normal table point w so that P[ | Z | ≤ w ] = 0.80.
⎛ z σ⎞
HINT: The formula n ≥ ⎜ α / 2 ⎟ will be useful.
⎝ E ⎠
2
”•
Page 6
© gs2012
STAT-UB.0103 SPRING 2012
Homework Set 4
Solutions
”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•
⎛ z σ⎞
SOLUTION: Use the result n ≥ ⎜ α / 2 ⎟ . In this formula,
⎝ E ⎠
2
is the limit on the magnitude of the error, meaning | X - μ | , which is $5
in this example
σ
the maximum believable standard deviation, which is $40 in this
example
1 - α the desired probability of achieving error E, which is 80% in this
example
E
In this example, we’ll note that α = 0.20 and then that
α
= 0.10, so that zα/2 = z0.10 = 1.28.
2
Then the formula asks for sample size
⎛ 1.28 × $40 ⎞
2
n ≥ ⎜
⎟ = 10.24 = 104.8576
$5
⎝
⎠
2
As n must be an integer, we raise this to the next value 105.
The value of z0.10 asks for the solution of P[ 0 ≤ Z ≤ z0.10 ] = 0.40. The closest value is
1.28, so we use z0.10 = 1.28. (The exact value from Minitab is 1.28155.)
We were trying to get X within 18 σ of μ, and the sample size requirement was 105 or
more.
8. A population of daily sales figures is approximately normally distributed with a mean
of $14,000 and a standard deviation of $3,000.
(a)
You’d like to predict tomorrow’s sales. (Yes, this is an inference
question.) Give an interval (a, b) for which you are will to say that the
probability is about 95% that tomorrow’s sales will be between a and b.
(b)
You’d like to predict the average sales over the next 15 business days.
(This is another inference question.) Give an interval (c, d) for which you
are will to say that the probability is about 95% that the sales over the next
15 business days will be between c and d.
SOLUTION: Since about 95% of a distribution will be between μ - 2σ and μ + 2σ,
you’d give $14,000 ± 2 × $3,000, meaning $8,000 to $20,000, as an interval which
should contain about 95% of the probability. This takes care of (a). Since you’ve
assumed a normally distributed population, this 95% figure is very close to perfect.
”•
Page 7
© gs2012
STAT-UB.0103 SPRING 2012
Homework Set 4
Solutions
”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•
As for (b), you only need to refine the argument by noting that SD( X ) =
σ
=
n
$3,000
≈ $775. In interval for this X is $14,000 ± 2 × $775, or $12,450 to $15,550.
15
You’ve assumed a normal population, so the 95% figure is pretty good. With a sample
of size n = 15, we are not quite allowed to use the Central Limit theorem, but we could
probability do it anyhow.
9. You have been tracking the “cash reserve” of a very large mutual fund, hoping to find
clues about future behavior. The cash reserve is currently at 4.32. The units here are
millions of dollars. You believe that the daily changes to this reserve are normally
distributed, with mean -0.02 and with standard deviation 0.06. Find the probability that
the reserves will be below 3.90 after 25 days.
SOLUTION: This is a very simple random walk. Let T be the total of the 25 daily
changes. The condition “reserve < 3.90” is exactly the same as {T < -0.42 }. However T
has a mean of 25 × (-0.02) = -0.50 and has SD(T) = 0.06 × 25 = 0.30. Then
⎡ T − ( −0.50 )
P[ T < -0.42 ] = P ⎢
<
0.30
⎣
= 0.6064
( −0.42 ) − ( −0.50 ) ⎤
0.30
⎥ ≈ P[ Z < 0.27 ]
⎦
With a span of n = 25 days, we have the Central Limit theorem, so it was not really
critical to assume the normal distribution for the daily changes.
”•
Page 8
© gs2012
STAT-UB.0103 SPRING 2012
Homework Set 4
Solutions
”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•”•
10. The Vindicator Mutual Fund family has a fund that tracks industrial metals. This is
currently trading at $14.20 per share. It is believed that the daily price follows a
lognormal random walk with drift μ = +0.02 and with volatility expressed through
σ = 0.15. Find the probability that the price after thirty trading days will exceed $16.00.
HINT: Recall that the lognormal random walk is governed through
Pn = P0 e X1 + X 2 +...+ X n , where the Xi’s are normal with mean μ and standard
deviation σ.
SOLUTION: We can rewrite the HINT as Pn = P0 eTn , where Tn = X1 + X2 + … + Xn.
Note that E(Tn) = nμ = 30 × 0.02 = 0.60 and SD(Tn) = σ
n = 0.15 30 ≈ 0.8216.
Then
P[ Pn > $16.00 ] = P ⎡⎣ P0 eTn > $16.00 ⎤⎦ = P ⎡⎣ $14.20 eT30 > $16.00 ⎤⎦
≈ P ⎡⎣ eT30 > 1.126761⎤⎦ ≈ P[ T30 > 0.119347 ]
0.119347 − 0.60 ⎤
⎡ T − 0.60
>
= P ⎢ 30
⎥⎦ ≈ P[ Z > -0.5850 ] = P[ Z < 0.5850 ]
0.8216
⎣ 0.8216
= 0.50 + P[ 0 ≤ Z < 0.5850 ] = 0.50 + 0.2207 = 0.7207.
The value 0.2207 was obtained by an easy interpolation.
”•
Page 9
© gs2012