Download stdin (ditroff) - Purdue Engineering

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Law of large numbers wikipedia , lookup

Central limit theorem wikipedia , lookup

Transcript
Homework #8. Due: Friday, October 24, 2003.
<KEY> IE 230
Textbook: D.C. Montgomery and G.C. Runger, Applied Statistics and Probability for
Engineers, John Wiley & Sons, New York, 2003. Sections 4.6–4.12.
1. Exam 1 scores had a minimum of 31, a maximum of 104, a mean of 73.3 and a standard
deviation of 15.8. Of 143 students, 28 had scores lying between 71 and 80. Twelve had
scores fifty or less.
(a) Let X denote the Exam 1 score of a randomly selected student. Assume that X has a
normal distribution with the given mean and standard deviation. Sketch the pdf.
Label and scale the axes.
____________________________________________________________
Draw a horizontal and a vertical axis.
Label the horizontal axis with x and the vertical with f X (x ).
Sketch a bell curve.
Scale the horizontal axis.
Under the middle of the bell curve, place the mean 73.3.
Under the points of inflection, place 73.3 − 15.8 = 57.4 and 73.3 + 15.8 = 89.1.
Scale the vertical axis.
Place zero at the bottom of the bell curve.
Place (2π)−1 / 2 / 15.8 ≈ 0.4 / 15.8 ≈ 0.025 at the top of the bell curve.
Comment: The vertical scaling is seldom used, but is determined by the
requirement that the area is one.
____________________________________________________________
(b) The Central Limit Theorem says that the normal distribution is a good model for
sums of random variables. Explain how a exam score is a sum of random variables
(and that therefore exam scores are likely to be approximately normally
distributed).
____________________________________________________________
Let Xi denote the points scored on question i , for n = 1, 2,..., n .
The test score is then X = X 1 + X 2 + . . . + Xn .
If n is large, then the Central Limit Theorem guarantees that
test scores are approximately normally distributed.
____________________________________________________________
(c) In fact, X takes only integer values, so the normal distribution is an approximation.
Explain why a reasonable approximation to P(X = 75) is P(74.4 ≤ X ≤ 75.5). (This is
called the continuity correction.)
____________________________________________________________
Probabilities associated with discrete random variables are sums.
Probabilities associated with continuous random variables are integrals.
The probability of getting a 75 on Exam 1 is positive, because X is discrete.
For any continuous random variable, say Y , P(Y = 75) = 0.
If X is discrete with positive probabilities on only the integers, then
P(X = 75) = P(74.5 ≤ X ≤ 75.5) exactly.
Therefore, approximate P(X = 75) by P(74.5 ≤ Y ≤ 75.5),
where the distribution of Y has the same mean and standard deviation as the
distribution of X .
____________________________________________________________
– 1 of 9 –
Schmeiser
Homework #8. Due: Friday, October 24, 2003.
<KEY> IE 230
(d) Let Y denote the number of scores in [71, 80]. If scores are independent, then Y is
binomial with n = 143 and (approximately) p = P(70.5 ≤ X ≤ 80.5). Determine the
value of p by
(i) shading the area in the sketch of Part (a) and approximating that area.
____________________________________________________________
Shade the area between 70.5 and 80.5.
State the approximate area, which should be roughly 20%.
____________________________________________________________
(ii) converting to standard normal and using Table II.
____________________________________________________________
p
= P(70.5 ≤ X ≤ 80.5)
definition of p
70.5 − 73.3 X − 73.3 80.5 − 73.3
= P( hhhhhhhhhh ≤ hhhhhhhh ≤ hhhhhhhhhh )
15.8
15.8
15.8
80.5 − 73.3
70.5 − 73.3
= P( hhhhhhhhhh ≤ Z ≤ hhhhhhhhhh )
15.8
15.8
same event
standard normal
= P(−0.178 ≤ Z ≤ 0.456)
simplify
= P(Z ≤ 0.456) − P(Z ≤ −0.178)
Axiom 2
= Φ(0.456) − Φ(−0.178)
standard-normal cdf notation
= 0.676 − 0.430
interpolating from Table II
= 0.246
simplify
____________________________________________________________
(iii) using "normdist" in MSexcel.
____________________________________________________________
p
= P(70.5 ≤ X ≤ 80.5)
definition of p
70.5 − 73.3 X − 73.3 80.5 − 73.3
= P( hhhhhhhhhh ≤ hhhhhhhh ≤ hhhhhhhhhh )
15.8
15.8
15.8
80.5
−
73.3
70.5
−
73.3
= P( hhhhhhhhhh ≤ Z ≤ hhhhhhhhhh )
15.8
15.8
same event
standard normal
= P(−0.178 ≤ Z ≤ 0.456)
simplify
= P(Z ≤ 0.456) − P(Z ≤ −0.178)
Axiom 2
= normsdist(0.456) − normsdist(−0.178)
standard-normal cdf in MSexcel
= 0.676 − 0.430
from MSexcel
= 0.246
simplify
____________________________________________________________
– 2 of 9 –
Schmeiser
Homework #8. Due: Friday, October 24, 2003.
<KEY> IE 230
(e) Determine the value of E(Y). Discuss whether the 28 observed scores in [71, 80] is
reasonable if exam scores are normally distributed.
____________________________________________________________
E(Y ) = np = 143 × 0.246 = 35.2
The observed 28 scores in the interval [71, 80] is less than E(Y ).
d dddddddddddddddd
The standard deviation of Y is √ndddddddddd
p (1 − p ) = √143
× 0.246 × 0.754 = 5.15,
so the observed value is a bit more than one standard deviation to the left of the mean.
Using the normal distribution as an approximation to this binomial distribution,
observing a value just to the left of the left point of inflection is not unusual.
____________________________________________________________
2. (Montgomery and Runger, Problem 4–46) If X is normally distributed with mean 5 and
standard deviation 4, determine the value of x that satisfies P(−x < X < x ) = 0.99.
____________________________________________________________
P(−x < X < x ) = 0.99
implies
implies
implies
implies
implies
−x − 5
X −5
x −5
P( hhhhhh < hhhhh < hhhhh ) = 0.99
4
4
4
−x
−
5
x
−
5
P( hhhhhh < Z < hhhhh ) = 0.99
4
4
x −5
−x − 5
P(Z < hhhhh ) − P(Z < hhhhhh ) = 0.99
4
4
x −5
−x − 5
normsdist( hhhhh ) − normsdist( hhhhhh ) = 0.99
4
4
x = 14.31 ←
Comment: I solved for x using MSexcel by dragging many values of x .
Alternatively, sketch the bell curve and notice that only the upper value matters.
That is P(−x < X < x ) ≈ P(X ≤ x ) for every value of x .
Then P(Z ≤ (x − 5) / 4) = 0.99 implies that (x − 5) / 4 = 2.33,
which implies that x = 14.32.
____________________________________________________________
– 3 of 9 –
Schmeiser
Homework #8. Due: Friday, October 24, 2003.
<KEY> IE 230
3. (Montgomery and Runger, Problem 4–68) Hits to an Web site are assumed to follow a
Poisson distribution with a mean of 10,000 per day. Let X denote the number of hits in
a day.
(a) Compute a normal approximation for the probability of more than 10,500 hits per
day.
____________________________________________________________
P(X > 10,500)
= P(X > 10,500.5)
(unneeded continuity correction)
X − 10,000
10,500.5 − 10,000
= P( hhhhhhhhhh > hhhhhhhhhhhhhhhh )
100
100
10,500.5
−
10,000
= P(Z > hhhhhhhhhhhhhhhh )
100
same event
standard normal approximation
= P(Z > 5)
simplify
≈ 0
Table II or normsdist
Comment: Is this answer surprisingly small?
____________________________________________________________
(b) Compute a normal approximation for x , where P(X > x ) = 0.01.
____________________________________________________________
P(X > x ) = 0.01
implies
implies
implies
implies
implies
X − 10,000
x − 10,000
P( hhhhhhhhhh > hhhhhhhhh ) = 0.01
100
100
x
−
10,000
P(Z > hhhhhhhhh ) = 0.01
100
x
−
10,000
P(Z ≤ hhhhhhhhh ) = 0.99
100
x − 10,000
Φ( hhhhhhhhh ) = 0.99
100
− 10,000
hxhhhhhhhh
= 2.33
100
implies
x = 10,000 + 2.33 × 100
implies
x = 10,233
Comment: Is this answer consistent with the answer to Part (a)?
Comment: Can you explain each step?
____________________________________________________________
– 4 of 9 –
Schmeiser
Homework #8. Due: Friday, October 24, 2003.
<KEY> IE 230
4. Let X denote your instructor’s commute time. Suppose that X has a gamma distribution
with mean µX = 9 minutes and standard deviation σX = 2 minutes.
(a) Determine the parameter values λ and r that yield the desired mean and variance.
____________________________________________________________
The gamma mean is E(X ) = r / λ.
The gamma variance is V(X ) = r / λ2.
Setting these to the desired values yields
r
hh
=9
λ
and
r
2
hhh
=2
2
λ
Solving for the parameter values r and λ yields
2
2
E (X )
9
r = hhhhhh = hhh = 20.25 ←
2
V(X )
2
and
E(X )
9
λ = hhhhh = hhh = 2.25 ←
2
V(X )
2
____________________________________________________________
(b) Sketch the resulting pdf. (You can use MSexcel’s function "gammadist" and print, or
just trace from the monitor. Notice that help for "gammadist" mistakenly refers to
the probability mass function rather than the pdf.)
____________________________________________________________
The key to using "gammadist" is that MSexcel refers to parameters
α and β rather than r and λ.
Reading help indicates that α = r and that β = 1 / λ.
(Unfortunately, most distribution families have alternate parameterizations.)
I drug x values 0, 0.5, 1.0,..., 18 and for each computed "gammadist(x,20.5,1/2.25,0)".
The pdf has a mode at about x = 8.5 and visually disappears below 5 and above 13.5.
____________________________________________________________
– 5 of 9 –
Schmeiser
Homework #8. Due: Friday, October 24, 2003.
<KEY> IE 230
5. (Montgomery and Runger, Problem 4–79) Let T denote the time to failure (in hours) of
fans in a personal computer. Assume that T has an exponential distribution with rate
λ = 0.00003.
(b) What are the units of λ?
____________________________________________________________
λ is in units of failures per hour.
____________________________________________________________
(b) Sketch the pdf of T . Label and scale the axes.
____________________________________________________________
Sketch a horizontal axis and a vertical axis.
Label the horizontal axis with t and the vertical axis with f T (t ).
Scale the horizontal axis with zero and the mean E(T ) = 1 / λ = 33,333.3 (hours).
Scale the vertical axis with zero and the mode value λ = 0.00003.
Sketch the exponential pdf. The center of gravity should be at E(T ).
The pdf value at the mean should be about one-third the height at x = 0.
____________________________________________________________
(c) What proportion of the fans will last at least 10,000 hours?
____________________________________________________________
−λ10000
P(T > 10,000) = 1 − P(T ≤ 10,000) = 1 − [1 − e
−0.3
]=e
≈ 0.741 ←
____________________________________________________________
(d) Show the answer to Part (c) in your sketch of Part (b).
____________________________________________________________
Shade the area to the right of x = 10,000.
(It should be about 3/4 of the area under pdf.)
____________________________________________________________
(e) If a particular fan has lasted 20,000 hours, what is the probability that it will last
beyond hour 30,000?
____________________________________________________________
P(T > 30,000 | T > 20,000) = P(T > 10,000) from the memoryless property.
Therefore, the answer is the same as from Part (c).
____________________________________________________________
– 6 of 9 –
Schmeiser
Homework #8. Due: Friday, October 24, 2003.
<KEY> IE 230
6. (Montgomery and Runger, Problems 4–96 and 4–97) Calls to a telephone system arrive
according to a Poisson Process at a rate of five calls per minute.
(a) What is the name applied to the distribution and the parameter values of the time
until the tenth call?
____________________________________________________________
The Erlang family of distributions.
r = 10 (calls) and λ = 5 (calls per minute).
____________________________________________________________
(b) What is the mean time until the tenth call?
____________________________________________________________
Let T denote the time until the tenth call.
E(T ) = r / λ = 10 / 5 = 2 (minutes)
Comment. Does this answer seem intuitive?
____________________________________________________________
(c) What is the mean time between the ninth and tenth calls?
____________________________________________________________
Because the calls follow a Poisson process, the
times between calls are independent and exponential.
If Ti denotes the time between the (i − 1)th and i th call,
then E(T 9) = 1 / λ = 1 / 5 (minute)
Comment. Does this answer seem intuitive?
____________________________________________________________
(d) What is the probability that exactly four calls occur within one (randomly selected)
minute.
____________________________________________________________
Let N denote the number of calls in one minute.
Then N is Poisson with mean µ = λ t = 5 × 1 = 5 calls.
Therefore,
−5
4
e 5
P(N = 4) = f N (4) = hhhhhh = 0.175 ←
4!
Comment: In MSexcel, you can use "poisson(4,5,0)".
____________________________________________________________
– 7 of 9 –
Schmeiser
Homework #8. Due: Friday, October 24, 2003.
<KEY> IE 230
7. (Montgomery and Runger, Problems 4–111) Assume that the life of a roller bearing
follows a Weibull distribution with parameters β = 2 and δ = 10,000 hours.
(a) Determine the probability that a bearing lasts at least 8000 hours. (If you use the
"weibull" function in MSexcel, you need to convert to its parameters α and β.)
____________________________________________________________
Let X denote the life of a (randomly selected) roller bearing.
Then the cdf is FX (x ) = P(X ≤ x ) = 1 − exp(−(x / δ)β).
Therefore,
β
−0.64
P(X > 8000) = 1 − P(X ≤ 8000) = 1 − [1 − exp(−(8000 / δ) )] = exp(−0.8 ) = e
2
____________________________________________________________
≈ 0.53 ←
(b) Determine the mean time to failure of a bearing using
(i) the result that Γ(r ) = (r − 1)! when r is a positive integer.
____________________________________________________________
1
E(X ) = δ Γ(1 + hh ) = (10,000) Γ(1.5) = (10,000) (1.5 − 1) Γ(1.5 − 1)
β
= (10,000)) (0.5) √dπ = (5000) (1.772) = 8,862.3 (hours).
____________________________________________________________
(ii) using MSexcel’s function "gammaln", which calculates the natural logarithm of
the gamma function. (Notice that the gamma function can yield huge values,
since it is a generalization of the factorial function. MSexcel returns the
logarithm, which is more tractable. You need to then use "exp(gammaln(r ))".)
____________________________________________________________
1
E(X ) = δ Γ(1 + hh ) = (10,000) Γ(1.5) = (10,000) exp[gammaln(1.5)]
β
= (10,000) exp(−0.1208) = (10,000) (0.88623) = 8,862.3 (hours).
____________________________________________________________
– 8 of 9 –
Schmeiser
Homework #8. Due: Friday, October 24, 2003.
<KEY> IE 230
8. Let X denote your instructor’s commute time. Suppose that X has a triangular distribution
with minimum at 6 minutes, mode at 8.5 minutes, and maximum at 15 minutes.
(a) Sketch the resulting pdf.
____________________________________________________________
Sketch a horizontal axis and a vertical axis.
Label the horizontal axis with x and the vertical axis with f X (x ).
Scale the horizontal axis with 6 and 8.5 and 15.
Scale the vertical axis with zero and the mode value 2 / (15 − 6) ≈ 0.22
Sketch the triangular pdf, on the left increasing linearly from 6 to 8.5
and decreasing linearly from 8.5 to 15.
____________________________________________________________
(b) Determine the mean commute time.
____________________________________________________________
From the Concise Notes, the mean is
a +m +b
6 + 8.5 + 15
E(X ) = hhhhhhhhh = hhhhhhhhhhh = 9 2 / 3 (minutes) ←
3
3
Comment: Do you know how to obtain this formula for the triangular mean?
____________________________________________________________
(c) Determine P(X > 10).
____________________________________________________________
From the Concise Notes, to the right of the mode m = 8.5, the triangular cdf is
(b − x )(x − b )
(b − x )
FX (x ) = 1 − hhhhhhhhhhhhh = 1 − hhhhhhhhhhhhh .
(b − a )(m − b )
(b − a )(b − m )
2
For our parameter values,
(15 − x )
(15 − x )
FX (x ) = 1 − hhhhhhhhhhhhhhh = 1 − hhhhhhhh .
(15 − 6)(15 − 8.5)
58.5
2
2
Therefore,
(15 − 10)
25
P(X > 10) = 1 − P(X ≤ 10) = 1 − [1 − hhhhhhhhh ] = h hhh ≈ 0.427 ←.
58.5
58.5
2
____________________________________________________________
(d) Shade in the sketch of Part (a) your answer to Part (c).
____________________________________________________________
Shade the area to the right of x = 10.
Is the area approximately correct?
____________________________________________________________
– 9 of 9 –
Schmeiser