Download Example

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
Navidi-38615
book
December 4, 2008
Chapter
15:53
Confirming Pages
4
Commonly Used
Distributions
Introduction
Statistical inference involves drawing a sample from a population and analyzing the
sample data to learn about the population. In many situations, one has some knowledge
about the probability mass function or probability density function of the population.
In particular, the probability mass or density function can often be well approximated
by one of several standard families of curves, or functions. In this chapter, we describe
some of these standard families, and for each one we describe some conditions under
which it is an appropriate model.
4.1 The Binomial Distribution
Imagine an experiment that can result in one of two outcomes. One outcome is labeled
“success” and the other outcome is labeled “failure.” The probability of success is denoted
by p. The probability of failure is therefore 1 − p. Such a trial is called a Bernoulli trial
with success probability p. The simplest Bernoulli trial is the toss of a coin. The two
outcomes are heads and tails. If we define heads to be the success outcome, then p is the
probability that the coin comes up heads. For a fair coin, p = 1/2. Another example of
a Bernoulli trial is the selection of a component from a population of components, some
of which are defective. If we define “success” as a defective component, then p is the
proportion of defective components in the population.
Many experiments consist of performing a sequence of Bernoulli trials. For example,
we might sample several components from a very large lot and count the number of
defectives among them. This amounts to conducting several independent Bernoulli trials
and counting the number of successes. The number of successes is then a random variable,
which is said to have a binomial distribution.
119
Navidi-38615
120
book
December 4, 2008
CHAPTER 4
Confirming Pages
15:53
Commonly Used Distributions
We now present a formal description of the binomial distribution. Assume that a
series of n Bernoulli trials is conducted, each with the same success probability p.
Assume further that the trials are independent—that is, that the outcome of one trial
does not influence the outcomes of any of the other trials. Let the random variable X
equal the number of successes in these n trials. Then X is said to have the binomial
distribution with parameters n and p. The notation is X ∼ Bin(n, p). X is a discrete
random variable, and its possible values are 0, 1, . . . , n.
Summary
If a total of n Bernoulli trials are conducted, and
■
■
■
The trials are independent
Each trial has the same success probability p
X is the number of successes in the n trials
then X has the binomial distribution with parameters n and p, denoted
X ∼ Bin(n, p).
Example
4.1
A fair coin is tossed 10 times. Let X be the number of heads that appear. What is the
distribution of X ?
Solution
There are 10 independent Bernoulli trials, each with success probability p = 0.5.
The random variable X is equal to the number of successes in the 10 trials. Therefore
X ∼ Bin(10, 0.5).
Probability Mass Function of a Binomial Random Variable
We now derive the probability mass function of a binomial random variable by considering an example. A biased coin has probability 0.6 of coming up heads. The coin is tossed
three times. Let X be the number of heads. Then X ∼ Bin(3, 0.6). We will compute
P(X = 2).
There are three arrangements of two heads in three tosses of a coin, HHT, HTH,
and THH. We first compute the probability of HHT. The event HHT is a sequence of
independent events: H on the first toss, H on the second toss, T on the third toss. We
know the probabilities of each of these events separately:
P(H on the first toss) = 0.6, P(H on the second toss) = 0.6, P(T on the third toss) = 0.4
Since the events are independent, the probability that they all occur is equal to the
product of their probabilities (Equation 3.11 in Section 3.2). Thus
P(HHT) = (0.6)(0.6)(0.4) = (0.6)2 (0.4)1
Navidi-38615
book
December 4, 2008
Confirming Pages
15:53
4.1
The Binomial Distribution
121
Similarly, P(HTH) = (0.6)(0.4)(0.6) = (0.6)2 (0.4)1 , and P(THH) = (0.4)(0.6)(0.6) =
(0.6)2 (0.4)1 . It is easy to see that all the different arrangements of two heads and one
tail have the same probability. Now
P(X = 2) = P(HHT or HTH or THH)
= P(HHT) + P(HTH) + P(THH)
= (0.6)2 (0.4)1 + (0.6)2 (0.4)1 + (0.6)2 (0.4)1
= 3(0.6)2 (0.4)1
Examining this result, we see that the number 3 represents the number of arrangements of two successes (heads) and one failure (tails), 0.6 is the success probability p,
the exponent 2 is the number of successes, 0.4 is the failure probability 1 − p, and the
exponent 1 is the number of failures.
We can now generalize this result to produce a formula for the probability of x
successes in n independent Bernoulli trials with success probability p, in terms of x, n,
and p. In other words, we can compute P(X = x) where X ∼ Bin(n, p). We can
see that
P(X = x) = (number of arrangements of x successes in n trials) · p x (1 − p)n−x
(4.1)
All we need to do now is to provide an expression for the number of arrangements
of x successes in n trials. To describe this number, we need factorial notation. For any
positive integer n, the quantity n! (read “n factorial”) is the number
n! = (n)(n − 1)(n − 2) · · · (3)(2)(1)
We also define 0! = 1. The number of arrangements of x successes in n trials is
n!/[x!(n − x)!]. We can now define the probability mass function for a binomial random
variable.
If X ∼ Bin(n, p), the probability mass function of X is
p(x) = P(X = x) =





n!
p x (1 − p)n−x
x!(n − x)!
x = 0, 1, . . . , n
0
otherwise
(4.2)
Figure 4.1 (page 122) presents probability histograms for the Bin(10, 0.4) and
Bin(20, 0.1) probability mass functions.
Navidi-38615
book
122
December 4, 2008
CHAPTER 4
Confirming Pages
15:53
Commonly Used Distributions
0.3
0.3
0.25
0.25
0.2
0.2
0.15
0.15
0.1
0.1
0.05
0.05
0
0
1
2
3
4
5
(a)
6
7
8
9
10
0
0
2
4
6
8
10
12
14
16
18
20
(b)
FIGURE 4.1 (a) The Bin(10, 0.4) probability histogram. (b) The Bin(20, 0.1) probability histogram.
Example
4.2
The probability that a newborn baby is a girl is approximately 0.49. Find the probability
that of the next five single births in a certain hospital, no more than two are girls.
Solution
Each birth is a Bernoulli trial with success probability 0.49. Let X denote the number
of girls in 5 births. Then X ∼ Bin(5, 0.49). We need to find P(X ≤ 2). Using the
probability mass function,
P(X ≤ 2) = P(X = 0 or X = 1 or X = 2)
= P(X = 0) + P(X = 1) + P(X = 2)
5!
5!
(0.49)0 (0.51)5−0 +
(0.49)1 (0.51)5−1
=
0!(5 − 0)!
1!(5 − 1)!
5!
+
(0.49)2 (0.51)5−2
2!(5 − 2)!
= 0.0345 + 0.1657 + 0.3185
= 0.5187
Sampling from a Finite Population
A manufacturing process produces 10,000 stainless steel rivets during the course of a
day. It is known that 8% of the rivets manufactured by this process fail to meet a strength
specification. A simple random sample of 20 rivets will be packaged for shipment to
a customer. A quality engineer wants to know the probability that the shipment will
contain exactly three defective rivets.
To compute this probability, we can use the binomial distribution. Each of the 20
rivets in the shipment is a trial. A “success” occurs if the rivet is defective. It is clear
that each trial has a success probability of 0.08. We want to know the probability that we
Navidi-38615
book
December 4, 2008
Confirming Pages
15:53
4.1
The Binomial Distribution
123
observe three successes in 20 trials. In order to use the binomial distribution, we must
be sure that the trials are independent. To see that this is so, recall from the discussion
of independence in Section 1.1 that when drawing a sample from a finite population, the
sample items may be treated as independent if the sample comprises less than 5% of the
population. Therefore, if X represents the number of defective rivets in the sample of 20,
P(X = 3) =
20!
(0.08)3 (0.92)17 = 0.1414
3!17!
Summary
Assume that a finite population contains items of two types, successes and failures, and that a simple random sample is drawn from the population. Then if the
sample size is no more than 5% of the population, the binomial distribution may
be used to model the number of successes.
Example
4.3
A lot contains several thousand components, 10% of which are defective. Nine components are sampled from the lot. Let X represent the number of defective components in
the sample. Find the probability that exactly two are defective.
Solution
Since the sample size is small compared to the population (i.e., less than 5%), the number
of successes in the sample approximately follows a binomial distribution. Therefore
we model X with the Bin(9, 0.1) distribution. The probability that exactly two are
defective is
9!
P(X = 2) =
(0.1)2 (0.9)7 = 0.1722
2!(9 − 2)!
Using the Binomial Table
Table A.1 (in Appendix A) presents binomial probabilities of the form P(X ≤ x) for
n ≤ 20 and selected values of p. Examples 4.4 and 4.5 illustrate the use of this table.
Example
4.4
Of all the new vehicles of a certain model that are sold, 20% require repairs to be
done under warranty during the first year of service. A particular dealership sells 14
such vehicles. What is the probability that fewer than five of them require warranty
repairs?
Solution
Let X represent the number of vehicles that require warranty repairs. Then
X ∼ Bin(14, 0.2). The probability that fewer than five vehicles require warranty repairs is P(X ≤ 4). We consult Table A.1 with n = 14, p = 0.2, and x = 4. We find that
P(X ≤ 4) = 0.870.
Navidi-38615
book
124
December 4, 2008
CHAPTER 4
Confirming Pages
15:53
Commonly Used Distributions
Sometimes the best way to compute the probability of an event is to compute the
probability that the event does not occur, and then subtract from 1. Example 4.5 provides
an illustration.
Example
4.5
Refer to Example 4.4. What is the probability that more than 2 of the 14 vehicles require
warranty repairs?
Solution
Let X represent the number of invoices in the sample that receive discounts. We wish
to compute the probability P(X > 2). Table A.1 presents probabilities of the form
P(X ≤ x). We note that P(X > 2) = 1 − P(X ≤ 2). Consulting the table with n = 14,
p = 0.2, x = 2, we find that P(X ≤ 2) = 0.448. Therefore P(X > 2) = 1 − 0.448 =
0.552.
The Mean and Variance of a Binomial Random Variable
With a little thought, it is easy to see how to compute the mean of a binomial random
variable. For example, if a fair coin is tossed 10 times, we expect on the average to see
five heads. The number 5 comes from multiplying the success probability (0.5) by the
number of trials (10). This method works in general. If we perform n Bernoulli trials,
each with success probability p, the mean number of successes is np. Therefore, if
X ∼ Bin(n, p), then µ X = np. The variance of a binomial random variable is not so
intuitive; it turns out to be np(1 − p).
Summary
If X ∼ Bin(n, p), then the mean and variance of X are given by
µ X = np
(4.3)
σ X2 = np(1 − p)
(4.4)
Exercises for Section 4.1
1. Let X ∼ Bin(10, 0.6). Find
a. P(X = 3)
b. P(X = 6)
c. P(X ≤ 4)
d. P(X > 8)
e. µ X
f. σ X2
2. Let X ∼ Bin(5, 0.3). Find
a. P(X < 3)
b. P(X ≥ 1)
c. P(1 ≤ X ≤ 3)
d. P(2 < X < 5)
e. P(X = 0)
f. P(X = 3)
g. µ X
h. σ X2
Navidi-38615
book
December 4, 2008
15:53
Confirming Pages
4.1
3. Find the following probabilities:
a.
b.
c.
d.
P(X = 7) when X ∼ Bin(13, 0.4)
P(X ≥ 2) when X ∼ Bin(8, 0.4)
P(X < 5) when X ∼ Bin(6, 0.7)
P(2 ≤ X ≤ 4) when X ∼ Bin(7, 0.1)
4. Ten percent of the items in a large lot are defective. A sample of six items is drawn from this
lot.
a. Find the probability that none of the sampled
items are defective.
b. Find the probability that one or more of the sampled items is defective.
c. Find the probability that exactly one of the sampled items is defective.
d. Find the probability that fewer than two of the
sampled items are defective.
5. Of all the weld failures in a certain assembly, 85%
of them occur in the weld metal itself, and the remaining 15% occur in the base metal. A sample of
20 weld failures is examined.
a. What is the probability that exactly five of them
are base metal failures?
b. What is the probability that fewer than four of
them are base metal failures?
c. What is the probability that none of them are base
metal failures?
d. Find the mean number of base metal failures.
e. Find the standard deviation of the number of base
metal failures.
6. A large industrial firm allows a discount on any invoice that is paid within 30 days. Of all invoices, 10%
receive the discount. In a company audit, 12 invoices
are sampled at random.
a. What is the probability that exactly four of them
receive the discount?
b. What is the probability that fewer than three of
them receive the discount?
c. What is the probability that none of them receive
the discount?
d. Find the mean number that receive the discount.
e. Find the standard deviation of the number that
receive the discount.
7. A fair coin is tossed eight times.
The Binomial Distribution
125
a. What is the probability of obtaining exactly five
heads?
b. Find the mean number of heads obtained.
c. Find the variance of the number of heads obtained.
d. Find the standard deviation of the number of
heads obtained.
8. In a large shipment of automobile tires, 10% have a
flaw. Four tires are chosen at random to be installed
on a car.
a. What is the probability that none of the tires have
a flaw?
b. What is the probability that exactly one of the
tires has a flaw?
c. What is the probability that one or more of the
tires has a flaw?
9. Of the bolts manufactured for a certain application,
85% meet the length specification and can be used
immediately, 10% are too long and can be used after
being cut, and 5% are too short and must be scrapped.
a. Find the probability that a randomly selected bolt
can be used (either immediately or after being
cut).
b. Find the probability that fewer than 9 out of a sample of 10 bolts can be used (either immediately or
after being cut).
10. A distributor receives a large shipment of components. The distributor would like to accept the shipment if 10% or fewer of the components are defective
and to return it if more than 10% of the components
are defective. She decides to sample 10 components
and to return the shipment if more than 1 of the 10 is
defective. If the proportion of defectives in the batch
is in fact 10%, what is the probability that she will
return the shipment?
11. A k out of n system is one in which there is a group
of n components, and the system will function if at
least k of the components function. Assume the components function independently of one another.
a. In a 3 out of 5 system, each component has probability 0.9 of functioning. What is the probability
that the system will function?
b. In a 3 out of n system, in which each component has probability 0.9 of functioning, what is
the smallest value of n needed so that the probability that the system functions is at least 0.90?
Navidi-38615
book
126
December 4, 2008
CHAPTER 4
15:53
Confirming Pages
Commonly Used Distributions
12. Refer to Exercise 11 for the definition of a k out of
n system. For a certain 4 out of 6 system, assume
that on a rainy day each component has probability
0.7 of functioning and that on a nonrainy day each
component has probability 0.9 of functioning.
a. What is the probability that the system functions
on a rainy day?
b. What is the probability that the system functions
on a nonrainy day?
c. Assume that the probability of rain tomorrow is
0.20. What is the probability that the system will
function tomorrow?
13. A certain large shipment comes with a guarantee
that it contains no more than 15% defective items.
If the proportion of defective items in the shipment
is greater than 15%, the shipment may be returned.
You draw a random sample of 10 items. Let X be the
number of defective items in the sample.
a. If in fact 15% of the items in the shipment are
defective (so that the shipment is good, but just
barely), what is P(X ≥ 7)?
b. Based on the answer to part (a), if 15% of the items
in the shipment are defective, would 7 defectives in
a sample of size 10 be an unusually large number?
c. If you found that 7 of the 10 sample items were
defective, would this be convincing evidence that
the shipment should be returned? Explain.
d. If in fact 15% of the items in the shipment are
defective, what is P(X ≥ 2)?
e. Based on the answer to part (d), if 15% of the items
in the shipment are defective, would 2 defectives in
a sample of size 10 be an unusually large number?
f. If you found that 2 of the 10 sample items were
defective, would this be convincing evidence that
the shipment should be returned? Explain.
14. An insurance company offers a discount to homeowners who install smoke detectors in their homes.
A company representative claims that 80% or more
of policyholders have smoke detectors. You draw
a random sample of eight policyholders. Let X be
the number of policyholders in the sample who have
smoke detectors.
a. If exactly 80% of the policyholders have smoke
detectors (so the representative’s claim is true, but
just barely), what is P(X ≤ 1)?
b. Based on the answer to part (a), if 80% of the
policyholders have smoke detectors, would one
policyholder with a smoke detector in a sample of
size 8 be an unusually small number?
c. If you found that exactly one of the eight sample
policyholders had a smoke detector, would this
be convincing evidence that the claim is false?
Explain.
d. If exactly 80% of the policyholders have smoke
detectors, what is P(X ≤ 6)?
e. Based on the answer to part (d), if 80% of the policyholders have smoke detectors, would six policyholders with smoke detectors in a sample of
size 8 be an unusually small number?
f. If you found that exactly six of the eight sample policyholders had smoke detectors, would this
be convincing evidence that the claim is false?
Explain.
15. A message consists of a string of bits (0s and 1s).
Due to noise in the communications channel, each
bit has probability 0.3 of being reversed (i.e., a 1 will
be changed to a 0 or a 0 to a 1). To improve the accuracy of the communication, each bit is sent five
times, so, for example, 0 is sent as 00000. The receiver assigns the value 0 if three or more of the
bits are decoded as 0, and 1 if three or more of
the bits are decoded as 1. Assume that errors occur
independently.
a. A 0 is sent (as 00000). What is the probability
that the receiver assigns the correct value
of 0?
b. Assume that each bit is sent n times, where n is
an odd number, and that the receiver assigns the
value decoded in the majority of the bits. What
is the minimum value of n necessary so that the
probability that the correct value is assigned is at
least 0.90?
16. One design for a system requires the installation of
two identical components. The system will work if
at least one of the components works. An alternative design requires four of these components, and
the system will work if at least two of the four components work. If the probability that a component
works is 0.9, and if the components function independently, which design has the greater probability
of functioning?
Navidi-38615
book
December 4, 2008
Confirming Pages
15:53
4.2
The Poisson Distribution
127
4.2 The Poisson Distribution
The Poisson distribution arises frequently in scientific work. We will introduce this
distribution by describing a classic application in physics: the number of particles emitted
by a radioactive mass.
A mass contains 10,000 atoms of a radioactive substance. The probability that a given
atom will decay in a one-minute time period is 0.0002. Let X represent the number of
atoms that decay in one minute. Now each atom can be thought of as a Bernoulli trial,
where success occurs if the atom decays. Thus X is the number of successes in 10,000
independent Bernoulli trials, each with success probability 0.0002, so the distribution of
X is Bin(10,000, 0.0002).
Now assume that we want to compute the probability that exactly three atoms from
this mass decay in one minute. Using the binomial probability mass function, we would
compute as follows:
P(X = 3) =
10,000!
(0.0002)3 (0.9998)9997 = 0.18047
3! 9997!
Computing the binomial probability involves some large factorials and large exponents. There is a simpler calculation that yields almost exactly the same result. Specifically, if n is large and p is small, and we let λ = np, it can be shown by advanced
methods that for any non-negative integer x,
n!
λx
p x (1 − p)n−x ≈ e−λ
x!(n − x)!
x!
(4.5)
We are led to define a new probability mass function, called the Poisson probability
mass function. The Poisson probability mass function is defined by
p(x) = P(X = x) =


e−λ

0
λx
x!
if x is a non-negative integer
(4.6)
otherwise
If X is a random variable whose probability mass function is given by Equation (4.6),
then X is said to have the Poisson distribution with parameter λ. The notation is
X ∼ Poisson(λ).
For the radioactive mass just described, we would use the Poisson mass function to
approximate P(X = x) by substituting λ = (10, 000)(0.0002) = 2 into Equation (4.6).
The result is
P(X = 3) = e−2
23
= 0.18045
3!
The Poisson probability agrees with the binomial probability to four decimal
places.
Navidi-38615
128
book
December 4, 2008
CHAPTER 4
Example
4.6
Confirming Pages
15:53
Commonly Used Distributions
If X ∼ Poisson(5), compute P(X = 3), P(X = 8), P(X = 0), P(X = −1), and
P(X = 0.5).
Solution
Using the probability mass function (4.6), with λ = 5, we obtain
Example
4.7
P(X = 3) = e−5
53
= 0.1404
3!
P(X = 8) = e−5
58
= 0.0653
8!
P(X = 0) = e−5
50
= 0.0067
0!
P(X = −1) = 0
because −1 is not a non-negative integer
P(X = 0.5) = 0
because 0.5 is not a non-negative integer
If X ∼ Poisson(4), compute P(X ≤ 2) and P(X > 1).
Solution
P(X ≤ 2) = P(X = 0) + P(X = 1) + P(X = 2)
= e−4
40
41
42
+ e−4 + e−4
0!
1!
2!
= 0.0183 + 0.0733 + 0.1465
= 0.2381
To find P(X > 1), we might try to start by writing
P(X > 1) = P(X = 2) + P(X = 3) + · · ·
This leads to an infinite sum that is difficult to compute. Instead, we write
P(X > 1) = 1 − P(X ≤ 1)
= 1 − [P(X = 0) + P(X = 1)]
0
1
−4 4
−4 4
= 1− e
+e
0!
1!
= 1 − (0.0183 + 0.0733)
= 0.9084
Navidi-38615
book
December 4, 2008
Confirming Pages
15:53
4.2
The Poisson Distribution
129
Summary
If X ∼ Poisson(λ), then
■
■
■
■
X is a discrete random variable whose possible values are the non-negative
integers.
The parameter λ is a positive constant.
The probability mass function of X is

x
 e−λ λ
if x is a non-negative integer
x!
p(x) = P(X = x) =

0
otherwise
The Poisson probability mass function is very close to the binomial
probability mass function when n is large, p is small, and λ = np.
The Mean and Variance of a Poisson Random Variable
The mean and variance of a Poisson random variable can be computed by using the
probability mass function along with the definitions given by Equations (3.13) and (3.14)
(in Section 3.3). It turns out that the mean and variance are both equal to λ.
Mean and Variance of a Poisson Random Variable
If X ∼ Poisson(λ), then the mean and variance of X are given by
µX = λ
(4.7)
σ X2 = λ
(4.8)
Figure 4.2 presents probability histograms for the Poisson(1) and Poisson(10)
probability mass functions.
0.35
0.35
0.3
0.3
0.25
0.25
0.2
0.2
0.15
0.15
0.1
0.1
0.05
0.05
0
0
0
2
4
6
8 10 12 14 16 18 20 22 24
(a)
0 2 4
6 8 10 12 14 16 18 20 22 24
(b)
FIGURE 4.2 (a) The Poisson(1) probability histogram. (b) The Poisson(10) probability histogram.
Navidi-38615
130
book
December 4, 2008
CHAPTER 4
15:53
Confirming Pages
Commonly Used Distributions
One of the earliest industrial uses of the Poisson distribution involved an application
to the brewing of beer. A crucial step in the brewing process is the addition of yeast culture
to prepare mash for fermentation. The living yeast cells are kept suspended in a liquid
medium. Because the cells are alive, their concentration in the medium changes over time.
Therefore, just before the yeast is added, it is necessary to estimate the concentration of
yeast cells per unit volume of suspension, so as to be sure to add the right amount.
Up until the early part of the twentieth century, this posed a problem for brewers.
They estimated the concentration in the obvious way, by withdrawing a small volume
of the suspension and counting the yeast cells in it under a microscope. Of course, the
estimates determined this way were subject to random variation, but no one knew how
to measure the likely size of this variation. Therefore no one knew by how much the
concentration in the sample was likely to differ from the actual concentration.
William Sealy Gosset, a young man in his mid-twenties who was employed by the
Guinness Brewing Company of Dublin, Ireland, discovered in 1904 that the number of
yeast cells in a sampled volume of suspension follows a Poisson distribution. He was
then able to develop methods to measure the concentration of yeast cells with greater
precision. Gosset’s discovery not only enabled Guinness to produce a more consistent
product, it showed that the Poisson distribution could have important applications in many
situations. Gosset wanted to publish his result, but his managers at Guinness considered
his discovery to be proprietary information. They required him to use a pseudonym, so
that competing breweries would not realize how useful the results could be. Gosset chose
the pseudonym “Student.”
In Example 4.8, we will follow a train of thought that leads to Student’s result. Before
we get to it though, we will mention that shortly after publishing this result, Student
made another discovery that solved one of the most important outstanding problems in
statistics, which has profoundly influenced work in virtually all fields of science ever
since. We will discuss this result in Section 5.4.
Example
4.8
Particles (e.g., yeast cells) are suspended in a liquid medium at a concentration of 10
particles per mL. A large volume of the suspension is thoroughly agitated, and then 1 mL
is withdrawn. What is the probability that exactly eight particles are withdrawn?
Solution
So long as the volume withdrawn is a small fraction of the total, the solution to this
problem does not depend on the total volume of the suspension but only on the concentration of particles in it. Let V be the total volume of the suspension, in mL. Then the
total number of particles in the suspension is 10V . Think of each of the 10V particles
as a Bernoulli trial. A particle “succeeds” if it is withdrawn. Now 1 mL out of the total
of V mL is to be withdrawn. Therefore, the amount to be withdrawn is 1/V of the total,
so it follows that each particle has probability 1/V of being withdrawn. Let X denote
the number of particles withdrawn. Then X represents the number of successes in 10V
Bernoulli trials, each with probability 1/V of success. Therefore X ∼ Bin(10V, 1/V ).
Since V is large, 10V is large and 1/V is small. Thus, to a very close approximation,
X ∼ Poisson(10). We compute P(X = 8) with the Poisson probability mass function:
P(X = 8) = e−10 108 /8! = 0.1126.
Navidi-38615
book
December 4, 2008
15:53
Confirming Pages
4.2
The Poisson Distribution
131
In Example 4.8, λ had the value 10 because the mean number of particles in 1 mL
of suspension (the volume withdrawn) was 10.
Example
4.9
Particles are suspended in a liquid medium at a concentration of 6 particles per mL. A
large volume of the suspension is thoroughly agitated, and then 3 mL are withdrawn.
What is the probability that exactly 15 particles are withdrawn?
Solution
Let X represent the number of particles withdrawn. The mean number of particles in a
3 mL volume is 18. Therefore X ∼ Poisson(18). The probability that exactly 15 particles
are withdrawn is
1815
P(X = 15) = e−18
15!
= 0.0786
Note that for the solutions to Examples 4.8 and 4.9 to be correct, it is important
that the amount of suspension withdrawn not be too large a fraction of the total. For
example, if the total volume in Example 4.9 was 3 mL, so that the entire amount was
withdrawn, it would be certain that all 18 particles would be withdrawn, so the probability
of withdrawing 15 particles would be zero.
Example
4.10
Grandma bakes chocolate chip cookies in batches of 100. She puts 300 chips into the
dough. When the cookies are done, she gives you one. What is the probability that your
cookie contains no chocolate chips?
Solution
This is another instance of particles in a suspension. Let X represent the number of chips
in your cookie. The mean number of chips is 3 per cookie, so X ∼ Poisson(3). It follows
that P(X = 0) = e−3 30 /0! = 0.0498.
Examples 4.8 and 4.9 show that for particles distributed uniformly at random
throughout a medium, the number of particles that happen to fall in a small portion
of the medium follows a Poisson distribution. In these examples, the particles were
actual particles, and the medium was spatial in nature. In many cases, however, the
“particles” represent events, and the medium is time. We saw such an example previously, where the number of radioactive decay events in a fixed time interval turned out
to follow a Poisson distribution. Example 4.11 presents another.
Example
4.11
The number of email messages received by a computer server follows a Poisson distribution with a mean of 6 per minute. Find the probability that exactly 20 messages will
be received in the next 3 minutes.
Navidi-38615
book
132
December 4, 2008
CHAPTER 4
Confirming Pages
15:53
Commonly Used Distributions
Solution
Let X be the number of messages received in 3 minutes. The mean number of messages
received in 3 minutes is (6)(3) = 18, so X ∼ Poisson(18). Using the Poisson(18)
probability mass function, we find that
P(X = 20) = e−18
1820
20!
= 0.0798
Exercises for Section 4.2
1. Let X ∼ Poisson(3). Find
a.
b.
c.
d.
e.
f.
P(X
P(X
P(X
P(X
µX
σX
= 2)
= 0)
< 3)
> 2)
2. The concentration of particles in a suspension is 4
per mL. The suspension is thoroughly agitated, and
then 2 mL is withdrawn. Let X represent the number
of particles that are withdrawn. Find
a.
b.
c.
d.
e.
P(X = 6)
P(X ≤ 3)
P(X > 2)
µX
σX
3. Suppose that 0.2% of diodes in a certain application
fail within the first month of use. Let X represent the
number of diodes in a random sample of 1000 that
fail within the first month. Find
a.
b.
c.
d.
e.
P(X = 4)
P(X ≤ 1)
P(1 ≤ X < 4)
µX
σX
4. The number of flaws in a given area of aluminum foil
follows a Poisson distribution with a mean of 3 per
m2 . Let X represent the number of flaws in a 1 m2
sample of foil.
a.
b.
c.
d.
e.
f.
P(X
P(X
P(X
P(X
µX
σX
= 5)
= 0)
< 2)
> 1)
5. The number of hits on a certain website follows a
Poisson distribution with a mean rate of 4 per minute.
a. What is the probability that five messages are
received in a given minute?
b. What is the probability that 9 messages are
received in 1.5 minutes?
c. What is the probability that fewer than three
messages are received in a period of 30 seconds?
6. One out of every 5000 individuals in a population
carries a certain defective gene. A random sample of
1000 individuals is studied.
a. What is the probability that exactly one of the
sample individuals carries the gene?
b. What is the probability that none of the sample
individuals carries the gene?
c. What is the probability that more than two of the
sample individuals carry the gene?
d. What is the mean of the number of sample individuals that carry the gene?
e. What is the standard deviation of the number of
sample individuals that carry the gene?
7. A sensor network consists of a large number of
microprocessors spread out over an area, in
Navidi-38615
book
December 4, 2008
15:53
Confirming Pages
4.2
communication with each other and with a base
station. In a certain network, the probability that a
message will fail to reach the base station is 0.005.
Assume that during a particular day, 1000 messages
are sent.
a. What is the probability that exactly 3 of the messages fail to reach the base station?
b. What is the probability that fewer than 994 of the
messages reach the base station?
c. What is the mean of the number of messages that
fail to reach the base station?
d. What is the standard deviation of the number
of messages that fail to reach the base
station?
8. Geologists estimate the time since the most recent
cooling of a mineral by counting the number of uranium fission tracks on the surface of the mineral.
A certain mineral specimen is of such an age that
there should be an average of 6 tracks per cm2 of
surface area. Assume the number of tracks in an area
follows a Poisson distribution. Let X represent the
number of tracks counted in 1 cm2 of surface area.
Find
a.
b.
c.
d.
e.
P(X = 7)
P(X ≥ 3)
P(2 < X < 7)
µX
σX
9. A random variable X has a binomial distribution,
and a random variable Y has a Poisson distribution. Both X and Y have means equal to 3. Is it
possible to determine which random variable has
the larger variance? Choose one of the following
answers:
i. Yes, X has the larger variance.
ii. Yes, Y has the larger variance.
iii. No, we need to know the number of trials, n,
for X .
iv. No, we need to know the success probability,
p, for X .
v. No, we need to know the value of λ for Y .
10. You have received a radioactive mass that is claimed
to have a mean decay rate of at least 1 particle per
The Poisson Distribution
133
second. If the mean decay rate is less than 1 per
second, you may return the product for a refund.
Let X be the number of decay events counted in
10 seconds.
a. If the mean decay rate is exactly 1 per second
(so that the claim is true, but just barely), what is
P(X ≤ 1)?
b. Based on the answer to part (a), if the mean decay
rate is 1 particle per second, would one event in
10 seconds be an unusually small number?
c. If you counted one decay event in 10 seconds,
would this be convincing evidence that the product should be returned? Explain.
d. If the mean decay rate is exactly 1 per second,
what is P(X ≤ 8)?
e. Based on the answer to part (d), if the mean
decay rate is 1 particle per second, would eight
events in 10 seconds be an unusually small
number?
f. If you counted eight decay events in 10 seconds,
would this be convincing evidence that the product should be returned? Explain.
11. Someone claims that a certain suspension contains
at least seven particles per mL. You sample 1 mL
of solution. Let X be the number of particles in the
sample.
a. If the mean number of particles is exactly seven
per mL (so that the claim is true, but just barely),
what is P(X ≤ 1)?
b. Based on the answer to part (a), if the suspension contains seven particles per mL, would one
particle in a 1 mL sample be an unusually small
number?
c. If you counted one particle in the sample, would
this be convincing evidence that the claim is false?
Explain.
d. If the mean number of particles is exactly 7 per
mL, what is P(X ≤ 6)?
e. Based on the answer to part (d), if the suspension contains seven particles per mL, would six
particles in a 1 mL sample be an unusually small
number?
f. If you counted six particles in the sample, would
this be convincing evidence that the claim is false?
Explain.
Navidi-38615
134
book
December 4, 2008
CHAPTER 4
Confirming Pages
15:53
Commonly Used Distributions
4.3 The Normal Distribution
The normal distribution (also called the Gaussian distribution) is by far the most
commonly used distribution in statistics. This distribution provides a good model for
many, although not all, continuous populations. Part of the reason for this is the Central
Limit Theorem, which we shall discuss in Section 4.8.
The normal distribution is continuous rather than discrete. The mean of a normal
random variable may have any value, and the variance may have any positive value. The
probability density function of a normal random variable with mean µ and variance σ 2
is given by
1
2
2
f (x) = √ e−(x−µ) /(2σ )
(4.9)
σ 2π
If X is a random variable whose probability density function is normal with mean
µ and variance σ 2 , we write X ∼ N (µ, σ 2 ).
Summary
If X ∼ N (µ, σ 2 ), then the mean and variance of X are given by
µX = µ
σ X2 = σ 2
Figure 4.3 presents a plot of the normal probability density function with mean µ
and standard deviation σ . The normal probability density function is sometimes called
the normal curve. Note that the normal curve is symmetric around µ, so that µ is the
median as well as the mean. It is also the case that for any normal population
■
■
■
About 68% of the population is in the interval µ ± σ .
About 95% of the population is in the interval µ ± 2σ .
About 99.7% of the population is in the interval µ ± 3σ .
≈ 99.7%
≈ 95%
≈ 68%
m ⫺ 3s
m ⫺ 2s
m ⫺ 1s
m
m ⫹ 1s
m ⫹ 2s
m ⫹ 3s
FIGURE 4.3 Probability density function of a normal population with mean µ and
variance σ 2 .
Navidi-38615
book
December 4, 2008
15:53
Confirming Pages
4.3
The Normal Distribution
135
The proportion of a normal population that is within a given number of standard
deviations of the mean is the same for any normal population. For this reason, when
dealing with normal populations, we often convert from the units in which the population
items were originally measured to standard units. Standard units tell how many standard
deviations an observation is from the population mean.
Example
4.12
Resistances in a population of wires are normally distributed with mean 20 m and
standard deviation 3 m. The resistance of two randomly chosen wires are 23 m and
16 m. Convert these amounts to standard units.
Solution
A resistance of 23 m is 3 m more than the mean of 20, and 3 m is equal to one
standard deviation. So 23 m is one standard deviation above the mean and is thus
equivalent to one standard unit. A resistance of 16 m is 1.33 standard deviations below
the mean, so 16 m is equivalent to −1.33 standard units.
In general, we convert to standard units by subtracting the mean and dividing by the
standard deviation. Thus, if x is an item sampled from a normal population with mean
µ and variance σ 2 , the standard unit equivalent of x is the number z, where
x −µ
(4.10)
z=
σ
The number z is sometimes called the “z-score” of x. The z-score is an item sampled
from a normal population with mean 0 and standard deviation 1. This normal population
is called the standard normal population.
Example
4.13
The yield, in grams, of a dye manufacturing process is normally distributed with mean
1500 and standard deviation 50. The yield of a particular run is 1568 grams. Find the
z-score.
Solution
The quantity 1568 is an observation from a normal population with mean µ = 1500 and
standard deviation σ = 50. Therefore
1568 − 1500
z=
50
= 1.36
Example
4.14
Refer to Example 4.13. The yield of a certain run has a z-score of −1.4. Find the yield
in the original units of grams.
Solution
We use Equation (4.10), substituting −1.4 for z and solving for x. We obtain
x − 1500
−1.4 =
50
Solving for x yields x = 1430. The yield is 1430 grams.
Navidi-38615
136
book
December 4, 2008
CHAPTER 4
Confirming Pages
15:53
Commonly Used Distributions
The proportion of a normal population that lies within a given interval is equal to
the area under the normal probability density above that interval. This would suggest
that we compute these proportions by integrating the normal probability density given
in Equation (4.9). Interestingly enough, areas under this curve cannot be found by the
method, taught in elementary calculus, of computing the antiderivative of the function
and plugging in the limits of integration. This is because the antiderivative of this function
is an infinite series and cannot be written down exactly. Instead, areas under this curve
must be approximated numerically.
Areas under the standard normal curve (mean 0, variance 1) have been extensively
tabulated. A typical such table, called a standard normal table, or z table, is given as
Table A.2 (in Appendix A). To find areas under a normal curve with a different mean and
variance, we convert to standard units and use the z table. Table A.2 provides areas in the
left-hand tail of the curve for values of z. Other areas can be calculated by subtraction or
by using the fact that the total area under the curve is equal to 1. We now present several
examples to illustrate the use of the z table.
Example
4.15
Find the area under the normal curve to the left of z = 0.47.
Solution
From the z table, the area is 0.6808. See Figure 4.4.
0.6808
0 0.47
FIGURE 4.4 Solution to Example 4.15.
Example
4.16
Find the area under the normal curve to the right of z = 1.38.
Solution
From the z table, the area to the left of z = 1.38 is 0.9162. Therefore the area to the right
is 1 − 0.9162 = 0.0838. See Figure 4.5.
0.9162
0.0838
0
1.38
FIGURE 4.5 Solution to Example 4.16.
Navidi-38615
book
December 4, 2008
Confirming Pages
15:53
4.3
Example
4.17
The Normal Distribution
137
Find the area under the normal curve between z = 0.71 and z = 1.28.
Solution
From the z table, the area to the left of z = 1.28 is 0.8997. The area to the left of z = 0.71 is
0.7611. The area between z = 0.71 and z = 1.28 is therefore 0.8997−0.7611 = 0.1386.
See Figure 4.6.
0.8997
0.7611
0.1386
–
0
=
1.28
0 0.71
0 0.71 1.28
FIGURE 4.6 Solution to Example 4.17.
Recall that the pth percentile of a sample is the value that divides the sample so
that p percent of the sample values are less than the pth percentile (see Section 1.2).
This idea extends to populations as well. The pth percentile of a population is the value
that divides the sample so that p percent of the population values are less than the pth
percentile. For a normal population, therefore, the pth percentile is the value for which
p percent of the area under the normal curve is to the left.
Example
4.18
What z-score corresponds to the 75th percentile of a normal distribution? The 25th
percentile? The median?
Solution
To answer this question, we use the z table in reverse. We need to find the z-score for
which 75% of the area of the curve is to the left. From the body of the table, the closest
area to 75% is 0.7486, corresponding to a z-score of 0.67. Therefore the 75th percentile
is approximately 0.67. By the symmetry of the curve, the 25th percentile is z = −0.67
(this can also be looked up in the table directly). See Figure 4.7. The median is z = 0.
≈ 25%
≈ 75%
0
0.67
⫺0.67
0
FIGURE 4.7 Solution to Example 4.18.
Example
4.19
Lifetimes of batteries in a certain application are normally distributed with mean 50 hours
and standard deviation 5 hours. Find the probability that a randomly chosen battery lasts
between 42 and 52 hours.
Navidi-38615
138
book
December 4, 2008
CHAPTER 4
Confirming Pages
15:53
Commonly Used Distributions
Solution
Let X represent the lifetime of a randomly chosen battery. Then X ∼ N (50, 52 ). Figure 4.8 presents the probability density function of the N (50, 52 ) population. The shaded
area represents P(42 < X < 52), the probability that a randomly chosen battery has a
lifetime between 42 and 52 hours. To compute the area, we will use the z table. First we
need to convert the quantities 42 and 52 to standard units. We have
z=
42 − 50
= −1.60
5
z=
52 − 50
= 0.40
5
From the z table, the area to the left of z = −1.60 is 0.0548, and the area to the left of
z = 0.40 is 0.6554. The probability that a battery has a lifetime between 42 and 52 hours
is 0.6554 − 0.0548 = 0.6006.
0.6006
42
50 52
z = ⫺1.6
z = 0.4
FIGURE 4.8 Solution to Example 4.19.
Example
4.20
Refer to Example 4.19. Find the 40th percentile of battery lifetimes.
Solution
From the z table, the closest area to 0.4000 is 0.4013, corresponding to a z-score of
−0.25. The population of lifetimes has mean 50 and standard deviation 5. The 40th
percentile is the point 0.25 standard deviations below the mean. We find this value by
converting the z-score to a raw score, using Equation (4.10):
−0.25 =
x − 50
5
Solving for x yields x = 48.75. The 40th percentile of battery lifetimes is 48.75 hours.
See Figure 4.9.
≈ 40%
48.75 50
z = ⫺0.25
FIGURE 4.9 Solution to Example 4.20.
Navidi-38615
book
December 4, 2008
Confirming Pages
15:53
4.3
Example
4.21
The Normal Distribution
139
A process manufactures ball bearings whose diameters are normally distributed with
mean 2.505 cm and standard deviation 0.008 cm. Specifications call for the diameter
to be in the interval 2.5 ± 0.01 cm. What proportion of the ball bearings will meet the
specification?
Solution
Let X represent the diameter of a randomly chosen ball bearing. Then X ∼ N (2.505,
0.0082 ). Figure 4.10 presents the probability density function of the N (2.505, 0.0082 )
population. The shaded area represents P(2.49 < X < 2.51), which is the proportion
of ball bearings that meet the specification.
We compute the z-scores of 2.49 and 2.51:
z=
2.49 − 2.505
= −1.88
0.008
z=
2.51 − 2.505
= 0.63
0.008
The area to the left of z = −1.88 is 0.0301. The area to the left of z = 0.63 is 0.7357. The
area between z = 0.63 and z = −1.88 is 0.7357 − 0.0301 = 0.7056. Approximately
70.56% of the diameters will meet the specification.
0.7056
2.49
z = ⫺1.88
2.505 2.51
z = 0.63
FIGURE 4.10 Solution to Example 4.21.
Linear Functions of Normal Random Variables
If a normal random variable is multiplied by a nonzero constant or has a constant added
to it, the resulting random variable is also normal, with a mean and variance that are
determined by the original mean and variance and the constants. Specifically,
Summary
Let X ∼ N (µ, σ 2 ), and let a =
0 and b be constants. Then
a X + b ∼ N (aµ + b, a 2 σ 2 ).
Example
4.22
(4.11)
A chemist measures the temperature of a solution in ◦ C. The measurement is denoted
C and is normally distributed with mean 40◦ C and standard deviation 1◦ C. The measurement is converted to ◦ F by the equation F = 1.8C + 32. What is the distribution
of F?
Navidi-38615
140
book
December 4, 2008
CHAPTER 4
15:53
Confirming Pages
Commonly Used Distributions
Solution
Since C is normally distributed, so is F. Now µC = 40, so µ F = 1.8(40) + 32 = 104,
and σC2 = 1, so σ F2 = 1.82 (1) = 3.24. Therefore F ∼ N (104, 3.24).
Linear Combinations of Independent Normal Random
Variables
One of the remarkable features of the normal distribution is that linear combinations
of independent normal random variables are themselves normal random variables. To
be specific, suppose that X 1 ∼ N (µ1 , σ12 ), X 2 ∼ N (µ2 , σ22 ), . . . , X n ∼ N (µn , σn2 )
are independent normal random variables. Note that the means and variances of these
random variables can differ from one another. Let c1 , c2 , . . . , cn be constants. Then the
linear combination c1 X 1 + c2 X 2 + · · · + cn X n is a normally distributed random variable.
The mean and variance of the linear combination are c1 µ1 + c2 µ2 + · · · + cn µn and
c12 σ12 + c22 σ22 + · · · + cn2 σn2 , respectively (see Equations 3.32 and 3.36 in Section 3.4).
Summary
Let X 1 , X 2 , . . . , X n be independent and normally distributed with means
µ1 , µ2 , . . . , µn and variances σ12 , σ22 , . . . , σn2 . Let c1 , c2 , . . . , cn be constants, and
c1 X 1 + c2 X 2 + · · · + cn X n be a linear combination. Then
c1 X 1 +c2 X 2 +· · ·+cn X n ∼ N (c1 µ1 +c2 µ2 +· · ·+cn µn , c12 σ12 +c22 σ22 +· · ·+cn2 σn2 )
(4.12)
Example
4.23
In the article “Advances in Oxygen Equivalent Equations for Predicting the Properties
of Titanium Welds” (D. Harwig, W. Ittiwattana, and H. Castner, The Welding Journal,
2001:126s–136s), the authors propose an oxygen equivalence equation to predict the
strength, ductility, and hardness of welds made from nearly pure titanium. The equation
is E = 2C + 3.5N + O, where E is the oxygen equivalence, and C, N , and O are the
proportions by weight, in parts per million, of carbon, nitrogen, and oxygen, respectively
(a constant term involving iron content has been omitted). Assume that for a particular
grade of commercially pure titanium, the quantities C, N , and O are approximately
independent and normally distributed with means µC = 150, µ N = 200, µ O = 1500,
and standard deviations σC = 30, σ N = 60, σ O = 100. Find the distribution of E. Find
P(E > 3000).
Solution
Since E is a linear combination of independent normal random variables, its distribution
is normal. We must now find the mean and variance of E. Using Equation (4.12),
Navidi-38615
book
December 4, 2008
Confirming Pages
15:53
4.3
The Normal Distribution
141
we compute
µ E = 2µC + 3.5µ N + 1µ O
= 2(150) + 3.5(200) + 1(1500)
= 2500
σ E2
= 22 σC2 + 3.52 σ N2 + 12 σ O2
= 22 (302 ) + 3.52 (602 ) + 12 (1002 )
= 57,700
We conclude that E ∼ N (2500, 57,700).
√
To find P(E > 3000), we compute the z-score: z = (3000 − 2500)/ 57,700 =
2.08. The area to the right of z = 2.08 under the standard normal curve is 0.0188. So
P(E > 3000) = 0.0188.
If X 1 , . . . , X n is a random sample from any population with mean µ and variance σ 2 ,
then the sample mean X has mean µ X = µ and variance σ X2 = σ 2 /n. If the population
is normal, then X is normal as well, because it is a linear combination of X 1 , . . . , X n
with coefficients c1 = · · · = cn = 1/n.
Summary
Let X 1 , . . . , X n be independent and normally distributed with mean µ and variance σ 2 . Then
σ2
X ∼ N µ,
(4.13)
n
Other important linear combinations are the sum and difference of two random
variables. If X and Y are independent normal random variables, the sum X + Y and the
difference X − Y are linear combinations. The distributions of X + Y and X − Y can be
determined by using Equation (4.12) with c1 = 1, c2 = 1 for X +Y and c1 = 1, c2 = −1
for X − Y .
Summary
Let X and Y be independent, with X ∼ N (µ X , σ X2 ) and Y ∼ N (µY , σY2 ). Then
X + Y ∼ N (µ X + µY , σ X2 + σY2 )
(4.14)
X − Y ∼ N (µ X − µY , σ X2 + σY2 )
(4.15)
How Can I Tell Whether My Data Come
from a Normal Population?
In practice, we often have a sample from some population, and we must use the sample
to decide whether the population distribution is approximately normal. If the sample
Navidi-38615
book
142
December 4, 2008
CHAPTER 4
15:53
Confirming Pages
Commonly Used Distributions
is reasonably large, the sample histogram may give a good indication. Large samples
from normal populations have histograms that look something like the normal density
function—peaked in the center, and decreasing more or less symmetrically on either
side. Probability plots, which will be discussed in Section 4.7, provide another good
way of determining whether a reasonably large sample comes from a population that is
approximately normal. For small samples, it can be difficult to tell whether the normal
distribution is appropriate. One important fact is this: Samples from normal populations
rarely contain outliers. Therefore the normal distribution should generally not be used
for data sets that contain outliers. This is especially true when the sample size is small.
Unfortunately, for small data sets that do not contain outliers, it is difficult to determine
whether the population is approximately normal. In general, some knowledge of the
process that generated the data is needed.
Exercises for Section 4.3
1. Find the area under the normal curve
a.
b.
c.
d.
To the right of z = −0.75.
Between z = 0.40 and z = 1.15.
Between z = −0.60 and z = 0.70.
Outside z = 0.75 to z = 1.30.
2. Find the area under the normal curve
a.
b.
c.
d.
To the right of z = 0.73.
Between z = 2.12 and z = 2.57.
Outside z = 0.96 to z = 1.62.
Between z = −0.13 and z = 0.70.
3. Let Z ∼ N (0, 1). Find a constant c for which
a.
b.
c.
d.
e.
P(Z ≤ c) = 0.8413
P(0 ≤ Z ≤ c) = 0.3051
P(−c ≤ Z ≤ c) = 0.8664
P(c ≤ Z ≤ 0) = 0.4554
P(|Z | ≥ c) = 0.1470
4. If X ∼ N (3, 4), compute
a.
b.
c.
d.
P(X ≥ 3)
P(1 ≤ X < 8)
P(−1.5 ≤ X < 1)
P(−2 ≤ X − 3 < 4)
5. Scores on a standardized test are approximately normally distributed with a mean of 460 and a standard
deviation of 80.
a. What proportion of the scores are above 550?
b. What is the 35th percentile of the scores?
c. If someone’s score is 600, what percentile is she
on?
d. What proportion of the scores are between 420 and
520?
6. Weights of female cats of a certain breed are normally
distributed with mean 4.1 kg and standard deviation
0.6 kg.
a. What proportion of female cats have weights
between 3.7 and 4.4 kg?
b. A certain female cat has a weight that is 0.5 standard deviations above the mean. What proportion
of female cats are heavier than this one?
c. How heavy is a female cat whose weight is on the
80th percentile?
d. A female cat is chosen at random. What is the
probability that she weighs more than 4.5 kg?
e. Six female cats are chosen at random. What is the
probability that exactly one of them weighs more
than 4.5 kg?
7. The lifetime of a light bulb in a certain application is
normally distributed with mean µ = 1400 hours and
standard deviation σ = 200 hours.
a. What is the probability that a light bulb will last
more than 1800 hours?
b. Find the 10th percentile of the lifetimes.
c. A particular battery lasts 1645 hours. What percentile is its lifetime on?
d. What is the probability that the lifetime of a battery
is between 1350 and 1550 hours?
Navidi-38615
book
December 4, 2008
Confirming Pages
15:53
4.4
8. At a certain university, math SAT scores for the entering freshman class averaged 650 and had a standard
deviation of 100. The maximum possible score is 800.
Is it possible that the scores of these freshmen are normally distributed? Explain.
9. The strength of an aluminum alloy is normally distributed with mean 10 gigapascals (GPa) and standard
deviation 1.4 GPa.
a. What is the probability that a specimen of this alloy
will have a strength greater than 12 GPa?
b. Find the first quartile of the strengths of this alloy.
c. Find the 95th percentile of the strengths of this
alloy.
10. The temperature recorded by a certain thermometer
when placed in boiling water (true temperature 100◦ C)
is normally distributed with mean µ = 99.8◦ C and
standard deviation 0.1◦ C.
a. What is the probability that the thermometer reading is greater than 100◦ C?
b. What is the probability that the thermometer reading is within ±0.05◦ C of the true temperature?
11. Penicillin is produced by the Penicillium fungus,
which is grown in a broth whose sugar content must
be carefully controlled. The optimum sugar concentration is 4.9 mg/mL. If the concentration exceeds
6.0 mg/mL, the fungus dies and the process must be
shut down for the day.
a. If sugar concentration in batches of broth is normally distributed with mean 4.9 mg/mL and standard deviation 0.6 mg/mL, on what proportion of
days will the process shut down?
b. The supplier offers to sell broth with a sugar
content that is normally distributed with mean
5.2 mg/mL and standard deviation 0.4 mg/mL.
Will this broth result in fewer days of production
lost? Explain.
The Lognormal Distribution
143
12. The quality assurance program for a certain adhesive
formulation process involves measuring how well the
adhesive sticks a piece of plastic to a glass surface.
When the process is functioning correctly, the adhesive strength X is normally distributed with a mean of
200 N and a standard deviation of 10 N. Each hour,
you make one measurement of the adhesive strength.
You are supposed to inform your supervisor if your
measurement indicates that the process has strayed
from its target distribution.
a. Find P(X ≤ 160), under the assumption that the
process is functioning correctly.
b. Based on your answer to part (a), if the process is
functioning correctly, would a strength of 160 N
be unusually small? Explain.
c. If you observed an adhesive strength of 160 N,
would this be convincing evidence that the process
was no longer functioning correctly? Explain.
d. Find P(X ≥ 203), under the assumption that the
process is functioning correctly.
e. Based on your answer to part (d), if the process is
functioning correctly, would a strength of 203 N
be unusually large? Explain.
f. If you observed an adhesive strength of
203 N, would this be convincing evidence that
the process was no longer functioning correctly?
Explain.
g. Find P(X ≤ 195), under the assumption that the
process is functioning correctly.
h. Based on your answer to part (g), if the process is
functioning correctly, would a strength of 195 N
be unusually small? Explain.
i. If you observed an adhesive strength of 195 N,
would this be convincing evidence that the
process was no longer functioning correctly?
Explain.
4.4 The Lognormal Distribution
For data that are highly skewed or that contain outliers, the normal distribution is generally
not appropriate. The lognormal distribution, which is related to the normal distribution,
is often a good choice for these data sets. The lognormal distribution is derived from
the normal distribution as follows: If X is a normal random variable with mean µ and
Navidi-38615
144
book
December 4, 2008
CHAPTER 4
Confirming Pages
15:53
Commonly Used Distributions
variance σ 2 , then the random variable Y = e X is said to have the lognormal distribution
with parameters µ and σ 2 . Note that if Y has the lognormal distribution with parameters
µ and σ 2 , then X = ln Y has the normal distribution with mean µ and variance σ 2 .
Summary
■
■
If X ∼ N (µ, σ 2 ), then the random variable Y = e X has the lognormal
distribution with parameters µ and σ 2 .
If Y has the lognormal distribution with parameters µ and σ 2 , then the
random variable X = ln Y has the N (µ, σ 2 ) distribution.
The probability density function of a lognormal random variable with parameters
µ and σ is
f (x) =





1
√
1
exp − 2 (ln x − µ)2
2σ
σ x 2π
if x > 0
(4.16)
if x ≤ 0
0
Figure 4.11 presents a graph of the lognormal density function with parameters µ = 0
and σ = 1.
Note that the density function is highly skewed. This is the reason that the lognormal
distribution is often used to model processes that tend to produce occasional large values,
or outliers.
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
5
10
15
FIGURE 4.11 The probability density function of the lognormal distribution
with parameters µ = 0 and σ = 1.
Navidi-38615
book
December 4, 2008
Confirming Pages
15:53
4.4
The Lognormal Distribution
145
It can be shown by advanced methods that if Y is a lognormal random variable with
parameters µ and σ 2 , then the mean E(Y ) and variance V (Y ) are given by
E(Y ) = eµ+σ
2
/2
2
V (Y ) = e2µ+2σ − e2µ+σ
2
(4.17)
Note that if Y has the lognormal distribution, the parameters µ and σ 2 do not refer
to the mean and variance of Y . They refer instead to the mean and variance of the normal
random variable ln Y . In Equation (4.17) we used the notation E(Y ) instead of µY and
V (Y ) instead of σY2 , in order to avoid confusion with µ and σ .
To compute probabilities involving lognormal random variables, take logs and use
the z table (Table A.2). Example 4.24 illustrates the method.
Example
4.24
When a pesticide comes into contact with skin, a certain percentage of it is absorbed. The
percentage that is absorbed during a given time period is often modeled with a lognormal
distribution. Assume that for a given pesticide, the amount that is absorbed (in percent)
within two hours of application is lognormally distributed with µ = 1.5 and σ = 0.5.
Find the probability that more than 5% of the pesticide is absorbed within two hours.
Solution
Let Y represent the percent of pesticide that is absorbed. We need to find P(Y > 5).
We cannot use the z table for Y , because Y is not normally distributed. However, ln Y
is normally distributed; specifically, ln Y ∼ N (1.5, 0.52 ). We express P(Y > 5) as a
probability involving ln Y :
P(Y > 5) = P(ln Y > ln 5) = P(ln Y > 1.609)
The z-score of 1.609 is
1.609 − 1.500
0.5
= 0.22
z=
From the z table, we find that P(ln Y > 1.609) = 0.4129. We conclude that the
probability that more than 5% of the pesticide is absorbed is approximately 0.41.
How Can I Tell Whether My Data Come
from a Lognormal Population?
As stated previously, samples from normal populations rarely contain outliers. In contrast,
samples from lognormal populations often contain outliers in the right-hand tail. That
is, the samples often contain a few values that are much larger than the rest of the
data. This of course reflects the long right-hand tail in the lognormal density function
(Figure 4.11). For samples with outliers on the right, we transform the data, by taking
the natural logarithm (or any logarithm) of each value. We then try to determine whether
these logs come from a normal population, by plotting them on a histogram or on a
probability plot. Probability plots will be discussed in Section 4.7.
Navidi-38615
book
146
December 4, 2008
CHAPTER 4
Confirming Pages
15:53
Commonly Used Distributions
Exercises for Section 4.4
1. The lifetime (in days) of a certain electronic component that operates in a high-temperature environment
is lognormally distributed with µ = 1.2 and σ = 0.4.
a. Find the mean lifetime.
b. Find the probability that a component lasts between
three and six days.
c. Find the median lifetime.
d. Find the 90th percentile of the lifetimes.
2. The article “Assessment of Dermopharmacokinetic
Approach in the Bioequivalence Determination
of Topical Tretinoin Gel Products” (L. Pershing,
J. Nelson, et al., Journal of The American Academy of
Dermatology, 2003:740–751) reports that the amount
of a certain antifungal ointment that is absorbed into
the skin can be modeled with a lognormal distribution. Assume that the amount (in ng/cm2 ) of active
ingredient in the skin two hours after application is
lognormally distributed with µ = 2.2 and σ = 2.1.
a. Find the mean amount absorbed.
b. Find the median amount absorbed.
c. Find the probability that the amount absorbed is
more than 100 ng/cm2 .
d. Find the probability that the amount absorbed is
less than 50 ng/cm2 .
e. Find the 80th percentile of the amount absorbed.
f. Find the standard deviation of the amount
absorbed.
3. The body mass index (BMI) of a person is defined to
be the person’s body mass divided by the square of
the person’s height. The article “Influences of Parameter Uncertainties within the ICRP 66 Respiratory Tract Model: Particle Deposition” (W. Bolch, E.
Farfan, et al., Health Physics, 2001:378–394) states
that body mass index (in kg/m2 ) in men aged 25–34 is
lognormally distributed with parameters µ = 3.215
and σ = 0.157.
a. Find the mean BMI for men aged 25–34.
b. Find the standard deviation of BMI for men aged
25–34.
c. Find the median BMI for men aged 25–34.
d. What proportion of men aged 25–34 have a BMI
less than 22?
e. Find the 75th percentile of BMI for men aged
25–34.
4. The article “Stochastic Estimates of Exposure and
Cancer Risk from Carbon Tetrachloride Released to
the Air from the Rocky Flats Plant” (A. Rood, P. McGavran, et al., Risk Analysis, 2001:675–695) models
the increase in the risk of cancer due to exposure to
carbon tetrachloride as lognormal with µ = −15.65
and σ = 0.79.
a.
b.
c.
d.
e.
Find the mean risk.
Find the median risk.
Find the standard deviation of the risk.
Find the 5th percentile.
Find the 95th percentile.
5. The prices of stocks or other financial instruments are
often modeled with a lognormal distribution. An investor is considering purchasing stock in one of two
companies, A or B. The price of a share of stock today
is $1 for both companies. For company A, the value
of the stock one year from now is modeled as lognormal with parameters µ = 0.05 and σ = 0.1. For
company B, the value of the stock one year from now
is modeled as lognormal with parameters µ = 0.02
and σ = 0.2.
a. Find the mean of the price of one share of company
A one year from now.
b. Find the probability that the price of one share of
company A one year from now will be greater than
$1.20.
c. Find the mean of the price of one share of company
B one year from now.
d. Find the probability that the price of one share of
company B one year from now will be greater than
$1.20.
6. A manufacturer claims that the tensile strength of a
certain composite (in MPa) has the lognormal distribution with µ = 5 and σ = 0.5. Let X be the
strength of a randomly sampled specimen of this
composite.
a. If the claim is true, what is P(X < 20)?
b. Based on the answer to part (a), if the claim is
true, would a strength of 20 MPa be unusually
small?
c. If you observed a tensile strength of 20 MPa, would
this be convincing evidence that the claim is false?
Explain.
Navidi-38615
book
December 4, 2008
Confirming Pages
15:53
4.5
d. If the claim is true, what is P(X < 130)?
e. Based on the answer to part (d), if the claim is true,
would a strength of 130 MPa be unusually small?
The Exponential Distribution
147
f. If you observed a tensile strength of 130 MPa,
would this be convincing evidence that the claim
is false? Explain.
4.5 The Exponential Distribution
The exponential distribution is a continuous distribution that is sometimes used to
model the time that elapses before an event occurs. Such a time is often called a waiting time. A common example is the lifetime of a component, which is the time that
elapses before the component fails. In addition, there is a close connection between the
exponential distribution and the Poisson distribution.
The probability density function of the exponential distribution involves a parameter,
which is a positive constant λ whose value determines the density function’s location
and shape.
Definition
The probability density function of the exponential distribution with parameter
λ > 0 is
λe−λx
x >0
f (x) =
(4.18)
0
x ≤0
Figure 4.12 presents the probability density function of the exponential distribution
for various values of λ. If X is a random variable whose distribution is exponential with
parameter λ, we write X ∼ Exp(λ).
4
␭=4
3
2
␭=2
1
␭=1
0
0
0.5
1
1.5
2
2.5
3
FIGURE 4.12 Plots of the exponential probability density function for various values of λ.
Navidi-38615
148
book
December 4, 2008
CHAPTER 4
Confirming Pages
15:53
Commonly Used Distributions
The cumulative distribution function of the exponential distribution is easy to
compute. For x ≤ 0, F(x) = P(X ≤ x) = 0. For x > 0, the cumulative distribution
function is
x
λe−λt dt = 1 − e−λx
F(x) = P(X ≤ x) =
0
Summary
If X ∼ Exp(λ), the cumulative distribution function of X is
1 − e−λx
x >0
F(x) = P(X ≤ x) =
0
x ≤0
(4.19)
The mean and variance of an exponential random variable can be computed by using
integration by parts. The results follow.
If X ∼ Exp(λ), then
Example
4.25
µX =
1
λ
(4.20)
σ X2 =
1
λ2
(4.21)
If X ∼ Exp(4), find µ X , σ X2 , and P(X ≤ 0.2).
Solution
We compute µ X and σ X2 from Equations (4.20) and (4.21), substituting λ = 2. We obtain
µ X = 0.25, σ X2 = 0.0625. Using Equation (4.19), we find that
P(X ≤ 0.2) = 1 − e−4(0.2) = 0.5507
The Poisson Process
Recall from Section 4.2 that the Poisson distribution is often used to model the number of
events that occur in a given region of time or space. We say that events follow a Poisson
process with rate parameter λ when the numbers of events in disjoint intervals of time
or space are independent, and the number X of events that occur in any interval of length
t has a Poisson distribution with mean λt—in other words, when X ∼ Poisson(λt).
We mentioned that the exponential distribution is sometimes used to model the
waiting time to an event. It turns out that the exponential distribution is the correct model
for waiting times whenever the events follow a Poisson process. The connection between
the exponential distribution and the Poisson process is as follows:
If events follow a Poisson process with rate parameter λ, and if T represents
the waiting time from any starting point until the next event, then T ∼ Exp(λ).
Navidi-38615
book
December 4, 2008
Confirming Pages
15:53
4.5
Example
4.26
The Exponential Distribution
149
A radioactive mass emits particles according to a Poisson process at a mean rate of 15
particles per minute. At some point, a clock is started. What is the probability that more
than 5 seconds will elapse before the next emission? What is the mean waiting time until
the next particle is emitted?
Solution
We will measure time in seconds. Let T denote the time in seconds that elapses before
the next particle is emitted. The mean rate of emissions is 0.25 per second, so the rate
parameter is λ = 0.25, and T ∼ Exp(0.25). The probability that more than 5 seconds
will elapse before the next emission is equal to
P(T > 5) = 1 − P(T ≤ 5)
= 1 − (1 − e−0.25(5) )
= e−1.25
= 0.2865
The mean waiting time is µT =
1
= 4 seconds.
0.25
Lack of Memory Property
The exponential distribution has a property known as the lack of memory property, which
we illustrate with Examples 4.27 and 4.28.
Example
4.27
The lifetime of a transistor in a particular circuit has an exponential distribution with
mean 1.25 years. Find the probability that the circuit lasts longer than two years.
Solution
Let T represent the lifetime of the circuit. Since µT = 1.25, λ = 1/1.25 = 0.8. We
need to find P(T > 2).
P(T > 2) = 1 − P(T ≤ 2)
= 1 − (1 − e−0.8(2) )
= e−1.6
= 0.202
Example
4.28
Refer to Example 4.27. Assume the transistor is now three years old and is still functioning. Find the probability that it functions for more than two additional years. Compare
this probability with the probability that a new transistor functions for more than two
years, which was calculated in Example 4.27.
Navidi-38615
book
150
December 4, 2008
CHAPTER 4
Confirming Pages
15:53
Commonly Used Distributions
Solution
We are given that the lifetime of the transistor will be more than three years, and we
must compute the probability that the lifetime will be more than 3 + 2 = 5 years. The
probability is given by
P(T > 5 and T > 3)
P(T > 3)
If T > 5, then T > 3 as well. Therefore P(T > 5 and T > 3) = P(T > 5). It follows that
P(T > 5)
P(T > 5|T > 3) =
P(T > 3)
P(T > 5|T > 3) =
e−0.8(5)
e−0.8(3)
= e−0.8(2)
=
= e−1.6
= 0.202
The probability that a 3-year-old transistor lasts 2 additional years is exactly the
same as the probability that a new transistor lasts 2 years.
Examples 4.27 and 4.28 illustrate the lack of memory property. The probability that
we must wait an additional t units, given that we have already waited s units, is the same
as the probability that we must wait t units from the start. The exponential distribution
does not “remember” how long we have been waiting. In particular, if the lifetime of a
component follows the exponential distribution, then the probability that a component
that is s time units old will last an additional t time units is the same as the probability
that a new component will last t time units. In other words, a component whose lifetime
follows an exponential distribution does not show any effects of age or wear.
The calculations in Examples 4.27 and 4.28 could be repeated for any values s and
t in place of 3 and 2, and for any value of λ in place of 0.8. We now state the lack of
memory property in its general form:
Lack of Memory Property
If T ∼ Exp(λ), and t and s are positive numbers, then
P(T > t + s | T > s) = P(T > t)
Exercises for Section 4.5
1. Let T ∼ Exp(0.5). Find
a.
b.
c.
d.
µT
σT2
P(T > 5)
The median of T
2. The time between requests to a web server is exponentially distributed with mean 0.5 seconds.
a. What is the value of the parameter λ?
b. What is the median time between requests?
c. What is the standard deviation?
Navidi-38615
book
December 4, 2008
Confirming Pages
15:53
4.6
a. What is the mean pore diameter?
b. What is the standard deviation of the pore diameters?
c. What proportion of the pores are less than 3 microns in diameter?
d. What proportion of the pores are greater than 11 microns in diameter?
e. What is the median pore diameter?
f. What is the third quartile of the pore diameters?
g. What is the 99th percentile of the pore diameters?
151
b. Find the probability that the distance between two
flaws is between 8 and 20 m.
c. Find the median distance.
d. Find the standard deviation of the distances.
e. Find the 65th percentile of the distances.
d. What is the 80th percentile?
e. Find the probability that more than one second
elapses between requests.
f. If there have been no requests for the past two seconds, what is the probability that more than one additional second will elapse before the next request?
3. A catalyst researcher states that the diameters, in microns, of the pores in a new product she has made follow
the exponential distribution with parameter λ = 0.25.
Some Other Continuous Distributions
5. A certain type of component can be purchased new or
used. Fifty percent of all new components last more
than five years, but only 30% of used components
last more than five years. Is it possible that the lifetimes of new components are exponentially distributed?
Explain.
6. A radioactive mass emits particles according to a
Poisson process at a mean rate of 2 per second. Let
T be the waiting time, in seconds, between emissions.
4. The distance between flaws on a long cable is exponentially distributed with mean 12 m.
a. Find the probability that the distance between two
flaws is greater than 15 m.
a.
b.
c.
d.
e.
f.
What is the mean waiting time?
What is the median waiting time?
Find P(T > 2).
Find P(T < 0.1).
Find P(0.3 < T < 1.5).
If 3 seconds have elapsed with no emission, what
is the probability that there will be an emission
within the next second?
4.6 Some Other Continuous Distributions
The Uniform Distribution
The continuous uniform distribution, which we will sometimes refer to just as the uniform
distribution, is the simplest of the continuous distributions. It often plays an important
role in computer simulation studies. The uniform distribution has two parameters, a and
b, with a < b.
Definition
The probability density function of the continuous uniform distribution with
parameters a and b is

 1
a<x <b
(4.22)
f (x) = b − a

0
otherwise
If X is a random variable with probability density function f (x), we say that X
is uniformly distributed on the interval (a, b).
Navidi-38615
152
book
December 4, 2008
CHAPTER 4
Confirming Pages
15:53
Commonly Used Distributions
Since the probability density function is constant on the interval (a, b), we can
think of the probability as being distributed “uniformly” on the interval. If X is a random
variable whose distribution is uniform on the interval (a, b), we write X ∼ U (a, b).
The mean and variance of a uniform random variable can easily be computed from
the definitions (Equations 3.19, 3.20, and 3.21 in Section 3.3).
Let X ∼ U (a, b). Then
a+b
2
(4.23)
(b − a)2
12
(4.24)
µX =
σ X2 =
The Gamma Distribution
The gamma distribution is a continuous distribution, one of whose purposes is to extend
the usefulness of the exponential distribution in modeling waiting times. It involves a
certain integral known as the gamma function. We define the gamma function and state
some of its properties.
Definition
For r > 0, the gamma function is defined by
∞
t r −1 e−t dt
(r ) =
(4.25)
0
The gamma function has the following properties:
1.
2.
3.
If r is an integer, then (r ) = (r − 1)!.
For any r , (r + 1) = r (r ).
√
(1/2) = π.
The gamma function is used to define the probability density function of the gamma
distribution. The gamma probability density function has two parameters, r and λ, both
of which are positive constants.
Definition
The probability density function of the gamma distribution with parameters r > 0
and λ > 0 is
 r r −1 −λx
λ x e


x >0
(r )
(4.26)
f (x) =


0
x ≤0
Navidi-38615
book
December 4, 2008
Confirming Pages
15:53
4.6
Some Other Continuous Distributions
153
1
0.8
r = 1, ␭ = 1
0.6
r = 3, ␭ = 2
0.4
r = 5, ␭ = 1
0.2
0
0
2
4
6
8
10
12
FIGURE 4.13 The gamma probability density function for various values of r and λ.
If X is a random variable whose probability density function is gamma with parameters r and λ, we write X ∼ (r, λ). Note that when r = 1, the gamma distribution is
the same as the exponential. In symbols, (1, λ) = Exp(λ). Figure 4.13 presents plots
of the gamma probability density function for various values of r and λ.
The mean and variance of a gamma random variable can be computed from the
probability density function. The results are as follows.
If X ∼ (r, λ), then
µX =
r
λ
(4.27)
σ X2 =
r
λ2
(4.28)
A gamma distribution for which the parameter r is a positive integer is sometimes
called an Erlang distribution. If r = k/2 where k is a positive integer, the (r, 1/2)
distribution is called a chi-square distribution with k degrees of freedom. The chisquare distribution is widely used in statistical inference. We will discuss some of its
uses in Section 6.5.
The Weibull Distribution
The Weibull distribution is a continuous distribution that is used in a variety of situations.
A common application of the Weibull distribution is to model the lifetimes of components
such as bearings, ceramics, capacitors, and dielectrics. The Weibull probability density
function has two parameters, both positive constants, that determine its location and
Navidi-38615
154
book
December 4, 2008
CHAPTER 4
Confirming Pages
15:53
Commonly Used Distributions
2
␣ = 5, ␤ = 1
1.5
1
␣ = 1, ␤ = 0.5
0.5
0
0
2
␣ = 5, ␤ = 0.2
4
6
8
10
FIGURE 4.14 The Weibull probability density function for various choices of α and β.
shape. We denote these parameters α and β. The probability density function of the
Weibull distribution is
α α−1 −(βx)α
x >0
αβ x e
(4.29)
f (x) =
0
x ≤0
If X is a random variable whose probability density function is Weibull with parameters α and β, we write X ∼ Weibull(α, β). Note that when α = 1, the Weibull
distribution is the same as the exponential distribution with parameter λ = β. In symbols,
Weibull(1, β) = Exp(β).
Figure 4.14 presents plots of the Weibull(α, β) probability density function for
several choices of the parameters α and β. By varying the values of α and β, a wide
variety of curves can be generated. Because of this, the Weibull distribution can be made
to fit a wide variety of data sets. This is the main reason for the usefulness of the Weibull
distribution.
The Weibull cumulative distribution function can be computed by integrating the
probability density function:
F(x) = P(X ≤ x) =





x
α
αβ α t α−1 e−(βt) dt = 1 − e−(βx)
α
x >0
0
0
(4.30)
x ≤0
This integral is not as hard as it looks. Just substitute u = (βt)α , and du = αβ α t α−1 dt.
The mean and variance of the Weibull distribution are expressed in terms of the
gamma function.
Navidi-38615
book
December 4, 2008
Confirming Pages
15:53
4.6
Some Other Continuous Distributions
If X ∼ Weibull(α, β), then
1
1
µX = 1 +
β
α
2
1
1 2
2
σX = 2 1 +
− 1+
β
α
α
155
(4.31)
(4.32)
In the special case that 1/α is an integer, then
2 2
1
1
1
1
2
µX =
!
σX = 2
!−
!
β
α
β
α
α
If the quantity 1/α is an integer, then 1 + 1/α and 1 + 2/α are integers, so Property 1
of the gamma function can be used to compute µ X and σ X2 exactly. If the quantity 1/α
is of the form n/2, where n is an integer, then in principle µ X and σ X2 can be computed
exactly through repeated applications of Properties 2 and 3 of the gamma function. For
other values of α, µ X and σ X2 must be approximated. Many computer packages can
do this.
Example
4.29
In the article “Snapshot: A Plot Showing Program through a Device Development Laboratory” (D. Lambert, J. Landwehr, and M. Shyu, Statistical Case Studies for Industrial
Process Improvement, ASA-SIAM 1997), the authors suggest using a Weibull distribution to model the duration of a bake step in the manufacture of a semiconductor.
Let T represent the duration in hours of the bake step for a randomly chosen lot. If
T ∼ Weibull(0.3, 0.1), what is the probability that the bake step takes longer than four
hours? What is the probability that it takes between two and seven hours?
Solution
We use the cumulative distribution function, Equation (4.30). Substituting 0.3 for α and
0.1 for β, we have
P(T ≤ t) = 1 − e−(0.1t)
0.3
Therefore
P(T > 4) = 1 − P(T ≤ 4)
0.3 = 1 − 1 − e−[(0.1)(4)]
= e−(0.4)
0.3
= e−0.7597
= 0.468
Navidi-38615
book
156
December 4, 2008
CHAPTER 4
Confirming Pages
15:53
Commonly Used Distributions
The probability that the step takes between two and seven hours is
P(2 < T < 7) = P(T ≤ 7) − P(T ≤ 2)
0.3 0.3 = 1 − e−[(0.1)(7)] − 1 − e−[(0.1)(2)]
= e−[(0.1)(2)] − e−[(0.1)(7)]
0.3
= e−(0.2) − e−(0.7)
0.3
0.3
0.3
= e−0.6170 − e−0.8985
= 0.132
Exercises for Section 4.6
1. A person arrives at a certain bus stop each morning.
The waiting time, in minutes, for a bus to arrive is
uniformly distributed on the interval (0, 10).
a. Find the mean waiting time.
b. Find the standard deviation of the waiting times.
2. Resistors are labeled 100 . In fact, the actual resistances are uniformly distributed on the interval
(95, 103).
a. Find the mean resistance.
b. Find the standard deviation of the resistances.
3. Let T ∼ (6, 2).
a. Find µT .
b. Find σT .
4. Let T ∼ (r, λ). If µT = 8 and σT2 = 16, find r and λ.
5. Let T ∼ Weibull(0.5, 2).
a.
b.
c.
d.
e.
Find µT .
Find σT2 .
Find P(T ≤ 2).
Find P(T > 3).
Find P(1 < T ≤ 2).
6. If T is a continuous random variable that is always
positive (such as a waiting time), with probability density function f (t) and cumulative distribution function
F(t), then the hazard function is defined to be the
function
f (t)
h(t) =
1 − F(t)
The hazard function is the rate of failure per unit time,
expressed as a proportion of the items that have not
failed.
a. If T ∼ Weibull(α, β), find h(t).
b. For what values of α is the hazard rate increasing with time? For what values of α is it
decreasing?
7. In the article “Parameter Estimation with Only One
Complete Failure Observation” (W. Pang, P. Leung,
et al., International Journal of Reliability, Quality, and
Safety Engineering, 2001:109–122), the lifetime, in
hours, of a certain type of bearing is modeled with the
Weibull distribution with parameters α = 2.25 and
β = 4.474 × 10−4 .
a. Find the probability that a bearing lasts more than
1000 hours.
b. Find the probability that a bearing lasts less than
2000 hours.
c. Find the median lifetime of a bearing.
d. The hazard function is defined in Exercise 6. What
is the hazard at t = 2000 hours?
8. The lifetime of a certain battery is modeled with the
Weibull distribution with α = 2 and β = 0.1.
a. What proportion of batteries will last longer than
10 hours?
b. What proportion of batteries will last less than 5
hours?
c. What proportion of batteries will last longer than
20 hours?
d. The hazard function is defined in Exercise 6. What
is the hazard at t = 10 hours?
9. The lifetime of a cooling fan, in hours, that is used in
a computer system has the Weibull distribution with
α = 1.5 and β = 0.0001.
Navidi-38615
book
December 4, 2008
Confirming Pages
15:53
4.7
Probability Plots
157
b. Based on the answer to part (a), if the model is correct, would one day be an unusually short lifetime?
Explain.
c. If you observed a component that lasted one day,
would you find this model to be plausible? Explain.
d. If this model is correct, what is P(T ≤ 90)?
e. Based on the answer to part (d), if the model is correct, would 90 days be an unusually short lifetime?
An unusually long lifetime? Explain.
f. If you observed a component that lasted 90 days,
would you find this model to be plausible? Explain.
a. What is the probability that a fan lasts more than
10,000 hours?
b. What is the probability that a fan lasts less than
5000 hours?
c. What is the probability that a fan lasts between
3000 and 9000 hours?
10. Someone suggests that the lifetime T (in days) of a
certain component can be modeled with the Weibull
distribution with parameters α = 3 and β = 0.01.
a. If this model is correct, what is P(T ≤ 1)?
4.7 Probability Plots
Scientists and engineers frequently work with data that can be thought of as a random sample from some population. In many such cases, it is important to determine
a probability distribution that approximately describes that population. In some cases,
knowledge of the process that generated the data can guide the decision. More often,
though, the only way to determine an appropriate distribution is to examine the sample
to find a probability distribution that fits.
Probability plots provide a good way to do this. Given a random sample X 1 , . . . , X n ,
a probability plot can determine whether the sample might plausibly have come from
some specified population. We will present the idea behind probability plots with a
simple example. A random sample of size 5 is drawn, and we want to determine whether
the population from which it came might have been normal. The sample, arranged in
increasing order, is
3.01, 3.35, 4.79, 5.96, 7.89
Denote the values, in increasing order, by X 1 , . . . , X n (n = 5 in this case). The
first thing to do is to assign increasing, evenly spaced values between 0 and 1 to the X i .
There are several acceptable ways to do this; perhaps the simplest is to assign the value
(i − 0.5)/n to X i . The following table shows the assignment for the given sample.
i
Xi
(i − 0.5)/5
1
2
3
4
5
3.01
3.35
4.79
5.96
7.89
0.1
0.3
0.5
0.7
0.9
The value (i − 0.5)/n is chosen to reflect the position of X i in the ordered sample.
There are i − 1 values less than X i , and i values less than or equal to X i . The quantity
(i − 0.5)/n is a compromise between the proportions (i − 1)/n and i/n.
Navidi-38615
book
158
December 4, 2008
CHAPTER 4
Confirming Pages
15:53
Commonly Used Distributions
The goal is to determine whether the sample might have come from a normal population. The most plausible normal distribution is the one whose mean and standard
deviation are the same as the sample mean and standard deviation. The sample mean
is X = 5.00, and the sample standard deviation is s = 2.00. We will therefore determine whether this sample might have come from a N (5, 22 ) distribution. The first step
is to compute the 100(i − 0.5)/5 percentiles (i.e., the 10th, 30th, 50th, 70th, and 90th
percentiles) of the N (5, 22 ) distribution. We could approximate these values by looking
up the z-scores corresponding to these percentiles and then converting to raw scores. In
practice, the Q i are invariably calculated by a computer software package. The following
table presents the X i and the Q i for this example.
i
Xi
Qi
1
2
3
4
5
3.01
3.35
4.79
5.96
7.89
2.44
3.95
5.00
6.05
7.56
The probability plot consists of the points (X i , Q i ). Since the distribution that
generated the Q i was a normal distribution, this is called a normal probability plot.
If X 1 , . . . , X n do in fact come from the distribution that generated the Q i , the points
should lie close to a straight line.
Figure 4.15 presents a normal probability plot for the sample X 1 , . . . , X 5 . A straight
line is superimposed on the plot, to make it easier to judge whether or not the points lie
12
0.999
10
0.99
8
0.95
0.9
0.75
6
0.5
4
0.25
2
0.1
0.05
0.01
0
0.001
⫺2
3
4
5
6
7
8
3
4
5
6
7
8
FIGURE 4.15 Normal probability plots for the sample X 1 , . . . , X 5 . The plots are identical, except for the scaling
on the vertical axis. The sample points lie approximately on a straight line, so it is plausible that they came from a
normal population.
Navidi-38615
book
December 4, 2008
Confirming Pages
15:53
4.7
Probability Plots
159
close to a straight line. Two versions of the plot are presented; they are identical except
for the scaling on the vertical axis. In the plot on the left, the values on the vertical axis
represent the Q i . In the plot on the right, the values on the vertical axis represent the
percentiles (as decimals, so 0.1 is the 10th percentile) of the Q i . For example, the 10th
percentile of N (5, 22 ) is 2.44, so the value 0.1 on the right-hand plot corresponds to
the value 2.44 on the left-hand plot. The 50th percentile, or median, is 5, so the value
0.5 on the right-hand plot corresponds to the value 5 on the left-hand plot. Computer
packages often scale the vertical axis like the plot on the right. In Figure 4.15, the sample
points are close to the line, so it is quite plausible that the sample came from a normal
distribution.
We remark that the points Q 1 , . . . , Q n are called quantiles of the distribution from
which they are generated. Sometimes the sample points X 1 , . . . , X n are called empirical
quantiles. For this reason the probability plot is sometimes called a quantile–quantile
plot, or QQ plot.
In this example, we used a sample of only five points to make the calculations clear.
In practice, probability plots work better with larger samples. A good rule of thumb is
to require at least 30 points before relying on a probability plot. Probability plots can
still be used for smaller samples, but they will detect only fairly large departures from
normality.
Figure 4.16 shows two normal probability plots. The plot in Figure 4.16a is of the
monthly productions of 255 gas wells. These data do not lie close to a straight line and
thus do not come from a population that is close to normal. The plot in Figure 4.16b is
of the natural logs of the monthly productions. These data lie much closer to a straight
line, although some departure from normality can be detected.
0.999
0.999
0.99
0.99
0.95
0.9
0.95
0.9
0.75
0.75
0.5
0.5
0.25
0.25
0.1
0.05
0.1
0.05
0.01
0.01
0.001
0.001
0
500
1000
(a)
1500
3
4
5
(b)
6
7
FIGURE 4.16 Two normal probability plots. (a) A plot of the monthly productions of 255 gas wells. These data do not
lie close to a straight line, and thus do not come from a population that is close to normal. (b) A plot of the natural logs
of the monthly productions. These data lie much closer to a straight line, although some departure from normality can be
detected.
Navidi-38615
book
160
December 4, 2008
CHAPTER 4
Confirming Pages
15:53
Commonly Used Distributions
Interpreting Probability Plots
It’s best not to use hard-and-fast rules when interpreting a probability plot. Judge the
straightness of the plot by eye. When deciding whether the points on a probability plot
lie close to a straight line, do not pay too much attention to the points at the very
ends (high or low) of the sample, unless they are quite far from the line. It is common
for a few points at either end to stray from the line somewhat. However, a point that
is very far from the line when most other points are close is an outlier, and deserves
attention.
Exercises for Section 4.7
1. Each of three samples has been plotted on a normal probability plot. For each, say whether the sample appears to
have come from an approximately normal population.
0.999
0.999
0.99
0.95
0.9
0.75
0.5
0.25
0.1
0.05
0.01
0.99
0.95
0.9
0.75
0.5
0.25
0.1
0.05
0.01
0.001
0.001
(a)
(b)
0.999
0.99
0.95
0.9
0.75
0.5
0.25
0.1
0.05
0.01
0.001
(c)
2. Construct a normal probability plot for the soapbars
data in Exercise 1 in Section 1.3 (page 30). Do these
data appear to come from an approximately normal
distribution?
4.1
3.8
3.7
3.7
1.8
1.9
3.7
3.8
3.2
4.6
4.3
3.4
1.9
1.8
3.6
4.0
4.6
4.7
3.8
2.3
2.0
1.8
3.8
4.4
4.5
4.6
3.8
4.1
3.9
1.9
2.5
4.3
4.3
3.5
4.5
3.3
2.3
4.0
4.1
2.0
3. Below are the durations (in minutes) of 40 eruptions
of the geyser Old Faithful in Yellowstone National
Park.
Construct a normal probability plot for these data.
Do the data appear to come from an approximately
normal distribution?
Navidi-38615
book
December 4, 2008
Confirming Pages
15:53
4.8
4. Below are the durations (in minutes) of 40 time intervals between eruptions of the geyser Old Faithful in
Yellowstone National Park.
91
86
73
79
51
51
67
60
79
85
68
86
53
45
86
71
82
88
72
67
51
51
75
81
76
80
75
76
82
49
66
83
84
82
84
76
53
75
70
55
Construct a normal probability plot for these data. Do
they appear to come from an approximately normal
distribution?
The Central Limit Theorem
161
5. Construct a normal probability plot for the PM data in
Table 1.2 (page 17). Do the PM data appear to come
from a normal population?
6. Construct a normal probability plot for the logs of the
PM data in Table 1.2. Do the logs of the PM data
appear to come from a normal population?
7. Can the plot in Exercise 6 be used to determine
whether the PM data appear to come from a lognormal
population? Explain.
4.8 The Central Limit Theorem
The Central Limit Theorem is by far the most important result in statistics. Many
commonly used statistical methods rely on this theorem for their validity. The Central
Limit Theorem says that if we draw a large enough sample from a population, then the
distribution of the sample mean is approximately normal, no matter what population the
sample was drawn from. This allows us to compute probabilities for sample means using
the z table, even though the population from which the sample was drawn is not normal.
We now explain this more fully.
Let X 1 , . . . , X n be a simple random sample from a population with mean µ and
variance σ 2 . Let X = (X 1 + · · · + X n )/n be the sample mean. Now imagine drawing
many such samples and computing their sample means. If one could draw every possible
sample of size n from the original population and compute the sample mean for each one,
the resulting collection would be the population of sample means. One could construct
the probability density function of this population. One might think that the shape of this
probability density function would depend on the shape of the population from which
the sample was drawn. The surprising thing is that if the sample size is sufficiently large,
this is not so. If the sample size is large enough, the distribution of the sample mean is
approximately normal, no matter what the distribution of the population from which the
sample was drawn.
The Central Limit Theorem
Let X 1 , . . . , X n be a simple random sample from a population with mean µ
and variance σ 2 .
X1 + · · · + Xn
be the sample mean.
Let X =
n
Let Sn = X 1 + · · · + X n be the sum of the sample observations.
Then if n is sufficiently large,
σ2
X ∼ N µ,
n
and
Sn ∼ N (nµ, nσ 2 )
approximately
(4.33)
approximately
(4.34)
Navidi-38615
162
book
December 4, 2008
CHAPTER 4
Confirming Pages
15:53
Commonly Used Distributions
The Central Limit Theorem says that X and Sn are approximately normally distributed, if the sample size n is large enough. The natural question to ask is: How large
is large enough? The answer depends on the shape of the underlying population. If the
sample is drawn from a nearly symmetric distribution, the normal approximation can be
good even for a fairly small value of n. However, if the population is heavily skewed,
a fairly large n may be necessary. Empirical evidence suggests that for most populations, a sample size of 30 or more is large enough for the normal approximation to be
adequate.
For most populations, if the sample size is greater than 30, the Central Limit
Theorem approximation is good.
Example
4.30
Let X denote the number of flaws in a 1 in. length of copper wire. The probability mass
function of X is presented in the following table.
x
P (X = x)
0
1
2
3
0.48
0.39
0.12
0.01
One hundred wires are sampled from this population. What is the probability that the
average number of flaws per wire in this sample is less than 0.5?
Solution
The population mean number of flaws is µ = 0.66, and the population variance is
σ 2 = 0.5244. (These quantities can be calculated using Equations 3.13 and 3.14 in
Section 3.3.) Let X 1 , . . . , X 100 denote the number of flaws in the 100 wires sampled
from this population. We need to find P(X < 0.5). Now the sample size is n = 100,
which is a large sample. It follows from the Central Limit Theorem (expression 4.33)
that X ∼ N (0.66, 0.005244). The z-score of 0.5 is therefore
0.5 − 0.66
= −2.21
z=√
0.005244
From the z table, the area to the left of −2.21 is 0.0136. Therefore P(X < 0.5) = 0.0136,
so only 1.36% of samples of size 100 will have fewer than 0.5 flaws per wire. See
Figure 4.17.
Note that in Example 4.30 we needed to know only the mean and variance of the
population, not the probability mass function.
Navidi-38615
book
December 4, 2008
Confirming Pages
15:53
4.8
The Central Limit Theorem
163
0.0136
0.50
z = ⫺2.21
0.66
FIGURE 4.17 Solution to Example 4.30
Example
4.31
At a large university, the mean age of the students is 22.3 years, and the standard deviation
is 4 years. A random sample of 64 students is drawn. What is the probability that the
average age of these students is greater than 23 years?
Solution
Let X 1 , . . . , X 64 be the ages of the 64 students in the sample. We wish to find P(X > 23).
Now the population from which the sample was drawn has mean µ = 22.3 and variance
σ 2 = 16. The sample size is n = 64. It follows from the Central Limit Theorem
(expression 4.33) that X ∼ N (22.3, 0.25). The z-score for 23 is
z=
23 − 22.3
√
= 1.40
0.25
From the z table, the area to the right of 1.40 is 0.0808. Therefore P(X > 23) = 0.0808.
See Figure 4.18.
0.0808
22.3
23.0
z = 1.4
FIGURE 4.18 Solution to Example 4.31
Normal Approximation to the Binomial
Recall from Section 4.1 that if X ∼ Bin(n, p) then X represents the number of successes
in n independent trials, each of which has success probability p. Furthermore, X has
mean np and variance np(1 − p). If the number of trials is large enough, the Central
Limit Theorem says that X ∼ N (np, np(1 − p)), approximately. Thus we can use the
normal distribution to compute probabilities concerning X .
Again the question arises: How large a sample is large enough? In the binomial case,
the accuracy of the normal approximation depends on the mean number of successes np
and on the mean number of failures n(1− p). The larger the values of np and n(1− p), the
better the approximation. A common rule of thumb is to use the normal approximation
Navidi-38615
164
book
December 4, 2008
CHAPTER 4
Confirming Pages
15:53
Commonly Used Distributions
whenever np > 5 and n(1 − p) > 5. A better and more conservative rule is to use the
normal approximation whenever np > 10 and n(1 − p) > 10.
Summary
If X ∼ Bin(n, p), and if np > 10 and n(1 − p) > 10, then
X ∼ N (np, np(1 − p))
approximately
(4.35)
To illustrate the accuracy of the normal approximation to the binomial, Figure 4.19
presents the Bin(100, 0.2) probability histogram with the N (20, 16) probability density
function superimposed. While a slight degree of skewness can be detected in the binomial
distribution, the normal approximation is quite close.
0.1
0.08
0.06
0.04
0.02
0
0
2
4
6
8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40
FIGURE 4.19 The Bin(100, 0.2) probability histogram, with the N (20, 16) probability
density function superimposed.
The Continuity Correction
The binomial distribution is discrete, while the normal distribution is continuous. The
continuity correction is an adjustment, made when approximating a discrete distribution
with a continuous one, that can improve the accuracy of the approximation. To see how
it works, imagine that a fair coin is tossed 100 times. Let X represent the number
of heads. Then X ∼ Bin(100, 0.5). Imagine that we wish to compute the probability
that X is between 45 and 55. This probability will differ depending on whether the
endpoints, 45 and 55, are included or excluded. Figure 4.20 illustrates the case where
the endpoints are included—that is, where we wish to compute P(45 ≤ X ≤ 55). The
exact probability is given by the total area of the rectangles of the binomial probability
histogram corresponding to the integers 45 to 55 inclusive. The approximating normal
curve is superimposed. To get the best approximation, we should compute the area
under the normal curve between 44.5 and 55.5. In contrast, Figure 4.21 illustrates the
case where we wish to compute P(45 < X < 55). Here the endpoints are excluded. The
Navidi-38615
book
December 4, 2008
Confirming Pages
15:53
4.8
The Central Limit Theorem
165
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0
40
45
50
55
60
FIGURE 4.20 To compute P(45 ≤ X ≤ 55), the areas of the rectangles corresponding
to 45 and to 55 should be included. To approximate this probability with the normal
curve, compute the area under the curve between 44.5 and 55.5.
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0
40
45
50
55
60
FIGURE 4.21 To compute P(45 < X < 55), the areas of the rectangles corresponding
to 45 and to 55 should be excluded. To approximate this probability with the normal
curve, compute the area under the curve between 45.5 and 54.5.
exact probability is given by the total area of the rectangles of the binomial probability
histogram corresponding to the integers 46 to 54. The best normal approximation is
found by computing the area under the normal curve between 45.5 and 54.5.
In summary, to apply the continuity correction, determine precisely which rectangles
of the discrete probability histogram you wish to include, and then compute the area under
the normal curve corresponding to those rectangles.
Example
4.32
If a fair coin is tossed 100 times, use the normal curve to approximate the probability
that the number of heads is between 45 and 55 inclusive.
Solution
This situation is illustrated in Figure 4.20. Let X be the number of heads obtained. Then
X ∼ Bin(100, 0.5). Substituting n = 100 and p = 0.5 into Equation (4.35), we obtain
the normal approximation X ∼ N (50, 25). Since the endpoints 45 and 55 are to be
Navidi-38615
166
book
December 4, 2008
CHAPTER 4
Confirming Pages
15:53
Commonly Used Distributions
included, we should compute the area under the normal curve between 44.5 and 55.5.
The z-scores for 44.5 and 55.5 are
44.5 − 50
55.5 − 50
z=
= −1.1,
z=
= 1.1
5
5
From the z table we find that the probability is 0.7286. See Figure 4.22.
0.7286
44.5
z = ⫺1.1
50
55.5
z = 1.1
FIGURE 4.22 Solution to Example 4.32.
Example
4.33
If a fair coin is tossed 100 times, use the normal curve to approximate the probability
that the number of heads is between 45 and 55 exclusive.
Solution
This situation is illustrated in Figure 4.21. Let X be the number of heads obtained. As
in Example 4.32, X ∼ Bin(100, 0.5), and the normal approximation is X ∼ N (50, 25).
Since the endpoints 45 and 55 are to be excluded, we should compute the area under the
normal curve between 45.5 and 54.5. The z-scores for 45.5 and 54.5 are
54.5 − 50
45.5 − 50
= −0.9,
z=
= 0.9
z=
5
5
From the z table we find that the probability is 0.6318. See Figure 4.23.
0.6318
45.5
z = ⫺0.9
50
54.5
z = 0.9
FIGURE 4.23 Solution to Example 4.33.
Accuracy of the Continuity Correction
The continuity correction improves the accuracy of the normal approximation to the
binomial distribution in most cases. For binomial distributions with large n and small
p, however, when computing a probability that corresponds to an area in the tail of
the distribution, use of the continuity correction can in some cases reduce the accuracy of the normal approximation somewhat. This results from the fact that the normal
Navidi-38615
book
December 4, 2008
Confirming Pages
15:53
4.8
The Central Limit Theorem
167
approximation is not perfect; it fails to account for a small degree of skewness in these
distributions. In summary, use of the continuity correction makes the normal approximation to the binomial distribution better in most cases, but not all.
Normal Approximation to the Poisson
Recall that if X ∼ Poisson(λ), then X is approximately binomial with n large and
np = λ. Recall also that µ X = λ and σ X2 = λ. It follows that if λ is sufficiently large, i.e.,
λ > 10, then X is approximately binomial, with np > 10. It follows from the Central
Limit Theorem that X is also approximately normal, with mean and variance both equal
to λ. Thus we can use the normal distribution to approximate the Poisson.
Summary
If X ∼ Poisson(λ), where λ > 10, then
X ∼ N (λ, λ)
approximately
(4.36)
Continuity Correction for the Poisson Distribution
Since the Poisson distribution is discrete, the continuity correction can in principle be
applied when using the normal approximation. For areas that include the central part
of the curve, the continuity correction generally improves the normal approximation,
but for areas in the tails the continuity correction sometimes makes the approximation
worse. We will not use the continuity correction for the Poisson distribution.
Example
4.34
The number of hits on a website follows a Poisson distribution, with a mean of 27 hits
per hour. Find the probability that there will be 90 or more hits in three hours.
Solution
Let X denote the number of hits on the website in three hours. The mean number of hits
in three hours is 81, so X ∼ Poisson(81). Using the normal approximation, we find that
X ∼ N (81, 81). We wish to find P(X ≥ 90). We compute the z-score of 90, which is
z=
90 − 81
√
= 1.00
81
Using the z table, we find that P(X ≥ 90) = 0.1587. See Figure 4.24.
0.1587
81
90
z=1
FIGURE 4.24 Solution to Example 4.34.
Navidi-38615
book
168
December 4, 2008
CHAPTER 4
15:53
Confirming Pages
Commonly Used Distributions
Exercises for Section 4.8
1. Bottles filled by a certain machine are supposed to
contain 12 oz of liquid. In fact, the fill volume is
random, with mean 12.01 oz and standard deviation
0.2 oz.
a. What is the probability that the mean volume of a
random sample of 144 bottles is less than 12 oz?
b. If the population mean fill volume is increased to
12.03 oz what is the probability that the mean volume of a sample of size 144 will be less than 12 oz?
2. Among all the income tax forms filed in a certain year,
the mean tax paid was $2000, and the standard deviation was $500. In addition, for 10% of the forms, the
tax paid was greater than $3000. A random sample of
625 tax forms is drawn.
a. What is the probability that the average tax paid
on the sample forms is greater than $1980?
b. What is the probability that more than 60 of the
sampled forms have a tax of greater than $3000?
3. In a galvanized coating process for pipes, the mean
coating weight is 125 lb, with a standard deviation of
10 lb. A random sample of 50 pipes is drawn.
a. What is the probability that the sample mean coating weight is greater than 128 lb?
b. Find the 90th percentile of the sample mean coating weight.
c. How large a sample size is needed so that the probability is 0.02 that the sample mean is less than
123?
4. A 500-page book contains 250 sheets of paper. The
thickness of the paper used to manufacture the book
has mean 0.08 mm and standard deviation 0.01 mm.
a. What is the probability that a randomly chosen
book is more than 20.2 mm thick (not including
the covers)?
b. What is the 10th percentile of book thicknesses?
c. Someone wants to know the probability that a randomly chosen page is more than 0.1 mm thick. Is
enough information given to compute this probability? If so, compute the probability. If not, explain
why not.
5. In a process that manufactures bearings, 90% of the
bearings meet a thickness specification. In a sample
of 500 bearings, what is the probability that more than
440 meet the specification?
6. Among the adults in a large city, 30% have a college
degree. A simple random sample of 100 adults is chosen. What is the probability that more than 35 of them
have a college degree?
7. The breaking strength (in kg/mm) for a certain type
of fabric has mean 1.86 and standard deviation 0.27.
A random sample of 80 pieces of fabric is drawn.
a. What is the probability that the sample mean breaking strength is less than 1.8 kg/mm?
b. Find the 80th percentile of the sample mean breaking strength.
c. How large a sample size is needed so that the
probability is 0.01 that the sample mean is less
than 1.8?
8. The amount of warpage in a type of wafer used in the
manufacture of integrated circuits has mean 1.3 mm
and standard deviation 0.1 mm. A random sample of
200 wafers is drawn.
a. What is the probability that the sample mean
warpage exceeds 1.305 mm?
b. Find the 25th percentile of the sample mean.
c. How many wafers must be sampled so that the
probability is 0.05 that the sample mean exceeds
1.305?
9. A sample of 225 wires is drawn from the population of
wires described in Example 4.30. Find the probability
that fewer than 110 of these wires have no flaws.
10. A battery manufacturer claims that the lifetime of
a certain type of battery has a population mean of
40 hours and a standard deviation of 5 hours. Let X
represent the mean lifetime of the batteries in a simple
random sample of size 100.
a. If the claim is true, what is P(X ≤ 36.7)?
b. Based on the answer to part (a), if the claim is true,
is a sample mean lifetime of 36.7 hours unusually
short?
c. If the sample mean lifetime of the 100 batteries
were 36.7 hours, would you find the manufacturer’s claim to be plausible? Explain.
d. If the claim is true, what is P(X ≤ 39.8)?
Navidi-38615
book
December 4, 2008
15:53
Confirming Pages
Supplementary Exercises for Chapter 4
e. Based on the answer to part (d), if the claim is true,
is a sample mean lifetime of 39.8 hours unusually
short?
f. If the sample mean lifetime of the 100 batteries
were 39.8 hours, would you find the manufacturer’s claim to be plausible? Explain.
11. A new process has been designed to make ceramic
tiles. The goal is to have no more than 5% of the tiles
be nonconforming due to surface defects. A random
sample of 1000 tiles is inspected. Let X be the number
of nonconforming tiles in the sample.
a. If 5% of the tiles produced are nonconforming,
what is P(X ≥ 75)?
169
b. Based on the answer to part (a), if 5% of the tiles
are nonconforming, is 75 nonconforming tiles out
of 1000 an unusually large number?
c. If 75 of the sample tiles were nonconforming,
would it be plausible that the goal had been
reached? Explain.
d. If 5% of the tiles produced are nonconforming,
what is P(X ≥ 53)?
e. Based on the answer to part (d), if 5% of the tiles
are nonconforming, is 53 nonconforming tiles out
of 1000 an unusually large number?
f. If 53 of the sample tiles were nonconforming,
would it be plausible that the goal had been
reached? Explain.
Supplementary Exercises for Chapter 4
1. An airplane has 100 seats for passengers. Assume that
the probability that a person holding a ticket appears
for the flight is 0.90. If the airline sells 105 tickets,
what is the probability that everyone who appears for
the flight will get a seat?
2. The number of large cracks in a length of pavement
along a certain street has a Poisson distribution with
a mean of 1 crack per 100 m.
a. What is the probability that there will be exactly
8 cracks in a 500 m length of pavement?
b. What is the probability that there will be no cracks
in a 100 m length of pavement?
c. Let T be the distance in meters between two successive cracks. What is the probability density function of T ?
d. What is the probability that the distance between
two successive cracks will be more than 50 m?
3. Pea plants contain two genes for seed color, each of
which may be Y (for yellow seeds) or G (for green
seeds). Plants that contain one of each type of gene are
called heterozygous. According to the Mendelian theory of genetics, if two heterozygous plants are crossed,
each of their offspring will have probability 0.75 of
having yellow seeds and probability 0.25 of having
green seeds.
a. Out of 10 offspring of heterozygous plants, what
is the probability that exactly 3 have green seeds?
b. Out of 10 offspring of heterozygous plants, what is
the probability that more than 2 have green seeds?
c. Out of 100 offspring of heterozygous plants, what
is the probability that more than 30 have green
seeds?
d. Out of 100 offspring of heterozygous plants, what
is the probability that between 30 and 35 inclusive
have green seeds?
e. Out of 100 offspring of heterozygous plants, what
is the probability that fewer than 80 have yellow
seeds?
4. A simple random sample X 1 , . . . , X n is drawn from
a population, and the quantities ln X 1 , . . . , ln X n are
plotted on a normal probability plot. The points
approximately follow a straight line. True or false:
a. X 1 , . . . , X n come from a population that is approximately lognormal.
b. X 1 , . . . , X n come from a population that is approximately normal.
c. ln X 1 , . . . , ln X n come from a population that is
approximately lognormal.
d. ln X 1 , . . . , ln X n come from a population that is
approximately normal.
5. The Environmental Protection Agency (EPA) has contracted with your company for equipment to monitor
water quality for several lakes in your water district.
A total of 10 devices will be used. Assume that each
Navidi-38615
book
170
December 4, 2008
CHAPTER 4
15:53
Confirming Pages
Commonly Used Distributions
device has a probability of 0.01 of failure during the
course of the monitoring period.
pled resistors will have resistances between 9.3 and
10.7 ?
a. What is the probability that none of the devices
fail?
b. What is the probability that two or more devices
fail?
c. If the EPA requires the probability that none of the
devices fail to be at least 0.95, what is the largest
individual failure probability allowable?
9. The intake valve clearances on new engines of a certain type are normally distributed with mean 200 µm
and standard deviation 10 µm.
6. In the article “Occurrence and Distribution of Ammonium in Iowa Groundwater” (K. Schilling, Water Environment Research, 2002:177–186), ammonium concentrations (in mg/L) were measured at a large number
of wells in Iowa. The mean concentration was 0.71,
the median was 0.22, and the standard deviation was
1.09. Is it possible to determine whether these concentrations are approximately normally distributed? If so,
say whether they are normally distributed, and explain
how you know. If not, describe the additional information you would need to determine whether they are
normally distributed.
7. Medication used to treat a certain condition is administered by syringe. The target dose in a particular application is µ. Because of the variations in the syringe, in
reading the scale, and in mixing the fluid suspension,
the actual dose administered is normally distributed
with mean µ and variance σ 2 .
a. What is the probability that the dose administered
differs from the mean µ by less than σ ?
b. If X represents the dose administered, find the
value of z so that P(X < µ + zσ ) = 0.90.
c. If the mean dose is 10 mg, the variance is 2.6 mg2 ,
and a clinical overdose is defined as a dose larger
than 15 mg, what is the probability that a patient
will receive an overdose?
a. What is the probability that the clearance is greater
than 215 µm?
b. What is the probability that the clearance is
between 180 and 205 µm?
c. An engine has six intake valves. What is the probability that exactly two of them have clearances
greater than 215 µm?
10. The stiffness of a certain type of steel beam used in
building construction has mean 30 kN/mm and standard deviation 2 kN/mm.
a. Is it possible to compute the probability that the
stiffness of a randomly chosen beam is greater than
32 kN/mm? If so, compute the probability. If not,
explain why not.
b. In a sample of 100 beams, is it possible to compute the probability that the sample mean stiffness
of the beams is greater than 30.2 kN/mm? If so,
compute the probability. If not, explain why not.
11. In a certain process, the probability of producing an
oversize assembly is 0.05.
a. In a sample of 300 randomly chosen assemblies,
what is the probability that fewer than 20 of them
are oversize?
b. In a sample of 10 randomly chosen assemblies,
what is the probability that one or more of them is
oversize?
c. To what value must the probability of an oversize
assembly be reduced so that only 1% of lots of 300
assemblies contain 20 or more that are oversize?
8. You have a large box of resistors whose resistances
are normally distributed with mean 10 and standard
deviation 1 .
12. A process for manufacturing plate glass leaves an average of three small bubbles per 10 m2 of glass. The
number of bubbles on a piece of plate glass follows a
Poisson distribution.
a. What proportion of the resistors have resistances
between 9.3 and 10.7 ?
b. If you sample 100 resistors, what is the probability that 50 or more of them will have resistances
between 9.3 and 10.7 ?
c. How many resistors must you sample so that the
probability is 0.99 that 50 or more of the sam-
a. What is the probability that a piece of glass 3 × 5 m
will contain more than two bubbles?
b. What is the probability that a piece of glass 4 × 6 m
will contain no bubbles?
c. What is the probability that 50 pieces of glass, each
3 × 6 m, will contain more than 300 bubbles in
total?
Navidi-38615
book
December 4, 2008
15:53
Confirming Pages
Supplementary Exercises for Chapter 4
13. The lifetime of a bearing (in years) follows the Weibull
distribution with parameters α = 1.5 and β = 0.8.
a. What is the probability that a bearing lasts more
than 1 year?
b. What is the probability that a bearing lasts less than
2 years?
14. The length of time to perform an oil change at a certain
shop is normally distributed with mean 29.5 minutes
and standard deviation 3 minutes. What is the probability that a mechanic can complete 16 oil changes in
an eight-hour day?
15. A cereal manufacturer claims that the gross weight
(including packaging) of a box of cereal labeled as
weighing 12 oz has a mean of 12.2 oz and a standard
deviation of 0.1 oz. You gather 75 boxes and weigh
them all together. Let S denote the total weight of the
75 boxes of cereal.
a. If the claim is true, what is P(S ≤ 914.8)?
b. Based on the answer to part (a), if the claim is true,
is 914.8 oz an unusually small total weight for a
sample of 75 boxes?
c. If the total weight of the boxes were 914.8 oz,
would you be convinced that the claim was false?
Explain.
171
d. If the claim is true, what is P(S ≤ 910.3)?
e. Based on the answer to part (d), if the claim is true,
is 910.3 oz an unusually small total weight for a
sample of 75 boxes?
f. If the total weight of the boxes were 910.3 oz,
would you be convinced that the claim was false?
Explain.
16. Someone claims that the number of hits on his website
has a Poisson distribution with mean 20 per hour. Let
X be the number of hits in five hours.
a. If the claim is true, what is P(X ≤ 95)?
b. Based on the answer to part (a), if the claim is true,
is 95 hits in a five-hour time period an unusually
small number?
c. If you observed 95 hits in a five-hour time period,
would this be convincing evidence that the claim
is false? Explain.
d. If the claim is true, what is P(X ≤ 65)?
e. Based on the answer to part (d), if the claim is true,
is 65 hits in a five-hour time period an unusually
small number?
f. If you observed 65 hits in a five-hour time period,
would this be convincing evidence that the claim
is false? Explain.