Download Probability and Statistics for Engineering and the Sciences, 8th ed.

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Transcript
3.1. Random Variables
95
Two Types of Random Variables
In Section 1.2, we distinguished between data resulting from observations on a counting variable and data obtained by observing values of a measurement variable. A
slightly more formal distinction characterizes two different types of random variables.
DEFINITION
A discrete random variable is an rv whose possible values either constitute a
finite set or else can be listed in an infinite sequence in which there is a first
element, a second element, and so on (“countably” infinite).
A random variable is continuous if both of the following apply:
1. Its set of possible values consists either of all numbers in a single interval
on the number line (possibly infinite in extent, e.g., from 2` to ⬁) or all
numbers in a disjoint union of such intervals (e.g., [0, 10] ´ [20, 30]).
2. No possible value of the variable has positive probability, that is,
P(X 5 c) 5 0 for any possible value c.
Although any interval on the number line contains an infinite number of numbers, it
can be shown that there is no way to create an infinite listing of all these values—
there are just too many of them. The second condition describing a continuous random variable is perhaps counterintuitive, since it would seem to imply a total
probability of zero for all possible values. But we shall see in Chapter 4 that intervals of values have positive probability; the probability of an interval will decrease
to zero as the width of the interval shrinks to zero.
Example 3.6
All random variables in Examples 3.1 –3.4 are discrete. As another example, suppose
we select married couples at random and do a blood test on each person until we find
a husband and wife who both have the same Rh factor. With X 5 the number of
blood tests to be performed, possible values of X are D 5 52, 4, 6, 8, c6 . Since the
possible values have been listed in sequence, X is a discrete rv.
■
To study basic properties of discrete rv’s, only the tools of discrete mathematics—
summation and differences—are required. The study of continuous variables requires
the continuous mathematics of the calculus—integrals and derivatives.
EXERCISES
Section 3.1 (1–10)
1. A concrete beam may fail either by shear (S) or flexure (F).
Suppose that three failed beams are randomly selected and
the type of failure is determined for each one. Let
X 5 the number of beams among the three selected that
failed by shear. List each outcome in the sample space along
with the associated value of X.
5. If the sample space S is an infinite set, does this necessarily imply that any rv X defined from S will have an infinite
set of possible values? If yes, say why. If no, give an
example.
3. Using the experiment in Example 3.3, define two more
random variables and list the possible values of each.
6. Starting at a fixed time, each car entering an intersection is
observed to see whether it turns left (L), right (R), or goes
straight ahead (A). The experiment terminates as soon as a car
is observed to turn left. Let X 5 the number of cars
observed. What are possible X values? List five outcomes
and their associated X values.
4. Let X 5 the number of nonzero digits in a randomly selected
zip code. What are the possible values of X? Give three possible outcomes and their associated X values.
7. For each random variable defined here, describe the set of
possible values for the variable, and state whether the variable is discrete.
2. Give three examples of Bernoulli rv’s (other than those in the
text).
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
96
CHAPTER 3
Discrete Random Variables and Probability Distributions
a. X 5 the number of unbroken eggs in a randomly chosen
standard egg carton
b. Y 5 the number of students on a class list for a particular
course who are absent on the first day of classes
c. U 5 the number of times a duffer has to swing at a golf
ball before hitting it
d. X 5 the length of a randomly selected rattlesnake
e. Z 5 the amount of royalties earned from the sale of a first
edition of 10,000 textbooks
f. Y 5 the pH of a randomly chosen soil sample
g. X 5 the tension (psi) at which a randomly selected tennis
racket has been strung
h. X 5 the total number of coin tosses required for three
individuals to obtain a match (HHH or TTT)
8. Each time a component is tested, the trial is a success (S) or
failure (F). Suppose the component is tested repeatedly until
a success occurs on three consecutive trials. Let Y denote the
number of trials necessary to achieve this. List all outcomes
corresponding to the five smallest possible values of Y, and
state which Y value is associated with each one.
9. An individual named Claudius is located at the point 0 in the
accompanying diagram.
A2
B1
B2
A3
B4
10. The number of pumps in use at both a six-pump station and
a four-pump station will be determined. Give the possible
values for each of the following random variables:
a. T 5 the total number of pumps in use
b. X 5 the difference between the numbers in use at stations
1 and 2
c. U 5 the maximum number of pumps in use at either
station
d. Z 5 the number of stations having exactly two pumps
in use
B3
0
A1
Using an appropriate randomization device (such as a
tetrahedral die, one having four sides), Claudius first
moves to one of the four locations B1, B2, B3, B4. Once at
one of these locations, another randomization device is
used to decide whether Claudius next returns to 0 or next
visits one of the other two adjacent points. This process
then continues; after each move, another move to one of
the (new) adjacent points is determined by tossing an
appropriate die or coin.
a. Let X 5 the number of moves that Claudius makes
before first returning to 0. What are possible values of X?
Is X discrete or continuous?
b. If moves are allowed also along the diagonal paths connecting 0 to A1, A2, A3, and A4, respectively, answer the
questions in part (a).
A4
3.2 Probability Distributions
for Discrete Random Variables
Probabilities assigned to various outcomes in S in turn determine probabilities associated with the values of any particular rv X. The probability distribution of X says
how the total probability of 1 is distributed among (allocated to) the various possible X values. Suppose, for example, that a business has just purchased four laser
printers, and let X be the number among these that require service during the warranty period. Possible X values are then 0, 1, 2, 3, and 4. The probability distribution
will tell us how the probability of 1 is subdivided among these five possible values—
how much probability is associated with the X value 0, how much is apportioned to
the X value 1, and so on. We will use the following notation for the probabilities in
the distribution:
p(0) 5 the probability of the X value 0 5 P(X 5 0)
p(1) 5 the probability of the X value 1 5 P(X 5 1)
and so on. In general, p(x) will denote the probability assigned to the value x.
Example 3.7
The Cal Poly Department of Statistics has a lab with six computers reserved for statistics majors. Let X denote the number of these computers that are in use at a particular time of day. Suppose that the probability distribution of X is as given in the
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
104
CHAPTER 3
Discrete Random Variables and Probability Distributions
PROPOSITION
For any two numbers a and b with a # b,
P(a # X # b) 5 F(b) 2 F(a2)
where “a2” represents the largest possible X value that is strictly less than a.
In particular, if the only possible values are integers and if a and b are
integers, then
P(a # X # b) 5 P(X 5 a or a 1 1 orc or b)
5 F(b) 2 F(a 2 1)
Taking a 5 b yields P(X 5 a) 5 F(a) 2 F(a 2 1) in this case.
The reason for subtracting F(a2) rather than F(a) is that we want to include
P(X 5 a); F(b) 2 F(a) gives P(a , X # b). This proposition will be used extensively when computing binomial and Poisson probabilities in Sections 3.4 and 3.6.
Example 3.15
Let X 5 the number of days of sick leave taken by a randomly selected employee of
a large company during a particular year. If the maximum number of allowable sick
days per year is 14, possible values of X are 0, 1, . . . , 14. With F(0) 5 .58,
F(1) 5 .72, F(2) 5 .76, F(3) 5 .81, F(4) 5 .88, and F(5) 5 .94,
P(2 # X # 5) 5 P(X 5 2, 3, 4, or 5) 5 F(5) 2 F(1) 5 .22
and
■
P(X 5 3) 5 F(3) 2 F(2) 5 .05
EXERCISES
Section 3.2 (11–28)
11. An automobile service facility specializing in engine
tune-ups knows that 45% of all tune-ups are done on fourcylinder automobiles, 40% on six-cylinder automobiles,
and 15% on eight-cylinder automobiles. Let X 5 the
number of cylinders on the next car to be tuned.
a. What is the pmf of X?
b. Draw both a line graph and a probability histogram for
the pmf of part (a).
c. What is the probability that the next car tuned has at
least six cylinders? More than six cylinders?
12. Airlines sometimes overbook flights. Suppose that for a
plane with 50 seats, 55 passengers have tickets. Define the
random variable Y as the number of ticketed passengers who
actually show up for the flight. The probability mass function of Y appears in the accompanying table.
y
45 46 47 48 49 50 51 52 53 54 55
p(y)
.05 .10 .12 .14 .25 .17 .06 .05 .03 .02 .01
a. What is the probability that the flight will accommodate
all ticketed passengers who show up?
b. What is the probability that not all ticketed passengers
who show up can be accommodated?
c. If you are the first person on the standby list (which
means you will be the first one to get on the plane if there
are any seats available after all ticketed passengers have
been accommodated), what is the probability that you
will be able to take the flight? What is this probability if
you are the third person on the standby list?
13. A mail-order computer business has six telephone lines. Let
X denote the number of lines in use at a specified time.
Suppose the pmf of X is as given in the accompanying table.
x
p(x)
0
1
2
3
4
5
6
.10
.15
.20
.25
.20
.06
.04
Calculate the probability of each of the following events.
a. {at most three lines are in use}
b. {fewer than three lines are in use}
c. {at least three lines are in use}
d. {between two and five lines, inclusive, are in use}
e. {between two and four lines, inclusive, are not in use}
f. {at least four lines are not in use}
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.2. Probability Distributions for Discrete Random Variables
14. A contractor is required by a county planning department to
submit one, two, three, four, or five forms (depending on the
nature of the project) in applying for a building permit. Let
Y 5 the number of forms required of the next applicant.
The probability that y forms are required is known to be proportional to y—that is, p(y) 5 ky for y 5 1, . . . , 5.
5
a. What is the value of k? [Hint: a
p(y) 5 1.]
y51
b. What is the probability that at most three forms are
required?
c. What is the probability that between two and four forms
(inclusive) are required?
d. Could p(y) 5 y2/50 for y 5 1, c, 5 be the pmf of Y?
15. Many manufacturers have quality control programs that include inspection of incoming materials for defects. Suppose a computer manufacturer receives computer boards in
lots of five. Two boards are selected from each lot for
inspection. We can represent possible outcomes of the selection process by pairs. For example, the pair (1, 2) represents
the selection of boards 1 and 2 for inspection.
a. List the ten different possible outcomes.
b. Suppose that boards 1 and 2 are the only defective
boards in a lot of five. Two boards are to be chosen at
random. Define X to be the number of defective boards
observed among those inspected. Find the probability
distribution of X.
c. Let F(x) denote the cdf of X. First determine F(0) 5
P(X # 0), F(1), and F(2); then obtain F(x) for all other x.
16. Some parts of California are particularly earthquake-prone.
Suppose that in one metropolitan area, 25% of all homeowners are insured against earthquake damage. Four homeowners are to be selected at random; let X denote the
number among the four who have earthquake insurance.
a. Find the probability distribution of X. [Hint: Let S denote
a homeowner who has insurance and F one who does
not. Then one possible outcome is SFSS, with probability
(.25)(.75)(.25)(.25) and associated X value 3. There are
15 other outcomes.]
b. Draw the corresponding probability histogram.
c. What is the most likely value for X?
d. What is the probability that at least two of the four
selected have earthquake insurance?
17. A new battery’s voltage may be acceptable (A) or unacceptable (U). A certain flashlight requires two batteries, so batteries will be independently selected and tested until two
acceptable ones have been found. Suppose that 90% of all
batteries have acceptable voltages. Let Y denote the number
of batteries that must be tested.
a. What is p(2), that is, P(Y 5 2)?
b. What is p(3)? [Hint: There are two different outcomes
that result in Y 5 3.]
c. To have Y 5 5, what must be true of the fifth battery
selected? List the four outcomes for which Y 5 5 and
then determine p(5).
d. Use the pattern in your answers for parts (a)–(c) to obtain
a general formula for p(y).
105
18. Two fair six-sided dice are tossed independently. Let
M 5 the maximum of the two tosses (so M(1,5) 5 5,
M(3,3) 5 3, etc.).
a. What is the pmf of M? [Hint: First determine p(1), then
p(2), and so on.]
b. Determine the cdf of M and graph it.
19. A library subscribes to two different weekly news magazines, each of which is supposed to arrive in Wednesday’s
mail. In actuality, each one may arrive on Wednesday,
Thursday, Friday, or Saturday. Suppose the two arrive independently of one another, and for each one P(Wed.) 5 .3,
P(Thurs.) 5 .4, P(Fri.) 5 .2, and P(Sat.) 5 .1. Let
Y 5 the number of days beyond Wednesday that it takes for
both magazines to arrive (so possible Y values are 0, 1, 2, or
3). Compute the pmf of Y. [Hint: There are 16 possible
outcomes; Y(W,W) 5 0, Y(F,Th) 5 2, and so on.]
20. Three couples and two single individuals have been invited
to an investment seminar and have agreed to attend.
Suppose the probability that any particular couple or individual arrives late is .4 (a couple will travel together in the
same vehicle, so either both people will be on time or else
both will arrive late). Assume that different couples and
individuals are on time or late independently of one
another. Let X 5 the number of people who arrive late for
the seminar.
a. Determine the probability mass function of X. [Hint:
label the three couples #1, #2, and #3 and the two individuals #4 and #5.]
b. Obtain the cumulative distribution function of X, and use
it to calculate P(2 # X # 6).
21. Suppose that you read through this year’s issues of the New
York Times and record each number that appears in a news
article—the income of a CEO, the number of cases of wine
produced by a winery, the total charitable contribution of a
politician during the previous tax year, the age of a
celebrity, and so on. Now focus on the leading digit of each
number, which could be 1, 2, . . . , 8, or 9. Your first thought
might be that the leading digit X of a randomly selected
number would be equally likely to be one of the nine possibilities (a discrete uniform distribution). However, much
empirical evidence as well as some theoretical arguments
suggest an alternative probability distribution called
Benford’s law:
x11
b
x
p(x) 5 P(1st digit is x) 5 log10 a
x 5 1, 2, . . . , 9
a. Without computing individual probabilities from this
formula, show that it specifies a legitimate pmf.
b. Now compute the individual probabilities and compare
to the corresponding discrete uniform distribution.
c. Obtain the cdf of X.
d. Using the cdf, what is the probability that the leading
digit is at most 3? At least 5?
[Note: Benford’s law is the basis for some auditing procedures used to detect fraud in financial reporting—for
example, by the Internal Revenue Service.]
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
106
CHAPTER 3
Discrete Random Variables and Probability Distributions
26. Alvie Singer lives at 0 in the accompanying diagram and
has four friends who live at A, B, C, and D. One day Alvie
decides to go visiting, so he tosses a fair coin twice to
decide which of the four to visit. Once at a friend’s house,
he will either return home or else proceed to one of the
two adjacent houses (such as 0, A, or C when at B), with
each of the three possibilities having probability 1 . In
3
this way, Alvie continues to visit friends until he
returns home.
22. Refer to Exercise 13, and calculate and graph the cdf F(x).
Then use it to calculate the probabilities of the events given
in parts (a)–(d) of that problem.
23. A consumer organization that evaluates new automobiles
customarily reports the number of major defects in each car
examined. Let X denote the number of major defects in a
randomly selected car of a certain type. The cdf of X is as
follows:
0
.06
.19
.39
F(x) 5 h
.67
.92
.97
1
x,0
0#x,
1#x,
2#x,
3#x,
4#x,
5#x,
6#x
1
2
3
4
5
6
A
0
D
C
a. Let X 5 the number of times that Alvie visits a friend.
Derive the pmf of X.
b. Let Y 5 the number of straight-line segments that Alvie
traverses (including those leading to and from 0). What
is the pmf of Y?
c. Suppose that female friends live at A and C and male
friends at B and D. If Z 5 the number of visits to female
friends, what is the pmf of Z?
Calculate the following probabilities directly from the cdf:
a. p(2), that is, P(X 5 2)
b. P(X . 3)
c. P(2 # X # 5)
d. P(2 , X , 5)
24. An insurance company offers its policyholders a number of
different premium payment options. For a randomly
selected policyholder, let X 5 the number of months
between successive payments. The cdf of X is as follows:
0
x,1
.30 1 # x ,
.40 3 # x ,
F(x) 5 f
.45 4 # x ,
.60 6 # x ,
1 12 # x
B
27. After all students have left the classroom, a statistics professor notices that four copies of the text were left under
desks. At the beginning of the next lecture, the professor
distributes the four books in a completely random fashion
to each of the four students (1, 2, 3, and 4) who claim to
have left books. One possible outcome is that 1 receives 2’s
book, 2 receives 4’s book, 3 receives his or her own book,
and 4 receives 1’s book. This outcome can be abbreviated
as (2, 4, 3, 1).
a. List the other 23 possible outcomes.
b. Let X denote the number of students who receive their
own book. Determine the pmf of X.
3
4
6
12
a. What is the pmf of X?
b. Using just the cdf, compute P(3 # X # 6) and P(4 # X).
25. In Example 3.12, let Y 5 the number of girls born before
the experiment terminates. With p 5 P(B) and
1 2 p 5 P(G), what is the pmf of Y? [Hint: First list the
possible values of Y, starting with the smallest, and proceed
until you see a general formula.]
28. Show that the cdf F(x) is a nondecreasing function; that is,
x1 , x2 implies that F(x1) # F(x2). Under what condition
will F(x1) 5 F(x2)?
3.3 Expected Values
Consider a university having 15,000 students and let X 5 the number of courses for
which a randomly selected student is registered. The pmf of X follows. Since
p(1) 5 .01, we know that (.01) # (15,000) 5 150of the students are registered for
one course, and similarly for the other x values.
x
1
2
3
4
5
6
7
p(x)
.01
.03
.13
.25
.39
.17
.02
Number registered
150
450
1950
3750
5850
2550
300
(3.6)
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.3. Expected Values
113
The absolute value is necessary because a might be negative, yet a standard
deviation cannot be. Usually multiplication by a corresponds to a change in the unit
of measurement (e.g., kg to lb or dollars to euros). According to the first relation in
(3.14), the sd in the new unit is the original sd multiplied by the conversion factor.
The second relation says that adding or subtracting a constant does not impact variability; it just rigidly shifts the distribution to the right or left.
Example 3.26
In the computer sales scenario of Example 3.23, E(X) 5 2 and
E(X 2) 5 (0)2(.1) 1 (1)2(.2) 1 (2)2(.3) 1 (3)2(.4) 5 5
so V(X) 5 5 2 (2)2 5 1. The profit function h(X) 5 800X 2 900 then has variance
(800)2 # V(X) 5 (640,000)(1) 5 640,000and standard deviation 800.
■
EXERCISES
Section 3.3 (29–45)
29. The pmf of the amount of memory X (GB) in a purchased
flash drive was given in Example 3.13 as
x
p(x)
1
2
4
8
16
.05
.10
.35
.40
.10
Compute the following:
a. E(X)
b. V(X) directly from the definition
c. The standard deviation of X
d. V(X) using the shortcut formula
30. An individual who has automobile insurance from a certain
company is randomly selected. Let Y be the number of moving violations for which the individual was cited during the
last 3 years. The pmf of Y is
y
p(y)
0
1
2
3
.60
.25
.10
.05
a. Compute E(Y).
b. Suppose an individual with Y violations incurs a surcharge of $100Y2. Calculate the expected amount of the
surcharge.
31. Refer to Exercise 12 and calculate V(Y) and sY. Then determine the probability that Y is within 1 standard deviation of
its mean value.
32. An appliance dealer sells three different models of upright
freezers having 13.5, 15.9, and 19.1 cubic feet of storage
space, respectively. Let X 5 the amount of storage space
purchased by the next customer to buy a freezer. Suppose
that X has pmf
x
p(x)
13.5
15.9
19.1
.2
.5
.3
a. Compute E(X), E(X2), and V(X).
b. If the price of a freezer having capacity X cubic feet is
25X 2 8.5, what is the expected price paid by the next
customer to buy a freezer?
c. What is the variance of the price 25X 2 8.5 paid by the
next customer?
d. Suppose that although the rated capacity of a freezer is
X, the actual capacity is h(X) 5 X 2 .01X 2. What is the
expected actual capacity of the freezer purchased by the
next customer?
33. Let X be a Bernoulli rv with pmf as in Example 3.18.
a. Compute E(X2).
b. Show that V(X) 5 p(1 2 p).
c. Compute E(X79).
34. Suppose that the number of plants of a particular type found
in a rectangular sampling region (called a quadrat by ecologists) in a certain geographic area is an rv X with pmf
p(x) 5 e
c/x 3 x 5 1, 2, 3, . . .
0
otherwise
Is E(X) finite? Justify your answer (this is another distribution that statisticians would call heavy-tailed).
35. A small market orders copies of a certain magazine for its
magazine rack each week. Let X 5 demand for the magazine, with pmf
x
1
2
3
4
5
6
p(x)
1
15
2
15
3
15
4
15
3
15
2
15
Suppose the store owner actually pays $2.00 for each copy of
the magazine and the price to customers is $4.00. If magazines
left at the end of the week have no salvage value, is it better to
order three or four copies of the magazine? [Hint: For both
three and four copies ordered, express net revenue as a function of demand X, and then compute the expected revenue.]
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
114
CHAPTER 3
Discrete Random Variables and Probability Distributions
36. Let X be the damage incurred (in $) in a certain type of accident during a given year. Possible X values are 0, 1000,
5000, and 10000, with probabilities .8, .1, .08, and .02,
respectively. A particular company offers a $500 deductible
policy. If the company wishes its expected profit to be $100,
what premium amount should it charge?
40. a. Draw a line graph of the pmf of X in Exercise 35. Then
determine the pmf of 2X and draw its line graph. From
these two pictures, what can you say about V(X) and
V(2X)?
b. Use the proposition involving V(aX 1 b) to establish a
general relationship between V(X) and V(2X).
37. The n candidates for a job have been ranked 1, 2, 3, . . . , n.
Let X 5 the rank of a randomly selected candidate, so that
X has pmf
41. Use the definition in Expression (3.13) to prove that
V(aX 1 b) 5 a 2 # s2X. [Hint: With h(X) 5 aX 1 b,
E[h(X)] 5 am 1 b where m 5 E(X).]
p(x) 5 e
1/n x 5 1, 2, 3, . . . , n
0 otherwise
(this is called the discrete uniform distribution). Compute
E(X) and V(X) using the shortcut formula. [Hint: The sum
of the first n positive integers is n(n 1 1)/2, whereas the
sum of their squares is n(n 1 1)(2n 1 1)/6.]
38. Let X 5 the outcome when a fair die is rolled once. If
before the die is rolled you are offered either (1/3.5) dollars
or h(X) 5 1/X dollars, would you accept the guaranteed
amount or would you gamble? [Note: It is not generally true
that 1/E(X) 5 E(1/X).]
39. A chemical supply company currently has in stock 100 lb of
a certain chemical, which it sells to customers in 5-lb
batches. Let X 5 the number of batches ordered by a randomly chosen customer, and suppose that X has pmf
x
1
2
3
4
p(x)
.2
.4
.3
.1
Compute E(X) and V(X). Then compute the expected number of pounds left after the next customer’s order is shipped
and the variance of the number of pounds left. [Hint: The
number of pounds left is a linear function of X.]
42. Suppose E(X) 5 5 and E[X(X 2 1)] 5 27.5. What is
a. E(X2)? [Hint: E[X(X 2 1)] 5 E[X 2 2 X] 5
E(X 2) 2 E(X)]?
b. V(X)?
c. The general relationship among the quantities E(X),
E[X(X 2 1)], and V(X)?
43. Write a general rule for E(X 2 c) where c is a constant.
What happens when you let c 5 m, the expected value of X?
44. A result called Chebyshev’s inequality states that for any
probability distribution of an rv X and any number k that is
at least 1, P(u X 2 m u $ ks) # 1/k2. In words, the probability that the value of X lies at least k standard deviations
from its mean is at most 1/k2.
a. What is the value of the upper bound for k 5 2? k 5 3?
k 5 4? k 5 5? k 5 10?
b. Compute m and s for the distribution of Exercise 13.
Then evaluate P(u X 2 m u $ ks) for the values of k
given in part (a). What does this suggest about the upper
bound relative to the corresponding probability?
c. Let X have possible values 21, 0, and 1, with probabilities
1 8
, , and 1 , respectively. What is P(u X 2 m u $ 3s),
18 9
18
and how does it compare to the corresponding bound?
d. Give a distribution for which P(u X 2 m u $ 5s) 5 .04.
45. If a # X # b, show that a # E(X) # b.
3.4 The Binomial Probability Distribution
There are many experiments that conform either exactly or approximately to the following list of requirements:
1. The experiment consists of a sequence of n smaller experiments called trials,
where n is fixed in advance of the experiment.
2. Each trial can result in one of the same two possible outcomes (dichotomous
trials), which we generically denote by success (S) and failure (F).
3. The trials are independent, so that the outcome on any particular trial does not
influence the outcome on any other trial.
4. The probability of success P(S) is constant from trial to trial; we denote this
probability by p.
DEFINITION
An experiment for which Conditions 1–4 are satisfied is called a binomial
experiment.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
120
Discrete Random Variables and Probability Distributions
CHAPTER 3
PROPOSITION
If X , Bin(n, p), then E(X) 5 np,
sX 5 1npq (where q 5 1 2 p).
V(X) 5 np(1 2 p) 5 npq,
and
Thus, calculating the mean and variance of a binomial rv does not necessitate evaluating summations. The proof of the result for E(X) is sketched in Exercise 64.
Example 3.34
If 75% of all purchases at a certain store are made with a credit card and X is the number
among ten randomly selected purchases made with a credit card, then X , Bin(10, .75).
Thus E(X) 5 np 5 (10)(.75) 5 7.5, V(X) 5 npq 5 10(.75)(.25) 5 1.875, and
s 5 11.875 5 1.37. Again, even though X can take on only integer values, E(X) need
not be an integer. If we perform a large number of independent binomial experiments,
each with n 5 10 trials and p 5 .75, then the average number of S’s per experiment will
be close to 7.5.
The probability that X is within 1 standard deviation of its mean value is
P(7.5 2 1.37 # X # 7.5 1 1.37) 5 P(6.13 # X # 8.87) 5 P(X 5 7 or 8) 5 .532 .
■
EXERCISES
Section 3.4 (46–67)
46. Compute the following binomial probabilities directly from
the formula for b(x; n, p):
a. b(3; 8, .35)
b. b(5; 8, .6)
c. P(3 # X # 5) when n 5 7 and p 5 .6
d. P(1 # X) when n 5 9 and p 5 .1
47. Use Appendix Table A.1 to obtain the following
probabilities:
a. B(4; 15, .3)
b. b(4; 15, .3)
c. b(6; 15, .7)
d. P(2 # X # 4) when X , Bin(15, .3)
e. P(2 # X) when X , Bin(15, .3)
f. P(X # 1) when X , Bin(15, .7)
g. P(2 , X , 6) when X , Bin(15, .3)
48. When circuit boards used in the manufacture of compact
disc players are tested, the long-run percentage of defectives
is 5%. Let X 5 the number of defective boards in a random
sample of size n 5 25, so X , Bin(25, .05).
a. Determine P(X # 2).
b. Determine P(X $ 5).
c. Determine P(1 # X # 4).
d. What is the probability that none of the 25 boards is
defective?
e. Calculate the expected value and standard deviation of X.
49. A company that produces fine crystal knows from experience that 10% of its goblets have cosmetic flaws and must
be classified as “seconds.”
a. Among six randomly selected goblets, how likely is it
that only one is a second?
b. Among six randomly selected goblets, what is the probability that at least two are seconds?
c. If goblets are examined one by one, what is the probability that at most five must be selected to find four that
are not seconds?
50. A particular telephone number is used to receive both voice
calls and fax messages. Suppose that 25% of the incoming
calls involve fax messages, and consider a sample of 25
incoming calls. What is the probability that
a. At most 6 of the calls involve a fax message?
b. Exactly 6 of the calls involve a fax message?
c. At least 6 of the calls involve a fax message?
d. More than 6 of the calls involve a fax message?
51. Refer to the previous exercise.
a. What is the expected number of calls among the 25 that
involve a fax message?
b. What is the standard deviation of the number among the
25 calls that involve a fax message?
c. What is the probability that the number of calls among
the 25 that involve a fax transmission exceeds the
expected number by more than 2 standard deviations?
52. Suppose that 30% of all students who have to buy a text for
a particular course want a new copy (the successes!),
whereas the other 70% want a used copy. Consider randomly selecting 25 purchasers.
a. What are the mean value and standard deviation of the
number who want a new copy of the book?
b. What is the probability that the number who want new
copies is more than two standard deviations away from
the mean value?
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.4. The Binomial Probability Distribution
c. The bookstore has 15 new copies and 15 used copies in
stock. If 25 people come in one by one to purchase this
text, what is the probability that all 25 will get the type
of book they want from current stock? [Hint: Let
X 5 the number who want a new copy. For what values
of X will all 25 get what they want?]
d. Suppose that new copies cost $100 and used copies cost
$70. Assume the bookstore currently has 50 new copies
and 50 used copies. What is the expected value of total revenue from the sale of the next 25 copies purchased? Be sure
to indicate what rule of expected value you are using.
[Hint: Let h(X) 5 the revenue when X of the 25 purchasers want new copies. Express this as a linear function.]
53. Exercise 30 (Section 3.3) gave the pmf of Y, the number of
traffic citations for a randomly selected individual insured
by a particular company. What is the probability that among
15 randomly chosen such individuals
a. At least 10 have no citations?
b. Fewer than half have at least one citation?
c. The number that have at least one citation is between 5
and 10, inclusive?*
54. A particular type of tennis racket comes in a midsize version
and an oversize version. Sixty percent of all customers at a
certain store want the oversize version.
a. Among ten randomly selected customers who want this
type of racket, what is the probability that at least six
want the oversize version?
b. Among ten randomly selected customers, what is the
probability that the number who want the oversize version
is within 1 standard deviation of the mean value?
c. The store currently has seven rackets of each version.
What is the probability that all of the next ten customers
who want this racket can get the version they want from
current stock?
55. Twenty percent of all telephones of a certain type are submitted for service while under warranty. Of these, 60% can
be repaired, whereas the other 40% must be replaced with
new units. If a company purchases ten of these telephones,
what is the probability that exactly two will end up being
replaced under warranty?
56. The College Board reports that 2% of the 2 million high
school students who take the SAT each year receive special
accommodations because of documented disabilities (Los
Angeles Times, July 16, 2002). Consider a random sample
of 25 students who have recently taken the test.
a. What is the probability that exactly 1 received a special
accommodation?
b. What is the probability that at least 1 received a special
accommodation?
c. What is the probability that at least 2 received a special
accommodation?
d. What is the probability that the number among the 25
who received a special accommodation is within 2
* “Between a and b, inclusive” is equivalent to (a # X # b).
121
standard deviations of the number you would expect to
be accommodated?
e. Suppose that a student who does not receive a special
accommodation is allowed 3 hours for the exam,
whereas an accommodated student is allowed 4.5 hours.
What would you expect the average time allowed the 25
selected students to be?
57. Suppose that 90% of all batteries from a certain supplier
have acceptable voltages. A certain type of flashlight
requires two type-D batteries, and the flashlight will work
only if both its batteries have acceptable voltages. Among
ten randomly selected flashlights, what is the probability
that at least nine will work? What assumptions did you
make in the course of answering the question posed?
58. A very large batch of components has arrived at a distributor. The batch can be characterized as acceptable only if the
proportion of defective components is at most .10. The
distributor decides to randomly select 10 components and to
accept the batch only if the number of defective components
in the sample is at most 2.
a. What is the probability that the batch will be accepted
when the actual proportion of defectives is .01? .05? .10?
.20? .25?
b. Let p denote the actual proportion of defectives in the
batch. A graph of P(batch is accepted) as a function of p,
with p on the horizontal axis and P(batch is accepted) on
the vertical axis, is called the operating characteristic
curve for the acceptance sampling plan. Use the results
of part (a) to sketch this curve for 0 # p # 1.
c. Repeat parts (a) and (b) with “1” replacing “2” in the
acceptance sampling plan.
d. Repeat parts (a) and (b) with “15” replacing “10” in the
acceptance sampling plan.
e. Which of the three sampling plans, that of part (a), (c), or
(d), appears most satisfactory, and why?
59. An ordinance requiring that a smoke detector be installed in
all previously constructed houses has been in effect in a particular city for 1 year. The fire department is concerned that
many houses remain without detectors. Let p 5 the true
proportion of such houses having detectors, and suppose
that a random sample of 25 homes is inspected. If the
sample strongly indicates that fewer than 80% of all houses
have a detector, the fire department will campaign for a
mandatory inspection program. Because of the costliness of
the program, the department prefers not to call for such
inspections unless sample evidence strongly argues for their
necessity. Let X denote the number of homes with detectors
among the 25 sampled. Consider rejecting the claim that
p $ .8 if x # 15.
a. What is the probability that the claim is rejected when
the actual value of p is .8?
b. What is the probability of not rejecting the claim when
p 5 .7? When p 5 .6?
c. How do the “error probabilities” of parts (a) and (b) change
if the value 15 in the decision rule is replaced by 14?
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
122
CHAPTER 3
Discrete Random Variables and Probability Distributions
60. A toll bridge charges $1.00 for passenger cars and $2.50
for other vehicles. Suppose that during daytime hours, 60%
of all vehicles are passenger cars. If 25 vehicles cross the
bridge during a particular daytime period, what is the
resulting expected toll revenue? [Hint: Let X 5 the number
of passenger cars; then the toll revenue h(X) is a linear
function of X.]
61. A student who is trying to write a paper for a course has a
choice of two topics, A and B. If topic A is chosen, the
student will order two books through interlibrary loan,
whereas if topic B is chosen, the student will order four
books. The student believes that a good paper necessitates
receiving and using at least half the books ordered for either
topic chosen. If the probability that a book ordered through
interlibrary loan actually arrives in time is .9 and books
arrive independently of one another, which topic should the
student choose to maximize the probability of writing a
good paper? What if the arrival probability is only .5 instead
of .9?
62. a. For fixed n, are there values of p (0 # p # 1) for which
V(X) 5 0? Explain why this is so.
b. For what value of p is V(X) maximized? [Hint: Either
graph V(X) as a function of p or else take a derivative.]
65. Customers at a gas station pay with a credit card (A), debit
card (B), or cash (C ). Assume that successive customers
make independent choices, with P(A) 5 .5, P(B) 5 .2, and
P(C ) 5 .3.
a. Among the next 100 customers, what are the mean and
variance of the number who pay with a debit card?
Explain your reasoning.
b. Answer part (a) for the number among the 100 who don’t
pay with cash.
66. An airport limousine can accommodate up to four passengers
on any one trip. The company will accept a maximum of six
reservations for a trip, and a passenger must have a reservation. From previous records, 20% of all those making
reservations do not appear for the trip. Answer the following
questions, assuming independence wherever appropriate.
a. If six reservations are made, what is the probability that
at least one individual with a reservation cannot be
accommodated on the trip?
b. If six reservations are made, what is the expected number of available places when the limousine departs?
c. Suppose the probability distribution of the number of
reservations made is given in the accompanying table.
Number of reservations
3
4
5
6
Probability
.1
.2
.3
.4
63. a. Show that b(x; n, 1 2 p) 5 b(n 2 x; n, p).
b. Show that B(x; n, 1 2 p) 5 1 2 B(n 2 x 2 1; n, p).
[Hint: At most x S’s is equivalent to at least (n 2 x) F’s.]
c. What do parts (a) and (b) imply about the necessity of
including values of p greater than .5 in Appendix Table A.1?
Let X denote the number of passengers on a randomly
selected trip. Obtain the probability mass function of X.
64. Show that E(X) 5 np when X is a binomial random
variable. [Hint: First express E(X) as a sum with lower limit
x 5 1. Then factor out np, let y 5 x 2 1 so that the sum is
from y 5 0 to y 5 n 2 1, and show that the sum equals 1.]
67. Refer to Chebyshev’s inequality given in Exercise 44.
Calculate P(u X 2 m u $ ks) for k 5 2 and k 5 3 when
X , Bin(20, .5), and compare to the corresponding upper
bound. Repeat for X , Bin(20, .75).
3.5 Hypergeometric and Negative
Binomial Distributions
The hypergeometric and negative binomial distributions are both related to the
binomial distribution. The binomial distribution is the approximate probability
model for sampling without replacement from a finite dichotomous (S–F) population provided the sample size n is small relative to the population size N; the
hypergeometric distribution is the exact probability model for the number of S’s in
the sample. The binomial rv X is the number of S’s when the number n of trials is
fixed, whereas the negative binomial distribution arises from fixing the number of
S’s desired and letting the number of trials be random.
The Hypergeometric Distribution
The assumptions leading to the hypergeometric distribution are as follows:
1. The population or set to be sampled consists of N individuals, objects, or
elements (a finite population).
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.5. Hypergeometric and Negative Binomial Distributions
EXERCISES
127
Section 3.5 (68–78)
68. An electronics store has received a shipment of 20 table
radios that have connections for an iPod or iPhone. Twelve
of these have two slots (so they can accommodate both
devices), and the other eight have a single slot. Suppose that
six of the 20 radios are randomly selected to be stored under
a shelf where the radios are displayed, and the remaining
ones are placed in a storeroom. Let X 5 the number among
the radios stored under the display shelf that have two slots.
a. What kind of a distribution does X have (name and values of all parameters)?
b. Compute P(X 5 2), P(X # 2), and P(X $ 2).
c. Calculate the mean value and standard deviation of X.
69. Each of 12 refrigerators of a certain type has been returned
to a distributor because of an audible, high-pitched, oscillating noise when the refrigerators are running. Suppose that
7 of these refrigerators have a defective compressor and the
other 5 have less serious problems. If the refrigerators
are examined in random order, let X be the number among
the first 6 examined that have a defective compressor.
Compute the following:
a. P(X 5 5)
b. P(X # 4)
c. The probability that X exceeds its mean value by more
than 1 standard deviation.
d. Consider a large shipment of 400 refrigerators, of which
40 have defective compressors. If X is the number among
15 randomly selected refrigerators that have defective
compressors, describe a less tedious way to calculate (at
least approximately) P(X # 5) than to use the hypergeometric pmf.
70. An instructor who taught two sections of engineering statistics last term, the first with 20 students and the second with
30, decided to assign a term project. After all projects had
been turned in, the instructor randomly ordered them before
grading. Consider the first 15 graded projects.
a. What is the probability that exactly 10 of these are from
the second section?
b. What is the probability that at least 10 of these are from
the second section?
c. What is the probability that at least 10 of these are from
the same section?
d. What are the mean value and standard deviation of the
number among these 15 that are from the second section?
e. What are the mean value and standard deviation of the
number of projects not among these first 15 that are from
the second section?
71. A geologist has collected 10 specimens of basaltic rock and
10 specimens of granite. The geologist instructs a laboratory assistant to randomly select 15 of the specimens for
analysis.
a. What is the pmf of the number of granite specimens
selected for analysis?
b. What is the probability that all specimens of one of the
two types of rock are selected for analysis?
c. What is the probability that the number of granite specimens selected for analysis is within 1 standard deviation
of its mean value?
72. A personnel director interviewing 11 senior engineers for
four job openings has scheduled six interviews for the first
day and five for the second day of interviewing. Assume
that the candidates are interviewed in random order.
a. What is the probability that x of the top four candidates
are interviewed on the first day?
b. How many of the top four candidates can be expected to
be interviewed on the first day?
73. Twenty pairs of individuals playing in a bridge tournament
have been seeded 1, . . . , 20. In the first part of the tournament, the 20 are randomly divided into 10 east–west pairs
and 10 north–south pairs.
a. What is the probability that x of the top 10 pairs end up
playing east–west?
b. What is the probability that all of the top five pairs end
up playing the same direction?
c. If there are 2n pairs, what is the pmf of X 5 the number
among the top n pairs who end up playing east–west?
What are E(X) and V(X)?
74. A second-stage smog alert has been called in a certain area
of Los Angeles County in which there are 50 industrial
firms. An inspector will visit 10 randomly selected firms to
check for violations of regulations.
a. If 15 of the firms are actually violating at least one
regulation, what is the pmf of the number of firms visited
by the inspector that are in violation of at least one
regulation?
b. If there are 500 firms in the area, of which 150 are in violation, approximate the pmf of part (a) by a simpler pmf.
c. For X 5 the number among the 10 visited that are in violation, compute E(X) and V(X) both for the exact pmf and
the approximating pmf in part (b).
75. Suppose that p 5 P(male birth) 5 .5. A couple wishes to
have exactly two female children in their family. They will
have children until this condition is fulfilled.
a. What is the probability that the family has x male
children?
b. What is the probability that the family has four children?
c. What is the probability that the family has at most four
children?
d. How many male children would you expect this family
to have? How many children would you expect this
family to have?
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
128
CHAPTER 3
Discrete Random Variables and Probability Distributions
76. A family decides to have children until it has three children
of the same gender. Assuming P(B) 5 P(G) 5 .5, what is
the pmf of X 5 the number of children in the family?
77. Three brothers and their wives decide to have children until
each family has two female children. What is the pmf of
X 5 the total number of male children born to the brothers?
What is E(X), and how does it compare to the expected
number of male children born to each brother?
78. According to the article “Characterizing the Severity and
Risk of Drought in the Poudre River, Colorado” (J. of Water
Res. Planning and Mgmnt., 2005: 383–393), the drought
length Y is the number of consecutive time intervals in
which the water supply remains below a critical value y0 (a
deficit), preceded by and followed by periods in which the
supply exceeds this critical value (a surplus). The cited
paper proposes a geometric distribution with p 5 .409 for
this random variable.
a. What is the probability that a drought lasts exactly 3
intervals? At most 3 intervals?
b. What is the probability that the length of a drought
exceeds its mean value by at least one standard
deviation?
3.6 The Poisson Probability Distribution
The binomial, hypergeometric, and negative binomial distributions were all derived
by starting with an experiment consisting of trials or draws and applying the laws of
probability to various outcomes of the experiment. There is no simple experiment on
which the Poisson distribution is based, though we will shortly describe how it can
be obtained by certain limiting operations.
DEFINITION
A discrete random variable X is said to have a Poisson distribution with
parameter m (m . 0) if the pmf of X is
p(x; m) 5
e2m # mx
x!
x 5 0, 1, 2, 3, . . .
It is no accident that we are using the symbol m for the Poisson parameter; we shall
see shortly that m is in fact the expected value of X. The letter e in the pmf represents
the base of the natural logarithm system; its numerical value is approximately
2.71828. In contrast to the binomial and hypergeometric distributions, the Poisson
distribution spreads probability over all non-negative integers, an infinite number of
possibilities.
It is not obvious by inspection that p(x; m) specifies a legitimate pmf, let alone
that this distribution is useful. First of all, p(x; m) . 0 for every possible x value
because of the requirement that m . 0. The fact that gp(x; m) 5 1 is a consequence
of the Maclaurin series expansion of em (check your calculus book for this result):
em 5 1 1 m 1
m2 m3 c
1
1
5
2!
3!
`
mx
x50 x!
g
(3.18)
If the two extreme terms in (3.18) are multiplied by e2m and then this quantity is
moved inside the summation on the far right, the result is
e2m # mx
x!
x50
`
15
Example 3.39
g
Let X denote the number of creatures of a particular type captured in a trap during a
given time period. Suppose that X has a Poisson distribution with m 5 4.5, so on
average traps will contain 4.5 creatures. [The article “Dispersal Dynamics of the
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
132
CHAPTER 3
EXERCISES
Discrete Random Variables and Probability Distributions
Section 3.6 (79–93)
79. Let X, the number of flaws on the surface of a randomly
selected boiler of a certain type, have a Poisson distribution
with parameter m 5 5. Use Appendix Table A.2 to compute
the following probabilities:
a. P(X # 8)
b. P(X 5 8)
c. P(9 # X)
d. P(5 # X # 8)
e. P(5 , X , 8)
80. Let X be the number of material anomalies occurring in a
particular region of an aircraft gas-turbine disk. The article
“Methodology for Probabilistic Life Prediction of MultipleAnomaly Materials” (Amer. Inst. of Aeronautics and
Astronautics J., 2006: 787–793) proposes a Poisson distribution for X. Suppose that m 5 4.
a. Compute both P(X # 4) and P(X , 4).
b. Compute P(4 # X # 8).
c. Compute P(8 # X).
d. What is the probability that the number of anomalies
exceeds its mean value by no more than one standard
deviation?
81. Suppose that the number of drivers who travel between a
particular origin and destination during a designated time
period has a Poisson distribution with parameter m 5 20
(suggested in the article “Dynamic Ride Sharing: Theory
and Practice,” J. of Transp. Engr., 1997: 308–312). What is
the probability that the number of drivers will
a. Be at most 10?
b. Exceed 20?
c. Be between 10 and 20, inclusive? Be strictly between 10
and 20?
d. Be within 2 standard deviations of the mean value?
82. Consider writing onto a computer disk and then sending it
through a certifier that counts the number of missing pulses.
Suppose this number X has a Poisson distribution with
parameter m 5 .2. (Suggested in “Average Sample Number
for Semi-Curtailed Sampling Using the Poisson Distribution,” J. Quality Technology, 1983: 126–129.)
a. What is the probability that a disk has exactly one missing pulse?
b. What is the probability that a disk has at least two missing pulses?
c. If two disks are independently selected, what is the probability that neither contains a missing pulse?
83. An article in the Los Angeles Times (Dec. 3, 1993) reports
that 1 in 200 people carry the defective gene that causes
inherited colon cancer. In a sample of 1000 individuals,
what is the approximate distribution of the number who
carry this gene? Use this distribution to calculate the
approximate probability that
a. Between 5 and 8 (inclusive) carry the gene.
b. At least 8 carry the gene.
84. Suppose that only .10% of all computers of a certain type
experience CPU failure during the warranty period. Consider a sample of 10,000 computers.
a. What are the expected value and standard deviation of
the number of computers in the sample that have the
defect?
b. What is the (approximate) probability that more than 10
sampled computers have the defect?
c. What is the (approximate) probability that no sampled
computers have the defect?
85. Suppose small aircraft arrive at a certain airport according
to a Poisson process with rate a 5 8 per hour, so that the
number of arrivals during a time period of t hours is a
Poisson rv with parameter m 5 8t.
a. What is the probability that exactly 6 small aircraft arrive
during a 1-hour period? At least 6? At least 10?
b. What are the expected value and standard deviation of
the number of small aircraft that arrive during a 90-min
period?
c. What is the probability that at least 20 small aircraft
arrive during a 2.5-hour period? That at most 10
arrive during this period?
86. The number of people arriving for treatment at an emergency room can be modeled by a Poisson process with a rate
parameter of five per hour.
a. What is the probability that exactly four arrivals occur
during a particular hour?
b. What is the probability that at least four people arrive
during a particular hour?
c. How many people do you expect to arrive during a 45min period?
87. The number of requests for assistance received by a towing
service is a Poisson process with rate a 5 4 per hour.
a. Compute the probability that exactly ten requests are
received during a particular 2-hour period.
b. If the operators of the towing service take a 30-min break
for lunch, what is the probability that they do not miss
any calls for assistance?
c. How many calls would you expect during their break?
88. In proof testing of circuit boards, the probability that any
particular diode will fail is .01. Suppose a circuit board contains 200 diodes.
a. How many diodes would you expect to fail, and what is
the standard deviation of the number that are expected to
fail?
b. What is the (approximate) probability that at least four
diodes will fail on a randomly selected board?
c. If five boards are shipped to a particular customer, how
likely is it that at least four of them will work properly?
(A board works properly only if all its diodes work.)
89. The article “Reliability-Based Service-Life Assessment of
Aging Concrete Structures” (J. Structural Engr., 1993:
1600–1621) suggests that a Poisson process can be used to
represent the occurrence of structural loads over time. Suppose
the mean time between occurrences of loads is .5 year.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Supplementary Exercises
a. How many loads can be expected to occur during a 2year period?
b. What is the probability that more than five loads occur
during a 2-year period?
c. How long must a time period be so that the probability of
no loads occurring during that period is at most .1?
90. Let X have a Poisson distribution with parameter m. Show that
E(X) 5 m directly from the definition of expected value.
[Hint: The first term in the sum equals 0, and then x can be canceled. Now factor out m and show that what is left sums to 1.]
91. Suppose that trees are distributed in a forest according to a
two-dimensional Poisson process with parameter a, the
expected number of trees per acre, equal to 80.
a. What is the probability that in a certain quarter-acre plot,
there will be at most 16 trees?
b. If the forest covers 85,000 acres, what is the expected
number of trees in the forest?
c. Suppose you select a point in the forest and construct a
circle of radius .1 mile. Let X 5 the number of trees
within that circular region. What is the pmf of X? [Hint:
1 sq mile 5 640 acres.]
92. Automobiles arrive at a vehicle equipment inspection station according to a Poisson process with rate a 5 10 per
hour. Suppose that with probability .5 an arriving vehicle
will have no equipment violations.
SUPPLEMENTARY EXERCISES
95. After shuffling a deck of 52 cards, a dealer deals out 5. Let
X 5 the number of suits represented in the five-card hand.
a. Show that the pmf of X is
p(x)
a. What is the probability that exactly ten arrive during the
hour and all ten have no violations?
b. For any fixed y $ 10, what is the probability that y arrive
during the hour, of which ten have no violations?
c. What is the probability that ten “no-violation” cars arrive
during the next hour? [Hint: Sum the probabilities in part
(b) from y 5 10 to ⬁.]
93. a. In a Poisson process, what has to happen in both the time
interval (0, t) and the interval (t, t 1 ⌬t) so that no
events occur in the entire interval (0, t 1 ⌬t)? Use this
and Assumptions 1–3 to write a relationship between
P0 (t 1 ⌬t) and P0(t).
b. Use the result of part (a) to write an expression for the
difference P0 (t 1 ⌬t) 2 P0 (t). Then divide by ⌬t and let
⌬t S 0 to obtain an equation involving (d/dt)P0 (t), the
derivative of P0(t) with respect to t.
c. Verify that P0 (t) 5 e2at satisfies the equation of part (b).
d. It can be shown in a manner similar to parts (a) and (b) that
the Pk (t)s must satisfy the system of differential equations
d
P (t) 5 aPk21(t) 2 aPk (t)
dt k
k 5 1, 2, 3, . . .
Verify that Pk(t) 5 e2at(at)k/k! satisfies the system. (This
is actually the only solution.)
(94–122)
94. Consider a deck consisting of seven cards, marked 1, 2, . . . ,
7. Three of these cards are selected at random. Define an rv
W by W 5 the sum of the resulting numbers, and compute
the pmf of W. Then compute m and s2. [Hint: Consider outcomes as unordered, so that (1, 3, 7) and (3, 1, 7) are not
different outcomes. Then there are 35 outcomes, and they
can be listed. (This type of rv actually arises in connection
with a statistical procedure called Wilcoxon’s rank-sum test,
in which there is an x sample and a y sample and W is the
sum of the ranks of the x’s in the combined sample; see
Section 15.2.)
x
133
1
2
3
4
.002
.146
.588
.264
[Hint: p(1) 5 4P(all are spades), p(2) 5 6P(only spades
and hearts with at least one of each suit), and p(4)
5 4P(2 spades ¨ one of each other suit).]
b. Compute m, s2, and s.
96. The negative binomial rv X was defined as the number of
F’s preceding the rth S. Let Y 5 the number of trials necessary to obtain the rth S. In the same manner in which the
pmf of X was derived, derive the pmf of Y.
97. Of all customers purchasing automatic garage-door openers,
75% purchase a chain-driven model. Let X 5 the number
among the next 15 purchasers who select the chain-driven
model.
a. What is the pmf of X?
b. Compute P(X . 10).
c. Compute P(6 # X # 10).
d. Compute m and s2.
e. If the store currently has in stock 10 chain-driven models
and 8 shaft-driven models, what is the probability that
the requests of these 15 customers can all be met from
existing stock?
98. A friend recently planned a camping trip. He had two flashlights, one that required a single 6-V battery and another
that used two size-D batteries. He had previously packed
two 6-V and four size-D batteries in his camper. Suppose
the probability that any particular battery works is p and that
batteries work or fail independently of one another. Our
friend wants to take just one flashlight. For what values of p
should he take the 6-V flashlight?
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
134
CHAPTER 3
Discrete Random Variables and Probability Distributions
99. A k-out-of-n system is one that will function if and only if
at least k of the n individual components in the system
function. If individual components function independently
of one another, each with probability .9, what is the probability that a 3-out-of-5 system functions?
100. A manufacturer of integrated circuit chips wishes to control the quality of its product by rejecting any batch in
which the proportion of defective chips is too high. To this
end, out of each batch (10,000 chips), 25 will be selected
and tested. If at least 5 of these 25 are defective, the entire
batch will be rejected.
a. What is the probability that a batch will be rejected if
5% of the chips in the batch are in fact defective?
b. Answer the question posed in (a) if the percentage of
defective chips in the batch is 10%.
c. Answer the question posed in (a) if the percentage of
defective chips in the batch is 20%.
d. What happens to the probabilities in (a)–(c) if the critical rejection number is increased from 5 to 6?
101. Of the people passing through an airport metal detector,
.5% activate it; let X 5 the number among a randomly
selected group of 500 who activate the detector.
a. What is the (approximate) pmf of X?
b. Compute P(X 5 5).
c. Compute P(5 # X).
102. An educational consulting firm is trying to decide whether
high school students who have never before used a handheld calculator can solve a certain type of problem more
easily with a calculator that uses reverse Polish logic or
one that does not use this logic. A sample of 25 students is
selected and allowed to practice on both calculators. Then
each student is asked to work one problem on the reverse
Polish calculator and a similar problem on the other. Let
p 5 P(S), where S indicates that a student worked the
problem more quickly using reverse Polish logic than without, and let X 5 number of S’s.
a. If p 5 .5, what is P(7 # X # 18)?
b. If p 5 .8, what is P(7 # X # 18)?
c. If the claim that p 5 .5 is to be rejected when either
x # 7 or x $ 18, what is the probability of rejecting the
claim when it is actually correct?
d. If the decision to reject the claim p 5 .5 is made as in
part (c), what is the probability that the claim is not
rejected when p 5 .6? When p 5 .8?
e. What decision rule would you choose for rejecting the
claim p 5 .5 if you wanted the probability in part (c) to
be at most .01?
103. Consider a disease whose presence can be identified by
carrying out a blood test. Let p denote the probability that
a randomly selected individual has the disease. Suppose n
individuals are independently selected for testing. One way
to proceed is to carry out a separate test on each of the n
blood samples. A potentially more economical approach,
group testing, was introduced during World War II to identify syphilitic men among army inductees. First, take a part
of each blood sample, combine these specimens, and carry
out a single test. If no one has the disease, the result will be
negative, and only the one test is required. If at least one
individual is diseased, the test on the combined sample will
yield a positive result, in which case the n individual tests
are then carried out. If p 5 .1 and n 5 3, what is the
expected number of tests using this procedure? What is the
expected number when n 5 5? [The article “Random
Multiple-Access Communication and Group Testing”
(IEEE Trans. on Commun., 1984: 769–774) applied these
ideas to a communication system in which the dichotomy
was active/idle user rather than diseased/nondiseased.]
104. Let p1 denote the probability that any particular code symbol is erroneously transmitted through a communication
system. Assume that on different symbols, errors occur
independently of one another. Suppose also that with probability p2 an erroneous symbol is corrected upon receipt.
Let X denote the number of correct symbols in a message
block consisting of n symbols (after the correction process
has ended). What is the probability distribution of X?
105. The purchaser of a power-generating unit requires c consecutive successful start-ups before the unit will be
accepted. Assume that the outcomes of individual start-ups
are independent of one another. Let p denote the probability that any particular start-up is successful. The random
variable of interest is X 5 the number of start-ups that
must be made prior to acceptance. Give the pmf of X for
the case c 5 2. If p 5 .9, what is P(X # 8)? [Hint: For
x $ 5, express p(x) “recursively” in terms of the pmf evaluated at the smaller values x 2 3, x 2 4, c, 2.] (This
problem was suggested by the article “Evaluation of a
Start-Up Demonstration Test,” J. Quality Technology,
1983: 103–106.)
106. A plan for an executive travelers’ club has been developed
by an airline on the premise that 10% of its current customers would qualify for membership.
a. Assuming the validity of this premise, among 25 randomly selected current customers, what is the probability that between 2 and 6 (inclusive) qualify for
membership?
b. Again assuming the validity of the premise, what are
the expected number of customers who qualify and the
standard deviation of the number who qualify in a random sample of 100 current customers?
c. Let X denote the number in a random sample of 25 current customers who qualify for membership. Consider
rejecting the company’s premise in favor of the claim
that p . .10 if x $ 7. What is the probability that the
company’s premise is rejected when it is actually valid?
d. Refer to the decision rule introduced in part (c). What is
the probability that the company’s premise is not
rejected even though p 5 .20 (i.e., 20% qualify)?
107. Forty percent of seeds from maize (modern-day corn) ears
carry single spikelets, and the other 60% carry paired
spikelets. A seed with single spikelets will produce an ear
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Supplementary Exercises
with single spikelets 29% of the time, whereas a seed with
paired spikelets will produce an ear with single spikelets
26% of the time. Consider randomly selecting ten seeds.
a. What is the probability that exactly five of these seeds
carry a single spikelet and produce an ear with a single
spikelet?
b. What is the probability that exactly five of the ears produced by these seeds have single spikelets? What is the
probability that at most five ears have single spikelets?
108. A trial has just resulted in a hung jury because eight members of the jury were in favor of a guilty verdict and the
other four were for acquittal. If the jurors leave the jury
room in random order and each of the first four leaving the
room is accosted by a reporter in quest of an interview,
what is the pmf of X 5 the number of jurors favoring
acquittal among those interviewed? How many of those
favoring acquittal do you expect to be interviewed?
109. A reservation service employs five information operators
who receive requests for information independently of one
another, each according to a Poisson process with rate
a 5 2 per minute.
a. What is the probability that during a given 1-min
period, the first operator receives no requests?
b. What is the probability that during a given 1-min
period, exactly four of the five operators receive no
requests?
c. Write an expression for the probability that during a
given 1-min period, all of the operators receive exactly
the same number of requests.
110. Grasshoppers are distributed at random in a large field
according to a Poisson process with parameter a 5 2 per
square yard. How large should the radius R of a circular
sampling region be taken so that the probability of finding
at least one in the region equals .99?
111. A newsstand has ordered five copies of a certain issue of a
photography magazine. Let X 5 the number of individuals
who come in to purchase this magazine. If X has a Poisson
distribution with parameter m 5 4, what is the expected
number of copies that are sold?
112. Individuals A and B begin to play a sequence of chess
games. Let S 5 5A wins a game6 , and suppose that outcomes of successive games are independent with P(S) 5 p
and P(F) 5 1 2 p (they never draw). They will play until
one of them wins ten games. Let X 5 the number of
games played (with possible values 10, 11, . . . , 19).
a. For x 5 10, 11, c, 19, obtain an expression for
p(x) 5 P(X 5 x).
b. If a draw is possible, with p 5 P(S), q 5 P(F),
1 2 p 2 q 5 P(draw), what are the possible values
of X? What is P(20 # X) ? [Hint: P(20 # X) 5
1 2 P(X , 20).]
113. A test for the presence of a certain disease has probability
.20 of giving a false-positive reading (indicating that an
individual has the disease when this is not the case) and
135
probability .10 of giving a false-negative result. Suppose
that ten individuals are tested, five of whom have the disease and five of whom do not. Let X 5 the number of positive readings that result.
a. Does X have a binomial distribution? Explain your reasoning.
b. What is the probability that exactly three of the ten test
results are positive?
114. The generalized negative binomial pmf is given by
nb(x; r, p) 5 k(r, x) # pr(1 2 p)x
x 5 0, 1, 2, . . .
Let X, the number of plants of a certain species found in a
particular region, have this distribution with p 5 .3 and
r 5 2.5. What is P(X 5 4)? What is the probability that at
least one plant is found?
115. There are two Certified Public Accountants in a particular
office who prepare tax returns for clients. Suppose that for
a particular type of complex form, the number of errors
made by the first preparer has a Poisson distribution with
mean value m1, the number of errors made by the second
preparer has a Poisson distribution with mean value m2,
and that each CPA prepares the same number of forms of
this type. Then if a form of this type is randomly selected,
the function
p(x; m1, m2) 5 .5
e2m1mx1
e2m2mx2
1 .5
x!
x!
x 5 0, 1, 2, . . .
gives the pmf of X 5 the number of errors on the selected
form.
a. Verify that p(x; m1, m2) is in fact a legitimate pmf ($ 0
and sums to 1).
b. What is the expected number of errors on the selected
form?
c. What is the variance of the number of errors on the
selected form?
d. How does the pmf change if the first CPA prepares 60%
of all such forms and the second prepares 40%?
116. The mode of a discrete random variable X with pmf p(x) is
that value x* for which p(x) is largest (the most probable
x value).
a. Let X | Bin(n, p). By considering the ratio b(x 1 1; n,
p)/b(x; n, p), show that b(x; n, p) increases with x as long
as x , np 2 (1 2 p). Conclude that the mode x* is the
integer satisfying (n 1 1)p 2 1 # x* # (n 1 1)p.
b. Show that if X has a Poisson distribution with parameter m, the mode is the largest integer less than m. If m is
an integer, show that both m 2 1 and m are modes.
117. A computer disk storage device has ten concentric tracks,
numbered 1, 2, . . . , 10 from outermost to innermost, and a
single access arm. Let pi 5 the probability that any particular request for data will take the arm to track
i(i 5 1, . . . , 10). Assume that the tracks accessed in successive seeks are independent. Let X 5 the number of
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
136
CHAPTER 3
Discrete Random Variables and Probability Distributions
tracks over which the access arm passes during two successive requests (excluding the track that the arm has just left,
so possible X values are x 5 0, 1, . . . , 9). Compute the
pmf of X. [Hint: P(the arm is now on track i and X 5 j) 5
P(X 5 j|arm now on i) # p.i After the conditional probability
is written in terms of p1, . . . , p10, by the law of total probability, the desired probability is obtained by summing over i.]
118. If X is a hypergeometric rv, show directly from the definition that E(X) 5 nM/N (consider only the case n , M).
[Hint: Factor nM/N out of the sum for E(X), and show
that the terms inside the sum are of the form
h(y; n 2 1, M 2 1, N 2 1), where y 5 x 2 1.]
119. Use the fact that
g (x 2 m)2p(x) $
g
(x 2 m)2p(x)
x: u x2mu$ks
all x
to prove Chebyshev’s inequality given in Exercise 44.
120. The simple Poisson process of Section 3.6 is characterized
by a constant rate a at which events occur per unit time. A
generalization of this is to suppose that the probability of
exactly one event occurring in the interval [t, t 1 ⌬t] is
a(t) # ⌬t 1 o(⌬t). It can then be shown that the number of
events occurring during an interval [t1, t2] has a Poisson
distribution with parameter
m5
冮
t1
a(t) dt
t2
The occurrence of events over time in this situation is
called a nonhomogeneous Poisson process. The article
“Inference Based on Retrospective Ascertainment,” J.
Amer. Stat. Assoc., 1989: 360–372, considers the intensity
function
a(t) 5 ea1bt
as appropriate for events involving transmission of HIV
(the AIDS virus) via blood transfusions. Suppose that
a 5 2 and b 5 .6 (close to values suggested in the paper),
with time in years.
a. What is the expected number of events in the interval
[0, 4]? In [2, 6]?
b. What is the probability that at most 15 events occur in
the interval [0, .9907]?
121. Consider a collection A1, . . . , Ak of mutually exclusive and
exhaustive events, and a random variable X whose distribution depends on which of the Ai’s occurs (e.g., a commuter might select one of three possible routes from home
to work, with X representing the commute time). Let
E(Xu Ai) denote the expected value of X given that the event
Ai occurs. Then it can be shown that
E(X) 5 ⌺E(Xu Ai) # P(Ai), the weighted average of the individual “conditional expectations” where the weights are
the probabilities of the partitioning events.
a. The expected duration of a voice call to a particular
telephone number is 3 minutes, whereas the expected
duration of a data call to that same number is 1 minute.
If 75% of all calls are voice calls, what is the expected
duration of the next call?
b. A deli sells three different types of chocolate chip cookies. The number of chocolate chips in a type i cookie
has a Poisson distribution with parameter
mi 5 i 1 1 (i 5 1, 2, 3). If 20% of all customers purchasing a chocolate chip cookie select the first type,
50% choose the second type, and the remaining 30%
opt for the third type, what is the expected number of
chips in a cookie purchased by the next customer?
122. Consider a communication source that transmits packets
containing digitized speech. After each transmission, the
receiver sends a message indicating whether the transmission was successful or unsuccessful. If a transmission is
unsuccessful, the packet is re-sent. Suppose a voice packet
can be transmitted a maximum of 10 times. Assuming that
the results of successive transmissions are independent of
one another and that the probability of any particular transmission being successful is p, determine the probability
mass function of the rv X 5 the number of times a packet
is transmitted. Then obtain an expression for the expected
number of times a packet is transmitted.
Bibliography
Johnson, Norman, Samuel Kotz, and Adrienne Kemp, Discrete
Univariate Distributions, Wiley, New York, 1992. An encyclopedia of information on discrete distributions.
Olkin, Ingram, Cyrus Derman, and Leon Gleser, Probability
Models and Applications (2nd ed.), Macmillan, New York,
1994. Contains an in-depth discussion of both general
properties of discrete and continuous distributions and results
for specific distributions.
Ross, Sheldon, Introduction to Probability Models (9th ed.),
Academic Press, New York, 2007. A good source of material
on the Poisson process and generalizations and a nice introduction to other topics in applied probability.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
142
CHAPTER 4
Continuous Random Variables and Probability Distributions
The probability that headway time is at most 5 sec is
5
P(X # 5) 5 3
5
f(x) dx 5 3 .15e2.15(x2.5) dx
.5
2`
5 .15e.075 3 e2.15x dx 5 .15e.075 # a2
5
.5
1 2.15x x55
e
b
`
.15
x5.5
5 e.075(2e2.75 1 e2.075) 5 1.078(2.472 1 .928) 5 .491
5 P(less than 5 sec) 5 P(X , 5)
■
Unlike discrete distributions such as the binomial, hypergeometric, and negative binomial, the distribution of any given continuous rv cannot usually be derived
using simple probabilistic arguments. Instead, one must make a judicious choice of
pdf based on prior knowledge and available data. Fortunately, there are some general
families of pdf’s that have been found to be sensible candidates in a wide variety of
experimental situations; several of these are discussed later in the chapter.
Just as in the discrete case, it is often helpful to think of the population of interest as consisting of X values rather than individuals or objects. The pdf is then a
model for the distribution of values in this numerical population, and from this
model various population characteristics (such as the mean) can be calculated.
EXERCISES
Section 4.1 (1–10)
1. The current in a certain circuit as measured by an ammeter is
a continuous random variable X with the following density
function:
f(x) 5 e
.075x 1 .2 3 # x # 5
0
otherwise
a. Graph the pdf and verify that the total area under the density curve is indeed 1.
b. Calculate P(X # 4). How does this probability compare
to P(X , 4)?
c. Calculate P(3.5 # X # 4.5) and also P(4.5 , X).
2. Suppose the reaction temperature X (in 8C) in a certain
chemical process has a uniform distribution with A 5 25
and B 5 5.
a. Compute P(X , 0).
b. Compute P(22.5 , X , 2.5).
c. Compute P(22 # X # 3).
d. For k satisfying 25 , k , k 1 4 , 5, compute
P(k , X , k 1 4).
3. The error involved in making a certain measurement is a continuous rv X with pdf
f (x) 5 e
a.
b.
c.
d.
.09375(4 2 x 2) 22 # x # 2
0
otherwise
Sketch the graph of f(x).
Compute P(X . 0).
Compute P(21 , X , 1).
Compute P(X , 2.5 or X . .5).
4. Let X denote the vibratory stress (psi) on a wind turbine blade
at a particular wind speed in a wind tunnel. The article
“Blade Fatigue Life Assessment with Application to
VAWTS” (J. of Solar Energy Engr., 1982: 107–111) proposes
the Rayleigh distribution, with pdf
x
f(x; u) 5 • u 2
# e2x /(2u )
2
2
0
x.0
otherwise
as a model for the X distribution.
a. Verify that f(x; u) is a legitimate pdf.
b. Suppose u 5 100 (a value suggested by a graph in the
article). What is the probability that X is at most 200? Less
than 200? At least 200?
c. What is the probability that X is between 100 and 200
(again assuming u 5 100)?
d. Give an expression for P(X # x).
5. A college professor never finishes his lecture before the end of
the hour and always finishes his lectures within 2 min after the
hour. Let X 5 the time that elapses between the end of the
hour and the end of the lecture and suppose the pdf of X is
f(x) 5 e
kx 2 0 # x # 2
0 otherwise
a. Find the value of k and draw the corresponding density
curve. [Hint: Total area under the graph of f(x) is 1.]
b. What is the probability that the lecture ends within 1 min
of the end of the hour?
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4.2 Cumulative Distribution Functions and Expected Values
c. What is the probability that the lecture continues beyond
the hour for between 60 and 90 sec?
d. What is the probability that the lecture continues for at
least 90 sec beyond the end of the hour?
6. The actual tracking weight of a stereo cartridge that is set to
track at 3 g on a particular changer can be regarded as a continuous rv X with pdf
1
y
25
143
0#y,5
1
f( y) 5 e 2
2
y 5 # y # 10
5
25
0
y , 0 or y . 10
a. Sketch a graph of the pdf of Y.
`
k[1 2 (x 2 3)2] 2 # x # 4
f(x) 5 e
0
otherwise
a. Sketch the graph of f(x).
b. Find the value of k.
c. What is the probability that the actual tracking weight is
greater than the prescribed weight?
d. What is the probability that the actual weight is within
.25 g of the prescribed weight?
e. What is the probability that the actual weight differs from
the prescribed weight by more than .5 g?
7. The time X (min) for a lab assistant to prepare the equipment
for a certain experiment is believed to have a uniform distribution with A 5 25 and B 5 35.
a. Determine the pdf of X and sketch the corresponding
density curve.
b. What is the probability that preparation time exceeds
33 min?
c. What is the probability that preparation time is within
2 min of the mean time? [Hint: Identify m from the graph
of f(x).]
d. For any a such that 25 , a , a 1 2 , 35, what is the
probability that preparation time is between a and
a 1 2 min?
8. In commuting to work, a professor must first get on a bus
near her house and then transfer to a second bus. If the waiting time (in minutes) at each stop has a uniform distribution
with A 5 0 and B 5 5, then it can be shown that the total
waiting time Y has the pdf
b. Verify that 3 f( y) dy 5 1.
2`
c. What is the probability that total waiting time is at most
3 min?
d. What is the probability that total waiting time is at most
8 min?
e. What is the probability that total waiting time is between
3 and 8 min?
f. What is the probability that total waiting time is either
less than 2 min or more than 6 min?
9. Consider again the pdf of X 5 time headway given in
Example 4.5. What is the probability that time headway is
a. At most 6 sec?
b. More than 6 sec? At least 6 sec?
c. Between 5 and 6 sec?
10. A family of pdf’s that has been used to approximate the distribution of income, city population size, and size of firms is
the Pareto family. The family has two parameters, k and u,
both . 0, and the pdf is
k # uk
x$u
f(x; k, u) 5 u x k11
0
x,u
a. Sketch the graph of f(x; k, u).
b. Verify that the total area under the graph equals 1.
c. If the rv X has pdf f (x; k, u), for any fixed b . u, obtain
an expression for P(X # b).
d. For u , a , b, obtain an expression for the probability
P(a # X # b).
4.2 Cumulative Distribution Functions
and Expected Values
Several of the most important concepts introduced in the study of discrete distributions also play an important role for continuous distributions. Definitions analogous
to those in Chapter 3 involve replacing summation by integration.
The Cumulative Distribution Function
The cumulative distribution function (cdf) F(x) for a discrete rv X gives, for any
specified number x, the probability P(X # x). It is obtained by summing the pmf
p(y) over all possible values y satisfying y # x. The cdf of a continuous rv gives the
same probabilities P(X # x) and is obtained by integrating the pdf f(y) between the
limits 2` and x.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
150
Continuous Random Variables and Probability Distributions
CHAPTER 4
DEFINITION
The variance of a continuous random variable X with pdf f(x) and mean value
m is
`
sX2 5 V(X) 5 3
(x 2 m)2 # f(x)dx 5 E[(X 2 m)2]
2`
The standard deviation (SD) of X is sX 5 2V(X).
The variance and standard deviation give quantitative measures of how much spread
there is in the distribution or population of x values. Again s is roughly the size of
a typical deviation from m. Computation of s2 is facilitated by using the same shortcut formula employed in the discrete case.
V(X) 5 E(X 2) 2 [E(X)]2
PROPOSITION
Example 4.12
(Example 4.10
continued)
For X 5 weekly gravel sales, we computed E(X) 5 38. Since
`
1
3
E(X 2) 5 3 x 2 # f(x) dx 5 3 x 2 # (1 2 x 2) dx
2
0
2`
1
3
1
5 3 (x 2 2 x 4) dx 5
2
5
0
V(X) 5
1
3 2
19
2 a b 5
5 .059
5
8
320
and sX 5 .244
■
When h(X) 5 aX 1 b, the expected value and variance of h(X) satisfy the same
properties as in the discrete case: E[h(X)] 5 am 1 b and V[h(X)] 5 a 2 # s2.
EXERCISES
Section 4.2 (11–27)
11. Let X denote the amount of time a book on two-hour reserve
is actually checked out, and suppose the cdf is
12. The cdf for X (5 measurement error) of Exercise 3 is
0
x , 22
3
x3
1
a4x 2 b 22 # x , 2
F(x) 5 d 1
2
32
3
1
2#x
0 x,0
x2
0#x,2
F(x) 5 d
4
1 2#x
Use the cdf to obtain the following:
a. P(X # 1)
b. P(.5 # X # 1)
c. P(X . 1.5)
| [solve .5 5 F(m
|)]
d. The median checkout duration m
e. F r(x) to obtain the density function f(x)
f. E(X)
g. V(X) and sX
h. If the borrower is charged an amount h(X ) 5 X 2 when
checkout duration is X, compute the expected charge
E[h(X)].
a.
b.
c.
d.
Compute P(X , 0).
Compute P(21 , X , 1).
Compute P(.5 , X).
Verify that f(x) is as given in Exercise 3 by obtaining
F r(x).
| 5 0.
e. Verify that m
13. Example 4.5 introduced the concept of time headway in
traffic flow and proposed a particular distribution for X 5
the headway between two randomly selected consecutive
cars (sec). Suppose that in a different traffic environment,
the distribution of time headway has the form
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4.2 Cumulative Distribution Functions and Expected Values
k
x.1
f(x) 5 • x 4
0 x#1
a. Determine the value of k for which f(x) is a legitimate pdf.
b. Obtain the cumulative distribution function.
c. Use the cdf from (b) to determine the probability that
headway exceeds 2 sec and also the probability that
headway is between 2 and 3 sec.
d. Obtain the mean value of headway and the standard
deviation of headway.
e. What is the probability that headway is within 1 standard
deviation of the mean value?
14. The article “Modeling Sediment and Water Column
Interactions for Hydrophobic Pollutants” (Water Research,
1984: 1169–1174) suggests the uniform distribution on the
interval (7.5, 20) as a model for depth (cm) of the bioturbation layer in sediment in a certain region.
a. What are the mean and variance of depth?
b. What is the cdf of depth?
c. What is the probability that observed depth is at most
10? Between 10 and 15?
d. What is the probability that the observed depth is within
1 standard deviation of the mean value? Within 2 standard deviations?
15. Let X denote the amount of space occupied by an article
placed in a 1-ft3 packing container. The pdf of X is
f(x) 5 e
90x 8(1 2 x) 0 , x , 1
0
otherwise
a. Graph the pdf. Then obtain the cdf of X and graph it.
b. What is P(X # .5) [i.e., F(.5)]?
c. Using the cdf from (a), what is P(.25 , X # .5)? What
is P(.25 # X # .5)?
d. What is the 75th percentile of the distribution?
e. Compute E(X) and sX.
f. What is the probability that X is more than 1 standard
deviation from its mean value?
16. Answer parts (a)–(f) of Exercise 15 with X 5 lecture time
past the hour given in Exercise 5.
17. Let X have a uniform distribution on the interval [A, B].
a. Obtain an expression for the (100p)th percentile.
b. Compute E(X), V(X), and sX.
c. For n, a positive integer, compute E(X n).
18. Let X denote the voltage at the output of a microphone, and
suppose that X has a uniform distribution on the interval
from 21 to 1. The voltage is processed by a “hard limiter”
with cutoff values 2.5 and .5, so the limiter output is a random variable Y related to X by Y 5 X if |X| # .5, Y 5 .5 if
X . .5, and Y 5 2.5 if X , 2.5.
a. What is P(Y 5 .5)?
b. Obtain the cumulative distribution function of Y and
graph it.
151
19. Let X be a continuous rv with cdf
0
x#0
4
x
F(x) 5 μ c1 1 lna b d 0 , x # 4
x
4
1
x.4
[This type of cdf is suggested in the article “Variability in
Measured Bedload-Transport Rates” (Water Resources
Bull., 1985: 39–48) as a model for a certain hydrologic variable.] What is
a. P(X # 1)?
b. P(1 # X # 3)?
c. The pdf of X?
20. Consider the pdf for total waiting time Y for two buses
1
y
0#y,5
25
1
f ( y) 5 e 2
2
y 5 # y # 10
5 25
0
otherwise
introduced in Exercise 8.
a. Compute and sketch the cdf of Y. [Hint: Consider separately 0 # y , 5 and 5 # y # 10 in computing F(y). A
graph of the pdf should be helpful.]
b. Obtain an expression for the (100p)th percentile. [Hint:
Consider separately 0 , p , .5 and .5 , p , 1.]
c. Compute E(Y ) and V(Y). How do these compare with the
expected waiting time and variance for a single bus when
the time is uniformly distributed on [0, 5]?
21. An ecologist wishes to mark off a circular sampling region
having radius 10 m. However, the radius of the resulting
region is actually a random variable R with pdf
f(r) 5
u
3
[1 2 (10 2 r)2] 9 # r # 11
4
0
otherwise
What is the expected area of the resulting circular region?
22. The weekly demand for propane gas (in 1000s of gallons)
from a particular facility is an rv X with pdf
u 2a1 2 x 2 b
1
f(x) 5
0
1#x#2
otherwise
a. Compute the cdf of X.
b. Obtain an expression for the (100p)th percentile. What is
|?
the value of m
c. Compute E(X) and V(X).
d. If 1.5 thousand gallons are in stock at the beginning of
the week and no new supply is due in during the week,
how much of the 1.5 thousand gallons is expected to be
left at the end of the week? [Hint: Let h(x) 5 amount
left when demand 5 x.]
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
152
CHAPTER 4
Continuous Random Variables and Probability Distributions
23. If the temperature at which a certain compound melts is a
random variable with mean value 1208C and standard deviation 28 C, what are the mean temperature and standard
deviation measured in 8F? [Hint: 8F 5 1.88 C 1 32.]
Although X is a discrete random variable, suppose its distribution is quite well approximated by a continuous distribution with pdf f(x) 5 k(1 1 x/2.5)27 for x $ 0.
a. What is the value of k?
b. Graph the pdf of X.
c. What are the expected value and standard deviation of
total medical expenses?
d. This individual is covered by an insurance plan that
entails a $500 deductible provision (so the first $500
worth of expenses are paid by the individual). Then the
plan will pay 80% of any additional expenses exceeding $500, and the maximum payment by the individual
(including the deductible amount) is $2500. Let Y
denote the amount of this individual’s medical
expenses paid by the insurance company. What is the
expected value of Y?
[Hint: First figure out what value of X corresponds to
the maximum out-of-pocket expense of $2500. Then
write an expression for Y as a function of X (which
involves several different pieces) and calculate the
expected value of this function.]
24. Let X have the Pareto pdf
f (x; k, u) 5
u
k # uk
x$u
x k11
0
x,u
introduced in Exercise 10.
a. If k . 1, compute E(X).
b. What can you say about E(X) if k 5 1?
c. If k . 2, show that V(X) 5 ku2 (k 2 1)22 (k 2 2)21.
d. If k 5 2, what can you say about V(X)?
e. What conditions on k are necessary to ensure that E(X n)
is finite?
25. Let X be the temperature in 8 C at which a certain chemical
reaction takes place, and let Y be the temperature in 8 F (so
Y 5 1.8X 1 32).
|, show that
a. If the median of the X distribution is m
|
1.8m 1 32 is the median of the Y distribution.
b. How is the 90th percentile of the Y distribution related to
the 90th percentile of the X distribution? Verify your
conjecture.
c. More generally, if Y 5 aX 1 b, how is any particular
percentile of the Y distribution related to the corresponding percentile of the X distribution?
26. Let X be the total medical expenses (in 1000s of dollars)
incurred by a particular individual during a given year.
27. When a dart is thrown at a circular target, consider the location of the landing point relative to the bull’s eye. Let X be the
angle in degrees measured from the horizontal, and assume
that X is uniformly distributed on [0, 360]. Define Y to be the
transformed variable Y 5 h(X) 5 (2p/360)X 2 p, so Y is
the angle measured in radians and Y is between 2p and p.
Obtain E(Y) and sY by first obtaining E(X) and sX, and then
using the fact that h(X) is a linear function of X.
4.3 The Normal Distribution
The normal distribution is the most important one in all of probability and statistics.
Many numerical populations have distributions that can be fit very closely by an
appropriate normal curve. Examples include heights, weights, and other physical
characteristics (the famous 1903 Biometrika article “On the Laws of Inheritance in
Man” discussed many examples of this sort), measurement errors in scientific experiments, anthropometric measurements on fossils, reaction times in psychological
experiments, measurements of intelligence and aptitude, scores on various tests, and
numerous economic measures and indicators. In addition, even when individual variables themselves are not normally distributed, sums and averages of the variables
will under suitable conditions have approximately a normal distribution; this is the
content of the Central Limit Theorem discussed in the next chapter.
DEFINITION
A continuous rv X is said to have a normal distribution with parameters m
and s (or m and s2), where 2` , m , ` and 0 , s, if the pdf of X is
f(x; m, s) 5
1
2
2
e2(x2m) /(2s ) 2` , x , `
12ps
(4.3)
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
162
CHAPTER 4
Continuous Random Variables and Probability Distributions
The exact probabilities are .2622 and .8348, respectively, so the approximations are
quite good. In the last calculation, the probability P(5 # X # 15) is being approximated by the area under the normal curve between 4.5 and 15.5—the continuity correction is used for both the upper and lower limits.
■
When the objective of our investigation is to make an inference about a population proportion p, interest will focus on the sample proportion of successes X/n
rather than on X itself. Because this proportion is just X multiplied by the constant
1/n, it will also have approximately a normal distribution (with mean m 5 p and
standard deviation s 5 1pq/n) provided that both np $ 10 and nq $ 10. This normal approximation is the basis for several inferential procedures to be discussed in
later chapters.
EXERCISES
Section 4.3 (28–58)
28. Let Z be a standard normal random variable and calculate
the following probabilities, drawing pictures wherever
appropriate.
a. P(0 # Z # 2.17)
b. P(0 # Z # 1)
c. P(22.50 # Z # 0)
d. P(22.50 # Z # 2.50)
e. P(Z # 1.37)
f. P(21.75 # Z)
g. P(21.50 # Z # 2.00)
h. P(1.37 # Z # 2.50)
i. P(1.50 # Z)
j. P( u Z u # 2.50)
deviation 1.75 km/h is postulated. Consider randomly
selecting a single such moped.
a. What is the probability that maximum speed is at most
50 km/h?
b. What is the probability that maximum speed is at least
48 km/h?
c. What is the probability that maximum speed differs from
the mean value by at most 1.5 standard deviations?
29. In each case, determine the value of the constant c that
makes the probability statement correct.
a. (c) 5 .9838
b. P(0 # Z # c) 5 .291
c. P(c # Z) 5 .121
d. P(2c # Z # c) 5 .668
e. P(c # u Z u) 5 .016
34. The article “Reliability of Domestic-Waste Biofilm
Reactors” (J. of Envir. Engr., 1995: 785–790) suggests that
substrate concentration (mg/cm3) of influent to a reactor is
normally distributed with m 5 .30 and s 5 .06.
a. What is the probability that the concentration exceeds .25?
b. What is the probability that the concentration is at
most .10?
c. How would you characterize the largest 5% of all concentration values?
30. Find the following percentiles for the standard normal distribution. Interpolate where appropriate.
a. 91st
b. 9th
c. 75th
d. 25th
e. 6th
31. Determine za for the following:
a. a 5 .0055
b. a 5 .09
c. a 5 .663
32. Suppose the force acting on a column that helps to support
a building is a normally distributed random variable X with
mean value 15.0 kips and standard deviation 1.25 kips.
Compute the following probabilities by standardizing and
then using Table A.3.
a. P(X # 15)
b. P(X # 17.5)
c. P(X $ 10)
d. P(14 # X # 18)
e. P(u X 2 15 u # 3)
33. Mopeds (small motorcycles with an engine capacity below
50 cm3) are very popular in Europe because of their mobility, ease of operation, and low cost. The article “Procedure
to Verify the Maximum Speed of Automatic Transmission
Mopeds in Periodic Motor Vehicle Inspections” (J. of
Automobile Engr., 2008: 1615–1623) described a rolling
bench test for determining maximum vehicle speed. A normal distribution with mean value 46.8 km/h and standard
35. Suppose the diameter at breast height (in.) of trees of a
certain type is normally distributed with m 5 8.8 and
s 5 2.8, as suggested in the article “Simulating a
Harvester-Forwarder Softwood Thinning” (Forest
Products J., May 1997: 36–41).
a. What is the probability that the diameter of a randomly selected tree will be at least 10 in.? Will exceed
10 in.?
b. What is the probability that the diameter of a randomly
selected tree will exceed 20 in.?
c. What is the probability that the diameter of a randomly
selected tree will be between 5 and 10 in.?
d. What value c is such that the interval (8.8 2 c, 8.8 1 c)
includes 98% of all diameter values?
e. If four trees are independently selected, what is the
probability that at least one has a diameter exceeding
10 in.?
36. Spray drift is a constant concern for pesticide applicators
and agricultural producers. The inverse relationship
between droplet size and drift potential is well known. The
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4.3 The Normal Distribution
paper “Effects of 2,4-D Formulation and Quinclorac on
Spray Droplet Size and Deposition” (Weed Technology,
2005: 1030–1036) investigated the effects of herbicide formulation on spray atomization. A figure in the paper suggested the normal distribution with mean 1050 mm and
standard deviation 150 mm was a reasonable model for
droplet size for water (the “control treatment”) sprayed
through a 760 ml/min nozzle.
a. What is the probability that the size of a single droplet is
less than 1500 mm? At least 1000 mm?
b. What is the probability that the size of a single droplet is
between 1000 and 1500 mm?
c. How would you characterize the smallest 2% of all
droplets?
d. If the sizes of five independently selected droplets are
measured, what is the probability that at least one exceeds
1500 mm?
37. Suppose that blood chloride concentration (mmol/L) has
a normal distribution with mean 104 and standard deviation 5 (information in the article “Mathematical Model of
Chloride Concentration in Human Blood,” J. of Med.
Engr. and Tech., 2006: 25–30, including a normal probability plot as described in Section 4.6, supports this
assumption).
a. What is the probability that chloride concentration
equals 105? Is less than 105? Is at most 105?
b. What is the probability that chloride concentration
differs from the mean by more than 1 standard deviation? Does this probability depend on the values of m
and s?
c. How would you characterize the most extreme .1% of
chloride concentration values?
38. There are two machines available for cutting corks intended
for use in wine bottles. The first produces corks with diameters that are normally distributed with mean 3 cm and standard deviation .1 cm. The second machine produces corks
with diameters that have a normal distribution with mean
3.04 cm and standard deviation .02 cm. Acceptable corks
have diameters between 2.9 cm and 3.1 cm. Which machine
is more likely to produce an acceptable cork?
39. a. If a normal distribution has m 5 30 and s 5 5, what is
the 91st percentile of the distribution?
b. What is the 6th percentile of the distribution?
c. The width of a line etched on an integrated circuit chip is
normally distributed with mean 3.000 mm and standard
deviation .140. What width value separates the widest
10% of all such lines from the other 90%?
40. The article “Monte Carlo Simulation—Tool for Better
Understanding of LRFD” (J. of Structural Engr., 1993:
1586–1599) suggests that yield strength (ksi) for A36 grade
steel is normally distributed with m 5 43 and s 5 4.5.
a. What is the probability that yield strength is at most 40?
Greater than 60?
b. What yield strength value separates the strongest 75%
from the others?
163
41. The automatic opening device of a military cargo parachute has been designed to open when the parachute is
200 m above the ground. Suppose opening altitude actually has a normal distribution with mean value 200 m and
standard deviation 30 m. Equipment damage will occur if
the parachute opens at an altitude of less than 100 m.
What is the probability that there is equipment damage to
the payload of at least one of five independently dropped
parachutes?
42. The temperature reading from a thermocouple placed in a
constant-temperature medium is normally distributed with
mean m, the actual temperature of the medium, and standard
deviation s. What would the value of s have to be to ensure
that 95% of all readings are within .18 of m?
43. The distribution of resistance for resistors of a certain
type is known to be normal, with 10% of all resistors
having a resistance exceeding 10.256 ohms and 5%
having a resistance smaller than 9.671 ohms. What are the
mean value and standard deviation of the resistance distribution?
44. If bolt thread length is normally distributed, what is the
probability that the thread length of a randomly selected
bolt is
a. Within 1.5 SDs of its mean value?
b. Farther than 2.5 SDs from its mean value?
c. Between 1 and 2 SDs from its mean value?
45. A machine that produces ball bearings has initially been
set so that the true average diameter of the bearings it produces is .500 in. A bearing is acceptable if its diameter is
within .004 in. of this target value. Suppose, however, that
the setting has changed during the course of production,
so that the bearings have normally distributed diameters
with mean value .499 in. and standard deviation .002 in.
What percentage of the bearings produced will not be
acceptable?
46. The Rockwell hardness of a metal is determined by
impressing a hardened point into the surface of the
metal and then measuring the depth of penetration of the
point. Suppose the Rockwell hardness of a particular
alloy is normally distributed with mean 70 and standard
deviation 3. (Rockwell hardness is measured on a continuous scale.)
a. If a specimen is acceptable only if its hardness is
between 67 and 75, what is the probability that a randomly chosen specimen has an acceptable hardness?
b. If the acceptable range of hardness is (70 2 c, 70 1 c),
for what value of c would 95% of all specimens have
acceptable hardness?
c. If the acceptable range is as in part (a) and the hardness
of each of ten randomly selected specimens is independently determined, what is the expected number of
acceptable specimens among the ten?
d. What is the probability that at most eight of ten independently selected specimens have a hardness of less than
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
164
CHAPTER 4
Continuous Random Variables and Probability Distributions
73.84? [Hint: Y 5 the number among the ten specimens
with hardness less than 73.84 is a binomial variable; what
is p?]
47. The weight distribution of parcels sent in a certain manner
is normal with mean value 12 lb and standard deviation
3.5 lb. The parcel service wishes to establish a weight value
c beyond which there will be a surcharge. What value of c
is such that 99% of all parcels are at least 1 lb under the surcharge weight?
48. Suppose Appendix Table A.3 contained (z) only for z $ 0.
Explain how you could still compute
a. P(21.72 # Z # 2.55)
b. P(21.72 # Z # .55)
Is it necessary to tabulate (z) for z negative? What property of the standard normal curve justifies your answer?
49. Consider babies born in the “normal” range of 37–43
weeks gestational age. Extensive data supports the
assumption that for such babies born in the United States,
birth weight is normally distributed with mean 3432 g and
standard deviation 482 g. [The article “Are Babies
Normal?” (The American Statistician, 1999: 298–302)
analyzed data from a particular year; for a sensible choice
of class intervals, a histogram did not look at all normal,
but after further investigations it was determined that this
was due to some hospitals measuring weight in grams and
others measuring to the nearest ounce and then converting
to grams. A modified choice of class intervals that
allowed for this gave a histogram that was well described
by a normal distribution.]
a. What is the probability that the birth weight of a randomly selected baby of this type exceeds 4000 g? Is
between 3000 and 4000 g?
b. What is the probability that the birth weight of a randomly selected baby of this type is either less than 2000 g
or greater than 5000 g?
c. What is the probability that the birth weight of a randomly
selected baby of this type exceeds 7 lb?
d. How would you characterize the most extreme .1% of all
birth weights?
e. If X is a random variable with a normal distribution and
a is a numerical constant (a 2 0), then Y 5 aX also has
a normal distribution. Use this to determine the distribution of birth weight expressed in pounds (shape,
mean, and standard deviation), and then recalculate the
probability from part (c). How does this compare to
your previous answer?
50. In response to concerns about nutritional contents of
fast foods, McDonald’s has announced that it will use a
new cooking oil for its french fries that will decrease substantially trans fatty acid levels and increase the amount
of more beneficial polyunsaturated fat. The company
claims that 97 out of 100 people cannot detect a difference in taste between the new and old oils. Assuming
that this figure is correct (as a long-run proportion),
what is the approximate probability that in a random
sample of 1000 individuals who have purchased fries at
McDonald’s,
a. At least 40 can taste the difference between the two oils?
b. At most 5% can taste the difference between the two
oils?
51. Chebyshev’s inequality, (see Exercise 44, Chapter 3), is
valid for continuous as well as discrete distributions. It
states that for any number k satisfying k $ 1,
P(u X 2 m u $ ks) # 1/k2 (see Exercise 44 in Chapter 3 for
an interpretation). Obtain this probability in the case of a
normal distribution for k 5 1, 2, and 3, and compare to the
upper bound.
52. Let X denote the number of flaws along a 100-m reel of
magnetic tape (an integer-valued variable). Suppose X has
approximately a normal distribution with m 5 25 and
s 5 5. Use the continuity correction to calculate the probability that the number of flaws is
a. Between 20 and 30, inclusive.
b. At most 30. Less than 30.
53. Let X have a binomial distribution with parameters
n 5 25 and p. Calculate each of the following probabilities using the normal approximation (with the continuity
correction) for the cases p 5 .5, .6, and .8 and compare
to the exact probabilities calculated from Appendix
Table A.1.
a. P(15 # X # 20)
b. P(X # 15)
c. P(20 # X)
54. Suppose that 10% of all steel shafts produced by a certain
process are nonconforming but can be reworked (rather than
having to be scrapped). Consider a random sample of 200
shafts, and let X denote the number among these that are
nonconforming and can be reworked. What is the (approximate) probability that X is
a. At most 30?
b. Less than 30?
c. Between 15 and 25 (inclusive)?
55. Suppose only 75% of all drivers in a certain state regularly
wear a seat belt. A random sample of 500 drivers is selected.
What is the probability that
a. Between 360 and 400 (inclusive) of the drivers in the
sample regularly wear a seat belt?
b. Fewer than 400 of those in the sample regularly wear a
seat belt?
56. Show that the relationship between a general normal percentile and the corresponding z percentile is as stated in this
section.
57. a. Show that if X has a normal distribution with parameters m and s, then Y 5 aX 1 b (a linear function of X)
also has a normal distribution. What are the parameters
of the distribution of Y [i.e., E(Y ) and V(Y )]? [Hint:
Write the cdf of Y, P(Y # y), as an integral involving
the pdf of X, and then differentiate with respect to y to
get the pdf of Y.]
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
165
4.4 The Exponential and Gamma Distributions
b. If, when measured in 8C, temperature is normally distributed with mean 115 and standard deviation 2, what
can be said about the distribution of temperature measured in 8F?
58. There is no nice formula for the standard normal cdf (z),
but several good approximations have been published in articles. The following is from “Approximations for Hand
Calculators Using Small Integer Coefficients” (Mathematics
of Computation, 1977: 214–222). For 0 , z # 5.5,
P(Z $ z) 5 1 2 (z)
< .5 exp e 2c
(83z 1 351)z 1 562
df
703/z 1 165
The relative error of this approximation is less than .042%.
Use this to calculate approximations to the following probabilities, and compare whenever possible to the probabilities obtained from Appendix Table A.3.
a. P(Z $ 1)
b. P(Z , 23)
c. P(24 , Z , 4)
d. P(Z . 5)
4.4 The Exponential and Gamma Distributions
The density curve corresponding to any normal distribution is bell-shaped and
therefore symmetric. There are many practical situations in which the variable of
interest to an investigator might have a skewed distribution. One family of distributions that has this property is the gamma family. We first consider a special case, the
exponential distribution, and then generalize later in the section.
The Exponential Distribution
The family of exponential distributions provides probability models that are very
widely used in engineering and science disciplines.
DEFINITION
X is said to have an exponential distribution with parameter l (l . 0) if the
pdf of X is
f (x; l) 5 e
le2lx
0
x$0
otherwise
(4.5)
Some sources write the exponential pdf in the form (1/b)e2x/b, so that b 5 1/l. The
expected value of an exponentially distributed random variable X is
`
E(X) 5 3 xle2lx dx
0
Obtaining this expected value necessitates doing an integration by parts. The variance of X can be computed using the fact that V(X) 5 E(X 2) 2 [E(X)]2. The determination of E(X 2) requires integrating by parts twice in succession. The results of
these integrations are as follows:
m5
1
l
s2 5
1
l2
Both the mean and standard deviation of the exponential distribution equal 1/l.
Graphs of several exponential pdf’s are illustrated in Figure 4.26.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
170
CHAPTER 4
Continuous Random Variables and Probability Distributions
DEFINITION
Let n be a positive integer. Then a random variable X is said to have a chisquared distribution with parameter n if the pdf of X is the gamma density
with a 5 n/2 and b 5 2. The pdf of a chi-squared rv is thus
f(x; n) 5
u
1
x (n/2)21e2x/2 x $ 0
2n/2(v/2)
0
x,0
(4.10)
The parameter n is called the number of degrees of freedom (df) of X. The
symbol x 2 is often used in place of “chi-squared.”
EXERCISES
Section 4.4 (59–71)
59. Let X 5 the time between two successive arrivals at the
drive-up window of a local bank. If X has an exponential
distribution with l 5 1 (which is identical to a standard
gamma distribution with a 5 1), compute the following:
a. The expected time between two successive arrivals
b. The standard deviation of the time between successive
arrivals
c. P(X # 4)
d. P(2 # X # 5)
60. Let X denote the distance (m) that an animal moves from its
birth site to the first territorial vacancy it encounters.
Suppose that for banner-tailed kangaroo rats, X has an exponential distribution with parameter l 5 .01386 (as suggested in the article “Competition and Dispersal from
Multiple Nests,” Ecology, 1997: 873–883).
a. What is the probability that the distance is at most
100 m? At most 200 m? Between 100 and 200 m?
b. What is the probability that distance exceeds the mean
distance by more than 2 standard deviations?
c. What is the value of the median distance?
61. Data collected at Toronto Pearson International Airport suggests that an exponential distribution with mean value 2.725
hours is a good model for rainfall duration (Urban Stormwater
Management Planning with Analytical Probabilistic Models,
2000, p. 69).
a. What is the probability that the duration of a particular
rainfall event at this location is at least 2 hours? At most
3 hours? Between 2 and 3 hours?
b. What is the probability that rainfall duration exceeds the
mean value by more than 2 standard deviations? What is
the probability that it is less than the mean value by more
than one standard deviation?
62. The paper “Microwave Observations of Daily Antarctic
Sea-Ice Edge Expansion and Contribution Rates” (IEEE
Geosci. and Remote Sensing Letters, 2006: 54–58) states
that “The distribution of the daily sea-ice advance/retreat
from each sensor is similar and is approximately double
exponential.” The proposed double exponential distribution
has density function f(x) 5 .5le2l|x| for 2` , x , ` . The
standard deviation is given as 40.9 km.
a. What is the value of the parameter l?
b. What is the probability that the extent of daily sea-ice
change is within 1 standard deviation of the mean value?
63. A consumer is trying to decide between two long-distance
calling plans. The first one charges a flat rate of 10¢ per
minute, whereas the second charges a flat rate of 99¢ for
calls up to 20 minutes in duration and then 10¢ for each
additional minute exceeding 20 (assume that calls lasting
a noninteger number of minutes are charged proportionately to a whole-minute’s charge). Suppose the consumer’s distribution of call duration is exponential with
parameter l.
a. Explain intuitively how the choice of calling plan should
depend on what the expected call duration is.
b. Which plan is better if expected call duration is 10 minutes? 15 minutes? [Hint: Let h1(x) denote the cost for the
first plan when call duration is x minutes and let h2(x) be
the cost function for the second plan. Give expressions
for these two cost functions, and then determine the
expected cost for each plan.]
64. Evaluate the following:
a. (6)
b. (5/2)
c. F(4; 5) (the incomplete gamma function)
d. F(5; 4)
e. F(0 ; 4)
65. Let X have a standard gamma distribution with a 5 7.
Evaluate the following:
a. P(X # 5)
b. P(X , 5)
c. P(X . 8)
d. P(3 # X # 8) e. P(3 , X , 8)
f. P(X , 4 or X . 6)
66. Suppose the time spent by a randomly selected student who
uses a terminal connected to a local time-sharing computer
facility has a gamma distribution with mean 20 min and
variance 80 min2.
a. What are the values of a and b?
b. What is the probability that a student uses the terminal
for at most 24 min?
c. What is the probability that a student spends between 20
and 40 min using the terminal?
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4.5 Other Continuous Distributions
67. Suppose that when a transistor of a certain type is subjected
to an accelerated life test, the lifetime X (in weeks) has a
gamma distribution with mean 24 weeks and standard deviation 12 weeks.
a. What is the probability that a transistor will last between
12 and 24 weeks?
b. What is the probability that a transistor will last at most
24 weeks? Is the median of the lifetime distribution less
than 24? Why or why not?
c. What is the 99th percentile of the lifetime distribution?
d. Suppose the test will actually be terminated after t
weeks. What value of t is such that only .5% of all transistors would still be operating at termination?
68. The special case of the gamma distribution in which a is a
positive integer n is called an Erlang distribution. If we
replace b by 1/l in Expression (4.8), the Erlang pdf is
f(x; l, n) 5 •
l(l x) n21e2lx
x$0
(n 2 1)!
0
x,0
It can be shown that if the times between successive events
are independent, each with an exponential distribution with
parameter l, then the total time X that elapses before all of
the next n events occur has pdf f(x; l, n).
a. What is the expected value of X? If the time (in minutes) between arrivals of successive customers is exponentially distributed with l 5 .5, how much time can
be expected to elapse before the tenth customer
arrives?
b. If customer interarrival time is exponentially distributed
with l 5 .5, what is the probability that the tenth customer (after the one who has just arrived) will arrive
within the next 30 min?
c. The event {X # t} occurs iff at least n events occur in
the next t units of time. Use the fact that the number of
events occurring in an interval of length t has a Poisson
distribution with parameter lt to write an expression
171
(involving Poisson probabilities) for the Erlang cdf
F(t; l, n) 5 P(X # t).
69. A system consists of five identical components connected in
series as shown:
1
2
3
4
5
As soon as one component fails, the entire system will fail.
Suppose each component has a lifetime that is exponentially
distributed with l 5 .01 and that components fail independently of one another. Define events Ai 5 {ith component lasts at least t hours}, i 5 1, c, 5, so that the Ais are
independent events. Let X 5 the time at which the system
fails—that is, the shortest (minimum) lifetime among the
five components.
a. The event {X $ t} is equivalent to what event involving
A1, c, A5?
b. Using the independence of the Airs, compute P(X $ t).
Then obtain F(t) 5 P(X # t) and the pdf of X. What
type of distribution does X have?
c. Suppose there are n components, each having exponential lifetime with parameter l. What type of distribution
does X have?
70. If X has an exponential distribution with parameter l, derive
a general expression for the (100p)th percentile of the distribution. Then specialize to obtain the median.
71. a. The event {X 2 # y} is equivalent to what event involving X itself?
b. If X has a standard normal distribution, use part (a) to
write the integral that equals P(X 2 # y). Then differentiate this with respect to y to obtain the pdf of X 2 [the
square of a N(0, 1) variable]. Finally, show that X 2 has a
chi-squared distribution with n 5 1 df [see (4.10)].
[Hint: Use the following identity.]
b(y)
d
e
f(x) dx f 5 f [b(y)] # br(y) 2 f [a(y)] # a r(y)
dy 3a(y)
4.5 Other Continuous Distributions
The normal, gamma (including exponential), and uniform families of distributions
provide a wide variety of probability models for continuous variables, but there are
many practical situations in which no member of these families fits a set of observed
data very well. Statisticians and other investigators have developed other families of
distributions that are often appropriate in practice.
The Weibull Distribution
The family of Weibull distributions was introduced by the Swedish physicist
Waloddi Weibull in 1939; his 1951 article “A Statistical Distribution Function of
Wide Applicability” (J. of Applied Mechanics, vol. 18: 293–297) discusses a number of applications.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4.5 Other Continuous Distributions
Example 4.28
177
Project managers often use a method labeled PERT—for program evaluation and
review technique—to coordinate the various activities making up a large project.
(One successful application was in the construction of the Apollo spacecraft.) A standard assumption in PERT analysis is that the time necessary to complete any particular activity once it has been started has a beta distribution with A 5 the optimistic
time (if everything goes well) and B 5 the pessimistic time (if everything goes
badly). Suppose that in constructing a single-family house, the time X (in days) necessary for laying the foundation has a beta distribution with A 5 2, B 5 5, a 5 2,
and b 5 3. Then a/(a 1 b) 5 .4, so E(X) 5 2 1 (3)(.4) 5 3.2. For these values
of a and b, the pdf of X is a simple polynomial function. The probability that it takes
at most 3 days to lay the foundation is
3
52x 2
1 4! x 2 2
a
ba
b dx
P(X # 3) 5 3 #
3 1!2!
3
3
2
4 3
4 # 11 11
5 3 (x 2 2)(5 2 x)2dx 5
5
5 .407
27 2
27 4
27
■
The standard beta distribution is commonly used to model variation in the proportion or percentage of a quantity occurring in different samples, such as the proportion of a 24-hour day that an individual is asleep or the proportion of a certain
element in a chemical compound.
EXERCISES
Section 4.5 (72–86)
72. The lifetime X (in hundreds of hours) of a certain type of
vacuum tube has a Weibull distribution with parameters
a 5 2 and b 5 3. Compute the following:
a. E(X) and V(X)
b. P(X # 6)
c. P(1.5 # X # 6)
(This Weibull distribution is suggested as a model for time
in service in “On the Assessment of Equipment Reliability:
Trading Data Collection Costs for Precision,” J. of Engr.
Manuf., 1991: 105–109.)
73. The authors of the article “A Probabilistic Insulation Life
Model for Combined Thermal-Electrical Stresses” (IEEE
Trans. on Elect. Insulation, 1985: 519–522) state that “the
Weibull distribution is widely used in statistical problems
relating to aging of solid insulating materials subjected to
aging and stress.” They propose the use of the distribution
as a model for time (in hours) to failure of solid insulating
specimens subjected to AC voltage. The values of the
parameters depend on the voltage and temperature; suppose a 5 2.5 and b 5 200 (values suggested by data in the
article).
a. What is the probability that a specimen’s lifetime is at
most 250? Less than 250? More than 300?
b. What is the probability that a specimen’s lifetime is
between 100 and 250?
c. What value is such that exactly 50% of all specimens
have lifetimes exceeding that value?
74. Let X 5 the time (in 1021 weeks) from shipment of a defective product until the customer returns the product. Suppose
that the minimum return time is g 5 3.5 and that the excess
X 2 3.5 over the minimum has a Weibull distribution with
parameters a 5 2 and b 5 1.5 (see “Practical Applications
of the Weibull Distribution,” Industrial Quality Control,
Aug. 1964: 71–78).
a. What is the cdf of X?
b. What are the expected return time and variance of return
time? [Hint: First obtain E(X 2 3.5) and V(X 2 3.5).]
c. Compute P(X . 5).
d. Compute P(5 # X # 8).
75. Let X have a Weibull distribution with the pdf from
Expression (4.11). Verify that m 5 b(1 1 1/a). [Hint: In
the integral for E(X), make the change of variable
y 5 (x/b)a, so that x 5 by1/a.]
76. a. In Exercise 72, what is the median lifetime of such
tubes? [Hint: Use Expression (4.12).]
b. In Exercise 74, what is the median return time?
c. If X has a Weibull distribution with the cdf from
Expression (4.12), obtain a general expression for the
(100p)th percentile of the distribution.
d. In Exercise 74, the company wants to refuse to accept
returns after t weeks. For what value of t will only 10%
of all returns be refused?
77. The authors of the paper from which the data in Exercise
1.27 was extracted suggested that a reasonable probability
model for drill lifetime was a lognormal distribution with
m 5 4.5 and s 5 .8.
a. What are the mean value and standard deviation of lifetime?
b. What is the probability that lifetime is at most 100?
c. What is the probability that lifetime is at least 200?
Greater than 200?
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
178
CHAPTER 4
Continuous Random Variables and Probability Distributions
78. The article “On Assessing the Accuracy of Offshore Wind
Turbine Reliability-Based Design Loads from the
Environmental Contour Method” (Intl. J. of Offshore and
Polar Engr., 2005: 132–140) proposes the Weibull distribution with a 5 1.817 and b 5 .863 as a model for 1-hour
significant wave height (m) at a certain site.
a. What is the probability that wave height is at most .5 m?
b. What is the probability that wave height exceeds its
mean value by more than one standard deviation?
c. What is the median of the wave-height distribution?
d. For 0 , p , 1, give a general expression for the 100pth
percentile of the wave-height distribution.
79. Nonpoint source loads are chemical masses that travel to the
main stem of a river and its tributaries in flows that are distributed over relatively long stream reaches, in contrast to
those that enter at well-defined and regulated points. The
article “Assessing Uncertainty in Mass Balance Calculation
of River Nonpoint Source Loads” (J. of Envir. Engr., 2008:
247–258) suggested that for a certain time period and location, X 5 nonpoint source load of total dissolved solids
could be modeled with a lognormal distribution having
mean value 10,281 kg/day/km and a coefficient of variation
CV 5 .40 (CV 5 sX/mX).
a. What are the mean value and standard deviation of
ln(X)?
b. What is the probability that X is at most 15,000
kg/day/km?
c. What is the probability that X exceeds its mean value,
and why is this probability not .5?
d. Is 17,000 the 95th percentile of the distribution?
|
80. a. Use Equation (4.13) to write a formula for the median m
of the lognormal distribution. What is the median for the
load distribution of Exercise 79?
b. Recalling that za is our notation for the 100(1 2 a) percentile of the standard normal distribution, write an
expression for the 100(1 2 a) percentile of the lognormal distribution. In Exercise 79, what value will load
exceed only 1% of the time?
81. A theoretical justification based on a certain material failure mechanism underlies the assumption that ductile
strength X of a material has a lognormal distribution.
Suppose the parameters are m 5 5 and s 5 .1.
a. Compute E(X) and V(X).
b.
c.
d.
e.
Compute P(X . 125).
Compute P(110 # X # 125).
What is the value of median ductile strength?
If ten different samples of an alloy steel of this type were
subjected to a strength test, how many would you expect
to have strength of at least 125?
f. If the smallest 5% of strength values were unacceptable,
what would the minimum acceptable strength be?
82. The article “The Statistics of Phytotoxic Air Pollutants”
(J. of Royal Stat. Soc., 1989: 183–198) suggests the lognormal distribution as a model for SO2 concentration above a
certain forest. Suppose the parameter values are m 5 1.9
and s 5 .9.
a. What are the mean value and standard deviation of concentration?
b. What is the probability that concentration is at most 10?
Between 5 and 10?
83. What condition on a and b is necessary for the standard
beta pdf to be symmetric?
84. Suppose the proportion X of surface area in a randomly
selected quadrat that is covered by a certain plant has a standard beta distribution with a 5 5 and b 5 2.
a. Compute E(X) and V(X).
b. Compute P(X # .2).
c. Compute P(.2 # X # .4).
d. What is the expected proportion of the sampling region
not covered by the plant?
85. Let X have a standard beta density with parameters a and b.
a. Verify the formula for E(X) given in the section.
b. Compute E[(1 2 X)m]. If X represents the proportion of
a substance consisting of a particular ingredient, what is
the expected proportion that does not consist of this
ingredient?
86. Stress is applied to a 20-in. steel bar that is clamped in a
fixed position at each end. Let Y 5 the distance from the
left end at which the bar snaps. Suppose Y/20 has a standard
beta distribution with E(Y) 5 10 and V(Y) 5 100.
7
a. What are the parameters of the relevant standard beta
distribution?
b. Compute P(8 # Y # 12).
c. Compute the probability that the bar snaps more than
2 in. from where you expect it to.
4.6 Probability Plots
An investigator will often have obtained a numerical sample x1, x2, c, xn and wish
to know whether it is plausible that it came from a population distribution of some
particular type (e.g., from a normal distribution). For one thing, many formal procedures from statistical inference are based on the assumption that the population distribution is of a specified type. The use of such a procedure is inappropriate if
the actual underlying probability distribution differs greatly from the assumed type.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
188
CHAPTER 4
Continuous Random Variables and Probability Distributions
b. Construct a Weibull probability plot. Is the Weibull distribution family plausible?
93. Construct a probability plot that will allow you to assess the
plausibility of the lognormal distribution as a model for the
rainfall data of Exercise 83 in Chapter 1.
94. The accompanying observations are precipitation values during March over a 30-year period in Minneapolis-St. Paul.
.77
1.74
.81
1.20
1.95
1.20
.47
1.43
3.37
2.20
3.00
3.09
1.51
2.10
.52
1.62
1.31
.32
.59
.81
2.81
1.87
1.18
1.35
4.75
2.48
.96
1.89
.90
2.05
a. Construct and interpret a normal probability plot for this
data set.
b. Calculate the square root of each value and then construct a normal probability plot based on this transformed data. Does it seem plausible that the square root
of precipitation is normally distributed?
c. Repeat part (b) after transforming by cube roots.
95. Use a statistical software package to construct a normal
probability plot of the tensile ultimate-strength data given in
Exercise 13 of Chapter 1, and comment.
96. Let the ordered sample observations be denoted by
y1, y2, c, yn ( y1 being the smallest and yn the largest). Our
SUPPLEMENTARY EXERCISES
97. The following failure time observations (1000s of hours)
resulted from accelerated life testing of 16 integrated circuit
chips of a certain type:
82.8
242.0
229.9
11.6
26.5
558.9
359.5
244.8
366.7
502.5
304.3
204.6
307.8
379.1
179.7
212.6
Use the corresponding percentiles of the exponential
distribution with l 5 1 to construct a probability plot.
Then explain why the plot assesses the plausibility of
the sample having been generated from any exponential
distribution.
(98–128)
98. Let X 5 the time it takes a read/write head to locate a
desired record on a computer disk memory device once the
head has been positioned over the correct track. If the disks
rotate once every 25 millisec, a reasonable assumption is
that X is uniformly distributed on the interval [0, 25].
a. Compute P(10 # X # 20).
b. Compute P(X $ 10).
c. Obtain the cdf F(X).
d. Compute E(X) and sX.
99. A 12-in. bar that is clamped at both ends is to be subjected
to an increasing amount of stress until it snaps. Let Y 5
the distance from the left end at which the break occurs.
Suppose Y has pdf
1
y
a bya1 2 b
f(y) 5 • 24
12
0
suggested check for normality is to plot the
(21((i 2 .5)/n), yi) pairs. Suppose we believe that the
observations come from a distribution with mean 0, and
let w1, c, wn be the ordered absolute values of the xirs .
A half-normal plot is a probability plot of the wirs.
More specifically, since P(u Z u # w) 5 P(2w # Z # w) 5
2(w) 2 1, a half-normal plot is a plot of the
(21 5[(i 2 .5)/n 1 1]/26, wi) pairs. The virtue of this plot is
that small or large outliers in the original sample will now
appear only at the upper end of the plot rather than at both
ends. Construct a half-normal plot for the following sample
of measurement errors, and comment: 23.78, 21.27, 1.44,
2.39, 12.38, 243.40, 1.15, 23.96, 22.34, 30.84.
e. The expected length of the shorter segment when the
break occurs.
100. Let X denote the time to failure (in years) of a certain
hydraulic component. Suppose the pdf of X is
f(x) 5 32/(x 1 4)3 for x . 0.
a. Verify that f(x) is a legitimate pdf.
b. Determine the cdf.
c. Use the result of part (b) to calculate the probability that
time to failure is between 2 and 5 years.
d. What is the expected time to failure?
e. If the component has a salvage value equal to
100/(4 1 x) when its time to failure is x, what is the
expected salvage value?
101. The completion time X for a certain task has cdf F(x) given by
0 # y # 12
otherwise
Compute the following:
a. The cdf of Y, and graph it.
b. P(Y # 4), P(Y . 6), and P(4 # Y # 6)
c. E(Y), E(Y2) , and V(Y)
d. The probability that the break point occurs more than
2 in. from the expected break point.
0
x,0
⎧
⎪
x3
0#x,1
⎪
3
⎪
⎨
1 7
7 3
7
⎪ 1 2 2 a3 2 xb a4 2 4 xb 1 # x # 3
⎪
7
⎪
1
x.
⎩
3
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Supplementary Exercises
a. Obtain the pdf f(x) and sketch its graph.
b. Compute P(.5 # X # 2).
c. Compute E(X).
102. The breakdown voltage of a randomly chosen diode of a
certain type is known to be normally distributed with mean
value 40 V and standard deviation 1.5 V.
a. What is the probability that the voltage of a single diode
is between 39 and 42?
b. What value is such that only 15% of all diodes have
voltages exceeding that value?
c. If four diodes are independently selected, what is the
probability that at least one has a voltage exceeding 42?
103. The article “Computer Assisted Net Weight Control”
(Quality Progress, 1983: 22–25) suggests a normal distribution with mean 137.2 oz and standard deviation 1.6 oz
for the actual contents of jars of a certain type. The stated
contents was 135 oz.
a. What is the probability that a single jar contains more
than the stated contents?
b. Among ten randomly selected jars, what is the probability that at least eight contain more than the stated
contents?
c. Assuming that the mean remains at 137.2, to what value
would the standard deviation have to be changed so that
95% of all jars contain more than the stated contents?
104. When circuit boards used in the manufacture of compact
disc players are tested, the long-run percentage of defectives is 5%. Suppose that a batch of 250 boards has been
received and that the condition of any particular board is
independent of that of any other board.
a. What is the approximate probability that at least 10% of
the boards in the batch are defective?
b. What is the approximate probability that there are
exactly 10 defectives in the batch?
105. The article “Characterization of Room Temperature
Damping in Aluminum-Indium Alloys” (Metallurgical
Trans., 1993: 1611–1619) suggests that Al matrix grain
size (mm) for an alloy consisting of 2% indium could be
modeled with a normal distribution with a mean value
96 and standard deviation 14.
a. What is the probability that grain size exceeds 100?
b. What is the probability that grain size is between
50 and 80?
c. What interval (a, b) includes the central 90% of all grain
sizes (so that 5% are below a and 5% are above b)?
106. The reaction time (in seconds) to a certain stimulus is a
continuous random variable with pdf
3# 1
1#x#3
f(x) 5 • 2 x 2
0
otherwise
a. Obtain the cdf.
b. What is the probability that reaction time is at most 2.5
sec? Between 1.5 and 2.5 sec?
189
c. Compute the expected reaction time.
d. Compute the standard deviation of reaction time.
e. If an individual takes more than 1.5 sec to react, a light
comes on and stays on either until one further second
has elapsed or until the person reacts (whichever happens first). Determine the expected amount of time that
the light remains lit. [Hint: Let h(X) 5 the time that the
light is on as a function of reaction time X.]
107. Let X denote the temperature at which a certain chemical
reaction takes place. Suppose that X has pdf
1
(4 2 x 2)
f(x) 5 • 9
0
21 # x # 2
otherwise
a. Sketch the graph of f(x).
b. Determine the cdf and sketch it.
c. Is 0 the median temperature at which the reaction takes
place? If not, is the median temperature smaller or
larger than 0?
d. Suppose this reaction is independently carried out once
in each of ten different labs and that the pdf of reaction
time in each lab is as given. Let Y 5 the number among
the ten labs at which the temperature exceeds 1. What
kind of distribution does Y have? (Give the names and
values of any parameters.)
108. The article “Determination of the MTF of Positive
Photoresists Using the Monte Carlo Method”
(Photographic Sci. and Engr., 1983: 254–260) proposes
the exponential distribution with parameter l 5 .93 as a
model for the distribution of a photon’s free path length
(mm) under certain circumstances. Suppose this is the correct model.
a. What is the expected path length, and what is the standard deviation of path length?
b. What is the probability that path length exceeds 3.0?
What is the probability that path length is between 1.0
and 3.0?
c. What value is exceeded by only 10% of all path lengths?
109. The article “The Prediction of Corrosion by Statistical
Analysis of Corrosion Profiles” (Corrosion Science, 1985:
305–315) suggests the following cdf for the depth X of the
deepest pit in an experiment involving the exposure of
carbon manganese steel to acidified seawater.
2(x2a)/b
F(x; a, b) 5 e2e
2` , x , `
The authors propose the values a 5 150 and b 5 90.
Assume this to be the correct model.
a. What is the probability that the depth of the deepest pit
is at most 150? At most 300? Between 150 and 300?
b. Below what value will the depth of the maximum pit be
observed in 90% of all such experiments?
c. What is the density function of X?
d. The density function can be shown to be unimodal (a
single peak). Above what value on the measurement
axis does this peak occur? (This value is the mode.)
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
190
CHAPTER 4
Continuous Random Variables and Probability Distributions
e. It can be shown that E(X) < .5772b 1 a. What is the
mean for the given values of a and b, and how does it
compare to the median and mode? Sketch the graph of
the density function. [Note: This is called the largest
extreme value distribution.]
110. Let t = the amount of sales tax a retailer owes the government for a certain period. The article “Statistical Sampling
in Tax Audits” (Statistics and the Law, 2008: 320–343)
proposes modeling the uncertainty in t by regarding it as a
normally distributed random variable with mean value m
and standard deviation s (in the article, these two parameters are estimated from the results of a tax audit involving
n sampled transactions). If a represents the amount
the retailer is assessed, then an under-assessment results if
t . a and an over-assessment results if a . t. The proposed
penalty (i.e., loss) function for over- or under-assessment is
L(a, t) 5 t a if t . a and 5 k(a t) if t # a (k . 1 is
suggested to incorporate the idea that over-assessment
is more serious than under-assessment).
a. Show that a* 5 m 1 s21(1/(k 1 1)) is the value of a
that minimizes the expected loss, where 21 is the
inverse function of the standard normal cdf.
b. If k = 2 (suggested in the article), m = $100,000, and s =
$10,000, what is the optimal value of a, and what is the
resulting probability of over-assessment?
111. The mode of a continuous distribution is the value x* that
maximizes f(x).
a. What is the mode of a normal distribution with parameters m and s?
b. Does the uniform distribution with parameters A and B
have a single mode? Why or why not?
c. What is the mode of an exponential distribution with
parameter l? (Draw a picture.)
d. If X has a gamma distribution with parameters a and b,
and a . 1, find the mode. [Hint: ln[f(x)] will be maximized iff f(x) is, and it may be simpler to take the derivative of ln[f(x)].]
e. What is the mode of a chi-squared distribution having n
degrees of freedom?
112. The article “Error Distribution in Navigation” (J. of the
Institute of Navigation, 1971: 429–442) suggests that the
frequency distribution of positive errors (magnitudes of
errors) is well approximated by an exponential distribution.
Let X 5 the lateral position error (nautical miles), which
can be either negative or positive. Suppose the pdf of X is
f(x) 5 (.1)e2.2|x| 2` , x , `
a. Sketch a graph of f(x) and verify that f(x) is a legitimate
pdf (show that it integrates to 1).
b. Obtain the cdf of X and sketch it.
c. Compute P(X # 0), P(X # 2), P(21 # X # 2), and the
probability that an error of more than 2 miles is made.
113. In some systems, a customer is allocated to one of two
service facilities. If the service time for a customer served
by facility i has an exponential distribution with parameter
li (i 5 1, 2) and p is the proportion of all customers served
by facility 1, then the pdf of X 5 the service time of a randomly selected customer is
f(x; l1, l2, p) 5 e
pl1e2l1x 1 (1 2 p)l2e2l2x
0
x$0
otherwise
This is often called the hyperexponential or mixed exponential distribution. This distribution is also proposed as a
model for rainfall amount in “Modeling Monsoon Affected
Rainfall of Pakistan by Point Processes” (J. of Water Resources Planning and Mgmnt., 1992: 671–688).
a. Verify that f(x; l1, l2, p) is indeed a pdf.
b. What is the cdf F(x; l1, l2, p)?
c. If X has f(x; l1, l2, p) as its pdf, what is E(X)?
d. Using the fact that E(X 2) 5 2/l2 when X has an exponential distribution with parameter l, compute E(X 2)
when X has pdf f(x; l1, l2, p). Then compute V(X).
e. The coefficient of variation of a random variable (or
distribution) is CV 5 s/m. What is CV for an exponential rv? What can you say about the value of CV when X
has a hyperexponential distribution?
f. What is CV for an Erlang distribution with parameters
l and n as defined in Exercise 68? [Note: In applied
work, the sample CV is used to decide which of the
three distributions might be appropriate.]
114. Suppose a particular state allows individuals filing tax
returns to itemize deductions only if the total of all itemized deductions is at least $5000. Let X (in 1000s of dollars) be the total of itemized deductions on a randomly
chosen form. Assume that X has the pdf
f(x; a) 5 e
k/x a
x$5
0
otherwise
a. Find the value of k. What restriction on a is necessary?
b. What is the cdf of X?
c. What is the expected total deduction on a randomly
chosen form? What restriction on a is necessary for
E(X) to be finite?
d. Show that ln(X/5) has an exponential distribution with
parameter a 2 1.
115. Let Ii be the input current to a transistor and I0 be the output current. Then the current gain is proportional to
ln(I0/Ii). Suppose the constant of proportionality is 1
(which amounts to choosing a particular unit of measurement), so that current gain 5 X 5 ln(I0/Ii). Assume X is
normally distributed with m 5 1 and s 5 .05.
a. What type of distribution does the ratio I0/Ii have?
b. What is the probability that the output current is more
than twice the input current?
c. What are the expected value and variance of the ratio of
output to input current?
116. The article “Response of SiCf/Si3N4 Composites Under
Static and Cyclic Loading—An Experimental and
Statistical Analysis” (J. of Engr. Materials and Technology,
1997: 186–193) suggests that tensile strength (MPa) of
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Supplementary Exercises
composites under specified conditions can be modeled by
a Weibull distribution with a 5 9 and b 5 180.
a. Sketch a graph of the density function.
b. What is the probability that the strength of a randomly
selected specimen will exceed 175? Will be between
150 and 175?
c. If two randomly selected specimens are chosen and
their strengths are independent of one another, what is
the probability that at least one has a strength between
150 and 175?
d. What strength value separates the weakest 10% of all
specimens from the remaining 90%?
117. Let Z have a standard normal distribution and define a new
rv Y by Y 5 sZ 1 m. Show that Y has a normal distribution with parameters m and s. [Hint: Y # y iff Z # ? Use
this to find the cdf of Y and then differentiate it with respect
to y.]
118. a. Suppose the lifetime X of a component, when measured
in hours, has a gamma distribution with parameters a
and b. Let Y 5 the lifetime measured in minutes.
Derive the pdf of Y. [Hint: Y # y iff X # y/60. Use this
to obtain the cdf of Y and then differentiate to obtain the
pdf.]
b. If X has a gamma distribution with parameters a and b,
what is the probability distribution of Y 5 cX?
119. In Exercises 117 and 118, as well as many other situations,
one has the pdf f(x) of X and wishes to know the pdf of
y 5 h(X). Assume that h( # ) is an invertible function, so
that y 5 h(x) can be solved for x to yield x 5 k(y). Then it
can be shown that the pdf of Y is
posing some interesting questions regarding birth coincidences.
a. Disregarding leap year and assuming that the other
365 days are equally likely, what is the probability that
three randomly selected births all occur on March 11?
Be sure to indicate what, if any, extra assumptions you
are making.
b. With the assumptions used in part (a), what is the probability that three randomly selected births all occur on
the same day?
c. The author suggested that, based on extensive data, the
length of gestation (time between conception and birth)
could be modeled as having a normal distribution with
mean value 280 days and standard deviation 19.88 days.
The due dates for the three Utah sisters were March 15,
April 1, and April 4, respectively. Assuming that all
three due dates are at the mean of the distribution, what
is the probability that all births occurred on March 11?
[Hint: The deviation of birth date from due date is normally distributed with mean 0.]
d. Explain how you would use the information in part (c)
to calculate the probability of a common birth date.
122. Let X denote the lifetime of a component, with f(x) and
F(x) the pdf and cdf of X. The probability that the component fails in the interval (x, x 1 x) is approximately
f(x) # x. The conditional probability that it fails in
(x, x 1 x) given that it has lasted at least x is
f(x) # x/[1 2 F(x)]. Dividing this by x produces the
failure rate function:
r(x) 5
g(y) 5 f [k(y)] # |kr(y)|
a. If X has a uniform distribution with A 5 0 and B 5 1,
derive the pdf of Y 5 2ln(X).
b. Work Exercise 117, using this result.
c. Work Exercise 118(b), using this result.
120. Based on data from a dart-throwing experiment, the article
“Shooting Darts” (Chance, Summer 1997, 16–19) proposed
that the horizontal and vertical errors from aiming at a point
target should be independent of one another, each with a
normal distribution having mean 0 and variance s2. It can
then be shown that the pdf of the distance V from the target
to the landing point is
f(v) 5
v
s2
# e2v /2s
2
2
v.0
a. This pdf is a member of what family introduced in this
chapter?
b. If s 5 20 mm (close to the value suggested in the
paper), what is the probability that a dart will land
within 25 mm (roughly 1 in.) of the target?
121. The article “Three Sisters Give Birth on the Same Day”
(Chance, Spring 2001, 23–25) used the fact that three Utah
sisters had all given birth on March 11, 1998 as a basis for
191
f(x)
1 2 F(x)
An increasing failure rate function indicates that older
components are increasingly likely to wear out, whereas a
decreasing failure rate is evidence of increasing reliability
with age. In practice, a “bathtub-shaped” failure is often
assumed.
a. If X is exponentially distributed, what is r(x)?
b. If X has a Weibull distribution with parameters a and b,
what is r(x)? For what parameter values will r(x) be
increasing? For what parameter values will r(x) decrease with x?
c. Since
r(x) 5 2(d/dx)ln[1 2 F(x)], ln[1 2 F(x)] 5
2兰r(x)dx. Suppose
r(x) 5 •
x
aa1 2 b
b
0
0#x#b
otherwise
so that if a component lasts b hours, it will last forever
(while seemingly unreasonable, this model can be used to
study just “initial wearout”). What are the cdf and pdf of X?
123. Let U have a uniform distribution on the interval [0, 1].
Then observed values having this distribution can be obtained from a computer’s random number generator. Let
X 5 2(1/l)ln(1 2 U).
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
192
CHAPTER 4
Continuous Random Variables and Probability Distributions
a. Show that X has an exponential distribution with parameter l. [Hint: The cdf of X is F(x) 5 P(X # x); X # x
is equivalent to U # ?]
b. How would you use part (a) and a random number generator to obtain observed values from an exponential
distribution with parameter l 5 10?
124. Consider an rv X with mean m and standard deviation s,
and let g(X) be a specified function of X. The first-order
Taylor series approximation to g(X) in the neighborhood of
m is
g(X) < g(m) 1 gr(m) # (X 2 m)
The right-hand side of this equation is a linear function of
X. If the distribution of X is concentrated in an interval over
which g( )is approximately linear [e.g., 1x is approximately linear in (1, 2)], then the equation yields approximations to E(g(X)) and V(g(X)).
a. Give expressions for these approximations. [Hint: Use
rules of expected value and variance for a linear function aX 1 b.]
b. If the voltage v across a medium is fixed but current I is
random, then resistance will also be a random variable
related to I by R 5 v/I. If mI 5 20 and sI 5 .5, calculate approximations to mR and sR.
#
125. A function g(x) is convex if the chord connecting any two
points on the function’s graph lies above the graph. When
g(x) is differentiable, an equivalent condition is that for
every x, the tangent line at x lies entirely on or below the
graph. (See the figure below.) How does g(m) 5 g(E(X))
compare to E(g(X))? [Hint: The equation of the tangent line
at x 5 m is y 5 g(m) 1 gr(m) # (x 2 m). Use the condition
of convexity, substitute X for x, and take expected values.
[Note: Unless g(x) is linear, the resulting inequality (usually
called Jensen’s inequality) is strict (, rather than # ); it is
valid for both continuous and discrete rv’s.]
126. Let X have a Weibull distribution with parameters a 5 2
and b. Show that Y 5 2X 2/b2 has a chi-squared distribution with n 5 2. [Hint: The cdf of Y is P(Y # y); express
this probability in the form P(X # g(y)), use the fact that X
has a cdf of the form in Expression (4.12), and differentiate with respect to y to obtain the pdf of Y.]
127. An individual’s credit score is a number calculated based on
that person’s credit history that helps a lender determine
how much he/she should be loaned or what credit limit
should be established for a credit card. An article in the
Los Angeles Times gave data which suggested that a beta
distribution with parameters A 5 150, B 5 850, a 5 8,
b 5 2 would provide a reasonable approximation to the
distribution of American credit scores. [Note: credit scores
are integer-valued].
a. Let X represent a randomly selected American credit
score. What are the mean value and standard deviation
of this random variable? What is the probability that X
is within 1 standard deviation of its mean value?
b. What is the approximate probability that a randomly
selected score will exceed 750 (which lenders consider
a very good score)?
x
128. Let V denote rainfall volume and W denote runoff volume
(both in mm). According to the article “Runoff Quality
Analysis of Urban Catchments with Analytical Probability
Models” (J. of Water Resource Planning and Management,
2006: 4–14), the runoff volume will be 0 if V # nd and will
be k(V 2 nd) if V . nd. Here nd is the volume of depression storage (a constant), and k (also a constant) is the
runoff coefficient. The cited article proposes an exponential distribution with parameter l for V.
a. Obtain an expression for the cdf of W. [Note: W is neither purely continuous nor purely discrete; instead it has
a “mixed” distribution with a discrete component at 0
and is continuous for values w . 0.]
b. What is the pdf of W for w . 0? Use this to obtain an
expression for the expected value of runoff volume.
Bury, Karl, Statistical Distributions in Engineering, Cambridge
Univ. Press, Cambridge, England, 1999. A readable and
informative survey of distributions and their properties.
Johnson, Norman, Samuel Kotz, and N. Balakrishnan,
Continuous Univariate Distributions, vols. 1–2, Wiley, New
York, 1994. These two volumes together present an exhaustive survey of various continuous distributions.
Nelson, Wayne, Applied Life Data Analysis, Wiley, New York,
1982. Gives a comprehensive discussion of distributions and
methods that are used in the analysis of lifetime data.
Olkin, Ingram, Cyrus Derman, and Leon Gleser, Probability
Models and Applications (2nd ed.), Macmillan, New York,
1994. Good coverage of general properties and specific distributions.
Tangent
line
Bibliography
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.