Download Continuous Random Variables and the Normal

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
CHAPTER
6
Continuous Random Variables
and the Normal Distribution
CHAPTER OUTLINE
6.1 The Standard Normal Distribution
6.2 Standardizing a Normal Distribution
6.3 Applications of the Normal Distribution
6.4 Determining the z and x values when an Area
under the Normal Distribution Curve is Known
6.1 THE STANDARD NORMAL DISTRIBUTION
The standard normal distribution is a normal distribution that has a mean equal to 0 and a
standard deviation of 1. Probabilities and z-scores associated with this distribution can be
found with Excel’s NORMSDIST and NORMSINV functions, respectively. The “S” in
the middle of each function name is for “Standard.”
First, we’ll consider computing probabilities for the standard normal distribution. (In
Section 6.4, we’ll calculate z-scores.) Your textbook has a table of probabilities
associated with various z-scores that can be used, but Excel can calculate probabilities for
any z-score with its built-in NORMSDIST function. This function calculates the
cumulative probability for any z-score, i.e., the area under the curve to the left of that zscore. So, if you want the probability between two z-scores, you have to use the function
for each one and then subtract, greater minus smaller. (Your answer, which is a
probability or area, should always be non-negative.)
87
88 Chapter 6 Continuous Random Variables and the Normal Distribution
Example 6-1 Find the area under the standard normal curve between z = 0 and z = 1.95.
Solution: Click on a blank cell in an Excel worksheet. Type the following:
=NORMSDIST(1.95)-NORMSDIST(0).
Hit ENTER, and you should see the probability of 0.474412 appear.
The other option is to insert the NORMSDIST function, type 1.95 into the text
box, click OK, hit F2 to edit the cell, type a minus sign after the last parenthesis, insert
NORMSDIST again, type 0, and then click OK again.
Example 6-2 Use Excel to find the following areas under the standard normal curve.
a) Area to the right of z = 2.32
b) Area to the left of z = –1.54
Solution: For part a), you need to subtract the cumulative probability (area to the left)
from 1 to get the complementary probability (area to the right). Type the following into a
blank cell:
=1-NORMSDIST(2.32).
You could, instead, just type “=1-” and then insert the function and enter 2.32.
Or you could find the cumulative probability in one cell, and then in another cell type
“=1-” and then click on the cell with the cumulative probability, and hit ENTER. Either
way you do it, you should see the probability of 0.01017 appear.
For part b), just insert the NORMSDIST function (or type it in with “=” before it
and “(” after it) and enter -1.54. This should give you the desired cumulative probability
(area to the left) of 0.06178.
6.2 STANDARDIZING A NORMAL DISTRIBUTION
If a random variable x is normal with mean µ and standard deviation σ , then the zx−µ
, is normal with mean 0 and standard deviation 1. This is
score, defined by z =
σ
important, especially when comparing populations which have different means and/or
standard deviations.
One can use Excel’s STANDARDIZE function to find the z-score that corresponds to
some x-value from a nonstandard normal distribution (where the mean is not 0 or the
standard deviation is not 1). This function has three inputs:
1) the x-value to be standardized = X,
2) the mean of the normal distribution from which the x-value is taken = Mean,
3) the standard deviation of the normal distribution from which the x-value is taken
= Standard_dev.
Excel Manual 89
Example 6-3 Let x be a continuous random variable that has a normal distribution with a
mean of 50 and a standard deviation of 10. Convert the following x-values to z-scores.
b) x = 35
a) x = 55
Solution: Click on a blank cell in an Excel worksheet, and insert the STANDARDIZE
function. For part a), enter 55 into the first text box, 50 into the second, and 10 into the
third. (You can hit TAB to go from one text box to the next, if you wish.) Click OK or
hit ENTER, and you should see the z-score of 0.5 appear.
Figure 6.1 Use Excel’s STANDARDIZE function if you want to convert an x-value from a normal
distribution with given mean and standard deviation to a z-score.
For part b), you can simply edit the first input (double-click on the cell or click on it
and hit F2). Change the 55 to a 35 and hit ENTER. You should see a z-score of –1.5
appear.
Figure 6.2 Excel’s STANDARDIZE function can be typed in and/or edited instead of inserted.
In order to compute probabilities associated with a nonstandard normal distribution
(when the mean is not 0 or the standard deviation is not 1), you can either first
standardize and use the NORMSDIST function, or you can more simply use the
NORMDIST function which takes into account the different mean and standard
deviation. (Notice the absence of the “S” in the middle of the function name, since this is
used for a normal distribution that is not standard.) Specifically, Excel’s NORMDIST
function requires four inputs:
90 Chapter 6 Continuous Random Variables and the Normal Distribution
1)
2)
3)
4)
the x-value at the edge of the desired cumulative probability = X,
the mean of the distribution = Mean,
the standard deviation of the distribution = Standard_dev,
1 (for TRUE) since we always want the cumulative normal probability =
Cumulative.
Note: The last input above is different from the discrete distributions like the binomial
distribution, where the FALSE option is sometimes desirable. Keep in mind that the
FALSE option would always generate a probability of zero for a continuous random
variable.
Example 6-4 Let x be a continuous random variable that is normally distributed with a
mean of 25 and a standard deviation of 4. Find the area
b) between x = 18 and x = 34.
a) between x = 25 and x = 32
Solution:
a) Click on a blank cell in an Excel worksheet. Type the following:
=NORMDIST(32,25,4,1)-NORMDIST(25,25,4,1).
Hit ENTER and you should see the probability of 0.459941 appear.
The other option is to insert the NORMDIST function, type 32 into the first text
box, 25 into the second, 4 into the third, and 1 into the fourth text box, click OK, hit F2
to edit the cell, type a minus sign after the last parenthesis, insert NORMDIST again,
enter 25, 25, 4, and 1, and then click OK again. (You can hit TAB to go from one text
box to the next, if you wish.)
b) Double-click on the cell from part a) and edit the functions so that the first inputs
are 34 and 18 instead of 32 and 25, respectively. Hit ENTER and you should see the
probability of 0.947716 appear.
Figure 6.3 Using the difference of two applications of Excel’s NORMDIST function to find a
probability between two values.
Example 6-5 Let x be a continuous random variable that has a normal distribution with a
mean of 80 and a standard deviation of 12. Find the area under the normal distribution
curve to the left of 27.
Solution: This is a direct application of the NORMDIST function, since we want the
area to the left. Insert this function into a cell and enter the four inputs as shown in the
figure below. Click OK, and you should see a value of 5.0159E-06 appear. This is
Excel’s scientific notation, which means 5.0159 × 10-6 or 0.0000050159. Pretty small!
Excel Manual 91
That’s because 27 is more than 4 standard deviations (4×12 = 48) below the mean (80).
So, the area to the left of that is nearly 0.
Figure 6.4 Excel gives the result in scientific notation when the probability is extremely small.
6.3 APPLICATIONS OF THE NORMAL DISTRIBUTION
Now, we consider a couple of applications involving the use of Excel’s NORMDIST
function.
Example 6-6 According to Automotive Lease Guide, the Porsche 911 sports car is
among the vehicles that hold their value best. A Porsche 911 (with a price of $87,500 for
a new car) is expected to command a price of $48,125 after three years (The Wall Street
Journal, August 6, 2002). Suppose the prices of all three-year-old Porsche 911 sports
cars have a normal distribution with a mean price of $48,125 and a standard deviation of
$1600. Find the probability that a randomly selected three-year-old Porsche 911 will sell
for a price between $46,000 and $49,000.
Solution: Note that we have a normal distribution with a mean of 48,125 and a standard
deviation of 1600. So, in order to find the probability, we need to use the NORMDIST
function. The values 48125 and 1600 will be the 2nd and 3rd inputs, respectively. Since
we seek the probability for between these two values, we need to apply the function
twice, once with the upper value and once with the lower value as the first input, and then
subtract. This can be done simply by typing in the following:
=NORMDIST(49000,48125,1600,1)-NORMDIST(46000,48125,1600,1).
92 Chapter 6 Continuous Random Variables and the Normal Distribution
You should see the resulting probability of 0.615699, or approximately 61.6%.
Important Note: You must not type commas within a number. Commas must only be
used to separate the four distinct inputs.
Of course, an alternative option is to insert the NORMDIST function, type 49000
into the first text box, 48125 into the second, 1600 into the third, and 1 into the fourth
text box, click OK, hit F2 to edit the cell, type a minus sign after the last parenthesis,
insert NORMDIST again, enter 46000, 48125, 1600, and 1, and then click OK again.
The advantage of inserting the function is that you are prompted as to what the
input(s) represent, in case you don’t remember what goes where.
Figure 6.5 Subtract two applications of the NORMDIST function in order to find the probability
between two values.
Example 6-7 A racing car is one of the many toys manufactured by Mack Corporation.
The assembly times for this toy can be modeled using a normal distribution with a mean
of 55 minutes and a standard deviation of 4 minutes. The company closes at 5 P.M.
every day. If one worker starts to assemble a racing car at 4 P.M., what is the probability
that she will finish this job before the company closes for the day?
Solution: Identify the mean and standard deviation: 55 and 4 minutes, respectively.
These are the 2nd and 3rd inputs of Excel’s NORMDIST function. The 1st input is the
“cutoff” for the probability desired. In this case, we want the probability that she will
finish in 1 hour (60 minutes) or less. So, the first input is 60. Always enter 1 for the last
input. Hit ENTER or click on OK and you will get the cumulative probability of 60 or
less. This value, 0.89435 (approximately 89.4%), is exactly what we want.
Excel Manual 93
Figure 6.6 Using Excel’s NORMDIST function to find a cumulative probability.
6.4 DETERMINING THE Z AND X VALUES WHEN AN AREA UNDER
THE NORMAL DISTRIBUTION CURVE IS KNOWN
In order to go from a cumulative probability back to a z-score, use the function
NORMSINV (with an “S”). In order to go from a cumulative probability back to an xvalue in a nonstandard normal distribution, you need to use the function NORMINV
(with no “S”). (Note that the “INV” suffix is in reference to an “inverse” operation – i.e.,
going backwards.) The key is to make sure that you first know the cumulative area or
probability, i.e. the area of the region under the curve lying to the left of the desired
value.
Example 6-8 Find the value of z such that the area under the standard normal curve in
the right tail is .0050.
Solution: Since the area to the right of z is .0050, the area to the left of z, the cumulative
area, must be 1-.0050 = .9950. Insert the NORMSINV function and enter .9950 (or you
can simply enter 1-.0050). Click on OK and you should see the z-score of 2.575835
appear.
94 Chapter 6 Continuous Random Variables and the Normal Distribution
Figure 6.7 Use Excel’s NORMSINV function to go “backwards” from a cumulative area (to the left
of a z-score in the standard normal distribution) to the z-score.
The first input of the NORMINV function is also the cumulative probability (or area
to the left) of the desired value, but there are two more inputs required: the mean and
standard deviation of the nonstandard normal distribution. (These two inputs were
unnecessary for NORMSINV since by definition of the standard normal, the mean and
standard deviation are always 0 and 1, respectively. As such, it is really just a special
case of NORMINV.)
Example 6-9 It is known that the life of a calculator manufactured by Intal Corporation
has a normal distribution with a mean of 54 months and a standard deviation of 8 months.
What should the warranty period be to replace a malfunctioning calculator if the
company does not want to replace more than 1% of all the calculators sold?
Solution: Note that we have a normal distribution with a mean of 54 and standard
deviation of 8. We want to find a “cutoff” value for the 1% proportion (or probability).
To do so, we need to determine the cumulative probability – the probability to the left of
the cutoff that we’re looking for. If the 1% are being replaced, it’s because they are not
lasting long enough. They are the ones on the low end of the distribution. So, this is our
cumulative probability: 0.01.
Symbolically, we need to find the number of months x such that P ( X ≤ x) = 0.01 ,
where X is a normal random variable with µ = 54 and σ = 8 . To perform the necessary
computations on Excel, insert the NORMINV function and enter the three inputs 0.01,
54, and 8. Then, hit ENTER and you should see the value of 35.38926, standing for
approximately 35.4 months. This is what the warranty period should be.
Excel Manual 95
Figure 6.8 Use Excel’s NORMINV function to find a “cutoff” value for a known cumulative
proportion or probability in a nonstandard normal distribution.
Example 6-10 Nearly all high school students who intend to go to college take the SAT
test. In 2002, the mean SAT score (in verbal and mathematics) of all students was 1020.
Debbie is planning to take this test soon. Suppose the SAT scores of all students who
take this test with Debbie will have a normal distribution with a mean of 1020 and a
standard deviation of 153. What should her score be on this test so that only 10% of all
examinees score higher than she does?
Solution: In this case, we have a normal distribution with a mean of 1020 and standard
deviation of 153. Since we want to find a “cutoff” value for the 10% proportion (or
probability), we need to use the NORMINV function. But, this is not the cumulative
probability . Rather, we seek the percentage above the value we’re looking for. As such,
the cumulative probability is 100%-10% = 90% or 0.90. Symbolically, we need to find
the score x such that P ( X ≤ x) = 0.90 , where X is a normal random variable with
µ = 1020 and σ = 153 . (Notice that this is talking about the 90th percentile!)
To perform the necessary computations on Excel, insert the NORMINV function
and enter the three inputs 0.90, 1020, and 153. Hit ENTER and you should see the value
of 1216.077. So, she needs to earn a score of at least 1216.
Figure 6.9 Using Excel’s NORMINV function to find the 90th percentile in a normal distribution.
96 Chapter 6 Continuous Random Variables and the Normal Distribution
Exercises
6.1 Use Excel’s NORMSDIST function to find the area under the standard normal curve
a) between z = 0 and z = –1.85
b) between z = 1.15 and z = 2.37
c) from z = –2.88 to z = –1.53
d) from z = –1.67 to z = 2.44
e) to the right of z = 1.56
f) to the left of z = –1.97.
6.2 Use Excel’s STANDARDIZE function in order to find the z-score for each of the
following x-values for a normal distribution with µ = 30 and σ = 5.
a) x = 39 b) x = 17
c) x = 22
d) x = 42
6.3 Use Excel’s NORMDIST function to find the following areas under a normal
distribution curve with µ = 20 and σ = 4.
a) Area between x = 20 and x = 27.
b) Area from x = 23 to x = 25.
c) Area between x = 9.5 and x = 17.
6.4 Use Excel’s NORMDIST function in order to determine the area under a normal
distribution curve with µ = 55 and σ = 7
a) to the right of x = 58
b) to the left of x = 43.
6.5 The U.S. Bureau of Labor Statistics conducts periodic surveys to collect information
on the labor market. According to one such survey, the average earnings of workers
in retail trade were $10 per hour in August 2002 (Bureau of Labor Statistics News,
September 18,2002). Assume that the hourly earnings of such workers in August
2002 had a normal distrbution with a mean of $10 and a standard deviation of $1.10.
Use Excel’s NORMDIST function to find the probability that the hourly earnings of
a randomly selected retail trade worker in August 2002 were
a) more than $12
b) between $8.50 and $10.80.
6.6 The transmission on a model of a specific car has a warranty for 40,000 miles. It is
known that the life of such a transmission has a normal distribution with a mean of
72,000 miles and a standard deviation of 12,000 miles. Use Excel’s NORMDIST
function to find the following.
a) What percentage of the transmissions will fail before the end of the warranty
period?
b) What percentage of the transmissions will be good for more than 100,000 miles?
6.7 Use Excel’s NORMSINV function in order to determine the value of z so that the
area under the standard normal curve
a) in the right tail is .0500
b) in the left tail is .0250
c) in the left tail is .0100
d) in the right tail is .0050
6.8 Fast Auto Service provides oil and lube service for cars. It is known that the mean
time taken for oil and lube service at this garage is 15 minutes per car and the
Excel Manual 97
standard devation is 2.4 minutes. The management wants to promote the business by
guaranteeing a maximum waiting time for its customers. If a customer’s car is not
serviced within that period, the customer will receive a 50% discount on the charges.
The company wants to limit this discount to at most 5% of the customers. Assume
that the times taken for oil and lube service for all cars have a normal distribution.
Use Excel’s NORMINV function to find what the maximum guaranteed waiting time
should be.