Download Normal, Binomial, Poisson, and Exponential Distributions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Chapter 5
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a
publicly accessible website, in whole or in part.
BUSINESS ANALYTICS:
DATA ANALYSIS AND
DECISION MAKING
Normal, Binomial, Poisson, and Exponential Distributions
Introduction

Several specific distributions commonly occur in a
variety of business situations:
Normal distribution—a continuous distribution characterized
by a symmetric bell-shaped curve
 Binomial distribution—a discrete distribution that is relevant
when we sample from a population with only two types of
members or when we perform a series of independent,
identical experiments with only two possible outcomes
 Poisson distribution—a discrete distribution that describes
the number of events in any period of time
 Exponential distributions—a continuous distribution that
describes the times between events

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Normal Distribution

The single most important distribution in statistics is the
normal distribution.
It is a continuous distribution and is the basis of the familiar
symmetric bell-shaped curve.
 Any particular normal distribution is specified by its mean
and standard deviation.

By changing the mean, the normal curve shifts to the right or left.
 By changing the standard deviation, the curve becomes more or
less spread out.


There are really many normal distributions, not just a single
one.

The normal distribution is a two-parameter family, where the two
parameters are the mean and standard deviation.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Continuous Distributions and
Density Functions (slide 1 of 2)

For continuous distributions, instead of a list of
possible values, there is a continuum of possible
values, such as all values between 0 and 100 or all
values greater than 0.
 Instead
of assigning probabilities to each individual
value in the continuum, the total probability of 1 is
spread over this continuum.
 The key to this spreading is called a density function,
which acts like a histogram.
 The
higher the value of the density function, the more likely
this region of the continuum is.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Continuous Distributions and
Density Functions (slide 2 of 2)

A density function, usually denoted by f(x), specifies the
probability distribution of a continuous random variable X.




The higher f(x) is, the more likely x is.
The total area between the graph of f(x) and the horizontal axis,
which represents the total probability, is equal to 1.
f(x) is nonnegative for all possible values of X.
Probabilities are found from a density function as areas under the
curve.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Normal Density

The normal distribution is a continuous distribution with
possible values ranging over the entire number line—from
“minus infinity” to “plus infinity.”



Only a relatively small range has much chance of occurring.
The normal density function is actually quite complex, in spite of
its “nice” bell-shaped appearance.
The formula for the normal density function, where μ and σ
are the mean and standard deviation, is:
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Standardizing: Z-Values

The standard normal distribution has mean 0 and
standard deviation 1, so it is denoted by N(0,1).
 It

is also referred to as the Z distribution.
To standardize a variable, subtract its mean and
then divide the difference by the standard
deviation:
A
Z-value is the number of standard deviations to the
right or left of the mean.
 If
Z is positive, the original value is to the right of the mean.
 If Z is negative, the original value is the left of the mean.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 5.1:
Standardizing.xlsx




Objective: To use Excel® to standardize annual returns of various
mutual funds.
Solution: Data set includes the annual returns of 30 mutual funds.
Calculate the mean and standard deviation of each annual return
and then use the standardizing formula to calculate the
corresponding Z-value.
OR calculate the Z-values directly, using Excel’s STANDARDIZE
function.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Normal Tables and Z-Values

A common use for Z-values and the standard normal
distribution is in calculating probabilities and percentiles by
the traditional method.

This method is based on a table of the standard normal
distribution found in many statistics textbooks. An example of such
a table is given below.


The body of the table contains probabilities.
The left and top margins contain possible values.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Normal Calculations in Excel

Two types of calculations are typically made with normal
distributions: finding probabilities and finding percentiles.

The functions used for normal probability calculations are
NORMDIST and NORMSDIST.


The main difference between these is that the one with the
“S” (for standardized) applies only to N(0, 1) calculations,
whereas NORMDIST applies to any normal distribution.
Percentile calculations that take a probability and return a
value are often called inverse calculations.
The Excel functions for these are named NORMINV and
NORMSINV.
 Again, the “S” in the second of these indicates that it
applies to the standard normal distribution.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 5.2:
Normal Calculations.xlsx





(slide 1 of 2)
Objective: To calculate probabilities and percentiles for
standard normal and general normal distributions in
Excel.
Solution: For “less than” probabilities, use NORMDIST
or NORMSDIST directly.
For “greater than” probabilities, subtract the
NORMDIST or NORMSDIST function from 1.
For “between” probabilities, subtract the two
NORMDIST or NORMSDIST functions.
For percentile calculations, use the NORMINV or
NORMSINV function with the specified probability as
the first argument.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 5.2:
Normal Calculations.xlsx
(slide 2 of 2)
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Empirical Rules Revisited

Three empirical rules apply to many data sets:
 About
68% of the data fall within one standard
deviation of the mean.
 About 95% fall within two standard deviations of the
mean.
 Almost all fall within three standard deviations of the
mean.

For these rules to hold with real data, the
distribution of the data must be at least
approximately symmetric and bell-shaped.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Weighted Sums of Normal
Random Variables

One very attractive property of the normal distribution
is that if you create a weighted sum of normally
distributed random variables, the weighted sum is also
normally distributed.
This is true even if the random variables are not
independent.
 If X1 through Xn are n independent and normally distributed
random variables with common mean μ and common
standard deviation σ, then the sum X1 + … + Xn is normally
distributed with mean nμ, variance nσ2, and standard
deviation √nσ.
 If a1 through an are any constants, then the weighted sum
a1X1 + … + anXn is normally distributed with mean a1μ1 +
… + anμn and variance a21 σ21 + … + a2n σ2n.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 5.3:
Personnel Decisions.xlsx




Objective: To determine test scores that can be used to accept or reject job
applicants at ZTel.
Solution: Scores of all applicants are approximately normally distributed
with mean 525 and standard deviation 55.
Calculate the percentage of applicants who are automatic accepts or
rejects, given the current standards of 600 for automatic accept and 425
for automatic reject.
Find new cutoff values that reject 10% and accept 15% of applicants.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 5.4:
Paper Machine Settings.xlsx




Objective: To determine the machine settings that result in paper of
acceptable quality at PaperStock Company.
Solution: A given roll of paper must be rejected if its actual fiber content is
less than 19.8 pounds or greater than 20.3 pounds.
The variability in fiber content is 0.10 pound when the process is “good,”
but increases to 0.15 pound when the machine goes “bad.”
Calculate the probability that a given roll is rejected, for a setting of μ =
20, when the machine is “good” and when it is “bad.”
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 5.5:
Tax on Stock Returns.xlsx



Objective: To determine the after-tax profit Howard Davis can be
90% certain of earning.
Solution: Howard is in the 33% tax bracket, so his after-tax profit is
67% of his before-tax profit. He invests $10,000 in a certain stock,
whose annual return is normally distributed with mean 5% and
standard deviation 14%.
Calculate the dollar amount such that Howard’s after-tax profit is
90% certain to be less than this amount; that is, calculate the 90th
percentile of his after-tax profit.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 5.6:
Oven Demand Simulation.xlsx (slide 1 of 3)



Objective: To construct and analyze a spreadsheet model
for microwave oven demand over the next 12 years using
Excel’s NORMINV function, and to show how models using
the normal distribution can lead to nonsensical outcomes
unless they are modified appropriately.
Solution: Using historical data, the company assumes that
demand in year 1 is normally distributed with mean 5000
and standard deviation 1500.
It also assumes that demand in each subsequent year is
normally distributed with mean equal to the actual
demand from the previous year and standard deviation
1500.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 5.6:
Oven Demand Simulation.xlsx (slide 2 of 3)

Using this model may lead to nonsensical results as shown
below:
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 5.6:
Oven Demand Simulation.xlsx (slide 3 of 3)


One way to modify the model is to let the standard deviation and
mean move together. That is, if the mean is low, then the standard
deviation will also be low.
To be even safer, it is possible to truncate the demand distribution at
some nonnegative value such as 250, as shown below.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Binomial Distribution

The binomial distribution is a discrete distribution that can
occur in two situations:



Consider a situation where there are n independent,
identical trials, where the probability of a success on each
trial is p and the probability of a failure is 1 – p.



When sampling from a population with only two types of
members (males and females, for example)
When performing a sequence of identical experiments, each of
which has only two possible outcomes
Define X to be the random number of successes in the n trials.
Then X has a binominal distribution with parameters n and p.
In Excel, calculate binomial probabilities with the
BINOMDIST function.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 5.7:
Binomial Calculations.xlsx




Objective: To use Excel’s BINOMDIST and CRITBINOM functions for calculating
binomial probabilities and percentiles in the context of flashlight batteries.
Solution: Let X be the number of successes in 100 trials of flashlight batteries,
where a success means that the battery is still functioning after eight hours.
Find the probabilities of various events, using the BINOMDIST function, as shown in
the spreadsheet below.
Find the 95th percentile of the distribution of X, using the CRITBINOM function.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Mean and Standard Deviation of the
Binomial Distribution


It can be shown that the mean and standard deviation
of a binomial distribution with parameters n and p are
given by the following equations.
The empirical rules discussed in Chapter 2 also apply,
at least approximately, to the binomial distribution.
There is about a 95% chance that the actual number of
successes will be within two standard deviations of the
mean.
 There is almost no chance that the number of successes will
be more than three standard deviations from the mean.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Binomial Distribution in the
Context of Sampling

If sampling is done without replacement, each member of
the population can be sampled only once.




That is, once a person is sampled, his or her name is struck from
the list and cannot be sampled again.
If sampling is done with replacement, then it is possible,
although maybe not likely, to select a given member of the
population more than once.
Most real-world sampling is performed without replacement.
The binomial model applies only to sampling with
replacement.

However, if no more than 10% of the population is sampled, the
binomial model can be used safely even if sampling is performed
without replacement.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Normal Approximation
to the Binomial

If you graph the binomial probabilities, you will see an
interesting phenomenon: the graph begins to look symmetric
and bell-shaped when n is fairly large and p is not too close
to 0 or 1.


The normal distribution provides a very good approximation to
the binomial under these conditions.
One practical consequence of the normal approximation to the
binomial is that the empirical rules apply very well to binomial
distributions.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 5.8:
Beating the Market.xlsx



Objective: To determine the probability of a mutual fund outperforming a
standard market index at least 37 out of 52 weeks.
Solution: The number of weeks where a given fund outperforms the market
index is binomially distributed with n = 52 and p = 0.5. This probability is
quite small (0.00159).
Now let Y be the number of the 400 best mutual funds that beat the market
at least 37 of 52 weeks. Y is also binomially distributed, with parameters n
= 400 and p = 0.00159. The resulting probability is nearly 0.5.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 5.9:
Supermarket Spending.xlsx




Objective: To use the normal and binomial distributions to calculate the typical
number of customers who spend at least $100 per day and the probability that at
least 30% of all 500 daily customers spend at least $100.
Solution: Historical data indicate that the amount spent per customer is normally
distributed with mean $85 and standard deviation $30.
If 500 customers shop in a given day, calculate the mean and standard deviation of
the number who spend at least $100.
Then calculate the probability that at least 30% of the 500 customers spend at
least $100. This is the probability that a binomially distributed random variable,
with n = 500 and p = 0.309, is at least 150.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 5.10:
Airline Overbooking.xlsx




(slide 1 of 2)
Objective: To assess the benefits and drawbacks of
airline overbooking.
Solution: Assume that the no-show rate is 10%—that is,
each ticketed passenger shows up with probability
0.90.
For a flight with 200 seats, calculate the probability
that more than 205 passengers show up; that more than
200 passengers show up; that at least 195 seats are
filled; and that at least 190 seats are filled.
Use the BINOMDIST function and a data table to
determine the probabilities.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 5.10:
Airline Overbooking.xlsx

(slide 2 of 2)
To see how sensitive these probabilities are to the
number of tickets issued, create a one-way data table,
as shown at the bottom of the spreadsheet below.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 5.11:
Election Returns.xlsx



Objective: To use a binomial model to determine whether early returns
reflect the eventual winner of an election between two candidates.
Solution: Suppose that a small percentage of the votes have been counted
and the Republican is currently ahead 540 to 460. On what basis can the
networks declare the Republican the winner, if there are millions of voters?
Use a binomial model to see how unlikely the event “at least 540 out of
1000” is, assuming that the Democrat will be the eventual winner.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 5.12:
Basketball Simulation.xlsx





Objective: To formulate a
nonbinomial model of basketball
shooting, and to use it to find the
probability of a “450 shooter”
making at least 13 out of 25 shots.
Solution: Assume the shooter makes
45% of his shots in the long run.
Use simulation to create a model
that implies that the shooter gets
better the more shots he makes and
worse the more he misses.
Consider his nth shot. If he has made
his last k shots, assume the
probability of making shot n is 0.45
+ kd1.
If he has missed his last k shots,
assume the probability of making
shot n is 0.45 − kd2.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Poisson and Exponential Distributions


In most statistical applications, the Poisson and
exponential distributions play a much less important
role than the normal and binomial distributions.
However, in many applied management science
models, the Poisson and exponential distributions
are key distributions.
 For
example, much of the study of probabilistic
inventory models, queuing models, and reliability
models relies heavily on these two distributions.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Poisson Distribution
(slide 1 of 3)

The Poisson distribution is a discrete distribution. It
usually applies to the number of events occurring within
a specified period of time or space.
Its possible values are all of the nonnegative integers: 0, 1,
2, and so on—there is no upper limit.
 Even though there is an infinite number of possible values,
this causes no real problems because the probabilities of all
sufficiently large values are essentially 0.


The Poisson distribution is characterized by a single
parameter, usually labeled λ (Greek lambda), which
must be positive.
It is both the mean and the variance of the Poisson
distribution.
 It is often called a rate—arrivals per hour, for example.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Poisson Distribution
(slide 2 of 3)

All Poisson distributions have the same basic shape as in the
figure below.

That is, they first increase and then decrease.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Poisson Distribution
(slide 3 of 3)

Typical examples of the Poisson distribution:
A
bank manager is studying the arrival pattern to the
bank. The events are customer arrivals, the number of
arrivals in an hour is Poisson distributed, and λ
represents the expected number of arrivals per hour.
 A retailer is interested in the number of customers who
order a particular product in a week. Then the events
are customer orders for the product, the number of
customer orders in a week is Poisson distributed, and λ
is the expected number of orders per week.

In Excel, calculate Poisson probabilities with the
POISSON function.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 5.13:
Poisson Demand Distribution.xlsx (slide 1 of 2)




Objective: To model the probability distribution of
monthly demand for plasma screen TVs with a
particular Poisson distribution.
Solution: Because the histogram of demands from
previous months resembles a Poisson distribution, try
modeling the monthly demand with a Poisson
distribution.
The historical average demand per month is about
17, so let the mean demand per month λ = 17.
Now test the Poisson model by calculating the
probabilities of various events.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 5.13:
Poisson Demand Distribution.xlsx (slide 2 of 2)
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Exponential Distribution
(slide 1 of 2)

The most common probability distribution used to
model the times between customer arrivals, often
called interarrival times, is the exponential
distribution.

In general, the continuous random variable X has an
exponential distribution with parameter λ (with λ > 0) if the
density function of X has the form:
The mean and standard deviation of this distribution are
both equal to the reciprocal of the parameter λ.
 For any exponential distribution, the probability to the left
of a given value x > 0 can be calculated with Excel’s
EXPONDIST function.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Exponential Distribution
(slide 2 of 2)


The exponential density function has the shape
shown below.
Because this density function decreases continuously
from left to right, its most likely value is x = 0.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.