Download Probability Distributions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Chapter 5:
Probability Distributions
Hildebrand, Ott and Gray
Basic Statistical Ideas for Managers
Second Edition
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
1
2005 Brooks/Cole, a division of Thomson Learning, Inc.
Learning Objectives for Ch. 5
• Understanding the counting techniques needed
for sequences and combinations.
• Understanding that a binomial random variable
counts the number of successes in a fixed
number of trials with each trial being a success or
failure.
• Assumptions needed to use the binomial.
• Understanding that a Poisson random variable
counts the number of occurrences of an event in a
unit of time, area or volume.
• Assumptions needed to use the Poisson.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2
2005 Brooks/Cole, a division of Thomson Learning, Inc.
Learning Objectives for Ch. 5
• Understanding that a normal random variable
measures a characteristic of interest and has a
bell-shaped distribution.
• Learning how to calculate probabilities for
binomial, Poisson and normal random variables.
• Understanding how to use a normal probability
plot to determine if data is from a normal
distribution.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
3
Section 5.1
Counting Possible Outcomes
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
4
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.1 Counting Possible Outcomes
• Under the classical interpretation of probability:
P(Event) = Number of favorable outcomes
Total number of outcomes
• We need ways to count the number of outcomes.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
5
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.1 Counting Possible Outcomes
• Preliminary Concept – Factorials
• The factorial symbol is “!”
• Definition of n!
n! = n (n - 1) (n - 2)…1
Example:
3! = (3)(2)(1) = 6
• By definition, 0! = 1.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
6
5.1 Counting Possible Outcomes
• One consideration in counting techniques
• Order matters ⇒ sequences
• Order doesn’t matter ⇒ subsets
Example:
Consider the letters a and b.
If order matters, there are 2 sequences:
(a,b) and (b,a)
If order does not matter, there is only 1 subset:
{a,b}
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
7
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.1 Counting Possible Outcomes
• Number of sequences
• Rule: The number of sequences of k objects that
can be formed from a set of r distinct
objects, denoted rPk, is:
rPk = (r) (r - 1)…(r – k + 1)
Example: The number of sequences of 2 letters
formed from the 4 letters a, b, c, d, is:
(4) (3) = 12
The sequences are:
(a,b) (a,c) (a,d) (b,c) (b,d) (c,d)
(b,a) (c,a) (d,a) (c,b) (d,b) (d,c)
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
8
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.1 Counting Possible Outcomes
• Number of subsets or combinations
• Rule: The number of subsets of k objects
that can be formed from a set of r distinct
objects, denoted rCk, is:
rCk
=
_______
r!
k! (r – k)!
• Notation: Use rCk or ( rk )
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
9
5.1 Counting Possible Outcomes
Example: The number of subsets of 2 letters formed
from the 4 letters a, b, c, d is:
r Ck =
( rk ) =
_______
4!
2! (4-2)!
=6
The subsets are:
{a,b} {a,c} {a,d} {b,c} {b,d} {c,d}
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
10
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.1 Counting Possible Outcomes
Exercise 5.67:
Several states now have a Lotto game. A player chooses
6 distinct integers in the range 1 to 40. If exactly those 6
numbers are selected as the winning numbers, the player
receives a very large prize. What is the probability that a
particular set of 6 numbers will be drawn? You may wish
to think of the 6 numbers drawn as the “success” numbers.
First approach: Order matters (even though it doesn’t)
Total number of outcomes = 40P6 = (40)(39)(38)(37)(36)(35)
Number of favorable outcomes = 6P6 = (6)(5)(4)(3)(2)(1) = 6!
P(Winning) = 6!/ [(40)…(35)] = .00000026052657
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
11
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.1 Counting Possible Outcomes
• Another perspective of the first approach
P(Winning) = 6! / [(40) ··· (35)]
(6)(5)(4)(3)(2)(1)
(40)(39)(38)(37)(36)(35)
=
⎛ 6 ⎞⎛ 5 ⎞ ⎛ 1 ⎞
⎟⎜ ⎟ " ⎜ ⎟
⎝ 40 ⎠⎝ 39 ⎠ ⎝ 35 ⎠
= ⎜
= P(W1) P(W2/W1) ··· P(W6/W1 and ··· W5)
Where W i ≡ {The ith number is a winning number}
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
12
5.1 Counting Possible Outcomes
Second approach: Order doesn’t matter
(and it really doesn’t)
Total number of outcomes = 40C6 =
_____
40!
6! 34!
Number of favorable outcomes = 6C6 · 34C0
6! . _____
34! = 1
= ____
6! 0! 0! 34!
P (Winning) = 6!/[(40)…(35)] = .00000026052657
Moral: You can’t lose if you don’t play!
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
13
2005 Brooks/Cole, a division of Thomson Learning, Inc.
Section 5.2
The Binomial Distribution
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
14
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.2 The Binomial Distribution
• Examples of a Bernoulli Trial
1. A coin toss results in a head (H) or a tail (T).
2. A bit sent through a digital communications channel is
entered as either 0 or 1 and received either correctly or
incorrectly.
3. An audited account is either current (C) or delinquent (D).
4. A consumer is either aware (A) of a particular product or not
aware (N).
5. A flight reservation is either a show (S) or no-show (N).
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
15
5.2 The Binomial Distribution
• Features of a Bernoulli Trial:
• Only 2 possible outcomes for each trial,
characterized as:
Success (S) or Failure (F)
•
π denotes P(S)
(1 – π) denotes P(F).
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
16
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.2 The Binomial Distribution
• Bernoulli R.V. and Probability Distribution
Let Y = 1, if trial results in S
= 0, if trial results in F
y
0
1
PY ( y )
1− π
π
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
17
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.2 The Binomial Distribution
• Graphical representation of a Bernoulli
probability distribution
P Y (y)
1–π
π
y
0
1
The distribution is skewed when π ≠ .5
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
18
5.2 The Binomial Distribution
• E(Y) = 0(1 - π) + 1(π ) = π
• V(Y) = Σ (y - µ)2 PY(y)
= [0 – π]2 (1 – π) + [1 – π]2 π
= π (1 – π) [π + (1 – π) ]
= π (1 – π)
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
19
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.2 The Binomial Distribution
• Examples of Binomial Random Variable:
1. Toss a coin 10 times. Let Y denote the number of heads in the 10
tosses.
2. For the next 3 bits transmitted through a digital communications
channel, let Y be the number of bits received that are in error.
3. 20 accounts are randomly selected from a population of several
thousand accounts and are audited. Let Y be the number of delinquent
accounts in the sample.
[The sampling has to be with replacement for the probability of
success to remain constant. In reality, the sampling is done without
replacement.]
4. 100 randomly selected consumers are surveyed as part of a market
research study. Let Y denote the number of these consumers who are
aware of a particular product.
5. Out of 50 flight reservations made, let Y be the number of passengers
who show.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
20
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.2 The Binomial Distribution
• Features of a Binomial Experiment:
• There are n Bernoulli trials [each one results in
S or F].
• The probability of a success, π = P(S), remains
constant over the n trials; [P(F) = 1 - π ].
• The trials are independent.
• The binomial random variable is the total number
of successes in n trials, where the ordering is
unimportant.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
21
5.2 The Binomial Distribution
• Binomial Probability Distribution
⎛n⎞
PY (y) = ⎜ ⎟ π y (1 - π )n - y , y = 0,1,..., n
⎝ y⎠
⎛n⎞
n!
• ⎜⎜ ⎟⎟ =
⎝ y ⎠ y ! (n - y) !
• The expression for PY(y) can be used to
calculate probabilities for a binomial random
variable.
• What is the basis for the expression for PY(y)?
22
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.2 The Binomial Distribution
Example: Y denotes number of bits in error in next 3 transmitted where P(Error) = π
Outcomes
E, E, E
y
y
Probability
y
From PY(y)
y
3
π3
⎛ 3⎞
⎜⎜ 3 ⎟⎟
⎝ ⎠
π3 (1-π)0 = π3
2
3 π2 (1- π)
⎛3 ⎞
⎜⎜ 2 ⎟⎟
⎝ ⎠
π2 (1- π)1 = 3 π2 (1- π)
1
3 π (1- π) 2
⎛ 3⎞
⎜⎜1 ⎟⎟
⎝ ⎠
π1 (1- π)3 –1 = 3 π (1- π ) 2
E, E, O
E, O, E
O, E, E
E, O, O
O, E, O
O, O, E
O, O, O
(1- π) 3
0
(1- π ) 3
Found by using principles
of Chapter 3.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
23
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.2 The Binomial Distribution
• Calculation of Probabilities
• Use the binomial probability distribution formula
• Instead of actually calculating the probabilities, we can
look them up in a table. Table 1 at the end of Hildebrand,
Ott & Gray gives the probabilities for n = 2(1) 10 (2) 20,
50, 100 and π = .05(.05).50.
• We can also use software (MINITAB; EXCEL’s
BINOMDIST function)
• Two obvious cases
⎛ n⎞
0
n
n
P[0 successes] = ⎜⎜0 ⎟⎟ π (1 - π ) = (1 - π )
⎝ ⎠
0
n
⎛n⎞ n
P[n successes] = ⎜⎜ ⎟⎟ π (1 - π ) = π
⎝n⎠
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
24
5.2 The Binomial Distribution
• Mean and Variance of a Binomial Random Variable
E[Y] = nπ
V(Y) = σ2 = nπ (1 - π)
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
25
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.2 The Binomial Distribution
• An easy way to find E(Y) and V(Y)
Y = total number of successes in n trials
= Number of successes on 1st trial
+ Number of successes on 2nd trial
+ …
+ Number of successes on nth trial
E(Y) = π + π + …. + π = nπ
V(Y) = π (1 – π) + π (1 – π)+ … + π (1 – π)
= nπ (1 – π)
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
26
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.2 The Binomial Distribution
Exercise 5.61 [Revised so that number of potential
customers is 50.]
Executives at a soft drink company wish to test a new
formulation of their chief product. The new drink is tested
in comparison to the current one. Each of 50 potential
customers is given a cup of the current formulation and a
cup of the new one. The cups are labeled H and K to
avoid bias. Each customer indicates a preference.
Assume that, in fact, the customers can't detect a
difference and are, in effect, guessing. Define Y to be the
number (out of 50) indicating preference for the new
formulation.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
27
5.2 The Binomial Distribution
a. What probability distribution should apply to Y?
Do the assumptions underlying that distribution
seem plausible in this context?
• Each of the 50 customers is a Bernoulli trial (either
prefers new product or does not).
• If customers are guessing, the probability of preference
for new product is 0.5.
• Reasonable to assume trials are independent.
• Let Y be the number of customers who indicate a
preference for the new product.
Then Y is binomial with n = 50 and π = 0.5.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
28
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.2 The Binomial Distribution
A graph of the probability distribution of Y follows.
Pr obability Distr ibution of Y
0.12
0.10
P(Y=Y)
0.08
0.06
0.04
0.02
0.00
0
10
20
30
40
50
y
The graph is symmetric because π = 0.5
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
29
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.2 The Binomial Distribution
b. Find the mean and standard deviation of Y.
µy = nπ = (50)(0.5) = 25
σ2 = nπ (1-π) = 12.5
σ = 3.54
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
30
5.2 The Binomial Distribution
c. (Cont’d)
Find the probability that the number of customers
preferring the new brand is within 2 standard deviations of
the mean.
P[µ – 2σ ≤ Y ≤ µ + 2σ ] = P[25 – 2(3.54) ≤ Y ≤ 25 + 2(3.54)]
= P[ 17.93 ≤ Y ≤ 32.08]
= P[ 18 ≤ Y ≤ 32]
= P[Y=18] + P[Y=19] + … + P[Y=32]
= .0160 + .0270 + … + .0160 (From Table 1)
= .9672
Most of the time (97%), we should observe between 18
and 32 customers indicating a preference for the new
product if, in fact, they are guessing.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
31
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.2 The Binomial Distribution
d. (Cont’d)
In one such test, 12 people preferred the new formulation. Find the
probability that 12 or fewer would prefer the new formulation if the
customers can’t detect a difference. What, if anything, can you infer
about consumer preferences from the results of the taste test.
P(Y ≤ 12) = .0001 (from Table 1)
If the hypothesis that the people can’t detect a difference is correct,
P(Y ≤ 12) is very small [ <.05]. Since this probability is very small,
it implies the hypothesis that the people can’t detect a difference is
incorrect! Or, π ≠ .5
Why were the cups labeled H and K?
Studies have shown that people have no preference for either of
these letters, as opposed to the letters A and B.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
32
2005 Brooks/Cole, a division of Thomson Learning, Inc.
Section 5.3
The Poisson Distribution
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
33
5.3 The Poisson Distribution
• Named for Simeon D. Poisson (1781-1840)
• Examples of a Poisson random variable
• The number of work-related injuries per month at a
manufacturing plant.
• The number of e-mail messages arriving at a personal
computer in one hour.
• The number of network errors per day on a local area
network.
• The Poisson random variable is the number of
occurrences in a given unit.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
34
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.3 The Poisson Distribution
• Features of a Poisson Experiment
For a unit of time, area or volume
• Probability that an event occurs in a given unit is the
same for all units.
• Probability of two or more events occurring at same
time is 0.
• The occurrence of the event in one unit is independent
of the number that occur in other units.
• The expected number of occurrences in each unit
is denoted by µ.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
35
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.3 The Poisson Distribution
• Poisson Probability Distribution
PY ( y ) =
e −µ µ y
( y )!
y = 0 ,1, 2 ...
• Calculation of Probabilities
• Use formula for pY (y)
• Use Table 2 for µ = 0.1(0.1)5 and
5.5(0.5)10 and 11(1)20
• Use software
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
36
5.3 The Poisson Distribution
• Mean and Variance for a Poisson Random Variable
• E(Y) = µ
• Var(Y) = µ
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
37
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.3 The Poisson Distribution
Exercise 5.29:
Suppose that the number of defaults on home mortgage
loans at National Mortgage Company follows a Poisson
distribution with an average of 8.2 defaults per month.
a. Compute the probability of exactly 12 defaults at NMC
next month.
P(Y = 12) = PY (12) =
e −8.2 (8.2)12
(12)!
= 0.0529925 { From Minitab
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
38
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.3 The Poisson Distribution
A graph of the probability distribution of Y, the number of
defaults per month follows.
P r o b a b ility D is tr ib uti o n o f Y
0 .1 4
0 .1 2
P(Y=y)
0 .1 0
0 .0 8
0 .0 6
0 .0 4
0 .0 2
0 .0 0
0
10
20
30
40
50
60
70
80
90
y
The probability distribution quickly tapers off to .005 or
less for y ≥ 16.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
39
5.3 The Poisson Distribution
b. What is the chance of at least one default next week?
P(Y ≥ 1) = 1 – P(Y = 0) = 1 - .00027
= 0.99973
c. Because of poor economic times, NMC believes that the
average number of defaults may have increased from 8.2
per month. Last month, there were 15 defaults. If the
average number of defaults has not changed from 8.2, find
P(Y ≥ 15).
P(Y ≥ 15) = 1 – P(Y ≤ 14) = 1 - .9791
= .0209
⇒ Since P(Y ≥ 15) is small, this implies µ has
changed from 8.2.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
40
2005 Brooks/Cole, a division of Thomson Learning, Inc.
Section 5.4
The Normal Distribution
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
41
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.4 The Normal Distribution
Continuous Random Variables in General
• Examples of continuous random variables:
• Stock market returns
• Quality characteristics of finished products
(such as net contents)
• Heights of males; heights of females
• Age at time of death
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
42
5.4 The Normal Distribution
Continuous Random Variables in General (Cont’d)
• Features of a continuous random variable:
• The possible values are uncountable.
• The probability that the random variable takes on
a specific value is 0.
• Only an interval of values has a nonzero
probability.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
43
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.4 The Normal Distribution
Continuous Random Variables in General (Cont’d)
• The probability for an interval of values will be
shown as the area under the pdf.
f Y ( y)
P(a< Y < b)
a
b
y
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
44
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.4 The Normal Distribution
Continuous Random Variables in General (Cont’d)
• Details:
• It doesn’t matter whether endpoints are included
in the interval:
P[a < Y < b] = P[a ≤ Y < b] = P[a < Y ≤ b]
= P[a ≤ Y ≤ b]
Why? P[Y = a] = P[Y = b] = 0.
• Data are never continuous!
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
45
5.4 The Normal Distribution
• The Standard Normal Random Variable
• The probability distribution of a standard normal
random variable Z is shown below:
fz (z)
z
0
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
46
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.4 The Normal Distribution
• E(Z) = µz = 0
{The curve is symmetric around 0}
V(Z) = σz2 = 1
• Other Properties:
Total area under the curve is 1.
The curve is symmetric around 0.
Î P(Z > 0) = 0.5
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
47
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.4 The Normal Distribution
• Determination of probabilities for a standard
normal random variable:
• Use Table 3 (area from 0 to a right-hand value z)
• Use software
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
48
5.4 The Normal Distribution
P(Z ≤ -2.42)
Exercise 5.30:
Suppose that Z represents a
standard normal random
variable.
i. Find P(Z ≤ -2.42).
= 0.5 - P( 0 ≤ Z ≤ 2.42)
= 0.5 - .4922
(from Table 3)
fZ ( z)
= 0.0078
z
-2.42
0
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
49
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.4 The Normal Distribution
P(-1.07 ≤ Z ≤ 2.33)
g. Find P(-1.07 ≤ Z ≤ 2.33)
= P(-1.07 ≤ Z ≤ 0)
fZ ( z)
+ P(0 ≤ Z ≤ 2.33)
= 0.3577 + 0.4901
(from Table 3)
z
-1.07
0
2.33
= 0.8478
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
50
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.4 The Normal Distribution
Exercise 5.31:
For the standard normal
random variable Z, solve the
following equation for k.
a. P(Z ≥ k) = .01
fZ ( z)
From Table 3,
P(0 ≤ Z ≤ 2.33) = 0.4901
P(Z ≥ 2.33) = .01
⇒ k = 2.33
.01
z
k
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
51
5.4 The Normal Distribution
• Normal Random Variables in General
fY (y)
σy
µ
y
y
• The probability distribution is mound-shaped.
• µy is the expected value of the distribution.
• σy is the standard deviation of the distribution.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
52
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.4 The Normal Distribution
• Standardize Y to find areas under the normal
curve of Y.
Z =
Y − µY
{Procedure for standardizing Y}
σY
Now use Table 3.
• The standardized variable Z measures how many
standard deviations Y is above or below its mean.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
53
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.4 The Normal Distribution
Exercise 5.41:
A potato chip packaging plant has a process line that fills
12 ounce bags of potato chips. At the current setting of the
machine, the quality control engineer knows that the
actual distribution of weights in the bags follows a normal
distribution with a mean of 12.0 ounces and a standard
deviation of 0.18 ounces.
a. What percentage of all bags filled contain exactly 12
ounces?
P(Y = 12) = 0, since the probability at a point is 0.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
54
5.4 The Normal Distribution
b. What percentage of all
bags filled contain more
than 12.4 ounces?
P(Y > 12.4)
= P(Z >
12.4 − 12
)
0.18
= P( Z > 2.22)
f Y ( y)
= 0.5 – 0.4868
= 0.0132
y
12
12.4
• 12.4 is 2.22 standard
deviations from 12.0.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
55
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.4 The Normal Distribution
Find k so that P(Y< k) = .60
Standardizing
c. Find the 60th percentile of
the actual weights of 12ounce bags of potato
chips.
k − 12
) = .60
0 .1 8
P(Z<
From Table 3,
P(Z < 0.253) = .60,
f Y ( y)
k − 12
= 0.253
0 .1 8
Set
.60
k = 12 +(0.18)(0.253)
y
12
k = 12.046 ounces
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
56
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.4 The Normal Distribution
d. Management is concerned when 12-ounce bags of potato
chips contain less than 11.75 ounces. The quality control
engineer can set the filling machine so that actual mean
filling weight is whatever he chooses, but the standard
deviation always remains at 0.18 ounces. What mean
filling weight should he set the machine to if he wants only
1% of all bags to contain less than 11.75 ounces?
Find µ so that P (Y < 11.75) = .01
.01 = P(Y < 11.75) = P(Z <
11.75 − µ
0.18
)
.01 = P(Z < -2.33) from Table 3
11.75 − µ
Set
= -2.33
0.18
µ = 11.75 + (2.33)(0.18) = 12.17 ounces
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
57
Section 5.5
Checking Normality
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
58
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.5 Checking Normality
• Many of the statistical techniques in later chapters
assume that the data is from a normal distribution.
• Chapter 2 presented several graphical techniques
that could be useful in assessing whether or not
the data is from a normal distribution.
• For example, is a histogram mound-shaped? The
answer to this question is facilitated by
superimposing a normal distribution over the
histogram.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
59
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.5 Checking Normality
Example:
Consider the returns for ^DJI first presented in Chapter 1.
The histogram with a normal distribution superimposed
follows.
Histogram for R^DJI with Normal Distribution Superimposed
Normal
Mean
StDev
N
10
-0.3414
5.287
35
Frequency
8
6
4
2
0
-10
-5
0
R^DJI
5
10
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
60
5.5 Checking Normality
Conclusion:
At first glance, it appears that the normal distribution is not
a good fit. However, the shape of the histogram is
determined by the number of class intervals and their
width. So, this may not be the best approach.
Histogram for R^DJI with Normal Distribution Superimposed
Normal
Mean
StDev
N
10
-0.3414
5.287
35
Frequency
8
6
4
2
0
-10
-5
0
R^DJI
5
10
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
61
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.5 Checking Normality
• Another approach for assessing normality is the
Normal Probability Plot.
• The data are arranged in ascending order.
• Each data value, y(i), is assigned a cumulative
relative frequency, pi:
pi =
100(i − 0.5)
n
• Think of 0.5 as a correction factor.
• Other correction factors are sometimes used.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
62
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.5 Checking Normality
• For example, if the data set has 25 observations,
then
p1 = 2.00, p2 = 6.00,…, p25 = 98.00
• The percentage of the observations less than or
equal to y(1) is 2.00%.
• The percentage of the observations less than or
equal to y(2) is 6.00%.
• (y(i), pi) are plotted on a graph where the vertical
axis is scaled so that if the data is from a normal
distribution, the resulting plot should be
approximately linear.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
63
5.5 Checking Normality
• Appearance of NPP’s for data from a distribution that is
not normal.
• Right-skewed data plot as a curve, with the slope
getting flatter as one moves to the right.
• Left-skewed data plot as a curve, with the slope getting
steeper as one moves to the right.
• Data from symmetric distributions with more tail area
than the normal plot as an S-shape, with the slope
steepest at both ends.
• The straight line drawn through the points can assist in
assessing linearity. It can also be misleading if a few of the
points are outliers.
• In the following examples, the sample size is fixed at 25.
This value for n was arbitrarily chosen.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
64
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.5 Checking Normality
Example:
What does the NPP look like for data from a standard
normal distribution?
Probability Plot of z
Normal
99
Mean
StDev
N
AD
P-Value
95
90
0.07199
1.285
25
0.196
0.878
Percent
80
70
60
50
40
30
20
10
5
1
-3
-2
-1
0
z
1
2
3
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
65
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.5 Checking Normality
Conclusion:
Since the plotted points are nearly linear, conclude that the
data came from a normal distribution.
Probability Plot of z
Normal
99
Mean
StDev
N
AD
P-Value
95
90
0.07199
1.285
25
0.196
0.878
Percent
80
70
60
50
40
30
20
10
5
1
-3
-2
-1
0
z
1
2
3
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
66
5.5 Checking Normality
Example:
What does the NPP look like for data from a normal
distribution with µ = 100 and σ = 10?
Probability Plot of y
Normal
99
Mean
StDev
N
AD
P-Value
95
90
103.8
10.11
25
0.231
0.781
Percent
80
70
60
50
40
30
20
10
5
1
80
90
100
110
120
130
y
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
67
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.5 Checking Normality
Conclusion:
Since the plotted points are nearly linear, conclude that the
data came from a normal distribution.
Probability Plot of y
Normal
99
Mean
StDev
N
AD
P-Value
95
90
103.8
10.11
25
0.231
0.781
Percent
80
70
60
50
40
30
20
10
5
1
80
90
100
110
120
130
y
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
68
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.5 Checking Normality
Example:
A uniform distribution is one that is of uniform or constant
height for the range of y values. For the interval from –3 to
+3, a uniform distribution has height of (1/6). What does
the NPP look like for data from a uniform distribution
that ranges from –3 to +3 ?
Probability Plot of y
Normal
99
M ean
S tDev
N
AD
P -Valu e
95
90
0.1511
1.944
25
0.844
0.025
Percent
80
70
60
50
40
30
20
10
5
1
-5.0
-2.5
0.0
y
2.5
5.0
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
69
5.5 Checking Normality
Conclusion:
Because the plot is S-shaped with the slope steepest at
both ends, conclude that the data came from a symmetric
distribution with more probability in each tail than the
normal distribution.
Probability Plot of y
Normal
99
M ean
StD ev
N
AD
P -Value
95
90
0.1511
1.944
25
0.844
0.025
Percent
80
70
60
50
40
30
20
10
5
1
-5.0
-2.5
0.0
y
2.5
5.0
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
70
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.5 Checking Normality
Example:
What does the NPP look like for data from a distribution
that is skewed to the right with E(Y) = 927 and σY = 871?
Probability Plot of y
Normal
99
Mean
StDev
N
AD
P-Value
95
90
904.9
800.1
25
1.229
<0.005
Percent
80
70
60
50
40
30
20
10
5
1
-1000
0
1000
y
2000
3000
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
71
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.5 Checking Normality
Conclusion:
Since the plot is curved with the slope getting flatter as
one moves to the right, conclude that the data came from
a right-skewed distribution.
Probability Plot of y
Normal
99
Mean
StDev
N
AD
P -Value
95
90
904.9
800.1
25
1.229
<0.005
Percent
80
70
60
50
40
30
20
10
5
1
-1000
0
1000
y
2000
3000
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
72
5.5 Checking Normality
Example:
Consider the returns for R^DJI. What does the NPP tell us?
NPP for R^DJI
Normal
99
Mean
StDev
N
AD
P-Value
95
90
-0.3414
5.287
35
0.240
0.760
Percent
80
70
60
50
40
30
20
10
5
1
-15
-10
-5
0
R^DJI
5
10
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
73
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.5 Checking Normality
Conclusion:
Because the NPP is linear, conclude that the R^DJI are
normally distributed. However, it’s a different story for the
RIBM data. The NPP for RIBM follows.
NPP for RIBM
Normal
99
Mean
StDev
N
AD
P-Value
95
90
0.4368
12.30
35
0.784
0.038
Percent
80
70
60
50
40
30
20
10
5
1
-30
-20
-10
0
10
20
30
40
RIBM
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
74
2005 Brooks/Cole, a division of Thomson Learning, Inc.
5.5 Checking Normality
• Procedure to obtain a Normal Probability Plot using Minitab:
Æ Suppose the data to be analyzed are stored in C1
Æ Click on Stat Æ Basic Statistics Æ Normality Test
Æ Enter “C1” in box for “Variable”
Æ Select “Percentile Lines” option. The default option is
“None”
Æ Select “Tests for Normality” option. The default option is
“Anderson-Darling”
Æ Enter “Title” for plot
Æ Click on “OK”
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
75
Keywords: Chapter 5
• Factorial
• Sequences
• Combinations
• Bernoulli trials
• Binomial random
variable
• Binomial probability
distribution
• Poisson random variable
• Poisson probability
distribution
• Normal random variable
• Standard normal
probability distribution
• Normal probability
distribution
• Normal probability plot
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
76
2005 Brooks/Cole, a division of Thomson Learning, Inc.
Summary of Chapter 5
• The counting techniques needed for sequences
and combinations.
• A binomial random variable counts the number of
successes in n trials, with each trial being a
success or failure.
• A Poisson random variable counts the number of
occurrences of an event over a specified length of
time.
• A normal random variable measures the
characteristic of interest and the probability
distribution is bell-shaped.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
77
2005 Brooks/Cole, a division of Thomson Learning, Inc.
Summary of Chapter 5
• Computing probabilities for the binomial, Poisson
and normal random variables.
• Assessing normality of data by the normal
probability plot.
Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5
Copyright
©
2005 Brooks/Cole, a division of Thomson Learning, Inc.
78