Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 5: Probability Distributions Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 1 2005 Brooks/Cole, a division of Thomson Learning, Inc. Learning Objectives for Ch. 5 • Understanding the counting techniques needed for sequences and combinations. • Understanding that a binomial random variable counts the number of successes in a fixed number of trials with each trial being a success or failure. • Assumptions needed to use the binomial. • Understanding that a Poisson random variable counts the number of occurrences of an event in a unit of time, area or volume. • Assumptions needed to use the Poisson. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2 2005 Brooks/Cole, a division of Thomson Learning, Inc. Learning Objectives for Ch. 5 • Understanding that a normal random variable measures a characteristic of interest and has a bell-shaped distribution. • Learning how to calculate probabilities for binomial, Poisson and normal random variables. • Understanding how to use a normal probability plot to determine if data is from a normal distribution. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 3 Section 5.1 Counting Possible Outcomes Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 4 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.1 Counting Possible Outcomes • Under the classical interpretation of probability: P(Event) = Number of favorable outcomes Total number of outcomes • We need ways to count the number of outcomes. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 5 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.1 Counting Possible Outcomes • Preliminary Concept – Factorials • The factorial symbol is “!” • Definition of n! n! = n (n - 1) (n - 2)…1 Example: 3! = (3)(2)(1) = 6 • By definition, 0! = 1. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 6 5.1 Counting Possible Outcomes • One consideration in counting techniques • Order matters ⇒ sequences • Order doesn’t matter ⇒ subsets Example: Consider the letters a and b. If order matters, there are 2 sequences: (a,b) and (b,a) If order does not matter, there is only 1 subset: {a,b} Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 7 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.1 Counting Possible Outcomes • Number of sequences • Rule: The number of sequences of k objects that can be formed from a set of r distinct objects, denoted rPk, is: rPk = (r) (r - 1)…(r – k + 1) Example: The number of sequences of 2 letters formed from the 4 letters a, b, c, d, is: (4) (3) = 12 The sequences are: (a,b) (a,c) (a,d) (b,c) (b,d) (c,d) (b,a) (c,a) (d,a) (c,b) (d,b) (d,c) Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 8 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.1 Counting Possible Outcomes • Number of subsets or combinations • Rule: The number of subsets of k objects that can be formed from a set of r distinct objects, denoted rCk, is: rCk = _______ r! k! (r – k)! • Notation: Use rCk or ( rk ) Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 9 5.1 Counting Possible Outcomes Example: The number of subsets of 2 letters formed from the 4 letters a, b, c, d is: r Ck = ( rk ) = _______ 4! 2! (4-2)! =6 The subsets are: {a,b} {a,c} {a,d} {b,c} {b,d} {c,d} Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 10 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.1 Counting Possible Outcomes Exercise 5.67: Several states now have a Lotto game. A player chooses 6 distinct integers in the range 1 to 40. If exactly those 6 numbers are selected as the winning numbers, the player receives a very large prize. What is the probability that a particular set of 6 numbers will be drawn? You may wish to think of the 6 numbers drawn as the “success” numbers. First approach: Order matters (even though it doesn’t) Total number of outcomes = 40P6 = (40)(39)(38)(37)(36)(35) Number of favorable outcomes = 6P6 = (6)(5)(4)(3)(2)(1) = 6! P(Winning) = 6!/ [(40)…(35)] = .00000026052657 Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 11 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.1 Counting Possible Outcomes • Another perspective of the first approach P(Winning) = 6! / [(40) ··· (35)] (6)(5)(4)(3)(2)(1) (40)(39)(38)(37)(36)(35) = ⎛ 6 ⎞⎛ 5 ⎞ ⎛ 1 ⎞ ⎟⎜ ⎟ " ⎜ ⎟ ⎝ 40 ⎠⎝ 39 ⎠ ⎝ 35 ⎠ = ⎜ = P(W1) P(W2/W1) ··· P(W6/W1 and ··· W5) Where W i ≡ {The ith number is a winning number} Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 12 5.1 Counting Possible Outcomes Second approach: Order doesn’t matter (and it really doesn’t) Total number of outcomes = 40C6 = _____ 40! 6! 34! Number of favorable outcomes = 6C6 · 34C0 6! . _____ 34! = 1 = ____ 6! 0! 0! 34! P (Winning) = 6!/[(40)…(35)] = .00000026052657 Moral: You can’t lose if you don’t play! Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 13 2005 Brooks/Cole, a division of Thomson Learning, Inc. Section 5.2 The Binomial Distribution Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 14 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.2 The Binomial Distribution • Examples of a Bernoulli Trial 1. A coin toss results in a head (H) or a tail (T). 2. A bit sent through a digital communications channel is entered as either 0 or 1 and received either correctly or incorrectly. 3. An audited account is either current (C) or delinquent (D). 4. A consumer is either aware (A) of a particular product or not aware (N). 5. A flight reservation is either a show (S) or no-show (N). Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 15 5.2 The Binomial Distribution • Features of a Bernoulli Trial: • Only 2 possible outcomes for each trial, characterized as: Success (S) or Failure (F) • π denotes P(S) (1 – π) denotes P(F). Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 16 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.2 The Binomial Distribution • Bernoulli R.V. and Probability Distribution Let Y = 1, if trial results in S = 0, if trial results in F y 0 1 PY ( y ) 1− π π Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 17 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.2 The Binomial Distribution • Graphical representation of a Bernoulli probability distribution P Y (y) 1–π π y 0 1 The distribution is skewed when π ≠ .5 Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 18 5.2 The Binomial Distribution • E(Y) = 0(1 - π) + 1(π ) = π • V(Y) = Σ (y - µ)2 PY(y) = [0 – π]2 (1 – π) + [1 – π]2 π = π (1 – π) [π + (1 – π) ] = π (1 – π) Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 19 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.2 The Binomial Distribution • Examples of Binomial Random Variable: 1. Toss a coin 10 times. Let Y denote the number of heads in the 10 tosses. 2. For the next 3 bits transmitted through a digital communications channel, let Y be the number of bits received that are in error. 3. 20 accounts are randomly selected from a population of several thousand accounts and are audited. Let Y be the number of delinquent accounts in the sample. [The sampling has to be with replacement for the probability of success to remain constant. In reality, the sampling is done without replacement.] 4. 100 randomly selected consumers are surveyed as part of a market research study. Let Y denote the number of these consumers who are aware of a particular product. 5. Out of 50 flight reservations made, let Y be the number of passengers who show. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 20 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.2 The Binomial Distribution • Features of a Binomial Experiment: • There are n Bernoulli trials [each one results in S or F]. • The probability of a success, π = P(S), remains constant over the n trials; [P(F) = 1 - π ]. • The trials are independent. • The binomial random variable is the total number of successes in n trials, where the ordering is unimportant. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 21 5.2 The Binomial Distribution • Binomial Probability Distribution ⎛n⎞ PY (y) = ⎜ ⎟ π y (1 - π )n - y , y = 0,1,..., n ⎝ y⎠ ⎛n⎞ n! • ⎜⎜ ⎟⎟ = ⎝ y ⎠ y ! (n - y) ! • The expression for PY(y) can be used to calculate probabilities for a binomial random variable. • What is the basis for the expression for PY(y)? 22 Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.2 The Binomial Distribution Example: Y denotes number of bits in error in next 3 transmitted where P(Error) = π Outcomes E, E, E y y Probability y From PY(y) y 3 π3 ⎛ 3⎞ ⎜⎜ 3 ⎟⎟ ⎝ ⎠ π3 (1-π)0 = π3 2 3 π2 (1- π) ⎛3 ⎞ ⎜⎜ 2 ⎟⎟ ⎝ ⎠ π2 (1- π)1 = 3 π2 (1- π) 1 3 π (1- π) 2 ⎛ 3⎞ ⎜⎜1 ⎟⎟ ⎝ ⎠ π1 (1- π)3 –1 = 3 π (1- π ) 2 E, E, O E, O, E O, E, E E, O, O O, E, O O, O, E O, O, O (1- π) 3 0 (1- π ) 3 Found by using principles of Chapter 3. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 23 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.2 The Binomial Distribution • Calculation of Probabilities • Use the binomial probability distribution formula • Instead of actually calculating the probabilities, we can look them up in a table. Table 1 at the end of Hildebrand, Ott & Gray gives the probabilities for n = 2(1) 10 (2) 20, 50, 100 and π = .05(.05).50. • We can also use software (MINITAB; EXCEL’s BINOMDIST function) • Two obvious cases ⎛ n⎞ 0 n n P[0 successes] = ⎜⎜0 ⎟⎟ π (1 - π ) = (1 - π ) ⎝ ⎠ 0 n ⎛n⎞ n P[n successes] = ⎜⎜ ⎟⎟ π (1 - π ) = π ⎝n⎠ Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 24 5.2 The Binomial Distribution • Mean and Variance of a Binomial Random Variable E[Y] = nπ V(Y) = σ2 = nπ (1 - π) Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 25 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.2 The Binomial Distribution • An easy way to find E(Y) and V(Y) Y = total number of successes in n trials = Number of successes on 1st trial + Number of successes on 2nd trial + … + Number of successes on nth trial E(Y) = π + π + …. + π = nπ V(Y) = π (1 – π) + π (1 – π)+ … + π (1 – π) = nπ (1 – π) Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 26 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.2 The Binomial Distribution Exercise 5.61 [Revised so that number of potential customers is 50.] Executives at a soft drink company wish to test a new formulation of their chief product. The new drink is tested in comparison to the current one. Each of 50 potential customers is given a cup of the current formulation and a cup of the new one. The cups are labeled H and K to avoid bias. Each customer indicates a preference. Assume that, in fact, the customers can't detect a difference and are, in effect, guessing. Define Y to be the number (out of 50) indicating preference for the new formulation. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 27 5.2 The Binomial Distribution a. What probability distribution should apply to Y? Do the assumptions underlying that distribution seem plausible in this context? • Each of the 50 customers is a Bernoulli trial (either prefers new product or does not). • If customers are guessing, the probability of preference for new product is 0.5. • Reasonable to assume trials are independent. • Let Y be the number of customers who indicate a preference for the new product. Then Y is binomial with n = 50 and π = 0.5. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 28 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.2 The Binomial Distribution A graph of the probability distribution of Y follows. Pr obability Distr ibution of Y 0.12 0.10 P(Y=Y) 0.08 0.06 0.04 0.02 0.00 0 10 20 30 40 50 y The graph is symmetric because π = 0.5 Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 29 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.2 The Binomial Distribution b. Find the mean and standard deviation of Y. µy = nπ = (50)(0.5) = 25 σ2 = nπ (1-π) = 12.5 σ = 3.54 Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 30 5.2 The Binomial Distribution c. (Cont’d) Find the probability that the number of customers preferring the new brand is within 2 standard deviations of the mean. P[µ – 2σ ≤ Y ≤ µ + 2σ ] = P[25 – 2(3.54) ≤ Y ≤ 25 + 2(3.54)] = P[ 17.93 ≤ Y ≤ 32.08] = P[ 18 ≤ Y ≤ 32] = P[Y=18] + P[Y=19] + … + P[Y=32] = .0160 + .0270 + … + .0160 (From Table 1) = .9672 Most of the time (97%), we should observe between 18 and 32 customers indicating a preference for the new product if, in fact, they are guessing. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 31 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.2 The Binomial Distribution d. (Cont’d) In one such test, 12 people preferred the new formulation. Find the probability that 12 or fewer would prefer the new formulation if the customers can’t detect a difference. What, if anything, can you infer about consumer preferences from the results of the taste test. P(Y ≤ 12) = .0001 (from Table 1) If the hypothesis that the people can’t detect a difference is correct, P(Y ≤ 12) is very small [ <.05]. Since this probability is very small, it implies the hypothesis that the people can’t detect a difference is incorrect! Or, π ≠ .5 Why were the cups labeled H and K? Studies have shown that people have no preference for either of these letters, as opposed to the letters A and B. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 32 2005 Brooks/Cole, a division of Thomson Learning, Inc. Section 5.3 The Poisson Distribution Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 33 5.3 The Poisson Distribution • Named for Simeon D. Poisson (1781-1840) • Examples of a Poisson random variable • The number of work-related injuries per month at a manufacturing plant. • The number of e-mail messages arriving at a personal computer in one hour. • The number of network errors per day on a local area network. • The Poisson random variable is the number of occurrences in a given unit. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 34 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.3 The Poisson Distribution • Features of a Poisson Experiment For a unit of time, area or volume • Probability that an event occurs in a given unit is the same for all units. • Probability of two or more events occurring at same time is 0. • The occurrence of the event in one unit is independent of the number that occur in other units. • The expected number of occurrences in each unit is denoted by µ. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 35 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.3 The Poisson Distribution • Poisson Probability Distribution PY ( y ) = e −µ µ y ( y )! y = 0 ,1, 2 ... • Calculation of Probabilities • Use formula for pY (y) • Use Table 2 for µ = 0.1(0.1)5 and 5.5(0.5)10 and 11(1)20 • Use software Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 36 5.3 The Poisson Distribution • Mean and Variance for a Poisson Random Variable • E(Y) = µ • Var(Y) = µ Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 37 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.3 The Poisson Distribution Exercise 5.29: Suppose that the number of defaults on home mortgage loans at National Mortgage Company follows a Poisson distribution with an average of 8.2 defaults per month. a. Compute the probability of exactly 12 defaults at NMC next month. P(Y = 12) = PY (12) = e −8.2 (8.2)12 (12)! = 0.0529925 { From Minitab Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 38 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.3 The Poisson Distribution A graph of the probability distribution of Y, the number of defaults per month follows. P r o b a b ility D is tr ib uti o n o f Y 0 .1 4 0 .1 2 P(Y=y) 0 .1 0 0 .0 8 0 .0 6 0 .0 4 0 .0 2 0 .0 0 0 10 20 30 40 50 60 70 80 90 y The probability distribution quickly tapers off to .005 or less for y ≥ 16. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 39 5.3 The Poisson Distribution b. What is the chance of at least one default next week? P(Y ≥ 1) = 1 – P(Y = 0) = 1 - .00027 = 0.99973 c. Because of poor economic times, NMC believes that the average number of defaults may have increased from 8.2 per month. Last month, there were 15 defaults. If the average number of defaults has not changed from 8.2, find P(Y ≥ 15). P(Y ≥ 15) = 1 – P(Y ≤ 14) = 1 - .9791 = .0209 ⇒ Since P(Y ≥ 15) is small, this implies µ has changed from 8.2. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 40 2005 Brooks/Cole, a division of Thomson Learning, Inc. Section 5.4 The Normal Distribution Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 41 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.4 The Normal Distribution Continuous Random Variables in General • Examples of continuous random variables: • Stock market returns • Quality characteristics of finished products (such as net contents) • Heights of males; heights of females • Age at time of death Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 42 5.4 The Normal Distribution Continuous Random Variables in General (Cont’d) • Features of a continuous random variable: • The possible values are uncountable. • The probability that the random variable takes on a specific value is 0. • Only an interval of values has a nonzero probability. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 43 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.4 The Normal Distribution Continuous Random Variables in General (Cont’d) • The probability for an interval of values will be shown as the area under the pdf. f Y ( y) P(a< Y < b) a b y Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 44 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.4 The Normal Distribution Continuous Random Variables in General (Cont’d) • Details: • It doesn’t matter whether endpoints are included in the interval: P[a < Y < b] = P[a ≤ Y < b] = P[a < Y ≤ b] = P[a ≤ Y ≤ b] Why? P[Y = a] = P[Y = b] = 0. • Data are never continuous! Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 45 5.4 The Normal Distribution • The Standard Normal Random Variable • The probability distribution of a standard normal random variable Z is shown below: fz (z) z 0 Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 46 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.4 The Normal Distribution • E(Z) = µz = 0 {The curve is symmetric around 0} V(Z) = σz2 = 1 • Other Properties: Total area under the curve is 1. The curve is symmetric around 0. Î P(Z > 0) = 0.5 Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 47 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.4 The Normal Distribution • Determination of probabilities for a standard normal random variable: • Use Table 3 (area from 0 to a right-hand value z) • Use software Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 48 5.4 The Normal Distribution P(Z ≤ -2.42) Exercise 5.30: Suppose that Z represents a standard normal random variable. i. Find P(Z ≤ -2.42). = 0.5 - P( 0 ≤ Z ≤ 2.42) = 0.5 - .4922 (from Table 3) fZ ( z) = 0.0078 z -2.42 0 Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 49 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.4 The Normal Distribution P(-1.07 ≤ Z ≤ 2.33) g. Find P(-1.07 ≤ Z ≤ 2.33) = P(-1.07 ≤ Z ≤ 0) fZ ( z) + P(0 ≤ Z ≤ 2.33) = 0.3577 + 0.4901 (from Table 3) z -1.07 0 2.33 = 0.8478 Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 50 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.4 The Normal Distribution Exercise 5.31: For the standard normal random variable Z, solve the following equation for k. a. P(Z ≥ k) = .01 fZ ( z) From Table 3, P(0 ≤ Z ≤ 2.33) = 0.4901 P(Z ≥ 2.33) = .01 ⇒ k = 2.33 .01 z k Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 51 5.4 The Normal Distribution • Normal Random Variables in General fY (y) σy µ y y • The probability distribution is mound-shaped. • µy is the expected value of the distribution. • σy is the standard deviation of the distribution. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 52 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.4 The Normal Distribution • Standardize Y to find areas under the normal curve of Y. Z = Y − µY {Procedure for standardizing Y} σY Now use Table 3. • The standardized variable Z measures how many standard deviations Y is above or below its mean. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 53 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.4 The Normal Distribution Exercise 5.41: A potato chip packaging plant has a process line that fills 12 ounce bags of potato chips. At the current setting of the machine, the quality control engineer knows that the actual distribution of weights in the bags follows a normal distribution with a mean of 12.0 ounces and a standard deviation of 0.18 ounces. a. What percentage of all bags filled contain exactly 12 ounces? P(Y = 12) = 0, since the probability at a point is 0. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 54 5.4 The Normal Distribution b. What percentage of all bags filled contain more than 12.4 ounces? P(Y > 12.4) = P(Z > 12.4 − 12 ) 0.18 = P( Z > 2.22) f Y ( y) = 0.5 – 0.4868 = 0.0132 y 12 12.4 • 12.4 is 2.22 standard deviations from 12.0. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 55 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.4 The Normal Distribution Find k so that P(Y< k) = .60 Standardizing c. Find the 60th percentile of the actual weights of 12ounce bags of potato chips. k − 12 ) = .60 0 .1 8 P(Z< From Table 3, P(Z < 0.253) = .60, f Y ( y) k − 12 = 0.253 0 .1 8 Set .60 k = 12 +(0.18)(0.253) y 12 k = 12.046 ounces Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 56 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.4 The Normal Distribution d. Management is concerned when 12-ounce bags of potato chips contain less than 11.75 ounces. The quality control engineer can set the filling machine so that actual mean filling weight is whatever he chooses, but the standard deviation always remains at 0.18 ounces. What mean filling weight should he set the machine to if he wants only 1% of all bags to contain less than 11.75 ounces? Find µ so that P (Y < 11.75) = .01 .01 = P(Y < 11.75) = P(Z < 11.75 − µ 0.18 ) .01 = P(Z < -2.33) from Table 3 11.75 − µ Set = -2.33 0.18 µ = 11.75 + (2.33)(0.18) = 12.17 ounces Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 57 Section 5.5 Checking Normality Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 58 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.5 Checking Normality • Many of the statistical techniques in later chapters assume that the data is from a normal distribution. • Chapter 2 presented several graphical techniques that could be useful in assessing whether or not the data is from a normal distribution. • For example, is a histogram mound-shaped? The answer to this question is facilitated by superimposing a normal distribution over the histogram. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 59 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.5 Checking Normality Example: Consider the returns for ^DJI first presented in Chapter 1. The histogram with a normal distribution superimposed follows. Histogram for R^DJI with Normal Distribution Superimposed Normal Mean StDev N 10 -0.3414 5.287 35 Frequency 8 6 4 2 0 -10 -5 0 R^DJI 5 10 Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 60 5.5 Checking Normality Conclusion: At first glance, it appears that the normal distribution is not a good fit. However, the shape of the histogram is determined by the number of class intervals and their width. So, this may not be the best approach. Histogram for R^DJI with Normal Distribution Superimposed Normal Mean StDev N 10 -0.3414 5.287 35 Frequency 8 6 4 2 0 -10 -5 0 R^DJI 5 10 Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 61 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.5 Checking Normality • Another approach for assessing normality is the Normal Probability Plot. • The data are arranged in ascending order. • Each data value, y(i), is assigned a cumulative relative frequency, pi: pi = 100(i − 0.5) n • Think of 0.5 as a correction factor. • Other correction factors are sometimes used. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 62 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.5 Checking Normality • For example, if the data set has 25 observations, then p1 = 2.00, p2 = 6.00,…, p25 = 98.00 • The percentage of the observations less than or equal to y(1) is 2.00%. • The percentage of the observations less than or equal to y(2) is 6.00%. • (y(i), pi) are plotted on a graph where the vertical axis is scaled so that if the data is from a normal distribution, the resulting plot should be approximately linear. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 63 5.5 Checking Normality • Appearance of NPP’s for data from a distribution that is not normal. • Right-skewed data plot as a curve, with the slope getting flatter as one moves to the right. • Left-skewed data plot as a curve, with the slope getting steeper as one moves to the right. • Data from symmetric distributions with more tail area than the normal plot as an S-shape, with the slope steepest at both ends. • The straight line drawn through the points can assist in assessing linearity. It can also be misleading if a few of the points are outliers. • In the following examples, the sample size is fixed at 25. This value for n was arbitrarily chosen. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 64 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.5 Checking Normality Example: What does the NPP look like for data from a standard normal distribution? Probability Plot of z Normal 99 Mean StDev N AD P-Value 95 90 0.07199 1.285 25 0.196 0.878 Percent 80 70 60 50 40 30 20 10 5 1 -3 -2 -1 0 z 1 2 3 Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 65 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.5 Checking Normality Conclusion: Since the plotted points are nearly linear, conclude that the data came from a normal distribution. Probability Plot of z Normal 99 Mean StDev N AD P-Value 95 90 0.07199 1.285 25 0.196 0.878 Percent 80 70 60 50 40 30 20 10 5 1 -3 -2 -1 0 z 1 2 3 Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 66 5.5 Checking Normality Example: What does the NPP look like for data from a normal distribution with µ = 100 and σ = 10? Probability Plot of y Normal 99 Mean StDev N AD P-Value 95 90 103.8 10.11 25 0.231 0.781 Percent 80 70 60 50 40 30 20 10 5 1 80 90 100 110 120 130 y Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 67 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.5 Checking Normality Conclusion: Since the plotted points are nearly linear, conclude that the data came from a normal distribution. Probability Plot of y Normal 99 Mean StDev N AD P-Value 95 90 103.8 10.11 25 0.231 0.781 Percent 80 70 60 50 40 30 20 10 5 1 80 90 100 110 120 130 y Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 68 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.5 Checking Normality Example: A uniform distribution is one that is of uniform or constant height for the range of y values. For the interval from –3 to +3, a uniform distribution has height of (1/6). What does the NPP look like for data from a uniform distribution that ranges from –3 to +3 ? Probability Plot of y Normal 99 M ean S tDev N AD P -Valu e 95 90 0.1511 1.944 25 0.844 0.025 Percent 80 70 60 50 40 30 20 10 5 1 -5.0 -2.5 0.0 y 2.5 5.0 Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 69 5.5 Checking Normality Conclusion: Because the plot is S-shaped with the slope steepest at both ends, conclude that the data came from a symmetric distribution with more probability in each tail than the normal distribution. Probability Plot of y Normal 99 M ean StD ev N AD P -Value 95 90 0.1511 1.944 25 0.844 0.025 Percent 80 70 60 50 40 30 20 10 5 1 -5.0 -2.5 0.0 y 2.5 5.0 Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 70 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.5 Checking Normality Example: What does the NPP look like for data from a distribution that is skewed to the right with E(Y) = 927 and σY = 871? Probability Plot of y Normal 99 Mean StDev N AD P-Value 95 90 904.9 800.1 25 1.229 <0.005 Percent 80 70 60 50 40 30 20 10 5 1 -1000 0 1000 y 2000 3000 Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 71 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.5 Checking Normality Conclusion: Since the plot is curved with the slope getting flatter as one moves to the right, conclude that the data came from a right-skewed distribution. Probability Plot of y Normal 99 Mean StDev N AD P -Value 95 90 904.9 800.1 25 1.229 <0.005 Percent 80 70 60 50 40 30 20 10 5 1 -1000 0 1000 y 2000 3000 Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 72 5.5 Checking Normality Example: Consider the returns for R^DJI. What does the NPP tell us? NPP for R^DJI Normal 99 Mean StDev N AD P-Value 95 90 -0.3414 5.287 35 0.240 0.760 Percent 80 70 60 50 40 30 20 10 5 1 -15 -10 -5 0 R^DJI 5 10 Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 73 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.5 Checking Normality Conclusion: Because the NPP is linear, conclude that the R^DJI are normally distributed. However, it’s a different story for the RIBM data. The NPP for RIBM follows. NPP for RIBM Normal 99 Mean StDev N AD P-Value 95 90 0.4368 12.30 35 0.784 0.038 Percent 80 70 60 50 40 30 20 10 5 1 -30 -20 -10 0 10 20 30 40 RIBM Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 74 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.5 Checking Normality • Procedure to obtain a Normal Probability Plot using Minitab: Æ Suppose the data to be analyzed are stored in C1 Æ Click on Stat Æ Basic Statistics Æ Normality Test Æ Enter “C1” in box for “Variable” Æ Select “Percentile Lines” option. The default option is “None” Æ Select “Tests for Normality” option. The default option is “Anderson-Darling” Æ Enter “Title” for plot Æ Click on “OK” Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 75 Keywords: Chapter 5 • Factorial • Sequences • Combinations • Bernoulli trials • Binomial random variable • Binomial probability distribution • Poisson random variable • Poisson probability distribution • Normal random variable • Standard normal probability distribution • Normal probability distribution • Normal probability plot Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 76 2005 Brooks/Cole, a division of Thomson Learning, Inc. Summary of Chapter 5 • The counting techniques needed for sequences and combinations. • A binomial random variable counts the number of successes in n trials, with each trial being a success or failure. • A Poisson random variable counts the number of occurrences of an event over a specified length of time. • A normal random variable measures the characteristic of interest and the probability distribution is bell-shaped. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 77 2005 Brooks/Cole, a division of Thomson Learning, Inc. Summary of Chapter 5 • Computing probabilities for the binomial, Poisson and normal random variables. • Assessing normality of data by the normal probability plot. Hildebrand, Ott & Gray, Basic Statistical Ideas for Managers, 2nd edition, Chapter 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 78