Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Transcript
Chapter 14
Random Variables
and Probability
Models
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
1
Objectives:
The student will be able to:
47. Recognize random variables.
48. Find the probability model for a discrete random
variable.
49. Find and interpret in context the mean (expected
value) and the standard deviation of a random
variable.
50. Tell if a situation involves Bernoulli trials.
51. Know the appropriate conditions for using a Binomial
model.
52. Calculate binomial probabilities.
53. Find and interpret in context the mean and standard
deviation of a Binomial model.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 2
2
14.1
Expected Value:
Center
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
3
Probability Model
•
•
•
•
•
An insurance company pays
$10,000 if you die or $5,000 for a disability.
The amount the company pays is a random variable:
a numeric value based on the outcome of a random
event.
It is a discrete random variable, since we can list all
the possible outcomes
A continuous random variable is a random variable
that is not discrete.
The collection of all possible values and their
probabilities is called a probability model.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
4
Expected Value
•
The expected value is
the average amount that
is likely to occur if there
are many trials.
•
Expected Value Formula:
μ = E( x ) =  xP(x)
•
•
 1 
 2   997 
E ( x ) = 10,000  
 +  5000  
+0
 = $20
 1000 
 1000   1000 
The company expects to pay an average of about
$20 per policy per year.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
5
Valentine’s Day Discount Deal
•
At the end of the meal, lovers can play for a
discount. From 4 aces:
• Black Ace: No discount
•
•
•
Ace of Hearts: $20 discount
Ace of Diamonds: Pick another ace
– Black Ace: No Discount
– Ace of Hearts: $10 discount
Find a probability model and interpret the expected
value.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
6
Valentine’s Day Discount Deal
•
P(X = 20) = P(A♥) = 1/4
•
P(X = 10) = P(A♦, then A♥) = P(A♦) × P(A ♥ | A ♦)
= 1/4 × 1/3 = 1/12
•
P(X = 0) = 1 – (1/4 + 1/12) = 2/3
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
7
Valentine’s Day Discount Deal
•
Expected Value:
 1
 1
2
E ( X ) =  20    + 10    +  0     $5.83
4
 12 
3
•
The restaurant should expect the average discount
to be about $5.83 per couple.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
8
14.2
Standard Deviation
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
9
Standard Deviation of a Probability Model
Standard Deviation
• Consider data from many outcomes of a random
variable.
• The variance and standard deviation of these
outcomes will measure the spread of the data.
σ =   x - μ P ( x )
2
•
Theoretically:
•
The standard deviation is the square root of the
variance:
2
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
10
About the Standard Deviation
•
Not just the standard deviation of the X values
•
A weighted average
•
Measures how outcomes will likely be spread out if
many are selected
•
Will be large if there is a high probability of both
small values and large values
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
11
Using the TI-83 to find E(X) and SD(X) for
a probability model
Much of what you have to do for discrete random data is very
much like what we did to get the descriptive statistics for data
involving frequencies only here we will use decimals,
probabilities.
Example:
• A recent study involving households and the number of children
in the household showed that 18% of the households had no
children, 39% had one, 24% had two, 14% had three, 4% has
four and 1% had five. Find the expected value, mean number of
children per household, and the standard deviation of this
distribution.
Place the values of the variable in L1: 0,1,2,3,4,5 and the
associated probabilities in L2: .18, .39, .24, .14, .04 and .01.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 12
12
Using the TI-83
Then do Stat -> Calc -> 1-var stat(L1 , L2)
• The
mean is 1.5 which is also the sum…. the calculator is doing
what we would do if we used the formula… multiply each value
by its corresponding probability and then add the separate
products.
• The
standard deviation is 1.1180. Notice the n=1. That just
means that the individual probabilities added up to one which
they should if this is a proper distribution.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 13
13
Practice
4) A day trader buys an option on a stock that will return
$100 profit if it goes up today and loses $200 if it goes
down. If the trader thinks there is a 75% chance that the
stock will go up, what is his expected value of the option?
6,14) You roll a die. If it comes up a 6, you win $100. If not,
you get to roll again. If you get a 6 the second time you
win $50, if not, you lose.
• Create a probability model for this game
• Find the expected amount you’ll win
• How much would you be willing to pay to play this game?
• Find the standard deviation of the amount you might win
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 14
14
24) Your company bids for two contracts. You believe the
probability that you get contract #1 is 0.8. If you get
contract #1, the probability that you get contract #2 is 0.2,
and if you do not get contract #1, the probability that you
get #2 will be 0.3.
a) Are the two contracts independent? Explain
b) Find the probability that you get both contracts
c) Find the probability that you get no contract
d) Let X be the number of contracts you get. Find the
probability model for X
e) Find the expected value and standard deviation of X
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 15
15
Refurbished Computers, Expected Value,
and Standard Deviation
The problem
• You just shipped two computers.
•
Out of the 15 in stock, 4 were refurbished.
•
If one of the two is refurbished, you lose $100.
•
If both are, you lose $1000 due to angry customer.
•
Find and interpret the expected value and standard
deviation.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
16
Refurbished Computers Accidentally
Shipped
Think→
• Plan: Find the expected loss and standard deviation
due to shipping refurbished computers.
• Variable: Let X = amount of loss.
• Plot: Make a picture.
• Model:
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
17
Refurbished Computers Accidentally
Shipped
Show→
• Mechanics: Use your calculator to find the
expected value and standard deviation.
Tell →
• Conclusion: I expect this mistake to cost the firm
$98.90 with a standard deviation of $226.74.
• The large standard deviation comes from the large
range of possible values.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
18
14.4
The Binomial Model
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
28
Bernoulli Trials
The basis for the Binomial probability model we will examine in this
chapter is the Bernoulli trial.
We have Bernoulli trials if:
• there
• the
are two possible outcomes (success and failure).
probability of success, p, is constant.
• the
trials are independent.
• Examples:
– Flipping a coin (where heads is success)
– rolling a die (where getting a “6” is success)
– throwing free throws in a basketball game
– drawing a card from a deck of cards with replacement (where
drawing an Ace is success)
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 29
29
Do we have Bernoulli Trials?
•
•
•
•
•
You are rolling 5 dice and need to get at least two 6’s to win the
game
We record the eye colors found in a group of 500 people
A city council of 11 Republicans and 8 Democrats picks a
committee of 4 at random. What is the probability that they
choose all Democrats?
A 2002 Rutgers University study found that 74% of high school
students have cheated on a test at least once. Your local high
school principle conducts a survey and gets responses that
admit to cheating from 322 of 481 students.
How likely is it that in a group of 120 the majority may have type
A blood, given that Type A is found in 43% of the population?
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 30
30
The Geometric Model – (just an
application of the multiplication rule)
A single Bernoulli trial is usually not all that interesting.
A Geometric probability model tells us the probability for a random
variable that counts the number of Bernoulli trials until the first
success.
• Example: lets draw cards from a standard deck with
replacement and consider drawing a heart “success.”
– Do we have Bernoulli trials? Would we have Bernoulli trials if
we were drawing without replacement?
– What is the probability p of success? What is the probability q
of failure?
– What is the probability that the first heart is the 3rd card
drawn?
○ i.e. first success occurs on trial 3.
○ This is the P(fail)*P(fail)*P(succeed) = 3/4 * 3/4 * 1/4
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 31
31
The Binomial Model
A Binomial model tells us the probability of having x successes in
n Bernoulli trials.
• Example:
If success is drawing a heart (drawing with
replacement), what is the probability that if we draw and replace
3 cards that we drew exactly one heart?
Two parameters define the Binomial model: n, the number of
trials; and, p, the probability of success. We denote this
Binom(n, p).
• Example:
If we flip a coin 6 times what is the probability of
getting heads exactly 3 times?
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 32
32
The Binomial Model (cont.)
In n trials, there are
n!
n Ck 
k ! n  k !
ways to have k successes.
• Read nCk
as “n choose k,” and is called a combination.
• Example:
How many ways are there to roll a die five
times and roll a 6 three of those times?
Note: n! = n x (n – 1) x … x 2 x 1, and n! is read as “n
factorial.”
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 33
33
The Binomial Model (cont.)
  np
  npq
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 34
34
Searching for Hope Solo’s Card
•
20% of the cereal boxes have Hope’s card.
•
What is the expected number of boxes to
open to get a Hope card?
Bernoulli Trial
• Two outcomes: success or failure
• The probability of success, p, is the same for
each trial.
• The trials are independent.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
35
Probability of Getting 2 Hopes in 5 Trials
(by hand)
•
•
•
•
Bernoulli trials: Millions of boxes, sample size 5.
P(X = 2) from Binom(n, p), n = 5, p = 0.2, q = 0.8.
2 successes, 3 failures. No quite 0.22 × 0.83.
Must consider all orders of 2 successes and 3
failures.
• Number of ways of picking k items from n:
n
n!
=
C
=
  n k
k!  n  k  !
k 
•
5!
5× 4×3×2
=
=10
2!  5 - 2! (2)(3×2)
P(X = 2) = 10 × 0.22 × 0.83 = 0.2048
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
36
The Binomial Model Summary
•
n = Number of trials
•
p = Probability of success
•
q = 1 – p = Probability of failure
•
X = Number of successes
• P ( X = x ) = n Cx p x q n- x
•
Mean = np
•
Standard Deviation = npq
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
37
Using the TI-83/84
•
Compute the P(X=x) using:
Distr  binompdf(n, p, x)
•
Compute the P(X<=x) using:
Distr  binomcdf(n, p, x)
This is a cumulative probability and is the sum
of P(X=0) + P(X=1) + … + P(X=x)
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 38
38
Using the TI-83/84 -- Practice
Suppose a light bulb company has a 20% defective rate. Consider
taking a sample of 6 bulbs.
1) What is the probability of getting exactly 1 defective bulb in that
group of 6? (Even though a defect isn't pleasant at times, it is
considered a success in this experiment since that is where our focus
is!)
If we computed this probability long hand we would do 6C1(.20)1(.80)5
On the calculator:
• 2nd
Distr...(#0) for binompdf(6, .2, 1) ... enter to get .393216
binompdf gives you the probability at a particular x. The pdf must be
followed by n (total number of trials), p (probability of a success), x
(number of successes you are interested in)
• binomcdf,
which we use next, will compute cumulative probabilities.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 39
39
Using the TI
2) What is the probability of getting at most 2 defective light bulbs?
This means P(0) + P(1) + P(2)
OR
use the cumulative binomial button:
•
2nd Distr...#A for binomcdf(6, .2, 2) ...enter to get .90112
3) What is the probability of getting at least two defective light bulbs?
• At least two means two or more which is the same as adding the
probabilities of 2 to 3 to 4 etc...
• OR 1 minus the complement of "at least two" which is 1 minus the cdf to 1
• 1 - binomcdf (6,.2,1) = .34464
4) What is the probability of getting from two to four defective light bulbs?
• You could do the pdf for 2 + pdf for 3 + pdf for 4 or be a little creative and do
• binomcdf(6,.2,4) - binomcdf(6, .2,1) = .34304
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 40
40
The Binomial Model
•
StatCrunch: Stat → Calculators
→ Binomial
•
Fill in n, p, x.
•
Decide on the inequality.
•
Compute.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
41
Using StatCrunch
To use StatCrunch to calculate Binomial Probabilities (or to
view the binomial probability histogram) go to
Stat -> Calculators -> Binomial
Enter the appropriate n, p, and Prob statement. Then click
"Calculate“
– For
example, if you want to compute the probability of
observing at least one "6" in 5 rolls of the die,
–n = 5
– p = 0.1667
– Prob (X=>1) = 0.598
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 42
42
Practice
20) An Olympic archer is able to hit the bull’s-eye 80% of the
time. Assume each shot is independent of the others. If she
shoots 6 arrows, what’s the probability of the following
• Her
first bull’s-eye comes on the 3rd arrow (note: this uses the
Geometric not the Binomial Distribution)
• She
misses the bull’s-eye at least once
• Her
first bull’s-eye comes on the fourth or fifth arrow
(Geometric)
• She
gets exactly 4 bull’s-eyes
• She
gets at least 4 bull’s-eyes
• She
gets at most 4 bull’s-eyes
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 43
43
Notice that when the Binomial Model
applies we are given E(X) and SD(X)…
•
n = Number of trials
•
p = Probability of success
•
q = 1 – p = Probability of failure
•
Mean = np
•
Standard Deviation =
npq
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
44
17) If you flip a fair coin 100 times
• Intuitively how many heads do you expect?
• Use the formula for expected value to verify your
intuition
18) An American roulette wheel has 38 slots, of which 18
are red, 18 are black, and 2 are green. If you spin the
wheel 38 times
• Intuitively how many times do you expect the ball to
land in a green slot?
• Use the formula for expected value to verify your
intuition
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 45
45
Consider the same archer…
• How
many Bull’s-eyes do you expect her to get?
• With
what standard deviation?
Suppose our archer shoots 10 arrows
• Find
the mean and standard deviation of the number of bull’seyes you may get
• What’s
• What
the probability that she never misses?
the probability that there are no more than 8 bull’s-eyes
• What’s
the probability that there are exactly 8 bull’s-eyes
• What’s
the probability that she hits the bull’e-eye more often
than she misses
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 46
46
Practice
6) Suppose 75% of all drivers always wear their seatbelts. Lets
investigate how many of the drivers might be belted among
six cars waiting at a traffic light.
• Describe how you’ll simulate the number of seatbelt wearing
drivers among the six cars
• Run 30+ trials
• Based
on the simulation estimate the probabilities that there
are exactly no belted drivers, one, two, three, etc.
• Calculate the actual probability model
• Compare the distribution of outcomes in the simulation to the
actual model
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 47
47
Universal Blood Donors
6% of all people are O−. If 20 donate at the blood drive,
find the mean and standard deviation and find the
probability that 2 or 3 are O−.
•
Think →Plan: Either success (O−) or Failure (not O−)
p = 0.06, independent
Therefore binomial
Variable: X = no. of O −, n = 20 people
Model: X is Binom(20, 0.06)
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
49
Universal Blood Donors
6% of all people are O−. If 20 donate at the blood drive,
find the mean and standard deviation and find the
probability that 2 or 3 are O−.
•
Show →Mechanics:
Mean = np = (20)(0.06) = 1.2
SD = npq =
 200.060.94  1.06
Use StatCrunch to find
P(X = 2 or 3) = P(X = 2) + P(X = 3)
= 0.2246 + 0.0860
= 0.3106
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
50
Universal Blood Donors
6% of all people are O−. If 20 donate at the blood drive,
find the mean and standard deviation and find the
probability that 2 or 3 are O−.
Tell →
• Conclusion:
In Groups of 20 randomly selected blood donors, I
expect to find an average of 1.2 universal blood
donors with a standard deviation of 1.06. There is
about a 31% chance that 2 or 3 of the 20 donors
are O−.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
51
14.5
Modeling the
Binomial with the
Normal Model
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
52
The Trouble with Large Sample Sizes
6% of all people are O−.
The Tennessee Red Cross has 32,000 donors and needs
at least 1850 that are O−. Will they run out?
• The computations involve ridiculously large
numbers.
•
“At least” requires P(X = 1850), P(X = 1851), all the
way up to P(X = 32,000).
•
Mean = np = 1920
•
SD  npq  42.48
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
53
The Solution for Large Sample Sizes
The Tennessee Red Cross has 32,000 donors and needs
at least 1850 that are O−. Will they run out (less than)?
•
Mean = np = 1920 SD = npq  42.48
•
The normal model with the same mean and
standard deviation is a very good approximation.
1850  1920 

• P ( X < 1850)  P  z <
  P( z < 1.65)  0.05
42.48


•
There is about a 5% chance that they will run out.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
54
How Large is “Large Enough”
The Success/Failure Condition
• A Binomial is approximately
Normal if we expect at least 10
successes and 10 failures.
np  10, nq  10
•
This comes from the binomial
being skewed for a small
number of successes or
failures expected.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
55
Example: Spam and the Normal
Approximation to the Binomial
Only 151 of 1422 emails got through your spam filter.
Might the filter be too aggressive?
• What is the probability that no more than 151 of the
emails are real messages?
•
•
•
These emails represent less than 10% of all emails.
np = (1422)(0.09) = 127.98  10
nq = (1422)(0.91) = 1294.02  10
•
Yes, the Normal model is a good approximation.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
56
Example: Spam and the Normal
Approximation to the Binomial
•
What is the probability that no more than 151 of the
emails are real messages?
•  = np = 127.98
• σ = npq  10.79
151-127.98 

 151)  P  z 
  P ( z  2.13)  0.9834
10.79


• There is over a 98% chance that no more than 151
of them were real messages. The filter may be
working.
• P(X
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
57
If your candidate is favored by approximately 53% of the
population, but only 100 people vote, what is the
probability that your candidate wins?
• In
other words, your candidate needs at least 51 votes
of 100 votes. Assume each voter is independent.
• Do
we have Bernoulli trials?
• What
is the probability of “success” for a trial?
• What
the probability that at least 51 of 100 voters vote
for your candidate?
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 58
58
Approximating with the Normal Model
When we use the Normal model to approximate the Binomial
model, we are using a continuous random variable to
approximate a discrete random variable.
So, when we use the Normal model, we no longer calculate the
probability that the random variable equals a particular value,
but only that it lies between two values.
Ex. For our election example:
• μ = np = 100*.53 =53
• σ = √(npq) = √(100*.53*.47) =4.99
• P(at least 51 votes) ~ Normalcdf(51,100, 53, 4.99)
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
Slide 1- 59
59
What Can Go Wrong?
Probability models are still just models.
• They just approximate reality. Always question
whether the model is accurate.
If the model is wrong, so is everything else.
• Check the mean and standard deviation for
reasonableness. Be sure that the probabilities sum
to 1.
Don’t assume you have Bernoulli trials without checking.
• Check for only 2 possible outcomes, independence,
consistency of probabilities, and the 10% rule.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
60
What Can Go Wrong?
Don’t assume everything is Normal.
• Check there the distribution is not skewed and that
there is a bell shape.
Don’t use the normal approximation for small n.
• np> 10, nq > 10
Watch out for variables that are not independent.
• The arithmetic rules do not work for variables that
are not independent. Always check for
independence.
Copyright © 2014, 2012, 2009 Pearson Education, Inc.
61