* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Download Monday, January 9: Chapter 14: Probability
Survey
Document related concepts
Transcript
Chapter 14: Probability.1
What is the probability that we spin a coin and it lands on heads?
If we were to keep spinning the coin forever, the relative frequency would stabilize at a particular value: the true
probability of heads.
The _______________ of an event is its long-run relative frequency.
For any random phenomenon, each attempt, or ________, generates an __________. An outcome is what
happened on a trial. For example, each spin of the coin was a ________ and the possible outcomes were
______________________.
Instead of individual outcomes, sometimes we consider ________, which are combinations of outcomes. For
example, if a trial consisted of rolling a die, the outcomes would be 1, 2, 3, 4, 5, 6. An event might be:
If we want to know about combinations of outcomes, it helps if the individual trials are independent. If trials
are ____________________, then the outcome of one trial doesn’t influence or change the likelihood of another
trial. For example, when we were spinning the coin, the trials are independent, since knowing the outcome of
one trial doesn’t affect the outcomes of subsequent trials.
But, if we are drawing cards without replacement from a deck, the trials are not independent. If we draw a club
on our first trial, then we know the next card is less likely to be a club and more likely to be a heart, spade, or
diamond.
The ______________________________ says that the long-run relative frequency of repeated independent
events get closer and closer to the true relative frequency (probability) as the number of trials increases.
Many people misunderstand this Law and apply it to situations with a small number of trials. For example, if
we flip a coin 5 times and it lands on heads 5 times, some people will believe that the 6th trial is more likely to
be tails, since “tails is due.” This is sometimes called the “Law of Averages.” Don’t believe it.
However, since the trials are independent, the probability that the coin lands on heads never changes, no matter
what has happened before. The Law of Large Numbers only guarantees that the relative frequency of an event
will approach the true probability after a __________ number of trials. The coin doesn’t purposefully
compensate for its run of heads by landing on tails, but the results of many trials will eventually overwhelm any
departures from what’s expected.
For example, we may start with 5 heads in a row (100% heads), but after 100 more flips, we should be close to
55 heads and 50 tails (52.4% heads), and after 900 more flips, we should be close to 505 heads and 500 tails
(50.2% heads).
Note: This is why casinos have boards which post the past 15 or 20 outcomes on a roulette wheel. The casino
knows that gamblers misunderstand the law of large numbers and will be encouraged to bet more since they
think they know what numbers are more likely to come up
HW #1: Read:, Problems: 14.1-6
Chapter 14: Basic Probability Rules.2
1. A probability is any number between 0 and 1. That is, for any event A, 0 P( A) 1 .
Remember that a probability is a relative frequency. We cannot have more heads than the number of
spins (so P(heads) 1) and we cannot have a negative number of heads (so, P(heads) 0).
If an event has probability = 0, then the event will never occur.
If an event has probability = 1, then the event must occur.
2. Something has to happen. That is, when we consider a single trial of a random phenomenon, one of possible
the outcomes must occur. For example, if we are spinning a coin, we must either get heads or tails.
All of the possible outcomes of a random phenomenon are called the __________________ (S), thus
P(S) = 1
If we add the probabilities of all the possible outcomes and it doesn’t equal 1, then the assignment of
probabilities is not legitimate (or there may be rounding error).
3. The probability of an event occurring is 1 minus the probability it doesn’t occur. That is, P(A) = 1 – P Ac .
For example, if the probability of rain today is .1, then the probability that it will not rain today is 1 - .1 = .9.
The event “not A” is called the complement of A and is denoted Ac .
Other books may use different notation, including: A’, ~A, etc.
4. For two disjoint events A and B, the probability that one or the other event occurs is the sum of the
probabilities of the two events. That is, P(A or B) = P A B = P(A) + P(B) if A and B are disjoint.
Two events are ____________ if they do not share any outcomes. For example, suppose you randomly
selected a registered voter and asked him which political party he belonged to. Since you can only
register as a member of one party, being a democrat and being a republican are disjoint events.
Disjoint events are also called ___________________________ events.
So, if the probability of being a Republican is .4 and the probability of being a Democrat is .45, the
probability of being a Republican or a Democrat =
This rule also works for more than 2 disjoint events. For example, if the probability of being a
Libertarian is 5%, then P(R or D or L) =
However, you cannot use this addition rule when the events are not disjoint. For example, for students
in statistics, P(senior) = .75 and P(own a TI calculator) = .9, but P(senior or own a TI) .75 + .9 = 1.65.
The events are not disjoint and we cannot have probabilities over 1!
5. For two independent events A and B, the probability that both A and B occur is the product of the
probabilities of the two events. That is, P(A and B) = P A B = P(A) x P(B) if A and B are independent.
Recall from yesterday that two events are _____________________ if knowing the outcome of one
event does not change the probability of the other event.
For example, if we flip a coin and roll a die, the events “heads” and “6” are independent, since knowing
the outcome of the coin flip does not change the probability that the die lands on 6.
Thus, P(heads and 6) =
This rule also works for more than 2 independent events. For example, the probability of rolling 3 sixes
in a row is: P(6 and 6 and 6) =
However, you cannot use this multiplication rule when the events are not independent. For example, if a
class of 30 students has 16 females, the probability of selecting (without replacement) 2 females is not
(16/30)(16/30), since these events are not independent. Knowing that the first person selected was
female changes the probability that the second person selected is female (from 16/30 to 15/29).
If the probability of rain in Hacienda Heights is 50% and the probability of rain in Rowland Heights is 50%,
what is the probability that it will rain in both places?
If the probability of rain in New York City is also 50%, what is the probability that is will rain in both HH and
NYC?
HW #2: Read 331-337:, Problems: 14.7-15 odd
Chapter 14: More Probability.3
Suppose that 50% of cars sold in this country are made in the USA, 30% in Japan, and 10% in Germany.
1. Is this a valid assignment of probabilities? How do you know?
2. Assume that the remaining 10% of cars are made in other countries.
a. What is the probability that a car selected at random is not made in the USA?
b. What is the probability that a car selected at random is made in Germany or Japan?
c. This rule used in (b) only works for disjoint events. Explain why these events are disjoint.
d. What is the probability that two cars selected at random are both from Japan?
Note: In cases where the population is very large compared to the sample (population size >
10 x sample size), it is OK to treat trials as independent even when sampling without
replacement since the change in probabilities will be very small.
e. What is the probability that none of the 3 cars came from Germany?
f. What is the probability that at least one of the three cars is made in the USA?
g. What is the probability that the first time you get a Japanese car is on your fourth selection?
h. What is the probability that two cars selected at random are both from the USA or both from
Japan?
HW #3: Read ________, Problems 14.8-16 even
Review chapter 14.4
HW #4: Problems 14.17-27 odd
Chapter 15: Probability Rules!.5
Review: For any random phenomenon, each _______ generated an __________. An _________ is any set or
collection of outcomes. The collection of all possible outcomes is called the ____________________ (S).
For example, if a family has 2 children, the sample space is {BB, BG, GB, GG}. In this case all four outcomes
are equally likely.
If you are dealt 5 cards from a deck and record the number of aces you get, what is the sample space?
Are these outcomes equally likely?
When all of the outcomes in the sample space are equally likely, the probability of an event A is:
count of outcomes in A
P(A) =
count of all possible outcomes
For example, the probability that a family with two children has two boys is 1/4. But, the probability of getting
4 aces in 5 cards is not 1/5 since the outcomes are not equally likely.
What is the probability of drawing a single card from a deck and getting a red card or an ace?
To answer this problem, we need to use the __________________________ which works when events are
disjoint and when events are not disjoint.
P(A or B) = P A B = P(A) + P(B) – P(A and B)
Note: When we use the word “or” we mean it in the inclusive sense: A or B means A occurs or B occurs or
both A and B occur.
Suppose that 75% of students in statistics are seniors, 90% have a TI-83, and 70% are seniors with TI-83’s.
What is the probability that a randomly selected statistics student is a senior or has a TI-83?
HW #5: Read _________, Problems 15.1-15.4
Chapter 15: Conditional Probability.6
Sometimes the knowledge that one event has occurred changes the probability that another event will occur.
The probability of being in a car accident increases if you know that it is raining outside
Suppose that the pass rate on the AP Statistics exam is 80%. That is, for a randomly selected student,
P(pass) = .80. However, if you know that the student got a B in the class, then the probability increases
95%. That is, for a randomly selected student, P(pass given that you have a B) = .95.
Notation: P(pass | B) = .95
Probabilities in the form P(A | B) are called __________________ probabilities and pronounced “the
probability of A occurring given that B has already occurred”
The following data is about the 2201 passengers on the Titanic. 367/1731 males survived and overall 711
survived. Express this in two-way table with gender and survival as the variables.
Note: A ___________________ is a way to summarize the relationship between two categorical variables. It
lists the outcomes of one variable down the left side, the outcomes of the other variable across the top, and the
frequencies (or relative frequencies) in each cell.
Suppose you randomly selected a name from the Titanic’s passenger list.
a.
P(survived) = ?
b.
P(male) = ?
c.
P(female survived) = ?
d.
P(survived | female) = ?
e.
P(male | survived) = ?
f.
P(died | female) = ?
g.
P(female | died) = ?
h
P(female | survived) = ?
Def: Let E and F be two events. The conditional probability of event E given that event F has occurred is:
PE F
PE | F
PF
The following table shows the relationship between the grade and driving status of WHS students.
9 10 11 12
Has License .01 .05 .11 .16
No License .26 .20 .14 .07
If you randomly selected a LSHS student, what is the probability that:
a. he or she is a freshman?
b. he or she has a drivers license?
c. he or she is a senior and has a DL?
d. a senior has a DL?
e. a legal driver is a senior?
f. a person has a DL or is a senior?
HW #6: Read _________ Problems 15.6-15.9
Chapter 15: The General Multiplication Rule and Independence.7
In the last chapter, we learned a rule for finding P(A and B) when A and B are independent:
P(A and B) = P(A) P(B)
However, now that we understand conditional probability, we can use the general multiplication rule:
P(A and B) = P(A) P(B|A)
This rule comes directly from the conditional probability formula we learned yesterday
What is the probability of drawing 2 aces in a row from a deck (without replacement)?
Suppose that 23% of LSHS students are seniors and that 65% of seniors are taking a math class. What is the
probability that a randomly selected LSHS student is a senior and is taking a math class?
Also, we can now formally define independence. Events A and B are ________________ whenever P(B | A) =
P(B | Ac) = P(B). That is, knowing the outcome of event A does not change the probability of event B.
Whenever you are asked about independence, this is the definition you should use to justify your answer.
Also, we can see that the original multiplication formula is just a special case of the general formula, since
P(A) P(B|A) simplifies to P(A) P(B) when A and B are independent.
Using the examples from yesterday:
Were being male and surviving independent on the Titanic?
Were being male and surviving disjoint on the Titanic?
Were grade and driving status independent at LSHS? Disjoint?
Suppose that 20% of students at a local school participate in athletics, 45% have GPA’s over 3.0 and 10% have
both characteristics.
What percentage of students have neither characteristic?
What is the probability that an athlete has a GPA over 3.0?
Are being an athlete and having a GPA over 3.0 disjoint?
Are being an athlete and having a GPA over 3.0 independent?
HW #7: Read ________ Problems 15.11-14, 17
Chapter 15: More about Conditional Probability.8
Suppose that there are 12 girls on a softball team. Six of them are freshmen, 4 are sophomores, and 2 are
juniors.
a. If two players are selected at random, what is the probability that they are both juniors?
b. Neither is a junior?
c. At least 1 is a junior?
d. They are of same grades?
Reminder: When we are sampling without replacement, successive selections are NOT independent. When the
population size is known, you should adjust the probabilities accordingly.
When airline passengers go through the security checkpoint, some are subject to additional searches. On a
recent afternoon, 30 of 145 passengers with a laptop were searched and 42 of 203 passengers without a laptop
were searched. Suppose you randomly selected a passenger that afternoon.
a. What is the probability he was searched?
b. What is the probability that he didn’t have a laptop and was searched?
c. What is the probability that he had a laptop or was searched?
d. What is the probability that a person without a laptop was searched?
e. What is the probability that a person who was searched didn’t have a laptop?
f. Are the events A = has a laptop and B = was searched independent? Disjoint?
Suppose that in a certain company, 5% of the employees use drugs. The company decides to test all of its
employees for drugs with a test that is 95% successful (it correctly identifies 95% of users as users and 95% of
non-users as non-users). Let D = event that the employee uses drugs and P = event that the person tests
positive.
a. Express the information given in terms of P and D.
b. Express the data in a tree diagram.
c. What is the probability that a person uses drugs and tests positive?
d. What is the probability that a person tests positive?
e. What is the probability that a person uses drugs or tests positive?
f. If a person tests positive, what is the probability that they use drugs? Does this result surprise you?
Note: the false positive rate of a test is P(positive | no drugs) and the false negative rate is P(negative | drugs)
Which of these errors is worse? For the employees? For the company?
How can we change these probabilities?
HW #8: Read _________, Problems
Chapter 15: Review.9
HW #9: Read ___________, Problems
6.7 Estimating Probabilities using Simulation.10
In many cases, it is very difficult (or impossible) to calculate a probability using the rules we have learned in
this chapter so far. In these cases, we must estimate the probabilities empirically (through repeated observation)
or by performing a simulation. Also, remember that the Law of Large Numbers says that the precision of our
estimates increases as the number of observations gets larger.
Suppose a cereal company places one of four toys in every box of cereal. Furthermore, the company claims that
each toy is produced in the same quantity so each of the toys is equally likely to show up in a randomly chosen
box. Suppose I want to get all 4 toys, but it takes me 15 boxes to find each of the four. Is this evidence that the
toys are NOT uniformly distributed? Estimate the probability that it takes 15 or more boxes to get all 4 toys.
To solve this problem empirically, what would you do?
-Count how many cereal boxes it takes to get all 4 toys.
-Repeat this process many times.
-Count the number of times it takes 15 or more boxes and divide by the total number of times you
repeated the experiment.
-For example, if you did this 100 times and 4 times it took 15 or more boxes, then the probability it takes
15 or more boxes to get 4 toys is approximately .04.
What is the problem with this method?
-It will take a lot of time and A LOT OF MONEY!
We can also solve this problem with a simulation.
-Instead of buying cereal boxes, we could use some other random device to mimic this process, such as
spinning a spinner divided into 4 equal parts. We could label each section on the spinner as
“toy 1”, “toy 2”, etc. and continue spinning until we get all 4 toys.
-What are the advantages to this method?
-faster and cheaper!
Distribute spinners and paper clips and conduct the simulation. Pool class results in a frequency table and
estimate the probability that it takes 15 or more boxes (should be between .02 and .10).
HW #10
Read _________
11 More Simulations.11
Simulations in General:
1. Designate a random mechanism (spinner, random digits, coin, dice) to imitate the outcomes in the real life
situation. Make sure the probabilities match!
2. Define what a run will consist of and what it represents. A run (also called a trial, repetition, observation, or
simulation) is a set of steps that mimic the real life situation. Give a brief example or two to convince me you
know what you are doing.
3. Conduct as many runs as possible, recording the results in a frequency table.
4. Answer the question using the results in the frequency table.
For example, in the cereal problem:
1. Use a spinner divided into 4 equal regions and let each region represent getting a particular toy.
2. Each run will consist of spinning the spinner n times until the spinner lands at least once in each region.
This represents buying cereal boxes until we have at least one of each toy. For example, consider the following
sequence of spins: 1, 1, 3, 4, 3, 1, 2. In this example, it took 7 boxes to get all 4 toys.
3. See yesterday’s notes.
4. See yesterday’s notes.
Which other random mechanisms could we use in step 1?
-dice: ignore 5’s and 6’s.
-random digit table: ignore 0, 5-9
-RandInt(1,4)
-Cards (with replacement)
According to GE, 70% of AAA light bulbs last at least 100 hours. However, in my last 4-pack, only one light
bulb lasted that long. Can this be reasonably attributed to chance, or is their claim exaggerated? Assuming
their claim is correct, estimate the probability of buying 4 light bulbs and having 1 or fewer last 100 hours.
How would you estimate this probability empirically?
1. Let ________ represent a good light bulb (70%) and __________ represent a bad light bulb (30%)
2. Each run will consist of generating four numbers from 0-9 and observing how many are between 0-6.
This will represent purchasing 4 light bulbs and determining how many will last 100 hours.
3.
4.
HW #11 Simulation Worksheet (1-3)
NOTE: Grade this HW myself!
11 More Simulations.12
Since only 90% of ticketed passengers actually show up for their flight (on average), it is the practice of airlines
to overbook their flights (sell more seats than they have). On a certain airline, each plane holds 100 passengers
and 108 tickets are sold for each flight. Unfortunately for travelers, if more than 100 people show up for a
particular flight, some of them are turned away. For example, if 103 people show up, then 3 people will be
pretty mad!
a. Design and conduct a simulation to estimate the probability that at least one passenger will be turned
away.
b. If plane tickets cost $400 and the airline loses $1000 for each overbooked passenger, how much extra
money does the airline make per flight with this practice?
1. Let 0 = ______________, 1-9 _______________
2. For each run, I will generate 108 integers from 0-9 to represent the 108 people with tickets. I will count the
number of 0’s in each run to count the number who don’t show up. For example, if there are 9 0’s in the 108
random integers, then on that particular flight there were 9 no-show’s. This means all of the remaining 99
passengers get a seat and no one is turned away.
3.
# of n o sh ows
# bump ed
0
8
1
7
2
6
3
5
4
4
5
3
6
2
7
1
•
8
0
fre quen cy
4a. P(at least one is turned away) =
4b. Profit =
HW #12 Simulation Worksheet: 4-5, Book:
6.7 More Simulations.13
The “Big Dan Teague” Problem (see worksheet)
1. Using the random digit table and taking two digits at a time (00-99), let numbers
00-89 represent cars going 0-10 MPH over the speed limit
90-96 represent cars going 11-15 MPH over the speed limit ($100 fine)
97-98 represent cars going 16-20 MPH over the speed limit ($200 fine)
99 represent cars going 20 or more MPH over the speed limit ($300 fine)
2. Each run will consist of selecting 2 digits at a time until 300 minutes have passed. For each pair of digits
(which represent one car), I will check if the car gets a fine. If it doesn’t, I will continue selecting pairs of digits
until I get a car with a fine. I will then record the number of cars that passed by, the amount of time since the
last fine, and the fine. I will also keep track of the total time and the total fines. For example, using line 101 of
the random digit table:
19, 22, 39, 50, 34, 05, 75, 67, 87, 13, 96.
40, 91.
3.
Fine = $100 ($100 total), Time = 40 minutes (40 total).
Fine = $100 ($200 total), Time = 31 minutes (71 total).
Distribution of Total Fines
0.35
0.30
0.20
0.15
Probabi li ty Axi s
0.25
0.10
0.05
200
400
600
800
1000
1200
1400
1600
1800
2000
Total Fines
4. P($600 or lower) = ________________. Thus, it appears that the supervisors claim is correct. It is very
unlikely that Big Dan got a total as low as $600 by random chance.
What if Big Dan came back with $2000 worth of tickets? Could this have occurred by random chance or did
Dan "pad" the readings on the radar gun?
HW #13 Review Worksheet (1-5)
Review Chapter 11.14
-Simulation: Pat leaves for work at a random time between 7:00 am and 8:00 am. Pat’s paper arrives at a
random time from 7:30 am to 8:30 am. Design and conduct a simulation (do 20 runs) to estimate the probability
that the paper arrives before Pat leaves for work.
HW #14:
Quiz Chapter 14, 15, 11
Optional Review Assignments for Final:
Chapters 1-6: page 105: 1, 5, 7, 11, 14, 15, 25, 29, 31
Chapters 7-9: page 204: 1, 3, 5, 7, 11, 17, 27, 29
Chapters 11-13: page 267: 1, 3, 5, 15, 23, 24, 25, 29, 31, 37, 41, 42