Download Probability definitions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
Probability definitions
1. Probability of an event = chance that the event will occur.
2. Experiment = any action or process that generates observations.
In some contexts, we speak of a “data-generating process”
Examples: toss a coin (one or more times), roll 2 dice, select 5 cards from a deck,
interview 100 people for market research, observe the reaction of 50 patients to a new drug.
3. Sample space = set of all possible outcomes of an experiment.
Example: if two dice (a red die and a green die) are tossed,
any outcome is described by the number on the red die and the number on the green die.
Note that the outcome is a “low level” description of what happened.
Let the number on the red die be indicated by bold italics, so that (3, 2) indicates a
3 on the red die and a 2 on the green die.
In the experiment of tossing a red die and a green die, we can list the sample space in a table:
1, 1
1 , 2
1 , 3
1 , 4
1 , 5
1, 6
2 , 1
2 , 2
2 , 3
2 , 4
2 , 5
2 , 6
3 , 1
3 , 2
3 , 3
3 , 4
3 , 5
3 , 6
4 , 1
4 , 2
4 , 3
4 , 4
4 , 5
4 , 6
5 , 1
5 , 2
5 , 3
5 , 4
5 ,5
5 , 6
6 , 1
6 , 2
6 , 3
6 , 4
6 ,5
6 , 6
4. Event = any collection or subset of outcomes in the sample space. Events are simple if they contain
exactly one outcome, and are compound if they contain more than one.
Example: We can define the event A = both dice show the same number, and calculate
P(A) = 6 / 36. (If you xerox or copy the sheet, shade in the appropriate areas)
Define event B = sum of two numbers at least 10. P (B) = 6 / 36
5. Random variables associate a number with an event. If we define the random variable X as the sum
of the two numbers on the dice, we can say P (X = 6) = 5/36
6. Union of two events, denoted by ∪ as in A ∪ B, constructs a new event – both dice show the same
number, or the sum of the two numbers is at least 10. This should be read as “A or B”.
Find the probability of A or B in the example above. Does P(A or B) = P(A) + P(B) ? Why not?
7. Intersection of two events, denoted by ∩ as in A ∩ B, constructs a new event - both dice show the
same number, which is greater than or equal to 10. This should be read as “A and B”.
Find the probability of A and B in the example above. Does P (A and B) = P(A) times P(B) ? Why not?
8. Complement of an event A, denoted by a prime or superscript c, as in A’ or Ac , indicates those
outcomes not in the event A. What is the probability of A’ in the example above?
Note that P(A’) = 1.0 - P(A) This relationship often makes calculations much simpler, especially when
the problem includes phrases such as “at least” or “at most”.
Example (Chevalier de la Mere): what is the probability that at least one six turns up in 4 tosses of a die?
[Hint: it is a little more than half, .5177. De la Mere found that out by extensive experiment.]
What is the probability that at least one double six turns up in 24 tosses of two dice?
[The chance of double six is 1/36, but we compensate by having 6 times as many tosses, so de la Mere
thought the probability should be the same. Is it? Hint: de la Mere lost a lot of money by believing this]
Statistics 1040
Dr. McGahagan
Probability problems
Simple occurences:
Event
Probability
Get a tail in a single toss of a fair coin
Roll a 3 on a normal, 6 sided die
Draw a heart from a deck or cards
Child born on a Saturday or Sunday
More complicated events, for which it will be helpful to use random variables to keep track of outcomes
(for example, one might want the random variable X = number of heads in 3 tosses of a fair coin)
Event
Toss 3 heads in 3 tosses of a coin
Roll boxcars (6 on each of 2 dice)
Be dealt a flush (5 cards of same suit) in a
standard 5 card deal.
Toss exactly 2 heads and two tails IN THAT
ORDER in 4 tosses of a coin
Toss exactly 2 heads and two tails IN ANY
ORDER in 4 tosses of a coin
Roll either boxcars or snake-eyes
(2 sixes or 2 ones in a roll of 2 dice.
Probability
Addition and multiplication rules – the full story
In some of the above problems, we used the addition rule in the form
P(A) or P(B) = P(A) + P(B)
and the multiplication rule in the form
P(A and B) = P(A) P(B)
We must extend the rules to take account of
1. events that are not mutually exclusive – that is, events which can both happen at the same
time. Suppose you have been dealt a poker hand of Jack ,Queen, King and Ace of hearts along with the 2
of clubs. You are interested in the change of coming up with a winning hand if you discard the 2 and
draw another card. What are the chances of getting either a 10 or a heart?
We can assume that either a flush or a straight will win – but there is the chance here of getting a royal
flush with the 10 of hearts.
2. Events that are not independent – that is, in which the probability of one event affects the
probability of another.
What is the chance of drawing two hearts in a row? Hint: NOT 13/52 times 13/52. Why?
Suppose the chance that, for the US as a whole, the chance that a family’s first car is domestic is
.75, denoted as P(D1) = .75, and the chance that the second car is domestic is .4, denoted as P(D2) = .4
What is the chance that both cars are domestic?
If the two events are independent, we could apply the multiplication rule
P(D1 and D2) = P(D1) x P(D2) = 0.75 x 0.4 = 0.30
But it may be that purchasers show buyer loyalty, that is, those who purchased a domestic car for their
first car are more likely than the average to buy a domestic car for their second car.
Assume that P(D2 given D1) = 0.6 – also written P(D1 | D2) = 0.6
and calculate P(D1 and D2). Hint: look back at the two hearts in a row problem.
P (D1 and D2) = P(D1) x P (D2 | D1) = .75 x 0.6 = 0.45
A visual presentation of a similar problem:
D1 = a family's first car is domestic; F1 = probability first car is foreign
D2 = a family's second car is domestic; F2 = probability second car is foreign
Given that P(D1) = 0.75 and P(D2) = 0.4,
and assuming there are 100 total cars and that P(D1 and D2) = 0.35 fill in the following table:
A few numbers have been filled in to get you started. Be sure you understand how they were
arrived at, and how they reflect the statements above.
Car 2 is domestic
Car 2 is foreign
ROW TOTALS
Car 1 is domestic
35
75
40
100
Car 1 is foreign
COLUMN totals
After doing so, calculate all the joint probabilities:
P(D1 and D2) = 0.35
P(D1 and F2) =
P(F1 and D2) =
P(F1 and F2) =
And all the conditional probabilities
P(D2 | D1) =
P(D2 | F1) =
P(F2 | D1) =
P(F2 | F1) =
What is the probability that (for two car families described in the table above):
a. Both a family's cars are domestic?
b. Both a family's cars are foreign?
c. A family has one domestic and one foreign car?
d. We know that the Smith family has at least one domestic car.
What is the chance that they also have a foreign car?
e. We know that the Jones family has at least one foreign car.
What is the chance that they also have a domestic car?
Answers:
Car 2 is domestic
Car 2 is foreign
ROW TOTALS
35
40
75
Car 1 is foreign
5
20
25
COLUMN totals
40
60
100
Car 1 is domestic
The JOINT PROBABILITIES are easy:
The table gives the NUMBERS of families in each category -there are 35 families with both car 1 and car 2 being domestic,
so the probability that any two-car family chosen at random having two domestic cars
is 35 / 100 = 0.35 or 35 percent.
P (D1 and D2 ) = 0.35 (answer for [a] on last page)
P (D1 and F2) = 0.40
P (F1 and D2) = 0.05
P (F1 and F2) = 0.20 (answer for [b] on last page)
For [c] on the previous page, note that families will have one domestic and one foreign car if their two
cars are in the square D1 and F2 OR in the square F1 and D2. The OR is telling us to ADD the joint
probabilities P(D1 and F2) + P (F1 and D2) = 0.40 + 0.05 = 0.45
The ROW and COLUMN totals give the MARGINAL PROBABILITIES,
since they are written in the margin of the detailed table (don't think of marginal cost !).
P(D1) = 0.75
P (F1) = 0.40
P (D2) = 0.25
P (F2) = 0.60
CONDITIONAL PROBABILITIES can be read off the table:
If you are GIVEN that D1 is domestic, you know you are in the first ROW of the table -the information given means the family is one of the 75 for whom the first car is domestic. You
can mentally reduce the entire table to:
Car 1 is domestic
Car 2 is domestic
Car 2 is foreign
ROW TOTALS
35
40
75
Hence the probability that their second car is foreign is:
P (F2 | D1) = 40 / 75 = 0.5333
If we are GIVEN that the second car is domestic, the probability that the first car is foreign is
P (F1 | D2) = 5 / 40 = 0.1250 (mentally reduce the entire table to the first column)
If we are given (as in part [d]) that the Smith family has at least one domestic car, we delete the F1 - F2
square from the table, leaving 80 families, so chance of also having a foreign car is 45 / 80 = 0.5625
For the Jones family in part [e], with at least one foreign car, delete the D1-D2 square; their chance of
also having a domestic car is 20 / 65 = 0.3077.
Bayes's Theorem
Suppose that 80 percent of the taxicabs in town are owned by the Yellow Cab Company and 20 percent
are owned by the Blue Cab Company, and that they are painted accordingly. A taxicab driver was arrested
in a bank robbery. A witness claims that a cab was used as the getaway car, and thinks that the cab was
blue, although he admits that the light was poor. As a result of repeated tests in similar lighting
conditions, the defense finds that the witness is 75 percent accurate -- that he correctly identifies a blue
cab as blue 75 percent of the time, but incorrectly identifies a blue cab as yellow 25 percent of the time.
We should:
(a) reject the witness testimony as not perfectly accurate, and treat the probability of the cab
being blue as 20 percent.
(b) accept the witness as having a 75 percent chance of being right and the cab being blue
(c) treat the probability of the cab being blue as somewhere between 20 and 75, but closer to
20 (that is, more than 20 but less than 47.5)
(d) treat the probability of the cab being blue as somewhere between 20 and 75, but closer to
75 (that is, more than 47.5 but less than 75)
Answer:
The answer will be between the two extremes -- although not perfectly accurate, the witness is
right more often than not, and his claim that the cab is blue raises the chance to more than 20 percent.
But it is NOT 75 percent: the test establishes the chance the witness SAYS the cab is blue, GIVEN THAT
it is in fact blue at 75 percent, but we are interested in the "inverse probability" that the cab is REALLY
blue, GIVEN that the witness SAYS it is blue.
P (Says B | B) = 0.75
P (Says B | B) = P (B and SAYS B) / P (B) from the definition of conditional probability
P (B | Says B) = P (B and SAYS B) / P (Says B) has the same numerator, but a different
denominator.
It is a simple application of the multiplication rule for dependent events to compute the numerator:
P (B and Says B) = P (B) * P (Says B | B) = 0.20 * (.75) = 0.15
Make a table and fill in the other possibilites:
P (B and Says Y) = P (B) * P (Says Y | B) = 0.20 * (.25) = 0.05
P (Y and Says B) = P (Y) * P (Says B | Y) = 0.80 * (.25) = 0.20
P (Y and Says Y) = P (Y) * P (Says Y | Y) = 0.80 * (.75) = 0.60
Says B
Says Y
Row totals
IS B
0.15
0.05
0.20
IS Y
0.20
0.60
0.80
Col. totals
0.35
0.65
[Grand total = 1.00 or 100 percent]
We know the witness said B, so we know that the first column is the only one that counts. Note that
despite his 75 percent accuracy, the fact that there are more yellow cabs means that our witness makes
more mistakes than correct identifications.
P (B | Says B) = 15 / 35 = 3/7 = 0.4286 = 42.86 percent