* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download CH4. Introduction to Probability
Survey
Document related concepts
Transcript
CH4. Introduction to Probability
Random experiments & Sample space
•
A random experiment is an observational process whose results cannot be
known in advance.
The set of all outcomes (S) is the sample space for the experiment.
•
Discrete Sample Space
•
A sample space with a countable number of outcomes is discrete.
•
For a single roll of a die, the sample space is:
S = {1, 2, 3, 4, 5, 6}
•
When two dice are rolled, the sample space is the following pairs:
S=
{(1,1), (1,2), (1,3), (1,4), (1,5), (1,6),
(2,1), (2,2), (2,3), (2,4), (2,5), (2,6),
(3,1), (3,2), (3,3), (3,4), (3,5), (3,6),
(4,1), (4,2), (4,3), (4,4), (4,5), (4,6),
(5,1), (5,2), (5,3), (5,4), (5,5), (5,6),
(6,1), (6,2), (6,3), (6,4), (6,5), (6,6)}
Continuous Sample Space
If the outcome is a continuous measurement, the sample space can be
•
described by a rule.
•
For Ex, the sample space to describe a randomly chosen student ’s GPA
would be S = {X | 0.00 < X < 4.00}
Events
•
An event is any subset of outcomes in the sample space.
•
A simple event or elementary event, is a single outcome.
Ex 1: The event having a head in tossing a coin. A={H}
Ex 2: The event having 2 in rolling a die. A={2}
•
A discrete sample space S consists of all the simple events (Ei):
S = {E1, E2,…, En}
•
A compound event consists of two or more simple events.
Ex 1: The event having even number of rolling a die.
A = {2, 4, 6}: composed of three simple events
1
Ex 2: the compound event A = “rolling a seven” on a roll of two dice consists
of 6 simple events: A = {(1,6), (2,5), (3,4), (4,3), (5,2), (6,1)}
Equally Likely
•
Consider the random experiment of tossing a balanced coin.
What is the sample space? S = {H, T}
What are the chances of observing a H or T?
•
These two elementary events are equally likely.
•
When you buy a lottery ticket, the sample space
S = {win, lose} has only two events.
Are these two events equally likely to occur?
Probability
•
The probability of an event is a number that measures the relative likelihood
that the event will occur.
•
The probability of event A [denoted P (A)], must lie within the interval from
0 < P (A) < 1
0 to 1:
If P (A) = 0, then the event cannot occur.
If P (A) = 1, then the event is certain to occur.
•
In a discrete sample space, the probabilities of all simple events must sum to
unity: P (S) = P (E1) + P (E2) + … + P (En) = 1
What is Probability?
•
Three approaches to probability:
Approach
Examples
Empirical
There is a 2 percent chance of twins in a randomly-chosen birth.
Classical
There is a 50 % probability of heads on a coin flip.
Subjective
There is a 75 % chance that England will adopt the Euro currency by
2010.
Empirical Approach
•
Use the empirical or relative frequency approach to assign probabilities by
counting
the
frequency
(f)
of
observed
outcomes
defined
on
experimental sample space.
•
For Ex, to estimate the default rate on student loans:
P (a student defaults) = f /n = (number of defaults)/ (number of loans)
2
the
•
Necessary when there is no prior knowledge of events.
•
As the number of observations (n) increases or the number of times the
experiment is performed, the estimate will become more accurate.
Classical Approach
•
Instead of performing the experiment, we can use deduction to determine
P (A).
•
a priori refers to the process of assigning probabilities before the event is
observed.
Ex) Probability of having “head” in a fair coin
•
a priori probabilities are based on logic, not experience.
•
For Example, the two dice experiment has 36 equally likely simple events.
The P(rolling a seven) is
P( A)
•
number of outcomes with 7 dots
6
0.1667
number of outcomes in sample space 36
The probability is obtained a priori using the classical approach as shown in
this Venn diagram for 2 dice:
Law of Large Numbers
•
The law of large numbers is an important probability theorem that states
that a large sample is preferred to a small one.
•
Flip a coin 50 times. We would expect the proportion of heads to be near
1/2.
•
However, in a small finite sample, any ratio can be obtained (e.g., 1/3, 7/13,
10/22, 28/50, etc.).
•
A large n may be needed to get close to 1/2.
Subjective Approach
3
•
A subjective probability reflects someone ’ s personal belief about the
likelihood of an event.
•
Used when there is no repeatable random experiment.
•
Ex: What is the probability that the price of GM stock will rise within the
next 30 days?
Rules of Probability
Union of Two Events
•
The union of two events consists of all outcomes in the sample space S that
are contained either in event A or in event B or both
(denoted A B or “A or B”).
•
may be read as “or” since one or the other or both events may occur.
Intersection of Two Events
•
The intersection of two events A and B
(denoted A B or “A and B”) is the event consisting of all outcomes in the
sample space S that are contained in both event A and event B.
•
may be read as “and” since both events occur.
General Law of Addition
•
The general law of addition states that the probability of the union of two
events A and B is:
P (A B) = P (A) + P (B) – P (A B)
When you add the P (A) and P (B) together, you count the P (A and B) twice.
4
So, you have to subtract P (A B) to avoid over-stating the probability.
•
For the card Ex:
P (Q) = 4/52
(4 queens in a deck)
P (R) = 26/52 (26 red cards in a deck)
P (Q R) = 2/52 (2 red queens in a deck)
P (Q R) = P (Q) + P (R) – P (Q R) = 4/52 + 26/52 – 2/52= 28/52
Mutually Exclusive Events
•
Events A and B are mutually exclusive (or disjoint) if their intersection is the
null set () that contains no elements.
•
In the case of mutually exclusive events, the addition law reduces to:
P (A B) = P (A) + P (B): Special Law of Addition
Complement of an Event
•
The complement of an event A is denoted by A′ (or A C ) and consists of
everything in the sample space S except event A.
•
Since A and A′ together comprise the entire sample space,
P (A) + P (A′ ) = 1
•
The probability of A′ is found by P (A′ ) = 1 – P (A)
•
For example, The Wall Street Journal reports that about 33% of all new
small businesses fail within the first 2 years. The probability that a new
small business will survive is:
P (survival) = 1 – P (failure) = 1 – .33 = .67 or 67%
Collectively Exhaustive Events
•
Events are collectively exhaustive if their union is the entire sample space S.
•
Two mutually exclusive, collectively exhaustive events are dichotomous (or
binary) events.
5
•
More than two mutually exclusive, collectively exhaustive events are
polytomous events.
Conditional Probability
•
The probability of event A given that event B has occurred.
•
Denoted P (A | B).
The vertical line “ | ” is read as “given.”
P( A | B)
P( A B)
for P (B) is not zero and undefined otherwise
P( B)
P( A | B) P( Ac | B) ?
•
Question:
•
Consider the logic of this formula by looking at the Venn diagram.
P( A | B)
P( A B)
P( B)
The sample space is restricted to B, an event that has occurred.
A B is the part of B that is also in A.
The ratio of the relative size of A B to B is P (A | B).
6
Independent Events
•
Event A is independent of event B if the conditional probability P (A | B) is
the same as the marginal probability P (A).
•
To check for independence, apply this test:
If P (A | B) = P (A) then event A is independent of B.
•
Another way to check for independence:
If P (A B) = P (A) P (B) then event A is independent of event B since
P (A | B) = P (A B) = P (A) P (B) = P (A)
P (B)
•
P (B)
Ex) Out of a target audience of 2,000,000, ad A reaches 500,000 viewers, B
reaches 300,000 viewers and both ads reach 100,000 viewers.
P( A)
500, 000
.25
2, 000, 000
P( A B)
•
P( B)
300, 000
.15
2, 000, 000
100, 000
.05
2, 000, 000
What is P (A | B)?
P( A | B)
P( A B) .05
.30
P( B)
.15
Dependent Events
•
When P (A) ≠ P (A | B), then events A and B are dependent.
•
For dependent events, knowing that event B has occurred will affect the
probability that event A will occur.
Multiplication Law for Independent Events
•
The probability of n independent events occurring simultaneously is:
P (A1 A2 ... An) = P (A1) P (A2) ... P (An)
if the events are (mutually) independent
Note: P (A1 A2 ... An)= P (A1) P (A2| A1)… P (An| A1 A2
•
…
An-1 )
To illustrate system reliability, suppose a Web site has 2 independent file
servers. Each server has 99% reliability. What is the total system reliability?
Contingency Tables
•
A contingency table is a cross-tabulation of frequencies into rows and
columns.
7
•
A contingency table is like a frequency distribution for two variables.
•
Ex) Salary Gains and MBA Tuition. Consider the following cross-tabulation
table for n = 67 top-tier MBA programs:
Relative Frequencies
•
Calculate the relative frequencies below for each cell of the cross-tabulation
table to facilitate probability calculations.
Marginal Probabilities
•
The marginal probability of a single event is found by dividing a row or
column total by the total sample size.
•
For Ex, find the marginal probability of a medium salary gain (P (S2)=33/67).
Joint Probabilities
•
A joint probability represents the intersection of two events in a crosstabulation table.
•
Consider the joint event that the school has low tuition and large salary
gains (denoted as P (T1 S3 )= 1/67 ).
•
Let X and Y be a pair of discrete random variables. Their joint probability
function expresses the probability that X takes the specific value x and
simultaneously Y takes the value y, as a function of x and y. The notation
used is P(x, y) so,
8
P( x, y) P( X x Y y)
•
Let X and Y be a pair of jointly distributed random variables. In this context
the probability function of the random variable X is called its marginal
probability function and is obtained by summing the joint probabilities over
all possible values; that is,
P( x) P( x, y )
y
•
Similarly, the marginal probability function of the random variable Y is
P( y ) P( x, y)
•
x
Let X and Y be discrete random variables with joint probability function
P(x,y). Then
1) 0 P(x,y) 1 for any pair of values x and y
2) The sum of the joint probabilities P(x, y) over all possible values must be 1.
Conditional Probabilities
•
Find the probability that the salary gains are small (S1) given that the MBA
tuition is large (T3). P (S1 | T3) =5/32
More about dependence/ independence
•
Two variables case
Ex 1)
X: Gender (events: M, F)
Y: Pregnant (events: Yes, No)
P(Yes|M)=0, P(Yes|F)
Knowing the gender affects the probability that the person is
pregnant: two variables are dependent
Ex 2) 52 Cards Example
X: value (events: 1,2,…,10,J,Q,K)
Y: color (events: Red, Black)
P(Q|Red)=P(Q), P(Q|Black)=P(Q)
Knowing the color of card does not affect the probability that the
Queen occurs.
* In order to check the independence of two variables, we need to check the
independence conditions of all the events between two variables.
Ex 3) to illustrate system reliability, suppose a Web site has 2 independent
file servers. Each server has 99% reliability.
system reliability?
9
What is the total
X: Server A (events: survive, fail)
Y: Server B (events: survive, fail)
Under the independence assumption of X and Y, the following
conditions should be satisfied.
P(A fail B fail)=P(A fail)P(B fail)
P(A fail B survive)=P(A fail)P(B survive)
P(A survive B fail)=P(A survive)P(B fail)
P(A survive B survive)=P(A survive)P(B survive)
Question: If two events A and B are mutually exclusive ( P( A B) 0 ), could A and
B be independent each other?
Bayes’ Theorem
•
The prior (marginal) probability of an event B is revised after event A has
been considered to yield a posterior (conditional) probability.
•
Bayes’ formula is:
•
In some situations P (A) is not given.
P( B | A)
P( A | B) P( B)
P( A)
Therefore, the most useful and
common form of Bayes’s Theorem is:
P( B | A)
P( A | B) P( B)
P( A | B) P( B) P( A | B ') P( B ')
•
Ex) Of the 580 women who test positive, 576 will actually be pregnant.
•
So, the desired probability is:
10
P (Pregnant│Positive Test) = 576/580 = .9931
First define
A = positive test
B = pregnant
A' = negative test
B ' = not pregnant
Some information is given: P (A | B) = .96, P (A | B ') = .01, P (B) = .60
or P (A' | B) = .04, P (A' | B ') = .99, P (B ') = .40
•
A generalization of Bayes’s Theorem allows event B to be polytomous (B1,
B2, … Bn) rather than dichotomous (B and B').
P( Bi | A)
P( A | Bi ) P( Bi )
P( A | B1 ) P( B1 ) P( A | B2 ) P( B2 ) ... P( A | Bn ) P( Bn )
11