* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download CH4. Introduction to Probability
Survey
Document related concepts
Transcript
CH4. Introduction to Probability Random experiments & Sample space • A random experiment is an observational process whose results cannot be known in advance. The set of all outcomes (S) is the sample space for the experiment. • Discrete Sample Space • A sample space with a countable number of outcomes is discrete. • For a single roll of a die, the sample space is: S = {1, 2, 3, 4, 5, 6} • When two dice are rolled, the sample space is the following pairs: S= {(1,1), (1,2), (1,3), (1,4), (1,5), (1,6), (2,1), (2,2), (2,3), (2,4), (2,5), (2,6), (3,1), (3,2), (3,3), (3,4), (3,5), (3,6), (4,1), (4,2), (4,3), (4,4), (4,5), (4,6), (5,1), (5,2), (5,3), (5,4), (5,5), (5,6), (6,1), (6,2), (6,3), (6,4), (6,5), (6,6)} Continuous Sample Space If the outcome is a continuous measurement, the sample space can be • described by a rule. • For Ex, the sample space to describe a randomly chosen student ’s GPA would be S = {X | 0.00 < X < 4.00} Events • An event is any subset of outcomes in the sample space. • A simple event or elementary event, is a single outcome. Ex 1: The event having a head in tossing a coin. A={H} Ex 2: The event having 2 in rolling a die. A={2} • A discrete sample space S consists of all the simple events (Ei): S = {E1, E2,…, En} • A compound event consists of two or more simple events. Ex 1: The event having even number of rolling a die. A = {2, 4, 6}: composed of three simple events 1 Ex 2: the compound event A = “rolling a seven” on a roll of two dice consists of 6 simple events: A = {(1,6), (2,5), (3,4), (4,3), (5,2), (6,1)} Equally Likely • Consider the random experiment of tossing a balanced coin. What is the sample space? S = {H, T} What are the chances of observing a H or T? • These two elementary events are equally likely. • When you buy a lottery ticket, the sample space S = {win, lose} has only two events. Are these two events equally likely to occur? Probability • The probability of an event is a number that measures the relative likelihood that the event will occur. • The probability of event A [denoted P (A)], must lie within the interval from 0 < P (A) < 1 0 to 1: If P (A) = 0, then the event cannot occur. If P (A) = 1, then the event is certain to occur. • In a discrete sample space, the probabilities of all simple events must sum to unity: P (S) = P (E1) + P (E2) + … + P (En) = 1 What is Probability? • Three approaches to probability: Approach Examples Empirical There is a 2 percent chance of twins in a randomly-chosen birth. Classical There is a 50 % probability of heads on a coin flip. Subjective There is a 75 % chance that England will adopt the Euro currency by 2010. Empirical Approach • Use the empirical or relative frequency approach to assign probabilities by counting the frequency (f) of observed outcomes defined on experimental sample space. • For Ex, to estimate the default rate on student loans: P (a student defaults) = f /n = (number of defaults)/ (number of loans) 2 the • Necessary when there is no prior knowledge of events. • As the number of observations (n) increases or the number of times the experiment is performed, the estimate will become more accurate. Classical Approach • Instead of performing the experiment, we can use deduction to determine P (A). • a priori refers to the process of assigning probabilities before the event is observed. Ex) Probability of having “head” in a fair coin • a priori probabilities are based on logic, not experience. • For Example, the two dice experiment has 36 equally likely simple events. The P(rolling a seven) is P( A) • number of outcomes with 7 dots 6 0.1667 number of outcomes in sample space 36 The probability is obtained a priori using the classical approach as shown in this Venn diagram for 2 dice: Law of Large Numbers • The law of large numbers is an important probability theorem that states that a large sample is preferred to a small one. • Flip a coin 50 times. We would expect the proportion of heads to be near 1/2. • However, in a small finite sample, any ratio can be obtained (e.g., 1/3, 7/13, 10/22, 28/50, etc.). • A large n may be needed to get close to 1/2. Subjective Approach 3 • A subjective probability reflects someone ’ s personal belief about the likelihood of an event. • Used when there is no repeatable random experiment. • Ex: What is the probability that the price of GM stock will rise within the next 30 days? Rules of Probability Union of Two Events • The union of two events consists of all outcomes in the sample space S that are contained either in event A or in event B or both (denoted A B or “A or B”). • may be read as “or” since one or the other or both events may occur. Intersection of Two Events • The intersection of two events A and B (denoted A B or “A and B”) is the event consisting of all outcomes in the sample space S that are contained in both event A and event B. • may be read as “and” since both events occur. General Law of Addition • The general law of addition states that the probability of the union of two events A and B is: P (A B) = P (A) + P (B) – P (A B) When you add the P (A) and P (B) together, you count the P (A and B) twice. 4 So, you have to subtract P (A B) to avoid over-stating the probability. • For the card Ex: P (Q) = 4/52 (4 queens in a deck) P (R) = 26/52 (26 red cards in a deck) P (Q R) = 2/52 (2 red queens in a deck) P (Q R) = P (Q) + P (R) – P (Q R) = 4/52 + 26/52 – 2/52= 28/52 Mutually Exclusive Events • Events A and B are mutually exclusive (or disjoint) if their intersection is the null set () that contains no elements. • In the case of mutually exclusive events, the addition law reduces to: P (A B) = P (A) + P (B): Special Law of Addition Complement of an Event • The complement of an event A is denoted by A′ (or A C ) and consists of everything in the sample space S except event A. • Since A and A′ together comprise the entire sample space, P (A) + P (A′ ) = 1 • The probability of A′ is found by P (A′ ) = 1 – P (A) • For example, The Wall Street Journal reports that about 33% of all new small businesses fail within the first 2 years. The probability that a new small business will survive is: P (survival) = 1 – P (failure) = 1 – .33 = .67 or 67% Collectively Exhaustive Events • Events are collectively exhaustive if their union is the entire sample space S. • Two mutually exclusive, collectively exhaustive events are dichotomous (or binary) events. 5 • More than two mutually exclusive, collectively exhaustive events are polytomous events. Conditional Probability • The probability of event A given that event B has occurred. • Denoted P (A | B). The vertical line “ | ” is read as “given.” P( A | B) P( A B) for P (B) is not zero and undefined otherwise P( B) P( A | B) P( Ac | B) ? • Question: • Consider the logic of this formula by looking at the Venn diagram. P( A | B) P( A B) P( B) The sample space is restricted to B, an event that has occurred. A B is the part of B that is also in A. The ratio of the relative size of A B to B is P (A | B). 6 Independent Events • Event A is independent of event B if the conditional probability P (A | B) is the same as the marginal probability P (A). • To check for independence, apply this test: If P (A | B) = P (A) then event A is independent of B. • Another way to check for independence: If P (A B) = P (A) P (B) then event A is independent of event B since P (A | B) = P (A B) = P (A) P (B) = P (A) P (B) • P (B) Ex) Out of a target audience of 2,000,000, ad A reaches 500,000 viewers, B reaches 300,000 viewers and both ads reach 100,000 viewers. P( A) 500, 000 .25 2, 000, 000 P( A B) • P( B) 300, 000 .15 2, 000, 000 100, 000 .05 2, 000, 000 What is P (A | B)? P( A | B) P( A B) .05 .30 P( B) .15 Dependent Events • When P (A) ≠ P (A | B), then events A and B are dependent. • For dependent events, knowing that event B has occurred will affect the probability that event A will occur. Multiplication Law for Independent Events • The probability of n independent events occurring simultaneously is: P (A1 A2 ... An) = P (A1) P (A2) ... P (An) if the events are (mutually) independent Note: P (A1 A2 ... An)= P (A1) P (A2| A1)… P (An| A1 A2 • … An-1 ) To illustrate system reliability, suppose a Web site has 2 independent file servers. Each server has 99% reliability. What is the total system reliability? Contingency Tables • A contingency table is a cross-tabulation of frequencies into rows and columns. 7 • A contingency table is like a frequency distribution for two variables. • Ex) Salary Gains and MBA Tuition. Consider the following cross-tabulation table for n = 67 top-tier MBA programs: Relative Frequencies • Calculate the relative frequencies below for each cell of the cross-tabulation table to facilitate probability calculations. Marginal Probabilities • The marginal probability of a single event is found by dividing a row or column total by the total sample size. • For Ex, find the marginal probability of a medium salary gain (P (S2)=33/67). Joint Probabilities • A joint probability represents the intersection of two events in a crosstabulation table. • Consider the joint event that the school has low tuition and large salary gains (denoted as P (T1 S3 )= 1/67 ). • Let X and Y be a pair of discrete random variables. Their joint probability function expresses the probability that X takes the specific value x and simultaneously Y takes the value y, as a function of x and y. The notation used is P(x, y) so, 8 P( x, y) P( X x Y y) • Let X and Y be a pair of jointly distributed random variables. In this context the probability function of the random variable X is called its marginal probability function and is obtained by summing the joint probabilities over all possible values; that is, P( x) P( x, y ) y • Similarly, the marginal probability function of the random variable Y is P( y ) P( x, y) • x Let X and Y be discrete random variables with joint probability function P(x,y). Then 1) 0 P(x,y) 1 for any pair of values x and y 2) The sum of the joint probabilities P(x, y) over all possible values must be 1. Conditional Probabilities • Find the probability that the salary gains are small (S1) given that the MBA tuition is large (T3). P (S1 | T3) =5/32 More about dependence/ independence • Two variables case Ex 1) X: Gender (events: M, F) Y: Pregnant (events: Yes, No) P(Yes|M)=0, P(Yes|F) Knowing the gender affects the probability that the person is pregnant: two variables are dependent Ex 2) 52 Cards Example X: value (events: 1,2,…,10,J,Q,K) Y: color (events: Red, Black) P(Q|Red)=P(Q), P(Q|Black)=P(Q) Knowing the color of card does not affect the probability that the Queen occurs. * In order to check the independence of two variables, we need to check the independence conditions of all the events between two variables. Ex 3) to illustrate system reliability, suppose a Web site has 2 independent file servers. Each server has 99% reliability. system reliability? 9 What is the total X: Server A (events: survive, fail) Y: Server B (events: survive, fail) Under the independence assumption of X and Y, the following conditions should be satisfied. P(A fail B fail)=P(A fail)P(B fail) P(A fail B survive)=P(A fail)P(B survive) P(A survive B fail)=P(A survive)P(B fail) P(A survive B survive)=P(A survive)P(B survive) Question: If two events A and B are mutually exclusive ( P( A B) 0 ), could A and B be independent each other? Bayes’ Theorem • The prior (marginal) probability of an event B is revised after event A has been considered to yield a posterior (conditional) probability. • Bayes’ formula is: • In some situations P (A) is not given. P( B | A) P( A | B) P( B) P( A) Therefore, the most useful and common form of Bayes’s Theorem is: P( B | A) P( A | B) P( B) P( A | B) P( B) P( A | B ') P( B ') • Ex) Of the 580 women who test positive, 576 will actually be pregnant. • So, the desired probability is: 10 P (Pregnant│Positive Test) = 576/580 = .9931 First define A = positive test B = pregnant A' = negative test B ' = not pregnant Some information is given: P (A | B) = .96, P (A | B ') = .01, P (B) = .60 or P (A' | B) = .04, P (A' | B ') = .99, P (B ') = .40 • A generalization of Bayes’s Theorem allows event B to be polytomous (B1, B2, … Bn) rather than dichotomous (B and B'). P( Bi | A) P( A | Bi ) P( Bi ) P( A | B1 ) P( B1 ) P( A | B2 ) P( B2 ) ... P( A | Bn ) P( Bn ) 11