Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Probability theory Tron Anders Moger September 5th 2007 Some definitions: • Sample space S=The set of all possible outcomes of a random experiment • Event A: Subset of outcomes in the sample space • Venn diagram: Operations on events 1 • Complement: The complement of A are all outcomes included in the sample space, but not in A, denoted A . • Union: The union of two events A and B are the outcomes included in both A and B. Operations on events 2 • Intersection: The intersection of A and B are the outcomes included in both A and B. • Mutually exclusive: If A and B do not have any common outcomes, they are mutually exclusive. • Collectively exhaustive: A B S Probability • Probability is defined as the freqency of times an event A will occur, if an experiment is repeated many times nA pA n • The sum of the probabilities of all events in the sample space sum to 1. • Probability 0: The event cannot occur • Probabilities have to be between 0 and 1! Probability postulates 1 • The complement rule: P(A)+P(A)=1 • Rule of addition for mutually exclusive events: P(AB)=P(A)+P(B) Probability postulates 2 • General rule of addition, for events that are not mutually exclusive: P(AB)=P(A)+P(B)-P(AB) Conditional probability • If the event B already has occurred, the conditional probability of A given B is: P( A B ) P( A | B ) P( B ) • Can be interpreted as follows: The knowledge that B has occurred, limit the sample space to B. The relative probabilities are the same, but they are scaled up so that they sum to 1. Probability postulates 3 • Multiplication rule: For general outcomes A and B: P(AB)=P(A|B)P(B)=P(B|A)P(A) • Indepedence: A and B are statistically independent if P(AB)=P(A)P(B) – Implies that P( A B) P( A) P( B) P( A | B ) P( A) P( B ) P( B ) Probability postulates 4 • Assume that the events A1, A2 ,..., An are independent. Then P(A1A2....An)=P(A1)P(A2)....P( An) This rule is very handy when all P(Ai) are equal Example: Doping tests • Let’s say a doping test has 0.2% probability of being positive when the athlete is not using steroids • The athlete is tested 50 times • What is the probability that at least one test is positive, even though the athlete is clean? • Define A=at least one test is positive Complement rule Rule of independence 50 terms P( A) 1 P( A ) 1 (1 0.002) * .... * (1 0.002) 1 (1 0.002)50 0.095 9.5% Example: Andy’s exams • • • • Define A=Andy passes math B=Andy passes chemistry Let P(A)=0.4 P(B)=0.35 P(A∩B)=0.12 Are A and B independent? 0.4*0.35=0.14≠0.12, no they are not • Probability that Andy fail in both subjects? Complement rule General rule of addition P( A B ) 1 P( A B ) 1 ( P( A) P( B ) P( A B )) 1 (0.4 0.35 0.12) 0.37 The law of total probability - twins • • • • A= Twins have the same gender B= Twins are monozygotic B= Twins are heterozygotic What is P(A)? • The law of total probability P(A)=P(A|B)P(B)+P(A|B)P(B ) For twins: P(B)=1/3 P(B )=2/3 P(A)=11/3+1/22/3=2/3 Bayes theorem P( B ) P( A | B ) P( B | A) P( B ) P( A | B ) P( B ) P( A | B ) • Frequently used to estimate the probability that a patient is ill on the basis of a diagnostic • Uncorrect diagnoses are common for rare diseases Example: Cervical cancer • B=Cervical cancer • A=Positive test • P(B)=0.0001 P(A|B)=0.9 P(A|B)=0.001 P( A | B ) P( B ) P( B | A) P( A | B ) P( B ) P( A | B ) P( B ) 0.9 * 0.0001 0.08 0.9 * 0.0001 0.001 * 0.9999 • Only 8% of women with positive tests are ill Usefullness of test highly dependent on disease prevalence and quality of test: P(B) 0.0001 0.001 0.01 P(A| B) 0.001 0.0001 0.001 0.0001 0.001 0.0001 P(B|A) 0.08 0.47 0.47 0.90 0.90 0.99 Odds: • The odds for an event is the probability of the event divided by the probability of its complement P( A) P( A) Odds 1 P( A) P( A ) • From horse racing: Odds 1:9 means that the horse wins in 1 out of 10 races; P(A)=0.1 Random variables • A random variable takes on numerical values determined by the outcome of a random experiment. • A discrete random variable takes on a countable number of values, with a certain probability attached to each specific value. • Continuous random variables can take on any value in an interval, only meaningful to talk about the probability for intervals. PDF and CDF • For discrete random variables, the probability density function (PDF) is simply the same as the probability function of each outcome, denoted P(x). • The cumulative density function (CDF) at a value x is the cumulative sum of the PDF for values up to and including x, F ( x0 ) P( x) . x x0 • Sum over all outcomes is always 1 (why?). • For a single dice throw, the CDF at 4 is 1/6+1/6+1/6+1/6=4/6=2/3 Expected value • The expected value of a discrete random variable is defined as the following sum: E ( X ) xP( x ) x • The sum is over all possible values/outcomes of the variable • For a single dice throw, the expected value is E(X)=1*1/6+2*1/6+...+6*1/6=3.5 Properties of the expected value • We can construct a new random variable Y=aX+b from a random variable X and numbers a and b. (When X has outcome x, Y has outcome ax+b, and the probabilities are the same). • We can then see that E(Y) = aE(X)+b • We can also construct for example the random variable X*X = X2 Variance and standard deviation • The variance of a stochastic variable X is 2 Var( X ) E(( X )2 ) E( X 2 ) 2 • The standard deviation is the square root of the variance. 2 • We can show that Var (aX b) a Var ( X ) • Hence, constants do not have any variance Example: • Let E(X)=X and Var(X)=X2 • What is the expected value and variance of X x Y ? x E (Y ) E ( X x x Var(Y ) Var( X X X ) E( ) E( ) 0 X X X X X X x x ) Var( X X ) 1 2 x Var( X ) 1 Next week: • So far: Only considered discrete random variables • Next week: Continuous random variables • Common probability distributions for random variables • Normal distribution