Download Lecture3

Probability theory Petter Mostad 2005.09.15 Sample space • The set of possible outcomes you consider for the problem you look at • You subdivide into different outcomes only as far as is relevant for your problem • The sample space is the start of a simplified model for reality • Events are subsets of the sample space Set theory • • • • The complement of subset A is denoted A The intersection of A and B is denoted A  B The union of A and B is denoted A  B A and B are called mutually exclusive or disjoint if A  B   where  is the empty set • If A1 , A2 ,..., An are subsets of a sample space S , then they are called collectively exhaustive if A1  A2  ...  An  S Venn diagrams A A A A B A B A B B A B C ( A  C)  (B  C) Computations with sets Examples of rules you can prove: • ( A  B)  C  A  ( B  C ) • ( A  B)  C  A  ( B  C ) • A  ( A  B)  ( A  B ) • If A1 , A2 , A3 are mutually exclusive and collectively exhaustive, then B  ( B  A1 )  ( B  A2 )  ( B  A3 ) Events as subsets • The union of events corresponds to ”or”, the intersection to ”and”, and the complement to ”not”. Examples: – – – – – – A : The patient is given drug X B : The patient dies A  B : The patient is given drug X and dies A  B : The patient is given drug X or s/he dies or both B : The patient does not die A  B : The patient is not given drug X, and dies Definitions of probability • If A is a set of outcomes of an experiment, and if when repeating this experiment many times the frequency of A approaches a limit p, then the probability of A is said to be p. Or: • The probability of A is your ”belief” that A is true or will happen: It is an attribute of your knowledge about A. Over time, events given probability p should happen with frequency p. Properties of probabilities • When probabilities are assigned to all outcomes in a sample space, such that – All probabilities are positive or zero. – The probabilities add up to one. then we say we have a probability model • The probability of an event A is the sum of the probabilities of the outcomes in A, and is written P(A) Computations with probabilities Some consequences of the set-theory results: • P( A)  1  P( A) • When A and B are mutually exclusive, P( A  B)  P( A)  P( B) • In general, P( A)  P( A)  P( B)  P( A  B) Conditional probability • If some information limits the sample space to a subset A, the relative probabilities for outcomes in A are the same, but they are scaled up so that they sum to 1. • We write P(B|A) for the probability of event B given event A. P ( B  A ) • In symbols: P( B | A)  P( A) The law of total probability • As A  B and A  B are disjoint, we get P( A)  P( A  B)  P( A  B ) • Together with the definition of conditional probability, this gives the law of total probability: P( A)  P( A | B) P( B)  P( A | B ) P( B ) Statistical independence • If P(B|A)=P(B), we say that B is statistically independent of A. • We can easily see that this happens if and only if P( A  B)  P( A) P( B) • Thus B is statistically independent of A if and only if A is statistically independent of B. Bayes theorem • Bayes theorem says that: P( A | B) P( B) P( B | A)  P( A) • This can be deduced simply from definition of conditional probability: P( B | A) P( A)  P( A  B)  P( A | B ) P ( B ) • Together with the law of total probability: P( A | B) P( B) P( B | A)  P( A | B) P( B)  P( A | B ) P( B ) Example • A disease X has a prevalence of 1%. A test for X exists, and – If you are ill, the test is positive in 90% of cases – If you are not ill, the test is positive in 10% of cases. • You have a positive test: What is the probability that you have X? Joint and marginal probabilities Assume A1 , A2 ,..., An are mutually exclusive and collectively exhaustive. Assume the same for B1 , B2 ,..., Bm . Then: • P( Ai  B j ) are called joint probabilities • P( Ai ) or P ( B j ) are called marginal probabilities • If every Ai is statistically independent of every B j then the two subdivisions are called independent attributes Odds • The odds for an event is its probability divided by the probability of its complement. • What is the odds of A if P(A) = 0.8? • What can you say about the probability of A if its odds is larger than 1? Overinvolvement ratios • If you want to see how A depends differently on B or C, you can compute the overinvolvement ratio: P( A | B) P( A | C ) • Example: If the probability to get lung cancer is 0.5% for smokers and 0.1% for non-smokers, what is the overinvolvement ratio? Random variables • A random variable is a probability model where each outcome is a number. • For discrete random variables, it is meaningful to talk about the probability of each specific number. • For continuous random variables, we only talk about the probability for intervals. PDF and CDF • For discrete random variables, the probability density function (PDF) is simply the same as the probability function of each outcome. • The cumulative density function (CDF) at a value x is the cumulative sum of the PDF for values up to and including x. • Example: A die throw has outcomes 1,2,3,4,5,6. What is the CDF at 4? Expected value • The expected value of a discrete random variable is the weighted average of its possible outcomes, with the probabilities as weights. • For a random variable X with outcomes x1, x2, …, xn with probabilities P(x1), P(x2), …, P(xn), the expected value E(X) is E ( X )  P( x1 ) x1  P( x2 ) x2  ...  P( xn ) xn • Example: What is the expected value when throwing a die? Properties of the expected value • We can construct a new random variable Y=aX+b from a random variable X and numbers a and b. (When X has outcome x, Y has outcome ax+b, and the probabilities are the same). • We can then see that E(Y) = aE(X)+b • We can also construct for example the random variable X*X = X2 Variance and standard deviation • The variance of a stochastic variable X is  X2  Var ( X )  E (( X   X )2 ) where X  E( X ) is the expected value. • The standard deviation is the square root of the variance. • We can show that Var (aX  b)  a 2Var ( X ) • We can also show Var ( X )  E ( X 2 )   X2 Example: Bernoulli random variable • A Bernoulli random variable X takes the value 1 with probability p and the value 0 with probability 1-p. • E(X) = p • Var(X) = p(1-p) • What is the variance for a single die throw? Some combinatorics • How many ways can you make ordered selections of s objects from n objects? Answer: n*(n-1)*(n-2)*…*(n-s+1)) • How many ways can you order n objects? Answer: n*(n-1)*…*2*1 = n! (”n faculty”) • How many ways can you make unordered selections of s objects from n objects? Answer: n  (n  1)   (n  s  1) n n! s!    s !(n  s)!  s 

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Lecture3