Download random variable

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Randomness wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
Probability theory
Tron Anders Moger
September 5th 2007
Some definitions:
• Sample space S=The set of all possible
outcomes of a random experiment
• Event A: Subset of outcomes in the sample
space
• Venn diagram:
Operations on events 1
• Complement: The complement of A are all
outcomes included in the sample space, but
not in A, denoted A .
• Union: The union of two events A and B are
the outcomes included in both A and B.
Operations on events 2
• Intersection: The intersection of A and B are
the outcomes included in both A and B.
• Mutually exclusive: If A and B do not have
any common outcomes, they are mutually
exclusive.
• Collectively exhaustive: A  B  S
Probability
• Probability is defined as the freqency of
times an event A will occur, if an
experiment is repeated many times
nA
pA 
n
• The sum of the probabilities of all events in
the sample space sum to 1.
• Probability 0: The event cannot occur
• Probabilities have to be between 0 and 1!
Probability postulates 1
• The complement rule: P(A)+P(A)=1
• Rule of addition for mutually exclusive
events: P(AB)=P(A)+P(B)
Probability postulates 2
• General rule of addition, for events that are
not mutually exclusive:
P(AB)=P(A)+P(B)-P(AB)
Conditional probability
• If the event B already has occurred, the
conditional probability of A given B is:
P( A  B )
P( A | B ) 
P( B )
• Can be interpreted as follows: The
knowledge that B has occurred, limit the
sample space to B. The relative probabilities
are the same, but they are scaled up so that
they sum to 1.
Probability postulates 3
• Multiplication rule: For general outcomes A
and B:
P(AB)=P(A|B)P(B)=P(B|A)P(A)
• Indepedence: A and B are statistically
independent if P(AB)=P(A)P(B)
– Implies that
P( A  B) P( A) P( B)
P( A | B ) 

 P( A)
P( B )
P( B )
Probability postulates 4
• Assume that the events
A1, A2 ,..., An are independent. Then
P(A1A2....An)=P(A1)P(A2)....P(
An)
This rule is very handy when all P(Ai) are
equal
Example: Doping tests
• Let’s say a doping test has 0.2% probability of
being positive when the athlete is not using
steroids
• The athlete is tested 50 times
• What is the probability that at least one test is
positive, even though the athlete is clean?
• Define A=at least one test is positive
Complement rule
Rule of independence 50 terms
P( A)  1  P( A )  1  (1  0.002) * .... * (1  0.002)
 1  (1  0.002)50  0.095  9.5%
Example: Andy’s exams
•
•
•
•
Define A=Andy passes math
B=Andy passes chemistry
Let P(A)=0.4 P(B)=0.35 P(A∩B)=0.12
Are A and B independent?
0.4*0.35=0.14≠0.12, no they are not
• Probability that Andy fail in both subjects?
Complement rule
General rule of addition
P( A  B )  1  P( A  B )  1  ( P( A)  P( B )  P( A  B ))
 1  (0.4  0.35  0.12)  0.37
The law of total probability - twins
•
•
•
•
A= Twins have the same gender
B= Twins are monozygotic
B= Twins are heterozygotic
What is P(A)?
• The law of total probability
P(A)=P(A|B)P(B)+P(A|B)P(B )
For twins: P(B)=1/3 P(B )=2/3
P(A)=11/3+1/22/3=2/3
Bayes theorem
P( B ) P( A | B )
P( B | A) 
P( B ) P( A | B )  P( B ) P( A | B )
• Frequently used to estimate the probability
that a patient is ill on the basis of a
diagnostic
• Uncorrect diagnoses are common for rare
diseases
Example: Cervical cancer
• B=Cervical cancer
• A=Positive test
• P(B)=0.0001 P(A|B)=0.9
P(A|B)=0.001
P( A | B ) P( B )
P( B | A) 
P( A | B ) P( B )  P( A | B ) P( B )
0.9 * 0.0001

 0.08
0.9 * 0.0001  0.001 * 0.9999
• Only 8% of women with positive tests are ill
Usefullness of test highly dependent on
disease prevalence and quality of test:
P(B)
0.0001
0.001
0.01
P(A| B)
0.001
0.0001
0.001
0.0001
0.001
0.0001
P(B|A)
0.08
0.47
0.47
0.90
0.90
0.99
Odds:
• The odds for an event is the probability of
the event divided by the probability of its
complement
P( A)
P( A)
Odds 

1  P( A) P( A )
• From horse racing: Odds 1:9 means that the
horse wins in 1 out of 10 races; P(A)=0.1
Random variables
• A random variable takes on numerical
values determined by the outcome of a
random experiment.
• A discrete random variable takes on a
countable number of values, with a certain
probability attached to each specific value.
• Continuous random variables can take on
any value in an interval, only meaningful to
talk about the probability for intervals.
PDF and CDF
• For discrete random variables, the probability
density function (PDF) is simply the same as the
probability function of each outcome, denoted
P(x).
• The cumulative density function (CDF) at a value
x is the cumulative sum of the PDF for values up
to and including x, F ( x0 )   P( x) .
x  x0
• Sum over all outcomes is always 1 (why?).
• For a single dice throw, the CDF at 4 is
1/6+1/6+1/6+1/6=4/6=2/3
Expected value
• The expected value of a discrete random variable
is defined as the following sum:
  E ( X )   xP( x )
x
• The sum is over all possible values/outcomes of
the variable
• For a single dice throw, the expected value is
E(X)=1*1/6+2*1/6+...+6*1/6=3.5
Properties of the expected value
• We can construct a new random variable
Y=aX+b from a random variable X and
numbers a and b. (When X has outcome x,
Y has outcome ax+b, and the probabilities
are the same).
• We can then see that E(Y) = aE(X)+b
• We can also construct for example the
random variable X*X = X2
Variance and standard deviation
• The variance of a stochastic variable X is
 2  Var( X )  E(( X   )2 )  E( X 2 )   2
• The standard deviation is the square root of
the variance.
2
• We can show that Var (aX  b)  a Var ( X )
• Hence, constants do not have any variance
Example:
• Let E(X)=X and Var(X)=X2
• What is the expected value and variance of
X  x
Y 
?
x
E (Y )  E (
X  x
x
Var(Y )  Var(
X
X X
)  E( )  E( ) 

0
X
X
X X
X
X  x
x
)  Var(
X
X
)
1

2
x
Var( X )  1
Next week:
• So far: Only considered discrete random
variables
• Next week: Continuous random variables
• Common probability distributions for
random variables
• Normal distribution