Download Random variable

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
PHP 2510
Random variables; some discrete distributions
Random variables - what are they?
Probability mass function; cumulative distribution function
Some discrete random variable models:
• Bernoulli
• Binomial
• Geometric
• Negative binomial
PHP 2510 – Sept 18, 2008
1
Random variables
A random variable is essentially a random number.
Formally, a random variable maps elements of a sample space to
the set of real numbers.
Example. Toss a fair coin 3 times. The sample space of all
possible sequences is
Ω = {hhh, hht, hth, thh, htt, tht, tth, ttt}
Examples of random variables:
X
=
number of heads
Y
=
number of consecutive heads
Z
=
1 if three heads, 0 if not.
We denote random variables by italic uppercase letters.
PHP 2510 – Sept 18, 2008
2
Discrete random variable takes on a finite (or countable) number
of distinct values, such as the number of illnesses in a year.
Continuous random variables take on values along a continuum,
such as time until an event, or height of a randomly selected
person.
Our focus today is on discrete random variables
PHP 2510 – Sept 18, 2008
3
Random variables and probability mass functions
A probability mass function (PMF) describes the frequency or
probability of each value of a random variable.
Example. Let X be the number of heads in three tosses of a fair
coin. The PMF of X is
PHP 2510 – Sept 18, 2008
P (X = 0)
=
1/8
P (X = 1)
=
3/8
P (X = 2)
=
3/8
P (X = 3)
=
1/8
4
Example. Let Y be the number of consecutive heads in three
tosses of a fair coin.
P (Y = 0)
=
1/8
P (Y = 1)
=
4/8
P (Y = 2)
=
2/8
P (Y = 3)
=
1/8
Example. Let Z = 1 if 3 heads are tossed, and Z = 0 otherwise.
The PMF of Z is
PHP 2510 – Sept 18, 2008
P (Z = 0)
=
7/8
P (Z = 1)
=
1/8
5
PMF and CDF of a random variable
The probability mass function (PMF) is usually denoted by
p(x) = P (X = x). For a discrete variable having outcomes
x1 , x2 , . . ., the PMF sums to one:
∑
p(xi ) = 1
i
The cumulative distribution function (CDF) is defined as
F (x) = P (X ≤ x).
PHP 2510 – Sept 18, 2008
6
Example. Let X denote the number of heads in three tosses of a
coin. This table shows the PMF and CDF of X:
PHP 2510 – Sept 18, 2008
x
p(x)
F (x)
0
1/8
1/8
1
3/8
4/8
2
3/8
7/8
3
1/8
1
7
Bernoulli distribution
A Bernoulli random variable takes on only two values: 0 (failure)
and 1 (success).
The probability of success is π, then the probability of failure is
1 − π.
p(1) = π
p(0) = 1 − π,
or
p(x) = π x (1 − π)x ,
for x = 0 or 1.
Example: The prevalence of HIV infection is 11%. Let X be the
HIV status of a randomly chosen people. X = 1 if HIV+; X = 0 if
HIV-. Then, X has a Bernoulli distribution.
p(X = 1) = 0.11,
PHP 2510 – Sept 18, 2008
p(X = 0) = 0.89.
8
Binomial distribution
The binomial model for a random variable X characterizes number
of successes in n repeated trials of an experiment that can result
either in success or failure.
Example 1. X = number of heads on 10 tosses of a fair coin
Example 2. Y = number of winning lottery tickets out of 10
million purchased
Example 3. Z = number of 100 patients in a clinical trial who
have cancer remission following an experimental treatment
Example 4. W = number of the 3 transferred embryos that
implant in a woman’s uterus following in-vitro fertilization
PHP 2510 – Sept 18, 2008
9
Mass function for binomial distribution
When trials are independent, probability of having x successes in n
trials is the same, regardless of the ordering of successes and
failures.
First, any particular sequence of x successes occurs with
prob = π × π × · · · × π × (1 − π) × (1 − π) × · · · × (1 − π)
|
{z
} |
{z
}
x successes
n − x failures
= π x (1 − π)n−x
(n)
There are x ways of assigning x successes in a sequence of n
trials. Then,
(number of ways to have x successes) × π x × (1 − π)n−x
( )
n
=
π x (1 − π)n−x .
x
P (X = x) =
PHP 2510 – Sept 18, 2008
10
Example: Number of smokers in a sample of size n
29% of Americans are smokers. Suppose you select 3 people at
random from the population (i.e. n = 3). Let X denote the number
of smokers in the sample.
PHP 2510 – Sept 18, 2008
11
1st person
2nd person
3rd person
x
P (X = x)
1
1
1
3
0.02
0
1
1
2
0.06
1
1
0
2
0.06
1
0
1
2
0.06
1
0
0
1
0.15
0
1
0
1
0.15
0
0
1
1
0.15
0
0
0
0
0.36
PHP 2510 – Sept 18, 2008
12
Construct mass function for X
( )
3
P (X = 0) =
× .290 × .713 = .36
0
( )
3
P (X = 1) =
× .291 × .712 = .45
1
P (X = 2) =
P (X = 3)
PHP 2510 – Sept 18, 2008
=
13
Quick review
If the sample contains at least one smoker, what is the probability
it contains exactly one smoker? Ans = .70
PHP 2510 – Sept 18, 2008
14
Example calculations with the binomial distribution
Example 1: Roll 5 fair dice. Let X = number of sixes. Find:
1. P (X = 0)
2. P (X > 0)
3. P (X = 2 | X > 0)
4. E(X)
PHP 2510 – Sept 18, 2008
15
Example 2: Testing whether a die is fair.
1. A die is rolled 5 times, and a six does not come up. Is the die
fair?
(p(0) = .40)
2. A die is rolled 10 times, and a six does not come up. Is it fair?
(p(0) = .16)
3. A die is rolled 50 times, and six only comes up twice. Is it fair?
(p(2) = .005, p(1) = .001, p(0) = .0001).
PHP 2510 – Sept 18, 2008
16
Geometric distribution
The geometric distribution is useful for modeling waiting times on
a discrete scale.
• Assume independent trials where success probability is pi
• Geometric variable X characterizes the number of trials until
the first success.
• To have the first success occur on trial k, need k − 1 failures
before the first success.
Probability mass function is
P (X = k)
PHP 2510 – Sept 18, 2008
=
(1 − π)k−1 × π
17
Example. Probability of contracting HIV in a single sexual
encounter is 1 in 500. Let X denote the encounter during which a
person gets infected for the first time. Assume each encounter is
independent and carries the same risk.
The mass function is
(
P (X = k) =
499
500
)k−1 (
1
500
)
Example. What is the probability of contracting HIV within the
first 3 encounters?
P (X = 1)
=
· · · = .002
P (X = 2)
=
· · · = .001996
P (X = 3)
=
· · · = .001992
P (X ≤ 3) =
PHP 2510 – Sept 18, 2008
· · · = .006
18
Related documents