Download Random variable

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
PHP 2510
Random variables; some discrete distributions
Random variables - what are they?
Probability mass function; cumulative distribution function
Some discrete random variable models:
• Bernoulli
• Binomial
• Geometric
• Negative binomial
PHP 2510 – Sept 18, 2008
1
Random variables
A random variable is essentially a random number.
Formally, a random variable maps elements of a sample space to
the set of real numbers.
Example. Toss a fair coin 3 times. The sample space of all
possible sequences is
Ω = {hhh, hht, hth, thh, htt, tht, tth, ttt}
Examples of random variables:
X
=
number of heads
Y
=
number of consecutive heads
Z
=
1 if three heads, 0 if not.
We denote random variables by italic uppercase letters.
PHP 2510 – Sept 18, 2008
2
Discrete random variable takes on a finite (or countable) number
of distinct values, such as the number of illnesses in a year.
Continuous random variables take on values along a continuum,
such as time until an event, or height of a randomly selected
person.
Our focus today is on discrete random variables
PHP 2510 – Sept 18, 2008
3
Random variables and probability mass functions
A probability mass function (PMF) describes the frequency or
probability of each value of a random variable.
Example. Let X be the number of heads in three tosses of a fair
coin. The PMF of X is
PHP 2510 – Sept 18, 2008
P (X = 0)
=
1/8
P (X = 1)
=
3/8
P (X = 2)
=
3/8
P (X = 3)
=
1/8
4
Example. Let Y be the number of consecutive heads in three
tosses of a fair coin.
P (Y = 0)
=
1/8
P (Y = 1)
=
4/8
P (Y = 2)
=
2/8
P (Y = 3)
=
1/8
Example. Let Z = 1 if 3 heads are tossed, and Z = 0 otherwise.
The PMF of Z is
PHP 2510 – Sept 18, 2008
P (Z = 0)
=
7/8
P (Z = 1)
=
1/8
5
PMF and CDF of a random variable
The probability mass function (PMF) is usually denoted by
p(x) = P (X = x). For a discrete variable having outcomes
x1 , x2 , . . ., the PMF sums to one:
∑
p(xi ) = 1
i
The cumulative distribution function (CDF) is defined as
F (x) = P (X ≤ x).
PHP 2510 – Sept 18, 2008
6
Example. Let X denote the number of heads in three tosses of a
coin. This table shows the PMF and CDF of X:
PHP 2510 – Sept 18, 2008
x
p(x)
F (x)
0
1/8
1/8
1
3/8
4/8
2
3/8
7/8
3
1/8
1
7
Bernoulli distribution
A Bernoulli random variable takes on only two values: 0 (failure)
and 1 (success).
The probability of success is π, then the probability of failure is
1 − π.
p(1) = π
p(0) = 1 − π,
or
p(x) = π x (1 − π)x ,
for x = 0 or 1.
Example: The prevalence of HIV infection is 11%. Let X be the
HIV status of a randomly chosen people. X = 1 if HIV+; X = 0 if
HIV-. Then, X has a Bernoulli distribution.
p(X = 1) = 0.11,
PHP 2510 – Sept 18, 2008
p(X = 0) = 0.89.
8
Binomial distribution
The binomial model for a random variable X characterizes number
of successes in n repeated trials of an experiment that can result
either in success or failure.
Example 1. X = number of heads on 10 tosses of a fair coin
Example 2. Y = number of winning lottery tickets out of 10
million purchased
Example 3. Z = number of 100 patients in a clinical trial who
have cancer remission following an experimental treatment
Example 4. W = number of the 3 transferred embryos that
implant in a woman’s uterus following in-vitro fertilization
PHP 2510 – Sept 18, 2008
9
Mass function for binomial distribution
When trials are independent, probability of having x successes in n
trials is the same, regardless of the ordering of successes and
failures.
First, any particular sequence of x successes occurs with
prob = π × π × · · · × π × (1 − π) × (1 − π) × · · · × (1 − π)
|
{z
} |
{z
}
x successes
n − x failures
= π x (1 − π)n−x
(n)
There are x ways of assigning x successes in a sequence of n
trials. Then,
(number of ways to have x successes) × π x × (1 − π)n−x
( )
n
=
π x (1 − π)n−x .
x
P (X = x) =
PHP 2510 – Sept 18, 2008
10
Example: Number of smokers in a sample of size n
29% of Americans are smokers. Suppose you select 3 people at
random from the population (i.e. n = 3). Let X denote the number
of smokers in the sample.
PHP 2510 – Sept 18, 2008
11
1st person
2nd person
3rd person
x
P (X = x)
1
1
1
3
0.02
0
1
1
2
0.06
1
1
0
2
0.06
1
0
1
2
0.06
1
0
0
1
0.15
0
1
0
1
0.15
0
0
1
1
0.15
0
0
0
0
0.36
PHP 2510 – Sept 18, 2008
12
Construct mass function for X
( )
3
P (X = 0) =
× .290 × .713 = .36
0
( )
3
P (X = 1) =
× .291 × .712 = .45
1
P (X = 2) =
P (X = 3)
PHP 2510 – Sept 18, 2008
=
13
Quick review
If the sample contains at least one smoker, what is the probability
it contains exactly one smoker? Ans = .70
PHP 2510 – Sept 18, 2008
14
Example calculations with the binomial distribution
Example 1: Roll 5 fair dice. Let X = number of sixes. Find:
1. P (X = 0)
2. P (X > 0)
3. P (X = 2 | X > 0)
4. E(X)
PHP 2510 – Sept 18, 2008
15
Example 2: Testing whether a die is fair.
1. A die is rolled 5 times, and a six does not come up. Is the die
fair?
(p(0) = .40)
2. A die is rolled 10 times, and a six does not come up. Is it fair?
(p(0) = .16)
3. A die is rolled 50 times, and six only comes up twice. Is it fair?
(p(2) = .005, p(1) = .001, p(0) = .0001).
PHP 2510 – Sept 18, 2008
16
Geometric distribution
The geometric distribution is useful for modeling waiting times on
a discrete scale.
• Assume independent trials where success probability is pi
• Geometric variable X characterizes the number of trials until
the first success.
• To have the first success occur on trial k, need k − 1 failures
before the first success.
Probability mass function is
P (X = k)
PHP 2510 – Sept 18, 2008
=
(1 − π)k−1 × π
17
Example. Probability of contracting HIV in a single sexual
encounter is 1 in 500. Let X denote the encounter during which a
person gets infected for the first time. Assume each encounter is
independent and carries the same risk.
The mass function is
(
P (X = k) =
499
500
)k−1 (
1
500
)
Example. What is the probability of contracting HIV within the
first 3 encounters?
P (X = 1)
=
· · · = .002
P (X = 2)
=
· · · = .001996
P (X = 3)
=
· · · = .001992
P (X ≤ 3) =
PHP 2510 – Sept 18, 2008
· · · = .006
18
Related documents