Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Random Variables
an important concept in probability
A random variable , X, is a numerical quantity
whose value is determined be a random
experiment
Examples
1.
2.
3.
4.
Two dice are rolled and X is the sum of the two upward
faces.
A coin is tossed n = 3 times and X is the number of times
that a head occurs.
We count the number of earthquakes, X, that occur in the
San Francisco region from 2000 A. D, to 2050A. D.
Today the TSX composite index is 11,050.00, X is the value
of the index in thirty days
Examples – R.V.’s - continued
5.
A point is selected at random from a square whose sides are
of length 1. X is the distance of the point from the lower
left hand corner.
point
X
6.
A chord is selected at random from a circle. X is the length
of the chord.
chord
X
Definition – The probability function, p(x), of
a random variable, X.
For any random variable, X, and any real
number, x, we define
p x P X x P X x
where {X = x} = the set of all outcomes (event)
with X = x.
Definition – The cumulative distribution
function, F(x), of a random variable, X.
For any random variable, X, and any real
number, x, we define
F x P X x P X x
where {X ≤ x} = the set of all outcomes (event)
with X ≤ x.
Examples
1.
Two dice are rolled and X is the sum of the two upward
faces. S , sample space is shown below with the value of X
for each outcome
(1,1)
2
(1,2)
3
(1,3)
4
(1,4)
5
(1,5)
6
(1,6)
7
(2,1)
3
(2,2)
4
(2,3)
5
(2,4)
6
(2,5)
7
(2,6)
8
(3,1)
4
(3,2)
5
(3,3)
6
(3,4)
7
(3,5)
8
(3,6)
9
(4,1)
5
(4,2)
6
(4,3)
7
(4,4)
8
(4,5)
9
(4,6)
10
(5,1)
6
(5,2)
7
(5,3)
8
(5,4)
9
(5,5)
10
(5,6)
11
(6,1)
7
(6,2)
8
(6,3)
9
(6,4)
10
(6,5)
11
(6,6)
12
1
p 2 P X 2 P 1,1
36
2
p 3 P X 3 P 1, 2 , 2,1
36
3
p 4 P X 4 P 1,3 , 2, 2 , 3,1
36
4
5
6
5
4
p 5 , p 6 , p 7 , p 8 , p 9
36
36
36
36
36
3
2
1
p 10 , p 11 , p 12
36
36
36
and p x 0 for all other x
Note :
X x
for all other x
Graph
0.18
p(x)
0.12
0.06
0.00
2
3
4
5
6
7
8
x
9
10
11
12
The cumulative distribution function, F(x)
For any random variable, X, and any real number, x, we
define
F x P X x P X x
where {X ≤ x} = the set of all outcomes (event) with X ≤ x.
Note {X ≤ x} = if x < 2. Thus F(x) = 0.
{X ≤ x} = {(1,1)} if 2 ≤ x < 3. Thus F(x) = 1/36
{X ≤ x} = {(1,1) ,(1,2),(1,2)} if 3 ≤ x < 4. Thus F(x) = 3/36
Continuing we find
x2
0
1 2 x3
36
363 3 x 4
6
36 4 x 5
10 5 x 6
36
15
36 6 x 7
F x 21
36 7 x 8
26
36
8 x9
30
36 9 x 10
33 10 x 11
36
35
11 x 12
36
12 x
1
1.2
1
0.8
0.6
0.4
0.2
0
0
5
10
F(x) is a step function
2.
A coin is tossed n = 3 times and X is the number of times
that a head occurs.
The sample Space S = {HHH (3), HHT (2), HTH (2), THH (2),
HTT (1), THT (1), TTH (1), TTT (0)}
for each outcome X is shown in brackets
1
p 0 P X 0 P TTT
8
3
p 1 P X 1 P HTT,THT,TTH
8
3
p 2 P X 2 P HHT,HTH,THH
8
1
p 3 P X 3 P HHH
8
p x P X x P 0 for other x.
Graph
probability function
p(x)
0.4
0.3
0.2
0.1
0
0
1
2
x
3
Graph
Cumulative distribution function
1.2
1
F(x)
0.8
0.6
0.4
0.2
0
-1
0
1
2
x
3
4
Examples – R.V.’s - continued
5.
A point is selected at random from a square whose sides are
of length 1. X is the distance of the point from the lower
left hand corner.
point
X
6.
A chord is selected at random from a circle. X is the length
of the chord.
chord
X
Examples – R.V.’s - continued
5.
A point is selected at random from a square whose sides are
of length 1. X is the distance of the point from the lower
left hand corner.
point
X
S
An event, E, is any subset of the square, S.
P[E] = (area of E)/(Area of S) = area of E
E
The probability function
set of all points a dist x
p x P X x P
0
from lower left corner
S
Thus p(x) = 0 for all values of x. The probability function for this
example is not very informative
The Cumulative distribution function
set of all points within a
F x P X x P
dist x from lower left corner
S
x
0 x 1
x
1 x 2
x
2x
0
x2
4
F x P X x
Area A
1
x0
0 x 1
1 x 2
2x
S
A
x
0 x 1
x
1 x 2
x
2x
Computation of Area A 1 x 2
x2 1
A
x
1
2
2
tan x 2 1
x
x 1
2
tan 1
x2 1
1
1 x2 1
2
2
2
A 2
x
x
1
x
2
2
2
x 2 1 x 2 x 2 1 tan 1
4
4
x2 1 x2
0
2
x
4
F x P X x
x 2 1 tan 1
4
1
x0
0 x 1
x2 1 x2
1 x 2
2x
1
F x
0
-1
0
1
2
The probability density function, f(x), of a
continuous random variable
Suppose that X is a random variable.
Let f(x) denote a function define for -∞ < x < ∞ with the
following properties:
1.
f(x) ≥ 0
2.
f x dx 1.
3.
b
P a X b f x dx.
a
Then f(x) is called the probability density function of X.
The random, X, is called continuous.
Probability density function, f(x)
f x dx 1.
b
P a X b f x dx.
a
Cumulative distribution function, F(x)
F x P X x
x
f t dt.
F x
Thus if X is a continuous random variable with
probability density function, f(x) then the cumulative
distribution function of X is given by:
F x P X x
x
f t dt.
Also because of the fundamental theorem of calculus.
F x
dF x
dx
f x
Example
A point is selected at random from a square whose sides are of
length 1. X is the distance of the point from the lower left hand
corner.
point
X
0
2
x
4
F x P X x
x 2 1 tan 1
4
1
x0
0 x 1
x2 1 x2
1 x 2
2x
Now
f x F x
d 2
x 1
4
dx
x 0 or 2 x
0
x
2
tan
0 x 1
1
2
x 1 x
2
1 x 2
Also
d 2
1
x
1
tan
dx
4
1
2
x
2
2
x
1
x
3
2
1
x
2
2x 2 x
2 x tan
2
x 1 x
2
1
3
2
d 1
x 1 x
tan
dx
2
2 x tan 1
2
x2 1
d 1
x
tan
dx
2
x2 1
x2 1
Now
d
1
1
tan u
du
1 u2
d 1
tan
dx
and
1
x 1
1 x 2 1
2
d 1
x
tan
dx
2
1 2
x 1
2
2x
x
x 1
3
2
2
x 1
2
d 2
2
1
2
x 1 tan
x 1 x
dx
4
x 2 x tan 1 x 2 1
2
32
Finally
0
x
f x F x
2
1
x
2
x
tan
2
x 0 or 2 x
0 x 1
x2 1
1 x 2
Graph of f(x)
2
1.5
1
0.5
0
-1
0
1
2
Discrete Random Variables
Recall
p(x) = P[X = x] = the probability function of X.
This can be defined for any random variable X.
For a continuous random variable
p(x) = 0 for all values of X.
Let SX ={x| p(x) > 0}. This set is countable (i. e. it can
be put into a 1-1 correspondence with the integers}
SX ={x| p(x) > 0}= {x1, x2, x3, x4, …}
Thus let
p x p x
x
i 1
i
Proof: (that the set SX ={x| p(x) > 0} is countable)
(i. e. can be put into a 1-1 correspondence with the integers}
SX = S1 S2 S3 S3 …
where
1
1
Si x
p x
i
i 1
i. e.
1
S1 x p x 1 Note: n S1 2
2
1
1
S2 x p x Note: n S3 3
2
3
1
1
S3 x p x Note: n S3 4
3
4
Thus the number of elements of Si n Si i 1 (is finite)
Thus the elements of SX = S1 S2 S3 S3 …
can be arranged {x1, x2, x3, x4, … }
by choosing the first elements to be the elements of S1 ,
the next elements to be the elements of S2 ,
the next elements to be the elements of S3 ,
the next elements to be the elements of S4 ,
etc
This allows us to write
p x
x
for
px
i 1
i
A Discrete Random Variable
A random variable X is called discrete if
p x p x 1
x
i 1
i
That is all the probability is accounted for by values, x,
such that p(x) > 0.
Discrete Random Variables
For a discrete random variable X the probability
distribution is described by the probability
function p(x), which has the following properties
1.
0 p x 1
2.
p x p x 1
x
3.
i 1
P a x b
i
p x
a x b
Graph: Discrete Random Variable
P a x b
p(x)
a
p x
a x b
b
Continuous random variables
For a continuous random variable X the probability
distribution is described by the probability density
function f(x), which has the following properties :
1.
f(x) ≥ 0
2.
f x dx 1.
3.
b
P a X b f x dx.
a
Graph: Continuous Random Variable
probability density function, f(x)
f x dx 1.
b
P a X b f x dx.
a
A Probability distribution is similar to a distribution
of mass.
A Discrete distribution is similar to a point
distribution of mass.
Positive amounts of mass are put at discrete points.
p(x1)
p(x2)
p(x3)
p(x4)
x1
x2
x3
x4
A Continuous distribution is similar to a
continuous distribution of mass.
The total mass of 1 is spread over a continuum. The
mass assigned to any point is zero but has a non-zero
density
f(x)
The distribution function F(x)
This is defined for any random variable, X.
F(x) = P[X ≤ x]
Properties
1.
F(-∞) = 0 and F(∞) = 1.
Since {X ≤ - ∞} = and {X ≤ ∞} = S
then F(- ∞) = 0 and F(∞) = 1.
2.
F(x) is non-decreasing (i. e. if x1 < x2 then
F(x1) ≤ F(x2) )
If x1 < x2 then {X ≤ x2} = {X ≤ x1} {x1 < X ≤ x2}
Thus P[X ≤ x2] = P[X ≤ x1] + P[x1 < X ≤ x2]
or F(x2) = F(x1) + P[x1 < X ≤ x2]
Since P[x1 < X ≤ x2] ≥ 0 then F(x2) ≥ F(x1).
3.
F(b) – F(a) = P[a < X ≤ b].
If a < b then using the argument above
F(b) = F(a) + P[a < X ≤ b]
Thus F(b) – F(a) = P[a < X ≤ b].
4.
p(x) = P[X = x] =F(x) – F(x-)
Here
F x lim F u
ux
5. If p(x) = 0 for all x (i.e. X is continuous)
then F(x) is continuous.
A function F is continuous if
F x lim F u F x lim F u
u x
u x
One can show that
Thus p(x) = 0 implies that F x F x F x
For Discrete Random Variables
F x P X x p u
u x
F(x) is a non-decreasing step function with
F 0 and F 1
p x F x F x jump in F x at x.
1.2
F(x)
1
0.8
0.6
0.4
p(x)
0.2
0
-1
0
1
2
3
4
For Continuous Random Variables Variables
F x P X x
x
f u du
F(x) is a non-decreasing continuous function with
F 0 and F 1
f x F x.
f(x) slope
F(x)
1
0
-1
0
1
x
2
Some Important Discrete
distributions
The Bernoulli distribution
Suppose that we have a experiment that has two
outcomes
1. Success (S)
2. Failure (F)
These terms are used in reliability testing.
Suppose that p is the probability of success (S) and
q = 1 – p is the probability of failure (F)
This experiment is sometimes called a Bernoulli Trial
Let
0 if the outcome is F
X
1 if the outcome is S
q
Then p x P X x
p
x0
x 1
The probability distribution with probability function
q x 0
p x P X x
p x 1
is called the Bernoulli distribution
1
0.8
0.6
p
q = 1- p
0.4
0.2
0
0
1
The Binomial distribution
Suppose that we have a experiment that has two
outcomes (A Bernoulli trial)
1. Success (S)
2. Failure (F)
Suppose that p is the probability of success (S) and
q = 1 – p is the probability of failure (F)
Now assume that the Bernoulli trial is repeated
independently n times.
Let
X the number of successes occuring in th n trials
Note: the possible values of X are {0, 1, 2, …, n}
For n = 5 the outcomes together with the values of X and the
probabilities of each outcome are given in the table below:
FFFFF SFFFF FSFFF FFSFF FFFSF FFFFS
0
1
1
1
1
1
q5
pq4
pq4
pq4
pq4
pq4
SSFFF
2
p2q3
SFSFF
2
p2q3
SFFSF
2
p2q3
SFFFS
2
p2q3
FSSFF
2
p2 q3
FSFSF
2
p2q3
FSFFS
2
p2q3
FFSSF
2
p2q3
FFSFS
2
p2q3
FFFSS
2
p2q3
SSSFF
3
p3q2
SSFSF
3
p3q2
SSFFS
3
p3 q2
SFSSF
3
p3q2
SFSFS
3
p3q2
SFFSS
3
p3q2
FSSSF
3
p3q2
FSSFS
3
p3q2
FSFSS
3
p3q2
FFSSS
3
p3q2
SSSSF
4
p 4q
SSSFS
4
p4q
SSFSS
4
p4q
SFSSS
4
p4q
FSSSS
4
p4q
SSSSS
5
p5
For n = 5 the following table gives the different
possible values of X, x, and p(x) = P[X = x]
x
0
1
p(x) = P[X = x]
q5
5pq4
2
3
10p3q2 10p2q3
4
5
5p4q
p5
For general n, the outcome of the sequence of n
Bernoulli trails is a sequence of S’s and F’s of length
n.
SSFSFFSFFF…FSSSFFSFSFFS
• The value of X for such a sequence is k = the number
of S’s in the sequence.
• The probability of such a sequence is pkqn – k ( a p for
each S and a q for each F)
•
n
There are such sequences containing exactly
k
k S’s
n
• k is the number of ways of selecting the k
positions
for the S’s. (the remaining n – k positions
are for the F’s
Thus
n k nk
p k P X k p q
k
k 0,1, 2,3,
, n 1, n
These are the terms in the expansion of (p + q)n
using the Binomial Theorem
p q
n
n 0 n n 1 n 1 n 2 n 2
p q p q p q
0
1
2
n n 0
p q
n
For this reason the probability function
n x n x
p x P X x p q
x 0,1, 2,
x
,n
is called the probability function for the Binomial
distribution
Summary
We observe a Bernoulli trial (S,F) n times.
Let X denote the number of successes in the n trials.
Then X has a binomial distribution, i. e.
n x n x
p x P X x p q
x 0,1, 2,
x
where
1. p = the probability of success (S), and
2. q = 1 – p = the probability of failure (F)
,n
Example
A coin is tossed n= 7 times.
Let X denote the number of heads (H) in the n = 7
trials.
Then X has a binomial distribution, with p = ½ and
n = 7.
Thus
n x n x
p x P X x p q
x 0,1, 2,
x
7 1 x 1 7 x
2 2
x 0,1, 2, , 7
x
7 1 7
2
x
x 0,1, 2,
,7
,n
x
0
1/
p(x)
p(x)
128
1
7/
128
2
21/
128
3
35/
128
4
35/
5
128
21/
6
7/
128
7
1/
128
0.3
0.25
0.2
0.15
0.1
0.05
0
0
1
2
3
4
x
5
6
7
128
Example
If a surgeon performs “eye surgery” the chance of “success”
is 85%. Suppose that the surgery is perfomed n = 20 times
Let X denote the number of successful surgeries in the n =
20 trials.
Then X has a binomial distribution, with p = 0.85 and n =
20.
Thus
n x n x
p x P X x p q
x 0,1, 2, , n
x
20
x
20 x
.85 .15
x 0,1, 2, , 20
x
x
p (x )
x
p (x )
x
p (x )
x
p (x )
0
0.0000
6
0.0000
12
0.0046
18
0.2293
1
0.0000
7
0.0000
13
0.0160
19
0.1368
2
0.0000
8
0.0000
14
0.0454
20
0.0388
3
0.0000
9
0.0000
15
0.1028
4
0.0000
10
0.0002
16
0.1821
5
0.0000
11
0.0011
17
0.2428
0.3000
0.2500
p(x)
0.2000
0.1500
0.1000
0.0500
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
x