Download CHAPTER 3

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Randomness wikipedia , lookup

Probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Transcript
CHAPTER 3
DISCRETE RANDOM
VARIABLES AND
PROBABILITY DISTRIBUTIONS
LEARNING OBJECTIVES
• Understand random variables
• Calculate means and variances
• Determine probabilities from cumulative
distribution functions
• Understand the assumptions of the discrete
probability distributions
• Calculate probabilities, determine means and
variances for each of the discrete probability
distributions
Concept of Random Variable
• Summarize the outcome from a random
experiment by a simple number
• Used to describe the possible outcomes
• Useful to associate a number with each
outcome in the sample space
• Variable that associates a number with
the outcome of a random experiment is
referred to as a random variable
Definition of Random Variables
• Function that assigns a real number to
each outcome in the sample space
• Denoted by an uppercase letter such as
X
• Value of the random variable is denoted
by a lowercase letter such as x
Two Types of Random Variables
• Discrete and Continuous
• Sometimes a measurement can assume any
value in an interval of real numbers
• Said to be a continuous random variable
– Examples: Electrical current, length, pressure,
temperature, time, voltage, weight
• Sometimes the measurement is limited to
integers
• Said to be a discrete random variable
– Examples: No. of scratches on a surface, No. of calls
received per day, and No. of transmitted bits received
in error
Probability Distribution
• Describes probabilities associated with
the possible values of X
• Discrete Case
– Specified by just a list of the possible
values along with the probability of each
• Continuous Case
– Used to describe the probability distribution
• Convenient to express the probability in
terms of a formula
Discrete Probability
Distribution
• Definition
– For a discrete random variable X with
possible values x1, x2, x3, …, xn, the
probability mass function (PMF) is
• f(xi) = P(X = xi)
• f(xi)  0 for all xi
•  f (x )  1
x
i
Discrete Probability Distribution
Examples
• To better understand the PMF, consider
• Example 1
– Tossing a 6-sided die
– f(x)=1/6 for X=1,2,3,4,5,6
• Example 2
– Check whether the following can serve as
probability distribution
– 1. f(x)= (x-2)/2 for x=1,2,3,4;
– 2. h(x)= x2/25 for x=0,1,2,3,4;
Class Problem
•
•
The sample space of a random experiment is
{a,b,c,d,e,f}, and each outcome is equally likely. A
random variable is defined as follows:
Outcome
a
b
x
0
0
c
d
1.5 1.5
e
f
2
3
Determine the probability mass function of X
a)
b)
c)
f(0)=P(X=0)=1/6+1/6=1/3
f(1.5)=P(X=1.5)=1/6+1/6=1/3
f(3)=P(X=3)=1/6
Cumulative Distribution Function
• Useful to provide cumulative probabilities such as
P(Xx)
• Cumulative distribution function (CDF) of a discrete
random variable X, denoted as F(x), is
F ( x)  P( X  x)   f ( x )
i
x x
i
• F(x) has the following properties:
– F(x)= P(X  x)=  f ( x )
xi  x
i
– 0  F(x)  1
– If x  y, then F(x)  F(y)
Cumulative Distribution FunctionExample
• Determine the cumulative distribution of
the random variable for which each
outcome is equally likely in the following
sample space:
Outcome a
x
0
b
0
c
d
e f
1.5 1.5 2 3
Solution
Outcome a
x
f(x)
b
0
0
1/6
1/6
c
1.5 1.5
1/6
0,
x  0 
1 , 0  x  1.5
3


2
F ( x)  , 1.5  x  2

3
5 , 2  x  3 

6

3 x

1,














d
1/6
e
f
2
3
1/6
1/6
Calculating Probabilities from
PMF and CDF
• Interested to determine
– Probabilities from cumulative distribution
functions
– Cumulative distribution functions from
probability mass functions
– Vise versa
• Consider the following example
Example
• Consider the following PMF
f(y)
y
1
2
3
4
f(y)
.4
.3
.2
.1
0.4
y
• Determine the CDF
 0,
0.4,

F ( y )  0.7,
 0.9
1

y 1 
1  y  2
1
F(y)

2  y  3
3  y  4
4 y


y
1
Example
• Consider the following CDF
 0,
0.4,

F ( y )  0.7,
 0.9
1

y 1 
1  y  2

2  y  3
3  y  4
4 y


• Note that Y can take on 1, 2, 3, or 4.
• Determine the PMF
– P(Y=1)=0.4-0.0=0.4
– P(Y=2)=0.7-0.4=0.3, P(Y=3)=0.9-0.7=0.2,
P(Y=4)=1.0-0.9=0.1
– Hence,
y
1
2
3
4
f(y)
.4
.3
.2
.1
Mean of a Discrete Random Variable
• PMF provides complete information about the
properties of a random variable
• Useful to have some summary measures of these
properties
• Mean or expectation
– Denoted by E(X) or  represents an average value of
the random variable
• If a random variable takes on the values x1, x2,… ,
or xk with the probabilities f(x1), f(x2), …., and f(xk),
its mean is
= E(X)= x1.f(x1) + x2.f(x2)+ ...+ xk.f(xk)
– Or
= E(X)=  xf ( x)
x
Examples
• Find the mean of the probability distribution of the
number of heads obtained in three flips of a coin
– Probabilities
• 1/8, 3/8, 3/8, and 1/8
• = (0)1/8 + (1)3/8 + (2)3/8 + (3) 1/8 = 3/2
• Suppose Y is the total showing on a pair of dice,
find the mean of the probability distribution
– Calculated as
E[Y]   2( 1 )  3( 2 )  4( 3 ) 
36
36
36
5( 4 )  6( 5 )  7( 6 ) 8( 5 )  9( 4 ) 10( 3 ) 11( 2 ) 12( 1 )  7
36
36
36
36
36
36
36
36
Variance of a Discrete Random Variable
• Another important measure of the distribution of a
random variable
• Measures the spread or variability
• Whereas mean measures central, the variance
measures the deviation
• Denoted as 2 or V(x)
 2=
V(X)=E(X-)2 =
( x   )2 f ( x )
x
• Standard deviation of X is  =[V(X)]1/2
Examples
• Find the variance of the probability distribution of
the number of heads obtained in three flips of a
coin
– Variance
2= (0-3/2)2(1/8)+(1-3/2)2 (3/8)+(2-3/2)2(3/8)+(33/2)2(1/8)=0.75
• Find the variance of the total showing on a pair of
dice
– Variance
2 = V(X)= (2-7)2(1/36) + (3-7)2(2/36)+ … + (12-7)2(1/36) =
Class Problem
• If the range of X is the set {0, 1, 2, 3, 4} and
P(X=x)= 0.2, determine the mean and
variance of the random variable
• Solution
– X can take values of 0,1, 2, 3 or 4
– Summation of x f(x)
– Hence, mean=2
– Variance can be calculated as
– V(X)= (0-2)2 (0.2)+(1-2)2 (0.2)+…=
Uniform Discrete Distribution
• If X assumes the values x1,x2,….,xn with
equal probability, then it has a discrete
uniform distribution
• Probability mass function
f(xi)= 1/n for xi= x1,x2,….,xn
n
• Mean
 xi
i 1
E[X]=
n
• Variance
n
2
V(X)=  (x i   )
i 1
Examples
• When a light bulb is selected randomly from a
box containing a red, a blue, a white, and a
yellow bulb
– Sample space is {red, blue, white, yellow} with
probability 1/4
– Hence, f(x)=1/4
• When a die is tossed
–
–
–
–
Sample space is {1,2,3,4,5,6} with probability 1/6
f(x)=1/6
E[X]= (1*1/6+ 2*1/6+ 3*1/6+ 4*1/6+ 5*1/6+6*1/6)=3.5
V[X]=(1-3.5)2/6 + (2-3.5)2/6 + …+ (6-3.5)2/6 =35/12
Binomial Distribution
• Two possible outcomes labeled success or
failure
• Referred to as a Bernoulli process
• Conditions
– Consists of n repeated trails
– Results in two possible outcomes (Success or
Failure)
– Probability of success, denoted by p, remains
constant
– Trials are independent
Examples
•
Tossing a coin 10 times
1.
2.
3.
4.
•
There are 10 trials, and they are identical
Each trial has only two outcomes
Probability of getting a H is 0.5 in each trial
Trials (tosses) are independent
In an operation, 5% of all machined parts are
defective. 3 parts are randomly selected from the
production line to determine if each of them is
defective or good
1.
2.
3.
4.
Three identical trials
Each trial has two outcomes
p=0.05
Independent trials
PMF of the Binomial Distribution
• Binomial distribution with parameters p and
n=1,2,…
• PMF
n x
– f(x)=   p (1-p)n-x
x
– Where x =0,1, 2,…,n.
 n
n!

 
–  x  ( n  x )! x!
• E[X]= np
• V (X)=np(1-p)=npq
Example
• In an operation, 5% of all parts machined by
a firm are defective
• If three parts are randomly selected from the
production line, what is the probability that
exactly one of them will be defective?
• Solution
– n=3, x=1, and p=0.05
– Substituting in the PMF
 3
P(X=1)= f(1) = 1  0.051(1-0.05)3-1
P(X=1)=3*0.051(1-0.05)3-1 = 0.1354
Class Problem
• The random variable X has a binomial distribution with
n=10 and p=0.01, determine the following probabilities
• P(X=5)
 10 
=  (0.01)5(1-0.01)10-5) = 2.4 x10 8
5 
• P(X2)
=P(X=0)+P(X=1)+P(X=2)=0.904 +0.091+0.00045
=0.99
• P(3X<5)
4
=P(X=3)+ P(X=4)=1.14 x10
Geometric Distribution
•
•
•
•
Closely related to the binomial experiment
Trials are conducted until a success is obtained
Let X denote the number of trials
PMF
– P(X = x) = f(x) = (1-p)x-1p
x = 1,2,3,…
• Mean and Variance
– =E(x)=1/p and 2=V(x)=(1-p)/p2
Example
• The probability of a successful optical alignment in
the assembly of an optical data storage product is
0.8
• Assume the trials are independent
– What is the probability the first successful
alignment requires exactly four trials?
– What is the probability that the first successful
alignment requires at most four trials?
– What is the probability that the first successful
alignment requires at least four trials?
Example
• Let X denote the number of trials to obtain in the
first successful alignment.
• Then X is a geometric random variable with p = 0.8
• Solution
P(X = 4) = f(4) = 0.23(0.8) = 0.0064
P(X  4) = P(X=1) + P(X = 2) + P(X =3) + P(X = 4)=0.9984
P(X  4) = 1  P(X < 4) = 1  0.992 = 0.008
Class Problem
• Assume that each of your calls to a popular radio
station has a probability of 0.02 of connecting, that
is, of not obtaining a busy signal (calls are
independent)
– What is the probability that your first call that connects is
your tenth call?
– What is the probability that it requires more than five
calls for you to connect?
– What is the mean number of calls needed to connect?
Solutions
• Let X denote the number of calls needed to obtain a
connection
• Then, X is a geometric random variable with p=0.02
• P(X = 10) = (1  0.02)9 0.02  0.9890.02  0.0167
• P(X>5) =1  P( X  4)  1  [ P( X  1)  P( X  2)  P( X  3)  P( X  4)]
1  [0.02  0.98(0.02)  .982 (0.02)  0.983 (0.02)]
1  0.0776  0.9224
• E(X) =1/0.02 = 50
Negative Binomial Distribution
• Based on an experiment
– Consists of a sequence of independent trials
– Result in either a “S” or “F”
– Probability of success is constant from trial to trial
– Continues until a total of r success have been observed
• PMF
x 1
P(X = x) = f(x) = r 1 (1-p)x-rpr




• Mean and Variance
E[X] = r/p and V (X) = r(1-p)/p2
Example
• Suppose that X is a negative binomial random
variable with p=0.2 and r=4. Determine the
following
E(X), P(X = 20), P(X = 19), P(X = 21)
• Solution
• E(X) = r/p =4/0.2 = 20
19 
x 1
16
4
x-r
r


(
0
.
80
)
0
.
2
 0.0436
• P(X = 20) = r 1 (1-p) p =  

3




18 
15
4


(
0
.
80
)
0
.
2
 0.0459
• P(X = 19) =  3 
 
 20 
17
4

(
0
.
80
)
0
.
2
 0.0411
• P(X = 21) = 
 3 


Hypergeometric Distribution
• Assumptions
– Population consists of N objects (finite)
– Each classified as a “S” or “F” and K success
– n individuals is selected without replacement
• Random variable of interest is X, the
number of successes in the sample
PMF of the Hypergeometric Distribution
• X=number of successes in a sample of size n
drawn from a population consisting of K
successes and N-K failures
• PMF is given
 K  N  K 

x 

n  x



f ( x) 
N

n 






– for x satisfying
max (0, n-N+K) x min (n,K)
• Mean and Variance
K  and V (X) = n  K  N  k  N  n 
E[X] = n 


 


 N 
 N  N
 N  1 
Example
• A shipment of 20 machined parts contains 5 that
are defective
• If 10 of them are randomly chosen for
inspection, what is the probability that 2 of the 10
will be defective?
• Solution
x = 2, n = 10, k = 5, and N=20
 5  20  5 
 

2 10  2 
f(x=2)=  
= 0.348
 20 
 
10 
Class Problem
• A lot of 75 washers contains 5 in which the variability in
thickness around the circumference of the washer is
unacceptable
• A sample of 10 washers is selected at random, without
replacement.
– What is the probability that none of the unacceptable washers is in
the sample?
– What is the probability that at least one unacceptable washer is in
the sample?
– What is the probability that exactly one unacceptable washer is in
the sample?
– What is the mean number of unacceptable washers in the sample?
Solution
• Let X denote the number of unacceptable washers
in the sample of 10
• P(X = 0)
 K  N  K 
 

x  n  x 



N
 
n 
 5  75  5 
 

 0 10  0   0.4786
 75 
 
10 
• P(X  1) =1- P(X=0)=0.5214
• P(X = 1)=
 5  75  5 
 

1 10  1   0.3923
 75 
 
10 
• E(X) =10*5/75=2/3
Poisson Distribution
• Binomial, hypergeometric, and negative binomial
distributions start with an experiment consisting of
trials
• Based on the number of outcomes occurring during
a given time interval or in a specified regions
• Examples
– # of accidents that occur on a given highway during a 1week period
– # of customers coming to a bank during a 1-hour interval
– # of TVs sold at a department store during a given week
– # of breakdowns of a washing machine per month
Conditions
• Consider the # of breakdowns of a washing
machine per month example
– Each breakdown is called an occurrence
– Occurrences are random that they do not
follow any pattern (unpredictable)
– Occurrence is always considered with respect
to an interval (one month)
The Probability Mass Distribution
• X = number of counts in the interval
• Poisson random variable with  > 0

x
• PMF
e
f(x)=
x=0,1,2,

x!
• Mean and Variance
E[X] =  , V (X) = 
Example
• If a bank gets on average  = 6 bad checks per
day, what are the probabilities that it will receive
four bad checks on any given day?10 bad checks
on any two consecutive days?
• Solution
x = 4 and  = 6, then f(4) =
6 4 e 6
= 0.135
4!
e 12 1210
 = 12 and x = 10, then f(10) =
= 0.105
10!
Class Problem
• The number of failures of a testing instrument from
contamination particle on the product is a Poisson
random variable with a mean of 0.02 failure per
hour.
– What is the probability that the instrument does not fail in
an 8-hour shift?
– What is the probability of at least one failure in one 24hour day?
Solution
a) Let X denote the failure in 8 hours. Then, X has a
Poisson distribution with =0.16
P(X=0)=0.8521
b) Let Y denote the number of failure in 24 hours.
Then, Y has a Poisson distribution with =0.48
P(Y1) = 1-P(Y = 0) =0.3812
The Poisson Approximation To
The Binomial Distribution
• When n is large and p is small, binomial
probabilities are often approximates by
e   x
f(x)=
x!
for x = 0,1,2, … with  = np
Example
• Assume 5% of the books at a certain bindery
have defective bindings. Find the probability that
2 of 100 books bound by this bindery will have
defective bindings, using
– The binomial distribution
– The Poisson distribution
Solution
• Using Binomial
 
x = 2, n = 100, and p = 0.05
f(x) = 100 (0.05)2(0.95)98 = 0.081
2
• Using Poisson
x = 2,  = np = 100*0.05 = 5
e   x
f(2) =
x!
= 0.084
Next agenda
• Continue our development of probability
distributions with a discussion of several
important continuous distributions