Download Chapter 4

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Probability wikipedia , lookup

Transcript
Chapter 4
Continuous Random Variables
and Probability Distributions
Learning Objectives
• Determine probabilities from probability density
functions
• Determine probabilities from cumulative
distribution functions
• Calculate means and variances
• Standardize normal random variables
• Approximate probabilities for some binomial
and Poisson distributions
• Calculate probabilities, determine means and
variances for the continuous probability
distributions presented
Continuous Random
Variables
• Discussed about discrete random
variables
• Continuous random variable, X, has a
distinctly different distribution from the
discrete random variables
• Includes all values in an interval of real
numbers
• Thought of as a continuum
Probability Distributions
• Describe the probability distribution of a
continuous random variable X
• Probability that X is between a and b is
determined as the integral of f(x) from a
to b
Probability Density Function
• Continuous random variable X, a probability density
function
1.
f ( x)  0

2.
 f ( x)dx  1

b
3. P(a  X  b)   f ( x )dx  area under f(x) from a to b
a
• Zero for x values that cannot occur
Important Point
• f(x) is used to calculate an area that represents
the probability that X assumes a value in [a, b]
• Probability at any point is zero, because every
point has zero width
• P(X=x)=0
• Not distinguish between inequalities such as <
or  for continuous random variables
• For any x1 and x2
P(a  X  b)  P(a  X  b) P(a  X  b)  P(a  X  b)
• Not true for a discrete random variable!
Example
• If a random variable
2e for x  0
f ( x)  
0 for  0
2 x
• Find the probability
– between 1 and 3
– greater than 0.5
• Solution
3
P(1  X  3)   2e 2 x dx  e  2  e 6  0.135
1

P( X  5) 
2 x
1
2
e
dx

e
 0.368

0.5
Class Problem
• Suppose that f(x)= e-(x-4) for x>4
• Determine the following probabilities
– P(1<X)
– P(2X<5)
– Determine such that P(X<x) =0.9
Solution
• Part a

P(1  X )   e( x 4 )dx  e( x 4 )
• Part b

4
1
because f(x)=0 for x<4
4
5
P(2  X  5)   e( x 4 )dx  e( x 4 ) 54  1  e1  0.6321
• Part c
4
x
P( X  x )   e( x 4 )dx  e( x 4 ) 4x  1  e( x 4 )  0.90
4
• Then, x = 4 – ln(0.10) = 6.303
Cumulative Distribution
Function
• Stated in the same way as we did for the discrete
random variable
• F(x) is defined for every number x by
x
F ( x)  P( X  x) 
 f (u )du

for    x  
F(x) and f(x)
• Probability that the random variable will take
on a value on the interval from a to b is F(b) –
F(a)
• Fundamental theorem of integral calculus
dF ( x )
 f ( x)
dx
Example
• Find the cumulative distribution function of the
following pdf
2e 2 x for x  0
f ( x)  
0 for  0
• Solution
x
 2t
- 2x
 2e dt  1 - e for x  0
F ( x)   0
0
for x  0

• When x=1 yields
F (1)  1  e 2  0.865
Class Problem
• Suppose that f(x)=1.5 x2 for –1<x<1
– Determine the cumulative distribution function
• Solution
Solution
• The cumulative distribution function
x
Fx ( x )   1.5x 2dx  0.5x 3 x1  0.5x 3  0.5 for -1< x < 1
1
• Then
x  -1
0,

F ( x )  0.5x 3  0.5 - 1  x  1
1,
1 x

Mean and Variance
• Defined similarly to a discrete random variable
• Mean or expected value of X

 x f ( x ) dx
– E[X]=

• Variance

– V(X)=
2
(
x


)
f ( x ) dx


Example
• If a random variable has
2e for x  0
f ( x)  
0 for x  0
2 x
– Find the mean and variance of the given
probability density function
• Solution
E[ x ] 



0
2 x
xf
(
x
)
dx

x
.
2
e
dx  0.5



V [Y ]   ( x   ) f ( x ) dx
2


  ( x  1 / 2) 2e dx
2
0
 1/ 4
2 x
Uniform Distribution
•
Simplest continuous distribution
•
Probability Density
f ( x) 
•
Mean & Standard Deviation
ab
b  af(x)

•
1
ba
and
2

12
Proof
b
x
0.5 X 2
E( X )  
dx 
b

a
ba
a
b
V (X )  
a
b
a
 ( a  b) / 2
ab 2
ab 3
)
(x 
)
2
2
dx 
ba
3(b  a )
x
(x  (
b
a
(b  a ) 2

12
Example
• Suppose X has a continuous uniform distribution
over the interval [1.5, 5.5]
– Determine the mean, variance, and standard deviation
of X
– What is P(X<2.5)?
• Solution
E(X) = (5.5+1.5)/2 = 3.5,
(5.5  1.5) 2
V (X ) 
 3/ 4
12
x  3
4  0.866
2.5
P ( X  2.5)   0.25dx
1.5
2.5
 0.25x 1.5  0.25
Class Problem
• The thickness of a flange on a aircraft component
is uniformly distributed between 0.95 and 1.05
millimeters
– Determine the cumulative distribution function of flange
thickness
– Determine the proportion of flanges that exceeds 1.02
millimeters
– What thickness is exceeded by 90% of the flanges?
Solution
a)
The distribution of X is f(x) = 10 for 0.95 < x < 1.05
0,
x  0.95
Now



F( x)  10 x  9.5,


1,
0.95  x  105
.
105
. x
b) The probability
P( X  102
. )  1  P( X  102
. )  1  F(102
. )  0.3
c) If P(X > x)=0.90, then 1  F(X) = 0.90 and F(X) = 0.10.
Therefore, 10x - 9.5 = 0.10 and x = 0.96.
Normal Distribution
• Most widely used model for the distribution of a
random variable is a normal distribution
• Describes many random processes or continuous
phenomena
• Used to approximate discrete probability
distributions
• Basis for classical statistical inference
Mean and Variance
• Random variables with different means and
variances can be modeled by normal probability
• E[X]= determines the center of the probability
density function
• V[X]=2 determines the width
• Illustrates several normal probability density
functions
Probability Density Function
• X with probability density function
1
f ( x) 
e
2
 ( x   )2
2 2
 x  
• Normal random variable with parameters - < <
and >1
• Mean and variance
– E[X]=  , V[X]= 2
• N(, 2 ) used to denote the distribution
Useful Information
• Total area under the curve is 1.0
• Two tails of the curve extend indefinitely
• Useful results
– P(-<X<+)=0.6827
– P(-2<X<+2)=0.9545
– P(-3<X<+3)=0.9973
Calculating the Probabilities
• Normal distributions differ by mean &
standard deviation
f(X)
X
• Each distribution would require its own table
• Infinite number of tables!
Definition
•
•
•
•
Normal random variable with =0 and 2=1
Called a standard normal random variable
Denoted as Z
Cumulative distribution function of a standard
normal random variable is denoted as
(z)  P(Z  z)
• Appendix Table II provides cumulative
probability values
Standardize the Normal Distribution
• Use the following random variable z to
standardize a normal distribution into
standard normal distribution
= 1

z

X 

X
Normal distribution
=0
Standardized Normal Distribution
• Calculate the probabilities
P( X  x)  P(
X 


x

Z
)  P( Z  z )
Working with Table
• Table II provides values of (z) for values of Z
• Suppose Z=1.5
• Note that P(a<X<b)=F(a) – F(b)
Example
• Find probabilities that a random variable having
the standard normal distribution will take on a
value
– between 0.87 and 1.28
– between –0.34 and 0.62
– greater than 0.85
– greater than –0.65
– less than –0.85
– less than –4.6
Solution
• F(1.28) – F(0.85) = 0.8997-0.8078 = 0.0919
• F(0.62) - F(-0.34)= 0.7324 – (1-0.6331) =
0.3655
• P(z>0.85)=1-P(z0.85)= 1-F(0.85) = 10.85023 = 0.1977
• P(z>-0.65) =1-F(-0.65)=1-[1F(0.65)]=F(0.65) = 0.7422
• P(z<-0.85)= (1-0.8551) = 0.1949
• P(z<-4.6)=
– P (z<-3.99) = 1-0.99967 = 0.000033
– P(z<-4.6)< P(z<-3.99) = ~ 0
Class Problem
• Assume X is normally distributed with a mean
of 5 and a standard deviation of 4
• Determine the following
–
–
–
–
P(X<11)
P(X>0)
P(3<X<7)
P(2 < X < 9)
Solution
• P(X < 11) =
11  5 

P Z 

4


= P(Z < 1.5) = 0.93319
• P(X > 0) = P(Z > (5/4)) = P(Z > 1.25) = 1 
P(Z < 1.25) = 0.89435
 3 5  Z  7  5
P
=  4
4 


• P(3 < X < 7)
= P(0.5 < Z
< 0.5)=P(Z < 0.5)  P(Z < 0.5)= 0.38292
• P(2 < X < 9) =
P  2  5  Z  9  5 
4
4 



=P(1.75 < Z < 1) = [P(Z < 1)  P(Z <
1.75)] = 0.80128
Normal Approximation of
Binomial Distribution
• Difficult to calculate probabilities when n is
large
• Used to approximate binomial probabilities for
cases in which n is large
• Gives approximate probability only
Approximation
• If X is a binomial random variable
z   X  np  np 1  p 
• is approximately a standard normal random
variable
• Good for np>5 and n(1-p)>5
• Accuracy of approximation
– Calculate the interval:
  3  np  3 np1  p 
– If Interval lies in range 0 to n, normal
approximation can be used
Normal Approximation of
Poisson Distribution
• If X is a Poisson random variable with E(X)= and
V(X)= 
z  X    
• Approximated a standard normal random
variable
• Good for  >5
Exponential Distribution
• Poisson distribution defined a random variable to
be the number events during a given time interval
or in a specified regions
• Time or distance between the events is another
random variable that is often of interest
• Probability density function
f ( x)  e  x
• Mean and variance
  E[ X ]  1 ,  2  V [ X ]  1
Example
•
Suppose that the log-ons to a computer network
follow a Poisson process with an average of 3
counts per minute
a) What is the mean time between counts?
b) What is the standard deviation of the time
between counts?
c) Determine x such that the probability that at least
one count occurs before time x minutes is 0.95
Solution
a) E(X) = 1/ =1/3 = 0.333 minutes
b) V(X) = 1/2 = 1/32 = 0.111,  = 0.33
c) Value of x
x
P( X  x) 
 3t
3
e
dt

0
 e
 3t
x
 1  e 3 x  0.95
0
•
Thus, x = 0.9986
Class Problem
• The time between the arrival of electronic
messages at your computer is exponentially
distributed with a mean of two hours
– (a) What is the probability that you do not
receive a message during a two-hour period?
– (b) What is the expected time between your fifth
and sixth messages?
Solution
•
Let X denote the time until a message is
received. Then, X is an exponential random
variable and   1 / E( X)  1 / 2
a) P(X > 2) = 

1
 2e
x/ 2
2
b) E(X) = 2 hours.
dx  e  x / 2
 e 1  0.3679
2
Erlang Distribution
• Describes the length until the first count is
obtained in a Poisson process
• Generalization of the exponential distribution is
the length until r counts occur in a Poisson
process.
• Random variable that equals the interval length
until r counts occur in a Poisson process
• PDF
f ( x) 
r x r 1e  x
(r  1)!
, for x  0 and r  1,2,3, ...
• Mean and Variance
– E(X)=r/ and V(X)= r/2
Example
•
Errors caused by contamination on optical disks
occur at the rate of one error every bits. Assume the
errors follow a Poisson distribution.
a) What is the mean number of bits until five errors
occur?
b) What is the standard deviation of the number of bits
until five errors occur?
c) The error-correcting code might be ineffective if
there are three or more errors within 105 bits. What
is the probability of this event?
Solution
a)
b)
X denote the number of bits until five errors occur
Then, X has an Erlang distribution with r = 5 and =10-5
error per bit
r
5
– E(X) =   5  10
r
 5  1010   5  10 10  223607
2
– V(X) = 
X
c) Y denote the number of errors in 105 bits. Then, Y is a
Poisson random variable with 10-5 error per bit which
equals 1 error per 105 bits
P (Y  3)
 1  P (Y  2)
 1  e  e  e
1
10
0!
 0.0803
1
1!
11
1
2!
12

Gamma Function
• Generalization of factorial function leads
• Definition of gamma function

r 1  x
for r > 0
( r )   x e dx,
0
• Integral (r) is finite
• Using integration by parts it can be shown
– (r)=(r-1) (r-1)
– if r is a positive integer
– (r)=(r-1)!
• Used in development of the gamma distribution
Gamma Distribution
• X with the following PDF
•
•
•
•
•
f ( x) 
r 1  x
x e
r
( r )
for
x>0
Gamma random variable with parameters >0 and r>0
If r is an integer, X has an Erlang distribution
Erlang is a special case of the gamma distribution
Mean and Variance
– E(X)=r/ and V(X)= r/2