Download Continuous probability

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Randomness wikipedia , lookup

Transcript
7. Continuous probability.
So far we have been considering examples where the outcomes form a finite or countably infinite set. In
many situations it is more natural to model a situation where the outcomes could be any real number or
vectors of real numbers. In these situations we usually model probabilities by means of integrals.
Example 1. An bank is doing a study of the amount of time, T, between arrivals of successive customers.
Assume we are measuring time in minutes. They are interested in the probability that T will have various
values. If we imagine that T can be any non-negative real number, then the sample space is the set of all
non-negative real numbers, i.e. S = {t: t  0}. This is an uncountable set. In situations such as this, the
probability that T assumes any particular value may be 0, so we are interested in the probability that the
outcome lies in various intervals. For example, what is the probability that T will be less than 2 and 3
minutes.
A common way to describe probabilities in situations such as this is by means of an integral. We try to find
b
a function f(t) such that the probability that T lies in any interval a  t  b is equal to 
 f(t) dt, i.e.
a
b
(1)
Pr{ a  T  b } =

 f(t) dt.
a
A function f(t) with this property is called a probability density function for the outcomes of experiment.
We can regard T as a random variable and f(t) is also called the probability density function for the random
variable T.
Since Pr{ T = a } = 0 and Pr{ T = b } = 0, one has Pr{ a < T  b }, Pr{ a  T < b }, and Pr{ a < T < b } all
equal to this integral. In order that the non-negativity property and normalization axiom hold (formulas
(1.10) and (1.11) in section 1), one should have
(2)
f(t)  0
for all t,

(3)

 f(t) dt = 1.
-
For example, suppose after some study the bank has come with the following model. They feel that
b
e
Pr{ a  T  b } = 
dt
 2
-t/2
(4)
for 0  a  b.
a
7-1
0.5
0.4
0.3
0.2
0.1
2
4
6
8
This is an example of an exponential density. In this case f(t) =
10
e-t/2
for t  0 and f(t) = 0 for t < 0. Note
2
that f(t) satisfies (2) and (3). This integral can be evaluated in terms of elementary functions, so we could
just as well say
Pr{ a  T  b } = e-a/2 - e-b/2.
However, it is useful to keep in mind the integral representation (4).
For example, the probability that T would be between 2 and 3 minutes would be
Pr{ 2  T  3 } = e-2/2 - e-3/2  0.145.
Cumulative distribution functions. Just as with a random variable that only takes on a finite set of values,
it is sometimes convenient to use its cumulative distribution function F(t). If the random variable is T then
t
(5)
F(t) = Pr{ T  t } =

 f(s) ds
-
Note that F '(t) = f(t). In the example above where f(t) =
e-t/2
for t  0 and f(t) = 0 for t < 0 one has f(t) = 1 2
e-t/2 for t  0 and f(t) = 0 for t < 0.
1
0.8
0.6
0.4
0.2
2
4
6
8
7-2
10
Exponential random variables. The example above is an example of an exponential random variable.
These are random variables whose density function has the form f(t) = e-t for t  0 and f(t) = 0 for t < 0.
They are often used to model the time between successive events of a certain type, for example
1.
the times between breakdowns of a computer,
2.
the times between arrivals of customers at a store,
3.
the times between sign-ons of users on a computer network,
4.
the times between incoming phone calls,
5.
the times it takes to serve customers at a store,
6.
the times users stay connected to a computer network,
7.
the lengths of phone calls,
8.
the times it takes for parts to break, e.g. light bulbs to burn out.
Exponential random variables have an interesting property called the memoryless property. If T is an
exponential random variable, then the conditional probability that T is greater than some value t + s given
that T is greater than s is the same as the probability that T is greater than t. In symbols
Pr{ T > t + s | T > s } = Pr{ T > t }
This follows from the fact that Pr{ T > t } = e--t. Suppose, for example, the time between arrivals of bank
customers is an exponential random variable and it has been ten minutes since the arrival of the last
customer. Then the probability that the next customer will arrive in the next two minutes is the same as if a
customer had just arrived.
The parameter  in the exponential distribution has a natural interpretation as a rate. Suppose we have N0
identical systems each of which will undergo a transition in time Ti for i = 1, …, N. Suppose the Ti are
independent and all have exponential distribution with parameter . We are interested in the rate at which
transitions are occurring at time t. The probability that any one of the systems will not have undergone a

transition in the time interval 0  s  t is 
 e-s ds = e-t. So the expected number of systems that have not
t
yet undergone a transition at time t is N = N0e-t. So the rate at which they are undergoing transitions at
dN
1 dN
time t is = N0e-t = N. So  = is the specific rate at which the systems are undergoing
dt
N dt
transitions.
Uniform random variables. A random variable with a constant density function is said to be a uniformly
distributed random variable.
Example 2. Suppose the time John arrives at work is uniformly distributed between 8 and 8:30. What is
the probability that John arrives no later than 8:10?
Suppose we let T be the time in minutes past 8:00 that John arrives. Then the probability density function is
given by f(t) = 1/30 for 0  t  30 and f(t) = 0 for other values of t. The probability that John arrives no
10
later than 8:10 is given by Pr{T  10}
=
 1/30 dt = 1/3.
0
7-3
Means or expected values. For a random variable X that only takes on a discrete set of values x1, …, xn
with probabilities Pr{X = xk } = f(xk), its mean or expected value is
 = = E(X) = x1f(xn) + … + xnf(xn)
(6)
For a continuous random variable T with density function f(t) we replace the sum by an integral to calculate
its mean, i.e.

 = = E(T) = 
 t f(t) dt
(7)
-
Let's see why (7) is a reasonable generalization of (6) to the continuous case. Suppose for simplicity that T
only takes on values between two finite numbers a and b, i.e f(t) = 0 for t < a and t > b. Divide the interval
a  t  b into m equal sized subintervals by points t0, t1, …, tm where tk = a + k(t) with t = (b – a)/m.
Suppose we have a sequence T1, T1, …, Tn, … of repeated independent trials where each Tn has density
function f(t). Suppose s1, s1, …, sn, … are the values we actually observe for the random variables
_ s1 + s2 +  + sn
T1, T1, …, Tn, … In our computation of s =
let's approximate all the values of sj that are
n
between tk-1 and tk by tk and then group all the approximate values that equal tk together. Then we have
_
s1 + s2 +  + sn
(t1 + t1 +  + t1) + (t2 + t2 +  + t2) +  + (tm + tm +  + tm)
s =

n
n
=
g1t1 + g2t2 +  + gmtm
g1
g2
gm
=
t + t +  + tm
n
n 1 n 2
n
where gj is the number of times that sj is between tk-1 and tk. As n   one has
tk
gj
 Pr{tk-1 < T  tk} = 
 f(t) dt  f(tk)t. So as n gets large it is not unreasonable to suppose that
n
tk-1
m
m
_
s   tk f(tk)t. If we let m   then all the above approximations get better and  tk f(tk)t 
k=1
k=1


 t f(t) dt which leads to (7).
-
For an exponential random variable we have

(8)


e--t

1
--t
--t
--t
 = 
 t e dt = - te | 0 + 
 e dt = - 0 + 0 -  | 0 = 
0
0
We used integration by parts with u = t and dv = e--t dt so that du = dt and v = -e--t.
In the example above where the time between arrivals of bank customers was an exponential random
variable with  = ½, the average time between arrivals is 1/ = 2 min.
7-4
Problem 1. The lifetime of a certain type of light bulb has density function f(t) given by f(t) = 0 for t < 100
days and f(t) = 20000/t3 for t > 100 days.
a.
Find the probability that the lifetime is more than 110 days.
b.
Find the average lifetime.
Problem 2. Taxicabs pass by at an average rate of 20 per hour. Assume the time between taxi cabs is an
exponential random variable. What is the probability that a taxicab will pass by in the next minute?
In section 5 we considered machine replacement problems using discrete probability. One can also use
continuous probability for these problems.
Example 3 (Machine replacement). Consider the hard drive on my office computer. It costs c1 = $300 to
replace if it is replaced before it fails. If it fails before it is replaced, it costs an additional c2 = $1000 in
terms of down time for my computer. This is in addition to the $300 replacement cost. Suppose that after it
has been installed it is equally likely to fail anytime in the next five years. Suppose every hard drive fails by
the end of the 5th year. Let
T = the time the hard drive fails
q = time at which you replace it if it hasn't already failed
C = Cq = the cost of a replacement
Tq = replacement time if it is replaced at time q if it has not already failed.
a.
Find the probability density function f(t) and cumulative distribution function F(t) for T.
b.
Find the expected cost E(C) of a replacement.
c.
Find the expected time E(Tq) of a replacement.
d.
Find the long run average cost z(q) = E(C)/E(Tq) of a replacement.
e.
When should the hard drive be replaced so as to minimize the long run replacement cost.
Since it is equally likely to be replaced at any time in the next five years, f(t) should be constant for t
between 0 and 5 and f(t) should be 0 for t less than 0 and greater than 5. Since the integral of f(t) over all t
should be 1, we must have f(t) = 1/5 for 0  t  5 and f(t) = 0 for t < 0 and t > 5. This is another example of
a uniform probability distribution which was discussed earlier. It is constant in an interval and zero
elsewhere.
The cumulative distribution function is just the integral of the density function from -  to t. So F(t) = 0 for
t < 0 and F(t) = t/5 for 0  t  5 and F(t) = 1 for t > 5.
E(C) = c1 + c2F(q) = 300 + 1000q/5 = 300 + 200q.
Tq is an example of a random variable is a mixture of continuous and discrete. It is continuous for t  q and
it is discrete for t = q. It's density function is the same as that of T for t < q, i.e. f(t) = 1/5 for 0  t < q. It's
density function is 0 for t > q. It has a probability mass of f(t) = 1 - F(q) for t = q. To compute E(Tq) we
combine the formulas for discrete and continuous random variable, i.e. we integrate tf(t) over the region
where T is continuous and sum tf(t) over the points where T is discrete. So
7-5
q
q
E(Tq) = 
 tf(t) dt + q(1 - F(q)) =
0
2
2
2

 t/5 dt + q(1 - q/5) = q /10 + q – q /5 = q - q /10.
0
z(q) = E(C)/E(Tq) = q
c1 + c2F(t)
=

 tf(t) dt + q(1 - F(q))
300 + 200q
3000 + 2000q
=
q - q2/10
10q - q2
0
To minimize z, we can minimize u = z/1000 =
3 + 2q
2q2 + 6q - 30 . So u' = 0 when
2. We have u' =
10q - q
(10q - q2)2
q2 + 3q - 15 = 0. The positive solution to this equation is about q = 2.65. So we should replace the disk
drive after 2.65 years.
7-6