Download Continuous Random Variables

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Outline
Two kinds of random variables
1.
a.
b.
2.
3.
4.
5.
6.
Discrete random variables
Continuous random variables
Symmetric distributions
Normal distributions
The standard normal distribution
Using the standard normal distribution
The normal approximation to the binomial
Lecture 6
1. Two kinds of random variables
a.
Discrete (DRV)



Outcomes have countable
values
Possible values can be
listed
E.g., # of people in this
room
 Possible values can be
listed: might be …51 or 52
or 53…
Lecture 6
1. Two kinds of random variables
a.
b.
Discrete RV
Continuous RV



Not countable
Consists of points in an
interval
E.g., time till coffee break
Lecture 6
1. Two kinds of random variables

The form of the
probability distribution for
a CRV is a smooth curve.

Such a distribution may
also be called a


Frequency Distribution
Probability Density
Function
Lecture 6
1. Two kinds of random variables


In the graph of a CRV, the
X axis is whatever you are
measuring
E.g., exam scores, mood
scores, # of widgets
produced per hour.

The Y axis measures the
frequency of scores.
Lecture 6
X
The Y-axis measures frequency. It is usually not shown.
Lecture 6
2. Symmetric Distributions

In a symmetric CRV, 50% of
the area under the curve is in
each half of the distribution.
P(x ≤ ) = P(x ≥ ) = .5


Note: Because points on a
CRV are infinitely thin, we
can only measure the
probability of intervals of X
values
We can’t measure or
compute the probability of
individual X values.
Lecture 6
50% of area on each side of µ
μ
A symmetric distribution which is not moundshaped. The two sections (above and below the
mean) each contain 50% of the observations.
Lecture 6
50% of area
50% of area
µ
A mound-shaped symmetric distribution (the normal distribution)
Lecture 6
3. Normal Distributions

A very important set of
CRVs has mound-shaped
and symmetric probability
distributions. These are
called “normal
distributions”

Many naturally-occurring
variables are normally
distributed.
Lecture 6
3. Normal Distributions

Are perfectly symmetrical
around their mean, .

Have standard deviation,
, which measures the
“spread” of a distribution

 is an index of variability
around the mean.
Lecture 6

µ
Lecture 6
3. Normal distributions

There are an infinite
number of normal
distributions

They are distinguished on
the basis of their mean (µ)
and standard deviation (σ)
Lecture 6
3. Standard Normal Distribution

The standard normal
distribution is a special one
produced by converting
raw scores into Z scores

Thus, µ = 0 and σ = 1
Lecture 6
3. Standard Normal Distribution

The area under the curve
between  and some value X
≥  has been calculated for
the standard normal
distribution and is given in
Table IV of the text.

E.g., for Z = 1.62, area =
.4474

Because distribution is
symmetric, table values can
also be used for scores
below the mean (Z scores
below 0).
Lecture 6
.4474
Z= 0
Z=
X 1.62
Area gives the probability of finding a score between
the mean and X when you make an observation
Lecture 6
.4474
Z = -1.62

X
For Z < 0, same values can be used as for Z > 0
Lecture 6
5. Using the Standard Normal Distribution


Suppose the average height for
Canadian women is µ = 160 cm,
with  = 15 cm.
What is the probability that the
next Canadian woman we meet is
more than 175 cm tall?

Note two things:
1. this is a question about a
single case
2. it specifies an interval.
Lecture 6
Using the Standard Normal Distribution
Table gives this area
We need this area
160
175
Lecture 6
cm
µ
Remember that area above the mean, , is half (.5)
of the distribution.
Lecture 6
Using the Standard Normal Distribution
Call this shaded area P. We
can get P from Table IV
160
175
Lecture 6
Using the Standard Normal Distribution
Z=X- =

175-160
15
= 1.00
Now, look up Z = 1.00 in the table.
Corresponding area is .3413.
Lecture 6
• Value of Z that marks
one end of the interval
– you want to find
probability that a
randomly selected case
has a score that falls in
this interval
• Other end of the
interval is at µ (= 0)
µ
Z
Using the Standard Normal Distribution
This area is .3413
So this area must be
.5 – .3413 = .1587
160
175
Lecture 6
Using the Standard Normal Distribution
This area is .3413
So this area must be
.5 – .3413 = .1587
Z=0
Z = 1.0
Lecture 6
Using the Standard Normal Distribution

What is the probability that the next Canadian woman
we meet is more than 175 cm tall?

Answer: .1587
Lecture 6
Binomial Random Variable – Method #3
When n is large and p is not too close to 0 or 1, we can use
the normal approximation to the binomial probability
distribution to work out binomial probabilities.
How can you tell if the normal approximation is
appropriate in a given case?
Use the approximation if np ≥ 5 and nq ≥ 5
Lecture 6
BRV – Method #3
Histogram
Normal curve
X
The histogram shows the probabilities of different values of the BRV X.
Because area gives probability, we can use the area under a section of
the normal curve to approximate the area of some part of the histogram
Lecture 6
BRV – Method #3
In order to use the normal approximation, we have to be
able to compute the mean and standard deviation for the
BRV (in order to work out Z).
μ = np(# of observations times P(Success))
σ = √npq
There’s just one other issue to deal with…
Lecture 6
BRV – Method #3
Notice how the normal curve misses one top corner of the rectangle for 7
and overstates the other top corner – these two errors approximately
cancel each other.
1
2
3
4 5 6
7
8
9 10
X
Lecture 6
BRV – Method #3
1
2
3
4 5 6
7
8
9 10
X
Notice how the rectangle for 7 runs from 6.5 to 7.5. The probability of X
values up to and including 7 is given by the area to the left of 7.5.
Lecture 6
BRV – Example 1 from last week
Air Canada keeps telling us that arrival and departure times at
Pearson International are improving. Right now, the statistics
show that 60% of the Air Canada planes coming into Pearson do
arrive on time. (This actually is an improvement over 10 years
ago when only 42% of the Air Canada planes arrived on time at
Pearson.) The problem is that when a plane arrives on time, it
often has to circle the airport because there’s still a plane in its
gate (a plane which didn’t leave on time). Statistics also show that
50% of the planes that arrive on time have to circle at least once,
while only 35% of the planes that arrive late have to circle at
least once.
Lecture 6
BRV – Example 1 from last week
c) Of the next 80 Air Canada planes that arrive at Pearson,
what’s the probability that between 40 and 45 (inclusive)
have to circle at least once?
Lecture 6
BRV – Example 1
First, we check to see whether we can use the normal
approximation. To do this, we need to know the
probability that a plane has to circle at least once:
P(C) = P(C ∩ Late) + P(C ∩ Not Late)
= [P(L) * P(C │L)] + [P(L) * P(C│L)]
= .35 (.40) + .6 (.5)
= .44
Lecture 6
BRV – Example 1
Now we can do the check:
n = 80. p = .44 and q = (1 – p) = .56
np = 80 (.44) = 35.2 > 5
nq = 80 (.56) = 44.8 > 5
Thus, it’s OK to use the normal approximation.
Lecture 6
BRV – Example 1
μ = np = 80*(.44) = 35.2
σ = √npq = √80(.44)(.56) = 4.44
Correction for continuity: to get area for 40 and up, we use
X = 39.5. To get area for 45 and below, we use X = 45.5
Lecture 6
35.2
40
45
45 and below
starts at 45.5
40 and up starts at 39.5
Lecture 6
BRV – Example 1
Z39.5 = 39.5 – 35.2
4.44
= +.97
Z45.5 = 45.5 – 35.2
4.44
= +2.32
Probability that
Z ≤ 2.32
Probability that
Z ≤ .97
P(.97 ≤ Z ≤ 2.32) = .4894 - .3340 = .1554.
This is the probability that between 40 and 45 (inclusive) of the
next 80 planes have to circle at least once.
Lecture 6
From here down = .5 + .4894 = .9894
35.2
40
From here down = .5 + .3340 = .8340
45
Using the normal approximation, we estimate
the combined area of all these rectangles to be
.9894 – .8340 = .1554
35.2
40
45
Lecture 6
CRV Example 1
Wind speed in Windy City is normally-distributed and
the middle 40% of wind speeds is bounded by 23.9 and
29.3.
a. Wind speed would be expected to be lower than what
value only 5% of the time?
Lecture 6
What value of Z
is associated
with p = .20?
.40
From Table,
= 0.53
.20
23.9
µ
29.3
Since distribution is symmetric, µ = midpoint
between 23.9 and 29.3. This is 26.6.
Z
CRV Example 1a
Since Z = X - µ then, σ = X - µ
σ
σ = 29.3 – 26.6
0.53
Z
= 5.094
Z for p = .45 is 1.645
(from Table)
Thus, required X = 26.6 – 1.645 (5.094) = 18.22
Lecture 6
.05
.45
18.22
µ
Lecture 6
CRV Example 1
Wind speed in Windy City is normally-distributed and
the middle 40% of wind speeds is bounded by 23.9 and
29.3.
b. UV radiation in Windy City is normally distributed with a
mean of 10.4 and a variance of 6.25. Wind speed and UV
radiation are independent of each other. A "bad day" in
Windy City is any day on which either wind speed
exceeds 35.0 or UV radiation exceeds 15. What is the
probability of a bad day in Windy City?
Lecture 6
CRV Example 1b
√6.25 = 2.5
10.4
15
Lecture 6
CRV Example 1b
Z = 15 – 10.4 = 1.84
2.5
P(Z < 1.84) = .4671 (from Table)
P(X > 15) = P(Z > 1.84) = .5 – .4671 = .0329
Lecture 6
CRV Example 1b
5.094
26.6
35
Lecture 6
CRV Example 1b
Z = 35 – 26.6 = 1.65
5.094
P(Z < 1.65) = .4505 (from Table)
P(X > 35) = P(Z > 1.65) = .5 – .4505 = .0495
Lecture 6
CRV Example 1b
P(Bad Day) = P[(UV > 15) or (Wind > 35)]
= .0329 + .0495 – (.0329)(.0495)
= .0808
(From Additive Rule of Probability)
Lecture 6
CRV Example 2
Three truckers from different companies meet at a truck-stop and start
talking about how fast their companies are at getting freight to its destination.
Trucker #1 tells the others that Acme, his company, requires drivers to make
the Toronto-Quebec City run in 726 minutes, though the actual mean
duration for the trip is 735, with a standard deviation of 9 minutes. Trucker #2
reveals that Road Hog Transport, his company, wants drivers to make the
same run in 732 minutes, but the actual mean time is 740 minutes, with a
standard deviation of 12. Trucker #3 claims that his company, TakeAChance
Transport, gives drivers 754 minutes to make the run, and that the actual
mean trip length is 760 minutes with a standard deviation of 18. Assume that
trip times are normally distributed.
Lecture 6
CRV Example 2
a. The three drivers each make the Toronto-Quebec City
run, leaving the truck-stop at the same moment. Acme’s
driver makes the run in the time specified by his company.
What is the probability that he beats the Road Hog
driver?
Lecture 6
CRV Example 2
Z = 726 – 740
12
=
–14
12
= – 1.167
P(Z ≤ – 1.167) = .3790.
Thus, probability that Acme beats Road Hog is .3790 + .5 =
.8790.
Lecture 6
0.3790
726
0.5
740
Lecture 6
CRV Example 2
b. What is the probability that, relative to the time specified
by their respective companies, all three drivers arrive at
least 20 minutes late?
Lecture 6
CRV Example 2
ZACME = 746 – 735 = 1.22
9
P(Z ≥ 1.22) = .5 - .3888 = .1112
ZROAD HOG = 752 – 740 = 1.00
12
P(Z ≥ 1.00) = .5 - .3413 = .1587
ZTAKEACHANCE = 774 – 760 = .777 P(Z ≥ .777) = .2177
18
Lecture 6
CRV Example 2
c. Suppose Trucker #2 arrived 20 minutes lat and Trucker #3
arrived 30 minutes early (relative to their respective
companies’ demands). What is the probability that Trucker
#1 was the second to arrive?
Lecture 6
CRV Example 2
Z1 = 752 – 735
9
=
17
9
=
Z2 = 724 – 735
9
=
11
9
=
1.889
–1.222
P(Z ≤ 1.889) = .4706
P(Z ≥ –1.222) = .3888
.8594
Thus, P (Trucker #1 arrives second) = .8594
Lecture 6
Review
Area under curve gives probability of finding X in a given
interval.
 Area under the curve for Standard Normal Distribution is
given in Table IV.
 For area under the curve for other normally-distributed
variables first compute:
Z=X-

Then look up Z in Table IV.

Lecture 6