Download standard normal distribution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
1
Outline
Two kinds of random variables
a. Discrete random variables
b. Continuous random variables
2. Symmetric distributions
3. Normal distributions
4. The standard normal distribution
5. Using the standard normal distribution
6. The normal approximation to the binomial
1.
Lecture 6
1. Two kinds of random variables
a.
Discrete (DRV)

Outcomes have
countable values
 Possible values can
be listed
 E.g., # of people in
this room
 Possible values can be
listed: might be …51 or
52 or 53…
Lecture 6
2
1. Two kinds of random variables
a. Discrete RV
b. Continuous RV

Not countable
 Consists of points in
an interval
 E.g., time till coffee
break
Lecture 6
3
1. Two kinds of random variables

The form of the
probability distribution
for a CRV is a smooth
curve.

Such a distribution
may also be called a
 Frequency Distribution
 Probability Density
Function
Lecture 6
4
1. Two kinds of random variables

In the graph of a CRV,
the X axis is whatever
you are measuring
 E.g., exam scores,
mood scores, # of
widgets produced per
hour.

The Y axis measures
the frequency of scores.
Lecture 6
5
6
X
The Y-axis measures frequency. It is usually not shown.
Lecture 6
2. Symmetric Distributions
 In a symmetric CRV, 50%
 Note: Because points on a
of the area under the curve
is in each half of the
distribution.
CRV are infinitely thin, we
can only measure the
probability of intervals of X
values
 We can’t measure or
compute the probability of
individual X values.
P(x ≤ ) = P(x ≥ ) = .5
Lecture 6
7
50% of area on each side of µ
μ
A symmetric distribution which is not moundshaped. The two sections (above and below the
mean) each contain 50% of the observations.
Lecture 6
8
9
50% of area
50% of area
µ
A mound-shaped symmetric distribution (the normal distribution)
Lecture 6
10
3. Normal Distributions

A very important set of
CRVs has moundshaped and symmetric
probability distributions.
These are called
“normal distributions”
 Many naturally-
occurring variables are
normally distributed.
Lecture 6
11
3. Normal Distributions

Are perfectly
symmetrical around
their mean, .

Have standard
deviation, , which
measures the “spread”
of a distribution
  is an index of
variability around the
mean.
Lecture 6
12

µ
Lecture 6
13
3. Normal distributions

There are an infinite
number of normal
distributions

They are distinguished
on the basis of their
mean (µ) and standard
deviation (σ)
Lecture 6
3. Standard Normal Distribution

The standard normal
distribution is a special
one produced by
converting raw scores
into Z scores

Lecture 6
Thus, µ = 0 and σ = 1
14
3. Standard Normal Distribution
 The area under the curve
between  and some value
X ≥  has been calculated
for the standard normal
distribution and is given in
Table IV of the text.
 Because distribution is
symmetric, table values can
also be used for scores
below the mean (Z scores
below 0).
 E.g., for Z = 1.62, area =
.4474
Lecture 6
15
16
.4474
Z= 0
Z=
X 1.62
Area gives the probability of finding a score between
the mean and X when you make an observation
Lecture 6
17
.4474
Z = -1.62

For Z < 0, same values can be used as for Z > 0
Lecture 6
X
5. Using the Standard Normal Distribution
 Suppose the average height for
 Note two things:
Canadian women is µ = 160 cm,
with  = 15 cm.
 What is the probability that the
next Canadian woman we meet is
more than 175 cm tall?
Lecture 6
1. this is a question about
a single case
2. it specifies an interval.
18
19
Using the Standard Normal Distribution
Table gives this area
We need this area
160
Lecture 6
175
cm
20
µ
Remember that area above the mean, , is half (.5)
of the distribution.
Lecture 6
Using the Standard Normal Distribution
Call this shaded area P. We
can get P from Table IV
160
Lecture 6
175
21
Using the Standard Normal Distribution
Z=X- =

175-160
15
= 1.00
Now, look up Z = 1.00 in the table.
Corresponding area is .3413.
Lecture 6
22
• Value of Z that marks
one end of the interval
– you want to find
probability that a
randomly selected case
has a score that falls in
this interval
• Other end of the
interval is at µ (= 0)
µ
Z
Using the Standard Normal Distribution
24
This area is .3413
So this area must be
.5 – .3413 = .1587
160
Lecture 6
175
Using the Standard Normal Distribution
25
This area is .3413
So this area must be
.5 – .3413 = .1587
Z=0
Lecture 6
Z = 1.0
Using the Standard Normal Distribution
 What is the probability that the next Canadian
woman we meet is more than 175 cm tall?
 Answer: .1587
Lecture 6
26
Binomial Random Variable – Method #3
When n is large and p is not too close to 0 or 1, we
can use the normal approximation to the binomial
probability distribution to work out binomial
probabilities.
How can you tell if the normal approximation is
appropriate in a given case?
Use the approximation if np ≥ 5 and nq ≥ 5
Lecture 6
27
28
BRV – Method #3
Normal curve
Histogram
X
The histogram shows the probabilities of different values of the BRV X.
Because area gives probability, we can use the area under a section of
the normal curve to approximate the area of some part of the histogram
Lecture 6
29
BRV – Method #3
In order to use the normal approximation, we have
to be able to compute the mean and standard
deviation for the BRV (in order to work out Z).
μ = np
σ = √npq
(# of observations times P(Success))
There’s just one other issue to deal with…
Lecture 6
30
BRV – Method #3
Notice how the normal curve misses one top corner of the rectangle for 7
and overstates the other top corner – these two errors cancel each other.
1
2
3
4 5 6
7
8
9 10
Lecture 6
X
31
BRV – Method #3
1
2
3
4 5 6
7
8
9 10
Notice how the rectangle for 7 runs from 6.5 to 7.5. The probability of X
values up to and including 7 is given by the area to the left of 7.5.
Lecture 6
X
BRV – Example 1 from last week
Air Canada keeps telling us that arrival and departure times
at Pearson International are improving. Right now, the
statistics show that 60% of the Air Canada planes coming
into Pearson do arrive on time. (This actually is an
improvement over 10 years ago when only 42% of the Air
Canada planes arrived on time at Pearson.) The problem is
that when a plane arrives on time, it often has to circle the
airport because there’s still a plane in its gate (a plane
which didn’t leave on time). Statistics also show that 50%
of the planes that arrive on time have to circle at least
once, while only 35% of the planes that arrive late have to
circle at least once.
Lecture 6
32
BRV – Example 1 from last week
c) Of the next 80 Air Canada planes that arrive at
Pearson, what’s the probability that between 40
and 45 (inclusive) have to circle at least once?
Lecture 6
33
34
BRV – Example 1
First, we check to see whether we can use the
normal approximation. To do this, we need to know
the probability that a plane has to circle at least
once:
P(C) = P(C ∩ Late) + P(C ∩ Not Late)
= [P(L) * P(C │L)] + [P(L) * P(C│L)]
= .35 (.40) + .6 (.5)
= .44
Lecture 6
35
BRV – Example 1
Now we can do the check:
n = 80. p = .44 and q = (1 – p) = .56
np = 80 (.44) = 35.2 > 5
nq = 80 (.56) = 44.8 > 5
Thus, it’s OK to use the normal approximation.
Lecture 6
36
BRV – Example 1
μ = np = 80 (.44) = 35.2
σ = √npq = √80(.44)(.56) = 4.44
Correction for continuity: to get area for 40 and up,
we use X = 39.5. To get area for 45 and below, we
use X = 45.5
Lecture 6
37
35.2
40 and up starts at 39.5
Lecture 6
40
45
45 and below
starts at 45.5
38
BRV – Example 1
Z39.5 = 39.5 – 35.2 = +.97
4.44
Z45.5 = 45.5 – 35.2 = +2.32
4.44
P(.97 ≤ Z ≤ 2.32) = .4894 - .3340 = .1554.
That is the probability that between 40 and 45 (inclusive) of
the next 80 planes have to circle at least once.
Lecture 6
From here down = .5 + .4894 = .9894
35.2
40
From here down = .5 + .3340 = .8340
45
40
Using the normal approximation, we estimate
the combined area of all these rectangles to be
.9894 – .8340 = .1554
35.2
Lecture 6
40
45
41
CRV Example 1
Wind speed in Windy City is normally-distributed
and the middle 40% of wind speeds is bounded
by 23.9 and 29.3.
a. Wind speed would be expected to be lower than
what value only 5% of the time?
Lecture 6
What value of Z
is associated
with p = .20?
.40
From Table,
= 0.53
.20
23.9
µ
29.3
Since distribution is symmetric, µ = midpoint
between 23.9 and 29.3. This is 26.6.
Z
43
CRV Example 1a
Since Z = X - µ then, σ = X - µ
σ
Z
σ = 29.3 – 26.6 = 5.094
0.53
Z for p = .45 is 1.645
(from Table)
Thus, required X = 26.6 – 1.645 (5.094) = 18.22
Lecture 6
44
.05
.45
18.22
µ
Lecture 6
45
CRV Example 1
Wind speed in Windy City is normally-distributed
and the middle 40% of wind speeds is bounded
by 23.9 and 29.3.
b. UV radiation in Windy City is normally
distributed with a mean of 10.4 and a variance
of 6.25. Wind speed and UV radiation are
independent of each other. A "bad day" in Windy
City is any day on which either wind speed
exceeds 35.0 or UV radiation exceeds 15. What
is the probability of a bad day in Windy City?
Lecture 6
46
CRV Example 1b
√6.25 = 2.5
10.4
Lecture 6
15
47
CRV Example 1b
Z = 15 – 10.4 = 1.84
2.5
P(Z < 1.84) = .4671 (from Table)
P(X > 15) = P(Z > 1.84) = .5 – .4671 = .0329
Lecture 6
48
CRV Example 1b
5.094
26.6
Lecture 6
35
49
CRV Example 1b
Z = 35 – 26.6 = 1.65
5.094
P(Z < 1.65) = .4505 (from Table)
P(X > 35) = P(Z > 1.65) = .5 – .4505 = .0495
Lecture 6
50
CRV Example 1b
P(Bad Day) = P[(UV > 15) or (Wind > 35)]
= .0329 + .0495 – (.0329)(.0495)
= .0808
(From Additive Rule of Probability)
Lecture 6
51
Review
 Area under curve gives probability of finding X in a given
interval.
 Area under the curve for Standard Normal Distribution is
given in Table IV.
 For area under the curve for other normally-distributed
variables first compute:
Z=X-

Then look up Z in Table IV.
Lecture 6