Download MTH/STA 561 NORMAL PROBABILITY DISTRIBUTION The most

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
MTH/STA 561
NORMAL PROBABILITY DISTRIBUTION
The most important continuous probability distribution in the entire …eld of statistics
is the normal distribution. Originally, the mathematical equation of the normal distribution was developed by Abraham DeMoivre in 1733. However, it is often referred to as the
Gaussian distribution in honor of Karl Friedrich Gauss (1777-1855). The normal distribution is important both because it seems to provide an adequate model for various observed
measurements and, as we will see in the subsequent chapters, because it provides an accurate
approximation to a wide variety of probability distributions.
The graph of the normal distribution is a bell-shaped curve and is called the normal
curve or Gaussian curve. A continuous random variable Y that has the bell-shaped normal
distribution is called a normal random variable. The probability distribution of the normal
random variable is a function of the form
"
#
2
1 y
1
for
1 < y < 1.
f (y; ; ) = p exp
2
2
It depends upon the two parameters and , that will be shown later to be the mean and
standard deviation, respectively, of the normal random variable. As demonstrated in the
following theorem, this function bell-shaped is a legitimate probability distribution.
Theorem 1. For any real number
Z1
1
where
1<
< 1 and
1 < y < 1,
"
1
1 y
p exp
2
2
2
#
dy = 1
> 0.
Proof. Consider the integral
I=
Z1
exp
x2
2
dx
1
This integral exists because the integrand is a positive continuous function which is bounded
by an integrable function; that is,
0 < exp
for
1 < x < 1, and
Z1
1
exp ( jxj + 1) dx = 2
Z1
x2
2
< exp ( jxj + 1)
exp ( x + 1) dx = 2 lim [ exp ( x + 1)]b0 = 2e
b!1
0
1
To evaluate the integral I, we note that I > 0 and that I 2 may be written
I2 =
Z1 Z1
1
x2 + w 2
2
exp
dxdw
1
This double integral can be evaluated by changing to polar coordinates. If we let x = cos
and w = sin , then we have dxdw = d d , x2 + w2 = 2 , and
2
I =
Z2 Z1
0
Accordingly, I =
2
exp
d d =
2
lim
exp
t!1
2
=t
2
=0
0
0
p
Z2
d =
Z2
d =2
0
2 and thus
Z1
1
Now if we let x = (x
becomes
)/ , where
Z1
1
x2
2
1
p exp
2
1
p exp
2
"
dx = 1
> 0, then dx = dy/
1
2
y
2
#
and the preceding integral
dy = 1:
Since > 0, the integrand of the preceding integral satis…es the requirements of being a
probability density function of a continuous type of random variable. As a result, we now
formally de…ne the normal distribution as follows.
De…nition 1. A continuous random variable Y is said to have a normal probability
distribution (or Gaussian distribution) with parameters and if its probability density
function is given by
"
#
2
1 y
1
f (y; ; ) = p exp
2
2
for
1<
< 1,
> 0, and
1 < y < 1.
As will be shown later, and 2 can be veri…ed by means of moment-generating function
to be mean and variance, respectively, of the random variable Y . The normal distribution occurs so frequently in certain parts of statistics that we denote it, for brevity, by
Y
N ( ; 2 ). Thus, by writing Y
N ( ; 2 ), we mean that the random variable Y is
normally distributed with mean and variance 2 .
The graph of the normal distribution N ( ; 2 ), generally referred to as normal curve or
Gaussian Curve, is bell-shaped curve as shown below. Intuitively, the normal curve seems
to give a most objective probability in the sense that the most likely probability occurs at
2
the mean (center) and that it becomes less and less likely as the curve symmetrically and
smoothly move decreasingly towards both ends.
We now derive the moment-generating function for a normal random variable as follows.
Theorem 2. The moment-generating function for a normal random variable Y is given
by
1 22
t
2
Proof. By de…nition, the moment-generating function for a normal random variable Y
is given by
"
#
Z1
2
1
1
y
mY (t) = E etY =
exp (ty) p exp
dy
2
2
mY (t) = exp
t+
1
The two exponential factors of the integrand can be combined as
"
#
"
#
2
2
1 y
1 y
exp (ty) exp
= exp
+ ty
2
2
y2
= exp
Thus,
mY (t) =
Z1
1
y2
1
p exp
2
2 2 ty
2
2 2 ty
2
2 y+
2
2
:
2
2 y+
dy:
2
Now let us complete the square in the numerator of the exponent; that is,
y2
2 2 ty
2 y+
2
= y2
2
2
+
=
y
+
2
t
=
y
+
2
t
=
y
+
2
t
=
y
+
2
t
3
2
t y+
2
2
2
2
+
2
+
2
+
2
2
2
2
2
t
2
+2
t
2
2
t+
4 2
t
1
t+
2
2 2
t
4 2
t
which implies that
exp
y2
2 2 ty
2
2
2 y+
= exp
2
= exp
"
"
[y
1
2
2
( +
2 2
y
2
t)]
2
( +
+
2
t)
#
1
t+
2
exp
2 2
t
#
t+
1
2
2 2
t
:
Thus, mY (t) becomes
1
t+
2
mY (t) = exp
= exp
t+
1
2
2 2
t
2 2
t
Z1
1|
1
p exp
2
"
y
1
2
2
( +
t)
{z
N( +
2
#
dy
}
2 t; 2 )
because the integrand of the last integral can be thought of as a normal distribution with
parameters + 2 t and , and hence it is equal to one.
Thus, the mean and variance of a normal random variable may be calculated from the
moment-generating function mY (t) as follows: By di¤erentiating mY (t) with respect to t,
we obtain
d
1 22
E (Y ) = mY (t)
=
+ 2 t exp t +
t
=
dt
2
t=0
t=0
and
E Y
2
=
d2
mY (t)
dt2
=
2
exp
=
2
+
t=0
t+
1
2
2 2
t
+
+
2
t
2
exp
t+
1
2
2 2
t
t=0
2
Thus,
V ar (Y ) = E Y 2
[E (Y )]2 =
2
+
2
2
=
2
:
We summarize the preceding discussion in the following theorem.
Theorem 3. The mean and variance of a normal random variable Y are
respectively; that is,
E (Y ) =
and
V ar (Y ) = 2
and
2
,
De…nition 2. A continuous random variable Y that follows a normal probability distribution with mean and variance 2 is referred to as a normal random variable with mean
4
and variance
2
.
Once the parameters and are speci…ed, the normal curve is completely determined.
In Figure (a) of the following set of …gures, we have sketched two normal curves which have
the same standard deviation but di¤erent means. The two curves are identical in shape but
are centered at di¤erent positions along the horizontal axis. In Figure (b), we have sketched
two normal curves with the same mean but di¤erent standard deviations. In this case, the
two curves are centered at exactly the same positions on the horizontal axis, but the curve
with the larger standard deviation is lower and spreads out farther. Remember that the area
under a probability curve and above the horizontal axis must be equal to one and therefore
the more variable the set of observations the lower and wider the corresponding curve will
be. Figure (c) shows the results of sketching two normal curve that have di¤erent means
and di¤erent standard deviations. Clearly, they are centered at di¤erent positions on the
horizontal axis and their shapes re‡ect the two di¤erent values of .
(a)
(b)
(c)
5
The following theorem gives a list of properties of the normal curve N ( ;
Theorem 4. Let Y be normally distributed with mean
Then we have the following properties:
2
).
and standard deviation .
(1). The mode, which is the point on the horizontal axis at which the curve is a maximum,
occurs at y = .
(2). The curve is symmetric about a vertical axis through the mean .
(3). The curve has its points of in‡ection at y =
< y < + , and is concave upward elsewhere.
. It is concave downward if
(4). The curve approaches the horizontal axis asymptotically as we proceed in either
direction away from the mean.
Proof. (1) It is easy to show that
f 0 (y; ; ) =
2
1
p
2
(y
) exp
"
1
2
and
f 00 (y; ; ) =
4
1
p
2
)2
(y
2
exp
"
1
2
2
y
Setting f 0 (y; ; ) = 0 yields y = . Since f 0 (y; ; ) > 0 for y <
y > , it follows that the mode occurs at y = .
(2) It is also easy to verify that f (y
; ; )=f( y
that is, the curve is symmetric about the vertical line y = .
#
2
y
#
and f 0 (y; ; ) < 0 for
; ; ) for all
1 < y < 1.
(3) If f 00 (y; ; ) = 0, then y =
or y = + . Clearly, f 00 (y; ; ) < 0 for jy
j<
00
and f (y; ; ) > 0 for jy
j > . Hence, the curve has its points of in‡ection at y =
.
(4) Now note that
1
lim f (y; ; ) = p
2
y! 1
1
lim exp
2
y! 1
2
1
)2 = p
2
(y
0 = 0:
Just as the area under the curve of any continuous probability density function bounded
by the two ordinates y = y1 and y = y2 is equal to the probability that the random variable
Y assumes a value between y = y1 and y = y2 , for Y
N ( ; 2 ), we have
P fy1 < Y < y2 g =
Zy2
y1
1
p exp
2
"
1
2
y
2
#
dy;
represented by the area of the shaded region of the curve given below.
6
Let
z=
y
:
Then dz = 1 dy or dy = dz. Also; if y falls between y1 and y2 , z will fall between
z1 =
y1
and
z2 =
y2
:
Therefore, the above probability can be written as
P fy1 < Y < y2 g =
Zz2
=
Zz2
1
p exp
2
z1
z1
1
p exp
2
1 2
z
2
dz
1 2
z dz:
2
It should be noticed that the integrand of the second integral turns out to be the normal
distribution with mean = 0 and variance 2 = 1. As will be shown later, the second
integral can be obtained by using the probability table that is designed speci…cally for this
kind of probabilities. Let us …rst de…ne this special kind of normal probability as follows.
De…nition 3. The probability distribution of a normal random variable Z with mean 0
and variance 1 is called a standard normal distribution. That is, the probability distribution
of the random variable Z is given by
1
f (z; 0; 1) = p exp
2
1 2
z
2
for
1<z<1
Similar to what it means by Y
N ( ; 2 ), the expression Z
N (0; 1) stands for the
random variable Z that has a standard normal distribution. Below is the graph of the
standard normal distribution. The distribution function of the standard normal random
variable is given by
F (z; 0; 1) = P fZ
zg =
Zz
1
7
1
p exp
2
1 2
t dt
2
The …gures below graph the standard normal distribution and its distribution function.
The table for standard normal probability gives the area under the standard normal curve
corresponding to
Z1
1
1 2
p exp
P fZ > zg =
t dt
2
2
z
for values of z ranging from 0:00 to 5:00, as shown below.
Example 1. Let Z be a standard normal random variable. From the Standard Normal
Probability Table, we have
P fZ > 1:73g = 0:0418
and
P f 1:85 < Z < 0:78g = 1 P fZ > 1:95g P fZ > 0:78g
= 1 0:0322 0:2177
= 0:7501:
8
Example 2. Given a standard normal random variable Z, …nd the value of k such that
(a) P fZ > kg = 0:2236;
(b) P fk < Z <
0:15g = 0:4329:
Solution. (a) From the Standard Normal Probability Table, we read that P fZ > 0:76g =
0:2236. Hence, k = 0:76.
(b) From the Standard Normal Probability Table, we read that the total area to the left
of 0:15 is equal to 0:4404; that is, P fZ < 0:15g = P fZ > 0:15g = 0:4404. In the normal
curve below, since the area between k and 0:15 is 0:4329 so that the area to the left of k
must be 0:4404 0:4329 = 0:0075. Hence, from the Standard Normal Probability Table, we
have k = 2:43.
As demonstrated earlier, we now prove the following very useful theorem.
Theorem 5. If Y
N( ;
2
), then
Z=
Proof. Since
Y
N (0; 1)
> 0, the distribution function of Z is
Y
F (z) = P
=
z Z+
1
z
1
p exp
2
9
"
= P fY
1
2
y
z + g
2
#
dy
If we change the variable of integration by letting u = (y
) = , then du = dy= and
Zz
1
1 2
p exp
F (z) =
u du
2
2
1
Accordingly the density function of Z is given by
d
1
1 2
F (z) = p exp
z
for
dz
2
2
which is the standard normal density function. Thus,
Y
Z=
N (0; 1) :
1 < z < 1;
The theorem considerably simpli…es calculation of probabilities concerning normally distributed random variables, as will be seen below.
Standardized Transformation. Suppose that Y
Then, for any y1 < y2 ,
P fy1 < Y < y2 g = P fY
Y
= P
= P
=
y2 g
y2
Z
(y2 Z )=
1
P fY
y2
y1 g
P
P
Z
1
p exp
2
z2
2
)=
N (0; 1).
since P fY = y2 g = 0 and Z = (Y
dz
N( ;
Y
208g = P
Z
= P fZ
250
(y1 Z )=
2
= P fZ
40
1:05g = 0:1469:
10
)= .
y1
1
208
) and let Z = (Y
y1
Example 3. Let Y
N (250; 1600). Then = 250 and
it follows from the Standard Normal Probability Table that
P fY
2
1
p exp
2
= 1600 (or
1:05g
z2
2
dz;
= 40). Thus,
2
Example 4. Let Y
N (40; 64). Then = 40 and
from the Standard Normal Probability Table that
= 64 (or
= 8). Thus, it follows
36:8 40
50 40
<Z<
= P f 0:4 < Z < 1:25g
8
8
= 1 P fZ > 1:25g P fZ > 0:4g = 1 0:1056 0:3446
= 0:5498:
P f36:8 < Y < 50g = P
Example 5. Suppose that Y
N (40; 36). Find the value of c such that
(a) P fY > cg = 0:1401;
(b) P fY
Solution.
cg = 0:4483:
In (a), we see that
P fY > cg = P
Z>
c
40
6
= 0:1401
But from the Standard Normal Probability Table, we have
P fZ > 1:08g = 0:1401
Thus,
c
40
6
= 1:08
and hence c = 40 + (6) (1:08) = 46:48.
11
In (b), note that
P fY
cg = P
c
Z
40
6
= 0:4483
By the symmetry property of the normal curve, it follows that
P
c
Z>
40
6
= 0:4483
From the Standard Normal Probability Table, we read
P fZ > 0:13g = 0:4483
Thus,
c
40
6
and hence c = 40 + (6) ( 0:13) = 39:22.
12
= 0:13