Download the normal distribution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
THE NORMAL DISTRIBUTION
(Gaussian Distribution)
Marquis de Laplace (1749-1827) and Carl Friedrich Gauss
(1777-1855) were jointly credited with the discovery of the
normal distribution.
However, in 1924, Karl Pearson, discovered and published in his
journal Biometrika that Abraham De Moivre (1667-1754) had
developed the formula for the normal distribution.
The Normally Distributed Variable
A variable is said to be normally distributed variable or have a
normal distribution if its distribution has the shape of a normal
curve.
The Normal Curve
Bell shaped
Centered at µ
Approaches zero outside µ -3σ µ + 3σ
Example of Three Different Normal Distributions
The Normal Probability Distribution
Form of a continuos probability distribution. That is, it is a
probability distribution of a continuos random variable.
P(X=x) = 0 if X is a continuos random variable. That, is we
cannot have a probability value for a point.
The shape of all normal densities is the same – a symmetric
bell shape. The density curve has one peak, approaches the
horizontal axis but never touches it, and extends indefinitely
in either direction.
Infinitely many possible normal random variables depending
on the value of the mean and standard deviation.
The mean of the distribution is a measure of its location
Mean = Mode = Median
The standard deviation of the distribution is a measure of the
spread, or variability, of the distribution.
Notation: X ~ N( ,
2
)
The total area under the curve (AUC) is 1
Chapter 6: Normal Distribution
Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss
Prepared by: Nina Kajiji
Page -2-
The Probability Density of a Normal Random Variable
To draw normal curves with parameters
following equation:
f ( x)
Where:
e
1
2
e
(x
2
and
we employ the
)2
2
for
x
= 2.718
= 3.141
= mean of the random variable
= standard deviation of the random variable
Example: To draw a normal curve with parameters =5 and =2
we first determine the extreme tail values. That is,
-3
= 5 - (3)(2)
= -1
+3
= 5 + (3)(2)
= 11
Chapter 6: Normal Distribution
Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss
Prepared by: Nina Kajiji
Page -3-
The Standard Normal Distribution
Often called the z-curve. The horizontal axis is labeled z for the
z-statistic. Notation: Z ~ N(0,12)
Additional Properties
1. The curve is symmetric about zero.
2. Most of the area under the SNC lies between
3.
Why we Need a Standard Normal?
Chapter 6: Normal Distribution
Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss
Prepared by: Nina Kajiji
Page -4-
The z-Table
The following table was computed using the excel function:
Normdist with mean of zero and standard deviation of 1. For
example to compute the probability for z = 0 type the following
function: =NORMDIST(($A2+B$1),0,1,TRUE)-0.5
These values are for the right half of the standard normal curve
Chapter 6: Normal Distribution
Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss
Prepared by: Nina Kajiji
Page -5-
Finding Probabilities of the Standard Normal Distribution
A number in the body of the z-table gives the area under the
SNC between 0 and a specified value of z.
To find the area under the SNC between 0 and a negative
value of z we apply the symmetric property.
To find the area under the SNC to the right of a positive zvalue or left of a negative z-value, simply subtract the table
value from 0.5000.
To find the area between two positive z-values, determine the
table values of both z-values and subtract the low value from
the high value.
To find the area to the left of a positive z-value. Obtain the
table value and add to 0.5000.
To find the area between a negative z-value and a positive zvalue obtain the table values for both z-statistics and add them
together.
Some Important Areas Under SNC
z
z
z
=
=
=
1 ==> 2*0.3413 = 68.26%
2 ==> 2*0.4772 = 95.44%
3 ==> 2*0.4987 = 99.74%
Chapter 6: Normal Distribution
Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss
Prepared by: Nina Kajiji
Page -6-
Finding Values of Z Given a Probability
To The Right
To find the z-value given the area to the right of a positive zvalue. That is, given that the area in the right tail is 0.025 find z.
Notation:
P(Z
z) = 0.025
z = 0.500 - 0.025 = 0.475 (for table above)
z = 1.0 – 0.025 = 0.975 (for table in the book)
By searching the body of the Z-Table above for 0.475 gives us a
z-value of 1.96.
By searching the body of the Z-table in the book for 0.975 gives
us 1.96
This is denoted as z ; where alpha in this case is 0.025.
z0.025 = 1.96
To The Left
Given: Find z if area to the left of the z is 0.025.
Notation: P(Z z) = 0.025
Solution: Employ the property of symmetry and multiply by
negative 1.0. That is, z = -1.96.
Chapter 6: Normal Distribution
Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss
Prepared by: Nina Kajiji
Page -7-
Normally Distributed Random Variables
Definition
A random variable is said to be normally distributed if
probabilities for the random variable are equal to areas under a
normal curve.
If such a random variable has mean x and standard deviation
then the normal curve that is used is the one with parameters
and x.
x
x
Empirical Rule
For any normally distributed random variable x:
The probability is 0.6826 that x will be within one standard
deviation to either side of its mean.
P( x - x < x < x + x) = 0.6826
The probability is 0.9544 that x will be within two standard
deviations to either side of its mean.
P( x - 2 x < x < x + 2 x) = 0.9544
The probability is 0.9974 that x will be within three standard
deviation to either side of its mean
P( x - 3 x < x < x + 3 x) = 0.9974
In general, the probability is 1- that x will be within z
standard deviation to either side of its mean.
P( x - z /2 x < x < x + z /2 x) = 1 Chapter 6: Normal Distribution
Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss
Prepared by: Nina Kajiji
/2
Page -8-
Finding The Probability For Given Values of X
Goal: To find the area to the right or left of a given value of x
for a normal curve with parameters and .
Solution: Convert x to its standardized value. That is, its zscore or z-value. Uses the steps for finding the
probability for a given z-value as shown above.
Example 1: Let us consider a random variable X ~ N(50,102).
Find the probability of X greater than 60. That is, P(X > 60) = ?
Solution 1:
Using the table above
P(X > 60) = P((X - ) / > (60 - ) / )
= P(Z > (60 - )/ )
= P(Z > (60-50)/10)
= P(Z > 1)
= 0.5000 – 0.3413
= 0.1587
Using the Book
P(X > 60) = P((X - ) / > (60 - ) / )
= P(Z > (60 - )/ )
= P(Z > (60-50)/10)
= P(Z > 1)
= 1 - 0.8413
= 0.1587
Chapter 6: Normal Distribution
Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss
Prepared by: Nina Kajiji
Page -9-
Example 2: Given X ~ N(175, 142). Find the probability that x
is between 190 and 210.
Solution 2:
Using the table above
P(190 < X < 210) = P(190- )/ < (X- )/ < (210- )/ )
= P(190-175)/14 < Z < (210-175)/14)
= P(1.07 < Z < 2.50)
= 0.4938 – 0.3577
= 0.1361
Using the book
P(190 < X < 210) = P(190- )/ < (X- )/ < (210- )/ )
= P(190-175)/14 < Z < (210-175)/14)
= P(1.07 < Z < 2.50)
= -0.8577 + 0.9938
= 0.1361
Finding The X-value for Given Probability
De-Standardization or Inverse Transformation
The process of converting z-scores to their x-values. That is,
x = +z .
Example 1: Given X ~ N(100, 162). Find x such that P(X < x) =
0.04
Solution 1: (using table above or book)
z-value from the table for area=0.04 to the left is -1.75.
Therefore, x = 100 + (-1.75)(16) = 72
Chapter 6: Normal Distribution
Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss
Prepared by: Nina Kajiji
Page -10-
Normal Distribution & Value At Risk
(Optional -- SKIP)
The following data is presented from Investment Analysis and
Portfolio Management, Fifth Edition, Chapter 8, pp 268, by
Frank Reilly and Keith Brown. The problem is modified to
present the VaR concept.
Example:
A three-asset class portfolio, as regularly contained in the Wall
Street Journal, is presented below. Find the value at risk (VaR)
at a 5% level for one week. The portfolio variance is 0.017 and
the total dollar investment is $20 million.
Asset Classes
Stocks (S)
Bonds (B)
Near Cash (C)
E(Ri)
0.12
0.08
0.04
E( i) Wi
0.20 0.6
0.10 0.3
0.03 0.1
Digression: “VaR is a dollar measure of the minimum loss that
would be expected over a period of time with a given
probability.” Don Chance.
Chapter 6: Normal Distribution
Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss
Prepared by: Nina Kajiji
Page -11-
Solution:
Given:
p
Z
Find:
= Sqrt(0.017) = 0.1306
= 0.05
= 1.65
X or Portfolio Minimum Return = ?
The expected return (weighted mean) of the portfolio is:
E(Rp)
Recall:
= (0.6 * 0.12) + (0.3 * 0.08) + (0.1 * 0.04)
= 0.1
= 10%
Weighted Mean formula
x
wi xi
wi
The next step is to compute the weekly portfolio return and
standard deviation.
Weekly Return:
Weekly SD:
0.1 / 52 = 0.0019
0.1306 / Sqrt(52) = 0.0181
Then under the normal distribution the return that is 1.65 standard
deviations below the expected return is:
X
= -z
= 0.0019 – (1.65 * 0.0181)
= -0.02797
approx 0.028
Chapter 6: Normal Distribution
Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss
Prepared by: Nina Kajiji
Page -12-
The portfolio would be expected to lose at least 2.8
percent 5 percent of the time. Since VaR is always
expressed in dollars we have:
= $20,000,000 * 0.028
= $560,000
In other words the portfolio would expect to lose at least
$560,000 in one week 5% of the time. That is, once every
20 weeks. This is important for portfolio managers that
wish to diversify away their risk.
Chapter 6: Normal Distribution
Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss
Prepared by: Nina Kajiji
Page -13-
NORMAL APPROXIMATION TO THE BINOMIAL
DISTRIBUTION
Steps
1. Determine n, the number of trials, and p, the success
probability.
2. Check that both np and n(1-p) are at least 5. If they are not
DO NOT use the normal approximation.
3. Find the mean and standard deviation using the binomial
formulas:
4.
x
= np;
and
x
= sqrt(np(1-p))
5. Make the corrections for continuity. That is, subtract 0.5
from the smaller integer and add 0.5 to the larger integer.
This assures the binomial probabilities are within the
specified integers.
6. Find the area under the normal curve with parameters
x.
Chapter 6: Normal Distribution
Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss
Prepared by: Nina Kajiji
x
Page -14-
and
Example:
Given n=60 and 0.60 as the probability that a player does not
complete (p=0.60) find P(x=48).
Solution:
x
= np = (60)(0.60) = 36
x
= sqrt(36(0.4)) = 3.8
and,
Then, P(x=48) = ?
Using table above
= P((47.5 - 36)/3.8 < z < (48.5-36)/3.8)
= P(3.03 < z < 3.30)
= 0.4995 - 0.4988
(from Table above)
= 0.007
Using the book
= P((47.5 - 36)/3.8 < z < (48.5-36)/3.8)
= P(3.03 < z < 3.30)
= -0.9988 + 0.9995
= 0.007
Chapter 6: Normal Distribution
Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss
Prepared by: Nina Kajiji
Page -15-
NORMAL PROBABILITY PLOTS
To assess the normality of a population, we construct a normal
probability plot for the sample data.
If the plot is roughly linear, then accept as reasonable that the
population is approximately normally distributed.
If the plot shows systematic deviations from linearity then we
conclude that the population is probably not approximately
normally distributed.
To construct the plot
1. Compute the z-score
2. Plot the z-score on the y-axis and the x-values on the x-axis.
Interpretation
Chapter 6: Normal Distribution
Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss
Prepared by: Nina Kajiji
Page -16-
NORMALLY DISTRIBUTED POPULATIONS
Definition
A population is said to be normally distributed if percentages for
the population are equal to the areas under a normal curve. If
such a population has mean , and standard deviation , and the
normal curve that is used is the one with parameters and .
Basically, if the relative frequency for two values of x is
approximately equal to the AUC for those two values of x we
conclude that the population is normally distributed.
Conversely, if given that the population is normally distributed
with mean , and standard deviation , we can compute the
percentage of the population between any two values of x.
Empirical Rule
For any normally distributed population:
About 68.26% of the population values lie within one standard
deviation to either side of the mean.
1. About 95.44% of the population values lie within two
standard deviations to either side of the mean.
2. About 99.74% of the population values lie with three
standard deviations to either side of the mean.
Chapter 6: Normal Distribution
Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss
Prepared by: Nina Kajiji
Page -17-
Quartiles & Percentiles
To find the quartiles:
Find the z-value for the appropriate area under the curve. That
is for quartiles, Q1 or Q3, we need to find the z-value that will
have an area of 0.25 to its left or right, respectively.
Destandardize the z-scores to obtain the x-value. This is the
value of Q1 or Q3.
To find percentiles:
Find the z-value for the appropriate area under the curve. For
example, if interested in the 90th percentile the area under the
curve is 1-0.90=0.10.
De-Standardize the z-scores to obtain the x-value.
Chapter 6: Normal Distribution
Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss
Prepared by: Nina Kajiji
Page -18-