Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
THE NORMAL DISTRIBUTION (Gaussian Distribution) Marquis de Laplace (1749-1827) and Carl Friedrich Gauss (1777-1855) were jointly credited with the discovery of the normal distribution. However, in 1924, Karl Pearson, discovered and published in his journal Biometrika that Abraham De Moivre (1667-1754) had developed the formula for the normal distribution. The Normally Distributed Variable A variable is said to be normally distributed variable or have a normal distribution if its distribution has the shape of a normal curve. The Normal Curve Bell shaped Centered at µ Approaches zero outside µ -3σ µ + 3σ Example of Three Different Normal Distributions The Normal Probability Distribution Form of a continuos probability distribution. That is, it is a probability distribution of a continuos random variable. P(X=x) = 0 if X is a continuos random variable. That, is we cannot have a probability value for a point. The shape of all normal densities is the same – a symmetric bell shape. The density curve has one peak, approaches the horizontal axis but never touches it, and extends indefinitely in either direction. Infinitely many possible normal random variables depending on the value of the mean and standard deviation. The mean of the distribution is a measure of its location Mean = Mode = Median The standard deviation of the distribution is a measure of the spread, or variability, of the distribution. Notation: X ~ N( , 2 ) The total area under the curve (AUC) is 1 Chapter 6: Normal Distribution Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss Prepared by: Nina Kajiji Page -2- The Probability Density of a Normal Random Variable To draw normal curves with parameters following equation: f ( x) Where: e 1 2 e (x 2 and we employ the )2 2 for x = 2.718 = 3.141 = mean of the random variable = standard deviation of the random variable Example: To draw a normal curve with parameters =5 and =2 we first determine the extreme tail values. That is, -3 = 5 - (3)(2) = -1 +3 = 5 + (3)(2) = 11 Chapter 6: Normal Distribution Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss Prepared by: Nina Kajiji Page -3- The Standard Normal Distribution Often called the z-curve. The horizontal axis is labeled z for the z-statistic. Notation: Z ~ N(0,12) Additional Properties 1. The curve is symmetric about zero. 2. Most of the area under the SNC lies between 3. Why we Need a Standard Normal? Chapter 6: Normal Distribution Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss Prepared by: Nina Kajiji Page -4- The z-Table The following table was computed using the excel function: Normdist with mean of zero and standard deviation of 1. For example to compute the probability for z = 0 type the following function: =NORMDIST(($A2+B$1),0,1,TRUE)-0.5 These values are for the right half of the standard normal curve Chapter 6: Normal Distribution Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss Prepared by: Nina Kajiji Page -5- Finding Probabilities of the Standard Normal Distribution A number in the body of the z-table gives the area under the SNC between 0 and a specified value of z. To find the area under the SNC between 0 and a negative value of z we apply the symmetric property. To find the area under the SNC to the right of a positive zvalue or left of a negative z-value, simply subtract the table value from 0.5000. To find the area between two positive z-values, determine the table values of both z-values and subtract the low value from the high value. To find the area to the left of a positive z-value. Obtain the table value and add to 0.5000. To find the area between a negative z-value and a positive zvalue obtain the table values for both z-statistics and add them together. Some Important Areas Under SNC z z z = = = 1 ==> 2*0.3413 = 68.26% 2 ==> 2*0.4772 = 95.44% 3 ==> 2*0.4987 = 99.74% Chapter 6: Normal Distribution Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss Prepared by: Nina Kajiji Page -6- Finding Values of Z Given a Probability To The Right To find the z-value given the area to the right of a positive zvalue. That is, given that the area in the right tail is 0.025 find z. Notation: P(Z z) = 0.025 z = 0.500 - 0.025 = 0.475 (for table above) z = 1.0 – 0.025 = 0.975 (for table in the book) By searching the body of the Z-Table above for 0.475 gives us a z-value of 1.96. By searching the body of the Z-table in the book for 0.975 gives us 1.96 This is denoted as z ; where alpha in this case is 0.025. z0.025 = 1.96 To The Left Given: Find z if area to the left of the z is 0.025. Notation: P(Z z) = 0.025 Solution: Employ the property of symmetry and multiply by negative 1.0. That is, z = -1.96. Chapter 6: Normal Distribution Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss Prepared by: Nina Kajiji Page -7- Normally Distributed Random Variables Definition A random variable is said to be normally distributed if probabilities for the random variable are equal to areas under a normal curve. If such a random variable has mean x and standard deviation then the normal curve that is used is the one with parameters and x. x x Empirical Rule For any normally distributed random variable x: The probability is 0.6826 that x will be within one standard deviation to either side of its mean. P( x - x < x < x + x) = 0.6826 The probability is 0.9544 that x will be within two standard deviations to either side of its mean. P( x - 2 x < x < x + 2 x) = 0.9544 The probability is 0.9974 that x will be within three standard deviation to either side of its mean P( x - 3 x < x < x + 3 x) = 0.9974 In general, the probability is 1- that x will be within z standard deviation to either side of its mean. P( x - z /2 x < x < x + z /2 x) = 1 Chapter 6: Normal Distribution Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss Prepared by: Nina Kajiji /2 Page -8- Finding The Probability For Given Values of X Goal: To find the area to the right or left of a given value of x for a normal curve with parameters and . Solution: Convert x to its standardized value. That is, its zscore or z-value. Uses the steps for finding the probability for a given z-value as shown above. Example 1: Let us consider a random variable X ~ N(50,102). Find the probability of X greater than 60. That is, P(X > 60) = ? Solution 1: Using the table above P(X > 60) = P((X - ) / > (60 - ) / ) = P(Z > (60 - )/ ) = P(Z > (60-50)/10) = P(Z > 1) = 0.5000 – 0.3413 = 0.1587 Using the Book P(X > 60) = P((X - ) / > (60 - ) / ) = P(Z > (60 - )/ ) = P(Z > (60-50)/10) = P(Z > 1) = 1 - 0.8413 = 0.1587 Chapter 6: Normal Distribution Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss Prepared by: Nina Kajiji Page -9- Example 2: Given X ~ N(175, 142). Find the probability that x is between 190 and 210. Solution 2: Using the table above P(190 < X < 210) = P(190- )/ < (X- )/ < (210- )/ ) = P(190-175)/14 < Z < (210-175)/14) = P(1.07 < Z < 2.50) = 0.4938 – 0.3577 = 0.1361 Using the book P(190 < X < 210) = P(190- )/ < (X- )/ < (210- )/ ) = P(190-175)/14 < Z < (210-175)/14) = P(1.07 < Z < 2.50) = -0.8577 + 0.9938 = 0.1361 Finding The X-value for Given Probability De-Standardization or Inverse Transformation The process of converting z-scores to their x-values. That is, x = +z . Example 1: Given X ~ N(100, 162). Find x such that P(X < x) = 0.04 Solution 1: (using table above or book) z-value from the table for area=0.04 to the left is -1.75. Therefore, x = 100 + (-1.75)(16) = 72 Chapter 6: Normal Distribution Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss Prepared by: Nina Kajiji Page -10- Normal Distribution & Value At Risk (Optional -- SKIP) The following data is presented from Investment Analysis and Portfolio Management, Fifth Edition, Chapter 8, pp 268, by Frank Reilly and Keith Brown. The problem is modified to present the VaR concept. Example: A three-asset class portfolio, as regularly contained in the Wall Street Journal, is presented below. Find the value at risk (VaR) at a 5% level for one week. The portfolio variance is 0.017 and the total dollar investment is $20 million. Asset Classes Stocks (S) Bonds (B) Near Cash (C) E(Ri) 0.12 0.08 0.04 E( i) Wi 0.20 0.6 0.10 0.3 0.03 0.1 Digression: “VaR is a dollar measure of the minimum loss that would be expected over a period of time with a given probability.” Don Chance. Chapter 6: Normal Distribution Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss Prepared by: Nina Kajiji Page -11- Solution: Given: p Z Find: = Sqrt(0.017) = 0.1306 = 0.05 = 1.65 X or Portfolio Minimum Return = ? The expected return (weighted mean) of the portfolio is: E(Rp) Recall: = (0.6 * 0.12) + (0.3 * 0.08) + (0.1 * 0.04) = 0.1 = 10% Weighted Mean formula x wi xi wi The next step is to compute the weekly portfolio return and standard deviation. Weekly Return: Weekly SD: 0.1 / 52 = 0.0019 0.1306 / Sqrt(52) = 0.0181 Then under the normal distribution the return that is 1.65 standard deviations below the expected return is: X = -z = 0.0019 – (1.65 * 0.0181) = -0.02797 approx 0.028 Chapter 6: Normal Distribution Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss Prepared by: Nina Kajiji Page -12- The portfolio would be expected to lose at least 2.8 percent 5 percent of the time. Since VaR is always expressed in dollars we have: = $20,000,000 * 0.028 = $560,000 In other words the portfolio would expect to lose at least $560,000 in one week 5% of the time. That is, once every 20 weeks. This is important for portfolio managers that wish to diversify away their risk. Chapter 6: Normal Distribution Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss Prepared by: Nina Kajiji Page -13- NORMAL APPROXIMATION TO THE BINOMIAL DISTRIBUTION Steps 1. Determine n, the number of trials, and p, the success probability. 2. Check that both np and n(1-p) are at least 5. If they are not DO NOT use the normal approximation. 3. Find the mean and standard deviation using the binomial formulas: 4. x = np; and x = sqrt(np(1-p)) 5. Make the corrections for continuity. That is, subtract 0.5 from the smaller integer and add 0.5 to the larger integer. This assures the binomial probabilities are within the specified integers. 6. Find the area under the normal curve with parameters x. Chapter 6: Normal Distribution Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss Prepared by: Nina Kajiji x Page -14- and Example: Given n=60 and 0.60 as the probability that a player does not complete (p=0.60) find P(x=48). Solution: x = np = (60)(0.60) = 36 x = sqrt(36(0.4)) = 3.8 and, Then, P(x=48) = ? Using table above = P((47.5 - 36)/3.8 < z < (48.5-36)/3.8) = P(3.03 < z < 3.30) = 0.4995 - 0.4988 (from Table above) = 0.007 Using the book = P((47.5 - 36)/3.8 < z < (48.5-36)/3.8) = P(3.03 < z < 3.30) = -0.9988 + 0.9995 = 0.007 Chapter 6: Normal Distribution Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss Prepared by: Nina Kajiji Page -15- NORMAL PROBABILITY PLOTS To assess the normality of a population, we construct a normal probability plot for the sample data. If the plot is roughly linear, then accept as reasonable that the population is approximately normally distributed. If the plot shows systematic deviations from linearity then we conclude that the population is probably not approximately normally distributed. To construct the plot 1. Compute the z-score 2. Plot the z-score on the y-axis and the x-values on the x-axis. Interpretation Chapter 6: Normal Distribution Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss Prepared by: Nina Kajiji Page -16- NORMALLY DISTRIBUTED POPULATIONS Definition A population is said to be normally distributed if percentages for the population are equal to the areas under a normal curve. If such a population has mean , and standard deviation , and the normal curve that is used is the one with parameters and . Basically, if the relative frequency for two values of x is approximately equal to the AUC for those two values of x we conclude that the population is normally distributed. Conversely, if given that the population is normally distributed with mean , and standard deviation , we can compute the percentage of the population between any two values of x. Empirical Rule For any normally distributed population: About 68.26% of the population values lie within one standard deviation to either side of the mean. 1. About 95.44% of the population values lie within two standard deviations to either side of the mean. 2. About 99.74% of the population values lie with three standard deviations to either side of the mean. Chapter 6: Normal Distribution Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss Prepared by: Nina Kajiji Page -17- Quartiles & Percentiles To find the quartiles: Find the z-value for the appropriate area under the curve. That is for quartiles, Q1 or Q3, we need to find the z-value that will have an area of 0.25 to its left or right, respectively. Destandardize the z-scores to obtain the x-value. This is the value of Q1 or Q3. To find percentiles: Find the z-value for the appropriate area under the curve. For example, if interested in the 90th percentile the area under the curve is 1-0.90=0.10. De-Standardize the z-scores to obtain the x-value. Chapter 6: Normal Distribution Class Notes to accompany: Introductory Statistics, 9th Ed, By Neil A. Weiss Prepared by: Nina Kajiji Page -18-