Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
MTH/STA 561 NORMAL PROBABILITY DISTRIBUTION The most important continuous probability distribution in the entire …eld of statistics is the normal distribution. Originally, the mathematical equation of the normal distribution was developed by Abraham DeMoivre in 1733. However, it is often referred to as the Gaussian distribution in honor of Karl Friedrich Gauss (1777-1855). The normal distribution is important both because it seems to provide an adequate model for various observed measurements and, as we will see in the subsequent chapters, because it provides an accurate approximation to a wide variety of probability distributions. The graph of the normal distribution is a bell-shaped curve and is called the normal curve or Gaussian curve. A continuous random variable Y that has the bell-shaped normal distribution is called a normal random variable. The probability distribution of the normal random variable is a function of the form " # 2 1 y 1 for 1 < y < 1. f (y; ; ) = p exp 2 2 It depends upon the two parameters and , that will be shown later to be the mean and standard deviation, respectively, of the normal random variable. As demonstrated in the following theorem, this function bell-shaped is a legitimate probability distribution. Theorem 1. For any real number Z1 1 where 1< < 1 and 1 < y < 1, " 1 1 y p exp 2 2 2 # dy = 1 > 0. Proof. Consider the integral I= Z1 exp x2 2 dx 1 This integral exists because the integrand is a positive continuous function which is bounded by an integrable function; that is, 0 < exp for 1 < x < 1, and Z1 1 exp ( jxj + 1) dx = 2 Z1 x2 2 < exp ( jxj + 1) exp ( x + 1) dx = 2 lim [ exp ( x + 1)]b0 = 2e b!1 0 1 To evaluate the integral I, we note that I > 0 and that I 2 may be written I2 = Z1 Z1 1 x2 + w 2 2 exp dxdw 1 This double integral can be evaluated by changing to polar coordinates. If we let x = cos and w = sin , then we have dxdw = d d , x2 + w2 = 2 , and 2 I = Z2 Z1 0 Accordingly, I = 2 exp d d = 2 lim exp t!1 2 =t 2 =0 0 0 p Z2 d = Z2 d =2 0 2 and thus Z1 1 Now if we let x = (x becomes )/ , where Z1 1 x2 2 1 p exp 2 1 p exp 2 " dx = 1 > 0, then dx = dy/ 1 2 y 2 # and the preceding integral dy = 1: Since > 0, the integrand of the preceding integral satis…es the requirements of being a probability density function of a continuous type of random variable. As a result, we now formally de…ne the normal distribution as follows. De…nition 1. A continuous random variable Y is said to have a normal probability distribution (or Gaussian distribution) with parameters and if its probability density function is given by " # 2 1 y 1 f (y; ; ) = p exp 2 2 for 1< < 1, > 0, and 1 < y < 1. As will be shown later, and 2 can be veri…ed by means of moment-generating function to be mean and variance, respectively, of the random variable Y . The normal distribution occurs so frequently in certain parts of statistics that we denote it, for brevity, by Y N ( ; 2 ). Thus, by writing Y N ( ; 2 ), we mean that the random variable Y is normally distributed with mean and variance 2 . The graph of the normal distribution N ( ; 2 ), generally referred to as normal curve or Gaussian Curve, is bell-shaped curve as shown below. Intuitively, the normal curve seems to give a most objective probability in the sense that the most likely probability occurs at 2 the mean (center) and that it becomes less and less likely as the curve symmetrically and smoothly move decreasingly towards both ends. We now derive the moment-generating function for a normal random variable as follows. Theorem 2. The moment-generating function for a normal random variable Y is given by 1 22 t 2 Proof. By de…nition, the moment-generating function for a normal random variable Y is given by " # Z1 2 1 1 y mY (t) = E etY = exp (ty) p exp dy 2 2 mY (t) = exp t+ 1 The two exponential factors of the integrand can be combined as " # " # 2 2 1 y 1 y exp (ty) exp = exp + ty 2 2 y2 = exp Thus, mY (t) = Z1 1 y2 1 p exp 2 2 2 ty 2 2 2 ty 2 2 y+ 2 2 : 2 2 y+ dy: 2 Now let us complete the square in the numerator of the exponent; that is, y2 2 2 ty 2 y+ 2 = y2 2 2 + = y + 2 t = y + 2 t = y + 2 t = y + 2 t 3 2 t y+ 2 2 2 2 + 2 + 2 + 2 2 2 2 2 t 2 +2 t 2 2 t+ 4 2 t 1 t+ 2 2 2 t 4 2 t which implies that exp y2 2 2 ty 2 2 2 y+ = exp 2 = exp " " [y 1 2 2 ( + 2 2 y 2 t)] 2 ( + + 2 t) # 1 t+ 2 exp 2 2 t # t+ 1 2 2 2 t : Thus, mY (t) becomes 1 t+ 2 mY (t) = exp = exp t+ 1 2 2 2 t 2 2 t Z1 1| 1 p exp 2 " y 1 2 2 ( + t) {z N( + 2 # dy } 2 t; 2 ) because the integrand of the last integral can be thought of as a normal distribution with parameters + 2 t and , and hence it is equal to one. Thus, the mean and variance of a normal random variable may be calculated from the moment-generating function mY (t) as follows: By di¤erentiating mY (t) with respect to t, we obtain d 1 22 E (Y ) = mY (t) = + 2 t exp t + t = dt 2 t=0 t=0 and E Y 2 = d2 mY (t) dt2 = 2 exp = 2 + t=0 t+ 1 2 2 2 t + + 2 t 2 exp t+ 1 2 2 2 t t=0 2 Thus, V ar (Y ) = E Y 2 [E (Y )]2 = 2 + 2 2 = 2 : We summarize the preceding discussion in the following theorem. Theorem 3. The mean and variance of a normal random variable Y are respectively; that is, E (Y ) = and V ar (Y ) = 2 and 2 , De…nition 2. A continuous random variable Y that follows a normal probability distribution with mean and variance 2 is referred to as a normal random variable with mean 4 and variance 2 . Once the parameters and are speci…ed, the normal curve is completely determined. In Figure (a) of the following set of …gures, we have sketched two normal curves which have the same standard deviation but di¤erent means. The two curves are identical in shape but are centered at di¤erent positions along the horizontal axis. In Figure (b), we have sketched two normal curves with the same mean but di¤erent standard deviations. In this case, the two curves are centered at exactly the same positions on the horizontal axis, but the curve with the larger standard deviation is lower and spreads out farther. Remember that the area under a probability curve and above the horizontal axis must be equal to one and therefore the more variable the set of observations the lower and wider the corresponding curve will be. Figure (c) shows the results of sketching two normal curve that have di¤erent means and di¤erent standard deviations. Clearly, they are centered at di¤erent positions on the horizontal axis and their shapes re‡ect the two di¤erent values of . (a) (b) (c) 5 The following theorem gives a list of properties of the normal curve N ( ; Theorem 4. Let Y be normally distributed with mean Then we have the following properties: 2 ). and standard deviation . (1). The mode, which is the point on the horizontal axis at which the curve is a maximum, occurs at y = . (2). The curve is symmetric about a vertical axis through the mean . (3). The curve has its points of in‡ection at y = < y < + , and is concave upward elsewhere. . It is concave downward if (4). The curve approaches the horizontal axis asymptotically as we proceed in either direction away from the mean. Proof. (1) It is easy to show that f 0 (y; ; ) = 2 1 p 2 (y ) exp " 1 2 and f 00 (y; ; ) = 4 1 p 2 )2 (y 2 exp " 1 2 2 y Setting f 0 (y; ; ) = 0 yields y = . Since f 0 (y; ; ) > 0 for y < y > , it follows that the mode occurs at y = . (2) It is also easy to verify that f (y ; ; )=f( y that is, the curve is symmetric about the vertical line y = . # 2 y # and f 0 (y; ; ) < 0 for ; ; ) for all 1 < y < 1. (3) If f 00 (y; ; ) = 0, then y = or y = + . Clearly, f 00 (y; ; ) < 0 for jy j< 00 and f (y; ; ) > 0 for jy j > . Hence, the curve has its points of in‡ection at y = . (4) Now note that 1 lim f (y; ; ) = p 2 y! 1 1 lim exp 2 y! 1 2 1 )2 = p 2 (y 0 = 0: Just as the area under the curve of any continuous probability density function bounded by the two ordinates y = y1 and y = y2 is equal to the probability that the random variable Y assumes a value between y = y1 and y = y2 , for Y N ( ; 2 ), we have P fy1 < Y < y2 g = Zy2 y1 1 p exp 2 " 1 2 y 2 # dy; represented by the area of the shaded region of the curve given below. 6 Let z= y : Then dz = 1 dy or dy = dz. Also; if y falls between y1 and y2 , z will fall between z1 = y1 and z2 = y2 : Therefore, the above probability can be written as P fy1 < Y < y2 g = Zz2 = Zz2 1 p exp 2 z1 z1 1 p exp 2 1 2 z 2 dz 1 2 z dz: 2 It should be noticed that the integrand of the second integral turns out to be the normal distribution with mean = 0 and variance 2 = 1. As will be shown later, the second integral can be obtained by using the probability table that is designed speci…cally for this kind of probabilities. Let us …rst de…ne this special kind of normal probability as follows. De…nition 3. The probability distribution of a normal random variable Z with mean 0 and variance 1 is called a standard normal distribution. That is, the probability distribution of the random variable Z is given by 1 f (z; 0; 1) = p exp 2 1 2 z 2 for 1<z<1 Similar to what it means by Y N ( ; 2 ), the expression Z N (0; 1) stands for the random variable Z that has a standard normal distribution. Below is the graph of the standard normal distribution. The distribution function of the standard normal random variable is given by F (z; 0; 1) = P fZ zg = Zz 1 7 1 p exp 2 1 2 t dt 2 The …gures below graph the standard normal distribution and its distribution function. The table for standard normal probability gives the area under the standard normal curve corresponding to Z1 1 1 2 p exp P fZ > zg = t dt 2 2 z for values of z ranging from 0:00 to 5:00, as shown below. Example 1. Let Z be a standard normal random variable. From the Standard Normal Probability Table, we have P fZ > 1:73g = 0:0418 and P f 1:85 < Z < 0:78g = 1 P fZ > 1:95g P fZ > 0:78g = 1 0:0322 0:2177 = 0:7501: 8 Example 2. Given a standard normal random variable Z, …nd the value of k such that (a) P fZ > kg = 0:2236; (b) P fk < Z < 0:15g = 0:4329: Solution. (a) From the Standard Normal Probability Table, we read that P fZ > 0:76g = 0:2236. Hence, k = 0:76. (b) From the Standard Normal Probability Table, we read that the total area to the left of 0:15 is equal to 0:4404; that is, P fZ < 0:15g = P fZ > 0:15g = 0:4404. In the normal curve below, since the area between k and 0:15 is 0:4329 so that the area to the left of k must be 0:4404 0:4329 = 0:0075. Hence, from the Standard Normal Probability Table, we have k = 2:43. As demonstrated earlier, we now prove the following very useful theorem. Theorem 5. If Y N( ; 2 ), then Z= Proof. Since Y N (0; 1) > 0, the distribution function of Z is Y F (z) = P = z Z+ 1 z 1 p exp 2 9 " = P fY 1 2 y z + g 2 # dy If we change the variable of integration by letting u = (y ) = , then du = dy= and Zz 1 1 2 p exp F (z) = u du 2 2 1 Accordingly the density function of Z is given by d 1 1 2 F (z) = p exp z for dz 2 2 which is the standard normal density function. Thus, Y Z= N (0; 1) : 1 < z < 1; The theorem considerably simpli…es calculation of probabilities concerning normally distributed random variables, as will be seen below. Standardized Transformation. Suppose that Y Then, for any y1 < y2 , P fy1 < Y < y2 g = P fY Y = P = P = y2 g y2 Z (y2 Z )= 1 P fY y2 y1 g P P Z 1 p exp 2 z2 2 )= N (0; 1). since P fY = y2 g = 0 and Z = (Y dz N( ; Y 208g = P Z = P fZ 250 (y1 Z )= 2 = P fZ 40 1:05g = 0:1469: 10 )= . y1 1 208 ) and let Z = (Y y1 Example 3. Let Y N (250; 1600). Then = 250 and it follows from the Standard Normal Probability Table that P fY 2 1 p exp 2 = 1600 (or 1:05g z2 2 dz; = 40). Thus, 2 Example 4. Let Y N (40; 64). Then = 40 and from the Standard Normal Probability Table that = 64 (or = 8). Thus, it follows 36:8 40 50 40 <Z< = P f 0:4 < Z < 1:25g 8 8 = 1 P fZ > 1:25g P fZ > 0:4g = 1 0:1056 0:3446 = 0:5498: P f36:8 < Y < 50g = P Example 5. Suppose that Y N (40; 36). Find the value of c such that (a) P fY > cg = 0:1401; (b) P fY Solution. cg = 0:4483: In (a), we see that P fY > cg = P Z> c 40 6 = 0:1401 But from the Standard Normal Probability Table, we have P fZ > 1:08g = 0:1401 Thus, c 40 6 = 1:08 and hence c = 40 + (6) (1:08) = 46:48. 11 In (b), note that P fY cg = P c Z 40 6 = 0:4483 By the symmetry property of the normal curve, it follows that P c Z> 40 6 = 0:4483 From the Standard Normal Probability Table, we read P fZ > 0:13g = 0:4483 Thus, c 40 6 and hence c = 40 + (6) ( 0:13) = 39:22. 12 = 0:13