Download Gaussian (or Normal) Random Variable

Statistics 215 Lab Materials Gaussian (or Normal) Random Variable In this section we introduce the Gaussian Random Variable, which is more commonly referred to as the Normal Random Variable. This is the random variable that has a bell-shaped curve as its probability density function. This is pictured below. Page 1 of 6 Statistics 215 Lab Materials The Normal distribution or a Normal random variable has nothing truly “normal” about it. That is to say, that there is nothing abnormal about other random variables. The Normal distribution does arise more frequently than other distribution. There are two settings in which it occurs quite frequently. The first of these is biological. The Normal distribution seems to arise when numerous quantities are added together. This often arises in biology when large amounts of genetic material combine in a particular trait, e.g. heights or lengths. The other setting where the Normal is often observed is the psychological setting. As with heights and lengths, this is thought to be the result of many genetic factors combining. For example, IQ measurements are often modeled as having a Normal distribution. More specific examples of Normal RV’s include: lengths of newborn male piglets, heights of female peacocks, lengths of 2 inch nails, scores on the Stanford-Binet Psychological test. The Normal distribution has the following characteristics. It’s range is the entire number line. We can completely determine any Normal distribution by knowing it’s mean and it’s standard deviation. The distribution is symmetric about the mean. Consequently the mean and the median are the same number. The mean tells us where the center of the distribution is and the standard deviation tells us how dispersed or spread out the distribution is. The Normal distribution is used so commonly that we have special notation for the Normal distribution. The reason that the Normal distribution is so widely used is that it is an extremely flexible distribution. Notation: X ~ N(5,4) is read “X is a RV with a Normal distribution with mean 5 and variance 4”. In general, the notation is Y ~ N(µy, σy2) is read “Y is a Normal random variable with mean µy and variance σy2.” As with other continuous RV’s the Normal distribution uses area to determine probability. However, the Normal has a special feature that separates it from other distributions. This feature is that for calculating probabilities what is necessary for finding a particular probability is the z-score corresponding to the cutoff of interest. That is, if we want to know P(X<7) for a Normal RV X, what we need to know is the z-score for 7. Recall that the z-score for 7 would be z= 7 − µx which depends on the values for the mean and σx the standard deviation. One result of this is that the probability of being 2 standard deviations above the mean is the same whether the mean is 75 or 75,000 and whether the standard deviation is 2 or 200. As a consequence the z-score plays an indispensable role in calculating probabilities from a Normal distribution. Recall that the z-score of a value€ c is the number of standard deviations c is above or below the mean. Because of the role that the z-score plays, we specify a random variable Z to have a Normal distribution with mean 0 and standard deviation 1. Z is often referred to a “standard Normal” random variable. The reason for this specification is that by taking the z-score all Normal random variables can be transformed into having mean 0 and standard deviation 1. The overall goal and consequence of this is that we need to use the z-score (and hence the standard Normal distribution) to find Normal probabilities. Thus if X is a Normal random variable with mean 85 and standard deviation 5, then P(X>90) = P(Z> )= P(Z>1.0). This is because we can transform the variable X to the variable Z and by calculating the z-score for 90, we have the same probability, P(X>90) = P(Z>1.0). This is true for any calculation that we do with Normal random variables. We transform to Z and use Z for our probabilities. Calculating Normal Probabilities There are three steps to calculating a Normal probability. 1. Find the z-score for the value of interest. 2. Determine the appropriate formula for calculating the probability. 3. Use that z-score to find the probability using Table 3. Page 2 of 6 Statistics 215 Lab Materials Example: If X is a Normal RV with mean 5 and standard deviation 2, find the z-score for x = 4. The z-score for x = 4 is = = -0.5. Consequently, the value x = 4 is one-half of a standard deviation below the mean, since z = -0.5. Then P(X>4) = P(Z>-0.5). Example If X is a Normal RV with mean 5 and standard deviation 2, find the z-score for 8.4. The z-score for 8.4 is = = 1.7. Consequently, the value 8.4 is 1.7 standard deviations above the mean, since z = 1.7. Then P(X<8.4) = P(Z<1.7). Example: If H ~N(142, 3.52), find the z-score for 150. The z-score for 150 is = =2.29. Consequently, the value 150 is 2.29 standard deviations above the mean, since z=2.29. Then P(H>150) = P(Z>2.29). Having found the z-score we need to determine the appropriate formula for calculating the probability of interest. The reason that we do this is the structure of Table D.3(a), which we will use for calculation. This table has values for probabilities that are less than and with positive z-scores. However, we are often interested in probabilities that involve negative z-scores or in probabilities that involve greater than a particular value. Assume that we are interested in a random variable X with mean 70 and standard deviation 10. P(X<80)=P(Z< ) = P(Z<1.0). This is an example of a probability that is less than a positive z-score. Instead, if we wanted P(X>80) = P(Z>1.0), then this is an example of a probability that is more than a positive z-score. If we wanted to know P(X>60) = P(Z> ) = P(Z>-2.0), this is an example of a greater than probability with a negative z-score. Finally, if we need to calculate P(X<60) = P(X<-2.0), this is an example of a less then probability with a negative z-score. Table 3 contains probabilities such as P(Z<z). Consequently, we need rules to work other probabilities into this format. This is similar to the rules that were used for the binomial and Poisson tables to get probabilities other than P(X≤r). What we want Calculation we need to perform Example P(Z<z), with z positive P(Z<z) P(Z<1.42) P(Z>z) with z positive P(Z>z) = 1-P(Z<z) P(Z>1.42) = 1-P(Z<1.42) P(Z<z) with z negative P(Z<z) =P(Z<z) P(Z<-1.42) P(Z>z) with z negative P(Z>z) = P(Z<-z) * P(Z>-1.42) = P(Z<1.42) *Recall that the negative of a negative is a positive. These rules stem from two basic facts. First the symmetry of the Normal distribution means that the P(Z>z) = P(Z<-z). Since z and –z are the same distance from the mean of zero, symmetry says these Page 3 of 6 Statistics 215 Lab Materials probabilities must be the same. The other fact that is used is the complement rule, which says that P(Z>z) = 1- P(Z<z). Combining these facts we get the above table of rules. Finally the last step we need is using Table 3. Suppose we want to find P(Z<1.48). The first step is to find the tenths place 1.4 and find it in the first column. Then go across that row to the column labeled 0.08. The entry in the table is 0.9306, so P(Z<1.48) = 0.9306. If we want to find P(Z< 0.85). First find 0.8 in the first column of the table. Then go across that row to the column for 0.05. The entry in the table is 0.8023, so P(Z<0.85) = 0.8023. If we want to find P(Z<2.11). Again we find 2.1 in the first column of the table and go across that row to the column for 0.01. The value in the table is 0.9826, so P(Z<2.11) = 0.9826. The following examples combine all these steps. Example Suppose that X is a normal random variable with mean 100 and standard deviation 7.5 Find P(X < 110). P(X<110) = P(Z< ) = P(Z<1.33) = 0.9082. We can look P(Z<1.33) up directly in the table. Find P(X > 120) P(X>120) = P(Z> ) = P(Z>2.67) = (by complementary events) =1- P(Z<2.67) = 1- 0.9962 = 0.0038. Find P(X > 93) P(X>93) = P(Z> ) = P(Z>-0.93) = (by symmetry of the Normal distribution) =P(Z<0.93) = 0.8238. Find P(X < 84) P(X<84) = P(Z< ) = P(Z < -2.13) = (by symmetry of the Normal distribution) =P(Z>2.13) = (by complementary events) = 1-P(Z<2.13) = 0.9834. TIP: Since Table 3 uses only two decimal places for z-scores, round all z-scores to two decimal places when using this table. TIP: It is common to refer to a random variable by the name of the random variable or by the distribution. They are interchangeable. Since any RV is defined by its distribution, this usage is appropriate, though it often confuses people the first time they see or hear this. TIP: It is often helpful when doing calculations with Normal probabilities to draw a picture to get an idea about the quality of your final answer. If it conflicts with the picture then you may need to reconsider your calculations. The first step in this is to draw a bell-shaped curve. Draw a vertical line down the center and label it with the value of the mean. Over 99% of the Normal distribution is within 3 standard deviations of the mean. So go to the right edge of you curve and label it with the value of the mean plus three times the standard deviation. Go to the left edge and label it with the value of the mean minus three times the standard deviation. Then shade the area for the probability that you are interested in. Page 4 of 6 Statistics 215 Lab Materials Example: Suppose X is a Normal random variable with mean 120 and standard deviation 7. Find P(X>125) 99 120 141 We use 120 for the center since it is the mean. The values 141 and 99 are 120 + 3*7 and 120-3*7, which are 3 standard deviations above and below the mean, respectively. P(X>125) = P(Z> ) = P(Z>0.71) = 1-P(Z<0.71) = 1- 0.7611 = 0.2389. Given the accuracy of the picture it seems reasonable that the probability should be around 24%. We would have been nervous had the answer we calculated been more than 50 % or less than 2%. Drawing a picture is a nice check for gross errors in calculation. 7.4 Percentiles of the Normal distribution Back in Chapter 4 we discussed percentiles for data. For example the 80th percentile is the point in the distribution where 80% of the data or 80% of the probabilities are below that point (and consequently 20% are above that point). We often want to calculate percentiles for a specific distribution or set of data. For example if I want to build a cage that 98% of frogs will be comfortable in, the I need to know the 98th percentile of frog sizes. An admissions officer might only want to accept students who are in the top 20% of all scores on some standardized test. In that case the admissions officers would need to know the 80th percentile of scores on that test. They would accept only those students whose test scores were above the 80th percentile. To find percentiles for the Normal distribution, we reverse the process from the previous section. In the previous section we had a value and we were looking for a probability or a percentage. For example, the previous section we wanted P(X>182) = c and we found what c was. In this section, we’ll have P(X>k) = 0.7500, say, and we’ll have to find k. Here, we have the percentage and we want to find the value that would give us that percentage. Consequently, we’ll reverse the steps we took in the previous section. Suppose that we want to find the 75th percentile of a Normal distribution with mean 430 and standard deviation 22. Let X be a Normal RV with mean 430 and standard deviation 22. Then we want to find a value k, such that P(X<k) = 0.7500. Likewise there exists a z-score for k, call it zk, such that P(Z<zk) = 0.7500. Now we can find zk by going into the body of Table 3 and finding 0.7500. Inside the body of the table we find the closest percentage to 0.7500. That percentage is 0.7486. This probability corresponds to a z-score of 0.67. To find the z-score go to the top of the column and the left of the row. Thus, zk = 0.67. This is the z-score for k, but we need to convert that back to k. Now zk = , 0.67 = . Solving for k gives us k = 430 + 0.67*(22) = 444.74. So the 75th percentile of a Normal distribution with mean 430 and standard deviation 22 is approximately 444.74. Page 5 of 6 Statistics 215 Lab Materials Finding the j*100th percentile, k, of a Normal random variable X. 1. In the body of Table D.3(a) find j (or the value closest to j). 2. Find the z-score for j, call it zk. 3. Using the formula for the z-score, , solve for k. Example: For a Normal Random variable X ~N(45, 6) find the 92nd percentile 1. In Table 3, the closest value to 0.9200 is 0.9207. 2. zk = 1.41 3. 1.41 = , then k = 45 + 1.41*6 = 53.46. So the 92nd percentile is 53.46. Example: For a Normal RV Y ~ N(76, 3), find the 97th percentile 1. In Table D.3(a), the closest value to 0.9700 is 0.9699. 2.zk = 1.88 3.1.88 = , then k = 76 + 1.88*3 = 81.64 So the 97th percentile is 81.6. Page 6 of 6

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Gaussian (or Normal) Random Variable