Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Principles of Statistics I Lecture Notes for Principles of Statistics I (Economics 261) Western Nevada College Copyright © 2010 By Vance A. Hughey Carson City, Nevada The following material is copyrighted. The text of this publication, or any part thereof, may not be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, storage in an information retrieval system, or otherwise, without the prior written permission of Vance A. Hughey. Continuous Probability Distributions Page 1 Principles of Statistics I 6. Continuous Probability Distributions—Introduces probability distribution of continuous random variables. Three major continuous probability distributions—the uniform, the normal, and the exponential distributions—are introduced. Learning Objectives: Topics: 1. Understand the difference between how probabilities are computed for discrete and continuous random variables. 2. Know how to compute probability values for a continuous uniform probability distribution and be able to compute the expected value and variance for such a distribution. 3. Be able to compute probabilities using a normal probability distribution. Understand the role of the standard normal distribution in this process. 4. Know how and when the normal distributions can be used to approximate binomial probabilities. 5. Be able to compute probabilities using an exponential probability distribution. 6. Understand the relationship between the Poisson and exponential probability distributions. Uniform distributions Normal probability distribution Standard normal distribution Exponential probability distribution Key concepts: Uniform distributions Probability density function Normal probability distribution Bell-shaped curve z-value Standard normal distribution Continuity correction factor Exponential probability distribution We have just finished studying discrete probability distributions, but we now turn to continuous probability distributions. The key difference (which leads to a major difference in how probabilities are computed) stems from the fact that with continuous probability distributions, the random variable can take on any value in an interval. There are an infinite number of possible values in any interval, so we won’t be talking about a specific value. Instead we will discuss the probability that a continuous random variable will lie within a specific interval. This technique is useful in a broad range of problems. For example, consider the probability that a can of soup will contain between 8.00 and 8.33 ounces of soup. It is not meaningful, or is impractical, to ask what the probability is that a can contains exactly 8.2534 ounces. Other kinds of problems that can be addressed include the probability that a stock price will vary from $45 to $76, or the probability that a loan application will be processed within a particular 30-minute interval, or the probability that a firm’s accounts receivable will be from 30 to 60 days old. There are many other Continuous Probability Distributions Page 2 Principles of Statistics I applications in engineering, manufacturing, medicine, transportation, psychology, and other fields. We are going to discuss three continuous probability distributions: And we will be discussing the probability that a continuous random variable will lie within a specific interval. Let’s start with the uniform distribution since it will clearly show the differences between discrete and continuous probability distributions concerning how probabilities are computed. Uniform Distribution If we assume, as the text does, that the flight between Chicago and New York can take from 120 to 140 minutes, we can define x as a random variable where: 120 ≤ x ≤ 140 . If we assume that the flight can never take less than 120 minutes nor more than 140 minutes, then f ( x) = 0 for any interval <120 or >140. Continuous Probability Distributions Page 3 Principles of Statistics I There are 20 one-minute intervals between 120 and 140 minutes, and if we further assume that each one-minute interval is equally likely, then the random variable x has a uniform probability distribution. The probability density function would be: f ( x) = 1 for 120 ≤ x ≤ 140 20 The graph of this probability density function would look like this (Figure 6.2— page 228): f(x) P(120<=x<=130)=Area=1/20(10)=.50 1/20 x 120 125 130 Flight Time in Minutes 135 140 We can also generalize the uniform probability density function as follows (Equation 6.1—page 227): ⎧⎛ 1 ⎞ ⎪ for a ≤ x ≤ b f ( x) = ⎨⎜⎝ b - a ⎟⎠ ⎪ 0 elsewhere ⎩ where: a = smallest value the variable can assume; and b = largest value the variable can assume. In the case of the probability density function f ( x) , the height of the function at any particular value of x does not represent probability. Instead, we look at the area to find probability. What is the probability that a flight time will fall between 120 and 140? Answer: 1.0. What is the probability that the flight time will fall between 120 and 130? Answer: 0.50, since each one-minute interval is equally likely. Continuous Probability Distributions Page 4 Principles of Statistics I Optional The mean and variance for the uniform probability distribution are: µ= σ 2 a+b 2 (b − a ) = 2 12 The Normal Distribution One of the most common distributions used in business and economics for decisionmaking purposes is the normal probability distribution. This distribution is also known as the Gaussian distribution, named after Carl Friedrich Gauss (1777-1855) who was the first person to explore the properties of the normal curve. It has been used in a wide variety of applications including heights and weights of people, scientific measurements, test scores, and amounts of rainfall. (Later we will also see how a continuous normal random variable can be used as an approximation in situations involving discrete random variables.) You’ve all probably seen a representation of the normal (or bell-shaped) curve at some time or another. Perhaps one of your instructors has drawn such a curve in conjunction with reporting of test scores. The probability density function that defines the bell-shaped curve of the normal probability distribution is (Equation 6.2— page 232): f ( x) = 1 σ 2π e−(x− µ ) 2 / 2σ 2 Don’t worry about this formula since it only provides the height of the curve for any value of x. Continuous Probability Distributions Page 5 Principles of Statistics I The bell-shaped curve looks like this: µ The normal probability distribution (normal curve) is an extremely important distribution in statistics, so let’s spend some time reviewing some of the important characteristics of this kind of distribution. Characteristics of normal probability distributions 1. There is an entire family of normal probability distributions. Each has a different combination of mean µ and standard deviation σ . 2. The mean, median, and mode are the same for a normal curve and occur at the value of x where the curve is at its highest point. 3. The mean of a normal distribution can be any numerical value: positive, negative, or zero. 4. The normal distribution is symmetric. The “tails” approach but never touch the horizontal axis. Continuous Probability Distributions Page 6 Principles of Statistics I 5. The standard deviation determines the width of the curve to the extent that the higher the standard deviation, the flatter the curve. 6. The total area under the curve (as with all continuous probability distributions) is 1. 7. Regardless of the value of the mean, µ , and the standard deviation, σ , probabilities for the normal random variable are given by the area under the curve. We’ll show you how to compute these probabilities shortly. The Standard Normal Probability Distribution There is a special normal curve that is used in decision-making called the standard normal probability distribution. What distinguishes this curve from all of the other normal curves is that it has a mean, µ , of 0 and a standard deviation, σ , of 1, and instead of using x as the random variable, we use the letter z. We use this special curve to find probabilities. We know that there is a 100% chance or a probability of 1.00 that z falls within an interval under the curve. What is the probability that z falls within an interval above the mean? 0.50. Half of the z values are above the mean and half are below the mean. What if we wanted to know the probability that z was between 0 and 1, inclusive? Here’s where we refer to a table developed specifically for computing the area under the standard normal curve. Computing the area under the curve within a defined interval is the same as determining the probability that z falls within the defined interval. The tables in the inside front cover of your textbook show the area under the curve from the mean to any specified z value. If we want to know the probability that z falls within 1 standard deviation above the mean, we can look up the z value of 1.00 and find the probability 0.3413 (0.8413–0.5000 using the right-side table). What is the probability that z falls within 1 standard deviation of the mean? In this case we would be looking for the area both above and below the mean that is within one standard deviation of the mean. Since we already know the probability of z falling within the interval z=0 to 1, we can just double it to get 0.6826 (68.3%). Note that this is the same probability as we see in Figure 6.4 (page 234). If we were to calculate the probability that the random variable z would take on a value within plus Continuous Probability Distributions Page 7 Principles of Statistics I or minus 2 standard deviations of the mean we would get 95.44% and for 3 standard deviations we would get 99.72%. Before we can use the standard normal probability distribution in any real-world decision-making, we need to come up with a means of converting an x-value to a z-value. You see, there are very few real-world cases where the distribution has a mean of 0 and a standard deviation of 1. The formula for converting to the standard normal distribution is: z= x −µ σ Using this formula we can compute probabilities for any normal distribution. Let’s go through an example now to put this entire discussion in a more meaningful light. Again, for simplicity, let’s start with the example from the text—The Grear Tire Company problem on page 239. This company wants to place a guarantee on its new tire but isn’t sure what the policy should be. Actual tests show the mean mileage to be 36,500 with a standard deviation of 5,000 miles. What is the probability that the tire mileage will exceed 40,000 miles for any given tire? Using 40,000 for x, 36,500 for µ , and 5,000 for σ , we can use Formula 6.3 (page 238) to convert x to z. z= x −µ σ 40,000 − 36,500 5,000 3,500 = 5,000 z = 0.70 = Now that we have a z value, we can compute the probability that z > 0.70 (which is the same as the probability that x ≥ 40,000 . We know that the probability that z is greater than 0 is 0.50. But we want the probability that z > 0.70 , so we’re looking for an area to the right of z = 0.70 . We need to find the probability that z is between 0 and 0.7, and then subtract that value from 0.5. This gives us 0.5000− 0.2580 = 0.2420 From this we conclude that about 24.2 percent of the tires will last at least 40,000 miles. Now for some tricky stuff. What should the guarantee be if Grear would like no more than 10 percent of the tires to be eligible for the discount guarantee? We need to set a Continuous Probability Distributions Page 8 Principles of Statistics I minimum guaranteed mileage where the probability of a tire not getting that mileage is 10 percent. Figure 6.7 (page 240) shows this situation. The x value that corresponds with the minimum mileage is where the 50 percent lower tail of the distribution would break at 0.40. So, if we find 0.1000 (actually 0.1003) in the leftside table on the inside of the front cover of the text, we see that it corresponds with z = −1.28 . A minus since it is below the mean. We have the z-value but we need to convert back to x: x −µ = −1.28 σ x − µ = 1.28σ x = µ −1.28σ x = 36,500 −1.28(5,000) = 30,100 z= Grear might want to set a 30,000-mile guarantee for its tires. Normal Approximation of Binomial Probabilities Suppose, as the text does, that we want to find the probability that a sample of 100 invoices has 12 errors when a 10 percent error rate is average for a particular company. Since binomial probability tables don’t usually go over n = 20 , we could use the normal distribution to approximate the binomial probability. First, we set µ = np and σ = np(1− p) . µ = (100)(0.1) = 10 . σ = (100)(0.1)(0.9) = 3 . Next, we want to find the area under the curve where x = 12 . But since we cannot find a continuous probability at a point, we need to convert the discrete point to an interval using a continuity correction factor. We add and subtract 0.5 from 12 to get an interval of 11.5 to 12.5. Now we can find the area under the curve for this interval. Finally, we convert to the standard normal probability distribution using Equation 6.3 (page 238). We’ll find the z value for x = 12.5 and for x = 11.5. Once we have the probabilities for each area, we’ll subtract to find the probability for the interval. z= x − µ 12.5 −10.0 = = 0.83 σ 3 at x = 12.5 z= x − µ 11.5 −10.0 = = 0.50 σ 3 at x = 11.5 Continuous Probability Distributions Page 9 Principles of Statistics I From the right-side table, the area under the curve for z = 0.83 is 0.7967 and for z = 0.50 is 0.6915. Subtracting, we get P( x = 12) = 0.1052 . The exponential distribution A continuous probability distribution that is useful in describing the time or space between occurrences of an event is the exponential probability distribution. The formula for calculating the probability of obtaining a value for the exponential random variable of less than or equal to some specific value of x is: P( x ≤ x0 ) = 1− e−x 0 / µ (Equation 6.4) Let’s go through an example. Let’s say that your experience at a bank is such that it usually takes 7 minutes for you to complete your weekly deposit. One day you decide that you would like to know the probability of getting your business done in 3 minutes or less. You recognize this as a problem involving a continuous probability distribution and further recognize it as a time interval problem where the exponential distribution would be appropriate. You drag out the formula for calculating the probability of obtaining a value for the exponential random variable of less than or equal to some specific value of x ( x0 ) : P( x ≤ x0 ) = 1− e−x 0 / µ P ( x ≤ 3) = 1− e−3 / 7 P( x ≤ 3) = 0.3486 The exponential and Poisson distributions are similar in that the Poisson distribution provides a description of the number of occurrences per interval, and the exponential distribution provides a description of the length of the interval between occurrences. Continuous Probability Distributions Page 10 Principles of Statistics I Next class • Chapter 7—Sampling and Sampling Distributions Continuous Probability Distributions Page 11