Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Transcript

8.5 Normal Distributions We have seen that the histogram for a binomial distribution with n = 20 trials and p = 0.50 was shaped like a bell if we join the tops of the rectangles with a smooth curve. Real world data, such as IQ scores, weights of individuals, heights, test scores have histograms that have a symmetric bell shape. We call such distributions Normal distributions. This will be the focus of this section. DeMoivre http://www-gap.dcs.st-and.ac.uk/~history/Mathematicians/De_Moivre.html Three mathematicians contributed to the mathematical foundation for this curve. They are Abraham De Moivre, Pierre Laplace and Carl Frederick Gauss De Moivre pioneered the development of analytic geometry and the theory of probability. He published The Doctrine of Chance in 1718. The definition of statistical independence appears in this book together with many problems with dice and other games. He also investigated mortality statistics and the foundation of the theory of annuities Laplace Laplace also systematized and elaborated probability theory in "Essai Philosophique sur les Probabilités" (Philosophical Essay on Probability, 1814). He was the first to publish the value of the Gaussian integral, Bell shaped curves Many frequency distributions have a symmetric, bell shaped histogram. For example, the frequency distribution of heights of males is symmetric about a mean of 69.5 inches. Example 2: IQ scores are symmetrically distributed about a mean of 100 and a standard deviation of 15 or 16. The frequency distribution of IQ scores is bell shaped. Example 3: SAT test scores have a bell shaped , symmetric distribution. Graph of a generic normal distribution 0.5 0.4 0.3 Series1 0.2 0.1 0 -4 -2 0 2 4 Values on X axis represent the number of standard deviation units a particular data value is from the mean. Values on the y axis represent probabilities of the random variable x. 0.5 0.4 0.3 Series1 0.2 0.1 0 -4 -2 0 2 4 Area under the Normal Curve 1. Normal distribution : a smoothed out histogram 2. P( a < x < b) = Probability that the random variable x is between a and b is determined by the area under the normal curve between x = a and x = b . Properties of Normal distributions 1. Symmetric about its mean, 2. Approaches, but not touches, the horizontal axis as x gets very large ( or x gets very small) 3. Almost all observations lie within 3 standard deviations from the mean. Area under normal curve Example: A midwestern college has an enrollment of 3264 female students whose mean height is 64.4 inches and the standard deviation is 2.4 inches. By constructing a relative frequency distribution, with class boundaries of 56, 57, 58, … 74, we find that the frequency distribution resembles a bell shaped symmetrical distribution. Heights of Females at a College (Relative frequency distribution with class width = 1 is smoothed out to form a normal, bell-shaped curve) .. Normal curve areas Key fact: For a normally distributed variable, the percentage of all possible observations that lie within any specified range equals the corresponding area under its associated normal curve expressed as a percentage. This holds true approximately for a variable that is approximately normally distributed. The area of the red portion of the graph is equal to the prob( 66 < x < 68) ; the probability that a female student chosen at random from the population of all students at the college has a height between 66 and 68 in. Finding areas under a normal, bell-shaped curve The problem with attempting to find the area under a normal curve between x = a and x = b ( and thus finding the probability that x is between a and b, P( a < x < b) is that calculus is needed. However, we can circumvent this problem by using results from calculus. Tables have been constructed to find areas under what is called the standard normal curve. The standard normal curve will be discussed shortly. A normal curve is characterized by its mean and standard deviation. The scale for the x axis will be different for each normal curve. The shape of each normal curve will differ since the shape is determined by the standard deviation; the greater the standard deviation, the “flatter” and more spread out the normal curve will be. Standardizing a Normally Distributed Variable To find percentage of scores that lie within a certain interval, we need to find the area under the normal curve between the desired x values. To do this, we need a table of areas for each normal curve. The problem is that there are infinitely many normal curves so that we would need infinitely many tables. Non-standard normal curves For example, the distribution of IQ scores is normal with mean = 100 and standard deviation =16. Ex. 2. The heights of females at a certain mid-western college is normally distributed with a mean of 64.4 inches and a standard deviation of 2.4 inches. Ex. 3. The probability distribution of x, the diameter of CD’s produced by a company, is normally distributed with a mean of 4 inches and a standard deviation of .03 inches. Thus, for these three examples we would need three separate tables giving the areas under the normal curve for each separate distribution. Obviously, this poses a problem. Standard normal curve The way out of this problem is to standardize each normal curve which will transform individual normal distributions into one particular standardized distribution. To find P( a < x < b) for the non-standard normal curve, we can find P a z b P a z b Thus P(a < x < b) = variable z is called the standard normal variable. The Standard normal distribution The standard normal distribution will have a mean of 0 and a standard deviation of 1. Values on the horizontal axis are called z values. Z will be defined shortly. Values on the y axis are probabilities and will be decimal numbers between 0 and 1, inclusive. 0.5 0.4 0.3 Series1 0.2 0.1 0 -4 -2 0 2 4 Standardized Normally Distributed Variable The formula below for z can be used to standardize any normally distributed variable x. Z is referred to as the amount of standard deviations from the mean; A. S. D. M. = z. represent the mean and standard deviation of the distribution, respectively. , x z For example, if IQ scores are distributed normally with a mean of 100 and standard deviation of 16, the if x = IQ of an individual = 124, then z = 124 100 1.5 16 Areas under the standard normal curve Find the following probabilities: A) P( 0 < z < 1.2) = Use table or TI 83 to find area. Answer: .3849 Areas under the Standard Normal Curve Let z be the standard normal variable. Find the following probabilities: Be sure to sketch a normal curve and shade the appropriate area. If you use a TI 83, give the appropriate commands required to do the problem. Examples Probability( -1.3 < z<0) 1. Draw diagram 2. Shade appropriate area 3. Use table or calculator to find area. 4. Answer: .4032 Examples (continued) Probability (-1.25 < z < .89) = 1. Draw picture 2. Shade appropriate area 3. Use table to find two different areas 4. Find the sum of the two percentages. 5. Answer: .7076 More examples: Probability ( z > .75) 1. Draw diagram 2. Shade appropriate area 3. Use table to find p(0<z<0.75) 4. Subtract this area from 0.5000. Answer: 0.2266 More examples (continued) probability(-1.13 < z < -.79) = 1. Draw diagram 2. Shade appropriate area 3. Use table to find p(0 < z < 1.13) 4. Use table to find p( 0 < z < 0.79) 5. Subtract the smaller percentage from the larger percentage. 6. Answer: 0.0855 Finding probabilities for nonstandard normal curves. P( a < x < b) is the same as a b P z Example 1 IQ scores are normally distributed with a mean of 100 and a standard deviation of 16. Find the probability that a randomly chosen person has an IQ greater than 120. Step 1. Draw a normal curve and shade appropriate area. State probability: P( x > 120) , where x is IQ. Example Step 2. Convert x score to a standardized z score: Z = ( 120 – 100)/ 16 = 20/16 = 5/4 = 1.25 Probability ( x 120) = P( z > 1.25) Step 3. Draw standard normal curve and shade appropriate area. Step 4. Use table or TI 83 To find area. Answer: .1056 Areas under the Non-standard normal curbe A traffic study at one point on an interstate highway shows that vehicle speeds are normally distributed with a mean of 61.3 mph and a standard deviation of 3.3 miles per hour. If a vehicle is randomly checked, find the probability that its speed is between 55 and 60 miles per hour. Solution: 1. Draw diagram 2. Shade appropriate area 3. Use z x 5. Find 6. Answer: 0.3187 60 61.3 55 61.3 p z 3.3 3.3 Non standard normal curve areas If IQ scores are normally distributed with a mean of 100 and a standard deviation of 16, find the probability that a randomly chosen person will have an IQ greater than 84. Answer: approximately .84 IQ scores example If IQ scores are normally distributed with a mean of 100 and a standard deviation of 16, find the probability that a person’s IQ is between 85 and 95. 1. Draw diagram 2. Shade appropriate area 3. standardize variable x using 4. Find x x p 1 5. Answer: 0.2031 z 2 z x Areas under non-standard normal curves The lengths of a certain snake are normally distributed with a mean of 73 inches and a standard deviation of 6.5 inches. Find the following probabilities. Let x represent the length of a particular snake P( 65<x<75) answer: 0 .5116 Mathematical Equation for bell-shaped curves Carl Frederick Gauss, a mathematician, was probably the first to realize that certain data had bell-shaped distributions. He determined that the following equation could be used to describe these distributions: 1 f ( x) e 2 Where data. , 2 ( x ) 2 2 are the mean and standard deviation of the Using the Normal Curve to approximate binomial probabilities Example: We have seen that the histogram for a binomial distribution with n = 20 trials and p = 0.50 was shaped like a bell if we join the tops of the rectangles with a smooth curve. If we wanted to find the probability that x (number of heads) is greater than 12, we would have to use the binomial probability formula and calculate P(x = 12) + P(x=13) + p(x=14) + … P(x=20) . The calculations would be very tedious to say the least. Binomial Distribution for n = 20 and p = 0.5 ( A coin is tossed 20 times and the probability of x = 0 , 1, 2, 3, …20 is calculated. Each vertical bar represents one outcome of x. ) Using the Normal curve to approximate binomial probabilities We could, instead, treat the binomial distribution as a normal curve since its shape is pretty close to being a bell-shaped curve and then find the probability that x is greater than 12 using the procedure for finding areas under a normal curve. Prob(x > 12) = P(x > 11.5) = total area in yellow Because the normal curve is continuous and the binomial distribution is discrete ( x = 0 , 1 , 2, …20) we have to make what is called a correction for continuity. Since we want P(x > 12) we must include the rectangular area corresponding to x = 12 . The base of this rectangle starts at 11.5 and ends at 12. 5. Therefore, we must find P(x > 11.5) The rectangle representing the prob(x = 12) extends from 11.5 to 12.5 on the horizontal axis. Solution: Using the procedure for finding area under a non-standard normal curve we have the following result: 11.5 10 p( x 11.5) p z 2.24 = 0.25