Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Normal Distribution Up until now, we’ve dealt with discrete probability distributions. The normal distribution is a continuous probability distribution. It has the shape of a bell curve http://mathworld.wolfram.com/ NormalDistribution.html Since the normal curve represents a probability distribution, the area under the normal curve must be equal to 1. The center and spread of the normal curve change. Probability density histogram with continuous normal distribution curve The location the center of the normal probability is based on the mean of the population (called µ, “mu”). The spread of the normal curve is based on the standard deviation of the population (called σ, “sigma”). http://en.wikibooks.org/wiki/A-level_Mathematics/OCR/S2/Normal_Distribution 1 The area under a normal curve from a to b is the probability that a data value will like between a and b. Typically, calculus is used to find the area under a curve. In statistics, though, the calculus has been done and stored in a table. All we need is to be able to look up areas on a table. 64 66 68 70 72 74 76 Example: Say the curve above represents the probability distribution for mens’ height. If the area in blue is .301, then the probability that a man is between 70 and 72 inches is .301. Definition: In a normal distribution with mean µ and standard distribution σ, the z-score related to any data point x is z= x−µ σ Note: the z score related to x = µ is 0. Example: If the weight of paper discarded weekly by a typical household is normally distributed, with a mean of 9.4 pounds and a standard distribution of 4.2 pounds, find the z score related to 19.354 pounds of paper being discarded weekly. The values in the table of page A-1 of your test return the area under the normal curve up to the x value related to each z score (pink area). This area is the probability that the z score will be less than the z score related to that data point. 2 For our problem, we can interpret that as the probability that the weight of paper discarded by a household is in a week is less than 19.354 pounds. pink area = P(discarded paper weighs less than 19.354 pounds) = .9911 Remember that the total probability = total area under the curve = 1. What is the probability that a household discards more than 19.354 pounds of paper weekly? 1 - .9911 = .0089 Example: Find the z-score related to a household discarding 4.444 pounds of trash weekly. -1.18 4.444 − 9.4 z= = −1.18 4.2 Note that for data points less than the mean, the z score is always negative. Example: What is the probability that the amount of trash discarded by a household is between 4.444 and 19.354 pounds? What is the probability that a household discards less than 4.444 pounds of paper weekly? 2.37 The probability is the area in pink. We know that the area up to the z score of – 1.18 is .1190, while the area up to 2.37 is .9911 3 2.37 Area is pink = We can use our table of areas to show that for normally distributed data, the probability that a data point is within one standard deviation of the mean is .683, area up to 2.37 – area up to –1.18 = within 2 standard deviations: .954, .9911 – .1190 = .8721 = within 3 standard deviations: .997. P(between 4.444 and 19.354 pounds of paper discarded. 4