Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Survey

Document related concepts

no text concepts found

Transcript

The Normal Distribution Normal Distributions Many data sets display similar characteristics The normal distribution is a way of describing a certain kind of "ideal" data set Although no real-world data is perfect, a surprising amount of natural phenomina are approximately "normal" What is a Normal Distribution? Symmetrical no skew mean, median, mode all equal Histogram data 0.10 0.08 Density •Mound / Bell Shaped •peaks in the middle •slopes down towards the sides 0.06 0.04 0.02 10 15 20 25 30 s Density = normalDensity ( x , 30, sd ) 35 40 45 50 55 Histograms Revisited The bars in the histogram tell us how much of the data falls into each interval Histogram data little data here 0.10 Density 0.08 0.06 0.04 lots of data here 0.02 10 15 20 25 30 s Density = normalDensity ( x , 30, sd ) 35 40 45 50 55 Why is Normal good? The Normal Distribution is so well behaved that we can draw a curve that almost matches it This makes it very easy to measure how tall the histogram bars are The height of the bars are given by the curve that matches it This allows us to find almost exactly how much data is in each part of the distribution Where are we on the curve? the mean x 2σ x σ x xσ x 2σ Where are we on the curve? the mean one standard deviation below one standard deviation above x x 2σ x σ x xσ x 2σ x Where are we on the curve? the mean one standard deviation below one standard deviation above x 2 standard deviations below x 2 standard deviations above x x x 2σ x σ x xσ x 2σ Area under the curve 95% 68% 34% 34% 2.25% x 3σ 13.5% x 2σ x σ 13.5% x xσ 2.25% x 2σ x 3σ More Properties Approximately 68% of the data is within one standard deviation of the mean Approximately 95% of the data is within two standard deviations of the mean Approximately 99.7% of the data is within three standard deviations of the mean 2 Notation X ~ N(x, σ ) If we want to say "this data is approximated by the standard distribution"... We should also state what the mean and standard deviation are 2 our data (call it X) X ~ N(x, σ ) and this standard deviation or variance with this mean "is approximated by" the normal distribution Notation example The data is normal, and has a mean of 3 and a standard deviation of 2 2 X ~ N(3,2 ) The data is normal, has a mean of 5.4, and a standard deviation of 3 X ~ N(5.4,9) be careful - if there is no square, then the second number is the variance, and you need to take the square root to get the standard deviation.... Problem Example Julie is an engineer who is designing roller coasters. Her roller coaster must have mass restrictions that are suitable for 95% of the population. The average adult in North America has a mass of 71.8kg with a standard deviation of 13.6kg. What range of mass should her ride accommodate? Problem Example Julie is an engineer who is designing roller coasters. Her roller coaster must have mass restrictions that are suitable for 95% of the population. The average adult in North America has a mass of 71.8kg with a standard deviation of 13.6kg. What range of mass should her ride accommodate? 1. Assume that the masses are normally distributed. 2. 95% of the data will fall within two standard deviations Consequently, the range will be between 71.8 - 2(13.6) = 44.6kg and 71.6 + 2(13.6) = 99 kg Example 2: Out of 150 packages of crackers, 97 packages contain between 80 and 100 crackers. Assume a normal distribution Estimate: A) the average number of crackers B) the standard deviation of the sample