Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 2 The Normal Distribution Density Curves A density curve is a model we will use for probability problems. Two characteristics – All area is positive (above the x-axis) – Area = 1 It is because Area = 1 that we can use a density curve to model probability, where P(Sample Space) = 1 Density Curves Think of a density curve as a relative frequency histogram with tiny class width—so tiny that the width of each class approaches zero. When you connect the tops of all the bars, a continuous line forms. The Normal Distribution The most famous density curve models the Normal Distribution. The normal distribution is a distribution of many commonly-found phenomena: test scores, physical properties of various creatures (humans included), measurements in scientific experiments, and many more. The Normal Distribution will be your constant friend until the end of the course! The Normal Distribution Identify a normal distribution using two things – Shape: Symmetric, Unimodal, Bell-shaped – The Empirical Rule: About of the About mean About 68% of all data fall within 1 standard deviation mean 95% are within 2 standard deviations of the 99.7% are within 3 sd’s of the mean. There are many, many normal distributions out there, but they all follow the empirical rule. From your textbook, Yates, Moore, Starnes, p. 87 The Normal Distribution The points of inflection (where the curve reverses) correspond to the points one standard deviation away from the mean on either side. A Normal Distribution is completely described by mean and standard deviation. For instance, the heights of men are N(69”, 2.5”). This means “normally distributed with a mean of 69 inches and standard deviation of 2.5 inches.” Those three facts: Normal, Mean=, SD = tell you everything you need to know to solve problems using the normal distribution. Solving Problems with the Normal Distribution #7, page 89 ALWAYS draw the curve. Standardizing—Speaking a common language How would you compare a score of 600 on an SAT with a score of 27 on an ACT? The secret: Express both scores as “the number of standard deviations above or below the mean.” Standardized Scores AKA: “Z-scores” x z Standardized Scores AKA: “Z-scores” In other words—What is the distance from the score to the mean, compared to the size of the standard deviation? This gives an answer to the question, “How many standard deviations from the middle is this score?” Scores expressed as “standard deviations from the mean” are called Standardized Scores, or Z-Scores Z-scores can be positive or negative! Standardizing—Speaking a common language How would you compare a score of 650 on an SAT with a score of 25 on an ACT? The mean SAT score is 500, while the mean ACT score is 18. Standard deviation for SAT is 100. Standard deviation for ACT is 6 Express both as Z-scores, then compare: Standardizing—Speaking a common language How would you compare a score of 650 on an SAT with a score of 25 on an ACT? SAT guy wins. z 650500 150 1.5 100 SAT 100 z 2518 76 1.16 ACT 6 The STANDARD Normal Distribution If you were to create a distribution of z-scores, it would look like this: The STANDARD Normal Distribution This Distribution is N(0,1): It, too, is a Normal Distribution, whose Mean is 0 and whose SD is 1 Or if you prefer: μ=0, and σ = 1 Normal Distribution Calculations The Empirical Rule: All fine and good until someone’s Z-score isn’t an integer! Easy: What percentile is a score of 600 for an SAT if μ=500, and σ = 100? It’s the 84th percentile, because a score 1 SD above the mean is better than 68% + 16%, or 84%. Hard: What percentile is a score of 550? – Uhhh… z = .5 Now what? First, The Three P’s Percent (or percentile) Proportion Probability – The problems may differ, but the procedures are the same. – It all comes down to figuring out how much area is involved. Table A: Gives you Area to the left of your z-score. In this case, the area to the left of .5 Using Table A Along the edge, “assemble” your zscore. Whole units and the tenths place along the side, intersected with the hundredths place along the top. Where the two intersect, that’s area to the left Example (using a positive zscore) Say a person scores a 695 on a math SAT. (~N(500, 100). What is that person’s percentile? Calculate z-score: Z = (695500)/100 = 1.95 Normal Calculation Toolbox (Use it without fail—get all the points ) 1. Draw and label the curve, shading the area of interest. (Don’t forget “N(500, 100),” e.g.) 2. Create a probability statement, leaving blanks for the unknowns 3. Show all calculations (z-score calculations, etc.) and complete the probability statement. 4. Answer the question in context using words. SAT Example 1. 2. ~N(500, 100) 3. P(x <= 690) = ________ z 695 500 195 1.95 100 100 From Table A, Area to left of z = 1.95 is .9744 500 100 4. 695 A score of 695 on the SAT is approximately the 97.4th percentile. Assessing Normality “Are these data normally distributed?” 1. Is the shape of the histogram unimodal, bell-shaped, approximately symmetric? 2. Are the mean & median close? (This suggests symmetry) 3. Does the Empirical Rule apply? In other words, are about 68% of the numbers within 1 SD of the mean? Are about 95% within 2? Are just about all of them within 3 SD’s of the mean? 4. Is an NPP of the data approximately straight? If yes, data’s approximately normal. Be somewhat forgiving in your judgment. On the test, as long as you check all of these things (climb the mountain!) your judgment is valid. The phrase “approximately normal” is open to interpretation, and data is never perfect, so feel free to have an opinion. NPP—Normal Probability Plot Read the book (end of 2.2) To create an NPP with the calculator – Put data into a list – Go to Stat Plot. Choose the 6th option (after the boxplots) – Let the data axis remain “x” – Do the usual thing: Zoom…Zoomstat. – If what you see is pretty much straight, the distribution is approximately normal.