Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lesson Plan • Answer Questions • Areas Under Normal Curves • Graphical Summary • Quiz 1 TA schedule can be found in http://www.stat.duke.edu/secc/schedule/, TAs are available Monday-Thursday and Sunday in SECC (Old chem 211B). Good math website: http://mathworld.wolfram.com/ Upload lecture notes the day before the lecture. Please download labrep2 from my website (www.stat.duke.edu/ chengg/teaching101) for Wed. Lab. Students can check their grades on Blackboard. 2 3. Areas Under Normal Curves In the last lecture, we learned how to find areas under a standard normal distribution (mean µ = 0, standard deviation sd = 1). This required use of the table on page A-105. A region under a normal curve corresponds to a proportion of the population. This is because a normal curve can be viewed as the limit of a series of histograms, in which the sample gets large while the bin-size goes to zero. For example, 50% of the area under the standard normal lies to left of 0. Thus if this represents temperatures in Nome, then about half the time the temperature is below 0. 2-1 We now show how to convert a question about an arbitrary normal distribution into an equivalent question about the standard normal, and vice-versa. Thus we can use the table on A-105 to answer questions about all normal distributions, not just the standard normal. Let X be a random value from a population with mean µ and standard deviation σ. We write this as X ∼ N (µ, σ). Define a new random variable Z = Z ∼ N (0, 1). X−µ . σ Then one can prove that This is called the z-transformation. To go the other way, we convert the standard normal value to an arbitrary normal distribution by solving for X. So X = µ + Zσ. 2 3.1 Using the z-Transformation Reggie Jackson has an IQ of 140. What percentage of people are smarter? Assume that IQs (X) are normally distributed with mean 100 and standard deviation 16. X ∼ N (100, 16), Then P (X > 140) =? We want the area under the normal distribution for IQ that lies to the right of 140. By the z-transformation, this is equivalent to the area under the standard normal distribution that lies to the right of z= 140 − 100 X −µ = = 2.5. σ 16 3 ¿From the table on A-105, the area between ±2.5 is 98.76%. Thus the area above 2.5 is 12 (100 − 98.76) = .62%. 4 Now we go the other way. We find the X value that corresponds to a given percentage. To join Mensa one must be in the top 2% of the IQ distribution. What score do you need? On the A-105 table, look up 96%. That gives the z-value of approximately 2.05. We know that 2% of the area under the standard normal is above 2.05, and 2% is below -2.05. Now we use the inverse z-transformation. So X = µ + Zσ = 100 + (2.05)(16) = 132.8. One needs a score of at least 132.8 to join. 5 Always draw a picture. Assume heights are normally distributed with mean 5.8 inch and standard deviation 0.4 inch. What proportion of people are shorter than 5.5 inch? 6 Assume heights are normally distributed with mean 5.8 inch and standard deviation 0.4 inch. Then 30% of people are shorter than what? 7 3.2 The Continuity Correction A perfect normal distribution describes data that can take any possible value—negatives, fractions, irrationals, etc. But often the data can only take non-negative integer values; e.g., the number of students who come to class on a given day. It is reasonable to say that the number of students who attend a class on “The History of European Socialism” is approximately normal with mean 30 and standard deviation 5. We can use the normal table to make statements about the probability that more than 32 students will attend tomorrow’s lecture. But because only integers are possible, we can improve the accuracy of the normal table by using the continuity correction. 8 What is the probability that more than 32 students attend? The bad way uses the z-transformation z = (32 − 30)/5 = .4, and finds the area under the N(0,1) curve that lies above .4. The good way handles the area between 32 and 33 appropriately. We use the z-transformation z = (32.5 − 30)/5 = .5, and find the area under the curve that lies above .5. 9 How can you decide if data are a random sample from a normal distribution? • Inspect the histogram. • Make a normal probability plot. 10 To make a normal probability plot, order the observations from smallest to largest; denote the ordered observations by X(1) , X(2) , . . . , X(n) . For observation X(i) , find the z-value such that (i − .5)/n ∗ 100% of the area under the standard normal curve is to the left. Call this z-value Yi . Then plot (X(i) , Yi ) for all i = 1, . . . n. If this looks pretty much like a straight line, then the data are approximately normal. 11 Pictures Worth a Thousand Words The graphical display shows where sample values are located and where they concentrate. • Stem and leaf • Boxplot 11 Stem and Leaf Crude death rates of 22 Africa countries: 16 23 22 13 23 14 23 14 22 16 18 20 22 14 19 12 17 17 17 17 28 18 The decimal point is 1 digit(s) to the right of the | 2|8 2|0222333 1|667777889 1|23444 12 Boxplots Use 5 numbers to summarize data • the maximum; • third quantile; • median; • first quantile; • minimum. 13