Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 5 Normal Curve Bell Shaped Unimodal Symmetrical Unskewed Mode, Median, and Mean are same value Normal Curve 68.26% 95.44% 99.72% week 4 Renske Doorenspleet 1 Theoretical Normal Curve General relationships: ±1 s = about 68% ±2 s = about 95% ±3 s = about 99% week 4 Renske Doorenspleet 2 Theoretical Normal Curve 68.26% 95.44% 99.72% -5 -4 -3 -2 -1 0 1 week 4 Renske Doorenspleet 2 3 4 5 3 Using the Normal Curve: Z Scores To find areas, first compute Z scores. The formula changes a “raw” score (Xi) to a standardized score (Z). week 4 Renske Doorenspleet 4 Using Appendix A to Find Areas Below a Score Appendix A can be used to find the areas above and below a score. First compute the Z score, taking careful note of the sign of the score. Draw a picture of the normal curve and shade in the area in which you are interested. week 4 Renske Doorenspleet 5 Using Appendix A Appendix A has three columns. (a) = Z scores. (b) = areas between the score and the mean b b week 4 Renske Doorenspleet 6 Using Appendix A Appendix A has three columns. ( c) = areas beyond the Z score c c week 4 Renske Doorenspleet 7 Using Appendix A Find your Z score in Column A. To find area below a positive score: Add column b area to .50. To find area above a positive score Look in column c. (a) (b) (c) . . . 1.66 0.4515 0.0485 1.67 0.4525 0.0475 1.68 0.4535 0.0465 . . . week 4 Renske Doorenspleet 8 Using Appendix A The area below Z = 1.67 is 0.4525 + 0.5000 or 0.9525. Areas can be expressed as percentages: 0.9525 = 95.25% 95.2 week 4 Renske Doorenspleet 9 Using Appendix A What if the Z score is negative (– 1.67)? To find area below a negative score: Look in column c. To find area above a negative score Add column b .50 (a) (b) (c) . . . 1.66 0.4515 0.0485 1.67 0.4525 0.0475 1.68 0.4535 0.0465 . . . week 4 Renske Doorenspleet 10 Using Appendix A The area below Z = - 1.67 is 0.475. Areas can be expressed as %: 4.75%. Areas under the curve can also be expressed as probabilities. Probabilities are proportions and range from 0.00 to 1.00. The higher the value, the greater the probability (the more likely the event). week 4 Renske Doorenspleet 11 Finding Probabilities If a distribution has: X = 13 s =4 What is the probability of randomly selecting a score of 19 or more? week 4 Renske Doorenspleet 12 Finding Probabilities (a) . 1.49 1.50 1.51 . 1. Find the Z score. 2. For Xi = 19, Z = . . 1.50. 0.4319 0.0681 3. Find area above in column c. 0.4332 0.0668 4. Probability is 0.0668 or 0.07. (b) (c) 0.4345 0.0655 . . week 4 Renske Doorenspleet 13 Finding Probabilities (exercise 1) The mean of the grades of final papers for this class is 65 and the X standard deviation is 5. What percentage of the students have scores above 70? In other words, what is the probability of randomly selecting a score of 70 or more? week 4 Renske Doorenspleet 14 Finding Probabilities (exercise 2) Stephen Jay Gould (1996). Full House. The Spread of Excellence from Plato to Darwin. X Doctors: you have an aggressive type of cancer and half of the patients will die within 8 months. Question: An optimistic person like Gould was not impressed and not shocked by this message. Why not? week 4 Renske Doorenspleet 15 Chapter 6 Introduction to Inferential Statistics : Sampling and the Sampling Distribution Problem: The populations we wish to study are almost always so large that we are unable to gather information from every case. week 4 Renske Doorenspleet 16 Basic Logic And Terminology Solution: We choose a sample -a carefully chosen subset of the population – and use information gathered from the cases in the sample to generalize to the population. week 4 Renske Doorenspleet 17 Samples Must be representative of the population. Representative: The sample has the same characteristics as the population. How can we ensure samples are representative? Samples in which every case in the population has the same chance of being selected for the sample are likely to be representative. week 4 Renske Doorenspleet 18 Sampling Techniques Simple Random Sampling (SRS) Systematic Random Sampling Stratified Random Sampling Cluster Sampling See Healey’s book for more information on differences between those techniques week 4 Renske Doorenspleet 19 Applying Logic and Terminology For example: Population = All 20,000 students. Sample = The 500 students selected and interviewed week 4 Renske Doorenspleet 20 The Sampling Distribution Every application of inferential statistics involves 3 different distributions. Information from the sample is linked to the population via the sampling distribution. Population Sampling Distribution week 4 Renske Doorenspleet Sample 21 First Theorem Tells us the shape of the sampling distribution and defines its mean and standard deviation. If we begin with a trait that is normally distributed across a population (IQ, height) and take an infinite number of equally sized random samples from that population, the sampling distribution of sample means will be normal. week 4 Renske Doorenspleet 22 Central Limit Theorem For any trait or variable, even those that are not normally distributed in the population, as sample size grows larger, the sampling distribution of sample means will become normal in shape. week 4 Renske Doorenspleet 23 The Sampling Distribution: Properties 1. Normal in shape. 2. Has a mean equal to the population mean. 3. Has a standard deviation (standard error) equal to the population standard deviation divided by the square root of N. The Sampling Distribution is normal so we can use Appendix A to find areas. See Table 6.1, p. 160 of Healey’s book for specific important symbols. week 4 Renske Doorenspleet 24