Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 4: Variability Chapter Outline 4.1 Overview (The Purpose for Measuring Variability) 4.2 The Range 4.3 Standard Deviation and Variance for a Population Formulas for Population Variance and Standard Deviation Final Formulas and Notation 4.4 Standard Deviation and Variance for Samples Sample Variability and Degrees of Freedom 4.5 More about Variance and Standard Deviation Presenting the Mean and Standard Deviation in a Frequency Distribution Graph Sample Variance as an Unbiased Statistic Standard Deviation and Descriptive Statistics Transformations of Scale (Adding a Constant or Multiplying by a Constant) In the Literature: Reporting the Standard Deviation Variance and Inferential Statistics Learning Objectives and Chapter Summary 1. Students should understand the general purpose for measuring variability and they should be able to recognize the difference between scores with high variability versus scores with low variability. Variability recognizes that not all of the scores are the same, there are differences (unless there is zero variability), and then measures how big the differences are. Note that variability is a measure of distance. Instructor Notes – Chapter 4 – page 47 2. Students should be able to define and calculate the range, but they should also realize that this is a relatively crude measure of variability. The range is defined as the distance from the largest score to the smallest score and thus is completely determined by only two scores. Technically, the range is measured as the distance from the lower real limit of the lowest score to the upper real limit of the highest score, but is often calculated as simply the highest score minus the lowest score. Note that the range does not measure variability for the entire set of scores, but rather is determined entirely by the two extreme scores. 3. Students should understand the concept of standard deviation as a measure of the standard distance from the mean. Some scores are close to the mean and other scores are far away. The standard deviation provides a measure of the standard or “average” distance from the mean. In a frequency distribution graph showing either a sample or a population, students should be able to estimate the location of the mean and estimate the size of the standard deviation. If students are given values for the mean and standard deviation, they should be able to sketch a frequency distribution corresponding to the given values. Although the concept of variance is not as concrete as the concept of standard deviation, students should be able to provide a verbal definition for variance: standard deviation is a measure of distance from the mean, so variance is a measure of squared distance. 4. Students should be able to calculate SS (sum of squared deviations), variance, and standard deviation for a sample and for a population. In addition, they should understand the concept of an unbiased statistic and the correction for bias that is used in the formula for sample variance. The three values (SS, variance, and standard deviation) are always calculated in order: First you find the sum of the squared deviations, then find the average of the squared deviations, and finally you take the square root to find the standard deviation. If students can provide verbal definitions for these three values, then most of the formulas are selfevident. (The notable exception is the computational formula for SS.) Note that the correction for bias (dividing by n – 1) is made only when calculating sample variance. SS is computed exactly the same for samples and populations, and standard deviation is always the square root of variance for both samples and for populations. Instructor Notes – Chapter 4 – page 48 Other Lecture Suggestions 1. The same demonstration that was used in Chapter 3 to show how adding a constant (or multiplying by a constant) affects the mean, can be used again to show how these actions affect the standard deviation. Sketch a simple histogram and label the values along the X axis using 1, 2, 3, 4, and 5. Locate the mean using a vertical line at M = 3 and identify the standard deviation using arrows that point from the mean up to X = 4 and from the mean down to X = 2 (showing a standard deviation of one point). Now ask the students what will happen to the distribution if you add 10 points to every score. Answer: The whole distribution moves 10 points to the right. (You can keep the same sketch, simply re-label the values on the X axis to 11, 12, 13, 14, and 15.) Note that the mean (middle) has shifted 10 points to the right but the standard deviation arrows are still exactly one point. Adding a constant does not change the standard deviation. Go back to the original distribution and ask what would happen if every score were multiplied by 10. This time the 1s become 10s, the 2s become 20s, and so on. Again, you can keep the same sketch, simply re-label the values on the X axis to 10, 20, 30, 40, and 50. After multiplying, the mean (that was M = 3) is now located at M = 30 (10 times bigger), and the arrows that used to extend one point from the mean now extend 10 points. Multiplying by a constant multiplies both the mean and the standard deviation. 2. An alternative demonstration of the effects of adding or multiplying by a constant is simply to note that standard deviation is a measure of distance. To determine how the standard deviation is affected, you can simply pick any two scores and determine how the distance between them is affected. For example, you have a quiz score of 7 and your friend has a score of 5. What happens to the distance between these scores if (a) five points are added to every score in the distribution? (b) every score is multiplied by 3? 3. To test students’ understanding of standard deviation, give them parameters for a distribution of scores (for example, µ = 70 and σ = 15) and ask if a specific score such as X = 80 is an extreme score, out in the tail of the distribution. To answer the question, sketch a distribution (a pile of scores) with the appropriate parameters and locate the approximate position of the specific score. Note: If you use scores that are within one standard deviation of the mean (not extreme) or more than two or three standard deviations from the mean (extreme) you probably will not need to define “extreme.” 4. Example 4.7 and Table 4.1 can be repeated in class to demonstrate several concepts and calculations concerning variability. First, you can calculate variance for the original population showing the population formulas. Second, you can compute SS and variance for one or two of the samples to demonstrate the sample formulas. Finally, you can use the set of nine sample variances to demonstrate the concept of an unbiased statistic. Note that none of the samples has a variance exactly equal to the population variance, however the average of the sample variances is exactly equal to the population variance. Instructor Notes – Chapter 4 – page 49 5. The concept of biased samples and representative samples can be used as an analogy to help explain the concept of biased and unbiased statistics. If you were interested in studying elementary school children, for example, you probably would not recruit your sample from a science and computers camp for child geniuses. These kids would be a biased sample that is not representative of the general population. Similarly, calculating sample variance without making some correction, produces a biased value that is not representative of the general population. In each case, the goal is to obtain a sample that is representative of the population. (Note: Be sure to differentiate the concept of a biased sample and a biased statistic. Even if you have a good representative sample, the sample variance will be a biased statistic unless you make a correction in the formula.) 6. The following values produce whole-number answers for classroom examples. Population Variance and Standard Deviation Definitional formula Computational formula X 7 1 7 9 SS = 36 σ2 = 9 σ =3 X 1 6 1 1 1 SS = 20 σ2 = 4 σ=2 Sample Variance and Standard Deviation Definitional formula Computational formula X 5 1 5 5 SS = 12 s2 = 4 s=2 X 1 7 1 1 SS = 27 s2 = 9 s=3 Instructor Notes – Chapter 4 – page 50 Exam Items for Chapter 4 Multiple-Choice Questions 1. (www) In a population of N = 10 scores, the smallest score is X = 8 and the largest score is X = 20. Using the concept of real limits, what is the range for this population? a. 11 b. 12 c. 13 d. cannot be determined without more information 2. A sample consists of n = 16 scores. How many of the scores are used to calculate the range? a. 2 b. 4 c. 8 d. all 16 3. For a sample of n = 16 scores, how many scores are used to calculate the sample variance? a. 2 b. 8 c. 15 d. all 16 4. What is the value of SS (sum of squared deviations) for the following sample? Sample: 2, 3, 4, 7 a. 14/3 = 2.67 b. 14 c. 72 d. 78 5. (www) What is the value of SS (sum of squared deviations) for the following population? Population: 1, 1, 1, 5 a. 3 b. 7 c. 12 d. 28 Instructor Notes – Chapter 4 – page 51 6. What is the value of SS (sum of squared deviations) for the following sample? Sample: 1, 1, 1, 3 a. 0 b. 1 c. 3 d. 12 7. What is the value of SS (sum of squared deviations) for the following population? Population: 2, 3, 0, 5 a. 13 b. 38 c. 13/4 = 3.25 d. 38/4 = 9.50 8. What is the value of SS for the following set of scores? Scores: 8, 3, 1. a. 26 b. 29 c. 74 d. 144 9. A population of N = 5 scores has ΣX = 20 and ΣX2 = 100. For this population, what is the value of SS? a. 20 b. 80 c. 100 d. 380 10. (www) A population has SS= 100 and 2 = 4. How many scores are in the population? a. 25 b. 26 c. 200 d. 400 11. A population has SS= 100 and 2 = 4. What is the value of (X – ) for the population? a. 0 b. 25 c. 100 d. 400 Instructor Notes – Chapter 4 – page 52 12. What is the value of SS for the following set of scores? a. 18 b. 10 c. 9 d. 6 Scores: 1, 1, 4, 0 13. A sample of n = 4 scores has ΣX = 8 and ΣX2 = 40. What is the value of SS for this sample? a. 6 b. 8 c. 24 d. 40 14. A sample of n = 5 scores has ΣX = 20 and ΣX2 = 120. For this sample, what is the value of SS? a. 20 b. 40 c. 100 d. 120 15. A population of N = 6 scores has ΣX = 12 and ΣX2 = 54. What is the value of SS for this population? a. 5 b. 9 c. 30 d. 54 16. Which of the following symbols identifies the sample variance? a. s b. s2 c. σ d. σ 2 17. Which of the following symbols identifies the population standard deviation? a. s b. s2 c. σ d. σ 2 Instructor Notes – Chapter 4 – page 53 18. A population of N = 100 scores has µ = 30 and σ = 4. What is the population variance? a. 2 b. 4 c. 8 d. 16 19. A sample of n = 25 scores has M = 20 and s2 = 9. What is the sample standard deviation? a. 3 b. 4.5 c. 9 d. 81 20. A set of 10 scores has SS = 90. If the scores are a sample, the sample variance is ____ and if the scores are a population, the population variance is ____. a. s2 = 9, σ2 = 9 b. s2 = 9, σ2 = 10 c. s2 = 10, σ2 = 9 d. s2 = 10, σ2 = 10 21. (www) The sum of the squared deviation scores is SS = 20 for a population of N = 5 scores. What is the variance for this population? a. 4 b. 5 c. 80 d. 100 22. (www) The sum of the squared deviation scores is SS = 20 for a sample of n = 5 scores. What is the variance for this sample? a. 4 b. 5 c. 80 d. 100 23. (www) A population has µ = 50 and σ = 5. If 10 points are added to every score in the population, then what are the new values for the mean and standard deviation? a. µ = 50 and σ = 5 b. µ = 50 and σ = 15 c. µ = 60 and σ = 5 d. µ = 60 and σ = 15 Instructor Notes – Chapter 4 – page 54 24. A population of scores has µ = 50 and σ = 5. If every score in the population is multiplied by 3, then what are the new values for the mean and standard deviation? a. µ = 50 and σ = 5 b. µ = 50 and σ = 15 c. µ = 150 and σ = 5 d. µ = 150 and σ = 15 25. A sample of n = 8 scores has SS = 50. If these same scores were a population, then the SS value for the population would be _____. a. 50 b. greater than 50 c. less than 50 d. impossible to determine without additional information 26. A sample of n = 9 scores has a variance of s2 = 18. If the scores were a population, what value would be obtained for the population variance. a. 2 = 14 b. 2 = 16 c. 2 = 24 d. 2 = 144 27. (www) What are the values for SS and variance for the following sample of n = 3 scores? Sample: 1, 4, 7 a. SS = 18 and variance = 6 b. SS = 18 and variance = 9 c. SS = 66 and variance = 22 d. SS = 66 and variance = 33 28. (www) What are the values for SS and variance for the following sample of n = 4 scores? Sample: 1, 1, 0, 4 a. SS = 9 and variance = 3 b. SS = 9 and variance = 2.25 c. SS = 18 and variance = 6 d. SS = 18 and variance = 9 29. What is the value of SS for the following set of scores? Scores: 0, 1, 4, 5 a. 17 b. 18 c. 42 d. Cannot answer without knowing whether it is a sample or a population. Instructor Notes – Chapter 4 – page 55 30. What is the variance for the following population of scores? Scores: 5, 2, 5, 4 a. 6 b. 2 c. 1.5 d. 1.22 31. Which of the following is true for most distributions? a. Around 30% of the scores will be located within one standard deviation of the mean. b. Around 50% of the scores will be located within one standard deviation of the mean. c. Around 70% of the scores will be located within one standard deviation of the mean. d. Around 90% of the scores will be located within one standard deviation of the mean. 32. Which set of scores has the smallest standard deviation? a. 11, 17, 31, 53 b. 5, 11, 42, 22 c. 145, 143, 145, 147 d. 27, 105, 10, 80 33. For a particular sample, the largest distance (deviation) between a score and the mean is 11 points. The smallest distance between a score and the mean is 4 points. Therefore, the standard deviation _____. a. will be less than 4 b. will be between 4 and 11 c. will be greater than 11 d. It is impossible to say anything about the standard deviation. 34. The smallest score in a population is X = 5 and the largest score is X = 10. Based on this information, you can conclude that ______. a. the population mean is somewhere between 5 and 10. b. the population standard deviation is smaller than 6. c. the population mean is between 5 and 10, and the standard deviation is less than 6. d. None of the other choices is correct. Instructor Notes – Chapter 4 – page 56 35. If sample variance is computed by dividing SS by n, then the average value of the sample variances from all the possible random samples will be _______ the population variance. a. smaller than b. larger than c. exactly equal to d. unrelated to 36. If sample variance is computed by dividing SS by df = n – 1, then the average value of the sample variances from all the possible random samples will be _______ the population variance. a. smaller than b. larger than c. exactly equal to d. unrelated to 37. For a population with = 60, which of the following values for the population standard deviation would cause X = 68 to have the most extreme position in the distribution? a. σ = 1 b. σ = 2 c. σ = 3 d. σ = 4 38. There is a 6-point difference between two sample means. If the two samples have the same variance, then which of the following values for the variance would make the mean difference easiest to see in a graph showing the two distributions. a. s2 = 2 b. s2 = 8 c. s2 = 16 d. s2 = 64 39. (www) On an exam with a mean of μ = 70, you have a score of X = 75. Which of the following values for the standard deviation would give you the highest position within the class? a. σ = 1 b. σ = 5 c. σ = 10 d. cannot determine from the information given Instructor Notes – Chapter 4 – page 57 40. (www) On an exam with a mean of μ = 70, you have a score of X = 65. Which of the following values for the standard deviation would give you the highest position within the class? a. σ = 1 b. σ = 5 c. σ = 10 d. cannot determine from the information given True/False Questions 41. The range and the standard deviation, are both measures of distance. 42. Using the concept of real limits, the range is 8 points for a set of scores that range from a high of X = 16 to a low of X = 8. 43. The range is usually considered to be a relatively crude measure of variability. 44. For a population of scores, the sum of the deviation scores is equal to N. 45. A population of N = 5 scores has SS = 20 and σ2 = 4. If the 5 scores were a sample, the value of SS would still be 20 but the variance would be s2 = 5. 46. The value for SS is always greater than or equal to zero. 47. A positive deviation always indicates a score that is less than the mean. 48. After a researcher adds 5 points to every score in a sample, the standard deviation is found to be s = 10. The original sample had a standard deviation of s = 5. 49. After a researcher multiplies every score in a sample by 2, the standard deviation is found to be s = 10. The original sample had a standard deviation of s = 5. 50. Multiplying every score in a sample by 3 will not change the value of the standard deviation. 51. If the population variance is 5, then the population standard deviation is σ = 25. 52. A sample with a variance of 25 has a standard deviation equal to 5 points. 53. If the scores in a population range from a low of X = 5 to a high of X = 14, then the population standard deviation must be less than 15 points. Instructor Notes – Chapter 4 – page 58 54. A sample of n = 25 scores is selected from a population with a variance of σ2 = 16. The sample variance probably will be smaller than 16. 55. For a population, a deviation score is computed as X – μ. 56. If the population variance is 4, then the standard deviation will be σ = 16. 57. A sample of n = 7 scores has SS = 42. The variance for this sample is s2 = 6. 58. For a sample of n = 6 scores with X = 30 and X2 = 200, SS = 20. 59. For a population of N = 4 scores with X = 10 and X2 = 30, SS = 5. 60. A population with SS = 90 and a variance of 9 has N = 10 scores. 61. A sample with SS = 40 and a variance of 8 has n = 5 scores. 62. To calculate the variance for a sample, SS is divided by df = n – 1. 63. To calculate the variance for a population, SS is divided by N. 64. In a population with a mean of μ = 40 and a standard deviation of σ = 8, a score of X = 46 would be an extreme value, far out in the tail of the distribution. 65. If you have a score of X = 66 on an exam with = 70 you should expect a better grade if σ = 10 than if σ = 5. 66. If you have a score of X = 76 on an exam with = 70 you should expect a better grade if σ = 10 than if σ = 5. 67. For a sample with M = 20 and s = 1, a score of X = 17 would be considered an extremely low score. 68. For a sample with M = 40 and s = 4, about 95% of the individuals will have scores between X = 32 and X = 48. 69. For a population with µ = 70 and σ = 5, about 95% of the individuals will have scores between X = 65 and X = 75. 70. It is easier to see the mean difference between two samples if the sample variances are small. Instructor Notes – Chapter 4 – page 59 Other Exam Items 71. For each of the following samples: Sample #1: 1, 0, 3, 6 a. Compute the mean Sample #2: 1, 3, 5, 4, 7 b. Determine whether it would be better to use the computational or the definitional formula for SS. c. Compute SS. 72. Using the definitional formula, compute SS, variance and the standard deviation for the following sample of scores. Scores: 3, 6, 1, 6, 5, 3 73. (www) For the following sample, use the computational formula to calculate SS. Then, compute the sample variance and standard deviation. Scores: 1, 3, 1, 1 74. Calculate the variance and the standard deviation for the following sample data. Scores: 10, 7, 9, 1, 2, 0, 6 75. Without some correction, sample variability is said to be "biased." Define the term biased, and explain how this bias is corrected in the formula for sample variance. Answers for Multiple-Choice Questions (with section and page numbers from the text) 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. c, 4.2, p. 106 a, 4.2, p. 106 d, 4.4, p. 115 b, 4.4, p. 115 c, 4.3, p. 111 c, 4.4, p. 115 a, 4.3, p. 111 a, 4.3, p. 111 a, 4.3, p. 112 a, 4.3, p. 113 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. a, 4.3, p. 108 c, 4.3, p. 111 c, 4.4, p. 115 b, 4.4, p. 115 c, 4.3, p. 112 b, 4.4, p. 115 c, 4.3, p. 113 d, 4.3, p. 108 a, 4.4, p. 115 c, 4.4, p. 115 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. a, 4.3, p. 113 b, 4.4, p. 115 c, 4.5, p. 122 d, 4.5, p. 123 a, 4.4, p. 115 b, 4.4, p. 115 b, 4.4, p. 115 a, 4.4, p. 115 a, 4.4, p. 115 c, 4.3, p. 113 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. Instructor Notes – Chapter 4 – page 60 c, 4.5, p. 121 c, 4.5, p. 121 b, 4.4, p. 116 c, 4.5, p. 121 a, 4.5, p. 120 c, 4.5, p. 120 a, 4.5, p. 122 a, 4.5, p. 125 a, 4.5, p. 122 c, 4.5, p. 122 Answers for True/False Questions (with section and page numbers from the text) 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. T, 4.1, p. 105 F, 4.2, p. 106 T, 4.2, p. 106 F, 4.3, p. 108 T, 4.4, p. 115 T, 4.3, p. 111 F, 4.3, p. 107 F, 4.5, p. 122 T, 4.5, p. 123 F, 4.5, p. 123 51. 52. 53. 54. 55. 56. 57. 58. 59. 60. F, 4.3, p. 108 T, 4.4, p. 115 T, 4.3, p. 110 F, 4.5, p. 120 T, 4.3, p. 107 F, 4.3, p. 108 F, 4.4, p. 115 F, 4.4, p. 115 T, 4.3, p. 112 T, 4.3, p. 113 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. F, 4.4, p. 115 T, 4.4, p. 117 T, 4.3, p. 113 F, 4.5, p. 122 T, 4.5, p. 122 F, 4.5, p. 122 T, 4.5, p. 122 T, 4.5, p. 121 F, 4.5, p. 121 T, 4.5, p. 125 Answers for Other Exam Items 71. a. For sample #1, M = 2.5. For sample #2, M = 4. b. The computational formula is better for sample #1 and the definitional formula for sample #2. c. For sample #1, SS = 21. For sample #2, SS = 20. 72. SS = 20; s 2 = 4; s = 2. 73. ΣX = 6 and ΣX2 = 12. SS = 3, σ2 = 1and σ = 1. 74. SS = 96; s2 = 16; s = 4 75. Without some correction, sample variability tends to be smaller than the population variability. Whenever a statistic consistently underestimates (or overestimates) the corresponding population parameter, the statistic is said to be biased. The bias in sample variability is corrected by dividing the sum of squared deviations (SS) by n – 1 (instead of n) in the formula for sample variance. Instructor Notes – Chapter 4 – page 61