Download Chapter 2

Chapter 2 Section 2.1 Check Your Understanding, page 89: 1. c 2. Her daughter weighs more than 87% of girls her age and she is taller than 67% of girls her age. 3. About 65% of calls lasted less than 30 minutes. This means that about 35% of calls lasted 30 minutes or longer. 4. The first quartile (25th percentile) is at about 14 minutes. The third quartile (75th percentile) is at about 32 minutes. This suggests that the IQR = 32 − 14 = 18 minutes. Check Your Understanding, page 91: 1. z = class. 65 − 67 = −0.466. Lynette’s height is about 0.46 standard deviations below the mean height of the 4.29 = 2. Brent’s z-score is z height of the class. 74 − 67 = 1.63. Brent’s height is about 1.63 standard deviations above the mean 4.29 74 − 76 3. Since Brent’s z-score is −0.85, we know that −0.85 = . Solving this for σ we find that σ = 2.35 inches. σ Check Your Understanding, page 97: 1. Converting the heights from inches to centimeters will not change the shape. However, it will multiply the center and spread by 2.54. 2. Adding 6 inches to each of the students’ heights will not change the shape of the distribution, nor will it change the spread. It will, however, add 6 inches to the center. 3. Converting the class heights to z-scores will not change the shape of the distribution. It will change the mean to 0 and the standard deviation to 1. Check Your Understanding, page 103: 1. This figure depicts a legitimate density curve because it is positive everywhere and it has total area of . 2. About 12% of the observations lie between 7 and 8. Chapter 2: Modeling Distributions of Data 35 3. Point A in the graph below is the approximate median. 4. Point B in the graph above is the approximate mean. The mean is less than the median in this case because the distribution is skewed to the left. Exercises, page 105: 2.1 (a) Putting the data in order we get: 13 13 13 15 19 22 23 23 24 26 26 30 31 34 38 49 50 50 51 57. 5 The girl with 22 pairs of shoes is the 6th smallest. Therefore, her percentile is = 0.25. In other words, 20 25% of the girls had fewer pairs of shoes than she did. (b) Putting the data in order we get: 4 5 5 5 6 7 7 7 7 8 10 10 10 10 11 12 14 22 35 38. The boy with 22 pairs has more shoes than 17 people. Therefore, his 17 percentile is = 0.85. In other words, 85% of boys had fewer pairs of shoes than he did.(c) The boy is 20 more unusual because only 15% of the boys have as many or more than he has, while the girl has a value that is more centered in the distribution. 25% have fewer and 75% have as many or more. 2.2 (a) Since 4 states have smaller values for the percent of residents aged 65 and older, the percentile 4 for Colorado is = 0.08. In other words, 8% of states have a smaller percent of residents aged 65 and 50 older. (b) Since 40 states have smaller values for the percent of residents aged 65 and older, the 40 percentile for Rhode Island is = 0.80. In other words, 80% of states have a smaller percent of 50 residents aged 65 and older. (c) Colorado is the more unusual state having only 8% of the states with smaller percentages of residents aged 65 and older. Rhode Island has a value that is closer to the center of the distribution with 80% lower and 20% as larger or larger. 2.3 According to the Los Angeles Times, the speed limits on California highways are such that 85% of the vehicle speeds on those stretches of road are less than the speed limit. 2.4 Larry’s wife should gently break the news that being in the 90th percentile is not good news in this situation. About 90% of men similar to Larry have lower blood pressures. The doctor was suggesting that Larry take action to lower his blood pressure. 36 The Practice of Statistics for AP*, 4/e 2.5 The girl in question weighs more than 48% of girls her age, but is taller than 78% of the girls her age. Since she is taller than 78% of girls, but only weighs more than 48% of girls, she is probably fairly skinny. 2.6 Peter’s time was slower than 80% of his previous race times that season, but it was slower than only 50% of the racers at the league championship meet. 2.7 (a) The highlighted student sent about 212 text messages in the 2-day period which placed her at about the 80th percentile. (b) The median number of texts is the same as the 50th percentile. Locate 50% on the y-axis, read over to the points and then find the relevant place on the x-axis. The median is approximately 115 text messages. 2.8 (a) Maryland has about 13% foreign-born residents placing it at about the 70th percentile. (b) Locate 30% on the y-axis, read over to the points and then find the relevant place on the x-axis. The 30th percentile is approximately 4.5% foreign-born. 2.9 (a) First find the quartiles. The first quartile is the 25th percentile. Find 25 on the y-axis, read over to the line and then down to the x-axis to get about $19. The 3rd quartile is the 75th percentile. Find 75 on the y-axis, read over to the line and then down to the x-axis to get about $50. So the interquartile range is $50 − $19 = $31. (b) The person who spent $19.50 is just above what we have called the 25th percentile. It appears that $19.50 is at about the 26th percentile. (c) The graph is below: 2.10 (a) To find the 60th percentile, find 60 on the y-axis, read over to the line and then read down to the x-axis to find approximately 1000 hours. (b) To find the percentile for the lamp that lasted 900 hours, find 900 on the x-axis, read up to the line and across to the y-axis to find that it is approximately the 35th percentile. 2.11 Eleanor’s standardized= score, z = z 27 − 18 = 1.5. 6 Chapter 2: Modeling Distributions of Data 680 − 500 = 1.8, is higher than Gerald’s standardized score, 100 37 2.12 The standardized batting averages (z-scores) for these three outstanding hitters are: Player z-score Cobb 0.420 − 0.266 = z = 4.15 0.0371 Williams 0.406 − 0.267 = z = 4.26 0.0326 Brett 0.390 − 0.261 = z = 4.07 0.0317 All three hitters were at least 4 standard deviations above their peers, but Williams’ z-score is the highest. 2.13 (a) Judy’s bone density score is about one and a half standard deviations below the average score for all women her age. The fact that your standardized score is negative indicates that your bone density is below the average for your peer group. The magnitude of the standardized score tells us how many standard deviations you are below the average (about 1.5). (b) If we let σ denote the standard deviation of the bone density in Judy’s reference population, then we can solve for σ in the equation 948 − 956 −1.45 = . Thus, σ = 5.52 grams/cm2. σ 2.14 (a) Mary’s z-score (0.5) indicates that her bone density score is about half a standard deviation above the average score for all women her age. Even though the two bone density scores are exactly the same, Mary is 10 years older so her z-score is higher than Judy’s (−1.45). Judy’s bones are healthier when comparisons are made to other women in their age groups. (b) If we let σ denote the standard deviation of the bone density in Mary’s reference population, then we can solve for σ in the equation 948 − 944 0.5 = . Thus, σ = 8 grams/cm2. There is more variability in the bone densities for older σ women, which is not surprising. 2.15 (a) Since 22 salaries were less than Lidge’s salary, his salary is at the 22 = 75.86 percentile. (b) 29 6,350,000 − 3,388,617 = 0.79. Lidge’s salary was 0.79 standard deviations above the mean salary of 3,767, 484 $3,388,617. z 2.16 Madson’s salary in 2008 was $1,400,000. Since there were 14 salaries less than Madson’s, his 14 salary is at the = 48 percentile. Note that we could also argue that since Madson’s salary was the 29 1, 400,000 − 3,388,617 median, his percentile would be 50%. Madson’s z-score is z = = −0.53. Madson 3,767, 484 had a typical salary compared to the rest of the team since his percentile was 50%, but there were some players who made much more than he did since his z-score is negative, indicating a salary below the mean team salary. 2.17 (a) In the national group, about 94.8% of the test takers scored below 65. Scott’s percentiles, 94.8th among the national group and 68th within the school, indicate that he did better among all test takers than 38 The Practice of Statistics for AP*, 4/e he did among the 50 boys at his school. (b) Scott’s z-scores are z = group and z = 64 − 58.2 = 0.62 among the 50 boys at his school. 9.4 64 − 46.9 = 1.57 among the national 10.9 2.18 The boys at Scott’s school did very well on the PSAT. Scott’s score was relatively better when compared to the national group than to his peers at school. Only 5.2% of the test takers nationally scored 65 or higher, yet about 32% scored 65 or higher at Scott’s school. 2.19 (a) The mean and the median both increase by 18 so the mean is 87.188 and the median is 87.5. Mean New = = sum of student heights standing on chairs number of students ( height of first student + 18) +  + ( height of last student + 18 ) number of students sum of student heights standing on floor + 18 ∗ number of students number of students = Mean Old += 18 69.188 += 18 87.188. = The median is still the height of the middle student. Now that this student is standing on a chair 18 inches from the ground, the median will be 18 inches larger. (b) The standard deviation and IQR do not change. For the standard deviation, note that although the mean increased by 18, the observations each increased by 18 as well so that the deviations did not change. For the IQR, Q1 and Q3 both increase by 18 so that their difference remains the same as in the original data set. 2.20 (a) The mean and median salaries will each increase by $1000 (the distribution of salaries just shifts by $1000). (b) The extremes and quartiles will also each increase by $1000. The standard deviation will not change. Nothing has happened to affect the variability of the distribution. The center has shifted location, but the spread has not changed. 2.21 (a) To give the heights in feet, not inches, we would divide each observation by 12 (12 inches = 1 foot). Thus height of first student (inches) height of last student(inches) + + 12 12 Mean New = number of students 1  height of first student (inches) +  + height of last student (inches)    12  number of students  1  1 = = Mean Old  =  69.188 5.77 feet. 12  12  = The median is still the height of the middle student. To convert this height to feet, we divide by 12: 69.5 Median= = 5.79 feet. New 12 Chapter 2: Modeling Distributions of Data 39 (b) To find the standard deviation in feet, note that each deviation in terms of feet is found by dividing the original deviation by 12. standard deviation New (first deviation (ft)) 2 + ... + (last deviation (ft)) 2 = n-1 (first deviation (in)) 2 + ... + (last deviation (in)) 2 = n −1 1 3.2 = ∗ standard deviation Old = = 0.27 feet. 12 12 The first and third quartiles are still the medians of the first and second halves of the data; these values must simply be converted to feet. To do this, divide the first and third quartiles of the original data set by 67.75 71 12: = Q1 = 5.65 feet and Q= = 5.92 feet. So the interquartile range is 3 12 12 IQR = 5.92 − 5.65 = 0.27 feet. 2.22 (a) The mean and median will each increase by 5% since all of the observations will increase by 5%. (b) The 5% raise will increase the distance of the quartiles from the median. The quartiles and the standard deviation will each increase by 5% (the original values will be multiplied by 1.05). 2.23 To find the mean temperature in degrees Fahrenheit multiply by 9 and add 32 so we get 5 9 9 = 77 degrees Fahrenheit. To find the standard deviation, we just multiply by ( 25) + 32 5 5 since adding 32 just shifts the distribution and does not affect the spread. So we get 9 = = standard deviation ( 2 ) 3.6 degrees Fahrenheit. f 5 Mean= f 2.24 To get a correct measurement in inches we need to subtract the 0.2 inches that Clarence mistakenly added. Then to transform that same measurement to cm we need to multiply by 2.54. So the new mean is 7.62 cm. The change in location does not affect the standard deviation so we just ( 3.2 − 0.2 ) ∗ 2.54 = multiply the old standard deviation by 2.54. 0.1( 2.54 ) = 0.254 cm. 2.25 Sketches will vary. Use them to confirm that the students understand the meaning of (a) symmetric and bimodal 2.26 Sketches will vary. Use them to confirm that the students understand the meaning of skewed to the left. 1 by 3 rectangle, the 3 1 area beneath the curve is 1. (b) One third of accidents occur in the first mile: this is a by 1 rectangle, 3 2.27 (a) It is on or above the horizontal axis everywhere, and because it forms a 40 The Practice of Statistics for AP*, 4/e 1 1 . (c) One-tenth of accidents occur next to Sue’s property: this is a by 0.3 3 3 rectangle, so the proportion is 0.1. so the proportion is 2.28 (a) The area under the curve is a rectangle with height 1 and width 1. Thus, the total area under the curve is 1(1) = 1. (b) The area under the uniform distribution between 0.8 and 1 is 0.2 (1) = 0.2, so 20% of the observations lie above 0.8. (c) The area under the uniform distribution between 0.25 and 0.75 is 0.5 (1) = 0.5, so 50% of the observations lie between 0.25 and 0.75. 2.29 Both are 1.5. The mean is 1.5 because this is the obvious balance point of the rectangle. The median is also 1.5 because the distribution is symmetric (so that median = mean) and because half of the area lies to the left and half to the right of 1.5. 2.30 Both are 0.5. The mean is 0.5 because this is the obvious balance point of the rectangle. The median is also 0.5 because the distribution is symmetric (so that median = mean) and because half of the area lies to the left and half to the right of 0.5. 2.31 (a) Mean is C, median is B (the right skew pulls the mean to the right). (b) Mean is B, median is B (this distribution is symmetric). 2.32 (a) Mean is A, median is A (the distribution is symmetric). (b) Mean A, median B (the left skew pulls the mean to the left). 2.33 c 2.34 b 2.35 c 2.36 b 2.37 d 2.38 e 2.39 The distribution is skewed to the right since most of the values are 25 minutes or less, but the values stretch out up to about 90 minutes. The data is centered roughly around 20 minutes and the range of the distribution is close to 90 minutes. The two largest values appear to be outliers. Chapter 2: Modeling Distributions of Data 41 2.40 (a) A bar chart is given below: (b) 6%. Since 6% of the sample was left-handed, population that is left-handed is 6%. 3 , our best estimate of the percentage of the 50 Section 2.2 Check Your Understanding, page 114: 1. The graph is shown below: 2. Since 67 inches is one standard deviation above the mean, approximately women have heights greater than 67 inches. 1 − 0.68 = 16% of young 2 3. Since 62 is one standard deviation below the mean and 72 is three standard deviations above the mean, 0.68 0.997 approximately + = 84% of young women have heights between 62 and 72 inches. 2 2 42 The Practice of Statistics for AP*, 4/e Check Your Understanding, page 119: 1. The proportion is 0.9177. A graph is shown below: 2. The proportion is 0.9842. A graph is shown below: 3. The proportion is 0.9649 − 0.2877 = 0.6772. A graph is shown below: Chapter 2: Modeling Distributions of Data 43 4. The z-score for the 20th percentile is -0.84. A graph is shown below: 5. The 55th percentile is the z-value where 45% are greater than that value. z = 0.13. A graph is shown below: Check Your Understanding, page 124: 240 − 170 = 2.33. The proportion of z30 scores above 2.33 is 1 − 0.9901 = 0.0099. A graph is shown below: = 1. For 14-year old boys with cholesterol 240, the z-score is z 44 The Practice of Statistics for AP*, 4/e 200 − 170 = 1. The proportion of z-scores between 1 30 and 2.33 is 0.9901 − 0.8413 = 0.1488. A graph is shown below: 2. The z-score for a cholesterol level of= 200 is z 3. The 80th percentile of the Standard Normal distribution is 0.84 (see graph below). This means that the distance, x, of Tiger Woods’ drive lengths that satisfies the 80th percentile is the solution to x − 304 0.84 = . Solving for x, we get 310.72 yards. 8 Exercises, page 131: 2.41 The Normal density curve with mean 69 and standard deviation 2.5 is shown below. Chapter 2: Modeling Distributions of Data 45 2.42 The Normal distribution for the weights of 9-ounce bags of potato chips is shown below. The interval containing weights within 1 standard deviation of the mean goes from 9.07 to 9.17. The interval containing weights within 2 standard deviations of the mean goes from 9.02 to 9.22. The interval containing weights within 3 standard deviations of the mean goes from 8.97 to 9.27. 2.43 (a) Approximately 2.5% of men are taller than 74 inches, which is 2 standard deviations above the mean. (b) Approximately 95% of men have heights between 69 − 5 = 64 inches and 69 + 5 = 74 inches. (c) Approximately 16% of men are shorter than 66.5 inches, because 66.5 is one standard deviation below the mean. Approximately 2.5% are shorter than 64 inches, because 64 inches is two standard deviations below the mean. So approximately 16% − 2.5% = 13.5% of men have heights between 64 inches and 66.5 inches. (d) The value 71.5 is one standard deviation above the mean. Thus, the area to the left of 71.5 is 0.68 + 0.16 = 0.84. In other words, 71.5 is the 84th percentile of adult male American heights. 2.44 (a) Approximately 2.5% of bags weigh less than 9.02 ounces, because 9.02 is two standard deviations below the mean. (b) Approximately 68% of bags have weights between ounces and 9.12 + 0.05 = 9.17 ounces. (c) Approximately 84% of bags have weights less than 9.17 ounces, because 9.17 is one standard deviation above the mean. Approximately 0.15% of bags have weight less than 8.97 ounces, because 8.97 is three standard deviations below the mean. So approximately 84% − 0.15% = 83.85% of bags have weight between 8.97 and 9.17 ounces. (d) The value 9.07 is one standard deviation below the mean. This means that the area to the left of 9.07 is 0.16. In other words, 9.07 is the 16th percentile of the weights of these potato chip bags. 2.45 The standard deviation is approximately 0.2 for the tall, more concentrated one and 0.5 for the short, less concentrated one. 2.46 The mean is 10 and the standard deviation is about 2. The entire range of scores is about 16 − 4 = 12. Since the range of values in a normal distribution is about six standard deviations, we have 12 = 2 for σ. 6 2.47 (a) 0.9978. The graph is shown below (left). (b) 1 − 0.9978 = 0.0022. The graph is shown below (right). 46 The Practice of Statistics for AP*, 4/e (c) 1 – 0.0485 = 0.9515. The graph is show below (left). (d) 0.9978 – 0.0485 = 0.9493. The graph is show below (right). 2.48 (a) 0.0069. The graph is show below (left). (b) 1 − 0.9931 = 0.0069. The graph is shown below (right). (c) 0.9931 − 0.8133 = 0.1798. The graph is shown below (left). (d) 0.1020 − 0.0016 = 0.1004. The graph is shown below (right). 2.49 (a) 0.9505 − 0.0918 = 0.8587. The graph is shown below (left). (b) 0.9633 − 0.6915 = 0.2718. The graph is shown below (right). Chapter 2: Modeling Distributions of Data 47 2.50 (a) 0.7823 − 0.0202 = 0.7621. The graph is shown below (left). (b) 0.3745 − 0.1335 = 0.241. The graph is shown below (right). 2.51 (a) The value that is closest to 0.1000 in Table A is 0.1003. This corresponds to a value of 66 th percentile. −1.28 for z. (b) The point where 34% of observations are greater is also the 10 − 34 = The value that is closest to 0.6600 in Table A is 0.6591 which corresponds to a z-score of 0.41. 2.52 (a) The value that is closest to 0.6300 in Table A is 0.6293 which corresponds to a z-value of 0.33. (b) If 75% of values are greater than z, then 25% are lower. The value that is closest to 0.2500 in Table A is 0.2514 which corresponds to a z-score of -0.67. 2.53 (a) State: Let x = the length of pregnancies. The variable x has a Normal distribution with µ = 266 days and σ = 16 days. We want the proportion of pregnancies that last less than 240 days. Plan: The proportion of pregnancies lasting less than 240 days is shown in the graph below (left). 240 − 266 = −1.63, so x < 240 corresponds to z < −1.63. Using table A we 16 see that the proportion of observations less than -1.63 is 0.0516 or about 5.2%. Conclude: About 5.2% of pregnancies last less than 240 days which means that 240 is approximately the 5th percentile. Do: For x = 240 we have z = 48 The Practice of Statistics for AP*, 4/e (b) State: Let x = the length of pregnancies. The variable x has a Normal distribution with µ = 266 days and σ = 16 days. We want the proportion of pregnancies lasting between 240 and 270 days. Plan: The proportion of pregnancies lasting between 240 and 270 days is shown in the graph above (right). Do: 170 − 266 From part (a) we have that for x = 240, z = −1.63. For x = 270, we have z = 0.25. Using = 16 table A we see that the proportion of observations less than 0.25 is 0.5987. So the proportion of observations between −1.63 and 0.25 is 0.5987 − 0.0516 = 0.5471 or about 55%. Conclude: Approximately 55% of pregnancies last between 240 and 270 days. (c) State: Let x = the length of pregnancies. The variable x has a Normal distribution with µ = 266 days and σ = 16 days. We want the number of days such that 80% of people have shorter pregnancies than that number of days. Plan: The 80th percentile for the length of human pregnancy is shown in the graph below. Do: Using Table A, the 80th percentile for the standard Normal distribution is 0.84. Therefore, the 80th percentile for the length of x − 266 human pregnancy can be found by solving the equation 0.84 = for x. Thus, 16 = x ( 0.84 )16 + 266 = 279.44. Conclude: The longest 20% of pregnancies last approximately 279 or more days. 2.54 State: Let x = the IQ scores of people aged 20 to 34. The variable x has a Normal distribution with µ = 110 and σ = 25. We want the proportion of IQ scores that are less than 150. Plan: The proportion of IQ scores less than 150 is shown in the graph below (left). 150 − 110 = 1.6. Using Table A we see that the proportion of observations 25 less than 1.6 is 0.9452 or approximately 94.5%. Conclude: An IQ score of 150 is approximately the 95th percentile. (b) State: Let x = the IQ scores of people aged 20 to 34. The variable x has a Normal distribution with µ = 110 and σ = 25. We want the proportion of IQ scores that are between 125 and 150. Plan: The proportion of IQ scores between 125 and 150 is shown in the graph above (right). Do: From Do: For x = 150 we= have z Chapter 2: Modeling Distributions of Data 49 125 − 110 = 0.6. So the 25 proportion of observations between 0.6 and 1.6, using Table A is 0.9452 − 0.7257 = 0.2195 or about 22%. Conclude: About 22% of people aged 20-34 have IQ scores between 125 and 150. (c) State: Let x = the IQ scores of people aged 20 to 34. The variable x has a Normal distribution with µ = 110 and σ = 25. We want to find the score such that 98% of people aged 20-34 have an IQ score that is smaller. Plan: The 98th percentile of the IQ scores is shown in the graph below. part (a), we have that for x = 150, z = 1.6. For x = 125 we have that z = Do: Using Table A, the 98th percentile for the standard Normal distribution is closest to 2.05. Therefore, x − 110 the 98th percentile for the IQ scores can be found by solving the equation 2.05 = for x. Thus, 25 = x ( 2.05 ) 25 + 110 = 161.25. Conclude: In order to qualify for MENSA membership a person must score 162 or higher. 2.55 (a) We want to find the area under the N(0.37, 0.04) distribution to the right of 0.3. The graphs below show that this area is equivalent to the area under the N(0, 1) distribution to the right of 0.3 − 0.37 z= = −1.75. 0.04 Using Table A, the proportion of adhesions higher than 0.30 is 1 − 0.0401 = 0.9599. We would expect trains to arrive on time about 96% of the time. (b) We want to find the area under the N(0.37, 0.04) distribution to the right of 0.5. The graphs below show that this area is equivalent to the area under the 0.5 − 0.37 = N(0, 1) distribution to the right of z = 3.25. Using Table A, the proportion of adhesions 0.04 higher than 0.50 is 1 − 0.9994 = 0.0006. So we would expect trains to arrive early .06% of the time. 50 The Practice of Statistics for AP*, 4/e (c) It makes sense to try to have the value found in part (a) larger. We want the train to arrive at its destination on time, but not to arrive at the switch point early. 2.56 (a) If a lid is too small to fit, that means that its diameter is less than 3.95. Let x = diameter of the lid. We wish to find the proportion of lids for which x < 3.95. The graph showing this area is below (on 3.95 − 3.98 = −1.5. Using Table A, the proportion of observations the left). For x = 3.95, we have z = 0.02 less than -1.5 is 0.0668. So, approximately 6.7% of lids are too small. (b) If a lid is too big to fit, that means that the diameter is more than 4.05. Let x = diameter of the lid. We wish to find the proportion of lids for which x > 4.05. The graph showing this area is below (on the right). For x = 4.05, we have 4.05 − 3.98 = z = 3.5. Using Table A, the proportion of observations less than 3.5 is approximately 1, so 0.02 the proportion of observations greater than 3.5 is approximately 0. In other words, it is extremely rare for a lid to be too big. (c) It makes more sense to have a larger proportion of lids too small rather than too large. The company wants to make sure that the fit is snug. If more lids are too large, there will be more spills. If lids are too small, customers will just try another lid. But if lids are too large, the customer may not notice and then spill the drink. 2.57 (a) We want to solve for the appropriate mean of the distribution so that the adhesion is less than 0.30 on less than 2% of days. First, find the z-value for which 2% of observations are lower. Using Table A, we find z = −2.05. Since we want this z-value to correspond to an adhesion of 0.30, we solve 0.30 − µ z= = −2.05 for µ. In other words, µ = 0.30 + 2.05 ( 0.04 ) = 0.382. The mean adhesion should be 0.04 0.30 − 0.37 0.382. (b) If the mean adhesion stays at 0.37, then solve z = = −2.05 for σ. In other words σ σ = 0.30 − 0.37 = 0.034. The standard deviation of the adhesion values should be 0.034. (c) To −2.05 Chapter 2: Modeling Distributions of Data 51 compare the options, we want to find the area under the N ( µ ,σ ) to the right of 0.50. Under option (a), 0.50 − 0.382 = 2.95. Using Table A, we find that the area is 1 − 0.9984 = 0.0016. Under option (b), 0.04 0.50 − 0.37 = z = 3.82. Using Table A, we find that the area is 1 − 0.9999 = 0.0001. Therefore, we prefer 0.034 option (b). = z 2.58 (a) We want to solve for the appropriate mean so that less than 1% of lids have diameter less than 3.95. First, find the z-value for which 1% of observations are lower. Using Table A, we find z = −2.33. Since we want this z-value to correspond to a diameter of 3.95, we solve 3.95 − µ z= = −2.33 for µ. In other words, µ = 3.95 + 2.33 ( 0.02 ) = 4.00 inches. So the mean lid 0.02 diameter should be 4.00 inches. (b) If the mean diameter stays at 3.98, then solve 3.95 − 3.98 3.95 − 3.98 words σ = 0.013 inches. So the standard deviation z= = −2.33 for σ. In other= σ −2.33 of the lid diameters should be 0.013 inches. (c) We prefer reducing the standard deviation. This will not increase the number of lids that are too big, but will reduce the number of lids that are too small. If we just shift the mean to the right, we will reduce the number of lids that are too small, but we will increase the number of lids that are too big. 2.59 (a) Using Table A, the closest values to the deciles are ±1.28. (b) The deciles for the heights of young women are 64.5 ± 1.28(2.5) or 61.3 inches and 67.7 inches. 2.60 The quartiles for a standard Normal distribution are ±0.6745. For a N ( µ , σ ) distribution, µ − 0.6745σ , Q3 = µ + 0.6745σ , and IQR =1.349σ . Therefore, 1.5 ( IQR ) = 2.0235σ , and the Q1 = suspected outliers are below Q1 − 1.5 ( IQR ) = µ − 2.698σ or above Q3 + 1.5 ( IQR ) = µ + 2.698σ . The proportion outside of this range is approximately the same as the area under the standard Normal distribution outside of the range from −2.7 to 2.7, which is 2 ( 0.0035 ) = 0.007 or 0.70%. 2.61 Use the two graphs below and the given information to set up two equations in two unknowns. The two equations are 1.04 = 60 − µ σ and 1.88 = 75 − µ σ . Multiplying both sides of the equations by σ and subtracting yields 0.84σ = 15 or σ = 17.86 minutes. Substituting this value back into the first 60 − µ or µ = equation we obtain 1.04 = 60 − 1.04(17.86) = 41.43 minutes. 17.86 52 The Practice of Statistics for AP*, 4/e 2.62 Use the given information and the graphs below to set up two equations in two unknowns. 1− µ 2−µ . Multiplying both sides of the equations by σ and The two equations are −0.25 = and 2.05 = σ σ subtracting yields −2.3σ = −1 or σ = 0.4348 minutes. Substituting this value back into the first equation 1− µ we obtain −0.25 = or µ = 1 + 0.25(0.4348) = 1.1087 minutes. 0.4348 2.63 (a) Descriptive statistics and a histogram are provided below. Variable shlength N 44 Mean 15.586 StDev 2.550 Minimum 9.400 Q1 13.525 Median 15.750 Q3 17.400 Maximum 22.800 The distribution of shark lengths is roughly symmetric with a peak at 16 and varies from 9.4 feet to 22.8 feet. (b) 68.2% of the lengths fall within one standard deviation of the mean, 95.5% of the lengths fall within two standard deviations of the mean, and 100% of the lengths fall within 3 standard deviations of the mean. These are very close to the 68-95-99.7 rule. (c) A Normal probability plot is shown below. Chapter 2: Modeling Distributions of Data 53 Except for one small shark and one large shark, the plot is fairly linear, indicating that the Normal distribution is appropriate. (d) The graphical display in (a), check of the 68−95−99.7 rule in (b), and Normal probability plot in (c) indicate that shark lengths are approximately Normal. 2.64 (a) Descriptive statistics and a histogram are provided below. Variable density N 29 Mean 5.4479 StDev 0.2209 Minimum 4.8800 Q1 5.2950 Median 5.4600 Q3 5.6150 Maximum 5.8500 The measurements of the earth’s density are roughly symmetric with a mean of 5.45 and varies from 4.88 to 5.85. (b) The densities follow the 68−95−99.7 rule closely—75.86% (22 out of 29) of the densities fall within one standard deviation of the mean, 96.55% (28 out of 29) of the densities fall within two standard deviations of the mean, and 100% of the densities fall within 3 standard deviations of the mean. (c) Normal probability plots from Minitab (left) and a TI calculator (right) are shown below. 54 The Practice of Statistics for AP*, 4/e The Normal probability plot is roughly linear, indicating that the densities are approximately Normal. (d) The graphical display in (a), the 68-95-99.7 rule in (b) and the Normal probability plot in (c) all indicate that these measurements are approximately Normal. 2.65 The plot is nearly linear. Because heart rate is measured in whole numbers, there is a slight “step” appearance to the graph. 2.66 The shape of the Normal probability plot suggests that the data are right-skewed. This can be seen in the steep, nearly vertical section in the lower left—these numbers were less spread out than they should be for Normal data—and the three apparent outliers that deviate from the line in the upper right; these were much larger than they would be for a Normal distribution. 149,164 = 0.1147 or about 11.47%. (b) The percent of 1,300,599 149,164 + 50,310 scores greater than or equal to 27 is = 0.1534 or about 15.34%. (c) For the normal 1,300,599 distribution with µ = 21.2 and σ = 5.0, the percent of observations greater than 27 corresponds to the 27 − 21.2 percent of observations= above z = 1.16 on the Standard Normal curve. Using Table A, this 5 proportion is 0.1230. So about 12.30% of scores would be 27 or higher. 2.67 (a) The percent of scores above 27 is 2.68 Women’s weights are skewed to the right: This makes the mean higher than the median, and it is also revealed in the differences M − Q= 133.2 − 118.3= 14.9 pounds and 1 Q3 − M= 157.3 − 133.2= 24.1 pounds. 2.69 d 2.70 c 2.71 b Chapter 2: Modeling Distributions of Data 55 2.72 c 2.73 c 2.74 c 2.75 For both kinds of cars we see that the highway miles per gallon is higher than the city miles per gallon, a conclusion that we would have expected. The two-seater cars have a wider spread of miles per gallon values than the minicompact cars do, both on the highway and in the city. Also the miles per gallon values are slightly lower for the two-seater cars than for the minicompact cars. This difference is greater on the highway than it is in the city. 2.76 (a) The two-way table is shown below. (b) The percent of eggs in each group that hatched are 59.26% in a cold nest, 67.86% in a neutral nest, and 72.12% in a hot nest. The percents indicate that hatching increases with temperature. The cold nest did not prevent hatching, but made it less likely. Cold Neutral Hot Hatched 16 38 75 Not hatched 11 18 29 Total 27 56 104 56 The Practice of Statistics for AP*, 4/e Chapter Review Exercises (page 136) R2.1 (a) The only way to obtain a z-score of 0 is if the x value equals the mean. Thus, the mean is 170 cm. To find the standard deviation, use the fact that a z-score of 1 corresponds to a height of 177.5. Then 177.5 − 170 so σ = 7.5. Thus the standard deviation of the height distribution for 15-year-old males is 1= σ 7.5 cm. (b) Since 2.5 = 2.5. x − 170 ,= x 2.5(7.5) + 170 = 188.75 cm. A height of 188.75 cm has a z-score of 7.5 179 − 170 = 1.20. Paul is somewhat taller than average for his age. His height is 1.20 7.5 standard deviations above the average male height for his age. (b) 85% of boys Paul’s age are shorter than Paul’s height. R2.2 (a) z = R2.3 (a) Reading up from 10 hours on the x-axis to the graphed line and then across to the y-axis, we see that 10 hours corresponds to about the 70th percentile. (b) The median (50th percentile) is about 5, Q1 (25th percentile) is about 2.5, and Q3 (75th percentile) is about 11. There are outliers, according to the 1.5 ( IQR ) rule, because values exceeding Q3 + 1.5 ( IQR ) = 23.75 clearly exist. R2.4 (a) If we converted the guesses from feet to meters the shape of the distribution would not change. 43.7 42 The new mean would be = 13.32 meters, the median would be = 12.80 meters, the standard 3.28 3.28 12.5 12.5 deviation would be = 3.81 meters, and the IQR would be = 3.81 meters. (b) The mean error 3.28 3.28 would be 43.7 − 42.6 = 1.1 feet. The standard deviation of the errors would be the same as the standard deviation of the guesses, 12.5 feet, because we have just shifted the distribution, but not changed its width by subtracting 42.6 from each guess. R2.5 (a) Answers will vary but the line should be slightly to the right of the main peak. Line A in the graph below. (b) Answers will vary but the line should be slightly to the right of the line for the median. Line B in the graph below. R2.6 (a) 336 ± 3(3) = ( 327,345) (b) 339 is one standard deviation above the mean, so 16% of the horse pregnancies last longer than 339 days. This is because 68% are within one standard deviation of the mean, so 32% are more than one standard deviation from the mean and half of those are greater than 339. Chapter 2: Modeling Distributions of Data 57 R2.7 (a) The graph is given below (on the left). Looking up −2.25 in Table A, the proportion of observations below −2.25 is 0.0122. (b) The graph is below (on the right). The proportion of observations larger than −2.25 is 1 − 0.0122 = 0.9878. (c) The graph is given below (on the left). Looking up 1.77 in Table A gives 0.9616. This is the proportion of observations below 1.77. We would like the proportion greater than 1.77 so we subtract this from 1. 1 − 0.9616 = 0.0384. 3.84% of observations are above 1.77. (d) The graph is given below (on the right). Subtract the area below −2.25 from the area below 1.77 to get 0.9616 − 0.0122 = 0.9494. 94.94% of observations are between −2.25 and 1.77. R2.8 (a) Looking up 0.80 in the body of Table A, we find that the z corresponding to the number closest to 0.80 is 0.84. The graph is shown below (on the left). (b) If 35% of all values are greater than a particular z-value, then 65% are lower. Looking up 0.65 in the body of Table A, we find that the z corresponding to the number closest to 0.65 is 0.39. The graph is shown below (on the right). 2500 − 3668 = −2.29. The z-value that corresponds to a baby weight less than 2500 grams 511 at birth is −2.29. The percent of babies weighing less than this is the area to the left (see graph below). According to Table A, this is 0.0110. So, approximately 1% of babies will be identified as having low birth weight. R2.9 (a) z = 58 The Practice of Statistics for AP*, 4/e (b) The z-values corresponding to the quartiles are −0.67 and 0.67. To find the x-values corresponding to the quartiles, we solve the following equations for x. x − 3668 3325.63 −0.67 = ⇒ x =−0.67(511) + 3668 = 511 x − 3668 0.67 3668 4010.37 = = ⇒ x 0.67(511) += 511 The quartiles of the birth weight distribution are 3325.6 and 4010.4. R2.10 If the distribution is Normal, it must be symmetric about its mean – and in particular, the 10th and 90th percentiles must be equal distances above and below the mean – so the mean is 250 points. If 225 points below (above) the mean is the 10th (90th) percentile, this is 1.28 standard deviations below (above) 225 = 175.8 points. the mean, so the distribution’s standard deviation is 1.28 R2.11 A histogram and Normal probability plot are given below. Both indicate that the data are not exactly Normally distributed. The histogram is roughly symmetric. The descriptive statistics given below indicate that the mean and median are very similar which is consistent with rough symmetry. For most purposes, these data can be considered approximately Normal. Variable length of thorax N 49 Mean 0.8004 StDev 0.0782 Minimum 0.6400 Q1 0.7600 Median 0.8000 Q3 0.8600 Maximum 0.9400 R2.12 The steep, nearly vertical portion at the bottom and the bow upward indicate that the distribution of the data is right-skewed with several outliers. In other words, this data is not Normally distributed. Chapter 2: Modeling Distributions of Data 59 AP Statistics Practice Test (page 138) T2.1 e. The percentile tells what percent of scores are below that figure. T2.2 d. At one standard deviation away from the mean, the curve has its inflection point. Also, the vast majority of the curve lies within 3 standard deviations of the mean. T2.3 b. Both the addition and the multiplication affect the mean, but only the multiplication affects the standard deviation. T2.4 b. About 20% consume fewer than 4 ounces and about 60% consume fewer than 8 ounces, so about 40% consume between 4 and 8 ounces. T2.5 a. Solve the following equation for σ. 1.04 = percentile of the distribution as found in Table A. 60 − 55 σ . The value 1.04 is the approximate 85th T2.6 d. There is not enough area between A and B or between F and G to constitute 25%. Since A and G are the min and the max, they are obviously part of the 5 number summary. The points B and F are too close to the ends of the distribution. T2.7 c. Two feet constitutes 24 inches. Since there is a spread of about 6 standard deviations in a length that encompasses 99.7% of all observations, we would estimate the standard deviation to be 24 = 4 inches. 6 T2.8 e. The proportion of observations with z > −3.0 in a Standard Normal distribution is 0.9987. T2.9 e. z = 540 − 470 = 0.27. 110 T2.10 c. Jane’s z-score is 0.27 (see previous problem). Colleen’s z-score is z = 530 − 515 = 0.13. 116 T2.11 (a) Jane’s performance was better. She did more curl-ups than 85% of girls her age. This means that she qualified for both the President’s award and for the National award. Matt did more curl-ups than 50% of boys his age. This means that less than 50% of the boys his age did better than he did, whereas less than 15% of the girls her age did better than Jane. Matt qualified for the National award, but did not qualify for the President’s award. (b) Since Jane’s score has a higher percentile than Matt’s score, her standardized score will also be larger than his. 23.9 − 22.8 = 1. So 1.1 the proportion of observations lower than this is 0.8413 (using Table A). This means that this soldier’s head circumference is in approximately the 84th percentile. A graph is shown below. T2.12 (a) The z-value that corresponds to this soldier’s head circumference is z = 60 The Practice of Statistics for AP*, 4/e 20 − 22.8 = −2.55. Using Table A, the area below −2.55 1.1 26 − 22.8 is 0.0054 (see graph below). Standardizing the right endpoint we get z = 2.91. The area = 1.1 below 2.91 (using Table A) is 0.9982, so the area above 2.91 is 1 − 0.9982 = 0.0018 (see graph below). This means that the area in both tails is 0.0054 + 0.0018 = 0.0072. So approximately 0.7% of soldiers require custom helmets. (b) Standardizing the left endpoint we get z = (c) The quartiles of a Standard Normal distribution are −0.67 and 0.67. To find the quartiles of the head circumference distribution we solve the following equations for x. x − 228 −0.67 = ⇒ x =−0.67(1.1) + 22.8 =22.063 1.1 x − 228 0.67 22.8 23.537 = = ⇒ x 0.67(1.1) + = 1.1 This means that Q1 = 22.063 and Q3 = 23.537. So IQR = Q3 − Q1 = 23.537 − 22.063 = 1.474 inches. T2.13 No, this data does not seem to follow a Normal distribution. First, there is a large difference between the mean and the median. The mean is 48.25 and the median is 37.80. The Normal distribution is symmetric so the mean and median should be quite close in a data set drawn from one. This data set appears to be highly skewed to the right. This can be seen by the fact that the mean is so much larger than the median. It can also be seen by the fact that the distance between the minimum and the median is 37.80 − 2 = 35.80, but the distance between the median and the maximum is 204.90 − 37.80 = 167.10. Chapter 2: Modeling Distributions of Data 61

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Chapter 2