Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Homework 8 – Solutions Chapter 5D Review Questions. 6. What is an exponential scale? When is an exponential scale useful? An exponential scale is one in which each unit corresponds to a power of 10. In general, they are useful for displaying data that vary over a huge range of values. 14. Education and Earnings. Examine Figure 5.15, which shows the unemployment rate and the median weekly earnings for eight different levels of education. a. Briefly describe how earnings vary with educational attainment. Earnings increase with level of education. b. Briefly describe how unemployment varies with educational attainment. The unemployment rate decreases with level of education. c. What is the percentage increase in weekly earnings when a professional degree is compared to a bachelor’s degree? $1522 − $978 ≈ 56% $978 d. How much more likely is a high school dropout to be unemployed than a worker with a bachelor’s degree? A high school dropout is 9.0/2.8 ≈ 3.2 times as likely to be unemployed than a someone with a bachelor’s degree. e. On average, people spend about 45 years in the work force before retiring. Based on the data in Figure 5.15, how much more would the average college graduate (bachelor’s degree) earn during these 45 years than the average high school graduate? In 45 years, the average college graduate would earn 52 × 45 × ($978 − $591) = $905,580 more than the average high school graduate. 18. Federal Spending. Figure 5.31 shows the major spending categories of the federal budget over the last 50 years. (Payments to individuals includes Social Security and Medicare; net interest represents interest payments on the national debt; all other represents non-defense discretionary spending.) Interpret the stack plot and discuss some of the trends it reveals. a. Find the percentage of the budget that went to net interest in 1990, 1995, and 2005. About 15% of the budget went to net interest in 1990 and 1995; it dropped to about 8% in 2005. b. Find the percentage of the budget that went to defense in 1960, 1980, and 2005. In 1960, about 52% of the budget went to defense; it dropped to about 23% in 1980, and to about 20% in 2005. c. Find the percentage of the budget that went to payments to individuals in 1980, 2000, and 2005. Payments to individuals was about 47% of the budget in 1980, 57% in 2000, and about 54% in 2005. Average weekly earnings 27. Comparing Earnings. Figure 5.40 compares the average weekly earnings of men and women. Identify any misleading aspects of the display. Draw the display in a fairer way. $900 $800 $700 $600 $500 $400 $300 $200 $100 $0 The graph does not start at zero, making it appear that women earn about a quarter of what men earn. A fairer graph would start from zero. Men Women 28. Breaking Distances. Figure 5.41 shows the breaking distance for three different cars. Discuss the ways in which it might be deceptive. How much greater is the breaking distance of a Lincoln than the breaking distance of a Lexus? Draw the display in a fairer way. As presented in Figure 5.41, the Lincoln superficially appears to have a breaking distance about twice that of the Lexus because the x-scale starts at 170 feet. In fact, the difference is only about (207 − 187) = 20 feet, or (207 − 187)/187 = 10.7%. A more fair way of representing the data might start the graph at zero: Lincoln Saab Lexus 0 25 50 75 100 125 150 175 200 Breaking distance (feet) 29. Cell Phone Users. The following table shows the number of cell phone subscribers in the United States for selected years between 1990 and 2007. Display the data using both an ordinary vertical scale and an exponential vertical scale. (Hint: For the exponential scale, use tick marks at 1 million, 10 million, and 100 million.) Which graph is more useful? Why? Year 1990 1995 1997 1998 1999 2000 2001 2002 2003 2007 Subscribers (millions) 5.3 33.8 55.3 69.2 86.0 109.5 128.3 140.8 158.7 255.0 Exponential Scale Subscribers (millions) Subscribers (millions) Ordinary Scale 250 200 150 100 50 1990 1995 2000 2005 100 Either scale has its 10 1 1990 1995 2000 2005 uses, the ordinary scale shows a steadily increasing growth, while the exponential scale illustrates that this growth is somewhat less than exponential. Chapter 6A Does it make sense? Decide whether each of the following statements makes sense (or is clearly true) or does not make sense (or is clearly false). Explain your reasoning. 11. The distribution of grades was left-skewed, but the mean, median, and mode were all the same. Does not make sense. If the mean, median, and mode are the same, the distribution should be symmetric. Mean, Median, and Mode. Compute the mean, median, and mode of the following data sets. 14. Body temperatures (in degrees Fahrenheit) of randomly selected normal and healthy adults: 98.6 98.6 98.0 98.0 99.0 98.4 98.4 98.4 98.4 98.6 The mean, median, and mode are 98.44, 98.4, and 98.4, respectively. 20. Margin of Victory. The data set below gives the margin of victory in the NFL Superbowl games for 2002-2009. 3 12 11 3 3 27 3 27 a. Find the mean and median margin of victory. The mean is 11.125 points and the median is 7 points. b. Identify the outliers in the set. If you eliminate the outliers on the high side, what are the new mean and median. After eliminating outliers on the high side, the mean is 5.83 points and the median is 3 points. Approximate Average. State, with an explanation, whether the mean, median, or mode gives the best description of the following averages. 23. The average number of times that people change jobs during their careers. The distribution is probably right-skewed by a few people who change jobs frequently, so the median will give a better description. Describing Distributions. Consider the following distributions. a. How many peaks would you expect from the distribution? Explain. b. Would you expect the distribution to be symmetric, left-skewed, or right-skewed? Explain. c. Would you expect the variation of the distribution to be small, moderate, or large? Explain. 27. The exam scores for 100 students when 40 students got an F, 25 students got a D, and 20 students got a C. a. There would be one peak on the far left (F’s). b. The distribution would be right-skewed because the scores trail off to the right. c. The variation is large. 32. The weights of cars at a dealership at which about half of the inventory consists of compact cars and half of the inventory consists of sport utility vehicles. a. There would likely be two peaks, one for compact cars, and one for sport utility vehicles. b. The distribution would be symmetric. c. The variation would be moderate because, although the difference in weight between compact cars and sport utility vehicles is large, the differences between compact cars or between sport utility vehicles tends to be small. Smooth Distributions. Through each histogram, draw a smooth curve that captures its important features. Then classify the distribution according to its number of peaks, symmetry or skewness, and variation. 35. Times between 300 eruptions of Old Faithful geyser in Yellowstone National Park, shown in Figure 6.6. Times Between Eruptions of Old Faithful 60 Frequency 50 40 30 20 10 0 50 60 70 80 90 100 110 Time (minutes) The distribution has two peaks (i.e., it is bimodal), with no symmetry and large variation. 36. Time until failure for a sample of 108 computer chips that failed, shown in Figure 6.7. The distribution has one peak and is right-skewed. Although most of the data is clustered around its peak, the distribution has considerable spread, so it has moderate variation. Frequency Failure Time of Computer Chips 50 45 40 35 30 25 20 15 10 5 0 2 4 6 8 Time (months) 10 12 39. Family Income. Suppose you study family income in a random sample of 300 families. You find that the mean family income is $55,000; the median is $45,000; and the highest and lowest incomes are $250,000 and $2400, respectively. a. Draw a rough sketch of the income distribution, with clearly labeled axes. Describe the distribution as symmetric, left-skewed, or right-skewed. Sketches will vary, but, because the mean is larger than the median and because there are large outliers, the distribution is likely right-skewed with a single peak at the mode. b. How many families in the sample earned less than $45,000? Explain how you know. About 150 families (50%0 earned less than $45,000 because that value is the median income. c. Based on the given information, can you determine how many families earned more than $55,000? Why or why not? Other than to say that is less than half, we don’t have enough information to determine how many families earned more than $55,000. Chapter 6B Does it make sense? Decide whether each of the following statements makes sense (or is clearly true) or does not make sense (or is clearly false). Explain your reasoning. 9. For the 30 students who took the test, the high score was 80, the median was 74, and the low score was 40. Makes sense. Supposing half the students scored 74 or better this is entirely possible. 12. The mean gas mileage of the compact cars we tested was 34 miles per gallon, with a standard deviation of 5 gallons. Does not make sense. The standard deviation and mean should have the same units. Comparing Variations. Consider the following data sets. a. b. c. d. Find the mean, median, and range for each of the two data sets. Give the five number summary and draw a boxplot for each of the two data sets. Find the standard deviation for each of the two data sets. Apply the range rule of thumb to estimate the standard deviation of each of the two data sets. How well does the rule work in each case? Briefly discuss why it does or does not work well. e. Based on all your results, compare and discuss the two data sets in terms of center and variation. 15. The table below gives the cost of living index for six East Coast cities and six West Coast cities (using the ACCRA index, where 100 represents the average cost of living for all participating cities with a population over 1.5 million). East Coast Cities Atlanta 98.2 Baltimore 108.7 Boston 135.4 Miami 111.5 New York City 216.0 Washington, DC 140.0 West Coast Cities Los Angeles 155.8 Portland 113.2 San Diego 144.8 San Francisco 182.4 San Jose 156.0 Seattle 122.7 a. For the East Coast the mean, median, and range are 134.97, 123.45, and 117.8, respectively; and, for the West Coast they are 145.82, 150.3, and 69.2, respectively. b. For the East Coast the five-number summary is (98.2, 108.7, 123.45, 140.0, 216.0), while for the West Coast, it is (113.2, 122.7, 150.3, 156.0, 182.4). The boxplots are then: East Coast West Coast 80 100 120 140 160 180 200 220 c. The standard deviation is 42.86 for the East Coast, and 25.06 for the West Coast. d. For the East Coast approximate standard deviation is 117.8/4 = 29.45, which is a far cry from the true value of 42.86 largely because of New York. On the West Coast, the approximate value is 69.2/4 = 17.3, which is also fairly inaccurate. e. The cost of living is smaller, though more varied, on the East Coast. Understanding Variation. The following exercises give four data sets consisting of seven numbers. a. Make a histogram for each set. b. Give the five-number summary and draw a boxplot for each set. c. Compute the standard deviation for each set. d. Bases on your results, briefly explain how the standard deviation provides a useful single-number summary of the variation in these data sets. 20. The following sets of numbers all have a mean of 6: {6,6,6,6,6,6,6},{5,5,6,6,6,7,7}, {5,5,5,6,7,7,7},{3,3,3,6,9,9,9} 8 4 6 3 4 2 2 1 0 0 6 a. 4 4 3 3 2 2 1 1 0 5 6 0 7 5 6 7 3 6 9 b. The five number summaries for each of the sets are (in order): (6,6,6,6,6); (5,5,6,7,7); (5,5,6,7,7); (3,3,6,9,9). The boxplots are: Set Set Set Set 1 2 3 4 0 2 4 6 8 c. The standard deviations for the sets are (in order): 0.000, 0.816, 1.000, and 3.000. d. Looking at part c, we can see that the variation increases with each successive set. 23. Portfolio Standard Deviation. The book Investments by Zvi Bodie, Alex Kane, and Alan Marcus claims that the returns for investment portfolios with a single stock have a standard deviation of 0.55, while the returns for portfolios with 32 stocks have a standard deviation of 0.325. Explain how the standard deviation measures the risk in these two types of portfolios. A lower standard deviation suggests more certainty in the expected return, and a lower risk.