Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 3 Data Description Section 3-3 Measures of Variation Range Variance Sample Variance Sample Standard Deviation Shortcut Section 3-3 Exercise #7 The number of incidents where policies were needed for a sample of ten schools in Allegheny County is 7, 37, 3, 8, 48, 11, 6, 0, 10, 3. Assume the data represent samples. Find the range. Use the shortcut formula for the unbiased estimator to compute the variance and standard deviation. Use the shortcut formula for the unbiased estimator to compute the variance and standard deviation. Is the data consistent or does it vary? Explain. Finding the Sample Variance and Standard Deviation for Grouped Data Section 3-3 Exercise #21 The data shows the number of murders in 25 selected cities. Find the variance and Number f standard deviation. 27-90 13 91-154 155-218 219-282 283-346 347-410 411-474 475-539 539-602 2 0 5 0 2 0 1 2 Class Xm f f • Xm f • X m2 The data shows the number of murders in 25 selected cities. Find the variance and standard deviation. Section 3-3 Exercise #33 The mean of a distribution is 20 and the standard deviation is 2. Answer each. Use Chebyshev’s theorem. a. At least what percentage of the values will fall between 10 and 30? b. At least what percentage of the values will fall between 12 and 28? a. Subtract the mean from the larger value: 30 – 20 = 10 10 Divide by the standard deviation to get k: =5 2 b. Subtract the mean from the larger value: 28 – 20 = 8. Divide by the standard 8 deviation to get k: = 4 2 Chebyshev’s theorem The Empirical (Normal) Rule Chebyshev’s theorem applies to any distribution regardless of its shape. However, when a distribution is bell-shaped (or what is called normal), the following statements, which make up the empirical rule, are true. Approximately 68% of the data values will fall within 1 standard deviation of the mean. Approximately 95% of the data values will fall within 2 standard deviations of the mean. Approximately 99.7% of the data values will fall within 3 standard deviations of the mean. Section 3-3 Exercise #41 The average U.S. yearly per capita consumption of citrus fruits is 26.8 pounds. Suppose that the distribution of fruit amounts consumed is bell-shaped with a standard deviation equal to 4.2 pounds. What percentage of Americans would you expect to consume more than 31 pounds of citrus fruit per year? By the Empirical Rule, 68% of consumption is within 1 standard deviation of the mean. Then 1/2 of 32%, or 16%, of consumption would be more than 31 pounds of citrus fruit per year. Chapter 3 Data Description Section 3-4 Measures of Position A z score or standard score for a value is obtained by subtracting the mean from the value and dividing the result by the standard deviation. The symbol for a standard score is z. The formula is Section 3-4 Exercise #13 Which of the following exam scores has a better relative position? a. A score of 42 on an exam with X = 39 and s = 4 b. A score of 76 on an exam with X = 71 and s = 3 Percentile Formula Section 3-4 Exercise #22 Find the percentile ranks of each weight in the data set. The weights are in pounds. Data: 78, 82, 86, 88, 92, 97 Section 3-4 Exercise #23 What value corresponds to the 30th percentile? Find the percentile ranks of each weight in the data set. The weights are in pounds. Chapter 3 Data Description Section 3-5 Exploratory Data Analysis The Five-Number Summary and Boxplots A boxplot is a graph of a data set obtained by drawing a horizontal line from the minimum data value to Q1, drawing a horizontal line from Q3 to the maximum data value, and drawing a box whose vertical sides pass through Q1 and Q3 with a vertical line inside the box passing through the median or Q2. Section 3-5 Exercise #1 Identify the five number summary and find the interquartile range. 8, 12, 32, 6, 27, 19, 54 Data arranged in order: Minimum: Median: Maximum: Q1: Q3: Interquartile Range: Section 3-5 Exercise #9 Use the boxplot to identify the maximum value, minimum value, median, first quartile, third quartile, and interquartile range. Information Obtained from a Boxplot 1. a. If the median is near the center of the box, the distribution is approximately symmetric. b. If the median falls to the left of the center of the box, the distribution is positively skewed. c. If the median falls to the right of the center, the distribution is negatively skewed. 2. a. If the lines are about the same length, the distribution is approximately symmetric. b. If the right line is larger than the left line, the distribution is positively skewed. c. If the left line is larger than the right line, the distribution is negatively skewed. Section 3-5 Exercise #15 9.8 8.0 13.9 4.4 3.9 21.7 15.9 3.2 11.7 24.8 34.1 17.6 These data are the number of inches of snow reported in randomly selected cities for September 1 through January 10. Construct a boxplot and comment on the skewness of the data. Data arranged in order : Section 3-5 Exercise #16 These data represent the volumes in cubic yards of the largest dams in the United States and in South America. Construct a boxplot of the data for each region and compare the distributions. United States 125,628 92,000 78,008 77,700 66,500 62,850 52,435 50,000 South America 311,539 274,026 105,944 102,014 56,242 46,563