Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Math 251, 10 October 2003, Exam I Name: ANSWERS . Instructions: Complete each of the following eight questions, and please explain and justify all appropriate details in your solutions in order to obtain maximal credit for your answers. 1. (6 pts) Classify the type of sampling used in the following examples. (a) To conduct a poll, the Join Arnold team randomly chose 8 different prefixes in California (the first 3 digits of the telephone number) and called all households from those prefixes. ANS: This is a cluster sample. The population was divided into groups, some groups were randomly selected and every member of the selected groups was surveyed. (b) To maintain quality control, a tire manufacturer tests every 100th tire that comes off of the assembly line in its plant. ANS: This is a systematic sample. (c) To determine student attitudes toward worship requirements at La Sierra, President Geraty gave questionnaires to ten randomly selected students each from of the following groups: Freshmen, Sophomores, Juniors, Seniors and Graduate Students. ANS: This is a stratified sample. The population was divided into groups, and a random sample from each group was selected. 2. (6 pts) Categorize the following data according to level: nominal, ordinal, interval, or ratio. (a) The time someone goes to bed. ANS: Interval—differences in times make sense, but ratios do not. (b) Length of time to complete a marathon. ANS: Ratio—differences and ratios are meaningful. (c) The condition of a highway: poor, acceptable, good. ANS: Ordinal—the responses can be ranked, but differences between the ranks do not make sense. 3. (2 pts) In a set of data with more than two values, how does increasing the largest number affect the mean? How would it affect the median? ANS: The mean is increased, but the median remains unchanged. 4. (6 pts) How will the mean, standard deviation, and coefficient of variation compare in Population 1 below compare with those in Population 2 below? Explain why but do not compute them. Notice the data values in Population 2 are 10 times the data values in Population 1. Pop1: Pop2: 5 50 10 100 15 150 20 200 25 250 30 300 40 400 45 450 50 500 75 750 80 800 95 950 ANS: The mean and standard deviation of the second population are 10 times the mean and standard deviation of the first, both the average and the distance from the mean of the data in the second is exactly 10 times that in the first. However, the coefficients of variation of both populations are equal since the factors of 10 in the mean and standard deviation cancel in when computing the ratio for the coefficient of variation. 5. Consider the data (which are systolic blood pressures of 25 subjects): 105 126 146 108 126 152 110 128 166 110 130 188 112 130 190 112 130 116 132 118 134 118 136 120 140 (a) (2 pts) What class width should be chosen if you would like to have 8 classes. ANS: First, the range divided by the number of classes is (190 – 105)/8 = 10.625. Now go up to the next who number to make sure all data are covered in the 8 classes, so we choose a class width of 11. (b) (8 pts) Complete the following table for this data given that the first class has limits 105—119 Lower Limit Upper Lower Upper Cumulative Relative Limit Boundary Boundary Midpoint Frequency Frequency Frequency 105 119 104.5 119.5 112 9 9 .36 120 134 119.5 134.5 127 9 18 .36 135 149 134.5 149.5 142 3 21 .12 150 164 149.5 164.5 157 1 22 .04 165 179 164.5 179.5 172 1 23 .04 180 194 179.5 194.5 187 2 25 .08 (c) (10 pts) Find the median, Q1, and Q3 for the above data. Draw a box and whisker plot for the data. You may draw it horizontally if you prefer. ANS: The median is in the (25 + 1)/2th place, i.e. the 13th place, so the median is 128. The first quartile is the median of the first 12 numbers, which is 114 (average of 6th and 7th data). The third quartile is the median of the highest 12 numbers which is 138 (average of 19th and 20th data). High: 190 Third Quartile: 138 Median: 128 First Quartile: 114 Low: 105 See text for method of constructing box and whisker plot. The lower whisker starts at 105 and goes to 114, the low edge of the box is at 114, the upper edge is at 138, and the line in the box is at 128. The upper whisker starts at 138 and goes up to 190. (d) (5 pts) Construct a relative frequency histogram for the data using the table in (b). Relative Frequency Histogram 0.4 0.35 Relative Frequency 0.3 0.25 0.2 0.15 0.1 0.05 0 5 4. 19 5 9. 17 5 4. 16 5 9. 14 5 4. 13 5 9. 11 5 4. 10 Systolic Pressure 6. (6 pts) At a large university, 4000 students wrote a mathematics placement test one day. Given that x = 85,400 and x2=1,904,290 for these test scores, Find the mean and population standard deviation, and the coefficient of variation. ANS: The mean is: 85,400/4000 = 21.35 SSx = 1,904,290 – (85,400)2/4000 = 81,000 The population standard deviation is = (SSx /N)1/2 = (81,000/4000)1/2 = 4.5 The coefficient of variation is CV = 100% 21.08% 7. (4 pts) A population is known to have a mean of 50 and standard deviation of 15. Use Chebyshev’s theorem to find the interval in which you would expect to find at least 8/9 of the data. ANS: Chebyshev’s theorem says that at least 8/9 of data lies within 3 standard deviations of the mean. Therefore, we compute the interval 3 which is (5,95). So at least 8/9 of all data in the population should be in the interval (5,95). 8. (5 pts) Professor Henry Wiggins decided to study the ages of the children attending the nursery at his school. He constructed the following frequency distribution for ages in months. x 10—19 20—29 30—39 F 20 55 25 Please help Professor Wiggins by estimating the mean and sample standard deviation for the ages of children at the nursery. ANS: Use the formulas x xf, and x2 x2f where on the right hand side we use the class midpoints and frequencies. Then x 14.520 + 24.555 + 34.525 = 2500 x2 14.5220 + 24.5255 + 34.5225 = 66,975 and SSx 66,975 – (2500)2/100 = 4475 Therefore, the mean is approximately 2500/100 = 25, and the standard deviation is approximately s = (SSx /(n-1))1/2 = (4475/99)1/2 6.7232448