Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Name: uNID: First Midterm Exam (MATH1070 Spring 2012) Instructions: This is a one hour exam. You can use a notecard. Calculators are allowed, but other electronics are prohibited. 1. [40pts] Multiple Choice Problems In a statistics class with 136 students, the professor records how much money each student has in his or her possession during the first class of the semester. The following histogram is of the data collected. Based on this histogram, answer questions 1) – 3). 1) The number of students with under USD 10 in their possession is closest to C A. 50. B. 70. C. 60. D. 40. 2) The percent of students with over USD 20 in their possession is about B A. 10%. B. 20%. C. 30%. D. 40%. 3) From the histogram, which of the following is true? A. B. C. D. A The mean is much larger than the median. The mean is much smaller than the median. It is impossible to compare the mean and median for these data. The mean and median are approximately equal. Name: uNID: A sample was taken of the verbal SAT scores of applicants to a California State College. The following is a boxplot of the scores. Based on this histogram, answer questions 4) and 5). 4) Based on this boxplot, the interquartile range is closest to A. 500. B. 200. C. 600. D. 400. B 5) If 25 points were added to each score, then interquartile range of the new scores would A A. B. C. D. remain unchanged. be increased by 5. be increased by 25. be increased by 625. 6) A Normal density curve has which of the following properties? A. B. C. D. D It has a peak centered above its mean. It is symmetric. The spread of the curve is proportional to the standard deviation. All of the above. Name: uNID: Refer to the following scatterplot For each menu item at a fast food restaurant, fat content (in grams) and number of calories were recorded. A scatterplot of these data is given below. 7) A plausible value for the correlation between calories and fat is A. +0.9. B. -0.9. C. -1.2. D. +0.2. A 8) Which of the following is not true of the correlation coefficient r? D A. −1 ≤ r ≤ 1. B. If r = 0, then there is no relationship between x and y. C. If r is the correlation between x and y, then r is also the correlation between y and x. D. Multiplying all data values (x’s and y’s) by 10 will have no impact on r. Name: uNID: 2. [12pts] A company produces packets of soap powder labeled Giant Size 32 Ounces. The actual weight of soap powder in such a box has a Normal distribution with a mean of 33 oz and a standard deviation of 0.7 oz. To avoid having dissatisfied customers, the company says a box of soap is considered underweight if it weighs less than 32 oz. To avoid losing money, it labels the top 5% (the heaviest 5%) overweight. 1). What proportion of boxes is underweight (i.e., weigh less than 32 oz)? 2). How heavy does a box have to be for it to be labeled overweight? 1. Let X denote the weight of a box. Then we want to know the proportion of boxes such that X < 32. The corresponding zscore is 32 − 33 X − 33 = = −1.43 Z= 0.7 0.7 From the table of the standard normal cumulative proportions, we find that the proportion for X < 32 is 0.0764. 2. Let x0 be the threshold of overweight. Then the proportion corresponding to X ≥ x0 is 5%, or equivalently the proportion corresponding to X < x0 is 95%. From the table of the standard normal cumulative proportions, we find that the z-score corresponding to 0.95 is 1.645 (both 1.64 and 1.65 are O.K.). Therefore x0 = 0.7(1.645) + 33 = 34.1515. Name: uNID: 3. [10pts] The following are the heights (in inches) of 25 students in a given class. Draw the histogram. 51 62 68 53 63 69 55 63 70 55 64 70 57 66 72 59 66 74 60 67 78 60 68 62 68 √ Since there are 25 observations, it is suggested to use 25 = 5 bins for our histogram. (It’s O.K. to use different number of bins as long as that number is neither too big nor too small.) The range is 78 − 51 = 27. Thus the bin size should be around 6. In fact, it is more natural to use 6 bins and use bin size 5 here. The following is the frequency table bins 50 ≤ x < 55 55 ≤ x < 60 60 ≤ x < 65 65 ≤ x < 70 70 ≤ x < 75 75 ≤ x < 80 Here is the histogram: frequency 2 4 7 7 4 1 Name: uNID: 4. The following are the grades of 18 students in a given exam. (a) [4pts] Make a stemplot. Here we draw a stemplot with split stems, i.e., the stem 6− represents 60 ∼ 64 and the stem 6+ represents 65 ∼ 69. The stemplot is given as follows: 6− 6+ 7− 7+ 8− 8+ 9− 9+ 03 9 4 66789 23 568 02 79 (b) [10pts] Find the five-number summary (min, Q1, median, Q3, max). Since there are 18 observations, the median is the average of the 9th and 10th observation, i.e. (79 + 82)/2 = 80.5. Since the median is not an observation in the data set, the lower half is the 9 observation from 60 to 79. Then the first quartile which is the median of the lower half is the 5th observation, which is 76. Similarly, the third quartile is 88. Therefore the five number summary is min 60 Q1 76 median 80.5 Q3 88 max 99 (c) [6pts] Are there any potential outlier(s) according to the 1.5×IQR rule? We have IQR = Q3 − Q1 = 88 − 76 = 12, and 1.5 × IQR = 12(1.5) = 18. Since Q1 − 18 = 58 < 60 and Q3 + 18 = 106 > 99, there is no outlier according to the 1.5 × IQR rule. Name: uNID: 5. A student wonders if people of similar heights tend to date each other. She measures herself, her dormitory roommate, and the women in the adjoining rooms; then she measures the next man each woman dates. Here are the data (heights in inches). Women x Men y 66 72 64 68 66 70 (a) [4pts] What is the mean of the heights of these three women? What about men? We have x̄ = 66 + 64 + 66 = 65.333 3 and ȳ = 72 + 68 + 70 = 70 3 (b) [8pts] Compute the standard deviation of the height for these 3 men by complete the following table. Use your calculator only to add, subtract, multiply, divide, square or take the square root of numbers. yi 72 68 70 yi − ȳ 2 -2 0 (yi − ȳ)2 4 4 0 Therefore the standard deviation of y is v r u n u 1 X 1 sy = t (yi − ȳ)2 = (4 + 4 + 0) = 2. n − 1 i=1 3−1 Now find the standard deviation of the height for these 3 women by the same procedure. xi 66 64 66 xi − x̄ 0.667 -1.333 0.667 (xi − x̄)2 0.444 1.778 0.444 Therefore the standard deviation of x is v r u n u 1 X 1 (xi − x̄)2 = sx = t (0.444 + 1.778 + 0.444) = 1.155. n − 1 i=1 3−1 (c) [6pts] Find the correlation coefficient r between the height of men and women. n 1 X xi − x̄ yi − ȳ · n − 1 i=1 sx sy 1 0.667 2 −1.333 −2 0.667 0 = · + · + · 3−1 1.155 2 1.155 2 1.155 2 r= = 0.866 8