Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Math 251, 15 October, Exam I Results from this test: Number of students writing test: 26 Mean: 78.8% Standard Deviation: 11.4% Low: 57.6% First Quartile: 68.5% Median: 78.8% Third Quartile: 88.8% High: 95.6% Math 251, 15 October, Exam I Name: (Partial Answers) . Instructions: Complete each of the following eight questions, and please explain and justify all appropriate details in your solutions in order to obtain maximal credit for your answers. 1. (2 pts) What is your birthday (Month & Day)? (This data will be used in class later so please enter your true birthday) 2. (2 pts) If your instructor were to compute the class mean of this test when it is graded, and use it to estimate the average for all tests taken by this class this quarter, would this be an example of descriptive or inferential statistics? Explain. Inferential – using a sample mean (one test) to estimate the population (all tests, quizzes and assignments) mean. 3. (a) (2pts) In a survey of a sample of parents, 53% said they protect their children from sun exposure using sunscreen. Is 53% a parameter or statistic? Explain. Statistic – it is a numerical property of the sample of parents. (b) (2 pts) In a union’s vote, 55% voted in favor of ratifying a contract proposal. Is 55% a statistic or parameter? Explain. Parameter – it is a numerical property of the population of union members. (c) (1 pt) A study on attitudes about smoking is conducted at a college. The students are divided by class, and then a random sample is selected from each class. What type of sampling technique is this (e.g. simple random, convenient, stratified, systematic, cluster)? Stratified – a random sample is taken from each class (strata). 4. (5 points) (True or False) (a) T right. The median is (generally) to the left of the mean in data that is skewed to the (b) T The Empirical Rule for bell-shaped distributions says that about 95% of the data lies within two standard deviations of the mean. (c) F The 70th percentile of a set of data is the number so that 70% of the data lie above that number, and 30% of the data are below that number. (approximately 70% of data below, and 30% above). (d) T The z-score for a number 4 standard deviations below the mean is -4. (e) F Chebychev’s Theorem says that exactly 8/9 of the data in any distribution will lie within 3 standard deviations of the mean. (Not exactly 8/9 but at least 8/9.) 5. At a large university, 5000 students wrote a mathematics placement test one day. Given that x = 306,250 and (x-)2= 451,250 for these test scores. (a) (4 pts) Find the mean and population standard deviation for these scores. Mean = 306,2505000 = 61.25 Population Standard Deviation = (451,2505000)1/2 = 9.5 (b) (2 pts) Find the test score that is 2 standard deviations below the mean. 61.25 – 2(9.5) = 42.25 (c) (2 pts) If the distribution is normal (bell-shaped), according to the empirical rule, what is the approximate percentile of a score that is two standard deviations above the mean? 50% + 47.5% = 97.5%, therefore, the 97.5th percentile 6. Consider the following data of 26 numbers. 33 66 90 35 70 90 47 72 93 48 76 94 51 78 96 57 80 97 60 82 64 84 64 85 65 89 (a) (2 pts) Find the median of the data. Because 26 is even, the median is the average of the 26/2 = 13th place and the 14th place, therefore the median is (72+76)/2 = 74 (b) (4 pts) Given that Q1 = 61, and Q3 = 88 find the IQR and construct a box and whisker plot for the data. IQR = 88 – 61 = 27. To make the box plot, note that L=33, Q1 = 61, Q2 = 74, Q3 = 88, and H = 97 (see text for further details on box plot). (c) (6 pts) Construct a relative frequency histogram for the data where the first class has limits 30-44, be sure to list all class limits and boundaries, and class width. Limits 30-44 45-59 60-74 75-89 90-104 Boundaries 29.5-44.5 44.5-59.5 59.5-74.5 74.5-89.5 89.5-104.5 Frequency 2 4 7 7 6 Relative Freq. 2/26 = .077 4/26 = .154 7/26 = .269 7/26 = .269 6/26 = .231 Class Width = 15 See text for details on the Histogram – the boundaries should be on the horizontal axis (or midpoints), the heights of the bars should be the relative frequency. 7. A doctor is interested in the relationship between age (x) and blood pressure (y) in men. So far the doctor has collected the following data. Age (x) Blood Pressure (y) 16 109 25 122 39 143 45 132 49 199 57 175 64 185 70 199 For this data: x =365, x2 =19073, y=1264, y2 =208690, xy = 61807 (a) (4 pts) Find the equation of the least squares regression line. m = (8*61807 – 365*1264)(8*19073-3652) = 1.7095924 y-intercept: b = 1264/8 – 1.7095924*365/8 = 79.999845 The line equation is: y = 1.7095x + 79.9998 (b) (2 pt) Use the regression line equation to predict the blood pressure of a 40-year-old man. Ans: 1.7095*40+79.9998 = 148.4 (plug x=40 in line, and compute y) (c) (2 pt) At what age is a man’s expected blood pressure 140? Ans: (140 – 79.9998)/1.7095 = 35.1 years (plug y = 140 in line, and solve for x). (d) (2 pt) The correlation coefficient for this data is .888. Does this indicate that there is a good linear fit? Explain. It represents a pretty good fit. The closer the number is to 1, the better the fit to a line of positive slope, while a correlation coefficient close to 0 indicates that there is practically no linear correlation. 8. (2 pts) In studying the relation between hours of TV watched per week (x) and GPA’s (y), it was found that GPA’s tended to decreases as the hours of TV watched increased. Would you expect the correlation coefficient to be positive or negative for the data collected? Explain. A negative correlation coefficient, because as the x’s increase, the y’s decrease.