Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sample Exam #1, Math 201 1. Use the data set given below to answer all of the following questions. 14.0, 18.4, 21.6, 22.1, 23.8, 24.3, 25.9, 26.5, 27.5, 29.2, 29.3, 29.4, 29.7, 29.8, 30.2, 30.8, 31.9, 33.5 HaL Use the statistical capability of your scientific calculator to find the mean, standard deviation, and variance of the data: êêx = 26.55 s = 5.05083 variance = 25.5109 HbL Find by hand the first quartile Q1 , median M , third quartile Q3 , and the IQR. Q1 = 23.8 M = 28.35 Q3 = 29.8 IQR = 6 The numbers are in order and there are 18 pieces of data so the median is the average of the 9th and 10th 27.5+29.2 pieces of data. M = ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅ = 28.35 2 Q1 is the median of the first half of the data, or the 5th piece of data and Q2 is the median of the second half of the data, or the 14th piece of data. HcL Create a boxplot for these data: 30 25 20 15 HdL Create a split–stemplot for these data: First, round to the nearest whole number 1 4 1 8 2 2244 2 668999 3 000124 3 HeL Which measure would be a better measure for the center of this distribution? Justify your choice. Since the distribution is skewed and not very symmetric the median is the best measure for the center. 2. The histogram below shows the distribution of a set of observations: 40 35 30 25 20 15 10 5 80 90 100 110 120 130 140 150 HaL Is the distribution symmetric, skewed to the left, or skewed to the right? The distribution is skewed to the left. HbL Is the mean less than or greater than the median? Since the mean follows the skew, the mean is less than the median. HcL How many data values are in the data set? Adding up the height of each bar, we get: 2 + 3 + 4 + 6 + 16 + 25 + 43 + 42 = 141 HdL Use the histogram to accurately estimate the median. The position of the median is found n+1 141+1 142 ÅÅÅÅ2ÅÅÅÅÅÅ = ÅÅÅÅÅÅÅÅ2ÅÅÅÅÅÅÅ = ÅÅÅÅ2ÅÅÅÅÅ = 71 is the 71st entry, so M is between 130 and 140. by: 3. Use Table A to answer the following questions. Find the proportion of observations from a standard Normal distribution that satisfies each of the following statements. HaL z ¥ 3.13 -3 -2 -1 0 1 2 3 From Table A, we find the p-value to be 1 - .9991 = .0009 or .09% HbL -1.25 § z § 0.54 -3 -2 -1 0 1 2 3 We look up both -1.25 and 0.54 in Table A. Using these we find the p-value to be .7054-.1056=.5998 or 59.98% HcL 58% of the observations are greater than what z value? HcL 58% of the observations are greater than what z value? -3 -2 -1 0 1 2 3 Since Table A uses the area to the left, we subtract 100%-58% and we look up 42% or .42 in table A to find the z value. The closest we can get is .4168, which gives z = -.21 4. Estimate the mean and standard deviation for the normal distribution whose density curve is shown. m = ___16____ (This is the center.) s = ____3____ (This is the distance from the center to the inflection points, ie. the steepest point on the curve.) 5 10 15 20 25 The scores on the math section of the SAT test for Washington students (2006) are normally distributed 5. with mean 532 and standard deviation 103. HaL What proportion of students received a score between 500 and 650? 500 650 First we'll find the z values for 500 and 650. 500-532 z1 = ÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅÅÅÅÅÅÅ º -.31 103 650-532 z2 = ÅÅÅÅÅÅÅÅ103 ÅÅÅÅÅÅÅÅÅÅÅÅ º 1.15 Using table A the number of students scoring less than 500 is 37.83% and the number of students scoring less than 650 is 87.49%. So the number of students scoring between 500 and 650 is 87.4937.83=49.66%. HbL 83% of the test scores were less than what value? x HbL 83% of the test scores were less than what value? x The percentage closest to 83% from table A is 83.15 %. This gives a z value of z = .96. We'll use this to solve for the test score. x-532 .96 = ÅÅÅÅÅÅÅÅ Å ÅÅÅ Å Å 103 98.88 = x - 532 x º 631 So 83% of the test scores were less than 631. 6. Match the scatterplot with the correlation values given below: HaL HbL HcL HdL HeL HfL Scatterplot #1 Scatterplot #2 Scatterplot #3 Scatterplot #4 Scatterplot #5 Scatterplot #6 r = -0.64 goes with Scatterplot #__4_ r = 1.00 goes with Scatterplot #__6_ r = -0.11 goes with Scatterplot #__1_ r = 0.68 goes with Scatterplot #__3_ r = 0.59 goes with Scatterplot #__5_ r = -0.92 goes with Scatterplot #__2_ 7. A new teacher is analyzing whether or not there is an association between scores earned by students on their first exam in the course and the course grade earned by students at the end of the term. Exams are scored using a 100 point scale (0 to 100 points) and course grades use a 100% scale (0% to 100%). There are 35 students in the course. HaL Decide which variable, Exam 1 Score or Course Grade, is the explanatory variable and which is the response variable. Circle the scatterplot below that matches your decision. Explanatory Variable: ___Exam 1 Score______ Response Variable: ___Course Grade______ Course Grade 1 0.9 0.8 HaL Decide which variable, Exam 1 Score or Course Grade, is the explanatory variable and which is the response variable. Circle the scatterplot below that matches your decision. Explanatory Variable: ___Exam 1 Score______ Response Variable: ___Course Grade______ Course Grade 1 0.9 0.8 0.7 0.6 0.5 50 60 70 80 90 Exam 1 100 HbL Find the equation of the regression line y` = a + b x. The mean and standard deviation for the Course Grade variable is 0.766 and 0.123 The mean and standard deviation for the Exam 1 Score variable is 83.943 and 11.295 The correlation is 0.7845 Be very sensitive to roundoff errors. sy We'll use the formula, y` = a + b x, with b = r ÅÅÅÅ ÅÅ and a = êêy - b êê x. sx 0.123 b = .7845 H ÅÅÅÅÅÅÅÅ Å ÅÅÅ Å ÅÅ Å L º .008543 11.295 a = 0.766 - .008543 H83.943L º .04887 So the regression line is given by y` = .04887 + .008543 x. HcL Predict the course grade for a student who scores a 91 on their first exam. Use the equation of the regression line found in (b). y` = .04887 + .008543 H91L º .826 HdL There is an obvious outlier present in the data set - what is its coordinate on the scatterplot? Describe what happened to the student represented by the outlier. The coordinate is approximately H96, .56L. The student did well on the first exam, scoring a 96 out of 100, but didn't pass the class with a 56% overall. HeL What proportion of the 35 students earned an A for the course? Any Course Grade between 90% and 100% would be assigned the A letter grade. 4 There are 4 of students that have scores above .9. So ÅÅÅÅ ÅÅ º .114 or 11.4 % of the students earned an A. 35 Depending on the quarter and instructor some of the previous exercise may not appear until exam 2. 8. Use the data set to answer the following questions: 2,2,2,4,4,5,5,5,7,7,7,7,8,11,11 HaL Find the five number summary for the given data. There are 15 pieces of data, so the median is the 8th or middle piece. M=5 The median of the first half of the data is the 4th piece. Q1 = 4 The median of the second half of the data is the 12th piece. Q3 = 7 This gives a five number summary of Min = 2, Q1 = 4, M = 5, Q3 = 7, Max = 11 HbL Create a boxplot for the data. 10 8 6 4 2 9. For the data set from the previous problem, describe the distribution of the data and determine if the five number summary was the best representation of the spread. The distribution is skewed and so the five number summary is the best representation of the spread, because mean and standard deviation are better suited for symmetric distributions. 10. Create a split stemplot for the following data and describe the distribution: 11,12,16,19,22,23,25,25,26,28,29,30,32,34,38,38 1 12 1 69 2 23 2 55689 3 024 3 88 11. For the data set in the previous problem determine the best summary and give justification for your answer. (Just state the type of summary, don't compute it.) Either summary could be justified. It is single peaked and somewhat symmetric, so mean and standard deviation could be used. On the other hand, there is a little bit of skewness, so the five number summary may be more desirable. 12. The length of human pregnancies from conception to birth varies according to a distribution that is approximately normal with mean 266 days and standard deviation 16 days. Use the 68-95-99.7 rule to answer the following questions. HaL Between what values do the lengths of the middle 99.7% of all pregnancies fall? The middle 99.7% of the pregnancies will fall within 3 standard deviations from the mean. 266 ± 3 H16L 266 ± 48 218 to 314 days That is, 99.7% of the pregnancies will fall between 218 and 314 days. The middle 99.7% of the pregnancies will fall within 3 standard deviations from the mean. 266 ± 3 H16L 266 ± 48 218 to 314 days That is, 99.7% of the pregnancies will fall between 218 and 314 days. HbL 218 234 250 266 282 298 314 How long are the longest 2.5% of all pregnancies? The longest 2.5% of all pregnancies will fall above 2 standard deviations from the mean. 266 + 2 H16L = 298 So the longest 2.5% of all pregnancies last 298 or more days. 218 234 250 266 282 298 314 13. Use table A to answer the following questions. HaL What percentage of human pregnancies last less than 270 days? 270-266 z = ÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅÅÅÅÅÅÅ = .25 16 218 234 250 266 282 298 314 From table A, we get PHz § .25L = .5987 HbL What percentage of human pregnancies last between 250 and 270 days? 218 234 250 266 282 298 314 270-266 ÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅÅÅÅÅÅÅ 16 z1 = = .25 PHz1 § .25L = .5987 250-266 z2 = ÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅÅÅÅÅÅÅ = -1 16 PHz2 § -1L = .1587 218 234 250 266 282 298 314 270-266 ÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅÅÅÅÅÅÅ 16 z1 = = .25 PHz1 § .25L = .5987 250-266 z2 = ÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅÅÅÅÅÅÅ = -1 16 PHz2 § -1L = .1587 So the percentage of human pregnancies between 250 and 270 days is .5987 - .1587 = .44. 14. Below how many days do 67% of all human pregnancies last? Using table A, we'll find the value of z that corresponds to 0.67. We find z = .44. x-266 Solving .44 = ÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅÅ for x, we get x = 266 + 16 H.44L = 273.04 days. So x should be less than 274 days. 16 Use the histogram to answer the following questions: 15. Frequency Histogram 9 8 7 6 5 4 3 2 1 0 Frequency 20 25 30 35 40 More Bin HaL Describe the distribution of the data set. The distribution is single peaked with a slight skew to the right. HbL How many observations are represented by the histogram? 4 + 7 + 8 + 4 + 2 = 25 HcL Find the median and mean on the histogram and justify your answers. Median º 27 (Find either the 13th entry or the point where the areas on either side are equal.) Mean º 31 (The average gets pulled towards the skew, so it should be more than the median.) Note: Actual answers may vary, but the relationships described above need to be true. 16. The following table gives information about a sample of sports cars that were test driven. Determine who the individuals are in the study, what the variables are, and whether each variable is categorical or quantitative. City mpg Highway mpg color Audi TT Quattro 20 28 white BMW M Coupe 17 25 black Ford Thunderbird 17 23 red The individuals are the cars being tested. The variables are city mpg, highway mpg and color. The two mpg variables are quantitative and color is categorical. 17. Compute the mean and standard deviation for the city mpg for all the cars in the study from problem 9. êê 20+17+17 x = ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅÅÅ = 18 3 2 ######## H20-18L2 +H17-18L2 +H17-18L s = "################################ ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅÅÅÅÅÅÅ####### ÅÅÅÅÅÅ = 1.73205 2