Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
STATISTICS TEAM QUESTIONS February Invitational 2016 at Sickles HS Statistics Team Question #1 NO CALCULATOR The stem plot below summarizes the number of gold medals earned by 40 countries in the Winter Olympics for 1924-1998. 0 8 2 7 9 9 0 2 0 1 1 1 0 0 1 1 0 1 0 0 0 1 0 0 0 1 9 8 8 2 9 5 7 1 3 9 8 9 5 4 3 5 9 6 1 7 8 8 8 Key: 7/1 means 71 medals Part A: write THE LETTER(S) of the following of the descriptions which depicts the data in the stem plot shown: { R=skewed to the right, L=skewed to the left, S=symmetric, T=has 2 modes}. Your answer should be one or more LETTERS. Part B: Write the median of the data represented by the stem plot shown. Part C: For the data x in the stem plot shown which has the property x |10 x 40} , find the interquartile range (IQR). Part D: True or False: For the countries that won LESS than 10 medals, the mean is less than 3. (Write the word, True or False, as your answer.) STATISTICS TEAM QUESTIONS February Invitational 2016 at Sickles HS Statistics Team Question #2. NO CALCULATOR Part A: A data set with seven data points has a mean of 4. One of the seven values in the data set is 7. What numerical contribution did the data point 7 make to the variance of this sample? Part B: The mean age of twelve members of a club at a meeting is 37. Ralph, who is age 24, arrived late. What is the mean of the 13 members of the club at the meeting after Ralph arrived? Part C: In the circumstances for part B, if we add the information that everyone in the club is the same age except Ralph, find the standard deviation ( ) of the ages of the population of the 13 members of the meeting, to the nearest whole number. Part D: In a room, there are four persons who weigh 100 lbs, and one person who weighs 120 lbs. Find the mean of the following numbers: the mean of the five weights, the median of the five weights, the mode of the five weights, and the range of the five weights. ----------------------------------------------------------------------------------------------------------Question #3. NO CALCULATOR Suppose events X and Y are independent, the probability of X is 0.6, and the probability of Y is 0.1. A = the probability that both X and Y occur. Give your answer in reduced fraction form. B = the probability that either X or Y occurs. Give your answer in reduced fraction form. C = the probability that X occurs given Y. Give your answer in reduced fraction form. D= the probability that X occurs and Y does not. Give your answer in reduced fraction form. ----------------------------------------------------------------------------------------------------------------------------Question #4. NO CALCULATOR Given the set of ordered pairs below and its linear regression model in y ax b form. { (0, 16), (1, 13), (2, 10), (3, 7), (4, 4) } a in the equation above B = b in the equation above A= C = the correlation coefficient D = coefficient of determination. STATISTICS TEAM QUESTIONS February Invitational 2016 at Sickles HS Question #5. NO CALCULATOR. Assume the scores of an Intelligence Quotient Test are collected from a group of 100 students, and that the scores are normally distributed with a mean of 600 and a standard deviation of 50. Using the Empirical Rule, A = the score of a student at the 97.5th percentile B = the score of a student at the 16th percentile C = the score of a student at the 50th percentile. D = the variance of the data set. ------------------------------------------------------------------------------------------------------------------Question #6. NO CALCULATOR. The power for a hypothesis test run at a 5% significance level was 0.75. A = the probability of a Type I error for this test, written as a decimal. B = the probability of a Type II error for this test, written as a decimal. The summary to the right is information about a men’s long jump competition, for 22 contestants. It gives the minimum and maximum jumps, quartile information, and some singlevariable statistics. Assume the data is normally distributed and the given mean and standard deviation are (mu) and (sigma), respectively. C = the percent of the data, to the tenth of a percent, that lie between 281.5 and 298. D = the z-score for the data point 324.4, to the nearest tenth. ------------------------------------------------------------------------------------------------------------------------Question #7. CALCULATOR OK. Scores on an SAT test are approximately normally distributed with a mean of 500 and a standard deviation of 90. A. What proportion of scores are above 600? (round to nearest thousandth) B. What is the 25th percentile of the scores? (round to nearest whole number) C. What proportion of the scores are between 420 and 520? (round to nearest thousandth) D. The WISC IQ test scores are normally distributed with Mu = 100 and Sigma = 15. Stephen Hawking had an IQ test score of 160. What SAT score would have a z-score comparable to the WISC score of 160? (round to nearest tenth) --------------------------------------------------------------------------------------------------------------------------------------- STATISTICS TEAM QUESTIONS February Invitational 2016 at Sickles HS Question #8. CALCULATOR OK. The following grouped frequency table of the income, x, of 30 employees at a local small business (in thousands of dollars) is given: Income $26 x $28 $28 x $30 $30 x $32 $32 x $34 $34 x $36 Frequency 2 11 8 5 4 A = the relative cumulative frequency of the $28 x $30 class. B = the class that contains the 66.6 th percentile. (answer with an interval as shown in row 1 of the table). C: using class-midpoints as representative values, give the mean for this data, with units in thousands of dollars, as ##.## to the nearest hundredths place. D: If the boss’ income is $300,000, find the mean income for all 31 workers, written in dollars (as the boss’ income was just written), to the nearest dollar. ---------------------------------------------------------------------------------------------------------------------------Question #9. CALCULATOR OK. A. A target is circular with a 12 inch diameter. What is the exact probability that you throw a dart and it lands 2 inches from the center of the target, given that it lands on the target randomly. B. A target is circular with a 12 inch diameter. What is the exact probability of throwing a dart and having it land farther than 2 inches from the center of the target, given that it lands on the target randomly. C. A game has five cards with amounts $4, $6, $10 and $20. You pay $5 to play the game and randomly receive one of the five cards which shows the amount of your winnings. What is the expected gain or loss of the game to the nearest hundredth of a dollar? (Use a negative to indicate a loss, positive to indicate a gain.) D. A game has five cards with amounts $1, $2, $2 and $20. If you pay to play the game, and you win one of the cards which shows the amount of your winnings, tell the fair price to pay to play the game, to the nearest hundredth of a dollar. ---------------------------------------------------------------------------------------------------------------------------Question #10. CALCULATOR OK. The line of best fit (linear regression line) of a set of data is y 0.22 x 8 . A. If each x-coordinate of points (x, y) of the data set is increased by 10 and each y-coordinate is decreased by 3, then give the slope of the new line of best fit as a decimal. B. If each x-coordinate of points (x, y) of the data set is increased by 10 and each y-coordinate is decreased by 3, then give the y-intercept of the new line of best fit as a decimal. C. If each x-coordinate of points (x, y) of the data set is increased by 3 and each y-coordinate is increased by 3, then give the slope of the new line of best fit as a decimal. STATISTICS TEAM QUESTIONS February Invitational 2016 at Sickles HS B. If each x-coordinate of points (x, y) of the data set is increased by 3 and each y-coordinate is increased by 3, then give the y-intercept of the new line of best fit as a decimal. Question #11. CALCULATOR OK. A lottery sells tickets and claims that 1 in 9 tickets is a winning ticket. A = the probability that you will win at least once if you purchase 4 tickets. Write your answer as a decimal to the thousandth place. B = the probability that you will win at least twice if you purchase 10 tickets. Write your answer as a decimal to the thousandth place. C = the least number of tickets you would have to buy to have over 50% probability of winning at least once. Your answer must be a whole number. D = the probability of buying 10 tickets and not winning at all. Write your answer as a decimal to the thousandth place. --------------------------------------------------------------------------------------------------------------------------Question #12. CALCULATOR OK. The table to the right categorizes 40 students in the Chess Club by hours worked after school weekly and the grade they earned in Quarter I in Statistics class. The Chess Club has a total of 40 members, all represented in the table. Variable: Hours worked at job after school weekly Hours Worked 5 hrs Between 5 and 10 hrs 10 hrs A B C D 5 4 4 1 4 0 3 2 6 3 2 6 Part A. Based on the data in the table, what is the probability that a student in the Chess Club works less than 5 hours per week? Part B. Based on the data in the table, what is the probability that a student in the Chess Club earns a grade of a B or C for Quarter I Statistics? Part C. Based on the data in the table, what is the probably that a student in the Chess Club works more than 10 hours per week, given that they earned a grade of a C in Statistics in Quarter I? Part D. Based on the data in the table, what is the probability that a student worked between 5 and 10 hours per week, given that they earned higher than a C in Quarter I Statistics? ----------------------------------------------------------------------------------------------------------------------------------- STATISTICS TEAM QUESTIONS February Invitational 2016 at Sickles HS Question #13. CALCULATOR OK There is a negative relationship between the amount of hours that a student uses his/her phone per week (x) versus the grade they earn in their A. P. Statistics class (y). The mean time that a student spends on their phone is 21 hours per week with a standard deviation of 2 hours. The mean grade in A. P. Statistics class is 81 with a standard deviation of 6. There is only one Stat course in the school, and so “A. P. Statistics” and “Statistics” refers to the one class. The coefficient of determination between phone use hours and grade in the class is 0.8836. Use this information to find the following. A=the slope of the line of best fit between the weekly phone hours and the Statistics grade. B= the y-intercept of the line of best fit between the weekly phone hours and the Statistics grade. C: Matilda spends 25 hours on her phone weekly. Her grade in the Statistics class is 65. What is the value of Matilda’s residual? D: Ralph spends 15 hours on his phone weekly. What is his predicted grade in A. P. Statistics? ------------------------------------------------------------------------------------------------Question #14. CALCULATOR OK. On a 30-question test, Paula scored got a score of 68, Samuel scored 22, Quentin scored 80 and Victor scored 77. The test was normally distributed with a mean of 60 and standard deviation of 8. A = Paula’s z-score, written as a decimal to the hundredth place. B = Samuel’s z-score, written as a decimal to the hundredth place. C = Quentin’s percentile score, written as a decimal to the hundredth place. (note: 81.25 percentile should be written as 81.25, not 0.813) D = Victor’s percentile score, written as a decimal to the hundredth place. STATISTICS TEAM QUESTIONS February Invitational 2016 at Sickles HS ANSWERS!! Questions #1-6 are non-calculator. Questions #7-14 are calculator. Part A 1 2 R 1.5 or 3 Part B 4.5 or 9 2 Part C Part D 19 True 36 4 81 2 3 3 50 16 25 3 5 27 50 4 3 16 1 1 5 700 500 600 2500 6 0.05 0.25 25 7 8 0.133 439 0.401 860.0 or 860 13 $30 x $32 30.87 39548 0.5 or Answers must be in fraction form 1 2 8 or 0.8 9 5.00 or 5 6.25 10 -0.22 7.2 -0.22 11.66 11 0.376 0.307 6 0.308 3 13 13 31 12 13 14 A, B must be decimal form or (30, 32] 0 9 Notes 0.35 or 7 20 0.575 or 23 40 -2.82 140.22 -4.72 97.72 1 -4.75 99.38 98.32 SOLUTIONS!!! 1. A: Plotting the data points gives a general skewed right shape. So R is correct. There is one mode: 00 so T is false. B: There are 40 numbers, and so the median will be the average of the 20th and the 21st. These are found in the first row. In order: 000000000001111111227899. The answer is (2+7)/2=4.5 C: 18,18,19,21,25,27,29,35,38,39,39 The IQR=Q3-Q1=38-19=19. D: Again look at the first row to see 24 numbers with a sum of 44. Mean is 44/21, about 2. Less than 3. TRUE. answers must be in decimal form answers must be in decimal form STATISTICS TEAM QUESTIONS February Invitational 2016 at Sickles HS (7 4) 2 3 1.5 7 1 2 B: Total before Ralph’s arrival is 37x12=444. After Ralph arrives, total is 444+24=468. Mean is then 468/13=36 C: We know that before Ralph’s arrival the mean was 37. If they all have the same age, they all are 37 years old. So after Ralph’s arrival, the standard deviation of the population would be 2. A: For a sample, the contribution would be x xx 37 1 37 1 37 1 37 1 37 1 37 1 37 1 37 1 37 1 37 1 37 1 37 1 24 12 Square 1 1 1 1 1 1 1 1 1 1 1 1 144 the square root of the sum of the last row above 156 divided by 13. 12 to the nearest whole number is 4, as 12 is closer to 16 than to 9. Answer=4. D: Set {100,100,100,100,120} gives a mean of 520/5=104. Median = 100. Mode=100 Range=20. Mean of these statistics gives 324/4=81. 3 1 3 . 5 10 50 3 3 C: P ( X / Y ) 50 1 5 10 3. A: P( X Y ) B: P( X Y ) 3 1 3 30 5 3 32 16 = = 5 10 50 50 50 25 D: P ( X ~ Y ) 3 9 27 5 10 50 4. Using { (0, 16), (1, 13), (2, 10), (3, 7), (4, 4) } we see that all data points lie on the line y 3x 16 . A = slope of the line = -3 B = y-intercept of the line = 16. C = the correlation coefficient= -1. Since -3 < 0, and we have a perfect correlation. C= -1. D = coefficient of determination = (1) 2 1 5. A: Using the Empirical rule, we say 95% of the data falls between +2 and -2 SD from the mean. To find the students at the 97.5 percentile we want the number at which 2.5% is above the student’s score and 97.5% is below. So we say mean+2(sd)=600+2(50)=700. B: 50-34=16 percent lies below 2 sd from the mean. .34 .34 so 600-2(50)=500. .16 C: The 50%ile means that 50% lie below the score. That is the mean, 600. D: Variance is 50 squared, or 2500. 6. A: The probability of a Type I Error in hypothesis testing is predetermined by the significance level. A=0.05. B: Power = 1-P(type II error), so 0.75=1=P. P=0.25. C: If the min is 281.5 and Q1=298, then 25% lie between the two scores. STATISTICS TEAM QUESTIONS February Invitational 2016 at Sickles HS D: 324.4-314.096 = 10.3 approximately. 10.3/20.7 is approximately half or z=0.5 to the tenth. 7. A: z 600 500 1.11111 . To the right of this has percent 0.13326 using 90 normalcdf(600,999999,500,90) on the TI-84. B. invNorm(0.25,500,90)=439.2959225 gives answer 439 C. normalcdf(420,520,500,90)=0.4008981546 gives answer 0.401 D. (160-100)/15=4 z-score, or 4 sd from the mean. For the SAT data given, this would relate to 500+90(4)=500+360=860 8. A = 2+11=13. B = 30x(2/3)=20 so we look for the cumulative frequency to be 20. Start counting from left and we see that at the end of the $30 x $32 interval, the frequency is 21. So the range $30 x $32 gives this percentile. C = 27(2)+29(11)+31(8)+33(5)+35(4)=926. Divide by 30 to get 30.8666... for answer 30.87 D. From part C, take the 926, add 300 for the boss. Divide by 31 to get 39.548387 which means the mean is 39548 dollars to the nearest dollar. 36 4 8 36 9 C. Weighted wins/losses are 1(0.25) 1(.25) 5(0.25) 15(0.25) =5.00 9. A. 0 B. D. Fair price = 1(0.25)+2(0.25)+2(0.25)+20(0.25)=6.25 10. A. y 0.22( x 10) 5 has slope -0.22 B. y 0.22( x 10) 5 has y-intercept -0.22(-10)+5=7.2 C. y 0.22( x 3) 11 has slope -0.22 D. y 0.22( x 3) 11 has y-intercept 11.66 11. A. Probability that you do NOT win at least once is that you lose all four times. 4 8 So we take 1 minus that answer: 1 =0.3757 --> 0.376 9 10 9 8 1 8 B. Probability is 1-(P(losing all the time)-P(winning once))= 1 10 =0.307 9 9 9 8 9 n 8 9 n C. 1 0.5 . .5 . n log(8 / 9) log(0.5) . n=5.88 so you must buy 6 tickets. 10 8 D. =0.308 9 12. A. The first row adds to 14. This gives 14/40 as our probability. 0.35 or 7/20 B. Add the B and C columns and divide by 40. Answer=0.575 or 23/40 C. P(more than 10 and earned a C)/P(earned a C) = 3/13 STATISTICS TEAM QUESTIONS February Invitational 2016 at Sickles HS D. P(worked between 5 and 10 hrs AND earned higher than a C)/P(earned higher than a C) = 13/31 sy sx ( x x ) . r= 0.8836 0.94 . The negative comes from the negative 6 relationship. y 81 0.94 ( x 21) . Slope = -2.82 2 6 B: 140.22. Let x=0 for y 81 0.94 ( x 21) 2 C: Let x=25 for y 81 2.82( x 21) and get y=69.72. Residual = Observed-Predicted= 13. A: y y1 r 65-69.72= -4.72 D: Let x=15 to y 81 2.82( x 21) to get y=97.92 14. A: 68 60 1. 8 B: 22 60 4.75 8 C: normalcdf(-99999, 80,60,8)=0.993790=99.38 percentile. D: normalcdf(-99999, 77,60,8)=98.32 percentile.