Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
March Statewide Invitational Statistics Team Answers: 1. 5 A=9 10. 5 9 B= C = 0.134 or 0.135 D = Undefined A=0 B=6 C=0 D=2 2. A = 24 B=8 C=9 D = 19 11. A = -0.91 B = 198 C = 0.3454 D = 793.346 3. A=1 B=1 C=1 D=0 12. A = 96 B = 46 C = 7th D=4 4. A=1 B=1 C=1 D=1 13. A=0 B=0 C=0 D = 0.173 5. A = or 0.8 14. A = 1.00 4 5 1 or 5 B= 0.2 C=0 8 D = 25 or 0.32 6. 1 π΄ = 115,200 π΅= πΆ= π·= 7. B = 1.00 C = 1.00 D=0 15. 1 10 1 120 1 46,080 B=2 C=2 D=0 A=5 B = 59 C = 32 377 D= 6 8. A=1 B = 10 C=0 D = 10 9. A = 75 or 0.41 31 B= 152 175 or 0.87 C = 10.87 or A=2 1087 100 D = βNONEβ or 69,432 6,385 Answers & Solutions Solutions: 1. Answers: A = 5/9, B = 5/9, C = 0.134 or 0.135, D = undefined Solution: Parts A and B are essentially the same. The first and third quartiles of each are 2 and 7, respectively. Therefore, the QCD is (7 β 2) / (7 + 2) = 5 / 9 for each of them. For part C, the first and third quartiles are located approximately 0.67 standard deviations on each side of the mean in a Normal distribution when using the Standard 113.4 β 86.6 Normal Distribution Chart. Thus, Q1 = 100 β 0.67(20) = 86.6 and Q3 = 100 + 0.67(20) = 113.4 and ππΆπ· = = 113.4 + 86.6 0.134 (when using the chart). Otherwise, Q1 = invNorm(0.25, 100, 20) 86.51020501 and Q3 = invNorm(0.75, 100, 20) = 113.489795 β 86.51020501 113.489795 and so ππΆπ· = β 0.135 when rounded. For part D, the QCD is undefined since the 113.489795 + 86.51020501 quartiles are located at approximately -0.67 and 0.67 in a Standard Normal Distribution. 2. Answers: A = 24, B = 8, C = 9, D = 19 Solution: The one-sample t-test has df = n β 1 = 25 β 1 = 24. Regression inference has df = n β 2 = 10 β 2 = 8 for a set of 10 data pairs. Since there are C = 10 digits, the chi-square goodness of fit test has df = C β 1 = 10 β 1 = 9. The degrees of freedom using the conservative approach to the two-sample t-test is the smaller sample size minus one: df = 20 β 1 = 19. 3. Answers: A = 1, B = 1, C = 1, D = 0 Solution: A = B = C = 1 since these statements are all true for a sum of a finite set of independent Normally distributed random variables, which is a linear combination. D = 0 since the standard deviation is the square root of result from C. 4. Answer: A = 1 (Yes), B = 1 (Yes), C = 1 (Yes), and D = 1 (Yes) Solution: The Empirical Rule is specifically defined for the Normal distribution and each of the other distributions converge in distribution to the Normal distribution either as the degrees of freedom approaches infinity (in the case of the Studentβs t-Distribution and the Chi-Square Distribution) or as the number of trials approaches infinity (as for the Binomial Distribution). This is also a direct consequence of The Central Limit Theorem. 5. Answers: A = 4/5 or 0.8, B = 1/5 or 0.2, C =0, D = 8/25 or 0.32 Solution: A: Since the odds ratio of positive to negative Z-scores in the data set is 4:1, the probability of randomly selecting a data value with a positive Z-score is 4/(4+1) = 4/5 = 0.8. B: A data value below the mean has a negative Z-score; therefore, B is just the complement of part A: B = 1/5 = 0.2. C: P(Data Value = Mean) = 0. Since it is stated that none of the Z-scores are equal to, then none of the data values in the set are equal to the mean. D: This is simply the product of the results from parts A and B doubled: C = 2(4/5)(1/5) = 2(0.8)(0.2) = 8/25 = 0.32. 1 1 6. Answers: π΄ = 115,200 , π΅ = 10 , πΆ = 1 120 1 , π· = 46,080 Solution: Let the notation βNsd = nβ represent the outcome on an N-sided die. For example: 4sd = 2 denotes rolling a 2 on the 4-sided die. Therefore: A = P(4sd = anything)*P(6sd = same as 4sd)*P(8sd = same as 4sd)*P(10sd = same as 4sd)*P(12sd = same as 4sd)*P(20sd = same as 4sd) since the 4-sided die limits the possible outcome to six ones, six twos, 4 1 1 1 1 1 1 1 six threes, and six fours and the dice are all independent. Therefore: π΄ = 4 × 6 × 8 × 10 × 12 × 20 = 115,200 π΅ = 10 since the only way to get a product of 0 is to roll the 0 on the 10-sided die. The results on the other dice do not mater. 2 3 4 4 5 8 3,840 1 πΆ = π(πππππ ππ πππ ππππ) = 4 × 6 × 8 × 10 × 12 × 20 = 460,800 = 120 . Since the dice are all independent, then D = P (sum on all dice is 50 | 10sd = 0) = P (sum on all five remaining dice is 50, which is the maximum possible) = 1 1 1 1 1 1 P (4sd = 4, 6sd = 6, 8sd = 8, 12sd = 12, and 20sd = 20) = 4 × 6 × 8 × 12 × 20 = 46,080 . 377 7. Answers: A = 5, B = 59, C = 32, D = 6 Solution: A = 5 since each die must show a 1 with the exception of the 10-sided die which must show a 0. B = 59 since the maximum possible is 9 on the 10-sided die while all other dice the maximum equals the number of sides. C = 32 since the expected value of the sum on the dice is the sum of the expected values on each die. Also, note that the mean on each die is equal to the medial value on the die since they are all discrete uniform distributions. Thus we have: E(Sum) = 2.5 + 3.5 + 4.5 + 4.5 + 6.5 + 10.5 = 32. 377 D = 6 . The dice are all independent so the variance of the sum on the dice is the sum of the variances of each die. The most efficient way to calculate each dieβs variance is to take the sum of the squares of each value on the die divided by the number of sides on the die and then subtract the square of the mean of the die as follows: Var(4sd) = (1 + 4 + 9 + 16) / 4 β 2.52 = 5 / 4 Var(6sd) = (1 + 4 + 9 + 16 + 25 + 36) / 6 β 3.52 = 35 / 12 Var(8sd) = (1 + 4 + 9 + 16 + 25 + 36 + 49 + 64) / 8 β 4.52 = 21 / 4 Var(10sd) = (0 + 1 + 4 + 9 + 16 + 25 + 36 + 49 + 64 + 81) / 10 β 4.52 = 33 / 4 Var(12sd) = (1 + 4 + 9 + 16 + 25 + 36 + 49 + 64 + 81 + 100 + 121 + 144) / 12 β 6.52 = 143 / 12 Var(12sd) = (1 + 4 + 9 + 16 + 25 + 36 + 49 + 64 + 81 + 100 + 121 + 144 + 169 + β¦ + 361 + 400) / 20 β 10.52 = 133 / 4 The sum of the variances is 5 / 4 + 35 / 12 + 21 / 4 + 33 / 4 + 143 / 12 + 133 / 4 = 377 / 6. 8. Answers: A = 1, B = 10, C = 0, D = 10 Solution: The completed two-way table below helps answer each part: X XC Total YC b = P(X and YC) d = P(XC and YC) P(YC) = b + d Y a = P(X and Y) c = P(XC and Y) P(Y) = a + c Total P(X) = a + b P(XC) = c + d a + b + c + d =1 A = a + b + c + d = 1 since the sum of all the probabilities must equal 1. B = 1 + 2 + 3 + 4 = 10 since statements 1, 2, 3, and 4 are true while statement 5 is false since two mutually exclusive events cannot possibly be independent. Statement 1 is true since a = P(X and Y) = 0 when X and Y are mutually exclusive. Hence, Statement 2 is true as a consequence of Statement 1 being true: P(X or Y) = a + b + a + c = b + c. Likewise, Statement 3 is true for the same reason since d represents the probability of the complement of P(X or Y). C = 1 since if X and Y are both mutually exclusive and exhaustive, then the probabilities of their intersection is 0 as well as the probability of the intersection of their complements. Thus: a = P(X and Y) = 0 and d = P(XC and YC) = 0 which makes a + d = 0. D = 1 + 2 + 3 + 4 = 10 since all 4 statements are true. Statements 1 and 2 are both true by definition of independence. Statement 1: a = P(X and Y) = P(X) P(Y) = (a + b)(a + c) and d = P(XC and YC) = P(XC) P(YC) = (c + d)(b + d). Statement 2: b = P(X and YC) = P(X) P(YC) and c = P(XC and Y) = P(XC) P(Y) = [1 β P(X)] [1 β P(YC)]. Statement 3: if P(X) = P(Y) then a + b = a + c so b = c. Statement 4: a cannot be 0 since a = P(X and Y) = P(X) P(Y) and neither probabilities are 0 themselves. There is one notable exception, which is if events X and Y are βessentially deterministicβ which means one of them has probability 0 and the other has probability 1. An essentially deterministic event is considered independent of any other event, even itself. Therefore, this is the only acceptable dispute of this result because it would force either b = 1 or c = 1 and all other values 0. 9. Answers: A = 31 / 75 or 0.41, B = 152 / 175 or 0.87, C = 10.87 or 1087/100 or 69,432 / 6,385, D = βNONEβ Solution: A = P (Vendor B | Rejected) = 31 / 75 = 0.41 when rounded. B = P (not Rejected | not Vendor B) = (53+93+70+88) / (170+180) = 0.87 when rounded. C is a Chi-Square two-way table test of homogeneity of part quality distribution across each vendor. Entering the nine observed counts into a 3 x 3 matrix in the calculator and running the chi-square test produces a test statistic of π 2 β 6.87 (when rounded) and a p-value of 0.1427. The degrees of freedom for a 3 x 3 table are df = (r-1)(c-1) = (2-1)(2-1) = 4. Therefore, C = 6.87 + 4 = 10.87 or 1087/100. Alternatively, the table below shows the expected counts for each cell obtained by (row total)(column total) / (table total): Part Quality Perfect Acceptable Rejected Vendor A 171*170/500 = 58.14 252*170/500 = 85.68 75*170/500 = 26.18 The Chi-Square Test statistic is π 2 = β Vendor B 171*150/500 = 51.3 252*150/500 = 75.6 75*150/500 = 23.1 (πππ πππ£ππβπΈπ₯ππππ‘ππ)2 πΈπ₯ππππ‘ππ 2 = Vendor C 171*180/500 = 61.56 252*180/500 = 90.72 75*180/500 = 27.72 (53β58.14)2 58.14 + (48β51.3)2 51.3 β¦ (22β27.72)2 27.72 β 6.87 and the Chi- Square critical value for β = 0.05 with df = 4 is π = 9.49. D = βNONEβ since the test statistic does not exceed the critical value and since the p-value of 0.1427 is not less than 0.05. The null hypothesis represents the plant managers fear and we cannot reject the null and hence, the managerβs fear. Thus, he cannot justify cancelling the contract with any of the vendors. 10. Answers: A = 0, B = 6, C = 0, D = 2 Solution: Statement i.) is usually false for discrete random variables but true for continuous random variables, and neither X nor Y are specified as being discrete or continuous. Statement ii.) is true if X and Y are independent (which is not specified), and false otherwise. Statement iii.) is true if the events X = c and Y = k are mutually exclusive (which is not specified), or if X and Y are both continuous variables and so both probabilities are 0. Statement iv.) is true when µX and µY are opposites of each other, and false otherwise. Statement v.) is true only if Y is symmetric and continuous, which are not specified. Statement vi.) is true only when X and Y are independent, which is not specified. Thus, A = 0, B = 6, C = 0, and D = 2 (which, of course, is true regardless of how the 6 statements are distributed among A, B, and C). 11. Answers: A = -0.91, B = 198, C = 0.3454, D = 793.346 Solution: A = β βπ πππ’πππ = ββ0.8288 β β0.91. It is negative since the slope is negative: -10.6932. B = 198 since there a total of n = 200 observations from the top 20 Hustle scores over the last 10 years and df = n β 2 for regression inference. β10.6932 π β0 C = ππΈπ = β 0.3454 since we are solving π‘ = 1 πππ ππΈπ . β30.9589 ππΈπ D = 28.16642 β 793.346, which is just the square of the Standard Error (Standard Deviation) of the Residuals. 12. Answers: A = 96, B = 46, C = 7th, D = 4 Solution: A = 96 since it is the difference between the predicted 1st place teamβs score and the predicted 10th place teamβs score as follows: π¦Μ = 408.9332 β 10.6932(1) = 398.24 πππ π¦Μ = 408.9332 β 10.6932(10) = 302.0012. Their difference is 398.24 β 302.0012 = 96.2388 β 96 when rounded. B = 46 since it is the residual between the teamβs predicted score and its actual score: π¦Μ = 408.9332 β 10.6932(11) = 291.308 πππ π = π¦ β π¦Μ = 337 β 291.308 = 45.692 β 46 when rounded. 337β408.9332 C = 7th if we solve 337 = 408.9332 β 10.6932π₯ for x, which is the Hustle Rank: π₯ = β10.6932 β 6.727 or 7th place. D = 4 since statements a, b, c, and d are true by definition of the Standard Error of the Residuals, the Coefficient of Determination, and the fact that the p-value for the slope is less than 0.01, respectively; but statement e is false because using the model to predict the Hustle Score of the 40th place team is an extrapolation and even results in a negative score! 13. Answers: A = 0, B = 0, C = 0, D = 0.173 Solution: A = B = C = 0 since they each request computing the probability of small, finite, and discrete set of integers out infinitely many real numbers on the interval [0, 10]. D = 0.10(π β β2) β 0.173 when rounded to the nearest thousandth. 14. Answer: A = 1.00, B = 1.00, C = 1.00, D = 0 Solution: A = 1.00 since the independent two-sample t-test statistic is π‘ = Μ β πΜ π π2 π2 β 1+ 2 π1 π2 = 25 β 20 160 90 β + 10 10 = 5 β16+9 = 1.00. B = C = A = 1.00 since parts B and C are just linear transformations of the same set of data in part A which will result in the exact same value for the test statistic. D = 0 since the mean of each set of Z-scores is 0 which makes the numerator of the test statistic formula 0. 15. Answers: A = 2, B = 2, C = 2, D = 0 Solution: A = 2 it is always possible to extract the exact data values from a stemplot and any categorical data set is appropriately displayed with either a bar graph or a pie chart, making i and iii always true. B = 2 since it is possible to estimate the mean of a data set reasonably well from a boxplot if is sufficiently symmetric (making the mean and median approximately equal) and when there is sufficient detail in the scale along the axis. Also, the shape of the distribution in a frequency histogram and a relative frequency histogram is identical, so long as the same scale and class size is used in both. Otherwise, the shapes may be different. This makes iv and vi sometimes true. C = 2 since the concepts of symmetry, skewness, and linear correlations apply only to quantitative data sets and not categorical ones. This makes ii and v nonsensical statements, and hence, always false. D = 0 since A = B = C = 2 which makes the sample variance 0.