Download Solutions - FloridaMAO

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
March Statewide Invitational
Statistics Team
Answers:
1.
5
A=9
10.
5
9
B=
C = 0.134 or 0.135
D = Undefined
A=0
B=6
C=0
D=2
2.
A = 24
B=8
C=9
D = 19
11.
A = -0.91
B = 198
C = 0.3454
D = 793.346
3.
A=1
B=1
C=1
D=0
12.
A = 96
B = 46
C = 7th
D=4
4.
A=1
B=1
C=1
D=1
13.
A=0
B=0
C=0
D = 0.173
5.
A = or 0.8
14.
A = 1.00
4
5
1
or
5
B=
0.2
C=0
8
D = 25 or 0.32
6.
1
𝐴 = 115,200
𝐡=
𝐢=
𝐷=
7.
B = 1.00
C = 1.00
D=0
15.
1
10
1
120
1
46,080
B=2
C=2
D=0
A=5
B = 59
C = 32
377
D=
6
8.
A=1
B = 10
C=0
D = 10
9.
A = 75 or 0.41
31
B=
152
175
or 0.87
C = 10.87 or
A=2
1087
100
D = β€œNONE”
or
69,432
6,385
Answers & Solutions
Solutions:
1. Answers: A = 5/9, B = 5/9, C = 0.134 or 0.135, D = undefined
Solution: Parts A and B are essentially the same. The first and third quartiles of each are 2 and 7, respectively.
Therefore, the QCD is (7 – 2) / (7 + 2) = 5 / 9 for each of them. For part C, the first and third quartiles are located
approximately 0.67 standard deviations on each side of the mean in a Normal distribution when using the Standard
113.4 βˆ’ 86.6
Normal Distribution Chart. Thus, Q1 = 100 – 0.67(20) = 86.6 and Q3 = 100 + 0.67(20) = 113.4 and 𝑄𝐢𝐷 =
=
113.4 + 86.6
0.134 (when using the chart). Otherwise, Q1 = invNorm(0.25, 100, 20) 86.51020501 and Q3 = invNorm(0.75, 100, 20) =
113.489795 βˆ’ 86.51020501
113.489795 and so 𝑄𝐢𝐷 =
β‰ˆ 0.135 when rounded. For part D, the QCD is undefined since the
113.489795 + 86.51020501
quartiles are located at approximately -0.67 and 0.67 in a Standard Normal Distribution.
2. Answers: A = 24, B = 8, C = 9, D = 19
Solution: The one-sample t-test has df = n – 1 = 25 – 1 = 24. Regression inference has df = n – 2 = 10 – 2 = 8 for a set of
10 data pairs. Since there are C = 10 digits, the chi-square goodness of fit test has df = C – 1 = 10 – 1 = 9. The degrees of
freedom using the conservative approach to the two-sample t-test is the smaller sample size minus one: df = 20 – 1 = 19.
3. Answers: A = 1, B = 1, C = 1, D = 0
Solution: A = B = C = 1 since these statements are all true for a sum of a finite set of independent Normally distributed
random variables, which is a linear combination. D = 0 since the standard deviation is the square root of result from C.
4. Answer: A = 1 (Yes), B = 1 (Yes), C = 1 (Yes), and D = 1 (Yes)
Solution: The Empirical Rule is specifically defined for the Normal distribution and each of the other distributions
converge in distribution to the Normal distribution either as the degrees of freedom approaches infinity (in the case of the
Student’s t-Distribution and the Chi-Square Distribution) or as the number of trials approaches infinity (as for the
Binomial Distribution). This is also a direct consequence of The Central Limit Theorem.
5. Answers: A = 4/5 or 0.8, B = 1/5 or 0.2, C =0, D = 8/25 or 0.32
Solution: A: Since the odds ratio of positive to negative Z-scores in the data set is 4:1, the probability of randomly
selecting a data value with a positive Z-score is 4/(4+1) = 4/5 = 0.8.
B: A data value below the mean has a negative Z-score; therefore, B is just the complement of part A: B = 1/5 = 0.2.
C: P(Data Value = Mean) = 0. Since it is stated that none of the Z-scores are equal to, then none of the data values in the
set are equal to the mean.
D: This is simply the product of the results from parts A and B doubled: C = 2(4/5)(1/5) = 2(0.8)(0.2) = 8/25 = 0.32.
1
1
6. Answers: 𝐴 = 115,200 , 𝐡 = 10 , 𝐢 =
1
120
1
, 𝐷 = 46,080
Solution: Let the notation β€œNsd = n” represent the outcome on an N-sided die. For example: 4sd = 2 denotes rolling a 2
on the 4-sided die. Therefore: A = P(4sd = anything)*P(6sd = same as 4sd)*P(8sd = same as 4sd)*P(10sd = same as
4sd)*P(12sd = same as 4sd)*P(20sd = same as 4sd) since the 4-sided die limits the possible outcome to six ones, six twos,
4
1
1
1
1
1
1
1
six threes, and six fours and the dice are all independent. Therefore: 𝐴 = 4 × 6 × 8 × 10 × 12 × 20 = 115,200 𝐡 = 10
since the only way to get a product of 0 is to roll the 0 on the 10-sided die. The results on the other dice do not mater.
2
3
4
4
5
8
3,840
1
𝐢 = 𝑃(π‘ƒπ‘Ÿπ‘–π‘šπ‘’ π‘œπ‘› π‘Žπ‘™π‘™ 𝑑𝑖𝑐𝑒) = 4 × 6 × 8 × 10 × 12 × 20 = 460,800 = 120 . Since the dice are all independent, then
D = P (sum on all dice is 50 | 10sd = 0) = P (sum on all five remaining dice is 50, which is the maximum possible) =
1
1
1
1
1
1
P (4sd = 4, 6sd = 6, 8sd = 8, 12sd = 12, and 20sd = 20) = 4 × 6 × 8 × 12 × 20 = 46,080 .
377
7. Answers: A = 5, B = 59, C = 32, D = 6
Solution: A = 5 since each die must show a 1 with the exception of the 10-sided die which must show a 0.
B = 59 since the maximum possible is 9 on the 10-sided die while all other dice the maximum equals the number of sides.
C = 32 since the expected value of the sum on the dice is the sum of the expected values on each die. Also, note that the
mean on each die is equal to the medial value on the die since they are all discrete uniform distributions. Thus we have:
E(Sum) = 2.5 + 3.5 + 4.5 + 4.5 + 6.5 + 10.5 = 32.
377
D = 6 . The dice are all independent so the variance of the sum on the dice is the sum of the variances of each die. The
most efficient way to calculate each die’s variance is to take the sum of the squares of each value on the die divided by the
number of sides on the die and then subtract the square of the mean of the die as follows:
Var(4sd) = (1 + 4 + 9 + 16) / 4 – 2.52 = 5 / 4
Var(6sd) = (1 + 4 + 9 + 16 + 25 + 36) / 6 – 3.52 = 35 / 12
Var(8sd) = (1 + 4 + 9 + 16 + 25 + 36 + 49 + 64) / 8 – 4.52 = 21 / 4
Var(10sd) = (0 + 1 + 4 + 9 + 16 + 25 + 36 + 49 + 64 + 81) / 10 – 4.52 = 33 / 4
Var(12sd) = (1 + 4 + 9 + 16 + 25 + 36 + 49 + 64 + 81 + 100 + 121 + 144) / 12 – 6.52 = 143 / 12
Var(12sd) = (1 + 4 + 9 + 16 + 25 + 36 + 49 + 64 + 81 + 100 + 121 + 144 + 169 + … + 361 + 400) / 20 – 10.52 = 133 / 4
The sum of the variances is 5 / 4 + 35 / 12 + 21 / 4 + 33 / 4 + 143 / 12 + 133 / 4 = 377 / 6.
8. Answers: A = 1, B = 10, C = 0, D = 10
Solution: The completed two-way table below helps answer each part:
X
XC
Total
YC
b = P(X and YC)
d = P(XC and YC)
P(YC) = b + d
Y
a = P(X and Y)
c = P(XC and Y)
P(Y) = a + c
Total
P(X) = a + b
P(XC) = c + d
a + b + c + d =1
A = a + b + c + d = 1 since the sum of all the probabilities must equal 1.
B = 1 + 2 + 3 + 4 = 10 since statements 1, 2, 3, and 4 are true while statement 5 is false since two mutually exclusive
events cannot possibly be independent. Statement 1 is true since a = P(X and Y) = 0 when X and Y are mutually
exclusive. Hence, Statement 2 is true as a consequence of Statement 1 being true: P(X or Y) = a + b + a + c = b + c.
Likewise, Statement 3 is true for the same reason since d represents the probability of the complement of P(X or Y).
C = 1 since if X and Y are both mutually exclusive and exhaustive, then the probabilities of their intersection is 0 as well
as the probability of the intersection of their complements. Thus: a = P(X and Y) = 0 and d = P(XC and YC) = 0 which
makes a + d = 0.
D = 1 + 2 + 3 + 4 = 10 since all 4 statements are true. Statements 1 and 2 are both true by definition of independence.
Statement 1: a = P(X and Y) = P(X) P(Y) = (a + b)(a + c) and d = P(XC and YC) = P(XC) P(YC) = (c + d)(b + d).
Statement 2: b = P(X and YC) = P(X) P(YC) and c = P(XC and Y) = P(XC) P(Y) = [1 – P(X)] [1 – P(YC)].
Statement 3: if P(X) = P(Y) then a + b = a + c so b = c. Statement 4: a cannot be 0 since a = P(X and Y) = P(X) P(Y)
and neither probabilities are 0 themselves. There is one notable exception, which is if events X and Y are β€œessentially
deterministic” which means one of them has probability 0 and the other has probability 1. An essentially deterministic
event is considered independent of any other event, even itself. Therefore, this is the only acceptable dispute of this result
because it would force either b = 1 or c = 1 and all other values 0.
9. Answers: A = 31 / 75 or 0.41, B = 152 / 175 or 0.87, C = 10.87 or 1087/100 or 69,432 / 6,385, D = β€œNONE”
Solution: A = P (Vendor B | Rejected) = 31 / 75 = 0.41 when rounded. B = P (not Rejected | not Vendor B) =
(53+93+70+88) / (170+180) = 0.87 when rounded. C is a Chi-Square two-way table test of homogeneity of part quality
distribution across each vendor. Entering the nine observed counts into a 3 x 3 matrix in the calculator and running the
chi-square test produces a test statistic of πœ’ 2 β‰ˆ 6.87 (when rounded) and a p-value of 0.1427. The degrees of freedom for
a 3 x 3 table are df = (r-1)(c-1) = (2-1)(2-1) = 4. Therefore, C = 6.87 + 4 = 10.87 or 1087/100.
Alternatively, the table below shows the expected counts for each cell obtained by (row total)(column total) / (table total):
Part Quality
Perfect
Acceptable
Rejected
Vendor A
171*170/500 = 58.14
252*170/500 = 85.68
75*170/500 = 26.18
The Chi-Square Test statistic is πœ’ 2 = βˆ‘
Vendor B
171*150/500 = 51.3
252*150/500 = 75.6
75*150/500 = 23.1
(π‘‚π‘π‘ π‘’π‘Ÿπ‘£π‘’π‘‘βˆ’πΈπ‘₯𝑝𝑒𝑐𝑑𝑒𝑑)2
𝐸π‘₯𝑝𝑒𝑐𝑑𝑒𝑑
2
=
Vendor C
171*180/500 = 61.56
252*180/500 = 90.72
75*180/500 = 27.72
(53βˆ’58.14)2
58.14
+
(48βˆ’51.3)2
51.3
…
(22βˆ’27.72)2
27.72
β‰ˆ 6.87 and the Chi-
Square critical value for ∝ = 0.05 with df = 4 is πœ’ = 9.49.
D = β€œNONE” since the test statistic does not exceed the critical value and since the p-value of 0.1427 is not less than 0.05.
The null hypothesis represents the plant managers fear and we cannot reject the null and hence, the manager’s fear. Thus,
he cannot justify cancelling the contract with any of the vendors.
10. Answers: A = 0, B = 6, C = 0, D = 2
Solution: Statement i.) is usually false for discrete random variables but true for continuous random variables, and
neither X nor Y are specified as being discrete or continuous. Statement ii.) is true if X and Y are independent (which is
not specified), and false otherwise. Statement iii.) is true if the events X = c and Y = k are mutually exclusive (which is
not specified), or if X and Y are both continuous variables and so both probabilities are 0. Statement iv.) is true when µX
and µY are opposites of each other, and false otherwise. Statement v.) is true only if Y is symmetric and continuous,
which are not specified. Statement vi.) is true only when X and Y are independent, which is not specified. Thus, A = 0,
B = 6, C = 0, and D = 2 (which, of course, is true regardless of how the 6 statements are distributed among A, B, and C).
11. Answers: A = -0.91, B = 198, C = 0.3454, D = 793.346
Solution: A = – βˆšπ‘… π‘†π‘žπ‘’π‘Žπ‘Ÿπ‘’ = βˆ’βˆš0.8288 β‰ˆ βˆ’0.91. It is negative since the slope is negative: -10.6932.
B = 198 since there a total of n = 200 observations from the top 20 Hustle scores over the last 10 years and df = n – 2 for
regression inference.
βˆ’10.6932
𝑏 βˆ’0
C = 𝑆𝐸𝑏 =
β‰ˆ 0.3454 since we are solving 𝑑 = 1 π‘“π‘œπ‘Ÿ 𝑆𝐸𝑏 .
βˆ’30.9589
𝑆𝐸𝑏
D = 28.16642 β‰ˆ 793.346, which is just the square of the Standard Error (Standard Deviation) of the Residuals.
12. Answers: A = 96, B = 46, C = 7th, D = 4
Solution: A = 96 since it is the difference between the predicted 1st place team’s score and the predicted 10th place team’s
score as follows: 𝑦̂ = 408.9332 βˆ’ 10.6932(1) = 398.24 π‘Žπ‘›π‘‘ 𝑦̂ = 408.9332 βˆ’ 10.6932(10) = 302.0012. Their
difference is 398.24 βˆ’ 302.0012 = 96.2388 β‰ˆ 96 when rounded.
B = 46 since it is the residual between the team’s predicted score and its actual score: 𝑦̂ = 408.9332 βˆ’ 10.6932(11) =
291.308 π‘Žπ‘›π‘‘ 𝑒 = 𝑦 βˆ’ 𝑦̂ = 337 βˆ’ 291.308 = 45.692 β‰ˆ 46 when rounded.
337βˆ’408.9332
C = 7th if we solve 337 = 408.9332 βˆ’ 10.6932π‘₯ for x, which is the Hustle Rank: π‘₯ = βˆ’10.6932 β‰ˆ 6.727 or 7th place.
D = 4 since statements a, b, c, and d are true by definition of the Standard Error of the Residuals, the Coefficient of
Determination, and the fact that the p-value for the slope is less than 0.01, respectively; but statement e is false because
using the model to predict the Hustle Score of the 40th place team is an extrapolation and even results in a negative score!
13. Answers: A = 0, B = 0, C = 0, D = 0.173
Solution: A = B = C = 0 since they each request computing the probability of small, finite, and discrete set of integers out
infinitely many real numbers on the interval [0, 10]. D = 0.10(πœ‹ βˆ’ √2) β‰ˆ 0.173 when rounded to the nearest
thousandth.
14. Answer: A = 1.00, B = 1.00, C = 1.00, D = 0
Solution: A = 1.00 since the independent two-sample t-test statistic is 𝑑 =
Μ… βˆ’ 𝑋̅
𝑋
𝑆2 𝑆2
√ 1+ 2
𝑛1 𝑛2
=
25 βˆ’ 20
160 90
√ +
10 10
=
5
√16+9
= 1.00.
B = C = A = 1.00 since parts B and C are just linear transformations of the same set of data in part A which will result in
the exact same value for the test statistic. D = 0 since the mean of each set of Z-scores is 0 which makes the numerator of
the test statistic formula 0.
15. Answers: A = 2, B = 2, C = 2, D = 0
Solution: A = 2 it is always possible to extract the exact data values from a stemplot and any categorical data set is
appropriately displayed with either a bar graph or a pie chart, making i and iii always true.
B = 2 since it is possible to estimate the mean of a data set reasonably well from a boxplot if is sufficiently symmetric
(making the mean and median approximately equal) and when there is sufficient detail in the scale along the axis. Also,
the shape of the distribution in a frequency histogram and a relative frequency histogram is identical, so long as the same
scale and class size is used in both. Otherwise, the shapes may be different. This makes iv and vi sometimes true.
C = 2 since the concepts of symmetry, skewness, and linear correlations apply only to quantitative data sets and not
categorical ones. This makes ii and v nonsensical statements, and hence, always false.
D = 0 since A = B = C = 2 which makes the sample variance 0.