Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Exe 2. Elementary school: x̄ = 648.1, Median = 655, No mode, Midrange = 666.5, Range = 1059, Sample variance = 101433, Sample standard deviation = 318.5. Secondary school: x̄ =264.3, Median = 279.5, No mode, Midrange = 233.5, Range = 381, Sample variance = 16968.6, Sample standard deviation = 130.3. The numbers of elementary schools is more variable than that of secondary schools. Exe 4. Score Freq. Xm fXm 2 fXm 478 − 504 4 491 1964 964324 505 − 531 6 518 3108 1609944 532 − 558 2 545 1090 594050 559 − 585 2 572 1144 654368 586 − 612 2 599 1198 717602 Sum 16 8504 4540288 Sample mean x̄ = 531.5, Variance = 1360.8, Std = 36.9, Modal class is the second class. The shape is right-skewed. Exe 8. Weighted mean of payoffs is 4700$. Exe 10. These two data sets have different units so we use coefficients of variations (CV) to compare their variation. The CV of numbers of textbooks = 5 ×100% = 31.25% 16 is larger than the CV of ages = 18.6%, so the first data set is more variable than the second one. Exe 12. Ordered data: 3, 4, 5, 15, 16, 17, 19, 23, 24, 31, 33 (n = 11 observations). Percentile rank of x = (Number of values below x + 0.5) / n. Percentile ranks corresponding to the above 11 values are 5%, 14%, 23%, 32%, 41%, 50%, 59%, 68%, 77%, 86%, 95% respectively. The value that corresponds to the 40th percentile is at the position 11×40/100 = 4.4 ≈ 5, which is 16. Boxplot: Min = 3; Q1 = 5; Median = 17; Q3 = 24; Max = 33. Exe 14. a). Ordered data: 400, 506, 511, 514, 517, 521 (6 observations). Q1 = 506, Q3 = 517, IQR = = Q3-Q1 = 11, (Q1-1.5IQR; Q3+1.5IQR) = (489:5; 533:5). So 400 is an outlier. 1 b)-d): No outlier. Exe 16. In order to say about the percentage of data points that fall within a particular range, one can use the empirical rule or Chebyshev’s theorem. The empirical rule often gives a more accurate estimate but it requires the assumption of bell-shaped distribution, while Chebyshev’s theorem is applicable for any data set. The data set in this exercise is not assumed to have a bell-shape, so Chebyshev’s theorem should be used. a) The interval 47,300$-69,700$ is k=1 standard deviation from the mean. Chebyshev’s theorem says that at least 1− k12 = 0% data points fall in this interval! This conclusion is noninformative. Chebyshev’s theorem tells us nothing in this case. b) 80,900$ is k = 80,900−58,500 = 2 std of the mean. According to Chebyshev’s theorem, at 11,200 leat 1− k12 = 25% data points are within k = 2 std of the mean. Therefore at most 25% data points fall outside that interval. That is, at most 25% of workers earn more than 80,900$. c) k = 3.705. At most 7.28%. Exe 18. k = 2.4. At least 82.6% Exe 20. The relative position of a data point in a data set depends on the mean and the = standard deviation of the data set. In the first test, a grade of 82 corresponds to z = 82−85 6 −0.5. In the second test, a grade of 56 corresponds to z = −0.8. The grade in the first test has a better relative position because it is closer to the mean than the second score. Exe 22. “Before” data set: min = 12, Q1 = 19.5, Median = 30, Q3 = 34.25, max = 38. “After” data set: min = 12, Q1 = 14.25, median = 18, Q3 = 23.5, max = 32 Exe 23. The distribution is bell-shaped, so the empirical rule can be used. 68% of the times would be expected to be within 1 std of the mean, which is 23.7-35.7 2