Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Review 1 Answers 1. D. The mean of the sum of two random variables would be the sum of the respective means, or 45 + 15 = 60. In terms of the standard deviation of the sum of two random variables, the variance of the sum is the sum of the two variances. Therefore, the variance of the sum would be 15^2 + 3^2 = 234. The standard deviation would be square root of 234, which are approximately 15.3. 2. D. The boxplot for the data set A is wider, overall, indicating a wider range. Also, the “box” portion of the boxplot for data set A is wider, indicating a wider interquartile range, or IQR. However, boxplots do not indicate the sample sizes for the sets they are based upon. 3. D. Cluster sampling is a process whereby a portion of the population is selected, and all values from that portion are taken as the sample. This is achieved in the case whereby a single county is used, in its entirety, as the sample in question. 4. E. First, since the size of one of the data sets is 10, and the distribution of the land sizes is likely not normally distributed, the conditions for using the t-distribution are not met. Second, since the entire populations are taken into the samples, the means would be population means and have no variation. Third, since the two nations are contiguous, sharing borders, it can also be argued that the land sizes of the states and provinces are not independent. 5. B. The main purpose of a stratified sample is to divide the population into groups, called strata, so that each stratum has members represented in the sample. It is the simple random sample that allows each member of the population to have an equal probability of being selected. Both types of samples can be done with any sample size, and significance tests can be done with both. Moreover, it is often that the stratified sample is more difficult to collect. 6. E. Since each value in the set is divided by 2, the mean and standard deviation would both be divided by 2. With the original mean being 170 and standard deviation being 50, the new mean and standard deviation would be 85 and 25, respectively. 7. E. First, since no maximum possible score was given, it is not known, whether the outlier was an extremely high value or low value. However, if it is known whether the outlier is a low value its removal would cause the mean to increase, but the median could possibly stay the same, or go up with the removal would cause the mean to increase, but the median could possibly stay the same, or go up with the removal of the lowest value set. If it is known whether the outlier is a high value, its removal would cause the mean to decrease, but the median could possibly stay the same, or go down with the removal of the highest value of the set. Therefore, the exact behavior of the mean and median cannot be determined from the information given. 8. C. A stratified sample involves dividing the population into a number of subsets called strata, and taking a simple random sample from each stratum. In this case, the strata are the countries of the state. 9. A. The most likely explanation for the difference between the poll results and the actual results and the actual election results is the fact that those who participated in the poll were only those who chose to be included in the sample. This is the issue of voluntary sampling. Although the other responses are all possible explanations for the difference in the results, they would unlikely be causes of great differences. It is possible that the question was slanted, but the actual question is not given and, therefore, cannot be assumed. Lying in the responses is possible, but would likely not account for great differences in the results. Also, respondents changing their minds might account for some difference, but likely not a great differences in the results. Last, it is likely, from the context, that the sample size was sufficient to produce publishable results. 10. D. It was originally noted that there was an inverse relationship between the rate of involvement in volunteer activities and the rate of delinquency among this group of children. Therefore, it was assumed that if children who had a high rate of delinquency were placed in voluntary activities, their rate of delinquency would be reduced. However, this was based on the assumption that there was a cause-and-effect relationship between volunteer activities and delinquency. In reality, since the children placed in the volunteer activities did not have a reduction in delinquency, it showed that volunteer activities were not the cause of low delinquency, but merely that the high volunteer rate and low delinquency were only correlated, being caused by a third factor, namely the type of children who would be involved in both. Differences between boys and girls would not account for the difference, nor would necessarily examining other factors. Observing other samples could possibly produce differing results but may not explain the failure of placing children into volunteer activities as an attempt to lower delinquency rates. 11. B. In a study to measure the effect of a drug, the effects are compared to a control group, which measures the natural behavior of a condition without medication, in this case, that of migraine headaches. In order to reduce the psychological bias of patients who know they are receiving medication, the members of the control group are given a placebo, or fake pill. To further reduce this bias, the study is a blind study, whereby the patients do not know whether the pill they are given is the real medication or the placebo. Moreover, in a double-blind study, neither the patient nor the doctor administering the pill knows whether the pill is the real medication of the placebo. This is meant to eliminate any unintended clues the doctor may give to the patient to indicate whether the medication is real or not. 12. C. With replacement, the possible samples of size 2 are {1, 1}, {1,3}, {1,4}, {1,8}, {3,1}, {3,3}, {3,4}, {3,8}, {4,1}, {4,3}, {4,4}, {4,8}, {8,1}, {8,3}, {8,4}, and {8,8} yielding possible sample means 1, 2, 2.5, 4.5, 2, 3, 3.5, 5.5, 2.5, 3.5, 4, 6, 4.5, 5.5, 6, and 8. For these possible samples, μ x = 4 and σ x =1.80. 13. A. Plan I is the superior plan, because both the initial selection of the deer, and the second selection, are based on random samples from the population. Plan II lacks randomness, because only the one square mile region is taken at random. This process would not take into consideration factors, such as terrain, that may affect the distribution of the deer over the region, whereby the deer would not be randomly chosen into the samples. 14. E. The percentile associated with a score represents how many whole percent of all scores lie below that score. Therefore, if a score is in the 90th percentile, this means that 90% of all other scores are below that score. 15. D. The center box for Set A is wider than the center box for Set B, indicating a greater interquartile range for Set A. Other statements concerning the exact mean, standard deviation, set size, or presence of outliers cannot be justified, since the exact distribution of actual data points cannot be determined from the boxplot alone. 16. A. The definition of “random sample” is that each possible sample size n is equally likely to be chosen as the sample size for n. Therefore, if a sample is random, each member of the population is equally likely to be included in the sample. A random sample does not need to have any particular size, nor does the random sample guarantee that each stratum of the population will be represented or that the values will follow any pattern. 17. C. The goal in this case was to best represent the typical student in the large class. The typical student, which made up the largest portion of the class, was a Freshman. By assigning values 1, 2, 3, and 4 to the students, the mode, the value which appears most often, would indicate that the Freshman make up the greatest portion of the class. 18. E. Standard deviation gives a measure of the amount of variation in the values in either a sample or a population. In this case, the standard deviation would give a measure of the amount of variation in the wattages in the sample of power strips. Low variation would indicate that the values are, on the whole, close to the mean. Therefore, standard deviation would be the best indicator of stability. 19. B. The only statement that is not true is the claim that the ranges are equal. The range for Stockton is (38 - 17) = 21, and the range for Waffletown is (37 - 18) = 19. The other statements are true. Waffletown has 5 players over 30, and Stockton has 4 players over 30. The ages of the Stockton players, if compared in order, are equal, or lower than each corresponding age of the Waffletown players, with the exception of only the oldest player. The median age for Waffletown is 24.5, and the median age for Stockton is 23.5. Stockton has 9 players in the 20s, and Waffletown has 7. 20. C. To produce a simple random sample, all values in the population must be equally likely to be chosen into the sample. Therefore, a random process should be implemented that selects values from the population, as one single, nonstratified group. Using random numbers to select voters from a voter registration list would best produce a random sample from the entire population. Selecting 25 voters from four countries would produce a voter on a list is a systematic process that is not random, and selecting voters from street corners eliminates a large number of voters from the possibility selection. 21. C. In order to stimulate a process whereby 20% of students are absent, and 80% are present, assign two of the random digits to represent “absent” and eight of the random digits to represent “present.” Assigning 0 and 1 to represent “absent” and the remainder of the digits as “present” will accomplish this.