* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Good Luck! - TAMU Stat
Survey
Document related concepts
Transcript
STAT 211 Section 506, Spring 2006 Exam I-Form A Last Name : First Name: Student ID : DO NOT OPEN THIS EXAM UNTIL YOU ARE INSTRUCTED TO DO SO • If there is no correct answer or if multiple answers are correct, select the best answer. • Hand in both the exam form and the scantron. • Make sure your name is on both the exam form and the scantron. • Make sure you mark the appropriate letter for your exam form on your scantron. • Each question is worth 5 points for a total of 100 points possible. • If you are caught cheating or helping someone cheat on this exam, you both will receive a grade of ZERO on the exam. The work you submit must be your own. Good Luck! 1 1. The inferential branch of statistics deals with: A. Tables and graphs. B. Generalization from a sample to a population. C. Deductive statistics. D. Discrete and continuous variables. E. Univariate data sets. Answer: B. Statistics has two branches: descriptive statistic and inferential statistic. The former refers to techniques for summarizing and describing data, and the latter refers to techniques for generalizing from a sample to a population. The following is a histogram of the number of hits per nine-inning baseball game. 2. According to the above histogram, given a random nine-inning baseball games, approximately what proportion is there of having 9 or 10 hits in that game? A. 10% B. 12% C. 22% D. 80% E. 40% Answer: C. Proportion is the same as relative frequency. 9 hits has a relative frequency a little above 10%, and 10 hits has a relative frequency around 10%. So the best approximate is 22%. 3. How would you describe the above histogram? A. Bimodal and symmetric B. Unimodal and positively skewed C. Bimodal and positively skewed D. Unimodal and negatively skewed E. Bimodal and negatively skewed Answer: B. If a histogram has a single peak, it is called unimodal, and Bimodal if it has two peaks. A histogram is called positively skewed if it is skewed toward larger values than toward smaller values. Negatively skewed is the other way round. How does the speed of a runner vary 2 over the course of a marathon (42.195km)? Consider determining both the time to run the first 5km and the time to run between the 35km and 40km points, and then subtracting the former time from the latter time. A positive value of this difference corresponds to a runner slowing down towards the end of the race. The histogram below is based on such times. 4. For the above histogram: A. The mean is about the same as the median. B. The mean is larger than the median. C. The median is larger than the mean. D. The range is smaller than the fourth spread. E. The range is smaller than the interquartile range. Answer: B. When a histogram is symmetric, the median equals the mean. Here the histogram is positively skewed. So the mean is larger than the median due to the effect of large observations in the data. A company utilizes two different machines to manufacture parts of a certain type. During a single shift, a sample of n = 20 parts produced by each machine is obtained, and the value of a particular critical dimension for each part is determined. The comparative boxplot is shown below. 3 Boxplot of Machine 1, Machine 2 85 80 75 Data 70 65 60 55 50 Machine 1 Machine 2 5. Which of the following statements best conform to the above boxplots? A. Machine 1 is more consistent. B. Both machines are about the same consistent. C. Machine 2 has a greater median critical dimension D. Machine 2 is more consistent E. Machine 2 produces the parts with greatest critical dimension. Answer: D. A machine with less variability in the values of critical dimension is obviously more consistent. The following is a stem-and-leaf display of a simulated sample from a population distributed as lognormal(µ=0,σ=1). (The values are rounded to have one decimal point.) The 0 | 1 | 2 | 3 | 4 | 5 | 6 | decimal point is at the | 1355699 28 2 The next three problems are based on the above stem-and-leaf display. 6. Calculate the sample mean and sample standard deviation of the values in the simulated sample above. A. x̄ = 0, s = 1.00 B. x̄ = 1.65, s = 2.16 C. x̄ = 1.65, s = 4.67 D. x̄ = 1.30, s = 1.69 4 E. x̄ = 1.30, s = 1.79 Answer: E. Note that one can recover the original data from a stem-and-leaf display. Here we read off the data as 0.1, 0.3, 0.5, 0.5, 0.6, 0.9, 0.9 1.2 1.8, 6.2 7. For the above simulated data, compute the 10% trimmed mean. A. 1.65 B. 0.75 C. 1.30 D. 0.84 E. 0.77 Answer: D. 10% trimmed mean is the average of what is left over after removing 10% of the observations from either end of the ordered sequence. Here we remove 1 observation from either end to get 0.3, 0.5, 0.5, 0.6, 0.9, 0.9 1.2 1.8. 8. If we randomly select another observation from the same population, what is the probability that the new observation will fall in the observed range? (Hint: use the relationship of the lognormal distribution to the normal distribution.) A. 0.05 B. 0.95 C. 0.75 D. 0.50 E. Need more information to determine. Answer: B. The observed range of the data is (0.1, 6.2). We compute P (0.1 < X < 6.2). Since X is distributed as lognormal(µ=0,σ=1), P (0.1 < X < 6.2) = P (log 0.1 < log X < log 6.2) = P (−2.30 < Z < 1.82) = 0.95. 9. Suppose X is some random variable having mean 0 and standard deviation 10. Define a new random variable Y = −1 ∗ (X + 10) Calculate the mean and the standard deviation of Y . A. µy = −10, σy = 10 B. µy = 0, σy = 100 C. µy = −10, σy = −10 D. µy = 0, σy = 10 E. Can’t say—we have to know the probability density function of Y . Answer: E[Y ] = E[−1 ∗ (X + 10)] = −E[X + 10] = −E[X] − 10 = −10. V ar(Y ) = V ar[−1 ∗ (X + 10)] = V ar[X + 10] = V ar[X] = 102 . Thus σy = 10. 10. A particular airline has 10 A.M. flights from Chicago to NY, Atlanta, and LA. Let A denote the event that the NY flight is full and define events B and C analogously for the other two flights. Suppose P(A)=0.6, P(B)=0.5, P(C)=0.4, and the three events are independent. What is the probability that only the NY flight is full? A. 0.12 B. 0.88 C. 0.18 D. 0.10 F. 0.60 Answer: C. The event ”only the NY flight is full”=A ∩ B c ∩ C c , where B c and C c denote the complement events of B and C, respectively. Then the desired probability=P(A ∩ B c ∩ C c )=P(A)P(B c )P(C c )=0.6 · (1 − 0.5) · (1 − 0.4) = 0.18. Here the reason P(A ∩ B c ∩ C c ) factors is because the three events A, B, and C are independent. 5 11. One percent of all individuals in a certain population are carriers of a particular disease. A diagnostic test for this disease has a 90% detection rate for carriers and a 5% detection rate for noncarriers. Suppose the test is applied independently to two different blood samples from the same randomly selected individual. What is the probability that both tests yield positive result? (Hint: The two events ”The randomly selected person is a carrier” and ”The randomly selected person is a noncarrier” are mutually exclusive and exhaustive events.) A. 0.01 B. 0.89 C. 0.90 D. 0.95 F. 0.06 Answer: A. Let A =“The randomly selected person is a carrier”, then Ac =“The randomly selected person is a noncarrier”. Let B =“both tests yield positive result”, then P (B) = P (B|A)P (A) + P (B|Ac )P (Ac ). Since P (B|A) = 0.92 and P (B|Ac ) = 0.052 , then P (B) = P (B|A)P (A) + P (B|Ac )P (Ac ) = 0.92 ∗ 0.01 + 0.052 ∗ (1 − 0.01) = 0.01. 12. When circuit boards used in the manufacture of compact disc players are tested, the long-run percentage of defectives is 5%. Among 25 randomly selected boards, what is the expected value of the number of defective boards? A. 0.05 B. 0.5 C. 1.19 D. 1.25 F. 12.5 Answer: D. The number of defective boards among the 25 randomly selected boards is a random variable and has Binomial distribution with n = 25, and p = 0.05. The expected number of defective boards is np = 1.25. 13. A geologist has collected 10 specimens of basaltic rock and 10 specimens of granite. The geologist instructs a lab assistant to randomly select 15 of the specimens for analysis. What is the distribution of the number of granite specimens selected for analysis? A. Binomial B. Hypergeometric C. Possion D. Negative binomial E. Geometric Answer. B. Refer to the definition of the hypergeometric distribution in the textbook. 14. An insurance company offers its policyholders a number of different premium payment options. For a randomly selected policyholder, let X= the number of months between successive payments. The CDF of X is as follows. 0 if x < 1 0.36 if 1≤x<3 0.47 if 3 ≤ x < 4 F (x) = 0.51 if 4 ≤ x < 6 0.91 if 6 ≤ x < 12 1 if x ≥ 12 Calculate P (3 ≤ X ≤ 6) A. 0.55 B. 0.51 C. 0.15 6 D. 0.11 E. 0.47 Answer: A. P (3 ≤ X ≤ 6) = F (6) − F (2) = 0.91 − 0.36 = 0.55 15. Let X denote the amount of time for which a book on a 2-hour reserve at a college library is checked out by a randomly selected student. The CDF of checkout duration X is 0 if x < 0 x2 F (x) = if 0 ≤ x < 2 4 1 if x ≥ 2 If the borrower is charged an amount h(X) = 3X 2 , compute the expected charge E(h(X)) A. 18.0 B. 2.0 C. 6.0 D. 2.6 E. 4.0 Answer: C. First compute the pdf of X: f (x) = F 0 (x) = E(h(X)) = E(3X 2 ) = Z x 2 2 3x2 · f (x)dx = 0 if 0 ≤ x < 2. By definition, Z 0 2 3x2 · x dx = 6 2 16. Find the 75th percentile of the distribution of checkout duration in the previous problem A. 1 B. 1.41 C. 1.73 D. 2 E. Cannot determine. Answer: C. Let x0.75 denote the 75th percentile of the distribution. Then we solve the following equation for x0.75 F (x0.75 ) = 0.75, which gives us x0.75 = 1.73. 17. Find the interquartile range (75th percentile-25th percentile) for the standard normal distribution using Table A.3. Interpolate where appropriate. A. 1.350 B. 0.675 C. 0.773 D. 1.546 E. 1.196 Answer: A. The 75th percentile is 0.674, and by symmetry of the standard normal density function, the 25th percentile is -0.674. Thus the interquartile range is 1.348 ≈ 1.35. 18. Suppose the force acting on a column that helps to support a building is normally distributed with mean 15.0 kips and standard deviation 1.25 kips. What is the probability that the force differs from 15.0 kips by at most 2 standard deviations? A. 0.954 B. 0.977 C. 0.046 7 D. 0.023 E. 1 Answer: A. Let X denote the form acting on a column. Since X is Normal(µ = 15, σ = 1.25), then Z = X−15 1.25 is Normal(0,1). Thus P (|X − 15| < 2 ∗ 1.25) = P (|X − 15| < 2 ∗ 1.25) = P (|Z| < 2) = Φ(2) − Φ(−2) ≈ 0.954 19. Let X have a binomial distribution with parameters n = 25 and p = 0.5. Calculate P (15 ≤ X ≤ 20) using the exact distribution from Appendix Table A.1 and using the normal approximation with continuity correction. Report the absolute value of the approximation error. A. 0.0430 B. 0.0540 C. 0.0962 D. 0.0008 E. 0.0616 Answer: D. X is distributed as Binomial(n = 25, p = 0.5). Look up Table A.1 and obtain P (15 ≤ X ≤ 20) =. Since Binomial(n = 25, p = 0.5)can be approximated by the normal distribution with µ = np = 12.5, and variance σ 2 = np(−p) = 6.25, we apply continuity correction and approximate P (15 ≤ X ≤ 20) by X − 12.5 20.5 − 12.5 14.5 − 12.5 ≤ p ≤ p ) P (15 − 0.5 ≤ X ≤ 20 + 0.5) = P ( p (6.25) (6.25) (6.25) = Φ(3.2) − Φ(0.8) ≈ 0.211 . 20. Let X be any random variable which is normally distributed, what is the probability that X is within 1.5 SDs of its mean? A. 0.0668 B. 0.9332 C. 0.8664 D. 0.1336 E. Cannot determine without knowing the mean and the variance. Answer: C. Let X ∼ N (µ, σ), then Z = X−µ σ ∼ N (0, 1) P (|X − µ| < 1.5σ) = P (|Z| < 1.5) = Φ(1.5) − Φ(−1.5) = 0.8664 . 8