Download Good Luck! - TAMU Stat

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Student's t-test wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
STAT 211 Section 506, Spring 2006
Exam I-Form A
Last Name :
First Name:
Student ID :
DO NOT OPEN THIS EXAM UNTIL YOU ARE INSTRUCTED TO DO SO
• If there is no correct answer or if multiple answers are correct, select the best
answer.
• Hand in both the exam form and the scantron.
• Make sure your name is on both the exam form and the scantron.
• Make sure you mark the appropriate letter for your exam form on your scantron.
• Each question is worth 5 points for a total of 100 points possible.
• If you are caught cheating or helping someone cheat on this exam, you both
will receive a grade of ZERO on the exam. The work you submit must be
your own.
Good Luck!
1
1. The inferential branch of statistics deals with:
A. Tables and graphs.
B. Generalization from a sample to a population.
C. Deductive statistics.
D. Discrete and continuous variables.
E. Univariate data sets.
Answer: B. Statistics has two branches: descriptive statistic and inferential statistic. The former refers to techniques for summarizing and describing data, and the latter refers to techniques
for generalizing from a sample to a population.
The following is a histogram of the number of hits per nine-inning baseball game.
2. According to the above histogram, given a random nine-inning baseball games, approximately what proportion
is there of having 9 or 10 hits in that game?
A. 10%
B. 12%
C. 22%
D. 80%
E. 40%
Answer: C. Proportion is the same as relative frequency. 9 hits has a relative frequency a little
above 10%, and 10 hits has a relative frequency around 10%. So the best approximate is 22%.
3. How would you describe the above histogram?
A. Bimodal and symmetric
B. Unimodal and positively skewed
C. Bimodal and positively skewed
D. Unimodal and negatively skewed
E. Bimodal and negatively skewed
Answer: B. If a histogram has a single peak, it is called unimodal, and Bimodal if it has two
peaks. A histogram is called positively skewed if it is skewed toward larger values than toward
smaller values. Negatively skewed is the other way round. How does the speed of a runner vary
2
over the course of a marathon (42.195km)? Consider determining both the time to run the first 5km and the
time to run between the 35km and 40km points, and then subtracting the former time from the latter time.
A positive value of this difference corresponds to a runner slowing down towards the end of the race. The
histogram below is based on such times.
4. For the above histogram:
A. The mean is about the same as the median.
B. The mean is larger than the median.
C. The median is larger than the mean.
D. The range is smaller than the fourth spread.
E. The range is smaller than the interquartile range.
Answer: B. When a histogram is symmetric, the median equals the mean. Here the histogram is
positively skewed. So the mean is larger than the median due to the effect of large observations
in the data.
A company utilizes two different machines to manufacture parts of a certain type. During a single shift, a
sample of n = 20 parts produced by each machine is obtained, and the value of a particular critical dimension
for each part is determined. The comparative boxplot is shown below.
3
Boxplot of Machine 1, Machine 2
85
80
75
Data
70
65
60
55
50
Machine 1
Machine 2
5. Which of the following statements best conform to the above boxplots?
A. Machine 1 is more consistent.
B. Both machines are about the same consistent.
C. Machine 2 has a greater median critical dimension
D. Machine 2 is more consistent
E. Machine 2 produces the parts with greatest critical dimension.
Answer: D. A machine with less variability in the values of critical dimension is obviously more
consistent.
The following is a stem-and-leaf display of a simulated sample from a population distributed as lognormal(µ=0,σ=1).
(The values are rounded to have one decimal point.)
The
0 |
1 |
2 |
3 |
4 |
5 |
6 |
decimal point is at the |
1355699
28
2
The next three problems are based on the above stem-and-leaf display.
6. Calculate the sample mean and sample standard deviation of the values in the simulated sample above.
A. x̄ = 0, s = 1.00
B. x̄ = 1.65, s = 2.16
C. x̄ = 1.65, s = 4.67
D. x̄ = 1.30, s = 1.69
4
E. x̄ = 1.30, s = 1.79
Answer: E. Note that one can recover the original data from a stem-and-leaf display. Here we
read off the data as 0.1, 0.3, 0.5, 0.5, 0.6, 0.9, 0.9 1.2 1.8, 6.2
7. For the above simulated data, compute the 10% trimmed mean.
A. 1.65
B. 0.75
C. 1.30
D. 0.84
E. 0.77
Answer: D. 10% trimmed mean is the average of what is left over after removing 10% of the
observations from either end of the ordered sequence. Here we remove 1 observation from
either end to get 0.3, 0.5, 0.5, 0.6, 0.9, 0.9 1.2 1.8.
8. If we randomly select another observation from the same population, what is the probability that the new
observation will fall in the observed range? (Hint: use the relationship of the lognormal distribution to the
normal distribution.)
A. 0.05
B. 0.95
C. 0.75
D. 0.50
E. Need more information to determine.
Answer: B. The observed range of the data is (0.1, 6.2). We compute P (0.1 < X < 6.2). Since X
is distributed as lognormal(µ=0,σ=1), P (0.1 < X < 6.2) = P (log 0.1 < log X < log 6.2) = P (−2.30 <
Z < 1.82) = 0.95.
9. Suppose X is some random variable having mean 0 and standard deviation 10. Define a new random variable
Y = −1 ∗ (X + 10)
Calculate the mean and the standard deviation of Y .
A. µy = −10, σy = 10
B. µy = 0, σy = 100
C. µy = −10, σy = −10
D. µy = 0, σy = 10
E. Can’t say—we have to know the probability density function of Y .
Answer: E[Y ] = E[−1 ∗ (X + 10)] = −E[X + 10] = −E[X] − 10 = −10. V ar(Y ) = V ar[−1 ∗ (X + 10)] =
V ar[X + 10] = V ar[X] = 102 . Thus σy = 10.
10. A particular airline has 10 A.M. flights from Chicago to NY, Atlanta, and LA. Let A denote the event that
the NY flight is full and define events B and C analogously for the other two flights. Suppose P(A)=0.6,
P(B)=0.5, P(C)=0.4, and the three events are independent. What is the probability that only the NY flight
is full?
A. 0.12
B. 0.88
C. 0.18
D. 0.10
F. 0.60
Answer: C. The event ”only the NY flight is full”=A ∩ B c ∩ C c , where B c and C c denote
the complement events of B and C, respectively. Then the desired probability=P(A ∩ B c ∩
C c )=P(A)P(B c )P(C c )=0.6 · (1 − 0.5) · (1 − 0.4) = 0.18. Here the reason P(A ∩ B c ∩ C c ) factors is
because the three events A, B, and C are independent.
5
11. One percent of all individuals in a certain population are carriers of a particular disease. A diagnostic test for
this disease has a 90% detection rate for carriers and a 5% detection rate for noncarriers. Suppose the test is
applied independently to two different blood samples from the same randomly selected individual. What is
the probability that both tests yield positive result? (Hint: The two events ”The randomly selected person is
a carrier” and ”The randomly selected person is a noncarrier” are mutually exclusive and exhaustive events.)
A. 0.01
B. 0.89
C. 0.90
D. 0.95
F. 0.06
Answer: A. Let A =“The randomly selected person is a carrier”, then Ac =“The randomly
selected person is a noncarrier”. Let B =“both tests yield positive result”, then P (B) =
P (B|A)P (A) + P (B|Ac )P (Ac ). Since P (B|A) = 0.92 and P (B|Ac ) = 0.052 , then P (B) = P (B|A)P (A) +
P (B|Ac )P (Ac ) = 0.92 ∗ 0.01 + 0.052 ∗ (1 − 0.01) = 0.01.
12. When circuit boards used in the manufacture of compact disc players are tested, the long-run percentage of
defectives is 5%. Among 25 randomly selected boards, what is the expected value of the number of defective
boards?
A. 0.05
B. 0.5
C. 1.19
D. 1.25
F. 12.5
Answer: D. The number of defective boards among the 25 randomly selected boards is a
random variable and has Binomial distribution with n = 25, and p = 0.05. The expected number
of defective boards is np = 1.25.
13. A geologist has collected 10 specimens of basaltic rock and 10 specimens of granite. The geologist instructs a
lab assistant to randomly select 15 of the specimens for analysis. What is the distribution of the number of
granite specimens selected for analysis?
A. Binomial
B. Hypergeometric
C. Possion
D. Negative binomial
E. Geometric
Answer. B. Refer to the definition of the hypergeometric distribution in the textbook.
14. An insurance company offers its policyholders a number of different premium payment options. For a randomly
selected policyholder, let X= the number of months between successive payments. The CDF of X is as follows.

0
if x < 1




0.36
if
1≤x<3



0.47 if 3 ≤ x < 4
F (x) =
0.51 if 4 ≤ x < 6




0.91 if 6 ≤ x < 12



1
if x ≥ 12
Calculate P (3 ≤ X ≤ 6)
A. 0.55
B. 0.51
C. 0.15
6
D. 0.11
E. 0.47
Answer: A. P (3 ≤ X ≤ 6) = F (6) − F (2) = 0.91 − 0.36 = 0.55
15. Let X denote the amount of time for which a book on a 2-hour reserve at a college library is checked out by
a randomly selected student. The CDF of checkout duration X is

 0 if x < 0
x2
F (x) =
if 0 ≤ x < 2
 4
1 if x ≥ 2
If the borrower is charged an amount h(X) = 3X 2 , compute the expected charge E(h(X))
A. 18.0
B. 2.0
C. 6.0
D. 2.6
E. 4.0
Answer: C. First compute the pdf of X: f (x) = F 0 (x) =
E(h(X)) = E(3X 2 ) =
Z
x
2
2
3x2 · f (x)dx =
0
if 0 ≤ x < 2. By definition,
Z
0
2
3x2 ·
x
dx = 6
2
16. Find the 75th percentile of the distribution of checkout duration in the previous problem
A. 1
B. 1.41
C. 1.73
D. 2
E. Cannot determine.
Answer: C. Let x0.75 denote the 75th percentile of the distribution. Then we solve the following
equation for x0.75
F (x0.75 ) = 0.75,
which gives us x0.75 = 1.73.
17. Find the interquartile range (75th percentile-25th percentile) for the standard normal distribution using Table
A.3. Interpolate where appropriate.
A. 1.350
B. 0.675
C. 0.773
D. 1.546
E. 1.196
Answer: A. The 75th percentile is 0.674, and by symmetry of the standard normal density
function, the 25th percentile is -0.674. Thus the interquartile range is 1.348 ≈ 1.35.
18. Suppose the force acting on a column that helps to support a building is normally distributed with mean 15.0
kips and standard deviation 1.25 kips. What is the probability that the force differs from 15.0 kips by at most
2 standard deviations?
A. 0.954
B. 0.977
C. 0.046
7
D. 0.023
E. 1
Answer: A. Let X denote the form acting on a column. Since X is Normal(µ = 15, σ = 1.25),
then Z = X−15
1.25 is Normal(0,1). Thus
P (|X − 15| < 2 ∗ 1.25) = P (|X − 15| < 2 ∗ 1.25) = P (|Z| < 2) = Φ(2) − Φ(−2) ≈ 0.954
19. Let X have a binomial distribution with parameters n = 25 and p = 0.5. Calculate P (15 ≤ X ≤ 20) using the
exact distribution from Appendix Table A.1 and using the normal approximation with continuity correction.
Report the absolute value of the approximation error.
A. 0.0430
B. 0.0540
C. 0.0962
D. 0.0008
E. 0.0616
Answer: D. X is distributed as Binomial(n = 25, p = 0.5). Look up Table A.1 and obtain P (15 ≤
X ≤ 20) =. Since Binomial(n = 25, p = 0.5)can be approximated by the normal distribution with
µ = np = 12.5, and variance σ 2 = np(−p) = 6.25, we apply continuity correction and approximate
P (15 ≤ X ≤ 20) by
X − 12.5
20.5 − 12.5
14.5 − 12.5
≤ p
≤ p
)
P (15 − 0.5 ≤ X ≤ 20 + 0.5) = P ( p
(6.25)
(6.25)
(6.25)
= Φ(3.2) − Φ(0.8) ≈ 0.211
.
20. Let X be any random variable which is normally distributed, what is the probability that X is within 1.5
SDs of its mean?
A. 0.0668
B. 0.9332
C. 0.8664
D. 0.1336
E. Cannot determine without knowing the mean and the variance.
Answer: C. Let X ∼ N (µ, σ), then Z =
X−µ
σ
∼ N (0, 1)
P (|X − µ| < 1.5σ) = P (|Z| < 1.5) = Φ(1.5) − Φ(−1.5) = 0.8664
.
8