Download Final Exam Solutions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Inductive probability wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Confidence interval wikipedia , lookup

German tank problem wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Name:
Student Number:
STAT 285: Fall Semester 2014
Final Examination — Solutions
5 December 2014
Instructor: Richard Lockhart
Instructions: This is an open book test. As such you may use formulas such as those for
means and variances of standard distributions from the book without deriving them. You
may use notes, text, other books and a calculator; you may not use a computer. Your work
will be marked for clarity of explanation. I expect you to explain what assumptions you are
making and to comment if those assumptions seem unreasonable. In general you need not
finish doing arithmetic in confidence interval problems; I will be satisfied if your answers
contain things like
p
27 ± 1.96 247.5/11,
but I have to be absolutely convinced you know what arithmetic to do! In hypothesis testing
problems you will have to finish the arithmetic enough to reach a real world conclusion. I
want the answers written on the paper. The exam is out of 70.
1. We generally assume that when a coin is tossed it lands heads up with probability
0.5. It has been suggested, however, that if you do the experiment in a different way
the chance might change. One such way is to stand the coin on edge on a hard flat
surface, hold it upright with a finger and then flick the edge with a finger to send the
coin spinning away.
A group of 107 statistics students actually did this, 40 times each for a total of 4280
spins.
(a) [5 marks] Suppose they got a total of 2376 heads. Give a 99% confidence interval
for the probability that spinning produces heads.
We are going to assume that all 40 students had the same probability p of spinning
heads; it is not obvious that this is realistic. If so then the number, X, of heads
has a Binomial(4280,p) distribution. The estimate is
p̂ =
2376
= 0.5551402 = 0.555
4280
The estimated standard error is
r
r
p̂(1 − p̂)
0.555 · 0.445
σ̂p̂ =
=
= 0.007596106 = 0.00760.
4280
4280
1
For a 99% confidence interval the relevant critical point from the normal curve is
z0.005 = 2.58
and the interval is
0.5551 ± 2.58 · 0.0076 or 0.5475 to 0.5627.
It is not necessary to work out all the numbers for full marks.
(b) [5 marks] Is it reasonable to believe that this method produces heads with probability 1/2?
This is a typical hypothesis testing question. The way it is asked the null hypothesis is H0 : p = 0.5 and the alternative is Ha : p 6= 0.5. The relevant test statistic
is
0.5551 − 0.5
= 7.25
T =
0.0076
and the P -value is obtained from a normal curve. On the test the tables being
used end at 3.49 and you can only say that P is less than 2 × 0.0002 = 0.0004. I
can use R to check that
P ≈ 4.2 × 10−13 .
In any case P is so ridiculously small that the null hypothesis is untenable. The
probability of heads is more than 0.5.
2. [5 marks] It has also been suggested that if you flip a coin in the usual way and catch
it in your hand it is slightly more likely than 50% to land the same way up as it started.
If the chance of landing the same side up as it started were really 0.51 how many tosses
would I need to make to have at least a 90% chance that a level 5% test would detect
a significant difference between the probability of heads and 0.5?
This is an application of the sample size determination problem. We are given α = 0.05,
β = 0.1, p0 = 0.5, and p′ = 0.51. The alternative suggested calls for a one-sided test
so
#2 "
√
√
2
1.65 · 0.5 + 1.28 · 0.50
z0.05 0.5 · 0.5 + z0.1 0.51 · 0.49
=
= 21609
n=
0.51 − 0.5
0.01
is the required sample size.
3. A large population of new mothers is divided into two groups: smokers and nonsmokers. Independent samples of 40 mothers are drawn from each group in order
to make comparisons. One comparison made is birth weight. The babies of the 40
smokers average 112.1 ounces with a standard deviation of 16.1 ounces while those of
the 40 non-smokers average 123.1 ounces with an SD of 15.8 ounces.
2
(a) [5 marks] Is it clear that smokers have lower birth weight babies?
We have two samples: X1 , . . . , Xn with n = 40 from the population of smokers,
and Y1 , . . . , Ym with m = 40 from the population of non-smokers. If µ1 is the
population mean birth weight for smoker’s babies and µ2 is the population mean
birth weight for non-smokers babies then our null hypothesis is H0 : µ2 = µ1 (or
H0 : µ2 ≤ µ1 ) and the alternative is Ha : µ1 < µ2 . (There may be students who
try to make the case for a 2 sided test; I would dock 0.5 marks only.) The relevant
test statistic is
Ȳ − X̄
T =q
= 3.08.
15.82
16.12
+ 40
40
The degrees of freedom, no matter how you calculate it will be close to 78; they
cannot be lower than 39 and the two SDs are close together so they will be close
to 78. The P value for 3.08, 1 sided, with any degrees of freedom from 40 to 120
is in the range of 0.001 to 0.002 so in any case we conclude that there is very
strong evidence against the idea that smokers have babies as heavy as those of
non-smokers. Clearly they have low birth weight babies. From R the P -value
is between 0.00141 and 0.00187 and likely closer to the smaller number but the
conclusion is not affected by this difference.
(b) [5 marks] Give a 90 percent confidence interval for the difference in mean birth
weights between smoking and non-smoking mothers.
In the previous part you computed both a degrees of freedom and an estimated
standard error. The df should be used to find a t multiplier.
3
4. The following sequence of questions concern the following model. Imagine we have 3
dice which have been carefully manufactured so that they all have exactly the same
weight, θ. We begin by weighing 1 of the dice and recording Y1 which you may assume
has mean θ. The error in the measurement, namely Y1 − θ has a normal distribution
with mean 0 and standard deviation σ. In order to keep this problem simple you may
assume that σ = 1 and that you know this somehow.
Then you weigh 2 of the dice together and record Y2 whose mean is 2θ. Assume that
the error Y2 − 2θ has a normal distribution with mean 0 and standard deviation σ.
In this problem you are to compare two estimators for θ.
(a) [5 marks] The first estimator is based on the idea that Y2 /2 has mean θ – the
same as Y1 . This estimator is the average of these two.
θ̂1 =
Y1 + Y2 /2
2
Find the bias, standard error and mean square error of θ̂1 .
The mean of θ̂1 is
E(Y1 ) + E(Y2 )/2
Y1 + Y2 /2
θ + 2θ/2
=
E(θ̂1 ) = E
=
= θ.
2
2
2
Thus θ̂1 is unbiased (the bias is 0). The variance is
Var(Y1 ) + Var(Y2 )/4
σ 2 + σ 2 /4
5σ 2
Y1 + Y2 /2
=
=
=
Var(θ̂1 ) = Var
2
4
4
16
The mean squared error is also the variance. The standard error is the square
root of the variance
√
5σ
.
4
4
(b) [5 marks] Another estimator is obtained by least squares. Derive the formula
for the least squares estimate of θ; call this estimator θ̂2 .
The error sum of squares is
(Y1 − θ)2 + (Y2 − 2θ)2
To minimize this take the derivative with respect to θ and get
−2(Y1 − θ) − 4(Y2 − 2θ) = −2(Y1 + 2Y2 − 5θ).
This is 0 when
Y1 + 2Y2 − 5θ = 0
so
θ̂2 =
Y1 + 2Y2
.
5
(c) [5 marks] Find the bias, standard error and mean squared error of θ̂2 .
The mean of θ̂2 is
Y1 + 2Y2
E(Y1 ) + 2E(Y2 )
θ + 4θ
E
=
=
= θ.
5
5
5
Thus θ̂2 is unbiased. Its variance is
Var(Y1 ) + 4Var(Y2 )
σ 2 + 4σ 2
σ2
Y1 + 2Y2
=
=
= .
Var
5
25
25
5
This is also the MSE. The standard error is
σ
√ .
5
(d) [2 marks] Based on these calculations, which is the better estimator of θ, θ̂1 or
θ̂2 ?
Both are unbiased and θ̂2 has a smaller variance because
5
1
< .
5
16
Thus θ̂2 is better.
5
5. Suppose X has density
f (x, θ) =
(
θ
x2
0
x>θ
x≤θ
(a) [2 marks] Find the cumulative distribution function of X.
Let F (x) = P (X ≤ x) be the cdf. For x < θ we have F (x) = 0. For x ≥ θ we
have
x
Z x
θ
θ
θ F (x) = P (X ≤ x) =
=
1
−
du
=
−
.
2
u θ
x
θ u
(b) [2 marks] For any b such that 1 ≤ b find
P (1 ≤
X
≤ b)
θ
and then find the values of b for which this probability is 0.025 and 0.975.
This is just
P (θ ≤ X ≤ bθ) = F (bθ) − F (θ) = 1 −
θ
1
=1− .
bθ
b
To make this be α we need
α = 1 − 1/b
or 1/b = 1 − α or
1
.
1−α
Thus for 0.025 we have b = 1/0.975 and for 0.975 we need b = 1/0.025.
b=
6
(c) [5 marks] Use the results of the previous problem to find a 95% confidence
interval for θ.
We now know that
1
X
1
P
= 0.975 − 0.025 = 0.95.
≤
≤
0.975
θ
0.025
Solving the inequalities we find
P (0.025X ≤ θ ≤ 0.975X) = 0.95
so that the interval [0.025X, 0.975X] is a 95% confidence interval for θ.
(d) [1 mark] Evaluate your interval if we observe X = 40.
The interval runs from 0.025 × 40 = 1 to 0.975 × 40 = 39.
7
6. In the October 7, 2014 issue of the Canadian Medical Association Journal a randomized
controlled double blind study of 378 patients studied the effect of melatonin on delirium.
The treatment group had 186 patients and observed 55 cases of delirium while the
control group had 192 patients and 49 cases of delirium.
(a) [1 mark] If the treatment is completely ineffective what is the estimated probability of delirium in this group?
The pooled estimate of the binomial probability is
p̂ =
55 + 49
104
=
= 0.27513.
186 + 192
378
(b) [2 marks] What is the estimated standard error for the estimate in the previous
part.
If the treatment is completely ineffective we have n = 378 trials and p̂ is a sample
proportion so its standard error is
r
p(1 − p)
378
which we estimate by
r
p̂(1 − p̂)
=
378
r
0.275 × 0.725
= 0.02297.
378
(c) [5 marks] Is there clear evidence of a difference in either direction in delirium
rates between treatment and control?
This is a test for the equality of two proportions, p1 , the probability of delirium
in the treatment group and p2 the probability of delirium in the control group.
The null hypothesis is p1 = p2 . The alternative is two sided: p1 6= p2 . The test
statistic is
p̂1 − p̂2
T =q
= 0.736.
1
1
p̂(1 − p̂) 186
+ 192
This is nowhere near significant; there is little evidence of a difference in the
probability of delirium between treatment and control. The actual P -value is
0.46.
8
7. I have regularly used the heights of 1078 father / adult son pairs gathered in Victorian
England. We are interested in predicting the heights of sons from the heights of fathers
supposing that the relationship is described by a straight line. Fathers average 67.69
inches in height with a standard deviation of 2.74 inches. Sons average 68.68 inches
in height with a standard deviation of 2.81 inches. The average of the products (son’s
height times father’s height) is 4652.895 square inches. Students writing in Burnaby
or at the CSD got to see the following sentence which was added after the exams were
printed; students in Surrey never
P found this out! I adjusted the marking to compensate.
The Error Sum of Squares i (Yi − β̂0 − β̂1 xi )2 is 6390.331 square inches.
(a) [2 marks] Show that
n
1X
(xi − x̄)(yi − ȳ)
xy − x̄ȳ =
n i=1
The right hand side is
n
n
1X
1X
(xi − x̄)(yi − ȳ) =
(xi yi − xi ȳ − yi x̄ + x̄ȳ)
n i=1
n i=1
n
n
n
1X
1X
1X
1
=
xi yi − ȳ
xi − x̄
yi + nx̄ȳ
n i=1
n i=1
n i=1
n
= xy − ȳx̄ − x̄ȳ + x̄ȳ
= xy − x̄ȳ
as desired.
(b) [3 marks] Estimate the slope and intercept of the least squares line for predicting
son’s heights from father’s heights.
This just requires you to plug in to the various formulas. The numerator of β̂1 is
xy − x̄ȳ = 4652.895 − 67.69 × 68.68 = 3.946.
The denominator is
x2 − x̄2 =
Thus
1077
n−1 2
sx =
2.742 = 7.501.
n
1078
3.946
βˆ1 =
= 0.526.
7.501
Then
β̂0 = ȳ − β̂1 x̄ = 68.68 − 0.526 × 67.69 = 33.08.
The units of β̂0 are inches while β̂1 is unitless (inches per inch).
9
(c) [5 marks]Give a 95% confidence interval for the true slope for the population
these families are drawn from.
The interval is
S
β̂1 ± t0.025,n−1 p
(n − 1)s2x
where n − 1 = 1077 which is so large we must use the normal critical value: 1.96.
The quantity S is the estimated standard error which I had to add to the data
give to you (it can be computed from the information given if you know how but
I did not teach how). Using the information given in red above we find
r
6390.331
S=
= 2.437
1076
inches. Our interval is
0.526 ± 1.96 × √
10
2.437
.
1077 × 2.752
1a
5
5a
2
1b
5
5b
2
2
5
5c
5
3a
5
5d
1
3b
5
6a
1
4a
5
6b
2
4b
5
6c
5
4c
5
7a
2
4d
2
7b
3
7c
5
Total
70
11