Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
BS1.
(a) Let the joint density for random variables X and Y be defined by
fX,Y (x, y) = 6(1 − y)
0<x<y<1
and zero otherwise.
(i) Evaluate EfX [X] and EfX [X 2 ], and hence evaluate VfX [X].
5 MARKS
(ii) Derive the conditional density, fY |X (y|x), and the conditional expectation
EfY |X [(1 − Y )|X = x],
for general x, 0 < x < 1. Hence or otherwise, evaluate EfY [Y ] and CfX,Y [X, Y ],
the covariance of X and Y .
5 MARKS
(b) Let the joint density for random variables U and V be defined by
3
fU,V (u, v) = u2 (1 − |v|)
2
− 1 < u < 1, −1 < v < 1
and zero otherwise. Let A ≡ {(u, v) : 0 < u < 1, 0 < v < u}. Compute P [(U, V ) ∈ A].
5 MARKS
(c) Let the joint density for random variables R and S be defined by
fR,S (r, s) = 2r
0 < r < 1, 0 < s < 1
and zero otherwise. Find the probability
P R2 < S < R .
5 MARKS
BS2.
(a) The cumulant generating function for random variable X, KX (t), is defined
by
KX (t) = log MX (t) = log EfX [etX ]
for t ∈ (−h, h), some h > 0. Show that
KX (λt1 + (1 − λ)t2 ) ≤ λKX (t1 ) + (1 − λ)KX (t2 )
for t1 , t2 ∈ (−h, h), and 0 ≤ λ ≤ 1.
6 MARKS
(b) Consider the probability density functions f0 and fβ for β > 0 such that
fβ (x) =
f0 (x) exp{−βh(x)}
Z(β)
1
for function h(x) > 0, where Z(β) is the normalizing constant for density fβ .
(i) Show that
dZ(β)
= −Z(β)Efβ [h(X)]
dβ
and find a similar expression for
d
dβ
1
Z(β)
.
6 MARKS
(ii) Let g(β) be defined by
g(β) = Efβ [h(X)].
Show that g(β) is a non-increasing function of β.
8 MARKS
BS3.
(a) Suppose that
FX (x) =
(x + 1)2 − 1
(x + 1)2
0<x<∞
with FX (x) = 0 for x ≤ 0. Let Yn be the maximum order statistic derived from a
random sample X1 , . . . , Xn from this distribution.
(i) Show that in this case Yn has no limiting distribution.
4 MARKS
(ii) Consider the transformed variable
1
Zn = √ Yn .
n
Show that Zn does have a limiting distribution, and identify it.
4 MARKS
(b) Now suppose that X1 , ...Xn are independent Exponential (1) random variables.
Let Yn be the maximum order statistic derived from a random sample X1 , . . . , Xn .
(i) Find the distribution function for Zn = Yn − a for constant a.
4 MARKS
(ii) Find a sequence of constants {an } such that, for z > 0,
lim FZn (z) = exp {− exp {−z}}
n→∞
and hence find an approximation to the probability P [Yn > y0 ] for large n, for
fixed y0 > 0.
4 MARKS
2
(c) The times between earthquakes of a certain magnitude that occur in Southern
California are assumed to be independent Exponential random variables with rate
parameter λ = 0.01 (per day), so that the expected time between earthquakes
is 100 days. Over a given recording period, n = 200 earthquakes were observed,
and the longest time between any two consecutive earthquakes was 1021 days.
Comment on the validity of the Exponential distribution assumption in the light
of these data.
4 MARKS
BS4.
~ = [X1 , X2 , . . . , Xn ]0 be a random sample from an Exponential distribution
Let X
with density
fθ (x) = θe−θx I[x ≥ 0], θ > 0.
(a) Find the maximum likelihood estimator (MLE) of θ, say θ̂n , and show that
θ̂n is consistent and asymptotically normal (without resorting to known results
regarding MLEs). Invoke clearly any other theorem you need.
5 MARKS
Now suppose that we observe only Yi = [Xi ] + 1, where the square bracket means
the largest integer smaller than or equal to Xi , i = 1, . . . , n.
(b) Find the MLE of θ, say θ̃n based on the random sample Y~ = [Y1 , . . . , Yn ]0 .
5 MARKS
(c) Find the asymptotic distribution of θ̃n .
5 MARKS
(d) Find the asymptotic relative efficiency of θ̂n with respect to θ̃n and comment
on the situations where θ is small and where θ tends to infinity.
5 MARKS
BS5.
The class of power series distributions consists of families of probability mass functions (pmf) of the form
a(x) x
γ I[x ∈ {0, 1, 2, . . . }]
K(γ)
P
x
where γ > 0, and where a(x) satisfies ∞
x=0 a(x)γ = K(γ) < ∞. Suppose that
~ = [X1 , . . . , Xn ]0 is a random sample from a family of power series distributions
X
P
given by {fγ : γ > 0}. Let Y = ni=1 Xi .
fγ (x) =
(a) Show that Y has a power series distribution.
2 MARKS
(b) Show that Y is γ-sufficient.
2 MARKS
(c) Show that Y is γ-complete if the set {γ > 0 : K(γ) < ∞} contains an interval.
3 MARKS
(d) Let A(y) =
P
~
x:
Pn
i=1
xi =y
Qn
i=1
a(xi ). Show that the uniformly minimum vari3
ance unbiased estimator (UMVUE) of γ r is
(
0,
if Y < r
τ (Y ) =
A(Y − r)/A(Y ), if Y ≥ r
for any Y with positive probability under any γ > 0.
Optional hint: You may wish to consider an expression for K(γ)n Eγ T for a
candidate estimator T .
5 MARKS
~ = [X1 , . . . , Xn ]0 be a random sample from a Geometric(p) distribution
(e) Let X
with pmf fp (x) = p(1 − p)x I[x ∈ {0, 1, . . . }]. Obtain the UMVUE of p.
5 MARKS
(f ) Under the same setting as in (e), find the UMVUE of θ = (1 − p)/p.
3 MARKS
BS6.
(a) State the Neyman-Pearson Lemma. Make sure to clearly state the criteria and
conclusions for the so-called necessity and sufficiency parts of the lemma.
4 MARKS
For the rest of this question, let X ∼ U (θ, θ + 1), θ ∈ {0, θ1 } for some θ1 > 0.
We wish to test H0 : θ = 0 against H1 : θ = θ1 .
(b) Let θ1 = 1. Find the most powerful level α test for all 1 ≥ α ≥ 0. Indicate
which necessity condition of the Neyman-Pearson Lemma is not met. 4 MARKS
(c) Let θ1 be a value in (0.95, 1). Show that there exists a most powerful level
5% test that has size strictly smaller than 5%. Explain what condition in the
Neyman-Pearson Lemma is not satisfied.
4 MARKS
(d) Let θ1 be a value in (0, 0.95].
i. Find a most powerful level 5% test.
2 MARKS
ii. Show that this test is not unique.
2 MARKS
(e) Sketch on one graph the maximum power attainable in the alternative for
H1 : θ = θ1 as a function of θ1 ∈ (0, 1].
4 MARKS
BS7.
The expected proportions of three genetic phenotypes are (Table 1): where θ is the
allele frequency. Suppose we observe the following frequencies from n individuals
selected independently (Table 2).
(a) Write down the likelihood for θ.
5 MARKS
4
AA
Aa
θ2 2θ(1 − θ)
aa
(1 − θ)2
Table 1: Expected proportions.
AA
fAA
Aa
fAa
aa
faa
Total
n
Table 2: Observed frequencies.
(b) From first principles derive an expression for θ̂M LE .
5 MARKS
(c) From first principles, derive Var(θ̂M LE ).
5 MARKS
(d) Show how one could estimate θ using the Newton-Raphson algorithm, and how
one can use the procedure to calculate its variance.
5 MARKS
BS8.
(a) Let θ be a parameter of a distribution and X a random variable from the
distribution. Consider an estimator W (X) for τ (θ). Give the definition of bias in
terms of W (X) for τ (θ).
4 MARKS
(b) Now consider a single random variable Y distributed as Bin(n, p). Characterize
functions of p that are unbiasedly estimable by any function of W (Y ).
6 MARKS
(c) Again consider Y from Bin(n, p). Show that
by any function of W (Y ).
p
1−p
is not unbiasedly estimable
4 MARKS
(d) In Epidemiology
p
1−p
is called the odds and log
p
1−p
the logit of p. Consider
p̂
again a random variable Y from Bin(n, p) that yields θ̂ = log 1−p̂
. Derive an
approximation for Var(θ̂).
6 MARKS
BS9.
(a) Give an example where the nuisance parameter(s) in a multi-parameter likelihood can be eliminated by a conditioning argument. Give enough details that a
reader new to this topic could follow.
10 MARKS
(b) Explain the concept of a profile likelihood, and give an example where the
nuisance parameter(s) in a multi-parameter likelihood can be eliminated by analytically calculating a profile likelihood. What could be done if one were not able
5
to arrive at the profile likelihood analytically?
10 MARKS
6
STUDENT NAME:
STUDENT ID#
McGILL UNIVERSITY
Department of Epidemiology, Biostatistics and Occupational Health
Biostatistics Part A Comprehensive Exam
Theory Paper
Date: Friday May 1, 2009
Time: 13:00 - 17:00
INSTRUCTIONS
Answer only six questions out of BS1-BS9.
Questions
BS1
BS2
BS3
BS4
BS5
BS6
BS7
BS8
BS9
Marks
This exam comprises the cover and 6 pages of questions.