Download SOLUTION FOR HOMEWORK 8, STAT 4372 Welcome to your 8th

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Genetic algorithm wikipedia , lookup

Knapsack problem wikipedia , lookup

Routhian mechanics wikipedia , lookup

Birthday problem wikipedia , lookup

Computational fluid dynamics wikipedia , lookup

Pattern recognition wikipedia , lookup

Corecursion wikipedia , lookup

Perturbation theory wikipedia , lookup

Least squares wikipedia , lookup

Simulated annealing wikipedia , lookup

Computational electromagnetics wikipedia , lookup

Inverse problem wikipedia , lookup

Generalized linear model wikipedia , lookup

Multiple-criteria decision analysis wikipedia , lookup

Mathematical optimization wikipedia , lookup

False position method wikipedia , lookup

Transcript
SOLUTION FOR HOMEWORK 8, STAT 4372
Welcome to your 8th homework. Here you have an opportunity to solve classical estimation problems which are the must to solve on the exam due to their simplicity.
1. Problem 15.4 Given: X̄ = 35, 000, Sn = 75, 000, π̂50 = 10, 000, π̂90 = 100, 000.
Using percentile matching for Weibull distribution, finds its two parameters.
Solution: I do not know why moments are also given — you can use them for the method
of moments in any case. Now an important remark. For Weibull, based on the current Table,
you can use matching either via percentiles - using V aRg (X|θ, τ ) - or cdf F (x|θ, τ ). Below
I present both approaches, on your exam choose one which is faster.
(a) Matching percentiles π̂g = πg (θ, τ ) = V aRg (X|θ, τ ) where here g = .5 and g = .9.
From the Table V aRg (X|θ, τ ) = θ[− ln(1 − g)]1/τ .
Now I solve the system of to equations always going from system to system (to avoid a
stupid mistake). Write
(
θ[− ln(.5)]1/τ = 10, 000
θ[− ln(.1)]1/τ = 100, 000
Now I begin to solve it dividing
(
(
[ln(.1)/ ln(.5)]1/τ = 10
θ[− ln(.5)]1/τ = 100, 000
1/τ = ln(1)/ ln[ln(.1)/ ln(.5)]
θ = 10, 000[− ln(.5)]−1/τ
(
1/τ = 1.918
θ = 20197.5
Answer: τ̂ = .52 and θ̂ = 20, 197.5
τ
(b) Matching via known cdf F (x|θ, τ ) = 1 − e−(x/θ) . I need here two equation for
F (π̂g |θ, τ ) = g with g = .5, .9. Let us solve the system of equations
(
1 − exp(−(10, 000/θ)τ ) = .5
1 − exp(−(100, 000/θ)τ ) = .9
Put exponents to the right, numbers to the left, take logarithms and get a system similar to
the above-considered
(
(10, 000/θ)τ ) = ln(.5)
(100, 000/θ)τ ) = ln(.1)
Solve it and get the same answer.
2. Problem 15.6 Solution: First of all, you need to recalculate claims to the level of
year 3 by taking into account the inflation. 100 claims from the first year will give you
(100)(10, 000)(1.1)2 = 1, 210, 000
1
and the 200 claims from the second year will give you
(200)(12, 500)(1.1) = 2, 750, 000.
To use the method of moments I calculate the average of n = 300 claims as
X̄ = [1, 210, 000 + 2, 750, 000]/300 = 13, 200.
For the Pareto distribution µ = θ/(α − 1) and because α = 3 is given we get µ = θ/2. From
µ = X̄ we get answer θ̂ = 26, 400.
3. Problem 15.8 This is a nice problem. Given: X := ZX + (1 − Z)B where Z is
Bernoulli(p) and independent of two exponentially distributed A and B with means 1 and
10, respectively. [Note that there is no information that A and B are independent.] Given
Varp (X) = 22 , find p using method of moments.
Solution: Here empirical variance is given, so we need to calculate the theoretical variance
as a function in p and then solve the equation. To find variance we calculate first and second
moments. Write
Ep (X) = Ep (Z)E(A) + Ep (1 − Z)E(B) = p(1) + (1 − p)(10) = 10 − 9p.
For the second moment using the Table we get E(A2 ) = 2[E(A)]2 , and using this formula
and Ep (Z 2 ) = p we can write,
Ep (X 2 ) = Ep (Z 2 A2 ) + 2Ep {Z(1 − Z)AB} + Ep ((1 − Z)2 B 2 )
= pE(A2 ) + 0 + (1 − p)E(B 2 ) = 2p + (1 − p)200 = 200 − 198p.
In the above I used Z(1 − Z) = 0 which holds because Z is either 0 or 1.
Now we can calculate
Varp (X) = Ep (X 2 ) − [Ep (X)]2 = 200 − 198p − (10 − 9p)2 = 100 − 18p − 81p2 .
Solve the equation Varp (X) = 4 and get p̂ = .98.
4. Problem 15.21 Given: Losses are Weibull(θ, τ ). A sample of 16 losses is given (I
skip it). Use the smoothed empirical estimate of 20th and 70th percentiles to find underlying
parameters θ, τ .
Solution: First, n + 1 = 17. Then for the 20th percentile we have (17)(.2) = 3.4 which
yields j.2 = 3 and h.2 = .4 and X(3) = 75. Then using Definition 15.3 on page 377 yields
p̂.2 = (.6)X(3) + (.4)X(4) = .6(75) + .4(81) = 77.4.
Similarly, for 70th percentile we get .7(17) = 11.9 which yields j.7 = 11, h.7 = .9 and
X(11) = 122. Then
p̂.7 = (.1)X(11) + (.9)X(12) = .1(122) + .9(125) = 124.7.
2
Here we can use cdf and equate FX (π̂g ) = g. We get
(
1 − exp(−(77.4/θ)τ ) = .2
1 − exp(−(124.7/θ)τ ) = .7
Solve the system and get τ̂ = 3.53 and θ = 118.32.
5. Problem 15.22 Given: X has pdf fX (x) = θ−1 exp(−(x − δ)/θ)Y (x > δ. Given
X̄ = 300 and π̂.5 (X) = 240. Find δ and θ.
Solution: It is important to note (with the purpose of a faster solution) that δ is a shift
parameter, that is X = Z + δ where Z is a standard exponential RV with mean θ.
As a result of this remark, Z̄ = 300 − δ, π̂.5 (Z) = 240 − δ, and we can use the Table for
theoretical characteristics
Eθ (Z) = θ,
π.5 (Z) = VaR.5 (Z) = −θ ln(.5)
Remark: If the RV is not from the Table then you need to do your calculations of the
characteristics using a given pdf, that is, calculate
mean andR median directly using their
Rm
definitions. For instance, here the solution of 0 fZ (z)dz = m∞ fZ (z)dz will give you the
median m.
Then we solve the system
(
θ = 300 − δ
−θ ln(.5) = 240 − δ
(
60 = θ[1 + ln(.5)
δ = 300 − θ
Answer: θ = 195.5 and δ = 104.6.
6. Problem 15.29 Solution: First of all, I need to explain what q35 is. This is a discrete
analog of the hazard rate function,
q35 =
S(35) − S(36)
.
S(35)
In other words, it is a likelihood to die during 36th year if a live was observed at age 35
(see a discussion and definition at page 368; this parameter is used for life tables and in
life insurance - Exam 3). Because here we are dealing with conditional expectation, we
can restrict attention only to lives that were observed at age 35, and then consider a new
distribution such that F (35) = 0 and G(35) = 1. Then q35 = F (36) = 1 − S(36). This is the
idea of shifting the data discussed on page 385.
Now let us look at the data and factors in the likelihood. We have 4 groups of data and
corresponding likelihoods.
(1) 6 lives observed at age 35.4 and died before 36. Conditional probability of an event
for each live is (I use the technique discussed at p.385)
F (1) − F (.4)
w − .4w
.6w
=
=
.
1 − F (.4)
1 − .4w
1 − .4w
3
The corresponding factor in likelihood is
L1 = [
.6w 6
].
1 − .4w
(2) 4 lives observed at age 35.4 and survived at age 36. The conditional probability of
the event for each live is
1−w
F (∞) − F (36)
=
.
F (∞) − F (35.4)
1 − .4w
with the corresponding factor in likelihood
L2 = [
1−w 4
].
1 − .4w
(3) 8 lives observed at age 35 (the initial point) and died before 36. The conditional
probability of the event for a live is w with the likelihood factor for the group being
L3 = w 8 .
(4) 12 lives first observed at 35 and survived the age 36. The conditional probability of
the event is 1 − w and the factor is
L4 = w 12 .
Now note that up to a constant factor (which is irrelevant) the total likelihood is proportional to
w 14 (1 − w)16
L=
.
(1 − .4w)10
Take its logarithm, the derivative is
l′ = (14/w) − (16)/(1 − w) + 4/(1 − .4w).
Set the derivative to zero, look at the numerator and get
14 − 31.6w + 8w 2 = 0.
The solution which is less than 1 is w = q35 = .51.
7. Problem 15.30 Solution: Remember that for censored data the likelihood is the
product of density functions at moments of observed deaths times products of survival functions at moments of censoring. As a result, our first step is to calculate underlying survival
and density functions.
According to page 17, if the hazard rate function h of a nonnegative RV is given then its
survival function can be calculated as
S(t) = e−
Rt
0
h(u)du
.
The the pdf is f (t) = −S ′ (t) := −dS(t)/dt. Using these formulae we get
S(t) = exp(−λ1 t)I(0 ≤ t < 2) + exp(−2λ1 − λ2 (t − 2))I(t ≥ 2),
4
and
f (t) = λ1 exp(−λ1 t)I(o ≤ t < 2) + λ1 exp(−2λ1 − λ2 (t − 2))I(t ≥ 2).
Then we calculate the likelihood function,
L(λ1 , λ2 ) = λ1 exp(−λ1 (1.7))λ2 exp(−2λ1 − λ2 (3.3 − 2))
× exp(−λ1 (1.5)) exp(−2λ1 − λ2 (2.6 − 2)) exp(−2λ1 − λ2 (3.5 − 2))
= λ1 λ2 exp(−λ1 [1.7 + 1. + 6] − λ2 [1.3 + .6 + 1.5]) = λ1 λ2 exp(−9.2λ1 − 3.4λ2 ).
Now we need to find values of the parameters λ1 and λ2 that maximize the likelihood. To
do this it is convenient to take the logarithm and then set first order partial derivatives
with respect to λ1 and λ2 equal to zero, and then solve the system of equations. Formally
you then need to check that these are points of the maximum but if they are unique then
typically they are the pints of maximum - so you can save time.
Remark: Just in case, this is the rule how to check that a function h(x, y) attains a
maximum at the point (a, b). (a) The first-order partial derivatives are zero, that is
∂h(x, y)/∂x|x=a,y=b = 0,
and ∂h(x, y)/∂y|x=a,y=b = 0.
(b) At least one second-order partial derivative is negative, that is
∂ 2 h(x, y)/∂x2 |x=a,y=b < 0,
or ∂ 2 h(x, y)/∂y 2|x=a,y=b < 0.
(c) The Jacobian of the second-order partial derivatives is positive,
{[∂ 2 h(x, y)/∂x2 ][∂ 2 h(x, y)/∂y 2] − [∂ 2 h(x, y)/∂x∂y]2 }|x=a,y=b > 0.
For the data at ta hand the log-likelihood function is
l(λ1 , λ2 ) = ln(L(λ1 , λ2 )) = ln(λ1 ) + ln(λ2 ) − 9.2λ1 − 3.4λ2 .
Take first-order partial derivatives and get the system
(
λ−1
1 − 9.2 = 0
λ−1
2 − 3.4 = 0.
Its solution gives us the Answer: λ̂1 = 1/9.2 = .11 and λ2 = 1/3.4 = .294.
8 Problem 15.33. Given: Loss X is Exponential(θ). Find E(X) = θ based on the given
data.
Solution: We have observations of the loss and 495 censorings at the value 4,000. The
first 5 should be plugged in the density and all 495 censoring values into the survival function.
Here f (t) = θ−1 e−t/θ and S(t) = e−t/θ . As a result, the likelihood function is
L(θ) = θ−5 exp(−[1100 + 3200 + 3300 + 3500 + 3900]/θ) exp(−(495)(4000)/θ).
5
This yields the log-likelihood
l(θ) = −5 ln(θ) − 1, 995, 000/θ.
Take the derivative, set it to zero, find the solution:
dl(θ)/dθ = −5/θ + 1, 995, 000/θ2 = 0
and the solution is θ̂ = 3999, 000.
9. Problem 15.35 Solution: Actuary X is dealing with 4 observed times and 1 censored
at 5 years, actuary Y is dealing with 5 observed times. For the problem at hand the pdf is
f (t) = −dS(t)/dt = w −1I(0 ≤ t ≤ w).
Note that this is the uniform distribution where the parameter of interest is in the support
of the density — remember that in this case MLE can be tricky and based on visualization
of the MLE rather than taking a derivative!
Well for the actuary X the likelihood is
LX (w) = f (1)f (3)f (4)f (4)S(5) = w −4(1 − 5/w)I(w ≥ 5).
(1)
Please pay attention that the indicator function must be taken case about, and in (1) I used
I(0 ≤ 1 ≤ w)I(0 ≤ 3 ≤ w)I(0 ≤ 4 ≤ w)I(0 ≤ 4 ≤ w)I(0 ≤ 5 ≤ w) = I(5 ≤ w).
Next step is to understand that LX (w) is defined only for w ∈ [5, ∞), and we need to
understand how this function looks like for these values of w. Set g(w) = w −4 (1 − 5/w) and
note that its derivative is
dg(w))/dw = −4w −5 + 25w −6.
Set the derivative to zero and find the extreme point ŵ = 6.25. Note that it is greater
5, which is admissible for the support. This is also plainly the maximum point because
g(5) = 0. As a result, the MLE for the actuary X is ŵX = 6.25.
Actuary Y observes all times so for him the likelihood is
LY (w) = f (1)f (3)f (4)f (4)f (6) = w −5 I(w ≥ 6).
(2)
This is an interesting case because w −5 is decreasing in w and this the maximum of the
likelihood is attained at the point ŵY = 6. To see this more clearly graph the likelihood (2)
as a function in w and see where the maximum is. The function is zero for w < 6 and then
w −5 for w ≥ 6.
10. Problem 15.46. Solution: Here cdf F (x) = xp I(0 < x < 1) so the density is
f (x) = pxp−1 I(0 < x < 1). Then the log-likelihood function is
l(p) =
n
X
[ln(p) + (p − 1) ln(xl )].
l=1
Take derivative, equate it to zero and get
np−1 +
The solution gives you the MLE p̂ =
n
X
ln(xl ) = 0.
l=1
Pn
−n/ l=1
6
ln(xl ).