Download SOLUTION FOR HOMEWORK 1, STAT 3372 Welcome to your first

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Computational complexity theory wikipedia , lookup

Computational electromagnetics wikipedia , lookup

Generalized linear model wikipedia , lookup

Knapsack problem wikipedia , lookup

Inverse problem wikipedia , lookup

Simulated annealing wikipedia , lookup

Mathematical optimization wikipedia , lookup

Multi-objective optimization wikipedia , lookup

Weber problem wikipedia , lookup

Multiple-criteria decision analysis wikipedia , lookup

Transcript
SOLUTION FOR HOMEWORK 1, STAT 3372
Welcome to your first homework. Remember that you are always allowed to use Tables
allowed on SOA/CAS exam 4. You can find them on my webpage.
Another remark: 6 minutes per problem is your “speed”. Now you probably will not be
able to solve your problems so fast — but this is the goal.
Try to find mistakes (and get extra points) in my solutions. Typically they are silly
arithmetic mistakes (not methodological ones). They allow me to check that you did your
HW on your own. Please do not e-mail me about your findings — just mention them on
the first page of your solution and count extra points. You can use them to compensate for
wrongly solved problems in this homework. They cannot be counted beyond 10 maximum
points for each homework.
Now let us look at your problems.
General Remark: It is always prudent to begin with writing down what is given and what
you need to establish. This step can help you to understand/guess a possible solution. Also,
your solution should be written neatly so, if you have extra time, it is easier to check your
solution.
1. Problem 2.4 Given: The hazard function is h(x) = (A + e2x )I(x ≥ 0) and S(0.4) =
0.5. Find: A
Solution: I need to relate h(x) and S(x) to find A. There is a formula on page 17, also
discussed in class, which is helpful here. Namely,
−
S(x) = e
Rx
0
h(u)du
.
It allows us to find A. Indeed,
Z
0
x
h(u)du =
Z
x
0
(A + e2u )du = Ax + (1/2)(e2x − 1).
Using the given relation S(0.4) = 0.5 we get
ln(0.5) = ln(S(0.4)) = −[Ax + (1/2)(e2x − 1)|x=0.4 .
Now I simplify the last relation and get
ln(2) = A(0.4) + (1/2)(e0.8 − 1)
and then
A=
ln(2) − e0.8 /2 + 1/2
0.70 − 1.11 + 0.5
=
= 0.20.
0.4
0.4
2. Problem 3.3 Given:
σ
µ := E(X) = 2, η := =
µ
q
Var(X)
µ
=
q
E(X − µ)2
1
µ
= 2; µ′3 := E(X 3 ) = 136.
Find the skewness γ1 .
Solution: First, let us remember that the skewness is
γ1 :=
E(X − µ)3
µ3
=
.
σ3
σ3
(1)
The standard deviation σ I can calculate from the coefficient of variation η and the mean
µ, namely
σ = µη = 4.
A formula for the third central moment via raw moments is (check it via (a + b)3 =
a3 + 3a2 b + 3ab2 + b3 , and remember that this is a particular case of a general binomial
P
formula (a + b)k = kr=0 [k!/(r!(k − r)!)]ar bk−r )
E(X − µ)3 = E(X 3 ) − 3E(X 2 )µ + 2µ3 .
(2)
In the right-side of (2) I do not know µ′2 := E(X 2 ) but I do know how to calculate it via
central moments:
E(X 2 ) = Var(X) + [E(X)]2 = σ 2 + µ2 = 16 + 4 = 20.
Using this in (2) yields
µ3 = E(X − µ)3 = 136 − (3)(20)(2) + (2)(8) = 32.
Then, using (1) we get the wished
γ1 = 32/43 = 1/2.
3. Problem 3.6 Given that the mean excess loss functions eX (d) and eY (d) are related
as
eY (30) = eX (30) + 4
(3)
where X ∼ Unif ([0, 100]) and Y ∼ Unif ([0, w]), find w.
Solution: Let us remember formulae for the mean excess loss function that may be useful
here:
R∞
R∞
SZ (z)dz
d (z − d)fZ (z)dz
eZ (d) :=
= d
.
(4)
SZ (d)
SZ (d)
Above I wrote two possible expressions because one of them can be more helpful (faster to
use). Here the second one looks more attractive to me because I only need to know the
survival function SZ for a uniform RV Unif ([0, u]). Let us calculate it (but if you can find
it in the Table - use it!)
SZ (z) = Pr(Z > z) = u
−1
Z
z
∞
I(0 < x < u)dx = (u − z)u−1 I(z ∈ [0, u]) + I(z < 0).
This together with (4) yields
eZ (d) =
Ru
d
u(u − d) − (1/2)(u2 − d2 )
(u − z)u−1 dz
=
(u − d)u−1
u−d
2
= u − (1/2)(u + d) = (1/2)(u − d).
Using this expression to calculate the two excess loss functions in (3) yields (note that here
d = 30)
eX (30) = (1/2)[100 − 30] = 35,
eY (30) = (1/2)[w − 30] = w/2 − 15.
Plug in (3) and get
35 = w/2 − 15 + 4.
(5)
We get w = 92.
Remark: Now is a time to check correctness of the answer. Can w be smaller 100?
Does this look right to you? Note that CAS/SOA exams are multiple choice exams, so you
understanding of a topic can help and drastically reduce time to solve a problem. Here it
is clear that w must be larger than 100 because eY (30) > eX (30)! My mistake was in (5)
where I incorrectly plugged-in numbers. A correct step is: w/2 − 15 = 35 + 4 which gives
me w = 108.
4. Problem 3.7. Given: fX (x) = λ−1 e−x/λ I(x > 0). [This is exponential RV with the
mean λ.] Find the mean excess loss function eX (d) at d = λ.
Solution: Using (4) we get
R∞
SX (u)du
eX (d) = d
.
SX (d)
For the exponential RV the survival function is for x ≥ 0
S(x) =
Z
∞
x
fX (u)du =
Z
∞
x
−x/λ
λ−1 e−u/λ du = −e−u/λ |∞
.
x = e
Please check that SX has properties of the survival function.
Then
R ∞ −u/λ
e
du
−λe−u/λ |∞
λe−d/λ
d
eX (d) = d −d/λ
=
=
= λ.
e
e−d/λ
e−d/λ
What we see is the famous memory-less property of Exponential RV.
5. Problem 3.10. (a) Wrong. Empirical distribution function is discontinuous (it corresponds to a discrete random variable) and then the mean excess loss function is also
discontinuous.
(b). Correct, proved earlier.
(c). Wrong. Using Table A, p.671, α > 0, the survival function is SX (u) = [θ/(θ +
α
u)] I(u > 0), α > 0. Thus the mean excess loss function is
eX (d) =
R∞
d
Z ∞
SX (u)du
θα d∞ (u + θ)−α du
α
(u + θ)−α du.
=
= (d + θ)
SX (d)
[θ/(d + θ)]α
d
R
Remark: Another way to quickly check a possible answer is to use the formula
eX (d) =
E(X) − E(X ∧ d)
SX (d)
3
and then use the Table.
Note that the integral converges only if α > 1, and then
eX (d) = (d + θ)α (α − 1)−1 (d + θ)−α+1 = (d + θ)/(α − 1),
(6)
so it is always increasing in d.
6. Problem 3.11. Remember that in Problem 3.10 I explained that the mean excess loss
function (and the mean) for Pareto exists only if α > 1.
7. Problem 3.13. Given fX (x) = 2.5x−3.5 I(x ≥ 1). Find the coefficient of variation η.
Solution: By its definition
η=
Now we calculate: E(X) =
Plug in (7) and get
[Var(X)]1/2
[E(X 2 ) − µ2 ]1/2
=
.
E(X)
µ
R∞
1
(2.5)xx−3.5 dx = 5/3 and E(X 2 ) =
η = [5 − 25/9]1/2 /[5/3] = .9.
(7)
R∞
1
(2.5)x2 x−3.5 dx = 5.
Remark: you may notice that X is a single-parameter Pareto (see the Table A.4.1.4)
with α = −2.5 and fixed (set in advance) θ = 1. Note that for this Pareto the support is
x > θ. Do not be confused by another two-parameters Pareto (Pareto part II) described in
A.2.3. Here both α and θ are parameters and the difference is that the support is x > 0! So
be cautious with Pareto as well as with other distributions — accurately try to figure out
which one is related to your problem. By the way, do you think that Y = X − θ where X
is the single-parameter and Y is two-parameter Pareto? In any case, because the Table is
given, you can use it to save some time.
8. Problem 3.16. Given: Empirical cdf F̂ (x) is equal to 0.2 at x = 400, 0.7 at x = 800,
and 0.1 at x = 1600. Find a corresponding empirical skewness.
Solution: Remember that skewness is γ1 = E(X − µ)3 /σ 3 . For empirical one we use the
empirical cdf. We need to calculate 3 moments. Let us do this. We do this via corresponding
empirical probability mass function p̂(x) which is equal to jumps of the empirical cdf, that
is, p̂(x) is equal to 0.2, 0.7, and 0.1 at x equal to 400, 800 and 1,600. Now we calculate:
Ê(X) = µ̂ =
X
xp̂(x) = (.2)(400) + (.7)(800) + (.1)(1600) = 80 + 560 + 160 = 800.
x
Further,
Ê(X − µ̂)2 = (.2)(400 − 800)2 + (.7)(800 − 800)2 + (.1)(1600 − 800)2 = 96, 000.
Further,
Ê(X − µ̂)3 = (.2)(400 − 800)3 + (.7)(800 − 800)3 + (.1)(1600 − 800)3 = 38, 400, 000.
Plug in the numbers and get
γ1 = 38, 400, 000/(96, 000)3/2 = 1.29.
4
9. Problem 3.17. Given: cdf is F (x) = (1 − x−2 )I(x ≥ 1). Find: mean, median, variance.
Solution: (a) For a continuous RV it is typically easier to work with its density
f (x) := dF (x)/dx = 2x−3 I(x ≥ 1).
Then
µ = E(X) =
Z
∞
−3
2xx dx = 2
1
Z
(8)
∞
1
x−2 dx = −2x−1 |∞
1 = 2.
Remark: You also can calculate expectations via cdf/survival function like this:
E(X) =
Z
1
∞
Z
∞
xf (x)dx = −
Z
∞
= −xS(x)|∞
1 +
Z
∞
xdS(x)
1
1
[use integration by parts]
S(x)dx = 1 +
Z
1
∞
S(x)dx.
Since S(x) = x−2 for x ≥ 1 we get
E(X) = 1 +
1
x−2 dx = 1 − x−1 |∞
1 = 2.
Use whatever approach is faster/easier for you. Of course, the Table can be the fastest
solution if you realize your distribution; but you must be able to do this calculation on your
own without a table.
(b) The median m for X is a solution of FX (m) = .5. Here 1 − m−2 = .5 yields m = 21/2 .
(c) From (8) we get that the mode is 1 because this is the value of x where the density
is largest.
10. Problem 3.19. For X ∼ P areto(α, θ) the 10th percentile is θ − k and 90th percentile
is 5θ − 3k. Find α.
Remark: Well, which Pareto is here? My guess is that two-parameter because this is
how the authors write: two parameters. Unfortunately, this is the only hint given, and you
should be ready for such things. As a result, here X > 0 but if it would be one-parameter
Pareto then X > θ. This is another hint: note that the 10th percentile is smaller than θ
(well, I hope that k is positive and less than θ). Second remark — pay attention to this
problem — it is very typical for the Exam C.
Solution: From the Table I get F (x) = 1 − [θ/(x + θ)]α I(x > 0) so we get a system of
two equations
(
0.1 = 1 − [θ/(2θ − k)]α
0.9 = 1 − [θ/(6θ − 3k)]α
When you solve a system you always go from a system to a new system. So I simplify the
two equations and get a new system
(
[θ/(2θ − k)]α = 0.9
3−α [θ/(2θ − k)]α = 0.1
Now I divide the first equation on the second and get 3α = 9 which yields α = 2.
5