Download Solutions #8 - Bryn Mawr College

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Linear least squares (mathematics) wikipedia , lookup

Sufficient statistic wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

German tank problem wikipedia , lookup

Transcript
Math 295
Solutions #8
November 4, 2002
(formula in solution B corrected 11/18)
Problem A. (From text, question 5.2.3) A random variable Y has an exponential distribution
with parameter  if it has a density function given by
fY ( y )   e  y for y  0.
The parameter  must satisfy  > 0.
[Note that


0


0
fY ( y)dy  1 as required. The mean of this distribution is
yfY ( y)dy  1 . The variance turns out also to be 1/, so the standard deviation is 1/  .]

[This is a good model if Y represents the waiting time for a random event that occurs at a rate of .
Thus, if raindrops are falling at  per minute, it makes sense that on average you have to wait 1/ minutes
for the next one.]
[Sometimes we write the density function as fY ( y ) 
1
e


1

y
. This is the same class of
distributions, but parameterized differently. The new parameter  corresponds to 1/. This is convenient
since the mean is exactly .]
Suppose that we make four independent random draws from an exponential
random variable with distribution fY ( y )   e  y . The random draws turn out to be
8.2, 9.1, 10.6, and 4.9. What is a maximum-likelihood estimate for  ?
The likelihood function, given a sequence of draws y1, y2, …, yn, is
L( )   fY ( yi )   e  yi .
i 1..n
i 1..n
Since this expression contains a lot of products and exponentials it is easier
to maximize the log-likelihood function
L ( )  ln L( )     ln     yi 
i 1..n


 n ln      yi  .
 i 1..n 
We want to choose  to maximize this, so we solve L '( )  0 :


L'( )  n    yi   0 when y  ˆ
  i 1..n 
n
 ˆ 
 1 .
y


y
 i
 i 1..n 
In this case y-bar is (8.2+9.1+10.6+4.9)/4 = 8.2, so ˆ  1/ 8.2  0.122.
1
Note: It isn’t too surprising that ˆ  1 , because  is the inverse of
y
the mean  of the distribution. We are just saying that the
maximum-likelihood estimator is the one that makes  = y .
Since the method-of-moments estimator is (by definition) the
one that makes  = y , it follows that in this instance, the
method-of-moments estimator is the same as the maximumlikelihood estimator.
Problem B. A random variable Y has a Cauchy distribution if it has a density function given
by
fY ( y ) 
1/ 
.
1  ( y  m) 2
m
The only parameter is m, which can be any real number. We say that the distribution is
centered at m.
a. Suppose we make four random draws from this distribution, and they turn out to be
– 4.6, – 4.5, – 4.1, + 1.2. What is a maximum likelihood estimator for m ? (This will
require computation. Excel is probably sufficient. 2-digit accuracy is fine.)
The likelihood function, with these values substituted, is

1/ 
1/ 
1/ 
1/ 
L(m)  



2
2
2
2
 1  (4.6  m) 1  (4.5  m) 1  (4.1  m) 1  ( 1.2  m)

.

I didn’t even try to maximize this mess analytically. I just set it up in Excel
and evaluated it for different values of m. The best value I found
corresponded to m = -4.33, so to two decimal places, we have
ˆ  4.33 .
m
Note that this is NOT the average of the observed values. The parameter
m is something vaguely like an average for the distribution, but the
best estimator of it is not the sample average---it seems to give
smaller weight to “outliers” like +1.2. Often this is what we want for
an unknown distribution.
b. (Extra credit for brave souls only) Try computing E(Y) when m = 0.
Try the definition:
E (Y )  

y 

1

yfY ( y )dy

y 
y
dy.
1 y2
2
Now, there are a lot of reasons for wanting that integral to exist and be
zero. But, it doesn’t exist. Recall that an integral that is improper at
both ends exists only if the corresponding improper-at-one-end-only
integrals exist. In this case, that means that we must evaluate the
integral by

0

y
y
y
y  1  y 2 dy  y  1  y 2 dy  y 0 1  y 2 dy
1 1 1
1  1
 
dz  
dz (substituting z  1  y 2 )
z

z

1
2
z
2
z
and neither of those integrals exists.
So, this is an example of a random variable without an expected value. (The
Cauchy distribution is mostly useful as a bad example.)
Problem C. (based on text, question 5.2.12) If X has the density function
f X ( x)   k  (1/ x) 1 when y  k.
Assume that k is a known constant, but that we don’t know  except that   1.
Suppose we get 25 random draws from this distribution, say x1, x2, …, x25.
a. Write a formula for the likelihood function L() in terms of x1, x2, …, x25 (and k).
Just write:
L( ) 

i 1..25
f X ( xi ) 
  k (1/ x )
1
i
i 1..25
This simplifies a bit:

25 25 
L( )   k


 1


1
xi 

i 1..25

.
b. Try Maximizing L() to obtain a maximum likelihood estimator for  in terms of
x1, x2, …, x25 (and k). Find a formula for the value of  that maximizes L() if
you can; otherwise, get as far as you can.
xi

i

1..25
Let’s write P for
. Now:
 P
L( )   25 k 25 1
 1
so
L ( )  ln L( )  25ln    ln(k 25 )  (  1) ln P
 25ln   (ln k 25  ln P)  ln P
L'( )  25 /   (ln k 25  ln P)
3
ˆ
This is zero for  , so
25
ˆ 
ln P  ln k 25 .
That’s the maximum likelihood estimator.
With some extra work, we can put this in a simpler form:
1
.
ˆ 
 xi 
geometric mean of the numbers  
 k i 1..25
(We were given at the outset that 1, and we haven’t enforced that.
But the last form of the estimator makes it clear that if the observed
values are really greater than k, then the estimator is automatically
greater than 1.)
c. (Text, exercise 5.2.19) Find a formula for a method-of-moments estimator for .
(Since there is only one unknown parameter, this just means choosing  so that
the mean of the distribution matches the mean of the observed values.)
First we need to compute the theoretical mean of this distribution, as
a function of .
 1
1
   x  k   dx
xk
 x
  
 k
.
  1 
That result only makes sense of  > 1. (Did you notice when >1 was
used in the evaluation of the integral?)


Now the method-of-moments estimator is the one that makes x-bar,
the mean of the observations, equal to . We compute:
 ˆ 
x  k 
ˆ 
  1 

x
ˆ 
.
x k
So the method-of-moments estimator is different (sometimes very
different) from the maximum likelihood estimator.
4
Problem D. Assume that b > a. A random variable X has a uniform distribution with limits a
and b if it has a density function given by
fX(x) = 1/(b-a)
when a  x  b.
a
b
This distribution has mean (a+b)/2 and standard deviation (b-a)/ 12 (which means
that the variance is (b-a)2/12 ). Suppose we make random draws x1, x2, x3, x4 from this
distribution. Assume that these have been rearranged so that x1  x2  x3  x4.
Let’s verify that variance calculation. We’ll accept that E(X) =
(a+b)/2. Now:
b
1
E( X 2 )   x2
dx
x a
ba
b
1 b 2
1 1 3
1 b3  a 3

x
dx

x



b  a x  a
b  a  3  xa 3 b  a

so
1 2
b  ab  a 2 

3
Var ( X )  E ( X 2 )  E ( X ) 2
1 2
 ab
b  ab  a 2   


3
 2 
1
2
 b  a  .
12
2

a. Find maximum likelihood estimators for a and b (in terms of the xi’s).
The likelihood function is
L(a, b)   f X ( x; a, b)
i 1..n
 1
if a  xi  b 

   b  a 
.
i 1..n 0
otherwise 

This means that if ANY of the xi’s is outside the range [a,b], then
L(a,b) = 0. Otherwise, L(a,b)=(1/(b-a))n. It’s pretty clear that the
way to maximize this is to choose the interval [a,b] as small as
possible, without excluding any of the observations. Therefore:
aˆ  x1
bˆ  x
4
is the maximum-likelihood estimator.
5
b. Find method-of-moments estimators for a and b.
This requires finding the values of a and b for which the theoretical
mean and variance of the distribution match the mean and variance of
the observed data. Write M and V for the mean and variance of the
observed data; that is,
M = (x1+…+x4) / 4
V = ((x12+…+x42)/4) – M2.
Then choose a and b so that
(a+b)/2 = M
and
(b-a)2/12 = V.
The solution to this system is
â = M – 3 V
b̂ = M + 3 V .
This solution seems pretty sensible---it says, place the boundaries
about 1.7 sample-standard-deviations above and below the sample
mean---but it does NOT guarantee that all of the data points are in
the interval  aˆ , bˆ  .
 
6