Download Answers

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Economics 375: Introduction to Econometrics
Homework #1
This homework is due on January 14 .
th
1.
Consider a random variable X that comes from a normal distribution with mean  and
variance 2. Think of four different ways to estimate the unobservable mean :
5X1  5X 2
(a) X 
12
7X 3
X
X
(b) X  1  2 
10
5
10
(c) X  Median(X1 , X 2 , X 3 )
4X1 3X 2 3X 3


(d) X 
10
10
10
a. Which, if any estimators are unbiased?
X and X are unbiased. X is unbiased in this situation (a normal distribution) because a normal
distribution is symmetrical around the mean (and hence the median will, on average, equal the
mean). In general, the median is not an unbiased estimator of the mean when the distribution is
asymmetric. X is a biased estimator because, on average, it will produce estimates of  that are
too small.
A simple way to answer most of these types of questions is to insert the population average for
each random draw (i.e., let X1 equal μ) and then see if the estimator equals the population
5  5  5
average. For instance, using X , we see that X 
  so, on average, X would
12
6
produce an estimate that is only five-sixths the size of the actual population mean. Notice, for X
and X this technique produces 1μ and hence, each are unbiased.
b. Between X and X , which estimator is more efficient?
Let me state at the outset, this is somewhat of a difficult problem to prove. Please don’t get
discouraged if you tried to prove it and were not successful. I think this is one of the more
difficult algebra problems you will encounter in my class. Later, I’ll show you how I solved it
but for now, let’s try to think of the intuition to arrive at the correct answer.
Let’s imagine you draw three random observations from the distribution with mean μ and
variance 2 the first two of which are quite close to μ and the third just happens to be a long way
away (here, we are not defining “close” and “long way away”). Think of what happens when
you use X to estimate the mean: because the third observation is a long way away from the
mean and X gives most of the weight (seven-tenths) to this third observation, X will be
“pulled” quite a bit towards the value of the third observation. This means that the value of X
will tend to be close to the third observation. Now, because the third observation may be either
above or below the mean, this means that if we repeatedly estimate X with multiple different
draws of the Xs, then X will tend to have large values away from the mean—in other words X
will tend to have a large variance. This is less likely to happen with X because no particular
observation will have an overwhelming amount of weight that will allow outliers to pull it away
from the mean. In our terms, X will be more efficient than X .
Again, before proving this, consider the following Monte Carlo experiment (we’ll talk more
about Monte Carlo experiments later. For now, just think of them as big computer simulations).
Imagine I ask the computer to draw three random, normally distributed numbers and compute
both X and X and save their values. Then, repeat this process, using different random draws,
many times (say, 10,000). If I present histograms of all 10,000 values of each of X and X , then
the one that is least efficient should have more values further from the mean (or, better said,
should have a larger variance).
X
X
Notice, that is exactly what we get. Ten thousand replications (using a standard normal
distribution), gives a variance for X that is about 60% of that of X . Now, for the proof.
When one generates three observations, the variance equation is
. However, in this case, we weren’t asked to compute x-bar
(the traditional mean of x), but instead X and X , so our two variance equations will be:
(1)
(2)
X
X
X
X
X
X
Expanding (1) by inserting the definition for X :
X1 X 2 7 X 3
X1 X 2 7 X 3




10
5
10
10
5
10
X1 X 2 7 X 3


10
5
10
Now, comes some algebra work where we simplify this expression so it contains as few X1, X2,
and X3s as possible:
.
.
.
.
.
.
.
(3)
Doing the same for (2) gives:
.
.
.
.
.
(4)
Concentrate on the first three terms of this expression (the ones associated with the squared X’s).
In all three cases, those in expression (4) (associated with X ) are smaller than those of (3). This
leads us to believe that usually (4) will be smaller than (3) but, if you look at the last three terms,
this does not always need to be the case—one could probably pick an X1, X2 and X3 in such a
way to get (3) smaller than (4). However, remember the concept of efficiency, like that of
unbiasedness, is an “average” one. In other words, an efficient estimator on average gives a
smaller variance.
To see the conclusion of this proof, we need one mathematical concept sometimes not introduced
in some prerequisites for this class. I introduce it and show the remainder of this proof in this
video.
c. Can you think of a time where you would rather use an efficient, biased estimator rather than
a unbiased, inefficient estimator?
In some cases, it is better to be a little wrong and be very precise. Consider a field goal kicker.
Would you rather have an accurate one that is imprecise (that is sometimes missing to the left
and sometimes to the right but on average making it through the goal) or a precise and inaccurate
one who always squeeks the ball within the left goal post?
2.
Consider a random variable X that comes from a fair, four sided die (1 with probability
.25, 2 with probability .25, 3 with probability .25, and 4 with probability .25).
a. What is the population mean of Y where Y = (X1 + X2)/2?
The mean of Y is the mean of X1 plus the mean of X2 divided by 2. The mean of both X1 and X2
is 2.5 so the mean of Y is also 2.5.
b. Consider two methods of estimating the population mean of X: X 
X 1 2X 2
and

3
3
3X1 2X 2

. Which method of estimating the population mean is unbiased? Which
5
5
estimator provides estimates with the smallest variance?
Both estimators are unbiased so each produces the correct mean, on average.
X
To determine variance, consider the following spreadsheet:
Die 1
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
Die 2
1
1
1
1
2
2
2
2
3
3
3
3
4
4
4
4
X
1
1.333333
1.666667
2
1.666667
2
2.333333
2.666667
2.333333
2.666667
3
3.333333
3
3.333333
3.666667
4
X
1
1.6
2.2
2.8
1.4
2
2.6
3.2
1.8
2.4
3
3.6
2.2
2.8
3.4
4
All of the possible X vary between 1 and 4 as do the X . However, the X seem to vary less
(notice, there are more numbers close to the true mean in X than there are for X ). Indeed, the
variance of X is .694 while the variance of X is .65 indicating X is less efficient than X .
3.
A company services copiers. A review of its records shows that the time taken for a
service call is normally distributed with a mean of 72.5 minutes and standard deviation of 20
minutes.
a. What proportion of service calls take less than one hour?
Z = (60 – 72.5)/20 = -.625. P(Z<-.625) = .266 or 26.6% of the time. (from
http://www.statpages.org/pdfs.html)
b. What proportion of service calls take more than 90 minutes?
Z = (90 – 72.5)/20 = .875. P(Z>.875) = .1908.
c. If a service call has taken an hour, what is the probability that it will take more than 90
minutes?
Z = (90 – 72.5)/20 = .875. P(Z>.875|Z>-..625) = .1908/(1-.266) = .259
d. In a random sample of four calls, what is the probability that the average length of calls is
less than one hour?
Z = (60 – 72.5)/(20/4.5) = -1.25. P(Z<-1.25) = .10565
4.
Following are the miles-per-gallon figures for a sample of cars of the same model tested
under identical conditions:
25
30
28
29
24
25
27
28
26
24
27
31
First, the relevant sample statistics are:
X = 27
s2 = 5.2727
s = 2.296
The hypothesis test being conducted is:
H0 :  = 28
HA :   28
For this problem, I am testing the first statement, “The consumers’ group that conducted this test
is skeptical of the manufacturer’s claim that its cars average 28 mpg” which implies doubt of the
average rather than a one-tailed test which implies doubt of greater than or less than the average.
My test statistic is:
t = (27 – 28)/(2.296/12.5) = -1.509
The critical value of a t-statistic at the 99% level in a two tailed test with 11 degrees of freedom
is 3.106. In this case, my test statistic is less than my critical value so I fail to reject my null
hypothesis and conclude that it is likely enough for me to observe an average of 27 when the true
mean is 28 that I will not conclude the auto company is being dishonest.
b. Find the 95% confidence interval for the mean gas mileage of all cars of this model.
The t-statistic with 11 degrees of freedom at the 95% level is 2.201. My confidence interval for
the mean gas mileage of all cars in this model is: 27  2.201(2.296/12.5) = {25.54, 28.45}
c. Interpret the confidence interval you found in (b). Can you conclude that 95% of all cars of
this model will have mpg values within this interval? Explain.
This confidence interval is the confidence interval of the unobservable true average gas mileage
of all cars in this model (what we typically call μ); not of the individual vehicles themselves. In
other words, I am 95% confident that the average gas mileage of this make of cars is between
25.54 and 28.45 mpg. This does not tell me about any single car which may have a gas mileage
outside of this range. To find an estimate of any single car, I simply use the estimated mean and
standard deviation. I would be 95% confident that any given car has a mpg between 27 
2.2962.201 = {21.94, 32.05}.