Download Assignment 2 Comparing Qualitative Response Regression Models

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Department of Economics and Statistics
Karlstad University
Jari Appelgren
Assignment 2
Comparing Qualitative Response Regression Models
Applied Econometrics (NEK) & Econometrics B (STAT)
For this assignment we will use a subset of a data set, Table 15.27, from the textbook
website. NOTE! To make the analyzes more interesting the data has been changed
compared to the original on the website. Because of limitations on “Kurstorget” the
file is renamed to '.doc' and can be imported into STATA by File > Import > 'ASCII
data created by spreadsheet', use Browse to find file, and click OK.
The assignment is based on exercise 15.19 from the textbook, and is originally about
2000 women where six variables are used; county, work, age, marital status, number
of children, and education. Note that the data set from the website include more
variables and more cases than this subset.
The assignment is about using the data to estimate three separate models; the linear
probability model, the logit model, and the probit model, and to evaluate which of
these models you would consider appropriate choice for the data.
Part 1 – The Linear Probability Model
Estimate a linear probability model and decide on a suitable set of variables that are
statistically useful for this model. Explain the parameter estimates and their effect in
terms of the probabilities that they estimate.
Important that you explain all the problems that are found in the process and how you
have solved them. Also finally investigate the final estimated model and comment on
any problems that are still evident.
Part 2 – The Logit Model
Estimate the model according to the Logit form, and find the statistically useful
variables. Considering the previous part of the assignment, comment on which
variables is significant compared to previous results.
Explain the estimates and any problems with the final model.
Department of Economics and Statistics
Karlstad University
Jari Appelgren
Part 3 – The Probit Model
Same procedure as in parts 1 and 2, but use the Probit form in this case. Explain the
difference between Logit and Probit in words and any differences in assumptions and
results. Are the results surprising compared to part 2?
Optional assignment (only if you are interested)
In the Logit model, or more usually logistic regression, the term LD50 is usual
(specifically in Medical applications considering the abbreviation LD). Investigate
this term and explain it's meaning and how it's related to the estimates in a simple
logistic regression with only one independent variable.
In our situation there are more than one independent variable, so a bit more difficult
to use the same reasoning as for one independent variable. Consider how you might
be able to get similar information from this data set. Note that you can either consider
all or one at a time, depending on your choice perspective to this problem.
Why is the value 50 so interesting in these situations? Please explain.
General instructions
In each part of the assignment you are required to find a final model. You can
consider two ways; by entering or by removing variables from the model based on
significance. At least one of Part 1, 2, and 3, should be done by entering or by
removing variables. Explain criteria that you use for each method.
Please include at least a minimum of output and/or values from the steps used in each
part of the assignment, so that is will be simple to follow the choices made along the
way. The final models must be included in full from STATA. Also include some
diagnostics about the model.
The assignment must be turned in at latest the week before the main examination of
the course, and place in one (Nationalekonomi / Statistik) of the department
postboxes in the begin of the 11C200-corridor.
The assignment should be done in groups of up to three students (or alone) and with
the name of the assignment and participants' codes clearly on a front page of the
assignment.
Jari Appelgren
2012-09-10