Download Project: Predicting the 2016 US Presidential election Due in Week

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Project: Predicting the 2016 US Presidential election
Due in Week 10
This is an Individual Project
The website http://projects.fivethirtyeight.com/2016-election-forecast/
provides real-time polling information for the November 8th election.
Your project is to develop a statistical model to predict the 2016 US presidential election. The
dataset election2016.txt on the course web-page provides information on US presidential elections going back to 19161 . Based on this data, your goal is to construct a regression
model for predictive the outcome of the 2016 election. There are a number of variables that
effect the outcome. For example, whether the incumbent president is running again. The goal
here is to address how the current state of the economy affects the outcome. For example, a
good economy has historically been a positive for the incumbent party.
You should provide a number of statistical diagnostics of your model. For example, you should
find the average prediction error of your forecast of the presidential vote VP variable. This is
defined as the democratic share of the two-party presidential vote in 2016.
The election2016.txt dataset contains the following variables:
1. VP Democratic share of two-party Presidential vote.
2. VC Democratic share of two-party House vote.
3. I 1 if there is a Democratic incumbent at the time of the election and −1 if there is a
Republican incumbent.
4. DPER 1 if the Democratic presidential incumbent is running again, −1 if Republican,
and 0 otherwise.
5. DUR if the incumbent party has been in power for one term 0, 1 if the incumbent party
has been in power for two consecutive terms, 1.25 if the incumbent party has been in
power for three consecutive terms, 1.50 for four consecutive terms, and so on.
6. WAR dummy for the elections of 1920, 1944, and 1948 and 0 otherwise.
7. GROWTH, G growth rate of real per capita GDP in the first three quarters of the
1
The econometrician Ray Fair has done extensive work on this dataset and predicting elections. His website and
book provides background reading material.
1
election year (annual rate).
8. PRICE INFLATION, P absolute value of the growth rate of the GDP deflater in
the first 15 quarters of the administration (annual rate) except for 1920, 1944, and 1948,
where the values are zero.
9. GOODNEWS, Z number of quarters in the first 15 quarters of the administration in
which the growth rate of real per capita GDP is greater than 3.2 percent at an annual rate
except for 1920, 1944, and 1948, where the values are zero
For example, the price inflation variable measures the growth rate of the GDP deflator in the
first 15 quarters of the second Obama administrations term. The third variable is a Goodness
(Z) variable which is defined as the number of quarters during which the GDP per capita growth
has exceeded 3.2%. The slow growth of the US economy since the financial crisis of 2008 is a
negative for the incumbent party.
Your project should focus on the key variables: Growth (G), Inflation (P) and Goodnews (Z).
Current values for the economic variables prior to the November 8th election are:
Date
January 30, 2016
April 28, 2016
July 29, 2016
October 28, 2016
G
1.97
0.87
0.94
0.97
P
1.37
1.28
1.40
1.42
Z
3
3
2
2
You have to build a model to predict the Presidential vote, in VOTE (VP) for the incumbent
(Democratic) party. Your analysis should build on the following:
(a) Predict the election outcome Vote as a function of the three key variables Growth,
Inflation, Goodnews.
(b) Develop a model with interaction variable such as G ? I and include control variables Dper, Dur, War.
(c) Calculate the number of false predictions that your model makes. Is this a reasonable
number?
(d) Calculate your estimate of the probability that the Democrats will win in 2016.
(e) Given the outcome of the 2016 election, assess how well your model foresaw this year’s
election result.
Keep in mind the data analytic tools we’ve covered in class. The following is a list of techniques, not all will necessarily be central to your analysis: Regression analytics; Hypothesis
testing; Confidence and Prediction intervals; Variable Selection; Diagnostics and Outliers.
Present your analysis in a paper describing your approach with the above in mind. Maximum
length is fifteen pages including exhibits.
Good Luck!
2