Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
```UNC-Wilmington
Dept. Economics and Finance
ECN 525
Dr. Chris Dumas
Model Building Example 1â€”Air Pollution & Lung Cancer
Suppose that public health officials suspect a relationship between the air pollution index (X) in
various cities and the number of new lung cancer cases per year per 100,000 people (Y).
Suppose that you have available the following (hypothetical) data on X and Y in various cities.
Public health officials say that more air pollution leads to more lung cancer, whereas industry
representatives claim that lung cancer is caused by other factors (smoking, etc.).
Air Pollution Index
X
100
134
178
205
240
160
134
122
210
136
227
130
162
170
187
156
176
144
127
139
152
166
188
New Lung Cancer Cases per Year
(per 100,000 people)
Y
304
323
365
341
400
336
301
327
343
292
364
292
335
302
343
317
355
357
318
298
329
334
321
1) Based on the available data given in the table above, try to find a pattern in the data, and
form a (very) simple theory about the relationship between the air pollution index and new lung
cancer cases. State your theory in a sentence. Remember that a theory is a pattern in data, so,
your statement of a theory should be a description of a possible pattern in the data. When
describing the pattern, be sure to specify the particular variables involved in the pattern and how
they relate together.
2) <this question deleted>
1
3) Use Excel to construct a graph of the air pollution index vs. new lung cancer cases and look
for any pattern (i.e., does the pattern in your graph even vaguely support your theory?). (Note
that by constructing a graph and looking for a pattern to support your theory, you are engaging in
inductive reasoning, or inference.)
4) Based on your graph, it should be uncertain which of two patterns is present in your data (i.e.,
a flat line, indicating no relationship, or an upward sloping line, indicating a relationship). Use
Nested Modeling to determine the best operators for a model of your theory (by doing so, you
determine which pattern is best supported by the data).
5) Use Excel to determine the best parameters for your model. (This should be a by-product of
6) What does Statistical Hypothesis Testing (the t-stat and the Adjusted R-squared numbers) say
7) Conduct a Sensitivity Analysis of your model with respect to the parameters. The model
results (that is, Y) are most sensitive to estimates of which model parameter?
8) Derive two theorems from your theory. The theorems should answer the following questions:
(a) If the air pollution index increases by 50 units, what is the increase in the number of new lung
cancer cases? (b) If there were no air pollution in a city, what would be the baseline/background
number of new lung cancer cases per year?
9) Suppose industry representatives say that it is smoking that is causing most of the new lung
cancer cases. Describe a natural experiment that would allow you to test this claim.
10) By performing the analysis in this homework, have you proved your theory from question (1)
above? If so, why? If not, why not (what other factors could explain the pattern observed in