Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
732A20 Data Mining and Statistical Learning Department of Computer and Information Science Computer lab 7: Generalized additive models Learning objectives The main objective of this computer lab is to make the student familiar with the use of generalized additive models for prediction and hypothesis testing. After completing the lab the student shall be able to: (i) (ii) (iii) Use proc GAM in SAS to fit different types of generalized additive models to a given dataset. Interpret the output from proc GAM, and test hypotheses regarding the model components. Visually inspect the residuals of a generalized additive models and based on the patterns found in the residuals suggest how the model can be improved Recommended reading Chapter 9.1 in Hastie et al. Assignment 1: Using additive models to examine how mortality is related to the number of influenza cases The Excel document influenza.xls contains weekly data on the mortality and the number of laboratory-confirmed cases of influenza in Sweden. In addition, there is information about population-weighted temperature anomalies (temperature deficits). The last columns contain time-lagged variables. Your task is to employ generalized additive models to examine how the mortality is influenced by the number of influenza cases. a) Use time series plots and scatter-charts to visually inspect how the mortality varies with year, week, and the number of laboratory-confirmed cases of influenza. b) Use Proc GAM to investigate how the mortality can be described as a function of year (or time) and week. The set of models fitted to data shall include: (i) ordinary regression models with independent normally distributed error terms; (ii) additive semiparametric models with independent normally distributed error terms. Use various combinations of param() and spline() in the model statement, and examine how the degrees of freedom of the spline function(s) influence the deviance of the model. Also, plot predicted and observed mortality against time for the fitted models. c) Choose one of the models you have already fitted to data and examine the residuals. Is the temporal pattern in the residuals correlated to the outbreaks of influenza? 732A20 Data Mining and Statistical Learning Department of Computer and Information Science d) Use Proc GAM to investigate how the mortality can be described as a function of year (or time), week, and the number of confirmed cases of influenza. Summarize your findings in a table of deviances for the tested models. Choose the model having the smallest deviance and make suitable plots of: (i) the spline components in the model; (ii) observed and predicted mortality rates. Test whether or not the mortality is influenced by the outbreaks of influenza. Assignment 2: Using additive models to examine how mortality is related to the number of influenza cases and extremely low temperatures The Excel file influenza.xls contains observations of population-weighted temperature deficits in Sweden. (A high deficit means that is unusually cold.) Your task is to employ generalized additive models to investigate how the mortality is influenced by influenza outbreaks and temperature deficits. a) Take the best model in assignment 1 and examine by visual inspection whether or not the residuals in that model are correlated to the temperature deficit? b) Use Proc GAM to investigate how the mortality can be described as a function of year (or time), week, the number of confirmed cases of influenza, and the temperature deficit. Summarize your findings in a table of deviances for the tested models. Choose the model having the smallest deviance and make suitable plots of: (i) the spline components in the model; (ii) observed and predicted mortality rates. Test whether or not the mortality is influenced by the outbreaks of influenza and the temperature deficit. c) Use Proc GAM to investigate whether your so far best model can be further improved by introducing time-lagged information about influenza cases and temperature deficits. Summarize your findings in a table of deviances for the tested models. Choose the model having the smallest deviance and make suitable plots of: (i) the spline components in the model; (ii) observed and predicted mortality rates. Test whether or not the mortality is influenced by the outbreaks of influenza and the temperature deficit. To hand in Highlighted items.