Download Computer lab 9: Neural networks

732A20 Data Mining and Statistical Learning Department of Computer and Information Science Computer lab 9: Neural networks Learning objectives The main objective of this computer lab is to make the student familiar with some advanced properties of neural networks and the use of such models for classification and regression. After completing the lab the student shall be able to: (i) (ii) (iii) Use the Neural Network node in SAS Enterprise Miner to select and fit classification and regression models Understand the difference in configuration and areas of application of Multilayer Perceptrons and Radial Basis Functions Networks Understand how the user can select various optimization methods for fitting neural networks to given data sets. Recommended reading SAS Enterprise miner documentation on Neural Network node. Assignment 1: Hand-written digits recognition The aim of the research is to find out how good the neural network can recognize images. The problem is supposed to be done both persons of the group. 1. Open the file grid.bmp with Paint program. Choose the white pencil color. Choose View->Zoom->Large size. In each cell of first row you should draw the digit ‘1’. The student 1 fills in first 7 cells and student 2 fills in the remaining 3 cells. Now, row number 2 should be filled with the digit ’2’ by the same principle. Rows 3,4,5 are filled with appropriate digits. Save the result as ‘data.bmp’. The fragment of the filled picture is given. Include the picture in your report 2. Download the files converter.ctf (make sure that extension was not changed after you downloaded the file) and converter.exe in the same folder the images 3. Perform the following steps in Windows. Open Start menu, right-click on My computer, choose Properties, click on tab ‘Advanced’, click ‘Environment Variables’ button. In the new window opened click ‘New’ button. Specify ‘Variable name’= PATH and ‘Variable value’= C:\programs\matlab\bin\win32 . Click OK in all three windows. 732A20 Data Mining and Statistical Learning Department of Computer and Information Science 4. Run converter.exe. The new files train.txt and score.txt should appear. These files contain records representing digits’ image in binary format and the appropriate labels showing which digit the given binary map represents. 5. Import both train.txt and score.txt to SAS using ‘Import data...’ menu item. In ‘Select a data source from the list below’ list choose ‘Delimited file (*.*)’ item. Give the names traindata and scoredata to the appropriate datasets. 6. Create input nodes with training set traindata , test set scoredata and score set scoredata. The last column ‘Var257’ in both tables represents the digit value. Set ‘Model role’ and ‘Measurement’ for ‘Var257’ in appropriate Enterprise Miner nodes to ‘target’ and ‘nominal’, respectively. 7. Create neural network with default configuration of the network (multilayer perceptron) and run the network for training and scoring. 8. Find out misclassification rate for the scoredata table. Compare the real digits value given in the scoredata table with the predicted ones. 9. Change the configuration of the network: set the activation function in output layer to ‘Logistic’ and the error function to ‘Normal’. Change the number of neurons in the hidden layer to 10. Run the network. Perform step 8 and compare the results obtained with the default configuration and the new configuration. 10. Do you think that created neural networks perform well for this data? Assignment 2: Comparison of ORBF networks with multilayer perceptron and optimization methods 1. Import the data set from wave.xls to SAS (call it wave) and plot the response variable (Y) versus the explanatory variable (X) using the GPLOT procedure. 2. How many peaks can you see in that function? Let us denote this number by p. 3. Create Input Data Source node in Enterprise Miner for wave data set and select suitable variables roles. 4. Create two neural networks by creating two Neural Network nodes and connecting them to the Input data source node. The first network should be an Ordinary Radial Basis Function (ORBF) network. Think about what minimal amount of hidden neurons this network must contain to have a chance to fit the data properly? How is this quantity related to p? Let h denote the minimal number of neurons.. Set the number of neurons in your ORBF network to h. The second network should be a Multilayer Perceptron (MLP). Set the number of neurons in the hidden layer of this network to h as well. Switch off ‘Training Process monitor’. 732A20 Data Mining and Statistical Learning Department of Computer and Information Science 5. Train both networks and score wave data set. Plot the fitted values produced by each network using GPLOT together with original observations (You shall draw two plots: (ORBF fit, original), (MLP Fit, original)). Are you satisfied with results? Which network performes better? What is the average squared error in each of the two cases? 6. Enable the advanced user interface for an ORBF network and compare how long time different training techniques require. Try Standard backprop, Quickprop, Quasi-Newton, Conjugate Gradient and Levenberg-Marquardt with Maximum Iterations=10000. The time you can see in the Logs of the Results. It can be found after NOTE: PROCEDURE NEURAL used (Total process time): sentence. Compose the table with columns (technique, time). 7. Set Default Training Method for ORBF network. Set Optimization Step= ‘Prelim and Train’ and Number of Preliminary runs to ‘50’ for both networks. 8. Run MLP and ORBF networks and produce two more plots as you did at step 5. Compare the plots obtained at step 8 and answer which network performed better now and why? Compare the plots for both networks with plots obtained at step 5. Is there an improvement? If there is, answer what was the problem at step 5 and why new settings helped at step 8? 9. Try different configurations of the MLP network a. h hidden neurons with activation function=’Hyperbolic tangent’ and activation function in output layer=’Square’ b. h-1 hidden neurons with activation function =’Hyperbolic tangent’, 1 neuron with activation function =’Sine’ , activation function in output layer=’Default’ (add one more hidden layer and set the amount of neurons on both layers appropriately) For each configuration obtain the plot of fitted values and original values versus predictors. Compare the obtained graphs with the ones you had at step 8. 10. For the ORBF network set Weight decay parameter to ‘1’. How and why did it influence the fit? To hand in Highlighted items.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Computer lab 9: Neural networks