Download Stat 6601 Project: Neural Networks (V&R 6.3)

Stat 6601 Project: Neural Networks (V&R 6.3) Group Members: Xu Yang Haiou Wang Jing Wu 5/24/2017 1 Definition  Neural Network A broad class of models that mimic functioning inside the human brain   There are various classes of NN models. They are different from each other depending on: (1) Problem types, prediction, Classification , Clustering (2) Structure of the model (3) Model building algorithm We will focus on feed-forward neural network. 5/24/2017 2 A bit of biology . . . Most important functional unit in human brain – a class of cells called – NEURON Dendrites Cell Body Axon Synapse Neurons Neural Nework • Dendrites – Receive information • Cell Body – Process information • Axon – Carries processed information to other neurons • Synapse – Junction between Axon end and Dendrites of other Neurons 5/24/2017 3 An Artificial Neuron Dendrites X1 X2 Xp w1 w2 .. . wp Cell Body Axon Direction of flow of Information I f V = f(I) I = w1X1 + w2X2 + w3X3 +… + wpXp • Receives Inputs X1 X2 … Xp from other neurons or environment • Inputs fed-in through connections with ‘weights’ • Total Input = Weighted sum of inputs from all sources • Transfer function (Activation function) converts the input to output • Output goes to other neurons or environment 5/24/2017 4 Simplest but most common form (One hidden layer) yk  o ( k   whkh ( h   wih xi )) h 5/24/2017 i 5 Choice for Activation function 1 1 1 0.5 0 -1 Logistic Tanh (hyperbolic tangent) f(x) = (ex – e-x) 5/24/2017 / (ex + e-x) f(x) = ex / (1 + ex) 0 Threshold 0 if x< 0 f(x) = 1 if x >= 1 6 A collection of neurons form a layer Input Layer Hidden Layer - Each neuron gets ONLY - Connects Input and one input, directly from outside Output layers x1 wij Output Layer - Output of each neuron directly goes to outside x2 x3 x4 Input layer 5/24/2017 Hidden Layer(s) Outputs 7 More general format  Skip-layer connections yk   o ( k   wik xi   w jk h ( j   wij xi )) wij Input layer 5/24/2017 i k j k Hidden Layer(s) i j Outputs 8 Fitting criteria   Least squares E Maximum likelihood p p 2 || t  y ||  p p  p  tkp 1  t p k E   tk log p  (1  tk ) log p y 1  y p k  k k   Log likelihood E    t log p , p k p  k p k p p k  e  y kp K c 1 e ycp One way to ensure f is smooth: E+λC(f ) 5/24/2017 9 Usage of nnet in R  nnet.formula(formula, data=NULL, weights, ..., subset, na.action=na.fail, contrasts=NULL) size: number of units in the hidden layer. Can be zero if there are skip-layer units. Wts: initial parameter vector. If missing chosen at random. linout: switch for linear output units. Default logistic output units. entropy: switch for entropy (= maximum conditional likelihood) fitting. Default by leastsquares. softmax: switch for softmax (log-linear model) and maximum conditional. skip: Logical for links from inputs to outputs. formula: A formula of the form 'class ~ x1 + x2 + ...' weights: (case) weights for each example - if missing defaults to 1. rang: if Wts is missing, use random weights from runif(n, -rang, rang). decay: Parameter λ. maxit: maximum of iterations for the optimizer. Hess: Should the Hessian matrix at the solution be returned? trace: logical for output form the optimizer. 5/24/2017 10 An Example Code: library(MASS) library(nnet) attach(rock) area1<-area/10000; peri1<-peri/10000 rock1<-data.frame(perm, area=area1, peri=peri1, shape) rock.nn<-nnet(log(perm)~area + peri +shape, rock1, size=3, decay=1e-3, linout=T, skip=T, maxit=1000, hess=T) summary(rock.nn) 5/24/2017 11 Output # weights: 19 initial value 1712.850737 iter 10 value 34.726352 iter 20 value 32.725356 iter 30 value 30.677100 iter 40 value 29.430856 …………………………………. iter 140 value 13.658571 iter 150 value 13.248229 iter 160 value 12.941181 iter 170 value 12.913059 iter 180 value 12.904267 iter 190 value 12.901672 iter 200 value 12.900292 iter 210 value 12.899496 final value 12.899400 converged 5/24/2017 > summary(rock.nn) a 3-3-1 network with 19 weights options were - skip-layer connections linear output units decay=0.001 b->h1 i1->h1 i2->h1 i3->h1 9.48 -7.39 -14.60 6.94 b->h2 i1->h2 i2->h2 i3->h2 1.92 -11.87 -2.88 7.36 b->h3 i1->h3 i2->h3 i3->h3 -0.03 -11.12 15.61 4.62 b->o h1->o h2->o h3->o i1->o i2->o i3->o 2.64 3.89 11.90 -17.76 -0.06 4.73 -0.38 >sum((log(perm)-predict(rock.nn))^2) [1] 11.39573 12 Use the same method from previous section to view the fitted surface Code: Xp <- expand.grid(area = seq(0.1, 1.2, 0.05), peri = seq(0, 0.5, 0.02), shape = 0.2) trellis.device() rock.grid <- cbind(Xp, fit = predict(rock.nn,Xp)) ## S: Trellis 3D Plot wireframe(fit ~ area + peri, rock.grid, screen = list(z = 160, x = -60), aspect = c(1, 0.5), drape = T) 5/24/2017 13 Output 8 6 4 2 0 fit -2 fit area -4 -6 5/24/2017 14 Experiment to show key factor which affects the degree of fit attach(cpus) cpus3 <- data.frame(syct = syct-2, mmin = mmin-3, mmax = mmax-4, cach = cach/256, chmin = chmin/100, chmax = chmax/100, perf = perf) detach() test.cpus <- function(fit) sqrt(sum((log10(cpus3$perf) - predict(fit, cpus3))^2)/109) cpus.nn1 <- nnet(log10(perf) ~ ., cpus3, linout = T, skip = T, size = 0) test.cpus(cpus.nn1) [1] 0.271962 cpus.nn2 <- nnet(log10(perf) ~ ., cpus3, linout = T, skip = T, size = 4, decay = 0.01, maxit = 1000) test.cpus(cpus.nn2) [1] 0.2130121 cpus.nn3 <- nnet(log10(perf) ~ ., cpus3, linout = T, skip = T, size = 10, decay = 0.01, maxit = 1000) test.cpus(cpus.nn3) [1] 0.1960365 cpus.nn4 <- nnet(log10(perf) ~ ., cpus3, linout = T, skip = T, size = 25, decay = 0.01, maxit = 1000) test.cpus(cpus.nn4) [1] 0.1675305 5/24/2017 15

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Stat 6601 Project: Neural Networks (V&R 6.3)