Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Training and Testing Neural Networks 서울대학교 산업공학과 생산정보시스템연구실 이상진 Contents • Introduction • When Is the Neural Network Trained? • Controlling the Training Process with Learning Parameters • Iterative Development Process • Avoiding Over-training • Automating the Process Introduction (1) • Training a neural network – perform a specific processing function 1) 어떤 parameter? 2) how used to control the training process 3) management of the training data - training process에 미치는 영향? – Development Process • 1) Data preparation • 2) neural network model & architecture 선택 • 3) train the neural network – neural network의 구조와 그 function에 의해 결정 – Application – “trained” Introduction (2) • Learning Parameters for Neural Network • Disciplined approach to iterative neural network development Introduction (3) When Is the Neural Network Trained? • When the network is trained? – the type of neural network – the function performing • classification • clustering data • build a model or time-series forecast – the acceptance criteria • meets the specified accuracy – the connection weights are “locked” – cannot be adjusted When Is the Neural Network Trained? Classification (1) • Measure of success : percentage of correct classification – incorrect classification – no classification : unknown, undecided • threshold limit When Is the Neural Network Trained? Classification (2) •confusion matrix : possible output categories and the corresponding percentage of correct and incorrect classifications Category A Category B Category C Category A 0.6 0.25 0.15 Category B 0.25 0.45 0.3 Category C 0.15 0.3 0.55 When Is the Neural Network Trained? Clustering (1) • Output a of clustering network – open to analysis by the user • Training regimen is determined: – the number of times the data is presented to the neural network – how fast the learning rate and the neighborhood decay • Adaptive resonance network training (ART) – vigilance training parameter – learn rate When Is the Neural Network Trained? Clustering (2) • Lock the ART network weights – disadvantage : online learning • ART network are sensitive to the order of the training data When Is the Neural Network Trained? Modeling (1) • Modeling or regression problems • Usual Error measure – RMS(Root Square Error) • Measure of Prediction accuracy – average – MSE(Mean Square Error) – RMS(Root Square Error) • The Expected behavior – 초기의 RMS error는 매우 높으나, 점차 stable minimum으로 안정화된다 When Is the Neural Network Trained? Modeling (2) When Is the Neural Network Trained? Modeling (3) • 안정화되지 않는 경우 – network fall into a local minima • the prediction error doesn’t fall • oscillating up and down – 해결 방법 • • • • reset(randomize) weight and start again training parameter data representation model architecture When Is the Neural Network Trained? • Forecasting Forecasting (1) – prediction problem – RMS(Root Square Error) – visualize : time plot of the actual and desired network output • Time-series forecasting – long-term trend • influenced by cyclical factor etc. – random component • variability and uncertainty – neural network are excellent tools for modeling complex time-series problems • recurrent neural network : nonlinear dynamic systems – no self-feedback loop & no hidden neurons When Is the Neural Network Trained? Forecasting (2) Controlling the Training Process with Learning Parameters (1) • Learning Parameters depends on – Type of learning algorithm – Type of neural network Controlling the Training Process with Learning Parameters (2) - Supervised training Pattern Neural Network Prediction Desired Output 1) How the error is computed 2) How big a step we take when adjusting the connection weights Controlling the Training Process with Learning Parameters (3) - Supervised training • Learning rate – magnitude of the change when adjusting the connection weights – the current training pattern and desired output • large rate – giant oscillations • small rate – to learn the major features of the problem • generalize to patterns Controlling the Training Process with Learning Parameters (4) - Supervised training • Momentum – filter out high-frequency changes in the weight values – oscillating around a set values 방지 – Error 가 오랫동안 영향을 미친다 • Error tolerance – how close is close enough – 많은 경우 0.1 – 필요성 • net input must be quite large? Controlling the Training Process with Learning Parameters (5) -Unsupervised learning • Parameter – selection for the number of outputs • granularity of the segmentation (clustering, segmentation) – learning parameters (architecture is set) • neighborhood parameter : Kohonen maps • vigilance parameter : ART Controlling the Training Process with Learning Parameters (6) -Unsupervised learning • Neighborhood – the area around the winning unit, where the non-wining units will also be modified – roughly half the size of maximum dimension of the output layer – 2 methods for controlling • square neighborhood function, linear decrease in the learning rate • Gaussian shaped neighborhood, exponential decay of the learning rate – the number of epochs parameter – important in keeping the locality of the topographic amps Controlling the Training Process with Learning Parameters (7) -Unsupervised learning • Vigilance – control how picky the neural network is going to be when clustering data – discriminating when evaluating the differences between two patterns – close-enough – Too-high Vigilance • use up all of the output units Iterative Development Process (1) • Network convergence issues – fall quickly and then stays flat / reach the global minima – oscillates up and down / trapped in a local minima – 문제의 해결 방법 • some random noise • reset the network weights and start all again • design decision Iterative Development Process (2) Iterative Development Process (3) • Model selection – inappropriate neural network model for the function to perform – add hidden units or another layer of hidden units – strong temporal or time element embedded • recurrent back propagation • radial basis function network • Data representation – key parameter is not scaled or coded – key parameter is missing from the training data – experience Iterative Development Process (4) • Model architecture – not converge : too complex for the architecture – some additional hidden units, good – adding many more? • Just, Memorize the training patterns – Keeping the hidden layers as this as possible, get the best results Avoiding Over-training • Over-training – – – – 같은 pattern을 계속적으로 학습 cannot generalize 새로운 pattern에 대한 처리 switch between training and testing data Automating the Process • Automate the selection of the appropriate number of hidden layers and hidden units – – – – pruning out nodes and connections genetic algorithms opposite approach to pruning the use of intelligent agents