Download 9-Lecture1(updated)

Artificial Intelligence Neural Networks (Chapter 9) Outline of this Chapter • • • • • • Biological Neurons Neural networks History Artificial Neural Network Perceptrons Multilayer Neural Network Applications of neural networks Definition Neural Network A broad class of models that mimic functioning inside the human brain There are various classes of NN models. They are different from each other depending on  Problem types  Structure of the model  Model building algorithm For this discussion we are going to focus on Feed-forward Back-propagation Neural Network (used for Prediction and Classification problems) Biological Neurons • The brain is made up of neurons (nerve cells) which have – dendrites (inputs) – a cell body (soma) – an axon (outputs) – synapse (connections between cells) • Synapses can be excitatory (potential-increasing activity) or inhibitory (potential-decreasing), and may change over time • The synapse releases a chemical transmitter – the sum of which can cause a threshold to be reached – causing the neuron to fire (electrical impulse is sent down the axon ) Biology of a neuron A bit of biology . . . Most important functional unit in human brain – a class of cells called – NEURON Dendrites Cell Body Axon Synapse Hippocampal Neurons Source: heart.cbl.utoronto.ca/ ~berj/projects.html • Dendrites – Receive information Schematic • Cell Body – Process information • Axon – Carries processed information to other neurons • Synapse – Junction between Axon end and Dendrites of other Neurons An Artificial Neuron Dendrites X1 X2 Xp w1 w2 .. . wp Cell Body Axon Direction of flow of Information I f V = f(I) I = w1X1 + w2X2 + w3X3 +… + wpXp • Receives Inputs X1 X2 … Xp from other neurons or environment • Inputs fed-in through connections with ‘weights’ • Total Input = Weighted sum of inputs from all sources • Transfer function (Activation function) converts the input to output • Output goes to other neurons or environment Biological Neurons (cont.) • When the inputs reach some threshold an action potential (electrical pulse) is sent along the axon to the outputs. • The pulse spreads out along the axon reaching synapse & releasing transmitters into the bodies of other cells. • Learning occurs as a result of the synapse’ plasticicity: They exhibit long-term changes in connection strength. • There are about 1011 neurons and about 1014 synapses in the human brain(!) • A neuron may connect to as many as 100,000 other neurons Brain structure • We still do not know exactly how the brain works. e.g, born with about 100 billion neurons in our brain. Many die as we progress through life, & are not replaced, but we continue to learn. But we do know certain things about it. • Different areas of the brain have different functions – Some areas seem to have the same function in all humans (e.g., Broca’s region- speech & language); the overall layout is generally consistent – Some areas vary in their function; also, the lower-level structure and function vary greatly senses emotions, reasoning, planning, movement, & parts of speech. hearing, memory, meaning, and language vision & ability to recognize objects Brain structure (cont.) • We don’t know how different functions are “assigned” or acquired – Partly the result of the physical layout / connection to inputs (sensors) and outputs (effectors) – Partly the result of experience (learning) • We really don’t understand how this neural structure/ collection of simple cells leads to action, consciousness and thought • Artificial neural networks are not nearly as complex as the actual brain structure Comparing brains with computers Computer Computational units Storage units Cycle time Bandwidth Neuron updates/sec • • • • • • Human Brain 1 CPU, 105 gates 109 Bits RAM, 1011 bits disk 1011 neurons 1011neurons, 1014 synapses 10-8 Sec 109 bits/sec 105 10-3 sec 1014 bits/sec 1014 They are more neurons in human brain than they are bits in computers Human brain is evolving very slowly---computer memories are growing rapidly. There are a lot more neurons than we can reasonably model in modern digital computers, and they all fire in parallel NN running on a serial computer requires 100 of cycles to decided if a single N will fire--in real brain, all Ns do this in a single step. e.g. brain recognizes a face in less than a sec--- billion of cycles Neural networks are designed to be massively parallel The brain is effectively a billion times faster at what it does Neural networks History • McCulloch & Pitts (1943) are generally recognised as the designers of the first neural network • Many of their ideas still used today (e.g. a neuron has a threshold level and once that level is reached the neuron fires is still the fundamental way in which artificial neural networks operate) • Hebb (1949) developed the first learning rule (on the premise that if two neurons were active at the same time the strength between them should be increased). • During the 50’s and 60’s many researchers such as Minsky & Papert, worked on the perceptron (NNModel) • 1969 saw the death of neural network research for about 15 years • Only in the mid 80’s (Parker and LeCun) NN research revived. Artificial Neural Network • (Artificial) Neural networks are made up of nodes/units connected by links which have – inputs edges, each link has a numeric weight – outputs edges (with weights) – an activation level (a function of the inputs) The computation is split into 2 components: 1. Linear component, called input function (ini)-- computes the weighted sum of the unit’s input values. 2. Non-linear component, called activation function (g)– transforms the weighted sum into the final value that serves as the unit’s activation value: ai = g(ini) = g( aj wj,i ) • Some nodes are inputs (perception), some are outputs (action) Modeling a Neuron in i   j Wj , iaj Each unit does a local computation based on inputs from its neighbours & compute a new activation level – sends along each of its output links aj: Activation value of unit j wj,I: Weight on the link from unit j to unit i inI: Weighted sum of inputs to unit i aI: Activation value of unit i g: Activation function. Activation Functions • Different models are obtained by using different mathematical functions for g. • 3 common choices are: threshold function logistic function Step(x) = 1 if x >= 0, else 0 Sign(x) = +1 if x >= 0, else –1 Sigmoid(x) = 1/(1+e-x) ( in which we try to minimize the error by adjusting the weights of the network, e : represents error degree) 1  represents the firing of a pulse down the axon, & 0 represents no firing. t (threshold) represents the min total weighted input needed to cause the neuron to ANN (Artificial Neural Network) – Feed-forward Network A collection of neurons form a ‘Layer’ X1 X2 X3 X4 Input Layer Direction of information flow - Each neuron gets ONLY one input, directly from outside Hidden Layer - Connects Input and Output layers Output Layer - Output of each neuron directly goes to outside y1 y2 ANN (Artificial Neural Network) – Feed-forward Network Number of hidden layers can be None One More Implementing logical functions • McCulloch and Pitts: every Boolean function AND, OR, & NOT can be represented by units with suitable weights & thresholds. • We can use these units to build a network to compute any Boolean function (t = threshold or the value of the Bias weight that determines the threshold to cause the neuron to fire) Network Structure They are 2 main categories of NN structure: • Feed-forward/acyclic networks:  allow signals to travel one way only; from input to output. There is no feedback (loops).  Tend to be straight forward networks that associate inputs with outputs. (i.e. pattern recognition.)  Usually arranged in layers– each unit receives input only from units in preceding layer, no links between units in the same layer. – single-layer perceptrons – multi-layer perceptrons • Recurrent/cyclic networks: – Feeds its outputs back into its own inputs. – recurrent neural nets have directed cycles with delays – The links can form arbitrary topologies. • The brain is recurrent network – activation is fed back to the units that caused it. Feed-forward example No direct connection to the outside world Simple NN with 2 inputs, 2 hidden units & 1 output unit. a5 = g(W3,5.a3 +W4,5.a4) = g(W3,5. g(W1,3.a1 +W2,3.a2) +W4,5 g(W1,4.a1 +W2,4.a2)) By adjusting weights, we change the function that the Network represents: learning occurs in NN this way! Perceptron • Is a network with all inputs connected directly to the output. • This is called a single layer NN (Neural Network) or a Perceptron Network. • It is a simple form of NN that is used for classification of linearly separable patterns. (i.e. If we have 2 results we can separate them with a line with each group result on a different side of the line) Perceptron or a Single-layer NN  A Feed-Forward NN with no hidden units.  Output units all operate separately--no shared weights.  First Studied in the 50’s  Other networks were known about but the perceptron was the only one capable of learning and thus all research was concentrated in this area.  A single weight only affects one output so we can limit our study to a model as shown on the right  Notation can be simpler, i.e. O  Step0 j WjIj Multilayer NN • Network with 1/more layers of hidden units • Layers are usually fully connected; • numbers of hidden units typically chosen by hand Summary • Most brains have lots of neurons; each neuron  linear threshold unit (?) • Perceptrons (one-layer networks) insufficiently expressive • Multi-layer networks are sufficiently expressive; can be trained by gradient descent, i.e., error back-propagation • Many applications: speech, driving, handwriting, fraud detection, etc. • Engineering, cognitive modelling, and neural system modelling • sub fields have largely diverged End of Chapter 9

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 9-Lecture1(updated)