Download The Brain, Neural Networks and Artificial Intelligence

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Pattern recognition wikipedia, lookup

Hierarchical temporal memory wikipedia, lookup

Catastrophic interference wikipedia, lookup

Convolutional neural network wikipedia, lookup

Transcript
Kane Hill
u4357873
The Brain, Neural Networks and Artificial Intelligence
I. Introduction
The human brain is the most complex system in the known universe. It has evolved over millions of years into the
highly sophisticated information generating and processing unit and control system observed in modern human
beings. The specific parts of the brain are responsible for specific tasks such as sensory operations, co-ordination
and controlling basic bodily functions such as breathing. The functioning of the human brain also gives to rise to the
idea of consciousness. This is largely due to the fact that humans have memory, which will be a central theme of this
report. The cortex is the folded hemispherical structure, which is more prominent in humans than other species and
gives rise to our uniqueness. It is believed that the cortex came about only 100,000 years ago, a relatively short time
ago in evolutionary terms. The sections of the cortex, control auditory senses, vision, voluntary movement and is
most closely associated with logical reasoning and the use of language.
The central nervous system in humans contains something of the order of 100 billion neurons, and forms an
extremely complex network with a high degree of connectivity. Neurons have 6 main components which will be
discussed in further detail in the following section of this report. Neurons are fundamental building blocks of the
neural network of the brain. They are the information processing units of the brain responsible for receiving and
transmitting information. Each part of the neuron plays a part in the role of communication of the information
throughout the body. Neurons are comprised of dendrites, a nucleus, the soma, the axon hillock, the axon and the
terminal buttons. It was thought for many years that the brain was wired according to a complicated plan in which
the role of each brain cell and its connection to others was laid down according to a fixed plan. The truth had only
recently been discovered: the connections between neurons in the brain are largely random.
The neural network of the brain is self organising, directed and weighted graph. The network or graph contains
many ‘cycles’, is dynamic, structurally altering, adaptive, resilient, robust and ‘plastic’. This is due to the fact that
every time a neuron-neuron connection is made, weakened or strengthened, the structure of the neurons themselves
and the structure of the brain are changed. This and the shear size of the brain constitute the familiar statement that
“we really know very little about the brain.” The reason the brain is referred to as robust is because it is still able to
function in the event of a severed connection and/or neurons being annihilated. This is very different to standard
circuitry found in everyday appliances or the computer that this report was constructed on. It is known as plastic as
it can adapt to new memories and functions. This is due to the brains ability to form new connections between
neurons which occur at the synapses and are facilitated by neurotransmitters, which alter the strength which can pass
between neurons. During childhood and in learning processes these connections form and alter their strengths. As
such a massive amount of connections are made in the early years of a human beings life, which is why children
consume information so rapidly and are able to learn so quickly. It is of no coincidence that education begins at the
start of ones life; it is often said that mathematicians, physicists and musicians achieve their best work in their
twenties and thirties and then tail off or even decline therein after.
Many models imitating how biological neural networks behave have been developed following the introduction of
the Hebb rule which relates the strength of connections between neurons with the incidence of neuron firing
instances. These include the Hopfield network – a simple model that simulates memory, the Perceptron model for
decision making and learning, in which the network can be trained to successfully achieve a desired output. The first
example of such a use of brain inspired artificial neural network was NETTalk in 1987. NETTalk uses a perceptron
model to read English text and convert into sounds using a voice synthesizer. Artificial neural networks of this type
have 3 layers; the input, the hidden layer and the output. These kinds of networks can be trained to make
generalizations. For example if the network is presented with a range of ellipses quantified by information on the
position of its centre, height, width and angle of tilt then the network may be trained by adjusting the connections to
the output layers. If the network is presented with a new ellipse that was not a part of the training regime it can still
distinguish it as an ellipse. This algorithm has found uses in fields as diverse as digital signal processing, financial
market analysis and biological systems and of course physics. This system is very different to conventional
computation and more like that of the human brain, but it is still essentially a ‘teacher supervised’ learning process,
by this we mean the network is not self taught and requires ‘academic assistance.’ The brain is a self organising
network and is required to learn its own learning procedures.
Artificial intelligence (AI) has been a popular science concept around for many decades and much ethical,
philosophical and scientific literature, media and discussion have been allotted to the subject, here the former shall
be neglected as ethics does not cover the scope of this course. Human intelligence can be generalised to be
interpreted as our adaptability, ability to learn new concepts and form complex societies and relationships. These
aspects of human life may be attributed to consciousness. So for a machine to be able to be truly intelligent under
this definition it must be considered to be conscious. The previously introduced artificial network models require a
‘teacher’ so while useful are certainly not intelligent. A network that has AI must be able to make complex
decisions by a succession of associative steps without ‘academic assistance’. There are models which attempt to do
just this and succeed in some respects. For example the Kohonen network (a Self Organising Network – SOM)
works via a competitive process in which a hidden layer node is made to progressively respond most strongly to an
input pattern image and the other hidden nodes minimise their response. As such, a single node can learn to respond
to a certain image. This kind of artificial network is still very basic and yet non-trivial. The brain is an
extraordinarily complex system and whether if any vague realisation an intelligent computer is ever to be realised
then the field of complex systems and neuroscience have many a gallop to go to reach the AI wall before they can
even think about climbing it.
II. The Structure of the Brain and the Neuron
Fig #1: Basic structure of the brain
The brain can be divided into sections according to function, appearance and locale;
1.
2.
3.
4.
5.
6.
7.
The brain stem is the lower extension of the brain that connects to the spinal chord. It is responsible for the
control of automatic functions such as breathing, heart rate, pulse etc.
Broca’s Area – The function of this area is the understanding of language, speech, and the control of facial
neurons.
The Cerebellum, located at the back of the head is connected to the brain stem controls walking, balance and
general motor functions.
The Cerebrum is the largest part of the brain and is divided into 2 hemispheres which control opposite sides of
the body. This section of the brain is associated with conscious thought, movement and sensation. The corpus
callosum connects the two hemispheres of the cerebrum and allows for communication between the two. The
cerebrum has 4 lobes responsible for different functions:
a) Frontal Lobe – Controls attention, behaviour, abstract thinking, problem solving, creative thought, emotion,
intellect, initiative, judgment, coordinated movements, muscle movements, smell, physical reactions, and
personality.
b) Occipital Lobe – Located in the back of the head this part of the brain controls vision.
c) Parietal Lobe – controls tactile sensation, response to internal stimuli, sensory comprehension, some
language, reading, and some visual functions.
d) Temporal lobe – controls auditory and visual memories, language, some hearing and speech, language and
some behaviour.
The Motor Cortex helps control movement in various parts of the body.
The Sensory Cortex receives information from the spinal cord about the sense of touch, pressure, pain, and the
perception of the position of body parts and their movements.
The Wernicke’s Area is part of the temporal lobe that surrounds the auditory cortex and seems to be essential
for understanding and formulating speech. Damage in this area causes problems in understanding spoken
language.
The Neuron is the basic building block of the brain. There are 100 billion in the human brain and they are
collectively known as a “neuron forest” as they are arranged in a complicated neural network that resembles a
forest. Unlike trees the neuron forests contain “cycles” i.e. many, many neurons are connected in a largely
random fashion. The biological neural network of the brain can be described as a weighted, directional graph
because information can flow in one direction through neurons.
Fig #2: Schematic diagram of a neuron
Dendrites are treelike structures at the start of a neuron and are covered with synapses. They transmit electrical
signals to the Soma. Most neurons have many dendrites, which are short and highly branched.
The Soma is where the signals from the dendrites are received and sent onward. The soma and nucleus are not
involved however with the active transmission of the signals but simply serve to keep the neuron alive and
functional.
The Axon Hillock is responsible for the firing of the neuron. If the total strength of the signal exceeds the Axon
Hillock potential limit, the system will in turn fire a neuron down the axon. This process can set up a cascading
effect of electrical activity and since each neuron is connected to many may trigger many others. Different
kinds of brain activity are caused by different patterns of firing.
The Terminal Buttons (Axon Terminal) are responsible to sending the signal further onto other neurons. At
the end of the button is the synaptic cleft. The strength of the synaptic connection between two neurons
influences the likelihood of the second neuron firing. A strong connection will make this process more likely
where as a weak one may only occur occasionally. This is of course dependent on the presence of the types of
neurotransmitters present as they govern the size of the electrical signal which can pass between neurons. This
process is thought to be important in the memory process. New memories are formed by adjusting the strengths
of connections and are not stored on individual neurons. This process appears to govern this process called the
Hebb Rule – The connection between new neurons will strengthen if more often than not two neurons fire
together.
Fig #3: Neurotransmitters transferring form the axon terminal of a neuron into the dendrite of another.
At the axon terminal the electrical signals excite chemicals called neurotransmitters to transfer between the
axons of one neuron into the dendritic spine of another.
Every time this process occurs the structure of the neuron is changed because a new connection has been made.
This means that the structure of the brain is altered with every new connection made. Furthermore the next time
a signal is directed through this established connection the connection is further strengthened and information
can flow faster through this neuronal route. Much of the learning in adults is merely strengthening these
connections rather than creating new ones, although this still occurs. Although a lot of useful connections are
made in childhood it is also true that a lot of what happens in this period of development is editing redundant
and useless connections. This is perhaps why taking drugs and alcohol whilst at adolescent ages is so bad, not
only because brain cells are lost and connection are severed but the brain gets confused and fails to eliminate
redundant connections. Never the less the brain is extremely resilient to attack, this is due to the fact that we
have approximately 1,000 – 10,000 billion synapses and if a connection is severed we can adapt, by an
automatic rewiring process in which the neuron is circumvented. We can infer from this that the brain has a
high level of betweenness.
The human brain is a powerful thinking machine because of its sheer number of neural nodes – many tens of
billions. However, the processing time of a standard computer is 100,000 times faster than that of a neuron. If
the enormous amount of neurons can work simultaneously and effectively in parallel together than it is easy to
see how our brains are far more advanced than current computers. We know that each neuron has around
10,000 connections and they are used for communication with other neurons. They are arranged in an extremely
complex network and different sectors of the brain are wired in different ways. These portions of the brain have
specialized networking also.
III. Artificial Neural Networks (ANN) And Artificial Intelligence (AI)
ANN are different to traditional computers in that they do not have a CPU that sequentially executes a logical set of
rules. ANN consist of many simple nodes/units (simplified versions of neurons) that are arranged in a complex
communication network to perform parallel processed operations without a program or set of rules to follow. The
computational abilities of the system are related to dynamic node firings with adjustable strengths (weights) between
connections. Such a system is much more closely related to the computation of the human brain.
The type of memory functionality exhibited by humans may be referred to as associative memory storage/recall.
This is because all memories are really strings of memories. For example we remember people by a number of
facets of their appearance/personality etc. e.g. the colour of their eyes/skin, the way they smell after a shower and so
on and so forth. A smell or taste of can bring on a memory of a particular experience. Since the memory of the
person or time in life is located in different parts of the sensory parts of the brain, i.e. in entirely different regions
then the memory must be distributed throughout the brain in some fashion. So we access a memory by its content
and not its address – our memory can also be called content addressable.
Artificial neural networks mimic the behaviour of nature.
Fig # 4: Typical neural network training regime
Typically neural networks are trained, so that a particular input leads to a specific target output. Such a situation is
depicted above. Here, the network is adjusted, based on a comparison of the output and the target, until the network
output matches the target. The performance of the system is increased when greater training is provided to the
network. This means that the network is more knowledgeable but not more intelligent.
Neural networks have been trained to perform complex functions in various fields, including pattern recognition,
identification, classification, speech, vision, and control systems. Today neural networks can be trained to solve
problems that are difficult for conventional computers or human beings. These algorithms and/or hardware systems
are now used for control systems, simulations and diagnostics in a wide variety of fields: engineering, finance, and
industry. A few interesting examples of this include; aircraft control systems, weapon steering, target tracking,
object discrimination, facial recognition, radar and image signal processing including data compression, feature
extraction and noise suppression, signal/image identification, voice synthesis, nonlinear modelling in electronics,
Breast cancer cell analysis, forklift robot, securities market analysis.
Neural networks have been successful in being applied to such applications as those listed above. We may now
investigate the foundations of some historical neural networks and their uses.
The Hopfield Network
The Hopfield network is based on a very simplified model of a neuron in which it can exist in one of two states; that
is ‘firing and ‘not firing’. The Hopfield network is able to store certain memories or patterns similar to the brain – in
that a whole image may be retrieved when only partial information of the pattern is provided. Not only this but the
Hopfield network is resilient – if a few connections are severed the network can still reproduce the image with
reasonable accuracy. Every node is connected to every other node. The network stores a specific set of equilibrium
points such that, when an initial condition is provided, the network eventually comes to rest at such a design point.
The network is recursive in that the output is fed back as the input, once the network is in operation. This means that
we can provide the network a series of initial conditions and the trajectories will eventually go the closest
equilibrium point. The firing of a single node sets up a cascade of different firing patterns which stabilises such that
the network tends to a fixed pattern. It is possible to ensure the stable firing pattern that is set up corresponds to the
right memories being stored. The handy thing about the Hopfield network is that any pattern of firing activity can be
made “stable” that is we can burn in any memory into the network at the start of the process. So if the network is
started with a random nodal activity the memory will still be recalled, because the system always finds the closest
equilibrium point, i.e. we can give the network a corrupted image and it will try to reconstruct the image on its own.
However there is a threshold; if the image is substantially poor the network may get it wrong, just like the brain. An
example could be a friend or associates telephone number.
Although we know how many nodes are required for a given number of memories to be stored in the Hopfield
network, the same information is not available for most networks. The Hopfield network can be arranged to perform
the operations of memory storage and recall howver a smart “teacher” is still required to train the network what to
remember. In other words the weights between nodes must be carefully chosen.
The perceptron model
Fig #4: Perceptron model algorithm
The perceptron has the ability to generalize from its training and learn from initially randomly distributed
connections. Perceptrons are especially suited for simple problems in pattern classification. They are fast and
reliable networks for the problems they can solve. The perceptron is a supervised learning network. The nodes in
this model are conected into layers; the input layer, hidden layer and output layer. A firing pattern is fed into the first
layer which conveys the pattern onto the hidden layer and then is fed to the output layer and the pattern of firing at
the output defines the response of the network to the input pattern. The output pattern does not affect the hidden
layer, this is a directional system. The response to the input firing pattern is automatically stable and almost
immediate, unlike the Hopfield network. This system can be used for pattern recognition because the network can
be trained to produce a desired response. This is done by comparing the output with the target output and adjusting
the weights of internodal connections in the hidden layer. For example if the input pattern is a circle, it can be
arranged so that the output is also a circle. If a triangle and circle is presented as the input to the network it can be
arranged to still respond with a circle. If the network is trained with a variety of shapes and set so to only respond to
circles and a pattern void of a square is presented then the network can be set that it responds with a zero. The
perceptron can therefore classify and recognise a pattern because it can place the pattern into categories with or
without circles. It can also be arranged so that when a circle and/or triangle is observed, that the network responds
with a square. In this way the perceptron can perform association tasks. NETTalk made in 1987 was the first popular
example of a perceptron network which used a voice synthesizer to vocalise English text. The perceptron can also be
made to add integers and act as an encoder; these are examples of the networks ability to make generalizations.
There are many other artificial neural network models and algorithms. These include the backward propagation
algorithm, adaptive linear, competitive, feed forward, radial basis, generalised regression and self organising
networks. Some of these networks have simulated to perform certain tasks and they will be discussed in the
following section.
Self-Organizing Networks (SOM) is one of the most interesting topics in the neural network field. These networks
can learn to detect regularities and correlations in their input and adapt their future responses to that input
accordingly. The neurons of competitive networks learn to recognize groups of similar input vectors. Selforganizing maps learn to recognize groups of similar input vectors in such a way that neurons physically near each
other in the neuron layer respond to similar input vectors. These networks can be used for data extrapolation of
multidimensional data sets.
If we are ever going to be able to build some kind if intelligent machine, that exhibits even some facets of the human
brain, it is clear that the fields of neuroscience, complex systems and applied mathematics must converge to produce
some truly inspiring analysis, design and system implementation. For a machine to display some of the aspects of
the brain that make us ‘smart’ such as creativity, independent decision making etc. We require a lot more
knowledge on how the networking of the brain is set up.
IV. Simulations of Neural Networks
1.
The (2 neuron) Hopfield Network
In order to obtain a Hopfield network with 2 stable points we can define two target vectors: namely T = [+1 -1;-1
+1];
We can graphically represent these two vectors and thereby define all the possible states of the 2-neuron Hopfield
network as being within the graphs domain;
Fig: Plot of stable points
Now if we create the Hopfield network given the target values, T we can check that these points are
stable as the network should return these values unchanged. This is the case. Now if specify a random
starting point and simulate the networks response we get that the network finds the closest stable point.
Fig: Trajectory plot for one supplied initial condition to the Hopfield network
Now this process can further verified with 100 randomly generated intial conditions, all of which
return to the stable points of the system.
Fig: Trajectory plots for 100 randomly generated initial conditions to the Hopfield network.
This network mimics how memory is thought to work in the human brain, where the stable equilibrium
points could be ‘memories’ and the trajectories neuron firing patterns. We could also think of this as a
mechanical system in which conservation of energy requires that the shortest path to the lowest energy
state is taken.
2.
Forward feed Networks and Character Recognition
Suppose we would like to design a network that recognizes the letters of the alphabet. An imaging system that
digitizes each letter centered in the system's field of vision is available. The imaging system is imperfect and suffers
from noise. A two layer forward feed network is created and trained with perfect and noisy characters. Training is
conducted using backward propagation with both adaptive learning rate and momentum.
The reliability of the network pattern recognition system may be measured by testing the network with hundreds of
input vectors and varied quantities of noise. Noise with a mean of 0 and a standard deviation from 0 to 0.5 is added
to input vectors.
The number of erroneous classifications is then added and percentages are obtained.
The solid line on the graph shows the reliability for the network trained with and without noise. The reliability of the
same network when it was only trained without noise is shown with a dashed line. As one would expect the network
that was trained with and without noise is less erroneous when tested with noisy characters. This is due to the fact
that the network has more ‘experience’. The system can be improved with more training just as a rally driver
becomes more accustomed driving on different dynamic roads. By this we mean roads that
To test the system, a letter is created with noise and presented to the network. The result is that the network can still
recognize the given noisy image as the specified letter.
The following noisy J (left) can be interpreted as the perfect J specified from the character set (right)
3.
Neural Networks for Control Systems in Engineering
Let’s say our objective is to control the movement of a simple, single-link robot arm, as shown in the following
figure:
The equation of motion for the arm is
•
, where
is the angle of the arm, and u is the torque supplied by the DC motor.
The objective is to train the controller so that the arm tracks the reference model. The model comprises of two
neural networks; the plant and reference controller.
We use a 5-13-1 neural network controller architecture to model this system. Firstly we need to specify
the standard parameters; delay, hidden layer neurons, max and min values etc. The next step is to train
the network and then generate data for training the controller. The controller is trained with dynamic
backpropagation.
Now that the controller has been successfully trained we can generate a graph of reference and plant
output to analyze the agreement of the two.
4.
Self Organizing Networks
Determining structures in vast multidimensional data sets is difficult and time-consuming. Interesting, novel
relations between the data items may be hidden in the data. The self-organizing map (SOM) algorithm can be used
to aid the exploration. Maps illustrating the struvtures of the standard of living in the world and the organisation of
full-text document collections have already been drawn and analysed.
The SOM network algorithm
Let’s say we would like to analyse gene expressions in baker's yeast.
Our aim is to gain some understanding of gene expressions in Saccharomyces cerevisiae or bakers yeast. It is the
fungus that is used to bake bread and ferment wine from grapes.
Saccharomyces cerevisiae, when introduced in a medium rich in glucose, can convert glucose to ethanol. At the
start, yeast converts glucose to ethanol by a metabolic process called "fermentation". However once supply of
glucose is expended yeast shifts from anerobic fermentation of glucose to aerobic respiraton of ethanol. This process
is called diauxic shift. This process is of considerable interest since it is accompanied by major changes in gene
expression.
This simulation uses DNA microarray data to study temporal gene expression of almost all genes in Saccharomyces
cerevisiae during the diauxic shift. This simulation uses data from DeRisi, JL, Iyer, VR, Brown, PO. "Exploring the
metabolic and genetic control of gene expression on a genomic scale." Science. 1997 Oct 24;278(5338):680-6.
PMID: 9381177
The full data set can be downloaded from
http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE28 .
the
Gene
Expression
Omnibus
website,
Gene expression levels were measured at seven time points during the diauxic shift. The variable times contains the
times at which the expression levels were measured in the experiment.
Filtering the Genes
The data set of 6400 is substantial and a lot of the information corresponds to genes that do not show any interesting
changes during the experiment. To make it easier to find the interesting genes, the first thing to do is to reduce the
size of the data set by removing genes with expression profiles that do not show anything of interest. Also in the
cases where data was not collected for a certain time step, the data can be omitted. We can further reduce the data
set by isolating profiles that correspond to large changes in expression due to the diauxic shift.
Principal Component Analysis
Now that we have a manageable list of genes, we may analyse the relationships between the profiles.
Principal-component analysis (PCA) is a useful technique that can be used to reduce the dimensionality of large data
sets. We can use this technique further to reduce the data set by eliminating profiles that contribute to say less than
15% variance in the data set. We must first normalize the input vectors, such that they have zero mean and unity
variance. To visualize the principal components of the dataset we use the scatter function;
Cluster Analysis: Self-Organizing maps
The principal components can be now be clustered using a Self-Organizing map (SOM) clustering algorithm. The
result is as follows
We may assign clusters using the SOM by finding the nearest node to each point in the data set
V. Conclusion
The human brain is the most unique, intriguing and complex system known to man. It is a directed, dynamic self
organising network characterised by a complex structure that is continually recalling, comparing, processing,
discriminating and analysing information with massive parallelism. This fact is at the ‘heart’ of the power of the
mind as a thinking machine. We may very well never truly understand exactly how our brains work, what thought
actually is or how we may exploit such information to construct an intelligent computer, but analyses of neural
networks has already to some amazing discoveries and found realistic applications in wide ranging fields. From the
simple model of Hopfield to all the many current additions and counterparts, artificial neural networks have come
along way already and are now capable of solving highly nonlinear and complex systems; systems which normal
computation or even human computation can not. However, it is clear that further scrutiny and understanding of the
human brain will lead to new models being derived and subsequent applications being realised that surely will serve
as useful tools for problem solving in the modern world.
Note: I have become very interested in neural networks and their abundant applications, and specifically self
organising and if I had more time I would have liked to conduct more data analysis of real systems and perhaps even
apply them to an original problem.
References:
[1] The Quantum Brain - Jeffrey Satinover: John Wiley and Sons Inc. (2001)
[2] Mathworks online documentation – www.mathworks.com/support
[3] http://www.codeproject.com/useritems/nxml.asp
[4] http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/report.html