Download lectures 2, 3, and 4 ()

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Lecture 2-3-4
ASSOCIATIONS, RULES, AND MACHINES
CONCEPT OF AN E-MACHINE:
simulating symbolic read/write memory by changing
dynamical attributes of data in a long-term memory
Victor Eliashberg
Consulting professor, Stanford University,
Department of Electrical Engineering
Slide 1
SCIENTIFIC / EGINEERING APPROACH
“When you have eliminated the impossible, whatever remains, however
improbable, must be the truth.” (Sherlock Holmes)
External system (W,D)
Sensorimotor devices, D
W
External world, W
Computing system, B, simulating
the work of human nervous system
D
B
Human-like robot (D,B)
Slide 2
ZERO-APPROXIMATION MODEL
s(ν)
s(ν+1)
Slide 3
BIOLOGICAL INTERPRETATION
Motor control
AM
Working memory, episodic
memory, and mental imagery
AS
Slide 4
PROBLEM 1: LEARNING TO SIMULATE the Teacher
This problem is simple: system AM needs to learn a manageable number of
fixed rules.
X
X11
X12
AM
y
sel
y
NM.y
0
1
NM
symbol read
current state of mind
Teacher
move
type symbol
next state of mind
Slide 5
PROBLEM 2: LEARNING TO SIMULATE EXTERNAL SYSTEM
This problem is hard: the number of fixed rules needed to represent a RAM with
n locations explodes exponentially with n.
y
1
2
NS
NOTE. System (W,D) shown in slide 3 has the properties of a
random access memory (RAM).
Slide 6
Programmable logic array (PLA): a logic implementation
of a local associative memory (solves problem 1 from slide 5)
Slide 7
BASIC CONCEPTS FROM THE AREA
OF ARTIFICIAL NEURAL NETWORKS
Slide 8
Typical neuron
Neuron is a very specialized cell. There are several types of neurons with different
shapes and different types of membrane proteins. Biological neuron is a complex
functional unit. However, it is helpful to start with a simple artificial neuron (next slide).
Slide 9
Neuron as the first-order linear threshold element:
x1
g1
gk
xk
xm
τ
Output:
R’ is the set of real non-negative numbers
du
+ u = Σ gkxk
dt
k=1
y=L( u )
where,
u if u > 0
L( u) =
0 otherwise
{
y
 R’
Parameters: g1,… gm
Equations:
m
gm
u
 R’
y  R’
Inputs: xk
y=L( u )
(1)
(2)
(3)
u
0
A more convenient notation
x1
xk
g1
gk
xm
gm
s
τ
u
xk
gk
is the k-th component of input vector
is the gain (weight) of the k-th synapse
m
s = Σ gkxk
k=1
is the total postsynaptic current
u is the postsynaptic potential
y is the neuron output
y
τ is the time constant of the neuron
Slide 10
Input synaptic matrix, input long-term memory (ILTM) and DECODING
gx1k
x1
xk
xm
ILTM
gxnk
gxik
x
s1
si
sn
DECODING
(computing similarity)
si
s1
sn
An abstract representation of (1):
m
si =
Σ
gxikxk
k=1
i=1,…n
(1)
fdec: X × Gx
S
(2)
Notation:
x=(x1, .. xm) are the signals from input neurons (not shown)
gx = (gxik) i=1,…n, k=1,…m
is the matrix of synaptic gains -- we postulate that
this matrix represents input long-term memory (ILTM)
s=(s1, .. sn) is the similarity function
Slide 11
Layer with inhibitory connections as the mechanism of the winner-take-all (WTA) choice
s1
α
si
u1 τ
α
sn
α
ui τ
xinh
q
un τ
Equations:
(1)
β
d1
β
β
dn
di
(2)
Note. Small white and black circles represent excitatory
and inhibitory synapses, respectively.
(3)
s1
sn
si
Procedural representation:
RANDOM CHOICE
iwin
“
“:
iwin : { i / si=max( j )sj > 0 }
(4)
if (i == iwin) di=1; else di=0;
(5)
denotes random equally probable choice
Slide 12
Output synaptic matrix, output long-term memory (OLTM) and ENCODING
y1
yk
yp
d1
y
gyki
gyk1
di
d1
dn
di
gykn
dn
ENCODING
(data retrieval)
OLTM
An abstract representation of (1):
n
yk =
Σ
gykidi
i=1
k=1,…p
(1)
fenc: D × Gy
Y
(2)
NOTATION:
d=(d1, .. dm) signals from the WTA layer (see previous slide)
gy = (gyki) i=1,…n, k=1,…m
is the matrix of synaptic gains -- we postulate that
this matrix represents output long-term memory (OLTM)
y=(y1, .. yp)
output vector
Slide 13
A neural implementation of a local associative memory
(solves problem 1 from slide 5) (WTA.EXE)
addressing by content
S21(I,j)
S21(i,j)
DECODING
Input long-term memory (ILTM)
N1(j)
RANDOM CHOICE
Output long-term memory (OLTM)
ENCODING
retrieval
Slide 14
A functional model of the previous network [7],[8],[11]
(WTA.EXE)
(1)
(2)
(3)
(4)
(5)
Slide 15
HOW CAN WE SOLVE THE HARD
PROBLEM 2 from slide 6?
Slide 16
External system as a generalized RAM
Slide 17
Concept of a generalized RAM (GRAM)
Slide 18
Slide 18
Slide 19
Representation of local associative memory in terms of three
“one-step” procedures: DECODING, CHOICE, ENCODING
Slide 20
INTERPRETATION PROCEDURE
Slide 21
At the stage of training, sel=1; at the stage of examination sel=0.
System AS simply “tape-records” its experience, (x1,x2,xy)(0:ν).
y
1
2
NS
GRAM
NOTE. System (W,D) shown in slide 3 has the properties of a
random access memory (RAM).
Slide 22
EXPERIMENT 1: Fixed rules and variable rules
Slide 23
EXPERIMENT 1 (continued 1)
Slide 24
EXPERIMENT 1 (continued 2)
Slide 25
A COMPLETE MEMORY MACHINE (CMM) SOLVES PROBLEM 2,
but this solution can be easily falsified!
Slide 26
GRAM as a state machine: combinatorial explosion of the
number of fixed rules
Slide 27
Concept of a primitive E-machine
Slide 28
(α< .5)
s(i) >
;
c
Slide 29
Effect of a RAM w/o a RAM buffer
1 2 3 4
1 2 3 4
1 2 3 4
c c c c
c c c c
c c c c
b b b b
b b b b
b b b b
a a a a
a a a a
a a a a
1 2 3 4
1 2 3 4
1 2 3 4
b a c b
c b a c
a c b a
1
a b c
2
3
4
a b c a b c a b c
G-state
E-state
Slide 30
EFFECT OF “MANY MACHINES IN ONE”
1 2 3 4 5 6 7 8
X(1)
X(2)
y(1)
AND
OR
XOR
NAND
n=8 locations of LTM
0 0 0 0 1 1 1 1
0 0 1 1 0 0 1 1
G-state
0 1 0 1 0 1 0 1
E-state
A table with n=2
m+1
m
2
represents N=2
different m-input 1-output
Boolean functions.
Let m=10. Then n=2048
NOR
1024
and N=2
Slide 31
Simulation of GRAM with A={1,2}, and D={a,b,ε}
1
addr
2
3
4
5
6
7
i
1 1 2 2 1 2
din
a b b a
dout
a b b a b a
ν=5
s (i) is the number of matches in the
first two rows. Input (addr,din) = (1,ε)
produces s(i)=1 for i=1 and i=2.
s(i)
if ( s(i)>e(i) ) e(i)(ν+1) = s(i)(ν);
e(i)
se(i)
else e(i)(ν+1) = c · e(i)(ν) ; τ=1/(1-c)
se(i) = s(i) · ( 1+a · e(i) ); (a<.5)
dout=b is read from i=2 that has se(i)=max(se)
Slide 32
Assume that the E-machine starts with the state of LTM shown in the table and doesn’t
learn more, so this state remains the same. What changes is the E-state, e(1),…e(4).
Assume that at ν=1, e(1)=..e(4)=0. Let us send the input sequence (addr,din)(1:5) =
(1,a), (1,b),(2,a),(2,b),(1,ε). As can be verified, at ν = 5, the state e(i) and functions s(i)
and se(i) for i=1,..4 are as shown below. Accordingly, iwin=2 and dout=b.
1
addr
din
dout
ν=5
2
3
4
i
1 1 2 2
gx(1,1:4)
a b b a
gx(2,1:4)
a b b a
gy(1,1:4)
s (i) is the number of matches in the
first two rows. Input (addr,din) = (1,ε)
produces s(i)=1 for i=1 and i=2.
if ( s(i)>e(i) ) e(i)(ν+1) = s(i)(ν);
else e(i)(ν+1) = c · e(i)(ν) ; τ=1/(1-c)
s(i)
se(i) = s(i) · ( 1+a · e(i) );
e(i)
iwin :
(a<.5)
{i : se(i)=max(se)>0};
y =gy(iwin); (a<.5)
se(i)
Slide 33
What can be efficiently computed in this “nonclassical”
symbolic/dynamical computational paradigm (call it the
E-machine paradigm)?
What computational resources are available in the brain
-- especially in the neocortex -- for the implementation of
this paradigm?
How can dynamical equations (such as the last equation
in slide 29) be efficiently implemented in biologically
plausible neural network models?
Slide 34