Download Learning Sensorimotor Contingencies

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Photoreceptor cell wikipedia , lookup

Human eye wikipedia , lookup

Transcript
Learning Sensorimotor
Contingencies
James J. Clark
Centre for Intelligent Machines
McGill University
This work is being done in collaboration with:
J. Kevin O’Regan (CNRS, Univ. Rene Descartes)
and with doctoral students at McGill University:
Fatima Drissi-Smaili
Ziad Hafed
Muhua Li
A mystery
: Why do we perceive the same feature value (e.g. Color)
when viewing the feature foveally or peripherally?
Why is this a mystery?
The signal provided by retinal photoreceptors can be quite
different when the image of the feature falls on different places
on the retina.
For example:
the spectral sensitivity curves of retinal photoreceptors are
shifted towards the blue in peripheral cells as compared with
the foveal cells.
.
A related mystery (perhaps…)
: Why do neurons in areas such as V4 and IT, which have
large receptive fields, respond to the same feature
value (e.g. color, orientation, complex shape)
no matter where the feature lies in the receptive field?
The activity of these neurons is usually reduced when the
feature falls in the periphery of the receptive field as compared
with the center, but the neuron’s selectivity, or tuning, is the
same everywhere.
Perceptual Stability
These mysteries can be more generally considered as related
to the mystery of perceptual stability.
Perceptual stability is the constancy of subjective experience
across self-actions, even though these self-actions can cause
large changes in sensory inputs.
Sensorimotor Contingencies
One theory of perceptual stability, due to O’Regan and Noe,
holds that what is perceived is the sensorimotor contingency
associated with a given physical stimulus.
A sensorimotor contingency is a law or set of laws that
describes the relation between self-actions and resulting
changes in sensory input.
Since it is the presence of a lawful relationship between
sensory input and motor activity that determines the perception
of a physical stimulus, an appropriate change in sensory input
is necessary for a perception to be stable!
Conditioning using Temporal-Difference Learning
We propose that Sensorimotor Contingencies associated
with sensory changes due to eye movements can be learned
using a variety of learning techniques.
We propose the use of the Temporal-Difference Learning
scheme of Sutton and Barto.
This reinforcement learning technique can be thought of as
a form of Conditioning where the Conditioned Stimulus is
the sensory activity before the eye movement and the
Unconditioned Stimulus is the sensory activity after the
eye movement.
After conditioning, presentation of the conditioned stimulus
will produce the same behaviour as that produced by the
unconditioned stimulus.
The Sutton-Barto Temporal-Difference Learning Rule


Vij (t )    (t ) Vij (t  1)  X j (t )  X j (t  1) X j (t )
X j (t )   X j (t  1)  X j (t  1) 
X (t )  0 during the attention shift from fovea to periphery
Vij (t ) is constraine d through saturation to lie between 0 and 1
 t   the foveal value of feature i, and is the " unconditio ned stimulus"
X j (t ) is the peripheral value of feature j
X j (t ) is the eligibilit y trace, or short - term - memory of X j (t )
V is a matrix of association strengths between pre- and postsaccadic stimuli.
The pre-motor stimulus X is held in a short-term memory
generating an eligibilty trace, which will be used to enhance,
in a Hebbian fashion, the association to the post-motor stimulus.
The reinforcement signal, which is multiplied by the eligibilty
trace to yield the change in the association matrix,
is the difference between 2 different predictions of the
foveal response - a weighted sum of the current and previous
foveal responses, and the action of the current association matrix
on the previous peripheral stimulus.
TRAINING PHASE
Attention selects a peripheral target and enhances feature
detector activity at that location.
TRAINING PHASE
A short-term memory (eligibility trace) of this feature
activity is generated.
TRAINING PHASE
An eye movement is made, foveating the target.
TRAINING PHASE
Attention shifts to the fovea, enhancing the feature detector
activity there.
TRAINING PHASE
The feature detector activity at the fovea is associated with
the feature detector activity represented in the short-term
memory, using an appropriate learning rule, e.g. the
Sutton-Barto Temporal Difference Rule.
RECOGNITION PHASE
Once associations have been built up, the appearance of
an attended-to target in periphery can produce a response
as though the target is actually foveated.
This response can be thought of as a mental image.
This mental image might be represented by activity in
neurons in areas with large receptive fields (V4, IT) and
hence would be concerned only with feature type, rather
than feature location.
This provides an explanation for the continuity in the
quality of the subjective experience of a stimulus across
the visual field.
STEADY-STATE OPERATION
We have divided the processing into two separate phases,
Training and Recognition.
In practice, however, these can co-occur.
The learning mechanism can be continuous, allowing
adaptation to changes in the sensory and motor systems
(e.g. aging of the photoreceptors, changes in the projective
optics of the eye, …)
Creation of “Mental Images”
Once the association weights matrix, V, has been
learned, it can be used to generate predictions, M, of
what the foveal image or feature detector response
will look like, based on the peripheral, responses, P.
M = V*P
It is expected that the association matrix should map
foveal images into themselves, therefore the
eigenvectors of this matrix should be (linear combinations of)
the foveal images.
F = kV*F
AN EXAMPLE: STABILITY OF COLOR PERCEPTION
Many factors, including absorption of light by the lens
of the eye, cause a yellowing of the light falling on the
fovea as compared with that falling on the periphery.
After training, a presentation of a given color feature in the
fovea is associated with the color feature that would be
observed after the feature is foveated with an eye movement.
This can be seen in the structure of the association weights
matrix, where peripheral and foveal color features map to
the same color class.
ANOTHER EXAMPLE:
STABILITY OF STRAIGHT LINE PERCEPTION
The retina is hemispherical, and this causes straight lines in
space to be projected as 2-D arcs on the retinal surface,
with radii of curvature that vary with eccentricity
of Lines
onto
Receptors
ImagesImages
of Straight
LinesProjected
At Various
Eccentricities
It can be seen that the “mental images” are all very close
to the foveal images, no matter where on the periphery
the projection of the physical line falls.
The eigenvalues are not equal to the foveal images, but the
foveal images can be obtained from them through a
linear sum.
Development of Position Invariance in Neural Responses
Standard View
Which feature detectors
are connected to the cell
must be learned (and
continually adapted)
Feature detectors with differing preferred stimuli
(corresponding to the photoreceptor responses of
a stable physical stimulus as the eye moves)
it is unclear how
the development would
proceed without some
sort of adaptation signal
coming from the need
for constancy of response
across self-actions
(e.g. eye movements)
Development of Position Invariance in Neural Responses
Alternate View
“mental image”
(prediction of
foveal response)
Association Layer
Eye movement
signal
Feature detectors with differing preferred stimuli
(corresponding to the photoreceptor responses of
a stable physical stimulus as the eye moves)
The weightings of the
lower level units are
continually updated
through the associative
learning mechanism. This
mechanism requires input
from the oculomotor
system to know when
an eye movement has
taken place.
Conclusions
Perceptual stability and the position invariance of higher-level
cortical neurons may arise from a learning of sensorimotor
contingencies.
Such learning can be accomplished with a reinforcement
learning network, which learns to generate predictions of
lower level visual feature detector activity which would
occur after foveation of a physical stimulus.
In our view, a projection of a physical stimulus onto any
peripheral retinal location will result in the same
“mental image” of the feature as projection onto the fovea.
On-going and Future Research
* Recurrent Feedback of predictions back down
to low-level feature detectors
- will allow small displacements of foveal image
* Interpretation of the Reinforcement Signal
- small signal can be used to drive adaptation
- large signal can be used to indicate instability of the world
or to indicate that a new class should be created
* Psychophysical studies of Pre- and Post-motor attention shifts
* Sensorimotor Basis function representations of the
Association weights matrix.