Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Learning Sensorimotor Contingencies James J. Clark Centre for Intelligent Machines McGill University This work is being done in collaboration with: J. Kevin O’Regan (CNRS, Univ. Rene Descartes) and with doctoral students at McGill University: Fatima Drissi-Smaili Ziad Hafed Muhua Li A mystery : Why do we perceive the same feature value (e.g. Color) when viewing the feature foveally or peripherally? Why is this a mystery? The signal provided by retinal photoreceptors can be quite different when the image of the feature falls on different places on the retina. For example: the spectral sensitivity curves of retinal photoreceptors are shifted towards the blue in peripheral cells as compared with the foveal cells. . A related mystery (perhaps…) : Why do neurons in areas such as V4 and IT, which have large receptive fields, respond to the same feature value (e.g. color, orientation, complex shape) no matter where the feature lies in the receptive field? The activity of these neurons is usually reduced when the feature falls in the periphery of the receptive field as compared with the center, but the neuron’s selectivity, or tuning, is the same everywhere. Perceptual Stability These mysteries can be more generally considered as related to the mystery of perceptual stability. Perceptual stability is the constancy of subjective experience across self-actions, even though these self-actions can cause large changes in sensory inputs. Sensorimotor Contingencies One theory of perceptual stability, due to O’Regan and Noe, holds that what is perceived is the sensorimotor contingency associated with a given physical stimulus. A sensorimotor contingency is a law or set of laws that describes the relation between self-actions and resulting changes in sensory input. Since it is the presence of a lawful relationship between sensory input and motor activity that determines the perception of a physical stimulus, an appropriate change in sensory input is necessary for a perception to be stable! Conditioning using Temporal-Difference Learning We propose that Sensorimotor Contingencies associated with sensory changes due to eye movements can be learned using a variety of learning techniques. We propose the use of the Temporal-Difference Learning scheme of Sutton and Barto. This reinforcement learning technique can be thought of as a form of Conditioning where the Conditioned Stimulus is the sensory activity before the eye movement and the Unconditioned Stimulus is the sensory activity after the eye movement. After conditioning, presentation of the conditioned stimulus will produce the same behaviour as that produced by the unconditioned stimulus. The Sutton-Barto Temporal-Difference Learning Rule Vij (t ) (t ) Vij (t 1) X j (t ) X j (t 1) X j (t ) X j (t ) X j (t 1) X j (t 1) X (t ) 0 during the attention shift from fovea to periphery Vij (t ) is constraine d through saturation to lie between 0 and 1 t the foveal value of feature i, and is the " unconditio ned stimulus" X j (t ) is the peripheral value of feature j X j (t ) is the eligibilit y trace, or short - term - memory of X j (t ) V is a matrix of association strengths between pre- and postsaccadic stimuli. The pre-motor stimulus X is held in a short-term memory generating an eligibilty trace, which will be used to enhance, in a Hebbian fashion, the association to the post-motor stimulus. The reinforcement signal, which is multiplied by the eligibilty trace to yield the change in the association matrix, is the difference between 2 different predictions of the foveal response - a weighted sum of the current and previous foveal responses, and the action of the current association matrix on the previous peripheral stimulus. TRAINING PHASE Attention selects a peripheral target and enhances feature detector activity at that location. TRAINING PHASE A short-term memory (eligibility trace) of this feature activity is generated. TRAINING PHASE An eye movement is made, foveating the target. TRAINING PHASE Attention shifts to the fovea, enhancing the feature detector activity there. TRAINING PHASE The feature detector activity at the fovea is associated with the feature detector activity represented in the short-term memory, using an appropriate learning rule, e.g. the Sutton-Barto Temporal Difference Rule. RECOGNITION PHASE Once associations have been built up, the appearance of an attended-to target in periphery can produce a response as though the target is actually foveated. This response can be thought of as a mental image. This mental image might be represented by activity in neurons in areas with large receptive fields (V4, IT) and hence would be concerned only with feature type, rather than feature location. This provides an explanation for the continuity in the quality of the subjective experience of a stimulus across the visual field. STEADY-STATE OPERATION We have divided the processing into two separate phases, Training and Recognition. In practice, however, these can co-occur. The learning mechanism can be continuous, allowing adaptation to changes in the sensory and motor systems (e.g. aging of the photoreceptors, changes in the projective optics of the eye, …) Creation of “Mental Images” Once the association weights matrix, V, has been learned, it can be used to generate predictions, M, of what the foveal image or feature detector response will look like, based on the peripheral, responses, P. M = V*P It is expected that the association matrix should map foveal images into themselves, therefore the eigenvectors of this matrix should be (linear combinations of) the foveal images. F = kV*F AN EXAMPLE: STABILITY OF COLOR PERCEPTION Many factors, including absorption of light by the lens of the eye, cause a yellowing of the light falling on the fovea as compared with that falling on the periphery. After training, a presentation of a given color feature in the fovea is associated with the color feature that would be observed after the feature is foveated with an eye movement. This can be seen in the structure of the association weights matrix, where peripheral and foveal color features map to the same color class. ANOTHER EXAMPLE: STABILITY OF STRAIGHT LINE PERCEPTION The retina is hemispherical, and this causes straight lines in space to be projected as 2-D arcs on the retinal surface, with radii of curvature that vary with eccentricity of Lines onto Receptors ImagesImages of Straight LinesProjected At Various Eccentricities It can be seen that the “mental images” are all very close to the foveal images, no matter where on the periphery the projection of the physical line falls. The eigenvalues are not equal to the foveal images, but the foveal images can be obtained from them through a linear sum. Development of Position Invariance in Neural Responses Standard View Which feature detectors are connected to the cell must be learned (and continually adapted) Feature detectors with differing preferred stimuli (corresponding to the photoreceptor responses of a stable physical stimulus as the eye moves) it is unclear how the development would proceed without some sort of adaptation signal coming from the need for constancy of response across self-actions (e.g. eye movements) Development of Position Invariance in Neural Responses Alternate View “mental image” (prediction of foveal response) Association Layer Eye movement signal Feature detectors with differing preferred stimuli (corresponding to the photoreceptor responses of a stable physical stimulus as the eye moves) The weightings of the lower level units are continually updated through the associative learning mechanism. This mechanism requires input from the oculomotor system to know when an eye movement has taken place. Conclusions Perceptual stability and the position invariance of higher-level cortical neurons may arise from a learning of sensorimotor contingencies. Such learning can be accomplished with a reinforcement learning network, which learns to generate predictions of lower level visual feature detector activity which would occur after foveation of a physical stimulus. In our view, a projection of a physical stimulus onto any peripheral retinal location will result in the same “mental image” of the feature as projection onto the fovea. On-going and Future Research * Recurrent Feedback of predictions back down to low-level feature detectors - will allow small displacements of foveal image * Interpretation of the Reinforcement Signal - small signal can be used to drive adaptation - large signal can be used to indicate instability of the world or to indicate that a new class should be created * Psychophysical studies of Pre- and Post-motor attention shifts * Sensorimotor Basis function representations of the Association weights matrix.