Download Object recognition in clutter: selectivity and invariance

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Multielectrode array wikipedia , lookup

Perception of infrasound wikipedia , lookup

Functional magnetic resonance imaging wikipedia , lookup

Perception wikipedia , lookup

Types of artificial neural networks wikipedia , lookup

Neural oscillation wikipedia , lookup

Nonsynaptic plasticity wikipedia , lookup

Clinical neurochemistry wikipedia , lookup

Activity-dependent plasticity wikipedia , lookup

Neuroethology wikipedia , lookup

Artificial general intelligence wikipedia , lookup

Neuroplasticity wikipedia , lookup

Caridoid escape reaction wikipedia , lookup

Neural modeling fields wikipedia , lookup

Development of the nervous system wikipedia , lookup

Neuroeconomics wikipedia , lookup

Neuroanatomy wikipedia , lookup

Mirror neuron wikipedia , lookup

Allochiria wikipedia , lookup

Response priming wikipedia , lookup

Holonomic brain theory wikipedia , lookup

Convolutional neural network wikipedia , lookup

Pre-Bötzinger complex wikipedia , lookup

Visual N1 wikipedia , lookup

Single-unit recording wikipedia , lookup

Premovement neuronal activity wikipedia , lookup

Optogenetics wikipedia , lookup

Visual extinction wikipedia , lookup

Neuroesthetics wikipedia , lookup

Neuropsychopharmacology wikipedia , lookup

Channelrhodopsin wikipedia , lookup

Binding problem wikipedia , lookup

Neural coding wikipedia , lookup

Psychophysics wikipedia , lookup

C1 and P1 (neuroscience) wikipedia , lookup

Neural correlates of consciousness wikipedia , lookup

Time perception wikipedia , lookup

Efficient coding hypothesis wikipedia , lookup

Biological neuron model wikipedia , lookup

Synaptic gating wikipedia , lookup

Metastability in the brain wikipedia , lookup

Inferior temporal gyrus wikipedia , lookup

Stimulus (physiology) wikipedia , lookup

Nervous system network models wikipedia , lookup

Feature detection (nervous system) wikipedia , lookup

Transcript
Object Recognition in Clutter: Selectivity and Invariance Properties in Monkey
Inferotemporal Cortex
Davide Zoccolan and James DiCarlo
The problem: A major challenge of current theories of vision is to understand how the visual system performs
object recognition in cluttered conditions, typical of natural visual scenes, where objects of interest do not appear in
isolation but together with background objects. Object recognition in primates is thought to depend on neuronal
activity in the inferotemporal cortex (IT) [1], which is the last stage of the ventral visual stream. In fact, neurons
found in monkey IT fulfill two essential requirements for visual recognition: invariance and selectivity. They are
selectively tuned to views of complex objects such as faces and their responses show significant invariance to
stimulus transformations such as scale and position changes [2, 3]. Previous studies report a reduction of an IT
neuron response to its preferred stimulus when an additional “clutter” stimulus is simultaneously present in its
receptive field [4, 5]. However, the relationship between position-, shape-, and clutter- sensitivity of IT neurons has
not been yet systematically assessed.
Motivation: Understanding how single and multiple objects are represented in the higher cortical areas of primates
is one of the major objectives of computational and systems neuroscience. Such a challenge requires a highly
multidisciplinary approach that combines electrophysiology and psychophysics with computational modeling. The
hierarchical model of object recognition, recently developed in Poggio’s lab, accounts for both object identification
and categorization of visual perception. It also provides a plausible circuitry to explain their neural basis and the
origin of the invariance and selectivity properties of higher visual cells [3]. More generally, the model can play a key
role in analyzing electrophysiological data, planning experiments and interpreting their results in the light of a
coherent theoretical framework. Therefore, new experimental tests are continuously necessary to verify the model’s
predictions and to improve its computational architecture.
Previous work: Preliminary model simulations predict a complex but testable pattern of interaction between
multiple objects simultaneously present in IT neuron’s receptive field [6]. When the preferred shape and a nonoptimal (clutter) shape are simultaneously present in the IT neuron’s receptive field, the response to the pair of
shapes will be reduced. The amount of reduction will depend on the similarity of the clutter shape and the preferred
shape. In particular, the model specifically predicts a U-shaped dependence of clutter interference (i.e., reduced
neuronal response) on clutter-preferred shape similarity [6]. These model predictions call for an experimental
investigation aimed at systematically testing them.
Approach: The first step in our experimental design was to train monkey subjects to be experts at detecting specific
objects. One monkey subject has been trained in a sequential object recognition task that requires the detection of a
specific shape (the target shape) embedded in a temporal sequence of shapes drawn from the same, parameterized
shape space (the distractors). Shapes are ~ 2 deg wide. To insure the generality of our results, the monkey has been
trained to detect a target object in each of three different parameterized shape spaces (cars, faces, and abstract
silhouettes). Results of the training in each shape space showed a consistent performance improvement (more than
doubling) during the first 7-10 days of training that reached an asymptotic value that remained constant for the
remaining 8-10 training sessions.
Once the behavioral training has been completed, we started to perform single unit recordings from IT
cortex. Each isolated neuron is tested for: 1) responsiveness of the neuron to stimuli sampled from the trained spaces;
2) selectivity of the neural response across the optimal stimulus space; 3) position tolerance of the shape selectivity;
4) impact of clutter on the shape selectivity. All the recordings are performed in passive viewing rapid sequence
presentation (5 stimuli per second). The selection of stimuli presented to the monkey during the recordings includes
subsets of shapes, belonging to each space, that were not used during the training phase.
Our preliminary recordings show that it is possible to find IT neurons sharply tuned across subsets of shapes
belonging to our stimulus spaces. The response of these neurons is maximal for a specific shape (the optimal
stimulus) and then smoothly decreases for stimuli more and more dissimilar from the optimal one. Since our stimulus
spaces are parametrized, it is possible to build a tuning curve of the neuronal response as a function of the distance,
in the shape space, from the optimal stimulus. For those neurons whose tuning curves were invariant in the top and
bottom training retinal position, we were able to test the interference produced by clutter, i.e. pairs of stimuli of
controlled similarity simultaneously present in the neuron’s receptive field. These first recordings confirm at least
part of the model predictions, in that the neuronal response smoothly decreases as a function of the distance, in the
shape space, of the flanker stimulus from the optimal one. However, we did not find yet any IT neuron whose
response increased when flanker and optimal stimulus were very dissimilar, as predicted by the U-shaped
dependence of clutter interference of the model units.
Difficulty: The main problems we encountered in trying to assess the impact of clutter on IT neuron responses were
the following. First, it is very hard, in the first place, to find neurons with sharp selectivity across some subset of the
stimulus space. Second, when such sharply tuned neurons were found, they often had small receptive fields (~3-4
deg), in which the two stimuli necessary to test the clutter interference could barely fit. Third, the selectivity of these
neurons was often depressed during the protocol for the clutter interference test. We believe that this effect could be
accounted by adaptation. So far, because of these problems, we could test the response to clutter in only a small
fraction of the recorded IT neurons.
Impact: The aim of this project is to combine physiological, psychophysical and computational studies to investigate
the dependence of IT neuronal robustness to clutter on neuron's shape- and position-sensitivities. This will help to
understand how complex shapes are represented in IT and how multiple objects interact or compete for IT
representation. These are fundamental problems of contemporary visual science that require a strong interaction
between experimental investigation and computational modeling.
Future Work: We will keep testing IT neuron shape selectivity and clutter tolerance, trying, at the same time, to
obtain a better characterization of the position tolerance, receptive filed size and adaptation properties of IT neurons.
Research support: This report describes research done at the Center for Biological & Computational Learning,
which is in the McGovern Institute for Brain Research at MIT, as well as in the Dept. of Brain & Cognitive Sciences,
and which is affiliated with the Computer Sciences & Artificial Intelligence Laboratory (CSAIL).
This research was sponsored by grants from: Office of Naval Research (DARPA) Contract No. MDA97204-1-0037, Office of Naval Research (DARPA) Contract No. N00014-02-1-0915, National Science Foundation
(ITR/IM) Contract No. IIS-0085836, National Science Foundation (ITR/SYS) Contract No. IIS-0112991, National
Science Foundation (ITR) Contract No. IIS-0209289, National Science Foundation-NIH (CRCNS) Contract No.
EIA-0218693, National Science Foundation-NIH (CRCNS) Contract No. EIA-0218506, and National Institutes of
Health (Conte) Contract No. 1 P20 MH66239-01A1.
Additional support was provided by: Central Research Institute of Electric Power Industry, Center for eBusiness (MIT), Daimler-Chrysler AG, Compaq/Digital Equipment Corporation, Eastman Kodak Company, Honda
R&D Co., Ltd., ITRI, Komatsu Ltd., Eugene McDermott Foundation, Merrill-Lynch, Mitsubishi Corporation, NEC
Fund, Nippon Telegraph & Telephone, Oxygen, Siemens Corporate Research, Inc., Sony MOU, Sumitomo Metal
Industries, Toyota Motor Corporation, and WatchVision Co., Ltd..
Davide Zoccolan is supported by the The International Human Frontier Science Program Organization.
References:
[1] K. Tanaka. Inferotemporal cortex and object vision. Annu.Rev.Neurosci. 19, 109-139 (1996).
[2] N. K. Logothetis, J. Pauls, T. Poggio. Shape representation in the inferior temporal cortex of monkeys. Curr.Biol.
5, 552-563 (1995).
[3] M. Riesenhuber and T. Poggio. Hierarchical models of object recognition in cortex. Nat.Neurosci. 2, 1019-1025
(1999).
[4] T. Sato. Interactions between of visual stimuli in the receptive fields of inferior temporal neurons in awake
macaques. Exp. Brain Res. 77:23-30 (1989)
[5] E. Rolls and M. Tovee. The responses of single neurons in the temporal visual cortical areas of the macaque
when more than one stimulus is present in the receptive field. Exp. Brain Res. 103:409-420 (1995)
[6] M. Riesenhuber and T. Poggio. Are cortical models really bound by the "binding problem"? Neuron 24, 87-25
(1999).