Download From 1D to 2D via 3D: dynamics of surface motion segmentation for

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Human eye wikipedia , lookup

Blast-related ocular trauma wikipedia , lookup

Transcript
Journal of Physiology - Paris 98 (2004) 35–52
www.elsevier.com/locate/jphysparis
From 1D to 2D via 3D: dynamics of surface motion segmentation
for ocular tracking in primates
Guillaume S. Masson
*
Institut de Neurosciences Physiologiques et Cognitives, Centre National de la Recherche Scientifique, 31 Chemin Jospeh Aiguier,
13402 Marseille cedex 20, France
Abstract
In primates, tracking eye movements help vision by stabilising onto the retinas the images of a moving object of interest. This
sensorimotor transformation involves several stages of motion processing, from the local measurement of one-dimensional luminance changes up to the integration of first and higher-order local motion cues into a global two-dimensional motion immune to
antagonistic motions arising from the surrounding. The dynamics of this surface motion segmentation is reflected into the various
components of the tracking responses and its underlying neural mechanisms can be correlated with behaviour at both single-cell and
population levels. I review a series of behavioural studies which demonstrate that the neural representation driving eye movements
evolves over time from a fast vector average of the outputs of linear and non-linear spatio-temporal filtering to a progressive and
slower accurate solution for global motion. Because of the sensitivity of earliest ocular following to binocular disparity, antagonistic
visual motion from surfaces located at different depths are filtered out. Thus, global motion integration is restricted within the depth
plane of the object to be tracked. Similar dynamics were found at the level of monkey extra-striate areas MT and MST and I suggest
that several parallel pathways along the motion stream are involved albeit with different latencies to build-up this accurate surface
motion representation. After 200–300 ms, most of the computational problems of early motion processing (aperture problem,
motion integration, motion segmentation) are solved and the eye velocity matches the global object velocity to maintain a clear and
steady retinal image.
Ó 2004 Elsevier Ltd. All rights reserved.
Keywords: 2D visual motion integration; Tracking eye movements; Motion segmentation; Binocular disparity; Non-Fourier motion
1. Introduction
Vision is blurred when retinal slip of the image exceeds a few degrees per second. Tracking eye movements
help vision by stabilising onto the retinas the images of a
moving object of interest (see [84] for a review). To do
so, the eyes are rotated smoothly at the same velocity as
the selected object. This visual stabilisation mechanisms
are found in a wide range of species and it has been so
successively investigated that it has been erected as a
paradigmatic instance of sensorimotor transformations
[56]. In a wide range of species, including human,
smooth eye movements have been very carefully scrutinised and, nowadays the fundamentals of oculomotor
control are known with high precision and its neural
*
Tel.: +33-491-164314/164315; fax: +33-491-774969.
E-mail address: [email protected] (G.S. Masson).
0928-4257/$ - see front matter Ó 2004 Elsevier Ltd. All rights reserved.
doi:10.1016/j.jphysparis.2004.03.017
basis are largely unveiled (see [19] for a review). Finally,
because similar visual stimulation and ocular recordings
techniques can be used in both humans and monkeys
and that there are very close behavioural similarities
between the two species, it is commonly accepted that
monkeys are the best model for understanding the
neural basis of oculomotor control.
Contrary to the neural motor control of tracking eye
movements (see [43,49,56] for reviews), much less is
known about the visual mechanisms feeding the sensorimotor transformation. The theoretical consequences
of this lack of interest for the afferent information was
pointed out by Miles who wrote: ‘‘to date, few studies
have been concerned with the optokinetic system’s
ability to deal with the more complex optic flow patterns
of everyday experience, and most models oversimplify
the situation, collapsing the visual processing into a
single black box that, in some unspecified way, derives
a signal encoding retinal slip’’ [81]. Ten years later
36
G.S. Masson / Journal of Physiology - Paris 98 (2004) 35–52
however, a growing body of evidence indicates that
complex visual motion processing is indeed engaged in
controlling tracking eye movements (see [43,51,67,82]
for reviews). Efforts have been mostly focused on how
target velocity in reconstructed from local motions (e.g.
[23,93]) and then represented in cortical visual motion
areas [55]. Behavioural studies in man and monkeys can
now decipher its dynamics using always more complex
motion stimuli inspired from human motion psychophysics. By consequence, eye movements reveal their
potential as a ‘‘behavioural probe’’ of the visual brain.
These works open the door to a more comprehensive
view on the coupling between vision and action, although a new theoretical framework is yet to emerge
[106].
Herein, I will review some recent works which demonstrate that reflexive tracking eye movements have a
complex temporal dynamics which reflects the progressive build-up of a neural representation of the targetobject motion. I will show that this progressive build-up
involves parallel detection mechanisms of the local
motion cues and surface motion integration mechanisms. Moreover, behavioural results demonstrate that
such integration process is weighted by depth cues such
as binocular disparity. This selective integration of 1D
local motion cues into a 2D global surface motion is
thus intrinsically modulated by contextual, 3D cues
which act on the very earliest stages. I will try to relate
these various motion processing stages to physiological
data collected at the single-cell levels within different
monkey visual areas. In particular the emphasis will be
put onto the tight link observed between the temporal
dynamics at both behavioural and physiological levels.
From these experimental evidence, I will then draw a
framework which integrates the modern conceptions of
parallel, cortical visual motion processing into the frontend of the visuo-oculomotor control in primates.
2. What visual motion?
In the everyday life, the image of the visual world is
constantly changing as any displacement of our eyes,
head or body results in complex optic flow patterns.
During self-translation, the retinal velocity of any point
in the visual array depends upon its distance from the
observer. Therefore, stabilising the image of a given
object of interest requires that the visual system singles
out its motion from the surrounding and elaborates a
precise estimate of its direction and speed. Such processing involves a series of mechanisms which goes from
1D local motion detection to 3D object motion segmentation. These stages have been extensively investigated in motion psychophysics and physiology (see
[10,104,114] for reviews). As a consequence, specific
motion stimuli have been designed to tackle each of
them and they can be adapted for investigating the
tracking systems.
To correctly study visual motion processing in the
context of tracking eye movements, three obstacles need
first to be cleared up. First, some specific motion
information is needed for initiating and controlling
tracking eye movements, which can be different from
those used by visual perception. In a complex visual
scene, many objects are moving simultaneously and the
resulting motion parallax flow field is an important cue
for an accurate 3D perception. Motion transparency
perception generated by a random dot pattern where
each half of the dot population move at different speeds
is a good example of this: transparency will only be
perceived if the two speeds are neuraly represented at
the same time by the motion system (see [10]). Here,
motion segmentation requires a precise estimate of each
local motion and a selective grouping of the dots moving
at the same speed to generate the perception of two
transparent surfaces sliding one over each other. Perceptual performance is rather poor and sluggish in such
displays when compare to motion detection and discrimination performance [70,80]. How the tracking
system performs which such displays? If you ask the
subjects to track one of the surface (say, the faster),
pursuit initiation has the usual fast (100 ms) latency,
albeit with a lower performance than perception [112].
This contradicts in part the fact that perceptual judgements need rather long (200 ms) stimulus duration to
reach a decent performance [70]. When both motions are
in the same direction, tracking responses are initiated by
the average speed of the display. During steady state, the
instantaneous eye speed will fluctuate between surface
speeds, depending to which surface the subject is paying
attention at a given moment. Then, very little influence
of the non-attending moving surfaces will be seen, as if
the information was discarded [79]. These results indicate that, contrary to the perceptual system, the visuo–
motor system needs a simple representation, or the
selective read-out of a more complex representation,
which provides only one speed signal at a time and for
which any other motion is disregarded as noise. Moreover, since initial pursuit is driven by the vector average
of the motion display, the pursuit response is obviously
initiated before a complete motion segmentation is
performed. This example shows that investigating motion processing in the context of either perception or
ocular behaviour yield to different results. A lesson is
that eye movements are not solely an objective tool to
investigate perception. We must take into account the
behavioural constraints imposed onto the visual motion
processing to understand the visuo–motor transformation and to probe the neural representation of target
motion driving the eyes.
Secondly, any movement of the eyeball will result in a
displacement of the sensing organ, the retina. Tracking
G.S. Masson / Journal of Physiology - Paris 98 (2004) 35–52
acts as a negative feedback loop where eye movements
tend to minimise the retinal image motion by smoothly
changing the orientation of the fovea. The consequence
is that, during closed-loop behaviour, the retinal stimulus is different from the displayed stimulus. It is
therefore difficult to isolate the properties of the visual
processing involved in the control of the eyes. Obviously, other sources of information such as the eye
velocity memory provided by an extra-retinal signal
comes into play. One solution to solve this problem is to
investigate the earliest phase of pursuit initiation (see
[56]). During a short period of time, shorter than the
pursuit reaction time, the system is transiently openloop and the ocular responses depend solely upon the
displayed stimulus. As a consequence, a precise mapping
between stimulus parameters and tracking properties
can be measured and, eventually related to the physiological data gained by recording the activities of single
neurones using the same parameters space (see [45,82]).
Third, tracking responses are continuous rotations of
the eyes. By recording the eye velocity profiles and then
quantifying responses at certain points in time, we can
measure the temporal evolution of visual motion processing. Hence, we have a tool to tape the build-up of
the visual symphony where different instruments starts
playing at different epochs but converge onto a common
beat (e.g. [66,71,84]). Two timing information can then
be accurately measured: the latency of each specific
tracking component and the time course of their integration. This is clearly an advantage over motion psychophysics where different visual latencies can hardly be
measured only by using different stimulus durations (see
[121] for instance). From eye movement responses,
analogies with neural physiology data can then be
drawn both in terms of the basic properties of some
given functional pathways [45] and of sequences of early
and late waves of activation in the visual system (see
[13,53]).
In the present review, I will focus on the properties of
the visual processing underlying the initiation of shortlatency ocular following responses, first identified by
Miles and his group (see [82,83] for reviews). If motion
of a large visual scene is applied in the wake of a saccadic eye movement, reflexive tracking responses are
elicited at ultra-short latencies (85 ms in humans and
55 ms in monkeys). These machine-like responses exhibit many properties of low-level motion detectors. For
instance, when elicited by moving luminance sine-wave
gratings, their latencies depend upon both contrast and
temporal frequency [36,84]. Amplitude of the responses
shows non-monotonic speed tuning, peaking at values
30–40°/s [74]. In both monkeys and man, motion in the
periphery of the visual field modulates the ocular following to motion in the central field, indicating that the
visual signal driving the eyes results from an integration
process [84]. This later result suggests that ocular fol-
37
lowing eye movements depend upon the segmentation of
the moving visual scene and are not driven by its en
masse motion [82].
In monkeys, the neural basis of ocular following
have been carefully scrutinised by the group of Kawano
(Fig. 1). Correlated neural activity was found in visual
cortical areas MT and MST, leading the eye movements onset by 10 ms and showing both a strong
directional selectivity and a preference for high speeds
[48]. Neural activity linked to ocular following has also
been shown in the dorsolateral pontine nucleus
(DLPN) [46] and the ventral paraflocullus lobes of the
cerebellum (VPFL) [100]. The analysis of the visual
latencies at these different cortical and sub-cortical
stages suggests a progression of the information flow
from visual processing to motor command. Moreover,
although neurons in MST and DLPN show a wide
range of directional preferences (Fig. 1), consistent with
their role in visual processing, simple-spikes of Purkinje
cells in the VPFL are directionally tuned in motor
coordinates (i.e. vertical or horizontal) [37]. Kawano
and colleagues suggested that visual information concerning the moving visual scene is encoded in MST and
relayed via DLPN to the ventral paraflocullus which
computes the motor command for driving the eyes
[120]. However, additional cerebellar afferences from
the pretectal nucleus of the optic tract [44] should
also be taken into account to render the exact wiring
diagram of this exquisite model of sensorimotor transformation. More detailed reviews on the neural mechanisms of ocular following eye movements can be
found elsewhere [45,120].
3. From luminance to 1D motion cues: detection and
triggering
How is the visual information about motion of the
scene encoded? Any visual motion is primarily a local
change in luminance, but changes in other local visual
cues (contrast, texture, colour, binocular disparity, etc.)
also provide motion information. There is mounting
evidence that visual motion computation involves several parallel streams (see [62]). A first-order system extracts motion from drifting luminance modulations and
a second-order system measures motion from texture
contrast modulations where local luminance remains
constant. These two systems are monocular, fast and
sensitive to a wide range of spatial and temporal frequencies. They both use motion energy analysis [1,115]
or, equivalently, elaborated Reichardt detectors [111]. A
third-order system has been described which computes
motion from a saliency map. It is much more sluggish,
with a more restricted spatio-temporal sensitivity [62].
Although it provides a powerful tool for a precise
investigation of the relationship between tracking eye
38
G.S. Masson / Journal of Physiology - Paris 98 (2004) 35–52
Fig. 1. Information flow along the sensori-motor path for ocular following. Typical neuronal responses in cortical visual area MST, in the dorsolateral pontine nucleus (DLPN) and in the ventral lobe of the paraflocculus (VLPF) of the cerebellum to motion of a large field random dot pattern.
Neuron responses are plotted together with the mean eye velocity profiles, on the same time axis. Arrows indicate the mean latency of the cell
responses and of the tracking eye movements. Right-end insets show the distribution of directional selectivities, for the three recording sites. All
directions of motion are represented in both MST and DLPN while in VFPL, directions along the horizontal and vertical axes are represented in
different sub-populations coding for horizontal and vertical eye movements, respectively. Modified from [45].
movements and attention, it is beyond the scope of the
present article.
A classical signature of linear motion processing,
such as motion energy detectors, is called reversed phi
motion [21,61]. With appropriate spatio-temporal
parameters, a single, step-wise displacement of a high
density random dot pattern produces a vivid sensation
of forward, apparent motion [116]. If the luminance
polarity of the pattern is reversed during the step,
apparent motion is perceived in the opposite direction to
the actual displacement, a situation called ‘‘reversed phi
motion’’ [5]. A similar reversing of optomotor responses
had been previously observed by Reichardt with contrast-polarity reversing stimulus presented to the beetle
eye. This observation has formed the core of his correlation model [96] and Lu and Sperling later showed
that motion reversing is predicted by all motion-energy
like models based on a linear spatio-temporal filtering
of the input sequence [61]. It is interesting to note
that inversion of neural responses with an inverted
contrast-polarity has been used to identify linear neural
processing of visual information in many different
invertebrate (e.g. [28,35,96]) and vertebrate (e.g. [24,
57,89]) nervous systems.
G.S. Masson / Journal of Physiology - Paris 98 (2004) 35–52
In humans, we have recorded the ocular following
responses to apparent motion generated by a single
position step of a large random dot pattern [73]. Despite
the transient nature of the motion signal, robust ocular
following responses were initiated at very short latencies
(<80 ms). These responses showed a non-monotonic
tuning function between amplitude and step size (Fig.
2), that is best fitted with an odd Gabor function which
peaks for step sizes of roughly half the period of the
random dot pattern fundamental frequency, and
asymptote to a default value for step sizes larger than
this period. Such spatial parameters are classical
parameters of short-range apparent motion processing
(e.g. [98]). If the contrast polarity of the random dot
pattern was reversed between the first and second frame,
ocular following responses were initiated in the direction
opposite to the actual displacement (Fig. 2). These reversed responses had the same ultra-short latency (<80
ms) and their tuning curves were symmetrical to those
found for forward responses, the best-fitted Gabor
functions being 180° out of phase. These similar spatial
parameters indicate that the same spatio-temporal filtering is used for both conditions from local changes in
luminance. This result demonstrates that the earliest
phase of ocular following is driven by a motion detection processing that linearly samples the local changes in
luminance before computing its motion.
39
Many neurons in the middle temporal (MT) area,
which provides the major inputs to MST, show directional selectivity for motion and exhibit reversed directionality with luminance-reversing motion stimuli
[57,58]. Similar reversal have been observed in retinal
directional cells in rabbit [7], cat striate cortex [30] and
fly lobular plate [28]. In monkeys it has been found that
simple––but not complex––cells in area V1 show inverted responses to opposite-contrast [57]. This result
suggest that the earliest direction selectivity of MT
neurons result from interactions between V1 simple cells
[58]. These results suggest that the earliest ocular following is driven by motion signals elaborated by MT/
MST motion detectors that linearly combined fast inputs from V1 simple cells. There is another type of eye
movement––disparity vergence––that shares a number
of features with ocular following: (1) In both humans
and monkeys, it is elicited at ultra-short latency when
the appropriate stimuli––this time, disparity steps––are
applied to large random dot patterns [16,17]; (2) it shows
response reversal with luminance-reversing stimuli, referred to as ‘‘anti-correlated stimuli’’ [69]; (3) in monkeys, it seems to be mediated, at least in part, by visual
area MST [108] whose disparity-selective neurons show
response reversals with stimuli of opposite contrast
polarity at the two eyes [110]. Interestingly, many disparity-selective neurons in monkey striate cortex [24]
Fig. 2. Forward and reversed ocular following: step-size tuning. (a) Space-time diagram of a single step apparent motion when the contrast polarity
remains constant (top panel) or is reversed (bottom panel) across the displacement. One slice of a random dot pattern is presented. Image was
blanked briefly during the step (grey horizontal bar). (b) Mean version velocity profiles (Vs ) of ocular following responses to rightward steps with
(red) or without (blue) contrast reversal. Numbers indicate the size of the displacement (in degrees). Vertical dotted lines indicate the estimated mean
latency. The light blue area indicates the time-window over which the change in version position was computed, for each trial. (c) Mean (±SE) change
in version position, plotted as a function of step size. Positive (negative) values indicate rightward (leftward) apparent motion and ocular tracking.
Blue and red circles plot data obtained with constant and reversed contrast polarity, respectively. Continuous lines are best-fitted Gabor functions.
The function for reversed contrast polarity condition shows a smaller peak-to-peak amplitude and a 180° phase shift to that found with constant
contrast polarity. Both curves do not converge onto the zero level because we subtracted the mean change in version position obtained with a catchtrial (no position step) from each data point to remove any effects due to post-saccadic drift. Different catch-trial were used for constant and reversed
contrast polarity conditions. Data from [73,74].
40
G.S. Masson / Journal of Physiology - Paris 98 (2004) 35–52
and area MT [52] show similar inversion of their tuning
curves when presented with anti-correlated stimuli.
Hence, the clear suggestion is that the motion that initiates version and the disparity that initiates vergence
are both sensed by first-order energy mechanisms in
cortex.
However, although forward and reversed phi motion
stimuli had similar Fourier Power Spectra, the magnitude of the reversed ocular following were only half of
that found for forward responses (Fig. 2b and c, [73]).
Interestingly, Livingstone and co-workers have also reported weaker responses to contrast-reversing in directionally selective cells in areas V1 and MT [57,58]. This
reduction in the magnitude of the responses is a significant deviation from the linear prediction and could be
explained in two different ways. First, an early non-linearity could prevent most of the motion detectors to
respond to anti-correlated stimuli. Such non-linearity
has been found in the fly lobula plate [35] and, in a
somewhat lesser extent, in monkey area V1 [57]. Alternatively, there might be competing motion signals which
antagonise the ocular following response. In the reversed phi motion condition, higher-order motion signals remain in the forward direction and therefore
compete with the reversed, first-order motion signals
[61]. Such competition might explain why reversed
ocular following were smaller and suggests that secondorder motion signals can also drive ocular following eye
movements. Direct evidence for this are still lacking in
humans but, in monkeys it has been found that pure
second-order motion stimuli can elicit ocular following
responses albeit with a latency delayed by 20 ms relative to grating-driven responses [8]. In the same vein,
voluntary pursuit eye movements are elicited by secondorder motion targets although with a lower initial
acceleration profile and a longer latency [18,40,54]. In
brief, there is now experimental evidence that different
motion cues contribute to the initiation of tracking eye
movements in both human and non-human primates.
Whether or not pure second-order motion can elicit, and
at which latency, reflexive, ocular following responses is
still a lacking critical piece of evidence. If so, it will be
possible to titrate the interaction between pure first- and
second-order motion signals and to measure its temporal dynamics.
4. From 1D to 2D: integration of motion signals
That different motion cues can be used for measuring
the actual motion of a given surface has been already
suggested in order to solve the so-called ‘‘aperture
problem’’ [113]. Single extended contours of a surface
are of considerable importance for the visual system.
However, because of spatial and temporal limits of any
retinal image sampling mechanism, the motion of these
one-dimensional (1D) features is inherently ambiguous:
there will be always a family of physical movements in
two dimensions that produces the same local visual
motion of this isolated contour. All these possible
translational velocities lie on a constraint line in the
velocity space [31,65]. One solution to solve this motion
ambiguity is to extract the different 1D motion signals
present across the visual field and then to combine them
[2,86]. Plaid motion stimuli provide one good example
of such a computation. A moving plaid is constructed by
summing two sets of parallel 1D contours, called components, such as lines or sinusoidal gratings of different
orientations. Each component moves in the direction
orthogonal to its orientation. Under certain circumstances, the components cohere to form a 2D pattern
which moves in a direction different from the components motion directions [2,113]. Different computational
solutions have been proposed to reconstruct the 2D
pattern motion direction from its 1D component motions: intersection of constraints (IOC), vector average
or feature tracking. The IOC solution is the unique
translation vector consistent with the information of
both vectors and is defined geometrically by the intersection of both constraint lines in the velocity space. The
vector average solution is the average of the two normal
velocities. Finally, the feature tracking solution is defined by the velocity of some features of the plaid
intensity pattern such as the so-called ‘‘blobs’’ present at
the intersection between gratings. Both the IOC and the
feature tracking solutions correspond to the veridical
(true) pattern motion direction [2,33,38].
The simplest solution is to compute the vector average of the different 1D motions. For one familly of plaid
patterns, called Type I plaids by Ferrera and Wilson
[33], the perceived direction correspond to this linear
solution. In humans, we have recorded the ocular following responses to both single grating and plaid pattern motions [66]. Plaid patterns were constructed by
summing two low spatial frequency gratings whose
orientations and motion directions differed by 90°. We
found that responses were initiated at the same ultrashort latency (85 ms) by both type of stimuli and were
very similar when a single grating or a Type I plaid
pattern moved in the same direction (Fig. 3, left panel).
Moreover, a trial-by-trial analysis revealed that ocular
following responses to a moving plaid can be predicted
from the vector average of the responses to its component gratings. Finally, the local signals do not need to
overlap to be averaged since similar results were observed when comparing the ocular following to stimulus
arrays made of 16 Gabor patches with either one or two
different carrier motion directions (Fig. 3, right panel).
These results suggest that tracking of a 2D motion can
be initiated after computing a spatial average of the
different 1D motion signals. Such linear combination of
local motion signals is extremely fast since ocular fol-
G.S. Masson / Journal of Physiology - Paris 98 (2004) 35–52
41
Fig. 3. Ocular following responses to overlapping and non-overlapping grating motions. Left panels illustrate the mean horizontal (_eh ) and vertical
(_ev ) eye velocities of responses to a single grating (broken lines) or a Type I plaid (continuous lines), both moving in the same direction (45°, oblique
axis). The Type I plaid was made of two orthogonal gratings moving in the rightward (0°) and the upward (90°) directions. Right panels show similar
results obtained with an array of Gabor patches, all moving in the same direction (broken lines) or each half moving in two orthogonal directions
(continuous lines). Vertical dotted lines indicate an estimate of latencies for horizontal and vertical responses. Data from [66].
lowing to either multiple or single 1D motions stimuli
have similar ultra-short latencies.
There are neurons in monkey area MT that do respond to Type I plaid and therefore seem able to encode
the pattern motion direction [97], irrespective of the
orientation of the component gratings. These pattern
cells are different from the component cells that are
activated only when one of the two gratings is moving in
its preferred direction [86]. To the extent that a component cell ‘‘sees’’ only the local motion generated by
each grating, the directional tuning curve for the plaid
stimulus must have separate peaks corresponding to
each plaid motion direction that drift one of the grating
along its preferred direction. Moreover, the difference
between the two peaks must be equal to the angular
difference in the orientation of the two component
gratings. On a contrary, a pattern cell must have a
directional tuning curve presenting only one peak, which
is aligned with a the pattern motion direction, irrespective of the two component motion directions. It was
then found that the two sub-populations of cells are
present at the level of area MT but that only component
cells were found at the level of area V1 [86]. A recent
study by Pack et al. [92] unveiled a more complex picture in area MT of behaving monkeys: for many cells,
responses changed over time. Over the first 20 ms of
most cells responses, direction tuning were bimodal and
consistent with the ‘‘component prediction’’. However,
the subsequent direction tuning, as determined by
averaging firing rate over the next 1500 ms of stimulus
presentation, was unimodal and consistent with the
‘‘pattern prediction’’. Clearly, the neural responses undergo a complex temporal dynamics where the earliest
part reflects predominantly the ambiguous 1D grating
motions. Thereafter, direction selectivity gradually
converges towards the unambiguous 2D pattern motion.
This results suggest that local, 1D motion cues have the
fastest access to area MT. Consistent results were found
by the group of Movshon which described the dynamics
of a large population of MT cells in opiate-anesthetized
monkeys. Following plaid pattern presentation, component cells emerged rapidly (<70 ms) but pattern cells
42
G.S. Masson / Journal of Physiology - Paris 98 (2004) 35–52
exhibit a more progressive build-up of their plaid motion selectivity, the entire sub-population being stabilised 120–140 ms after stimulus onset. Some cells
effectively demonstrated a bi-phasic dynamics, being
first component-like and then pattern-like [87].
How can we explain the averaging process observed
for ocular following responses to Type I plaids? Since
these responses had the usual ultra-short latencies, they
cannot be explained on the sole basis of the pattern
cells sub-population. We found no evidence for a slow
build-up of the responses or a delayed onset. Obviously, if both component and pattern cells first encode
1D local motion, a simple read-out such as a vector
averaging or summation of the population activity
would be sufficient to explain our behavioural results.
The existence of such a read-out has been demonstrated by micro-stimulation studies in area MT in the
context of smooth pursuit initiation [39]. Moreover, a
second, fastest averaging mechanism can take place at
the level of MT since several studies have demonstrated that MT cells average multiple inputs (e.g.
[12,32,95,105]). Thus, average component velocity can
be represented at the level of single neurones and drive
the ocular following responses, without the need for
pattern selective cells. Further studies are needed to
demonstrate whether the averaging process takes place
at the level of MT cells or at the level of its population
read-out (see [32,95]).
Thus, the key question remains open: how can we
probe the temporal dynamics of the build-up of the
surface motion representation and disentangle it from
more simpler, averaging solution. Fortunately, there
are instances where the 2D pattern motion direction
(i.e. the IOC solution) cannot be predicted by a linear
combination of the component motions. These plaids
have been called ‘‘Type II plaids’’ by Ferrera and
Wilson [33] and they offer an excellent opportunity to
test the contribution of the different computational
solution for motion integration. Psychophysical studies
by the group of Hugh Wilson provided two seminal
results. First, the perceived pattern motion direction of
Type II plaids evolves over time, from the vector
average to the IOC predictions. Second, with stimulus
durations longer than 100 ms the perceived direction
can be predicted from the vector sum between first- and
second-order motion signals [121]. In plaids, first-order
motions correspond to the sinusoidal luminance grating
motions. Second-order motion can be extracted from
them through a filter–rectify–filter scheme [118]. When
pattern and grating motions are of different directions,
if these motion signals have different dynamics we
should be able to tease them apart from the tracking
responses. We carefully investigated the open-loop,
initial part of the ocular following to uni-kinetic plaids.
Uni-kinetic plaids are a limiting case of Type II plaids,
where a single moving grating is added to a static
grating of different orientations [38]. The orientation of
the static grating orientation fully determines the pattern motion direction (Fig. 4a). We found that ocular
following was always first initiated in the direction of
the moving grating, at the usual short-latency (85
ms). However, a second component was systematically
observed 25 ms later which rotated the tracking responses towards the pattern motion direction (Fig. 4b
and c). Consequently, the 2D tracking direction evolved
over time as illustrated in Fig. 4d. With upward grating
motion, tracking was initiated and maintained in the
upward direction, as showed by instantaneous mean
tracking direction vectors. When a static, oblique
grating was added to the same grating motion, the
initial vectors pointed in the direction of the grating
motion but after 20 ms, the instantaneous vectors
progressively shifted towards the oblique direction, that
is the pattern motion direction. A trial by trial analysis
indicated that final tracking direction reflected this
behaviour, with the mean value being shifted 30°
away from the grating motion direction (Fig. 4d). Thus,
processing of pattern motion cues is delayed by 20 ms
relative to grating motion processing and tracking initiation slowly converge towards the pattern motion,
indicating that 2D motion integration is a slow and
progressive build-up [66].
Static and moving gratings cohere into a uni-kinetic
plaid pattern only if they have similar spatial frequencies
and contrast [26]. Indeed, we found that the amplitude
of the late tracking component was dependent upon the
relative spatial frequency of gratings. Moreover, early
and late components exhibit different contrast response
functions when tested independently. The earliest
component was characterised by a very high contrast
sensitivity, a steep contrast response function and a
saturation with high grating contrast values. On the
contrary, the late component showed a more sluggish
contrast response function with no or very little saturation [66]. Interestingly, these two contrast response
functions are very similar to those reported for the
magno-cellular and the parvo-cellular pathways of the
geniculo-cortical visual streams [99].
These behavioural results elucidates the dynamics of
cortical motion processing. Temporal dynamics of
ocular following eye movements reveals parallel processing of grating- and pattern-related motion cues.
These processes have different delays but converge onto
a single integrative stage. Such computational scheme
corresponds to the multiple motion pathways models
already suggested from psychophysical studies [21,62,
117,118]. These models propose that first- and secondorder motions are processed through parallel pathways
before being integrated to reconstruct the 2D surface
motion. Wilson et al. [118] suggested that the nonFourier pathway is slower, which is also supported by
oculomotor studies [8,18,54,66].
G.S. Masson / Journal of Physiology - Paris 98 (2004) 35–52
43
Fig. 4. Early and late ocular following responses to uni-kinetic plaids. (a) A uni-kinetic plaid moving in the rightward–upward (p1) or the leftward–
downward (p2) direction is constructed by summing a static oblique grating and an horizontal grating moving either upward (g1) or downward (g2)
respectively. (b) Mean horizontal and vertical eye velocity in response to each grating (continuous line) or plaid (broken line) motion direction. One
notes that for plaid motion, the vertical responses is initiated at ultra-short latency but the horizontal responses are initiated only 25 ms later. The
latency estimates for each component is indicated by the broken vertical lines. (c) For three subjects, mean (±SD) latency of horizontal and vertical
responses to plaid motions in the rightward–upward (p1, top panel) and the leftward–downward (p2, lower panel). Notice that the latency of the
horizontal component is always longer that the latency of the vertical components. (d) Left panel indicates the frequency distribution of the mean
final direction (time window: 95–135 ms) of ocular following to either grating (broken line) or plaid (continuous line) motion. Right panel shows the
mean tracking vector, sampled every 4 ms, in response to the same moving grating or uni-kinetic plaid. Data from [66].
5. Towards surface motion: 2D features integration
Many authors have suggested that second-order
motion processing is in fact a texture grabber followed
by motion energy detectors (e.g. [62,117]). From a
computational point of view, the same algorithm can be
applied to extract single, localised 2D features and nonlocalised, periodic second-order cues [59]. Therefore, the
two pathways scheme proposed by Wilson et al. [118] to
compute the 2D pattern motion direction of a plaid
appears to be very similar to the feature tracking solution indicated above [117]. It has long been recognised
that individual visual features are essential for various
visual tasks such as pattern recognition, surface segmentation and so on. Wallach [113] already indicated
the critical role of 2D visual features in reconstructing
2D surface motion. The barber-pole illusion offers a
simple, but very powerful tool to investigate the basic
rule underlying motion integration (e.g. [20,34,94,101]).
When seen behind a large aperture, the perceived global
direction of a set of bars depends on the shape of the
aperture and of the geometrical relationships between
the bars and the aperture edges. The laboratory version
of it is illustrated in Fig. 5a, where a sinusoidal horizontal grating is presented behind three instances of
aperture aspect ratio (the ratio between the two axis of
the aperture) and main orientation (the orientation of
the long axis). If the grating is set into upward motion,
the global perceived direction of the surface within the
aperture is either upward (case 2), upward–rightward
(case 1) and upward–leftward (case 3). Pairwise comparison between these instances indicate the main phenomena of the so-called ‘‘barber-pole illusion’’ (i.e. the
fact that the same grating motion provide different
perceived motions). Comparison between cases 1 and 3
hence demonstrates that the perceived direction is biased
44
G.S. Masson / Journal of Physiology - Paris 98 (2004) 35–52
Fig. 5. Ocular following responses to the barber-pole stimuli. (a) A horizontal grating, moving upward is seen behind three different apertures. A
diamond is an aperture with an aspect ratio of 1, and perceived direction is upward (case 2, continuous arrow). With elongated apertures (aspect
ratio ¼ 3), tilted counter-clockwise (case 1) or clockwise (case 3), the perceived direction is along the upward–leftward and upward–rightward
directions (continuous arrows), respectively, corresponding to the classical ‘‘barber-pole illusion’’. Right panel illustrates the horizontal and vertical
mean eye velocity of tracking responses to each condition. With the diamond aperture, the responses is purely vertical (blue line) at the earliest
latency. With the elongated apertures, the first component is upward, at the earliest latency (red lines) but a second component is initiated 20 ms
later, either rightward (continuous red line) or leftward (broken red line), corresponding to the predictions based on perceived direction. (b) A
framework for ocular following of global motion. Local motion signals from grating and line-endings are extracted through parallel pathways which
converge onto an integrative stage, albeit with an additional delay (d) for features processing. The integrative stage (presumably visual area MT in
primates) average the incoming signals so that the net population vector evolves over time. Data from from [71].
towards the orientation of the long axis of the aperture.
Comparing cases 2 and 3 indicates that the perceived
direction also depends upon the aperture aspect ratio.
The classical explanation for this is that three different
local motion signals compete one with each other to
dominate the global motion perception. The grating
provides ambiguous 1D motion signals from everywhere
within the surface. But line-endings or terminators arise
at the intersection between the grating and the aperture
edges. They form 2D features whose motion is unambiguous. Terminator motions have different directions
along the two orthogonal sets of apertures edges. For an
aspect ratio of 1, they cancel each other but, for higher
aspect ratios, one of the two feature motion directions
will dominate and therefore the global perceived direction will be biased towards it.
Ocular following to large barber-pole stimuli had the
striking behaviour illustrated in Fig. 5a (right panel)
[71]. Ocular responses were always initiated first in the
direction of the grating motion (i.e. upward for the
illustrated examples), at the usual ultra-short-latency
(85 ms), irrespective of the geometry of the aperture. A
later component was initiated at a latency 110 ms in
the direction of the long axis of the aperture. The
amplitude, but not the latency of this later component
varied with the aperture aspect ratio indicating that it
reflected the output of a global motion integration. Indeed, by the end of the open loop period the mean
tracking direction were best predicted by a vector sum-
mation of the 1D and 2D motions in the display. The
specific role of grating and line-endings was demonstrated by an independent modulation of either early or
late component. Adding a foveal mask of similar shape
inside the aperture and increasing its size up to 90% of
the grating area reduced specifically the earliest component but not the later. On the contrary, changing the
orientation of line-ending motions by staircasing the
luminance profile of the aperture edges [94] specifically
decreased the amplitude, not the latency, of the later
tracking component [71].
These results further demonstrate that global motion
computation is a dynamical mechanism which integrates
different local motion cues processed through parallel
channels which have different latencies (Fig. 5b). The
two motion pathways model adequately renders this
dynamical process. Indeed, computational studies have
demonstrated that a filter–rectify–filter scheme can also
extract 2D features such like line-endings or corners [59].
As pointed out already, this class of model suggests that
the various visual motion cues converge onto an integrative stage, albeit with different latencies. Thus, at the
level of this integration stage, the coding of the global
motion should exhibit a temporal evolution closely related to that found for eye movements. As said above,
area MT is thought to implement this integration stage.
Recent work by Born and colleagues shows that single
neurone activity exhibit a temporal evolution very similar to that reported here for tracking eye movements.
G.S. Masson / Journal of Physiology - Paris 98 (2004) 35–52
They investigated the directional selectivity of MT
neurones when presented with a set of parallel bars
travelling orthogonal or oblique to their orientation [91].
They found that the earliest part of the response showed
a strong interaction between the orientation of the bars
and their motion direction: the earliest preferred direction corresponded to the direction orthogonal to the
bars. Later direction selectivity was however insensitive
to the bars orientation. At the population level, they
found that direction selectivity evolves over time from
the direction orthogonal to the orientation of a luminance edges to its actual direction, with a time constant
of 70 ms. This shift towards an orientation-independent, direction-selectivity, may reflect the temporal
evolution of the neural solution of the aperture problem.
The mechanism proposed to explain this temporal evolution is still a matter of hot controversies. One
hypothesis suggests a specific role for terminators, which
can be extracted by slower, non-linear mechanisms such
as second-order motion processing [59] or end-stopped
cells [60]. Our behavioural results link the dynamics of
plaid and barber-pole motion perception and therefore
argue in favour of this hypothesis. An alternative view is
that the higher spatial frequency components of tiltedline patterns, which are related to the line-endings signals, have a lower contrast and are therefore extracted
through the same linear filtering as the grating motion,
but with a longer latency [63]. Once again, a critical test
to decide between these two solutions is to probe whether or not ocular following to pure second-order motion such as contrast modulated random dot patterns
are indeed delayed by 20 ms relative to responses to
grating motion.
6. A role for 3D cues: segmentation of surface motion
Our studies demonstrate that the initiation of tracking eye movements involves several different local motion processing that extract local 1D and 2D motion
signals. These different local motions are integrated with
different delays, and with a rather long temporal
dynamics which explains why the ocular behaviour
evolves over time [66,71]. In the last part of this review, I
will show how ocular tracking depends on the counterpart of motion integration: motion segregation.
In a crowded environment, the visual motion of the
object of interest is surrounded by motions of other
surfaces. Hence, for the tracking mechanism to respond
selectively to the retinal motion of the object of interest,
it must ignore the retinal motion of other objects which
are nearer or farther away. In fact, earlier studies on
optokinetic responses to wide field motion have reported
that this is indeed the case and that binocular disparity
of the retinal images of these objects located outside the
plane of fixation might play a role [42]. Together with
45
Miles and his colleagues, we further examined this
hypothesis by looking at the ocular following to complex patterns where competing motions are presented in
a corrugated display [72]. Half of the bands were presented in the plane of fixation when the relative depth of
the other half was manipulated by changing the binocular disparity between their left and right eye images
(Fig. 6a, bottom panel). As a baseline condition, test
bands were presented alone (Fig. 6a, top panel) and
their motion drove strong ocular following responses at
the usual short-latency (Fig. 6b, ‘‘test bands only’’
velocity profile). On the contrary, when the antagonistic
motion (conditioning bands) were presented simultaneously in the same plane of fixation then only minimal
responses were observed (Fig. 6b, ‘‘0°’’ velocity profile).
We found that increasing the binocular disparity of this
conditioning motion stimulus had profound effects on
the earliest ocular following: the larger was the binocular disparity the larger were the tracking responses in
the direction to the test bands. This relationship was
best summarised by the tuning curve relating the
amplitude of the earliest ocular following to the binocular disparity of the conditioning motion (Fig. 6c).
Maximum interaction (i.e. minimum responses) was
observed for no disparity but minimum motion interactions (i.e. larger responses in the test bands motion
direction) were found with disparities larger than 2–3°.
At these large disparity values, antagonistic motion
from the conditioning bands was almost completely
eliminated and the amplitude of earliest ocular following
closely matched that observed for the control, test bands
only condition. Thus, ocular following exhibits a bellshaped sensitivity to binocular disparity of competing
motion [72].
Binocular, direction-selective neurones in areas MT
and MST have the requisite receptive field sensitivity to
implement this ultra-fast image segmentation based on
binocular disparity. MT neurones respond to motion
with a strong direction-selectivity and antagonistic motion presented simultaneously within the receptive field
has a strong inhibitory influence [105]. Most of them are
also disparity-sensitive such that they respond best to
motion presented at the preferred depth [75]. As a
consequence, when motion in the non-preferred direction are presented outside the preferred-disparity range,
its antagonistic influence decreases [11]. Similar mechanisms have been demonstrated in area MSTl albeit with
a different spatial arrangement of antagonistic motions.
Motion presented in the surround of the receptive field
have strong modulation effects on the responses evoked
by motion in the centre and such modulation also depends on the disparity of the surround stimulus [29].
Interestingly, ocular following responses in monkeys
are also modulated by surround, antagonistic motions
[84] and this modulation depends on the binocular disparity of the surrounding stimuli [47]. Consistently,
46
G.S. Masson / Journal of Physiology - Paris 98 (2004) 35–52
Fig. 6. Binocular disparity sensitivity of the early component of ocular following. (a) A pattern is made of alternating stripes with random dots
moving either rightward or leftward. In the control condition, the test bands are presented alone, always in the plane of fixation. In the test condition,
the test bands (e.g. leftward motion) are presented in the plane of fixation but interlaced with conditioning bands (e.g. rightward motion) whose
binocular disparity is manipulated. (b) Mean version velocity profiles of ocular following responses to the conditioning bands only (rightward eye
movements, continuous line), the test band only (leftward eye movements, continuous line) or when both bands are presented together (broken lines).
Number indicate the binocular disparity of the conditioning bands (in degrees). (c) Mean change in version position, as a function the disparity of the
conditioning bands. Responses are minimal when both patterns are in the plane of fixation (disparity ¼ 0°). As the disparity of the antagonistic
motion increases, the amplitude of the responses in the leftward direction increases, up to the base-line values observed with the test bands only
condition. Modified from [72].
Takemura et al. [109] showed that horizontal disparity
steps applied to the image during the centering saccade
have similar effects both on earliest ocular following (see
also [15]) and mean firing rate of MST neurons. Altogether, these results suggest that the sensitivity of the
earliest ocular following to binocular disparity is mediated by disparity-selective motion detectors tuned to the
binocular plane of fixation. These neurons weight the
different motion signals within their receptive field to
implement an automatic motion segmentation process
where local motions within the plane of fixation are
selectively integrated together and local motions outside
the plane of fixation are eliminated.
A striking aspect of our results is that such disparitysensitive modulation was observed for the earliest phase
of ocular following: antagonistic motion was eliminated
immediately when disparity was introduced such that
tracking was initiated in the direction of the motion
presented in the plane of fixation. Similar results, albeit
with a slightly different and less direct paradigm, were
observed in monkeys [15]. They suggest that the earliest
component of ocular following is sensitive to binocular
disparity. As a consequence, 1D local motion measurements within the plane of fixation can be linearly integrated and motion outside the plane of fixation are
eliminated. Within the framework outlined above, the
fast and coarse representation of surface motion is
nevertheless sensitive to binocular 3D cues. On its way
to computing 2D surface motion, the visual system takes
advantage of some low-level 3D cues to restrict the
number of local motion signals being averaged. Therefore, 3D contextual integration occurs at the most elementary stages of motion processing, affecting the most
preliminary and coarse neural representation of the
pursued object.
It is possible that others, contextual 3D cues are involved in surface motion integration and segregation
but their dynamics is largely unknown. Psychophysical
studies have suggested that the spatial integration of
motion signals is gated by depth cues other than disparity. For instance, the perceived direction of a barberpole stimulus closely depends on the 3D interpretation
of the visual scene based on occlusion cues [20,101].
When the grating is perceived as moving behind an
occluding surface, global motion direction is predominantly perceived along the axis orthogonal to the grating
orientation. A possible explanation is that, under these
circumstances, line-endings are interpreted as being
‘‘extrinsic’’ to the moving surface and their contribution
to the motion integration process is vetoed. On the
contrary, when the moving surface is perceived as being
in front of the background, line-endings are interpreted
as being ‘‘intrinsic’’, their motion is integrated into the
2D integration and the perceived direction is largely
consistent with the classical barber-pole illusion [101].
Castet et al. [20] demonstrated that the extrinsic/intrinsic
interpretation of 2D features depends upon the relative
disparity between the moving and the occluding surfaces
but even more on the monocular unpaired regions.
Interestingly, similar contextual weighting of motion
G.S. Masson / Journal of Physiology - Paris 98 (2004) 35–52
signals integration has been at the earliest level of cortical processing. Sugita [107] showed that some direction-selective neurones in macaque area V1 encode the
motion of partially invisible edges when seen behind an
occluding surface covering the centre of the receptive
field but presented in front (i.e. crossed disparity) of the
moving contours. Albright and co-workers presented
another example of contextual depth-ordering. In
monkey area MT, many neurones responses represented
the perceived direction of a barber-pole, and not the
direction orthogonal to its orientation. These cell responses were sensitive to depth-ordering cues present at
the aperture edges which located outside the receptive
field [27].
What would be the dynamics of such contextual
modulation of motion integration? In none of these
previous psychophysical or neurophysiological studies
the temporal evolution of these contextual effects was
analysed (for a review see [4]). Latency and temporal
evolution analysis are sparse in the physiological literature despite the fact that, as shown above with plaid
motion analysis in area MT, these information are
critical to fully understand the processing performed by
the visual cortex. Again, our behavioural approach
could be very helpful to tackle this question. Our results
suggest that motions of 2D features such as line-endings
are extracted with a delay relative to grating motion. If
so, then contextual effects should have profound effects
only on the later tracking component. Moreover,
occlusion and binocular disparity cues should have different impact of the responses, the later but not the
former being symmetrical relative to the plane of fixation. To know whether such contextual effects can be
seen at the earliest part of the late component or would
be further delayed is of considerable importance. If
contextual influences of 3D cues such as occlusion or
47
monocular unpaired regions are observed at the earliest
phase of the late tracking component then most certainly 3D surface motion interpretation is embodied in
the feed-forward motion processing implementing fast
motion integration. Such a result would be similar to
what have been found for the disparity sensitivity of the
earliest ocular following. On the contrary, if occlusionbased contextual influences have a significant impact
only on later tracking phases (>110 ms), it would then
indicate that motion integration involves higher-order
rules that cannot be explained in terms of filtering
properties of the feed-forward motion processing.
7. Parallel and hierarchical motion processing are
revealed by ocular following
The behavioural results that I presented herein suggest that several different motion processing play a role
in the progressive build up of the neural representation
of the 2D surface motion which is used for tracking
initiation. Fig. 7 summarizes the different processing
stages being involved. We have been able to identify an
early tracking component with latencies 80 ms in humans and 55 ms in monkeys, which seems to be driven
by local changes in luminance and therefore which are
detected by first-order motion detectors. These detectors
are binocular and disparity-selective so that first-order
motion outside of the plane of fixation are filtered out. It
seems plausible that a rapid integration of these local
signals operates immediately after local motion sensing
since tracking responses to either a single motion or a
vector sum/average of several motions are indistinguishable. This linear mechanism implements a fast and
coarse motion integration restricted to elements within
the plane of fixation. A second tracking component
Fig. 7. 1D, 2D and 3D cues for object motion integration. A complex image where a moving object is embedded in a noisy background such as a
dense foliage is first processed by local, linear spatio-temporal filters. Motion signals located in the plane of fixation are automatically segregated
because of the disparity sensitivity of local motion detectors. These local measurements are then processed within two independent, parallel pathways. A rapid pathway extracts 1D motion signals related to elongated edges and pools them to extract a quick, coarse estimate of the object motion.
A slower pathway implements a texture grabber mechanism that can extract local 2D features and compute their motion. These local, nonambiguous motion signals are then feed into the linear solution to compute a correct, non-ambiguous estimate of the 2D object translation.
48
G.S. Masson / Journal of Physiology - Paris 98 (2004) 35–52
emerges 20 ms later, which seems to be driven by local
2D features and, at least in monkeys, by pure secondorder motion as well. This later component is critical to
perfectly align the tracking eye movements with the
global motion direction of the pursued surface, independently of its shape. Whether or not this integration
depends upon depth-ordering interpretation of local
motion signals is currently being investigated. Early and
late tracking components can be modulated independently suggesting that their respective inputs are processed in parallel. Finally, tracking direction undergoes
a slow temporal evolution so that movements of the eyes
slowly converge to the global motion direction, within a
period of time of 100 ms after stimulus onset.
As said above, these results render many of the key
characteristics of a class of feed-forward models proposed for motion integration, which are known as the
‘‘two motion pathways models’’ (see [62,117]). These
models postulate the existence of two, parallel processing streams: a Fourier (or linear) and a non-Fourier (or
non-linear) pathway. The former can extract changes in
luminance of elongated edges while the later acts as a
non-linear texture grabber followed by a linear motion
processing stage. Fourier and non-Fourier motion
pathways are though to correspond to the direct and
indirect inputs from areas V1 to MT (see [77] for a review). There are massive direct projections from spiny
stellate neurones of area V1––layer 4B to area MT [102].
These V1-4B neurones receive a predominant magnocellular input from layer 4Ca [119] and have a broad
spatial selectivity and a high temporal resolution [85].
Since there are MT neurones that are activated by
moving random dots at latencies that precede ocular
following by 10 ms in monkeys (see Fig. 1, [48]), I
suggest that this direct input to area MT is responsible
for the earliest initiation of ocular following, presumably via visual area MST. This hypothesis explains
several experimental results about the role of linear
motion processing. In particular, it is consistent with the
high contrast sensitivity of the earliest phase of ocular
following, which could be explained by the predominant
magno-cellular input to this fast V1-MT route [99].
There is also a non-direct route from V1 to MT,
which relays in area V2 [25,76]. This indirect route
originates from V1––layer 4B pyramidal neurons that
receive mixed M and P signals via inputs from both
layer 4Ca and 4Cb and in turn project to the thick
stripes of V2 [76,119]. Visual area V2 play a key role
along this indirect pathway which is thought to correspond to the non-Fourier stream of the two motion
pathways models. This role is supported by several facts.
First, lesioning V2 produces strong deficits in texture
discrimination but no deficit in orientation discrimination, contrast sensitivity or detection of low level,
luminance-based visual cues [78]. Second, there is evidence for rapid, selective neuronal responses to different
types of second-order stimuli such as illusory contours
in macaque visual area V2 (see [6] for a review). Third,
in cats area 18, there are neurones that respond to both
first-order (i.e. luminance grating) and second-order
(contrast-modulated gratings) motions [122]. However,
cell responses to second-order motions are delayed relative to responses to first-order motion [64]. Similar
evidence about the dynamics of neuronal responses to
second-order motion in area V2 is still lacking in primate but it is known that a small subgroup of neurons in
monkey area MT responds to various types of secondorder motion [3,22,90]. One hypothesis is that these MT
cells are driven by inputs from visual area V2. Within
this framework, our results suggest that early and late
component of ocular following responses reflect the
successive inputs of parallel, direct and indirect pathways onto area MT (Fig. 5), which in turn project to
area MST and then to the brainstem oculomotor system
(see Fig. 1).
From a more general point of view, I believe that our
‘‘behavioural probe’’ illuminates the modern conceptions about parallel and serial processing in the visual
system. Since the pioneering studies of single-unit
recording in visual cortex, the functional models of
cortical processing have oscillated back and forth between a serial (hierarchical) and a parallel conception
(see [13,77]). In the hierarchical model, several processing steps are implemented by a cascade of neurones
whose receptive fields are of increasing complexity [53].
Earliest models of feed-forward motion processing resonate with this conception, where a first stage of motion
detection is followed by a second-stage of motion integration [86,103]. In parallel models, independent and
simultaneous processing are done within modules specialised for different aspects of the visual input. The
modern conception that motion processing is done
within several pathways specialised for different motion
cues offer a functional equivalent of these cortical parallel pathways [62]. As pointed out by Bullier and Nowak [14], timing is a critical feature for deciphering the
functional organisation of the cortical visual system.
From an extensive review of the literature, Lamme and
Roelfsema [53] extracted a schematic temporal figure of
the feed-forward sweep of visually driven activity in
macaque cortex. Earliest activities (<40 ms) are seen in
both V1 and MT, followed by MST and the frontal eye
field (<50 ms). As pointed out above, results by Kawano
et al. [48] indicate a strict correlation between this early
activity in areas MT/MST and the initiation of ocular
following in macaque (latencies: 55 ms). Of particular
interest, earliest activities in area V2 are delayed by 20
ms relative to V1/MT onset (see [53]). I suggest that this
20 ms delay may explain the timing difference that we
found between early and late ocular following components. Moreover, a more detailed analysis of the latency
distribution of neurones in the different layers in areas
G.S. Masson / Journal of Physiology - Paris 98 (2004) 35–52
V1 and V2 have revealed that the largest difference occurs between the magno-cellular and parvo-cellular
streams running throughout them [88]. This observation
is consistent with our finding that early and late tracking
components have different contrast response functions,
which look very similar to those reported from magnoand parvo-driven geniculo-cortical streams, respectively
[99].
In summary, our results fit very well the view of the
functional organisation of the cortical motion stream in
which multiple parallel pathways process different local
motion cues but converge onto an integrative stage,
presumably MT/MST. Such parallel and hierarchical
organisation unveil a more complex temporal structure
where the integrative stage receives these parallel inputs
with different delays. As a consequence, the neural
representation of the object-target motion undergoes a
progressive build-up going from a crude representation
of 1D edges motion to a fine representation of 2D surface motion within a 3D complex visual scene.
49
behavioural levels. Lastly, we focused on the first 200 ms
of tracking initiation. Using third-order motion cues
and complex depth-ordering of moving and occluding
surfaces we will be able to further investigate the buildup of surface motion and its relationship with attention
and mid-level vision mechanisms which, presumably,
operate with a longer time scale. Finally, many recent
studies indicate that steady-state tracking eye movements share many of the properties of motion segmentation and integration for perception (e.g. [9,50,
106,112]). Altogether, these results call for the need of a
more complex front-end motion processing in models of
the visual tracking sensorimotor transformation. They
also open the door to future research, first to unveil the
ocular following-related neural activity within these
various visual areas (and in particular visual area V2)
and second, to decipher when and how such a neural
representation is linked to higher order perceptual and
cognitive processing. Thus, a more comprehensive view
of how perception and action are coupled will progressively emerge.
8. Conclusion
A growing body of evidence suggests that complex
motion processing are involved in the visual stabilisation
of gaze. How is local velocity encoded by MT cells and
then used for smooth pursuit initiation in monkeys has
been carefully investigated by the group of Steve Lisberger (e.g. [23,55,93]). How global target velocity is
reconstructed for tracking eye movements is under
investigation in several labs (e.g. [41,68,106]). Although
I focused on reflexive tracking initiation in the present
article, there is also strong evidence for similar dynamics
of motion integration for voluntary smooth pursuit
initiation in primates (e.g. [68,91]). For instance, in humans, we found that voluntary pursuit of line-drawing
objects always start in the vector average of edges motion. It takes more than 150 ms before the tracking
direction is perfectly aligned with the true object motion.
Once again, temporal evolution of smooth pursuit eye
movement unveil a dynamics of the neural representation of object motion which is in close agreement with
what we have found with reflexive, ocular following eye
movements [68]. In brief, it appears now that initiation
of smooth eye movements depends on a complex global
visual motion build-up with a rather slow temporal
dynamics. Throughout this review article, I indicated
what are the missing pieces of the puzzle. Further
experimental works are needed to clarify the contribution of the first-, second- and third-order motion pathways. Moreover, the slow dynamics of 2D motion
integration showed by responses of MT cells as well as
by ocular tracking to plaid motion cast doubts on the
sufficiency of feed-forward models. Clearly, more
experimental work is needed at both physiological and
Acknowledgements
The author thanks Eric Castet, Jean Lorenceau,
Daniel Mestre and Frederick A. Miles for their help in
conducting this work and their contribution to the ideas
developed in this paper. This work was supported by the
CNRS, la Fondation pour la Recherche Medicale, le
Ministere de la Recherche and the National Eye Institute, NIH. I thank Leland S. Stone, Alexa Riehle and
Laurent Goffart for comments on previous versions of
this manuscript.
References
[1] E.H. Adelson, J.R. Bergen, Spatio-temporal energy models for
the perception of motion, J. Opt. Soc. Am A 2 (1985) 284–299.
[2] E.H. Adelson, J.A. Movshon, Phenomenal coherence of moving
visual pattern, Nature 300 (1982) 523–525.
[3] T.D. Albright, Form-cue invariant motion processing in primate
visual cortex, Science 255 (1992) 1141–1143.
[4] T.D. Albright, G.R. Stoner, Contextual influences on visual
processing, Annu. Rev. Neurosci. 23 (2002) 339–379.
[5] S.M. Anstis, Phi movement as a subtraction process, Vision Res.
10 (1970) 1411–1430.
[6] C.L. Baker Jr., Central neural mechanisms for detecting secondorder motion, Curr. Opin. Neurobiol. 9 (1999) 461–466.
[7] H.B. Barlow, W.R. Levick, The mechanisms of directionally
selective units in rabbit’s retina, J. Physiol. (Lond.) 178 (1965)
477–504.
[8] P.J. Benson, K. Guo, Stages in motion processing revealed by the
ocular following response, NeuroReport 10 (1999) 3803–3807.
[9] B.R. Beutter, L.S. Stone, Human motion perception and smooth
pursuit eye movements show similar directional biases for
elongated apertures, Vision Res. 38 (1998) 1273–1286.
50
G.S. Masson / Journal of Physiology - Paris 98 (2004) 35–52
[10] O. Braddick, N. Qian, The organization of global motion and
transparency, in: J.M. Zanker, J. Zeil (Eds.), Motion Vision––
Computational, Neural and Ecological Constraints, Springer
Verlag, Berlin, 2001, pp. 86–112.
[11] D.C. Bradley, N. Qian, R.A. Andersen, Integration of motion
and stereopsis in middle temporal cortical areas of macaques,
Nature 373 (1995) 609–611.
[12] K.H. Britten, H.W. Heuer, Spatial summation in the receptive
fields of MT neurons, J. Neurosci. 15 (1999) 5074–5084.
[13] J. Bullier, Integrated model of visual processing, Brain Res. Rev.
134 (2001) 193–204.
[14] J. Bullier, L.G. Nowak, Parallel versus serial processing: new
vistas on the distributed organization of the visual system, Cur.
Opin. Neurobiol. 5 (1995) 497–503.
[15] C. Busettini, G.S. Masson, F.A. Miles, A role for binocular
stereo cues in the rapid visual stabilization of the eyes, Nature
380 (1996) 342–345.
[16] C. Busettini, F.A. Miles, R.J.K. Krauzlis, Short-latency disparity
vergence responses and their dependence on a prior saccadic eye
movement, J. Neurophysiol. 75 (1996) 1392–1410.
[17] C. Busettini, E.J. FitzGibbon, F.A. FitzGibbon, Short-latency
disparity vergence in humans, J. Neurophysiol. 85 (2001) 1129–
1152.
[18] F. B€
utzer, U.W. Ilg, J.M. Zanker, Smooth-pursuit eye movements elicited by first-order and second-order motion, Exp. Brain
Res. 115 (1997) 61–70.
[19] R.H.S. Carpenter, Eye Movements, in: Vision and Visual
Dysfunctions, vol. 8, Macmillan, London, 1991.
[20] E. Castet, V. Charton, A. Dufour, The extrinsic/intrinsic classification of 2D motion signals in the barberpole illusion, Vision
Res. 39 (1999) 705–720.
[21] C. Chubb, G. Sperling, Two motion perception mechanisms
revealed by distance driven reversal of apparent motion, Proc.
Natl. Acad. Sci. USA 86 (1989) 2985–2989.
[22] J. Churan, U.J. Ilg, Processing of second-order motion stimuli in
primate middle temporal area and medial superior temporal area,
J. Opt. Soc. Am. A 18 (2001) 2297–2306.
[23] M.M. Churchland, S.G.L. Lisberger, Apparent motion produces
multiple deficits in visually guided smooth pursuit eye movements
on monkeys, J. Neurophysiol. 84 (2000) 216–235.
[24] B.G. Cumming, A.J. Parker, Responses of primary visual cortical
neurons to binocular disparity without depth perception, Nature
389 (1997) 280–283.
[25] E.A. DeYoe, D.C. Van Essen, Segregation of efferent connections and receptive field properties in visual area V2 of the
macaque, Nature 317 (1985) 58–61.
[26] K.R. Dobkins, G.R. Stoner, T.D. Albright, Perceptual, oculomotor and neural responses to moving color plaids, Perception
27 (1998) 681–709.
[27] R.O. Duncan, T.D. Albright, G.R. Stoner, Occlusion and the
interpretation of visual motion: perceptual and neuronal effects
of context, J. Neurosci. 20 (2000) 5885–5897.
[28] M. Egelhaaf, A. Borst, Are there separate ON and OFF channels
in fly motion vision?, Vis. Neurosci. 8 (1992) 151–164.
[29] S. Eifuku, R.H. Wurtz, Response to motion in extrastriate area
MSTl: disparity sensitivity, J. Neurophysiol. 82 (1989) 2462–
2475.
[30] R.C. Emerson, M.N. Citron, W.J. Vaughn, S. Klein, Nonlinear
directionally selective subunits in complex cells of cat striate
cortex, J. Neurophysiol. 58 (1987) 33–65.
[31] C.L. Fennema, W.B. Thompson, Velocity determination in
scenes containing several moving objects, Comp. Graph. Image
Proc. 9 (1979) 301–315.
[32] V.P. Ferrera, S.G.L. Lisberger, Neuronal responses in visual
areas MT and MST during smooth pursuit target selection, J.
Neurophysiol. 78 (1997) 1433–1446.
[33] V.P. Ferrera, H.R. Wilson, Perceived direction of moving twodimensional patterns, Vision Res. 30 (1990) 272–285.
[34] N. Fischer, J.M. Fischer, Directional tuning of the barberpole
illusion, Perception 30 (2001) 1321–1336.
[35] N. Franceschini, A. Riehle, A. Le Nestour, Directionally selective
motion detection by insect neurons, in: H. Stavenga (Ed.), Facets
of Vision, Springer-Verlag, Berlin, 1989, pp. 360–390.
[36] R.S. Gellman, J.R. Carl, F.A. Miles, Short-latency ocular
following responses in man, Vis. Neurosci. 5 (1990) 107–122.
[37] H. Gomi, M. Shidara, A. Takemura, Y. Inoue, K. Kawano, M.
Kawato, Temporal firing patterns of Purkinje cells in the
cerebellar ventral paraflocullus during ocular following responses
in monkey. I. Simple spikes, J. Neurophysiol. 80 (1998) 832–848.
[38] A. Gorea, J. Lorenceau, Directional performances with moving
plaids: component-related and plaid-related modes coexist, Spat.
Vis. 5 (1991) 231–252.
[39] J.M. Groh, R.T. Born, W.T. Newsome, How is a sensory map
read-out? Effects of microstimulation in visual area MT on
saccades and smooth-pursuit eye movements, J. Neurosci. 17
(1997) 4312–4330.
[40] M.J. Hawken, K.R. Gegenfurtner, Pursuit eye movements
to second-order motion targets, J. Opt. Soc. Am. A 18 (2001)
2282–2296.
[41] S.J. Heinen, S.N.J. Watamaniuk, Spatial integration in human
smooth pursuit, Vision Res. 38 (1998) 3785–3794.
[42] I.P. Howard, E.G. Gonzales, Human optokinetic nystagmus is
response to moving binocularly disparate stimuli, Vision Res. 27
(1987) 1807–1816.
[43] U.J. Ilg, Slow eye movements, Prog. Neurobiol. 53 (1997) 293–
329.
[44] Y. Inoue, A. Takemura, K. Kawano, M.J. Mustari, Role of the
pretectal nucleus of the optic tract in short-latency ocular
following responses in monkeys, Exp. Brain Res. 131 (2000)
269–281.
[45] K. Kawano, Ocular tracking: behaviour and neurophysiology,
Curr. Opin. Neurobiol. 9 (1999) 467–473.
[46] K. Kawano, M. Shidara, S. Yamane, Neural activity in dorsolateral pontine nucleus of alert monkey during ocular following
responses, J. Neurophysiol. 67 (1992) 680–703.
[47] K. Kawano, Y. Inoue, A. Takemura, F.A. Miles, Effect of
disparity in the peripheral field on short-latency ocular following
responses, Vis. Neurosci. 11 (1994) 833–837.
[48] K. Kawano, M. Shidara, Y. Watanabe, S. Yamane, Neural
activity in cortical area MST of alert monkey during ocular
following responses, J. Neurophysiol. 71 (1994) 2305–
2324.
[49] E.L. Keller, S.J. Heinen, Generation of smooth pursuit eye
movements: neuronal mechanisms and pathways, Neurosci. Res.
11 (1991) 79–107.
[50] R.J. Krauzlis, S.A. Adler, Effects of directional expectation on
motion perception and pursuit eye movements, Vis. Neurosci. 18
(2001) 365–376.
[51] R.J. Krauzlis, L.S. Stone, Tracking with the mind’s eye, Trends
Neurosci. 22 (1999) 544–550.
[52] K. Krug, B.G. Cumming, A.J. Parker, Response of single MT
neurons to anti-correlated stereograms in the awake macaque,
Soc. Neurosci. Abstr. 25 (1999) 109–119.
[53] V.A. Lamme, P.R. Roelfsema, The distinct modes of vision
offered by feedforward and recurrent processing, Trends Neurosci. 23 (2000) 571–579.
[54] A. Lindner, U.J. Ilg, Initiation of smooth pursuit eye movements
to first-order and second-order stimuli, Exp. Brain Res. 133
(2000) 450–456.
[55] S.G. Lisberger, J.A. Movshon, Visual motion analysis for pursuit
eye movements in area MT of macaque monkeys, J. Neurosci. 19
(1999) 2224–2246.
G.S. Masson / Journal of Physiology - Paris 98 (2004) 35–52
[56] S.G. Lisberger, E.J. Morris, L. Tychsen, Visual motion processing and sensorimotor integration for smooth pursuit eye movements, Annu. Rev. Neurosci. 10 (1987) 97–129.
[57] M.S. Livingstone, B.R. Livingstone, Responses of V1 neurons to
reverse phi stimuli, in: 2nd Annual Meeting, Vision Sciences
Society, Sarasota, 10–15 May 2002, p. 49.
[58] M.S. Livingstone, C.C. Pack, R.T. Born, Two-dimensional
substructure of MT receptive fields, Neuron 30 (2001) 781–793.
[59] G. L€
offler, H.S. Orbach, Computing feature motion without
feature detectors: a model for terminator motion, Vision Res. 39
(1999) 859–871.
[60] J. Lorenceau, M. Lorenceau, N. Wells, E. Castet, Different
motion sensitive units are involved in recovering the direction of
moving lines, Vision Res. 33 (1993) 1207–1217.
[61] Z.-L. Lu, G. Sperling, Second-order reversed phi, Percept.
Psychophy. 61 (1999) 1075–1088.
[62] Z. Lu, G. Sperling, Three-systems theory of human visual motion
perception: review and update, J. Opt. Soc. Am. A. 18 (2001)
2331–2370.
[63] N. Majaj, M.A. Smith, A. Kohn, W. Bair, J.A. Movshon, A role
for terminators in motion processing by macaque MT neurons?
in: 2nd Annual Meeting, Vision Sciences Society, Sarasota, 10–15
May, 2002, p. 147.
[64] I. Mareschal, C.L. Baker Jr., Temporal and spatial response to
second-order stimuli in cat area 18, J. Neurophysiol. 80 (1998)
2811–2823.
[65] D. Marr, S. Ullman, Directional selectivity at its use in early
visual processing, Proc. Roy. Soc. Lond. B 211 (1981) 151–180.
[66] G.S. Masson, E. Castet, Parallel motion processing for the
initiation of short-latency ocular following in humans, J. Neurosci. 22 (2002) 5149–5163.
[67] G.S. Masson, D.R. Mestre, A look in the black box: eye
movements as a probe of visual motion processing, Cah. Psychol.
Cog. 17 (1998) 807–829.
[68] G.S. Masson, L.S. Stone, From following edges to pursuing
objects, J. Neurophysiol. 88 (2002) 2869–2873.
[69] G.S. Masson, C. Busettini, F.A. Miles, Vergence eye movements
in response to binocular disparity without depth perception,
Nature 389 (1997) 283–286.
[70] G.S. Masson, D.R. Mestre, L.S. Stone, Speed tuning for motion
segmentation and discrimination, Vision Res. 39 (1999) 4297–
4308.
[71] G.S. Masson, Y. Rybarczyk, E. Castet, D.R. Mestre, Temporal
dynamics of motion integration for the initiation of tracking
responses at ultra-short latencies, Vis. Neurosci. 17 (2000) 753–
767.
[72] G.S. Masson, C. Busettini, D.-S. Yang, F.A. Miles, Short-latency
ocular following in humans: sensitivity to binocular disparity,
Vision Res. 41 (2001) 3371–3387.
[73] G.S. Masson, D.-S. Yang, F.A. Miles, Reversed short-latency
ocular following in humans, Vision Res. 42 (2002) 2081–
2087.
[74] G.S. Masson, D.-S. Yang, F.A. Miles, Version and vergence eye
movements in humans: open-loop dynamics determined by
monocular rather than binocular image speed, Vision Res. 42
(2002) 2853–2867.
[75] J.H.R. Maunsell, D.C. Van Essen, Functional properties of
neurons in middle temporal visual areas of the macaque monkey.
II. Binocular interactions and sensitivity to binocular disparity, J.
Neurophysiol. 49 (1983) 1148–1167.
[76] J.H.R. Maunsell, D.C. Van Essen, The connections of the middle
temporal visual area (MT) and their relationship to a cortical
hierarchy in the macaque monkey, J. Neurosci. 3 (1983) 2563–
2586.
[77] W.H. Merigan, J.H.R. Maunsell, How parallel are the primate visual pathways? Annu. Rev. Neurosci. 16 (1993) 369–
402.
51
[78] W.H. Merigan, T.A. Nealey, J.H.R. Maunsell, Visual effects of
lesions of cortical areas V2 in macaques, J. Neurosci. 13 (1993)
3223–3334.
[79] D.R. Mestre, G.S. Masson, Ocular responses to motion parallax
stimuli: the role of perceptual and attentional factors, Vision Res.
37 (1997) 1627–1641.
[80] D.R. Mestre, G.S. Masson, L.S. Stone, Spatial scale of motion
segmentation from speed cues, Vision Res. 41 (2001) 2697–2713.
[81] F.A. Miles, The sensing of rotational and translational optic flow
by the primate optokinetic system, in: F.A. Miles, J. Wallman
(Eds.), Visual Motion and its Role in the Stabilization of Gaze,
Reviews in Oculomotor Research, vol. 5, Elsevier, New York,
1993, pp. 393–417.
[82] F.A. Miles, The neural processing of 3-D visual information:
evidence from eye movements, Eur. J. Neurosci. 10 (1998) 811–
822.
[83] F.A. Miles, K. Kawano, Visual stabilization of the eyes, Trends
Neurosci. 10 (1987) 153–158.
[84] F.A. Miles, K. Kawano, L.M. Optican, Short-latency ocular
following responses of monkey. I. Dependence on temporospatial properties of the visual input, J. Neurophysiol. 56 (1986)
1321–1354.
[85] J.A. Movshon, W.T. Newsome, Visual response properties of
striate cortical neurons projecting to area MT in macaque
monkeys, J. Neurosci. 16 (1996) 7733–7741.
[86] J.A. Movshon, E.H. Adelson, M.S. Gizzi, W.T. Newsome, The
analysis of visual moving patterns, in: C. Chagas, R. Gattass, C.
Gross (Eds.), Pattern Recognition Mechanisms, Springer, New
York, 1985, pp. 117–151.
[87] J.A. Movshon, A.T. Smith, N.J. Majaj, A. Kohn, K.W. Bayr,
Dynamics of pattern motion signals in macaque area MT,
Perception 31S (2002) 101d.
[88] M.H.J. Munk, L.G. Nowak, P. Girard, N. Choulamountri, J.
Bullier, Visual latencies in cytochrome oxidase bands of macaque
area V2, Proc. Natl. Acad. Sci. USA 92 (1995) 988–992.
[89] A. Nieder, H. Wagner, Hierarchical processing of horizontal
disparity information in the visual forebrain of behaving owls, J.
Neurosci. 21 (2001) 4514–4522.
[90] L.P. O’Keefe, J.A. Movshon, Processing of first- and secondorder motion signals by neurons in area MT of the macaque
monkey, Vis. Neurosci. 15 (1998) 305–317.
[91] C.C. Pack, R.T. Born, Temporal dynamics of a neural solution to
the aperture problem in visual area MT, Nature 409 (2001) 1040–
1042.
[92] C.C. Pack, V.K. Berezovskii, R.T. Born, Dynamic properties of
neurons in cortical area MT in alert and anesthetized macaque
monkeys, Nature 414 (2001) 905–908.
[93] N.J. Priebe, M.M. Churchland, S.G.L. Lisberger, Reconstruction
of target speed for the guidance of pursuit eye movements, J.
Neurosci. 21 (2001) 3196–3206.
[94] R.P. Power, B. Moulden, Spatial gating effects on judged motion
of gratings in apertures, Perception 21 (1992) 449–463.
[95] G.H. Recanzone, R.H. Wurtz, U. Schwartz, Responses of MT
and MST neurons to one or two moving objects in the receptive
field, J. Neurophysiol. 78 (1997) 2904–2915.
[96] W. Reichardt, Autocorrelation, a principle for the evaluation of
sensory information by the central nervous system, in: W.A.
Rosenblith (Ed.), Sensory Communication, Wiley, New York,
1961, pp. 303–317.
[97] H.R. Rodman, T.D. Albright, Single-unit analysis of patternmotion selective properties in the middle temporal visual area
(MT), Exp. Brain Res. 73 (1989) 53–64.
[98] T. Sato, Reversed apparent motion with random dot patterns,
Vision Res. 29 (1989) 1749–1758.
[99] G. Sclar, J.H. Maunsell, P. Lennie, Coding of image contrast in
central visual pathways of the macaque monkey, Vision Res. 30
(1990) 1–10.
52
G.S. Masson / Journal of Physiology - Paris 98 (2004) 35–52
[100] M. Shidara, K. Kawano, Role of Purkinje cells in the ventral
paraflocullus in short-latency ocular following responses, Exp.
Brain Res. 93 (1993) 185–195.
[101] S. Shimojo, G. Silverman, K. Nakayama, Occlusion and the
solution to the aperture problem for motion, Vision Res. 29
(1989) 619–626.
[102] S. Shipp, S. Zeki, The organisation of connections between areas
V5 and V1 in macaque monkey visual cortex, Eur. J. Neurosci. 1
(1989) 308–331.
[103] E.P. Simoncelli, D.J. Heeger, A model of neuronal responses in
visual area MT, Vision Res. 38 (1998) 101–112.
[104] A.T. Smith, R.J. Snowden, Visual Detection of Motion, Academic Press, San Diego, 1994.
[105] R.J. Snowden, S. Treue, R.G. Erickson, R.A. Andersen, The
response of area MT and V1 neurons to transparent motion, J.
Neurosci. 11 (1991) 2768–2785.
[106] L.S. Stone, B.R. Beutter, J. Lorenceau, Visual motion
integration for perception and pursuit, Perception 29 (2000)
771–787.
[107] Y. Sugita, Grouping of image fragments in primary visual cortex,
Nature 401 (1999) 269–272.
[108] A. Takemura, Y. Inoue, K. Kawano, The role of MST neurons in
short-latency tracking eye movements, Soc. Neurosci. Abstr. 26
(2000) 1715.
[109] A. Takemura, Y. Inoue, K. Kawano, The effect of disparity on
the very earliest ocular following responses and the initial
neuronal activity in monkey cortical area MST, Neurosci. Res.
38 (2000) 93–101.
[110] A. Takemura, Y. Inoue, K. Kawano, C. Quaia, F.A. Miles,
Single-unit activity in cortical area MST associated with disparity-vergence eye movements: evidence for population coding, J.
Neurophysiol. 85 (2001) 2254–2266.
[111] J.P.H. Van Santen, G. Sperling, Elaborated Reichardt detectors,
J. Opt. Soc. Am. A 2 (1985) 322–342.
[112] J. Wallace, G.S. Masson, D.R. Mestre, P. Mamassian, The
efficiency of smooth pursuit for surface motion, Invest. Opth. Vis.
Sci. 42S (2001) 3330.
[113] H. Wallach, Ueber Visuell Wahrgenommene Bewegungrichtung,
Psycholog. Forsch. 20 (1935) 325–380 (English translation by S.
Wuerger, B. Shapley, Perception 11 (1996) 1317–1367).
[114] T. Watanabe, High-level Motion Processing. Computational,
Neurobiological and Psychophysical Perspective, MIT Press,
Cambridge, 1998.
[115] A.B. Watson, A.J. Ahumada, Model of human visual motion
sensing, J. Opt. Soc. Am. A 2 (1985) 322–342.
[116] M. Wertheimer, Experimentelle Studien€
uber das Sehen von
Bewegung, Z. Psychol. (1912) 161–265.
[117] H.R. Wilson, Non-Fourier cortical processes in texture form and
motion perception, in: P.S. Ulinsky (Ed.), Cerebral Cortex,
Models of Cortical Circuits, vol. 13, Kluwer Academic, New
York, 1999, pp. 445–477.
[118] H.R. Wilson, V.P. Ferrera, C. Yo, A psychophysically motivated
model for two-dimensional motion perception, Vis. Neurosci. 9
(1992) 79–98.
[119] N.H. Yabuta, A. Sawatari, E.M. Callaway, Two functional
channels from primary visual cortex to dorsal visual cortical
areas, Science 292 (2001) 297–300.
[120] K. Yamamoto, Y. Kobayashi, A. Takemura, K. Kawano, M.
Kawato, A mathematical model that reproduces vertical ocular
following responses from visual stimuli by reproducing the simple
spike firing frequency of Purkinje cells in the cerebellum,
Neurosci. Res. 29 (1997) 161–169.
[121] C. Yo, H.R. Wilson, Perceived direction of moving two-dimensional patterns depends on duration, contrast and eccentricity,
Vision Res. 32 (1992) 135–147.
[122] Y.-X. Zhou, C.L. Baker Jr., A processing stream in mammalian
visual cortex neurons for non-Fourier responses, Science 261
(1993) 98–101.