Download Example Stimuli and Features - Cognitive Science

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data assimilation wikipedia , lookup

Time series wikipedia , lookup

Transcript
Integration of different features in guiding eye-movements
Frank Schumann, Alper Acik, Selim Onat & Peter König
Neurobiopsychology Department, Institute of Cognitive Science, Albrechtstrasse 28, 49076 Osnabrück
Introduction
Example Stimuli and Features
In natural behaviour we actively attend to parts of a visual scene by moving our eyes. Models of such overt
attention behaviour combine different local visual features in a bottom-up process to derive salient locations
(e.g. Itti & Koch, 2001; Parkhurst & Niebur, 2002). Here, we developed a Bayesian, data-driven model of
saliency and studied this feature integration process empirically. We investigated the interaction of
luminance, luminance contrast, texture contrast, edges and colour contrast during free viewing of natural
stimuli.
We develop an eye-tracking data-driven Bayesian measure of saliency and study:
(1) How does the saliency of a feature depend on the feature values?
Natural
Mean Luminance 1°
ManMade
Faces
Luminance Contrast 1°
Fractals
Texture 5°
Rainforest
Barness 1°
(2) How does the saliency of a feature vary in the context of a second feature?
We model this feature interaction with additive and multiplicative integration processes.
Methods
Fixations on image 1
Control Fixations
The upper row shows example stimuli from all
image categories. The middle row shows
example feature maps for the first natural image.
We analyzed feature integration within the
intensity channel, and in the color dataset also
across intensity and colour channel. The bottom
row shows fixations on the first natural image
and a spatial control fixation distribution.
We analyzed eye-tracking data from free viewing baseline conditions of two studies. A grayscale data set was
taken from Acik et al. (submitted), and comprises 64 images in 4 categories (Natural images, manmade
scences and objects, close up faces, and fractals). Natural images were taken from the “Zurich Natural Image
Database” (http://www.klab.caltech.edu/~wet/ ZurichNatDB.tar.gz). A colour set contained free-viewing
data of 96 colour-calibrated images from the Kibale rainforest in Uganda (courtesy of Prof. Tom Troscianko,
University of Bristol).
We generated feature maps for mean luminance (L), luminance contrast (LC), luminance contrast on lowpass filtered images (LCLP), texture contrast (TC), edges (Bar), red-green contrast (RG) and yellow-blue
contrast (YB) at spatial scales of 1, 2 and 5 degrees according to the respective measures used in Einhäuser &
König (2003). Colour contrasts follow the definition of LC, applied to the respective channel in the DKL
colour space (Darrington, 1984).
(2) Model:
(1) Eye Tracking Data:
Individual salience
P(Fix|LC, TC)
Measured Integration
+
Additive Integration
P(Fix|LC, TC)
LC 1°
(1) Measuring Feature Interaction: a Bayesian Model of Salience
TC 5°
TC 5°
LC 1°
LC 1°
X
Salience º
FixationStatistics * Constant
ImageStatistics
LC 1°
TC 5°
Joint Feature Statistics in Images
TC 5°
LC 1°
LC 1°
Results
Linearity of individual feature salience
R
2
1A) We measured how the salience of a feature depends
on the value of the feature. For some features, salience
rises linearly with the value of the feature, while other
features contribute non-linearly to saliency, depending on
the image category. Only luminance contrast seems to
contribute linearly to saliency in all categories except in
natural images.
2) Individual features contribute independently to salience. Integration of features can be
explained by additive integration. However, multiplicative models had a similar fit to the data.
Mutual information between the salience of two features
We modelled the obtained empirical feature interaction data with additive and multiplicative interaction
processes. We first derived salience for two individual features as the respective marginal distributions of the
measured joint salience. Then, we again combined the two individual saliencies additively and multiplicatively.
For additive interaction, the two individual salience distributions served as regressor variables. For
multiplicative interaction, an interaction term was constructed as the outer product matrix of individual
saliencies. We used multiple regression to compare the global fit of additive and multiplicative reconstructions
of the empirical data with the coefficient of determination R2.
We used a Bayesian measure of salience to study the interaction of two features in eye-tracking data
during free viewing of different image categories. We model the integration of two features in saliency
with additive and multiplicative integration processes.
We find that (1) features contribute to salience in linear and nonlinear ways. And (2) in general knowing
the value of one feature does not give much information about the salience of another feature. That is,
different features contribute independently to a joint saliency map. Additive integration of individual
saliencies gives a good description of the data for most analysed feature combinations within and across
feature channels. However, multiplicative interaction explained the data equally good in most cases. In
summary, we argue that selection of salient regions can be explained by independent processing of
individual features that are additively combined.
Contact:
The work reported here is based on the Master’s Thesis of Frank Schumann ([email protected]).
Goodness of fit for additive models
Acknowledgements:
Hans-Peter Frey, Daniel Lang (University of Osnabrück) & Tom Troscianko (University of Bristol) for
eyetracking-data of colour calibrated Uganda rainforest images; Lina Jansen and Stefan Scherbaum for help with
statistical analysis and implementation.
max MI = 4.3 bit!
R
2
MI (bit)
(2) Modelling Feature Interaction:
Discussion
1) Some features contribute linearly and others non-linearly to a joint saliency map.
0
2A) We quantify the interaction of features in measured
salience with the mutual information (MI) between the
saliencies of the two features. We find that knowing the value
of one feature gives little information (< 0.1 bit, from a
maximal MI of 4.3. bit) about the saliency of second feature.
Hence features contribute independently to saliency.
Correcting for spatial viewing biases
Subjects tend to look more at the centre of an image than at the periphery, independent of the images shown.
Hence our measure would overestimate the importance of features at central locations, introducing a
dependence of the model on eventual regularities in the spatial arrangements of the images. We estimate
spatial viewing biases by accumulating the available fixation mass over all subjects and images in a spatial
distribution of control fixation. To accommodate for such biases, we weighted features according to this spatial
control distribution such that features of prominent central locations are considered less strongly in the fixation
statistics, correcting for spatial biases.
TC 5°
We derive joint salience, the fixation probability of a patch given two image features, from eye-tracking
data and test feature integration for additive and multiplicative interaction. A) We acquire the joint
distribution of respective feature combinations in the images (image statistics) and at the subset of
selected fixation points (gaze statistics). B) The empirical feature interaction in salience is calculated in a
Bayesian manner as the relative difference between the likelihood of the two features at gaze and the
respective prior probabilityof the two features in the images. This empirical saliency model describes
active selection of image regions based on two features. We quantify interaction of features in saliency
with mutual information. C) We use multiple regression to reconstruct the measured joint salience model
with additive and multiplicative combination of the individual salience distributions for the two features.
0.1
P(Feat1, Feat2|Fixation) denotes the distribution of two features in the subset of the patches selected by
subjects. P(Feat1, Feat2) denotes the prior probability of a feature combination in the images, the image
statistics. It reflects the probability to encounter a given feature combination under random sampling of the
images. The relation between the gaze statistics (what was selected) and the image statistics (what could have
been selected) shows how subjects actively choose feature combinations from the images. It is an empirical
model of saliency. P(Fix) is a constant factor depending on the number of images taken into the analysis.
Comparing different features
Different features have different units of measurements. To allow comparison of features we rank-ordered the
bin edges of two single feature domains in percentiles of the occurrence of the respective feature in the images.
Such histogram equalization for individual features is plausible in light of contrast gain control mechanisms
(Ohzawa et al., 1982) and temporal whitening of natural visual input (Dan et al., 1996) in the early visual system.
P(LC, TC)
P(LC, TC|Fix)
Joint Feature Statistics at Gaze
Multiplic. Integration
P(Fix|LC, TC)
Bayesian Salience Model
TC 5°
Models of saliency aim to describe how subjects actively use local features of an image to direct visual attention
and gaze. Here, we ask whether the salience of one feature varies in the context of another, i.e. if features
interact. We measure salience of two features as the conditional probability to fixate an image patch given two
feature values of the patch:
2B) We reconstruct the measured joint salience with
additive and multiplicative interaction terms using
multiple regression. Additive integration generally
provides a good model of the measured feature
interaction, across and within feature channels.
However, multiplicative models had a similar fit than
additive models for most feature combinations (not
shown here).
References:
Dan Y, Atick JJ, Reid RC. (1996). Efficient coding of natural scenes in the lateral geniculate nucleus: experimental
test of a computational theory. J Neurosci (15).
Darrington, A. (1984). Chromatic mechanisms in lateral geniculate nucleus of macaque. Krauskopf, J and Lennie,
P. J Physiol 357
Einhäuser & König (2003): Does luminance contrast contribute to a saliency map for overt visual attention?,
European Journal of Neuroscience (17).
Itti, L. & Koch, C. (2001). Computational modelling of visual attention. Nature Review Neuroscienc (2)
Ohzawa I, Sclar G, Freeman RD. (1982). Contrast gain control in the cat visual cortex. Nature (15).
Parkhust, et. al. (2002):Modelling the role of saliency in the allocation of overt visual attention, Vision Research
(42).