Download Robust Tracking and Remapping of Eye Appearance with Passive

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Robust Tracking and Remapping
of Eye Appearance with Passive
Computer Vision
Presented by Jason Moore
1. Introduction
• This paper discusses a single-camera iristracking and remapping approach based
on passive computer vision.
• In other words: Using a camera to track
the gaze of an individual specifically for
the purpose of computer control.
Gaze Estimation Applications
•
•
•
•
•
•
Ophthalmology
Psychology
Neurology
Marketing and Advertising
HCI
Aids for the disabled
Pupil & Iris
• The pupil and iris are monitored for gaze
estimation.
Gaze Estimation Techniques
• Categorized by:
– Degree of Intrusiveness
– Technology employed
• Active vs. Passive
– Cost
– Target application domains
Intrusive Techniques
• Require equipment to be put in physical
contact with the user.
• Examples of equipment:
– Electrodes
– Contact lenses
– Head mounted devices
Nonintrusive Techniques
• Use cameras to capture images of the
eye.
• Most commercial devices use IR light
reflected by the eye.
– These systems are fairly accurate but require
special and expensive hardware.
– Retain a degree of intrusiveness because of
active light emission.
– Can perform poorly with bad lighting
conditions or if the user is wearing glasses.
More IR problems
• Most IR-based systems require the user’s
head to remain still.
– This limits the degree of usability.
• IR-based systems that do not require the
user’s head to remain still do not yield
great accuracy.
Active Vs. Passive
• Active approaches rely on light emission to
track the eye.
• Passive approaches rely only on natural
light.
– Use off-the-shelf hardware to perform iris
localization and tracking.
– Iris is ideal for tracking due to its perfectly
circular shape and contrast to sclera.
Gaze Tracking for HCI Is Difficult
• The remapping transformation of pupil
position to the computer screen is time
dependent and changes whenever the
user moves his head.
• Although difficult, it is an interesting
concept for its potential social and
commercial impact.
2. Iris Tracking
• Composed of three states:
– Iris localization
• Iris candidates are selected and passed to the
tracing state.
– Iris Tracing
• The iris is searched for. If it is found, wait for the
next frame, otherwise go back to iris localization. If
the eye is closed, go to Wait state.
– Wait
• This state is for both voluntary and involuntary eye
blinks.
Iris Tracking
Iris Localization
• In this state, the current frame is analyzed
to generate some initial guesses on the
position of the iris.
• The image is filtered to enhance the
contrast between the iris and the sclera.
• The potential iris locations are selected
based on image intensity.
Iris Localization
• One point of
interest has been
identified on the xaxis. Two points of
interest have been
identified on the yaxis.
• Both hypotheses
will be passed to
the tracing state.
Iris Tracing
• Before considering the hypotheses
presented by the Iris Localization state,
search for the iris based on its last
location.
• If the iris is not found near the last
position, then consider the hypotheses
presented by the Iris Localization state.
Iris Tracing
The estimated iris position after initial failure to find the iris based
on its previous location.
The RANSAC Algorithm
• RANSAC stands for Random Sample
Consensus.
• A popular algorithm for model selection in
a data set containing both inliers and
outliers.
• Here the RANSAC
algorithm has been used to fit
a line to a set of data points
regardless of the large
number of outliers.
C-RANSAC
• RANSAC modified to have more
knowledge about the tracing task.
• This knowledge concerns the range of
possible ellipse dimensions.
The left image shows the failure of standard RANSAC
to find the iris properly. The image on the right shows
the success of C-RANSAC.
C-RANSAC
Left image shows
success in finding the iris
while wearing glasses.
Right image shows
success in low light
conditions.
Left image shows
success in finding the iris
while the eyebrow is in
the interest window. Right
image shows success
when the iris is in a lateral
position.
Eye Blink Detection
• Blinking is detected by a vertical shift in the
cumulative histogram.
• The eyelashes have a similar level of
intensity as the iris, but are not in the same
vertical position.
3. Remapping
• Remapping iris position to screen position
is somewhat math intensive.
• Involves an initial calibration phase.
• Takes head motion into consideration.
4. Experimental Results
• Hardware:
– Digital camera with 12x optical zoom and
640x480 image resolution.
– 19” computer screen with a resolution of
1024x768
– Standard laptop with 1.73GHz processor.
User / Hardware Positions
Tracking Results
• 594 frames (approx. 23.7s @ 25fps) were
recorded with the user looking at several
different points on the screen.
• Iris position and shape were manually
annotated.
• Manual annotations then compared to
results from RANSAC and C-RANSAC.
RANSAC vs. Ground Truth
-Tracking accuracy for the y-coordinate of the ellipse center.
- Notice the two spikes due to ocular occlusions.
C-RANSAC vs. Ground Truth
- Tracking accuracy for the y-coordinate of the ellipse center.
- C-RANSAC does not spike when the eye is occluded.
RANSAC vs. C-RANSAC: Y
- RANSAC experiences more error on the y-axis than C-RANSAC.
RANSAC vs. C-RANSAC: X
- RANSAC and C-RANSAC do not differ much on their x-axis error.
RANSAC vs. C-RANSAC
- The distributions are similar, but C-RANSAC appears to have less
error when attempting to find the y-position of the iris’ center.
Calibration
• User looks at eight points on the screen,
staring at each point for four seconds.
• The points are arranged as follows: four at
the corners of the screen, and four (with a
rhomboidal layout) at its center.
• One hundred measurements are collected
for the iris center for each calibration point.
Compensated vs. Uncompensated
Head Movement
-Circle/Red represents uncompensated movement.
-Square/Green represents compensated movement.
-The compensated clusters are more compact and more reasonably
arranged.
Remapping Calibration Clusters to
the Screen
- The crosses represent the center of the calibration circles.
Remapping Without Feedback
-Twenty random screen points denoted by crosses.
-Results obtained shortly after calibration represented by
circles.
-Results obtained ten minutes after calibration represented by
stars.
Remapping With Feedback
- Results obtained with head compensation represented by
circles.
-Results obtained without head compensation represented
by stars.
-Lack of head compensation is actually better!
Line Tracing With Visual Feedback
Conclusion
• A robust, single-camera, real-time eye-tracking algorithm
is presented.
• An eye blink detector works equally well for both
voluntary and involuntary eye closures.
• A constrained RANSAC approach for iris tracking is
proposed that performs better than standard RANSAC in
the presence of distracters and occlusions in the image
sequence.
• The on-screen remapping method is capable of
compensating for small head movements.
• Experiments outlined the importance of providing visual
feedback to the user and the benefit gained from
performing head compensation, especially during imageto-screen map calibration.
Future Work
• Improve further the image to screen mapping
model, by taking explicitly into account the
spherical shape of the eyeball.
• Relax the “neutral expression” constraint set for
head compensation.
• Generalize the approach to passive interaction
surfaces such as books, newspapers, and
paintings.
• Extend the framework to the problem of
determining the 3D coordinates of a location
pointed at in space.