Download Slide 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Types of artificial neural networks wikipedia , lookup

Regression toward the mean wikipedia , lookup

Transcript
Cambridge, Massachusetts
Kernel Methods for Weakly Supervised
Mean Shift Clustering
Oncel Tuzel & Fatih Porikli
Mitsubishi Electric Research Labs
Peter Meer
Rutgers University
COMPANY CONFIDENTIAL
5/24/2017
2
Outline
• Motivation
• Mean Shift
• Method Overview
• Kernel Mean Shift
• Constrained Kernel Mean Shift
• Experiments
• Conclusion
3
Motivation
• Clustering is an ambiguous
task
• In many cases, the initially
designed similarity metric fails
to resolve the ambiguities
• Simple supervision can guide
clustering to desired structure
• We present a semi supervised
mean shift clustering algorithm
based on pair-wise similarities
4
Mean Shift
• Given n data points xi on Rd and
associated bandwidths hi, the sample
point density estimator is given by
where k(x) is the kernel profile
• Stationary points of the density can be
found via the mean shift procedure
where
5
Mean Shift Clustering
• Mean shift iterations are
initialized at the data points
• The cluster centers are located
by the mean shift procedure
• The data points associated with
the same local maxima of the
density function produce a
partitioning of the space
• There is no systematic semi
supervised mean shift algorithm
6
Method Overview
• The supervision is given in the
form of a few pair-wise
similarity constraints
• We embed the input space to a
space where the constraint
pairs are associated with the
same mode
• Mode seeking is performed on
the embedded space
• The method preserves all the
advantages of mean shift
clustering
Embedded Space
.
.
.
.
.
.
.
.
.
.
.
.
x
.. . . . . . .
......
....
. ..x......
.... .. .
.. . .. .
...x. .... .. .
......x..
..
Input Space
...x. ..
.
..
.x...
7
Pair-wise Constraints on the Input Space
• Data points are projected to the
null space of the constraint
matrix
Constraint
Input
Clustering
Projection
Points
Vector
1
• The method fails if the clusters
are not linearly separable
c2-c1
y
• Since the constraint point pairs
overlap after projection, they
are clustered together
0
c1
1
-1
-2
-1
0
x
• At most d-1 constraints can be
defined
c2
1
8
Pair-wise Constraints on the Feature Space
• The method can be extended to
handle increasing number of
constraints or to linearly inseparable
case using a mapping function
• The projection is performed on the
feature space
1.5
1
(c2)-(c1)
0 (c )
1c
1
-0.5
2
-1
-1
c2
0
1
x
• Defining mapping explicitly is not
practical
Solution: Kernel Trick
(c2)
0.5
x2
• The mapping
embeds the input
space to an enlarged feature space
Mapping
Constraint
Input
Clustering
Projection
toPoints
Feature
Vector
Space
9
Kernel Mean Shift (Explicit Form)
• Given
and a p.s.d. kernel
satisfying
where
• The density estimator at
is given by
• The stationary points can be found via the mean shift procedure
10
Kernel Mean Shift (Implicit Form)
• Let
be the
dimensional feature
matrix and
be the
dimensional Kernel matrix
• At each iteration the estimate, , lies is the column space of and
any point on the subspace can be written as
• The distance between two points and is given by
• The implicit form of mean shift updates the weighting vectors
where
denote the i-th canonical basis for Rn
11
Kernel Mean Shift Clustering
• The clustering algorithm starts on the data points
• Upon convergence the mode can be expressed via
• When the rank of the kernel matrix K is smaller than n, columns of
form an overcomplete basis and the modes can be identified within
an equivalence relationship
• The procedure is restricted to the subspace spanned by the feature
points therefore
• The convergence of the procedure follows from the original proof
12
Constrained Kernel Mean Shift
1.5
• Let
be the set of
point pairs to be clustered together
• The constraint matrix is given by
1
Feature Space
(c2)-(c1)
(c2)
x2
0.5
0 (c )
1
-0.5
• The null space of A is the set of
vectors
-1
-1
1.5
and the matrix
0
x
Projection
1
projects to
• Under the projection the constraint
point pairs are overlapped
x
2
0.5
0
-0.5
-1
-1
0
1
x
1
13
Constrained Kernel Mean Shift
• The constrained mean shift algorithm implicitly maps the data points to
null space of the constraint matrix
and performs mean shift on the embedded space
• This process is equivalent to applying kernel mean shift algorithm with
the projected kernel function
• The projected Kernel matrix only involves mapping
through the
kernel function and can be expressed in terms of original Kernel matrix
where
involving constraint set and
is the part of the Kernel matrix
is the scaling matrix
14
Experiments
• We conduct experiments on three datasets
– Synthetic experiments
– Clustering faces across illumination on CMU PIE dataset
– Clustering object categories on Caltech-4 dataset
• For the first two experiments we utilize Gaussian kernel function
• For the last experiment we utilize
kernel function
• We use adaptive bandwidth mean shift where the bandwidth for each
point is selected as the k-th smallest distance from the point to all the
data points on the feature space
15
Clustering Linear Structure
Data Points
Mean Shift
Constrained Mean Shift
• We generated 240 data points originating from six different lines
• Data is corrupted with normally distributed noise with standard
deviation 0.1
• Three pair-wise constraints are given
16
Clustering Circular Structure
• We generated 200 data
points originating from
five concentric circles
• Data is corrupted with
normally distributed
noise with standard
deviation 0.1
• 80 outlier points are
added
• Four pair-wise
constraints are enforced
from the same circle
Data Points
Data Points with Outliers
Mean Shift
Constrained Mean Shift
17
Clustering Faces Across Illumination
Samples from CMU PIE Dataset
Constraint Set
• Dataset contains 441 images from 21 subjects under 21 different
illumination conditions
• Images are coarsely registered and scaled to the same size 128x128
• Each image is represented with a 16384-dimensional vector
• Two pair-wise similarity constraints are given per subject
• Approximately 1/10 of the dataset is labeled
18
Clustering Faces with Mean Shift
Pair-wise Distances
Mean Shift
50
50
100
100
150
150
200
200
250
250
300
300
350
350
400
400
100
200
300
400
100
200
300
• Mean shift finds 5 clusters corresponding to partly illumination
conditions, partly subject labels
400
19
Clustering Faces with Constrained Mean
Shift
Pair-wise Distances after Embedding
Constrained Mean Shift
50
50
100
100
150
150
200
200
250
250
300
300
350
350
400
400
100
200
300
400
100
200
300
• Constrained mean shift recovers all 21 subjects perfectly
400
20
Clustering Object Categories
Samples from Caltech-4 Dataset
• Dataset contains 400 images from four object categories: cars,
motorcycles, faces, airplanes
• Each image is represented with a 500 bin feature histogram
• Pair-wise constraints are randomly selected within classes
• Experiment is repeated with varying number of constraints (1 to 20
constraints per object class)
21
Clustering Object Categories with Mean Shift
Pair-wise Distances
Mean Shift
50
50
100
100
150
150
200
200
250
250
300
300
350
350
400
100
200
300
400
400
100
200
300
400
• Some of the samples from airplanes class and half of the motorcycles
class are incorrectly identified as cars
• The overall clustering accuracy is 74.25%
22
Clustering Object Categories with Constrained
Mean Shift
Pair-wise Distances after Embedding
Constrained Mean Shift
50
50
100
100
150
150
200
200
250
250
300
300
350
350
400
100
200
300
400
400
100
200
300
• Clustering example after enforcing 10 constraints per class
• Only a single example among 400 is misclustered
400
23
Clustering Performance vs. Number of Constraints
• The results are averaged over 20 runs where at each run a different
constraint set is selected
• Clustering accuracy is over 99% for more than 7 constraints per class
24
Conclusion
• We presented a novel constrained mean shift clustering method that
can incorporate pair-wise must-link priors
• The method preserves all the advantages of the original mean shift
clustering algorithm
• The presented approach also extends to inner product spaces thus, it
is applicable to a wide range of problems