Download 7. Decision Trees and Decision Rules

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Entity–attribute–value model wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Relational algebra wikipedia , lookup

Functional Database Model wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Concurrency control wikipedia , lookup

ContactPoint wikipedia , lookup

Database model wikipedia , lookup

Relational model wikipedia , lookup

Transcript
國立雲林科技大學
National Yunlin University of Science and Technology
Extending structure adaptive selforganizing map for mixed data
Advisor : Dr. Hsu
Presenter : Chih-Ling Wang
1
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Outline








Motivation
Objective
Introduction
Background
GSASOM Algorithm
Experimental results
Conclusions
Q&A
2
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Motivation

Due to the recent advances in storage, communications, image
compression, and internet technologies, multimedia
information has become more popular.

With this explosive growth in the volume of multimedia
information archives, the efficient browsing and retrieval of
desired information is of paramount importance.
3
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Objective

In this paper, we propose a novel approach to generating
topology preserving mapping of structural shapes.
4
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Introduction

The most commonly used properties of images for visual
content-based retrieval are color, texture, shape, spatial
relationships between various properties, or a combination of
these properties.

The most popular approach for indexing into image databases
has been the histogram indexing using the above listed
properties.

In this paper, we propose a novel shape indexing scheme using
a structural histograming technique and the SOM algorithm.
5
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Introduction(cont.)

The edge pixels in the images have been used by some
researchers to perform the shape-based similarity search.

Hirata et al. computed the correlation between query sketch
and database edge images.

Jain et al. constructed global 72-bin shape histograms using
edge directions.

The shape similarity is performed by computing a weighted
sum of the Euclidean distances.
6
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Introduction(cont.)

Mokhtarian el al. used the curvature scale space(CSS) method
to represent two-dimensional(2-D) shapes at different
resolutions.

Maxima of the CSS image are used to represent the shape.
7
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Introduction(cont.)

Another common approach to shape-based indexing and
retrieval is to use segmented boundary curves instead of the
edge pixels or the complete closed curves.

Petrakis et al. approximated shapes into a sequence of concave
and convex segments and then a dynamic programming-based
shape matching scheme was employed to establish the
correspondences between curve segments over different
resolutions.
8
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Introduction(cont.)

They made use of an R-tree to perform the indexing in a lower
dimensional space.

Berretti et al. proposed a shape retrieval scheme for generic
shapes using a metric tree based indexing scheme.

They also decomposed the shapes according to the shapes’
protrusions and organized the token attributes into an M-tree to
perform the shape similarity computation and retrieval.
9
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Introduction(cont.)

The trademark image databases have been commonly used to
test image retrieval and in particular several shape retrieval
systems.

Kato, in his system, normalized the trademark images to an
8x8 pixel grid and computed shape features from the resulting
pixel frequency distributions to be used for retrieval.
10
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Introduction(cont.)

Wu et al. developed a system for trademark archiving and
retrieval (STAR) making use of text and images.

Eakins et al. also investigated the problem of shape-based
trademark retrieval. They use regions boundaries extracted
from binary images and approximated by straight lines and
circular arc segments.

These primitive boundary descriptors are grouped into families
to obtain various global shape features.
11
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Introduction(cont.)

In this paper, we employ the SOM to organize structural
shapes in a topographical manner for efficient shape retrieval.

The concept of mapping structural shapes in a topology
conserving manner is novel.

The structural information contained in geometrical shape is
extracted using the pairwise relational attribute vectors.
12
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Introduction(cont.)

These vectors are quantized using an SOM, as the SOMs offer
a number of advantages such as the ability to quantize
adaptively depending on the dynamic ranges of the attributes
and the ability to deal with the curse of dimensionality in the
histograms-based methods more efficiently.

Using this trained quantization SOM referred to as SOM1, a
global histogram of relational attribute vectors is generated for
every structural shape.

This histograms are treated as input vectors to another SOM
referred to as SOM2.
13
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Relational attribute vectors
In
this study, we consider two cases namely, invariance to translationrotation and invariance to translation, rotation, and scale.
Prior to computing the attributes, the intersection point between the
two lines are computed as shown by “i”. The end point of the first line
also known as the reference line closer to the intersection point is
labeled as ”a”. The other end point of the first line is labeled as ”b”.
Likewise the end point of the second line are also labeled as “c“ and
“d”.
14
Intelligent Database Systems Lab
Relational attribute vectors (cont.)

N.Y.U.S.T.
I. M.
In the first set of experiments, the following seven translation and
rotation invariant relational attributes are used:
1)
. The angle returned is between zero
and . However, if we identify the rotation from to as clockwise or
counter-clockwise by evaluating the vector product between vectors
and , then we can compute the angle attribute between to in
order to improve the discrimination quality of this attribute.
2) Length of the reference line ab.
3) Length of the second line cd .
4) Distance ac.
5) Distance bd.
6) Distance ad.
7) Distance bc.
15
Intelligent Database Systems Lab
Relational attribute vectors (cont.)

In the second set of experiments the following five translation, rotation,
and scale invariant relational attributes are used:
1)
. The angle computed is between
to .
2) Relative position ratio:1/((1/2)+(lib/lab)).
3) Line length ratio: min{lac,lbd}/max{lab,lcd}.
4) End point ratio: min{lac,lbd}/max{lac,lbd}.
5) Cross end point ratio:min{lad,lbc}/max{lad,lbc}
16
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Relational attribute vectors (cont.)

If the query line patterns are corrupted by noise, then making
use of every line pairs may not be beneficial to the
performance of the system.

In such a situation, local neighborhood graphs with the
neighborhood degree of 6 are known to yield the best
performance for translation, rotation and scale invariant
retrieval.

In this study, the pairwise relational vectors are computed up
to six nearest neighbor line segments of every line segments in
the trademark model base.
17
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Quantization of relational vectors
- A. Self-Organizing Maps




In this application, it is desirable to have each neuron to be the
winner with the probability 1/M where M is the total number
of nodes in the SOM.
Although the usage of topological neighborhoods attempts to
provide a uniform utilization of all nodes, it does not
completely resolve the problem.
Three of these approaches, namely convex combination,
competitive learning with conscience and competitive
learning with attention, are reviewed and evaluated recently by
Bebis et al.
According to their findings, the competitive learning with
conscience appears to yield the best performance.
18
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Quantization of relational vectors
- A. Self-Organizing Maps (cont.)
19
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Quantization of relational vectors
- A. Self-Organizing Maps (cont.)
 The dimensions of the input vectors are five and seven for
SOM1.
 The number of output neurons will be identical to the number
of bins that we wish to have in the 1-D histogram.
20
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Quantization of relational vectors
- B. Shape Histograms and Indexing
21
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Quantization of relational vectors
- B. Shape Histograms and Indexing (cont.)
 At the completion of executing the steps in Table II, we have a



1-D histogram for every shape in the database.
These histograms are treated as the input vectors to construct
the SOM2 using the same self-organizing map algorithm in
Table I.
The input feature vectors’ dimension of SOM2 is identical to
the number of nodes in SOM1.
Every shape is associated with three best matching neurons.
22
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Quantization of relational vectors
- B. Shape Histograms and Indexing (cont.)
23
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Quantization of relational vectors
- B. Shape Histograms and Indexing (cont.)
24
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
N.Y.U.S.T.
I. M.
Experiments results
 Experiments were conducted using a part of the trademark
database.
 We conducted two experiments using the two sets of relational
attribute vectors defined in Section II.
 In our experiments, the number of neurons in the SOM1 is
1600. The SOM2 has 225 neurons.
25
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Experiments results (cont.)
26
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Experiments results (cont.)
 As naturally expected, the system was able to always retrieve
the original shape when noiseless query objects were presented
to the system in both experiments.
27
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Experiments results (cont.)
 Figs.3 and 4 show some retrieved objects in EXP2 when the
query objects with a fraction of missing lines are presented to the
system.
 The query image is shown at the top left corner and the clean
version of the object is shown in the second column of the first
row.
28
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Experiments results (cont.)
 From these experiments, it can be concluded that the 5-D



rotation, scale, and translation invariant attributes are more
robust than the 7-D translation and rotation invariant attributes.
In order to improve the performance in EXP1, a larger nearest
neighbor graph should be used.
The translation, scale and rotation invariant approach with a
limited neighborhood graph would be more suitable in these
situations.
From the experimental results in Figs.2-4, it is clear that the
SOM2 was able to retrieve the similar shapes using the
histogram intersection similarity measure.
29
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Conclusions



In this paper, we proposed a novel topology preserving
mapping scheme for geometric structural objects using the
SOM.
The proposed approach offers a number of advantages such as
the ability to make use of several relational attributes, the
ability to perform a dynamic quantization, the flexibility in
including and removing model objects and database images,
and the ability to handle other attributes like color and texture
in homogeneous manner by the SOM.
The proposed approach is capable of generating the
topological mapping with the desired invariance properties.
30
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Personal Opinion

Defect: it is ambiguous in the SOM algorithm.

Apply: multimedia information retrieval
31
Intelligent Database Systems Lab