Download 7. Decision Trees and Decision Rules

國立雲林科技大學 National Yunlin University of Science and Technology Extending structure adaptive selforganizing map for mixed data Advisor : Dr. Hsu Presenter : Chih-Ling Wang 1 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Outline         Motivation Objective Introduction Background GSASOM Algorithm Experimental results Conclusions Q&A 2 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Motivation  Due to the recent advances in storage, communications, image compression, and internet technologies, multimedia information has become more popular.  With this explosive growth in the volume of multimedia information archives, the efficient browsing and retrieval of desired information is of paramount importance. 3 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Objective  In this paper, we propose a novel approach to generating topology preserving mapping of structural shapes. 4 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Introduction  The most commonly used properties of images for visual content-based retrieval are color, texture, shape, spatial relationships between various properties, or a combination of these properties.  The most popular approach for indexing into image databases has been the histogram indexing using the above listed properties.  In this paper, we propose a novel shape indexing scheme using a structural histograming technique and the SOM algorithm. 5 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Introduction(cont.)  The edge pixels in the images have been used by some researchers to perform the shape-based similarity search.  Hirata et al. computed the correlation between query sketch and database edge images.  Jain et al. constructed global 72-bin shape histograms using edge directions.  The shape similarity is performed by computing a weighted sum of the Euclidean distances. 6 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Introduction(cont.)  Mokhtarian el al. used the curvature scale space(CSS) method to represent two-dimensional(2-D) shapes at different resolutions.  Maxima of the CSS image are used to represent the shape. 7 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Introduction(cont.)  Another common approach to shape-based indexing and retrieval is to use segmented boundary curves instead of the edge pixels or the complete closed curves.  Petrakis et al. approximated shapes into a sequence of concave and convex segments and then a dynamic programming-based shape matching scheme was employed to establish the correspondences between curve segments over different resolutions. 8 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Introduction(cont.)  They made use of an R-tree to perform the indexing in a lower dimensional space.  Berretti et al. proposed a shape retrieval scheme for generic shapes using a metric tree based indexing scheme.  They also decomposed the shapes according to the shapes’ protrusions and organized the token attributes into an M-tree to perform the shape similarity computation and retrieval. 9 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Introduction(cont.)  The trademark image databases have been commonly used to test image retrieval and in particular several shape retrieval systems.  Kato, in his system, normalized the trademark images to an 8x8 pixel grid and computed shape features from the resulting pixel frequency distributions to be used for retrieval. 10 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Introduction(cont.)  Wu et al. developed a system for trademark archiving and retrieval (STAR) making use of text and images.  Eakins et al. also investigated the problem of shape-based trademark retrieval. They use regions boundaries extracted from binary images and approximated by straight lines and circular arc segments.  These primitive boundary descriptors are grouped into families to obtain various global shape features. 11 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Introduction(cont.)  In this paper, we employ the SOM to organize structural shapes in a topographical manner for efficient shape retrieval.  The concept of mapping structural shapes in a topology conserving manner is novel.  The structural information contained in geometrical shape is extracted using the pairwise relational attribute vectors. 12 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Introduction(cont.)  These vectors are quantized using an SOM, as the SOMs offer a number of advantages such as the ability to quantize adaptively depending on the dynamic ranges of the attributes and the ability to deal with the curse of dimensionality in the histograms-based methods more efficiently.  Using this trained quantization SOM referred to as SOM1, a global histogram of relational attribute vectors is generated for every structural shape.  This histograms are treated as input vectors to another SOM referred to as SOM2. 13 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Relational attribute vectors In this study, we consider two cases namely, invariance to translationrotation and invariance to translation, rotation, and scale. Prior to computing the attributes, the intersection point between the two lines are computed as shown by “i”. The end point of the first line also known as the reference line closer to the intersection point is labeled as ”a”. The other end point of the first line is labeled as ”b”. Likewise the end point of the second line are also labeled as “c“ and “d”. 14 Intelligent Database Systems Lab Relational attribute vectors (cont.)  N.Y.U.S.T. I. M. In the first set of experiments, the following seven translation and rotation invariant relational attributes are used: 1) . The angle returned is between zero and . However, if we identify the rotation from to as clockwise or counter-clockwise by evaluating the vector product between vectors and , then we can compute the angle attribute between to in order to improve the discrimination quality of this attribute. 2) Length of the reference line ab. 3) Length of the second line cd . 4) Distance ac. 5) Distance bd. 6) Distance ad. 7) Distance bc. 15 Intelligent Database Systems Lab Relational attribute vectors (cont.)  In the second set of experiments the following five translation, rotation, and scale invariant relational attributes are used: 1) . The angle computed is between to . 2) Relative position ratio:1/((1/2)+(lib/lab)). 3) Line length ratio: min{lac,lbd}/max{lab,lcd}. 4) End point ratio: min{lac,lbd}/max{lac,lbd}. 5) Cross end point ratio:min{lad,lbc}/max{lad,lbc} 16 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Relational attribute vectors (cont.)  If the query line patterns are corrupted by noise, then making use of every line pairs may not be beneficial to the performance of the system.  In such a situation, local neighborhood graphs with the neighborhood degree of 6 are known to yield the best performance for translation, rotation and scale invariant retrieval.  In this study, the pairwise relational vectors are computed up to six nearest neighbor line segments of every line segments in the trademark model base. 17 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Quantization of relational vectors - A. Self-Organizing Maps     In this application, it is desirable to have each neuron to be the winner with the probability 1/M where M is the total number of nodes in the SOM. Although the usage of topological neighborhoods attempts to provide a uniform utilization of all nodes, it does not completely resolve the problem. Three of these approaches, namely convex combination, competitive learning with conscience and competitive learning with attention, are reviewed and evaluated recently by Bebis et al. According to their findings, the competitive learning with conscience appears to yield the best performance. 18 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Quantization of relational vectors - A. Self-Organizing Maps (cont.) 19 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Quantization of relational vectors - A. Self-Organizing Maps (cont.)  The dimensions of the input vectors are five and seven for SOM1.  The number of output neurons will be identical to the number of bins that we wish to have in the 1-D histogram. 20 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Quantization of relational vectors - B. Shape Histograms and Indexing 21 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Quantization of relational vectors - B. Shape Histograms and Indexing (cont.)  At the completion of executing the steps in Table II, we have a    1-D histogram for every shape in the database. These histograms are treated as the input vectors to construct the SOM2 using the same self-organizing map algorithm in Table I. The input feature vectors’ dimension of SOM2 is identical to the number of nodes in SOM1. Every shape is associated with three best matching neurons. 22 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Quantization of relational vectors - B. Shape Histograms and Indexing (cont.) 23 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Quantization of relational vectors - B. Shape Histograms and Indexing (cont.) 24 Intelligent Database Systems Lab N.Y.U.S.T. I. M. N.Y.U.S.T. I. M. Experiments results  Experiments were conducted using a part of the trademark database.  We conducted two experiments using the two sets of relational attribute vectors defined in Section II.  In our experiments, the number of neurons in the SOM1 is 1600. The SOM2 has 225 neurons. 25 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments results (cont.) 26 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments results (cont.)  As naturally expected, the system was able to always retrieve the original shape when noiseless query objects were presented to the system in both experiments. 27 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments results (cont.)  Figs.3 and 4 show some retrieved objects in EXP2 when the query objects with a fraction of missing lines are presented to the system.  The query image is shown at the top left corner and the clean version of the object is shown in the second column of the first row. 28 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments results (cont.)  From these experiments, it can be concluded that the 5-D    rotation, scale, and translation invariant attributes are more robust than the 7-D translation and rotation invariant attributes. In order to improve the performance in EXP1, a larger nearest neighbor graph should be used. The translation, scale and rotation invariant approach with a limited neighborhood graph would be more suitable in these situations. From the experimental results in Figs.2-4, it is clear that the SOM2 was able to retrieve the similar shapes using the histogram intersection similarity measure. 29 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Conclusions    In this paper, we proposed a novel topology preserving mapping scheme for geometric structural objects using the SOM. The proposed approach offers a number of advantages such as the ability to make use of several relational attributes, the ability to perform a dynamic quantization, the flexibility in including and removing model objects and database images, and the ability to handle other attributes like color and texture in homogeneous manner by the SOM. The proposed approach is capable of generating the topological mapping with the desired invariance properties. 30 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Personal Opinion  Defect: it is ambiguous in the SOM algorithm.  Apply: multimedia information retrieval 31 Intelligent Database Systems Lab

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 7. Decision Trees and Decision Rules