* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Class Distributions
Survey
Document related concepts
Transcript
國立雲林科技大學 National Yunlin University of Science and Technology Class distributions on SOM surfaces for feature extraction and object retrieval Advisor : Dr. Hsu Graduate : Kuo-min Wang Authors : Jorma T. Laaksonen*, J. Markus Koskela, Erkki Oja 2005 Expert Systems with Applications 1 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Outline Motivation Objective Introduction Class Distributions BMU Probabilities BMU Entropy SOM Surface Convolutions Multiple feature extraction Bayesian Decision estimation Personal Opinion 2 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Motivate A Self-Organizing Map (SOM) is typically trained in unsupervised mode, using a large batch of training data. Even from the same data, qualitatively different distributions can be obtained by using different feature extraction techniques 3 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Objective We use such distributions for comparing different classes and different feature representations of the data in our content-based image retrieval system PicSOM. The information-theoretic measures of entropy and mutual information are suggested to evaluate the compactness of a distribution and the independence of two distributions. 4 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Introduction 影像檢索(Image Retrieval) Segmentation segmentation, feature extraction, representation 及 query processing 將影像中不同的區域劃分出來,大多是時候是指者將影像中物件的 邊緣找出來,然後再確定這個區域是否是有意義的區域 Feature extraction 指一張影向上某一塊區域的特徵。 特徵的擷取跟特徵的表示方式 (Representation)有直接的關係, 因為不同的表示法,就會需要不同 的擷取法 顏色(color)、形狀(shape)、質地(texture) 5 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Introduction (cont.) We study how object class histograms on SOMs can be given interpretations in terms of probability densities and information-theoretic measures Entropy and mutual information (Cover & Thomas, 1991) A good feature the class is heavily concentrated on only a few nearby map elements, giving a low value of entropy. The mutual information of two features’ distributions is a measure on how independent those features are. 6 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Class Distributions Normalized to unit sum the hit frequency give a discrete histogram which is a sample estimate of a probability distribution of the class on the SOM surface. The shape of the distribution depends on several factors The distribution of the original data Cannot to control the very-high-dimensional pattern space Feature extraction technique in use affects the metrics and the distribution of all the generated feature vectors Feature invariance Some pattern space directions are retained better than others. Working properly, semantically similar patterns will be mapped nearer to each other 7 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Class Distributions (cont.) Overall shape of the training set After it has been mapped from the original data space to the feature vector space, determines the overall organization of the SOM. The class distribution of the studied object subset or class, relative to the overall shape of the feature vector distribution. 8 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Class Distributions (cont.) Measures the denseness or locality of feature vectors on a SOM SDH (Pampalk, Rauber, & Merkl, 2002) Each data point is mapped not only to its nearest SOM unit but to s nearest units with reciprocally decreasing fractions. Quantitative locality measures Map usage, average pair distance, fragmentation, and purity (Pullwitt , 2002) They fail to take into account the topological structure of the class 9 Intelligent Database Systems Lab N.Y.U.S.T. I. M. BMU Probabilities Calculating the a priori probability of each SOM unit for being the BMU for any vector x of the feature space. Probability density function (pdf) 10 Intelligent Database Systems Lab N.Y.U.S.T. I. M. BMU Probabilities (cont.) Voronoi region The set of vectors in the original feature space that the closer to the weight vector of unit i than to any other weight vector We are actually replacing the continuous pdf with a discrete probability histogram by counting the number of times that any given map unit is the BMU. The probability histogram of class C on the SOM surface 11 Intelligent Database Systems Lab N.Y.U.S.T. I. M. BMU Entropy The entropy H of a distribution P=(P0,P1,…Pk-1) is calculated as s 4 s s 4 2 s 1 2 2 1 s 3 12 Intelligent Database Systems Lab 3 N.Y.U.S.T. I. M. BMU Entropy (cont.) 13 Intelligent Database Systems Lab N.Y.U.S.T. I. M. SOM Surface Convolutions Entropy Drawback The calculation of entropies does not yet take into account the spatial topology of the SOM units in any way. It is the topological order of the units that separates SOM from other vector quantization methods. That method bears similarity to the smoothed data histogram approach (Pampalk, 2002) data points are not mapped one-to-one to their BMUs but spread into s closet map units in the feature space. 14 Intelligent Database Systems Lab N.Y.U.S.T. I. M. SOM Surface Convolutions (cont.) 15 Intelligent Database Systems Lab N.Y.U.S.T. I. M. SOM Surface Convolutions (cont.) The larger the convolution window is , the smoother is the overall shape of the distribution due to the vanishing of the details. The selection of a proper size for the convolution mask can be identified as a form of the general scale-space problem 16 Intelligent Database Systems Lab N.Y.U.S.T. I. M. SOM Surface Convolutions (cont.) 17 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Multiple Feature Extractions It is possible to use more than one feature extraction method in parallel. In CBIR, three different feature categories are generally recognized: color, texture, and shape features. Let us denote by P=(P0, P1, …, Pk-1) and Q=(Q0, Q1,…, Qk-1) H(P) and H(Q) measure the distributions of the single feature vectors, mutual information I(P, Q) can be used for studying the interplay between them 18 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Multiple Feature Extractions (cont.) HT and nHT have by far the largest values for mutual information CS and SC have the largest value on both SOMs EH and HT is high on the smaller SOM, but not so much on the larger SOM with more resolution. 19 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Bayesian Decision Estimation Using the Bayesian decision rule to make optimal classification Posterior probability To decide on the jth object’s membership in class C 20 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Bayesian Decision Estimation (cont.) Query by example Relevance feedback Presents a number of images to the user at each query round, and the user is expected to evaluate their relevance to her current task. Incrementally fine-tune the selection so that more and more relevant images will be shown at consequtive query rounds Choose next image for the user Maximal probability of relevance Minimal probability of nonrelevance 21 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Bayesian Decision Estimation (cont.) 22 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Bayesian Decision Estimation (cont.) Relevance feedback problem By adding the hit caused by the new relevant and nonrelevant samples to the map units, convolving them with the mask used, And renormalizing the distributions to unit sums Let us denote the history of the query up to the t – 1’th round by H t 1 ( D0 , R0 , D1 , R1 ,..., Dt 1 , Rt 1 ) Maximize the current probability of relevance P( x xrel | H t 1 ) 23 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Bayesian Decision Estimation (cont.) 24 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Conclusions The entropy of the distribution characterizes quantitatively the compactness of an object class. Proposed method can be used as an efficient way of comparing these features and the SOMs produced with them. We showed that the mutual information of the distributions could be used to identify both the most similar and the most uncorrelated of features can also be used to select the subset of the feature extraction methods with the most independent features. Bayesian decision used for choosing either the most probable class for a data item, or the most likely data item belonging to a given class. 25 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Personal Opinions Advantage Application Combined entropy & mutual information & smooth method to find the important feature and independent of features. Feature extraction Drawback The structure of this paper is not good, Some diagram is not clear, so difficult to understand 26 Intelligent Database Systems Lab