Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CS848 Similarity Search in Multimedia Databases Dr. Gisli Hjaltason Content-based Retrieval Using Local Descriptors: Problems and Issues from Databases Perspective Laurent Amsaleg & Patrick Gros IRISA-CNRS, Campus de Beaulieu, Rennes, France Presented by: Wei Jiang February 19, 2003 Outline Introduction Image Processing Techniques – – – – Global vs Local Descriptors Extracting Interest Points Computing Local Descriptors Extension to Colour Images Database Indexing Techniques – Traditional Approaches – VA-File and Pyramid-Tree Performance Evaluation Conclusion & Perspectives Introduction Descriptor – – – – Root of image content-based retrieval system Represented by multi-dimentional vectors of real numbers Encodes specific information extracted from an image Invariant to some types of variations Content-based retrieval – Image processing techniques to extract descriptors from images • Large-grain recognition: color histogram, grey-level histogram • Fine-grain recognition: local descriptors – Database techniques to store descriptors and accelerate searches • Dimensional curse • Pyramid-Tree and VA-File Introduction (con’d) Motivation: To explore the consequences of using local descriptors together with up-to-date database multidimensional indexing strategies. Outline Introduction Image Processing Techniques – – – – Global vs Local Descriptors Extracting Interest Points Computing Local Descriptors Extension to Colour Images Database Indexing Techniques – Traditional Approaches – VA-File and Pyramid-Tree Performance Evaluation Conclusion & Perspectives Global vs. Local Descriptors Global Descriptor Computes one descriptor per image Cannot identify elements within images Lower computation cost Smaller database size Lower searching cost Globally robust Local Descriptor Computes several descriptors per image Can identify elements within images Higher computation cost Bigger database size Higher searching cost Increase the recognition power Outline Introduction Image Processing Techniques – – – – Global vs Local Descriptors Extracting Interest Points Computing Local Descriptors Extension to Colour Images Database Indexing Techniques – Traditional Approaches – VA-File and Pyramid-Tree Performance Evaluation Conclusion & Perspectives Extracting Interest Points Interest Points – Points in one image that will be also found in similar images – Where the signal changes 2-dimensionally Harris Corner Detector Gaussian filter to avoid image noise – Gaussian function: – Gaussian distribution and discrete approximation (with mean (0,0) and =1) Harris Corner Detector Gaussian filter to avoid image noise – Computing the convolution of the original signal with Gaussian function to get the smoothed signal I Computing the eigenvalues of the matrix Significant values of the eigenvalues indicate an interest point Outline Introduction Image Processing Techniques – – – – Global vs Local Descriptors Extracting Interest Points Computing Local Descriptors Extension to Colour Images Database Indexing Techniques – Traditional Approaches – VA-File and Pyramid-Tree Performance Evaluation Conclusion & Perspectives Computing Local Descriptor Goal: – To make local descriptors invariant w. r. t. the type of variations Computing in two steps: – Computing the derivatives of the smoothed signal I (up to the third order), to provide a basic description – Mixing the derivatives to enforce invariance properties and to make descriptors robust to changes Local Descriptors for Grey-level Images Gaining translational and rotational invariance – Nine invariant quantities to eliminate the angle of rotation Gaining photometric invariance – Illumination is modelled by I --> aI + b Gaining scale invariance – Multi-scale approach F is a function, a is a scale. F(x)= G(ax). Outline Introduction Image Processing Techniques – – – – Global vs Local Descriptors Extracting Interest Points Computing Local Descriptors Extension to Colour Images Database Indexing Techniques – Traditional Approaches – VA-File and Pyramid-Tree Performance Evaluation Conclusion & Perspectives Extension to Colour Images RGB system Extracting interest points by Harris detector Computing the derivatives separately for every channel Mixing the derivatives – Gaining rotational invariance – Gaining scale invariance – Gaining photometric invariance Outline Introduction Image Processing Techniques – – – – Global vs Local Descriptors Extracting Interest Points Computing Local Descriptors Extension to Colour Images Database Indexing Techniques – Traditional Approaches – VA-File and Pyramid-Tree Performance Evaluation Conclusion & Perspectives Database Indexing Techniques Problem: – The space for descriptor gets too big to fit in main memory Solution: – To store descriptors on disks – Multi-dimensional index structures to accelerate searches Goal: – To minimize the resulting number of I/Os Outline Introduction Image Processing Techniques – – – – Global vs Local Descriptors Extracting Interest Points Computing Local Descriptors Extension to Colour Images Database Indexing Techniques – Traditional Approaches – VA-File and Pyramid-Tree Performance Evaluation Conclusion & Perspectives Traditional Approaches Data-partitioning index methods – Dividing the data space according to the distribution of data • R-Tree: Minimum bounding rectangles and overlaps • SS-Tree: bounding spheres instead of rectangles • SR-Tree: intersection of a bounding sphere and a bounding rectangle • TV-Tree: divide the dimensions into three classes – Drawback: in high-dimensional space, the probability of accessing every index page gets close to 1 Traditional Approaches Space-partitioning indexing – Dividing the data space along predefined lines, regardless of the actual data clustering Grid-file, KDB-Tree, etc. – Drawbacks • Inefficient in high-dimensional space • Indexing large volumes of empty space • When the query point is near a cell boundary, the search cost is increased. Outline Introduction Image Processing Techniques – – – – Global vs Local Descriptors Extracting Interest Points Computing Local Descriptors Extension to Colour Images Database Indexing Techniques – Traditional Approaches – VA-File and Pyramid-Tree Performance Evaluation Conclusion & Perspectives VA-File To improve the sequential search Splitting each dimension, and encoding the grid cells A file storing all the descriptors Another file storing the geometrical approximations of these descriptors associates a descriptor id to a cell # Searching algorithm computes geometrical approximation, determines close cells, and scans them in an increasing order of distance Pyramid-Tree Dividing a space into 2xd pyramids, the top of each pyramid is the center of the data space Each pyramid is cut into slices parallel to its base Any point of the multi-dimensional space is mapped into a pair(pyramid number, height in the pyramid) A given slice of a specific pyramid is stored as a page of B+ tree The number of slices increases linearly (and not exponentially) with the number of dimensions Outline Introduction Image Processing Techniques – – – – Global vs Local Descriptors Extracting Interest Points Computing Local Descriptors Extension to Colour Images Database Indexing Techniques – Traditional Approaches – VA-File and Pyramid-Tree Performance Evaluation Conclusion & Perspectives Performance Evaluations Experimental environment – – – – SUN Ultra 5 workstation running SunOS 5.7 CPU 333MHz UltraSPARC-Iii Main memory 384 Mb Local secondary storage 8 Gb Databases – One color image database – One grey-level image database – Third one for recognition evaluation test only Experiments Comparing the recognition power of color and grey descriptors Influence of the Dimensionality of Data Influence of the Database Size Impact of the Number of Descriptors in a Request Experiment 2: Influence of the Dimensionality of Data Experiment 3: Influence of the Database Size Experiment 4: Impact of the Number of Descriptors in a Request Outline Introduction Image Processing Techniques – – – – Global vs Local Descriptors Extracting Interest Points Computing Local Descriptors Extension to Colour Images Database Indexing Techniques – Traditional Approaches – VA-File and Pyramid-Tree Performance Evaluation Conclusion & Perspectives Conclusion and Perspectives Slowdown is caused by the dimensionality of data, the size of database, and the number of descriptors. The three efficient multi-dimensional indexing techniques do NOT efficiently cope with the fine-grain recognition with local descriptors. It’s crucial to come up with new indexing techniques specially designed to efficiently support the use of local descriptors. Research Direction Numerous local descriptors for a single query creates redundancy Exploit the distribution of data to accelerate the queries Change the management of memory to benefit from consecutive queries Using several low-dimension indexes instead of a unique high-dimension index