* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download VISUAL INFORMATION RETRIEVAL
Edge detection wikipedia , lookup
Computer vision wikipedia , lookup
Rendering (computer graphics) wikipedia , lookup
BSAVE (bitmap format) wikipedia , lookup
Spatial anti-aliasing wikipedia , lookup
Anaglyph 3D wikipedia , lookup
Hold-And-Modify wikipedia , lookup
Indexed color wikipedia , lookup
Image editing wikipedia , lookup
Stereoscopy wikipedia , lookup
Medical image computing wikipedia , lookup
VISUAL INFORMATION RETRIEVAL Presented by Dipti Vaidya OVERVIEW • Image Retrieval • Content Based Image Retrieval • Various visual features and their corresponding techniques • Indexing • Examples of VIR systems • Research issues IMAGE MANAGEMENT • Digital images and videos are becoming an integral part of human communication • Giga bytes of images generated everyday • Preservation • Make the information organized to allow efficient browsing, searching and retrieval Text-based Image Retrieval • Annotate images by text and then used text-based DBMS to perform the image retrieval • Difficulties: • Large amount of labor in manual annotation • Digital Imagery is a subjective source of information • Certain visual properties (pattern, colors, shapes, textures) are different or nearly impossible to describe with text. Content-Based Information Retrieval • Instead of being manually annotated by text-based keywords, images would be indexed by their own “ visual content” • Computer vision and pattern recognition Image query visual content • • • • Color Texture Shape Spatial Relationship Color • Distribution of color is a useful feature for image representation. • Color distribution is represented as a histogram of intensity of values. • Color of any pixel may be represented in terms of component, RGB. • A histogram is defined each of whose bins correspond to a range of these values for each components. Color • Let Q and I be two histograms and both contain N bins, the intersection (query and image in database) is defined as follows: Problem:It is computationally expensive. For N histogram bins and M is the total number of images in database, the computation cost is O(NM). DB is exhaustively searched the only way to reduce search time is to reduce N. Color K best colors in a given color space: • A partition of color space is into “K super-cells;” each of which will correspond to a histogram bin. Color histogram of images or objects can then be calculated as the normalized count of the pixels that fall in each of these super cells. • The advantage of this approach is that the clustering process will take into account the color distribution of images over the entire database and this will minimize the likelihood of histogram bins => only a small number of number of histogram bins tend to capture the majority of pixels of an image=> only largest bins ( in terms of pixel counts) need be selected as the representation of any histogram, and as the bins of the query and image histograms are apparently matched, intersection may be computed. It does not degrade the performance of histogram matching. Problems • Disadvantage of a histogram is that it lacks any performance about location—divide an image into sub-areas and calculate a histogram for each of these sub-areas. Increasing the number of sub-area increases the information about location, but it also increases the memory. Object-detection by color-based techniques: • Segmentation begins by dividing an image into achromatic and chromatic region, based strictly on the chroma component of each pixel. The Hue component is then used to further segment the image into a set of uniform region based on histogram difference metric. Finally, post processing is carried out to recover from over segmentation. • Performance may be degraded in the presence of strong highlights or shading. Shape Retrieval • Shape Representation – Invariant to translation, rotation and scaling • Boundary Based – Outer Boundary – Fourier Descriptor • Region Based – Entire shape region – Moment invariant Texture • Homogeneity of visual patterns Granularity,directionality and repetitiveness • Co-occurrence matrix Grey-level spatial dependence of texture Orientation and Distance • Texture representation Human visual perception of texture 6 visual texture properties- coarseness, contrast, directionality,likeness, regularity and roughness • Wavelets Indexing • To make CBIR truly scalable to large size image collections, efficient multidimensional indexing techniques needs to be explored • Challenges: - High dimensionality - Non-Euclidean similarity measure Indexing • Towards solving these problems, one promising approach is to first perform dimension reduction and then use appropriate multi-dimensional indexing techniques. Dimension Reduction At least 2 approaches appeared in the literature • Karhunen-Loeve Transform (KLT) • Column-wise clustering KLT • Considering that the Image Retrieval System is a dynamic system and new images are continuously added to the image collection, a dynamic update of indexing structure is indispensably needed.This algorithm provides such a tool. Column wise clustering Normally it is used to cluster similar objects together to perform recognition or grouping. Clustering can also be used column wise to reduce the dimensionality of the feature space. Multi-dimensional Indexing Techniques • • • • Bucketing Algorithm K-d tree K-D-B tree R- tree and it’s variants R+ tree and R*- tree CBIR Systems • More than 80 systems have been identified Most Image retrieval systems support one or more of the following options: • Random browsing • Search by example • Search by sketch • Search by text • Navigation with customized image categories CBIR Systems • • • • • • • • QBIC Virage Photobook VisualSEEK WebSEEK Netra MARS Other Systems QBIC • Two key properties of QBIC are (1) its use of image and video content computable properties of color, texture, shape, and motion of images, videos, and their objects in the queries, and (2) its graphical query language in which queries are posed by drawing, selecting, and other graphical means. QBIC • QBIC has two main components: database population (the process of creating an image database) and database query. • During the population, images and videos are processed to extract features describing their content colors, textures, shapes, and camera and object motion and the features are stored in a database. During the query, the user composes a query graphically. Features are generated from the graphical query and then input to a matching engine that finds images or videos from the database with similar features. QBIC – Data Model For both population and query, the QBIC data model has still images or scenes (full images) that contain objects (subsets of an image), and video shots that consist of sets of contiguous frames and contain motion objects. Videos are broken into clips called shots. Representative frames, or r-frames, are generated for each extracted shot. R-frames are treated as still images, and features are extracted and stored in the database. Further processing of shots generates motion objects for example, a car moving across the screen. QBIC – Sample Queries • For each full-scene image, identified image object, rframe, and identified video object resulting from the above processing, a set of features is computed to allow contentbased queries. The features are computed and stored during database population. • A multiobject query is asking for images that contain both a red round object and a green textured object. The features are standard color and texture. The matching is done by combining the color and texture distances. Combining distances is applied to arbitrary sets of objects and features to implement logical and semantics QBIC – Database Population In still image database population, the images are reduced to a standard-sized icon called a thumbnail and annotated with any available text information. Object identification is an optional but key part of this step. It lets users manually, semi-automatically, or fully automatically identify interesting regions which we call objects in the images. Internally, each object is represented as a binary mask. There may be an arbitrary number of objects per image. Objects can overlap and can consist of multiple disconnected components like the set of dots on a polka-dot dress. Text, like "baby on beach," can be associated with an outlined object or with the scene as a whole. Open Research Issues • Feature Extraction Robust retrieval algorithms • Multi-dimensional Indexing Retrieval speed • Human-computer interaction Query Construction • Human Visual Perception • Evaluation Criterion Performance – effective measures; precision/recall Large-scale image test bed.