Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Selected Advanced Topics Storing and Retrieving Images Content-based Image/Video Indexing and Retrieval G52IIP, School of Computer Science, University of Nottingham 1 Problem Find all images contain horses ….. G52IIP, School of Computer Science, University of Nottingham 2 Text-based technology Annotation: Each image is indexed with a set of relevant text phrases, e.g., Appropriate phrases to describe the content of this image include: Mother, Child, Vegetable, Yellow, Green, Purple …. Retrieval: based on text search technology G52IIP, School of Computer Science, University of Nottingham 3 Text-based technology - Drawbacks Annotation - subjective different people may use different phrases to describe the same or very similar image/content G52IIP, School of Computer Science, University of Nottingham 4 Text-based technology - Drawbacks Annotation - Laborious It will take a lot of man-hours to label large image/video databases with 1m+ items G52IIP, School of Computer Science, University of Nottingham 5 Content-based Technology Using Visual Examples G52IIP, School of Computer Science, University of Nottingham 6 Content-based Technology Using Visual Features r% g% b% G52IIP, School of Computer Science, University of Nottingham 7 Content-based Technology Content-based image indexing and retrieval (CBIR), is an image database management technique, which indexes the data items (images, or video clips) using visual features (e.g., color, shape, and texture) of the images or video clips. A CBIR system lets users find pictorial information in large image and video databases based on visual cues, such as colour, shape, texture, and sketches. G52IIP, School of Computer Science, University of Nottingham 8 Content-based Technology The visual features, computed using image processing and computer vision techniques are used to represent the image contents numerically. Image Content - a high level concept, e.g., this image is a sunset scene, a landscape scene, etc. Numerical Content Representations - Low level numbers, often the same set of numbers can come from very different images, making the task very hard! G52IIP, School of Computer Science, University of Nottingham 9 Content-based Technology Techniques for Computing Visual Features/Representing Image Contents – some are very sophisticated, and many are still not matured hence the computational processes in some cases are automatic but in other cases are semi-automatic in the most difficult cases, it may have to be done manually G52IIP, School of Computer Science, University of Nottingham 10 Content-based Technology Comparing Image Content/Retrieving Images based on Content Simple approaches - compute the metric distance between low level numerical representations Advanced Approaches - using sophisticated pattern recognition, artificial intelligence, neural networks, and interactive (relevant feed-back) techniques to compare the visual content (low level numerical features) G52IIP, School of Computer Science, University of Nottingham 11 Content-based Technology - IBM QBIC System The IBM’s QBIC (Query by Image and Video Content) system is one of the early examples of CBIR system developed in 1990s. The system lets users find pictorial information in large image and video databases based on color, shape, texture, and sketches. G52IIP, School of Computer Science, University of Nottingham 12 Content-based Technology - IBM QBIC System The User Interfaces Module Let user specify visual query by drawing, selecting from a color wheel, selecting a sample image … Display results as an ordered set of images The Database Population and Database Query Modules Database population - process images and video to extract features describing their content - colors, textures, shapes and camera and object motion, and store the features in a database Database Query - let user compose a query graphically, extract features from the graphical query, input to a matching engine that finds images or video clips with similar features G52IIP, School of Computer Science, University of Nottingham 13 Content-based Technology - IBM QBIC System The Data Model Still image, or scene - full image Objects contained in the full image - subsets of an image Videos - broken into clips called shots - sets of contiguous frames Representative frames, the r-frames, are generated for each shot R-frames are treated as still image - from which features are extracted and stored in the database. Further processing of shots generates motion objects - e.g., a car moving across the screen. G52IIP, School of Computer Science, University of Nottingham 14 Content-based Technology - IBM QBIC System Queries are allowed on Objects - e.g., Find images with a red round object Scenes - e.g., Find images that have approximately 30% red and 15% blue colors Shots - e.g., Find all shots panning from left to right A combination of above - e.g., Find images that have 30% red and contain a blue textured objects G52IIP, School of Computer Science, University of Nottingham 15 Content-based Technology - IBM QBIC System Similarity Measures Similarity queries are done against the database of pre-computed features using distance functions between the features Examples include, Euclidean distance, City-block distance, …. These distance functions are intended to mimic human perception to approximate a perceptual ordering of the database But, it is often the case that a distance metric in a feature space will bear little relevance to perceptual similarity. G52IIP, School of Computer Science, University of Nottingham 16 Content-based Technology - Basic Architecture Similarity Measures Imagery Meta data Record1 color texture shape positions …. Record2 color texture shape positions …. color texture shape positions …. Record3 Record4 Record n Query color color color texture shape texture shape texture shape positions positions positions …. …. …. Image Database G52IIP, School of Computer Science, University of Nottingham 17 Colour - An effective Visual Cue Colors can be a more powerful visual cue than you initially thought! What soft drink Which fruit? G52IIP, School of Computer Science, University of Nottingham 18 Colour - An effective Visual Cue In many cases, color can be very effective. Here is an example Results of content-based image retrieval using 4096-bin color histograms G52IIP, School of Computer Science, University of Nottingham 19 Colour Spaces Colour Models RGB Model: This colour model uses the three NTSC primary colours to describe a colour within a colour image. B Cyan Sometimes in Computer Vision, it is convenient to use rg chromaticity space White Magenta G gray R r = R/(R+G+B) g= G/(R+G+B) Yellow G52IIP, School of Computer Science, University of Nottingham 20 Colour Spaces YIQ Model: The YIQ models is used in commercial colour TV broadcasting, which is a re-coding of RGB for transmission efficiency and for maintaining compatibility with monochrome TV standard. Y 0.299 I 0.596 Q 0.212 0.587 0.114 R 0.275 0.321 G 0.523 0.331 B In YIQ, the luminance (Y) and colour information (I and Q) are de-coupled. YCbCr Model Y = 0.299R + 0.587G + 0.114B Cb = -0.169R - 0.331G + 0.500B Cr = 0.500R - 0.419G - 0.081B G52IIP, School of Computer Science, University of Nottingham 21 Perceived Color Differences One problem with the RGB colour system is that colorimetric distances between the individual colours don't correspond to perceived colour differences. For example, in the chromaticity diagram, a difference between green and greenish-yellow is relatively large, whereas the distance distinguishing blue and red is quite small. r = R/(R+G+B) g= G/(R+G+B) G52IIP, School of Computer Science, University of Nottingham 22 CIELAB CIE (Commission Internationale de l'Eclairage) solved this problem in 1976 with the development of the Lab colour space. A three-dimensional color space was the result. In this model, the color differences which you perceive correspond to distances when measured colorimetrically. The a axis extends from green (-a) to red (+a) and the b axis from blue (-b) to yellow (+b). The brightness (L) increases from the bottom to the top of the threedimensional model. With CIELAB what you see is what you get (in theory at least). G52IIP, School of Computer Science, University of Nottingham 23 Colour Histogram Given a discrete colour space defined by some colour axes (e.g., red, green, blue), the colour histogram is obtained by discretizing the image colours and counting the number of times each discrete colour occurs in the image. The image colours that are transformed to a common discrete colour are usefully thought of as being in the same 3D histogram bin centered at that colour. G52IIP, School of Computer Science, University of Nottingham 24 Colour Histogram Construction Step 1 Colour quantization (discretizing the image colours) Step 2 Count the number of times each discrete colour occurs in the image. G52IIP, School of Computer Science, University of Nottingham 25 Colour Quantization A true colour, 24-bit/pixel image (8 bit - R, 8 bit - G, 8 bit -B), will have 224 = 16777216 bins ! That is, each image will have to be represented by over 16 million numbers computationally impossible in practice not necessary Colour quantization - reduce the number of (colours) bins G52IIP, School of Computer Science, University of Nottingham 26 Simple Colour Quantization Simple Colour Quantization (Non-adaptive) Divide each colour axis into equal length sections (different axis can be divided differently). Map (quantize) each colour into its corresponding bin G52IIP, School of Computer Science, University of Nottingham 27 Simple Colour Quantization Example: In RGB space, quantize each image colour into one of 8x8x8 = 512 colour bins R 0 31 63 95 127 159 191 223 255 0 31 63 95 127 159 191 223 255 0 31 63 95 127 159 191 223 255 G B Colour Bin Colour Bin (123,23,45) (3, 0, 1 ) (122, 28, 46) (3, 0, 2) (132, 29,50) (4, 0, 1) (122, 172, 27) (3, 5, 0) (121,26,48) (x, x, x) (142, 28, 46) (x, x, x) G52IIP, School of Computer Science, University of Nottingham 28 Advanced Colour Quantization Adaptive Colour quantization (Not required) B A pixel is a point in the 3D colour space Vector Quantization K-means clustering K representative colours The colour histogram consists of K bins, each corresponding to one of the representative colours. A pixels is classified as belonging to the nth bin if the nth representative colour is the one (amongst all the representative colours) that is closest to the pixel. G R Representative colours G52IIP, School of Computer Science, University of Nottingham 29 Colour Histogram Construction - An Example A 3 x 3, 24-bit/pixel image has following RGB planes Green Red 23 11 22 24 24 12 77 69 12 213 11 22 Blue 24 232 12 77 239 12 23 12 22 24 24 123 77 69 123 Construct an 8-bin colour histogram (using simple colour quantization, treating each axis as equally important). Bin (0,0,0) = Bin (0,0,1) = Bin (0,1,0) = Bin (0,1,1) = Bin (1,0,0) = Bin (1,0,1) = Bin (1,1,0) = Bin (1,1,1) = G52IIP, School of Computer Science, University of Nottingham 30 Colour Histogram Construction - An Example Quantized Colour Planes Green Red 0 0 0 0 0 0 0 0 0 1 0 0 Blue 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 Count the number of times each discrete colour occurs in the image. Bin (0,0,0) = 6 Bin (0,0,1) = 0 Bin (0,1,0) = 3 Bin (0,1,1) = 0 Bin (1,0,0) = 0 Bin (1,0,1) = 0 Bin (1,1,0) = 0 Bin (1,1,1) = 0 G52IIP, School of Computer Science, University of Nottingham 31 Colour Based Image Indexing # of pixels The histogram of colours in an image defines the image colour distribution # of pixels 10 0 0 0 100 10 30 0 0 Color Distribution = (10,0,0,0,100,10,30,0,0) G52IIP, School of Computer Science, University of Nottingham 32 Colour based Image Retrieval Images are similar if their histograms are similar! Colour Distribution = (10,0,0,0,100,10,30,0,0) Dissimilar Colour Distribution Colour Distribution = = (0,40,0,0,0,0,0,0,110,0) Similar! (10,0,0,0,90,10,40,0,0) G52IIP, School of Computer Science, University of Nottingham 33 Formalizing Similarity Colour Distribution Colour Distribution 1 = 2 = H (10,0,0,0,100,10,30,0,0) (0,40,0,0,0,0,0,110,0) H 1 2 Similarity(Image 1, Image 2) = D (H1, H2) where D( ) is a distance measure between vectors (histograms) H1 and H2 G52IIP, School of Computer Science, University of Nottingham 34 Metric Distances A distance measure D( ) is a good measure if it is a metric! D(a,b) is a metric if D(a,a) = 0 (the distance from a to itself is 0 D(a,b) = D(b,a) (the distance from a to b = distance from b to a) D(a,c) <= D(a,b) + D(b,c) ( triangle inequality [ the straight line distance is always the least!] ) a D(a,b) + D(b,c) should be no smaller than D(a,c) b c G52IIP, School of Computer Science, University of Nottingham 35 Common Metric Distance measures Histogram Intersection, HI n HI(H1, H2) = 1 2 min( H , H i i ) i 1 H1 = (10, 0, H2 = ( 0, 0, 0, 100, 10, 30, 0, 40, 0, 0, 0, 0) 6, 0, 110, 0) Similarity = HI(H1, H2) = 0 + 0 + 0 + 0 + 0 + 6 + 0 + 0 = 6 G52IIP, School of Computer Science, University of Nottingham 36 Common Metric Distance measures Euclidean or straight-line distance or L2-norm, D2 D (H , H ) 2 1 2 H 1 i H 2 2 i H1 H 2 i 2 Root-mean square error H1 = (10, 0, H2 = ( 0, 0) 40, 0) Similarity = D2(H1, H2) = sqrt(100 + 1600 +0) = 41.23 G52IIP, School of Computer Science, University of Nottingham 37 Common Metric Distance measures Manhattan or city-block or L1-norm, D1 D1 ( H 1 , H 2 ) H i1 H i2 H 1 H 2 i 1 sum of absolute differences H1 = (10, 0, H2 = ( 0, 0) 40, 0) Similarity = D1(H1, H2) = (10 + 40 +0) = 50 G52IIP, School of Computer Science, University of Nottingham 38 Histogram Intersection vs City Block Distance Theorem: if H1 and H2 are colour histograms and the total count in each is N (there are N-pixels in an image) then: N H 1 H 2 0.5 H 1 H 2 1 (Histogram Intersection inversely proportional to a metric distance!) Proof (1) H 1 H 2 min( H i1 , H i2 ) (by definition) i (2) min( a, b) max( a, b) a b 1 G52IIP, School of Computer Science, University of Nottingham 39 Histogram Intersection vs City Block Distance (3) max( a, b) a b min( a, b) Substituting (2) and (3) in (1) 1 2 1 2 1 2 min( H , H ) max( H , H ) H H i i i i i i (4) i i H i1 H i2 min( H i1 , H i2 ) H i1 H i2 i i i i 2 min( H i , H i ) N N H 1 H 2 1 (5) 1 1 2 i N H 1 H 2 0.5 H 1 H 2 1 1 G52IIP, School of Computer Science, University of Nottingham 40 Colour Histogram Database Build (1)Histogram (2) Histogram Database G52IIP, School of Computer Science, University of Nottingham 41 How well does Color histogram intersection work ? 66 test histograms in the database Swain Original Test: 31 query images Recognition rate almost 100% Indeed, because color indexing worked so well it is at he heart of almost all image database systems G52IIP, School of Computer Science, University of Nottingham 42 Google Image Search G52IIP, School of Computer Science, University of Nottingham 43 Google Image Search After clicking this colour patch G52IIP, School of Computer Science, University of Nottingham 44 Problems with color histogram matching 1. Color Quantization problem: Colour Distribution = (0,40,0,0,0,0,0,110,0) 0 intersection! Colour Distribution = Because, the two images have slightly different color distributions their histograms have nothing in common! (0,0,40,0,0,0,0,0,110) Sources of quantization error: noise, illumination, camera G52IIP, School of Computer Science, University of Nottingham 45 Problems with color histogram matching 2. The resolution of a color histogram Colour Distribution = (0,40,0,0,0 … ,0,0,110,0) For the best results, Swain quantized colour space into 4096 distinct colours => Each colour distribution is a 4096-dimensional vector. => Histogram intersection costs O(4096) operations (some constant * 4096) 4096 comparisons per database histogram => histogram intersection will be very slow for large databases Many newer methods work well using 8 - 64 D features G52IIP, School of Computer Science, University of Nottingham 46 Problems with color histogram matching 3. The colour of the light Under a yellowish light all image colours are more yellow than they ought to be G52IIP, School of Computer Science, University of Nottingham 47 Problems with color histogram matching 4. The structure of colour distribution All four images have the same color distribution - need to take into account spatial relationships! G52IIP, School of Computer Science, University of Nottingham 48 Problem solution => Use statistical moments 1st order statistics = Mean R , G , B 2nd order statistics = R , G , B , RG , RB , GB 2 Variance/ Covariance R2 1 / N ( Ri R ) 2 i 2 2 RG 1 / N ( Ri R )(Gi G ) i G52IIP, School of Computer Science, University of Nottingham 49 Statistical similarity Colour Distribution = (50,50,50) Compare mean RGBs (In general compare all statistical measures) Colour Distribution = Statistical similarity = (20,70,40) [ R ( I1 ) R ( I 2 )]2 [G ( I1 ) G ( I 2 )]2 ... [ R2 G2 ]2 .... (Euclidean distance between corresponding statistical measures) G52IIP, School of Computer Science, University of Nottingham 50 Histogram vs Statistical Similarity Completeness of representation #params/ Match speed Sensitivity to Quantization error histogram complete Many/slow sensitive Low order stats incomplete Few/fast insensitive Low and high order stats complete (or over complete) Many/moderate sensitive G52IIP, School of Computer Science, University of Nottingham 51 Advanced Topics Fast Indexing Interactive/Relevant Feedback Reducing the Semantic Gap Visualization, Navigation, Browsing Internet scale image/video retrieval Flickr – billions of photos Youtube – billions of videos … G52IIP, School of Computer Science, University of Nottingham 52