Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Video Data Mining LUBANA ABO ASSAF Introduction The goal of data mining is to discover and describe interesting patterns in data. This task is especially challenging when the data consist of video sequences which may also have audio content. Video Data Mining 2 Introduction there are three types of videos: the produced, the raw, and the medical video. produced video are movies, news videos, dramas. raw video are traffic videos, surveillance videos. medical videos : Ultra sound videos including echocardiogram. Video Data Mining 3 DEFINITIONS a shot is collections of frames recorded from a single camera operation. the incoming frames are grouped into meaningful pieces in real time processing. This piece is called as segment to distinguish it from shot. Video Data Mining 4 Multimedia Data Mining Architecture Video Data Mining 5 C-BIRD: Content-Based Image Retrieval from Digital libraries Search by image colors by color percentage by texture by object model by illumination invariance by keywords Video Data Mining 6 Multi-Dimensional Search in Multimedia Databases Color layout Video Data Mining 7 Multi-Dimensional Analysis Color histogram Texture layout Video Data Mining 8 Mining Multimedia Databases Refining or combining searches Search for “airplane in blue sky” (top layout grid is blue and keyword = “airplane”) Search for “blue sky and green meadows” Search for “blue sky” (top layout grid is blue and bottom is green) (top layout grid is blue) Video Data Mining 9 Similarity Search in Multimedia Data Description-based retrieval systems Build indices and perform object retrieval based on image descriptions, such as keywords, captions, size, and time of creation Labor-intensive if performed manually Results are typically of poor quality if automated Content-based retrieval systems Support retrieval based on the image content, such as color histogram, texture, shape, objects, .. Video Data Mining 10 One Signature for the Entire Image? Similar images may contain similar regions, but a region in one image could be a translation or scaling of a matching region in the other Define regions by clustering signatures of windows of varying sizes within the image Signature of a region is the centroid of the cluster Similarity is defined in terms of the fraction of the area of the two images covered by matching pairs of regions from two images Video Data Mining 11 Multidimensional Analysis of Multimedia Data Multimedia data cube Design and construction similar to that of traditional data cubes from relational data Contain additional dimensions and measures for multimedia information, such as color, texture, and shape The database does not store images but their descriptors Feature descriptor: a set of vectors for each visual characteristic Color vector: contains the color histogram MFC (Most Frequent Color) vector: five color centroids MFO (Most Frequent Orientation) vector: five edge orientation centroids Layout descriptor: contains a color layout vector and an edge layout vector Video Data Mining 12 Mining Multimedia Databases The Data Cube and the Sub-Space Measurements By Size By Format By Format & Size RED WHITE BLUE Cross Tab JPEG GIF By Colour By Colour & Size RED WHITE BLUE Group By Colour Sum By Format Sum RED WHITE BLUE Measurement Sum Video Data Mining By Format & Colour By Colour • Format of image • Duration • Colors • Textures • Keywords • Size • Width • Height • Internet domain of image • Internet domain of parent pages • Image popularity 13 Challenge: Curse of Dimensionality Difficult to implement a data cube efficiently given a large number of dimensions, especially serious in the case of multimedia data cubes Many of these attributes are set-oriented instead of single-valued Restricting number of dimensions may lead to the modeling of an image at a rather rough, limited, and imprecise scale More research is needed to strike a balance between efficiency and power of representation Video Data Mining 14 Classification in MultiMediaMiner Video Data Mining 15 Mining Associations in Multimedia Data Associations between image content and non-image content features Associations among image contents that are not related to spatial relationships “If at least 50% of the upper part of the picture is blue, then it is likely to represent sky.” “If a picture contains two blue squares, then it is likely to contain one red circle as well.” Associations among image contents related to spatial relationships “If a red triangle is between two yellow squares, then it is likely a big oval-shaped object is underneath.” Video Data Mining 16 Mining Associations in Multimedia Data Special features: Need # of occurrences besides Boolean existence, e.g., “Two red square and one blue circle” implies theme “air-show” Need spatial relationships Blue on top of white squared object is associated with brown bottom Need multi-resolution and progressive refinement mining It is expensive to explore detailed associations among objects at high resolution It is crucial to ensure the completeness of search at multi-resolution space Video Data Mining 17 Mining Multimedia Databases Spatial Relationships from Layout property P1 on-top-of property P2 property P1 next-to property P2 Different Resolution Hierarchy Video Data Mining 18 REQUIREMENT OF VIDEO MINING It should be as unsupervised as possible. It should have as few assumptions about the data as possible. It should be computationally simple It should discover interesting events. Video Data Mining 19 What is “interesting events”? The definition of interesting is of course highly context dependent. In a sports video: highlights such as home runs, goals, three-point shots are interesting. while the general run of play is not so interesting. Video Data Mining 20 What is “interesting events”? The definition of interesting is of course highly context dependent. In a TV broadcast: The program is interesting but the commercial messages are not. Video Data Mining 21 What is “interesting events”? The definition of interesting is of course highly context dependent. In a news video: The transition points in the content that mark the semantic boundaries of the content are interesting. Video Data Mining 22 video summary video summary is a combination of the common and the uncommon events. in a soccer video, it is the uncommon events that constitute the summary. in a surveillance video, we would like to know both the common and the uncommon events to summarize the video. Video Data Mining 23 Film structure frames are the smallest unit of the video. Many frames constitute a shot. Similar shots make scenes. The complete film is the collection of several scenes presenting an idea or concept. Video Data Mining 24 Computable Features Computable features are set of attributes that can be extracted using image/signal processing and computer vision techniques. Video Data Mining 25 Computable Features Objects: Visual Objects:, Tree, Person, Hands, … Audio Objects: Music, Speech, Sound, … Scenes: Background: Building, Outdoors, Sky Relationships: The (time, spatial) relationships between objects & scenes Activities: Holding Hand in Hand, Looking for Stars Video Data Mining 26 video segmentation It is referred to as shot boundary detection. Many techniques have been developed to detect transitions from one shot to the next. These schemes differ mainly in the way that the inter-frame difference is computed. The main idea is that if the difference between the two consecutive frames is larger than a certain threshold value, then a shot boundary is considered between two corresponding frames. Video Data Mining 27 video segmentation The difference can be determined by comparing the corresponding pixels of two images. Color or grayscale histograms can also be used. Other schemes use domain knowledge such as predefined models, objects, regions, etc. Hybrids of the above techniques have also been investigated. Video Data Mining 28 Frame Difference Frame differences are computed by finding the difference between consecutive frames. the algorithm assumes a stationary background. the difference between the frames at regular intervals (k) is considered. If there are n frames, then we will get (n/k) frame differences (FD). Video Data Mining 29 Background Elimination Once the frame differences are computed the pixels that belong to the background region will have a value almost equal to zero, as the background is assumed stationary. because of camera noise, some of the pixels may not tend to zero. These values are set to zero by comparing any two frame differences. the background region is eliminated and only the moving object region will contain non-zero pixel values. Video Data Mining 30 Background Elimination After Background elimination Video Data Mining 31 Background Registration A general tracking approach is to extract salient regions from the given video clip using a learned background modeling technique. This involves subtracting every image from the background scene and thresholding the resultant difference image to determine the foreground image. Stationary pixels are identified and processed to construct the initial background registered image. Video Data Mining 32 Background Registration Vehicle is a group of pixels that move in a coherent manner, either as a lighter region over a darker background or vice versa. The vehicle may be of the same color as the background, or may be some portion of it may be camouflaged with the background, due to which tracking the object becomes difficult. This leads to an erroneous vehicle count. Video Data Mining 33 Post Processing Post processing is performed on the foreground dynamic objects to reduce the noise interference due to camera noise and irregular object motion. Order-statistics filters are used, which are the spatial filters and whose response is based on ordering the pixels contained in the image area encompassed by the filter. The response of the filter at any point is then determined by the ranking result. Median filter replaces the value of a pixel by the median of the gray levels in the neighborhood of that pixel. Video Data Mining 34 Post Processing After post processing Video Data Mining 35 Object tuning This is a post processing technique. In the current algorithm we use a median filter for noise elimination in both the object and background. As the object boundaries are not very smooth, a post processing technique is required on the foreground image. The final output of the object tuning phase is a binary image of the objects detected. Video Data Mining 36 Object Identification The image obtained after the pre-processing step has relatively less noise, so, the background area is completely eliminated. Now, if the pixel values of this image are greater than a certain threshold, then, those pixels are replaced by the pixels of the original frame. This process identifies the moving. Video Data Mining 37 Object Identification Identification of Objects Video Data Mining 38 Object counting The tracked binary image forms the input image for counting. This image is scanned from top to bottom for detecting the presence of an object. Two variables are maintained the number of objects. the information of the registered object. Video Data Mining 39 Object counting When a new object is encountered: check to see whether it is already registered, if the object is not registered then it is assumed to be a new object and count is incremented. else it is treated as a part of an already existing object and the presence of the object is neglected. A fairly good accuracy of count is achieved. Sometimes due to occlusions two objects are merged together and treated as a single entity. Video Data Mining 40 Examples Video Data Mining 41 Examples Video Data Mining 42 News videos News videos Remove commercials Remove graphics Remove anchor images but use text Video Data Mining 44 Associating text with frames Video Data Mining 45 Thank you