Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Analysis of user need in image archives This paper describes a project in which an analysis was undertaken of user queries addressed to seven libraries which manage archives of widely varying still and moving image material. The sampling procedure is described, in which queries obtained from each library were broadly categorised by image content, identification and accessibility. Attention is focused on the image content requests, for which a categorisation based on facet analysis is developed. The analytical tool which is used for this purpose is based on a schema already well established for the analysis of levels of meaning in images. The project demonstrates the possibility of formulating a general categorisation of requests which seek widely different still and moving image material. The paper concludes with observations on the potential value of embedding such a schema within the user interface of unmediated-query visual information retrieval systems. Colour moments 1. moment: mean 2. moment: variance 3. moment: skewness (fordelingens skævhed) Three colour components each with three moments gives only 9 numbers, i.e. very compact representation. Colour correlogram Describes not only the colour distribution but also the spatial correlation of pairs of colours. Three dimensional structure. First and second dimension are the colours of any pixel pair while the third dimension is the spatial distance. Tamura features Designed in accordance with psychological studies on human perception on texture. Six components: coarseness, contrast, directionality, linelikeness, regularity, and roughness. Normalized cuts Organizing an image by grouping objects together in a hierarchical fashion. Extracting the global impression of an image rather than building up the groups from local features. Graph partitioning problem and use the global criterion of normalized cuts to segment the graph. The basic idea is to partition using normalized cuts instead of minimum cuts. Minimum cut is a standard graph theory problem with an efficient solution, but when used in image segmentation, it favours making many small segments, whereas normalized cuts will avoid this behaviour. Principal component analysis In statistics, principal components analysis (PCA) is a technique for simplifying a data set, by reducing multidimensional data sets to lower dimensions for analysis. PCA is an orthogonal linear transformation that transforms the data to a new coordinate system such that the greatest variance by any projection of the data comes to lie on the first coordinate (called the first principal component), the second greatest variance on the second coordinate, and so on. Kullback-Leibler The Kullback-Leibler divergence, or relative entropy, is a quantity which measures the difference between two probability distributions. Has been used as similarity measure for texture. Semantic enrichment Prøver at lukke det semantiske hul. De opbygger nogle featurelag hvor features i højere lage er baseret på en deskriptor af fetaures fra lavere lag. Sammen med yderligere information kan man så udlede deskriptorer for højere lag (denne information kan være statistisk, domæne vide etc.). Dette koncept laver de for klasser der beskriver egenskaber fra den menneskelig verden. Concept discovery Semantics-intensive image retrieval. Images in the database is segmented to region; associated with homogenous colour, texture, and shape features. By exploiting regional statistical information in each image and employing a vector quantization method, a uniform and sparse region-based representation is achieved. With this representation a probabilistic model based on statisticalhidden-class assumptions of the image database is obtained and statistical methods are used for to analyze semantic concepts hidden in the database. Auto-correlation Autocorrelation is a mathematical tool used frequently in signal processing. It is a measure of how well a signal matches a time-shifted version of itself, as a function of the amount of time shift. More precisely, it is the cross-correlation of a signal with itself. Cross-correlation Cross-correlation (or sometimes "cross-covariance") is a measure of similarity of two signals. Egenværdi Egenvektor af en transformation defineret som en vektor der har uændret retning efter denne transformation. Egenværdien er det antal gange vektoren er blevet skaleret efter den tilsvarende transformation, og sidst men ikke mindst defineres et egenrum som en mængde af egenvektorer med fælles egenværdi. Iflg. Harris er de proportionale med autokorrelation funktionens krumning. Trace/determinant Trace of an n-by-n square matrix A is defined to be the sum of the elements on the main diagonal. Determinant is a function depending on n that associates a scalar, det(A), to every n×n square matrix A. The fundamental geometric meaning of a determinant is as the scale factor for volume when A is regarded as a linear transformation. The matrix exponential is a function on square matrices analogous to the ordinary exponential function. tr(A) = ∑ λi. From the connection between the trace and the eigenvalues, one can derive a connection between the trace function, the matrix exponential function, and the determinant: det(exp(A)) = exp(tr(A)). Hvorfor betragter Moravec kun 45 graders shift? Fordi at intensitetsfunktionen er diskret. Vi ønsker derfor at inkoprporere den første afledte. Det gør vi ved Taylor udvidelse. Brownian image model & Pedersen Den Brownske billedmodel er et skalainvariant stokastisk felt (et felt i en eller flere dimensioner, hvor værdien i hvert punkt er bestemt ud fra en lokal sandsynlighedsfordeling), med rumlig increments der er uafhængigt, identisk Gaussisk fordelt med middelværdien nul. Stokastisk billedmodel: en stokastisk variabel er en type af variabel der beskriver et tilfældigt forsøg, hvor udfaldet ikke er kendt. Billederne er udfald fra en todimensional Browns bevægelse, resulterende i en Gaussisk proces (middelværdi nul). En Gaussisk proces er en stokastisk proces der genererer udfald over tid på en sådan måde at enhver endelig lineær kombination af udfaldene kan karakteriseres ved en normalfordeling. Kovariansmatricen er resultat af nogle analytiske beregninger som er over min forstand. Kovariansmatricen n = n1+n2 og m = m1+m2 når de er lige heltal, ellers er indgangen 0. Lowes pyramide DoG bruges da det er en god approksimering til LoG som har givet gode resultater, men er meget betegningstungt. Billedet foldes gentagne gange med en Gauss. Resultatet er en stak billeder separeret med faktor k. Hvert lag ønsker vi bestående af s intervaller. Da sigma fordobles fra et lag til et andet k = 2^(1/s). Når vi går fra et lag til det næste downsampler vi billedet med faktor 2 (sigma fordobles). Den implementation jeg bruger er fra Lowes ”gamle” artikel. Hvert lag består ag to billeder. Billede A foldet med sigma = kvadratroden af 2, billede B er billede A foldet en ekstra gang med sigma = kvadratroden af 2. Billedet downsamples så med faktor 1.5 og ”starter så forfra”. DoG billeder har stærk respons ved kanter og sådanne punkter ville have en stor krumning langs kanten og en lille i den vinkelrette retning. Og de kan undersøges via en 2x2 Hessian matrix. Ved at undersøge om forholdet mellem de to krumninger er mindre end en eller anden konstant (Lowe bruger 10 og jeg 5). Threshold valuen på 3 sørger for at kun pixelværdier på 3 eller over kan vælges som min/max. Radiusen er pixelradiusen for undersøgte punkter. Punkt thresholding 8-vejs lokalt maksima, og grænse på 1% af maksimumværdien. Steerable filters One approach to finding the response of a filter at many orientations is to apply many versions of the same filter, each different from the others by some small rotation in angle. A more efficient approach is to apply a few filters corresponding to a few angles and interpolate between the responses. One then needs to know how many filters are required and how to properly interpolate between the responses. With the correct filter set and the correct interpolation rule, it is possible to determine the response of a filter of arbitrary orientation without explicitly applying that filter. We use the term “steerable filter" to describe a class of filters in which a filter of arbitrary orientation is synthesized as a linear combination of a set of “basis filters." Kan konstruere sådanne styrbare filter til at analysere lokale retninger. We measure the orientation strength along a particular direction, θ, by the squared output of a quadrature pair of bandpass filters steered to the angle θ. We call this spectral power the "oriented energy", E(θ). Descriptor Normalized: negates any possible contrast changes (contrast changes result in a gradient change of the same order as the constant factor multiplication of the pixel value resulting from the new contrast). Non-linear brightness changes: might result in corruption of the gradients magnitudes. Try to avoid this by thresholding to 0.2 to lessen the impact of large gradient magnitudes. This may result from camera saturation or illumination changes that affect the 3D surfaces. Linear-brightness change: doesn’t matter since we are working with pixel differences and linear brightness changes result in addition of some constant to each pixel value. R*-tree, K-d-B tree, SOM K-d-B: Splits laves så data punkterne deles så ligeligt som muligt I partitionerne, og minimerer også antallet af splits. R* Som R da det tillader overlappende minimum bounding regions. Forskellig fra R i den måde hvorpå MBR udregnes. R* bruger også ”forced reinserts” som prøvet at forhindre splits. Hierarchical and model-based clustering Hierarchical clustering builds (agglomerative), or breaks up (divisive), a hierarchy of clusters. The traditional representation of this hierarchy is a tree (called a dendrogram). Clustering algorithms based on probability models offer a principled alternative to heuristic algorithms. In particular, model-based clustering assumes that the data is generated by a finite mixture of underlying probability distributions such as multivariate normal distributions.