Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Principal component analysis wikipedia , lookup
Expectation–maximization algorithm wikipedia , lookup
Nonlinear dimensionality reduction wikipedia , lookup
Human genetic clustering wikipedia , lookup
Mixture model wikipedia , lookup
K-means clustering wikipedia , lookup
Clustering Readings: Chapter 8.4 from Principles of Data Mining by Hand, Mannila, Smyth. Chapter 9 from Principles of Data Mining by Hand, Mannila, Smyth. ----------------------------------------------------------------------------------------------------------------- 1. Clustering versus Classification classification: give a pre-determined label to a sample clustering: provide the relevant labels for classification from structure in a given dataset clustering: maximal intra-cluster similarity and maximal inter-cluster dissimilarity Objectives: - 1. segmentation of space - 2. find natural subclasses 1 Cumulative Intensity Difference, blockfit =95% Analysis of Actual Image 120 Difference with template in % 100 80 60 40 20 0 -15 -10 -5 0 distance to printborder in mm 5 10 L1L2-plane with six clusters 20 15 10 C5- L2-axis in mm C6- 5 C4C1- 0 C3- C2- -5 Cluster centres C1...C6 indicated by blue dot Observed point indicated as green blob Active cluster indicated in red Active Cluster: C1 and Lregion: #2 -10 -15 -20 -20 -15 -10 -5 0 5 L1-axis in mm 10 15 20 PRE-clustering Probability 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 0 2 4 6 8 10 12 14 16 18 20 2 Types of True State Evolutions 1 type II 0.9 type IV 0.8 true state value 0.7 type III 0.6 0.5 0.4 0.3 type I 0.2 0.1 0 <d> = 0.278 noise = 0.300 0 0.2 0.4 0.6 relative time t/N 0.8 1 Fig. 5.15. a (left): clustering of 10,000 points in (x,y)-plane, disks indicate prototypes. b (right): four prototype evolution types for true state function ul(t). 3 Cluster Analysis [book section 9.3] Segmentation and Partitioning Extra: Voronoi diagrams, see for example: http://www.ics.uci.edu/~eppstein/gina/voronoi.html VORONOI-Partitioning: * Given a set of points x[k] * Voronoi-partitioning: Part of the space V[k] of points closer to x[k] than to any other point x[l] Convex hull Delaunay Triangularization 4 Partitional Clustering [book section 9.4] score-functions centroid intra-cluster distance inter-cluster distance C-means [book page 303] while changes in cluster Ck % form clusters for k=1,…,K do Ck = {x | ||x – rk|| < || x – rl|| } end % compute new cluster centroids for k=1,…,K do rk = mean({x | x Ck }) end end 5 Extra: Fuzzy C-means ij is membership of sample i to custer j ck is centroid of custer i while changes in cluster Ck % compute new memberships for k=1,…,K do for i=1,…,N do jk = f(xj – ck) end end % compute new cluster centroids for k=1,…,K do % weighted mean ck = jjk xj /jjk end end 6 Trajectory of Fuzzy MultiVariate Centroids 1 0.9 0.8 0.7 1 0.6 2 453 0.5 0.4 2 4 0.3 0.2 3 5 0.1 0 -0.1 0 0.2 0.4 0.6 0.8 1 7 Hierarchical Clustering [book section 9.5] One major problem with partitional clustering is that the number of clusters (= #classes) must be pre-spedified !!! This poses the question: what IS the real number of clusters in a given set of data? Answer: it depends! Agglomerative methods: bottom-up Divisive methods: top-down 8 9 Probabilistic Model-Based Clustering using Mixture Models The EM-algorithm [book section 8.4] Suppose we have data D with a model with parameters and hidden parameters H (e.g. the class label!). Suppose: 1. p(D,H|) is the prob.dist. for set+hidden for given parameters 2. Q(H) is the prob.dist. on the hidden params H. Then we can write the log-likelihood distr. as [book pg 261]: l() = log p(D,H|) = log Q(H) p(D,H|)/Q(H) Q(H) log (p(D,H|)/Q(H)) = Q(H) log p(D,H|) + Q(H) log(1/Q(H)) F(Q,) 10 EM-algorithm: E-step: maximize F with respect to Q with fixed M-step: maximize F with respect to with fixed Q(H) so: E-step: Qk+1 = arg max Qk F(Qk,k) M-step: k+1 = arg max k F(Qk+1k) 11 Adaptations to EM-algorithm [book section 9.2.4, 9.2.5, 9.6] Probability Mixtures [idem] Mixture Decomposition, e.g. Gaussian Mixture Decomposition [idem] 12