
Clustering distributed sensor data streams using local
... In our approach, we apply incremental discretization at each sensor univariate stream Xi using the Partition Incremental Discretization (PiD) algorithm [14], which consists of two layers. The first layer simplifies and summarizes the data, while the second layer constructs the final grid. Within the ...
... In our approach, we apply incremental discretization at each sensor univariate stream Xi using the Partition Incremental Discretization (PiD) algorithm [14], which consists of two layers. The first layer simplifies and summarizes the data, while the second layer constructs the final grid. Within the ...
Discovering Regular Groups of Mobile Objects
... events when the bounding rectangles reach some boundary and by using knowledge about collisions between the MMCs (splitting or merging MMCs when those events occur). In experiments conducted on synthetic data with the K-Means as the generic algorithm used in micro-clustering, MMCs showed improvement ...
... events when the bounding rectangles reach some boundary and by using knowledge about collisions between the MMCs (splitting or merging MMCs when those events occur). In experiments conducted on synthetic data with the K-Means as the generic algorithm used in micro-clustering, MMCs showed improvement ...
07 - Emory Math/CS Department
... Start with a tree that consists of any point In successive steps, look for the closest pair of points (p, q) such that one point (p) is in the current tree but the other (q) is not Add q to the tree and put an edge between p and q ...
... Start with a tree that consists of any point In successive steps, look for the closest pair of points (p, q) such that one point (p) is in the current tree but the other (q) is not Add q to the tree and put an edge between p and q ...
this PDF file - SEER-UFMG
... Focusing. In real situations, textual data are collected and stored, usually including almost all kinds of information about the problem domain. However, many applications are usually related to only a few aspects of the problem domain. Therefore, it is naturally more efficient to select and focus o ...
... Focusing. In real situations, textual data are collected and stored, usually including almost all kinds of information about the problem domain. However, many applications are usually related to only a few aspects of the problem domain. Therefore, it is naturally more efficient to select and focus o ...
Data Mining Cluster Analysis: Basic Concepts and Algorithms
... – Finds clusters that minimize or maximize an objective function. – Enumerate all possible ways of dividing the points into clusters and evaluate the `goodness' of each potential set of clusters by using the given objective function. (NP Hard) ...
... – Finds clusters that minimize or maximize an objective function. – Enumerate all possible ways of dividing the points into clusters and evaluate the `goodness' of each potential set of clusters by using the given objective function. (NP Hard) ...
Fast Distance Metric Based Data Mining Techniques Using P
... c) d(X, Y) satisfies triangle inequality: The distance between two points can never be more than the sum of their distances from some third point. That is, for any three points X, Y and Z, d(X, Y) + d(Y, Z) ≥ d(X, Z) ...
... c) d(X, Y) satisfies triangle inequality: The distance between two points can never be more than the sum of their distances from some third point. That is, for any three points X, Y and Z, d(X, Y) + d(Y, Z) ≥ d(X, Z) ...
Possible Topics - NDSU Computer Science
... to [email protected].. It must be a research topic (makes some new contribution to the body of knowledge, as opposed to simply an exposition of what has already developed by others). Only one person per topic (first come first serve - email your request to me). Please check the schedule befor ...
... to [email protected].. It must be a research topic (makes some new contribution to the body of knowledge, as opposed to simply an exposition of what has already developed by others). Only one person per topic (first come first serve - email your request to me). Please check the schedule befor ...
Nearest-neighbor chain algorithm

In the theory of cluster analysis, the nearest-neighbor chain algorithm is a method that can be used to perform several types of agglomerative hierarchical clustering, using an amount of memory that is linear in the number of points to be clustered and an amount of time linear in the number of distinct distances between pairs of points. The main idea of the algorithm is to find pairs of clusters to merge by following paths in the nearest neighbor graph of the clusters until the paths terminate in pairs of mutual nearest neighbors. The algorithm was developed and implemented in 1982 by J. P. Benzécri and J. Juan, based on earlier methods that constructed hierarchical clusterings using mutual nearest neighbor pairs without taking advantage of nearest neighbor chains.