
Communication-Efficient Privacy-Preserving Clustering
... participating organizations to reveal their individual data items to each other. This paper makes several contributions. First, we present a simple, deterministic, I/O-efficient kclustering algorithm that was designed with the goal of enabling an efficient privacy-preserving version of the algorithm ...
... participating organizations to reveal their individual data items to each other. This paper makes several contributions. First, we present a simple, deterministic, I/O-efficient kclustering algorithm that was designed with the goal of enabling an efficient privacy-preserving version of the algorithm ...
Spatio-Temporal Clustering: a Survey
... clusters, rather than computing them from scratch. Geo-referenced time series. In a more sophisticated situation, it might be possible to store the whole history of the evolving object, therefore providing a (georeferenced) time-series for the measured variables. When several variables are available ...
... clusters, rather than computing them from scratch. Geo-referenced time series. In a more sophisticated situation, it might be possible to store the whole history of the evolving object, therefore providing a (georeferenced) time-series for the measured variables. When several variables are available ...
clustering large-scale data based on modified affinity propagation
... optimal value of ‘preference’ must be set. The bisection method suggests using AP to find a suitable preference for specified cluster number [20]. The process of finding the parameters is very time consuming because each change of any parameter will require re-running the algorithm. KAP [20] was dev ...
... optimal value of ‘preference’ must be set. The bisection method suggests using AP to find a suitable preference for specified cluster number [20]. The process of finding the parameters is very time consuming because each change of any parameter will require re-running the algorithm. KAP [20] was dev ...
Ranking Interesting Subspaces for Clustering High Dimensional Data*
... dimensionality reduction even does not yield the desired results (e.g. [1] presents an example where PCA does not reduce the dimensionality). In addition, using dimensionality reduction techniques, the data is clustered only in a particular subspace. The information of objects clustered differently ...
... dimensionality reduction even does not yield the desired results (e.g. [1] presents an example where PCA does not reduce the dimensionality). In addition, using dimensionality reduction techniques, the data is clustered only in a particular subspace. The information of objects clustered differently ...
A MapReduce-Based k-Nearest Neighbor Approach for Big Data
... The k-NN algorithm is a non-parametric method that can be used for either classification and regression tasks. This section defines the k-NN problem, its current trends and the drawbacks to manage big data. A formal notation for the k-NN algorithm is the following: Let T R be a training dataset and T ...
... The k-NN algorithm is a non-parametric method that can be used for either classification and regression tasks. This section defines the k-NN problem, its current trends and the drawbacks to manage big data. A formal notation for the k-NN algorithm is the following: Let T R be a training dataset and T ...
IOSR Journal of Computer Engineering (IOSR-JCE)
... In this Lazy learning [5] have the lazy learners are k-nearest Neighbor (kNN) classifier. It is the nonparametric and instance-based learning methods, here the trained data stream is simply stored in memory and the inductive process is different until a query is given. Lazy learning methods incurred ...
... In this Lazy learning [5] have the lazy learners are k-nearest Neighbor (kNN) classifier. It is the nonparametric and instance-based learning methods, here the trained data stream is simply stored in memory and the inductive process is different until a query is given. Lazy learning methods incurred ...
4C (Computing Clusters of Correlation Connected Objects)
... can easily be transformed to the problem of finding shifting coherent δ-cluster by applying logarithmic function to each object. Thus they focus on finding shifting coherent δclusters and introduce the metric of residue to measure the coherency among objects of a given cluster. An advantage is that ...
... can easily be transformed to the problem of finding shifting coherent δ-cluster by applying logarithmic function to each object. Thus they focus on finding shifting coherent δclusters and introduce the metric of residue to measure the coherency among objects of a given cluster. An advantage is that ...
HD-Eye: Visual Mining of High- Dimensional Data
... 9. S. Eick and G.J. Wills, “Navigating Large Networks with Hierarchies,” Proc. Visualization 1993, IEEE CS Press, Los Alamitos, Calif., ...
... 9. S. Eick and G.J. Wills, “Navigating Large Networks with Hierarchies,” Proc. Visualization 1993, IEEE CS Press, Los Alamitos, Calif., ...
Unsupervised Clustering Methods for Identifying Rare Events in
... system is being used in a different manner [6]. The second one is called misuse intrusion detection system which collects attack signatures, compares a behavior with these attack signatures, and signals intrusion when there is a match. It is often impossible to analyze the vast amount of whole data, ...
... system is being used in a different manner [6]. The second one is called misuse intrusion detection system which collects attack signatures, compares a behavior with these attack signatures, and signals intrusion when there is a match. It is often impossible to analyze the vast amount of whole data, ...
Nearest-neighbor chain algorithm

In the theory of cluster analysis, the nearest-neighbor chain algorithm is a method that can be used to perform several types of agglomerative hierarchical clustering, using an amount of memory that is linear in the number of points to be clustered and an amount of time linear in the number of distinct distances between pairs of points. The main idea of the algorithm is to find pairs of clusters to merge by following paths in the nearest neighbor graph of the clusters until the paths terminate in pairs of mutual nearest neighbors. The algorithm was developed and implemented in 1982 by J. P. Benzécri and J. Juan, based on earlier methods that constructed hierarchical clusterings using mutual nearest neighbor pairs without taking advantage of nearest neighbor chains.