MIS2502: Jing Gong

... using Pivot table is not data mining • Sum, average, min, max, time trend… ...

MultiClust 2013: Multiple Clusterings, Multi-view Data, and

A Preview on Subspace Clustering of High Dimensional Data

View PDF - CiteSeerX

... speaking, there are two ways to analyze the data: central analysis and distributed analysis. A typical example DIDS based central analysis is being developed by Division of Computer Science, University of California, Davis in 90th last century [2]. It is the first intrusion detection system that agg ...

C - GMU Computer Science

Cluster Analysis Research Design model, problems, issues

... Structure of database: Real life data may not always contain clearly identifiable clusters. Also the order in which the tuples are arranged may affect the results when an algorithm is executed if the distance measure used is not perfect. With a structure less data (for eg. Having lots of missing val ...

A New Privacy-Preserving Distributed k

An Entropy-Based Subspace Clustering Algorithm for - Inf

... In subspace clustering, objects are grouped into clusters according to subsets of dimensions (or attributes) of a data set [9]. These approaches involve two mains tasks, identiﬁcation of the subsets of dimensions where clusters can be found and discovery of the clusters from different subsets of dim ...

A Highly-usable Projected Clustering Algorithm for Gene Expression

... quality. However, the traditional functions used in evaluating cluster quality may not be applicable in the projected case. For example, if the average within-cluster distance to centroid is used within the selected subspace, the fewer attributes being selected, the better evaluation score will be r ...

Paper Title (use style: paper title)

A Mixture Model of Clustering Ensembles

An Efficient Fuzzy Clustering-Based Approach for Intrusion Detection

A Study of Network Intrusion Detection by Applying

Opening the Black Box: Interactive Hierarchical Clustering for

Classification via clustering for predicting final marks based on

Ensembles of Partitions via Data Resampling

... uncertainty from a set of different k-means partitions. The key idea of this approach is to integrate multiple partitions produced by clustering of pseudo-samples of a data set. Two issues, specific to the clustering combination, must be addressed: 1) The generative mechanism for individual partitio ...

Improved Multi Threshold Birch Clustering Algorithm

... parameter is larger than the optimal value, then the number of points put into sets is increased which require a continuously increasing extra cost while the leaf nodes representing sets are being clustered.[ 21], Also beginning with a good initial value for threshold would save about 10% time.[9]. ...

Efficient Mining of web log for improving the website using Density

... depends upon the websites. There are two types of the logs 1.Server logs and 2. Client logs. Server log can records all the activities on the server. Client log is not used much. The server log contains this following information ip address, session, port, date and time. By using the ip address, eac ...

Toward a Framework for Learner Segmentation

Supervised Learning for Automatic Classification of Documents

Chapter 10. Cluster Analysis: Basic Concepts and

No Slide Title

Mining Quantitative Association Rules on Overlapped Intervals

... Clustering can be considered the most important unsupervised learning technique, which deals with finding a structure in a collection of unlabeled data. A cluster is therefore a collection of objects which are “similar” to each other and are “dissimilar” to the objects belonging to other clusters [8 ...

SISC: A Text Classification Approach Using Semi Supervised Subspace Clustering

A Review on Density based Clustering Algorithms for Very

< 1 ... 46 47 48 49 50 51 52 53 54 ... 88 >

Nearest-neighbor chain algorithm

In the theory of cluster analysis, the nearest-neighbor chain algorithm is a method that can be used to perform several types of agglomerative hierarchical clustering, using an amount of memory that is linear in the number of points to be clustered and an amount of time linear in the number of distinct distances between pairs of points. The main idea of the algorithm is to find pairs of clusters to merge by following paths in the nearest neighbor graph of the clusters until the paths terminate in pairs of mutual nearest neighbors. The algorithm was developed and implemented in 1982 by J. P. Benzécri and J. Juan, based on earlier methods that constructed hierarchical clusterings using mutual nearest neighbor pairs without taking advantage of nearest neighbor chains.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Nearest-neighbor chain algorithm