Data Mining and Sensor Networks - School of Electrical Engineering

Outlier Mining in Large High-Dimensional Data Sets

Similarity Measures

Large-Scale Dataset Incremental Association Rules Mining Model

... and deposited in the table called FList. Then use a grouping scheme to divide the project(FList) into Q group, and save each group project in table called GList. Optimization grouping scheme must make each packet to keep a balanced loa d. Load balancing is to make the time of consumption to be close ...

Two faces of active learning

A Study of Density-Grid based Clustering Algorithms on Data Streams

A feature group weighting method for subspace clustering of high

... representing one set of particular measurements on the nucleated blood cells. In a banking customer data set, features can be divided into a demographic group representing demographic information of customers, an account group showing the information about customer accounts, and the spending group d ...

Incremental Clustering for Mining in a Data Warehousing

Exploring the Meaning behind Twitter Hashtags through Clustering

Efficient Data Clustering Algorithms: Improvements over Kmeans

An Evolutionary Algorithm for Mining Association Rules Using

... and ultimately understandable information in large databases [1]. For several years, a wide range of applications in various domains have benefited from KDD techniques and many works has been conducted on this topic. The problem of mining frequent itemsets arose first as a sub-problem of mining asso ...

Big Data Clustering

Hierarchical density estimates for data clustering

Automate the Process of Image Recognizing a Scatter Plot: An Application of Non-parametric Cluster Analysis in Capturing Data from Graphical Output

... A cluster is a group of objects, which are more similar to each other than to those in other group. Cluster analysis is a number of statistical algorithms and methods for grouping multiple objects into clusters according to their similarity. It aims at sorting different objects into groups in a way ...

Cluster Analysis: Basic Concepts and Methods

77. diffused kernel dmmi approach for theoretic clustering using data

Fast and Provably Good Seedings for k

... is not an issue in practice: (1) The preprocessing step only requires a single pass through the data compared to k passes for the seeding of k-means++. (2) It is easily parallelized. (3) Given random access to the data, the proposal distribution can be calculated online when saving or copying the da ...

an association rule mining algorithm based on a boolean matrix

... Data mining is the key step in the knowledge discovery process, and association rule mining is a very important research topic in the data mining field (Agrawal, Imielinski, & Swami, 1993). The original problem addressed by association rule mining was to find a correlation among sales of different p ...

A Simple Dimensionality Reduction Technique for Fast Similarity

... truncation of positive terms the distance in the transformed space is guaranteed to underestimate the true distance. This property is exploited by mapping the query into the same 2k space and examining the nearest neighbors. The theorem guarantees underestimation of distance, so it is possible that ...

Mining Common Outliers for Intrusion Detection

Machine Learning in Time Series Databases (and Outline of Tutorial I

Intoduction to Region Discovery

Adaptive Scaling of Cluster Boundaries for Large

Discovering Interesting Association Rules: A Multi

A Theoretic Framework of K-Means-Based Consensus Clustering

< 1 ... 10 11 12 13 14 15 16 17 18 ... 88 >

Nearest-neighbor chain algorithm

In the theory of cluster analysis, the nearest-neighbor chain algorithm is a method that can be used to perform several types of agglomerative hierarchical clustering, using an amount of memory that is linear in the number of points to be clustered and an amount of time linear in the number of distinct distances between pairs of points. The main idea of the algorithm is to find pairs of clusters to merge by following paths in the nearest neighbor graph of the clusters until the paths terminate in pairs of mutual nearest neighbors. The algorithm was developed and implemented in 1982 by J. P. Benzécri and J. Juan, based on earlier methods that constructed hierarchical clusterings using mutual nearest neighbor pairs without taking advantage of nearest neighbor chains.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Nearest-neighbor chain algorithm