
Large-Scale Dataset Incremental Association Rules Mining Model
... and deposited in the table called FList. Then use a grouping scheme to divide the project(FList) into Q group, and save each group project in table called GList. Optimization grouping scheme must make each packet to keep a balanced loa d. Load balancing is to make the time of consumption to be close ...
... and deposited in the table called FList. Then use a grouping scheme to divide the project(FList) into Q group, and save each group project in table called GList. Optimization grouping scheme must make each packet to keep a balanced loa d. Load balancing is to make the time of consumption to be close ...
A feature group weighting method for subspace clustering of high
... representing one set of particular measurements on the nucleated blood cells. In a banking customer data set, features can be divided into a demographic group representing demographic information of customers, an account group showing the information about customer accounts, and the spending group d ...
... representing one set of particular measurements on the nucleated blood cells. In a banking customer data set, features can be divided into a demographic group representing demographic information of customers, an account group showing the information about customer accounts, and the spending group d ...
An Evolutionary Algorithm for Mining Association Rules Using
... and ultimately understandable information in large databases [1]. For several years, a wide range of applications in various domains have benefited from KDD techniques and many works has been conducted on this topic. The problem of mining frequent itemsets arose first as a sub-problem of mining asso ...
... and ultimately understandable information in large databases [1]. For several years, a wide range of applications in various domains have benefited from KDD techniques and many works has been conducted on this topic. The problem of mining frequent itemsets arose first as a sub-problem of mining asso ...
Automate the Process of Image Recognizing a Scatter Plot: An Application of Non-parametric Cluster Analysis in Capturing Data from Graphical Output
... A cluster is a group of objects, which are more similar to each other than to those in other group. Cluster analysis is a number of statistical algorithms and methods for grouping multiple objects into clusters according to their similarity. It aims at sorting different objects into groups in a way ...
... A cluster is a group of objects, which are more similar to each other than to those in other group. Cluster analysis is a number of statistical algorithms and methods for grouping multiple objects into clusters according to their similarity. It aims at sorting different objects into groups in a way ...
Fast and Provably Good Seedings for k
... is not an issue in practice: (1) The preprocessing step only requires a single pass through the data compared to k passes for the seeding of k-means++. (2) It is easily parallelized. (3) Given random access to the data, the proposal distribution can be calculated online when saving or copying the da ...
... is not an issue in practice: (1) The preprocessing step only requires a single pass through the data compared to k passes for the seeding of k-means++. (2) It is easily parallelized. (3) Given random access to the data, the proposal distribution can be calculated online when saving or copying the da ...
an association rule mining algorithm based on a boolean matrix
... Data mining is the key step in the knowledge discovery process, and association rule mining is a very important research topic in the data mining field (Agrawal, Imielinski, & Swami, 1993). The original problem addressed by association rule mining was to find a correlation among sales of different p ...
... Data mining is the key step in the knowledge discovery process, and association rule mining is a very important research topic in the data mining field (Agrawal, Imielinski, & Swami, 1993). The original problem addressed by association rule mining was to find a correlation among sales of different p ...
A Simple Dimensionality Reduction Technique for Fast Similarity
... truncation of positive terms the distance in the transformed space is guaranteed to underestimate the true distance. This property is exploited by mapping the query into the same 2k space and examining the nearest neighbors. The theorem guarantees underestimation of distance, so it is possible that ...
... truncation of positive terms the distance in the transformed space is guaranteed to underestimate the true distance. This property is exploited by mapping the query into the same 2k space and examining the nearest neighbors. The theorem guarantees underestimation of distance, so it is possible that ...
Nearest-neighbor chain algorithm

In the theory of cluster analysis, the nearest-neighbor chain algorithm is a method that can be used to perform several types of agglomerative hierarchical clustering, using an amount of memory that is linear in the number of points to be clustered and an amount of time linear in the number of distinct distances between pairs of points. The main idea of the algorithm is to find pairs of clusters to merge by following paths in the nearest neighbor graph of the clusters until the paths terminate in pairs of mutual nearest neighbors. The algorithm was developed and implemented in 1982 by J. P. Benzécri and J. Juan, based on earlier methods that constructed hierarchical clusterings using mutual nearest neighbor pairs without taking advantage of nearest neighbor chains.