Swarm Intelligence Algorithms for Data Clustering

... maximization algorithms (Mitchell, 1997), artificial neural networks (Mao and Jain, 1995, Pal et al., 1993, Kohonen, 1995), evolutionary computing (Falkenauer, 1998, Paterlini and Minerva, 2003) and so on. Researchers all over the globe are coming up with new algorithms, on a regular basis, to meet ...

Powerpoint - University of California, Riverside

... Critical points for classification results best first or worst last?  put non-critical points last. Numerosity Reduction can partially be the good ordering solutions. The problem is very similar to ordering problem for anytime algorithms. Leave-one-out (k=1) within training data ...

10ClusBasic

... Cluster analysis (or clustering, data segmentation, …)  Finding similarities between data according to the characteristics found in the data and grouping similar data objects into clusters Unsupervised learning: no predefined classes (i.e., learning by observations vs. learning by examples: supervi ...

ON FUZZY NEIGHBORHOOD BASED CLUSTERING ALGORITHM

... clusters in industry. SOMs is a neural network approach [12]. Grid-based methods are fast and they handle outliers well. The grid-based methodology can also be used as an intermediate step in many other algorithms. The most important methods for this category are STING, CLIQUE, and WaveCluster [1, 2 ...

IOSR Journal of Computer Science (IOSR-JCE) e-ISSN: 2278-0661, p-ISSN: 2278-8727 PP 28-33 www.iosrjournals.org

Learning Optimization for Decision Tree Classification of Non

... started with the decision tree inducer with multivariate linear splits in each node and RIGf (S|w) instead of RIG(S|w). The values of w were determined by the Nelder-Mead algorithm. The corresponding linear splits are shown in the figure as blue solid lines. The classifier had the following performa ...

Density-Based Clustering over an Evolving Data Stream with Noise

classification of chronic kidney disease with most known data mining

... knowledge are proceeding rapidly.Since the mid1990s, A lot of researches have been conducted to create technics, methods and means that support the discovery of useful information [1]. In the information age, creating value is through using resources efficiently rather than physical assets. For this ...

Data mining application to decision-making processes in university

... 3.2 Cluster Analysis Once the relevant variables and categories, either latent or manifested, have been defined for the analysis, administrative procedures start being classified, grouping them in clusters through cluster analysis, based upon the scores of the variables employed [3]. This multivaria ...

Full-Text PDF

... The purpose of using k-means is to find clusters that minimize the sum of square distances between each cluster center and all objects in each cluster. Even though the number of clusters is small, the problem of finding an optimal k-means algorithm solution is NP-hard [2,3]. For this reason, a k-mea ...

A Compression Algorithm for Mining Frequent Itemsets

... Mining association rules in transaction datasets has been demonstrated to be useful and technically feasible in several application areas, particularly in retail sales [1, 2, 3, 4], document datasets applications [5], and also in intrusion detection [6]. Association rule mining is usually divided in ...

IOSR Journal of Computer Engineering (IOSR-JCE)

1. Introduction Data mining (DM) is an interdisciplinary field in

A Survey On Clustering Techniques For Mining Big Data

... maintain data exponential expansion. “Big data” can be defined as a large datasets whose size is so (too) large for the database software tools cannot easily capable to store data, capture data and handle data. We do not define big data in terms of being larger than a certain number of thousands of ...

Research on Rough Set and Decision Tree Method Application in

Density-Based Clustering over an Evolving Data Stream with Noise

... attracting a lot of research attention. Previous methods, one-pass [4, 10, 11] or evolving [1, 2, 5, 18], do not consider that the clusters in data streams could be of arbitrary shape. In particular, their results are often spherical clusters. One-pass methods typically make the assumption of the un ...

Data clustering with size constraints

... The goal of cluster analysis is to divide data objects into groups so that objects within a group are similar to one another and different from objects in other groups. Traditionally, clustering is viewed as an unsupervised learning method which groups data objects based only on the information pres ...

Optimized Association Rule Mining with Maximum Constraints using

Intelligent Application for Duplication Detection

... strings, and described a general dynamic programming method for computing edit distance. While character-based metrics work well for estimating distance between strings that differ due to typographical errors or abbreviations, they become computationally expensive and less accurate for larger string ...

International Journal on Advanced Computer Theory and

Performance Analysis of Distributed Association Rule Mining

... from data [5]. As more data is gathered, with the amount of data doubling every three years [1-2], data mining is becoming an increasingly important tool to transform this data into information. It is commonly used in a wide range of profiling practices, such as marketing, surveillance, fraud detect ...

Clustering Methods in High

... • The key problem: How should we learn the subspace preference of a cluster l t or a point? i t? • Most approaches rely on a “locality assumption” - The subspace is usually learned from the local neighborhood of cluster representatives/cluster members in the entire feature space: Cluster-based app ...

Computational Intelligence, NTU Lectures, 2005

SOM in data mining

Towards Effective and Efficient Distributed Clustering

... sites are able to put their data into a global context. The requirement to extract knowledge out of distributed data, without a prior unification of the data, created the rather new research area of Distributed Knowledge Discovery in Databases (DKDD). In this paper, we will present an approach where ...

< 1 ... 87 88 89 90 91 92 93 94 95 ... 169 >

K-means clustering

k-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells.The problem is computationally difficult (NP-hard); however, there are efficient heuristic algorithms that are commonly employed and converge quickly to a local optimum. These are usually similar to the expectation-maximization algorithm for mixtures of Gaussian distributions via an iterative refinement approach employed by both algorithms. Additionally, they both use cluster centers to model the data; however, k-means clustering tends to find clusters of comparable spatial extent, while the expectation-maximization mechanism allows clusters to have different shapes.The algorithm has a loose relationship to the k-nearest neighbor classifier, a popular machine learning technique for classification that is often confused with k-means because of the k in the name. One can apply the 1-nearest neighbor classifier on the cluster centers obtained by k-means to classify new data into the existing clusters. This is known as nearest centroid classifier or Rocchio algorithm.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

K-means clustering