Incremental learning in data stream analysis - ICAR-CNR

... • Browser clicks, user queries, link rating,… Beijing, ...

Clustering of Time Series Subsequences is Meaningless

... series, or a single time series, from which individual time series are extracted with a sliding window. Given the recent explosion of interest in streaming data and online algorithms, the latter case has received much attention. In this work we make a surprising claim. Clustering of streaming time s ...

IDEA: Integrative Detection of Early-stage Alzheimer`s disease

... its radius visualizes its IG, which is an additional criterion for an optimal subset configuration. The smaller the distance between two objects is, the higher is the amount of redundant information. Therefore, an optimal subset of measures consists of attributes with large radius and high distance ...

CHAPTER-18 Classification by Back propagation 18.1 Introduction

Discovery of Spatio-Temporal Patterns from Location

... Different geographical discretizations can be proposed to allow the extraction of patterns at different resolutions. A possibility is to divide the area using a regular grid. This would also allow to study the events at different levels of granularity. Controlling the size and shape of the cells it ...

The Use of Data Mining Methods to Predict the Result of Infertility

IOSR Journal of Computer Engineering (IOSR-JCE)

... take the related clustering actions in time by identifying the concept drifting in an online and quick way. Hence the framework achieves improved clustering speed by losing the acceptable accuracy. G. R. Marrs et. al [5] present two new algorithms that use a time of classification protocol for handl ...

Pareto Density Estimation: A Density Estimation for Knowledge

... based upon one or more of the following techniques: finite mixture models, variable kernel estimates, uniform kernel estimates. Finite mixture models attempt to find a superposition of parameterized functions, typically Gaussians which best account for the sample data. The method can in principle mo ...

4 Genetic Programming in Data Mining

Clustering Association Rules

Data Mining

... – In conjunction with existing classification algorithms-by finding near optimal solution the GA can narrow the search space of possible solutions to which the traditional system is then applied, the resultant hybrid approach presenting a more efficient solution to problems in large domains. – GAs h ...

HD-Eye: Visual Mining of High- Dimensional Data

... breakdown in efficiency (which proves true for all indexbased methods) or have notable effectiveness problems (which is basically true for all other methods). Our idea, presented in this article, is to combine an advanced clustering algorithm with new visualization methods for a more effective inter ...

Determining the number of clusters using information entropy for

Pre-Processing Structured Data for Standard Machine Learning

... examples represented as graphs and in some way convert these into fixed-length feature vectors. Several approaches for generating classification models through propositionalization have been proposed in the past [3,4,5,6,7]. The propositionalization methods are usually embedded in discovery, predict ...

Pattern Extracting Engine using Genetic Algorithms

MiningPetroglyphs_KDD`09 - University of California, Riverside

... Providing a rich source of information: ...

MultiClust 2013: Multiple Clusterings, Multi-view Data, and

... Clusters in subspaces are detected using the Mean Shift algorithm, which is based on a non-parametric kernel density estimation approach. The quality of a subspace is measured in terms of the density of the clusters discovered therein. A weighting term, measuring the similarity with previously dete ...

Introduction to WEKA

... • Select an attribute and examine 1. Summary statistics (Data type, any missing data, …) 2. Visualization ...

Evolving SQL Queries for Data Mining

Analysis of Prediction Techniques based on Classification and

... overall distribution pattern and correlations among data attributes. Classification approach can also be used for effective means of distinguishing groups or classes of object but it becomes costly so clustering can be used as preprocessing approach for attribute subset selection and classification[ ...

A Novel Staged Modeling Mechanism for Process Object

Cloud Based Hybrid Evolution Algorithm for NP

Spatial Data Mining by Decision Trees

... be analyzed, neighborhood objects table and spatial join index table. Whenever the attribute to be analyzed is an attribute of neighborhood, the algorithm uses a double join between the target table, the spatial join index table and the neighbors table; it is here where modification is to be made to ...

Discovering Correlated Subspace Clusters in 3D

... a 3D subspace cluster. Figure 1(b) shows the 3D subspace clusters obtained using different metrics. As we can see, several metrics can be used to find the desired CSCs, but some of them generate spurious results. In order to use correlation information on 3D continuousvalued data, we must calculate ...

Feature selection, Dimensionality Reduction and Clustering

< 1 ... 76 77 78 79 80 81 82 83 84 ... 169 >

K-means clustering

k-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells.The problem is computationally difficult (NP-hard); however, there are efficient heuristic algorithms that are commonly employed and converge quickly to a local optimum. These are usually similar to the expectation-maximization algorithm for mixtures of Gaussian distributions via an iterative refinement approach employed by both algorithms. Additionally, they both use cluster centers to model the data; however, k-means clustering tends to find clusters of comparable spatial extent, while the expectation-maximization mechanism allows clusters to have different shapes.The algorithm has a loose relationship to the k-nearest neighbor classifier, a popular machine learning technique for classification that is often confused with k-means because of the k in the name. One can apply the 1-nearest neighbor classifier on the cluster centers obtained by k-means to classify new data into the existing clusters. This is known as nearest centroid classifier or Rocchio algorithm.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

K-means clustering