RGCA: a Reliable GPU Cluster Architecture for Large

6340 Lecture on Object-Similarity and Clustering

Filtering and Refinement: A Two-Stage Approach for Efficient and

... process and also reuse dimension measures. To differentiate the binary tree structure with partition information with regular tree structure, we define a Filter Tree structure to utilize the result of the filtering stage and keep track of the deterministic space partition process. Definition 1 (Filt ...

Discovering Similar Patterns in Time Series

... Isokinetic systems were conceived to analyse the muscular fitness of patients who are members of any population group. During a standard session, patients must perform a set of exercises, for example, ten seconds extending and flexing their leg with the machine moving at a constant speed of 90°/s. T ...

Experiment No. 1

Multi-represented kNN-Classification for Large Class Sets

A New Sequential Covering Strategy for Inducing Classification

... number of correct predictions divided by the total number of predictions—in the test set, although in some application domains (e.g. credit approval, medical diagnosis and protein function prediction) the comprehensibility of the model plays an important role [3], [4]. For instance, both neural netw ...

Automatic subspace clustering of high dimensional data for data

... DNF expression, there are clusters that are poorly approxi mated poorly, such as a cigar-shaped cluster When the cluster description is restricted to be a rectangular box. On the other hand, the same criticism can also be raised against decision tree and decision-rule classi?ers, such as disclosed b ...

Improving SVM Classification on Imbalanced Data Sets in Distance

... perception that the distances induced by human judgments are often non-metric [22]. The key observation of the proposed approach is that although not every four point metric space can be embedded into a Euclidean space, every three point metric space can be isometrically embedded into the plane R2 . ...

A Near-Optimal Algorithm for Differentially-Private

... approximations to PCA which guarantee differential privacy, a cryptographically motivated definition of privacy (Dwork et al., 2006b) that has gained significant attention over the past few years in the machine-learning and data-mining communities (Machanavajjhala et al., 2008; McSherry and Mironov, ...

Online Full Text

... Stopping criteria: Select the optimal number of partition p Figure 1. Discretization algorithm based on Rough sets ...

Optimization of Naïve Bayes Data Mining Classification Algorithm

... predict the class of a new dataset with unknown class by analysing its structural similarity. Multiple classification algorithms have been implemented, used and compared for different data domains, however, there has been no single algorithm found to be superior over all others for all data sets for ...

Particle Swarm Optimization Based Optimal Segmentation for

... Moreover, instead of employing local search strategies, it makes full use of a particle swarm optimizer to minimize the given objective function. PSOVW[8] is simple to implement. It has been applied to solve high-dimensional data clustering and has been demonstrated to be less sensitive to the initi ...

Reconstruction-Based Association Rule Hiding

... association rules mined from D with minimum support threshold MST and minimum confidence threshold MCT, the problem of KHD becomes association rule hiding problem. Clifton in provided a well designed scenario which clearly shows the importance of the association rule hiding problem. In the scenario, ...

A Comparative Analysis of Association Rule Mining

... algorithms[15].The main task of every association rule mining algorithm is to find out the sets of items that frequently appear together the frequent itemsets. R. Porkodi presented the rule based approach for constructing gene and protein names dictionary from Medline abstracts that consists of thre ...

Proceedings of The Workshop on Mining Complex Patterns

... A clustering step (typical in Text Mining) can be performed on V to identify groups of elements having similar features (i.e., involved in the same verbal relationships). The underlying idea is that concepts belonging to the same cluster should share some semantics. For instance, if concepts dog, Jo ...

Multi-represented kNN-Classification for Large Class Sets

Feature Selection

A Survey on Nearest Neighbor Search Methods

... in space, the speed of searching is really decreased. kNN technique for the first time in [42] has been presented for classification and used simple algorithm. A naive solution for the NNS problem is using linear search method that computes distance from the query to every single point in the datase ...

TopCat: Data Mining for Topic Identification in a Text Corpus

Mayo_tutorial_July14

Mining Association Rules with Multiple Minimum Supports Using

A Decision Criterion for the Optimal Number Yunjae Jung ( )

A study on time series data mining based on the concepts and

... stock market investors may want to know when to enter and when to exit the market in order to maximize the investment returns. As a result, finding the turning points in stock trends might be the most interesting thing for investors. However, it is very difficult to apply the traditional time series ...

ISpaper04 July 07

... as the imprecision associated with the data in the modeling exercises. In fuzzy system modeling, the nonlinear relations in the data are approximated by means of fuzzy if-then rules. In earlier approaches, the fuzzy if-then rules were determined a priori from other sources such as experts’knowledge. ...

< 1 ... 24 25 26 27 28 29 30 31 32 ... 169 >

K-means clustering

k-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells.The problem is computationally difficult (NP-hard); however, there are efficient heuristic algorithms that are commonly employed and converge quickly to a local optimum. These are usually similar to the expectation-maximization algorithm for mixtures of Gaussian distributions via an iterative refinement approach employed by both algorithms. Additionally, they both use cluster centers to model the data; however, k-means clustering tends to find clusters of comparable spatial extent, while the expectation-maximization mechanism allows clusters to have different shapes.The algorithm has a loose relationship to the k-nearest neighbor classifier, a popular machine learning technique for classification that is often confused with k-means because of the k in the name. One can apply the 1-nearest neighbor classifier on the cluster centers obtained by k-means to classify new data into the existing clusters. This is known as nearest centroid classifier or Rocchio algorithm.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

K-means clustering