... Clustering algorithms have focused on the management of numerical and categorical data. However, in the last years, textual information has grown in importance. Proper processing of this kind of information within data mining methods requires an interpretation of their meaning at a semantic level. I ...

Data Mining on Empty Result Queries

Mining Query Subtopics from Search Log Data

(PPT, 739KB)

Scalable Model-based Clustering Algorithms for

Discovering Multiple Clustering Solutions

... Summary and Comparison in the Taxonomy ...

The GC3 framework : grid density based clustering for

TOWARD ACCURATE AND EFFICIENT OUTLIER DETECTION IN

Machine Learning for Information Visualization

THE CONSTRUCTION AND EXPLOITATION OF ATTRIBUTE

ENTROPY BASED TECHNIQUES WITH APPLICATIONS IN DATA

Prototype-based Classification and Clustering

... classifiers like decision trees, (artificial) neural networks, or (naı̈ve) Bayes classifiers and denote the process of assigning a class from a predefined set to an object or case under consideration. Consequently, a classification problem is the task to construct a classifier —that is, an automatic ...

Research Proposal - University of South Australia

SEQUENTIAL PATTERN ANALYSIS IN DYNAMIC BUSINESS

... Our major contribution is to identify the right granularity for sequential pattern analysis. We first show that the right pattern granularity for sequential pattern mining is often unclear due to the so-called “curse of cardinality”, which corresponds to a variety of difficulties in mining sequentia ...

Efficient Classification and Prediction Algorithms for Biomedical

University of Alberta Library Release Form Name of Author Title of Thesis

Rank Based Anomaly Detection Algorithms - SUrface

Agents and Data Mining Interaction - CS

Boris Mirkin Clustering: A Data Recovery Approach

... clusters. However, implementing this idea is less than straightforward. First, too many similarity measures and clustering techniques have been invented with virtually no support to a non-specialist user for choosing among them. The trouble with this is that different similarity measures and/or clus ...

Density-based Algorithms for Active and Anytime Clustering

... cost, high time complexity, noisy and missing data, etc. Motivated by these potential difficulties of acquiring the distances among objects, we propose another approach for DBSCAN, called Active Density-based Clustering (Act-DBSCAN). Given a budget limitation B, Act-DBSCAN is only allowed to use up ...

tdp.a020a09

Clustering and Community Detection in Directed Networks: A Survey

... Networks (or graphs) appear as dominant structures in diverse domains, including sociology, biology, neuroscience and computer science. In most of the aforementioned cases graphs are directed – in the sense that there is directionality on the edges, making the semantics of the edges non symmetric as ...

Improving the Accuracy of Decision Tree Induction by - IBaI

... set of features which are calculated from various image-processing methods. Images of flaws in welds are radio-graphed by local grey level discontinuities. Subsequently, the morphological edge finding operator, the derivative of Gaussian operator and the Gaussian weighted image moment vector operato ...

Segmentation, Classification, and Clustering of Temporal Data

... Abstract Time series can be found in domains as diverse as medicine, astronomy, geophysics, engineering, and quantitative finance. In general, a time series is a sequence of data points, measured at successive points in time and spaced at uniform time intervals. This thesis is concerned with time s ...

Mahout Tutorial (PDF Version)

< 1 2 3 4 5 6 ... 88 >

Nearest-neighbor chain algorithm

In the theory of cluster analysis, the nearest-neighbor chain algorithm is a method that can be used to perform several types of agglomerative hierarchical clustering, using an amount of memory that is linear in the number of points to be clustered and an amount of time linear in the number of distinct distances between pairs of points. The main idea of the algorithm is to find pairs of clusters to merge by following paths in the nearest neighbor graph of the clusters until the paths terminate in pairs of mutual nearest neighbors. The algorithm was developed and implemented in 1982 by J. P. Benzécri and J. Juan, based on earlier methods that constructed hierarchical clusterings using mutual nearest neighbor pairs without taking advantage of nearest neighbor chains.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Nearest-neighbor chain algorithm