Learning Similarity Metrics for Event Identification in Social

... type (e.g., textual or time data). In addition, we use one textual document representation that contains the textual representations of all the document features (title, description, tags, time/date and location). This representation, all-text, is commonly used in similar domains [28]. Next, we list ...

Text Mining and Clustering

Subspace Scores for Feature Selection in Computer Vision

Filtering and Refinement: A Two-Stage Approach for Efficient and

... measure the trend of species distribution and also to indicate slow environmental or climate changes. From the above discussion, one can generalize the following three types of anomalies (Figure 1): • Unique Instances (sparse and distant): A unique instance is an isolated point, far from the normal ...

An Overview of Data Mining Techniques

A Lattice Algorithm for Data Mining

Exploring the wild birds` migration data for the

Optimizing metric access methods for querying and mining complex

Document

Localized Prediction of Multiple Target Variables Using Hierarchical

Consensus Guided Unsupervised Feature Selection

Multivariate Approaches to Classification in Extragalactic

... XXIst century. In this paper we would like to present these different approaches in the general context of unsupervised (clustering) and supervised (classification) learning. Clustering approaches gather objects according to their similarities either through the choice of a distance metric or using ...

Computational Geometry and Spatial Data Mining

C-TREND: A New Technique for Identifying and Visualizing Trends in Multi-Attribute

Clustering Web Usage Data using Concept Hierarchy and Self

CURIO : A Fast Outlier and Outlier Cluster Detection Algorithm for

... access. Figure 3 shows the difference in required cell numbers for the UCI-KDD dataset (Hettich & Bay 1999) on internet usage data, while increasing P and κ. However it should be noted that an array structure is still reasonable given a dense dataset and coarse partitioning (number of cells < 224 ). ...

search engine optimization using data mining approach

Privacy Preserving Distributed DBSCAN Clustering∗

Evaluating the Performance of Association Rule Mining

... Abstract: Data mining is the phenomenon of extracting fruitful knowledge from contrasting perspectives. Frequent patterns are patterns that appear in a database most frequently. Various techniques have been recommended to increase the performance of frequent pattern mining algorithms. Energetic freq ...

Improving Efficiency of Apriori Algorithm

A SURVEY ON WEB MINNING ALGORITHMS

... with a complexity of O (NKM), where K is the number of clusters and M the number of batch iterations. In addition, all these centroid-based clustering techniques have an online version, which can be suitably used for adaptive attack detection in a data environment. 4.2. K-Mean Algorithm The K-Means ...

ISpaper04 July 07

What to put in the bag? Comparing and contrasting procedures for

K - Department of Computer Science

... Identify a set of data over 2 classes (squares and triangles) for which DANN will give a better result than kNN. Explain why this is the case. ...

BORDER: Efficient Computation of Boundary Points

< 1 ... 3 4 5 6 7 8 9 10 11 ... 88 >

Nearest-neighbor chain algorithm

In the theory of cluster analysis, the nearest-neighbor chain algorithm is a method that can be used to perform several types of agglomerative hierarchical clustering, using an amount of memory that is linear in the number of points to be clustered and an amount of time linear in the number of distinct distances between pairs of points. The main idea of the algorithm is to find pairs of clusters to merge by following paths in the nearest neighbor graph of the clusters until the paths terminate in pairs of mutual nearest neighbors. The algorithm was developed and implemented in 1982 by J. P. Benzécri and J. Juan, based on earlier methods that constructed hierarchical clusterings using mutual nearest neighbor pairs without taking advantage of nearest neighbor chains.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Nearest-neighbor chain algorithm