... Clustering algorithms have focused on the management of numerical and categorical data. However, in the last years, textual information has grown in importance. Proper processing of this kind of information within data mining methods requires an interpretation of their meaning at a semantic level. I ...
Prototype-based Classification and Clustering
... classifiers like decision trees, (artificial) neural networks, or (naı̈ve) Bayes classifiers and denote the process of assigning a class from a predefined set to an object or case under consideration. Consequently, a classification problem is the task to construct a classifier —that is, an automatic ...
... classifiers like decision trees, (artificial) neural networks, or (naı̈ve) Bayes classifiers and denote the process of assigning a class from a predefined set to an object or case under consideration. Consequently, a classification problem is the task to construct a classifier —that is, an automatic ...
SEQUENTIAL PATTERN ANALYSIS IN DYNAMIC BUSINESS
... Our major contribution is to identify the right granularity for sequential pattern analysis. We first show that the right pattern granularity for sequential pattern mining is often unclear due to the so-called “curse of cardinality”, which corresponds to a variety of difficulties in mining sequentia ...
... Our major contribution is to identify the right granularity for sequential pattern analysis. We first show that the right pattern granularity for sequential pattern mining is often unclear due to the so-called “curse of cardinality”, which corresponds to a variety of difficulties in mining sequentia ...
Boris Mirkin Clustering: A Data Recovery Approach
... clusters. However, implementing this idea is less than straightforward. First, too many similarity measures and clustering techniques have been invented with virtually no support to a non-specialist user for choosing among them. The trouble with this is that different similarity measures and/or clus ...
... clusters. However, implementing this idea is less than straightforward. First, too many similarity measures and clustering techniques have been invented with virtually no support to a non-specialist user for choosing among them. The trouble with this is that different similarity measures and/or clus ...
Density-based Algorithms for Active and Anytime Clustering
... cost, high time complexity, noisy and missing data, etc. Motivated by these potential difficulties of acquiring the distances among objects, we propose another approach for DBSCAN, called Active Density-based Clustering (Act-DBSCAN). Given a budget limitation B, Act-DBSCAN is only allowed to use up ...
... cost, high time complexity, noisy and missing data, etc. Motivated by these potential difficulties of acquiring the distances among objects, we propose another approach for DBSCAN, called Active Density-based Clustering (Act-DBSCAN). Given a budget limitation B, Act-DBSCAN is only allowed to use up ...
Clustering and Community Detection in Directed Networks: A Survey
... Networks (or graphs) appear as dominant structures in diverse domains, including sociology, biology, neuroscience and computer science. In most of the aforementioned cases graphs are directed – in the sense that there is directionality on the edges, making the semantics of the edges non symmetric as ...
... Networks (or graphs) appear as dominant structures in diverse domains, including sociology, biology, neuroscience and computer science. In most of the aforementioned cases graphs are directed – in the sense that there is directionality on the edges, making the semantics of the edges non symmetric as ...
Improving the Accuracy of Decision Tree Induction by - IBaI
... set of features which are calculated from various image-processing methods. Images of flaws in welds are radio-graphed by local grey level discontinuities. Subsequently, the morphological edge finding operator, the derivative of Gaussian operator and the Gaussian weighted image moment vector operato ...
... set of features which are calculated from various image-processing methods. Images of flaws in welds are radio-graphed by local grey level discontinuities. Subsequently, the morphological edge finding operator, the derivative of Gaussian operator and the Gaussian weighted image moment vector operato ...
Segmentation, Classification, and Clustering of Temporal Data
... Abstract Time series can be found in domains as diverse as medicine, astronomy, geophysics, engineering, and quantitative finance. In general, a time series is a sequence of data points, measured at successive points in time and spaced at uniform time intervals. This thesis is concerned with time s ...
... Abstract Time series can be found in domains as diverse as medicine, astronomy, geophysics, engineering, and quantitative finance. In general, a time series is a sequence of data points, measured at successive points in time and spaced at uniform time intervals. This thesis is concerned with time s ...
Nearest-neighbor chain algorithm
In the theory of cluster analysis, the nearest-neighbor chain algorithm is a method that can be used to perform several types of agglomerative hierarchical clustering, using an amount of memory that is linear in the number of points to be clustered and an amount of time linear in the number of distinct distances between pairs of points. The main idea of the algorithm is to find pairs of clusters to merge by following paths in the nearest neighbor graph of the clusters until the paths terminate in pairs of mutual nearest neighbors. The algorithm was developed and implemented in 1982 by J. P. Benzécri and J. Juan, based on earlier methods that constructed hierarchical clusterings using mutual nearest neighbor pairs without taking advantage of nearest neighbor chains.