
Mining Trajectory Data
... and retrieve features based on their geographic location over the time; such features include Stay Points (SP) and Points of Interest (POI) which can be useful to understand users’ interaction and similarity, and both understand individuals’ movement patterns and find interesting places in a certain ...
... and retrieve features based on their geographic location over the time; such features include Stay Points (SP) and Points of Interest (POI) which can be useful to understand users’ interaction and similarity, and both understand individuals’ movement patterns and find interesting places in a certain ...
Support vector machines based on K-means clustering for real
... Two elements affecting the response time of SVM classifiers are the number of input variables and that of the support vectors. While Viaene et al. (2001) improve response time by selecting parts of input variables, this paper tries to improve the response time of SVM classifiers by reducing support ...
... Two elements affecting the response time of SVM classifiers are the number of input variables and that of the support vectors. While Viaene et al. (2001) improve response time by selecting parts of input variables, this paper tries to improve the response time of SVM classifiers by reducing support ...
Data Mining and Exploration
... • Important, since for most astronomical studies you want either stars (~ quasars), or galaxies; the depth to which a reliable classification can be done is the effective limiting depth of your catalog - not the detection depth – There is generally more to measure for a non-PSF object • You d lik ...
... • Important, since for most astronomical studies you want either stars (~ quasars), or galaxies; the depth to which a reliable classification can be done is the effective limiting depth of your catalog - not the detection depth – There is generally more to measure for a non-PSF object • You d lik ...
Data Mining and Exploration (a quick and very superficial
... How many statistically distinct kinds of things are there in my data, and which data object belongs to which class? Are there anomalies/outliers? (e.g., extremely rare classes) I know the classes present in the data, but would like to classify efficiently all of my data objects ...
... How many statistically distinct kinds of things are there in my data, and which data object belongs to which class? Are there anomalies/outliers? (e.g., extremely rare classes) I know the classes present in the data, but would like to classify efficiently all of my data objects ...
Pattern Recognition and Classification for Multivariate - DAI
... that the reconstruction error of the employed SVD model does not exceed a predefined threshold. In case of two observed parameters the correlation structure of a segment can be expressed as a position vector (two-dimensional hyperplane) which gives an approximation of the original segment data. The ...
... that the reconstruction error of the employed SVD model does not exceed a predefined threshold. In case of two observed parameters the correlation structure of a segment can be expressed as a position vector (two-dimensional hyperplane) which gives an approximation of the original segment data. The ...
Decision Tree Induction in High Dimensional, Hierarchically
... Previous work on distributed decision tree induction usually focused on tight clusters of computers, or even on shared memory machines [4–6, 10, 11]. When a wide area distributed scenario was considered, all these algorithms become impractical because they use too much communication and synchronizat ...
... Previous work on distributed decision tree induction usually focused on tight clusters of computers, or even on shared memory machines [4–6, 10, 11]. When a wide area distributed scenario was considered, all these algorithms become impractical because they use too much communication and synchronizat ...
Application of Data Mining Techniques to Olea - CEUR
... it produces very simple rules for classification and can be considered the baseline for classification performance. It was found to perform as well as more sophisticated algorithms when applied to many of the standard machine learning test datasets (Holte, 1993). OneR can parsimoniously discover and ...
... it produces very simple rules for classification and can be considered the baseline for classification performance. It was found to perform as well as more sophisticated algorithms when applied to many of the standard machine learning test datasets (Holte, 1993). OneR can parsimoniously discover and ...
Cluster Description and Related Problems
... R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of high dimensional data for data mining applications. SIGMOD 1998. D. Angluin. Queries and concept learning. Machine Learning, 2(4): 319-342, 1988. P. Berman and B. Dasgupta. Approximating rectilinear polygon cover ...
... R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of high dimensional data for data mining applications. SIGMOD 1998. D. Angluin. Queries and concept learning. Machine Learning, 2(4): 319-342, 1988. P. Berman and B. Dasgupta. Approximating rectilinear polygon cover ...
Multiple Non-Redundant Spectral Clustering Views
... Clustering is often a first step in the analysis of complex multivariate data, particularly when a data analyst wishes to engage in a preliminary exploration of the data. Most clustering algorithms find one partitioning of the data (Jain et al., 1999), but this is overly rigid. In the exploratory data ...
... Clustering is often a first step in the analysis of complex multivariate data, particularly when a data analyst wishes to engage in a preliminary exploration of the data. Most clustering algorithms find one partitioning of the data (Jain et al., 1999), but this is overly rigid. In the exploratory data ...
Web People Search via Connection Analysis
... Disambiguation algorithm Correlation Clustering (1/3) • CC has been applied in the past to group documents of the same topic and to other problems. • It assumes that there is a similarity function s(u, v) learned on the past data. • Each (u, v) edge is assigned a “+” (similar) or “-” (different) la ...
... Disambiguation algorithm Correlation Clustering (1/3) • CC has been applied in the past to group documents of the same topic and to other problems. • It assumes that there is a similarity function s(u, v) learned on the past data. • Each (u, v) edge is assigned a “+” (similar) or “-” (different) la ...
Density Clustering Method for Gene Expression Data
... successfully discovered the tumor classes based on the simultaneous expression profiles of thousands of genes from acute leukemia patient’s testing samples using self-organizing maps clustering approach [8]. Some other clustering approaches, such as k-mean [21], fuzzy kmeans [1], CAST [3], etc, als ...
... successfully discovered the tumor classes based on the simultaneous expression profiles of thousands of genes from acute leukemia patient’s testing samples using self-organizing maps clustering approach [8]. Some other clustering approaches, such as k-mean [21], fuzzy kmeans [1], CAST [3], etc, als ...
Subspace Clustering of High-Dimensional Data: An Evolutionary
... ORCLUS finds projected clusters as a set of data points C together with a set of orthogonal vectors such that these data points are closely clustered in the defined subspace. A limitation of these two approaches is that the process of forming the locality is based on the full dimensionality of the s ...
... ORCLUS finds projected clusters as a set of data points C together with a set of orthogonal vectors such that these data points are closely clustered in the defined subspace. A limitation of these two approaches is that the process of forming the locality is based on the full dimensionality of the s ...
Designing Parallel and Distributed Algorithms for Data Mining and
... operates on huge deposits of data being salvaged from massive data resources determining in terabytes or zeta bytes are now totally considered to be frequent in data mining that tend to create data mining tasks and applications too deliberate to work and too gigantic to be executed on a solo process ...
... operates on huge deposits of data being salvaged from massive data resources determining in terabytes or zeta bytes are now totally considered to be frequent in data mining that tend to create data mining tasks and applications too deliberate to work and too gigantic to be executed on a solo process ...
frequent patterns for mining association rule in improved
... International Journal of Advanced Research in Computer Engineering & Technology (IJARCET) Volume 3 Issue 3, March 2014 ...
... International Journal of Advanced Research in Computer Engineering & Technology (IJARCET) Volume 3 Issue 3, March 2014 ...
Chapter 5. Cluster Analysis
... There is a separate “quality” function that measures the “goodness” of a cluster. The definitions of distance functions are usually very different for interval-scaled, boolean, categorical, ordinal and ratio variables. Weights should be associated with different variables based on applications and d ...
... There is a separate “quality” function that measures the “goodness” of a cluster. The definitions of distance functions are usually very different for interval-scaled, boolean, categorical, ordinal and ratio variables. Weights should be associated with different variables based on applications and d ...
Iterative Projected Clustering by Subspace Mining
... Therefore, a new class of projected clustering methods (also called subspace clustering methods) [1], [2], [3], [12] have emerged, whose task is to find 1) a set of clusters C, and 2) for each cluster Ci 2 C, the set of dimensions Di that are relevant to Ci . For instance, the projected clusters in ...
... Therefore, a new class of projected clustering methods (also called subspace clustering methods) [1], [2], [3], [12] have emerged, whose task is to find 1) a set of clusters C, and 2) for each cluster Ci 2 C, the set of dimensions Di that are relevant to Ci . For instance, the projected clusters in ...
CLINCH: Clustering Incomplete High-Dimensional Data
... mean more information, if we deal with dimensions with larger entropies, the prediction on the missing attributes will be more precise. After finishing all the complete dimensions, characteristics of clusters on this complete subspace are built through entropies, and will be employed for the predicti ...
... mean more information, if we deal with dimensions with larger entropies, the prediction on the missing attributes will be more precise. After finishing all the complete dimensions, characteristics of clusters on this complete subspace are built through entropies, and will be employed for the predicti ...
Opening the Black Box: Interactive Hierarchical Clustering for
... Clustering is one of the most important tasks for geographic knowledge discovery. However, existing clustering methods have two severe drawbacks for this purpose. First, spatial clustering methods have so far been mainly focused on searching for patterns within the spatial dimensions (usually 2D or ...
... Clustering is one of the most important tasks for geographic knowledge discovery. However, existing clustering methods have two severe drawbacks for this purpose. First, spatial clustering methods have so far been mainly focused on searching for patterns within the spatial dimensions (usually 2D or ...