Contents - Emory Math/CS Department

... According to William H. Inmon, a leading architect in the construction of data warehouse systems, “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of data in support of management’s decision making process” [Inm96]. This short, but comprehensive definitio ...

Slides - Asian Institute of Technology

...  Given both the network structure and all variables observable: learn only the CPTs  Network structure known, some hidden variables: gradient descent (greedy hill-climbing) method, analogous to neural network learning  Network structure unknown, all variables observable: search through the model ...

A Framework for an Intelligent Decision Support

... scientific method. Informal discussions with medical practitioners exposed a lack of confidence in data mining activities as they are perceived to not support the scientific method. This thesis demonstrates that there are strong parallels between the scientific method and the Cross-Industry Standard ...

Discovery of spatial association rules in geo

... data to be mined are represented in a single table (or relation) of a relational database, such that each row (or tuple) represents an independent unit of the sample population and columns correspond to properties of units. In spatial data mining applications this assumption turns out to be a great ...

FZ2210751085

... Zeng et al. [14], [15] optimize each path separately by decomposing the composition into execution paths, and after the optimization process, the execution paths are aggregated into an overall composition that consists of all paths. If there is a common abstract service that belongs to more than one ...

Classification - Computer Science and Engineering

... Examples of Classification Problem n ...

Profiling Project - Defining profiling

... Profiling is a highly evocative term with multiple meanings, used in both specialist and non-specialist contexts. Drawing attention to the innovative feature of profiling as a form of non-representational, probabilistic knowledge, this paper focuses on machine profiling. It aims to elaborate a suita ...

Scalable Model-based Clustering Algorithms for

Spammer Detection by Extracting Message Parameters from Spam

Mining Outlying Aspects on Numeric Data

Metadata Management for Knowledge Discovery

... trustability and searchability. The metadata should be accurate and all relevant information about the resource should be captured. It should conform to a specific metadata standard and should not contain contradictions. Finally, it should be made available in a properly machine-readable format and ...

An XML-Based Database for Knowledge Discovery

... the data that can be represented by XML. In XDM the pattern definition is represented together with data. This allows the ''reuse'' of patterns by the inductive database management system and the representation of intensional XML data. In particular, XDM explicitly represents the statements executed ...

towards outlier detection for high-dimensional data

... data streams. SPOT employs an innovative window-based time model in capturing dynamic statistics from stream data, and a novel data structure containing a set of top sparse subspaces to detect projected outliers effectively. SPOT also employs a multi-objective genetic algorithm as an effective searc ...

The Utility of Clustering in Prediction Tasks

K-means clustering

Toward Intelligent Data Warehouse Mining: An Ontology

... reasonable. Although data warehouses have solved the data preprocessing problems effectively, there is at least the following issues hindering the realization of such an intelligent assistance by the data warehouse mining system: ...

A New Procedure of Clustering Based on Multivariate Outlier Detection

... (typically for the normal behavior) from the given data and then apply a statistical test to determine if an object belongs to this model or not. Objects that have low probability to belong to the statistical model are declared as outliers. However, distribution-based approaches cannot be applied in ...

Truth and robustness in cross-country growth regressions

Relationship between Product Based Loyalty

Support Envelopes: A Technique for Exploring the Structure of Association Patterns,

... Finally, we consider two simple extensions of support envelopes. First, support envelopes involving only the items of one specific transaction provide a view of patterns with respect to a particular transaction, i.e., the set of such ‘restricted’ support envelopes will involve only those association ...

Slides

symbiotic evolutionary subspace clustering (s-esc)

... Axis-parallel vs. arbitrarily-oriented subspaces. Cluster 3 exists in an axis-parallel subspace, whereas clusters 1 and 2 exist in (different) arbitrarily-oriented subspaces. Reproduced from ...

Discrimination Aware Decision Tree Learning*

... three approaches include modifying the probability of the decision being positive, training one model for every sensitive attribute value and balancing them, and adding a latent variable in the Bayesian model that represents the unbiased label and optimizing the model parameters for likelihood using ...

Oracle9i Data Mining Concepts

... Oracle Corporation; they are provided under a license agreement containing restrictions on use and disclosure and are also protected by copyright, patent and other intellectual and industrial property laws. Reverse engineering, disassembly or decompilation of the Programs, except to the extent requi ...

Advanced Data Mining with Weka - Department of Computer Science

< 1 ... 17 18 19 20 21 22 23 24 25 ... 505 >

Nonlinear dimensionality reduction

High-dimensional data, meaning data that requires more than two or three dimensions to represent, can be difficult to interpret. One approach to simplification is to assume that the data of interest lie on an embedded non-linear manifold within the higher-dimensional space. If the manifold is of low enough dimension, the data can be visualised in the low-dimensional space.Below is a summary of some of the important algorithms from the history of manifold learning and nonlinear dimensionality reduction (NLDR). Many of these non-linear dimensionality reduction methods are related to the linear methods listed below. Non-linear methods can be broadly classified into two groups: those that provide a mapping (either from the high-dimensional space to the low-dimensional embedding or vice versa), and those that just give a visualisation. In the context of machine learning, mapping methods may be viewed as a preliminary feature extraction step, after which pattern recognition algorithms are applied. Typically those that just give a visualisation are based on proximity data – that is, distance measurements.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Nonlinear dimensionality reduction