Attribute, Event Sequence, and Event Type Similarity Notions for

analysis of governmental ict projects using data mining techniques

... Governments give high priority to ICT investments because of the increasing needs for information, transparency and accountability. ICT services facilitate decision-making, foster citizen participation and enhance the efficient delivery of goods and services. In addition, ICT in public administratio ...

A Decision Criterion for the Optimal Number Yunjae Jung ( )

... The cluster conguration is valid if clusters cannot reasonably occur by chance or as a benecial artifact of a clustering algorithm [2]. An optimal cluster conguration is dened as an outcome of all possible combinations of groupings, which presents a set of the most \meaningful" associations. Eve ...

87 Mining Concept Sequences from Large

... Query suggestion plays an important role in improving usability of search engines. Although some recently proposed methods provide query suggestions by mining query patterns from search logs, none of them model the immediately preceding queries as context systematically, and use context information ...

Introduction to Clementine

Statistics (STAT)

STATISTICS (STAT)

... Meeker. Methods for analyzing data collected over time; review of multiple regression analysis. Elementary forecasting methods: moving averages and exponential smoothing. Autoregressive-moving average (Box-Jenkins) models: identification, estimation, diagnostic checking, and forecasting. Transfer fu ...

A Survey of Sequential Pattern Mining - Philippe Fournier

... and offering discounts. Although pattern mining has become very popular due to its applications in many domains, several pattern mining techniques such as those for frequent itemset mining [1, 53, 116, 86, 106] and association rule mining [1] are aimed at analyzing data, where the sequential orderin ...

Event-based Failure Prediction - Institut für Informatik

Data Mining Concepts And Techniques_Jiawei Han

... 5.3 Ecient implementation of attribute-oriented induction . . . . . . . . . . . . . . . 5.3.1 Basic attribute-oriented induction algorithm . . . . . . . . . . . . . . . . . . 5.3.2 Data cube implementation of attribute-oriented induction . . . . . . . . . . 5.4 Analytical characterization: Analysis ...

Contents

... 5.3 Ecient implementation of attribute-oriented induction . . . . . . . . . . . . . . . 5.3.1 Basic attribute-oriented induction algorithm . . . . . . . . . . . . . . . . . . 5.3.2 Data cube implementation of attribute-oriented induction . . . . . . . . . . 5.4 Analytical characterization: Analysis ...

Approximate Mining of Consensus Sequential Patterns

Oracle R Enterprise User`s Guide

... Margaret Taft ...

An Annotation Management System for Relational Databases

... In almost all of these systems, the design includes multiple distributed annotation servers for storing annotations and data is merged from various sources to display it graphically to an end user. The research of these systems has been focussed on the scalability of design, distributed support for ...

Density-based Algorithms for Active and Anytime Clustering

... distances ideally to produce the same result as if it has the entire distance matrix at hand. The general idea of Act-DBSCAN is that it actively selects the most promising pairs of objects to calculate the distances between them and tries to approximate as much as possible the desired clustering res ...

ICDM06.metaclust.caruana.pdf

... applying Principal Component Analysis [8] to the data prior to weighting. PCA rotates the data to find a new orthogonal basis in which feature values are uncorrelated. Random weights applied to the rotated features (components) yields a more diverse set of distance functions. Typically, PCA componen ...

Optimal Candidate Generation in Spatial Co

... related to each other. Further, we are interested in co-located instances with distinct spatial features only. That is, for any two instances in a clique, their feature types are different, and they are neighbors. A clique indicates strong coherence between its members. An example of spatial feature ...

Michael J.A.Berry Mastering Data Mining

Data Mining Deployment Guide

Data Warehousing Fundamentals

... The Data Staging Area Ralph Kimball is one of the most widely recognized experts in the field of data warehousing. Kimball calls the data staging area the construction site for the warehouse. This is where much of the data transformation and cleansing takes place. A staging area is a typical require ...

Aalborg Universitet Sentinel Mining Middelfart, Morten

... sentinel mining is straight forward, compared to what will be presented in the following chapters, we demonstrate that this particular implementation scales linearly on large data volumes. Another important contribution in this chapter is the demonstration of the distinct differences between sentine ...

Statistical Selection of Relevant Subspace Projections for Outlier

... In general, outliers are objects that deviate from the rest of the data to a great extent. However, there have been various outlier models proposed in the literature. We categorize these models into two paradigms, traditional outlier mining methods and subspace outlier mining techniques. a) Traditio ...

Combining Classifiers with Meta Decision Trees

... iterative process of combining classifiers: at each iteration, the training data set is extended with the predictions obtained in the previous iteration. The work presented here focuses on combining the predictions of base-level classifiers induced by applying different learning algorithms to a sing ...

SoBigData Grant Agreement

... Text in italics shows the options of the Model Grant Agreement that are applicable to this Agreement. ...

< 1 2 3 4 5 6 ... 505 >

Nonlinear dimensionality reduction

High-dimensional data, meaning data that requires more than two or three dimensions to represent, can be difficult to interpret. One approach to simplification is to assume that the data of interest lie on an embedded non-linear manifold within the higher-dimensional space. If the manifold is of low enough dimension, the data can be visualised in the low-dimensional space.Below is a summary of some of the important algorithms from the history of manifold learning and nonlinear dimensionality reduction (NLDR). Many of these non-linear dimensionality reduction methods are related to the linear methods listed below. Non-linear methods can be broadly classified into two groups: those that provide a mapping (either from the high-dimensional space to the low-dimensional embedding or vice versa), and those that just give a visualisation. In the context of machine learning, mapping methods may be viewed as a preliminary feature extraction step, after which pattern recognition algorithms are applied. Typically those that just give a visualisation are based on proximity data – that is, distance measurements.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Nonlinear dimensionality reduction