Discovering Frequent Tree Patterns over Data Streams
... its underlying concept and then describe how STMer uses it in the mechanisms for the two phases. The goal of Lossy Counting proposed in [4] is to keep monitoring the data streams composed of items to report the ones with frequency counts exceeding the user-specified threshold. A data stream is conce ...
... its underlying concept and then describe how STMer uses it in the mechanisms for the two phases. The goal of Lossy Counting proposed in [4] is to keep monitoring the data streams composed of items to report the ones with frequency counts exceeding the user-specified threshold. A data stream is conce ...
Exploiting Data Mining Techniques For Broadcasting Data in Mobile
... The problem of scheduling the broadcast requests is to determine the sequence of items in the broadcast schedule. We need to determine what should be put in the schedule and in what sequence by taking into account the previous request patterns of the clients (i.e., the broadcast history). Organizing ...
... The problem of scheduling the broadcast requests is to determine the sequence of items in the broadcast schedule. We need to determine what should be put in the schedule and in what sequence by taking into account the previous request patterns of the clients (i.e., the broadcast history). Organizing ...
pdf file - Charles Ling
... statistical and learning methods cannot deal with missing values directly, examples with missing values are often deleted. However, deleting cases can result in a loss of a large amount of valuable data. Thus much previous research has focused on filling or imputing the missing values before learnin ...
... statistical and learning methods cannot deal with missing values directly, examples with missing values are often deleted. However, deleting cases can result in a loss of a large amount of valuable data. Thus much previous research has focused on filling or imputing the missing values before learnin ...
Data Mining and Homeland Security: An Overview
... user the value or significance of these patterns. These types of determinations must be made by the user. A second limitation is that while data mining can identify connections between behaviors and/or variables, it does not necessarily identify a causal relationship. Successful data mining still re ...
... user the value or significance of these patterns. These types of determinations must be made by the user. A second limitation is that while data mining can identify connections between behaviors and/or variables, it does not necessarily identify a causal relationship. Successful data mining still re ...
Activity Recognition Using, Smartphone Based
... main sources of information for this recognition task: environmental inputs (e.g. Wi-fi, vision based recognition, or other sensors placed in the [1]) or body-worn sensors (smartphone inputs, networks of wearable sensors). In this thesis, we will focus on this second source of input. We will specifi ...
... main sources of information for this recognition task: environmental inputs (e.g. Wi-fi, vision based recognition, or other sensors placed in the [1]) or body-worn sensors (smartphone inputs, networks of wearable sensors). In this thesis, we will focus on this second source of input. We will specifi ...
An Explorative Parameter Sweep: Spatial-temporal Data
... does the localization and copy number of species change through time) features from individual simulations. This will be a challenge because of the high dimensionality associated with the simulations output. For the purpose of extracting features, it is not a straightforward task to analyze time ser ...
... does the localization and copy number of species change through time) features from individual simulations. This will be a challenge because of the high dimensionality associated with the simulations output. For the purpose of extracting features, it is not a straightforward task to analyze time ser ...
An Intergrated Data Mining and Survival Analysis Model for
... value-creating potential and target them successfully with corresponding marketing strategies to reduce the risk of these high lifetime value customers defecting to competitors (Andrew Banasiewicz, 2004). Segmenting customer is the basic work of data mining according to known historic segmentation i ...
... value-creating potential and target them successfully with corresponding marketing strategies to reduce the risk of these high lifetime value customers defecting to competitors (Andrew Banasiewicz, 2004). Segmenting customer is the basic work of data mining according to known historic segmentation i ...
A Comprehensive Analysis on Associative Classification in Medical
... Association Rule Mining (ARM) is a tool used to find interesting associations and correlations among large set of data items which is a strong mechanism used in market basket analysis. Association rules shows attribute value conditions that occur frequently together in a given data set. Association ...
... Association Rule Mining (ARM) is a tool used to find interesting associations and correlations among large set of data items which is a strong mechanism used in market basket analysis. Association rules shows attribute value conditions that occur frequently together in a given data set. Association ...
Uncertain Data Management - Computer Science
... (as it is not part of the skyline) (ii) If p dominates any point in the list, it is inserted into the list, and all points in the list dominated by p are dropped. (iii) If p is neither dominated, nor dominates, any point in the list, it is inserted into the list as it may be part of the ...
... (as it is not part of the skyline) (ii) If p dominates any point in the list, it is inserted into the list, and all points in the list dominated by p are dropped. (iii) If p is neither dominated, nor dominates, any point in the list, it is inserted into the list as it may be part of the ...
Ministerul Educaţiei al Republicii Moldova Universitatea
... of the research and what presents each chapter in introduction. In chapter I, entitled „Advanced Data Analisys And Processing” is presented the objective study of the literature on the topic, and namely the interested information about Data Mining and OLAP terms, the tools that use each of these met ...
... of the research and what presents each chapter in introduction. In chapter I, entitled „Advanced Data Analisys And Processing” is presented the objective study of the literature on the topic, and namely the interested information about Data Mining and OLAP terms, the tools that use each of these met ...
Data Stream Mining: an Evolutionary Approach
... in time, the model of its behavior at a specific time, may not be obtained and therefore not exploited at time. For this reason, many conventional algorithms, even when they have been proved to be effective, have been discarded to be applied to data streams. Classical data mining techniques have bee ...
... in time, the model of its behavior at a specific time, may not be obtained and therefore not exploited at time. For this reason, many conventional algorithms, even when they have been proved to be effective, have been discarded to be applied to data streams. Classical data mining techniques have bee ...
Stream Cube: An Architecture for Multi
... With years of research and development of data warehousing and OLAP technology [9, 15], a large number of data warehouses and data cubes have been successfully constructed and deployed in applications, and data cube has become an essential component in most data warehouse systems and in some extende ...
... With years of research and development of data warehousing and OLAP technology [9, 15], a large number of data warehouses and data cubes have been successfully constructed and deployed in applications, and data cube has become an essential component in most data warehouse systems and in some extende ...
Master Thesis - Department of computing science
... Grid infrastructures are distributed and dynamic computing environments that are owned and used by a large number of individuals and organizations. In such environments information and computational power are shared, making it a large networked platform where applications are running as services. Op ...
... Grid infrastructures are distributed and dynamic computing environments that are owned and used by a large number of individuals and organizations. In such environments information and computational power are shared, making it a large networked platform where applications are running as services. Op ...
KDID'03 Keynote Talk
... decision, one regression) to eliminate two data columns (predicted attributes) ...
... decision, one regression) to eliminate two data columns (predicted attributes) ...
Exploring Temporal Data Using Relational Concept - CEUR
... Bio parameters is needed. To this end, preprocessings of the raw sequential data allow to build a qualitative temporal model that can be used to apply RCA on these data. The RCA result is a family of lattices that can be navigated by the users. The users can select relevant navigation paths through ...
... Bio parameters is needed. To this end, preprocessings of the raw sequential data allow to build a qualitative temporal model that can be used to apply RCA on these data. The RCA result is a family of lattices that can be navigated by the users. The users can select relevant navigation paths through ...
yes
... Naïve Bayesian prediction requires each conditional prob. be non-zero. Otherwise, the predicted prob. will be zero P( X | C i) ...
... Naïve Bayesian prediction requires each conditional prob. be non-zero. Otherwise, the predicted prob. will be zero P( X | C i) ...
Application of Convention 108 to the profiling mechanism
... 3. Anonymous data: there is no identifier. When two data sets relating to the same person are examined, it is impossible, using state-of-the-art resources, to be reasonably certain that the two data sets concern the same person. To illustrate this point, let us consider the example of a sales receip ...
... 3. Anonymous data: there is no identifier. When two data sets relating to the same person are examined, it is impossible, using state-of-the-art resources, to be reasonably certain that the two data sets concern the same person. To illustrate this point, let us consider the example of a sales receip ...
Nonlinear dimensionality reduction
High-dimensional data, meaning data that requires more than two or three dimensions to represent, can be difficult to interpret. One approach to simplification is to assume that the data of interest lie on an embedded non-linear manifold within the higher-dimensional space. If the manifold is of low enough dimension, the data can be visualised in the low-dimensional space.Below is a summary of some of the important algorithms from the history of manifold learning and nonlinear dimensionality reduction (NLDR). Many of these non-linear dimensionality reduction methods are related to the linear methods listed below. Non-linear methods can be broadly classified into two groups: those that provide a mapping (either from the high-dimensional space to the low-dimensional embedding or vice versa), and those that just give a visualisation. In the context of machine learning, mapping methods may be viewed as a preliminary feature extraction step, after which pattern recognition algorithms are applied. Typically those that just give a visualisation are based on proximity data – that is, distance measurements.