MapReduce/Hadoop
... – An efficient programming framework for processing parallelizable problems across huge datasets using a large number of machines. ...
... – An efficient programming framework for processing parallelizable problems across huge datasets using a large number of machines. ...
Tutorial Notes - ECML/PKDD 2006
... 59. Krebs F. & H. Bossel, 1996. "Emergent value orientation in self-organization of an animat", Ecological Modelling, vol. 96, pp. 143-164. 60. Kwon O.B. & J.J. Lee, 2001. "A multi agent intelligent system for efficient ERP maintenance", Expert Systems with Applications, vol. 21, pp. 191-202. ...
... 59. Krebs F. & H. Bossel, 1996. "Emergent value orientation in self-organization of an animat", Ecological Modelling, vol. 96, pp. 143-164. 60. Kwon O.B. & J.J. Lee, 2001. "A multi agent intelligent system for efficient ERP maintenance", Expert Systems with Applications, vol. 21, pp. 191-202. ...
Mining asynchronous periodic patterns in time series data
... valid segments to form the longest valid subsequence. As shown in Fig. 1, with min rep ¼ 3, S1 , S2 , and S3 are three valid segments of the pattern P ¼ ðd1 ; ; Þ. If we set max dis ¼ 3, then X1 is the longest subsequence before S3 is considered, which in turn makes X2 the longest one. If we only ...
... valid segments to form the longest valid subsequence. As shown in Fig. 1, with min rep ¼ 3, S1 , S2 , and S3 are three valid segments of the pattern P ¼ ðd1 ; ; Þ. If we set max dis ¼ 3, then X1 is the longest subsequence before S3 is considered, which in turn makes X2 the longest one. If we only ...
See new possibilities with predictive analytics
... software product for data analysis and data management, helps solve your business and research problems. SPSS for Windows is a modular, tightly integrated, full-featured product line that allows you to add modules and products to ensure you meet all your ...
... software product for data analysis and data management, helps solve your business and research problems. SPSS for Windows is a modular, tightly integrated, full-featured product line that allows you to add modules and products to ensure you meet all your ...
Spatio-temporal Co-occurrence Pattern Mining in Data Sets with
... for many application domains such as weather monitoring, astronomy, and solar physics - which is our application focus. Spatio-temporal co-occurring patterns frequently occur among various solar events. Fig. 1 shows two types of solar phenomena, Filaments (green) and Sigmoids (red) in spatial contex ...
... for many application domains such as weather monitoring, astronomy, and solar physics - which is our application focus. Spatio-temporal co-occurring patterns frequently occur among various solar events. Fig. 1 shows two types of solar phenomena, Filaments (green) and Sigmoids (red) in spatial contex ...
Deep web - AllThesisOnline
... web sites grow. With the connection of database the dynamic Web site developed increases in the number of sites with back end database for holding the important information in them. This information can retrieve trough user query from database server. The information store in databases and hidden be ...
... web sites grow. With the connection of database the dynamic Web site developed increases in the number of sites with back end database for holding the important information in them. This information can retrieve trough user query from database server. The information store in databases and hidden be ...
Rough Set Theory
... All the mathematical objects, such as relations, functions and numbers can be considered as a set. However, the concept of the classical set within mathematics is contradictory; since a set is considered to be "grouping" without all elements are absent and is know as an empty set (Stoll, 1979). The ...
... All the mathematical objects, such as relations, functions and numbers can be considered as a set. However, the concept of the classical set within mathematics is contradictory; since a set is considered to be "grouping" without all elements are absent and is know as an empty set (Stoll, 1979). The ...
Realizing a Process Cube Allowing for the Comparison of Event Data
... and a couple hundred thousands events (for example, in BPI Challenge [2] files). However, nowadays corporations work on a different scale of event logs. Giants like Royal Dutch Shell, Walmart, IBM, would rather consider millions of events (a day or even a second) and this number will continue to gro ...
... and a couple hundred thousands events (for example, in BPI Challenge [2] files). However, nowadays corporations work on a different scale of event logs. Giants like Royal Dutch Shell, Walmart, IBM, would rather consider millions of events (a day or even a second) and this number will continue to gro ...
Preprocessing Solutions for Telecommunication Specific Big Data
... anonymization, and merging multiple data sets together. As a part of the thesis, 20 experts were interviewed to shed understanding on big data, its use cases, data preprocessing, feature requirements and available tools. This thesis investigates on what is big data, and how the organizations, especi ...
... anonymization, and merging multiple data sets together. As a part of the thesis, 20 experts were interviewed to shed understanding on big data, its use cases, data preprocessing, feature requirements and available tools. This thesis investigates on what is big data, and how the organizations, especi ...
Data Mining Algorithms
... Data Bases, 506-521, Bombay, India, Sept. 1996. D. Agrawal, A. E. Abbadi, A. Singh, and T. Yurek. Efficient view maintenance in data warehouses. In Proc. 1997 ACM-SIGMOD Int. Conf. Management of Data, 417-427, Tucson, Arizona, May 1997. R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic ...
... Data Bases, 506-521, Bombay, India, Sept. 1996. D. Agrawal, A. E. Abbadi, A. Singh, and T. Yurek. Efficient view maintenance in data warehouses. In Proc. 1997 ACM-SIGMOD Int. Conf. Management of Data, 417-427, Tucson, Arizona, May 1997. R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic ...
Interpretation of Inconsistent Choice Data: How Many Context
... where R consists of all sets of orderings over X, E(D, R) is the number of observations in dataset D that are inconsistent with maximization of any ordering in R, and λ ∈ R+ is a constant. The program in (1) thus maximizes fit (by minimizing the number of unexplained observations E(D, R)) subject to ...
... where R consists of all sets of orderings over X, E(D, R) is the number of observations in dataset D that are inconsistent with maximization of any ordering in R, and λ ∈ R+ is a constant. The program in (1) thus maximizes fit (by minimizing the number of unexplained observations E(D, R)) subject to ...
SQL Based Frequent Pattern Mining - Otto-von-Guericke
... support. Frequent pattern mining is a foundation of several essential data mining tasks. These facts motivated us to develop original SQL-based approaches for mining frequent patterns. In this work, we investigate approaches based on SQL for the problem of finding frequent patterns from a transactio ...
... support. Frequent pattern mining is a foundation of several essential data mining tasks. These facts motivated us to develop original SQL-based approaches for mining frequent patterns. In this work, we investigate approaches based on SQL for the problem of finding frequent patterns from a transactio ...
TESI DOCTORAL
... to new habits of sun exposure. Considering the medical criteria, early diagnosis has become the best method of prevention but this is not trivial because experts are facing a problem characterized by a large volume of data, heterogeneous, and with partial knowledge. Based on these requirements we pr ...
... to new habits of sun exposure. Considering the medical criteria, early diagnosis has become the best method of prevention but this is not trivial because experts are facing a problem characterized by a large volume of data, heterogeneous, and with partial knowledge. Based on these requirements we pr ...
An overview of anomaly detection techniques: Existing solutions and
... mal’’ baseline. On the other hand, a hybrid intrusion detection system combines the techniques of the two approaches. Both signature detection and anomaly detection systems have their share of advantages and drawbacks. The primary advantage of signature detection is that known attacks can be detecte ...
... mal’’ baseline. On the other hand, a hybrid intrusion detection system combines the techniques of the two approaches. Both signature detection and anomaly detection systems have their share of advantages and drawbacks. The primary advantage of signature detection is that known attacks can be detecte ...
Customer Churn Prediction for the Icelandic Mobile
... Facing this challenge, mobile operators shift their attention from customer acquisition to customer retention. The crucial elements of customer retention are accurate churn prediction models and effective churn prevention strategies. The goal of this study is to construct a churn prediction model th ...
... Facing this challenge, mobile operators shift their attention from customer acquisition to customer retention. The crucial elements of customer retention are accurate churn prediction models and effective churn prevention strategies. The goal of this study is to construct a churn prediction model th ...
Nonlinear dimensionality reduction
High-dimensional data, meaning data that requires more than two or three dimensions to represent, can be difficult to interpret. One approach to simplification is to assume that the data of interest lie on an embedded non-linear manifold within the higher-dimensional space. If the manifold is of low enough dimension, the data can be visualised in the low-dimensional space.Below is a summary of some of the important algorithms from the history of manifold learning and nonlinear dimensionality reduction (NLDR). Many of these non-linear dimensionality reduction methods are related to the linear methods listed below. Non-linear methods can be broadly classified into two groups: those that provide a mapping (either from the high-dimensional space to the low-dimensional embedding or vice versa), and those that just give a visualisation. In the context of machine learning, mapping methods may be viewed as a preliminary feature extraction step, after which pattern recognition algorithms are applied. Typically those that just give a visualisation are based on proximity data – that is, distance measurements.