
Genetic and Evolutionary Computation Conference 2008
... in Atlanta at the Centre for Disease Control, and will be giving us a unique view into the use of agent based modeling at the CDC. Another part of GECCO’s continued success is its personnel. Between organizers, track chairs and tutorial speakers, there are almost seventy people involved in making GE ...
... in Atlanta at the Centre for Disease Control, and will be giving us a unique view into the use of agent based modeling at the CDC. Another part of GECCO’s continued success is its personnel. Between organizers, track chairs and tutorial speakers, there are almost seventy people involved in making GE ...
Data mining with SpagoBI, Weka and Oracle.
... you can blacklist from your firm and groups with low claim cost who you can do business with. This is an example of how data mining can be used in the real world. Marketers also use clustering algorithms to discover certain groups in their customer data whom they target with specific products. In th ...
... you can blacklist from your firm and groups with low claim cost who you can do business with. This is an example of how data mining can be used in the real world. Marketers also use clustering algorithms to discover certain groups in their customer data whom they target with specific products. In th ...
SCALABLE MINING ON EMERGING ARCHITECTURES
... and outlier detection to glean significant runtime improvements. These improvements are afforded by the high floating point throughput of this emerging processor. We also show examples where the Cell processor requires more compute time than competing technologies, primarily due to its high latency ...
... and outlier detection to glean significant runtime improvements. These improvements are afforded by the high floating point throughput of this emerging processor. We also show examples where the Cell processor requires more compute time than competing technologies, primarily due to its high latency ...
Contents
... huge size (often several gigabytes or more), and their likely origin from multiple, heterogenous sources. Low quality data will lead to low quality mining results. “How can the data be preprocessed in order to help improve the quality of the data and, consequently, of the mining results? How can the ...
... huge size (often several gigabytes or more), and their likely origin from multiple, heterogenous sources. Low quality data will lead to low quality mining results. “How can the data be preprocessed in order to help improve the quality of the data and, consequently, of the mining results? How can the ...
statistical models and analysis techniques
... structures). These data offer unique opportunities to improve model accuracy, and thereby decision-making, if machine learning techniques can effectively exploit the relational information. This work focuses on how to learn accurate statistical models of complex, relational data sets and develops tw ...
... structures). These data offer unique opportunities to improve model accuracy, and thereby decision-making, if machine learning techniques can effectively exploit the relational information. This work focuses on how to learn accurate statistical models of complex, relational data sets and develops tw ...
Partition Incremental Discretization
... also define this number. The input for this layer is the set of intervals of the first layer. A. Initialization of the layers: First layer The number of intervals in this layer should be much higher than required. It can be initialized in two modes: • Without seeing any previous data. We use a EWD ...
... also define this number. The input for this layer is the set of intervals of the first layer. A. Initialization of the layers: First layer The number of intervals in this layer should be much higher than required. It can be initialized in two modes: • Without seeing any previous data. We use a EWD ...
Using On-line Analytical Processing (OLAP)
... Treatment Facility (MTF) emergency room (ER) manager to improve ER staffing and utilization. MTF ER managers use statistical data analysis to help manage the efficient operation and use of ERs. As the size and complexity of databases increase, traditional statistical analysis becomes limited in the ...
... Treatment Facility (MTF) emergency room (ER) manager to improve ER staffing and utilization. MTF ER managers use statistical data analysis to help manage the efficient operation and use of ERs. As the size and complexity of databases increase, traditional statistical analysis becomes limited in the ...
Time Series Knowledge Mining
... knowledge discovery in databases (KDD) that has evolved from a collaboration of mostly the following fields: statistics, machine learning, artificial intelligence, pattern recognition, knowledge acquisition, expert system, data visualization, high-performance computing, databases, information retrie ...
... knowledge discovery in databases (KDD) that has evolved from a collaboration of mostly the following fields: statistics, machine learning, artificial intelligence, pattern recognition, knowledge acquisition, expert system, data visualization, high-performance computing, databases, information retrie ...
Oracle Data Mining Programmer`s Guide
... Oracle Corporation; they are provided under a license agreement containing restrictions on use and disclosure and are also protected by copyright, patent and other intellectual and industrial property laws. Reverse engineering, disassembly or decompilation of the Programs, except to the extent requi ...
... Oracle Corporation; they are provided under a license agreement containing restrictions on use and disclosure and are also protected by copyright, patent and other intellectual and industrial property laws. Reverse engineering, disassembly or decompilation of the Programs, except to the extent requi ...
Mining for Spatial Patterns
... S. Chawla, S. Shekhar, W. Wu and U. Ozesmi, “Extending Data Mining for Spatial Applications: A Case Study in Predicting Nest Locations”, Proc. Int. Confi. on 2000 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD 2000), Dallas, TX, May 14, 2000. ...
... S. Chawla, S. Shekhar, W. Wu and U. Ozesmi, “Extending Data Mining for Spatial Applications: A Case Study in Predicting Nest Locations”, Proc. Int. Confi. on 2000 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD 2000), Dallas, TX, May 14, 2000. ...
Patterns that Matter - Department of Information and Computing
... the past to discover useful knowledge. In this thesis, we only consider Descriptive Data Mining, as we want to get as much insight in the data as possible. Our goal is to avoid ‘black boxes’ – we prefer to use methods that allow for human inspection and interpretation, in order to maximise our unde ...
... the past to discover useful knowledge. In this thesis, we only consider Descriptive Data Mining, as we want to get as much insight in the data as possible. Our goal is to avoid ‘black boxes’ – we prefer to use methods that allow for human inspection and interpretation, in order to maximise our unde ...
Mining periodic behaviors of object movements for animal and
... pattern mining problem for a moving object with a given period. However, the rigid definition of frequent periodic pattern does not encode the statistical information. It cannot describe the case such as “The eagle has 0.8 probability to be inside the nest at 6:00 everyday.” One may argue that these ...
... pattern mining problem for a moving object with a given period. However, the rigid definition of frequent periodic pattern does not encode the statistical information. It cannot describe the case such as “The eagle has 0.8 probability to be inside the nest at 6:00 everyday.” One may argue that these ...
Data Preprocessing
... that is not part of any precomputed data cube in your data warehouse. You soon realize that data transformation operations, such as normalization and aggregation, are additional data preprocessing procedures that would contribute toward the success of the mining process. Data integration and data tr ...
... that is not part of any precomputed data cube in your data warehouse. You soon realize that data transformation operations, such as normalization and aggregation, are additional data preprocessing procedures that would contribute toward the success of the mining process. Data integration and data tr ...
Efficient Frequent Pattern Mining
... on the amount of work that a typical range of frequent itemset algorithms will need to perform. By computing our upper bounds, we have at all times an airtight guarantee of what is still to come, on which then various optimization decisions can be based, depending on the specific algorithm that is u ...
... on the amount of work that a typical range of frequent itemset algorithms will need to perform. By computing our upper bounds, we have at all times an airtight guarantee of what is still to come, on which then various optimization decisions can be based, depending on the specific algorithm that is u ...
Unsupervised Identification of the User’s Query Intent in Web Search Liliana Calderón-Benavides
... and all the members of the WRG and Yahoo! Research Barcelona. I would like to thank Vicente Lopez and Joan Codina from Barcelona Media, I learnt very much from our work together. I thank Devdatt Dubhashi for his invitation to work together at Chalmers University of Technology / Göteborg University, ...
... and all the members of the WRG and Yahoo! Research Barcelona. I would like to thank Vicente Lopez and Joan Codina from Barcelona Media, I learnt very much from our work together. I thank Devdatt Dubhashi for his invitation to work together at Chalmers University of Technology / Göteborg University, ...
Cluster analysis
Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). It is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics.Cluster analysis itself is not one specific algorithm, but the general task to be solved. It can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them. Popular notions of clusters include groups with small distances among the cluster members, dense areas of the data space, intervals or particular statistical distributions. Clustering can therefore be formulated as a multi-objective optimization problem. The appropriate clustering algorithm and parameter settings (including values such as the distance function to use, a density threshold or the number of expected clusters) depend on the individual data set and intended use of the results. Cluster analysis as such is not an automatic task, but an iterative process of knowledge discovery or interactive multi-objective optimization that involves trial and failure. It will often be necessary to modify data preprocessing and model parameters until the result achieves the desired properties.Besides the term clustering, there are a number of terms with similar meanings, including automatic classification, numerical taxonomy, botryology (from Greek βότρυς ""grape"") and typological analysis. The subtle differences are often in the usage of the results: while in data mining, the resulting groups are the matter of interest, in automatic classification the resulting discriminative power is of interest. This often leads to misunderstandings between researchers coming from the fields of data mining and machine learning, since they use the same terms and often the same algorithms, but have different goals.Cluster analysis was originated in anthropology by Driver and Kroeber in 1932 and introduced to psychology by Zubin in 1938 and Robert Tryon in 1939 and famously used by Cattell beginning in 1943 for trait theory classification in personality psychology.