Applying Subgroup Discovery Based on Evolutionary Fuzzy

... 2.3 Evolutionary Fuzzy Systems A EFS is basically a fuzzy system augmented by a learning process based on evolutionary computation, which includes genetic algorithms, genetic programming, and evolutionary strategies, among other evolutionary algorithms [15]. Fuzzy systems are one of the most importa ...

Web Mining (網路探勘)

ALADIN: Active Learning of Anomalies to Detect Intrusion

... traffic by requesting labels for examples which it cannot classify with high certainty. Combining these two goals overcomes many problems associated with earlier anomaly-detection based IDSs. Once trained, the system can be run as a fixed classifier with no further learning. Alternatively, it can co ...

Clustering Techniques Analysis for Microarray Data

... distance, Maximum distance, Mahalanobis distance and cosine similarity. 2. Partitioning Algorithms: They are iterative relocation algorithm. They are non hierarchical or flat methods. This method divides the data objects into non overlapping clusters such that each data object is in exactly one subs ...

Drowning in Data

... Metrics and Measurement: Report Library Key Standardized Reports in an Automated Fashion ...

Fast Approximate Query Processing on Temporal Data

... 1. Consider the set of Twitter users and the subset of Tweets containing mentions of consumer products. What were the top 10 frequently mentioned products across all users belonging to a given geographical region over a given hour, day, week or month? What is the similarity score between the set of ...

Classification: Basic Concepts, Decision Trees, and Model

... – Learn a model that maps each attribute set x into one of the predefined class labels y Introduction to Data Mining ...

Uniqueness of medical data mining

... More and more medical procedures employ imaging as a preferred diagnostic tool. Thus, there is a need to develop methods for efficient mining in databases of images, which are more difficult than mining in purely numerical databases. As an example, imaging techniques like SPECT, MRI, PET, and collec ...

Text document pre-processing using the Bayes formula for

Decision Analytics (DAPT)

... into relational schema. The course will give students competence in SQL and other search techniques, data validation and data cleansing. DAPT 612. Text Mining and Unstructured Data. 2 Hours. Semester course; 2 lecture hours. 2 credits. Focuses on unstructured data and includes the topics: creation o ...

mike_phd_defense_final

... – The support, (X), of an itemset X is the number of transactions that contain all the items of the itemset  Frequent itemsets have support > specified threshold  Different types of itemset patterns are distinguished by a measure and a threshold – The confidence of an association rule is given by ...

Discovering Lag Intervals for Temporal Dependencies

... Figure 1, [5min, 6min] is the predicted time range, indicating when a database alert occurs after a disk capacity alert is received. Furthermore, the associated lag interval characterizes the cause of a temporal dependency. For example, if the database is writing a huge temporal log ﬁle which is lar ...

sentiment analysis of twitter data using emoticons and emoji

Spatial Association Rules - Artificial Intelligence Group

...  Classifier: decision trees, neural network, multidimensional regression  Clustering: collection of objects ...

Data Mining for Intrusion Detection

... Data sets in Intrusion Detection y DARPA 19981 data set and its modification KDDCup99 data set created in MADAM ID project y DARPA 19991 data set y System call traces data set2 – U. New Mexico y Solaris audit data using BSM3 (Basic Security Module) y University of Melbourne, Australia ...

Data Mining: Concepts and Techniques

... The definitions of distance functions are usually very different for interval-scaled, boolean, categorical, ordinal ratio, and vector variables. Weights should be associated with different variables based on applications and data semantics. It is hard to define “similar enough” or “good enough” ...

080-31: Fraud Detection – A Primer for SAS® Programmers

Cloud-based Malware Detection for Evolving Data Streams

Data Mining of Machine Learning Performance Data

Meta Data for Visual Data Mining

... This example demonstrates, that a more comprehensive and systematic treatment of meta data would be desirable. In this paper we describe our general framework, which provides a variety of meta data for exploring and visualizing large data sets. In our framework the process of exploring data can be r ...

A Dynamic Indexing Technique for Multidimensional Non-Ordered Discrete Data Spaces, ACM Transactions on Database Systems, Vol. 31, No. 2, 2006, Gang Qian, Qiang Zhu, Qiang Xue and Sakti Pramanik.

... et al. 2004]. These trees only consider relative distances of data objects to organize and partition the search space and apply the triangle inequality property of distances to prune the search space. These techniques, in fact, could be applied to support similarity searches in an NDDS. However, mos ...

PPT

CSIS 0323 Advanced Database Systems Spring 2003

Chapter 15 - VCU DMB Lab.

Unilever Data Analysis Project - MIT Center for Digital Business

... through the Sloan School of Management Center for eBusiness. In this document we trace our interactions with Unilever, describe the data made available to us, describe various analyses and results, and present overall conclusions learned in the course of the project. Unilever has been a pioneer in m ...

< 1 ... 93 94 95 96 97 98 99 100 101 ... 505 >

Nonlinear dimensionality reduction

High-dimensional data, meaning data that requires more than two or three dimensions to represent, can be difficult to interpret. One approach to simplification is to assume that the data of interest lie on an embedded non-linear manifold within the higher-dimensional space. If the manifold is of low enough dimension, the data can be visualised in the low-dimensional space.Below is a summary of some of the important algorithms from the history of manifold learning and nonlinear dimensionality reduction (NLDR). Many of these non-linear dimensionality reduction methods are related to the linear methods listed below. Non-linear methods can be broadly classified into two groups: those that provide a mapping (either from the high-dimensional space to the low-dimensional embedding or vice versa), and those that just give a visualisation. In the context of machine learning, mapping methods may be viewed as a preliminary feature extraction step, after which pattern recognition algorithms are applied. Typically those that just give a visualisation are based on proximity data – that is, distance measurements.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Nonlinear dimensionality reduction