Developing a Hybrid Intrusion Detection System Using Data Mining

... paths for a single scenario. Common paths reflect the states that occur most frequently for a scenario. The common path mining algorithm consists of six steps. The first five steps create paths, P, for each instance of a scenario. First, raw data is collected from various sensors in the system. Seco ...

Learning Translation Consensus with Structured Label Propagation

Clustering Approach to Stock Market Prediction

... that most feature vectors in each subcluster belong to the same class. Then, for each sub cluster, we choose its centroid as the representative feature vector. Finally, we employ the representative feature vectors to predict the stock price movements. The experimental results show the proposed metho ...

Comparative Study on Hierarchical and Partitioning Data Mining

GROUP SYNTAX ERROR BA 180.1 THW2 RESEARCH PAPER

introduction to data mining

Visually–driven analysis of movement data by progressive clustering

... As we have mentioned, the aim of a clustering method is to produce a set of groups of objects where the objects in the same group (cluster) are near each other and the groups are distant from each other. The problem of finding the optimal clustering is NP-hard. There are several strategies proposed ...

data - Shuigeng Zhou

... In data warehousing literature, an n-D base cube is called a base cuboid. The top most 0-D cuboid, which holds the highest-level of summarization, is called the apex cuboid. The lattice of cuboids forms a data cube. ...

A Study of Pattern Prediction in the Monitoring Data of Earthen

MultiClust 2013: Multiple Clusterings, Multi-view Data, and

... How a (semi-supervised) clustering approach addresses such constraints would be a different question. Spectral graph partitioning is the topic addressed in the short paper by Zheng and Wu [15]. The basic motivation for this study is to try overcome an accuracy issue in spectral modularity optimizati ...

Data Mining

... subset data: sampling might hurt if highly skewed data feature selection: principal component analysis, heuristic search name/address cleaning, different meanings (annual, yearly), duplicate removal, supplying missing values ...

Scalable Algorithms for Distribution Search

... data in different modalities, such as digital images, audio, video, and text data. For example, consider motion capture datasets, which contain a list of numerical attributes of kinetic energy values. In this case, every motion can be represented as a cloud of hundreds of frames, with each frame bei ...

Classification of symbolic objects: A lazy learning approach

Document

... Scales linearly: finds a good clustering with a single scan and improves the quality with a few additional scans ...

A survey of Knowledge Discovery and Data Mining process models

Outlier Detection - SFU computing science

Statistical Learning Theory

SVM in Oracle Database 10g: Removing the Barriers to Widespread

... fi = 0; b is the intercept; αj is the Lagrangian multiplier for the jth training data record xj; and yj is the corresponding target value (±1). Eq. 1 is a linear equation in the space of attributes θ j = K (x j , x i ) . The kernel function, K, can be linear or non-linear. If K is a linear kernel, E ...

Data Mining

... Many queries of interest are difficult to state in a query language (Query formulation problem) ...

Clustering - Semantic Scholar

IOSR Journal of Computer Engineering (IOSR-JCE)

... A Survey of Fuzzy Based Association Rule Mining to Find Co-Occurrence Relationships this proposed work integrates the fuzzy set concepts in the newly proposed CFP-tree algorithm by constructing a compact sub-tree for a fuzzy frequent item, generating candidates in batch from the compact sub-tree an ...

A Hit-Miss Model for Duplicate Detection

... numerical difference is as likely as a small one. In order to handle both possibilities, we propose a hit-miss mixture model which includes a new type of miss for which small deviations from the true value are more likely than large ones. To distinguish between the two types of misses in this model, ...

Multivariate Approaches to Classification in Extragalactic

1 A Survey on Concept Drift Adaptation

Technologies and Computational Intelligence

... Big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. “Big Data” is data whose scale, diversity, and complexity require new architectures, techniques, algorithms, and analyt ...

< 1 ... 70 71 72 73 74 75 76 77 78 ... 505 >

Nonlinear dimensionality reduction

High-dimensional data, meaning data that requires more than two or three dimensions to represent, can be difficult to interpret. One approach to simplification is to assume that the data of interest lie on an embedded non-linear manifold within the higher-dimensional space. If the manifold is of low enough dimension, the data can be visualised in the low-dimensional space.Below is a summary of some of the important algorithms from the history of manifold learning and nonlinear dimensionality reduction (NLDR). Many of these non-linear dimensionality reduction methods are related to the linear methods listed below. Non-linear methods can be broadly classified into two groups: those that provide a mapping (either from the high-dimensional space to the low-dimensional embedding or vice versa), and those that just give a visualisation. In the context of machine learning, mapping methods may be viewed as a preliminary feature extraction step, after which pattern recognition algorithms are applied. Typically those that just give a visualisation are based on proximity data – that is, distance measurements.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Nonlinear dimensionality reduction