What`s Strange About Recent Events (WSARE v3.0)

Iterative Projected Clustering by Subspace Mining

... are noise. Therefore, a new class of projected clustering methods (also called subspace clustering methods) [1], [2], [3], [12] have emerged, whose task is to find 1) a set of clusters C, and 2) for each cluster Ci 2 C, the set of dimensions Di that are relevant to Ci . For instance, the projected c ...

Quantitative Evaluation of Approximate Frequent Pattern Mining

Steven F. Ashby Center for Applied Scientific Computing

Data Mining Products - Lyle School of Engineering

... • Techniques: Decision trees (modified CART), K-means, neural networks (MLP, back-propagation, RBF), regression (linear) • Platforms: Windows, Solaris, AIX, OS/390, OS/400 DB2 Intelligent Miner for Data performs mining functions against traditional DB2 databases or flat files. It also has capabiliti ...

Data Mining Techniques in Parallel and Distributed

... in last 20 years. Hundreds of algorithms are proposed till date but the recent focus is on mining association rules in a distributed fashion. A-priori, pincer-search, FP- Tree growth algorithms have been implemented in different ways using different data structures. Moving toward distributed approac ...

Recursive information granulation

... 1) derivation of information granule(s) from the original numeric data contained in the window of observation; 2) recursive processing of the mixture of granular and numeric data. In the detailed construct, we start with a collection (block) of , as shown in Fig. 3(a). The phase-1 granudata lation r ...

M. Tech. Computational Biology

... The M.Tech Computational Biology program has been designed to develop interdisciplinary skills required in the area of Computational Biology. In recent years, due to advancements in experimental techniques, there has been a phenomenal increase in the volume of biological data. In order to organize, ...

Mining Motifs from Human Motion

Comparative Study of Techniques to Discover Frequent Patterns of

... Comparative Study of Techniques to Discover Frequent Patterns of Web Usage Mining ...

Cluster Analysis

Automating Knowledge Discovery Workflow Composition Through

... of workflows. Notably, in the Wings component of the Pegasus project [28] a planner employing semantic reasoning is used to construct a concrete workflow from a given abstract workflow based on concrete input data [29]. In our research we tackle a related yet different goal; given an ontology and a ...

dmsa technique for finding significant patterns in large database

... Frequent pattern mining in databases plays a vital role in many data mining tasks like classification, sequential patters, clustering, association rules analysis etc. There are numerous mining algorithms for finding association rules. One of the most common algorithms is Apriori. It is used to mine ...

IJSRSET Paper Word Template in A4 Page Size

... in D. Note that, at this point, the information we have is based solely on the proportions of tuples of each class. Info(D) is also known as the entropy of D. Now, suppose we were to partition the tuples in D on some attribute A having v distinct values, fa1, a2…av, as observed from the training dat ...

A decoupled exponential random graph model for prediction of

... It can only be applied to recover unknown structures of the temporal network based on the observed node values x t for t = 1 · · · T . Direct predictions of the future values for x t+1 and the edge set E t+1 are impossible in this model. Another approach [4] models the network dynamics as a stochast ...

Co-clustering Numerical Data under User-defined Constraints

Density Biased Sampling

Using data mining technology to solve classification problems

... noise, deciding on strategies for handling missing data fields, and accounting for time-sequence information and known changes. Step 4. Data reduction and projection: finding useful features to represent the data depending on the goal of the task. With dimensionality reduction or transformation meth ...

Deriving private information from randomized data

... when data are highly correlated (thus redundant), we are able to derive, from the disguised data, more accurate information about the original data. In other words, there exists a strong relationship between the correlation and the randomization’s privacy-preserving property. The goal of this paper ...

33. MINING OF SEQUENTIAL PATTERNS WITH A PROGRESSIVE

Frequent Itemset Mining for Big Data

... low level programming languages, i.e., C and Fortran. It is known, however, that higher level languages are more popular in businesses [12]. Although they are not the most efficient in terms of computation or resources, they are easily accessible. Fortunately, thanks to the MapReduce framework propo ...

Mining Telecom System Logs to Facilitate Debugging Tasks

... together and are usually distributed in nature. When errors or abnormal behaviours occur, software engineers turn to the analysis of logs, generated by monitoring and tracing the system‟s activities. Logs, however, tend to be overwhelmingly large, which hinders any viable analysis unless adequate (a ...

Complex building`s energy system operation patterns analysis using

... real life conditions, supplies the reference data for validation. Results: The proposed method has been compared with dynamic time warping (DTW) method using cophenetic coefficients and it has been shown that the BoWR has produced better results as compared to DTW. The results of BoWR are further i ...

Multi-represented kNN-Classification for Large Class Sets

Hartigan`s K-Means Versus Lloyd`s K-Means -- Is It Time for a

... K clusters that minimizes D(C). The most popular optimization heuristic to this end is Lloyd’s algorithm [Lloyd, 1982; MacQueen, 1967; Forgy, 1965]. In most applications, d(vx , vc ) = 12 kvx − vc k2 , where k · k denotes the Euclidean norm, and one starts from some random partition of the data into ...

< 1 ... 125 126 127 128 129 130 131 132 133 ... 505 >

Nonlinear dimensionality reduction

High-dimensional data, meaning data that requires more than two or three dimensions to represent, can be difficult to interpret. One approach to simplification is to assume that the data of interest lie on an embedded non-linear manifold within the higher-dimensional space. If the manifold is of low enough dimension, the data can be visualised in the low-dimensional space.Below is a summary of some of the important algorithms from the history of manifold learning and nonlinear dimensionality reduction (NLDR). Many of these non-linear dimensionality reduction methods are related to the linear methods listed below. Non-linear methods can be broadly classified into two groups: those that provide a mapping (either from the high-dimensional space to the low-dimensional embedding or vice versa), and those that just give a visualisation. In the context of machine learning, mapping methods may be viewed as a preliminary feature extraction step, after which pattern recognition algorithms are applied. Typically those that just give a visualisation are based on proximity data – that is, distance measurements.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Nonlinear dimensionality reduction