Enhanced SMART-TV - Internetworking Indonesia Journal

... determined using a distance measure, e.g. Euclidian distance. KNN has shown good performance on various datasets. However, when the training set is very large, i.e. millions of objects, it will not scale. The brute force searches to find the k-nearest neighbors will increase the classification time ...

A Review on Clustering and Outlier Analysis Techniques in

Singular Value Decomposition

stat_9

... The parameters of a pdf are constants that characterize its shape, e.g. ...

Organizing Data and Information

... Disadvantages of Database Approach Relatively high cost of purchasing and operating a DBMS in a mainframe ...

use bp-network to construct composite attribute

... maximal speed in variable speed population is vMAX =20, and the threshold of locked ant moving time is l=50. Clustering result is illuminated in figure 1. In figure 1, each cluster figures one customer cluster. Objects in a cluster have some common characteristics and these characteristics can be ob ...

IEEE Transactions on Magnetics

... been applied to this area. Data mining refers to mining or extracting information from large data sets. Some of its functionalities are the discovery of class descriptions or concept, clustering, correlations and associations, outlier, deviation, classification, prediction, trend analysis, analysis, ...

Data Mining using Ensemble Classifiers for Improved Prediction of

IOSR Journal of Computer Engineering (IOSR-JCE)

... language processing. Although WordNet and other dictionaries can support this task, they are not sufficient. First, many words and phrases may not be synonyms in a dictionary but they may refer to the same attribute in an application domain. For example, “sweet” and “pleasant” are synonyms in commun ...

Indian Agriculture Land through Decision Tree in Data Mining

... with high data-processing efficiency and easily-understood characteristics becomes much more popular and has already been widely used in many fields, for example, speech recognition, medical treatment, model recognition and expert system, etc. And it includes many methods, and each method has its ch ...

Iberoamerican Journal of Applied Computing ISSN 2237

... calculation of subjective measures. However, as previously mentioned, new features can be added to the tool. Subjective measures such as: conformity, unexpected antecedent, unexpected consequent and unexpected antecedent and consequent (Sinoara, 2006), could be coupled to the environment. In the sec ...

Anomaly Detection

... Firm mathematical foundation ...

Pre-processing data using ID3 classifier

... attributes present in a dataset, and if specified it excludes class attribute. In this paper, records belonging to known diabetes b. Numeric to Binary: This method converts all the dataset were extracted to create training and testing attribute values intoo binary. If the Numeric value of dataset fo ...

this PDF file

... calculation of subjective measures. However, as previously mentioned, new features can be added to the tool. Subjective measures such as: conformity, unexpected antecedent, unexpected consequent and unexpected antecedent and consequent (Sinoara, 2006), could be coupled to the environment. In the sec ...

Miscellaneous Topics - McMaster Computing and Software

... ◦ Instance-based learners are susceptible to those attributes because they might change the neighborhoods ◦ Naive Bayes model does not suffer from this, but suffer from redundant attributes. Surprising: relevant attributes might also be harmful! Manual selection is desirable, but too timeconsuming A ...

Review on Data Mining Techniques for Intrusion Detection System

... Probe: Attackers usually apply probe to get information, to determine the targets and the type of operating system. Dos (Denial of service): Such attack may cause the stop of server operation, and the server cannot provide services. The attack usually occupies all system source of server, or occupie ...

Clustering - Computer Science, Stony Brook University

... presence of noise and outliers Medoids are less influenced by outliers ...

COMP 790-090 Data Mining: Concepts, Algorithms, and Applications 2

... presence of noise and outliers Medoids are less influenced by outliers ...

A cluster is considered to be stable depending on stability value

... In a grid environment the number of computing nodes and users participating are increase and may reach up to thousands or millions. The abundance of these resources forges new problems, such as how to collect the massive amounts of evolving resources in real time and extract the useful information f ...

Software Bug Detection Algorithm using Data mining

... defective modules in software. It helps to improve software quality and testing efficiency by constructing predictive models from code attributes to enable a timely identification of fault-prone modules, it also helps us in planning, monitoring and control and predict defect density and to better un ...

Application of Data Mining and Soft Computing Techniques for

... Ankit Dewan et.al.[21] have applied various technique of machine learning such as Artificial neural network, back propagation genetic algorithm for optimization purpose. But due to its drawback of being stuck in local minima researchers were not able to achieve the maximum profit. So they employed t ...

1 CHAPTER -1 INTRODUCTION 1.1 DATA MINING Data mining

... antecedent (if) and a consequent (then). An antecedent is an item found in the data. A consequent is an item that is found in combination with the antecedent ...

GUJARAT TECHNOLOGICAL UNIVERSITY

... 2. Design a data mart from scratch to store the credit history of customers of a bank. Use this credit profiling to process future loan applications. ...

Scaling Up the Accuracy of Naive-Bayes Classi ers: a Decision

... classier for Fisher's Iris data set, where the task is to determine the type of iris based on four attributes. Each bar represents evidence for a given class and attribute value. Users can immediately see that all values for petal-width and petal length are excellent determiners, while the middle r ...

Predicting Customer Loyalty Using Data Mining Techniques

< 1 ... 356 357 358 359 360 361 362 363 364 ... 505 >

Nonlinear dimensionality reduction

High-dimensional data, meaning data that requires more than two or three dimensions to represent, can be difficult to interpret. One approach to simplification is to assume that the data of interest lie on an embedded non-linear manifold within the higher-dimensional space. If the manifold is of low enough dimension, the data can be visualised in the low-dimensional space.Below is a summary of some of the important algorithms from the history of manifold learning and nonlinear dimensionality reduction (NLDR). Many of these non-linear dimensionality reduction methods are related to the linear methods listed below. Non-linear methods can be broadly classified into two groups: those that provide a mapping (either from the high-dimensional space to the low-dimensional embedding or vice versa), and those that just give a visualisation. In the context of machine learning, mapping methods may be viewed as a preliminary feature extraction step, after which pattern recognition algorithms are applied. Typically those that just give a visualisation are based on proximity data – that is, distance measurements.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Nonlinear dimensionality reduction