Feature Extraction based Approaches for Improving the

... intrusion detection system is applicable to any situation. In addition to a variety of algorithms and intrusion detection system model, dimension reduction methods are often used to select important features and to reduce dimension size for saving computational cost. Typically, there are two groups ...

A Reduced Error Pruning Technique for Improving Accuracy of

Effective framework for prediction of disease outcome using medical

... accuracy of 86.8%. Das et al. (2009) proposed a system called Neural Networks Ensemble, which creates new models by combining the posterior probabilities or the predicted values from multiple predecessor models using SAS data miner tool and achieved an accuracy of 89.01%. Cardiac disorders diagnosis ...

Deployment of Sensing Devices on Critical Infrastructure

DTW-D

Classification and Novel Class Detection in Data

... Unlike MineClass, ActMiner does not need all the instances in the training data to have true labels. ActMiner selectively presents some data points to the user for labeling. We call these data points as WCIs. In order to perform ensemble voting on an instance xj , first we initialize a vector V = {v ...

Effective Classification of 3D Image Data using

... image on the basis of only those regions that are of interest to an expert seems to be more meaningful [6-8]. The 3-D images or volumes we consider here consist of region data that can be defined as sets of (often connected) voxels (volume elements) in three-dimensional space that form 3-D structure ...

Microarray Basics: Part 2

... subgrid as an entire array (block by block loess) • Corrects best for artifacts introduced by the pins, as opposed to artifacts of regions of the slide – Because each subgrid has relatively few spots, risk having a subgrid where a substantial proportion of spots are really differentially expressed- ...

Improving clustering performance using multipath component distance

CANCER MICROARRAY DATA FEATURE SELECTION USING

... Cancer investigations in microarray data play a major role in cancer analysis and the treatment. Cancer microarray data consists of complex gene expressed patterns of cancer. In this article, a Multi-Objective Binary Particle Swarm Optimization (MOBPSO) algorithm is proposed for analyzing cancer gen ...

Improved Clustering using Hierarchical Approach

... A centroid-based partitioning technique uses the centroid of a cluster, Ci, to represent that cluster. The centroid of a cluster is its center point. The centroid can be defined in various ways such as by the mean or medoids of the objects assigned to the cluster. The difference between an object p ...

IOSR Journal of Electronics and Communication Engineering (IOSR-JECE)

... Data Mining Approach Using Apriori Algorithm: The Review Firstly, the data structure binary string is employed to describe the database. The support count can be implemented by performing the Bitwise “And” operation on the binary strings. Another technique for improving efficiency in Bitpriori is a ...

Data Mining and Its Applications

Advances in Natural and Applied Sciences Metaheuristics for Mining

... The rise of Meta-Heuristics can mainly be attributed to the increase in data generation. This is due to the increase in the information leveraging devices such as sensors, high resolution cameras and video recorders, satellites and user information generated from the internet. Hence a huge amount of ...

Chapter 40 DATA MINING FOR IMBALANCED DATASETS: AN

... undersampling. Random resampling consisted of oversampling the smaller class at random until it consisted of as many samples as the majority class and "focused resampling" consisted of oversampling only those minority examples that occurred on the boundary between the minority and majority classes. ...

ASSOCIATION RULE MINING BASED VIDEO CLASSIFIER WITH

... is a product of the size of the data set and the number of candidate rules, both of which may in some cases be large. It is clear that the choice of support and confidence thresholds will strongly influence the operation of CBA. The coverage analysis is performed either as a separate phase or integr ...

Mining Educational Data to Analyze the Student Motivation Behavior

II. Data Reduction

... dimensionality reduction to various areas, including computer vision, text mining and bioinformatics. If the evaluation procedure is tied to the task (e.g. clustering) of the learning algorithm, the FS algorithm employs the wrapper approach. This method searches through the feature subset space usin ...

Clustering

Algorithmic Information Theory-Based Analysis of Earth

... adopted to compute similarities between any two objects, and the conditions for it to be a metric hold under certain assumptions. Before the concept of information distance, the idea of exploiting the intrinsic power of data compression to match recurring patterns within the data can be found in nuc ...

A New Approach to Classification with the Least Number of Features

Segmentation using decision trees

... • Summary Table (upper left) • Tree-Ring Navigator (upper right) – Accessible from here: Tree Diagram + Assessment Statistics • Assessment Table (lower left) • Assessment Graph (lower right) – blue Training Data, red Validation Data ...

β-Thalassemia Knowledge Elicitation Using Data Engineering

... also. In this paper, the Thalassemia knowledge was elicited using Data engineering techniques (PCA, Pearson’s Chi square and Machine Learning). This knowledge presented in form of the comparison of classification performance of machine learning techniques between using Principal Components Analysis ...

Classify Uncertain Data with Neural Network

... Classification is the process of building a model that can describe and predict the class label of data based on the feature vector [8]. An intuitive way of handling uncertainty in classification is to represent the uncertain value by its expectation value and treat it as a certain data. Thus, conve ...

Study of Euclidean and Manhattan Distance Metrics

... conclude that the Manhattan distance gives the best performance in terms of precision of retrieved images. There may be cases where one measure performs better than other; which is totally depending upon the criterion adopted, the parameters used for validation the study etc. There are two categorie ...

< 1 ... 104 105 106 107 108 109 110 111 112 ... 170 >

K-nearest neighbors algorithm

In pattern recognition, the k-Nearest Neighbors algorithm (or k-NN for short) is a non-parametric method used for classification and regression. In both cases, the input consists of the k closest training examples in the feature space. The output depends on whether k-NN is used for classification or regression: In k-NN classification, the output is a class membership. An object is classified by a majority vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors (k is a positive integer, typically small). If k = 1, then the object is simply assigned to the class of that single nearest neighbor. In k-NN regression, the output is the property value for the object. This value is the average of the values of its k nearest neighbors.k-NN is a type of instance-based learning, or lazy learning, where the function is only approximated locally and all computation is deferred until classification. The k-NN algorithm is among the simplest of all machine learning algorithms.Both for classification and regression, it can be useful to assign weight to the contributions of the neighbors, so that the nearer neighbors contribute more to the average than the more distant ones. For example, a common weighting scheme consists in giving each neighbor a weight of 1/d, where d is the distance to the neighbor.The neighbors are taken from a set of objects for which the class (for k-NN classification) or the object property value (for k-NN regression) is known. This can be thought of as the training set for the algorithm, though no explicit training step is required.A shortcoming of the k-NN algorithm is that it is sensitive to the local structure of the data. The algorithm has nothing to do with and is not to be confused with k-means, another popular machine learning technique.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

K-nearest neighbors algorithm