View/Download-PDF - International Journal of Computer Science
... facts. Mining data for prediction of various diseases is now a day’s very helpful in health area. The data generated at various specialist hospitals is voluminous. There are a vast number of attributes i.e. features of the particular disease and from it the specialist will diagnose particular diseas ...
... facts. Mining data for prediction of various diseases is now a day’s very helpful in health area. The data generated at various specialist hospitals is voluminous. There are a vast number of attributes i.e. features of the particular disease and from it the specialist will diagnose particular diseas ...
clinical decision support for heart disease using predictive models
... potential heart patient to go untreated and may be even life threatening. We evaluate the models based on their false positives count. Models with lower number of false positives are desired. Three classification algorithms – (1) Naïve Bayes, (2) Decision Trees, and (3) Artificial Neural Networks (A ...
... potential heart patient to go untreated and may be even life threatening. We evaluate the models based on their false positives count. Models with lower number of false positives are desired. Three classification algorithms – (1) Naïve Bayes, (2) Decision Trees, and (3) Artificial Neural Networks (A ...
a practical case study on the performance of text classifiers
... methods used in statistics and machine learning is the K–Means algorithm which aims to create a group of k clusters from an initial set of n objects, each object being assigned to the cluster with the closest mean. Centroid-based technique for K-Means uses a number n of objects and a constant K whic ...
... methods used in statistics and machine learning is the K–Means algorithm which aims to create a group of k clusters from an initial set of n objects, each object being assigned to the cluster with the closest mean. Centroid-based technique for K-Means uses a number n of objects and a constant K whic ...
BAYDA: Software for Bayesian Classification and Feature Selection
... The Bayesian classification and feature selection approach described above is implemented in the BAYDA (Bayesian Discriminant Analysis) software programmed in the Java language. BAYDA has several unique features that distinguish it from standard statistical software for classification (such as predi ...
... The Bayesian classification and feature selection approach described above is implemented in the BAYDA (Bayesian Discriminant Analysis) software programmed in the Java language. BAYDA has several unique features that distinguish it from standard statistical software for classification (such as predi ...
data mining techniques and application to
... classification to an unknown sample. The K-Nearest used for classifying soils in combination with GPS-based Neighbor uses the information in the training set, but it does technologies [23]. Meyer GE et al. [24] uses a K-Means not extract any rule for classifying the other. approach to classify soils ...
... classification to an unknown sample. The K-Nearest used for classifying soils in combination with GPS-based Neighbor uses the information in the training set, but it does technologies [23]. Meyer GE et al. [24] uses a K-Means not extract any rule for classifying the other. approach to classify soils ...
Preprocessing and Classification of Data Analysis in Institutional
... developing methods for researching the information from large data that come from educational settings, and using these techniques to better understand students result, and the settings which they learn in [7]. Data mining is an infusion of interesting (non-trivial, inexplicit, previously unknown an ...
... developing methods for researching the information from large data that come from educational settings, and using these techniques to better understand students result, and the settings which they learn in [7]. Data mining is an infusion of interesting (non-trivial, inexplicit, previously unknown an ...
Feature Extraction for Classification in the Data Mining Process M
... Feature extraction (FE) is one of the dimensionality reduction techniques [Liu, 1998]. FE extracts a subset of new features from the original feature set by means of some functional mapping keeping as much information in the data as possible [Fukunaga, 1990]. Conventional Principal Component Analysi ...
... Feature extraction (FE) is one of the dimensionality reduction techniques [Liu, 1998]. FE extracts a subset of new features from the original feature set by means of some functional mapping keeping as much information in the data as possible [Fukunaga, 1990]. Conventional Principal Component Analysi ...
2. The DBSCAN algorithm - Linköpings universitet
... 4. Comparisons (DBSCAN vs. CLARANS) Through the original report [1], the DBSCAN algorithm is compared to another clustering algorithm. This one is called CLARANS (Clustering Large Applications based on RANdomized Search). It is an improvement of the k-medoid algorithms (one object of the cluster loc ...
... 4. Comparisons (DBSCAN vs. CLARANS) Through the original report [1], the DBSCAN algorithm is compared to another clustering algorithm. This one is called CLARANS (Clustering Large Applications based on RANdomized Search). It is an improvement of the k-medoid algorithms (one object of the cluster loc ...
Increasing Cement Strength Using Data Mining Techniques
... Y tremendous growth in data and databases has spawned a pressing need for new techniques and tools that can intelligently and automatically transform data into useful information and knowledge.[1] Data mining is a process of extracting implicit, previously unknown, but potentially useful information ...
... Y tremendous growth in data and databases has spawned a pressing need for new techniques and tools that can intelligently and automatically transform data into useful information and knowledge.[1] Data mining is a process of extracting implicit, previously unknown, but potentially useful information ...
KClustering
... The problem can be generalized as follows: Given a set N of n data points in d dimensional space, we must determine how to assign a set K of k points, called centers, in N so as to optimize based on some criterion. In most cases, it is natural to assume that N is much greater than K and d is relativ ...
... The problem can be generalized as follows: Given a set N of n data points in d dimensional space, we must determine how to assign a set K of k points, called centers, in N so as to optimize based on some criterion. In most cases, it is natural to assume that N is much greater than K and d is relativ ...
Stabilization of regression trees T. Urban, T. Kampke
... supports the generalization. The improvement in accuracy is reached by partitioning the learning data set and assessing specific regression surfaces to each of the subsets. The generated regression tree considers the whole information of the sample set at each split node rather than the information ...
... supports the generalization. The improvement in accuracy is reached by partitioning the learning data set and assessing specific regression surfaces to each of the subsets. The generated regression tree considers the whole information of the sample set at each split node rather than the information ...
View/Download-PDF - International Journal of Computer Science
... improve business in education sector to offer competitive courses, improve students‟ academic performance and teachers‟ performance [5]. The process to provide quality education is describes by Brijesh Kumar Baradwaj and Saurabh Pal (2011) to provide quality education higher educational institutions ...
... improve business in education sector to offer competitive courses, improve students‟ academic performance and teachers‟ performance [5]. The process to provide quality education is describes by Brijesh Kumar Baradwaj and Saurabh Pal (2011) to provide quality education higher educational institutions ...
K-nearest neighbors algorithm
In pattern recognition, the k-Nearest Neighbors algorithm (or k-NN for short) is a non-parametric method used for classification and regression. In both cases, the input consists of the k closest training examples in the feature space. The output depends on whether k-NN is used for classification or regression: In k-NN classification, the output is a class membership. An object is classified by a majority vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors (k is a positive integer, typically small). If k = 1, then the object is simply assigned to the class of that single nearest neighbor. In k-NN regression, the output is the property value for the object. This value is the average of the values of its k nearest neighbors.k-NN is a type of instance-based learning, or lazy learning, where the function is only approximated locally and all computation is deferred until classification. The k-NN algorithm is among the simplest of all machine learning algorithms.Both for classification and regression, it can be useful to assign weight to the contributions of the neighbors, so that the nearer neighbors contribute more to the average than the more distant ones. For example, a common weighting scheme consists in giving each neighbor a weight of 1/d, where d is the distance to the neighbor.The neighbors are taken from a set of objects for which the class (for k-NN classification) or the object property value (for k-NN regression) is known. This can be thought of as the training set for the algorithm, though no explicit training step is required.A shortcoming of the k-NN algorithm is that it is sensitive to the local structure of the data. The algorithm has nothing to do with and is not to be confused with k-means, another popular machine learning technique.