4 Data Mining

... Generally, the use of visual analytics has been well received by industry. Several companies have embraced this business model and are selling visual analytics tools and/or offering consultancy services to different industries. Arguably, the main reason to adopt this novel approach is that business ...

Data Privacy in Data Engineering, the Privacy Preserving Models

... and might prove to be a hindrance to wide distribution of the concerned data. A more intensely researched area is preserving privacy in data mining [Aggarwal and Yu 2008c]. The main intent of privacypreserving data mining (PPDM), materialized in 2000 [Agrawal and Srikant 2000] is to mask confidentia ...

Representing, Storing and Mining Moving Objects Data

... intends to extract new knowledge from data using data mining algorithms. This new knowledge or patterns, useful in the decision making process, involve different data positions in the time dimension. Research in Mining Moving Object Data has not yet produced a theoretical framework for Data Mining o ...

HERE - Kevin Pei

... test set of 100,000 samples, 290 are Mobile which accounts for a small proportion and we would ideally expect for it to yield a small percentage change in error. We propose a k nearest neighbour treatment for the restaurant type. We first construct a query matrix within the test set where each row ...

... deal with the scalability problem and quickly produce good quality recommendations, but they have to undergo expensive matrix factorization steps. An incremental SVD CF algorithm [38] precomputes the SVD decomposition using existing users. When a new set of ratings are added to the database, the alg ...

Tidset-Based Parallel FP-tree Algorithm for the Frequent Pattern

... algorithm only scans the database twice. It is notable that the FP-growth algorithm can be divided into two phases: the FP-tree construction and the mining of frequent patterns from the FP-tree. 2.1.1 Construction of the FP-tree The FP-tree is a data structure representing the necessary information ...

From DBMiner to WebMiner: What is the Future of Data Mining?

... of a "random browser." Hub/authority method (Kleinberg, 1998):  Prominent authorities often do not endorse one another directly on the Web.  Hub pages have a large number of links to many relevant authorities.  Thus hubs and authorities exhibit a mutually reinforcing relationship: Both the page-r ...

Rubik: Knowledge Guided Tensor Factorization and Completion for

... • 8 sub-phenotypes in total Experts evaluation: ...

Effect of Data Repair on Mining Network Streams

... In this paper, we use network data that were collected from a mobility network over a period of several weeks. The data consist of measurements on network performance indicators gathered from individual network elements. Network engineers use this data to monitor, in real-time, the performance of th ...

Vertical-format Based Frequent Pattern Mining

Data Mining: Concepts and Techniques

... Sorting, hashing, and grouping operations are applied to the dimension attributes in order to reorder and cluster related tuples Aggregates may be computed from previously computed aggregates, rather than from the base fact table ...

Data Mining with Fuzzy Methods: Status and

... In order to obtain meaningful fuzzy partitions, it is better to create rule bases by structure-oriented learning than by cluster-oriented or by hyperbox-oriented rule learning. The latter two approaches create individual fuzzy sets for each rule and thus provide less interpretable solutions. Structu ...

Clustering Association Rules

Privacy-Preserving Data Mining

Activity Recognition using Cell Phone Accelerometers

Customer Segmentation for Decision Support

Data Mining with Fuzzy Methods - Computational

... needed and to check whether it may be useful to take additional, related data into account. In the data preparation step, the gathered data is cleaned, transformed and maybe properly scaled to produce the input for the modeling techniques. In this step fuzzy methods may, for example, be used to dete ...

Enhanced SPRINT Algorithm based on SLIQ to Improve Attribute

... LASSIFICATION is the most commonly applied data mining technique that is effective for data mining analysis. This can be used to describe and extract models from data classes and predict future data [1]. The analysis and forecasts of these data provide good decision support in various industries. Cl ...

Privacy in the Post-NSA Era:Time for a Fundamental Revision?

Ontological Learning Assistant for Knowledge Discovery

... Therefore structuring the knowledge about previous initiatives and using the CBR paradigm for aiding the new process creation may solve this problem [13], [6]. Our platform takes advantage of that approach. However our case representation is not limited to the whole KDD project. We distinguished its ...

IOSR Journal of Computer Engineering (IOSR-JCE)

... improves the recognition accuracy. With the aim of choosing a subset of good features with respect to the target concepts, feature subset selection is an effective way for reducing dimensionality, removing irrelevant data, increasing learning accuracy, and improving result comprehensibility. Many fe ...

Document

... – math: tuning and validation than unsupervised – task: classification, regression – models: decision trees, Naïve Bayes, Bayes, linear/logistic regression, SVM, neural nets ...

Data Cube - Jiawei Han

15.6 Confidence Limits on Estimated Model Parameters

... as in the previous discussion, you subject these data sets to the same estimation procedure as was performed on the actual data, giving a set of simulated measured parameters aS(1) , aS(2) , . . . . These will be distributed around a(0) in close to the same way that a(0) is distributed around atrue ...

Parallel Outlier Detection on Uncertain Data for GPUs

... A considerable portion of data in the real world contains some degree of uncertainty [3], due to factors such as limitations in measuring equipment, partial responses or interpolation [5] [13]. An example of this could be a remote sensor network or location-based tracking system [42]. There have bee ...

< 1 ... 138 139 140 141 142 143 144 145 146 ... 505 >

Nonlinear dimensionality reduction

High-dimensional data, meaning data that requires more than two or three dimensions to represent, can be difficult to interpret. One approach to simplification is to assume that the data of interest lie on an embedded non-linear manifold within the higher-dimensional space. If the manifold is of low enough dimension, the data can be visualised in the low-dimensional space.Below is a summary of some of the important algorithms from the history of manifold learning and nonlinear dimensionality reduction (NLDR). Many of these non-linear dimensionality reduction methods are related to the linear methods listed below. Non-linear methods can be broadly classified into two groups: those that provide a mapping (either from the high-dimensional space to the low-dimensional embedding or vice versa), and those that just give a visualisation. In the context of machine learning, mapping methods may be viewed as a preliminary feature extraction step, after which pattern recognition algorithms are applied. Typically those that just give a visualisation are based on proximity data – that is, distance measurements.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Nonlinear dimensionality reduction