4 - Read

Threshold Regression Without Distribution Assumption when the

Privacy-Preserving Decision Tree Mining Based on

... To perturb a set of data records O = {o1 , . . . , on } on an attribute A, we create a perturbation matrix M for the attribute domain U = {u1 , . . . , uN }. For each uk ∈ U, p(k → h) = Pr(uk → uh ) denotes the (transition) probability that uk is replaced by uh ∈ U. The perturbation matrix is then d ...

Clustering Formulation using Constraint Optimization

Including Measurement Error in the Regression Model: A First Try

A k-mean clustering algorithm for mixed numeric and categorical data

project reportclustering - Department of Computer Science

Clustering - Politecnico di Milano

... using a mixture of distributions • Each cluster is represented by one distribution • The distribution governs the probabilities of attributes values in the corresponding cluster • They are called finite mixtures because there is only a finite number of clusters being represented • Usually individual ...

DECODE: a new method for discovering clusters of different

... Density-based cluster methods are characterized by aggregating mechanisms based on density (Han et al. 2001). It is believed that density-based cluster methods have the potential to reveal the structure of a spatial data set in which different point processes overlap. Ester et al. (1996) and Sander ...

Discovering Correlated Subspace Clusters in 3D

... Axis-parallel 3D subspace clusters are extensions of the 2D subspace clusters with time/location as the third dimension. Tricluster [1] is the pioneer work on 3D subspace clusters. Similar to 2D subspace clusters, triclusters fulfill certain similarity-based functions and thresholds have to be set o ...

Inducing Decision Trees with an Ant Colony Optimization Algorithm

Fast Distance Metric Based Data Mining Techniques Using P

An Agglomerative Clustering Method for Large Data Sets

A Simple Constraint-Based Algorithm for Efficiently Mining

Decision Trees for Uncertain Data

Inducing Decision Trees with an Ant Colony Optimization Algorithm

... 2.1. Top-down Induction of Decision Trees Decision trees provide a comprehensible graphical representation of a classification model, where the internal nodes correspond to attribute tests (decision nodes) and leaf nodes correspond to the predicted class labels—illustrated in Fig. 1. In order to cla ...

Association Rule Mining

C-SWF Incremental Mining Algorithm for Firewall Policy Management

Contextual Anomaly Detection in Big Sensor Data

full version

Frequency Distributions

Parallel Outlier Detection on Uncertain Data for GPUs

... 2.2 Parallel and GPU-accelerated data mining Data mining applications such as outlier detection are good candidates for parallelization [16] [18] as a large amount of data is processed by a small number of routines. Several outlier detection algorithms have been parallelized for acceleration with GP ...

Clustering Non-Ordered Discrete Data, JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, Vol 30, PP. 1-23, 2014, Alok Watve, Sakti Pramanik, Sungwon Jung, Bumjoon Jo, Sunil Kumar, Shamik Sural

... idea behind this is that even though two data points may not directly be similar to each other, but if they share many common neighbors then they should be put in the same cluster. Based on this new measure, they propose a hierarchical clustering method that recursively merges clusters with maximum ...

Cluster Validity Measurement for Arbitrary Shaped Clusters

Document Clustering Using Concept Space and Cosine Similarity

< 1 ... 24 25 26 27 28 29 30 31 32 ... 152 >

Expectation–maximization algorithm

In statistics, an expectation–maximization (EM) algorithm is an iterative method for finding maximum likelihood or maximum a posteriori (MAP) estimates of parameters in statistical models, where the model depends on unobserved latent variables. The EM iteration alternates between performing an expectation (E) step, which creates a function for the expectation of the log-likelihood evaluated using the current estimate for the parameters, and a maximization (M) step, which computes parameters maximizing the expected log-likelihood found on the E step. These parameter-estimates are then used to determine the distribution of the latent variables in the next E step.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Expectation–maximization algorithm