Classification in the Presence of Background Domain Knowledge

... more concise models by working at different levels of abstraction and exploring the relationship between concepts in the data set. The main contributions of this work are: 1. a concept hierarchy guided decision tree learning algorithm, that is able to take advantage of user supplied feature (attribu ...

Clustering of time series data—a survey

... time series data are concerned, distinctions can be made as to whether the data are discrete-valued or real-valued, uniformly or non-uniformly sampled, univariate or multivariate, and whether data series are of equal or unequal length. Non-uniformly sampled data must be converted into uniformed data ...

Solving Complex Machine Learning Problems with Ensemble Methods

... was to discuss ensemble strategies that not only focused on supervised classification, but that could be used to solve difficult and general machine learning problems. The workshop brought together members of the ensemble methods community and also researchers from other fields that could benefit fr ...

frbs: Fuzzy Rule-based Systems for Classification and Regression in R

Evaluation of clustering methods for adaptive learning systems

The GC3 framework : grid density based clustering for

... system and as such will not be totally reliable. Also the rate at which this data is being generated (real time in many cases) is much higher than the rate at which it can be analyzed by traditional data mining techniques. In such a dynamic environment, the basic tasks of Data mining such as Cluster ...

PDF version

... One of the applications areas of data mining is World Wide Web (WWW), which serves as a huge, widely distributed, global information service center for every kind of information such as news, advertisements, consumer information, financial management, education, government, e-commerce, health servic ...

Intrusion detection using clustering

statistical models and analysis techniques

PMML: An Open Standard for Sharing Models

A Survey on Distribution Testing

A Tutorial on Dirichlet Processes and Hierarchical Dirichlet Processes

Some contributions to semi-supervised learning

Deep learning is not the panacea - Computer Science | CU

... into cognition but the latter often perform better. This tension has recently surfaced in the realm of educational data mining, where a deep learning approach to estimating student proficiency, termed deep knowledge tracing or DKT [17], has demonstrated a stunning performance advantage over the main ...

A Survey on Consensus Clustering Techniques

chapter 6 data mining

... It is common to have observations with missing values for one or more variables. The primary options for addressing missing data are: (1) discard observations with any missing values, (2) discard variable(s) with missing values, (3) fill-in missing entries with estimated values, or (4) apply a data ...

Ensemble Learning Techniques for Structured

... classification models such as decision trees, artificial neural networks, Naïve Bayes, as well as many other classifiers (Kim, 2009). Ensemble learning, based on aggregating the results from multiple models, is a more sophisticated approach for increasing model accuracy as compared to the traditiona ...

Application and evaluation of inductive reasoning methods for the

... applied to draw conclusions about an individual given some statistical quantities such as probabilities, averages, or deviations from a previous examined population. In other words, by the use of statistical induction techniques, additional triples are derived based on some (precomputed) statistics ...

Aalborg Universitet Learning Bayesian Networks with Mixed Variables Bøttcher, Susanne Gammelgaard

lecture12and13_clustering

Improving student model for individualized learning

Package `RODM`

... specified (or defaults to an algorithm-specific model name). When created, the model will exist in Oracle as a database schema object. Most algorithms accept a parameter to direct ODM to enable automatic data preparation (default TRUE). This feature will request that ODM prepare data as befitting in ...

The 2009 Knowledge Discovery in Data Competition (KDD Cup

A Short Tutorial on Model

To Explain or to Predict? - Department of Statistics

... is often valued for its applied utility, yet is discarded for scientific purposes such as theory building or testing. Shmueli and Koppius (2010) illustrated the lack of predictive modeling in the field of IS. Searching the 1072 papers published in the two top-rated journals Information Systems Resea ...

< 1 2 3 4 5 6 7 8 ... 58 >

Mixture model

In statistics, a mixture model is a probabilistic model for representing the presence of subpopulations within an overall population, without requiring that an observed data set should identify the sub-population to which an individual observation belongs. Formally a mixture model corresponds to the mixture distribution that represents the probability distribution of observations in the overall population. However, while problems associated with ""mixture distributions"" relate to deriving the properties of the overall population from those of the sub-populations, ""mixture models"" are used to make statistical inferences about the properties of the sub-populations given only observations on the pooled population, without sub-population identity information.Some ways of implementing mixture models involve steps that attribute postulated sub-population-identities to individual observations (or weights towards such sub-populations), in which case these can be regarded as types of unsupervised learning or clustering procedures. However not all inference procedures involve such steps.Mixture models should not be confused with models for compositional data, i.e., data whose components are constrained to sum to a constant value (1, 100%, etc.). However, compositional models can be thought of as mixture models, where members of the population are sampled at random. Conversely, mixture models can be thought of as compositional models, where the total size of the population has been normalized to 1.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Mixture model