
Recommendation via Query Centered Random Walk on K-partite Graph
... graph is referred to as QRank in this paper. The QRank algorithm has some desirable properties. The Markov chain model computes the relevance score of each document from a global perspective, with respect to all the documents, terms, and authors in the graph. In contrast, standard approaches conside ...
... graph is referred to as QRank in this paper. The QRank algorithm has some desirable properties. The Markov chain model computes the relevance score of each document from a global perspective, with respect to all the documents, terms, and authors in the graph. In contrast, standard approaches conside ...
Research on High-Dimensional Data Reduction
... difficult, we must adopt some special means to solve. With the increase of the dimension of the data, the performance of high-dimensional index structure decreased rapidly, in the low dimensional space, those who often used Euclidean distance as the similarity measurement, but among data in high-dim ...
... difficult, we must adopt some special means to solve. With the increase of the dimension of the data, the performance of high-dimensional index structure decreased rapidly, in the low dimensional space, those who often used Euclidean distance as the similarity measurement, but among data in high-dim ...
Social-media Text Mining and Network Analysis to support Decision
... audience engagement. In order to reduce or isolate user classes the influence measures will be calculated based on a variety of social network measures will be examined such as influence, activity, use of single topic, and language amongst the dependencies of influence, both [15] and [16] share the ...
... audience engagement. In order to reduce or isolate user classes the influence measures will be calculated based on a variety of social network measures will be examined such as influence, activity, use of single topic, and language amongst the dependencies of influence, both [15] and [16] share the ...
Data Staging
... Identify data sources Extract and analyze source data Standardize data Correct and complete data Match and consolidate data Analyze data defect types Transform and enhance data into target Calculate derivations and summary data Audit and control data extract, transformation and loading ...
... Identify data sources Extract and analyze source data Standardize data Correct and complete data Match and consolidate data Analyze data defect types Transform and enhance data into target Calculate derivations and summary data Audit and control data extract, transformation and loading ...
APRIORI algorithm based medical data mining forfrequent
... discovery from data or KDD. Decision making can be achieved by converting data mining in to knowledge and this process is called knowledge discovery. The iterative sequence present in knowledge discovery are 1.,Data cleaning [inconsistent data and noise are removed],2., Data integration [the combina ...
... discovery from data or KDD. Decision making can be achieved by converting data mining in to knowledge and this process is called knowledge discovery. The iterative sequence present in knowledge discovery are 1.,Data cleaning [inconsistent data and noise are removed],2., Data integration [the combina ...
A Novel Approach for Data Cleaning by Selecting the Optimal Data
... features for instances will be built and the problem of intrusion detection will be mapped as a 10 feature problem, such feature creation and as features in new problem only have discrete values, in final classification decision tree will be used[12]. The idea of grouping utilizing least spreading o ...
... features for instances will be built and the problem of intrusion detection will be mapped as a 10 feature problem, such feature creation and as features in new problem only have discrete values, in final classification decision tree will be used[12]. The idea of grouping utilizing least spreading o ...
Educational Data Mining 2010 - International Educational Data
... Alberta[homepage] Abstract: Using computer-supported collaborative learning tools, learners interact forming relationships and complex flows of information. In a forum with very few learners it is customary to quickly collect thousands of messages in few months, and these are interrelated in intrica ...
... Alberta[homepage] Abstract: Using computer-supported collaborative learning tools, learners interact forming relationships and complex flows of information. In a forum with very few learners it is customary to quickly collect thousands of messages in few months, and these are interrelated in intrica ...
A Comparative Analysis of Classification Techniques on
... variables, parameter and structure learning. The density of the arcs is measure of its complexity. Simple model is represented by sparse Bayesnet while complex models by dense Bayesnet. Thus, it provides a flexible method for probabilistic modeling. 3.4Neural Network: An artificial neural network (A ...
... variables, parameter and structure learning. The density of the arcs is measure of its complexity. Simple model is represented by sparse Bayesnet while complex models by dense Bayesnet. Thus, it provides a flexible method for probabilistic modeling. 3.4Neural Network: An artificial neural network (A ...
Parallel K-Means Clustering Based on MapReduce
... The map function performs the procedure of assigning each sample to the closest center while the reduce function performs the procedure of updating the new centers. In order to decrease the cost of network communication, a combiner function is developed to deal with partial combination of the interm ...
... The map function performs the procedure of assigning each sample to the closest center while the reduce function performs the procedure of updating the new centers. In order to decrease the cost of network communication, a combiner function is developed to deal with partial combination of the interm ...
Dimensionality reduction Feature selection
... Dimensionality reduction. Motivation. • Classification problem example: – We have an input data { x 1 , x 2 ,.., x N } such that x i = ( x i1 , x i2 ,.., x id ) and a set of corresponding output labels { y1 , y 2 ,.., y N } – Assume the dimension d of the data point x is very large – We want to clas ...
... Dimensionality reduction. Motivation. • Classification problem example: – We have an input data { x 1 , x 2 ,.., x N } such that x i = ( x i1 , x i2 ,.., x id ) and a set of corresponding output labels { y1 , y 2 ,.., y N } – Assume the dimension d of the data point x is very large – We want to clas ...
A comparison of various clustering methods and algorithms in data
... Density functions clustering It computes density functions defined over the underlying attribute space instead of computing densities pinned to data points. They introduced the algorithm DENCLUE (DENsitybased CLUstEring). It has a firm mathematical foundation Along with DBCLASD that uses a density f ...
... Density functions clustering It computes density functions defined over the underlying attribute space instead of computing densities pinned to data points. They introduced the algorithm DENCLUE (DENsitybased CLUstEring). It has a firm mathematical foundation Along with DBCLASD that uses a density f ...
Supervised learning
... 3. Determine the input feature representation of the learned function. The accuracy of the learned function depends strongly on how the input object is represented. Typically, the input object is transformed into a feature vector, which contains a number of features that are descriptive of the objec ...
... 3. Determine the input feature representation of the learned function. The accuracy of the learned function depends strongly on how the input object is represented. Typically, the input object is transformed into a feature vector, which contains a number of features that are descriptive of the objec ...
Clustering Sentence-Level Text Using a Novel Fuzzy Relational
... algorithm is capable of identifying overlapping clusters of semantically related sentences. Comparisons with the ARCA algorithm on each of these data sets suggest that FRECCA is capable of identifying softer clusters than ARCA, without sacrificing performance as evaluated by external measures. Altho ...
... algorithm is capable of identifying overlapping clusters of semantically related sentences. Comparisons with the ARCA algorithm on each of these data sets suggest that FRECCA is capable of identifying softer clusters than ARCA, without sacrificing performance as evaluated by external measures. Altho ...
Radial Basis Function (RBF) Networks
... • The final layer performs a simple weighted sum with a linear output. • If the RBF network is used for function approximation (matching a real number) then this output is fine. • However, if pattern classification is required, then a hard-limiter or sigmoid function could be placed on the output ne ...
... • The final layer performs a simple weighted sum with a linear output. • If the RBF network is used for function approximation (matching a real number) then this output is fine. • However, if pattern classification is required, then a hard-limiter or sigmoid function could be placed on the output ne ...
IOSR Journal of Computer Engineering (IOSR-JCE)
... applied for further prediction; in case-based reasoning methods the predictive model is implicit in historical data; the third option is a mixture between prior explicit knowledge model and iterative refinements based on future data (Bayesian learning). Finally, in the presented conceptual map of Da ...
... applied for further prediction; in case-based reasoning methods the predictive model is implicit in historical data; the third option is a mixture between prior explicit knowledge model and iterative refinements based on future data (Bayesian learning). Finally, in the presented conceptual map of Da ...
LN22
... – Goal: A consumer appliance repair company wants to anticipate the nature of repairs on its consumer products and keep the service vehicles equipped with right parts to reduce on number of visits to consumer households. – Approach: Process the data on tools and parts required in previous repairs at ...
... – Goal: A consumer appliance repair company wants to anticipate the nature of repairs on its consumer products and keep the service vehicles equipped with right parts to reduce on number of visits to consumer households. – Approach: Process the data on tools and parts required in previous repairs at ...
notes
... results, thus usually need a large and complete set of training data. The second is discriminative, where we directly estimate the conditional probability P (Y |X). This approach can optimize directly for the task of interest, but the conditional probability may be harder to interpret in some cases. ...
... results, thus usually need a large and complete set of training data. The second is discriminative, where we directly estimate the conditional probability P (Y |X). This approach can optimize directly for the task of interest, but the conditional probability may be harder to interpret in some cases. ...
lcpc_xgli - Ohio State Computer Science and Engineering
... Department of Computer and Information Sciences Ohio State University ...
... Department of Computer and Information Sciences Ohio State University ...
Analysis of Student Performance by using Data Mining Concept
... or advanced technique of data mining that can be applied on the data related to the field of education. The data can be collected from past used data and operational data reside in the databases of educational institutes. The data of students can be personal information or academic performance. Furt ...
... or advanced technique of data mining that can be applied on the data related to the field of education. The data can be collected from past used data and operational data reside in the databases of educational institutes. The data of students can be personal information or academic performance. Furt ...
Nonlinear dimensionality reduction

High-dimensional data, meaning data that requires more than two or three dimensions to represent, can be difficult to interpret. One approach to simplification is to assume that the data of interest lie on an embedded non-linear manifold within the higher-dimensional space. If the manifold is of low enough dimension, the data can be visualised in the low-dimensional space.Below is a summary of some of the important algorithms from the history of manifold learning and nonlinear dimensionality reduction (NLDR). Many of these non-linear dimensionality reduction methods are related to the linear methods listed below. Non-linear methods can be broadly classified into two groups: those that provide a mapping (either from the high-dimensional space to the low-dimensional embedding or vice versa), and those that just give a visualisation. In the context of machine learning, mapping methods may be viewed as a preliminary feature extraction step, after which pattern recognition algorithms are applied. Typically those that just give a visualisation are based on proximity data – that is, distance measurements.