Algorithmic Information Theory-Based Analysis of Earth

PDF - International Journal of Advanced Research

... In India, sixty percentages of people are the followers of astrology from the birth to death. They believe that astrology can solve the confusions in their life. The publications related to astrology in popular magazines shows the accessibility of this traditional science in common people. If we con ...

Knowledge Discovery – Techniques and Application

... We conclude that we presented some definitions of basic terms in the Knowledge discovery. Our primary focus was to clarify the relation between data mining and knowledge discovery. Overview of the KDD process and basic data mining methods have been provided. There are several data-mining techniques, ...

Robust Outlier Detection Technique in Data Mining- A

... If the sample size is small (80 or fewer cases), a case is an outlier if its standard score is ±2.5 or beyond. • If the sample size is larger than 80 cases, a case is an outlier if its standard score is ±3.0 or beyond Then run the k-means (clustering) algorithm, for the dataset with and without the ...

anomaly detection

... LLt ( D )  M t log( 1   )   log PM t ( xi )  At log    log PAt ( xi ) ...

Predicting the outcome of English Premier League games using

... straightforward task. The main problem was with statto.com, where information is stored in javascript and cannot be accessed simply by following the link. It was necessary to send a request that changes the page layout so that it can be parsed. This problem has been solved by applying selenium libra ...

Slide 1

... Data mining vs machine learning • Machine learning methods are used for data mining – Classification, clustering ...

Consensus Guided Unsupervised Feature Selection

... by selecting the pivot portion of feature, which has been widely discussed in machine learning and data mining community (Guyon and Elisseeff 2003; Li and Fu 2015). Clearly, features after selection are easily interpreted, need shorter training time, and most importantly overcome the over ﬁtting pro ...

ACM SIGKDD Conference on Knowledge Discovery and Data

Knowledge Discovery in Database Nisha Rani Department of

... a defined distance of the queried object otherwise it will find all pairs that are within some distance of each other ...

Time-Series Similarity Problems and Well

7. C07-Machine Learning

... Managers in Samsung want to find consuming patterns of users so that it’d provide personalized services. ...

Ch 9.2.1

... Obtain a sample of points from the data set Compute the link value for each set of points, i.e., transform the original similarities (computed by Jaccard coefficient) into similarities that reflect the number of shared neighbors between points Perform an agglomerative hierarchical clustering on the ...

O - 國立雲林科技大學

... The partial/merge stream-based k-means - is simpler to find an appropriate cluster representation. - provides a highly scalable, parallel approach, efficiency, and a significantly higher clustering quality. ...

Overview of Predictive Modeling Approaches in Health Care Data

... domain. A predictive modeling approach of Data Mining has been systematically applied for the prognosis, diagnosis, and planning for treatment of chronic disease. For example, a classification system can assist the physician to predict if the patient is likely to have a certain disease, or by consid ...

Document

Supervised Learning for Gene Expression Microarray Data

... All top 1% splits are based on AD.  Leave-one-out results appear to be 100%…double-checking this to be sure.  35 is cutoff point for myeloma vote. No normal gets more than 15 votes, and no myeloma gets fewer than 55. ...

Contents - Computer Science

... appropriate classes, and (2) it forms descriptions for each class, as in classication. The guideline of striving for high intraclass similarity and low interclass similarity still applies. In data mining, eorts have focused on nding methods for ecient and eective cluster analysis in large datab ...

Anomaly Detection

Multi-Agent Distributed Data Mining by Ontologies

DECODE: a new method for discovering clusters of different

... on density (Han et al. 2001). It is believed that density-based cluster methods have the potential to reveal the structure of a spatial data set in which different point processes overlap. Ester et al. (1996) and Sander et al. (1998) introduced the approaches of DBSCAN and GDBSCAN to address the det ...

A Hybrid Fuzzy Firefly Based Evolutionary Radial Basis Functional

... To compare the performance of our proposed method we have considered two other evolutionary methods which are FA-RBFN and PSO-RBFN. In case of FA-RBFN method all the parameters of the RBF network are optimized by means of the firefly algorithm simultaneously. So, a firefly is encoded as a combinatio ...

Anytime Concurrent Clustering of Multiple Streams with an Indexing

... Data streams are continuously produced and need to be analysed online. Moreover, multistream applications demand higher anytime requirements due to streams arriving at any time and with varying speeds. This continuously arriving data means huge storage requirements. Therefore, online multi-stream cl ...

Topic Models over Text Streams: A Study of

CV - Grafia - University of California, Santa Barbara

... Developed a domain-independent and fully automatic Web data records extraction algorithm. The algorithm captures repetitive patterns rendered in Web pages by analyzing the HTML tag paths. Both flat data records and nested data records can be extracted automatically. No prior knowledge on how the Web ...

< 1 ... 149 150 151 152 153 154 155 156 157 ... 264 >

Cluster analysis

Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). It is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics.Cluster analysis itself is not one specific algorithm, but the general task to be solved. It can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them. Popular notions of clusters include groups with small distances among the cluster members, dense areas of the data space, intervals or particular statistical distributions. Clustering can therefore be formulated as a multi-objective optimization problem. The appropriate clustering algorithm and parameter settings (including values such as the distance function to use, a density threshold or the number of expected clusters) depend on the individual data set and intended use of the results. Cluster analysis as such is not an automatic task, but an iterative process of knowledge discovery or interactive multi-objective optimization that involves trial and failure. It will often be necessary to modify data preprocessing and model parameters until the result achieves the desired properties.Besides the term clustering, there are a number of terms with similar meanings, including automatic classification, numerical taxonomy, botryology (from Greek βότρυς ""grape"") and typological analysis. The subtle differences are often in the usage of the results: while in data mining, the resulting groups are the matter of interest, in automatic classification the resulting discriminative power is of interest. This often leads to misunderstandings between researchers coming from the fields of data mining and machine learning, since they use the same terms and often the same algorithms, but have different goals.Cluster analysis was originated in anthropology by Driver and Kroeber in 1932 and introduced to psychology by Zubin in 1938 and Robert Tryon in 1939 and famously used by Cattell beginning in 1943 for trait theory classification in personality psychology.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Cluster analysis