Subspace Memory Clustering
... obtain the original clusters, see Figure 2. Moreover, dimensions of these clusters equal 1, 1, 2, 2.04. Similarly, if we put p = 0.56 for dataset X2 then we obtain 100% compatibility with original clusters and dimensions of this clusters are given by 1, 1, 2, 2.67, see Figure 2. On Figure 3, we show ...
... obtain the original clusters, see Figure 2. Moreover, dimensions of these clusters equal 1, 1, 2, 2.04. Similarly, if we put p = 0.56 for dataset X2 then we obtain 100% compatibility with original clusters and dimensions of this clusters are given by 1, 1, 2, 2.67, see Figure 2. On Figure 3, we show ...
On the effect of data set size on bias and variance in classification
... variance when moving from one sample size to the next (i.e. sample sizes of 125 were not compared to 500 or more). Because prior predictions were made for variance, one-tailed tests are applied for the variance outcomes. As no predictions were made with respect to bias, two-tailed tests are for appl ...
... variance when moving from one sample size to the next (i.e. sample sizes of 125 were not compared to 500 or more). Because prior predictions were made for variance, one-tailed tests are applied for the variance outcomes. As no predictions were made with respect to bias, two-tailed tests are for appl ...
Feature Subset Selection - Department of Computer Science
... improves the performance of IB1 on 3 of natural domains and 3 of the artificial domains. Performance is significantly degraded on 2 of the natural and 3 of the artificial domains. CFS has successfully removed the 27 irrelevant attributes from the first two boolean domains (B1 & B2). As expected, CFS ...
... improves the performance of IB1 on 3 of natural domains and 3 of the artificial domains. Performance is significantly degraded on 2 of the natural and 3 of the artificial domains. CFS has successfully removed the 27 irrelevant attributes from the first two boolean domains (B1 & B2). As expected, CFS ...
A Dynamic Knowledge Base - K
... (i.e. non stationarity condition) can be paired with the more permanent/long term asymptotic information coming from the averages and distributions (i.e. stability condition). By the interaction between the short term and long term information evolutions, we may have to design different strategies f ...
... (i.e. non stationarity condition) can be paired with the more permanent/long term asymptotic information coming from the averages and distributions (i.e. stability condition). By the interaction between the short term and long term information evolutions, we may have to design different strategies f ...
comparison of purity and entropy of k-means
... points that occur and the overlap between the Di cluster and Sj class cluster are counted and this count is written as the nij in our contingency matrix. For suppose K =3 and K’ = 3 which means that there are 3 clusters in Kmeans and we have 3 class cluster we represent a three by three contingency ...
... points that occur and the overlap between the Di cluster and Sj class cluster are counted and this count is written as the nij in our contingency matrix. For suppose K =3 and K’ = 3 which means that there are 3 clusters in Kmeans and we have 3 class cluster we represent a three by three contingency ...
Mining Sensor Streams for Discovering Human Activity
... of a general pattern and determining their relative importance can be beneficial in many applications. For example in a study, Hayes, et al. [19] found that variation in the overall activity performance at home was correlated with mild cognitive impairment. This highlights the fact that it is import ...
... of a general pattern and determining their relative importance can be beneficial in many applications. For example in a study, Hayes, et al. [19] found that variation in the overall activity performance at home was correlated with mild cognitive impairment. This highlights the fact that it is import ...
Expert Systems
... Use human knowledge to solve problems that normally would require human intelligence Embody some non-algorithmic expertise Represent the expertise knowledge as data or rules within the computer Can be called upon when needed to solve problems ...
... Use human knowledge to solve problems that normally would require human intelligence Embody some non-algorithmic expertise Represent the expertise knowledge as data or rules within the computer Can be called upon when needed to solve problems ...
Algorithms and Software for Collaborative Discovery from
... knowledge acquisition algorithms that can learn from statistical summaries of data (e.g., counts of instances that match certain criteria) that are made available as needed from the distributed data sources in the absence of access to raw data. (b) Autonomously developed and operated data sources of ...
... knowledge acquisition algorithms that can learn from statistical summaries of data (e.g., counts of instances that match certain criteria) that are made available as needed from the distributed data sources in the absence of access to raw data. (b) Autonomously developed and operated data sources of ...
NLP Biosurveillance-Interface2004-chapman2
... potentially relevant for outbreak detection – Free-text format ...
... potentially relevant for outbreak detection – Free-text format ...
PDF version - PCP-net
... among constructs, and the elements associated with those constructs. In this paper we have argued that FCA is a natural model for representing ordinacy in grid data. However, there are other methods that might also be appropriate for capturing the way constructs are organised. As mentioned in Sectio ...
... among constructs, and the elements associated with those constructs. In this paper we have argued that FCA is a natural model for representing ordinacy in grid data. However, there are other methods that might also be appropriate for capturing the way constructs are organised. As mentioned in Sectio ...
The Impact of Sample Reduction on PCA-based Feature Mykola Pechenizkiy Seppo Puuronen
... FE and subset selection are not, of course, totally independent processes and they can be considered as different ways of task representation. And the use of such techniques is determined by the purposes, and, moreover, sometimes FE and selection methods are combined together in order to improve the ...
... FE and subset selection are not, of course, totally independent processes and they can be considered as different ways of task representation. And the use of such techniques is determined by the purposes, and, moreover, sometimes FE and selection methods are combined together in order to improve the ...
business intelligence and analytics
... • APPLICATION GASE 1.4 Moneyball: Analytics in Sports and Movies 53 » APPLICATION CASE 1.5 Analyzing Athletic Injuries 54 Prescriptive Analytics 54 » APPLICATION CASE 1.6 Industrial and Commercial Bank of China (ICBC) Employs Models to Reconfigure Its Branch Network 55 Analytics Applied to Different ...
... • APPLICATION GASE 1.4 Moneyball: Analytics in Sports and Movies 53 » APPLICATION CASE 1.5 Analyzing Athletic Injuries 54 Prescriptive Analytics 54 » APPLICATION CASE 1.6 Industrial and Commercial Bank of China (ICBC) Employs Models to Reconfigure Its Branch Network 55 Analytics Applied to Different ...
Context-Sensitive and Expectation-Guided Temporal Abstraction of High- Frequency Data
... and Musen1993; Shahar and Musen1996). A comprehensive review of temporal-reasoning approaches and useful references are given in (Shahar and Musen1996). In the fol- ...
... and Musen1993; Shahar and Musen1996). A comprehensive review of temporal-reasoning approaches and useful references are given in (Shahar and Musen1996). In the fol- ...
Multidimensional database representation of
... The proposed design includes database architecture and methods of interaction between client’s software and the database. The design extends a multidimensional database model to include new elements and capabilities that would serve well in real-time, volatile environments. A multidimensional model ...
... The proposed design includes database architecture and methods of interaction between client’s software and the database. The design extends a multidimensional database model to include new elements and capabilities that would serve well in real-time, volatile environments. A multidimensional model ...
Full text
... By now, LISp-Miner implements ten data mining procedures: 4ft-Miner (derived from the original GUHA procedure ASSOC), SD4ft-Miner, AC4ft-Miner, KLMiner, CF-Miner, SDKL-Miner, SDCF-Miner, KEX, ETree-Miner and MClusterMiner. Most of the procedures mine for various types of rule-like patterns — this ma ...
... By now, LISp-Miner implements ten data mining procedures: 4ft-Miner (derived from the original GUHA procedure ASSOC), SD4ft-Miner, AC4ft-Miner, KLMiner, CF-Miner, SDKL-Miner, SDCF-Miner, KEX, ETree-Miner and MClusterMiner. Most of the procedures mine for various types of rule-like patterns — this ma ...
Computational Intelligence in Data Mining
... pattern (a model and its parameters) meet the goals of the KDD process. For example, predictive models can often judged by the empirical prediction accuracy on some test set. Descriptive models can be evaluated evaluated along the dimensions of predictive accuracy, novelty, utility, and understandab ...
... pattern (a model and its parameters) meet the goals of the KDD process. For example, predictive models can often judged by the empirical prediction accuracy on some test set. Descriptive models can be evaluated evaluated along the dimensions of predictive accuracy, novelty, utility, and understandab ...
DATA MINING OF INPUTS: ANALYSING MAGNITUDE AND
... well known. The basic principles are to avoid encrypting the underlying structure of the data, and to avoid using irrelevant inputs. This is not easy in the real world, where we often receive data which has been processed by at least one previous user. The data may contain too many instances of some ...
... well known. The basic principles are to avoid encrypting the underlying structure of the data, and to avoid using irrelevant inputs. This is not easy in the real world, where we often receive data which has been processed by at least one previous user. The data may contain too many instances of some ...
Pattern Recognition Algorithms for Cluster
... probabilistic algorithms output a list of the N-best labels with associated probabilities, for some value of N, instead of simply a single best label. When the number of possible labels is fairly small (e.g. in the case of classification), N may be set so that the probability of all possible labels ...
... probabilistic algorithms output a list of the N-best labels with associated probabilities, for some value of N, instead of simply a single best label. When the number of possible labels is fairly small (e.g. in the case of classification), N may be set so that the probability of all possible labels ...
A New Measure for the Accuracy of a Bayesian Network
... only if, the joint probability distribution represented by the Bayesian network (PBN) matches the joint probability distribution described by the data set (PD). Both the joint probability distribution represented by the Bayesian network and the joint probability distribution described by the data se ...
... only if, the joint probability distribution represented by the Bayesian network (PBN) matches the joint probability distribution described by the data set (PD). Both the joint probability distribution represented by the Bayesian network and the joint probability distribution described by the data se ...
Knowledge acquisition and processing: new methods for
... All data vectors correctly classified after 4 stages of the classification. No misclassifications ! ...
... All data vectors correctly classified after 4 stages of the classification. No misclassifications ! ...
Non-zero probability of nearest neighbor searching
... method may report incorrect nearest data because different instances of q can be selected. Figure 2 shows an example of two data points a and b and an uncertain query point q. The bisector of a and b is shown by B ab . It is easy to see that all points above (including all instances of q, especially ...
... method may report incorrect nearest data because different instances of q can be selected. Figure 2 shows an example of two data points a and b and an uncertain query point q. The bisector of a and b is shown by B ab . It is easy to see that all points above (including all instances of q, especially ...
Tracking evolving communities in large linked networks
... A prominent alternative is to use cocitation analysis (13). In cocitation, two papers are judged similar if they are both cited by another paper. This is a very useful similarity measure. However, for this measure to work properly, a certain time-lag is required in order for papers to build up a cit ...
... A prominent alternative is to use cocitation analysis (13). In cocitation, two papers are judged similar if they are both cited by another paper. This is a very useful similarity measure. However, for this measure to work properly, a certain time-lag is required in order for papers to build up a cit ...
Building Knowledge-Driven DSS and Mining Data
... "rules" that are the basis for storing the knowledge in the system. Many expert system development environments store knowledge as rules. The following is an example of a rule: IF INCOME > $45,000 (condition) AND IF SEX = "M" (condition) THEN ADD to Target list (action) A rule is a formal way of spe ...
... "rules" that are the basis for storing the knowledge in the system. Many expert system development environments store knowledge as rules. The following is an example of a rule: IF INCOME > $45,000 (condition) AND IF SEX = "M" (condition) THEN ADD to Target list (action) A rule is a formal way of spe ...
Graph-Based Relational Learning: Current and Future Directions
... subgraph would be removed from subsequent iterations. If using the information-theoretic measure, then instances of the learned subgraph in both the positive and negative examples (even multiple instances per example) are compressed to a single vertex. We should note that the compression is a lossy ...
... subgraph would be removed from subsequent iterations. If using the information-theoretic measure, then instances of the learned subgraph in both the positive and negative examples (even multiple instances per example) are compressed to a single vertex. We should note that the compression is a lossy ...