Locally defined principal curves and surfaces
... Writing a PhD thesis is not something that you can do over the weekend; it takes a long time. Therefore, the first lines of this dissertation will be the words of gratitude to all those who helped me over the past years and made this possible. To begin with, I would like to express my gratitude to m ...
... Writing a PhD thesis is not something that you can do over the weekend; it takes a long time. Therefore, the first lines of this dissertation will be the words of gratitude to all those who helped me over the past years and made this possible. To begin with, I would like to express my gratitude to m ...
Information-theoretic graph mining - PuSH
... center of attention in the data mining field. There are many important tasks in graph mining, such as graph clustering, outlier detection, and link prediction. Many algorithms have been proposed in the literature to solve these tasks. However, normally these issues are solved separately, although th ...
... center of attention in the data mining field. There are many important tasks in graph mining, such as graph clustering, outlier detection, and link prediction. Many algorithms have been proposed in the literature to solve these tasks. However, normally these issues are solved separately, although th ...
An automatic email mining approach using semantic non
... Figure 1.1: Categories of Email Management Tasks .......................................................... 4 Figure 1.2: Automatic Folder Creation by Email Clustering ............................................ 11 Figure 2.1: Email as shown on user screen .......................................... ...
... Figure 1.1: Categories of Email Management Tasks .......................................................... 4 Figure 1.2: Automatic Folder Creation by Email Clustering ............................................ 11 Figure 2.1: Email as shown on user screen .......................................... ...
Automatic Document Topic Identification Using Hierarchical
... find the best matching topic for input documents. There are several applications for this task. For example, it can be used to improve the relevancy of search engine results by categorizing the search results according to their general topic. It can also give users the ability to choose the domain w ...
... find the best matching topic for input documents. There are several applications for this task. For example, it can be used to improve the relevancy of search engine results by categorizing the search results according to their general topic. It can also give users the ability to choose the domain w ...
now
... is a leaf node labeled as yt If Dt is an empty set, then t is a leaf node labeled by the default class, yd If Dt contains records that belong to more than one class, use an attribute test to split the data into smaller subsets. Recursively apply the procedure to each subset. ...
... is a leaf node labeled as yt If Dt is an empty set, then t is a leaf node labeled by the default class, yd If Dt contains records that belong to more than one class, use an attribute test to split the data into smaller subsets. Recursively apply the procedure to each subset. ...
Nearest-neighbor chain algorithm
In the theory of cluster analysis, the nearest-neighbor chain algorithm is a method that can be used to perform several types of agglomerative hierarchical clustering, using an amount of memory that is linear in the number of points to be clustered and an amount of time linear in the number of distinct distances between pairs of points. The main idea of the algorithm is to find pairs of clusters to merge by following paths in the nearest neighbor graph of the clusters until the paths terminate in pairs of mutual nearest neighbors. The algorithm was developed and implemented in 1982 by J. P. Benzécri and J. Juan, based on earlier methods that constructed hierarchical clusterings using mutual nearest neighbor pairs without taking advantage of nearest neighbor chains.