Mining Sequential Alarm Patterns in a Telecommunication Database
... a transaction-time associated with each transaction. A sequential pattern also consists of a list of sets of items. The problem is to find all sequential patterns with a userspecified minimum support, where the support of a sequential pattern is the percentage of data-sequences that contain the patt ...
... a transaction-time associated with each transaction. A sequential pattern also consists of a list of sets of items. The problem is to find all sequential patterns with a userspecified minimum support, where the support of a sequential pattern is the percentage of data-sequences that contain the patt ...
towards outlier detection for high-dimensional data
... research problem. A key observation that motivates this research is that outliers in high-dimensional data are projected outliers, i.e., they are embedded in lowerdimensional subspaces. Detecting projected outliers from high-dimensional stream data is a very challenging task for several reasons. Fir ...
... research problem. A key observation that motivates this research is that outliers in high-dimensional data are projected outliers, i.e., they are embedded in lowerdimensional subspaces. Detecting projected outliers from high-dimensional stream data is a very challenging task for several reasons. Fir ...
Discovering Multiple Clustering Solutions
... For example, K -MEANS Aims at a single partitioning of the data Each object is assigned to exactly one cluster Aims at one clustering solution One set of K clusters forming the resulting groups of objects ⇒ In contrast, we focus on multiple clustering solutions... Müller, Günnemann, Färber, Seidl ...
... For example, K -MEANS Aims at a single partitioning of the data Each object is assigned to exactly one cluster Aims at one clustering solution One set of K clusters forming the resulting groups of objects ⇒ In contrast, we focus on multiple clustering solutions... Müller, Günnemann, Färber, Seidl ...
Mining periodic behaviors of object movements for animal and
... There are many works on mining spatio-temporal patterns (Wang et al. 2003; Mamoulis et al. 2004; Cao et al. 2005; Li et al. 2010a). Mamoulis et al. (2004) detects the periodic patterns for moving objects. However, the work takes period as an input without discussing how to detect period automaticall ...
... There are many works on mining spatio-temporal patterns (Wang et al. 2003; Mamoulis et al. 2004; Cao et al. 2005; Li et al. 2010a). Mamoulis et al. (2004) detects the periodic patterns for moving objects. However, the work takes period as an input without discussing how to detect period automaticall ...
An automatic email mining approach using semantic non
... Jianguo Lu and my thesis committee chair, Dr. Dan Wu for accepting to be in my thesis committee. Your decision, despite your tight schedules, to help in reading the thesis and ...
... Jianguo Lu and my thesis committee chair, Dr. Dan Wu for accepting to be in my thesis committee. Your decision, despite your tight schedules, to help in reading the thesis and ...
Temporal Data Mining in Electronic Medical Records from Patients
... across the US. I simulated data to examine SPM performance and found that it is well-suited to extract ...
... across the US. I simulated data to examine SPM performance and found that it is well-suited to extract ...
... Clustering algorithms have focused on the management of numerical and categorical data. However, in the last years, textual information has grown in importance. Proper processing of this kind of information within data mining methods requires an interpretation of their meaning at a semantic level. I ...
doctoral thesis - Department of Cybernetics
... In this thesis we first propose a framework for relational data mining with taxonomic domain knowledge. The proposed framework is based on inductive logic programming and enables efficient handling of taxonomies on concepts and predicates by means of a specialized refinement operator. The operator i ...
... In this thesis we first propose a framework for relational data mining with taxonomic domain knowledge. The proposed framework is based on inductive logic programming and enables efficient handling of taxonomies on concepts and predicates by means of a specialized refinement operator. The operator i ...
tutorial[1]. - Penn State Department of Statistics
... • Constraints are specified to focus on only interesting portions of database – Example: find association rules where the prices of items are at most 200 dollars (max < 200) • Incorporating constraints can result in efficiency – Anti-monotonicity: • When an itemset violates the constraint, so does a ...
... • Constraints are specified to focus on only interesting portions of database – Example: find association rules where the prices of items are at most 200 dollars (max < 200) • Incorporating constraints can result in efficiency – Anti-monotonicity: • When an itemset violates the constraint, so does a ...
Locally defined principal curves and surfaces
... and manifolds with a particular intrinsic dimensionality, which we characterize in terms of the gradient and the Hessian of the probability density estimate. The theory lays a geometric understanding of the principal curves and surfaces, and a unifying view for clustering, principal curve fitting an ...
... and manifolds with a particular intrinsic dimensionality, which we characterize in terms of the gradient and the Hessian of the probability density estimate. The theory lays a geometric understanding of the principal curves and surfaces, and a unifying view for clustering, principal curve fitting an ...
Association Analysis Book Chapter
... A lattice structure can be used to enumerate the list of possible itemsets. For example, Figure 6.1 illustrates all itemsets derivable from the set {A, B, C, D, E}. In general, a data set that contains d items may generate up to 2d − 1 possible itemsets, excluding the null set. Some of these itemset ...
... A lattice structure can be used to enumerate the list of possible itemsets. For example, Figure 6.1 illustrates all itemsets derivable from the set {A, B, C, D, E}. In general, a data set that contains d items may generate up to 2d − 1 possible itemsets, excluding the null set. Some of these itemset ...
- Free Documents
... ence in a street address since it effectively changes the house num ber, while a single letter substitution is semantically insignicant because it is more likely to be caused by a typo or an abbrevia tion. Therefore, adapting string edit distance to a particular domain requires assigning different w ...
... ence in a street address since it effectively changes the house num ber, while a single letter substitution is semantically insignicant because it is more likely to be caused by a typo or an abbrevia tion. Therefore, adapting string edit distance to a particular domain requires assigning different w ...
Mining Query Subtopics from Search Log Data
... Most queries are ambiguous or multifaceted [14]. For example, ‘harry shum’ is an ambiguous query, which may refer to an American actor, a vice president of Microsoft, or another person named Harry Shum. ‘Xbox’ is a multifaceted query. When people search for ‘xbox’, they may be looking for informatio ...
... Most queries are ambiguous or multifaceted [14]. For example, ‘harry shum’ is an ambiguous query, which may refer to an American actor, a vice president of Microsoft, or another person named Harry Shum. ‘Xbox’ is a multifaceted query. When people search for ‘xbox’, they may be looking for informatio ...
Contributions to Automatic Knowledge Extraction from Unstructured
... extracting interesting information. Users need tools to compare different documents like effectiveness and relevance of documents or finding patterns to direct them on more documents. There are an increasing number of online documents and an automated document classification is an important challeng ...
... extracting interesting information. Users need tools to compare different documents like effectiveness and relevance of documents or finding patterns to direct them on more documents. There are an increasing number of online documents and an automated document classification is an important challeng ...
on ano ntol sem onym mic logy man misa crod y bas
... The exploitation of microdata compiled by statistical agencies is of great interest for the data mining community. However, such data often include sensitive information that can be directly or indirectly related to individuals. Hence, an appropriate anonymisation process is needed to minimise the r ...
... The exploitation of microdata compiled by statistical agencies is of great interest for the data mining community. However, such data often include sensitive information that can be directly or indirectly related to individuals. Hence, an appropriate anonymisation process is needed to minimise the r ...
Managing Discoveries in The Visual Analytics Process
... the mining process itself, rather than being carried out completely by machines. In VDM, visualizations are utilized to support a specific mining task or display the results of a mining algorithm, such as association rule mining. However, VDM offers little help for knowledge organization and managem ...
... the mining process itself, rather than being carried out completely by machines. In VDM, visualizations are utilized to support a specific mining task or display the results of a mining algorithm, such as association rule mining. However, VDM offers little help for knowledge organization and managem ...