Download Developing Methods for Combining multiple data Clustering

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Human genetic clustering wikipedia , lookup

K-means clustering wikipedia , lookup

Nearest-neighbor chain algorithm wikipedia , lookup

Cluster analysis wikipedia , lookup

Transcript
Developing Methods for Combining
Multiple Clustering of Patterns
Towards the discovery of natural clusters
Hanan Ayad
Supervisor
Prof. Mohamed Kamel
Outline
 Motivations
 Research
Summary
 2003 Publications
 Application to Learning Objects



Diverse Sources of Information
Multiple Clustering in LO
Process Overview
Motivations

Multiple clustering solutions




Enhance Quality



Different clustering methods
Selection of learning parameters (e.g. NNets)
Random starts, random ordering of patterns
Compensatory effects in clustering methods
Repeated fine decompositions
Distributed clustering

Feature-Distributed
• Multiple partial views
• Random subspaces
• Alternative feature reductions

Data-Distributed
• Overlapping subsets of patterns
Research Summary

Measure co-associations between patterns
based on their co-clustering - voting
 Development of combination rules based on
shared co-associations – Shared nearest
neighbors (binary votes, weighted votes, sum
rule, product rule, rank-based rule)



Determine strength-of-association
Accumulate local neighborhood densities of patterns
Patterns weights are inversely proportional to
their local neighbourhood densities (number and
weights of relationships)
Research Summary, Cont’d

Pruning of associations for efficiency and
assessment of behaviour.



Effect on mutuality of relationships
Improving quality using subsets of patterns relations
Study of convergence and stability

Induce a graph, representing the patterns
weighted relationships, and the patterns own
weights. Weighted Shared nearest neighbors
Graph (WSnnG)
 The graph is partitioned resulting in an
integrated clustering of the objects


Use of the graph partitioning package METIS.
Minimize edge-cut subject to weights of vertices being
equally distributed among the clusters
2003 Publications

H. Ayad and M. Kamel. "Finding Natural Clusters Using
Multi-Clusterer Combiner Based on Shared Nearest
Neighbors“. Multiple Classifier Systems: Fourth
International Workshop, MCS 2003, Guildford, Surrey,
United Kingdom, June 11-13. Proceedings.

H. Ayad, and M. Kamel. Refined Shared Nearest
Neighbors Graph for Combining Multiple Data
Clusterings", The 5th International Symposium on
Intelligent Data Analysis IDA 2003. Berlin, Germany.
Proceedings. LNCS. Springer. August, 2003

H. Ayad, and M. Kamel. Development of New Methods
for Combining Cluster Ensembles. On going Journal
Paper.
Application to Learning Objects
Diverse Sources of Information
 Meta

Data
Standardized Indexing, content structure and
organization
 Intelligent




Content Mining
Natural Language Understanding
Image Analysis and Understanding
Automatic Speech Recognition
Statistical Learning
 Re-Use/Learning

Scenarios
Dynamic assembly, object are grouped and
regrouped with other objects.
Application to Learning Objects
Multiple Clustering in LO
 Clusters



of Learning Objects
Multiple distributed taxonomies
Info. Sources: different sets of meta data,
different re-use/assembling scenarios
Dynamic environment, clusters based on
partial views
 Combining



of multiple clustering
Mining complex web of relationships
integrating multiple objects clustering
Discovery of combined multi-view clusters.
Application to Learning Objects
Learning Objects
Meta Data 1
.
Clustering 1
.
.
Meta Data m
Re-use Scenarios 1
.
.
.
.
.
.
Re-use Scenarios s
Clustering r
Combining
Multiple
Objects
Clustering
Integrated Learning Objects Clusters
Process Overview