Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
College of Science & Technology Dep. Of Computer Science & IT BCs of Information Technology Data Mining Chapter 6: Clustering Methods 2013 Prepared by: Mahmoud Rafeek Al-Farra www.cst.ps/staff/mfarra Course’s Out Lines 2 Introduction Data Preparation and Preprocessing Data Representation Classification Methods Evaluation Clustering Methods Mid Exam Association Rules Knowledge Representation Special Case study : Document clustering Discussion of Case studies by students Out Lines 3 Definition of Clustering Why clustering? Where to use clustering? Next: Types of Data in Cluster Analysis Next: A Categorization of Major Clustering Methods Definition of Clustering 4 Clustering can be considered the most important unsupervised learning technique; so, as every other problem of this kind, it deals with finding a structure in a collection of unlabeled data. Clustering is “the process of organizing objects into groups whose members are similar in some way”. A cluster is therefore a collection of objects which are “similar” between them and are “dissimilar” to the objects belonging to other clusters. Definition of Clustering 5 Cluster: a collection of data objects Similar to one another within the same cluster Dissimilar Cluster analysis Grouping to the objects in other clusters a set of data objects into clusters Clustering is unsupervised predefined classes classification: no Learning 6 Why clustering? 7 Simplifications Pattern detection Useful in data concept construction Unsupervised learning process Where to use clustering? 8 Data mining Information retrieval text mining Web analysis marketing medical diagnostic Which method should I use? 9 Type of attributes in data Scalability to larger dataset Ability to work with irregular data Time cost complexity Data order dependency Result presentation Thanks 10