Download Kuvankäsittelyn demot

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Content and material:
Clustering Methods
Spring 2007
1. Introduction
- Applications: cluster analysis, quantization, data reduction, pattern matching.
- Cluster software.
- K-means algorithm
- Clustering problem definitions.
2. Algorithms
- Hierarchical algorithms: Divisive, Agglomerative, MST-based.
- Iterative algorithms
- Self-organizing map (SOM)
- Optimization techniques (local search, GA, Tabu search)
- Optimal clustering (Branch-and-Bound, DP for 1-d case)
- Projection-based (Cleju’s thesis, Mantao’s paper)
- Comparison (Tomi+Marko manuscript, PR paper)
3. Number of clusters / cluster validity
- DBI, F-ratio…
- Stochastic complexity (Mantao’s PRL paper)
4. Preprocessing and metrics
- Distance metrics and distortion functions
- 0/1 normalization, RCMS (??)
- Outlier detection / removal (ICPR paper, SCIA paper)
- Missing values: recognition and fixing.
- Binary data, categorical data, non-attributive data
5. Search techqniques / nearest neighbor search
- Partial distortion search
- Projection-based search (MPS, PCA, some other projection axis?)
- Neighborhood graph
6. Visualization
- Graph-based
- Circle (?)
- Projection and sorting
- SOM
7. Data dimensionality
- Estimation of dimensionality
- Dimensional reduction
- Sphere packing / kissing number problem
8. Other models
- Fuzzy clustering
- Gaussian Mixture model
- Image segmentation
- Spatial clustering
Applications:








Data compression
Text mining
Speaker recognition
Customer relationship management
Shopping basquet analysis
Business intelligence
Oracle http://www.oracle.com/technology/products/bi/odm/index.html
Case example: Amazon (find similar books!)
Material:





Cluster-software, GhostMiner, Oracle(?)
Pseuco codes
Data: S sets, Dim, Birch, Image
Data: UCI, Delve dataset
SA hakemus, Singapore-esitelmä,
Articles / Pointers














SIPU papers + web material
Theodoridis and Koutroumbas, Pattern Recognition, Academic Press, 3rd ed., 2006.
http://cgi.di.uoa.gr/~stpatrec/welcome3d.html
Christopher Bishop, Pattern Recognition and Machine Learning, Springer, 2006.
Timo Kaukoranta, PhD thesis, TUCS, 1999.
Juha Kivijärvi, PhD thesis, TUCS, 2004.
Mantao Xu, PhD thesis, Joensuu, 2005.
Ismo Kärkkäinen, PhD thesis, Joensuu, 2006.
Jarkko Venna, PhD thesis, Tech.Univ. of Helsinki, 2007 (to appear)
Sami Äyrämö, PhD thesis, Knowledge Mining Using Robust Clustering, J-kylä, 2006.
http://dissertations.jyu.fi/studcomp/9513926559.pdf
Data: http://www.ics.uci.edu/~mlearn/MLRepository.html
Data: http://www.cs.toronto.edu/~delve/data/datasets.html
Data: http://kdd.ics.uci.edu/summary.data.type.html
Data: http://lib.stat.cmu.edu/
Data: http://cs.joensuu.fi/pages/franti/ts/