Download Ch9-UPLOAD - E-Learning | STMIK AMIKOM Yogyakarta

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
STMIK AMIKOM Yogyakarta
Chapter 9
ALGORITME Cluster
dan WEKA
Clustering
K-Means
Case
Sulidar Fitri, M.Sc
Data Mining
© Sulidar Fitri, Ms.C
STMIK AMIKOM Yogyakarta
REFERENCES
• Jiawei Han and Micheline Kamber. Data
Mining: Concepts and Techniques. 2006.
Department of Computer Science University
of Illinois at Urbana-Champaign.
www.cs.uiuc.edu/~hanj
• Ian H. Witten, Eibe Frank, Mark A. Hall. Data
Mining Practical Machine Learning Tools and
Techniques Third Edition.2011. Elsevier
• Kusrini dan Luthfi, E., 2009, Algoritma Data
Mining, Penerbit Andi
• Kusrini, Pattern Recognition.
• WEKA
Data Mining
© Sulidar Fitri, Ms.C
Clustering
Introduction
The previous data mining task of classification
deals with partitioning data based on a preclassified training sample
Clustering is an automated process to group
related records together.
Related records are grouped together on the basis
of having similar values for attributes
The groups are usually disjoint
Data Mining
© Sulidar Fitri, Ms.C
Via (Yohana, 2011)
Data Mining
© Sulidar Fitri, Ms.C
(Larose, 2005)
Data Mining
© Sulidar Fitri, Ms.C
Contoh Kasus: Proses pendeskritan kelas
kontinyu
Input
Data awal, berupa data kontinyu atau data diskret
Delta, yaitu nilai yang digunakan untuk
menentukan selisih centroid dan mean yang
diijinkan
Output: tabel pemetaan yang berisi kelas
diskret beserta nilai centroidnya
Data Mining
© Sulidar Fitri, Ms.C
Langkah
Proses:
1.Tentukan jumlah cluster
2.Alokasikan data ke dalam cluster secara random
3.Hitung centroid/rata-rata dari data yang ada di masing-
masing cluster
4.Alokasikan masing-masing data ke centroid/rata-rata
terdekat
5.Kembali ke Step 3, apabila masih ada data yang berpindah
cluster atau apabila perubahan nilai centroid, ada yang di
atas nilai threshold yang ditentukan atau apabila perubahan
nilai pada objective function yang digunakan di atas nilai
threshold yang ditentukan
Data Mining
© Sulidar Fitri, Ms.C
Penentuan centroid: acak atau ditentukan
dengan rumus
Data Mining
© Sulidar Fitri, Ms.C
Input: 79, 85, 83, 90, 82, 81, 85, 87, 89 dan 84
Jumlah kelas target: 3
delta : 0,01
Proses:
Min: 79
Max : 90
Toleransi error: 0.01 * (90-79) : 0.11
Data Mining
© Sulidar Fitri, Ms.C
Min: 79, max: 90
Centroid awal C2 dan
C3?
Data Mining
© Sulidar Fitri, Ms.C
0,92 > error (0.11)
Rerata menjadi
centroid baru
Data Mining
© Sulidar Fitri, Ms.C
Data Mining
© Sulidar Fitri, Ms.C
STMIK AMIKOM Yogyakarta
WEKA PRACTICE
Data Mining
© Sulidar Fitri, Ms.C
STMIK AMIKOM Yogyakarta
Clustering
•
•
•
•
•
•
Buka weka dan input data .arff
Pilih tab Cluster
Choose algoritma kMeans
Pilih Cluster/kelompok yang diinginkan berapa
Start
Baca outputnya
Data Mining
© Sulidar Fitri, Ms.C
STMIK AMIKOM Yogyakarta
GET STARTED
Data Mining
© Sulidar Fitri, Ms.C