Download CS 524 – High Performance Computing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
CS/CMPE 536 –Data
Mining
Outline
Description

A comprehensive introduction to the concepts and
techniques in data mining
mining process – its need and motivation
 data mining tasks and functionalities
 association rule mining
 clustering
 text and web mining
 mining sequential data
 evaluation of DM tools and programming of algorithms in
C/C++/Java
 data

Emphasis on concept building, algorithm evaluation
and applications
CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
2
Goals




To provide a comprehensive introduction to data
mining
To develop conceptual and theoretical understanding
of the data mining process
To provide hands-on experience in the
implementation and evaluation of data mining
algorithms and tools
To develop interest in data mining research
CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
3
After Taking this Course…
You should be able to …
 understand the need and motivation for data mining
 understand the characteristics of different data mining
tasks
 decide what data mining task and algorithm to use for a
given problem/data set
 implement and evaluate data mining solutions
 use commercially available DM tools
CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
4
Before Taking This Course…
You should be comfortable with…
 Data structures and algorithms!
 CS213
is a prerequisite
 You should be comfortable with algorithm descriptions and
implementations in a high-level programming language

Databases
 Understanding of
the database concept and familiarity with
database terms and terminology
 CS341 is recommended, not required

Basic math background
 Algebra,

calculus, etc
Programming in a high-level language
 C/C++
or Java
CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
5
Grading

Points distribution
Quizzes (5 to 6)
Assignments (hand + computer)
Project
Midterm exam
Final exam (comprehensive)
CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
10%
20%
150%
25%
30%
6
Policies (1)

Quizzes
 Most
quizzes will be announced a day or two in advance
 Unannounced quizzes are also possible

Sharing
 No
copying is allowed for assignments. Discussions are
encouraged; however, you must submit your own work
 Violators can face mark reduction and/or reported to
Disciplinary Committee

Plagiarism
 Do
NOT pass someone else’s work as yours! Write in your
words and cite the reference. This applies to code as well.
CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
7
Policies (2)

Submission policy
 Submissions
are due at the day and time specified
 Late penalties: 1 day = 10%; 2 day late = 20%; not accepted
after 2 days
 An extension will be granted only if there is a need and when
requested several days in advance.

Classroom behavior
 Maintain classroom sanctity by
remaining quiet and attentive
 If you have a need to talk and gossip, please leave the
classroom so as not to disturb others
 Dozing is allowed provided you do not snore load 
CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
8
Project




Design, implementation and evaluation of a data
mining application
You may choose a problem of your liking (after
consultation with me) or select one suggested by me
You may do the project in groups (of 2)
Start thinking about the project now
CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
9
Summarized Course Contents







Introduction and motivation
The data mining process – tasks and functionalities
Data preprocessing for data mining – data cleaning,
reduction, summarization, normalization, etc
Mining association rules – algorithms and applications
Mining by clustering – algorithms and applications
Mining text and web data
Mining sequential data
CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
10
Course Material

Required textbook


Supplementary material




Data Mining: Concepts and Techniques, Han and Kamber,
2001
Data Mining: Introductory and Advanced Topics, Dunham,
Pearson Education, 2003.
Data Mining: Practical Machine Learning Tools and
Techniques with Java Implementations, Witten et al.,
Morgan Kaufmann, 006.3 W829D, 2000.
Handouts (as and when necessary)
Other resources


Books in library
Web
CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
11
Course Web Site

For announcements, lecture slides, handouts,
assignments, quiz solutions, web resources:
http://suraj.lums.edu.pk/~cs536a04/

The resource page has links to information available on
the Web. It is basically a meta-list for finding further
information.
CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
12
Other Stuff

How to contact me?
 Office
hours: 3.00 to 4.30 PM TR (office: 429)
 E-mail: [email protected]
 By appointment: e-mail me for an appointment before
coming

Philosophy
 Knowledge
cannot be taught; it is learned.
 Be excited. That is the best way to learn. I cannot teach
everything in class. Develop an inquisitive mind, ask
questions, and go beyond what is required.
 I don’t believe in strict grading. But… there has to be a way
of rewarding performance.
CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
13
Reference Books in LUMS Library (1)








Data Mining: Concepts, Models, Methods, and Algorithms, Mehmed
Kantardzic, 006.3 K167D, 2003.
Principles of Data Mining, Hand and Mannila, 006.3 H236P, 2001.
The elements of statistical learning; data mining, inference, and prediction,
Tervor Hastie, Robert Tibshirani and Jerome Friedman, 006.31 H356E 2001.
Data mining and uncertain reasoning;an integrated approach, Zhengxin Chen,
006.321 C518D 2001.
Graphical models; methods for data analysis and mining, Christian Borgelt
and Rudolf Kruse, 006.3 B732G 2001.
Information visualization in data mining and knowledge discovery, Usama
Fayyad (ed.), 006.3 I434 2002.
Intelligent data warehousing;from data preparation to data mining, Zhengxin
Chen, 005.74 C518I 2002.
Machine learning and data mining;methods and applications, Michalski,
Ryszard S., ed.;Bratko, Ivan, ed.;Kubat, Miroslav, ed., 006.31 M149 1999.
CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
14
Reference Books in LUMS Library (2)




Managing and mining multimedia databases, Bhavani
Thuraisingbam, 006.7 T536M 2001.
Mastering data mining;the art and science of customer
relationship management, J.A. Michael Berry and
Gordon Linoff, 006.3 B534M 2000.
Data mining explained;a manager's guide to customercentric business intelligence, Rhonda Delmater and
Monte Hancock, 006.3 D359D 2001.
Data mining solutions;methods and tools for solving
real-world problems, Christopher Westphal and Teresa
Blaxton, 006.3 W537D 1998.
CS 536 - Data Mining (Au 2004/2005) - Asim Karim @ LUMS
15