Download Data Mining, Neural Networks, and Genetic Programming

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
VIC TO R IA U NIVE R S ITY O F W E L L INGTO N
Te Whare Wananga o te Upoko o te Ika a Maui
COMP422
Intro: 2
Outline
VUW
• Who and where
School of Engineering and Computer Science
• Course focus and objectives
• Course Texts
Welcome to COMP 422
• Course components and workload
• Assessment
Data Mining, Neural Networks, and
Genetic Programming
• Course materials
• Suggested reading and exercises
Mengjie Zhang, Bing Xue, Yi Mei, Harith Al-Sahaf
{mengjie.zhang, bing.xue, yi.mei, harith.al-sahaf}@ecs.vuw.ac.nz
COMP422
Intro: 3
COMP422
Who and Where
• Course Coordinator and Lecturer: Prof Mengjie Zhang (Meng,
[email protected], CO355)
• Lecturer: Bing Xue ([email protected], CO352)
• Lecturer and Tutor: Yi Mei ([email protected], CO351)
• (Casual) Lecturer and Tutor: Harith Al-Sahaf
([email protected], CO325)
• Lectures/Tutorials/Discussions:
– Monday: 2:10–4:00, MY 103 (2 hours)
– Wednesday: 3:10–4:00, MY 103
• You can ask us questions at any time!
Intro: 4
Course Focus
This course is focused on data mining algorithms, techniques, tasks,
and applications.
• Main topics:
– KDD/DM concepts, process, tasks, basic algorithms
– DM example — Image analysis and object recognition
– DM measure — Performance evaluation
– DM algorithms — Neural networks and applications
– DM algorithms — Genetic programming (other EC techniques: PSO, LCS, GAs) and applications,
– DM algorithms — Feature analysis
• Get to know each other!!
– Learning Theory
• Relationship between these topics
COMP422
Intro: 5
COMP422
Course Goals
Intro: 6
Course Text
• Understand the key concepts, theories, and tasks of DM.
• Understand the main strengths and limitations of commonly
used DM algorithms and how to apply them to a particular task.
• Understand key concepts and tasks in computer vision and image processing.
• Select/develop good features and algorithms for object recognition particularly classification.
• Use neural networks and genetic programming techniques for
data mining tasks.
• Analyse a given DM task/problem and choose/develop a good
algorithm to solve it.
• Select an appropriate criterion to evaluate data mining/learning
systems such as a classifier.
• Present your work in writing and oral presentation.
No assigned text books. Check with library. Good references:
• Pang-Ning Tan, Michael Steinbach, Vipin Kumar. Introduction to Data Mining. Addison
Wesley. 2006. (or Pearson New International Edition, 2014).
• Dursun Delen. Real-World Data Mining. Pearson. 2015.
• Margaret H. Dunham. Data Mining: Introductory and Advanced Topics. Prentice Hall,
2003.
• David A. Forsyth and Jean Ponce. Computer Vision: A Modern Approach. Prentice Hall,
2003.
• Stuart J. Russell and Peter Norvig. Artificial Intelligence: A Modern Approach. Prentice
Hall, Pearson Education Inc., 3rd edition, 2010.
• Riccardo Poli and William B. Langdon and Nicholas Freitag McPhee. A field guide to
genetic programming. Freely available at http://www.gp-field-guide.org.uk.
2008.
• Michael Affenzeller. Genetic algorithms and genetic programming: modern concepts and
practical applications. CRC Press, c2009.
Check the course website and the ECRG website for more detail.
COMP422
Intro: 7
Practical Work
• Projects: two
– Project 1: Due 11:59pm Monday 15 August 2016 (Week 6)
– Project 2: Due 11:59pm Monday 10 October 2016 (week
12)
• Labs/tutorials: in lectures for discussions
• Workload: the expected workload for the paper is 10 hours on
average per week including 3 hours of lectures (sometimes 2
lectures per week).
COMP422
Intro: 8
Course Components
Target 32-34 lectures (Monday has one lecture/hour, Wednesday
has two lectures/hours).
• Lectures – Concepts, theories, ideas, methods
• Discussions – more thinking for students, review of the concepts and theories.
• Projects – Practice and presentation of your work, analysis,
new/extended ideas and methods.
• Paper Review – Check other researchers’ methods, comparing
with those used/developed by yourselves.
• Demonstration – Project 1 will involve a demostration.
• Writing skills – Project 2 will involve more writing.
COMP422
Intro: 9
COMP422
Intro: 10
Course Schedule (Draft)
Course Assessment
• Week 1: Introduction to COMP 422, Data Mining and KDD Process.
• Overall marks:
• week 2: Data mining tasks and algorithms.
– The two projects contribute 40% (20% + 20%) of the total
marks.
• Week 3: DM Applications — Image Analysis and Object Classification.
– The final examination is worth 60% of the total marks.
• Week 5: Performance Evaluation
• To pass the course you must obtain a C- grade overall and meet
the mandatory requirements:
• Week 4: Feature Extraction and Selection
• Week 6: Neural Networks and Neural Engineering
• Week 7: Neural network simulator and special architecture
– Submit both projects with reasonable attempts;
• Week 8: Neural Networks for Object Recognition, Introduction to EC
– Obtain a D grade in the final exam; and
• Week 9: Data Preprocessing and Feature Manipulation
– Attend at least 26 lectures for discussions.
• Week 10: Feature Manipulation and Advanced Topics in EC and Data Mining
• READ the course outline
• Week 11: Advanced topics in Evolutionary Computation and Data Mining
• Week 12: Learning Theory and Summary
COMP422
Intro: 11
COMP422
Intro: 12
Background Knowledge and Related Courses
Course Materials
• Course web page:
• COMP307 – Introduction to AI
– Search techniques
– Knowledge based systems
– Learning concepts and methods
• COMP302/SWEN304 and COMP442/SWEN432: large collection of data, data warehouse, data marts, (data record retrieval)
• Mathematics:
statistics
discrete mathematics, calculus, probability,
• Programming Language (any of C/C++, Java, Pascal, ...)
• COMP 473 (T1), COMP 421 (T2), and 423 (T1)
http://ecs.victoria.ac.nz/Courses/COMP422 2016T2/
–
–
–
–
–
–
–
Course outlines
Lecture notes
Announcements
Assignments
Past exam papers
Important Links
Web submissions
• Course public directory: contains a lot of useful source code,
packages, executable programs, documents, data file...
/vol/courses/comp422/
• These materials will be updated from time to time
COMP422
Intro: 13
COMP422
Rules and Policies
• Lab hours and access;
• Lab Rules;
Intro: 14
Summary
• General issues for the course: course focus, course outline, lectures, practical work, assessment, web page, ...
• Printing Allocations;
• Suggested reading: course outline, a paper, and announcements and useful links from the course web site.
• Special Needs Students;
• Questions:
• Student Support;
• Plagiarism;
• Withdrawals;
• Student Grievances.
Check course outline!!!
– Why is data mining (DM) necessary?
– What are DM and knowledge discovery in databases?
– What is the KDD process?
– What are the relevant fields of KDD?
– What are common DM tasks?
– What is the difference between DM and data warehousing?
– What is the difference between DM and record retrieval?