Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
VIC TO R IA U NIVE R S ITY O F W E L L INGTO N Te Whare Wananga o te Upoko o te Ika a Maui COMP422 Intro: 2 Outline VUW • Who and where School of Engineering and Computer Science • Course focus and objectives • Course Texts Welcome to COMP 422 • Course components and workload • Assessment Data Mining, Neural Networks, and Genetic Programming • Course materials • Suggested reading and exercises Mengjie Zhang, Bing Xue, Yi Mei, Harith Al-Sahaf {mengjie.zhang, bing.xue, yi.mei, harith.al-sahaf}@ecs.vuw.ac.nz COMP422 Intro: 3 COMP422 Who and Where • Course Coordinator and Lecturer: Prof Mengjie Zhang (Meng, [email protected], CO355) • Lecturer: Bing Xue ([email protected], CO352) • Lecturer and Tutor: Yi Mei ([email protected], CO351) • (Casual) Lecturer and Tutor: Harith Al-Sahaf ([email protected], CO325) • Lectures/Tutorials/Discussions: – Monday: 2:10–4:00, MY 103 (2 hours) – Wednesday: 3:10–4:00, MY 103 • You can ask us questions at any time! Intro: 4 Course Focus This course is focused on data mining algorithms, techniques, tasks, and applications. • Main topics: – KDD/DM concepts, process, tasks, basic algorithms – DM example — Image analysis and object recognition – DM measure — Performance evaluation – DM algorithms — Neural networks and applications – DM algorithms — Genetic programming (other EC techniques: PSO, LCS, GAs) and applications, – DM algorithms — Feature analysis • Get to know each other!! – Learning Theory • Relationship between these topics COMP422 Intro: 5 COMP422 Course Goals Intro: 6 Course Text • Understand the key concepts, theories, and tasks of DM. • Understand the main strengths and limitations of commonly used DM algorithms and how to apply them to a particular task. • Understand key concepts and tasks in computer vision and image processing. • Select/develop good features and algorithms for object recognition particularly classification. • Use neural networks and genetic programming techniques for data mining tasks. • Analyse a given DM task/problem and choose/develop a good algorithm to solve it. • Select an appropriate criterion to evaluate data mining/learning systems such as a classifier. • Present your work in writing and oral presentation. No assigned text books. Check with library. Good references: • Pang-Ning Tan, Michael Steinbach, Vipin Kumar. Introduction to Data Mining. Addison Wesley. 2006. (or Pearson New International Edition, 2014). • Dursun Delen. Real-World Data Mining. Pearson. 2015. • Margaret H. Dunham. Data Mining: Introductory and Advanced Topics. Prentice Hall, 2003. • David A. Forsyth and Jean Ponce. Computer Vision: A Modern Approach. Prentice Hall, 2003. • Stuart J. Russell and Peter Norvig. Artificial Intelligence: A Modern Approach. Prentice Hall, Pearson Education Inc., 3rd edition, 2010. • Riccardo Poli and William B. Langdon and Nicholas Freitag McPhee. A field guide to genetic programming. Freely available at http://www.gp-field-guide.org.uk. 2008. • Michael Affenzeller. Genetic algorithms and genetic programming: modern concepts and practical applications. CRC Press, c2009. Check the course website and the ECRG website for more detail. COMP422 Intro: 7 Practical Work • Projects: two – Project 1: Due 11:59pm Monday 15 August 2016 (Week 6) – Project 2: Due 11:59pm Monday 10 October 2016 (week 12) • Labs/tutorials: in lectures for discussions • Workload: the expected workload for the paper is 10 hours on average per week including 3 hours of lectures (sometimes 2 lectures per week). COMP422 Intro: 8 Course Components Target 32-34 lectures (Monday has one lecture/hour, Wednesday has two lectures/hours). • Lectures – Concepts, theories, ideas, methods • Discussions – more thinking for students, review of the concepts and theories. • Projects – Practice and presentation of your work, analysis, new/extended ideas and methods. • Paper Review – Check other researchers’ methods, comparing with those used/developed by yourselves. • Demonstration – Project 1 will involve a demostration. • Writing skills – Project 2 will involve more writing. COMP422 Intro: 9 COMP422 Intro: 10 Course Schedule (Draft) Course Assessment • Week 1: Introduction to COMP 422, Data Mining and KDD Process. • Overall marks: • week 2: Data mining tasks and algorithms. – The two projects contribute 40% (20% + 20%) of the total marks. • Week 3: DM Applications — Image Analysis and Object Classification. – The final examination is worth 60% of the total marks. • Week 5: Performance Evaluation • To pass the course you must obtain a C- grade overall and meet the mandatory requirements: • Week 4: Feature Extraction and Selection • Week 6: Neural Networks and Neural Engineering • Week 7: Neural network simulator and special architecture – Submit both projects with reasonable attempts; • Week 8: Neural Networks for Object Recognition, Introduction to EC – Obtain a D grade in the final exam; and • Week 9: Data Preprocessing and Feature Manipulation – Attend at least 26 lectures for discussions. • Week 10: Feature Manipulation and Advanced Topics in EC and Data Mining • READ the course outline • Week 11: Advanced topics in Evolutionary Computation and Data Mining • Week 12: Learning Theory and Summary COMP422 Intro: 11 COMP422 Intro: 12 Background Knowledge and Related Courses Course Materials • Course web page: • COMP307 – Introduction to AI – Search techniques – Knowledge based systems – Learning concepts and methods • COMP302/SWEN304 and COMP442/SWEN432: large collection of data, data warehouse, data marts, (data record retrieval) • Mathematics: statistics discrete mathematics, calculus, probability, • Programming Language (any of C/C++, Java, Pascal, ...) • COMP 473 (T1), COMP 421 (T2), and 423 (T1) http://ecs.victoria.ac.nz/Courses/COMP422 2016T2/ – – – – – – – Course outlines Lecture notes Announcements Assignments Past exam papers Important Links Web submissions • Course public directory: contains a lot of useful source code, packages, executable programs, documents, data file... /vol/courses/comp422/ • These materials will be updated from time to time COMP422 Intro: 13 COMP422 Rules and Policies • Lab hours and access; • Lab Rules; Intro: 14 Summary • General issues for the course: course focus, course outline, lectures, practical work, assessment, web page, ... • Printing Allocations; • Suggested reading: course outline, a paper, and announcements and useful links from the course web site. • Special Needs Students; • Questions: • Student Support; • Plagiarism; • Withdrawals; • Student Grievances. Check course outline!!! – Why is data mining (DM) necessary? – What are DM and knowledge discovery in databases? – What is the KDD process? – What are the relevant fields of KDD? – What are common DM tasks? – What is the difference between DM and data warehousing? – What is the difference between DM and record retrieval?