Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
4/8/16 IST 565 Data Mining Course: Data Mining Instructor: Jon Fox Office: Bentonville, AR Office Hours: by Appointment / Online Semester: Summer 2016 Email: [email protected] Phone: 706.399.1538 (cell) Meeting Place: Online Catalog Description A broad introduction to data mining tools and techniques for information professionals. Students will develop a portfolio of resources, demonstrations, recipes, and examples of various data mining techniques. Course Description This course will introduce popular data mining methods for extracting knowledge from data. The principles and theories of data mining methods will be discussed and related to current opportunities in the business environment. Students will also acquire hands-on experience using state-of-the-art software to develop data mining solutions to practical problems. course focuses on becoming familiar and comfortable with a range of the available analytic tools in the context of several difficult, data-focused problems. The topics of the course will include the key tasks of data mining, including data preparation, concept description, association rule mining, classification, clustering, evaluation and analysis. By addressing problems in creative ways and connecting sets of available tools and data, the students are able to gain a practical understanding of analytics as a whole while identifying areas of interest for additional exploration. Learning Objectives During the course, we will emphasize: 1. Experiential learning through reading and practical exercises. 2. Collaborative learning through online discussions between instructors and peers. 3. Self-learning with appropriate instructional support and timely feedback using data mining case studies. In order to be successful in this course, the student will: 1. Pro-actively research solution options vs. relying solely on textbook content. 2. Actively code while completing the reading assignments. 4/8/16 3. Present results in a professional manner. Comments – Clarity – Correctness – Credit. 4. Submit their assignments on time. Upon completion of the course, the student will be able to: 1. Understand the fundamental processes, concepts, and techniques of data mining. 2. Appreciate the range of applicability of data mining to real problems in areas such as business, science, and engineering. 3. Advance your understanding of contemporary data-mining tools 4. Communicate your results in a meaningful way. Required Course Materials Pang-Ning Tan, Michael Stienbach, and Vipin Kumar, Introduction to Data Mining, Pearson, 2005. (Free sample chapters are available at the following website: http://wwwusers.cs.umn.edu/~kumar/dmbook/index.php) The required software includes Weka, R, and Tableau. All of these software packages are available through remote lab. Course Assignments / Percent of Final Grades / Due Dates: Tasks 1 2 3 4 Discussions Homework Assignments Project Checkpoints Final Project Percent of Course 10% 55% 5% 30% Due Dates All semester long Week 3, 5, 7, & 9 Week 4 & Week 10 Due at end of semester Assignment #1: Discussions The online discussions provide an opportunity to discuss current data mining readings, events, technologies, and methods. The discussions facilitate the first, second, and fourth learning objectives of the course by providing the opportunity to demonstrate understanding of data mining concepts and the range of applicability of data mining. There are 10 possible discussions in this course worth a maximum of 1 point each. Maximum points are possible if the submission is on-time, complete, and correct. Assignment #2: Homework Assignments Homework assignments provide open-ended problem solving experiences that build on the material covered in the readings. The assignments facilitate the first and third learning objectives of the course by providing the opportunity to apply techniques from class to realistic problem solving situations. A separate instruction document will be provided with 4/8/16 specific instruction for each assignment. The first homework assignment is worth a maximum of 10 points. The final 3 homework assignments in this course are worth a maximum of 15 points each. Maximum points are possible if the submission is on-time, complete, and correct. Assignment #3: Final Course Project For the final project, students will identify a data-mining problem, bring together different data sources, conduct analysis, draw conclusions, and produce a report explaining the results. Maximum points are possible if the submission is on-time, complete, and demonstrates the student’s ability to match the appropriate data mining methods to the chosen problem, draw appropriate conclusions, and present the results in a meaningful way. Class-Wide Phone Conferences: The instructor will answer student questions during toll-free phone conferences. There will be an introductory call early in the semester and then one call prior to each of the two project checkpoints. The phone conferences are optional but participation is highly encouraged as course learning objectives, specific concepts, and upcoming assignments will be discussed. Course Grading: Grades for specific assignments and the course final grade will be assigned by the instructor through the course’s on-line site. There are 100 possible grade points in this course and each Assignment’s grade value goes directly toward the total earned by each student. The numeric final point total will translate to the final letter grade for the course as follows: A C+ 100-93 79-78 AC 92-90 77-73 B+ C- 89-88 72-70 B D 87-83 69-60 BF 82-80 < 60 Grades will be available for viewing in the Grade Book section for the course’s on-line site. Academic Integrity The academic community of Syracuse University and of the School of Information Studies requires the highest standards of professional ethics and personal integrity from all members of the community. Violations of these standards are violations of a mutual obligation characterized by trust, honesty, and personal honor. As a community, we commit ourselves to standards of academic conduct, impose sanctions against those who violate these standards, and keep appropriate records of violations. The academic integrity statement can be found at http://supolicies.syr.edu/ethics/acad_integrity.htm. 4/8/16 Blackboard The iSchool uses Syracuse University’s Blackboard system to facilitate distance learning and main campus resources. The environment is composed of a number of elements that will help you be successful in both your current coursework and your lifelong learning opportunities. To access Blackboard, go to the following URL: http://blackboard.syr.edu. Use your Syracuse University NetID & Password to log into Blackboard. For questions regarding technical aspects of Blackboard, please submit a help ticket to the iSchool dashboard at My.iSchool.Dashboard (https://my.ischool.syr.edu). Log in with your NetID, select “Submit a Helpdesk Ticket,” and select Blackboard as the request type. The iSchool Blackboard support team will assist you. Students with Disabilities In compliance with Section 504 of the Americans with Disabilities Act (ADA), Syracuse University is committed to ensure that “no otherwise qualified individual with a disability … shall, solely by reason of disability, be excluded from participation in, be denied the benefits of, or be subjected to discrimination under any program or activity …” If you feel that you are a student who may need academic accommodations due to a disability, you should immediately register with: Office of Disability Services (ODS) 804 University Avenue Room 308 3rd Floor 315.443.4498 or 315.443.1371 (TTD only) ODS is the Syracuse University office that authorizes special accommodations for students with disabilities. 4/8/16 Course Schedule Week 0 5/16 – 5/22 1 5/23 – 5/29 2 5/30 – 6/5 3 6/6 – 6/12 4 6/13 – 6/19 5 6/20 – 6/26 6 6/27 – 7/3 7 7/4 – 7/10 8 7/11 – 7/17 9 7/18 – 7/24 10 7/25 – 7/31 11 8/1 – 8/7 12 8/8 – 8/14 Topic Course Introduction Readings Syllabus Assignments Intro Lecture Data Mining Intro Align our class with the methods, goals, and expectations of the course. Data Preparation Before we use the data, we have to make sure it is ready. Data Exploration Understanding the structure and content of the data. Classification Using decision trees to classify data. Pang Ch. 1 Syllabus Quiz Model Evaluation How good is our model? Does our algorithm help us understand the data. Classification Using Naïve Bayes as a decision tree alternative. Classification Nearest neighbors and support vectors. Clustering Finding the patterns and producing meaningful results. Clustering Finding the patterns and producing meaningful results. Association Rules Finding oranges in your shopping basket Project Preparation Identify, resolve, and hopefully avoid the pitfalls of data mining. Project Presentation Present results in a meaningful way. Pang Ch. 4 Exercise 2 Pang Ch. 5 Discussions Adobe / Phone Conf 2 Pang Ch. 5 Discussions Exercise 3 Pang Ch. 8 Discussions Adobe / Phone Conf 3 Discussion Profile-Post Pang Ch. 2 Adobe / Phone Conf 1 Discussions Pang Ch. 3 Exercise 1 Discussions Pang Ch. 4 Final Project Proposal Discussions Discussions Pang Ch. 8 Exercise 4 Discussions Pang Ch. 6 Final Project Status Discussions NA Project Preparation No assignments due NA Final Course Project Due on 8/14 4/8/16 Read More About It: Lantz, B., (2015). Machine Learning with R, Packt Publishing. Mitchell, T. (1997). Machine Learning, McGraw-Hill: Boston. Available at the following website: http://www.cs.cmu.edu/~tom/mlbook.html. North, M., (2012). Data Mining for the Masses, Infinite. Putler, D. D. & R. E. Krider (2012). Customer and Business Analytics. Boca Raton, FL: CRC Press.