Download DEREE COLLEGE SYLLABUS FOR: ITC 3333 DATA MINING AND

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
DEREE COLLEGE SYLLABUS FOR:
ITC 3333 DATA MINING AND BIG DATA
3/0/3
(Spring 2016 )
PREREQUISITES:
ITC1070 Information Technology Fundamentals –orCS 1070 Introduction to Information Systems
ITC 2188 Introduction to Programming
MA 2010 Statistics I
CATALOG
DESCRIPTION:
Data and feature selection, cleaning, extracting patterns from
structured and unstructured data, evaluation, big data, tools,
applications
RATIONALE:
The volume of data that organisations collect is increasing
exponentially, we are in the era of big data, but more data does not
mean more knowledge. With data mining techniques, we can
navigate through data that are chaotic, heterogeneous,
unstructured and noisy in in order to infer what is relevant.
The applications range from support of decision making, and
marketing, up to fraud detection, and medicine. The aim of this
course is to provide an understanding of the basic principles, and
to apply them through appropriate tools in many real world
problems that involved big data.
LEARNING OUTCOMES:
As a result of taking this course, the student should be able to:
1. Combine appropriate data mining techniques, while also
considering scalability, to discover information nuggets that are
appropriate for a specific problem.
2. Evaluate the quality of the inferred information by using a variety
of evaluation methods.
METHOD OF TEACHING AND
LEARNING:
In congruence with the teaching and learning strategy of the
college, the following tools are used:
In congruence with the learning and teaching strategy of the
College, the following tools/activities are used:
• Classroom lectures and occasional laboratory practical sessions.
• Office hours held by the instructor to provide further assistance to
students.
• Use of the Blackboard Learning platform, where instructors post
lecture notes, assignment instructions, timely announcements, as
well as additional resources.
ASSESSMENT:
Summative:
Project:
Programming and/or tool usage to address
one or more problems in data mining.
100%
Formative:
In class quizzes or lab exercises
0
The formative assessment aims to shape teaching along the
semester and prepare students for the summative assessments.
The project tests Learning Outcome: 1,2
(Assignment instructions and assessment rubrics are distributed on
the first day of class with the Course Outline.)
READING LIST:
REQUIRED MATERIAL:
Tan, P., Steinbach, M., & Kumar, V. (2006). Introduction to data
mining. Boston: Pearson Addison Wesley.
Instructor notes
FURTHER READING:
Hand D., Mannila H., Smyth P., (2001), Principles of Data Mining,
MIT Press.
Zaki, M. J., Meira W.,( 2014). Data Mining and Analysis, Cambridge
University Press.
RECOMMENDED MATERIAL:
M. Bishop. Pattern Recognition and Machine Learning. Heidelberg,
Germany: Springer 2006. i‐xx, 740 pp., ISBN: 0‐387‐31073‐8
$74.95
Hardcover.
Kybernetes,
36(2),
275-275.
doi:10.1108/03684920710743466
Mitchel T.M. (1997), Machine Learning, McGraw Hill
COMMUNICATION
REQUIREMENTS:
Daily access to the course’s site on the College’s Blackboard CMS.
Use of word processing and/or presentation graphics software for
documentation of assignments
SOFTWARE
REQUIREMENTS:
Python and related libraries: Skikit-learn, numpy
Apache Flink
Weka and/or Java machine learning libraries
WWW RESOURCES:
http://www.kdnuggets.com/
http://www.autonlab.org/tutorials/
http://www-users.cselabs.umn.edu/classes/Fall2011/csci5523/index.php?page=homework%20and%20projects
http://archive.ics.uci.edu/ml/
http://www.sciencemag.org/site/feature/data/compsci/machine_le
arning.xhtml
INDICATIVE CONTENT:
INDICATIVE CONTENT:
1.
Introduction to data mining
2.
Data preprocessing
3.
Data mining, knowledge representation
4.
Data mining algorithms: Association rules
5.
Data mining algorithms: Classification
6.
Data mining algorithms: Prediction
7.
Evaluation
8.
Data mining algorithms: Clustering
9.
Visualisation
10.
Scalability of data mining
11.
Big Data
12.
Streaming
13.
Advanced topics in Data mining