Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
July 2016 The Hong Kong Polytechnic University Hong Kong Community College Subject Description Form Subject Code CCN3163 Subject Title Introduction to Big Data Analytics Level 3 Credit Value 3 Medium of Instruction English Pre-requisite / Co-requisite/ Exclusion Pre-requisite Objectives This subject aims to provide students with the knowledge of current challenges, methodologies and technologies in processing big data. Emphasis will be placed on the students’ understanding of the rationales behind the technologies and the students’ ability to analyse big data using professional software packages. Intended Learning Outcomes Upon completion of the subject, students will be able to: CCN2041 Applied Computing (a) (b) (c) (d) (e) Subject Synopsis/ Indicative Syllabus understand the current challenges in processing big data aware of the technologies available for handling big data understand how big data are generated in different industries understand the ideas behind data mining methods targeted for big data analyse big datasets through the use of application software Basic Concepts and Issues in Handling Big Data Industries that generates big data; Types of big data; Challenges in processing big data; The curse of dimensionality. Technologies and Infrastructures Parallel computing; Map-Reduce; Distributed data. Clustering and Mining of Similar Items Similarity measures; Near-neighbor search; Similarity of text documents; Clustering of similar items; Strategies for clustering. Mining of Data Streams Common sources of data streams; Sampling from data streams; Obtaining summary statistics from data streams. Frequent Itemsets Common sources of market-basket data; Association rules; Support and confidence of association rules; Efficient algorithms for mining association 1 July 2016 rules from large dataset. Link Analysis and Social Network Basic concepts and applications of PageRank; Representations of social network; Identification of communities in social network. Advertising on the Web and Direct Marketing Targeted vs untargeted advertising; On-line vs off-line algorithm; Adwords bidding; Predictive models for direct marketing; Evaluation of market campaign. Teaching/Learning Methodology Lectures are used to introduce concepts, challenges and mythologies in processing big data. Real life examples will be used to enhance students’ understanding in the subject matter. Tutorials will be a combination of demonstration of data analysis and hands-on activities in analysing big data. Assessment Methods A variety of assessment tools will be used to develop and assess students’ achievement of the subject intended learning outcomes. in Alignment with Intended Learning Specific assessment % Intended subject learning outcomes to Outcomes methods/tasks weighting be assessed a b c d e ✓ ✓ ✓ ✓ ✓ ✓ 1. Continuous Assessment* 40 Test 16 ✓ ✓ Assignment 1 10 ✓ ✓ Assignment 2 10 Participation 4 ✓ ✓ ✓ ✓ ✓ 2. Final Examination 60 ✓ ✓ ✓ ✓ ✓ Total 100 *Continuous assessment items and/or weighting may be adjusted by the subject team subject to the approval of the College Programme Committee. To pass this subject, students are required to obtain Grade D or above in both the Continuous Assessment and Final Examination. Student Study Effort Expected Class contact Hours Lecture 26 Tutorial 13 Other student study effort Self-study 52 Continuous Assessment 39 2 July 2016 Total student study effort Reading List and References 130 Recommended Textbook Leskovec, J., Rajaraman, A., & Ullman, J. D. (2014), Mining of massive datasets. (2nd ed.), Cambridge University Press. References Baesens, B. (2014), Analytics in a big data world: The essential guide to data science and its applications, Wiley. White, T. (2015), Hadoop: The definitive guide. (4th ed.), O’Reilly Media. 3