Download DSC 421 Big Data

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
DSC 421 Big Data
DSC 421 Big Data (3, 0, 3). Manipulation, storage, and analysis of large scale data;
large-scale distributed filesystems like HDFS (Hadoop Distributed File System); large
scale databases including SQL and NoSQL; MapReduce algorithm design.
Sample Texts:
Anand Rajaraman and Jeffrey David Ullman, Mining of Massive Datasets, Cambridge
University Press, 2011.
Tom White, Hadoop: The Definitive Guide, 3rd Edition, O’Reilly Media, 2012.
Prerequisites:
 DSC 411 (Data Mining)
 CSC 450 (Database Management Systems)
Objectives and Outcomes:
1.
2.
3.
4.
Explain how large-scale distributed filesystems work.
Analyze and select appropriate database solutions from SQL and NoSQL options.
Design and implement algorithms for MapReduce systems.
Apply data mining techniques to big data sets.
Topics:
1. What is Big Data?
2. Large-Scale Distributed Filesystems
3. Developing MapReduce Algorithms
4. How MapReduce Works
5. Locality Sensitive Hashing
6. Link Analysis
7. Analysis of Massive Graphs
8. Large-Scale Machine Learning
9. Scaling Up Relational Databases
10. NoSQL Databases
11. Mining Data Streams
12. Big Data Case Studies
Coursework:
 Programming Assignments
 Team projects
 Presentations
 Midterm and final exams
Related documents