Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Detailed Syllabus Lecture-wise Breakup Subject Code Semester Even Subject Name Big data and Data Analytics. Credits 3 Coordinator Mr. Ashish Tripathi Contact Hours Semester Eight Session 2016-17 Month from January 3 The objectives of this course are Learning Introduce the main concepts, techniques and applications Of Big Data Objective Give students some experience on some social big data applications with Hadoop platform Learning Outcome Module No. 1. Student can apply the traditional data mining techniques to handle big data problems. Students will be able to make good projects in the field of machine learning , data mining and big data. Subtitle of the Module Topics in the module Introduction To big Data INTRODUCTION TO BIG DATA :Introduction – distributed file system – Big Data and its importance, Four Vs, Drivers for Big data, Big data analytics, Big data applications. Algorithms using map reduce, word count job by Map Reduce. JIIT University, Noida No. of Lectures for the module 10 – INTRODUCTION HADOOP Big Data – Apache Hadoop & Hadoop EcoSystem – Moving Data in and out of Hadoop – Understanding inputs and outputs of MapReduce - Data Serialization. 2 3. 4. 5. INTRODUCTION HADOOP HADOOP ECOSYSTEM AND YARN IV HIVE AND HIVEQL, HBASE APPLICATIONS TO DATA MINING Hadoop Architecture, Hadoop Storage: HDFS, Common Hadoop Shell commands , Anatomy of File Write and Read., NameNode, Secondary NameNode, and DataNode, Hadoop MapReduce paradigm, Map and Reduce tasks, Job, Task trackers Cluster Setup – SSH & Hadoop Configuration – HDFS Administering – Monitoring & Maintenance. HADOOP ECOSYSTEM AND YARN (6 hours) Hadoop ecosystem components Schedulers - Fair and Capacity, Hadoop 2.0 New FeaturesNameNode High Availability, HDFS Federation, MRv2, YARN, Running MRv1 in YARN. HIVE AND HIVEQL, HBASE (6 hours) Hive Architecture and Installation, Comparison with Traditional Database, HiveQL Querying Data - Sorting And Aggregating, Map Reduce Scripts, Joins & Subqueries, HBase conceptsAdvanced Usage, Schema Design, Advance Indexing - PIG, Zookeeper how it helps in monitoring a cluster, HBase uses Zookeeper and how to Build Applications with Zookeeper APPLICATIONS TO DATA MINING .cluster analysis – K-means algorithm, Naïve Bayes, Parallel k-means using Hadoop , case studies JIIT University, Noida 6 8 10 6 on big data mining. Total number of Lectures 40 Recommended Reading material: Author(s), Title, Edition, Publisher, Year of Publication etc. ( Text books, Reference Books, Journals, Reports, Websites etc.) 1. Bernard Marr Big Data: Using Smart Big Data, Analytics and Metrics to Make Better Decisions and Improve Performance. (source www.amazon.com) 2. Cheng, Shi, et al. "Evolutionary Computation and Big Data: Key Challenges and Future Directions." International Conference on Data Mining and Big Data. Springer International Publishing, 2016. 3. Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today's Businesses by Michael Minelli and Michele Chambers. Evaluation Scheme T1 1. T2 2. T3 3. Paper 4. Presentations Total 20 Marks 20 Marks 35 Marks 25 Marks 100 Marks JIIT University, Noida