Download Detailed Syllabus

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
Detailed Syllabus
Lecture-wise Breakup
Subject
Code
Semester
Even
Subject
Name
Big data and Data Analytics.
Credits
3
Coordinator
Mr. Ashish Tripathi
Contact Hours
Semester
Eight
Session 2016-17
Month
from January
3
The objectives of this course are
Learning

Introduce the main concepts, techniques and applications Of Big Data
Objective

Give students some experience on some social big data applications
with Hadoop platform
Learning
Outcome
Module No.
1.


Student can apply the traditional data mining techniques to handle big
data problems.
Students will be able to make good projects in the field of machine
learning , data mining and big data.
Subtitle of the
Module
Topics in the module
Introduction To big
Data
INTRODUCTION
TO
BIG
DATA
:Introduction – distributed file system – Big
Data and its importance, Four Vs, Drivers for
Big data, Big data analytics, Big data
applications. Algorithms using map reduce,
word count job by Map Reduce.
JIIT University, Noida
No. of
Lectures for
the module
10
– INTRODUCTION HADOOP Big Data –
Apache Hadoop & Hadoop EcoSystem –
Moving Data in and out of Hadoop –
Understanding inputs and outputs of
MapReduce - Data Serialization.
2
3.
4.
5.
INTRODUCTION
HADOOP
HADOOP
ECOSYSTEM AND
YARN
IV HIVE AND
HIVEQL, HBASE
APPLICATIONS TO
DATA MINING
Hadoop Architecture, Hadoop Storage:
HDFS, Common Hadoop Shell commands ,
Anatomy of File Write and Read.,
NameNode, Secondary NameNode, and
DataNode, Hadoop MapReduce paradigm,
Map and Reduce tasks, Job, Task trackers Cluster
Setup
–
SSH
&
Hadoop
Configuration – HDFS Administering –
Monitoring & Maintenance.
HADOOP ECOSYSTEM AND YARN (6
hours) Hadoop ecosystem components Schedulers - Fair and Capacity, Hadoop 2.0
New FeaturesNameNode High Availability,
HDFS Federation, MRv2, YARN, Running
MRv1 in YARN.
HIVE AND HIVEQL, HBASE (6 hours) Hive
Architecture and Installation, Comparison
with Traditional Database, HiveQL Querying Data - Sorting And Aggregating,
Map Reduce Scripts, Joins & Subqueries,
HBase conceptsAdvanced Usage, Schema
Design, Advance Indexing - PIG, Zookeeper how it helps in monitoring a cluster, HBase
uses Zookeeper and how to Build
Applications with Zookeeper
APPLICATIONS TO DATA MINING .cluster
analysis – K-means algorithm, Naïve Bayes,
Parallel k-means using Hadoop , case studies
JIIT University, Noida
6
8
10
6
on big data mining.
Total number of Lectures
40
Recommended Reading material: Author(s), Title, Edition, Publisher, Year of Publication
etc. ( Text books, Reference Books, Journals, Reports, Websites etc.)
1.
Bernard Marr Big Data: Using Smart Big Data, Analytics and Metrics to Make Better
Decisions and Improve Performance. (source www.amazon.com)
2.
Cheng, Shi, et al. "Evolutionary Computation and Big Data: Key Challenges and
Future Directions." International Conference on Data Mining and Big Data. Springer
International Publishing, 2016.
3.
Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for
Today's Businesses by Michael Minelli and Michele Chambers.
Evaluation Scheme
T1
1.
T2
2.
T3
3.
Paper
4.
Presentations
Total
20 Marks
20 Marks
35 Marks
25 Marks
100 Marks
JIIT University, Noida