Download Hadoop - dbmanagement.info

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
GROUP 7
TOOLS FOR BIG DATA
Sandeep Prasad
Dipojjwal Ray
Objectives...


Apache Hadoop
Apache hadoop v1.0.3 and v1.0.4 successful
installation

Wordcount functionality by hadoop mapreduce

Estimating value of 'Pi' by hadoop mapreduce

MapReduce and HDFS
Apache Hadoop...

High-Availability Distributed object-oriented platform

Open Source

Pseudo-Distributed single-node cluster

A part of Apache Lucene project

Handles petabytes of data
Installation of Hadoop v1.0.3 & 1.0.4...

Release Date v1.0.3 : October 12, 2012

Release Date v1.0.4 : May 16, 2012

OS : Ubuntu v12.04

Prerequisites : Sun Java, hduser

Configuration
Examples...
WordCount example :
$ /bin/hadoop jar hadoop-1.0.3-examples.jar
wordcount file01.txt

Estimation of 'Pi'

$ /bin/hadoop jar hadoop-1.0.3-examples.jar pi (x) (y)
x= Number of maps
y= Sample per maps
Runtime 2.25 seconds (x=10 ; y=100)
Estimated value 3.1480000000000
MapReduce & HDFS...


Divide and conquer algorithm
Map() and Reduce() function derive roots from
functional programming

JobTracker and TaskTracker

NameNode and DataNode

Hadoop Distributed File System

Java Framework
References...



http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntulinux-single-node-cluster
http://lintool.github.io/Cloud9/
Data intensive text-processing using Mapreduce Book by Jimmy
Lin and Chris Dyer

http://hadoop.apache.org/releases.html

http://www.apache.org/dyn/closer.cgi/hadoop/co
THANK YOU
framework written in Java
highly fault-tolerant distributed file system
JobTracker web UI provides information about general job statistics of the Hadoop
cluster, running/completed/failed jobs and a job history log file
The task tracker web UI shows you running and non-running tasks
Related documents