How to install hadoop on ubuntu part 2

Continued from Part 1

Step 7:

Create temp directory for hadoop


userhdp@ubuntu:~$ sudo mkdir -p /app/hadoop/tmp
userhdp@ubuntu:~$ sudo chmod 777 /app/hadoop/tmp
Make sure you give required persmissions to tmp folder

Step 8:

This step will show you how to configure hadoop related configuration files 

conf/core-site.xml, 
conf/hdfs-site.xml,
conf/mapred-site.xml.

Add following lines to conf/core-site.xml:


<property>
<name>hadoop.tmp.dir</name> 
<value>/app/hadoop/tmp</value> 
<description>A base for other temporary directories.</description>
</property>
<property> 
<name>fs.default.name</name> 
<value>hdfs://localhost:54310</value> 
<description>The name of the default file system.  A URI whose  scheme and authority determine the FileSystem implementation.  The  uri's scheme determines the config property (fs.SCHEME.impl) naming  the FileSystem implementation class.  The uri's authority is used to  determine the host, port, etc. for a filesystem.</description>
</property>

Add following lines to conf/hdfs-site.xml
<property> 
<name>dfs.replication</name> 
<value>1</value> 
<description>Default block replication.  The actual number of replications can be specified when the file is created.  The default is used if replication is not specified in create time.  </description>
</property>

Add following lines to conf/mapred-site.xml
<property> 
<name>mapred.job.tracker</name>
<value>localhost:54311</value> 
<description>The host and port that the MapReduce job tracker runs  at.  If "local", then jobs are run in-process as a single map  and reduce task.  </description>
</property>   


Step 9:

Before starting hadoop daemons we have to format namenode. This is required to initialize the directory dfs.name.dir.

Execute below command:


userhdp@ubuntu:~$ /usr/local/hadoop/bin/hadoop namenode -format

The output looks like below:



/************************************************************STARTUP_MSG: Starting NameNodeSTARTUP_MSG:   host = ubuntu/127.0.1.1STARTUP_MSG:   args = [-format]STARTUP_MSG:   version = 0.20.2STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Wed Jan 28 13:40:20 UTC 2014************************************************************/14/01/28 13:40:21 INFO namenode.FSNamesystem: fsOwner=userhdp,hadoop14/01/28 13:40:21 INFO namenode.FSNamesystem: supergroup=supergroup14/01/28 13:40:21 INFO namenode.FSNamesystem: isPermissionEnabled=true14/01/28 13:40:21 INFO common.Storage: Image file of size 96 saved in 0 seconds.14/01/28 13:40:21 INFO common.Storage: Storage directory .../hadoop-userhdp/dfs/name has been successfully formatted.14/01/28 13:40:21 INFO namenode.NameNode:
SHUTDOWN_MSG:
/************************************************************SHUTDOWN_MSG: Shutting down NameNode at ubuntu/127.0.1.1
************************************************************/
userhdp@ubuntu:/usr/local/hadoop$



Step 10:


userhdp@ubuntu: /usr/local/hadoop/bin/start-all.sh

The output will look like this:


starting namenode, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-namenode-ubuntu.out
localhost: starting datanode, logging to /usr/local/hadoop/bin/../logs/hadoop-userhdp-datanode-ubuntu.out
localhost: starting secondarynamenode, logging to /usr/local/hadoop/bin/../logs/hadoop-userhdp-secondarynamenode-ubuntu.out
starting jobtracker, logging to /usr/local/hadoop/bin/../logs/hadoop-userhdp-jobtracker-ubuntu.out
localhost: starting tasktracker, logging to /usr/local/hadoop/bin/../logs/hadoop-userhdp-tasktracker-ubuntu.out
userhdp@ubuntu:/usr/local/hadoop$

We can start individual daemons like below:

start-dfs.sh, start-mapred.sh

Verify Hadoop Daemons using jps command:


userhdp@ubuntu:/usr/local/hadoop
$ jps
2237 TaskTracker
2119 JobTracker
2338 DataNode 
1185 SecondaryNameNode
1349 Jps
1288 NameNode
Please check my Youtube video on "How to install hadoop on ubuntu" 

http://youtu.be/bq4ljrwSRZc


Abbreviations:

HDFS : Hadoop Distributed File System
SSH: Secure Shell


For more information on Hadoop Installation please check:


http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/


To download Hadoop Binaries please check:

http://apache.cs.utah.edu/hadoop/common/stable/

Please make sure you download stable version of hadoop.


==============================================================

Like it? give click on ad :)

Live and let live 

Comments