How to install hadoop on ubuntu













Installing hadoop on Single node Cluster.


Let us start installing and configuring Hadoop in single node cluster.

Pre-requisites:


OS: Ubuntu linux 10 and aboveJava: Java 6 and aboveHadoop Version: 1.2 and above


Step 1:

Install Java. Assuming installation is done as already.
Check Java version.


dills@ubuntu:~# java -versionjava version "1.6.0_20"
Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
Java HotSpot(TM) Client VM (build 16.3-b01, mixed mode, sharing)

If you get above response then your java installed correctly.

Java JDK should be placed in below path.

/usr/lib/jvm/java-6-sun

Step 2: [Optional]

Add hadoop system user:
Below commands add userhdp and group hadoop to local machine.


dills@ubuntu:~$ sudo addgroup
hadoopdills@ubuntu:~$ sudo adduser --ingroup hadoop userhdp

Step 3:

Configuring  SSH.

SSH configuration is mandatory to manage hadoop nodes.
We configure SSH access to our local system (localhost) and hadoop user userhdp.

Please make sure SSH is installed in ur machine because errors will be thrown if ssh is not configured properly.

in Ubuntu its easy to configure SSH.

Follow below commands:

dills@ubuntu:~$ su - userhdp
userhdp@ubuntu:~$ ssh-keygen -t rsa -P ""

After running above commands you will get below output.


Generating public/private rsa key pair.Enter file in which to save the key (/home/userhadp/.ssh/id_rsa): //Press enter here
Created directory '/home/userhadp/.ssh'.Your identification has been saved in /home/userhadp/.ssh/id_rsa.Your public key has been saved in /home/userhadp/.ssh/id_rsa.pub.The key fingerprint is:9c:22:3a:58:b4:e1:35:d2:ff:19:66:a6:ff:ae:2e:d3
userhadp@ubuntu
The key's randomart image is:[...snipp...]
userhdp@ubuntu:~$

This will create RSA key pair with empty password.

Now we have to enable SSH access to our local machine (localhost)

Now lets run below command:
userhdp@ubuntu:~$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys


Now final step is to test SSH connection to localhost.

This should be successfull at any cost otherwise
while starting hadoop daemons you may face connection errors.
If you face any issues in connecting to localhost please make sure 
your SSH service is running.

Run below command:


userhdp@ubuntu:~$ ssh localhost
The authenticity of host 'localhost (::1)' can't be established.RSA key fingerprint is 9c:22:3a:58:b4:e1:35:d2:ff:19:66:a6:ff:ae:2e:d3.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (RSA) to the list of known hosts.Linux ubuntu 2.6.32-22-generic #33-Ubuntu SMP Wed Jan 28 13:27:30 UTC 2014 i686 GNU/LinuxUbuntu 10.04 LTS
[...snipp...]
userhdp@ubuntu:~$

Step 4:

Make sure Hadoop installation tar file ready and navigate to below path and run commands.


userhdp@ubuntu:~$ cd /usr/local
userhdp@ubuntu:~$ sudo tar xvf hadoop-1.2.1.tar.gz
userhdp@ubuntu:~$ mv hadoop-1.2.1 hadoop
userhdp@ubuntu:~$ chown -R userhdp:hadoop hadoop

Now your hadoop directory is moved to :
/usr/local/hadoop


Step 5:

Update bashrc to install hadhoop installation directory and Java home directory.

Add below lines at the end of $Home/.bashrc file for userhdp.


# Set Hadoop Home directoryexport
HADOOP_HOME=/usr/local/hadoop
# Set JAVA_HOME directory
export JAVA_HOME=/usr/lib/jvm/java-6-sun
# Add Hadoop bin/ directory to PATH
export PATH=$PATH:$HADOOP_HOME/bin


Step 6:

Configure Hadoop-env.sh.

Navigate to $HADOOP_HOME/conf/hadoop-env.sh and update below path.


# The java implementation to use.  Required.
export JAVA_HOME=/usr/lib/jvm/java-6-sun
Continue to next Post:

Comments