Multi Node Cluster Setup in Hadoop
1)Copy Ubuntu1204.zip(in zip format, otherwise it will copy previous instances)
in three different folder (Namely:- Master, Slave1, Slave2)
2)Copy and Paste hadoop and jdk inside Master, Slave1 & slave2) and unzip them by using command:- tar -zxvf <fileName>
3)Set the JAVA_HOME AND HADOOP_HOME PATH inside ~/.bashrc by using command:-> sudo gedit ~/.bashrc -> at last past the path
in this format:-
export JAVA_HOME=/home/user/desktop/jdk-7u804/bin
export HADOOP_HOME=/home/user/desktop/hadoop-1/bin
export PATH=$PATH:$JAVA_HOME:$HADOOP_HOME
TO LOAD THE UPPER FILE NEED TO TYPE In TERMINAL
Command:- bash
check java version command:-
java -version
hadoop version
4) First check the IP address of all the machines(Master, Slave1, Slave2)
Command:-> $ifconfig
eg:- Master IP :- 192.168.6.1
Slave 1:- 192.168.6.2
Slave 2:- 192.168.6.3
5)Ping the Slave machine from the Masters and from the Masters to the Slaves to check the connections.
Command:-> $ping -b IP address (192.168.1.1)
NOTE:- Its will running continuously unless you type ctrl + c
6)Register the following in the MASTER MACHINE ONLY.
Command:-> sudo gedit /etc/hostname
And Type:- master
7)Register the following in the SLAVE 1 MACHINE ONLY
Command:-> sudo gedit /etc/hostname
And Type:-> slave1
8)Register the following in the SLAVE 2 MACHINE ONLY
Command:->sudo gedit /etc/hostname
And Type:-> slave2
9)Make the following entry in the MASTER MACHINE ONLY.
Command:-> sudo gedit /etc/hosts
And Type:->
<masterIP> master
<slave1IP> slave1
<slave2IP> slave2
example:-
192.168.1.6 master
192.168.1.7 slave1
192.168.1.5 slave2
10)Make the following registry in the SLAVE 1 NODE ONLY.
Command:-> sudo gedit /etc/hosts
And Type:-
192.168.1.6 master
192.168.1.7 slave1
11)Make the following registry in the SLAVE 2 NODE ONLY.
Command:-> sudo gedit /etc/hosts
Type:- 192.168.1.6 master
192.168.1.5 slave2
12)Configuring ssh
[SSH Should be configured in MASTER AND ALL THE SLAVE MACHINE]
$sudo apt-get install ssh
$sudo apt-get install rsync
13)Generate Key:-
Command:-> ssh-keygen -t rsa -P ""
(after that give name of generated key name)
(Copy the path of generated key)
Use this command to copy the generated key to master vm
Command:->
i) ssh-copy-id -i <path of generated key> <userName>@master
ex:- ssh-copy-id -i /root/mastergen.pub root@master (here, username= root, and generated key name= mastergen.pub)
ii) [Only For Master Node]
Command:-> ssh-copy-id -i /home/user/.ssh/id_rsa.pub username@slave1
ex:- ssh-copy-id -i /root/mastergen.pub root@slave1 (here, username= root, and generated key name= mastergen.pub)
iii)[Only For Master Node]
Command:-> ssh-copy-id -i /home/user/.ssh/id_rsa.pub username@slave2
ex:- ssh-copy-id -i /root/mastergen.pub root@slave2 (here, username= root, and generated key name= mastergen.pub)
14) [Only For Master Node]
Command:-> $whoami (cmd to find the username)
15)Check whether the ssh has been configured successfully or not : [from MASTER MACHINE ONLY]
ssh username@master (eg:- username=root)
ssh username@slave1
ssh username@slave2
16)Configure the masters file with the name of the master node (from MASTER NODE ONLY)
cd /user/home/downloads/hadoop-1.0.3/conf
sudo gedit masters
And Type:- master
17)Configure the slaves file with the name of all the slaves node name (From MASTER NODE ONLY)
cd /user/home/Downloads/hadoop-1.0.3/conf
sudo gedit slaves
Command:- master
slave1
slave2
18)Configure the following files ( From MASTERS AND ALL THE SLAVE NODE)
cd /user/home/Downloads/hadoop-1.0.3/conf
sudo gedit core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:54310</value>
</property>
</configuration>
sudo gedit hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
</configuration>
sudo gedit mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>hdfs://master:54311</value>
</property>
</configuration>
19)Format the Namenode (FROM MASTER NODE ONLY, MASTER NODE AND ALL THE SLAVE NODES)
cd /user/home/downloads/hadoop-1.0.3/
-> bin/hadoop namenode -format
-> ssh username@slave1
cd /user/home/downloads/hadoop-1.0.3/bin/
bin/hadoop namenode -format [From master node only]
exit
-> ssh user@slave2
cd /user/home/downloads/hadoop-1.0.3/bin/
bin/hadoop namenode -format [From master node only]
exit
-> cd /user/home/downloads/hadoop-1.0.3/bin
bin/start-all.sh
20)For checking daemons are working fine :
jps