Multi Node Cluster in Hadoop

Multi Node Cluster Setup in Hadoop


1)Copy Ubuntu1204.zip(in zip format, otherwise it will copy previous instances)
    in three different folder (Namely:- Master, Slave1, Slave2)
2)Copy and Paste hadoop and jdk inside Master, Slave1 & slave2) and unzip them by using command:- tar -zxvf <fileName>

3)Set the JAVA_HOME AND HADOOP_HOME PATH inside ~/.bashrc by using command:-> sudo gedit ~/.bashrc -> at last past the path
  in this format:- 
             export JAVA_HOME=/home/user/desktop/jdk-7u804/bin
             export HADOOP_HOME=/home/user/desktop/hadoop-1/bin
             export PATH=$PATH:$JAVA_HOME:$HADOOP_HOME
  TO LOAD THE UPPER FILE NEED TO TYPE In TERMINAL 
   Command:- bash
   check java version command:-
   java -version
   hadoop version
4) First check the IP address of all the machines(Master, Slave1, Slave2)
   Command:-> $ifconfig
   eg:- Master IP :- 192.168.6.1
        Slave 1:- 192.168.6.2
        Slave 2:- 192.168.6.3
5)Ping the Slave machine from the Masters and from the Masters to the Slaves to check the connections.
     Command:->  $ping -b IP address (192.168.1.1)
     NOTE:- Its will running continuously unless you type ctrl + c

6)Register the following in the MASTER MACHINE ONLY.
     Command:-> sudo gedit /etc/hostname
      And Type:-   master

7)Register the following in the SLAVE 1 MACHINE ONLY
      Command:-> sudo gedit /etc/hostname
       And Type:->  slave1

8)Register the following in the SLAVE 2 MACHINE ONLY
     Command:->sudo gedit /etc/hostname
And Type:-> slave2
 
9)Make the following entry in the MASTER MACHINE ONLY.
Command:-> sudo gedit /etc/hosts
And Type:->
<masterIP> master
        <slave1IP> slave1
        <slave2IP> slave2
example:-
        192.168.1.6 master
        192.168.1.7 slave1
        192.168.1.5 slave2
10)Make the following registry in the SLAVE 1 NODE ONLY.
      Command:-> sudo gedit /etc/hosts
  And Type:-
      192.168.1.6 master
      192.168.1.7 slave1

11)Make the following registry in the SLAVE 2 NODE ONLY.
     Command:-> sudo gedit /etc/hosts
     Type:- 192.168.1.6 master
            192.168.1.5 slave2

12)Configuring ssh
     [SSH Should be configured in MASTER AND ALL THE SLAVE MACHINE]
      $sudo apt-get install ssh
      $sudo apt-get install rsync

13)Generate Key:- 
  Command:-> ssh-keygen -t rsa -P ""
  (after that give name of generated key name)
  (Copy the path of generated key)
  
  Use this command to copy the generated key to master vm
  Command:->
       i) ssh-copy-id -i <path of generated key> <userName>@master 
  ex:-  ssh-copy-id -i /root/mastergen.pub root@master (here, username= root, and generated key name= mastergen.pub)
       ii) [Only For Master Node]
     Command:-> ssh-copy-id -i /home/user/.ssh/id_rsa.pub username@slave1   
  ex:-  ssh-copy-id -i /root/mastergen.pub root@slave1 (here, username= root, and generated key name= mastergen.pub)
       iii)[Only For Master Node]
       Command:-> ssh-copy-id -i /home/user/.ssh/id_rsa.pub username@slave2 
  ex:- ssh-copy-id -i /root/mastergen.pub root@slave2   (here, username= root, and generated key name= mastergen.pub)   

14) [Only For Master Node]
     Command:-> $whoami (cmd to find the username) 

15)Check whether the ssh has been configured successfully or not : [from MASTER MACHINE ONLY]
        ssh username@master (eg:- username=root)
        ssh username@slave1
        ssh username@slave2  
16)Configure the masters file with the name of the master node (from MASTER NODE ONLY)
      cd /user/home/downloads/hadoop-1.0.3/conf
      sudo gedit masters
        And Type:- master

17)Configure the slaves file with the name of all the slaves node name (From MASTER NODE ONLY)
       cd /user/home/Downloads/hadoop-1.0.3/conf
       sudo gedit slaves
      Command:-  master
                 slave1
                 slave2

18)Configure the following files ( From MASTERS AND ALL THE SLAVE NODE)
        cd /user/home/Downloads/hadoop-1.0.3/conf
        sudo gedit core-site.xml
        <configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:54310</value>
</property>
</configuration>
sudo gedit hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
</configuration>
sudo gedit mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>hdfs://master:54311</value>
</property>
</configuration>

19)Format the Namenode (FROM MASTER NODE ONLY, MASTER NODE AND ALL THE SLAVE NODES)
cd /user/home/downloads/hadoop-1.0.3/

     ->  bin/hadoop namenode -format

-> ssh username@slave1
cd /user/home/downloads/hadoop-1.0.3/bin/
bin/hadoop namenode -format [From master node only]
exit
-> ssh user@slave2
   cd /user/home/downloads/hadoop-1.0.3/bin/
   bin/hadoop namenode -format [From master node only]
   exit
   
-> cd /user/home/downloads/hadoop-1.0.3/bin
       bin/start-all.sh   
   
20)For checking daemons are working fine :
     jps

Visitor