1
votes

I have a Hadoop cluster made of 3 slaves and 1 master on top of which there is an HBase cluster with 3 RS and 1 master respectively. Additionally there is a Zookeeper ensemble on 3 machines.

The Hadoop cluster is functioning correctly as well as the Zookeeper ensemble. However, the HBase cluster fails to initialize correctly.

I start HBase it by running ./bin/start-hbase.sh. This correctly starts the HBase Master and the Region Servers. The hbase folder in hdfs is set-up correctly.

jps on master

hduser@master:~/hbase$ jps
5694 HMaster
3934 JobHistoryServer
3786 NameNode
3873 ResourceManager
6025 Jps

jps on slaves

5737 Jps
5499 HRegionServer
3736 DataNode
3820 NodeManager

However, the HBase master does not register the Region Servers as it is also apparent from looking at the logs:

master log

[master:master:60000] master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 1511 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms.

slave log

[regionserver60020] regionserver.HRegionServer: reportForDuty to master=master,60000,1404856451890 with port=60020, startcode=1404856453874
[regionserver60020] regionserver.HRegionServer: error telling master we are up
com.google.protobuf.ServiceException: org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending local=/10.0.2.15:53939 remote=master/192.168.66.60:60000]

Here are the configuration details:

/etc/hosts on master

192.168.66.63   slave-3 # Data Node and Region Server
192.168.66.60   master # Name Node and HBase Master
192.168.66.73   zookeeper-3 # Zookeeper node
192.168.66.71   zookeeper-1 # Zookeeper node
192.168.66.72   zookeeper-2 # Zookeeper node
192.168.66.62   slave-2 # Data Node and Region Server
192.168.66.61   slave-1 # Data Node and Region Server

/etc/hosts on slave-1

192.168.66.60   master
192.168.66.73   zookeeper-3
192.168.66.71   zookeeper-1
192.168.66.72   zookeeper-2

hbase-site.xml on ALL cluster nodes

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>hbase.tmp.dir</name>
        <value>/home/hduser/hbase/tmp</value>
    </property>
    <property>
        <name>hbase.rootdir</name>
        <value>hdfs://master/hbase</value>
    </property>
    <property>
        <name>hbase.cluster.distributed</name>
        <value>true</value>
    </property>
    <property>
        <name>hbase.local.dir</name>
        <value>/home/hduser/hbase/local</value>
    </property>
    <property>
        <name>hbase.master.info.port</name>
        <value>6010</value>
    </property>
    <property>
        <name>hbase.zookeeper.quorum</name>
        <value>zookeeper-1,zookeeper-2,zookeeper-3,</value>
    </property>
</configuration>

regionservers file on master and slaves

slave-3
slave-1
slave-2

hbase-env.sh on master and slaves

export JAVA_HOME=$(readlink -f /usr/bin/javac | sed "s:/bin/javac::"
export HBASE_OPTS="-XX:+UseConcMarkSweepGC"
export HBASE_MANAGES_ZK=false

What am I doing wrong so that the nodes cannot talk to each other? I am using Hadoop 2.4.0 and Hbase 0.98.3 along with Zookeeper 3.4.6 on Ubuntu Trusty Tahr x64.

1

1 Answers

1
votes

The answer to my mystery was solved by Ian Brooks on the HBase mailing list

Essentially I needed to manually specify the slaves in the /etc/hosts of the slaves ( I suspect that I only needed to add the slave itself) so that I end up with something like:

/etc/hosts on all slaves(RS)

192.168.66.60   master
192.168.66.73   zookeeper-3
192.168.66.71   zookeeper-1
192.168.66.72   zookeeper-2
192.168.66.61   slave-1
192.168.66.62   slave-2
192.168.66.63   slave-3

The reason for this was that there were to eth interfaces running on the slaves and the localhost was addressed on a different IP.