2
votes

This is happening in pseudo-distributed as well as distributed mode. When I try to start HBase, initially all the 3 services - master, region and quorumpeer start. However within a minute, the master stops. In the logs, this is the trace -

2013-05-06 20:10:25,525 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 0 time(s).
2013-05-06 20:10:26,528 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 1 time(s).
2013-05-06 20:10:27,530 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 2 time(s).
2013-05-06 20:10:28,533 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 3 time(s).
2013-05-06 20:10:29,535 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 4 time(s).
2013-05-06 20:10:30,538 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 5 time(s).
2013-05-06 20:10:31,540 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 6 time(s).
2013-05-06 20:10:32,543 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 7 time(s).
2013-05-06 20:10:33,544 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 8 time(s).
2013-05-06 20:10:34,547 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: <master/master_ip>:9000. Already tried 9 time(s).
2013-05-06 20:10:34,550 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown.
java.net.ConnectException: Call to <master/master_ip>:9000 failed on connection exception: java.net.ConnectException: Connection refused
        at org.apache.hadoop.ipc.Client.wrapException(Client.java:1179)
        at org.apache.hadoop.ipc.Client.call(Client.java:1155)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
        at $Proxy9.getProtocolVersion(Unknown Source)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:398)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:384)
        at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:132)
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:259)
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:220)
        at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1611)
        at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:68)
        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:1645)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1627)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:183)
        at org.apache.hadoop.hbase.util.FSUtils.getRootDir(FSUtils.java:363)
        at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:86)
        at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:368)
        at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:301)
Caused by: java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:519)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:484)
        at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:468)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:575)
        at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:212)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1292)
        at org.apache.hadoop.ipc.Client.call(Client.java:1121)
        ... 18 more

Steps I have taken to fix this without any success - downgraded from distributed mode to pseudo-distributed mode. Same issue. - tried standalone mode- no luck - used same user (hadoop) for both hadoop and hbase. Setup passwordless ssh for hadoop. - same problem. - edited /etc/hosts file and changed localhost/servername as well as 127.0.0.1 to actual IP address referencing SO and different sources. Still same issue. - rebooted the server

Here are the conf files.

hbase-site.xml

<configuration>
<property>
  <name>hbase.rootdir</name>
  <value>hdfs://<master>:9000/hbase</value>
        <description>The directory shared by regionservers.</description>
</property>

<property>
        <name>hbase.cluster.distributed</name>
        <value>true</value>
</property>

<property>
        <name>hbase.zookeeper.quorum</name>
        <value><master></value>
</property>

<property>
        <name>hbase.master</name>
        <value><master>:60000</value>
        <description>The host and port that the HBase master runs at.</description>
</property>

<property>
        <name>dfs.replication</name>
        <value>1</value>
        <description>The replication count for HLog and HFile storage. Should not be greater than HDFS datanode count.</description>
</property>

</configuration>

/etc/hosts file

127.0.0.1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6 .

What am I doing wrong here?

Hadoop Version - Hadoop 0.20.2-cdh3u5 HBase Version - Version 0.90.6-cdh3u5

2

2 Answers

5
votes

By looking at you configuration file, I assume that you are using the actual hostname in your config files. Add the hostname along with the IP of the machine into the /etc/hosts file if that is the case. Also make sure it matches with the hostname in your Hadoop's core-site.xml. Proper name resolution is vital for a proper HBase functioning.

If you still face any problem please follow the steps mentioned here properly. I have tried to explain the procedure in detail and hopefully you'll be able to make it run if you follow all the steps carefully.

HTH

-1
votes

I believe you're trying to use pseudo-distributed mode. I was getting the same error until I fixed 3 things:

  1. local /etc/hosts file

$ cat /etc/hosts

127.0.0.1   localhost
255.255.255.255 broadcasthost
::1             localhost 
fe80::1%lo0 localhost
172.20.x.x  my.hostname.com
  1. instead of pointing to hostname, point to localhost in hbase-env.sh

  2. Correct my classpath A. Ensure Hadoop is in classpath (via hbase-env.sh)

    export JAVA_HOME=your path to java home export HADOOP_HOME=your path to hadoop home export HBASE_HOME=your path to hbase home

    export HBASE_CLASSPATH=your path to hbase home/conf:your path to hadoop home/conf

B. When running my program, I edited the following bash script from HBase: The Definitive Guide (bin/run.sh) $ grep -v # bin/run.sh

bin=`dirname "$0"`
bin=`cd "$bin">/dev/null; pwd`

  echo "usage: $(basename $0) <example-name>"
  exit 1;
fi

MVN="mvn"
if [ "$MAVEN_HOME" != "" ]; then
  MVN=${MAVEN_HOME}/bin/mvn
fi

CLASSPATH="${HBASE_CONF_DIR}"

if [ -d "${bin}/../target/classes" ]; then
  CLASSPATH=${CLASSPATH}:${bin}/../target/classes
fi

cpfile="${bin}/../target/cached_classpath.txt"
if [ ! -f "${cpfile}" ]; then
  ${MVN} -f "${bin}/../pom.xml" dependency:build-classpath -Dmdep.outputFile="${cpfile}" &> /dev/null
fi
CLASSPATH=`hbase classpath`:${CLASSPATH}:`cat "${cpfile}"`

JAVA_HOME=your path to java home
JAVA=$JAVA_HOME/bin/java
JAVA_HEAP_MAX=-Xmx512m

echo "Classpath is $CLASSPATH"
"$JAVA" $JAVA_HEAP_MAX -classpath "$CLASSPATH" "$@"

It's worth noting I am using teh Mac. I believe these instructions will work for teh Linux too.