1
votes

I am writing a MapReduce job which reads from (a) HBase table(s). Almost everything works as it is supposed to except the Configuration class. So I did this,

Configuration config = HBaseConfiguration.create();
GenericOptionsParser parser = new GenericOptionsParser(config, args);
// This should work but is not working.
config.addResource(new Path(parser.getCommandLine().getOptionValue("conf", DEFAULT_HBASE_CONF)));

When I run the job like this (passing the path to hbase-site.xml correctly), I get this error.

14/06/30 23:02:30 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735)
    at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1075)
14/06/30 23:02:30 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)

But when I add the following two lines, it works like a charm (even though it seems completely ridiculous).

// So these are the workarounds.
config.set("hbase.rootdir", config.get("hbase.rootdir"));
config.set("hbase.zookeeper.quorum", config.get("hbase.zookeeper.quorum"));

Basically, read the parameters back from the Configuration object and set them back in the same object, which is bonkers.

I read a bug raised about it HBASE-11066, but it seems to have been closed citing local configuration problem (I think not) and a SO question here which is probably similar to my query, but with no answer yet. I use CDH 5.0.2 with HBase 0.96.1.1. Any insight would be deeply appreciated.

1

1 Answers

0
votes

Today I ran into something similar.

Effectively: My job has 'localhost' as the hbase.zookeeper.quorum when I run it from my IDE.

The cause was that the 'yarn' and 'hadoop' scripts add the config dir (i.e. where the hbase-site.xml is located) to the classpath before starting the java runtime. When I run from my IDE this is not done at all.

Now when you create the HBase config two files are loaded:

  • hbase-default.xml: This is part of one of the hbase jars so it will always be found.
  • hbase-site.xml: This is in the config dir, this config dir should be on the classpath and can overrule some of the settings from the default.

I validated this by printing the classpath from within my application using a snippet like this (copied from here)

ClassLoader cl = ClassLoader.getSystemClassLoader();
URL[] urls = ((URLClassLoader)cl).getURLs();
for(URL url: urls){
    System.out.println(url.getFile());
}

and by printing the result of

config.get("hbase.zookeeper.quorum") :

I suspect you have a similar issue.

One of the things I'm considering is to get the "HADOOP_CONF_DIR" environment variable and make sure it is part of the classpath and if it is not give a warning.