0
votes

I'm running hive and hbase on a 2-node-hadoop. I'm using hadoop-0.20.205.0, hive-0.9.0, hbase-0.92.0, and zookeeper-3.4.2.

hive and hbase works fine separately. Then I followed this manual to integrate hive and hbase. https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration

hive started without errors, and I created the sample table

CREATE TABLE hbase_table_1(key int, value string) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")
TBLPROPERTIES ("hbase.table.name" = "xyz");

show tables in hive and list or scan in hbase works well. But when I select * from hbase_table_1; in hive, I get errors

2012-09-12 11:25:56,975 ERROR ql.Driver (SessionState.java:printError(400)) - FAILED: Hive Internal Error: java.lang.RuntimeException(Error while making MR scratch directory - check filesystem config (null))
java.lang.RuntimeException: Error while making MR scratch directory - check filesystem config (null)
...
Caused by: java.lang.IllegalArgumentException: Wrong FS: hdfs://10.10.10.15:54310/tmp/hive-hadoop/hive_2012-09-12_11-25-56_602_1946700606338541381, expected: hdfs://hadoop01:54310

It says fs is wrong, but I don't think it's right to config fs to such a path, and where should I config it?

Here is my config files. Ip address of hadoop01 is 10.10.10.15.

hbase-site.xml

<configuration>
<property>
    <name>hbase.zookeeper.property.clientPort</name>
    <value>2222</value>
</property>


<property>
    <name>hbase.zookeeper.quorum</name>
    <value>10.10.10.15</value>
    <description>The directory shared by RegionServers.
    </description>
</property>
<property>
    <name>hbase.zookeeper.property.dataDir</name>
    <value>/home/hadoop/datas/zookeeper</value>
    <description>Property from ZooKeeper's config zoo.cfg.
      The directory where the snapshot is stored.
    </description>
</property>

<property>
    <name>hbase.rootdir</name>
    <value>hdfs://hadoop01:54310/hbase</value>
    <description>The directory shared by RegionServers.
    </description>
</property>
<property>
     <name>hbase.cluster.distributed</name>
     <value>true</value>
     <description>The mode the cluster will be in. Possible values are
       false: standalone and pseudo-distributed setups with managed Zookeeper
       true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh)
     </description>
</property>

Anyone can help please?

1

1 Answers

0
votes

I solved it myself.

Modify $HADOOP_HOME/conf/core-site.xml, change dfs.default.name from ip to hostname. like this

<property>
    <name>fs.default.name</name>
    <value>hdfs://hadoop01:54310/</value>   
</property>

Make sure that both this property and hbase.rootdir property in hbase-site.xml use same hostname or ip.