0
votes

I'm enabling Alluxio to be at the middle layer between Ceph and Hive, according to tutorial from Running Apache Hive with Alluxio, I tried to Serve Existing Tables Stored in HDFS from Alluxio cause I'm using external table to access data now.
The critical step here is to modify table location from distributed storage system like HDFS and Ceph to alluxio:

4.2. Move an External Table from HDFS to Alluxio Assume there is an existing external table u_user in Hive with location set to hdfs://namenode_hostname:port/ml-100k. You can use the following HiveQL statement to check its “Location” attribute:

hive> desc formatted u_user;
Then use the following HiveQL statement to change the table data location from HDFS to Alluxio:

hive> alter table u_user set location "alluxio://master_hostname:port/ml-100k";

The statement I used is:

alter table call_center set location "alluxio://alluxio_master:19998/tpcds_text_1000.db/call_center";

However, I got error like below:

 ERROR ql.Driver: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. null

By open the hive in WARN log level, we got more exception details:

WARN [ main] metastore.RetryingMetaStoreClient: MetaStoreClient lost connection. Attempting to reconnect (1 of 1) after 1s. alter_table_with_environmentContext
org.apache.thrift.transport.TTransportException: null
        at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) ~[hive-exec-3.1.1.jar:3.1.1]
        at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) ~[hive-exec-3.1.1.jar:3.1.1]
        at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) ~[hive-exec-3.1.1.jar:3.1.1]

Current call_center table information is like below:

# Detailed Table Information
Database:               tpcds_text_1000
OwnerType:              USER
Owner:                  root
Retention:              0
Location:               s3a://tpcds/user/root/tpcds/1000/call_center
Table Type:             EXTERNAL_TABLE

Any comment is welcomed, thanks.

1

1 Answers

0
votes

Some modification in Running Apache Hive with Alluxio might only take effective after the restart of metastore.

I killed the metastore and restarted it by hive --service metastore, the table location was modified successfully then.

Leave this thread here for anyone who might run into the same situation.