I'm enabling Alluxio to be at the middle layer between Ceph and Hive, according to tutorial from Running Apache Hive with Alluxio, I tried to Serve Existing Tables Stored in HDFS from Alluxio cause I'm using external table to access data now.
The critical step here is to modify table location from distributed storage system like HDFS and Ceph to alluxio:
4.2. Move an External Table from HDFS to Alluxio Assume there is an existing external table u_user in Hive with location set to hdfs://namenode_hostname:port/ml-100k. You can use the following HiveQL statement to check its “Location” attribute:
hive> desc formatted u_user;
Then use the following HiveQL statement to change the table data location from HDFS to Alluxio:hive> alter table u_user set location "alluxio://master_hostname:port/ml-100k";
The statement I used is:
alter table call_center set location "alluxio://alluxio_master:19998/tpcds_text_1000.db/call_center";
However, I got error like below:
ERROR ql.Driver: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. null
By open the hive in WARN log level, we got more exception details:
WARN [ main] metastore.RetryingMetaStoreClient: MetaStoreClient lost connection. Attempting to reconnect (1 of 1) after 1s. alter_table_with_environmentContext
org.apache.thrift.transport.TTransportException: null
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) ~[hive-exec-3.1.1.jar:3.1.1]
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) ~[hive-exec-3.1.1.jar:3.1.1]
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) ~[hive-exec-3.1.1.jar:3.1.1]
Current call_center table information is like below:
# Detailed Table Information
Database: tpcds_text_1000
OwnerType: USER
Owner: root
Retention: 0
Location: s3a://tpcds/user/root/tpcds/1000/call_center
Table Type: EXTERNAL_TABLE
Any comment is welcomed, thanks.