2
votes

Hive has two kinds of tables which are Managed and External Tables, for the difference, you can check Managed. VS External Tables.

Currently, to move external database from HDFS to Alluxio, I need to modify external table's location to alluxio://.

The statement is something like: alter table catalog_page set location "alluxio://node1:19998/user/root/tpcds/1000/catalog_returns"

According to my understanding, it should be a simple metastore modification,however, for some tables modification, it will spend dozens of minutes. The database itself contains about 1TB data btw.

Is there anyway for me to accelerate the table alter process? If no, why it's so slow? Any comment is welcomed, thanks.

1
Assume I have 10 tables, does it make sense for me to launch 10 hive client process and each process alter one table?Eugene

1 Answers

3
votes

I found suggested way which is metatool under $HIVE_HOME/bin.

metatool -updateLocation <new-loc> <old-loc>      Update FS root location in the
                                          metastore to new location.Both
                                          new-loc and old-loc should be
                                          valid URIs with valid host names
                                          and schemes.When run with the
                                          dryRun option changes are
                                          displayed but are not persisted.
                                          When run with the
                                          serdepropKey/tablePropKey option
                                          updateLocation looks for the
                                          serde-prop-key/table-prop-key
                                          that is specified and updates
                                          its value if found.

By using this tool, the location modification is very fast. (maybe several seconds.)

Leave this thread here for anyone who might run into the same situation.