2
votes

I'm running Hive 2.1.1 on EMR 5.5.0 with a remote mysql metastore DB. I need to enable transactions on hive, but when I follow the configuration here and run any query, I get the following error

FAILED: Error in acquiring locks: Error communicating with the metastore

Settings on the metastore:

hive.compactor.worker.threads = 0
hive.compactor.initiator.on = true

Settings in the hive client:

SET hive.support.concurrency=true;
SET hive.enforce.bucketing=true;
SET hive.exec.dynamic.partition.mode=nonstrict;
SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;

This only happens when I set hive.txn.manager, so my hive metastore is definitely online.

I've tried some of the old suggestions of turning hive test features on which didn't work, but I don't think this is a test feature anymore. I can't turn off concurrency as a similar post in SO suggests because I need concurrency. It seems like the problem is that either DbTxnManager isn't getting the remote metastore connection info properly from hive config or the mysqldb is missing some tables required by DbTxnManager. I have datanucleus.autoCreateTables=true.

2
The only class that I can find that implements HiveTxnManager is DbTxnManager which is the class that's not working for me. I have an alternative which would be to write a custom transaction manager, but that's not ideal. - Brandon Hamric

2 Answers

1
votes

It looks like hive wasn't properly creating the tables needed for the transaction manager. I'm not sure where it was getting its schema, but it was definitely wrong.

So we just ran the hive-txn-schema query to setup the schema manually. We'll do this at the start of any of our clusters from now on.

https://github.com/apache/hive/blob/master/metastore/scripts/upgrade/mysql/hive-txn-schema-2.1.0.mysql.sql

0
votes

The error from FAILED: Error in acquiring locks: Error communicating with the metastore

sometimes because of it without any data, you need to initialization some data in your tables. for example below.

create table t1(id int, name string)     
clustered by (id) into 8 buckets     
stored as orc TBLPROPERTIES ('transactional'='true');