0
votes

we have hadoop cluster - HDP 2.6.5 with hive meta store , and presto workers

in the presto workers we defined the following configuration

[root@presto_worker catalog]# ls -ltr
total 12
-rw-r--r-- 1 root root 247 Aug  5 14:30 jmx.properties
-rw-r--r-- 1 root root  54 Aug  5 14:30 memory.properties
-rw-r--r-- 1 root root 329 Aug  5 14:30 hive.properties

[root@presto_worker catalog]# more hive.properties
#
connector.name=hive-hadoop2
hive.metastore.uri=thrift://hadoop01.sys65.com:9083,thrift://hadoop03.sys65.com:9083
hive.config.resources=/etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml
hive.parquet.fail-on-corrupted-statistics=false
hive.force-local-scheduling=true
hive.parquet.use-column-names=true

my question is - how presto_worker connected to the hive meta store ?

what are the process steps that performed in background when presto_worker reached connecting to hive meta store ?

1
It opens a Thrift client? Just like Beeline or Hive CLI?OneCricketeer
can you give more details from beginning to end , what are the stages ?Judy
do you mean first its logged as su hive , then as beeline , and then its do some queries ?Judy
No, I mean the steps are the same. Afterwards. It uses Java classes internal to Hive client and runs queries just as any other clientOneCricketeer

1 Answers

0
votes

Worker needs a connection to Hive metastore when performing an INSERT into existing partitioned table. The relevant code is here: https://github.com/trinodb/trino/blob/6d9d47e0909b3fe9367584ba8450827dbbb8e1d7/presto-hive/src/main/java/io/prestosql/plugin/hive/metastore/HivePageSinkMetadataProvider.java#L58

AFAIR worker does not need metastore otherwise.