1
votes

I would like to know if GCP's DataProc supports WebHCat. Googling hasn't turned up anything.

So, does GCP DataProc support/provide WebHCat and if so what is the URL endpoint?

2

2 Answers

1
votes

Dataproc does not provide WebHCat out of the box, however, its trivial to create an initialization action such as:

#!/bin/bash
apt-get install hive-webhcat-server

WebHCat will be available on port 50111:

http://my-cluster-m:50111/templeton/v1/ddl/database/default/table/my-table

Alternatively, it is possible to setup a JDBC connection to HiveServer2 (available by default): https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-JDBC

1
votes

As of now you can use Dataproc Hive WebHCat component to activate Hive WebHCat during cluster creation:

gcloud dataproc clusters create $CLUSTER_NAME --optional-components=HIVE_WEBHCAT