I'm trying to set up LLAP (interactive query) for Hive 2.1.0 which comes along with the Google Cloud Dataproc. I have already enabled Tez as the execution engine, but I'm not able to find any documentation/steps for enabling LLAP for making Hive even faster. Most of the available ones are for Hortonworks cluster, which is done through Ambari.
2
votes
2 Answers
1
votes
I think you can follow the Hive Configuration Properties - LLAP to add the following properties when creating the cluster.
--properties 'hive:hive.llap.execution.mode=<mode>,hive:hive.server2.llap.concurrent.queries=<n>,...'
Note that, "hive:" prefix is necessary for Dataproc to plumb the properties to Hive.
0
votes
According to this document using apache hive on cloud dataproc, and Cloud SQL I/O and Hive Metastore
gcloud dataproc clusters create hive-cluster \
--scopes sql-admin \
--image-version 1.3 \
--initialization-actions gs://dataproc-initialization-actions/cloud-sql-proxy/cloud-sql-proxy.sh \
--properties 'hive:hive.metastore.warehouse.dir=gs://$PROJECT-warehouse/datasets,hive:hive.llap.execution.mode=<mode>,hive:hive.server2.llap.concurrent.queries=<n>' \
--metadata "hive-metastore-instance=<PROJECT_ID>:<REGION>:<INSTANCE_NAME>"
If you need to setup any hive configuration (hive-site.xml), just add hive:xxx in your properties.