2
votes

I'm trying to set up LLAP (interactive query) for Hive 2.1.0 which comes along with the Google Cloud Dataproc. I have already enabled Tez as the execution engine, but I'm not able to find any documentation/steps for enabling LLAP for making Hive even faster. Most of the available ones are for Hortonworks cluster, which is done through Ambari.

2

2 Answers

1
votes

I think you can follow the Hive Configuration Properties - LLAP to add the following properties when creating the cluster.

--properties 'hive:hive.llap.execution.mode=<mode>,hive:hive.server2.llap.concurrent.queries=<n>,...'

Note that, "hive:" prefix is necessary for Dataproc to plumb the properties to Hive.

0
votes

According to this document using apache hive on cloud dataproc, and Cloud SQL I/O and Hive Metastore



gcloud dataproc clusters create hive-cluster \
    --scopes sql-admin \
    --image-version 1.3 \
    --initialization-actions gs://dataproc-initialization-actions/cloud-sql-proxy/cloud-sql-proxy.sh \
    --properties 'hive:hive.metastore.warehouse.dir=gs://$PROJECT-warehouse/datasets,hive:hive.llap.execution.mode=<mode>,hive:hive.server2.llap.concurrent.queries=<n>' \
    --metadata "hive-metastore-instance=<PROJECT_ID>:<REGION>:<INSTANCE_NAME>" 

If you need to setup any hive configuration (hive-site.xml), just add hive:xxx in your properties.