0
votes

I am new to the HDInsight of Azure. I am trying to install presto on the HDInsight cluster.

As a test, I want to run TPC-H Query over. Here are what I did so far.

  1. I loaded TPC-H tables on Hive

  2. I am able to run a query over hive cli.

  3. I am able to run show tables query on presto cli.

  4. I am not able to run queries such as select count(*) from region; with Query 20200605_074052_00011_6etih failed: cannot create caching file system error message.

When I submit show tables query on presto cli, I got messages below.

Query 20200605_074050_00010_6etih, FINISHED, 5 nodes Splits: 70 total, 70 done (100.00%) 0:00 [8 rows, 326B] [27 rows/s, 1.08KB/s]

I barely touched hadoop settings such as hdfs-site.xml or, core-site.xml and presto's configuration is nothing but settings about memories.

Any help would be appreciated. Thanks for reading it.

1

1 Answers

0
votes

You can install Starburst Presto from HDInsights marketplace. Read more: https://azure.microsoft.com/pl-pl/blog/azure-hdinsight-and-starburst-brings-presto-to-microsoft-azure-customers/

However, Starburst does not provide an updated version of this solution, recommending Kuberneters-based (e.g. using Azure AKS) solution instead. See https://docs.starburstdata.com/latest/installation/azure.html

Disclaimer: I am from Starburst.