0
votes

I cannot connect Tableau to presto on an EMR cluster. Versions: Tableau 10, emr-5.3.0, Presto 0.157.1

I am able to connect via the presto-cli using the commands

[hadoop@ip-172-xx-yy-zz scripts]$ presto-cli
presto> use hive.poc;
presto:poc> show tables;
        Table
...

But I am not ablewhen trying to connect from tableau using the Teradata ODBC connector. I am getting the error "catalog is not specified" as a result.

But, when inspecting the trace of the java error, available at the Presto web interface (http://ec2-aaa-bbb-ccc-ddd.eee.compute.amazonaws.com:8889/query.html?20170130_165412_00329_5gbba), I get the following error. It looks more as a parsing error.

com.facebook.presto.sql.parser.ParsingException: line 1:1: no viable alternative at input '{'
    at com.facebook.presto.sql.parser.SqlParser$1.syntaxError(SqlParser.java:45)
    at org.antlr.v4.runtime.ProxyErrorListener.syntaxError(ProxyErrorListener.java:65)
    ... 60 more

The submitted query was the following:

{"query":"select * from \"hive.poc\".\"information_schema\".\"tables\" WHERE table_schema LIKE 'default' AND table_name LIKE '*'","preparedStatements":{}}

Any help / hint will be appreciated.

Note: in the EMR cluster, the presto server is available in port 8889, not in the usual 8080.

Thank you!

3

3 Answers

1
votes

This happens because Tableau is using the Teradata drivers (at least that's what the Tableau website tells you to install). Version 1.1.8 of the drivers is only compatible with Presto 0.157t (i.e. the Teradata release of 0.157).

This is the specific pull request responsible for the error you're seeing: https://github.com/prestodb/presto/pull/5868. A solution would probably be to install 0.157t on EMR but this requires a lot more work than simply checking the Presto box when spinning up a cluster on EMR.

A workaround that fixes the issue introduced by pull request 5868:

In the aws emr command add the --configurations flag and point to a local json file. For example:

aws emr create-cluster [...] --configurations file://./clusterConfiguration.json 

With the contents of clusterConfiguration.json being:

{
  "Classification": "presto-config",
  "Properties": {
    "presto.version": "0.148"
  }
}

This sets the published version of presto to 0.148. This is picked up by the Teradata drivers, which then use the old payload that 0.157 expects.

0
votes

Presto on EMR by default uses Hive as its catalog. Try to enter into the presto-cli using a command like this presto-cli --catalog hive

Then you should be able to access all the tables. Also in tableau you might have to define hive as the catalog.

0
votes

I also have the same problem.

Using rest-client , I changed the submitted query to be only the SQL statement, without the json envelope, and it worked. It doesn't resolves the problem, but it gives a hint for a solution.