I am executing a pyspark application on AWS EMR that is configured to use AWS Glue Data Catalog as metastore. I have a table setup in AWS Glue that points to DynamoDB table. And now in my pyspark script, I am trying to access the Glue table. I am able to do show tables
and able to see the glue table. But when I try to query the table, I am getting below exception,
pyspark.sql.utils.AnalysisException: u'java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: arn:aws:dynamodb:<region>:<acct_id>:table/DDBTABLE;'
My query in pyspark script:
spark.sql("select * from ddbtable").show()
Couldn't find any good reference on this. I see people talking about issue with spark.sql.warehouse.dir
. But not sure how it is related to glue data catalog. Any inputs ?