1
votes

I have configured the s3 keys (access key and secret key) in a jceks file using hadoop-credential api. Commands used for the same are as below:

hadoop credential create fs.s3a.access.key -provider jceks://hdfs@nn_hostname/tmp/s3creds_test.jceks

hadoop credential create fs.s3a.secret.key -provider jceks://hdfs@nn_hostname/tmp/s3creds_test.jceks

Then, I am opening a connection to Spark Thrift Server using beeline and passing the jceks file path in the connection string as below:

beeline -u "jdbc:hive2://hostname:10001/;principal=hive/_HOST@?hadoop.security.credential.provider.path=jceks://hdfs@nn_hostname/tmp/s3creds_test.jceks;

Now, when I try to create an external table with the location in s3, it fails with the below exception:

CREATE EXTERNAL TABLE IF NOT EXISTS test_table_on_s3 (col1 String, col2 String) row format delimited fields terminated by ',' LOCATION 's3a://bucket_name/kalmesh/';

Exception: Error: org.apache.spark.sql.execution.QueryExecutionException: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: java.nio.file.AccessDeniedException s3a://bucket_name/kalmesh: getFileStatus on s3a://bucket_name/kalmesh: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: request_id), S3 Extended Request ID: extended_request_id=) (state=,code=0)

2

2 Answers

0
votes

I don't think jceks support for the fs.s3a. secrets went in until Hadoop 2.8. I don't think; it's hard to tell from the source. If that is the case, and you are using Hadoop 2.7, then the secret isn't going to to be picked up. Afraid you will have to put it in the config.

0
votes

I had a similar situation, just with Drill instead of Hive. But like in your case:

  • using Hadoop 2.9 jars (1st version to support AWS KMS)
  • writing to s3a://
  • encrypting with SSE-KMS

... and got AmazonS3Exception: Access Denied.

In my case (perhaps in yours, as well) the exception description was a bit ambiguous. The reported AmazonS3Exception: Access Denied did not originate from S3 but from KMS! Access was denied to the key I used for encryption. User making the API calls was not on key's users list - once I added that user to key's list writing started to work and I could create encrypted tables on s3a://...