0
votes

I have a bucket on AWS s3 that enforces all object to be KMS encrypted. I'm running Presto on emr-5.2.1

I have external table on s3 (no data). When I'm using

INSERT INTO hive.s3.new_table
SELECT * FROM src_table 

I'm getting AccessDenied error. I tested few different option and reached to support but without luck. If I remove policy from the bucket Presto works just fine but files created on s3 are not encrypted.

Presto doesn't have any problem with reading encrypted external s3 tables or creating them locally on hdfs. I cannot allow unencrypted data.

Policy example:

{
   "Version":"2012-10-17",
   "Id":"PutObjPolicy",
   "Statement":[{
         "Sid":"DenyUnEncryptedObjectUploads",
         "Effect":"Deny",
         "Principal":"*",
         "Action":"s3:PutObject",
         "Resource":"arn:aws:s3:::YourBucket/*",
         "Condition":{
            "StringNotEquals":{
               "s3:x-amz-server-side-encryption":"aws:kms"
            }
         }
      }
   ]
}

http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingKMSEncryption.html

Presto config /etc/presto/conf/catalog/hive.properties

hive.s3.ssl.enabled=true
hive.s3.use-instance-credentials=true
hive.s3.sse.enabled = true
hive.s3.kms-key-id = long_key_id_here

...

Error:
com.facebook.presto.spi.PrestoException: Error committing write to Hive
    at com.facebook.presto.hive.HiveRecordWriter.commit(HiveRecordWriter.java:132)
    at com.facebook.presto.hive.HiveWriter.commit(HiveWriter.java:49)
    at com.facebook.presto.hive.HivePageSink.doFinish(HivePageSink.java:152)
    at com.facebook.presto.hive.authentication.NoHdfsAuthentication.doAs(NoHdfsAuthentication.java:23)
    at com.facebook.presto.hive.HdfsEnvironment.doAs(HdfsEnvironment.java:76)
    at com.facebook.presto.hive.HivePageSink.finish(HivePageSink.java:144)
    at com.facebook.presto.spi.classloader.ClassLoaderSafeConnectorPageSink.finish(ClassLoaderSafeConnectorPageSink.java:49)
    at com.facebook.presto.operator.TableWriterOperator.finish(TableWriterOperator.java:156)
    at com.facebook.presto.operator.Driver.processInternal(Driver.java:394)
    at com.facebook.presto.operator.Driver.processFor(Driver.java:301)
    at com.facebook.presto.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:622)
    at com.facebook.presto.execution.TaskExecutor$PrioritizedSplitRunner.process(TaskExecutor.java:534)
    at com.facebook.presto.execution.TaskExecutor$Runner.run(TaskExecutor.java:670)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: xxxxxx), S3 Extended Request ID: xxxxxxxxxxxxxx+xxx=
    at com.facebook.presto.hive.PrestoS3FileSystem$PrestoS3OutputStream.uploadObject(PrestoS3FileSystem.java:1003)
    at com.facebook.presto.hive.PrestoS3FileSystem$PrestoS3OutputStream.close(PrestoS3FileSystem.java:967)
    at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:74)
    at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:108)
    at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:2429)
    at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:106)
    at com.facebook.presto.hive.HiveRecordWriter.commit(HiveRecordWriter.java:129)
    ... 15 more
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: xxxxxxx)
    at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1387)
    at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:940)
    at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:715)
    at com.amazonaws.http.AmazonHttpClient.doExecute(AmazonHttpClient.java:466)
    at com.amazonaws.http.AmazonHttpClient.executeWithTimer(AmazonHttpClient.java:427)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:376)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4039)
    at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1583)
    at com.amazonaws.services.s3.AmazonS3EncryptionClient.access$101(AmazonS3EncryptionClient.java:80)
    at com.amazonaws.services.s3.AmazonS3EncryptionClient$S3DirectImpl.putObject(AmazonS3EncryptionClient.java:603)
    at com.amazonaws.services.s3.internal.crypto.S3CryptoModuleBase.putObjectUsingMetadata(S3CryptoModuleBase.java:175)
    at com.amazonaws.services.s3.internal.crypto.S3CryptoModuleBase.putObjectSecurely(S3CryptoModuleBase.java:161)
    at com.amazonaws.services.s3.internal.crypto.CryptoModuleDispatcher.putObjectSecurely(CryptoModuleDispatcher.java:108)
    at com.amazonaws.services.s3.AmazonS3EncryptionClient.putObject(AmazonS3EncryptionClient.java:483)
    at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadInOneChunk(UploadCallable.java:131)
    at com.amazonaws.services.s3.transfer.internal.UploadCallable.call(UploadCallable.java:123)
    at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:139)
    at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:47)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    ... 3 more

Am I missing something in configuration or Presto doesn't use KMS when inserting into tables?

According to amazon: "All GET and PUT requests for an object protected by AWS KMS will fail if they are not made via SSL or by using SigV4."

1
Can you please share the create table command of this table ? I am specifically looking for the s3 location. Are you using s3 or s3a or s3n ?Chirag
I got a reply from the support Can I use SSE-KMS with Presto? Unfortunately, no. Presto currently supports either SSE-S3(AES256) or ClientSideEncryption(CSE-KMS) EMR does support SSE-KMS for all the applications which use EMRFSFilesystem like Hive,Spark,MR and so on. Unfortunately, Presto uses PrestoFileSystem. It is the reason, any changes/improvements needs to be added directly to Presto.KonradK
AWS support submitted the ticket for this linkKonradK

1 Answers

0
votes

Presto now supports SSE-KMS via the hive.s3.sse.kms-key-id Hive connector configuration property.