I have a bucket on AWS s3 that enforces all object to be KMS encrypted. I'm running Presto on emr-5.2.1
I have external table on s3 (no data). When I'm using
INSERT INTO hive.s3.new_table
SELECT * FROM src_table
I'm getting AccessDenied error. I tested few different option and reached to support but without luck. If I remove policy from the bucket Presto works just fine but files created on s3 are not encrypted.
Presto doesn't have any problem with reading encrypted external s3 tables or creating them locally on hdfs. I cannot allow unencrypted data.
Policy example:
{
"Version":"2012-10-17",
"Id":"PutObjPolicy",
"Statement":[{
"Sid":"DenyUnEncryptedObjectUploads",
"Effect":"Deny",
"Principal":"*",
"Action":"s3:PutObject",
"Resource":"arn:aws:s3:::YourBucket/*",
"Condition":{
"StringNotEquals":{
"s3:x-amz-server-side-encryption":"aws:kms"
}
}
}
]
}
http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingKMSEncryption.html
Presto config /etc/presto/conf/catalog/hive.properties
hive.s3.ssl.enabled=true
hive.s3.use-instance-credentials=true
hive.s3.sse.enabled = true
hive.s3.kms-key-id = long_key_id_here
...
Error:
com.facebook.presto.spi.PrestoException: Error committing write to Hive
at com.facebook.presto.hive.HiveRecordWriter.commit(HiveRecordWriter.java:132)
at com.facebook.presto.hive.HiveWriter.commit(HiveWriter.java:49)
at com.facebook.presto.hive.HivePageSink.doFinish(HivePageSink.java:152)
at com.facebook.presto.hive.authentication.NoHdfsAuthentication.doAs(NoHdfsAuthentication.java:23)
at com.facebook.presto.hive.HdfsEnvironment.doAs(HdfsEnvironment.java:76)
at com.facebook.presto.hive.HivePageSink.finish(HivePageSink.java:144)
at com.facebook.presto.spi.classloader.ClassLoaderSafeConnectorPageSink.finish(ClassLoaderSafeConnectorPageSink.java:49)
at com.facebook.presto.operator.TableWriterOperator.finish(TableWriterOperator.java:156)
at com.facebook.presto.operator.Driver.processInternal(Driver.java:394)
at com.facebook.presto.operator.Driver.processFor(Driver.java:301)
at com.facebook.presto.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:622)
at com.facebook.presto.execution.TaskExecutor$PrioritizedSplitRunner.process(TaskExecutor.java:534)
at com.facebook.presto.execution.TaskExecutor$Runner.run(TaskExecutor.java:670)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: xxxxxx), S3 Extended Request ID: xxxxxxxxxxxxxx+xxx=
at com.facebook.presto.hive.PrestoS3FileSystem$PrestoS3OutputStream.uploadObject(PrestoS3FileSystem.java:1003)
at com.facebook.presto.hive.PrestoS3FileSystem$PrestoS3OutputStream.close(PrestoS3FileSystem.java:967)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:74)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:108)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:2429)
at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:106)
at com.facebook.presto.hive.HiveRecordWriter.commit(HiveRecordWriter.java:129)
... 15 more
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: xxxxxxx)
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1387)
at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:940)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:715)
at com.amazonaws.http.AmazonHttpClient.doExecute(AmazonHttpClient.java:466)
at com.amazonaws.http.AmazonHttpClient.executeWithTimer(AmazonHttpClient.java:427)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:376)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4039)
at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1583)
at com.amazonaws.services.s3.AmazonS3EncryptionClient.access$101(AmazonS3EncryptionClient.java:80)
at com.amazonaws.services.s3.AmazonS3EncryptionClient$S3DirectImpl.putObject(AmazonS3EncryptionClient.java:603)
at com.amazonaws.services.s3.internal.crypto.S3CryptoModuleBase.putObjectUsingMetadata(S3CryptoModuleBase.java:175)
at com.amazonaws.services.s3.internal.crypto.S3CryptoModuleBase.putObjectSecurely(S3CryptoModuleBase.java:161)
at com.amazonaws.services.s3.internal.crypto.CryptoModuleDispatcher.putObjectSecurely(CryptoModuleDispatcher.java:108)
at com.amazonaws.services.s3.AmazonS3EncryptionClient.putObject(AmazonS3EncryptionClient.java:483)
at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadInOneChunk(UploadCallable.java:131)
at com.amazonaws.services.s3.transfer.internal.UploadCallable.call(UploadCallable.java:123)
at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:139)
at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:47)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
... 3 more
Am I missing something in configuration or Presto doesn't use KMS when inserting into tables?
According to amazon: "All GET and PUT requests for an object protected by AWS KMS will fail if they are not made via SSL or by using SigV4."
Can I use SSE-KMS with Presto? Unfortunately, no. Presto currently supports either SSE-S3(AES256) or ClientSideEncryption(CSE-KMS) EMR does support SSE-KMS for all the applications which use EMRFSFilesystem like Hive,Spark,MR and so on. Unfortunately, Presto uses PrestoFileSystem. It is the reason, any changes/improvements needs to be added directly to Presto.
– KonradK