1
votes

I'm trying to connect apache drill to my aws s3 without specifying my access key and secret key in the config, so I add

"fs.s3a.aws.credentials.provider": "com.amazonaws.auth.DefaultAWSCredentialsProviderChain"

in the config hoping it will get the credentials from the default credential profiles file in my PC and get the IAM Role when I deploy it.

When I specify the access key and secret key in the config, the connection work just fine, but after I change to config to use DefaultAWSCredentialsProviderChain it didn't work

the drill show this when i try to use s3 storage

Error: SYSTEM ERROR: AmazonClientException: Unable to load AWS credentials from any provider in the chain

I can write to s3 using the DefaultAWSCredentialsProviderChain with org.apache.parquet.hadoop.ParquetWriter and can read the s3 bucket using awscli without any problem

here is my storage plugin config

{
  "type": "file",
  "connection": "s3a://my-bucket",
  "config": {
    "fs.s3a.endpoint": "s3.REGION.amazonaws.com",
    "fs.s3a.aws.credentials.provider": "com.amazonaws.auth.DefaultAWSCredentialsProviderChain"
  },
  "workspaces": {
    "tmp": {
      "location": "/tmp",
      "writable": true,
      "defaultInputFormat": null,
      "allowAccessOutsideWorkspace": false
    },
    "root": {
      "location": "/",
      "writable": false,
      "defaultInputFormat": null,
      "allowAccessOutsideWorkspace": false
    }
  },
  "formats": {
    "parquet": {
      "type": "parquet"
    }
  },
  "enabled": true
}
1

1 Answers

2
votes

Apache Drill does not support ~/.aws/credentials, but supports Hadoop CredentialProvider API. To be able to use it, you need to create an external credentials provider and set "hadoop.security.credential.provider.path" property (pointing to the provider) in "config" section of Drill's S3 Storage Plugin.

Alternatively, you may store your credentials in Drill's core-site.xml.