I am trying to work with Apache Drill. I am new to this whole environment, just trying to understand how Apache Drill works.
I am trying to query my json data stored on s3 using Apache Drill.
My bucket is created in US East (N. Virginia).
I have created a new Storage Plugin for S3 using this link.
Here is the configuration for my new S3 Storage Plugin :
{
"type": "file",
"enabled": true,
"connection": "s3a://testing-drill/",
"config": {
"fs.s3a.access.key": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
"fs.s3a.secret.key": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
},
"workspaces": {
"root": {
"location": "/",
"writable": false,
"defaultInputFormat": null,
"allowAccessOutsideWorkspace": false
},
"tmp": {
"location": "/tmp",
"writable": true,
"defaultInputFormat": null,
"allowAccessOutsideWorkspace": false
}
},
"formats": {
"psv": {
"type": "text",
"extensions": [
"tbl"
],
"delimiter": "|"
},
"csv": {
"type": "text",
"extensions": [
"csv"
],
"delimiter": ","
},
"tsv": {
"type": "text",
"extensions": [
"tsv"
],
"delimiter": "\t"
},
"parquet": {
"type": "parquet"
},
"json": {
"type": "json",
"extensions": [
"json"
]
},
"avro": {
"type": "avro"
},
"sequencefile": {
"type": "sequencefile",
"extensions": [
"seq"
]
},
"csvh": {
"type": "text",
"extensions": [
"csvh"
],
"extractHeader": true,
"delimiter": ","
}
}
}
I have also configured my core-site-example.xml as follows:
<configuration>
<property>
<name>fs.s3a.access.key</name>
<value>xxxxxxxxxxxxxxxxxxxx</value>
</property>
<property>
<name>fs.s3a.secret.key</name>
<value>xxxxxxxxxxxxxxxxxxxxxxxx</value>
</property>
<property>
<name>fs.s3a.endpoint</name>
<value>s3.us-east-1.amazonaws.com</value>
</property>
</configuration>
But when I try to use/set the workspace using the following command :
USE shiv.`root`;
It gives me following error :
Error: VALIDATION ERROR: Schema [shiv.root] is not valid with respect to either root schema or current default schema.
Current default schema: No default schema selected
[Error Id: 6d9515c0-b90f-48aa-9dc5-0c660f1c06ca on ip-10-0-3-241.ec2.internal:31010] (state=,code=0)
If try to execute show schemas;, then I get the following error :
show schemas;
Error: SYSTEM ERROR: AmazonS3Exception: Status Code: 400, AWS Service: Amazon S3, AWS Request ID: EEB438A6A0A5E667, AWS Error Code: null, AWS Error Message: Bad Request
Fragment 0:0
[Error Id: 85883537-9b4f-4057-9c90-cdaedec116a8 on ip-10-0-3-241.ec2.internal:31010] (state=,code=0)
I am not able to understand the root cause of this issue.
Bad Request. Maybe step through the setup again? - tobi6