I am trying to query my HDFS file system from apache drill. I have successfully able to query hive table , csv files but part files are not working.
hadoop fs -cat BANK_FINAL/2015-11-02/part-r-00000 | head -1
Gives result:
028|S80306432|2015-11-02|BRN-CLG-CHQ PAID TO SILVER ROCK BANDRA CO-OP|485|ZONE SERIAL [ 485]|L|I|MAHARASHTRA STATE CO-OP BANK LTD|3320.0|INWARD CLG|D11528|SBPRM
select * from dfs.`/user/ituser1/e.csv` limit 10
works fine and gives result successfully.
But when I try query
select * from dfs.`/user/ituser1/BANK_FINAL/2015-11-02/part-r-00000` limit 10
Gives error:
org.apache.drill.common.exceptions.UserRemoteException: VALIDATION ERROR: From line 1, column 15 to line 1, column 17: Table 'dfs./user/ituser1/BANK_FINAL/2015-11-02/part-r-00000' not found [Error Id: 6f80392a-51af-4b61-94d8-335b33b0048c on genome-dev13.axs:31010]
Apache Drill dfs storage plugin json is as follows:
{
"type": "file",
"enabled": true,
"connection": "hdfs://10.9.1.33:8020/",
"workspaces": {
"root": {
"location": "/",
"writable": true,
"defaultInputFormat": null
},
"tmp": {
"location": "/tmp",
"writable": true,
"defaultInputFormat": null
}
},
"formats": {
"psv": {
"type": "text",
"extensions": [
"psv"
],
"delimiter": "|"
},
"csv": {
"type": "text",
"extensions": [
"csv"
],
"delimiter": ","
},
"tsv": {
"type": "text",
"extensions": [
"tsv"
],
"delimiter": "\t"
},
"parquet": {
"type": "parquet"
},
"json": {
"type": "json"
},
"avro": {
"type": "avro"
},
"sequencefile": {
"type": "sequencefile",
"extensions": [
"seq"
]
},
"csvh": {
"type": "text",
"extensions": [
"csvh"
],
"extractHeader": true,
"delimiter": ","
}
}
}