I have S3 bucket that contains csv files (see 'Data sample').
The files were created as a result of an Athena query.
I want to treat the 'time' column as decimal(17,7) and the 'size' column as bigint.
I define a table (see 'Table definition') and using the query (see 'Query' below) I get the 'Error' below.
My questions are:
Is there a way to control the Athena query output format and get a csv without the quotes around the fields? (1545980216.9581780,1349 vs "1545980216.9581780","1349")
Assuming the answer to the previous question is negative. How can I query the data in the table '`del_me_later_4' and treat the fields as decimal and bigint.
Data sample
"time","size"
"1545751457.8957720","432"
"1545751458.0753407","583"
"1545751458.1407920","583"
"1545751458.1683733","560"
Table definition
CREATE EXTERNAL TABLE `del_me_later_4`(
`time` string,
`size` string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
's3://del-me-later/'
TBLPROPERTIES (
'skip.header.line.count'='1'
)
Query
SELECT cast(time AS decimal(17,7)) as time,
cast(size AS bigint) as size
FROM "del_me_later_4"
Error
INVALID_CAST_ARGUMENT: Cannot cast VARCHAR '"1545751457.8957720"' to DECIMAL(17, 7)