I tried to create external table from s3 parquet files(created with spark saved to s3) in Athena. Whenever the data type is decimal or timestamp, no data shows up after
select "column" from athena_table
The code I used to create external table is as follows
CREATE EXTERNAL TABLE IF NOT EXISTS tablename (
`date` timestamp,
`number` decimal(38, 0)
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = '1'
)
LOCATION 's3 path'
TBLPROPERTIES (
'has_encrypted_data'='true'
);
What I have tried:
- Replace decimal(38, 0) in above code with double, doesn't work, nothing shows up in that column
- Replace decimal(38, 0) or timestamp in above code with string, doesn't work, garbled character shows up
- Update the data type in my s3 parquet file in above two formats (double or string) and resave the file, doesn't work, error
HIVE_BAD_DATA: Field 'number' type FIXED_LEN_BYTE_ARRAY in parquet is incompatible with type double defined in table schema`