I have several avro files stored in an S3 bucket, and have them crawled by AWS Glue using AvroSerde, glue indexes files perfectly.
When I use AWS athena to query the table like this: select * from mytable
I get all results as expected and the "values"column is displayed as [23.4,345.6] for example.
However when I try to select the "values" column (or try to use any array function on it) I get no results and the query tab in Athena shows a "FAiled" icon, however I get no error whatsoever.
Example query that fails: select values from mytable
The output in Athena of describe mytable;
timestamp bigint from deserializer
values array<float> from deserializer
account string
year string
month string
day string
hour string
device string
type string
# Partition Information
# col_name data_type comment
account string
year string
month string
day string
hour string
device string
type string
The table definition in AWS Glue:
Name mytable
Database new_formats_test
Classification avro
Location s3://mytable/
Input format org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat
Output format org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
Serde serialization lib org.apache.hadoop.hive.serde2.avro.AvroSerDe
Serde parameters
avro.schema.literal {"type":"record","name":"ArchiveDataEntry","namespace":"com.epihunter.data.archive","fields":[{"name":"timestamp","type":"long"},{"name":"values","type":{"type":"array","items":"float"}}]}serialization.format 1
Table properties
sizeKey 18525162objectCount 1UPDATED_BY_CRAWLER Scan S3 new formats
avro.schema.literal {"type":"record","name":"ArchiveDataEntry","namespace":"com.epihunter.data.archive","fields":[{"name":"timestamp","type":"long"},{"name":"values","type":{"type":"array","items":"float"}}]}