I am using Spark to write files to S3 in ORC format. Also using Athena to query this data.
I am using the following partition keys:
s3://bucket/company=1123/date=20190207
Once I execute the Glue crawler to run on the bucket everything works as expected except the types of the partitions keys.
The Crawler configures them in the catalog as String
type instead of int
Is there a configuration to define the default type of the partition keys ?
I know it can be changed manually later and set the Crawler config to Add new columns only.
Add new columns only
– Alex StanovskyAdd new columns only
option doesn't work well, when the schema is subjected to change once in a while, as it's so easy to forget this particular crawler config value... – ChernikovP