0
votes

I've setup an AWS Glue crawler to index a set of bucketed CSV files in S3 (which then create an Athena DB).

My timestamp is in "Java" format - as defined in the documentation, example;

2019-03-07 14:07:17.651795

I've tried creating a custom classifier (and a new crawler) yet this column keeps being detected as a "string" and not a "timestamp".

I'm at a loss why Athena / Glue won't detect this as a timestamp..

1
how did you solved it ?Hugo

1 Answers

0
votes

I think the problem may be due to the fractional seconds in the timestamp. I found this StackOverflow answer that contains the patterns recognized as timestamps by Glue (but I haven't found where the patterns come from, I can't find them in the Glue docs).

You might have better luck using a custom classifier to make it understand your timestamp format.

I don't know how much it will help you since you also have to convince Athena to parse your timestamps. You might be better off letting Glue classify them as strings and create a view where you use DATE_PARSE to convert the strings to timestamps.