I have been playing around with AWS Glue for some quick analytics by following the tutorial here
While I have been able to successfully create crawlers and discover data in Athena, I've had issues with the data types created by the crawler. The date
and timestamp
data types get read as string
data types.
I followed this up by creating an ETL
job in GLUE using the data source created by the crawler as the input and a target table in Amazon S3
As part of the mapping transformation, I converted the data types of the date and timestamp as string
to timestamp
but unfortunately the ETL converted these column types to NULLS
. I have contemplated using classifiers with GROK
expressions but then decided transform them as part of ETL in GLUE.
The timestamp format is as 1/08/2010 6:15:00 PM