We have a folder of .csv and .ctl files. The CSVs are daily files, five in total per day, over a period of time. Their naming convention is a prefixed string followed by a date identifier (Eg: ABCDE090619.csv). The header row, for each of the five daily files, is consistent over time.
The expected behaviour of the Glue crawler is to recognize the five table schemas and create a row for day's data within each table. Instead, the crawler creates an individual schema for every single file. Roughly 550 in total.
Is there any mechanism which that could be driving this behaviour? Our considerations currently include the naming convention but according to the Glue docs, only the file schema should matter.
Thank you.