I get the following error when running a GLUE job over partitioned parquet files Unable to infer a schema for Parquet. It must be specified manually
I have set up my crawler and successfully obtained the schema for my parquet files. I can view the data in Athena. I have created the schema manually on my target Redshift.
I can load the files via GLUE into Redshift if all my data is in one folder only. BUT when I point at a folder that has nested folders, e.g. folder X - has 04 and 05 - the GLUE job fails with the message Unable to infer a schema for Parquet. It must be specified manually
Which is strange as it works if I put all these files into the same folder?