I have parquet files in S3 created using different sources. They have the same schema. 1 is created using Athena CTAS. Another is created using AWS Glue/Spark.
The files created by Glue looks like:
Athena CTAS ones looks like:
I tried copying the files that are in missing partitions into another folder then use a Glue crawler and Glue can detect that. But it cannot seem to detect these partitions when everything is put together. Why is that? Do I need to process all the data using 1 method for this to work?