I have AWS Glue ETL Job running every 15 mins that generates 1 parquet file in S3 each time.
I need to create another job to run end of each hour to merge all the 4 parquet file in S3 to 1 single parquet file using the AWS Glue ETL pyspark code.
Any one have tried it? suggestions and best practies?
Thanks in advance!