My use case is simple. I have 20 TB raw csv uncompressed data in s3 with a partition folder structure of year (10 partitions for 10 years, each partition has 2 TB). I want to convert this data into parquet format(snappy compressed) and keep the similar partition/folder structure. I want ONE Parquet table with TEN 10 partitions in Athena which I will use to query this data by partition and maybe get rid of the raw csv data later. With Glue, it seems like I will create 10 parquet tables which I can't use.
Is this doable in Glue? Instead of using EC2, Hive/Spark I was looking for simple solution. Any recommendation? Any help is much appreciated.