I have a structured streaming job which reads from event hub and write to delta lake table as /mytablepath , which is stored on Azure blob storage. In last 2 months run in Production it has created ~1000 small files in storage with each file having only 2-3 rows.
I tried to run optimize command on my delta lake table(path), but even after that number of files on blob storage has not reduced and when i run any query on table in notebook, it continue to show warning " query is on a delta table with many small files, run optimize to improve performance".
Thanks