Morning everyone,
I have created an ELT process which has the following layers and processes:- -Using a SSIS package to create a file pulling data from a SQL source and dropping into ADLS as csv files. There is one SSIS package for each source table -SSIS package which uses PolyBase to query the external table and adds some transformations -More SSIS packages to create slowly changing dimensions,FACT and dim tables using Azure Data Warehouse and Azure Database.
We have this running quite smoothly every day for around 40 tables from the original SQL source, but the collective size of the files in the lake is now creating a lot to store and costing money.
Is there any way to compress the file first in SSIS and then upload to ADLS? I've found a tool called ZappySys that has a SSIS pack which compresses the data, but it requires you to save the original table as a flat file first, then compress, then sends to ADLS and we can't have flat files stored on our VM's.
How else could I compress the data before it gets stored in ADLS? I need it not to involve saving flat files in directories.
Many thanks
