The concept around data retention is to establish policies that ensure that data that cannot be retained should be automatically removed as part of the process.
By default, DeltaLake stores a change data capture history of all data modifications. There are two settings delta.logRetentionDuration (default interval 30 days) and delta.deletedFileRetentionDuration (default interval 1 week).
Delta_log is the default implementation of transaction log in Databricks Delta Lake. It keeps the commit history of table transactions for default period of 30 days. However, if you are ingesting data in delta lake tables quite frequently, you may see so many tiny json and crc files created in your storage account under _delta_log directory. This can potentially increase your storage costs if you are not interested in maintaining log history for 30 days.
If you are administering and managing your databricks environment, You should be looking into truncating these log files. The default way to do this in databricks delta lake is to run ALTER TABLE.. TBLPROPERTIES statement for each table and it may sound cumbersome for administration.
%sql
ALTER table_name SET TBLPROPERTIES ('delta.logRetentionDuration'='interval 240 hours', 'delta.deletedFileRetentionDuration'='interval 1 hours')
SHOW TBLPROPERTIES table_name