Our powershell test harness calls databricks, which generates parquet files in azure storage.
When the harness attempts to clean up parquets (and other files) after a testrun, it searches for all blobs (in given locations) and removes them. Visibly, the blobs can't be seen any longer when looking in the azure portal after cleanup, but when the script runs again, it finds an increasing number of available blobs to delete, following each testrun.
Is this a case of some soft/hard delete policy?
I'm not specifying the snapshot parameter when deleting blobs, as I'm not interested to retain snapshots.
Some of the code used is:
$availableBlobs = Get-AzStorageBlob -Container $remoteContainer -Context $ctx
$ctx = GetStorageContext -storageaccountName $remoteStorageAccount -storageaccountkey $remoteStorageKey
$availableBlobs = Get-AzStorageBlob -Container $remoteContainer -Context $ctx
Remove-AzStorageBlob -Container $using:remoteContainer -Blob $blob.Name -Force -Context $using:ctx -ErrorAction SilentlyContinue
Why might the remove-AzStorageBlob seem to full remove a blob - such that the blob is no longer visiable, but seems to add to the increasing blob count found when the cleanup script next runs?
Additional Information After removing Remove -ErrorAction SilentlyContinue from remote-AzStorageBlob, it seems that part way through deleting all the $availableBlobs, an error 500 is seen:
The error does not appear at the same point, on subsequent attempts to run the same code.
$availableBlobs
just so that you can see what the script is trying to delete. i.e. is it the same blobs? (usually a caching issue, may have to refresh the Storage Context before re-running) is it the same blobs twice? (i.e. blobs += blobs because of caching). Remove-ErrorAction SilentlyContinue
from the remove, and you may get the error message that the "blob does not exist" i.e. the blobs are gone and its a caching issue, and you simply are not seeing it because you are hiding the error message. – HAL9256