2
votes

I need to have some archive cleanup code to remove old Azure logs after a certain retention period has occurred.

I am aware that I can do this:

CloudStorageAccount storageAccount = CloudStorageAccount.Parse("");
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer container = blobClient.GetContainerReference("ctr");

var blobList = container.ListBlobs();
foreach(var blob in blobList)
{
    logger.Info($"Blob Name: {blob.Uri}");
}

However within my container the structure is

/
/year/month/day/hour/files

So right now there is

/2017/5/11/14/files
/2017/5/11/17/files
/2017/5/11/22/files
/2017/5/11/23/files

and

/2017/5/12/11/files

Where files is multiple backup files.

The for loop only has 1 item in it's collection as the 2017 folder is the root.

Is there a way to retrieve all blobs?

The end goal is to delete all blobs older than the retention period.

2

2 Answers

5
votes

Try this pattern. Can be handy when browsing big storages. I found it much more GC and memory footprint friendly

var blobAccount = "<account>";
var apiKey = "<api-key>";
var containerName = "<container>";
var storageCredentials = new StorageCredentials(blobAccount, apiKey);

var account = new CloudStorageAccount(storageCredentials, true);
var blobClient = account.CreateCloudBlobClient();
var container = blobClient.GetContainerReference(containerName);
var blobLimit = 500

if (container == null) { return; }

var blobContinuationToken = new BlobContinuationToken();

using (var fs = new FileStream("Output.csv", FileMode.Create))
{
    var sw = new StreamWriter(fs);
    sw.WriteLine("Type,Name,Length");

    BlobContinuationToken continuationToken = null;
    do
    {   
        var blobList = container.ListBlobsSegmented("",
                                   true,
                                   BlobListingDetails.Metadata,
                                   blobLimit,
                                   continuationToken,
                                   new BlobRequestOptions
                                   {
                                       LocationMode = LocationMode.PrimaryOnly
                                   },
                                   null);

        continuationToken = blobList.ContinuationToken;

        // I was looking only for BlockBlobs
        foreach (var item in blobList.Results.OfType<CloudBlockBlob>())
        {
            sw.WriteLine($"block,\"{item.Name}\",{item.Properties.Length}");
        }

    } while (continuationToken != null);
}
4
votes

Use the UseFlatBlobListing parameter like this:

CloudStorageAccount storageAccount = CloudStorageAccount.Parse("");
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer container = blobClient.GetContainerReference("ctr");

var blobList = container.ListBlobs(useFlatBlobListing: true)
foreach(var blob in blobList)
{
    logger.Info($"Blob Name: {blob.Uri}");
}

This will give you all blobs in a flattened way.

See https://docs.microsoft.com/en-us/dotnet/api/microsoft.windowsazure.storage.blob.cloudblobcontainer.listblobs?view=azure-dotnet

If you also include the prefix parameter you can filter results based on the folder structure. To get everything in may 2017 you can do

var blobList = container.ListBlobs(prefix: "2017/5/", useFlatBlobListing: true)

This might help reducing the list of blobs depending on your retention.