3
votes

I'm attempting to create a logger for an application in Azure using the new Azure append blobs and the Azure Storage SDK 6.0.0. So I created a quick test application to get a better understanding of append blobs and their performance characteristics.

My test program simply loops 100 times and appends a line of text to the append blob. If I use the synchronous AppendText() method everything works fine, however, it appears to be limited to writing about 5-6 appends per second. So I attempted to use the asynchronous AppendTextAsync() method; however, when I use this method, the loop runs much faster (as expected) but the append blob is missing about 98% of the appended text without any exception being thrown.

If I add a Thread.Sleep and sleep for 100 milliseconds between each append operation, I end up with about 50% of the data. Sleep for 1 second and I get all of the data.

This seems similar to an issue that was discovered in v5.0.0 but was fixed in v5.0.2: https://github.com/Azure/azure-storage-net/releases/tag/v5.0.2

Here is my test code if you'd like to try to reproduce this issue:

static void Main(string[] args)
{
    var accountName = "<account-name>";

    var accountKey = "<account-key>;

    var credentials = new StorageCredentials(accountName, accountKey);

    var account = new CloudStorageAccount(credentials, true);

    var client = account.CreateCloudBlobClient();

    var container = client.GetContainerReference("<container-name>");

    container.CreateIfNotExists();

    var blob = container.GetAppendBlobReference("append-blob.txt");

    blob.CreateOrReplace();

    for (int i = 0; i < 100; i++)
        blob.AppendTextAsync(string.Format("Appending log number {0} to an append blob.\r\n", i));

    Console.WriteLine("Press any key to exit.");

    Console.ReadKey();
}

Does anyone know if I'm doing something wrong with my attempt to append lines of text to an append blob? Otherwise, any idea why this would just lose data without throwing some kind of exception?

I'd really like to start using this as a repository for my application logs (since it was largely created for that purpose). However, it would be quite unreliable if logs would just go missing without warning if the rate of logging went above 5-6 logs per second.

Any thoughts or feedback would be greatly appreciated.

2

2 Answers

7
votes

I now have a working solution based upon the information provided by @ZhaoxingLu-Microsoft. According to the the API documentation, the AppendTextAsync() method should only be used in a single-writer scenario because the API internally uses the append-offset conditional header to avoid duplicate blocks which does not work in a multiple-writer scenario.

Here is the documentation that specifies this behavior is by design: https://msdn.microsoft.com/en-us/library/azure/mt423049.aspx

So the solution is to use the AppendBlockAsync() method instead. The following implementation appears to work correctly:

for (int i = 0; i < 100; i++)
{
    var message = string.Format("Appending log number {0} to an append blob.\r\n", i);

    var bytes = Encoding.UTF8.GetBytes(message);

    var stream = new MemoryStream(bytes);

    tasks[i] = blob.AppendBlockAsync(stream);              
}

Task.WaitAll(tasks);

Please note that I am not explicitly disposing the memory stream in this example as that solution would entail a using block with an async/await inside the using block in order to wait for the async append operation to finish before disposing the memory stream... but that causes a completely unrelated issue.

2
votes

You are using async method incorrectly. blob.AppendTextAsync() is non-blocking, but it doesn't really finish when it returns. You should wait for all the async tasks before exiting from the process.

Following code is the correct usage:

var tasks = new Task[100];
for (int i = 0; i < 100; i++)
    tasks[i] = blob.AppendTextAsync(string.Format("Appending log number {0} to an append blob.\r\n", i));

Task.WaitAll(tasks);

Console.WriteLine("Press any key to exit.");

Console.ReadKey();