I am building an application that needs to allow users to upload large images (up to about 100 MB) to the Windows Azure Blob Storage service. Having read Rob Gillen's excellent article on file upload optimization for Windows Azure, I borrowed his approach for doing parallel upload of file chunks, using the CloudBlockBlob.PutBlock() method within a Parallel.For loop (code is available here).
The problem I have is that whenever I try to upload a file I get an "InvalidMd5" exception from the storage client. Suspecting that the problem may be in the development storage, I also tried running the code against my live Azure storage account, but I got the same error. Looking at the traffic with Fiddler I see that the "Content-MD5" header is set to a valid MD5 hash. The description of the error says that "The MD5 value specified in the request is invalid. The MD5 value must be 128 bits and Base64-encoded.", but to the best of my knowledge the value I see being sent in Fiddler is valid (e.g. a91c588092cedbdb1b82c2d3786fd509).
Here is the code I use for calculating the hash (courtesy of Rob Gillen):
public static string GetMD5HashFromStream(byte[] data)
{
MD5 md5 = new MD5CryptoServiceProvider();
byte[] retVal = md5.ComputeHash(data);
StringBuilder sb = new StringBuilder();
for (int i = 0; i < retVal.Length; i++)
{
sb.Append(retVal[i].ToString("x2"));
}
return sb.ToString();
}
And this is the actual call to PutBlock():
blob.PutBlock(transferDetails[j].BlockId, new MemoryStream(buff), blockHash, options);
I also tried passing the hash like so:
Convert.ToBase64String(Encoding.UTF8.GetBytes(blockHash))
but the result was the same - "InvalidMd5" error :(
The MD5 hash being passed to PutBlock() with base64 encoding (e.g. YTkxYzU4ODA5MmNlZGJkYjFiODJjMmQzNzg2ZmQ1MDk=) and without it (e.g. a91c588092cedbdb1b82c2d3786fd509) doesn't seem to make a difference.
Rob's code obviously worked for him and I really have no idea what may be causing the problem in my case. The only change I've made to Rob's code is to alter the ParallelUpload() extension method to take a Stream instead of a file name and to dynamically determine the block size depending on the size of the file being uploaded.
Please, if anyone has an idea how to solve this problem, let me know! I will be really grateful! I already lost two days struggling with this.