2
votes

Users can upload files to Azure blob storage via my web API service. And the blobs have MD5 hashes.

Now another API lets the user download files they previously uploaded. The API returns:

return File(blobFile, MediaTypeNames.Application.Octet, file.FileName);

So the javascript client receives a byte array with a header

Content-Type: application/octet-stream

The question is how does the js client validate the MD5 of the downloaded file matches the one that is from the blob?

I tried some online MD5 tools and they don't give me the same MD5 as the blob...

1
Be aware that large files may not have MD5 property set "for free": stackoverflow.com/a/69319211/32453 I wonder if maybe the data is binary so you're not seeing it line up?rogerdpack

1 Answers

0
votes

I tried some online MD5 tools and they don't give me the same MD5 as the blob.

The online MD5 has different algorithms with the blob MD5 hash.

The example below shows how a client can validate the blobs MD5 hash once all the data is retrieved.

// Validate MD5 Value
var md5Check = System.Security.Cryptography.MD5.Create();
md5Check.TransformBlock(retrievedBuffer, 0, retrievedBuffer.Length, null, 0);     
md5Check.TransformFinalBlock(new byte[0], 0, 0);

// Get Hash Value
byte[] hashBytes = md5Check.Hash;
string hashVal = Convert.ToBase64String(hashBytes);

if (hashVal != blobRef.Properties.ContentMD5) 
{
     throw new InvalidDataException("MD5 Mismatch, Data is corrupted!");
}

Also when you upload blob to storage, if you set the validate_content(bool) parameter to true, it will calculates an MD5 hash for each chunk of the blob.

The storage service checks the hash of the content that has arrived with the hash that was sent. This is primarily valuable for detecting bitflips on the wire if using http instead of https as https (the default) will already validate. Note that this MD5 hash is not stored with the blob.