Reference the discussion in this link:
What is the algorithm to compute the Amazon-S3 Etag for a file larger than 5GB?
The steps to recreate the MD5 hash is to 1) concatenate the md5 hashes for each upload part, 2) convert the concatenated hash into binary, 3) get the md5 hash of the binary, then 4) add the hyphen and number of parts to the hash. That all sounds easy enough, but where I'm struggling is in step 3. To get the hash of the binary I need to convert the string into a byte array. To get the byte array I need to know what encoding format to use. That's the part I'm missing. Do I use ASCII, UTF8, Unicode, BigEndian, something else?
I've tried using the four formats above and none have produced the correct hash. I just can't seem to figure this one out. The code I'm using is:
CompleteMultipartUploadResponse compResp = new CompleteMultipartUploadResponse();
CompleteMultipartUploadRequest compReq = new CompleteMultipartUploadRequest();
string requestETagHash = "";
compResp = client.CompleteMultipartUpload(compReq);
string compETag = compResp.ETag;
foreach (PartETag s in compReq.PartETags)
{
requestETagHash += s.ETag.Replace('\"', ' ').Trim().Split('-').First();
}
StringBuilder sb = new StringBuilder();
foreach (char c in requestETagHash)
{
try
{
sb.AppendFormat(Convert.ToString(Convert.ToInt16(c.ToString(), 16), 2).PadLeft(4, '0'));
}
catch (Exception ex)
{
MessageBox.Show("Hash error:\n\n" + ex.Message);
}
}
//What encoding is used in this line?
byte[] b = System.Text.Encoding.UTF8.GetBytes(sb.ToString());
byte[] data = md5Hash.ComputeHash(b, 0, b.Length);
StringBuilder sBuilder = new StringBuilder();
for (int i = 0; i < data.Length; i++)
{
sBuilder.Append(data[i].ToString("x2"));
}
Any in solving this would be appreciated.
byte[]
, then a) concatentating thosebyte[]
hashes together (so you can hash the result again); b) converting each hash into hex for the etag. – Jon Skeet