I have a storage with 2 GB of hashes, which i want to check with a public Api.
Use Case
Let's say I want to create an API which check if a person is known by my product. To respect the persons privacy I don't want to upload his name, member id and so on. So I decide to upload only a hash of the combined Informationen which will identify him. Now I have 2 GB (6*10^7) of SHA256 hashes and want to check them in a insane fast way.
This API should be hosted in azure.
Afte reading the documentation of the azure storage account, I think the Azure Table Storage is the right storage solution. I would set the base64 hash as partition key and leave the row key empty.
Question
- First, is the Azure Table the right storage for the job?
- Will it be a performance different between:
- partition key: base64 hash, row key: empty
- partition key: 'Upload Id', row key: empbase64 hash
- Does the time to access trough keys depends on the size of the table?
What is the fastest way to check if a partition key is present? I think my naive first try is not really the best way.
if(members.Where(x=>x.PartitionKey == Convert.ToBase64String(data.Hash)).AsEnumerable().Any()) { return req.CreateResponse(HttpStatusCode.OK, "Found Hash"); }else { return req.CreateResponse(HttpStatusCode.NotFound, "Don't found Hash"); }
How to upload the 2 GB of hashes? I think about to upload one big file and use azure function to split after each 256 bit and add the value to azure storage. Or any better Idea?