Looking for hash algorithm where small change in input will result in small change on hash

Question

Current hash functions are designed to have big changes on hash even if only a very small portion of input is changed. What I need, is a hash algorithm which output mutation will be directly proportional to input mutation. For example, I need something similar to this:

Hash("STR1") => 1000
Hash("STR2") => 1001
Hash("STR3") => 1002

etc. I'm not good at algorithms, but never heared of such implementation, although I'm almost sure someone should already come up with this algorithm.

My current requirement is to have large bitrate (512 bits maybe?) to avoid collisions.

Thanks

UPDATE

I think I should clarify my goal, I see that I did a very poor job explaining what I need. Sorry, I'm not a native English speaker and great communicator.

So basically I need this hash algorithm for searching similar binary files. You can think of it as Antivirus hashing algorithm. It calculates file checksum, but unlike traditional hashing functions, even after some small modification in malware binary, it still is able to detect it. This is pretty much what I'm looking for.

Another aspect is to avoid collision. Let me explain what I mean by that. It's not a conflicting goal. I want Hash("STR1") to produce 1000 and Hash("STR2") to produce 1001 or 1010 maybe, doesn't matter as long as the value is close relative to previous hash. But Hash("This is a very large string or maybe even binary data" + 100 random chars) should not produce a value close to 1000. I understand it will not work always and there would be some hash/hash-range collisions, but I think I can introduce another hashing algorithm and verify both to minimize collisions.

So what do you think? Maybe there is a better way to achieve my goal, maybe I'm asking too much, I don't know. I'm not well versed in Cryptograpy, math or algorithms.

Thank you again for your time and effort

I hope you know this is very weak security-wise, but I think I may be able to find something... — Laurel
Yes, it's not for security purpose, but for search :). Thanks for your effort Laurel :) — Davita
Do you need "1str", "2str", "3str" to hash close together also? — brian beuning
Avoiding collisions is incompatible with your goal of preserving "closeness" of hash results. You'll have to pick one. — user149341
Locality-sensitive hashing can do this, although you end up with more collisions. If your data set is known and reasonably small, you can create a perfect hash function, although that doesn't fulfill your goal of small input change resulting in small output change. A minimal perfect hash might be what you're looking for. — Jim Mischel

Jonathon Reinhart Jonathon Reinhart · Accepted Answer · 2016-07-19T02:22:54

How about a simple summation? Your hash can then wrap at the desired size, and if you take this into account during hash comparisons, a small difference in inputs should yield a small difference in hashes.

However, I think "minimal collisions" and "proportional change in output" are conflicting goals.

Looking for hash algorithm where small change in input will result in small change on hash

6 Answers