31
votes

I'm trying to write a hash that will accept all datatypes. Once in the function, I handle the data as a byte array. I'm having trouble figuring out how to cast an arbitrary interface{} to a byte array.

I tried using the binary package but it seemed to depend on the type of data passed in. One of the parameters of the Write() fn (docs) required knowing the byte order of the parameter.

All datatype sizes are some multiple of a byte (even the bool), so this should be simple in theory.

Code in question below,

package bloom

import (
    "encoding/gob"
    "bytes"
)

// adapted from http://bretmulvey.com/hash/7.html
func ComputeHash(key interface{}) (uint, error) {
    var buf bytes.Buffer
    enc := gob.NewEncoder(&buf)
    err := enc.Encode(key)
    if err != nil {
        return 0, err
    }
    data := buf.Bytes()

    var a, b, c uint
    a, b = 0x9e3779b9, 0x9e3779b9
    c = 0;
    i := 0;

    for i = 0; i < len(data)-12; {
        a += uint(data[i+1] | data[i+2] << 8 | data[i+3] << 16 | data[i+4] << 24)
        i += 4
        b += uint(data[i+1] | data[i+2] << 8 | data[i+3] << 16 | data[i+4] << 24)
        i += 4
        c += uint(data[i+1] | data[i+2] << 8 | data[i+3] << 16 | data[i+4] << 24)

        a, b, c = mix(a, b, c);
    }

    c += uint(len(data))

    if i < len(data) {
        a += uint(data[i])
        i++
    }
    if i < len(data) {
        a += uint(data[i] << 8)
        i++
    }
    if i < len(data) {
        a += uint(data[i] << 16)
        i++
    }
    if i < len(data) {
        a += uint(data[i] << 24)
        i++
    }


    if i < len(data) {
        b += uint(data[i])
        i++
    }
    if i < len(data) {
        b += uint(data[i] << 8)
        i++
    }
    if i < len(data) {
        b += uint(data[i] << 16)
        i++
    }
    if i < len(data) {
        b += uint(data[i] << 24)
        i++
    }

    if i < len(data) {
        c += uint(data[i] << 8)
        i++
    }
    if i < len(data) {
        c += uint(data[i] << 16)
        i++
    }
    if i < len(data) {
        c += uint(data[i] << 24)
        i++
    }

    a, b, c = mix(a, b, c)
    return c, nil
}

func mix(a, b, c uint) (uint, uint, uint){
    a -= b; a -= c; a ^= (c>>13);
    b -= c; b -= a; b ^= (a<<8);
    c -= a; c -= b; c ^= (b>>13);
    a -= b; a -= c; a ^= (c>>12);
    b -= c; b -= a; b ^= (a<<16);
    c -= a; c -= b; c ^= (b>>5);
    a -= b; a -= c; a ^= (c>>3);
    b -= c; b -= a; b ^= (a<<10);
    c -= a; c -= b; c ^= (b>>15);

    return a, b, c
}
2
How about pkg "encoding/gob"? Can you use it? - nvcnvn
@nvcnvn, seems to be working. I tried it earlier but now I realize there's a weakness in the hash on small values (0-62 are identical?). I changed the range I was working with any it now seems to work. Thanks! - Nate Brennand
fixed the errors in the hash fn, updated code found here: gist.github.com/natebrennand/10442587 - Nate Brennand

2 Answers

71
votes

Other problems in my code led me away from the gob package earlier, turns out it was the proper way as @nvcnvn suggested. Relevant code on how to solve this issue below:

package bloom

import (
    "encoding/gob"
    "bytes"
)

func GetBytes(key interface{}) ([]byte, error) {
    var buf bytes.Buffer
    enc := gob.NewEncoder(&buf)
    err := enc.Encode(key)
    if err != nil {
        return nil, err
    }
    return buf.Bytes(), nil
}
4
votes

Another way to convert interface{} to []bytes is to use a fmt package.

/*
* Convert variable `key` from interface{} to []byte
*/

byteKey := []byte(fmt.Sprintf("%v", key.(interface{})))

fmt.Sprintf converts interface value to string.
[]byte converts string value to byte.

※ Note ※ This method does not work if interface{} value is a pointer. Please find @PassKit's comment below.