7
votes

I am trying to learn Rust and decided to write a program that converts a hex string into a u64.

Currently, I have parsed the string into a vector of u8 values, each representing four bits (or "nibble"). I wrote the following code to take a Vec<u8> and return a corresponding u64. It works (as far as my testing shows), but I am not sure if it is the "appropriate" way in Rust to go about doing this.

fn convert_nibbles_to_u64(values: &Vec<u8>) -> u64 {
    // We need to turn this buffer into a u64 now
    let mut temp:u64 = 0;
    for i in values {
        temp = temp << 4;

        unsafe {
            // We need to unsafely convert a u8 to a u64. Note that
            // the host endian-ness will matter here.
            use std::mem;
            let i_64_buffer = [0u8,0u8,0u8,0u8,0u8,0u8,0u8,i.clone()];
            let i_64 = mem::transmute::<[u8; 8], u64>(i_64_buffer);
            let i_64_be = u64::from_be(i_64);
            temp = temp | i_64_be;           
        }
    }
    return temp;
}

I suppose the main issue I don't know how else to cast a u8 to a u64 value. Could you comment on ways to improve or write the code in a more idiomatic, Rust-like style?

EDIT: I have tried the following (unsuccessful) alternatives to the unsafe block:

Or'ing with i as a u64:

temp = temp | i as u64; 
------
Compiler error:
main.rs:115:23: 115:31 error: non-scalar cast: `&u8` as `u64`
main.rs:115         temp = temp | i as u64;

Or'ing with i directly:

temp = temp | i;
------
Compiler error:
main.rs:115:16: 115:24 error: the trait `core::ops::BitOr<&u8>` is not implemented for the type `u64` [E0277]
main.rs:115         temp = temp | i;
1
What’s wrong with i as u64?Chris Morgan
So, out of each u8 you’re only using four of the bits. So if your bits are [0b0000AAAA, 0b0000BBBB, 0b0000CCCC, 0b0000DDDD, 0b0000EEEE, 0b0000FFFF, 0b0000GGGG, 0b0000HHHH] you are wanting to end up with 0bAAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHu64. Am I correct?Chris Morgan
@ChrisMorgan if input is [0xF,0xE,0xE,0xD,0xB,0xE,0xE,0xF,0xF,0xE,0xE,0xD,0xB,0xE,0xE,0xF], the output should be 0xFEEDBEEFFEEDBEEFu64.samoz
Using &Vec<T> is practically never what you want—it’s two levels of indirection where only one is necessary. You should use just a simple slice &[T] instead. If you have a vector v, &v will coerce from &Vec<T> to &[T] silently.Chris Morgan
If values is too long, temp << 4 is going to overflow at some point.Matthieu M.

1 Answers

11
votes

Your issue is a simple one: for i in values, where values is of type &Vec<u8>, iterates over references to each value; that is, i is of type &u8. Oring and adding and such with references doesn’t make sense; you need to dereference it, getting the underlying u8. The easiest way of doing this is writing it into the for loop’s pattern (for the for grammar is for PATTERN in EXPRESSION, refer to the documentation on patterns for more explanation if you need it; for this simple case, for &x in y { … } basically means for x in y { let x = *x; … }):

fn convert_nibbles_to_u64(values: &[u8]) -> u64 {
    let mut out = 0;
    for &i in values {
        out = out << 4 | i as u64;
    }
    out
}

The whole form of a loop can be collapsed using Iterator.fold, too, like this:

fn convert_nibbles_to_u64(values: &[u8]) -> u64 {
    values.iter().fold(0, |x, &i| x << 4 | i as u64)
}