2
votes

In order to train a classifier, it requires that the training data are specified using a set of float array. Unfortunately, the training data available to me are byte arrays (actually they are Ipp8u arrays, which can be converted to unsigned char arrays).

Essentially, given an unsigned char array, I need to convert it to a float array: in other words, given an unsigned char array, I should read it as a float array. Is this operation always allowed? Does the float data type allow all possible configurations of bits? If yes, how to implement this conversion?

4
I would make use of the stringstream library - Syntactic Fructose
Please clarify: is the data binary (and in correct byte order, etc), or human-readable strings? - Chris
@enzom83: "binary" means nothing - what kind of binary format is it? Is it the binary representation of an array of your platform's floats? Or something else? - Matteo Italia
Still not sure: If each unsigned char represents one value, the answer by @Dims works. If each 4 bytes are a binary float representation, things get a bit complicated. - Chris
@enzom83 If you don't have to convert each element, what do you have to do? What are you not telling us? - Mr Lister

4 Answers

6
votes

There are many ways to approximate real numbers. There are floating point representations where some bits represent an exponent and some other bits represent a coefficient, there are fixed point representations where some bits represent the whole number part and some bits represent the factional part, there are arbitrary finite precision representations where some number of 'digits' in some base are stored, etc., and for every general class of representation there is an infinite variety of details which would matter when converting that representation into floats.

Your question does not specify what representation the byte array contains. Specifying that the array is Ipp8u does not come close to providing the necessary information.


What you probably mean is that the byte array contains a byte representation of the machine's native representation of floats (which is probably IEEE-754), differing at most in endianess.

You can simply do a memcpy of data from the char array into an array of floats:

char c[10 * sizeof(float)] = {...};
float f[10];
std::memcpy(f, c, 10 * sizeof(float)); // or you can search for an implementation of bit_cast

One thing not to do is to simply cast the char array: float *f = reinterpret_cast<float*>(c); This cast probably has undefined behavior because float probably has stricter alignment requirements than char.

If the endianess differs then you go through the byte array first and reorder the bytes, something like this:

// assuming sizeof(float) == sizeof(uint32_t)
for (int i; i<sizeof c; i+=sizeof(float)) {
    uint32_t i;
    std::memcpy(&i, c + i, sizeof(uint32_t));
    ntoh(i); // swaps bytes from Network TO Host order.
    std::memcpy(c + i, &i, sizeof(uint32_t));
}
3
votes

Given the documentation of the Intel Integrated Performance Primitives, the function:

IppStatus ippsConvert_8u32f(const Ipp8u* pSrc, Ipp32f* pDst, int len);

would seem a most-handy function for doing exactly what you're looking for.

2
votes

You should do each operation explicitly, not relying on implicit conversion. First read array in the char form

unsigned char charArray[100];
// reading 

then convert elements one by obe

float floatArray[100];
for(i=0; i<100; ++i) {
   floatArray[i] = (float) charArray[i];
}
-1
votes

You could just run an iterative conversion utilising atof in a for loop. Then you can just use the float array. I'm sure there is a one line solution though.