1
votes

Floating point values (IEEE 32 and 64-bit) are encoded using fixed-length big-endian encoding (7 bits used to avoid use of reserved bytes like 0xFF):

That paragraphs comes from the Smile Format spec (a JSON-like binary format).

What could that mean? Is there some standard way to encode IEEE floating point (single and double precision) so that the encoded bytes are in the 0-127 range?

More in general: I believe that, in the standard binary representation, there is no reserved or prohibited byte value, a IEEE floating point number can include any of the 256 possible bytes. Granted that, is there any standard binary encoding (or trick) so that some bytes value/s will never appear (as, say, in UTF8 encoding of strings one have some prohibited bytes values, as 0xFF)?

(I guess that would imply either losing some precision, or using more bytes.)

1
I would expect "using more bytes" rather than "using more bits". If you only use 7 bits per byte you need more bytes to fit in 64 bits than if you use 8 bits per byte. - Patricia Shanahan
@PatriciaShanahan : Well, I guess that "using 7 bits" actually means restricting your bytes to the 0-127 range (keeping the MSB at zero). But, yes, in the end it would imply using more bytes. - leonbloy

1 Answers

2
votes

I don't know the detail of such a format, but it looks as a kind of serialization of a data structure. Of course, since the final result is a byte-stream, you should be able to recognize a value from some other metadata. Probably they use the 7th bit a special-bit, then any misinterpreting byte-value should be avoided. That's the reason to "spread" an IEEE fp number along (five, for a single) bytes where only seven bits are actually used for the value.

I should read the format specs, so I tried to "extrapolate" what they're going to do. However, this kind of codify is relatively often in the low-level (e.g. embedded) programming.