5
votes

I want to convert an integer (int or long) a big-endian byte string. The byte string has to be of variable length, so that only the minimum number of bytes are used (the total length length of the preceding data is known, so the variable length can be inferred).

My current solution is

import bitstring

bitstring.BitString(hex=hex(456)).tobytes()

Which obviously depends on the endianness of the machine and gives false results, because 0 bits are append and no prepended.

Does any one know a way to do this without making any assumption about the length or endianess of an int?

4
Does this only need to work for an int, or does it need to work for a long as well? - jchl
For long as well, I forgot about this. I will edit the question. - user141335
This can be done simply in any version of Python without external dependencies -- in any case, you want a BYTEstring, not a BITstring. - John Machin

4 Answers

6
votes

Something like this. Untested (until next edit). For Python 2.x. Assumes n > 0.

tmp = []
while n:
    n, d = divmod(n, 256)
    tmp.append(chr(d))
result = ''.join(tmp[::-1])

Edit: tested.

If you don't read manuals but like bitbashing, instead of the divmod caper, try this:

d = n & 0xFF; n >>= 8

Edit 2: If your numbers are relatively small, the following may be faster:

result = ''
while n:
    result = chr(n & 0xFF) + result
    n >>= 8

Edit 3: The second method doesn't assume that the int is already bigendian. Here's what happens in a notoriously littleendian environment:

Python 2.7 (r27:82525, Jul  4 2010, 09:01:59) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> n = 65539
>>> result = ''
>>> while n:
...     result = chr(n & 0xFF) + result
...     n >>= 8
...
>>> result
'\x01\x00\x03'
>>> import sys; sys.byteorder
'little'
>>>
1
votes

A solution using struct and itertools:

>>> import itertools, struct
>>> "".join(itertools.dropwhile(lambda c: not(ord(c)), struct.pack(">i", 456))) or chr(0)
'\x01\xc8'

We can drop itertools by using a simple string strip:

>>> struct.pack(">i", 456).lstrip(chr(0)) or chr(0)
'\x01\xc8'

Or even drop struct using a recursive function:

def to_bytes(n): 
    return ([chr(n & 255)] + to_bytes(n >> 8) if n > 0 else [])

"".join(reversed(to_bytes(456))) or chr(0)
0
votes

If you're using Python 2.7 or later then you can use the bit_length method to round the length up to the next byte:

>>> i = 456
>>> bitstring.BitString(uint=i, length=(i.bit_length()+7)/8*8).bytes
'\x01\xc8'

otherwise you can just test for whole-byteness and pad with a zero nibble at the start if needed:

>>> s = bitstring.BitString(hex=hex(i))
>>> ('0x0' + s if s.len%8 else s).bytes
'\x01\xc8'
0
votes

I reformulated John Machins second answer in one line for use on my server:

def bytestring(n):
    return ''.join([chr((n>>(i*8))&0xFF) for i in range(n.bit_length()/8,-1,-1)])

I have found that the second method, using bit-shifting, was faster for both large and small numbers, and not just small numbers.