3
votes

Four consecutive bytes in a byte string together specify some value. However, only 7 bits in each byte are used; the most significant bit is always zero and therefore its ignored (that makes 28 bits altogether). So...

b"\x00\x00\x02\x01"

would be 000 0000 000 0000 000 0010 000 0001.

Or, for the sake of legibility, 10 000 0001. That's the value the four bytes represent. But I want a decimal, so I do this:

>>> 0b100000001
257

I can work all that out myself, but how would I incorporate it into a program?

2

2 Answers

7
votes

Use bitshifting and addition:

bytes = b"\x00\x00\x02\x01"
i = 0
for b in bytes:
    i <<= 7
    i += b     # Or use (b & 0x7f) if the last bit might not be zero.
print(i)

Result:

257
1
votes

Using the bitarray module, you can do it a lot quicker for big numbers:

Benchmarks (factor 2.4x speedup!):

janus@Zeus /tmp % python3 -m timeit -s "import tst" "tst.tst(10000)" 
10 loops, best of 3: 251 msec per loop
janus@Zeus /tmp % python3 -m timeit -s "import tst" "tst.tst(100)"  
1000 loops, best of 3: 700 usec per loop
janus@Zeus /tmp % python3 -m timeit -s "import sevenbittoint, os" "sevenbittoint.sevenbittoint(os.urandom(10000))"
10 loops, best of 3: 73.7 msec per loop
janus@Zeus /tmp % python3 -m timeit -s "import quick, os" "quick.quick(os.urandom(10000))"                        
10 loops, best of 3: 179 msec per loop

quick.py (from Mark Byers):

def quick(bites):
  i = 0
  for b in bites:
    i <<= 7
    i += (b & 0x7f)
    #i += b
  return i

sevenbittoint.py:

import bitarray
import functools

def inttobitarray(x):
  a = bitarray.bitarray()
  a.frombytes(x.to_bytes(1,'big'))
  return a

def concatter(accumulator,thisitem):
  thisitem.pop(0)
  for i in thisitem.tolist():
    accumulator.append(i)
  return accumulator

def sevenbittoint(bajts):
  concatted = functools.reduce(concatter, map(inttobitarray, bajts), bitarray.bitarray())
  missingbits = 8 - len(concatted) % 8
  for i in range(missingbits): concatted.insert(0,0) # zeropad
  return int.from_bytes(concatted.tobytes(), byteorder='big')

def tst():
  num = 32768
  print(bin(num))
  print(sevenbittoint(num.to_bytes(2,'big')))

if __name__ == "__main__":
  tst()

tst.py:

import os
import quick
import sevenbittoint

def tst(sz):
    bajts = os.urandom(sz)
  #for i in range(pow(2,16)):
  #  if i % pow(2,12) == 0: print(i)
  #  bajts = i.to_bytes(2, 'big')
    a = quick.quick(bajts)
    b = sevenbittoint.sevenbittoint(bajts)
    if a != b: raise Exception((i, bin(int.from_bytes(bajts,'big')), a, b))