0
votes

I am trying to work out a solution to a driver problem in which I have to extract a number of bits from an arbitrary bit stream starting from an arbitrary position. I have searched for the post around here and I have found that most of the Bit operations involved are 64 Bit machines or no special storage layout conditions.

Well, the Bit stream is stored as a Byte array in a 32 bit Micro. But the stream is little endian, for example a stream of bytes below

LSBit (Bit 0) --> 0100 1001 0100 0100 <-- MSBit (Bit 15)

is stored in memory as

Byte 0 1001 0010 Bit Layout, Bit 7-> 1001 0010 -> Bit 0

Byte 1 0010 0010 Bit Layout, Bit 15-> 0010 0010 -> Bit 8

here the Bit layout in a Byte is Big Endian, but the bytes are Little Endian.

If I have to extract Bit 4 to 11 from the Little endian stream to get 0010 1001 then I need to extract

from Byte 0

1001, Higher nibble of Byte 0

from Byte 1

0010, Lower Nibble of Byte 1

Shift the bits left from Byte 1 to get 0010 and OR with 1001 from Byte 0

The problem is that the stream can go to a length of 64 bits and the number of bits (upto 64) is arbitrary, as well as the starting bit is arbitrary. But for storage internally I can of course store in the appropriate data type which can fit the bits of course.

And to add to this, I have to pack them in the same way, from a valid data into this stream of Little Big Endian. :'(

I have to worry about runtime as well and the fact that long is 4 bytes. So to store a value of 48 bits I need to maintain an array of 6 bytes arranged in host format. without the compiler long long support.

EDIT: 64-bit long long support is there; I just checked the compiler manual.

Any suggestions please, I am stuck for three weeks now.

2
Welcome to Stack Overflow. Please read the About page soon. Do I understand you correctly that you have an incoming byte stream in little-endian order, and this byte stream can be up to 8 bytes long. From that stream, you have to pick up an arbitrary set of bits and store them into a 64-bit integer (since up to 64 bits can be requested)?Jonathan Leffler
This looks like a homework...joce
@Joce Care to elaborate?...this
I think little and big endian refer to byte order in word. Byte array does not have byte order, and bytes don't really have bit order as such, they are smallest accessible memory unit. If you get bits in reverse order in a byte, google for reversing bits. If you need to tweak byte order after that in a byte array, reorder the bytes. Separate operations.hyde
Thank you for the greeting. I am reading the About page now. Yes the incoming stream is Little endian, and is stored in memory just the way it arrives. This ends up in the bytes being Little Endian, But the Bit Layout in a Byte is Big Endian :\. The stream can be upto 64 bits, thats right, but the starting bit can be arbitrary and the number of bits as well. And also long is 4 bytes on the 32 bit micro since compiler max primitive type is 4 bytes.WinterS

2 Answers

2
votes

Read your stream byte-per-byte, building a 64Bit number in host-byte-order (As that is the maximum bitstream length).
Then extract using standard bit-fiddling.

This two-step recipe has the benefit of being host-endian-agnostic.

Getting a window of n bits at index i:

uint64_t extract(uint64_t in, int n, int i) {
    return in<<(64-i-n)>>(64-n);
}
1
votes

My offered solution is:

pickbits.h

#ifndef PICKBITS_H_INCLUDED
#define PICKBITS_H_INCLUDED

#include <stddef.h>
#include <stdint.h>

extern uint64_t pick_bits(unsigned char *bytes, size_t nbytes, int lo, int hi);

#endif /* PICKBITS_H_INCLUDED */

The headers are necessary for size_t from <stddef.h> and uint64_t from <inttypes.h>. It is important that headers should be self-contained and idempotent. Including those two headers is necessary to make pickbits.h self-contained; the header guards make it idempotent, though they could be removed and it would still be OK since there are no type definitions in the code (and standard headers are already constrained to be idempotent by the C standard).

pickbits.c

#include "pickbits.h"
#include <assert.h>

uint64_t pick_bits(unsigned char *bytes, size_t nbytes, int lo, int hi)
{
  assert(bytes != 0 && nbytes > 0 && nbytes <= 8);
  assert(lo >= 0 && lo < 64);
  assert(hi >= 0 && hi < 64 && hi >= lo);
  uint64_t result = 0;
  for (int i = nbytes - 1; i >= 0; i--)
    result = (result << 8) | bytes[i];
  result >>= lo;
  result &= (UINT64_C(1) << (hi - lo + 1)) - 1;
  return result;
}

Note that including "pickbits.h" first checks that the header is self-contained. The UINT64_C macro ensures that the constant 1 is treated as a uint64_t value.

picktest.c

#include "pickbits.h"
#include <inttypes.h>
#include <stdio.h>

int main(void)
{
  unsigned char d1[8] = "\xA5\xB4\xC3\xD2\xE1\xF0\x96\x87";
  /* Selecting nybbles */
  for (int u = 0; u < 64; u += 4)
  {
    uint64_t v = pick_bits(d1, sizeof(d1), u, u+3);
    printf("Picking bits %2d..%2d gives 0x%" PRIX64 "\n", u, u+3, v);
  }
  /* Select across nybble boundaries - T.B.D */
  return 0;
}

The test needs to be extended to cross nybble boundaries, or stay wholly within a nybble.

Sample run

Picking bits  0.. 3 gives 0x5
Picking bits  4.. 7 gives 0xA
Picking bits  8..11 gives 0x4
Picking bits 12..15 gives 0xB
Picking bits 16..19 gives 0x3
Picking bits 20..23 gives 0xC
Picking bits 24..27 gives 0x2
Picking bits 28..31 gives 0xD
Picking bits 32..35 gives 0x1
Picking bits 36..39 gives 0xE
Picking bits 40..43 gives 0x0
Picking bits 44..47 gives 0xF
Picking bits 48..51 gives 0x6
Picking bits 52..55 gives 0x9
Picking bits 56..59 gives 0x7
Picking bits 60..63 gives 0x8

Why not just one source file?

Answer: see Why is GCC 4.8.2 complaining about addition under strict overflow?

Basically, the code as presented above compiles cleanly under stringent warnings:

$ gcc -g -O3 -std=c99 -Wall -Wextra -Wmissing-prototypes -Wstrict-prototypes \
>     -Wold-style-definition -Wold-style-declaration -Werror -c \
>     pickbits.c picktest.c
$

If integrated into a single file, a 'feature' of GCC 4.8.2 kicks in and generates a warning (which -Werror converts into an error), even though the situation it warns about cannot possibly occur (the overflow couldn't occur if the types were int8_t, let alone int).