172
votes

In C++,

  • Why is a boolean 1 byte and not 1 bit of size?
  • Why aren't there types like a 4-bit or 2-bit integers?

I'm missing out the above things when writing an emulator for a CPU

12
In C++ you can "pack" the data by using bit-fields. struct Packed { unsigned int flag1 : 1; unsigned int flag2: 1; };. Most compilers will allocate a full unsigned int, however they deal with the bit-twiddling by themselves when you read / write. Also they deal by themselves with the modulo operations. That is a unsigned small : 4 attribute has a value between 0 and 15, and when it should get to 16, it won't overwrite the preceding bit :)Matthieu M.

12 Answers

265
votes

Because the CPU can't address anything smaller than a byte.

45
votes

From Wikipedia:

Historically, a byte was the number of bits used to encode a single character of text in a computer and it is for this reason the basic addressable element in many computer architectures.

So byte is the basic addressable unit, below which computer architecture cannot address. And since there doesn't (probably) exist computers which support 4-bit byte, you don't have 4-bit bool etc.

However, if you can design such an architecture which can address 4-bit as basic addressable unit, then you will have bool of size 4-bit then, on that computer only!

17
votes

The easiest answer is; it's because the CPU addresses memory in bytes and not in bits, and bitwise operations are very slow.

However it's possible to use bit-size allocation in C++. There's std::vector specialization for bit vectors, and also structs taking bit sized entries.

17
votes

Back in the old days when I had to walk to school in a raging blizzard, uphill both ways, and lunch was whatever animal we could track down in the woods behind the school and kill with our bare hands, computers had much less memory available than today. The first computer I ever used had 6K of RAM. Not 6 megabytes, not 6 gigabytes, 6 kilobytes. In that environment, it made a lot of sense to pack as many booleans into an int as you could, and so we would regularly use operations to take them out and put them in.

Today, when people will mock you for having only 1 GB of RAM, and the only place you could find a hard drive with less than 200 GB is at an antique shop, it's just not worth the trouble to pack bits.

11
votes

You could have 1-bit bools and 4 and 2-bit ints. But that would make for a weird instruction set for no performance gain because it's an unnatural way to look at the architecture. It actually makes sense to "waste" a better part of a byte rather than trying to reclaim that unused data.

The only app that bothers to pack several bools into a single byte, in my experience, is Sql Server.

11
votes

Because a byte is the smallest addressible unit in the language.

But you can make bool take 1 bit for example if you have a bunch of them eg. in a struct, like this:

struct A
{
  bool a:1, b:1, c:1, d:1, e:1;
};
9
votes

You can use bit fields to get integers of sub size.

struct X
{
    int   val:4;   // 4 bit int.
};

Though it is usually used to map structures to exact hardware expected bit patterns:

// 1 byte value (on a system where 8 bits is a byte)
struct SomThing   
{
    int   p1:4;   // 4 bit field
    int   p2:3;   // 3 bit field
    int   p3:1;   // 1 bit
};
5
votes

bool can be one byte -- the smallest addressable size of CPU, or can be bigger. It's not unusual to have bool to be the size of int for performance purposes. If for specific purposes (say hardware simulation) you need a type with N bits, you can find a library for that (e.g. GBL library has BitSet<N> class). If you are concerned with size of bool (you probably have a big container,) then you can pack bits yourself, or use std::vector<bool> that will do it for you (be careful with the latter, as it doesn't satisfy container requirments).

3
votes

Think about how you would implement this at your emulator level...

bool a[10] = {false};

bool &rbool = a[3];
bool *pbool = a + 3;

assert(pbool == &rbool);
rbool = true;
assert(*pbool);
*pbool = false;
assert(!rbool);
3
votes

Because in general, CPU allocates memory with 1 byte as the basic unit, although some CPU like MIPS use a 4-byte word.

However vector deals bool in a special fashion, with vector<bool> one bit for each bool is allocated.

0
votes

The byte is the smaller unit of digital data storage of a computer. In a computer the RAM has millions of bytes and anyone of them has an address. If it would have an address for every bit a computer could manage 8 time less RAM that what it can.

More info: Wikipedia

0
votes

Even when the minimum size possible is 1 Byte, you can have 8 bits of boolean information on 1 Byte:

http://en.wikipedia.org/wiki/Bit_array

Julia language has BitArray for example, and I read about C++ implementations.