Memory Address Alignment

Question

I'm a bit confused about the concept of memory alignment. So here's my doubt: What text says is that say if you wanna read 4 bytes of data, starting from an address that is not divisible by 4, you have the case of an unaligned memory access. Example, if I wanna read 10 bytes starting at address 05, this will be termed as an unaligned access (http://www.mjmwired.net/kernel/Documentation/unaligned-memory-access.txt).

Will this case be specific to a 4 byte word addressable architecture or does this hold valid for a byte addressable architecture as well? If the above case is unaligned for a byte addressable architecture, why is it so?

Thanks!

This is one of the problems that operating systems were put in place to solve... — ControlAltDel
There's no processor I know of that has a 10 byte data type. Let's keep it simple with a plain processor that has a 32-bit data bus and is trying to read a 32-bit value from address 1. That requires reading the value at address 0 first, moving byte 1 to offset 0, moving byte 2 to offset 1, moving byte 3 to offset 2. Then reading the value at address 4 and moving byte 0 to offset 3. Sounds involved, doesn't it :) — Hans Passant

DigitalRoss DigitalRoss · Accepted Answer · 2012-04-05T01:31:39

As a general rule, bit 0 in memory is gated onto a bus and bit 0 of that bus is connected to bit 0 of every register. It goes on like this until bit 31. There may be special hardware that directs each byte (bits 15:8, 23:16, and 31:24) onto the low order byte, bits 7:0. (When you get to bit "32", it's actually bit 0 of the 4-byte word at address 4.)

However, in the nominal case there is not any special hardware that moves bytes to any position other than the one they are nominally connected to in the natural order and, maybe, byte lane 0.

Imagine a simple memory chip with 32 data pins and a simple CPU with 32 data pins. A given data pin on each chip is wired to the corresponding one on the other, and only to that one. There simply is no way for a simple CPU to do the misaligned read at all.

So, consider a read from 0. The next 4 bytes all fall into a register as wired, and this also happens for a read from address 4. But what if you read (32 bits) from address 1? Or 2? Or 3? Although the read cannot be done directly in hardware, a fancy controller can cause a whole lot of things to happen:

the CPU can do TWO reads just to get all the bits. It can't do them at the same time, it only has 32 pins. One read is from address 0 and one from address 4
the CPU must then do various shift, mask, and inclusive-OR operations in order to construct a single word out of the two components.

All of these things take extra time.

^{Note. In reality the data bus is typically a multiple of 32-bits and so is the memory. Special hardware may exist for realigning objects. But even then, because it's an abnormal case, it may not get the pipeline optimizations that properly aligned reads get, and even with special hardware there is probably a time penalty for running operands through it.}

Memory Address Alignment

2 Answers