Why 2 LSB's of 32 bit ARM instruction address not used

Question

I am studying the ARM instruction architecture, and I have read that instructions are stored word-aligned, so the least significant two bits of instruction addresses are always zero in ARM state.

Thumb and Thumb-2 instructions are either 16 or 32 bits long. Instructions are stored half-word aligned, so the least significant bit of instruction addresses is always zero in Thumb state.

In some of my studies with different microcontrollers like AVR While accessing Program memory, I was using the Least significant bit to distinguish between Higher byte or lower byte to be accessed. But that was regarding Data memory access.

In ARM the instructions are anyways 32 bit and hence should be fetch all bytes at once.

Why then, the last two bits to fetch a particular byte of the instruction (1 bit in Thumb Mode) and use of banks.

PS: If I were to fetch individual byte of a 4-byte long instruction, it would take 4 cycles which is very inefficient, so what is the purpose of having byte addressability, Is it because the new THUMB type instructions which are 16-bit wide but still occupy 32-bit space?

It's not really clear what you're looking for clarification on; your first two paragraphs are the answer to your question! — Oliver Charlesworth
Why would they have banks in either mode ARM or THUMB to access individual byte of an 32/16 bit instruction. An instruction should be fetched as a whole word right? — Haswell
If i were to fetch individual byte of a 4 byte long instruction, it would take 4 cycles which is very inefficient, so what is the purpose of having byte address-ability, Is it because the new THUMB type instructions which are 16 bit wide but still occupy 32 bit space? — Haswell
Thumb instructions do not occupy 32 bits of space - that is the whole point of using them. In general terms, code space is byte addressable rather than half-word addressable for consistency with data space - some embedded or cache-equipped ARM designs may be built with a semi-Harvard architecture as an efficiency, but do not require special instructions for data access to code memory. Note that unaligned word access is usually prohibited for both. — Chris Stratton
The purpose for having byte addressability is because there are accesses to memory for things other than fetching instructions, and byte-level addressability is often useful for those scenarios. Note that there are architectures that are word addressable for things other than 8-bit bytes, but they aren't as common as 8-it addressable machines (for one thing, POSIX requires 8-bit addressability, if I recall correctly). Also, there are some ARM architectures (such as the Cortex M3) that have a limited bit-level addressability. — Michael Burr

RootPhoenix RootPhoenix · Accepted Answer · 2014-04-27T08:47:54

I think you are again mixing the Instruction access with Data access. As far as data access is concerned we may use the last two bits to fetch any byte among the 4 byte data.

But the concept of not using last two bits has nothing to do with accessing individual byte of a 32 bit instruction. As you said, accessing one byte at a time for instruction access is highly inefficient and is not permitted as well. So to enforce this rule ( of not accessing bytes at odd boundaries in instruction access) the last two bits will not be considered. The following diagram will explain this:

The addresses are 32 bit:

|--0x00000007--|--0x00000006--|--0x00000005--|--0x00000004--|

|--0x00000003--|--0x00000002--|--0x00000001--|--0x00000000--|

Focus on the last nible:

| 3-0011; 2-0010; 1-0001; 0-0000; |

| 7-0111; 6-0110; 5-0101; 4-0100; |

Now focus on the last two least significant bits. Our aim is not to allow an instruction to start at locations 1,2,3,5,6,7 So if you check the two LSB's they cannot be anything in 01,10,11. Only "00" as the 2 LSB's is allowed. Now since they are 00 it is as good as ignoring them when the address generated is in multiples of 4.

Hope you can visualize better.

Why 2 LSB's of 32 bit ARM instruction address not used

3 Answers