I have read multiple articles on this topic including below but things are still hazy to me: http://elinux.org/Tims_Notes_on_ARM_memory_allocation
Linux kernel ARM Translation table base (TTB0 and TTB1)
ARM hardware has 4096 entries of 4 byte each in L1 translation table. each entry translates a 1MB region in memory. At second level it has 256 entries of 4 bytes each. And each of second level entry translates a 4KB page in memory. So according to this any virtual address has to be divided into 12-8-12 to map to above scheme.
But on 32 bit ARM linux side this division is 11-9-12. Where L1 translation table consists of 2048 entries where each entry is 8 bytes. Here two 4 byte entries are clubbed together and the pointed second level translation tables are laid out one after the other in memory, so that at second level instead of 256 there are 512 entries. Additionally since Linux memory management expects various flags non native to ARM we define 512 more entries for linux page table(one for each 2nd level HW page table).
Now the question is Linux does not enforce PGD/PMD/PTE size (however it enforces page size to be 4K. Thus PAGE_SHIFT is set to 12), then why do we select 11-9-12 layout(i.e. 11 bits for PGD and 9 bits for HW PTE). Is it just to make sure that 512HW +512Linux PTE are aligned to a Page boundary ?
If someone could explain the logic behind this division in detail would be great....