3
votes

I came up with this question because I don't understand how address translation is carried out in kernel area.

From what I understand, to translate any address above 0xC0000000, we just need to minus this address with PAGE_OFFSET (except the kernel initializing process, where we need a page table for 8MB range). But this does not make sense where CPU executes an instruction that requires address at, say 0xF0000020, while the system only have a 256MB RAM.

For the above reason, I think the kernel indeed has a page table which allows the MMU to translate virtual address above 0xC0000000 to physical. Thus, in what situation we can directly minus the PAGE_OFFSET and in what situation we need the kernel page table?

I may get wrong at the beginning, so please correct.


EDIT

From << Understanding the Linux Virtual Memory Manager >>, where it says kernel page table exists. Now, more confused...

3.6 Kernel Page Tables

When the system first starts, paging is not enabled because page tables do not magically initialize themselves. Each architecture implements this differently so only the x86 case will be discussed. The page table initialization is divided into two phases. The bootstrap phase sets up page tables for just 8MiB so that the paging unit can be enabled. The second phase initializes the rest of the page tables. We discuss both of these phases in the following sections.

3.6.1 Bootstrapping

...

3.6.2 Finalizing

The function responsible for finalizing the page tables is called paging_init(). The call graph for this function on the x86 can be seen on Figure 3.4. enter image description here

The function first calls pagetable_init() to initialize the page tables necessary to reference all physical memory in ZONE_DMA and ZONE_NORMAL. Remember that high memory in ZONE_HIGHMEM cannot be directly referenced and that mappings are set up for it temporarily. For each pgd t used by the kernel, the boot memory allocator (see Chapter 5) is called to allocate a page for the PGD, and the PSE bit will be set if available to use 4MiB TLB entries instead of 4KiB. If the PSE bit is not supported, a page for PTEs will be allocated for each pmd t. If the CPU supports the PGE flag, it also will be set so that the page table entry will be global and visible to all processes.

Next, pagetable_init() calls fixrange_init() to set up the fixed address space mappings at the end of the virtual address space starting at FIXADDR_START. These mappings are used for purposes such as the local Advanced Programmable Interrupt Controller (APIC) and the atomic kmappings between FIX_KMAP_BEGIN and FIX_KMAP_END required by kmap_atomic(). Finally, the function calls fixrang_init() to initialize the page table entries required for normal high memory mappings with kmap().

After pagetable_init() returns, the page tables for kernel space are now fully initialized, so the static PGD (swapper_pg_dir) is loaded into the CR3 register so that the static table is now being used by the paging unit.

The next task of the paging_init() is responsible for calling kmap_init() to initialize each of the PTEs with the PAGE_KERNEL protection flags. The final task is to call zone_sizes_init(), which initializes all the zone structures used.

3

3 Answers

0
votes

There is a linear mapping for the first 900MB of physical memory (if it exists), so that the virtual address for that memory is equal to the physical memory plus PAGE_OFFSET. Of course this doesn't prevent the same physical memory from being mapped elsewhere in a process's address space, should the kernel want to use that memory for other purposes.

0
votes

Although linear mapping may seem special to human, it (usually) isn't special in terms of MMU configuration.

So as you said, to translate the virtual address from 3G to 3G+900MB, we can directly minus these addresses with PAGE_OFFSET. Does that mean the kernel does not need any page tables?

It still needs those tables to explain that specific (linear) mapping to MMU. Although there are some special cases like MIPS R3000.

But this does not make sense where CPU executes an instruction that requires address at, say 0xF0000020

I'd ask whether it makes sense to execute instruction at that address in the first place. What I mean is that on system with 256MBs of RAM you won't simply encounter such request (at least assuming code is not buggy).

The point that is confusing for you is IMO: who is responsible for doing address translations? The answer is (again, usually) MMU, implemented in hardware. The page tables are therefore the way to say MMU how it should do such translations - doing that is not kernel responsibility. The kernel just needs to configure MMU.

How the address translation(virt->phy) is performed when CPU requires an address above 0xC0000000?

Just as for the addresses below. This reading may be helpful.

0
votes

In fact your userspace program or even kernel uses virtual addressing. It means that every memory request goes through MMU. If it goes through MMU it uses page tables (see CR3 register on x86)

virtual addr --> MMU --> physical addr 

And kernel doesn't use some magical optimization when you accessing lowmem. Yes lowmem is mapped directly, that's why from humans point of view you can translate virtual addresses of lowmem to physical by a simple subtraction, but CPU does this translation through kernel page tables.