Introduction to Page Table (Including 4 Different Types)
What is a page table and what can it be used for? If you are looking for answers to these questions, read this post carefully because it gives you a full introduction to the page table.
Overview of Page Table
What is a page table? As a data structure, it is used by the virtual memory system in the computer operating system to store the mapping between virtual addresses and physical addresses. The program executed by the accessing process uses virtual addresses, while the hardware or more specifically the RAM subsystem uses physical addresses. The page table is a key component of virtual address translation, and it is necessary to access data in memory.
In an operating system that uses virtual memory, each process is given the impression that it is using a large and contiguous section of memory. Physically speaking, the memory of each process may be scattered in different areas of the physical memory, or may have been moved (paged out) to another storage, usually a hard disk drive or solid-state drive.
When a process requests access to data in its memory, the operating system is responsible for mapping the virtual address provided by the process to the physical address of the actual memory where the data is stored. The page table is where the operating system stores the virtual address to physical address mapping, and each mapping is also called a page table entry (PTE).
Translation Process of Page Table
The memory management unit (MMU) of the CPU stores the most recently used mapping cache from the page table of the operating system. This is called the translation lookaside buffer (TLB), and it is an associated cache.
When a virtual address needs to be converted to a physical address, the TLB will be searched first. If a match is found (TLB hit), the physical address is returned and the memory access can continue. Nevertheless, if there is no match (called a TLB miss), the memory management unit or operating system TLB miss handler will usually look up the address mapping in the page table to see if there is a mapping (a page walk).
If there is a mapping, write it back to the TLB (this must be done because the hardware accesses the memory through the TLB in the virtual memory system), and restart the faulting instruction (this may also happen in parallel). Subsequent translations will find TLB hits, and memory access will continue.
Page Table Types
Inverted Page Tables
It is best to think of the inverted page table (IPT) as an off-chip extension of the TLB using normal system RAM. Unlike the true page table, it may not be able to save all current mappings. The operating system must be prepared to handle misses, just like using MIPS-style software-filled TLB.
IPT combines a page table and a frame table into one data structure. The core of a fixed-size table is that the number of rows equals the number of frames in memory. If there are 4,000 frames, the inverted page table has 4,000 rows. Each row has an entry for the virtual page number (VPN), the physical page number (not the physical address), some other data, and the method of creating the collision chain.
Multilevel Page Tables
Multilevel page tables are also called “hierarchical page tables”. Although the inverted page table keeps a list of the mappings installed by all frames in physical memory, this can be very wasteful. To avoid this, we can create a page table structure containing virtual page mapping. This is done by keeping several page tables covering a certain block of virtual memory. For example, we can create smaller 4K pages with 1024 entries, covering 4M virtual memory.
This is useful because the top and bottom layers of virtual memory are usually used when the process is running – the top layer is usually used for text and data segments, and the bottom layer is usually used for the stack, with free memory in between. Multilevel page tables can reserve some smaller page tables to cover only the top and bottom of the memory, and create new page tables only when necessary.
Now, each of these smaller page tables is linked together by the master page table, effectively creating a tree data structure. Not only are two levels required, but multiple levels may be required.
The virtual address in this mode can be divided into three parts: the index in the root page table, the index in the subpage table, and the offset in the page.
Virtualized Page Tables
Creating a page table structure that contained the mapping of each virtual page in the virtual address space may end up being wasteful. However, the excessive space problem can be solved by placing the page table in virtual memory and letting the virtual memory system manage the memory of the page table.
However, a part of this linear page table structure must always reside in physical memory to prevent circular page faults and find key parts of the page table that do not exist in the page table.
Nested Page Tables
You can implement nested page tables to improve the performance of hardware virtualization. By providing hardware support for page-table virtualization, simulation requirements are greatly reduced. For x86 virtualization, the current choices are Intel’s Extended Page Table feature and AMD’s Rapid Virtualization Indexing feature.