Lecture 14-15 - TLB & Page Policies

TLB - L1/L2 Cache

Translation Lookaside Buffer (TLB): Small HW cache on MMU for recently used pages (typically 16-64 pages, so 64-256KB memory)

dict[page# (VPN), Page Table Entry] - lookup in one clock cycle
Locality of Reference: Processes only use a few pages at a time (typically >99% of cache hits)
TLB Hit is 1 cycle, 4-layer page translation is very expensive (>30% overhead).
TLB coverage reduced from 10% 1985 to 0.0001% 2023.

TLB Management:

Hardware Loaded: The MMU loads pages into TLB on cache miss (typically on Intel)
- OS maintain PTBR and the page table design is restricted by the processor
Software Loaded: The OS loads pages into TLB (seen in MIPS, slower)
- OS need to maintain TLB integrity when page table is changed (e.g. protection bit) Invalid the entire TLB on context switch
TLB Replacement Policy: Choosing what to evict when TLB is full

Page Fault: Accessing an evicted (invalid) PTE

On evictions, OS swap frame into file, change PTE to invalid and mark file offset (update TLB)
On page faults, OS swap the frame back to memory, change PTE, resume

Page Daemon: A kernel thread to proactively evict pages when memory is close to full

Temporal Locality: Locations referenced recently are likely to be reference again (e.g. for loop index i)

Spatial Locality: Locations close by are likely to be referenced soon (e.g. sequential array access)

Placement Policy: Where to put a page (469: placement impacts performance more when lots of memory is available)

Fetch Policy: When to swap a frame into memory

Demand Paging: Load page into memory when referenced, evict when it’s full
- Optimize: Evict clean pages first
- Optimize: Lazy loading of process code (only load when accessed) Cold Miss: Page miss on the first access Capacity Miss: Page miss caused by an eviction
- Optimize: Load large chunks of code from disk on page faults (based on locality)
Prepaging: Anticipate which pages will be used (rarely used)

Replacement Policy: Which page to evict? (evaluated by counting number of page faults over a reference string)

Belady’s Optimal: Look at future reference string, evict the least-used page (not practical, a upper-bound yardstick to compare with other algorithms)
Random Replacement: Evict a random page (a lower-bound for evaluation)
FIFO (First in First out): Evict the oldest page (temporal locality) (bad: an old page can be used very frequently)
- Belady’s Anomaly: Having larger memory will cause more page faults
LRU (Least Recently Used): Evict the page hasn’t been used for the longest time in the past. (bad when sequential access to a large array > physical memory)
- FIFO but move page to front when referenced
- Expensive runtime for manipulating queue on each memory access
Second Chance Algo: FIFO but evict a page with ref bit 0 (and prefer clean pages)
⭐ Clock Algo: Circular list of pages, clock hand pointing at the next oldest
1. When page loads, ref=1
2. If the pointed page is unref, evict. Otherwise, set its ref=0
3. Rotate clock hand +1
LFU (Least Frequently Used): Count number of accesses in the past (pages heavily used but no longer relevant may stick around)
Simplified 2Q: Two queues, one A1 FIFO (hold pages used once) and one Am LRU