What Is an MMU? Memory Management Unit Explained

An MMU, or Memory Management Unit, is a hardware component inside your computer’s processor that translates the memory addresses used by software into the actual physical locations in RAM. It sits between the processor and memory, intercepting every single read and write operation to ensure programs access only the memory they’re allowed to touch. Without it, every program on your computer would be able to read or overwrite another program’s data, and modern multitasking operating systems simply wouldn’t work.

What the MMU Actually Does

The MMU has two primary jobs: address translation and access control.

Every program running on your computer thinks it has its own private block of memory, starting from address zero and stretching as far as it needs. This is an illusion. The operating system and MMU work together to maintain it. When your web browser asks to read data at what it believes is memory address 5000, the MMU intercepts that request and converts it to a completely different physical address in your actual RAM chips. This lets dozens of programs each “own” address 5000 without ever colliding.

The access control side is equally important. The MMU checks permission bits attached to each region of memory before allowing a read or write. These permissions define whether a program has no access, read-only access, or full read/write access to a given block. If a program tries to touch memory it shouldn’t, the MMU signals a memory fault to the processor, and the operating system steps in to stop it. This is what’s happening behind the scenes when you see a “segmentation fault” or an application crash due to an access violation. On processors with security extensions like Arm’s TrustZone, the MMU can even separate memory into secure and non-secure zones, preventing untrusted software from accessing sensitive data regardless of its privilege level.

How Address Translation Works

The translation process relies on a system called paging. Both virtual memory (what the program sees) and physical memory (the actual RAM) are divided into small, fixed-size chunks called pages, typically 4 kilobytes each. The operating system maintains a data structure called a page table that maps each virtual page to a physical page.

When the MMU receives a virtual address, it splits it into two parts: the virtual page number and an offset within that page. It looks up the virtual page number in the page table to find the corresponding physical page number, then recombines it with the original offset. The result is a physical address pointing to the exact byte in RAM. This happens for every memory operation your processor executes, which means it needs to be extraordinarily fast.

The TLB: Making Translation Fast

Looking up a page table entry in main memory for every single load and store instruction would be painfully slow. To solve this, the MMU contains a small, specialized cache called the Translation Lookaside Buffer (TLB). The TLB stores the most recently used virtual-to-physical page mappings so the MMU can resolve addresses without touching main memory at all.

When the MMU needs to translate an address, it first checks the TLB. If the mapping is there (a “TLB hit”), the translation completes almost instantly, adding essentially zero delay. If the mapping isn’t cached (a “TLB miss”), the MMU has to search the page tables stored in main memory, a process called a page table walk. This costs roughly 8 to 10 clock cycles on typical hardware, and on some architectures the penalty can be significantly higher. Early measurements across several processor families found that the performance impact of TLB misses varied widely, from modest slowdowns to over six times the baseline execution time for workloads that triggered frequent misses.

Because the TLB is small (often 64 entries or fewer for a given level), it uses a replacement policy to decide which old entry to evict when a new one arrives. The operating system can also flush the TLB entirely or selectively when memory mappings change. Programs that access memory in predictable, localized patterns benefit enormously from the TLB, while programs that jump around large, scattered memory regions pay more of a penalty.

How the Operating System Controls the MMU

The MMU is hardware, but it takes its instructions from the operating system kernel. The kernel builds and maintains the page tables that define how virtual addresses map to physical ones. Each running process gets its own set of page tables, giving it a unique view of memory.

When the operating system switches from running one program to another (a context switch), it loads the new program’s page table into a special processor register. On x86 processors, this means writing to a register called CR3. Loading new page tables has a side effect: it flushes the TLB, since the old translations no longer apply. This flush is one reason context switches have a performance cost.

Linux and other modern operating systems use a technique called lazy TLB flushing to minimize this overhead. Rather than aggressively clearing the TLB every time anything changes, the kernel defers the flush until it’s absolutely necessary. A full TLB flush across all processor cores is the most expensive option and is reserved for situations where the entire address space has changed, like when a process exits or a new process is created via fork.

Virtual Memory and the Bigger Picture

The MMU is the hardware foundation that makes virtual memory possible. Virtual memory lets your computer use more memory than it physically has by swapping less-used pages to disk and bringing them back when needed. When a program tries to access a page that’s been moved to disk, the MMU triggers a page fault. The operating system catches this fault, loads the page back into RAM, updates the page table, and lets the program continue without it ever knowing anything happened.

This same mechanism enables memory-mapped files, shared memory between processes, and copy-on-write optimizations that make creating new processes fast. All of these features depend on the MMU’s ability to intercept memory accesses and let the operating system decide what happens next.

IOMMU: The MMU for Devices

A related component called the IOMMU (Input/Output Memory Management Unit) does the same job for hardware peripherals like graphics cards, network adapters, and storage controllers. These devices often need to read and write system memory directly (a technique called DMA), and without an IOMMU, they would operate on raw physical addresses with no protection. The IOMMU gives each device its own translated view of memory, preventing a misbehaving or compromised device from accessing memory belonging to the operating system or other devices. On Arm-based systems, this component is called the System MMU, and it can share the same page table format used by the CPU’s MMU.