Trying to Explain QEMU in Simple Terms (Part 4):Address Translation, Soft-MMU, System Mode, User Mode
Introduction
In the previous article (Part 3), we explained how peripherals are handled in QEMU.
In this article (Part 4), we will explain the following in QEMU:
- Address Translation
- Soft-MMU
- System Mode
- User Mode
From here, things might get a bit more complicated, and I’m a little unsure if we can explain it in a simple way. However, we will do my best to explain it as clearly as possible.
What is Address Translation?
What exactly does address translation refer to in QEMU?
There are multiple types of “addresses” involved in address translation in QEMU. While this may seem a bit complex at first, Figure 1 illustrates the process of address translation in QEMU. This diagram specifically relates to QEMU’s Soft-MMU (MMU = Memory Management Unit). The Soft-MMU corresponds to QEMU’s system mode. Additionally, QEMU has another mode called user mode. The terms Soft-MMU, system mode, and user mode will be explained later. To explain the types of address translation, we will first use part of this diagram.

In QEMU, the following types of addresses exist:
- Guest Virtual Address on the ARM CPU (guest) (abbreviated as gvaddr in QEMU source code)
- Guest Physical Address on the ARM CPU (guest) (abbreviated as gpaddr in QEMU source code)
- Host Virtual Address on the x86 CPU (host) (abbreviated as hvaddr in QEMU source code)
Virtual addresses are also called logical addresses. Personally, we find the term “logical address” more intuitive, but since QEMU uses the term “virtual address,” we will use “virtual address” from now on.
Now, we were unsure whether to explain what a virtual address is here. However, including an explanation of virtual addresses would disrupt the flow of the discussion, so we will explain virtual addresses in a later part of this series. Additionally, many readers who are interested in QEMU may already be familiar with virtual addresses.
Cross-Compilation
Now, let’s go through Figure 1 step by step. First, we will focus on part ① in Figure 1. Here, a C program A for the ARM CPU is cross-compiled on the host, generating an executable file (binary file) for the ARM CPU. Suppose this executable file contains an instruction that loads a value from the virtual address gvaddr.
Next, as shown in part ② of Figure 1, this program A is placed in RAM1, which exists in the physical memory space as seen from the ARM CPU. In part ②, the physical address gpaddr corresponding to the virtual address gvaddr is also indicated.
Memory Allocation on the Host
QEMU first allocates memory on the host for RAM1 using the C function calloc(), as shown in part ③ of Figure 1. Afterward, the ARM CPU executable is loaded, and the loaded ARM program A1 is placed into the RAM1 on the host. Then, QEMU executes the ARM program in RAM1 by converting the ARM instructions into x86 CPU instructions using the TCG (Tiny Code Generator), running program A on the x86.
ARM CPU’s MMU (Memory Management Unit)
At this point, QEMU needs to know which address on the host memory (specifically, which virtual address) the virtual address gvaddr referenced in the ARM CPU program corresponds to. In other words, it needs to know the address hvaddr, which is shown in part ③ of Figure 1.
As shown in part ④ of Figure 1, the virtual addresses referenced in the ARM CPU program are converted into physical addresses by the circuit called the MMU (Memory Management Unit) inside the ARM CPU. QEMU also emulates this MMU. The MMU is controlled by operating systems like Linux running on the ARM CPU. If no operating system like Linux is used on the ARM CPU, the MMU is generally not used, and the concept of virtual addresses is not present. In that case, only physical addresses are used.
The physical address (gpaddr) output by the MMU inside the ARM CPU is a physical address as seen from the ARM CPU. In other words, it is the gpaddr shown in part ② of Figure 1. QEMU then needs to further determine the hvaddr shown in part ③ of Figure 1.
Page Table, TLB, Soft-MMU
In QEMU, the conversion from guest physical address to host virtual address is performed using a page table and TLB (Translation Lookaside Buffer) created based on the guest’s physical address. Since an explanation of page tables can be complex, we plan to explain it in a later part of this series if possible.
However, if you take a closer look at parts ② and ③ of Figure 1, you might feel that the host virtual address hvaddr can be calculated as:
hvaddr = gpaddr + some value
The “some value” mentioned above is determined using the page table. Once the “some value” is calculated, it is stored in the TLB (Translation Lookaside Buffer). When the address gpaddr needs to be translated again, the value stored in the TLB is used directly without recalculating the “some value.”
In QEMU, the part inside the red frame shown in Figure 1 section ④ is called the Soft-MMU (Memory Management Unit).
By using the page table inside the Soft-MMU, it is possible to determine whether the guest’s physical address is an MMIO (peripheral access). The reason for this can be explained in another part of this series. In the case of MMIO, the peripheral’s C functions (such as read/write functions) are called, allowing the emulation of peripherals.
QEMU’s System Mode and User Mode
QEMU has both system mode and user mode. System mode refers to the mode that uses the previously mentioned Soft-MMU. The QEMU execution command in system mode includes the string “system” in the command, as shown below.
- qemu-system-arm
- qemu-system-i386
- qemu-system-mips
So, what does user mode refer to? In fact, user mode assumes that Linux is being used as the guest (ARM CPU) OS. Furthermore, user mode refers to a mode where Soft-MMU is not used. In other words, user mode is used when both of the following conditions are met:
- The guest OS is Linux
- Soft-MMU is not used
The process of address translation in user mode is shown in Figure 2.

In Figure 2, the ARM CPU C source code is compiled using cross-compiler arm-linux-gnu-gcc, assuming that Linux is running on the ARM CPU. In Figure 1, Soft-MMU was used to convert from the guest virtual address to the host virtual address, but in Figure 2, QEMU does not perform that conversion. In user mode, address management is left to the host OS (e.g., Linux).
The QEMU execution command in user mode does not contain the string “system,” as shown below.
- qemu-arm
- qemu-i386
- qemu-mips
At this point, a side note arises. One question is whether, by leaving address management to the host OS, the guest program A’s start address start_gvaddr (as shown in Figure 3) might already be in use by the host.

In user mode, when QEMU places the guest program A in the host’s memory, it uses the C function mmap() to allocate space in the virtual memory. start_gvaddr is passed as the starting address to the mmap() function. If there is no available space in the region, the return value of mmap() will not equal start_gvaddr, indicating that the space is not available. In this case, QEMU increments start_gvaddr slightly and attempts to call mmap() again. This process repeats until mmap() succeeds, and QEMU keeps track of how much has been added to start_gvaddr.
In user mode, peripheral models cannot be used
In QEMU’s user mode, Soft-MMU is not used. Therefore, the page tables of Soft-MMU are also not used. Without using the page tables, it is not possible to determine whether an accessed address is MMIO. Additionally, since the host OS does not have information about the guest’s peripherals, it cannot execute QEMU’s peripheral C models.
Next Time
In this article (Part 4), we explained address translation in QEMU, Soft-MMU, system mode, and user mode. Although the explanation may have been somewhat difficult to follow, we believe that understanding the details is not essential when using QEMU.
In the next article (Part 5), we will explain QEMU-KVM (Kernel-based Virtual Machine). This will cover the case where both the guest and host CPUs are x86-based, such as running x86-based Linux (guest) on an x86-based Windows (host). The underlying concept is that if both the host and guest use x86 machine code, there should be no need for machine code translation.