Emulog: QEMU Study
Welcome to the first emulog. This will be a bi-weekly series where I post updates on my emudevving, mainly for motivatation but also to reflect on. Most of these posts will be technical (though I may intersperse fun posts in between). I’m currently focused on building out a generic iPhone emulator so I’m doing a lot of background reading. As my first post, here’s some of the notes I took on “QEMU Internals” by Airbus Security Lab. Feel free to email me if you see anything wrong.
QEMU Intro #
QEMU is a generic userland and system emulator and it can target many different processor architectures. Userland emulation (or virtualization) refers to a running a binary cross-architecture but under the same OS (ex. Linux x86 on Linux ARM), which conveniently allows for emulating processor instructions and forwarding the syscalls to the host kernel (this is out of scope for the project).
QEMU’s system-mode emulation (qemu-system-) provides a JIT (just-in-time compiler), TinyCodeGenerator or TCG.
API Shenanigans #
QEMU uses QOM (the QEMU Object Model) which is a framework that gives C OOP features, creating QEMU-related types and instantiating objects. QEMU also relies on macros (DEFINE_MACHINE) to generate machine boilerplate code.
Guest memory can be accessed either through just a pointer to the buffer, MMIO, or a QEMU method, cpu_physical_memory_read/write - t’s preferrable to assign a block of memory (a subregion) to a device in QEMU and have it constrained within rather than providing the system memory (generated by default). memory_region_init_io() can be used to achieve this.1
TCG Internals #
The execution loop is to check for interrupts, else execute translated blocks (TB). The JIT compiler dynamically translates blocks of instructions from the target to host arch. In the process, QEMU looks for existing TBs, else generates the TB, and runs it.
The code is generated from gen_intermediate_code (generates IR, frontend operations) and tcg_gen_code (generates arch host assembly, backend operations). gen_intermediate_code generic but wraps for the target architecture (i386_tr_ops), calls into a translator loop, which processes the instruction and emits a TCG IR operation for the TB. From this block, the host code (target from TCG POV) can be generated and QEMU can run it. However, for a few instructions that can’t be directly emulated (ex. PPC mtmsr - a specific register), a TCG helper (handler) will be called to go into a pre-compiled generated C function (it can still modify the CPU state).
Memory #
QEMU is allocated a big chunk of memory for the guest (host OS deals with this). QEMU also has a softmmu that can translate virtual addresses to physical ones and virtual Translation Lookasize Buffer (vTLBs) - the vTLB will directly get the final host memory address if in the cache. QEMU LDST labels are used for TLB misses.
You can make bus devices, interrupt controllers, timers and a PCI controller via QEMU APIs. ↩︎