Currently there are two mechanisms for handling guest MMIO/PIO accesses in KVM: returning KVM_EXIT_MMIO/KVM_EXIT_IO from ioctl(KVM_RUN) and ioeventfd. In the first case KVM exits back to qemu and then forward the access to emulated device. The traditional dispatch mechanism looks like this:
In the second case ioeventfd mechanism can be used for the posted doorbell writes. A guest write in the registered address will signal the provided event instead of triggering an exit. This allows host to be notified in a lightweight way (this is called a «lightweight vmexit»). This is suitable for triggers which want to transmit a notify asynchronously and return as quickly as possible. ioeventfd can be also dispatched through QEMU (using KVM_EXIT_MMIO/KVM_EXIT_IO from ioctl(KVM_RUN)) when kvm_eventfds_allowed is false. This will lead to a lower performance. The benchmarking shows that using KVM ioeventfd is about 30+% faster.
ioregionfd mechanism is suggested to be used for faster in-kernel device dispatching. The control plane is KVM vm ioctl(KVM_SET_IOREGION) for registering MMIO/PIO regions. ioctl(KVM_SET_IOREGION) has to be provided with read/write file descriptors which will be used by wire protocol for communication. ioregionfd registered regions should not be overlapping and should not overlap with ioeventfd. Only one mechanism handles a MMIO/PIO access. Regions can be deleted by setting fd to -1.
The data plane is a bi-directional message protocol (wire protocol) that ioregionfd uses to communicate with emulated device. The device reads commands from the file descriptor with the following layout:
The info field layout is as follows::
Thus a device emulation task can use a run loop with the following code:
ioregionfd improves performance by eliminating the need for the vCPU task to forward MMIO/PIO exits to device emulation tasks:
ioregionfd API design discussions can be found here and here.