Currently there are two mechanisms for handling guest MMIO/PIO accesses in KVM: returning KVM_EXIT_MMIO/KVM_EXIT_IO from ioctl(KVM_RUN) and ioeventfd. In the first case KVM exits back to qemu and then forward the access to emulated device. The traditional dispatch mechanism looks like this:

kvm.ko  <---ioctl(KVM_RUN)---> VMM vCPU task <---messages---> device task

In the second case ioeventfd mechanism can be used for the posted doorbell writes. A guest write in the registered address will signal the provided event instead of triggering an exit. This allows host to be notified in a lightweight way (this is called a «lightweight vmexit»). This is suitable for triggers which want to transmit a notify asynchronously and return as quickly as possible. ioeventfd can be also dispatched through QEMU (using KVM_EXIT_MMIO/KVM_EXIT_IO from ioctl(KVM_RUN)) when kvm_eventfds_allowed is false. This will lead to a lower performance. The benchmarking shows that using KVM ioeventfd is about 30+% faster.

ioregionfd mechanism is suggested to be used for faster in-kernel device dispatching. The control plane is KVM vm ioctl(KVM_SET_IOREGION) for registering MMIO/PIO regions. ioctl(KVM_SET_IOREGION) has to be provided with read/write file descriptors which will be used by wire protocol for communication. ioregionfd registered regions should not be overlapping and should not overlap with ioeventfd. Only one mechanism handles a MMIO/PIO access. Regions can be deleted by setting fd to -1.

struct kvm_ioregion {
    __u64 guest_paddr; /* guest physical address */
    __u64 memory_size; /* bytes */
    __u64 user_data;
    __s32 rfd;
    __s32 wfd;
    __u32 flags;
    __u8  pad[28];
};

The data plane is a bi-directional message protocol (wire protocol) that ioregionfd uses to communicate with emulated device. The device reads commands from the file descriptor with the following layout:

struct ioregionfd_cmd {
    __u32 info;
    __u32 padding;
    __u64 user_data;
    __u64 offset;
    __u64 data;
};

The info field layout is as follows::

bits:  | 31 ... 8 |  6   | 5 ... 4 | 3 ... 0 |
field: | reserved | resp |   size  |   cmd   |

Thus a device emulation task can use a run loop with the following code:

switch (cmd.info & IOREGIONFD_CMD_MASK) {
case IOREGIONFD_CMD_READ:
    /* It's a read access */
    break;
case IOREGIONFD_CMD_WRITE:
    /* It's a write access */
    break;
default:
    /* Protocol violation, terminate connection */
}

ioregionfd improves performance by eliminating the need for the vCPU task to forward MMIO/PIO exits to device emulation tasks:

kvm.ko       <---------ioctl(KVM_RUN)-------> VMM vCPU task
  ^ 
ioregionfd   -------------------------------> device task

ioregionfd API design discussions can be found here and here.