-
Task
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
None
Tracing of the qemu-kvm process in the customer case has shown that ppoll() is a relatively expensive system call, especially with many file descriptors.
We could probably reduce latencies by using io_uring. QEMU has fdmon-io_uring.c, but it is effectively dead code today because it automatically disables itself in every code path.
The optimisation that would make this most interesting, hasn't been made in the existing implementation, though: Instead of using IORING_OP_POLL_ADD to poll eventfds and then clear them in separate read() syscalls, we can directly let io_uring read from the eventfd and when the read completes, we know an event has happened and don't need to do anything else to clear the eventfd any more.
Another related question is if it's optimal to have one eventfd per virtqueue, or if one eventfd per device and iothread would be enough.
The goal of this task is to explore the options that io_uring could give us to improve performance and possibly implement a solution upstream.