vhost/vhost-user-backend
German Maglione 9d4fc177aa vhost-user-backend: bump up MAX_MEM_SLOTS to 509
Let's support up to 509 mem slots, just like vhost in the kernel usually
does. This is required to properly support memory hotplug, either using
multiple DIMMs (ACPI supports up to 256) or using virtio-mem.

The 509 used to be the KVM limit, it supported 512, but 3 were
used for internal purposes. Currently, KVM supports more than 512, but
it usually doesn't make use of more than ~260 (i.e., 256 DIMMs + boot
memory), except when other memory devices like PCI devices with BARs are
used. So, 509 seems to work well for vhost in the kernel.

Details can be found in the QEMU change that made virtio-mem consume
up to 256 mem slots across all virtio-mem devices. [1]

509 mem slots implies 509 VMAs/mappings in the worst case (even though,
in practice with virtio-mem we won't be seeing more than ~260 in most
setups).

With max_map_count under Linux defaulting to 64k, 509 mem slots
still correspond to less than 1% of the maximum number of mappings.
There are plenty left for the application to consume.

[1] https://lore.kernel.org/all/20230926185738.277351-1-david@redhat.com/

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: German Maglione <gmaglione@redhat.com>
2024-01-12 15:31:51 +01:00
..
src vhost-user-backend: bump up MAX_MEM_SLOTS to 509 2024-01-12 15:31:51 +01:00
tests vhost-user-backend: bump up MAX_MEM_SLOTS to 509 2024-01-12 15:31:51 +01:00
Cargo.toml chore: Prepare vhost-user-backend 0.12.0 release. 2024-01-08 11:44:32 +01:00
CHANGELOG.md chore: Prepare vhost-user-backend 0.12.0 release. 2024-01-08 11:44:32 +01:00
README.md Move all crates to workspace root 2023-11-23 16:36:57 +01:00

vhost-user-backend

Design

The vhost-user-backend crate provides a framework to implement vhost-user backend services, which includes following external public APIs:

  • A daemon control object (VhostUserDaemon) to start and stop the service daemon.
  • A vhost-user backend trait (VhostUserBackendMut) to handle vhost-user control messages and virtio messages.
  • A vring access trait (VringT) to access virtio queues, and three implementations of the trait: VringState, VringMutex and VringRwLock.

Usage

The vhost-user-backend crate provides a framework to implement vhost-user backend services. The main interface provided by vhost-user-backend library is the struct VhostUserDaemon:

pub struct VhostUserDaemon<S, V, B = ()>
where
    S: VhostUserBackend<V, B>,
    V: VringT<GM<B>> + Clone + Send + Sync + 'static,
    B: Bitmap + 'static,
{
    pub fn new(name: String, backend: S, atomic_mem: GuestMemoryAtomic<GuestMemoryMmap<B>>) -> Result<Self>;
    pub fn start(&mut self, listener: Listener) -> Result<()>;
    pub fn wait(&mut self) -> Result<()>;
    pub fn get_epoll_handlers(&self) -> Vec<Arc<VringEpollHandler<S, V, B>>>;
}

Create a VhostUserDaemon Instance

The VhostUserDaemon::new() creates an instance of VhostUserDaemon object. The client needs to pass in an VhostUserBackend object, which will be used to configure the VhostUserDaemon instance, handle control messages from the vhost-user frontend and handle virtio requests from virtio queues. A group of working threads will be created to handle virtio requests from configured virtio queues.

Start the VhostUserDaemon

The VhostUserDaemon::start() method waits for an incoming connection from the vhost-user frontends on the listener. Once a connection is ready, a main thread will be created to handle vhost-user messages from the vhost-user frontend.

Stop the VhostUserDaemon

The VhostUserDaemon::stop() method waits for the main thread to exit. An exit event must be sent to the main thread by writing to the exit_event EventFd before waiting for it to exit.

Threading Model

The main thread and virtio queue working threads will concurrently access the underlying virtio queues, so all virtio queue in multi-threading model. But the main thread only accesses virtio queues for configuration, so client could adopt locking policies to optimize for the virtio queue working threads.

Example

Example code to handle virtio messages from a virtio queue:

impl VhostUserBackendMut for VhostUserService {
    fn process_queue(&mut self, vring: &VringMutex) -> Result<bool> {
        let mut used_any = false;
        let mem = match &self.mem {
            Some(m) => m.memory(),
            None => return Err(Error::NoMemoryConfigured),
        };

        let mut vring_state = vring.get_mut();

        while let Some(avail_desc) = vring_state
            .get_queue_mut()
            .iter()
            .map_err(|_| Error::IterateQueue)?
            .next()
        {
            // Process the request...

            if self.event_idx {
                if vring_state.add_used(head_index, 0).is_err() {
                    warn!("Couldn't return used descriptors to the ring");
                }

                match vring_state.needs_notification() {
                    Err(_) => {
                        warn!("Couldn't check if queue needs to be notified");
                        vring_state.signal_used_queue().unwrap();
                    }
                    Ok(needs_notification) => {
                        if needs_notification {
                            vring_state.signal_used_queue().unwrap();
                        }
                    }
                }
            } else {
                if vring_state.add_used(head_index, 0).is_err() {
                    warn!("Couldn't return used descriptors to the ring");
                }
                vring_state.signal_used_queue().unwrap();
            }
        }

        Ok(used_any)
    }
}

Xen support

Supporting Xen requires special handling while mapping the guest memory. The vm-memory crate implements xen memory mapping support via a separate feature xen, and this crate uses the same feature name to enable Xen support.

Also, for xen mappings, the memory regions passed by the frontend contains few extra fields as described in the vhost-user protocol documentation.

It was decided by the rust-vmm maintainers to keep the interface simple and build the crate for either standard Unix memory mapping or Xen, and not both.

License

This project is licensed under