vhost_user: fix replies without GET_PROTOCOL_FEATURES

I observed hangs in the following situation when using Cloud
Hypervisor with virtiofsd:

1. Start virtiofsd
2. Start Cloud Hypervisor instance 1, connected to virtiofsd.
3. Start Cloud Hypervisor instance 2, waiting for migration.
4. Migrate VM from Cloud Hypervisor instance 1 to 2.
5. Start virtiofsd again.

The hangs happened because Cloud Hypervisor remembered, as part of the
migration data, which vhost-user protocol features the backend for its
fs device supported.  Instance 2 therefore never sent
GET_PROTOCOL_FEATURES to the second invocation of virtiofsd.  This
should work, but it didn't, because update_reply_ack_flag() checked
whether self.protocol_features contained GET_PROTOCOL_FEATURES, but
self.protocol_features is only filled in when GET_PROTOCOL_FEATURES is
called.  As a result, Cloud Hypervisor expected a reply that virtiofsd
would never send.

Since REPLY_ACK is handled entirely by the vhost-user library, and not
by the backend, there's no need to ask the backend whether it supports
REPLY_ACK in the first place, so we can just drop the check for that
from update_reply_ack_flag().  We know that we always support it, so
we just need to check whether the backend has acked it.  This fixes
the hang described above.

Since we will now always reply if the backend acks the feature,
REPLY_ACK is now always included in the set of features returned by
GET_PROTOCOL_FEATURES, just like with XEN_MMAP (when enabled).

Signed-off-by: Alyssa Ross <hi@alyssa.is>
This commit is contained in:
Alyssa Ross 2025-03-27 13:34:58 +01:00 committed by Stefano Garzarella
parent 8c7cc594ac
commit 14db3cd165
3 changed files with 9 additions and 6 deletions

View file

@ -59,7 +59,8 @@ impl VhostUserBackendMut for MockVhostBackend {
}
fn protocol_features(&self) -> VhostUserProtocolFeatures {
VhostUserProtocolFeatures::all()
// Exclude REPLY_ACK to test that it is automatically added.
VhostUserProtocolFeatures::all() - VhostUserProtocolFeatures::REPLY_ACK
}
fn reset_device(&mut self) {

View file

@ -5,6 +5,10 @@
- [[#268]](https://github.com/rust-vmm/vhost/pull/268) Add support for `VHOST_USER_GET_SHARED_OBJECT`
### Changed
- [[#290]](https://github.com/rust-vmm/vhost/pull/290) Backends now
always support `VHOST_USER_PROTOCOL_F_REPLY_ACK`, without the
`VhostUserBackendReqHandler` implementation having to include it in
the features returned from `get_protocol_features`.
### Deprecated

View file

@ -327,7 +327,6 @@ pub struct BackendReqHandler<S: VhostUserBackendReqHandler> {
virtio_features: u64,
acked_virtio_features: u64,
protocol_features: VhostUserProtocolFeatures,
acked_protocol_features: u64,
// sending ack for messages without payload
@ -347,7 +346,6 @@ impl<S: VhostUserBackendReqHandler> BackendReqHandler<S> {
backend,
virtio_features: 0,
acked_virtio_features: 0,
protocol_features: VhostUserProtocolFeatures::empty(),
acked_protocol_features: 0,
reply_ack_enabled: false,
error: None,
@ -512,7 +510,9 @@ impl<S: VhostUserBackendReqHandler> BackendReqHandler<S> {
}
Ok(FrontendReq::GET_PROTOCOL_FEATURES) => {
self.check_request_size(&hdr, size, 0)?;
let features = self.backend.get_protocol_features()?;
// REPLY_ACK is implemented entirely by this library, so it's always supported.
let features =
self.backend.get_protocol_features()? | VhostUserProtocolFeatures::REPLY_ACK;
// Enable the `XEN_MMAP` protocol feature for backends if xen feature is enabled.
#[cfg(feature = "xen")]
@ -520,7 +520,6 @@ impl<S: VhostUserBackendReqHandler> BackendReqHandler<S> {
let msg = VhostUserU64::new(features.bits());
self.send_reply_message(&hdr, &msg)?;
self.protocol_features = features;
self.update_reply_ack_flag();
}
Ok(FrontendReq::SET_PROTOCOL_FEATURES) => {
@ -952,7 +951,6 @@ impl<S: VhostUserBackendReqHandler> BackendReqHandler<S> {
let pflag = VhostUserProtocolFeatures::REPLY_ACK;
self.reply_ack_enabled = (self.virtio_features & vflag) != 0
&& self.protocol_features.contains(pflag)
&& (self.acked_protocol_features & pflag.bits()) != 0;
}