# What
This commit introduces file-based advisory locking for the files backing
up the block devices by using the fcntl() syscall with OFD locks. The
per-open-file-descriptor (OFD) locks are more robust than traditional
POSIX locks (F_SETLK) as they are not tied to process IDs and avoid
common issues in multithreaded or multi-fd scenarios [1]. Therefore,
we don't use `std::fs::File::try_lock()`, which is backed by F_SETLKW.
The locking mechanism is aware of the `readonly` property and allows
`n` readers or `1` writer (exclusive mode).
As the locks are advisory, multiple cloud-hypervisor processes can
prevent themselves from writing to the same file. However, this is not
a system-wide file-system level locking mechanism preventing to open()
a file.
The introduced new locking mechanism does not cover vhost-user devices.
# Why
To prevent misconfiguration and improve safety, it is good practice to
protect disk image files with a locking mechanism. Experience and common
best practices suggest that advisory locks are preferable over mandatory
locks due to better compatibility and fewer pitfalls (in fs space).
The introduced functionality is aligned with the approach taken by
QEMU [0], and is also recommended in [1].
# Implementation Details
We need to ensure that not only normal operation keeps working but also
state save/resume and live-migration. Especially for live migration,
it is crucial that the sender VMM releases the locks when the VM stops
so the receiver VMM can acquire them right after that.
Therefore, the locking and releasing happen directly on the block
device struct. The device manager knows all block devices and can
forward requests to these types.
Last but not least, this commit uses on explicit lock acquiring
but implicit lock releasing (FD close). It only explicitly releases
the locks where this integrates more smoothly into the existing
code.
# Testing
I tested
- normal operation
- state save/resume,
- device hot plugging,
- and live-migration
with read/shared and write/exclusive locks.
One can use the `fcntl-tool` to test if locks are actually acquired
or released [2].
# Links
[0] 825b96dbce/util/osdep.c (L266)
[1] https://apenwarr.ca/log/20101213
[2] https://crates.io/crates/fcntl-tool
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com