This commit is part of a series of similar commits.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
This commit is part of a series of similar commits.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
This commit is part of a series of similar commits.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
This commit is part of a series of similar commits.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
This commit is the first in a series of similar commits to clean up
obsolete `extern crate` statements
Since Rust 1.30, normal macros can be imported via `use`, and with Rust
1.31 and edition 2018 this has become the preferred approach.
`extern crate` is only needed for `alloc` in `no_std` crates, which does
not apply here.
By dropping these (often redundant or odd) `extern crate` lines, we
expose the actual dependencies more clearly and reduce technical debt.
## Auto-generation of the series
Most of this series was produced automatically:
1. Removed all "extern crate" references
2. Run the script [0] to add missing `use` statements
3. Run `cargo +nightly fmt --all`
4. Fix the remaining problems manually
The treewide changes were then split into per-folder commits.
[0]
```python
import os
import re
# Mapping of macro/function usage to imports
MACRO_IMPORTS = {
"info!": "use log::info;\n",
"debug!": "use log::debug;\n",
"error!": "use log::error;\n",
"trace!": "use log::trace;\n",
"warn!": "use log::warn;\n",
"event!": "use event_monitor::event;\n",
"anyhow!(": "use anyhow::anyhow;\n",
"bitflags!(": "use bitflags::bitflags;\n",
"ioctl_ior_nr!": "use vmm_sys_util::{ioctl_ior_nr};\n",
"ioctl_iow_nr!": "use vmm_sys_util::{ioctl_iow_nr};\n",
}
# Regex for finding the first use statement
USE_REGEX = re.compile(r"^\s*(use|pub use) .+?;")
def process_file(path):
with open(path, "r", encoding="utf-8") as f:
lines = f.readlines()
content = "".join(lines)
existing_imports = set(lines)
needed_imports = set()
# Check macros/functions against mapping, only add if not already present
for key, import_stmt in MACRO_IMPORTS.items():
if key in content and import_stmt not in existing_imports:
needed_imports.add(import_stmt)
if not needed_imports:
print(f"Unmodified {path} (no new imports needed)")
return # Nothing to do
# Find first use or pub use statement
for i, line in enumerate(lines):
if USE_REGEX.match(line):
insertion_index = i + 1
break
else:
print(f"Unmodified {path} (no use or pub use statement found)")
return # No use statement found, skip file
# Insert imports
lines[insertion_index:insertion_index] = list(needed_imports)
# Write back file
with open(path, "w", encoding="utf-8") as f:
f.writelines(lines)
print(f"Modified {path}, added imports: {''.join(needed_imports).strip()}")
for root, _, files in os.walk("."):
for file in files:
if file.endswith(".rs"):
process_file(os.path.join(root, file))
```
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
Add tests:
- zlib: test_virtio_block_qcow2_zlib()
- zstd: test_virtio_block_qcow2_zstd()
Both these tests use zlib- and zstd-compressed images as OS image.
Modify test_virtio_block_qcow2_backing_file() test: it is practical
to test qcow2 file-backing with compression, so use zlib-compressed
image as a backing file.
Signed-off-by: Eugene Korenevsky <ekorenevsky@aliyun.com>
Add support of reading and writing compressed clusters.
Support zlib and zstd compressions.
L2 cache: store entire L2 entries, not only standard cluster addresses.
Read path. Offsets of compressed clusters cannot be determined,
therefore replace QcowFile.file_offset_read() with QcowFile.file_read().
This method reads the cluster, decompresses it if necessary and returns
the data to the caller.
Write path. QcowFile.file_offset_write(): since writing to compressed
clusters is not generally possible, allocate a new standard
(non-compressed) cluster if compressed L2 entry is encountered; then
decompress compressed cluster into new cluster; then return offset
inside new cluster to the caller. Processing of standard clusters is
not changed.
Signed-off-by: Eugene Korenevsky <ekorenevsky@aliyun.com>
As well as saving the MSRs as it is currently does ensure that the KVM
capability is enabled along with keeping the internal state updated.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
Co-authored-by: Chengyu Fu <chengyu.fu@linux.alibaba.com>
The granularity has significant implications in typical cloud
deployments with network storage. The Linux kernel will sync advisory
locks to network file systems, but these backends may have different
policies and handle locks differently. For example, Netapp speaks a NFS
API but will treat advisory OFD locks for the whole file as mandatory
locks, whereas byte-range locks for the whole file will remain
advisory [0].
As it is a valid use case to prevent multiple CHV instances from
accessing the same disk but disk management software (e.g., Cinder in
OpenStack) should be able to snapshot disks while VMs are running, we
need special control over the lock granularity. Therefore, it is a valid
use case to lock the whole byte range of a disk image without
technically locking the whole file - to get the best of both worlds.
This also brings CHVs behavior in line with QEMU [1].
Whole-file locks remain a valid use case and could be supported later.
This patch only provides the necessary groundwork; making it
configurable is out of scope for now.
[0] https://kb.netapp.com/on-prem/ontap/da/NAS/NAS-KBs/How_is_Mandatory_Locking_supported_for_NFSv4_on_ONTAP_9
[1] <qemu>/util/osdep.c::qemu_lock_fcntl()
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
This should be guaranteed by GuestMemory and GuestMemoryRegion, but
those traits are currently safe, so add checks to guard against
incorrect implementations of them.
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
It takes a pointer to a userspace address that it accesses, so it should
be marked unsafe. This was missed earlier.
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
Callers of get_host_address_range() rely on it returning a pointer to at
least size bytes of memory. mem.get_host_address() is an overrideable
method of a safe trait, so it is better for safe code to not rely on its
correctness for safety. Instead, use mem.get_slice(), which returns a
VolatileSlice whose invariants guarantee that it points to a sufficient
amount of memory. If mem.check_range() succeeds but mem.get_slice()
returns a slice that is too small, this means that there is either a
logic error or a situation the code cannot support yet, so panic.
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
The kernel will validate that the size is page-aligned. The file offset
is always zero, so the kernel will also validate that the offset is
page-aligned.
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
This API passes a u64 to a kernel API that treats the u64 as a userspace
address. Therefore, it should be marked unsafe, but it currently is not
[1]. Wrap the call in an unsafe block to document that invariants must
be upheld to avoid undefined behavior. This causes a compiler warning,
so suppress the warning with #[allow(unused_unsafe)].
[1]: https://github.com/rust-vmm/vfio/issues/100
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
I believe that its only caller used it safely, but it is still better to
mark the code as unsafe. Also add additional validity checks.
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
Also drop support for building the TDX code for 32-bit targets. All
CPUs with TDX support are 64-bit so supporting 32-bit targets is not
needed.
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
To ensure that struct sizes are the same on 32-bit and 64-bit, various
kernel APIs use __u64 (Rust u64) to represent userspace pointers.
Userspace is expected to cast pointers to __u64 before passing them to
the kernel, and cast kernel-provided __u64 to a pointer before using
them. However, various safe APIs in Cloud Hypervisor took
caller-provided u64 values and passed them to syscalls that interpret
them as userspace addresses. Therefore, passing bad u64 values would
cause memory disclosure or corruption.
Fix the bug by using usize and pointer types as appropriate. To make
soundness of the code easier to reason about, the PCI code gains a new
MmapRegion abstraction that ensures the validity of pointers. The rest
of the code already has an MmapRegion abstraction it can use. To avoid
having to reason about whether something is keeping the MmapRegion
alive, reference counting is added. MmapRegion cannot hold references
to other objects, so the reference counting cannot introduce cycles.
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
To ensure that struct sizes are the same on 32-bit and 64-bit, various
kernel APIs use __u64 (Rust u64) to represent userspace pointers.
Userspace is expected to cast pointers to __u64 before passing them to
the kernel, and cast kernel-provided __u64 to a pointer before using
them. However, various safe APIs in Cloud Hypervisor took
caller-provided u64 values and passed them to syscalls that treat them
as userspace addresses. Therefore, passing bad u64 values would cause
memory disclosure or corruption. The memory region APIs are one example
of this, so mark them as unsafe.
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
When booting a Linux guest in SMP configuration, on sapphire rapids
and granite rapids the following kernel warning can be observed:
[Firmware Bug]: CPUID leaf 0x1f subleaf 1 APIC ID mismatch 1 != 0
[Firmware Bug]: CPUID leaf 0x1f subleaf 2 APIC ID mismatch 1 != 0
The reason is that we announce the presence of the extended topology
leaf, but fail to announce the x2apic ID in EDX for each subleaf.
Signed-off-by: Thomas Prescher <thomas.prescher@cyberus-technology.de>
On-behalf-of: SAP thomas.prescher@sap.com
as of rust 1.90, writes to unix sockets use the sendto syscall. This
affects the vcpu threads when vfio_user devices are accessed.
Signed-off-by: Stefan Nürnberger <stefan.nuernberger@cyberus-technology.de>
Unfortunately, there is no lint for that.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
This is the first commit in a series of commits to improve the Code
Quality in Cloud Hypervisor in a sustainable way. These are the
default rules from `clippy::all` but written here to be more explicit.
`clippy::all` refers to all "default sensible" lints, not all
existing lints.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
This better aligns with the rest of the code and makes it clearer
that these tests can run "as is" in a normal hosted environments
without the special test environment.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
TL;DR: Massive quality of life improvement for devs
Cloud Hypervisor uses the Cargo test framework for multiple tests:
- normal unit tests
- unit tests requiring special environment (the Tap device tests)
- integration tests requiring a special environment
This prevented the execution of `cargo test --workspace`, which results
in a very poor developer experience. Although
`./scripts/run_unit_tests.sh` exists, there are valid reasons why devs
cannot or even don't want to use it.
By adding a new `chv_testenv` rustc config, we can conditionally only
activate tests when the `./scripts/` magic runs them. This improves
the general developer experience by a lot.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
Commit 5ec47d4883 was intended to patch ebx bits 23-16 in cpuid leaf
0x1, but it was not working as expected, as in rust, operator << has a
stronger precedence than & [1]. Later commit b6667f948e fixed the
operator precedence clippy warning, but did not fix the actual issue. As
a result, the current code is not changing ebx,
```
cpu_ebx |= ((dies_per_package as u32) * (cores_per_die as u32) * (threads_per_core as u32))
& (0xff << 16);
```
Since the total number of logical processors is generally less than
65536, the right hand side of the expression is 0 in most cases.
[1] https://doc.rust-lang.org/reference/expressions.html#expression-precedence
Fixes: 5ec47d4883 ("arch: x86_64: enable HTT flag")
Signed-off-by: Changyuan Lyu <changyuanl@google.com>
A major improvement to the developer experience of clippy in
Cloud Hypervisor.
1. Make `cargo clippy` just work with the same lints we use in CI
2. Simplify adding new lints
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
... by just passing the iterator along. For large VMs this bitmap is
gigantic. A 12TB VM has 384MB of dirty bitmap.
With all these optimizations from the previous commits in place, we
see quite the improvement when it comes to scanning the dirty bitmap.
For a bitmap with 1% bits (randomly) set, dirty_log() takes:
Original code: 2166ms (100.0%)
New code: 382ms ( 17.6%)
on my system. The sparser the dirty bitmap the faster. Scanning an
empty bitmap is 100x faster. For a 5% populated bitmap we are still 3x
faster.
If someone wants to play with this, there is a benchmark harness here:
https://github.com/blitz/chv-bitmap-bench
On-behalf-of: SAP julian.stecklina@sap.com
Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de>
... by passing the slice along instead.
On-behalf-of: SAP julian.stecklina@sap.com
Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de>
This would be a good opportunity to optimize another pointless vector
away, but I don't have a good way to test this at the moment. But
maybe someone else gives it a shot.
On-behalf-of: SAP julian.stecklina@sap.com
Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de>
Adding itertools as dependency improves the iteration code in the
following significantly.
With this change, we don't need a copy of the vector. Just something
that can be coerced into an iterator. We also use the bit position
iterator to make the code somewhat clearer. The new code is much
faster, because it will not iterate over every bit, just each 1 bit in
the input.
The next commit will complete this optimization and have some concrete
numbers.
On-behalf-of: SAP julian.stecklina@sap.com
Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de>
Instead of using ad-hoc code, just write an extension to the Iterator
trait that we can easily unit test.
Co-authored-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP julian.stecklina@sap.com
On-behalf-of: SAP philipp.schuster@sap.com
Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de>
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>