Commit graph

9065 commits

Author SHA1 Message Date
Philipp Schuster
ea4f07d3bf misc: clippy: add uninlined_format_args
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-11-21 09:32:11 +00:00
Philipp Schuster
7cb73e9e56 misc: clippy: add unnecessary_semicolon
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-11-21 09:32:11 +00:00
Philipp Schuster
06390342a6 misc: clippy: add default clippy lint groups
This is the first commit in a series of commits to improve the Code
Quality in Cloud Hypervisor in a sustainable way. These are the
default rules from `clippy::all` but written here to be more explicit.
`clippy::all` refers to all "default sensible" lints, not all
existing lints.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-11-21 09:32:11 +00:00
Philipp Schuster
11d17fbf79 docs: update CONTRIBUTING.md
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-11-20 21:15:03 +00:00
Philipp Schuster
e160a17131 net_util: unrelated code improvement
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-11-20 21:15:03 +00:00
Philipp Schuster
7364fbdc8e tests: move VM test into a test module
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-11-20 21:15:03 +00:00
Philipp Schuster
35b91f76af tests: prevent broken terminal after running cargo test -p vmm
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-11-20 21:15:03 +00:00
Philipp Schuster
d1680b9ff9 tests: streamline module names to unit_tests
This better aligns with the rest of the code and makes it clearer
that these tests can run "as is" in a normal hosted environments
without the special test environment.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-11-20 21:15:03 +00:00
Philipp Schuster
c990f1bdaa tests: enable cargo test --workspace + #[cfg(devcli_testenv)]
TL;DR: Massive quality of life improvement for devs

Cloud Hypervisor uses the Cargo test framework for multiple tests:

- normal unit tests
- unit tests requiring special environment (the Tap device tests)
- integration tests requiring a special environment

This prevented the execution of `cargo test --workspace`, which results
in a very poor developer experience. Although
`./scripts/run_unit_tests.sh` exists, there are valid reasons why devs
cannot or even don't want to use it.

By adding a new `chv_testenv` rustc config, we can conditionally only
activate tests when the `./scripts/` magic runs them. This improves
the general developer experience by a lot.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-11-20 21:15:03 +00:00
Bo Chen
e3e9e1c84c tests: Add retries for artifact downloads
This change makes integration tests more resilient to transient download
failures. This will reduce churn in our CI workflow, specifically for
the Merge Queue.

Examples:
https://github.com/cloud-hypervisor/cloud-hypervisor/actions/runs/19545345066/job/55962570896
https://github.com/cloud-hypervisor/cloud-hypervisor/actions/runs/19545345122/job/55962570736
https://github.com/cloud-hypervisor/cloud-hypervisor/actions/runs/19545345034/job/55962570724

Signed-off-by: Bo Chen <bchen@crusoe.ai>
2025-11-20 19:34:57 +00:00
Changyuan Lyu
be495ec64a arch: x86_64: fix cpuid leaf 0x1 EBX bits 23-16
Commit 5ec47d4883 was intended to patch ebx bits 23-16 in cpuid leaf
0x1, but it was not working as expected, as in rust, operator << has a
stronger precedence than & [1]. Later commit b6667f948e fixed the
operator precedence clippy warning, but did not fix the actual issue. As
a result, the current code is not changing ebx,

```
cpu_ebx |= ((dies_per_package as u32) * (cores_per_die as u32) * (threads_per_core as u32))
    & (0xff << 16);
```

Since the total number of logical processors is generally less than
65536, the right hand side of the expression is 0 in most cases.

[1] https://doc.rust-lang.org/reference/expressions.html#expression-precedence

Fixes: 5ec47d4883 ("arch: x86_64: enable HTT flag")
Signed-off-by: Changyuan Lyu <changyuanl@google.com>
2025-11-18 08:27:59 +00:00
dependabot[bot]
8ee26286ac build: Bump crate-ci/typos from 1.39.0 to 1.39.2
Bumps [crate-ci/typos](https://github.com/crate-ci/typos) from 1.39.0 to 1.39.2.
- [Release notes](https://github.com/crate-ci/typos/releases)
- [Changelog](https://github.com/crate-ci/typos/blob/master/CHANGELOG.md)
- [Commits](https://github.com/crate-ci/typos/compare/v1.39.0...v1.39.2)

---
updated-dependencies:
- dependency-name: crate-ci/typos
  dependency-version: 1.39.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-11-18 00:27:28 +00:00
Philipp Schuster
935332bc42 misc: unrelated misc code improvements
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-11-17 16:59:49 +00:00
Philipp Schuster
e4fd066d82 misc: improve developer experience of cargo clippy
A major improvement to the developer experience of clippy in
Cloud Hypervisor.

1. Make `cargo clippy` just work with the same lints we use in CI
2. Simplify adding new lints

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-11-17 16:59:49 +00:00
Philipp Schuster
a7fa3a0c86 vm-migration: better naming + unittests
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-11-17 14:34:54 +00:00
Julian Stecklina
b6c266c880 vmm: avoid creating large temporary vector during migration
... by just passing the iterator along. For large VMs this bitmap is
gigantic. A 12TB VM has 384MB of dirty bitmap.

With all these optimizations from the previous commits in place, we
see quite the improvement when it comes to scanning the dirty bitmap.

For a bitmap with 1% bits (randomly) set, dirty_log() takes:

Original code: 2166ms (100.0%)
New code:       382ms ( 17.6%)

on my system. The sparser the dirty bitmap the faster. Scanning an
empty bitmap is 100x faster. For a 5% populated bitmap we are still 3x
faster.

If someone wants to play with this, there is a benchmark harness here:
https://github.com/blitz/chv-bitmap-bench

On-behalf-of: SAP julian.stecklina@sap.com
Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de>
2025-11-17 14:34:54 +00:00
Julian Stecklina
fc99e299c3 virtio-devices: avoid creating a temporary vector
... by passing the slice along instead.

On-behalf-of: SAP julian.stecklina@sap.com
Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de>
2025-11-17 14:34:54 +00:00
Julian Stecklina
3d5f9a3a98 virtio-devices: mark a possible improvement
This would be a good opportunity to optimize another pointless vector
away, but I don't have a good way to test this at the moment. But
maybe someone else gives it a shot.

On-behalf-of: SAP julian.stecklina@sap.com
Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de>
2025-11-17 14:34:54 +00:00
Julian Stecklina
ad9034ed1d vm-migration: optimize dirty bitmap scanning
Adding itertools as dependency improves the iteration code in the
following significantly.

With this change, we don't need a copy of the vector. Just something
that can be coerced into an iterator. We also use the bit position
iterator to make the code somewhat clearer. The new code is much
faster, because it will not iterate over every bit, just each 1 bit in
the input.

The next commit will complete this optimization and have some concrete
numbers.

On-behalf-of: SAP julian.stecklina@sap.com
Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de>
2025-11-17 14:34:54 +00:00
Julian Stecklina
96f4e33897 vm-migration: add helper to iterate over bitmaps
Instead of using ad-hoc code, just write an extension to the Iterator
trait that we can easily unit test.

Co-authored-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP julian.stecklina@sap.com
On-behalf-of: SAP philipp.schuster@sap.com
Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de>
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
2025-11-17 14:34:54 +00:00
Rob Bradford
6147c4c8b7 tests: Disable nextest failure on fw_cfg tests on MSHV
Due to the fw_cfg test being disabled on MSHV this results in no tests
being runnable which results in an error in nextest. Reduce that error
to a warning use --no-tests=warn

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-11-17 10:22:34 +00:00
Rob Bradford
f9076cccfa tests: Add test timeout (10 minutes) to nextest configuration
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-11-17 10:22:34 +00:00
Rob Bradford
9f046f02a2 tests: Disable test_snapshot_restore_with_fd()
This is now failing on x86-64 as well after the update of the Rust
version.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-11-17 10:22:34 +00:00
Rob Bradford
3734a13cbf tests: Use cargo nextest for integration tests
This alternative test runner supports retries and also reports how long
each test takes to run.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-11-17 10:22:34 +00:00
Rob Bradford
2707b0f72a build: Add cargo nextest to container
cargo nextest is an improved test runner that allows retries as well a
reporting the times for the test runs.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-11-17 10:22:34 +00:00
Rob Bradford
176023156e test_infra: Make guest ID generation multiprocess safe
When using nextest for running tests each test is run in its own process
so the old solution of using a static variable for the guest ID (used to
determine the network segment) no longer works.

Instead use a text file on the filesystem protected with an exclusive
lock. The test process will read from it and then write back the next ID
that can be used. It wraps around at the limit of u8 and skips ID 0.

This function intentionally panics rather than propagate errors as it
should only be called for testing purposes and there the panic handler
will give a useful backtrace and cleanup.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-11-17 10:22:34 +00:00
Rob Bradford
5051feb0bd build: Bump MSRV to 1.89.0
This is required to support exclusive locking on files which is needed
for safe test ID generation when using nextest (since it runs each test
as a separate process.)

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-11-17 10:22:34 +00:00
Matt Moriarity
ec57aade15 seccomp: allow sendto for vsock thread
as of rust 1.90, writes to unix socket streams use send_with_flags
instead of write, so it uses a sendto syscall instead of write.

Signed-off-by: Matt Moriarity <matt@mattmoriarity.com>
2025-11-13 18:47:01 +00:00
Philipp Schuster
02da2f2d36 misc: gitlint: python code improvements
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-11-13 19:16:45 +00:00
Philipp Schuster
62345cd6fd misc: gitlint: allow more prefixes disabling 72 width limit
Suggested-by Alyssa Ross [0].

[0]: https://github.com/cloud-hypervisor/cloud-hypervisor/pull/7471#discussion_r2519894440

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-11-13 19:16:45 +00:00
Philipp Schuster
f826d92601 misc: gitlint: allow well-known commit tags to exceed line limit
To get that list, I've used

```
git log | grep --fixed-strings -- "-by:" | head -n 100000 | sort | less
```

on the Linux kernel's git repository.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-11-13 19:16:45 +00:00
Rob Bradford
063aa4b7d5 arch: riscv64: Expose host extension set to guest via FDT
The set of extensions supported by a RISC-V system needs to be exposed
to the guest - currently that is a fixed, minimal set of extensions.
These extensions are not sufficient to boot Ubuntu 25.10 which now has a
mininimum requirement of RVA23S64 (which is a minimum set of extensions
that make sense for server use cases.)

The easiest way to convey the extensions that the guest should use is to
copy those that the host kernel understands (and thus includes in the
/proc/cpuinfo) data.

However since nested virtualisation is not currently possible - exclude
the "H" (Hypervisor) extension from the list of short (single letter)
extensions.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-11-13 14:34:38 +00:00
Rob Bradford
d586c844de tests: Update test_iommu_segments check
After updating the Linux kernel to 6.19.6 the second segment (segment=1)
is now under the 2nd IOMMU group (which it a more logical setup) and as
such the added device which is on that segment is in that second IOMMU
group.

The same check is made in test_vdpa_block so also test there.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-11-13 14:29:26 +00:00
Rob Bradford
b3b51bd3a2 tests: aarch64: Fix test_virtio_iommu for kernel IOMMU groups change
The numbers for the IOMMU groups have shifted after the update to Linux
kernel 6.16.9.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-11-13 14:29:26 +00:00
Rob Bradford
8bf284e713 scripts: Bump Linux version to 6.16.9
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-11-13 14:29:26 +00:00
Rob Bradford
e023efce3d vmm: cpu: Retry signalling the vCPU thread if it doesn't acknowledge
Resignal every 10ms the thread if it has not acknowledged the signal via
setting the atomic when the vCPU thread was acknowledged. Further, avoid
an infinite loop by generating an error if it takes more than 1000ms to
interrupt the thread.

The retry helps mitigate a race condition where the signal is received
between checking the pause atomic and entering KVM_RUN ioctl when
pausing. Hitting this race condition would leave the
wait_untial_signal_acknowledged() method spinning indefinitely.

The timeout error avoids the VMM process being blocked indefinitely.

See: #7427

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-11-11 23:02:00 +00:00
Rob Bradford
bb7730e00f vmm: seccomp: Use rseq syscall constant
Formerly these syscall's had to be specified by number as the constants
were missing in musl.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-11-11 07:30:22 +00:00
Anirudh Rayabharam
dd66eb834c ci: dump kernel logs in MSHV workflow
Dump kernel logs after running the tests in the MSHV workflow to help
debug failures.

In addition to getting the kernel logs using `dmesg` also use AzCli to
retrieve the serial console logs. If the VM is hung or panicked, the
workflow would be unable to SSH into it and execute `dmesg`. In this
case the serial console logs would be helpful.

Signed-off-by: Anirudh Rayabharam <anrayabh@microsoft.com>
2025-11-08 16:06:15 +00:00
Bo Chen
9acf610a7b build: Release v49.0
Signed-off-by: Bo Chen <bchen@crusoe.ai>
2025-11-08 04:05:16 +00:00
Aastha Rawat
c4cce38631 scripts: aarch64: replace guestmount for image modification
Replace the use of `guestmount and `guestunmount` with standard tools
(losetup, mount) to modify cloud disk image. This eliminates
dependency on guestmount which requires /dev/kvm and is not available
in MSHV root partition.

Signed-off-by: Aastha Rawat <aastharawat@microsoft.com>
2025-11-06 19:33:53 +00:00
Aastha Rawat
e302d50b09 scripts: aarch64: enable MSHV support
Refactor MSHV hypervisor selection logic to enable aarch64 support
instead of exiting. Use `--features mshv` to conditionally build &
test when hypervisor is mshv replacing hard exit for aarch64 on mshv.

Signed-off-by: Aastha Rawat <aastharawat@microsoft.com>
2025-11-06 19:33:53 +00:00
Eugene Korenevsky
3791062b23 block: qcow: refactor: extract method cache_l2_cluster()
There are several copy-pasted code fragments in impl QcowFile. All of
them add L2 entry to the cache and one of them (in file_offset_write())
does also allocating new L2 entry if necessary.

Fold all these code fragments (except of one in l2_table() which does
error handling in special way) into cache_l2_cluster() method without
changing the logic.
This will make the code more compact and clean.

Signed-off-by: Eugene Korenevsky <ekorenevsky@aliyun.com>
2025-11-05 16:58:24 +00:00
Muminul Islam
e3fa27e251 scripts: Fix issue when extra docker volumes provided
With the current syntax docker gives 'docker: invalid reference format'
error. Also during parsing /xxx:/yyy in process_volumes_args
with \" inside variable i.e arr_vols=("${arg_vols//#/ }")
gives wrong output.

Example:
scripts/dev_cli.sh tests --integration --volumes /mshv:/mshv
Error: The volume /mshv /mshv does not exist.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2025-11-05 10:45:24 +00:00
Rob Bradford
932e1a636a tests: Disable integration tests that use virtio-mem on MSHV
These tests are now failing

See: #7456

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-11-04 19:44:02 +00:00
Alyssa Ross
d59dfdf8b6 vmm: seccomp: allow http-server to use sendto
Fixes: https://github.com/cloud-hypervisor/cloud-hypervisor/issues/7449
Signed-off-by: Alyssa Ross <hi@alyssa.is>
2025-11-03 12:41:18 +00:00
Ariel Chenet
9158c36dbf docs: Update documentation to reflect initial VMM output
The API documentation tells users to expect a message when Cloud
Hypervisor is launched, this message was removed in commit 13724db

This change updates the documentation to reflect how the program
actually functions, which is to say no message.

Signed-off-by: Ariel Chenet <achenet@fastmail.com>
2025-11-03 10:30:05 +00:00
dependabot[bot]
2a39b4a4e5 build: Bump crate-ci/typos from 1.38.1 to 1.39.0
Bumps [crate-ci/typos](https://github.com/crate-ci/typos) from 1.38.1 to 1.39.0.
- [Release notes](https://github.com/crate-ci/typos/releases)
- [Changelog](https://github.com/crate-ci/typos/blob/master/CHANGELOG.md)
- [Commits](https://github.com/crate-ci/typos/compare/v1.38.1...v1.39.0)

---
updated-dependencies:
- dependency-name: crate-ci/typos
  dependency-version: 1.39.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-11-01 00:03:30 +00:00
Alyssa Ross
0395b10b29 openapi: add missing NetConfig offload parameters
Signed-off-by: Alyssa Ross <hi@alyssa.is>
2025-10-30 18:40:07 +00:00
Alyssa Ross
1356b26c0f build: bump bitfield-struct from 0.11.0 to 0.12.1
Should fix beta clippy.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2025-10-30 00:40:50 +00:00
Philipp Schuster
7536a95424 misc: cleanup &Arc<dyn T> -> &dyn T
Consuming `&Arc<T>` as argument is almost always an antipattern as it
hides whether the callee is going to take over (shared) ownership
(by .clone()) or not. Instead, it is better to consume `&dyn T` or
`Arc<dyn T>` to be more explicit. This commit cleans up the code.

The change is very mechanic and was very easy to implement across the
code base.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-10-28 17:37:49 +00:00