Commit graph

426 commits

Author SHA1 Message Date
Anatol Belski
e6dd429a64 tests: qcow: Add testing for uncompressed backing file
Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
2025-12-05 15:38:55 +00:00
Anatol Belski
248e786363 tests: qcow: Adjust namings for zstd compresed backing file
Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
2025-12-05 15:38:55 +00:00
Sebastien Boeuf
801059c2d5 scripts: Update custom image to Ubuntu Noble Numbat
Updating the helper script to create a VFIO custom image from Ubuntu
22.04 to Ubuntu 24.04.

Signed-off-by: Sebastien Boeuf <seb@rivosinc.com>
2025-12-04 15:25:13 +00:00
Wei Liu
a1a018bd83 tests: Do not return a failure when no tests are run
It is a common use case to run a subset of tests locally to verify
certain functionalities.

The default behaviour for nextest is to error out if no tests are run.
That causes the test scripts to return a non-zero value (failure). Pass
`--no-tests=pass` to nextest to match what `cargo test` does if no tests
are run.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2025-11-25 15:59:39 +00:00
Eugene Korenevsky
e6d31a3d81 block: qcow: switch qcow2 tests from focal to jammy qcow2 images
Signed-off-by: Eugene Korenevsky <ekorenevsky@aliyun.com>
2025-11-24 08:52:13 +00:00
Eugene Korenevsky
94ed7c1745 block: qcow: add integration tests for qcow2 compression
Add tests:
- zlib: test_virtio_block_qcow2_zlib()
- zstd: test_virtio_block_qcow2_zstd()
Both these tests use zlib- and zstd-compressed images as OS image.

Modify test_virtio_block_qcow2_backing_file() test: it is practical
to test qcow2 file-backing with compression, so use zlib-compressed
image as a backing file.

Signed-off-by: Eugene Korenevsky <ekorenevsky@aliyun.com>
2025-11-24 08:52:13 +00:00
Philipp Schuster
c990f1bdaa tests: enable cargo test --workspace + #[cfg(devcli_testenv)]
TL;DR: Massive quality of life improvement for devs

Cloud Hypervisor uses the Cargo test framework for multiple tests:

- normal unit tests
- unit tests requiring special environment (the Tap device tests)
- integration tests requiring a special environment

This prevented the execution of `cargo test --workspace`, which results
in a very poor developer experience. Although
`./scripts/run_unit_tests.sh` exists, there are valid reasons why devs
cannot or even don't want to use it.

By adding a new `chv_testenv` rustc config, we can conditionally only
activate tests when the `./scripts/` magic runs them. This improves
the general developer experience by a lot.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-11-20 21:15:03 +00:00
Bo Chen
e3e9e1c84c tests: Add retries for artifact downloads
This change makes integration tests more resilient to transient download
failures. This will reduce churn in our CI workflow, specifically for
the Merge Queue.

Examples:
https://github.com/cloud-hypervisor/cloud-hypervisor/actions/runs/19545345066/job/55962570896
https://github.com/cloud-hypervisor/cloud-hypervisor/actions/runs/19545345122/job/55962570736
https://github.com/cloud-hypervisor/cloud-hypervisor/actions/runs/19545345034/job/55962570724

Signed-off-by: Bo Chen <bchen@crusoe.ai>
2025-11-20 19:34:57 +00:00
Rob Bradford
6147c4c8b7 tests: Disable nextest failure on fw_cfg tests on MSHV
Due to the fw_cfg test being disabled on MSHV this results in no tests
being runnable which results in an error in nextest. Reduce that error
to a warning use --no-tests=warn

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-11-17 10:22:34 +00:00
Rob Bradford
3734a13cbf tests: Use cargo nextest for integration tests
This alternative test runner supports retries and also reports how long
each test takes to run.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-11-17 10:22:34 +00:00
Rob Bradford
5051feb0bd build: Bump MSRV to 1.89.0
This is required to support exclusive locking on files which is needed
for safe test ID generation when using nextest (since it runs each test
as a separate process.)

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-11-17 10:22:34 +00:00
Philipp Schuster
02da2f2d36 misc: gitlint: python code improvements
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-11-13 19:16:45 +00:00
Philipp Schuster
62345cd6fd misc: gitlint: allow more prefixes disabling 72 width limit
Suggested-by Alyssa Ross [0].

[0]: https://github.com/cloud-hypervisor/cloud-hypervisor/pull/7471#discussion_r2519894440

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-11-13 19:16:45 +00:00
Philipp Schuster
f826d92601 misc: gitlint: allow well-known commit tags to exceed line limit
To get that list, I've used

```
git log | grep --fixed-strings -- "-by:" | head -n 100000 | sort | less
```

on the Linux kernel's git repository.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-11-13 19:16:45 +00:00
Rob Bradford
8bf284e713 scripts: Bump Linux version to 6.16.9
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-11-13 14:29:26 +00:00
Aastha Rawat
c4cce38631 scripts: aarch64: replace guestmount for image modification
Replace the use of `guestmount and `guestunmount` with standard tools
(losetup, mount) to modify cloud disk image. This eliminates
dependency on guestmount which requires /dev/kvm and is not available
in MSHV root partition.

Signed-off-by: Aastha Rawat <aastharawat@microsoft.com>
2025-11-06 19:33:53 +00:00
Aastha Rawat
e302d50b09 scripts: aarch64: enable MSHV support
Refactor MSHV hypervisor selection logic to enable aarch64 support
instead of exiting. Use `--features mshv` to conditionally build &
test when hypervisor is mshv replacing hard exit for aarch64 on mshv.

Signed-off-by: Aastha Rawat <aastharawat@microsoft.com>
2025-11-06 19:33:53 +00:00
Muminul Islam
e3fa27e251 scripts: Fix issue when extra docker volumes provided
With the current syntax docker gives 'docker: invalid reference format'
error. Also during parsing /xxx:/yyy in process_volumes_args
with \" inside variable i.e arr_vols=("${arg_vols//#/ }")
gives wrong output.

Example:
scripts/dev_cli.sh tests --integration --volumes /mshv:/mshv
Error: The volume /mshv /mshv does not exist.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2025-11-05 10:45:24 +00:00
Anirudh Rayabharam
f74cde7882 scripts: build mshv feature for ivshmem testing
The ivshmem tests are all failing in the CI for MSHV because Cloud
Hypervisor is built without the mshv feature.

Error: Cloud Hypervisor exited with the following chain of errors:
  0: Failed to open hypervisor interface (is hypervisor interface
       available?)
  1: Failed to create the hypervisor
  2: no supported hypervisor

Modify the build command to include the mshv feature.

Signed-off-by: Anirudh Rayabharam <anrayabh@microsoft.com>
2025-10-28 15:31:58 +00:00
Anirudh Rayabharam
861b7ab64d tests: exclude test_fw_cfg for mshv
test_fw_cfg is frequently failing in the CI for MSHV. Exclude it for
now. It needs further investigation. See issue #7434 for details.

Signed-off-by: Anirudh Rayabharam <anrayabh@microsoft.com>
2025-10-25 10:16:24 +00:00
Bo Chen
205e62aaa8 scripts: Download jammy images for rate-limiter tests
We recently moved many of our tests to use focal to jammy as the guest
images, including rate-limiter tests (#7367). We forgot to update the
rate-limiter scripts to reflect such change.

Our CI pipeline failed to report such error because our self-host runner
happened to be not working at the time we land the changes (see #7405).

Signed-off-by: Bo Chen <bchen@crusoe.ai>
2025-10-09 22:49:12 +00:00
Rob Bradford
d760301c8d tests: Reduce parallelism on x86-64 testing
Only use 75% of the available threads - this will reduce dislk and
memory pressure. Reducing the chance of flaky tests.

See: #7405

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-10-09 11:02:49 +00:00
Wei Liu
78a16227d2 scripts: Exit create-cloud-init.sh on error
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2025-09-22 14:24:50 +00:00
Shubham Chakrawar
2d9e243163 misc: Remove SGX support from Cloud Hypervisor
This commit removes the SGX support from cloud hypervisor. SGX support
was deprecated in May as part of #7090.

Signed-off-by: Shubham Chakrawar <schakrawar@crusoe.ai>
2025-09-05 18:08:36 +00:00
Muminul Islam
1ca6c159ef tests: option to override default migratable version
This patch gives user an option to override the
default migratable version to any later release.
This option makes MSHV specific tests suitable for
tests since MSHV is stable after some breaking changes.
This patch is also necessary for MSHV CI.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2025-09-03 18:51:24 +00:00
Philipp Schuster
92f415ea3f build: Bump MSRV to 1.88
This is necessary to use the let-chains feature in a
follow-up. After upgrading to Rust edition 2024, clippy
wants to collapse various if's with let-chains.

Update image to 20250815-0 since MSRV in Dockerfile is updated.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-08-15 10:55:48 +00:00
Songqian Li
4c1ee0329e tests: add ivshmem integration test case
Signed-off-by: Songqian Li <sionli@tencent.com>
2025-08-14 22:14:34 +00:00
Alex Orozco
5d478c534e tests: Add fw_cfg device integration test
This test verifies that we can see custom items added to the fw_cfg
device from inside the guest

Signed-off-by: Alex Orozco <alexorozco@google.com>
2025-08-11 17:29:51 +00:00
Songqian Li
530719a57a build: Bump MSRV to 1.87.0
rustc 1.90.0-beta.1 (788da80fc 2025-08-04) suggests using library
feature `unsigned_is_multiple_of`. It is stabled in Rust 1.87.0.

Update image to 20250807-0 since MSRV in Dockerfile is updated.

Signed-off-by: Songqian Li <sionli@tencent.com>
Signed-off-by: Bo Chen <bchen@crusoe.ai>
2025-08-07 16:53:59 +00:00
Wei Liu
5f2392c095 tests: Avoid repeatedly downloading files from GitHub
Running one or two tests in a tight loop can cause the download
functions to quickly hit GitHub's API rate limit. That causes the test
script to fail for no apparent reason.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2025-07-24 16:46:43 +00:00
Philipp Schuster
d580ed55c6 seccomp: add SYS_getcwd (79) to support proper Rust backtraces
When a proper Rust backtrace is printed, the Rust std wants to use the
SYS_getcwd(79) system call to prettify some paths while printing. In
Cloud Hypervisor, this is at least relevant for printing panics or if
a `anyhow::Error` value is printed using `{e:?}` (but not `{e:#?}`).

The syscall cause can be found in `impl fmt::Display for Backtrace {}`
in `library/std/src/backtrace.rs`.

Without this addition, the seccomp violation of the SYS_getcwd (79)
hinders the proper error message including a full backtrace from showing
up. This annoying behaviour already delayed many debugging efforts. With
this fix, things just work. The new syscall itself should be pretty
harmless for normal operation.

```
thread 'vmm' panicked at virtio-devices/src/rng.rs:224:9:
Yikes, things went horribly wrong!

==== Possible seccomp violation ====
Try running with `strace -ff` to identify the cause and open an issue: https://github.com/cloud-hypervisor/cloud-hypervisor/issues/new
[1]    287683 invalid system call (core dumped)  RUST_BACKTRACE=full cargo run --bin cloud-hypervisor -- --api-socket  --kerne
```

```
thread 'vmm' panicked at virtio-devices/src/rng.rs:224:9:
Yikes, things went horribly wrong!
stack backtrace:
   0:     0x557d91286b62 - std::backtrace_rs::backtrace::libunwind::trace::hc20b48b31ee52608
                               at /rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library/std/src/../../backtrace/src/backtrace/libunwind.rs:117:9
   1:     0x557d91286b62 - std::backtrace_rs::backtrace::trace_unsynchronized::h5d207cd20f193d88
                               at /rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library/std/src/../../backtrace/src/backtrace/mod.rs:66:14

...

  67:                0x0 - <unknown>
Error: Cloud Hypervisor exited with the following error:
  Failed to join on VMM thread: Any { .. }

Debug Info: ThreadJoin(Any { .. })
```

- add any panic, for example into the create or drop function of a
  device
- add --seccomp=true|log to analyze the situation

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-06-26 20:50:57 +00:00
Bo Chen
2b05753716 ci: Update reference kernel to 'v6.12.8-20250613'
This bump also includes another release 'ch-release-v6.12.8-20250422'
that changed the naming convention of the released kernel binaries
[1]. As a result, few changes are made to our integration tests and test
scripts.

[1] https://github.com/cloud-hypervisor/linux/releases/tag/ch-release-v6.12.8-20250422

Signed-off-by: Bo Chen <bchen@crusoe.ai>
2025-06-16 17:59:22 +00:00
Philipp Schuster
77e042237d ci: improve gitlint (max line length in body with exceptions)
Follow-up of 5aa1540c5d but way more
mature. We now use custom gitlint rules written in Python to better
handle the max line length, with respect to a few valid exceptions.
Recognizing code blocks or compiler output, as discussed, is not
trivial and hard to get right for all corner-cases. Therefore, this
commit is a pragmatic way forward. The CI job should be kept optional.

Allowed exceptions for the 72 line length limit are now:

1. links in the following three common patterns:
https://example.com/very-long-links/very-long-links/very-long-links/very-long-links/very-long-links/very-long-links/very-long-links
[0] https://example.com/very-long-links/very-long-links/very-long-links/very-long-links/very-long-links/very-long-links/very-long-links
[0]: https://example.com/very-long-links/very-long-links/very-long-links/very-long-links/very-long-links/very-long-links/very-long-links

2. code blocks (anything between the three backticks)

```
let x = "very_long_very_long_very_long_very_long_very_long_very_long_very_long_very_long_very_long_very_long_"
```

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-05-09 14:50:23 +01:00
Ruoqing He
226ecf47bb build: Bump MSRV to 1.83.0
The dependency `bitfield-struct` 0.10.x of `igvm` 0.3.5 requires MSRV
1.83.0, bump to catch up.

Update image to 20250412-0 because MSRV in Dockerfile is updated.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-04-12 18:31:02 +01:00
Ruoqing He
6768a13d95 build: Bump MSRV to 1.82.0
We are having complains from Rust 1.86.0-beta.1 (f0cb41030 2025-02-17)'
clippy, which suggests us to replace `repeat().take()` with
`repeat_n()`. While `repeat_n()` is stablized in Rust 1.82.0.

Update image to 20250307-2 because MSRV in Dockerfile is updated.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-03-07 15:09:14 +00:00
Ruoqing He
5cb5115456 build: Fix spdk in linux/arm64 image
The reason `test_vfio_user` fails is as @likebreath pointed: our ARM
host does not support SVE, while the nvme_tgt binary built from the
container image requires it. As a result, we encountered a SIGILL when
running the nvme_tgt binary. This also explains why this is not
happening when the container is built on the same host itself.

And quote from @rbradford:

When a job is run on one of the workers it looks to see if there is a
container locally matching the name as specified in the dev_cli.sh
script - if there is then it uses it. Otherwise it will try and download
it from the container registry - if that fails then it will built
locally. For the x86-64 workers started dynamically it will never have a
local version as they are a fresh VM. But on the ARM64 builder is a
local container image cache.

This can lead to an issue where if the image is build with one version
(a handcrafted datestamp) and then the Dockerfile is changed without
changing the timestamp then an old version may be fetched from the cache
or server. It is there for essential to always bump the datestamp (there
is a number after the - that can be used for this.)

However there is also the added complexity that image that is build and
uploaded to the container registry is not the same as the built locally
and thus used for the initial testing of the Dockerfile change. This
leads to the issue we have seen where different CPU compiler flags (from
-march=native) from the QEMU cross build in the hosted GHA action and
the local ARM64 build. Resulting in a binary in the remotely built
container not working locally.

We end up specifying TARGET_ARCHITECTURE="armv8.2-a" for building spdk,
and put built `python/spdk/` folder into `/usr/local/bin/spdk-nvme`.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-02-28 18:34:23 +00:00
Ruoqing He
0fbba66b21 scripts: Remove SPDK build in aarch64 test script
We already build `SPDK` for `linux/arm64` in our `Dockerfile`, no need
to build it here anymore.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-02-24 14:41:46 +00:00
Ruoqing He
655d512523 build: Upgrade to 24.04 in Dockerfile
`arm64` build in ubuntu:22.04 errors out with `error processing package
libc-bin`. This issue is a known issue between the binfmt (running
different architectures via QEMU) and the libc ldconfig binary running
in container. We're "suddenly" having issues as ubuntu-latest (which is
the OS version we run the GH action container with) was recently changed
from 22.04 to 24.04 and hence why upgrading the container userspace from
22.04 to 24.04 solves the problem.

Removed deprecated package `python3-distutils`.

Update image name from `20250111-0` to `20250222-0`.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-02-24 14:41:46 +00:00
Rob Bradford
f892789481 docs: Update documentation for new kernel configuration
Replace the use of a reference kernel configuration file from this
repository with the use of a defconfig from the linux fork.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-01-22 17:45:32 +00:00
Rob Bradford
2f9436bc12 build: Switch to named released kernel binary
For more control over updating the guest kernel use a fixed tag name
rather than fetching the latest.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-01-15 09:21:16 +00:00
Rob Bradford
fa686fdfc7 tests: Bump OVMF version
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-01-14 17:53:43 +00:00
Rob Bradford
66da3b9970 scripts: Temporarily build kernel as part of CI
Updating the kernel to v6.12 has shown up a flaw in the workflow for our
binary kernel releases. The CI job that builds the binary kernel in the
cloud-hypervisor/linux repository fetches the config from the main
branch of the cloud-hypervisor/cloud-hypervisor repository. However the
CI job to update the kernel version to use is in the cloud-hypervisor
repository.

As a workaround - update the kernel config and version in the
cloud-hypervisor repository to point to v6.12 and use the ability to
build the kernel during the CI run. Once merged to main a new release
can be made in the linux respository which will build a binary asset
using the new config. After that release the CI jobs on the
cloud-hypervisor repository can changed back to using the binary kernel
assets.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-01-13 21:46:23 +00:00
Rob Bradford
6ddbd60d9d build: Update kernel to v6.12
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-01-13 21:46:23 +00:00
Rob Bradford
2fe7f54ece build: Bump version number of Docker image
No change to the Dockerfile but I observed that the 20251022-0 image was
not available in the repository.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-01-11 15:03:01 +00:00
Rob Bradford
72452707ee scripts: Reduce number of parallel jobs on ARM64 CI
This system is erroring out on jobs due to insufficient memory - reduce
parallelism to allow CI jobs to complete.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-12-20 10:36:13 +00:00
Ruoqing He
261bfac4d4 ci: Constrain FW_URL to x86_64 one
With 0.5.0 release of `rust-hypervisor-firmware`, `aarch64` binary were
added to assets, which causes the `FW_URL` to have multiple download url
separated by a white space, thus our integration tests would fail.

Constrain `FW_URL` to `hypervisor-fw` to resolve this.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2024-12-02 14:14:57 +00:00
Ruoqing He
3c05626ad1 scripts: Replace download_linux with prepare_linux
`prepare_linux` is capable of determining whether we need to invoke
`build_custom_linux` for building linux from source or `download_linux`
for downloading pre-built.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2024-11-12 16:27:20 +00:00
Ruoqing He
906580ee92 scripts: Add prepare_linux function
`prepare_linux` checks if a `--build-guest-kernel` option is present,
and build kernel from `cloud-hypervisor/linux.git`. Otherwise, it will
invoke `download_linux` to use pre-built kernel.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2024-11-12 16:27:20 +00:00
Ruoqing He
337cbf3d33 scripts: Add consistency check script
Add `package-consistency-check.py` script to prevent #6809 and #6815
from happening. This script takes a string present in the repository
field of packages to identify pacakges from a specific source.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2024-11-02 16:09:09 +00:00
Songqian Li
7e6326b3c5 scripts: fix code coverage script args parsing
Signed-off-by: Songqian Li <sionli@tencent.com>
2024-10-23 18:38:34 +01:00