Replace the use of `guestmount and `guestunmount` with standard tools
(losetup, mount) to modify cloud disk image. This eliminates
dependency on guestmount which requires /dev/kvm and is not available
in MSHV root partition.
Signed-off-by: Aastha Rawat <aastharawat@microsoft.com>
Refactor MSHV hypervisor selection logic to enable aarch64 support
instead of exiting. Use `--features mshv` to conditionally build &
test when hypervisor is mshv replacing hard exit for aarch64 on mshv.
Signed-off-by: Aastha Rawat <aastharawat@microsoft.com>
With the current syntax docker gives 'docker: invalid reference format'
error. Also during parsing /xxx:/yyy in process_volumes_args
with \" inside variable i.e arr_vols=("${arg_vols//#/ }")
gives wrong output.
Example:
scripts/dev_cli.sh tests --integration --volumes /mshv:/mshv
Error: The volume /mshv /mshv does not exist.
Signed-off-by: Muminul Islam <muislam@microsoft.com>
The ivshmem tests are all failing in the CI for MSHV because Cloud
Hypervisor is built without the mshv feature.
Error: Cloud Hypervisor exited with the following chain of errors:
0: Failed to open hypervisor interface (is hypervisor interface
available?)
1: Failed to create the hypervisor
2: no supported hypervisor
Modify the build command to include the mshv feature.
Signed-off-by: Anirudh Rayabharam <anrayabh@microsoft.com>
test_fw_cfg is frequently failing in the CI for MSHV. Exclude it for
now. It needs further investigation. See issue #7434 for details.
Signed-off-by: Anirudh Rayabharam <anrayabh@microsoft.com>
We recently moved many of our tests to use focal to jammy as the guest
images, including rate-limiter tests (#7367). We forgot to update the
rate-limiter scripts to reflect such change.
Our CI pipeline failed to report such error because our self-host runner
happened to be not working at the time we land the changes (see #7405).
Signed-off-by: Bo Chen <bchen@crusoe.ai>
Only use 75% of the available threads - this will reduce dislk and
memory pressure. Reducing the chance of flaky tests.
See: #7405
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
This commit removes the SGX support from cloud hypervisor. SGX support
was deprecated in May as part of #7090.
Signed-off-by: Shubham Chakrawar <schakrawar@crusoe.ai>
This patch gives user an option to override the
default migratable version to any later release.
This option makes MSHV specific tests suitable for
tests since MSHV is stable after some breaking changes.
This patch is also necessary for MSHV CI.
Signed-off-by: Muminul Islam <muislam@microsoft.com>
This is necessary to use the let-chains feature in a
follow-up. After upgrading to Rust edition 2024, clippy
wants to collapse various if's with let-chains.
Update image to 20250815-0 since MSRV in Dockerfile is updated.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
rustc 1.90.0-beta.1 (788da80fc 2025-08-04) suggests using library
feature `unsigned_is_multiple_of`. It is stabled in Rust 1.87.0.
Update image to 20250807-0 since MSRV in Dockerfile is updated.
Signed-off-by: Songqian Li <sionli@tencent.com>
Signed-off-by: Bo Chen <bchen@crusoe.ai>
Running one or two tests in a tight loop can cause the download
functions to quickly hit GitHub's API rate limit. That causes the test
script to fail for no apparent reason.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
When a proper Rust backtrace is printed, the Rust std wants to use the
SYS_getcwd(79) system call to prettify some paths while printing. In
Cloud Hypervisor, this is at least relevant for printing panics or if
a `anyhow::Error` value is printed using `{e:?}` (but not `{e:#?}`).
The syscall cause can be found in `impl fmt::Display for Backtrace {}`
in `library/std/src/backtrace.rs`.
Without this addition, the seccomp violation of the SYS_getcwd (79)
hinders the proper error message including a full backtrace from showing
up. This annoying behaviour already delayed many debugging efforts. With
this fix, things just work. The new syscall itself should be pretty
harmless for normal operation.
```
thread 'vmm' panicked at virtio-devices/src/rng.rs:224:9:
Yikes, things went horribly wrong!
==== Possible seccomp violation ====
Try running with `strace -ff` to identify the cause and open an issue: https://github.com/cloud-hypervisor/cloud-hypervisor/issues/new
[1] 287683 invalid system call (core dumped) RUST_BACKTRACE=full cargo run --bin cloud-hypervisor -- --api-socket --kerne
```
```
thread 'vmm' panicked at virtio-devices/src/rng.rs:224:9:
Yikes, things went horribly wrong!
stack backtrace:
0: 0x557d91286b62 - std::backtrace_rs::backtrace::libunwind::trace::hc20b48b31ee52608
at /rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library/std/src/../../backtrace/src/backtrace/libunwind.rs:117:9
1: 0x557d91286b62 - std::backtrace_rs::backtrace::trace_unsynchronized::h5d207cd20f193d88
at /rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library/std/src/../../backtrace/src/backtrace/mod.rs:66:14
...
67: 0x0 - <unknown>
Error: Cloud Hypervisor exited with the following error:
Failed to join on VMM thread: Any { .. }
Debug Info: ThreadJoin(Any { .. })
```
- add any panic, for example into the create or drop function of a
device
- add --seccomp=true|log to analyze the situation
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
This bump also includes another release 'ch-release-v6.12.8-20250422'
that changed the naming convention of the released kernel binaries
[1]. As a result, few changes are made to our integration tests and test
scripts.
[1] https://github.com/cloud-hypervisor/linux/releases/tag/ch-release-v6.12.8-20250422
Signed-off-by: Bo Chen <bchen@crusoe.ai>
The dependency `bitfield-struct` 0.10.x of `igvm` 0.3.5 requires MSRV
1.83.0, bump to catch up.
Update image to 20250412-0 because MSRV in Dockerfile is updated.
Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
We are having complains from Rust 1.86.0-beta.1 (f0cb41030 2025-02-17)'
clippy, which suggests us to replace `repeat().take()` with
`repeat_n()`. While `repeat_n()` is stablized in Rust 1.82.0.
Update image to 20250307-2 because MSRV in Dockerfile is updated.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
The reason `test_vfio_user` fails is as @likebreath pointed: our ARM
host does not support SVE, while the nvme_tgt binary built from the
container image requires it. As a result, we encountered a SIGILL when
running the nvme_tgt binary. This also explains why this is not
happening when the container is built on the same host itself.
And quote from @rbradford:
When a job is run on one of the workers it looks to see if there is a
container locally matching the name as specified in the dev_cli.sh
script - if there is then it uses it. Otherwise it will try and download
it from the container registry - if that fails then it will built
locally. For the x86-64 workers started dynamically it will never have a
local version as they are a fresh VM. But on the ARM64 builder is a
local container image cache.
This can lead to an issue where if the image is build with one version
(a handcrafted datestamp) and then the Dockerfile is changed without
changing the timestamp then an old version may be fetched from the cache
or server. It is there for essential to always bump the datestamp (there
is a number after the - that can be used for this.)
However there is also the added complexity that image that is build and
uploaded to the container registry is not the same as the built locally
and thus used for the initial testing of the Dockerfile change. This
leads to the issue we have seen where different CPU compiler flags (from
-march=native) from the QEMU cross build in the hosted GHA action and
the local ARM64 build. Resulting in a binary in the remotely built
container not working locally.
We end up specifying TARGET_ARCHITECTURE="armv8.2-a" for building spdk,
and put built `python/spdk/` folder into `/usr/local/bin/spdk-nvme`.
Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
`arm64` build in ubuntu:22.04 errors out with `error processing package
libc-bin`. This issue is a known issue between the binfmt (running
different architectures via QEMU) and the libc ldconfig binary running
in container. We're "suddenly" having issues as ubuntu-latest (which is
the OS version we run the GH action container with) was recently changed
from 22.04 to 24.04 and hence why upgrading the container userspace from
22.04 to 24.04 solves the problem.
Removed deprecated package `python3-distutils`.
Update image name from `20250111-0` to `20250222-0`.
Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
Replace the use of a reference kernel configuration file from this
repository with the use of a defconfig from the linux fork.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
For more control over updating the guest kernel use a fixed tag name
rather than fetching the latest.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
Updating the kernel to v6.12 has shown up a flaw in the workflow for our
binary kernel releases. The CI job that builds the binary kernel in the
cloud-hypervisor/linux repository fetches the config from the main
branch of the cloud-hypervisor/cloud-hypervisor repository. However the
CI job to update the kernel version to use is in the cloud-hypervisor
repository.
As a workaround - update the kernel config and version in the
cloud-hypervisor repository to point to v6.12 and use the ability to
build the kernel during the CI run. Once merged to main a new release
can be made in the linux respository which will build a binary asset
using the new config. After that release the CI jobs on the
cloud-hypervisor repository can changed back to using the binary kernel
assets.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
No change to the Dockerfile but I observed that the 20251022-0 image was
not available in the repository.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
This system is erroring out on jobs due to insufficient memory - reduce
parallelism to allow CI jobs to complete.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
With 0.5.0 release of `rust-hypervisor-firmware`, `aarch64` binary were
added to assets, which causes the `FW_URL` to have multiple download url
separated by a white space, thus our integration tests would fail.
Constrain `FW_URL` to `hypervisor-fw` to resolve this.
Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
`prepare_linux` is capable of determining whether we need to invoke
`build_custom_linux` for building linux from source or `download_linux`
for downloading pre-built.
Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
`prepare_linux` checks if a `--build-guest-kernel` option is present,
and build kernel from `cloud-hypervisor/linux.git`. Otherwise, it will
invoke `download_linux` to use pre-built kernel.
Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
Add `package-consistency-check.py` script to prevent #6809 and #6815
from happening. This script takes a string present in the repository
field of packages to identify pacakges from a specific source.
Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
This system has a lot of cores (80) resulting in all the tests being
spawned simultaneously and leading to exhaustion of the available
memory. Instead limit the number of threads.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
Only download the kernel binaries from the github release if the remote
file is newer (avoids multiple copies accumulating in the download
directory.)
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
Add release and target params to `cargo test` since we collect
the code coverage reports from `xx/$BUILD_TARGET/release/`.
Signed-off-by: Songqian Li <sionli@tencent.com>
The Linux kernel fork repository for Cloud Hypervisor now produces
prebuilt x86-64 and aarch64 binaries. Speed up the CI by using those
binaries.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
Misspellings were identified by:
https://github.com/marketplace/actions/check-spelling
* Initial corrections based on forbidden patterns from the action
* Additional corrections by Google Chrome auto-suggest
* Some manual corrections
* Adding markdown bullets to readme credits section
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
The ability to control the rustc flags (required for adding new
attributes to the allowed list of #[cfg(..)]) requires bumping the MSRV
to 1.77.0
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
In accordance with reuse requirements:
- Place each license file in the LICENSES/ directory
- Add missing SPDX-License-Identifier to files.
- Add .reuse/dep5 to bulk-license files
Fixes: #5887
Signed-off-by: Ruslan Mstoi <ruslan.mstoi@intel.com>
This removes the requirement to ensure that we land PRs that update the
Dockerfile (and the appropraite dev_cli.sh change) in a specific time
frame.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>