Commit graph

409 commits

Author SHA1 Message Date
Muminul Islam
e3fa27e251 scripts: Fix issue when extra docker volumes provided
With the current syntax docker gives 'docker: invalid reference format'
error. Also during parsing /xxx:/yyy in process_volumes_args
with \" inside variable i.e arr_vols=("${arg_vols//#/ }")
gives wrong output.

Example:
scripts/dev_cli.sh tests --integration --volumes /mshv:/mshv
Error: The volume /mshv /mshv does not exist.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2025-11-05 10:45:24 +00:00
Anirudh Rayabharam
f74cde7882 scripts: build mshv feature for ivshmem testing
The ivshmem tests are all failing in the CI for MSHV because Cloud
Hypervisor is built without the mshv feature.

Error: Cloud Hypervisor exited with the following chain of errors:
  0: Failed to open hypervisor interface (is hypervisor interface
       available?)
  1: Failed to create the hypervisor
  2: no supported hypervisor

Modify the build command to include the mshv feature.

Signed-off-by: Anirudh Rayabharam <anrayabh@microsoft.com>
2025-10-28 15:31:58 +00:00
Anirudh Rayabharam
861b7ab64d tests: exclude test_fw_cfg for mshv
test_fw_cfg is frequently failing in the CI for MSHV. Exclude it for
now. It needs further investigation. See issue #7434 for details.

Signed-off-by: Anirudh Rayabharam <anrayabh@microsoft.com>
2025-10-25 10:16:24 +00:00
Bo Chen
205e62aaa8 scripts: Download jammy images for rate-limiter tests
We recently moved many of our tests to use focal to jammy as the guest
images, including rate-limiter tests (#7367). We forgot to update the
rate-limiter scripts to reflect such change.

Our CI pipeline failed to report such error because our self-host runner
happened to be not working at the time we land the changes (see #7405).

Signed-off-by: Bo Chen <bchen@crusoe.ai>
2025-10-09 22:49:12 +00:00
Rob Bradford
d760301c8d tests: Reduce parallelism on x86-64 testing
Only use 75% of the available threads - this will reduce dislk and
memory pressure. Reducing the chance of flaky tests.

See: #7405

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-10-09 11:02:49 +00:00
Wei Liu
78a16227d2 scripts: Exit create-cloud-init.sh on error
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2025-09-22 14:24:50 +00:00
Shubham Chakrawar
2d9e243163 misc: Remove SGX support from Cloud Hypervisor
This commit removes the SGX support from cloud hypervisor. SGX support
was deprecated in May as part of #7090.

Signed-off-by: Shubham Chakrawar <schakrawar@crusoe.ai>
2025-09-05 18:08:36 +00:00
Muminul Islam
1ca6c159ef tests: option to override default migratable version
This patch gives user an option to override the
default migratable version to any later release.
This option makes MSHV specific tests suitable for
tests since MSHV is stable after some breaking changes.
This patch is also necessary for MSHV CI.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2025-09-03 18:51:24 +00:00
Philipp Schuster
92f415ea3f build: Bump MSRV to 1.88
This is necessary to use the let-chains feature in a
follow-up. After upgrading to Rust edition 2024, clippy
wants to collapse various if's with let-chains.

Update image to 20250815-0 since MSRV in Dockerfile is updated.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-08-15 10:55:48 +00:00
Songqian Li
4c1ee0329e tests: add ivshmem integration test case
Signed-off-by: Songqian Li <sionli@tencent.com>
2025-08-14 22:14:34 +00:00
Alex Orozco
5d478c534e tests: Add fw_cfg device integration test
This test verifies that we can see custom items added to the fw_cfg
device from inside the guest

Signed-off-by: Alex Orozco <alexorozco@google.com>
2025-08-11 17:29:51 +00:00
Songqian Li
530719a57a build: Bump MSRV to 1.87.0
rustc 1.90.0-beta.1 (788da80fc 2025-08-04) suggests using library
feature `unsigned_is_multiple_of`. It is stabled in Rust 1.87.0.

Update image to 20250807-0 since MSRV in Dockerfile is updated.

Signed-off-by: Songqian Li <sionli@tencent.com>
Signed-off-by: Bo Chen <bchen@crusoe.ai>
2025-08-07 16:53:59 +00:00
Wei Liu
5f2392c095 tests: Avoid repeatedly downloading files from GitHub
Running one or two tests in a tight loop can cause the download
functions to quickly hit GitHub's API rate limit. That causes the test
script to fail for no apparent reason.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2025-07-24 16:46:43 +00:00
Philipp Schuster
d580ed55c6 seccomp: add SYS_getcwd (79) to support proper Rust backtraces
When a proper Rust backtrace is printed, the Rust std wants to use the
SYS_getcwd(79) system call to prettify some paths while printing. In
Cloud Hypervisor, this is at least relevant for printing panics or if
a `anyhow::Error` value is printed using `{e:?}` (but not `{e:#?}`).

The syscall cause can be found in `impl fmt::Display for Backtrace {}`
in `library/std/src/backtrace.rs`.

Without this addition, the seccomp violation of the SYS_getcwd (79)
hinders the proper error message including a full backtrace from showing
up. This annoying behaviour already delayed many debugging efforts. With
this fix, things just work. The new syscall itself should be pretty
harmless for normal operation.

```
thread 'vmm' panicked at virtio-devices/src/rng.rs:224:9:
Yikes, things went horribly wrong!

==== Possible seccomp violation ====
Try running with `strace -ff` to identify the cause and open an issue: https://github.com/cloud-hypervisor/cloud-hypervisor/issues/new
[1]    287683 invalid system call (core dumped)  RUST_BACKTRACE=full cargo run --bin cloud-hypervisor -- --api-socket  --kerne
```

```
thread 'vmm' panicked at virtio-devices/src/rng.rs:224:9:
Yikes, things went horribly wrong!
stack backtrace:
   0:     0x557d91286b62 - std::backtrace_rs::backtrace::libunwind::trace::hc20b48b31ee52608
                               at /rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library/std/src/../../backtrace/src/backtrace/libunwind.rs:117:9
   1:     0x557d91286b62 - std::backtrace_rs::backtrace::trace_unsynchronized::h5d207cd20f193d88
                               at /rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library/std/src/../../backtrace/src/backtrace/mod.rs:66:14

...

  67:                0x0 - <unknown>
Error: Cloud Hypervisor exited with the following error:
  Failed to join on VMM thread: Any { .. }

Debug Info: ThreadJoin(Any { .. })
```

- add any panic, for example into the create or drop function of a
  device
- add --seccomp=true|log to analyze the situation

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-06-26 20:50:57 +00:00
Bo Chen
2b05753716 ci: Update reference kernel to 'v6.12.8-20250613'
This bump also includes another release 'ch-release-v6.12.8-20250422'
that changed the naming convention of the released kernel binaries
[1]. As a result, few changes are made to our integration tests and test
scripts.

[1] https://github.com/cloud-hypervisor/linux/releases/tag/ch-release-v6.12.8-20250422

Signed-off-by: Bo Chen <bchen@crusoe.ai>
2025-06-16 17:59:22 +00:00
Philipp Schuster
77e042237d ci: improve gitlint (max line length in body with exceptions)
Follow-up of 5aa1540c5d but way more
mature. We now use custom gitlint rules written in Python to better
handle the max line length, with respect to a few valid exceptions.
Recognizing code blocks or compiler output, as discussed, is not
trivial and hard to get right for all corner-cases. Therefore, this
commit is a pragmatic way forward. The CI job should be kept optional.

Allowed exceptions for the 72 line length limit are now:

1. links in the following three common patterns:
https://example.com/very-long-links/very-long-links/very-long-links/very-long-links/very-long-links/very-long-links/very-long-links
[0] https://example.com/very-long-links/very-long-links/very-long-links/very-long-links/very-long-links/very-long-links/very-long-links
[0]: https://example.com/very-long-links/very-long-links/very-long-links/very-long-links/very-long-links/very-long-links/very-long-links

2. code blocks (anything between the three backticks)

```
let x = "very_long_very_long_very_long_very_long_very_long_very_long_very_long_very_long_very_long_very_long_"
```

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-05-09 14:50:23 +01:00
Ruoqing He
226ecf47bb build: Bump MSRV to 1.83.0
The dependency `bitfield-struct` 0.10.x of `igvm` 0.3.5 requires MSRV
1.83.0, bump to catch up.

Update image to 20250412-0 because MSRV in Dockerfile is updated.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-04-12 18:31:02 +01:00
Ruoqing He
6768a13d95 build: Bump MSRV to 1.82.0
We are having complains from Rust 1.86.0-beta.1 (f0cb41030 2025-02-17)'
clippy, which suggests us to replace `repeat().take()` with
`repeat_n()`. While `repeat_n()` is stablized in Rust 1.82.0.

Update image to 20250307-2 because MSRV in Dockerfile is updated.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-03-07 15:09:14 +00:00
Ruoqing He
5cb5115456 build: Fix spdk in linux/arm64 image
The reason `test_vfio_user` fails is as @likebreath pointed: our ARM
host does not support SVE, while the nvme_tgt binary built from the
container image requires it. As a result, we encountered a SIGILL when
running the nvme_tgt binary. This also explains why this is not
happening when the container is built on the same host itself.

And quote from @rbradford:

When a job is run on one of the workers it looks to see if there is a
container locally matching the name as specified in the dev_cli.sh
script - if there is then it uses it. Otherwise it will try and download
it from the container registry - if that fails then it will built
locally. For the x86-64 workers started dynamically it will never have a
local version as they are a fresh VM. But on the ARM64 builder is a
local container image cache.

This can lead to an issue where if the image is build with one version
(a handcrafted datestamp) and then the Dockerfile is changed without
changing the timestamp then an old version may be fetched from the cache
or server. It is there for essential to always bump the datestamp (there
is a number after the - that can be used for this.)

However there is also the added complexity that image that is build and
uploaded to the container registry is not the same as the built locally
and thus used for the initial testing of the Dockerfile change. This
leads to the issue we have seen where different CPU compiler flags (from
-march=native) from the QEMU cross build in the hosted GHA action and
the local ARM64 build. Resulting in a binary in the remotely built
container not working locally.

We end up specifying TARGET_ARCHITECTURE="armv8.2-a" for building spdk,
and put built `python/spdk/` folder into `/usr/local/bin/spdk-nvme`.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-02-28 18:34:23 +00:00
Ruoqing He
0fbba66b21 scripts: Remove SPDK build in aarch64 test script
We already build `SPDK` for `linux/arm64` in our `Dockerfile`, no need
to build it here anymore.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-02-24 14:41:46 +00:00
Ruoqing He
655d512523 build: Upgrade to 24.04 in Dockerfile
`arm64` build in ubuntu:22.04 errors out with `error processing package
libc-bin`. This issue is a known issue between the binfmt (running
different architectures via QEMU) and the libc ldconfig binary running
in container. We're "suddenly" having issues as ubuntu-latest (which is
the OS version we run the GH action container with) was recently changed
from 22.04 to 24.04 and hence why upgrading the container userspace from
22.04 to 24.04 solves the problem.

Removed deprecated package `python3-distutils`.

Update image name from `20250111-0` to `20250222-0`.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2025-02-24 14:41:46 +00:00
Rob Bradford
f892789481 docs: Update documentation for new kernel configuration
Replace the use of a reference kernel configuration file from this
repository with the use of a defconfig from the linux fork.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-01-22 17:45:32 +00:00
Rob Bradford
2f9436bc12 build: Switch to named released kernel binary
For more control over updating the guest kernel use a fixed tag name
rather than fetching the latest.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-01-15 09:21:16 +00:00
Rob Bradford
fa686fdfc7 tests: Bump OVMF version
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-01-14 17:53:43 +00:00
Rob Bradford
66da3b9970 scripts: Temporarily build kernel as part of CI
Updating the kernel to v6.12 has shown up a flaw in the workflow for our
binary kernel releases. The CI job that builds the binary kernel in the
cloud-hypervisor/linux repository fetches the config from the main
branch of the cloud-hypervisor/cloud-hypervisor repository. However the
CI job to update the kernel version to use is in the cloud-hypervisor
repository.

As a workaround - update the kernel config and version in the
cloud-hypervisor repository to point to v6.12 and use the ability to
build the kernel during the CI run. Once merged to main a new release
can be made in the linux respository which will build a binary asset
using the new config. After that release the CI jobs on the
cloud-hypervisor repository can changed back to using the binary kernel
assets.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-01-13 21:46:23 +00:00
Rob Bradford
6ddbd60d9d build: Update kernel to v6.12
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-01-13 21:46:23 +00:00
Rob Bradford
2fe7f54ece build: Bump version number of Docker image
No change to the Dockerfile but I observed that the 20251022-0 image was
not available in the repository.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2025-01-11 15:03:01 +00:00
Rob Bradford
72452707ee scripts: Reduce number of parallel jobs on ARM64 CI
This system is erroring out on jobs due to insufficient memory - reduce
parallelism to allow CI jobs to complete.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-12-20 10:36:13 +00:00
Ruoqing He
261bfac4d4 ci: Constrain FW_URL to x86_64 one
With 0.5.0 release of `rust-hypervisor-firmware`, `aarch64` binary were
added to assets, which causes the `FW_URL` to have multiple download url
separated by a white space, thus our integration tests would fail.

Constrain `FW_URL` to `hypervisor-fw` to resolve this.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2024-12-02 14:14:57 +00:00
Ruoqing He
3c05626ad1 scripts: Replace download_linux with prepare_linux
`prepare_linux` is capable of determining whether we need to invoke
`build_custom_linux` for building linux from source or `download_linux`
for downloading pre-built.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2024-11-12 16:27:20 +00:00
Ruoqing He
906580ee92 scripts: Add prepare_linux function
`prepare_linux` checks if a `--build-guest-kernel` option is present,
and build kernel from `cloud-hypervisor/linux.git`. Otherwise, it will
invoke `download_linux` to use pre-built kernel.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2024-11-12 16:27:20 +00:00
Ruoqing He
337cbf3d33 scripts: Add consistency check script
Add `package-consistency-check.py` script to prevent #6809 and #6815
from happening. This script takes a string present in the repository
field of packages to identify pacakges from a specific source.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2024-11-02 16:09:09 +00:00
Songqian Li
7e6326b3c5 scripts: fix code coverage script args parsing
Signed-off-by: Songqian Li <sionli@tencent.com>
2024-10-23 18:38:34 +01:00
Rob Bradford
34dc97f74f build: Bump dev container version
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-10-22 13:08:17 +00:00
Rob Bradford
2fa4dc6338 tests: Limit number of test thread on aarch64
This system has a lot of cores (80) resulting in all the tests being
spawned simultaneously and leading to exhaustion of the available
memory. Instead limit the number of threads.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-10-17 19:51:05 +00:00
Rob Bradford
b1547c4ccb tests: Update version of Jammy image in use
This version is generated with the new script and adds kexec-tools.

Fixes: #6726

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-10-17 19:51:05 +00:00
Rob Bradford
c162494867 scripts: Add a script to automate the custom image construction
Only for x86-64 right now but does include support for custom VFIO
image.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-10-17 19:51:05 +00:00
Rob Bradford
19d36c765f scripts: Only download kernel binaries if changed
Only download the kernel binaries from the github release if the remote
file is newer (avoids multiple copies accumulating in the download
directory.)

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-10-02 14:50:39 +00:00
Songqian Li
0a3ad6153a scripts: add cargo test args for code coverage reports
Add release and target params to `cargo test` since we collect
the code coverage reports from `xx/$BUILD_TARGET/release/`.

Signed-off-by: Songqian Li <sionli@tencent.com>
2024-09-28 13:59:40 +00:00
Songqian Li
9f02839448 scripts: add code coverage script
Fixes: #6507

Signed-off-by: Songqian Li <sionli@tencent.com>
2024-09-28 13:59:40 +00:00
Rob Bradford
a10e9e6cd5 build: Use binary kernel artifacts
The Linux kernel fork repository for Cloud Hypervisor now produces
prebuilt x86-64 and aarch64 binaries. Speed up the CI by using those
binaries.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-09-09 13:09:10 +00:00
Josh Soref
42e9632c53 misc: Fix spelling issues
Misspellings were identified by:
  https://github.com/marketplace/actions/check-spelling

* Initial corrections based on forbidden patterns from the action
* Additional corrections by Google Chrome auto-suggest
* Some manual corrections
* Adding markdown bullets to readme credits section

Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2024-06-08 16:31:30 +00:00
Rob Bradford
8b86c7724b build: Bump MSRV to 1.77.0
The ability to control the rustc flags (required for adding new
attributes to the allowed list of #[cfg(..)]) requires bumping the MSRV
to 1.77.0

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-05-08 08:10:28 +00:00
Rob Bradford
f5abb168e3 tests: Re-enable live upgrade tests
And bump release verion used to v39.0

Fixes: #6134

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-05-04 10:06:04 +00:00
Ruslan Mstoi
5e9886bba4 build: add REUSE Compliance Check
In accordance with reuse requirements:
- Place each license file in the LICENSES/ directory
- Add missing SPDX-License-Identifier to files.
- Add .reuse/dep5 to bulk-license files

Fixes: #5887

Signed-off-by: Ruslan Mstoi <ruslan.mstoi@intel.com>
2024-04-19 17:35:45 +00:00
Ruslan Mstoi
8a80bea4c5 gitlint: add openapi to valid components
See #6372

Signed-off-by: Ruslan Mstoi <ruslan.mstoi@intel.com>
2024-04-09 13:05:53 +00:00
Rob Bradford
c4ad9b45d0 build: Use explicit date version number for dev container
This removes the requirement to ensure that we land PRs that update the
Dockerfile (and the appropraite dev_cli.sh change) in a specific time
frame.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-04-08 21:11:57 +00:00
Rob Bradford
d485896edd build: Bump Rust version from 1.74.0 to 1.74.1
Fixes: #6368

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-04-08 21:11:57 +00:00
Rob Bradford
084eb0792d build: Bump MSRV to 1.74
This is required for the updated clap crate (see #6237)

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-02-29 19:42:16 +00:00
Alexandru Matei
3f2ca5375e scripts: Change features_build variable type to array
Because of double quotes the current value is passed
as a single argument with a space in it to cargo.
This commit changes it to an array so each element
is passed as a different arguments.

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2024-02-29 11:38:43 +00:00