Commit graph

503 commits

Author SHA1 Message Date
Bo Chen
987ad11c90 main: Report errors with 'error!()'
This was missed from #7183, likely because `eprint!` is used instead of
`eprintln!`.

Signed-off-by: Bo Chen <bchen@crusoe.ai>
2025-07-17 16:18:56 +00:00
Maximilian Güntner
50b33db718 vmm: replace eprintln with log::error
Unify log formatting and printing as `eprintln!` and `log::error!`
would be used alongside each other.
When using e.g. `env_logger` lines printed with `eprintln!` would
lack formatting / colors.
Currently only relevant in `ch-remote` + `cli_print_error_chain`.

Note that the replaced messages now also end up in the logfile of
`cloud-hypervisor` when configured and not any longer in stderr.

Signed-off-by: Maximilian Güntner <code@mguentner.de>
2025-07-10 16:36:54 +00:00
Maximilian Güntner
19dc733267 ch-remote: add env_logger, log messages to stderr
Until now all messages generated using `log::level!`
(e.g., `warn!`) have not been printed as `ch-remote` did not
register a logger.
Furthermore, replace all `eprintln!` with `error!`
to align formatting for consistency.

Signed-off-by: Maximilian Güntner <code@mguentner.de>
2025-07-10 16:36:54 +00:00
Philipp Schuster
190a11f212 ch-remote: also pretty-print remote server errors
Remote server errors are transferred as raw HTTP body. This way,
we lose the nested structured error information.

This is an attempt to retrieve the errors from the HTTP response
and to align the output with the normal error output.

For example, this produces the following chain of errors. Note
that everything after level 0 was retrieved from the HTTP server
response:

```
Error: ch-remote exited with the following chain of errors:
  0: http client error
  1: Server responded with InternalServerError
  2: Error from API
  3: The disk could not be added to the VM
  4: Failed to validate config
  5: Identifier disk1 is not unique

Debug Info: HttpApiClient(ServerResponse(InternalServerError, Some("Error from API<br>The disk could not be added to the VM<br>Failed to validate config<br>Identifier disk1 is not unique")))
```

In case the JSON can't be parsed properly, ch-remote will print:

```
Error: ch-remote exited with the following chain of errors:
  0: http client error
  X: Can't get remote's error messages from JSON response: EOF while parsing a value at line 1 column 0: body=''

Debug Info: HttpApiClient(ServerResponse(InternalServerError, Some("")))
```

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-06-13 19:55:54 +00:00
Philipp Schuster
6ea132708c vmm: use Error trait directly with Note for compiler bug
While working on this, I found a subtle but severe compiler bug [0].
To fight the bug with explicitness rather than implicitness (to
prevent weird stuff in the future), this change is beneficial.

The bug is at least in Rust stable 1.34..1.87.

[0]: https://github.com/rust-lang/rust/issues/141673

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-06-13 19:55:54 +00:00
Philipp Schuster
060c9de07f vmm: introduce nice error messages on exit (CHV and ch-remote)
With the foundations of each error type implementing std::error::Error,
we can now nicely walk the `.source()` chain and print an error trace.

This commit introduces improved user-facing error printing when:
- Cloud Hypervisor fails with an error
- ch-remote fails (client error)
- ch-remote fails (remote error)

The additional context is a clear improvement in UX for both users and
developers. In the following example, the new behaviour is shown for
a direct invocation of Cloud Hypervisor leading to a failure. This
looks similar for ch-remote.

```
Old Style
`target/release/cloud-hypervisor --api-socket /tmp/chv2.sock --kernel /etc/bootitems/linux/kernel_minimal/stable.bzImage --cmdline console=ttyS0 --serial tty --console off --disk path=img.raw --initramfs /etc/bootitems/linux/initrd_minimal/default`

Error booting VM: VmBoot(LockingError(BlockError(LockDiskImage(AlreadyLocked)))
```

```
`target/release/cloud-hypervisor --api-socket /tmp/chv2.sock --kernel /etc/bootitems/linux/kernel_minimal/stable.bzImage --cmdline console=ttyS0 --serial tty --console off --disk path=img.raw --initramfs
/etc/bootitems/linux/initrd_minimal/default`

Error: Cloud Hypervisor exited with the following chain of errors:
  0: Error booting VM
  1: The VM could not boot
  2: Error locking disk images: Another instance likely holds a lock
  3: Cannot lock images of all block devices
  4: Failed to get Write lock for disk image: ./img.raw
  5: The file is already locked

Debug Info: VmBoot(VmBoot(LockingError(DiskLockError(LockDiskImage { error: AlreadyLocked, lock_type: Write, path: "./raw_disk.bin" })))
```

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-06-13 19:55:54 +00:00
Philipp Schuster
1b03e59152 misc: streamline error Display::fmt()
The changes were mostly automatically applied using the Python
script mentioned in the first commit of this series.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-06-13 19:55:54 +00:00
Philipp Schuster
5711f31995 misc: ch-remote: streamline error Display::fmt()
The changes were mostly automatically applied using the Python
script mentioned in the first commit of this series.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-06-13 19:55:54 +00:00
Philipp Schuster
53e9c94e68 tests: cleanup test_util module
We can remove the `tests` module as the entire file is only
available when running tests.

Follow-up of #7130 / 1f13165fae.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-06-13 19:02:39 +00:00
Philipp Schuster
d594107c0d ch-remote: sort all commands and args alphabetically
Having them sorted alphabetically makes more sense since there are
already many, and the list is growing. This improves the UX for users
and developers.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-06-12 13:53:55 +00:00
Philipp Schuster
1f13165fae tests: prepare common test infrastructure for CLI args
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-06-12 13:53:55 +00:00
Philipp Schuster
12493db144 ch-remote: move Args and Commands creation to function
This enables to sort them alphabetically in a next step, similar
to #6988 / c37c639f3f610378ba7e523659e2d60ebfd769a4.

[0] https://github.com/cloud-hypervisor/cloud-hypervisor/pull/6988

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-06-12 13:53:55 +00:00
Philipp Schuster
ab6e1bd2d8 misc: ch-remote: streamline #[source] and Error impl
This streamlines the Error implementation in the Cloud Hypervisor code
base to match the remaining parts so that everything follows the agreed
conventions. These are leftovers missed in the previous commits.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-05-22 15:13:27 +00:00
Philipp Schuster
fff62d9302 misc: vmm: streamline #[source] and Error
This streamlines the code base to follow best practices for
error handling in Rust: Each error struct implements
std::error::Error (most due via thiserror::Error derive macro)
and sets its source accordingly.

This allows future work that nicely prints the error chains,
for example.

So far, the convention is that each error prints its
sub error as part of its Display::fmt() impl.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-05-21 09:09:30 +00:00
Philipp Schuster
a212343908 misc: arch/riscv64: streamline #[source] and Error
This streamlines the code base to follow best practices for
error handling in Rust: Each error struct implements
std::error::Error (most due via thiserror::Error derive macro)
and sets its source accordingly.

This allows future work that nicely prints the error chains,
for example.

So far, the convention is that each error prints its
sub error as part of its Display::fmt() impl.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-05-21 09:09:30 +00:00
Philipp Schuster
fa58b725cb vmm: alphabetically sort CLI options in --help output
The CLI has grown to a big variety of options. clap prints them in the
help message (--help) in the order they were defined. We now are at a
point where grouping things logically together doesn't work well.
Further, there is no support by clap for logical grouping and the
current code base wasn't consistent. Therefore, this commit introduces
two changes:

- a new structure to define arguments (all in an array)
- an alphabetical ordering of the arguments

No other changes have been made. No options have been altered.

This significantly improves:
- code maintainability and extensibility
- readability of the --help output

A unit test ensures they stay sorted. A better approach to check if the
list of arguments (known at build time) is sorted would be a compile
time check (`const`), but this currently isn't possible in stable Rust.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
2025-03-20 08:43:09 +00:00
Nikolay Edigaryev
74ca38f7a9 vmm: introduce platform option to limit maximum IOMMU address width
Signed-off-by: Nikolay Edigaryev <edigaryev@gmail.com>
2025-01-14 21:31:47 +00:00
Rob Bradford
2ef04671be main: Place --tpm in the correct argument group
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-11-08 00:12:23 +00:00
Rob Bradford
453bc31994 main: Require a payload to boot when any VM argument provided
If any VM argument (e.g. --disk) is provided require some payload (e.g.
--kernel or --firmware) when parsing the command line arguments.

See: #6831

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-11-08 00:12:23 +00:00
Ruoqing He
0aab960bf1 misc: Elide needless lifetimes
As clippy of rust-toolchain version 1.83.0-beta.1 suggests, elide
needless lifetimes to `'_`.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2024-10-18 17:46:39 +00:00
Songqian Li
33c15ca273 vmm: remove pub use vm_config in config
This patch removes pub import vm_config in config.rs to eliminate
the ambiguity of vm_comfig reference.

Signed-off-by: Songqian Li <sionli@tencent.com>
2024-09-30 08:18:02 +00:00
Ruoqing He
61e57e1cb1 misc: Further improve imports styling
By introducing `imports_granularity="Module"` format strategy,
effectively groups imports from the same module into one line or block,
improving maintainability and readability.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2024-09-29 16:13:48 +00:00
Rob Bradford
88a9f79944 misc: Adapt consistent import style formatting
Historically the Cloud Hypervisor coding style has been to ensure that
all imports are ordered and placed in a single group. Unfortunately
cargo fmt has no support for ensuring that all imports are in a single
group so if whitespace lines were added as part of the import statements
then they would only be odered correctly in the group.

By adopting "group_imports="StdExternalCrate" we can enforce a style
where imports are placed in at most three groups for std, external
crates and the crate itself. Choosing a style enforceable by the tooling
reduces the reviewer burden.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-09-29 13:08:12 +01:00
Songqian Li
cc9899e09d vmm: remove unused mutex in api
This patch removes locks in VmCreate request and VmInfo response
since we needn't use a lock here and should ensure that internal
implementation is transparent to the runtime.

Signed-off-by: Songqian Li <sionli@tencent.com>
2024-09-28 14:02:04 +00:00
Yuanchu Xie
5f18ac3bc0 devices: Add pvmemcontrol device
Pvmemcontrol provides a way for the guest to control its physical memory
properties, and enables optimizations and security features. For
example, the guest can provide information to the host where parts of a
hugepage may be unbacked, or sensitive data may not be swapped out, etc.

Pvmemcontrol allows guests to manipulate its gPTE entries in the SLAT,
and also some other properties of the memory map the back's host memory.
This is achieved by using the KVM_CAP_SYNC_MMU capability. When this
capability is available, the changes in the backing of the memory region
on the host are automatically reflected into the guest. For example, an
mmap() or madvise() that affects the region will be made visible
immediately.

There are two components of the implementation: the guest Linux driver
and Virtual Machine Monitor (VMM) device. A guest-allocated shared
buffer is negotiated per-cpu through a few PCI MMIO registers, the VMM
device assigns a unique command for each per-cpu buffer. The guest
writes its pvmemcontrol request in the per-cpu buffer, then writes the
corresponding command into the command register, calling into the VMM
device to perform the pvmemcontrol request.

The synchronous per-cpu shared buffer approach avoids the kick and busy
waiting that the guest would have to do with virtio virtqueue transport.

The Cloud Hypervisor component can be enabled with --pvmemcontrol.

Co-developed-by: Stanko Novakovic <stanko@google.com>
Co-developed-by: Pasha Tatashin <tatashin@google.com>
Signed-off-by: Yuanchu Xie <yuanchu@google.com>
2024-08-05 22:41:56 +00:00
Praveen K Paladugu
bd180bc3eb main: rename landlock_config to landlock_rules
To keep the naming consistent, rename all uses of landlock_config
to landlock_rules.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-08-05 17:46:30 +00:00
Praveen K Paladugu
d2f0e8aebb Revert "vmm: make landlock configs VMM-level config"
This reverts commit 94929889ac.
This revert moves landlock config back to VMConfig.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-08-05 17:46:30 +00:00
Wei Liu
94929889ac vmm: make landlock configs VMM-level config
This requires stashing the config values in `struct Vmm`. The configs
should be validated before before creating the VMM thread. Refactor the
code and update documentation where necessary.

The place where the rules are applied remain the same.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2024-07-06 04:42:58 +00:00
Praveen K Paladugu
11c17ca319 main: Enable landlock on main thread
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-07-06 04:42:58 +00:00
Praveen K Paladugu
130c988380 vmm: Enable Landlock on signal-handler thread
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-07-06 04:42:58 +00:00
Praveen K Paladugu
8c76a3e4b5 vmm: Enable Landlock on event-monitor thread
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-07-06 04:42:58 +00:00
Praveen K Paladugu
1d89f98edf vmm: Introduce landlock-rules cmdline param
Users can use this parameter to pass extra paths that 'vmm' and its
child threads can use at runtime. Hotplug is the primary usecase for
this parameter.

In order to hotplug devices that use local files: disks, memory zones,
pmem devices etc, users can use this option to pass the path/s that will
be used during hotplug while starting cloud-hypervisor. Doing this will
allow landlock to add required rules to grant access to these paths when
cloud-hypervisor process starts.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2024-07-06 04:42:58 +00:00
Praveen K Paladugu
287dbd4fc9 vmm: Introduce landlock cmdline parameter
Users can use this cmdline option to enable/disable Landlock based
sandboxing while running cloud-hypervisor.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-07-06 04:42:58 +00:00
Josh Soref
42e9632c53 misc: Fix spelling issues
Misspellings were identified by:
  https://github.com/marketplace/actions/check-spelling

* Initial corrections based on forbidden patterns from the action
* Additional corrections by Google Chrome auto-suggest
* Some manual corrections
* Adding markdown bullets to readme credits section

Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2024-06-08 16:31:30 +00:00
Alexandru Matei
091ce85473 main: update expand_fdtable comment
Updated the comment so it is sync with the code

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2024-05-28 14:50:40 +01:00
Alexandru Matei
f13d8f1412 main: fix high latency generated by file handle creation
Whenever the file descriptor table is full, Linux expands it by doubling
it's size.
The filesystem code that does this uses RCU synchronization to ensure
all pre-existing RCU read-side critical sections have completed. The
latency induced by this synchronization is a big part of the total time
required to restore a snapshot.

The kernel has an optimization in code, where it doesn't call
synchronize_rcu() if there is only one thread in the process. We can
take advantage of this optimization by expanding the descriptor table at
the application start, when it has only one thread.

This commit tries to expand the table to 4096 entries, this way we avoid
any expansion that could take place later.

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2024-05-25 01:09:10 +00:00
Omer Faruk Bayram
036e7e3797 vmm: ch-remote: replace deprecated zbus macros with new equivalents
Fixes deprecation related warnings introduced in #6400.

Signed-off-by: Omer Faruk Bayram <omer.faruk@sartura.hr>
2024-05-23 12:20:06 +00:00
Purna Pavan Chandra
555c4c41ab ch-remote: allow fds to be sent along with 'restore'
Enable restore command the ability to send file descriptors along with
HTTP request. This is useful when restoring a VM with explicit FDs
passed to NetConfig(s).

Signed-off-by: Purna Pavan Chandra <paekkaladevi@linux.microsoft.com>
2024-05-14 10:52:46 +00:00
Yi Wang
4fd5070f5d ch-remote: fix help of remove-device
remove-device can remove not only VFIO device but also pci device.

No functional change.

Signed-off-by: Yi Wang <foxywang@tencent.com>
2024-05-09 14:34:30 +00:00
Thomas Barrett
e7e856d8ac vmm: add pci_segment mmio aperture configs
When using multiple PCI segments, the 32-bit and 64-bit mmio
aperture is split equally between each segment. Add an option
to configure the 'weight'. For example, a PCI segment with a
`mmio32_aperture_weight` of 2 will be allocated twice as much
32-bit mmio space as a normal PCI segment.

Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
2024-04-24 09:35:19 +00:00
Yi Wang
1708561c74 ch-remote: add support for nmi
Adding the wrapping layer to be able to trigger NMI for the guest
from the ch-remote tool.

Signed-off-by: Yi Wang <foxywang@tencent.com>
2024-03-04 10:02:38 +00:00
Alexandru Matei
1091494320 vmm: http: graceful shutdown of the http api thread
This commit ensures that the HttpApi thread flushes all the responses
before the application shuts down. Without this step, in case of a
VmmShutdown request the application might terminate before the
thread sends a response.

Fixes: #6247

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2024-02-29 12:34:30 +00:00
Chris Webb
0310c5726f main: Show help text when run without arguments
cloud-hypervisor, ch-remote, vhost-user-block and vhost-user-net all
need at least one argument to do anything useful, so printing command
help is helpful when they are run without arguments or a subcommand.

Use clap::Command::arg_required_else_help(true) to do this.

Signed-off-by: Chris Webb <chris@arachsys.com>
2024-02-24 09:35:37 +00:00
Chris Webb
5627c26405 ch-remote: Fix crash when run with no subcommand
ch-remote crashes when run with --api-socket but no subcommand:

  $ target/release/ch-remote --api-socket /tmp/api
  thread 'main' panicked at src/bin/ch-remote.rs:509:14:
  internal error: entered unreachable code

Use clap::Command::subcommand_required(true) to yield a more friendly
error in this case.

Signed-off-by: Chris Webb <chris@arachsys.com>
2024-02-24 09:35:37 +00:00
Muminul Islam
aa6c486a6b vmm: add host-data as a command line argument
The host data provided at launch. Data is passed
to the hypervisor during the completion of the
isolated import.

Host Data provided by the hypervisor during guest launch.
The firmware includes this value in all attestation
reports for the guest.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2024-02-23 13:32:56 -08:00
Rob Bradford
adb318f4cd misc: Remove redundant "use" imports
With the nightly toolchain (2024-02-18) cargo check will flag up
redundant imports either because they are pulled in by the prelude on
earlier match.

Remove those redundant imports.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-02-19 17:54:30 +00:00
Chris Webb
09f3658999 vmm: Avoid zombie sigwinch_listener processes
When a guest running on a terminal reboots, the sigwinch_listener
subprocess exits and a new one restarts. The parent never wait()s
for children, so the old subprocess remains as a zombie. With further
reboots, more and more zombies build up.

As there are no other children for which we want the exit status,
the easiest fix is to take advantage of the implicit reaping specified
by POSIX when we set the disposition of SIGCHLD to SIG_IGN.

For this to work, we also need to set the correct default exit signal
of SIGCHLD when using clone3() CLONE_CLEAR_SIGHAND. Unlike the fallback
fork() path, clone_args::default() initialises the exit signal to zero,
which results in a child with non-standard reaping behaviour.

Signed-off-by: Chris Webb <chris@arachsys.com>
2024-02-19 17:08:47 +00:00
Muminul Islam
56dbb8f0db main: Support igvm as a payload
Currently kernel and firmware are checked as a payload.
IGVM should be checked as well. Otherwise, it hangs indefinitely.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2024-02-08 09:46:45 -08:00
Bo Chen
c1f4a7b295 main: Clarify truncate behavior for event monitor file
Fix beta clippy issue:

error: file opened with `create`, but `truncate` behavior not defined
   --> src/main.rs:624:26
    |
624 |                         .create(true)
    |                          ^^^^^^^^^^^^- help: add: `.truncate(true)`
    |
    = help: if you intend to overwrite an existing file entirely, call `.truncate(true)`
    = help: if you instead know that you may want to keep some parts of the old file, call `.truncate(false)`
    = help: alternatively, use `.append(true)` to append to the file instead of overwriting it
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#suspicious_open_options
    = note: `-D clippy::suspicious-open-options` implied by `-D warnings`
    = help: to override `-D warnings` add `#[allow(clippy::suspicious_open_options)]`

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-02-07 09:25:40 +00:00
Philipp Schuster
e50a641126 devices: add debug-console device
This commit adds the debug-console (or debugcon) device to CHV. It is a
very simple device on I/O port 0xe9 supported by QEMU and BOCHS. It is
meant for printing information as easy as possible, without any
necessary configuration from the guest at all.

It is primarily interesting to OS/kernel and firmware developers as they
can produce output as soon as the guest starts without any configuration
of a serial device or similar. Furthermore, a kernel hacker might use
this device for information of type B whereas information of type A are
printed to the serial device.

This device is not used by default by Linux, Windows, or any other
"real" OS, but only by toy kernels and during firmware development.

In the CLI, it can be configured similar to --console or --serial with
the --debug-console parameter.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
2024-01-25 10:25:14 -08:00