This is the first change to Cloud Hypervisor in a series of changes
intended to increase the max number of supported vCPUs in guest VMs,
which is currently limited to 255 (254 on x86_64).
No user-visible/behavior changes are expected as a result of
applying this patch, as the type of boot_cpus and related
fields in config structs remains u8 for now, and all configuration
validations remain the same.
Signed-off-by: Barret Rhoden <brho@google.com>
Signed-off-by: Neel Natu <neelnatu@google.com>
Signed-off-by: Ofir Weisse <oweisse@google.com>
Signed-off-by: Peter Oskolkov <posk@google.com>
This allows us to enable/disable the fw_cfg device via the cli
We can also now upload files into the guest vm using fw_cfg_items
via the cli
Signed-off-by: Alex Orozco <alexorozco@google.com>
Here we add the fw_cfg device as a legacy device to the device manager.
It is guarded behind a fw_cfg flag in vmm at creation of the
DeviceManager. In this cl we implement the fw_cfg device with one
function (signature).
Signed-off-by: Alex Orozco <alexorozco@google.com>
This partially reverts
ed8f347fe62edd33355ad771615296ff8edc8d33 from #7183 and
6277d7d5f20126945904fefdf5fb990bbcce5ae8 from #7201.
# Output how it was merged for v47 (#7066)
```
Error: Cloud Hypervisor exited with the following chain of errors:
0: Error booting VM
1: The VM could not boot
2: Error manipulating firmware file
3: No such file or directory (os error 2)
Debug Info: VmBoot(VmBoot(FirmwareFile(Os { code: 2, kind: NotFound, message: "No such file or directory" })))
```
# Output after #7183 and #7201
```
cloud-hypervisor: 31.385730ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:27 -- Error: Cloud Hypervisor exited with the following chain of errors:
cloud-hypervisor: 31.417961ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:39 -- 0: Error booting VM
cloud-hypervisor: 31.448078ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:39 -- 1: The VM could not boot
cloud-hypervisor: 31.486711ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:39 -- 2: Error manipulating firmware file
cloud-hypervisor: 31.513331ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:39 -- 3: No such file or directory (os error 2)
cloud-hypervisor: 31.548037ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:44 --
cloud-hypervisor: 31.568045ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:45 -- Debug Info: VmBoot(VmBoot(FirmwareFile(Os { code: 2, kind: NotFound, message: "No such file or directory" })))
```
The "proper logger" has indeed the advantage that messages can
be gracefully redirected to log files etc. However, this makes the
error message hardly readable.
Therefore, I propose to use error!() only for runtime errors messages
but not a pretty-printed version of those.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
Unify log formatting and printing as `eprintln!` and `log::error!`
would be used alongside each other.
When using e.g. `env_logger` lines printed with `eprintln!` would
lack formatting / colors.
Currently only relevant in `ch-remote` + `cli_print_error_chain`.
Note that the replaced messages now also end up in the logfile of
`cloud-hypervisor` when configured and not any longer in stderr.
Signed-off-by: Maximilian Güntner <code@mguentner.de>
Until now all messages generated using `log::level!`
(e.g., `warn!`) have not been printed as `ch-remote` did not
register a logger.
Furthermore, replace all `eprintln!` with `error!`
to align formatting for consistency.
Signed-off-by: Maximilian Güntner <code@mguentner.de>
Remote server errors are transferred as raw HTTP body. This way,
we lose the nested structured error information.
This is an attempt to retrieve the errors from the HTTP response
and to align the output with the normal error output.
For example, this produces the following chain of errors. Note
that everything after level 0 was retrieved from the HTTP server
response:
```
Error: ch-remote exited with the following chain of errors:
0: http client error
1: Server responded with InternalServerError
2: Error from API
3: The disk could not be added to the VM
4: Failed to validate config
5: Identifier disk1 is not unique
Debug Info: HttpApiClient(ServerResponse(InternalServerError, Some("Error from API<br>The disk could not be added to the VM<br>Failed to validate config<br>Identifier disk1 is not unique")))
```
In case the JSON can't be parsed properly, ch-remote will print:
```
Error: ch-remote exited with the following chain of errors:
0: http client error
X: Can't get remote's error messages from JSON response: EOF while parsing a value at line 1 column 0: body=''
Debug Info: HttpApiClient(ServerResponse(InternalServerError, Some("")))
```
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
While working on this, I found a subtle but severe compiler bug [0].
To fight the bug with explicitness rather than implicitness (to
prevent weird stuff in the future), this change is beneficial.
The bug is at least in Rust stable 1.34..1.87.
[0]: https://github.com/rust-lang/rust/issues/141673
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
With the foundations of each error type implementing std::error::Error,
we can now nicely walk the `.source()` chain and print an error trace.
This commit introduces improved user-facing error printing when:
- Cloud Hypervisor fails with an error
- ch-remote fails (client error)
- ch-remote fails (remote error)
The additional context is a clear improvement in UX for both users and
developers. In the following example, the new behaviour is shown for
a direct invocation of Cloud Hypervisor leading to a failure. This
looks similar for ch-remote.
```
Old Style
`target/release/cloud-hypervisor --api-socket /tmp/chv2.sock --kernel /etc/bootitems/linux/kernel_minimal/stable.bzImage --cmdline console=ttyS0 --serial tty --console off --disk path=img.raw --initramfs /etc/bootitems/linux/initrd_minimal/default`
Error booting VM: VmBoot(LockingError(BlockError(LockDiskImage(AlreadyLocked)))
```
```
`target/release/cloud-hypervisor --api-socket /tmp/chv2.sock --kernel /etc/bootitems/linux/kernel_minimal/stable.bzImage --cmdline console=ttyS0 --serial tty --console off --disk path=img.raw --initramfs
/etc/bootitems/linux/initrd_minimal/default`
Error: Cloud Hypervisor exited with the following chain of errors:
0: Error booting VM
1: The VM could not boot
2: Error locking disk images: Another instance likely holds a lock
3: Cannot lock images of all block devices
4: Failed to get Write lock for disk image: ./img.raw
5: The file is already locked
Debug Info: VmBoot(VmBoot(LockingError(DiskLockError(LockDiskImage { error: AlreadyLocked, lock_type: Write, path: "./raw_disk.bin" })))
```
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
The changes were mostly automatically applied using the Python
script mentioned in the first commit of this series.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
The changes were mostly automatically applied using the Python
script mentioned in the first commit of this series.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
We can remove the `tests` module as the entire file is only
available when running tests.
Follow-up of #7130 / 1f13165fae.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
Having them sorted alphabetically makes more sense since there are
already many, and the list is growing. This improves the UX for users
and developers.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
This streamlines the Error implementation in the Cloud Hypervisor code
base to match the remaining parts so that everything follows the agreed
conventions. These are leftovers missed in the previous commits.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
This streamlines the code base to follow best practices for
error handling in Rust: Each error struct implements
std::error::Error (most due via thiserror::Error derive macro)
and sets its source accordingly.
This allows future work that nicely prints the error chains,
for example.
So far, the convention is that each error prints its
sub error as part of its Display::fmt() impl.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
This streamlines the code base to follow best practices for
error handling in Rust: Each error struct implements
std::error::Error (most due via thiserror::Error derive macro)
and sets its source accordingly.
This allows future work that nicely prints the error chains,
for example.
So far, the convention is that each error prints its
sub error as part of its Display::fmt() impl.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
The CLI has grown to a big variety of options. clap prints them in the
help message (--help) in the order they were defined. We now are at a
point where grouping things logically together doesn't work well.
Further, there is no support by clap for logical grouping and the
current code base wasn't consistent. Therefore, this commit introduces
two changes:
- a new structure to define arguments (all in an array)
- an alphabetical ordering of the arguments
No other changes have been made. No options have been altered.
This significantly improves:
- code maintainability and extensibility
- readability of the --help output
A unit test ensures they stay sorted. A better approach to check if the
list of arguments (known at build time) is sorted would be a compile
time check (`const`), but this currently isn't possible in stable Rust.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
If any VM argument (e.g. --disk) is provided require some payload (e.g.
--kernel or --firmware) when parsing the command line arguments.
See: #6831
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
This patch removes pub import vm_config in config.rs to eliminate
the ambiguity of vm_comfig reference.
Signed-off-by: Songqian Li <sionli@tencent.com>
By introducing `imports_granularity="Module"` format strategy,
effectively groups imports from the same module into one line or block,
improving maintainability and readability.
Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
Historically the Cloud Hypervisor coding style has been to ensure that
all imports are ordered and placed in a single group. Unfortunately
cargo fmt has no support for ensuring that all imports are in a single
group so if whitespace lines were added as part of the import statements
then they would only be odered correctly in the group.
By adopting "group_imports="StdExternalCrate" we can enforce a style
where imports are placed in at most three groups for std, external
crates and the crate itself. Choosing a style enforceable by the tooling
reduces the reviewer burden.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
This patch removes locks in VmCreate request and VmInfo response
since we needn't use a lock here and should ensure that internal
implementation is transparent to the runtime.
Signed-off-by: Songqian Li <sionli@tencent.com>
Pvmemcontrol provides a way for the guest to control its physical memory
properties, and enables optimizations and security features. For
example, the guest can provide information to the host where parts of a
hugepage may be unbacked, or sensitive data may not be swapped out, etc.
Pvmemcontrol allows guests to manipulate its gPTE entries in the SLAT,
and also some other properties of the memory map the back's host memory.
This is achieved by using the KVM_CAP_SYNC_MMU capability. When this
capability is available, the changes in the backing of the memory region
on the host are automatically reflected into the guest. For example, an
mmap() or madvise() that affects the region will be made visible
immediately.
There are two components of the implementation: the guest Linux driver
and Virtual Machine Monitor (VMM) device. A guest-allocated shared
buffer is negotiated per-cpu through a few PCI MMIO registers, the VMM
device assigns a unique command for each per-cpu buffer. The guest
writes its pvmemcontrol request in the per-cpu buffer, then writes the
corresponding command into the command register, calling into the VMM
device to perform the pvmemcontrol request.
The synchronous per-cpu shared buffer approach avoids the kick and busy
waiting that the guest would have to do with virtio virtqueue transport.
The Cloud Hypervisor component can be enabled with --pvmemcontrol.
Co-developed-by: Stanko Novakovic <stanko@google.com>
Co-developed-by: Pasha Tatashin <tatashin@google.com>
Signed-off-by: Yuanchu Xie <yuanchu@google.com>
This requires stashing the config values in `struct Vmm`. The configs
should be validated before before creating the VMM thread. Refactor the
code and update documentation where necessary.
The place where the rules are applied remain the same.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Users can use this parameter to pass extra paths that 'vmm' and its
child threads can use at runtime. Hotplug is the primary usecase for
this parameter.
In order to hotplug devices that use local files: disks, memory zones,
pmem devices etc, users can use this option to pass the path/s that will
be used during hotplug while starting cloud-hypervisor. Doing this will
allow landlock to add required rules to grant access to these paths when
cloud-hypervisor process starts.
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Users can use this cmdline option to enable/disable Landlock based
sandboxing while running cloud-hypervisor.
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Misspellings were identified by:
https://github.com/marketplace/actions/check-spelling
* Initial corrections based on forbidden patterns from the action
* Additional corrections by Google Chrome auto-suggest
* Some manual corrections
* Adding markdown bullets to readme credits section
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
Whenever the file descriptor table is full, Linux expands it by doubling
it's size.
The filesystem code that does this uses RCU synchronization to ensure
all pre-existing RCU read-side critical sections have completed. The
latency induced by this synchronization is a big part of the total time
required to restore a snapshot.
The kernel has an optimization in code, where it doesn't call
synchronize_rcu() if there is only one thread in the process. We can
take advantage of this optimization by expanding the descriptor table at
the application start, when it has only one thread.
This commit tries to expand the table to 4096 entries, this way we avoid
any expansion that could take place later.
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
Enable restore command the ability to send file descriptors along with
HTTP request. This is useful when restoring a VM with explicit FDs
passed to NetConfig(s).
Signed-off-by: Purna Pavan Chandra <paekkaladevi@linux.microsoft.com>
When using multiple PCI segments, the 32-bit and 64-bit mmio
aperture is split equally between each segment. Add an option
to configure the 'weight'. For example, a PCI segment with a
`mmio32_aperture_weight` of 2 will be allocated twice as much
32-bit mmio space as a normal PCI segment.
Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
This commit ensures that the HttpApi thread flushes all the responses
before the application shuts down. Without this step, in case of a
VmmShutdown request the application might terminate before the
thread sends a response.
Fixes: #6247
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
cloud-hypervisor, ch-remote, vhost-user-block and vhost-user-net all
need at least one argument to do anything useful, so printing command
help is helpful when they are run without arguments or a subcommand.
Use clap::Command::arg_required_else_help(true) to do this.
Signed-off-by: Chris Webb <chris@arachsys.com>
ch-remote crashes when run with --api-socket but no subcommand:
$ target/release/ch-remote --api-socket /tmp/api
thread 'main' panicked at src/bin/ch-remote.rs:509:14:
internal error: entered unreachable code
Use clap::Command::subcommand_required(true) to yield a more friendly
error in this case.
Signed-off-by: Chris Webb <chris@arachsys.com>