Also tweak extended interrupt handlings, as needed.
Most credit should go to Neel Natu, who figured out the magic
bits needed to make things work and provided detailed comments.
This patch is still NOOP, as VM config allows only up to
254 vCPUs on x86_64.
Note: changes in this and related previous patches/PRs have
only been tested on Linux hosts running on Intel x86_64 hardware.
Signed-off-by: Barret Rhoden <brho@google.com>
Signed-off-by: Neel Natu <neelnatu@google.com>
Signed-off-by: Ofir Weisse <oweisse@google.com>
Signed-off-by: Peter Oskolkov <posk@google.com>
test_direct_kernel_boot_bzimage runs only on x86, so the cfg!() branch
for selecting grep_cmd is unnecessary. Remove it for clarity.
Signed-off-by: Anirudh Rayabharam <anrayabh@microsoft.com>
One can call `to_vec()` anyway if one needs an owned copy. This change
further helps to prevent needless copies in upcoming changes.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
On aarch64 and RISC-V, calling load_firmware() through load_kernel()
provides no benefit and only duplicates checks already performed in
load_payload(). load_payload() now directly invokes load_firmware() or
load_kernel(), removing unnecessary indirection and redundancy.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
Both functions are defined separately for the two architecture with
minor differences.
* `load_firmware()`: call `arch::uefi::load_uefi` which are available on
both architecture;
* `load_kernel()`: manually align to `arch::layout::KERNEL_START` 2MB
for both architecture (e.g. no-op for `aarch64`);
Signed-off-by: Bo Chen <bchen@crusoe.ai>
With 254 vCPUs, pausing now takes ~4ms instead of >254ms. This
improvement is visible when running `ch-remote pause` and is
particularly important for live migration, where every millisecond
of downtime matters.
For the wait logic, it is fine to stick to the approach of
sleeping 1ms on the first missed ACK as:
1) we have to wait anyway
2) we give time to the OS, enabling it to schedule a vCPU thread next
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
Enable kvm build test and clippy test on RISC-V 64-bit platform to
ensure whole projects builds properly.
Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
The underlying problem currently causes unrelated PRs to fail.
This commit fixes that.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
MP table is a legacy device that is incompatible
with x2apic CPU IDs exceeding 254. The Linux kernel
is perfectly happy without MP table in these cases.
Signed-off-by: Barret Rhoden <brho@google.com>
Signed-off-by: Neel Natu <neelnatu@google.com>
Signed-off-by: Ofir Weisse <oweisse@google.com>
Signed-off-by: Peter Oskolkov <posk@google.com>
Currently, the following scenarios are supported by Cloud Hypervisor to
bootstrap a VM:
1. provide firmware
2. provide kernel
3. provide kernel + cmdline
4. provide kernel + initrd
5. provide kernel + cmdline + initrd
As the difference between `--firmware` and `--kernel` is not very clear
currently, especially as both use/support a Xen PVH entry, adding this
helps to identify the cause of misconfiguration.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
This is necessary to use the let-chains feature in a
follow-up. After upgrading to Rust edition 2024, clippy
wants to collapse various if's with let-chains.
Update image to 20250815-0 since MSRV in Dockerfile is updated.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
This patch introduces the inter-vm shared memory(ivshmem) device
to share a memory region between multiple processes running
different guests and the host.
This patch supports the basic ivshmem functions like ivshmem-plain
in QEMU[1].
[1] https://www.qemu.org/docs/master/specs/ivshmem-spec.html
Signed-off-by: Yi Wang <foxywang@tencent.com>
Signed-off-by: Songqian Li <sionli@tencent.com>
Move UserspaceMapping to vm-device to avoid redefinition since
UserspaceMapping is used by both `virtio-devices` and `device`
crate.
Signed-off-by: Songqian Li <sionli@tencent.com>
It is always called with topology provided, so there is no
need to pass topology as an Option. Simplifying the signature
makes further topology-related changes to arc/src/x86_64 module
simpler.
Signed-off-by: Peter Oskolkov <posk@google.com>
This is the second patch in a series intended to let Cloud Hypervisor
support more than 255 vCPUs in guest VMs; the first patch/commit is
https://github.com/cloud-hypervisor/cloud-hypervisor/pull/7231
At the moment, CPU topology in Cloud Hypervisor is using
u8 for components, and somewhat inconsistently:
- struct CpuTopology in vmm/src/vm_config.rs uses four components
(threads_per_core, cores_per_die, dies_per_package, packages);
- when passed around as a tuple, it is a 3-tuple of u8, with
some inconsistency:
- in get_x2apic_id in arch/src/x86_64/mod.rs the three u8
are assumed to be (correctly)
threads_per_core, cores_per_die, and dies_per_package, but
- in get_vcpu_topology() in vmm/src/cpu.rs the three-tuple is
threads_per_core, cores_per_die, and packages (dies_per_package
is assumed to always be one? not clear).
So for consistency, a 4-tuple is always passed around.
In addition, the types of the tuple components is changed from u8 to
u16, as on x86_64 subcomponents can consume up to 16 bits.
Again, config constraints have not been changed, so this patch
is mostly NOOP.
Signed-off-by: Barret Rhoden <brho@google.com>
Signed-off-by: Neel Natu <neelnatu@google.com>
Signed-off-by: Ofir Weisse <oweisse@google.com>
Signed-off-by: Peter Oskolkov <posk@google.com>
This is the first change to Cloud Hypervisor in a series of changes
intended to increase the max number of supported vCPUs in guest VMs,
which is currently limited to 255 (254 on x86_64).
No user-visible/behavior changes are expected as a result of
applying this patch, as the type of boot_cpus and related
fields in config structs remains u8 for now, and all configuration
validations remain the same.
Signed-off-by: Barret Rhoden <brho@google.com>
Signed-off-by: Neel Natu <neelnatu@google.com>
Signed-off-by: Ofir Weisse <oweisse@google.com>
Signed-off-by: Peter Oskolkov <posk@google.com>
This allows us to enable/disable the fw_cfg device via the cli
We can also now upload files into the guest vm using fw_cfg_items
via the cli
Signed-off-by: Alex Orozco <alexorozco@google.com>
This allows the fw_cfg device to be recognized by the guest linux
kernel. This becomes more relavnt in the following cl where I add
the option to load files into the guest via fw_cfg. The Linux kernel
already has a fw_cfg driver that will automatically load these files
under /sys when CONFIG_FW_CFG_SYSFS is enabled in the kernel config
For arm we must add fw_cfg to the devices tree
Signed-off-by: Alex Orozco <alexorozco@google.com>
We pass a reference to the guest memory when we create the device
in DeviceManager. This allows us to access the guest memory for DMA.
Signed-off-by: Alex Orozco <alexorozco@google.com>
The acpi tables are created in the same place the acpi tables would be
created for the regular bootflow, except here we add them to the
fw_cfg device to be measured by the fw and then the fw will put the
acpi tables into memory.
Signed-off-by: Alex Orozco <alexorozco@google.com>
The kernel and initramfs are passed to the fw_cfg device as
file references. The cmdline is passed directly.
Signed-off-by: Alex Orozco <alexorozco@google.com>
Here we add the fw_cfg device as a legacy device to the device manager.
It is guarded behind a fw_cfg flag in vmm at creation of the
DeviceManager. In this cl we implement the fw_cfg device with one
function (signature).
Signed-off-by: Alex Orozco <alexorozco@google.com>