cloud-hypervisor

Author	SHA1	Message	Date
Songqian Li	bd17c84d3c	virtio-devices: move userspace mapping to vm-device Move UserspaceMapping to vm-device to avoid redefinition since UserspaceMapping is used by both `virtio-devices` and `device` crate. Signed-off-by: Songqian Li <sionli@tencent.com>	2025-08-14 22:14:34 +00:00
Songqian Li	cd2c43b489	misc: Fix beta clippy errors Fix clippy error: "error: manual implementation of `.is_multiple_of() `" from rustc 1.90.0-beta.1 (788da80fc 2025-08-04). Signed-off-by: Songqian Li <sionli@tencent.com>	2025-08-07 16:53:59 +00:00
Oliver Anderson	8c136041cb	build: Use workspace dependencies Many of the workspace members in the Cloud-hypervisor workspace share common dependencies. Making these workspace dependencies reduces duplication and improves maintainability. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com	2025-07-28 20:19:27 +00:00
dependabot[bot]	b0bf889d58	build: Bump serde_with from 3.9.0 to 3.14.0 Bumps [serde_with](https://github.com/jonasbb/serde_with) from 3.9.0 to 3.14.0. - [Release notes](https://github.com/jonasbb/serde_with/releases) - [Commits](https://github.com/jonasbb/serde_with/compare/v3.9.0...v3.14.0) --- updated-dependencies: - dependency-name: serde_with dependency-version: 3.14.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2025-07-26 00:03:54 +00:00
Songqian Li	e32fa593e5	build: clean up unused dependencies Signed-off-by: Songqian Li <sionli@tencent.com>	2025-07-15 07:16:36 +00:00
Alyssa Ross	ec8fceb4a6	virtio-devices: stop corrupting vsock commands The read_exact() call was introduced in `82ac114b8` ("virtio-devices: vsock: handle short read in muxer") to solve a crash when a connection disconnected without sending any data, but it introduced a problem of its own: because the socket is non-blocking, read_exact() may read some data, then return ErrorKind::WouldBlock. In that case, the data it read will be discarded. So for example if it read "CONNECT ", and then nothing else was available to read yet, "CONNECT " would be discarded, and so the next time this function was called, when epoll triggered again for the socket, only the following data would end up in command.buf, causing an error due to just a port number being an invalid command. Contrary to that commit message, this code was actually designed to handle short reads just fine — in the case of a short read, it stores the data it has read in command, and returns Error::UnixRead(ErrorKind::WouldBlock), which is ignored by the caller, and the function gets called again when there is more data to read, building up command potentially over the course of several reads. The only thing it didn't handle correctly, as far as I can tell, was a 0-byte read, which happens when a client disconnects from the socket without writing anything. All that's needed to fix this is to avoid an invalid subtraction in that case, so this change reverts `82ac114b8`, fixing the issue with partial commands being discarded, and instead handles the 0-byte read by using slice::get, and treating an empty command as an incomplete command, which of course it is. Fixes: `82ac114b8` ("virtio-devices: vsock: handle short read in muxer") Signed-off-by: Alyssa Ross <hi@alyssa.is>	2025-07-14 18:07:07 +00:00
Alyssa Ross	01aed9733c	build: add missing dependency features This makes it possible to run cargo test just for the virtio-devices crate (as long as either KVM or MSHV is specified). Signed-off-by: Alyssa Ross <hi@alyssa.is>	2025-07-14 18:06:54 +00:00
Muminul Islam	b268e88ba3	virtio-devices: remove unnecessary parentheses Cargo fuzz build report an warning: warning: unnecessary parentheses around closure body --> virtio-devices/src/iommu.rs:578:41 \| 578 \|.retain(\|&x, _\| (x < req.virt_start \|\| x > req.virt_end)); \| ^ \| = note: `#[warn(unused_parens)]` on by default help: remove these parentheses \| 578 -.retain(\|&x, _\| (x < req.virt_start \|\| x > req.virt_end)); 578 +.retain(\|&x, _\| x < req.virt_start \|\| x > req.virt_end); \| warning: `virtio-devices` (lib) generated 1 warning (run `cargo fix --lib -p virtio-devices` to apply 1 suggestion) Signed-off-by: Muminul Islam <muislam@microsoft.com>	2025-07-11 22:02:15 +00:00
Jinank Jain	190d90196f	build: Bump vfio and all the dependent crates to latest version Recently vfio crates have moved to crates.io, thus we should start consuming the crate from crates.io instead git url. This results in better versioning instead of tracking some git commit sha. Signed-off-by: Jinank Jain <jinankjain@microsoft.com>	2025-07-07 03:05:38 +00:00
Philipp Schuster	1433763d40	misc: virtio-devices: manual adjustment of special case Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com	2025-06-13 19:55:54 +00:00
Philipp Schuster	8e2973fe7c	misc: virtio-devices: streamline error Display::fmt() The changes were mostly automatically applied using the Python script mentioned in the first commit of this series. Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com	2025-06-13 19:55:54 +00:00
Gauthier Jolly	3d78662498	block: virtio-blk: report IO errors to the guest Instead of exiting on IO errors, report the errors to the guest with VIRTIO_BLK_S_IOERR. For example, the guest kernel will log something similar to this if the nbd behind /dev/vdc is unexpectedly disconnected: [ 166.033957] I/O error, dev vdc, sector 264 op 0x1:(WRITE) flags 0x9800 phys_seg 1 prio class 2 [ 166.035083] Aborting journal on device vdc-8. [ 166.037307] Buffer I/O error on dev vdc, logical block 9, lost sync page write [ 166.038471] JBD2: I/O error when updating journal superblock for vdc-8. [...] [ 174.234470] EXT4-fs (vdc): I/O error while writing superblock In case the rootfs is not located on the affected block device, this will not crash the guest. Fixes: #6995 Signed-off-by: Gauthier Jolly <contact@gjolly.fr>	2025-06-09 16:48:07 +00:00
Jinank Jain	2bc8d51a60	misc: Fix missing lifetime syntax clippy warning This was caught by the nightly compiler during cargo fuzz build. error: lifetime flowing from input to output with different syntax can be confusing --> /home/runner/work/cloud-hypervisor/cloud-hypervisor/hypervisor/src/arch/x86/emulator/mod.rs:493:26 \| 493 \| pub fn new(platform: &mut dyn PlatformEmulator<CpuState = T>) -> Emulator<T> { \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ----------- the lifetime gets resolved as `'_` \| \| \| this lifetime flows to the output \| = note: `-D mismatched-lifetime-syntaxes` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(mismatched_lifetime_syntaxes)]` help: one option is to remove the lifetime for references and use the anonymous lifetime for paths \| 493 \| pub fn new(platform: &mut dyn PlatformEmulator<CpuState = T>) -> Emulator<'_, T> { Signed-off-by: Jinank Jain <jinankjain@microsoft.com>	2025-06-09 11:19:11 +00:00
Philipp Schuster	20296e909a	misc: streamline thiserror cargo dep As almost every sub crate depends on thiserror, lets upgrade it to a workspace dependency. Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com	2025-05-28 17:24:34 +00:00
Philipp Schuster	28e0a95450	misc: virtio-devices: streamline #[source] and Error This streamlines the code base to follow best practices for error handling in Rust: Each error struct implements std::error::Error (most due via thiserror::Error derive macro) and sets its source accordingly. This allows future work that nicely prints the error chains, for example. So far, the convention is that each error prints its sub error as part of its Display::fmt() impl. Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com	2025-05-21 09:09:30 +00:00
Philipp Schuster	a615c809eb	misc: vsock: streamline #[source] and Error This streamlines the code base to follow best practices for error handling in Rust: Each error struct implements std::error::Error (most due via thiserror::Error derive macro) and sets its source accordingly. This allows future work that nicely prints the error chains, for example. So far, the convention is that each error prints its sub error as part of its Display::fmt() impl. Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com	2025-05-21 09:09:30 +00:00
Philipp Schuster	a212343908	misc: arch/riscv64: streamline #[source] and Error This streamlines the code base to follow best practices for error handling in Rust: Each error struct implements std::error::Error (most due via thiserror::Error derive macro) and sets its source accordingly. This allows future work that nicely prints the error chains, for example. So far, the convention is that each error prints its sub error as part of its Display::fmt() impl. Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com	2025-05-21 09:09:30 +00:00
Gregory Anders	dce82a34d0	net_util: add support for IPv6 addresses on tap interfaces Allow tap interfaces to be configured with an IPv6 address. The change is fairly straightforward: we need to update the API types and CLI parsing to accept either an IPv6 or IPv4 and then match on the IP address type when the tap device is configured. For IPv6 addresses, the netmask (prefix) must be provided at the same time as the address itself (in the SIOCSIFADDR ioctl). They cannot be configured separately. So we remove the separate "set_netmask" function and convert "set_ip_addr" to also accept a netmask. For IPv4 addresses, the IP address and netmask were already always set together, so this should have no functional impact for users of IPv4 addresses. Signed-off-by: Gregory Anders <ganders@cloudflare.com>	2025-05-20 16:41:04 +00:00
Philipp Schuster	78b0f68b21	vmm: Error for MemoryManagerError Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP <philipp.schuster@sap.com>	2025-05-16 11:42:01 +00:00
Philipp Schuster	b2993fb2fa	vmm: Error for vsock::unix::Error Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com	2025-05-16 11:42:01 +00:00
Philipp Schuster	05968f5c2c	block: introduce advisory locks for disk image files # What This commit introduces file-based advisory locking for the files backing up the block devices by using the fcntl() syscall with OFD locks. The per-open-file-descriptor (OFD) locks are more robust than traditional POSIX locks (F_SETLK) as they are not tied to process IDs and avoid common issues in multithreaded or multi-fd scenarios [1]. Therefore, we don't use `std::fs::File::try_lock()`, which is backed by F_SETLKW. The locking mechanism is aware of the `readonly` property and allows `n` readers or `1` writer (exclusive mode). As the locks are advisory, multiple cloud-hypervisor processes can prevent themselves from writing to the same file. However, this is not a system-wide file-system level locking mechanism preventing to open() a file. The introduced new locking mechanism does not cover vhost-user devices. # Why To prevent misconfiguration and improve safety, it is good practice to protect disk image files with a locking mechanism. Experience and common best practices suggest that advisory locks are preferable over mandatory locks due to better compatibility and fewer pitfalls (in fs space). The introduced functionality is aligned with the approach taken by QEMU [0], and is also recommended in [1]. # Implementation Details We need to ensure that not only normal operation keeps working but also state save/resume and live-migration. Especially for live migration, it is crucial that the sender VMM releases the locks when the VM stops so the receiver VMM can acquire them right after that. Therefore, the locking and releasing happen directly on the block device struct. The device manager knows all block devices and can forward requests to these types. Last but not least, this commit uses on explicit lock acquiring but implicit lock releasing (FD close). It only explicitly releases the locks where this integrates more smoothly into the existing code. # Testing I tested - normal operation - state save/resume, - device hot plugging, - and live-migration with read/shared and write/exclusive locks. One can use the `fcntl-tool` to test if locks are actually acquired or released [2]. # Links [0] `825b96dbce/util/osdep.c (L266)` [1] https://apenwarr.ca/log/20101213 [2] https://crates.io/crates/fcntl-tool Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com	2025-05-16 08:07:32 +00:00
Bo Chen	aaf86ef209	pci: Reprogram device BAR when its MSE bit is set The Memory Space Enable (MSE) bit from the COMMAND register in the PCI configuration space controls whether a PCI device responds to memory space accesses, e.g. read and write cycles to the device MMIO regions defined by its BARs. The MSE bit is used by the device drivers to ensure the correctness of BAR reprogramming. A common workflow is, the driver first clears the MSE bit, then writes new values to the BAR registers, and finally set the MSE bit to finish the BAR reprogramming. This patch changes how we handle BAR reprogramming for all PCI devices (e.g. virtio-pci, vfio, vfio-user, etc.), so that we follow the same convention, e.g. moving PCI BARs when its MSE bit is set. Note that some device drivers (such as edk2) only clear and set MSE once while reprogramming multiple BARs of a single device. To support such behavior, this patch adds support for multiple pending BAR reprogramming. See: https://github.com/cloud-hypervisor/cloud-hypervisor/issues/7027#issuecomment-2853642959 Signed-off-by: Bo Chen <bchen@crusoe.ai>	2025-05-15 17:35:44 +00:00
Bo Chen	cb52cf91df	pci: Keep `detect_bar_reprogramming` internal to `PciConfiguration` A BAR reprogramming of a PCI device will only happen when the (guest) kernel write to its PCI config space, e.g. the detection of bar reprogramming (`detect_bar_repgraomming()`) can be embedded to the PCI config space write (`write_config_register()`). It simplifies APIs exposed by the `struct PciConfiguration` and `trait PciDevice`. It also prepares for easier handling of pending bar reprogramming when the MSE bit of the COMMAND register is not enabled at the time of changing BAR registers. See: https://github.com/cloud-hypervisor/cloud-hypervisor/issues/7027#issuecomment-2853642959 Signed-off-by: Bo Chen <bchen@crusoe.ai>	2025-05-15 15:04:01 +00:00
Jinank Jain	ea4693a091	misc: Fix clippy error from beta compiler Rust has a new way of constructing other error and clippy complains if we are still using the older way to construct error message. Thus, migrate to the new approach suggested by the clippy. Warning from beta compiler: error: this can be `std::io::Error::other(_)` --> block/src/vhdx/mod.rs:142:17 \| \| / std::io::Error::new( \| \| std::io::ErrorKind::Other, \| \| format!("Failed to update VHDx header: {e}"), \| \| ) \| \|_________________^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#io_other_error help: use `std::io::Error::other` std::io::Error::other( format!("Failed to update VHDx header: {e}"), Signed-off-by: Jinank Jain <jinankjain@microsoft.com>	2025-04-03 13:11:49 +00:00
Jinank Jain	3698b8e74c	build: Centralize serde_json crate to workspace `serde_json` crate is referenced by multiple components, centralize it to workspace to better manage this crate. Signed-off-by: Jinank Jain <jinankjain@microsoft.com>	2025-04-02 06:20:54 +00:00
Ruoqing He	4de422ad69	misc: Fix clippy - manually reimplementing div_ceil Reported by 1.86.0-beta.1 (f0cb41030 2025-02-17). Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-01 01:02:17 +00:00
Ruoqing He	c441bb2968	misc: Fix clippy - doc list item overindented Reported by 1.86.0-beta.1 (f0cb41030 2025-02-17). Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-01 01:02:17 +00:00
Nikolay Edigaryev	27fda753e1	virtio-devices: iommu: allow limiting maximum address width in bits Currently, Cloud Hypervisor does not set a VIRTIO_IOMMU_F_INPUT_RANGE feature bit for the VirtIO IOMMU device, which, according to spec[1], means that the guest may use the whole 64-bit address space is for IOMMU purposes: >If the feature is not offered, virtual mappings span over the whole >64-bit address space (start = 0, end = 0xffffffff ffffffff) As far as I am aware, there are currently no host platforms on the market capable of addressing the whole 64-bit address space. For example, I am currently working with a host platform that reports 39-bit address space for IOMMU purposes: >DMAR: Host address width 39 When running a VFIO pass-through guest on such a platform, NVIDIA driver in guest gets DMA mapping failures when working with large data, and this results in Cloud Hypervisor exiting with the following error: >cloud-hypervisor: 1501.220535s: <__iommu> >ERROR:virtio-devices/src/thread_helper.rs:53 -- Error running worker: >HandleEvent(Failed to process request queue : ExternalMapping(Custom >{ kind: Other, error: "failed to map memory for VFIO container, iova >0x7fff00000000, gpa 0x24ce25000, size 0x1000: IommuDmaMap(Error(22))" >})) Passing "--platform iommu_address_width=39" to Cloud Hypervisor built with this change fixes this. [1]: https://docs.oasis-open.org/virtio/virtio/v1.3/csd01/ virtio-v1.3-csd01.html#x1-5420006 Signed-off-by: Nikolay Edigaryev <edigaryev@gmail.com>	2025-01-14 21:31:47 +00:00
Wei Liu	0cb2c86ff4	fuzz: introduce a virtio vsock fuzzer Signed-off-by: Wei Liu <liuwe@microsoft.com>	2025-01-14 00:26:01 +00:00
Wei Liu	d359c8cdce	virtio-devices: vsock: allow fuzzer to use TestBackend Instead of reinventing this mock infrastructure in the upcoming fuzzer, reuse the one that is already available. However this change makes Clippy complain that TestBackend and TestContext don't implement Default. This is just test code, we can suppress Clippy in this case. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2025-01-14 00:26:01 +00:00
Rob Bradford	2fc4de6c65	virtio-devices: iommu: Use hex formatting in log messages This means that the the addresses are more easily readable. Signed-off-by: Rob Bradford <rbradford@rivosinc.com>	2025-01-13 21:46:23 +00:00
Rob Bradford	03eeb36b74	virtio-devices: iommu: Search full range for GVA conversion Remove an erroneous optimisation that used the page size mask to reduce the range to iterate through on the set of mappings. This doesn't work as the virtio-iommu ranges are larger than a single page. This may have worked in the past when the mappings were limited to a single page. Signed-off-by: Rob Bradford <rbradford@rivosinc.com>	2025-01-13 21:46:23 +00:00
Wei Liu	a1af4238ae	virtio-devices: make ioeventfds() return an iterator MSHV's SEV-SNP implementation calls ioeventfds whenever there is an event. This change removes the need frequent allocation and deallocation of a vector, while at the same time makes sure other call sites are unaffected. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2025-01-09 21:28:46 +00:00
Wei Liu	d2e798944a	virtio-devices: rename two variables They are used. No need to start their names with an underscore. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2025-01-09 21:28:46 +00:00
Wei Liu	d99f294281	pci: rename as_any to as_any_mut That trait function returns a mutable reference. Rename it to follow Rust's convention. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2025-01-09 21:28:46 +00:00
Wei Liu	778c05d678	virtio-devices: use C ABI-qualification for packed structures Signed-off-by: Wei Liu <liuwe@microsoft.com>	2025-01-09 13:51:42 +00:00
Rob Bradford	2624f17ffe	virtio-devices: Automatically fix operator precedence clippy warning Signed-off-by: Rob Bradford <rbradford@rivosinc.com>	2025-01-07 17:44:41 +00:00
Rob Bradford	eeae63b459	build: Bump thiserror version Signed-off-by: Rob Bradford <rbradford@rivosinc.com>	2025-01-06 17:39:45 +00:00
Wei Liu	32482f6634	block: make available VIRTIO_BLK_F_SEG_MAX This allows the guest to put in more than one segment per request. It can improve the throughput of the system. Introduce a new check to make sure the queue size configured by the user is large enough to hold at least one segment. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2025-01-01 18:50:39 +00:00
dependabot[bot]	0c2f2d3ec1	build: Bump anyhow from 1.0.87 to 1.0.94 Bumps [anyhow](https://github.com/dtolnay/anyhow) from 1.0.87 to 1.0.94. - [Release notes](https://github.com/dtolnay/anyhow/releases) - [Commits](https://github.com/dtolnay/anyhow/compare/1.0.87...1.0.94) --- updated-dependencies: - dependency-name: anyhow dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2024-12-05 00:30:01 +00:00
dependabot[bot]	30cf1eed5e	build: Bump libc from 0.2.158 to 0.2.167 Bumps [libc](https://github.com/rust-lang/libc) from 0.2.158 to 0.2.167. - [Release notes](https://github.com/rust-lang/libc/releases) - [Changelog](https://github.com/rust-lang/libc/blob/0.2.167/CHANGELOG.md) - [Commits](https://github.com/rust-lang/libc/compare/0.2.158...0.2.167) --- updated-dependencies: - dependency-name: libc dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2024-12-03 01:15:36 +00:00
Ruoqing He	ab7b294688	misc: Replace map_or on false with is_some_and Replace `map_or()` on false condition with `is_some_and` to provide better readability, as suggestted by v1.84.0-beta.1 `cargo clippy`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2024-11-29 12:44:33 +00:00
Rob Bradford	df1d6eaaee	virtio-devices: Enable VIRTIO_RING_F_INDIRECT_DESC This improves sequential write performance using fio (2888MiB/s -> 3293MiB/s) VM config: cloud-hypervisor --disk path=~/workloads/jammy.raw,direct=on path=~/workloads/big-disk.img,direct=on --cpus boot=1 --memory size=2G,shared=on --serial tty --console off --seccomp log --kernel ~/workloads/hypervisor-fw Host: fio --filename=big-disk.img --direct=1 --rw=write --bs=256k --ioengine=libaio --iodepth=64 --runtime=120 --numjobs=1 --time_based --group_reporting --name=throughput-test-job --eta-newline=1 VM: fio --filename=/dev/vdb --direct=1 --rw=write --bs=256k --ioengine=libaio --iodepth=64 --runtime=120 --numjobs=1 --time_based --group_reporting --name=throughput-test-job --eta-newline=1 Baseline (file on filesystem on host used as backing store for block device): throughput-test-job: (groupid=0, jobs=1): err= 0: pid=10169: Tue Nov 5 09:31:55 2024 write: IOPS=13.5k, BW=3385MiB/s (3549MB/s)(397GiB/120008msec); 0 zone resets slat (usec): min=4, max=10222, avg=20.25, stdev=29.01 clat (usec): min=984, max=45599, avg=4706.01, stdev=2278.11 lat (usec): min=1002, max=45610, avg=4726.27, stdev=2278.77 clat percentiles (usec): \| 1.00th=[ 3195], 5.00th=[ 3228], 10.00th=[ 3261], 20.00th=[ 3261], \| 30.00th=[ 3261], 40.00th=[ 3261], 50.00th=[ 3294], 60.00th=[ 3916], \| 70.00th=[ 5014], 80.00th=[ 7308], 90.00th=[ 7635], 95.00th=[ 7898], \| 99.00th=[ 8586], 99.50th=[ 8979], 99.90th=[36439], 99.95th=[36963], \| 99.99th=[43779] bw ( MiB/s): min= 1934, max= 4821, per=100.00%, avg=3391.67, stdev=1266.42, samples=239 iops : min= 7738, max=19286, avg=13566.67, stdev=5065.65, samples=239 lat (usec) : 1000=0.01% lat (msec) : 2=0.03%, 4=61.10%, 10=38.62%, 20=0.11%, 50=0.15% cpu : usr=17.13%, sys=14.38%, ctx=1352501, majf=0, minf=11 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0% issued rwts: total=0,1624829,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=64 Run status group 0 (all jobs): WRITE: bw=3385MiB/s (3549MB/s), 3385MiB/s-3385MiB/s (3549MB/s-3549MB/s), io=397GiB (426GB), run=120008-120008msec Disk stats (read/write): dm-2: ios=129/1624787, sectors=1872/831364040, merge=0/0, ticks=185/6960387, in_queue=6960572, util=100.00%, aggrios=130/1626025, aggsectors=1880/831915888, aggrmerge=0/0, aggrticks=194/6967818, aggrin_queue=6968012, aggrutil=99.97% dm-0: ios=130/1626025, sectors=1880/831915888, merge=0/0, ticks=194/6967818, in_queue=6968012, util=99.97%, aggrios=130/1606095, aggsectors=1880/831915888, aggrmerge=0/19930, aggrticks=204/6634488, aggrin_queue=6635288, aggrutil=58.59% nvme0n1: ios=130/1606095, sectors=1880/831915888, merge=0/19930, ticks=204/6634488, in_queue=6635288, util=58.59% On block device in VM: throughput-test-job: (groupid=0, jobs=1): err= 0: pid=667: Tue Nov 5 09:53:19 2024 write: IOPS=13.2k, BW=3293MiB/s (3453MB/s)(386GiB/120008msec); 0 zone resets slat (usec): min=4, max=3518, avg=27.77, stdev=35.32 clat (usec): min=723, max=44252, avg=4829.82, stdev=2222.41 lat (usec): min=735, max=44270, avg=4857.85, stdev=2223.45 clat percentiles (usec): \| 1.00th=[ 3097], 5.00th=[ 3195], 10.00th=[ 3195], 20.00th=[ 3228], \| 30.00th=[ 3261], 40.00th=[ 3294], 50.00th=[ 3621], 60.00th=[ 4555], \| 70.00th=[ 5997], 80.00th=[ 7242], 90.00th=[ 7570], 95.00th=[ 7898], \| 99.00th=[ 8586], 99.50th=[ 8848], 99.90th=[36439], 99.95th=[36963], \| 99.99th=[40633] bw ( MiB/s): min= 1914, max= 4857, per=100.00%, avg=3299.46, stdev=1180.81, samples=239 iops : min= 7658, max=19430, avg=13197.77, stdev=4723.22, samples=239 lat (usec) : 750=0.01%, 1000=0.01% lat (msec) : 2=0.01%, 4=52.79%, 10=46.95%, 20=0.10%, 50=0.14% cpu : usr=25.95%, sys=16.71%, ctx=1111821, majf=0, minf=10 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0% issued rwts: total=0,1580693,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=64 Run status group 0 (all jobs): WRITE: bw=3293MiB/s (3453MB/s), 3293MiB/s-3293MiB/s (3453MB/s-3453MB/s), io=386GiB (414GB), run=120008-120008msec Disk stats (read/write): vdb: ios=60/1953213, merge=0/0, ticks=14/8229134, in_queue=8229149, util=100.00% Prior to change: throughput-test-job: (groupid=0, jobs=1): err= 0: pid=667: Tue Nov 5 09:37:45 2024 write: IOPS=11.6k, BW=2888MiB/s (3028MB/s)(338GiB/120008msec); 0 zone resets slat (usec): min=3, max=3200, avg=18.48, stdev=24.54 clat (usec): min=1237, max=46575, avg=5521.41, stdev=2641.99 lat (usec): min=1249, max=46591, avg=5540.06, stdev=2643.54 clat percentiles (usec): \| 1.00th=[ 2999], 5.00th=[ 3163], 10.00th=[ 3195], 20.00th=[ 3261], \| 30.00th=[ 3294], 40.00th=[ 3359], 50.00th=[ 6063], 60.00th=[ 7111], \| 70.00th=[ 7373], 80.00th=[ 7570], 90.00th=[ 7832], 95.00th=[ 8094], \| 99.00th=[ 8717], 99.50th=[ 9241], 99.90th=[36963], 99.95th=[37487], \| 99.99th=[41157] bw ( MiB/s): min= 1936, max= 4826, per=100.00%, avg=2892.43, stdev=1202.99, samples=239 iops : min= 7746, max=19306, avg=11569.68, stdev=4811.98, samples=239 lat (msec) : 2=0.01%, 4=46.26%, 10=53.38%, 20=0.09%, 50=0.26% cpu : usr=14.20%, sys=8.59%, ctx=1246257, majf=0, minf=12 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0% issued rwts: total=0,1386102,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=64 Run status group 0 (all jobs): WRITE: bw=2888MiB/s (3028MB/s), 2888MiB/s-2888MiB/s (3028MB/s-3028MB/s), io=338GiB (363GB), run=120008-120008msec Signed-off-by: Rob Bradford <rbradford@rivosinc.com>	2024-11-05 15:44:41 +00:00
Ruoqing He	0aab960bf1	misc: Elide needless lifetimes As clippy of rust-toolchain version 1.83.0-beta.1 suggests, elide needless lifetimes to `'_`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2024-10-18 17:46:39 +00:00
Ruoqing He	297236a7c0	misc: Eliminate use of `assert!((...).is_ok())` Asserting on .is_ok()/.is_err() leads to hard to debug failures (as if the test fails, it will only say "assertion failed: false". We replace these with `.unwrap()`, which also prints the exact error variant that was unexpectedly encountered (we can to this these days thanks to efforts to implement Display and Debug for our error types). If the assert!((...).is_ok()) was followed by an .unwrap() anyway, we just drop the assert. Inspired by and quoted from @roypat. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2024-10-03 12:03:49 +00:00
Ruoqing He	61e57e1cb1	misc: Further improve imports styling By introducing `imports_granularity="Module"` format strategy, effectively groups imports from the same module into one line or block, improving maintainability and readability. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2024-09-29 16:13:48 +00:00
Rob Bradford	88a9f79944	misc: Adapt consistent import style formatting Historically the Cloud Hypervisor coding style has been to ensure that all imports are ordered and placed in a single group. Unfortunately cargo fmt has no support for ensuring that all imports are in a single group so if whitespace lines were added as part of the import statements then they would only be odered correctly in the group. By adopting "group_imports="StdExternalCrate" we can enforce a style where imports are placed in at most three groups for std, external crates and the crate itself. Choosing a style enforceable by the tooling reduces the reviewer burden. Signed-off-by: Rob Bradford <rbradford@rivosinc.com>	2024-09-29 13:08:12 +01:00
Ruoqing He	5a70d7ec69	build: Centralize rust-vmm crates to workspace Modify `Cargo.toml` in each member crate to follow the dependencies specified in root `Cargo.toml` file. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2024-09-27 15:58:21 +00:00
Rob Bradford	d90fa96bb7	build: Bulk update vm-memory and related dependencies Signed-off-by: Rob Bradford <rbradford@rivosinc.com>	2024-09-26 12:31:25 +00:00
Alyssa Ross	287887c99c	vmm: fix console IO safety Rebooting a VM fails with the following error when debug assertions are enabled: fatal runtime error: IO Safety violation: owned file descriptor already closed This happens because FromRawFd::from_raw_fd is used on RawFds stored in ConsoleInfo every time a VM begins to boot, so the second time (after a reboot, or if the first attempt to boot via the API failed), the fd will be closed. Until this assertion is hit, the code is operating on either closed file descriptors, or new file descriptors for something completely different. If debug assertions are disabled, it will just continue doing this with unpredictable results. To fix this, and prevent the problem reocurring, ownership of the console file descriptors needs to be properly tracked, using Rust's type system, so this commit refactors the console code to do that. The file descriptors are now passed around with reference counts, so they won't be closed prematurely. The obvious way to do this would be to just have each member of ConsoleInfo be an Arc<File>, but we need to accomodate that serial console file descriptors can also be sockets. We can't just store an OwnedFd and convert it when it's used, because we only get a reference from the Arc, so we need to store the descriptors as their concrete types in an enum. Since this basically duplicates the ConsoleOutputMode enum from the config, the ConsoleOutputMode enum is now not used past constructing the ConsoleInfo. So that ownership can be represented consistently, the debug console's tty mode now uses its own stdout descriptor. I'm still using .try_clone().unwrap() (i.e. dup()) to clone file descriptors for Endpoint::FilePair and Endpoint::TtyPair, because I assume there's a reason for them not just to hold a single file descriptor. I've also retained the existing behaviour of having serial manager ignore the tty file descriptor passed to it (which is stdout), and instead using stdin. It looks a lot weirder now, because it has to explicitly indicate it's ignoring the fd with an underscore binding. Fixes: `52eebaf6` ("vmm: refactor DeviceManager to use console_info") Signed-off-by: Alyssa Ross <hi@alyssa.is>	2024-09-25 22:34:43 +00:00

1 2 3 4 5 ...

900 commits