Commit graph

599 commits

Author SHA1 Message Date
af8bd6b9ca Fix copy/paste problem with large buffers
The optimizations in ede53078b6 introduced a
but where the per-mime-type buffer size limit became a total size limit. This
commit fixes it to again apply per mime type, and also bumps the limit to
128MB.
2026-03-22 21:04:45 +00:00
21959d065c Force zink (OpenGL emulation using Vulkan) for gpu device server 2026-03-22 18:45:28 +00:00
8cff8b6597 fix: use full nix store paths for head/grep in USB probe
writeShellScript doesn't set PATH, so head and grep were not found
in the sandboxed service (TemporaryFileSystem=/).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 18:06:38 +00:00
a0fd96f7d7 fix: cloud-hypervisor USB passthrough vsock probe using wrong protocol
The readiness probe was sending a binary struct.pack message instead of
the ASCII proxy protocol (CONNECT <port>\n → OK <remoteport>\n), causing
it to always time out. Replace with a socat-based probe matching the
protocol used everywhere else.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 17:59:35 +00:00
b1c4f57dcf fix: GPU device service missing graphics driver environment
Set HOME, LIBGL_DRIVERS_PATH, __EGL_VENDOR_LIBRARY_DIRS, and add
/run/opengl-driver/lib to LD_LIBRARY_PATH so mesa can find DRI
drivers and EGL vendors. Set HOME to the per-VM GPU runtime dir
to fix shader cache directory creation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 17:57:53 +00:00
22f632df88 fix: vm-run console commands not returning output with cloud-hypervisor
Cloud-hypervisor's hybrid vsock (Unix socket + CONNECT protocol) doesn't
support half-close. When recv_pkt() gets a 0-byte read from shutdown(SHUT_WR),
it sends VSOCK_OP_SHUTDOWN with both RCV|SEND flags, tearing down the entire
connection and killing the response path.

Two fixes:
- Remove s.shutdown(SHUT_WR) from the vsock proxy
- Make guest command handler self-terminating: head -1 | bash. The pipe
  gives bash a clean EOF after one command line, so it no longer depends
  on vsock half-close to exit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 17:10:10 +00:00
faad5006c9 cleanup: remove vmsilo-start-* scripts, rename vmsilo-usb to vm-usb, fix vm-run output
- Remove vmsilo-start-* user-facing symlinks from package.nix (internal
  VM launcher scripts are only used by systemd ExecStart, not by users)
- Rename vmsilo-usb to vm-usb to match the vm-* naming convention
- Increase socat -t timeout in vm-run from default 0.5s to 5s to fix
  missing output from console commands (cloud-hypervisor proxy startup
  latency exceeded the default timeout window)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 16:51:57 +00:00
7148a5578c Bump crosvm for gpu dev debug logging 2026-03-22 16:51:57 +00:00
47899b819e Fix kwin runtime config reload for vmsilo
Allow overriding allowed wayland protocols for VMs through ~/.config/kwinrc:
```
kwriteconfig6 --file kwinrc --group Vmsilo --type stringlist --key AllowedProtocols "wl_shm,wl_compositor,wl_subcompositor,xdg_wm_base,wl_data_device_manager,zxdg_output_manager_v1,zwp_primary_selection_device_manager_v1,gtk_primary_selection_device_manager,wl_seat,wl_output,org_kde_kwin_server_decoration_manager,zxdg_decoration_manager_v1,zwp_relative_pointer_manager_v1,zwp_pointer_constraints_v1,wp_viewporter,wp_cursor_shape_manager_v1,wp_fractional_scale_manager_v1,wp_single_pixel_buffer_manager_v1,wp_alpha_modifier_v1,wp_color_representation_manager_v1,wp_color_manager_v1,frog_color_management_factory_v1,wp_fifo_manager_v1,wp_presentation,zwp_linux_dmabuf_v1"
```
2026-03-22 16:32:34 +00:00
08709827fb feat: replace crosvm USB passthrough with usbip-over-vsock
Replace crosvm xhci-based USB passthrough with usbip-rs over vsock,
enabling USB passthrough for both crosvm and cloud-hypervisor VMs.

Guest runs a persistent usbip-rs client listener on vsock port 5002.
Host runs one sandboxed usbip-rs host connect process per attached
device as a systemd template service (vmsilo-<vm>-usb@<devpath>).

Eliminates the JSON state file, file locking, and crosvm-specific
shell helper library in favor of systemd as the source of truth.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 16:15:27 +00:00
8b6a5594c5 docs: update CLAUDE.md for module restructuring
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 14:44:58 +00:00
43113ee984 refactor: unify socketWaitScript interface, reorder scripts.nix sections
Replace the dual mkSocketWaitScript/socketWaitScript pair in vm-config.nix
with a single socketWaitScript taking an onTimeout argument. Add section
headers to scripts.nix to distinguish VM launcher scripts from proxy and
user-facing scripts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 14:41:01 +00:00
eff3a8f1db refactor: move GPU config computation into vm-config.nix
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 14:35:50 +00:00
7d3a1e3d76 refactor: move cloud-hypervisor config building into vm-config.nix
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 14:31:45 +00:00
aff10fd01f refactor: centralize user UID/GID/home as _internal options
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 14:27:17 +00:00
43496674f6 refactor: extract USB helper lib and CLI into modules/usb.nix
Move usbHelperLib (~200 lines) and vmsiloUsbScript (~170 lines) from
scripts.nix into a dedicated usb.nix module. Deduplicate the
cloud-hypervisor USB rejection check into a shared usb_reject_ch_vm()
shell function. USB systemd services in services.nix continue consuming
cfg._internal.usbHelperLib unchanged.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 14:22:36 +00:00
b1bb3ad7cd refactor: remove dead _generatedGuestConfig option
The _generatedGuestConfig option was declared and read but never set by
any module, making it always evaluate to []. Remove the declaration and
simplify getEffectiveGuestConfig to read only from netvmInjections.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 14:18:20 +00:00
033a2b7375 Fix cloud-hypervisor vsock socket permissions for dbus-proxy
chown the vsock socket to the configured user after VM creation so the
dbus-proxy (which runs as the user) can connect to it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 13:25:48 +00:00
86cdc93d0e refactor: split services.nix into named generators, remove dead tray option
Phase 4: Break the monolithic systemd.services expression into named
generator functions (mkPrepServices, mkVmServices, mkProxyServices,
mkConsoleRelayServices, mkConsoleScreenServices, mkVirtiofsdServices,
mkSoundServices, mkDbusProxyServices, mkWaylandSeccontextServices,
mkGpuServices, mkUsbServices, mkBalloondService). The top-level
expression is now a clean concatenation of named generators.

Phase 5: Remove unused vm.tray option (superseded by vm.dbus.tray).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 13:06:41 +00:00
1c00fbf674 refactor: deduplicate helpers and extract syscall allowlists
Phase 2: Move getEffectiveInterfaces, getEffectiveIfaceMac, and
resolveColor into lib/helpers.nix so they're defined once instead of
copy-pasted across services.nix, networking.nix, assertions.nix, and
vm-config.nix. getEffectiveInterfaces takes netvmInjections as a
parameter to avoid the config-access issue.

Phase 3: Extract gpuSyscallAllowlist and soundSyscallAllowlist (~210
lines) from services.nix into lib/syscall-allowlists.nix.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 13:00:31 +00:00
fa7e43a2b4 refactor: extract shared VM config into lib/vm-config.nix
Deduplicate the computation shared between mkCrosvmVmScript and
mkCloudHypervisorVmScript by extracting it into a hypervisor-neutral
mkVmConfig function. Both script generators now consume this shared
attrset and only contain hypervisor-specific rendering logic.

Eliminates ~550 net lines from scripts.nix: rootfs resolution, GPU
normalization, kernel params, network entries, PCI handling, IOMMU
validation, and socket wait loops are now computed once.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 12:57:30 +00:00
626007760a Fix dbus-proxy log filter to match actual crate name
The EnvFilter targeted `vmsilo_tray` (a stale name) instead of
`vmsilo_dbus_proxy`, causing all log output to be silently dropped.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 12:56:53 +00:00
feb3012469 Add structured logging to vmsilo-dbus-proxy
INFO for listening socket, initiating connection, and accepting
connection (with protocol/port/address fields). DEBUG for tray entry
and notification creation. TRACE for all vsock messages sent/received
with full contents.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 12:45:18 +00:00
db915d990f Bump cloud-hypervisor to fix log spam 2026-03-22 12:25:44 +00:00
6d7b606d02 feat: add cloud-hypervisor support to balloond and dbus-proxy
- BalloonBackend trait abstracting over hypervisor-specific balloon control
- CrosvmBackend wrapping existing crosvm control socket protocol
- CloudHypervisorBackend using raw HTTP/1.1 over persistent Unix socket
  (GET /api/v1/vm.balloon-statistics, PUT /api/v1/vm.resize)
- Watcher recognizes both crosvm-control.socket and
  cloud-hypervisor-control.socket for auto-discovery
- dbus-proxy: CONNECT protocol support for cloud-hypervisor vsock,
  generic stream handling, --cid/--vsock-socket CLI args
- NixOS module: enable dbus-proxy for all VMs, vary args by hypervisor
2026-03-22 11:19:57 +00:00
4f97ccb28c Bump cloud-hypervisor 2026-03-22 10:09:28 +00:00
99cccf89ba VM option cloudHypervisor -> cloud-hypervisor for consistency 2026-03-22 09:55:43 +00:00
1f1d625bcc Fix sound config for CH guests, update TODO 2026-03-22 09:55:28 +00:00
43438277d8 fix: use screen -dmS with Type=forking for console drain service
screen -dmS forks to background (daemon mode), which should work
without a controlling terminal. Type=forking tells systemd to expect
the fork.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 20:10:06 +00:00
dae577fe0f fix: switch from screen to tmux for console drain sessions
screen requires a controlling terminal which systemd services don't
provide. tmux works without a terminal via new-session without -d.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 20:04:27 +00:00
a12f2c2d0f fix: wrap screen in script(1) to provide a TTY for systemd services
screen requires a controlling terminal which systemd services don't
have. Using script(1) to allocate a pseudo-terminal.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 19:57:33 +00:00
1d82180c47 feat: vm-shell attaches to screen session instead of direct PTY
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-21 19:49:20 +00:00
cb4c7d163e docs: update vm-shell detach instructions for screen
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-21 19:49:11 +00:00
575932a9a4 feat: add console screen session service per VM
Adds a vmsilo-<name>-console-screen systemd service for every VM that
attaches a GNU Screen session to the console PTY
(/run/vmsilo/<name>/console). The service polls for the PTY (needed for
cloud-hypervisor VMs), keeps the serial buffer drained, and cleans up
stale screen sessions on start and stop.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-21 19:48:20 +00:00
57c9e00000 refactor: make console-relay service crosvm-only
Cloud-hypervisor VMs now use PTY-direct serial mode and no longer need
a console-relay service. Filter the relay to crosvm VMs only via
lib.filter, removing the isCh/chRelayScript conditional logic. Also
add ExecStopPost cleanup of the console symlink for CH VMs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-21 19:47:10 +00:00
a0c28a5c57 feat: switch cloud-hypervisor serial to PTY-direct mode
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-21 19:45:50 +00:00
73fa020cf7 Enable colored border for notifications from VMs
Forgot to actually add the patch file to the package overlay before.
2026-03-21 19:18:29 +00:00
4febe2c4b3 Bump cloud-hypervisor to fix dev feature negotiation 2026-03-21 18:26:59 +00:00
ce3b162a4f Fix cloud-hypervisor VM launch: root overlay disk ID and other divergences from crosvm
The root overlay disk had no serial set and the kernel param hardcoded
"raw,vdb" instead of "raw,ephemeral", so the guest couldn't find the
overlay disk by /dev/disk/by-id/virtio-ephemeral.

Also fixes: missing netvmInjections nameservers, shared dir mount params
only handling sharedHome (not all dirs with mountPath), additional disks
dropping id field, ephemeral disk using direct=false (double-caching),
missing IOMMU group validation for PCI passthrough, and redundant
ephemeral disk creation in the script body.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 18:00:31 +00:00
8d825366e4 Fix early serial console with cloud-hypervisor
Use legacy serial port, virtio-console is not compiled in to the default
NixOS kernel.

This reverts commits:
25bbbd14b6.
0d72a5f710.
2026-03-21 17:35:35 +00:00
2107cf7deb fix: add ftruncate to sound service seccomp allowlist
PipeWire's mempool allocator (mem.c) calls ftruncate() to size memfds
for shared memory. The missing syscall caused EPERM, which cascaded
through the adapter factory to produce EINVAL on pw_stream_connect.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 16:27:10 +00:00
bfb7527588 fix: set PIPEWIRE_CONFIG_DIR for sound service, fix conditional RUST_LOG
Point PIPEWIRE_CONFIG_DIR at the pipewire package's data dir so
libpipewire finds client.conf directly, bypassing the /etc/pipewire
search path which doesn't have client.conf in the confinement.

Also fix sound.logLevel: RUST_LOG was being set unconditionally
(producing "RUST_LOG=" when null); now only set when logLevel is
configured.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 16:16:47 +00:00
6b8cd140d3 fix: add pipewire config closure to sound service confinement
The /etc/pipewire directory contains symlinks to the
pipewire-extra-config store path, which isn't in the confinement
closure. Add the pipewire config source to confinement.packages so
its full closure is available in the namespace.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 15:54:37 +00:00
cd73430c3e feat: add RUST_BACKTRACE=full to all Rust services and sound.logLevel option
Set RUST_BACKTRACE=full on VM, GPU, sound, dbus-proxy, balloond, and
wayland-seccontext services for better crash diagnostics. Add per-VM
sound.logLevel option (default "info") that sets RUST_LOG on the
vhost-device-sound service.

Also document previously undocumented options in README: cloud-hypervisor
hugepages, netvmRange, sound.logLevel, sound.seccompPolicy,
cloudHypervisor.hugepages, cloudHypervisor.seccompPolicy.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 15:53:49 +00:00
81e6394b78 fix: mount /etc/pipewire in sound service namespace, add debugging aids
Mount the NixOS-generated pipewire config directory at /etc/pipewire
inside the confined sound service namespace — libpipewire has
/etc/pipewire as a compiled-in config search path.

Also add RUST_BACKTRACE=full to all Rust service environments
(balloond, VM, sound, dbus-proxy, wayland-seccontext, GPU) and a
sound.logLevel option for RUST_LOG control.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 15:50:55 +00:00
a55934497c fix: bind-mount pipewire config into sound service namespace
The confined vhost-device-sound service can't find client.conf because
the pipewire config files (in the default output) aren't in the
confinement closure — only the library output is pulled in transitively.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 15:36:48 +00:00
588f8dc6f6 fix: wait for vhost-user sockets before starting VM
GPU, sound, and virtiofs backend services use Type=simple, so systemd
considers them started before they create their sockets. This race
causes crosvm to fail with "No such file or directory" on the
vhost-user socket paths. Add socket-wait loops (up to 30s) in both
crosvm and cloud-hypervisor start scripts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 15:30:03 +00:00
d486c7ee0c fix: add dup3, getsockname, symlink to GPU seccomp allowlist
These syscalls were being denied in enforcing mode, causing GPU device
units to fail.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 15:13:35 +00:00
d3d869c1ab feat: sandbox vhost-device-sound service with confinement and seccomp
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-21 15:06:54 +00:00
cd1279166f feat: add soundSyscallAllowlist for sound service seccomp 2026-03-21 15:04:25 +00:00