Replace root execution with DynamicUser=yes for VM services (vmsilo-<name>) and vm-switch daemons (vm-switch-<netname>). Console relay and proxy services run as the configured desktop user. Privileged ExecStartPre=+ scripts handle ACLs, VFIO chown, and TAP ownership. Socket paths move to per-VM subdirs (/run/vmsilo/<name>/). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
15 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
vmsilo is a lightweight virtualization system inspired by Qubes OS. It runs isolated VMs using crosvm (Chrome OS VMM) with different security domains (banking, shopping, untrusted, etc.). VMs are configured declaratively via a NixOS module.
Environment
You are running under NixOS. If you need any tools not in the environment, use nix-shell.
Development rules and guidelines
- Do not commit design or implementation plans to git.
- Update documentation (README.md and CLAUDE.md) along with code. Keep these concise.
Build Commands
# Build the default rootfs image
nix build .#
Code Style
This project uses treefmt-nix with nixfmt for formatting. Run before committing:
# Format all Nix files
nix fmt
There are no tests in this project.
Architecture
VM Launch Flow (NixOS module)
Each VM runs under its own dynamic service user (vmsilo-<name>) via DynamicUser=yes. A privileged ExecStartPre=+ script grants the dynamic user access to devices and sockets (ACLs, chown). Console relay and proxy services run as the configured desktop user.
VMs start automatically when first accessed via socket activation:
vm-run banking firefoxconnects to/run/vmsilo/banking/command.socket- Socket activation triggers
vmsilo-banking@.service(proxy template) - Proxy requires
vmsilo-banking-vm.service, which starts crosvm - Proxy waits for guest vsock:5000, then forwards command
The configured user can manage VM services via polkit (no sudo required for vm-start/vm-stop).
Key Files
rootfs-nixos/default.nix- NixOS-based rootfs builder (outputs qcow2, kernel, initrd)rootfs-nixos/configuration.nix- Guest NixOS configuration (systemd, vsock listener, idle watchdog)modules/options.nix- NixOS module interface (programs.vmsilooptions)modules/config.nix- NixOS module implementation (VM scripts, systemd units, networking)patches/vmsilo-decorations-combined-v6.5.5.patch- KWin patch for VM window decoration colorsflake.nix- Flake exposingnixosModules.defaultandlib.makeRootfsNixos
Generated Scripts (NixOS module)
vm-run <name> <cmd>- Run command in VM (starts VM on-demand via socket activation)vm-start <name>- Start VM via systemd (uses polkit, no sudo needed)vm-stop <name>- Stop VM via systemd (uses polkit, no sudo needed)vm-start-debug <name>- Start VM directly for debugging (requires sudo, bypasses socket activation)vm-shell <name>- Connect to VM serial console (default) or SSH with--ssh
Bash completion: Enabled by default (enableBashIntegration = true). VM names are queried dynamically from systemd, so completions update in existing shells after nixos-rebuild switch.
vm-shell options:
- Default: Connect to serial console (no SSH keys required). Escape with CTRL+].
--ssh: Use SSH over vsock (requires SSH keys configured inguestConfig)--ssh --root: SSH as root
Note: SSH mode requires SSH keys configured in per-VM guestConfig:
guestConfig = {
users.users.user.openssh.authorizedKeys.keys = [ "ssh-ed25519 AAAA..." ];
users.users.root.openssh.authorizedKeys.keys = [ "ssh-ed25519 AAAA..." ];
};
Sockets and Devices
Files in /run/vmsilo/<name>/ (per-VM subdirectory owned by cfg.user):
| Path | Type | Purpose |
|---|---|---|
<name>/command.socket |
Socket | Socket activation for vm-run commands |
<name>/crosvm-control.socket |
Socket | crosvm control socket (VM management) |
<name>/console-backend.socket |
Socket | Serial console backend (crosvm connects here) |
<name>/console |
PTY | User-facing serial console (for vm-shell) |
<name>/wayland-0 |
Bind mount | Wayland socket (via BindPaths, if GPU enabled) |
<name>/pulse-native |
Bind mount | PulseAudio socket (via BindPaths, if sound enabled) |
The console relay service (vmsilo-<name>-console-relay.service) bridges crosvm to a PTY, allowing users to connect/disconnect without disrupting crosvm.
Service User Isolation: Each service runs under its own user:
| Service | User | Method |
|---|---|---|
vmsilo-<name>-vm |
vmsilo-<name> |
DynamicUser=yes |
vmsilo-<name>-console-relay |
cfg.user |
Static (desktop user) |
vmsilo-<name>@ (proxy) |
cfg.user |
Static (desktop user) |
vm-switch-<netname> |
vm-switch-<netname> |
DynamicUser=yes |
Groups for device/socket access: kvm (KVM), vfio (VFIO container), vmsilo-video (Wayland ACL), vmsilo-audio (PulseAudio ACL), vmsilo-net-<netname> (vhost-user sockets). The VM service's ExecStartPre=+ runs as root to set ACLs, chown VFIO devices, and set TAP interface ownership.
Desktop Integration: The module generates .desktop files for all applications in guestPrograms, allowing VM apps to appear in the host's desktop menu. Apps are organized into submenus named "VM: <name>" (e.g., "VM: banking" containing Firefox, Konsole). Each app launches via vm-run. Icons are copied from guest packages to ensure proper display.
Window Decoration Colors: Each VM's color option is passed to crosvm via --wayland-security-context app_id=vmsilo:<name>:<color>. A KWin patch (patches/) reads this security context and applies the color to window decorations (title bar, frame). Serverside decorations are forced so colors are always visible. Text color auto-contrasts based on luminance.
Host networking: VMs are offline by default. Set hostNetworking = true for internet access.
Disposable VMs: Set disposable = true to auto-shutdown after idle:
- Guest runs
vm-idle-watchdogservice - Polls for active
vsock-cmd@*instances every 5 seconds - Shuts down after
idleTimeoutseconds of inactivity
VM-to-VM Networking: VMs can communicate via vmNetwork option:
- Each network has one router VM and multiple client VMs
- Uses vhost-user-net backed by vm-switch daemon
- MAC files written to
/run/vm-switch/<network>/<vm>/<type>.mac - vm-switch monitors MAC files and creates sockets
- crosvm connects via
--vhost-user type=net,socket=...
rootfs-nixos Package
Builds a full NixOS system into a qcow2 image:
- Outputs:
{ qcow2, kernel, initrd }for direct crosvm boot - Features: Systemd stage-1, overlayfs root (read-only ext4 + tmpfs upper), wayland-proxy-virtwl as systemd service
- Self-contained: No host /nix sharing - packages configured at build time via
guestPrograms - Socket activation: Uses
vsock-cmd.socket+vsock-cmd@.servicefor command handling - Idle watchdog: Optional
vm-idle-watchdog.servicefor disposable VMs
# Build with custom packages
vmsilo.lib.makeRootfsNixos "x86_64-linux" {
guestPrograms = [ pkgs.firefox pkgs.konsole ];
guestConfig = {
# Additional NixOS configuration
fileSystems."/home/user" = { device = "/dev/vdb"; fsType = "ext4"; };
};
}
vm-switch Daemon
Provides L2 switching for VM-to-VM networks:
- Location:
vm-switch/Rust crate - Build:
nix build .#vm-switch - Purpose: Handles vhost-user protocol for VM network interfaces
- Systemd: One service per vmNetwork (
vm-switch-<netname>.service), runs as dynamic uservm-switch-<netname>with groupvmsilo-net-<netname>
CLI flags:
-d, --config-dir <PATH> Config/MAC file directory (default: /run/vm-switch)
--log-level <LEVEL> error, warn, info, debug, trace (default: warn)
--no-sandbox Disable namespace sandboxing
--seccomp-mode <MODE> kill (default), trap, log, disabled
Testing locally:
# Build and run manually
nix build .#vm-switch
./result/bin/vm-switch -d /tmp/test-switch --log-level debug
# In another terminal, create test MAC files
mkdir -p /tmp/test-switch/router
echo "52:00:00:00:00:01" > /tmp/test-switch/router/router.mac
Process model: Main process forks one child per VM. Children are vhost-user net backends that handle virtio TX/RX for their VM. Main orchestrates lifecycle, config watching, and buffer exchange between children. Children exit when the vhost-user client (crosvm) disconnects; main automatically restarts them so crosvm can reconnect.
Startup sequence:
- Parse args, apply namespace sandbox (single-threaded, before tokio)
- Apply main seccomp filter
- Start tokio runtime, create ConfigWatcher + BackendManager, enter async event loop (SIGCHLD via tokio select branch)
Key source files:
src/main.rs- Entry point, sandbox/seccomp setup, async event loop, SIGCHLD handlingsrc/manager.rs- BackendManager: fork children, buffer exchange, crash cleanupsrc/args.rs- CLI argument parsing (clap)src/config.rs- VM configuration types (VmRole, VmConfig)src/watcher.rs- Config directory file watcher (inotify + debouncer)src/mac.rs- MAC address type and parsingsrc/control.rs- Main-child IPC over Unix seqpacket sockets + SCM_RIGHTSsrc/ring.rs- Lock-free SPSC ring buffer in shared memory (memfd)src/frame.rs- Ethernet frame parsing, MAC validationsrc/sandbox.rs- Namespace isolation (user, PID, mount, IPC, network)src/seccomp.rs- BPF syscall filters (main and child whitelists)src/child/process.rs- Child entry point: control channel, vhost daemon, child seccompsrc/child/forwarder.rs- PacketForwarder: L2 routing via ring bufferssrc/child/vhost.rs- ChildVhostBackend: virtio TX/RX callbackssrc/child/poll.rs- Event polling for control channel + ingress buffers
Control protocol (main <-> child IPC via SOCK_SEQPACKET + SCM_RIGHTS):
| Direction | Message | FDs | Purpose |
|---|---|---|---|
| Main -> Child | GetBuffer { peer_name, peer_mac } |
- | Ask child to create ingress buffer for a peer |
| Child -> Main | BufferReady { peer_name } |
memfd, eventfd | Ingress buffer created, here are the FDs |
| Main -> Child | PutBuffer { peer_name, peer_mac, broadcast } |
memfd, eventfd | Give child a peer's buffer as egress target |
| Main -> Child | RemovePeer { peer_name } |
- | Clean up buffers for disconnected/crashed peer |
| Main -> Child | Ping |
- | Heartbeat request (sent every 1s) |
| Child -> Main | Ready |
- | Child initialized and ready |
| Child -> Main | Pong |
- | Heartbeat response (must arrive within 100ms) |
Messages serialized with postcard. FDs passed via ancillary data.
Buffer exchange flow:
- Main sends
GetBufferto Child1 ("create ingress buffer for Child2") - Child1 creates SPSC ring buffer (memfd + eventfd), becomes Consumer, replies
BufferReady - Main forwards those FDs to Child2 via
PutBuffer-- Child2 becomes Producer - Packets now flow: Child2 writes to Producer -> shared memfd -> Child1 reads from Consumer
SPSC ring buffer (ring.rs): Lock-free single-producer/single-consumer queue backed by memfd_create() + mmap(MAP_SHARED). 64 slots, ~598KB total. Head/tail use atomic operations (no locks in datapath). Eventfd signals empty-to-non-empty transitions.
Sandbox (applied before tokio, requires single-threaded):
- User namespace - Maps real UID to 0 inside, enables unprivileged namespace creation
- PID namespace - Fork into new PID ns; main becomes PID 1
- Mount namespace - Minimal tmpfs root with
/config(bind-mount of config dir),/dev(null, zero, urandom),/proc,/tmp. Pivot root, unmount old. - IPC namespace - Isolates System V IPC
- Network namespace - Empty (no interfaces). Communication only via inherited FDs.
Seccomp filtering (BPF syscall whitelist):
--seccomp-mode=kill(default): Terminate on blocked syscall--seccomp-mode=trap: Send SIGSYS (debug with strace)--seccomp-mode=log: Log violations but allow--seccomp-mode=disabled: Skip filtering
Two filter tiers (child is a strict subset of main):
- Main: Allows fork, socket creation, inotify, openat (config watching + child management)
- Child: No fork, no socket creation, no file open. Applied after vhost setup completes. Allows clone3 for vhost-user threads.
Dependencies
- Custom crosvm fork:
git.dsg.is/davidlowsec/crosvm.git - wayland-proxy-virtwl: Wayland/X11 proxying between host and guests
- NixOS 25.11 base
NixOS Module Usage
Note: The configured user must have an explicit UID set (users.users.<name>.uid = <number>).
{ config, pkgs, ... }: {
# User must have explicit UID for vmsilo
users.users.david.uid = 1000;
programs.vmsilo = {
enable = true;
user = "david";
isolatedPciDevices = [ "01:00.0" ]; # PCI devices to isolate with vfio-pci
# Global crosvm configuration
crosvm = {
logLevel = "info"; # error, warn, info, debug, trace
extraArgs = [ ]; # Args before "run" subcommand
extraRunArgs = [ ]; # Args after "run" subcommand
};
# VM-switch daemon configuration
vm-switch = {
logLevel = "info"; # error, warn, info, debug, trace
extraArgs = [ ]; # Extra CLI arguments
};
nixosVms = [{
id = 3;
name = "banking";
color = "#2ecc71"; # Window decoration color (default: "red")
memory = 4096;
hostNetworking = true; # Enable internet access (default: false)
disposable = true; # Auto-shutdown when idle
idleTimeout = 120; # Shutdown after 2 minutes idle
# Per-VM crosvm overrides (optional)
crosvm = {
logLevel = "debug"; # Override global log level for this VM
extraArgs = [ ]; # Appended to global extraArgs
extraRunArgs = [ ]; # Appended to global extraRunArgs
};
# Per-VM packages and NixOS config
guestPrograms = [ pkgs.firefox pkgs.chromium ];
guestConfig = {
users.users.user.openssh.authorizedKeys.keys = [ "ssh-ed25519 AAAA..." ];
};
# Disk configuration (uses crosvm --block)
# Free-form attrsets with path as positional arg, rest passed to crosvm
additionalDisks = [{
path = "/tmp/data.qcow2";
ro = false;
sparse = true;
block-size = 4096;
id = "data";
}];
# Custom boot (optional - defaults to built rootfs)
# rootDisk = { path = "/path/to/root.qcow2"; ro = true; };
# kernel = /path/to/bzImage;
# initramfs = /path/to/initrd;
rootDiskReadonly = true; # default true
# Extra kernel parameters
kernelParams = [ "debug" ];
# GPU config (crosvm --gpu=)
# false=disabled, true=default, attrset=custom
gpu = { context-types = "cross-domain:virgl2"; };
# Sound config (crosvm --virtio-snd=)
# false=disabled, true=default PulseAudio, attrset=custom
sound = { backend = "pulse"; capture = true; };
# PCI passthrough (devices from isolatedPciDevices)
# Attrset with path (BDF or sysfs) and optional kv pairs
pciDevices = [{ path = "01:00.0"; iommu = "on"; }];
# Shared directories (crosvm --shared-dir)
# Attrset with path, tag, and optional kv pairs (colon separator)
sharedDirectories = [{ path = "/tmp/shared"; tag = "shared"; uid = 1000; }];
# vhost-user devices (auto-populated from vmNetwork, can add custom)
vhostUser = [{ type = "net"; socket = "/path/to/socket"; }];
}];
};
# Access built package via config.programs.vmsilo.package
}