docs: add spec for fuzzing setup targeting host codepaths

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Davíð Steinn Geirsson 2026-03-25 21:49:39 +00:00
parent e0aa6bfbf6
commit 1cb95dd57c

View file

@ -0,0 +1,304 @@
# Fuzzing Setup for usbip-rs Host Codepaths
## Motivation
The host process sits between untrusted client data (from a VM) and the host kernel's vhci_hcd driver. Malformed USB/IP protocol input could crash the host process or, worse, produce malformed responses that confuse the kernel. Fuzzing the host-side parsing, dispatch, and handler codepaths catches both classes of bug.
## Security Model Context
- **Client (untrusted):** runs inside a VM, sends raw USB/IP protocol bytes
- **Host (trusted):** parses client input, dispatches URBs to device handlers, sends responses back
- **Host kernel:** downstream consumer of USB/IP responses — robustness to malformed responses is not guaranteed
The fuzz targets simulate an untrusted client by feeding arbitrary bytes into the host-side entry points.
## Approach
Raw-bytes fuzzing via cargo-fuzz (libfuzzer). Fuzz input is fed through `MockSocket` into the host-side protocol parsing and URB handling functions. Responses written to the output side of `MockSocket` are validated for well-formedness.
Structured fuzzing (via `Arbitrary`-derived protocol types) is out of scope for this initial setup but can be added later.
## Fuzz Targets
Five targets, each a `[[bin]]` in `lib/fuzz/Cargo.toml`:
| Target | Entry Point | Device | What It Exercises |
|--------|-------------|--------|-------------------|
| `fuzz_parse_command` | `UsbIpCommand::read_from_socket()` | None | Raw protocol deserialization — headers, bounds checks, ISO descriptors |
| `fuzz_handle_client` | `handler()` | HID (lightweight) | Negotiation phase (DEVLIST, IMPORT) then URB loop |
| `fuzz_urb_hid` | `handle_urb_loop()` | HID keyboard | Interrupt + control transfers, single interface |
| `fuzz_urb_uac` | `handle_urb_loop()` | UAC1 loopback | ISO transfers, alt settings, 3 interfaces |
| `fuzz_urb_cdc` | `handle_urb_loop()` | CDC ACM | Bulk + interrupt transfers |
### Target Details
**`fuzz_parse_command`** — the lightweight target. Wraps fuzz bytes in a `MockSocket`, calls `UsbIpCommand::read_from_socket()`, asserts no panic. No device construction, no response validation. Tests the protocol deserialization layer in isolation.
**`fuzz_handle_client`** — exercises the full connection lifecycle. Constructs a `UsbIpServer` with a HID device registered, feeds fuzz bytes through `handler()`. This covers the negotiation phase (`OP_REQ_DEVLIST`, `OP_REQ_IMPORT`) and, if the fuzzer produces a valid import sequence, transitions into the URB loop. Validates all responses.
**`fuzz_urb_hid`** — constructs a HID keyboard `UsbDevice`, feeds fuzz bytes directly into `handle_urb_loop()`. Exercises interrupt transfers (keyboard reports), control transfers (GET_DESCRIPTOR, SET_IDLE), single interface. Validates all responses.
**`fuzz_urb_uac`** — constructs a UAC1 loopback `UsbDevice` via `build_uac_loopback_device()`, feeds fuzz bytes into `handle_urb_loop()`. Exercises isochronous transfers, alternate interface settings, 3 interfaces (control, stream out, stream in). Validates all responses.
**`fuzz_urb_cdc`** — constructs a CDC ACM `UsbDevice`, feeds fuzz bytes into `handle_urb_loop()`. Exercises bulk transfers (serial data), interrupt transfers, single interface. Validates all responses.
### Harness Pattern
All URB loop and handler targets follow the same pattern:
```rust
#![no_main]
use libfuzzer_sys::fuzz_target;
use std::sync::Arc;
fuzz_target!(|data: &[u8]| {
let rt = tokio::runtime::Builder::new_current_thread().build().unwrap();
rt.block_on(async {
let device = /* construct device */;
let mock = usbip_rs::util::mock::MockSocket::new(data.to_vec());
let output = mock.output_handle();
let _ = usbip_rs::handle_urb_loop(mock, Arc::new(device)).await;
let output_bytes = output.lock().unwrap();
usbip_rs::fuzz_helpers::assert_usbip_responses_valid(&output_bytes);
});
});
```
`fuzz_handle_client` is similar but constructs a `UsbIpServer`, adds a device, and calls `handler()` instead.
## Response Validation (`fuzz_helpers`)
A module in the lib crate (`lib/src/fuzz_helpers.rs`), gated behind `#[cfg(feature = "fuzz")]`.
### `assert_usbip_responses_valid(output_bytes: &[u8])`
Parses all complete responses from the output buffer. Silently returns if the buffer is empty or contains only a partial response (expected when the fuzzer hits EOF mid-stream).
The USB/IP protocol has two response formats with different header layouts:
**Negotiation-phase responses** (used by `fuzz_handle_client`):
- Header: version (2 bytes, must be `0x0111`) + command (2 bytes) + status (4 bytes)
- `OP_REP_DEVLIST` (0x0005): status 0, followed by device count (4 bytes) and device descriptors
- `OP_REP_IMPORT` (0x0003): status 0 (success) or non-zero (failure), followed by device descriptor on success
**URB-phase responses** (used by all targets except `fuzz_parse_command`):
Structural checks:
- Each response has at least a 48-byte header (full `UsbIpRetSubmit` or `UsbIpRetUnlink` header)
- Command code is `RET_SUBMIT` (0x00000003) or `RET_UNLINK` (0x00000004)
- Direction field is 0 or 1
- For `RET_SUBMIT`: actual transfer buffer data length matches the `actual_length` field in the header
- For `RET_SUBMIT` with ISO: `number_of_packets` × 16 bytes of ISO descriptors present after header
Semantic checks:
- ISO packet descriptors: `offset + actual_length` does not exceed transfer buffer length
- No duplicate seqnums across all responses
- `RET_UNLINK` status is 0 (success) or `-ECONNRESET` (URB already completed)
**Intentionally not checked:**
- Whether every request received a response (loop may exit early on malformed input)
- USB descriptor content validity
- Specific `RET_SUBMIT` error status codes (many valid negative errno values)
## Lib Crate Changes
### `lib/Cargo.toml`
Add optional dependency and feature:
```toml
[dependencies]
arbitrary = { version = "1", features = ["derive"], optional = true }
[features]
fuzz = ["arbitrary"]
```
### `lib/src/lib.rs`
Conditionally expose the module:
```rust
#[cfg(feature = "fuzz")]
pub mod fuzz_helpers;
```
### `lib/src/fuzz_helpers.rs`
New file containing `assert_usbip_responses_valid()` as described above.
### `lib/src/util.rs` — MockSocket Visibility
Currently `MockSocket` lives inside `#[cfg(test)] pub(crate) mod tests`. Move it to a separate conditional block so fuzz targets can use it:
```rust
#[cfg(any(test, feature = "fuzz"))]
pub mod mock {
// MockSocket struct, new(), output_handle(), AsyncRead, AsyncWrite impls
}
#[cfg(test)]
pub(crate) mod tests {
pub(crate) use super::mock::MockSocket;
// get_free_address, poll_connect, setup_test_logger, IsoLoopbackHandler stay here
}
```
This makes `MockSocket` available as `usbip_rs::util::mock::MockSocket` when the `fuzz` feature is enabled, without duplicating code or exposing test-only utilities.
## Fuzz Crate Structure
```
lib/fuzz/
├── Cargo.toml
├── .gitignore
└── fuzz_targets/
├── fuzz_parse_command.rs
├── fuzz_handle_client.rs
├── fuzz_urb_hid.rs
├── fuzz_urb_uac.rs
└── fuzz_urb_cdc.rs
```
### `lib/fuzz/Cargo.toml`
```toml
[package]
name = "usbip-rs-fuzz"
version = "0.0.0"
publish = false
edition = "2024"
[package.metadata]
cargo-fuzz = true
[dependencies]
libfuzzer-sys = "0.4"
tokio = { version = "1", features = ["rt", "sync", "time", "io-util"] }
[dependencies.usbip-rs]
path = ".."
features = ["fuzz"]
[workspace]
members = ["."]
[[bin]]
name = "fuzz_parse_command"
path = "fuzz_targets/fuzz_parse_command.rs"
doc = false
[[bin]]
name = "fuzz_handle_client"
path = "fuzz_targets/fuzz_handle_client.rs"
doc = false
[[bin]]
name = "fuzz_urb_hid"
path = "fuzz_targets/fuzz_urb_hid.rs"
doc = false
[[bin]]
name = "fuzz_urb_uac"
path = "fuzz_targets/fuzz_urb_uac.rs"
doc = false
[[bin]]
name = "fuzz_urb_cdc"
path = "fuzz_targets/fuzz_urb_cdc.rs"
doc = false
```
### `lib/fuzz/.gitignore`
```
target/
corpus/
artifacts/
Cargo.lock
```
## Nix Integration
### `flake.nix` Input
Add `rust-overlay`:
```nix
inputs = {
# ... existing inputs ...
rust-overlay = {
url = "github:oxalica/rust-overlay";
inputs.nixpkgs.follows = "nixpkgs";
};
};
```
### `fuzz` devShell
```nix
devShells.fuzz = let
rust-nightly = rust-overlay.packages.${system}.rust-nightly;
in pkgs.mkShell {
buildInputs = [
rust-nightly
pkgs.cargo-fuzz
pkgs.libusb1
] ++ pkgs.lib.optionals pkgs.stdenv.hostPlatform.isLinux [
pkgs.udev
];
nativeBuildInputs = [ pkgs.stdenv.cc pkgs.pkg-config ];
};
```
### `fuzz-usbip` App
Wrapper script with `--fork=N` support for parallel overnight fuzzing:
```nix
apps.fuzz-usbip = let
rust-nightly = rust-overlay.packages.${system}.rust-nightly;
fuzz-usbip = pkgs.writeShellScriptBin "fuzz-usbip" ''
set -euo pipefail
export PATH="${rust-nightly}/bin:${pkgs.cargo-fuzz}/bin:${pkgs.stdenv.cc}/bin:${pkgs.pkg-config}/bin:$PATH"
export PKG_CONFIG_PATH="${pkgs.libusb1.dev}/lib/pkgconfig:${pkgs.udev.dev}/lib/pkgconfig:''${PKG_CONFIG_PATH:-}"
cd "$(${pkgs.git}/bin/git rev-parse --show-toplevel)/lib"
if [ $# -eq 0 ]; then
cargo fuzz list
else
target="$1"
shift
fork=0
args=()
for arg in "$@"; do
case "$arg" in
--fork=*) fork=''${arg#--fork=} ;;
*) args+=("$arg") ;;
esac
done
if [ "$fork" -gt 0 ]; then
while true; do
cargo fuzz run "$target" -- -max_len=1048576 "-fork=$fork" "''${args[@]}" || true
echo "--- fuzzer exited, restarting (artifacts saved) ---"
done
else
cargo fuzz run "$target" -- -max_len=1048576 "''${args[@]}"
fi
fi
'';
in {
type = "app";
program = "${fuzz-usbip}/bin/fuzz-usbip";
};
```
**Usage:**
- `nix run .#fuzz-usbip` — list available fuzz targets
- `nix run .#fuzz-usbip -- fuzz_urb_hid` — fuzz HID target, single process
- `nix run .#fuzz-usbip -- fuzz_urb_hid --fork=8` — fuzz HID target with 8 parallel processes, auto-restart on crash (for overnight runs)
## Future Work (Out of Scope)
- **Structured fuzzing:** derive `Arbitrary` on `UsbIpCommand`, `SetupPacket`, etc. to generate structurally valid protocol messages that reach deeper into handler logic
- **USB descriptor validation:** verify GET_DESCRIPTOR responses produce structurally valid USB descriptors
- **Corpus seeding:** extract protocol traces from existing tests as seed corpus