Add pve-exporter design spec
Full design for a Go Prometheus exporter for Proxmox VE, replacing the Python prometheus-pve-exporter with corosync metrics added. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
4aa8a6d579
commit
154a46f3cf
1 changed files with 287 additions and 0 deletions
287
docs/superpowers/specs/2026-03-20-pve-exporter-design.md
Normal file
287
docs/superpowers/specs/2026-03-20-pve-exporter-design.md
Normal file
|
|
@ -0,0 +1,287 @@
|
|||
# pve-exporter Design Spec
|
||||
|
||||
A Prometheus exporter for Proxmox VE written in Go. Replaces the Python
|
||||
prometheus-pve-exporter with a single static binary, matching all existing
|
||||
metric names for dashboard compatibility, and adding corosync cluster metrics.
|
||||
|
||||
## Goals
|
||||
|
||||
- Drop-in metric compatibility with prometheus-pve-exporter (same metric names
|
||||
and labels where possible) so existing Grafana dashboards work unchanged
|
||||
- Add corosync/quorum metrics not available in the Python exporter
|
||||
- Single statically-linked binary for easy deployment via Ansible
|
||||
- Cluster-wide scrape from a single instance (no per-node exporter deployment)
|
||||
|
||||
## Non-Goals
|
||||
|
||||
- Ceph metrics (collected separately via ceph-mgr)
|
||||
- General-purpose PVE API client library
|
||||
- Full parity with PVE's web UI
|
||||
|
||||
## Architecture
|
||||
|
||||
### Project Structure
|
||||
|
||||
```
|
||||
pve-exporter/
|
||||
├── main.go # Entry point, flag parsing, HTTP server
|
||||
├── collector/
|
||||
│ ├── collector.go # Collector interface, registry, PVECollector
|
||||
│ ├── client.go # PVE API client (HTTP, auth, JSON parsing)
|
||||
│ ├── cluster_status.go # pve_up, pve_node_info, pve_cluster_info
|
||||
│ ├── cluster_resources.go # CPU, memory, disk, network, storage, guest/HA info
|
||||
│ ├── corosync.go # pve_cluster_quorate, nodes_total, expected_votes, node_online
|
||||
│ ├── version.go # pve_version_info
|
||||
│ ├── backup.go # pve_not_backed_up_*
|
||||
│ ├── node_config.go # pve_onboot_status
|
||||
│ ├── replication.go # pve_replication_*
|
||||
│ └── subscription.go # pve_subscription_*
|
||||
├── go.mod
|
||||
├── go.sum
|
||||
├── Makefile
|
||||
└── README.md
|
||||
```
|
||||
|
||||
### Collector Framework
|
||||
|
||||
Follows the node_exporter pattern:
|
||||
|
||||
```go
|
||||
type Collector interface {
|
||||
Update(client *Client, ch chan<- prometheus.Metric) error
|
||||
}
|
||||
```
|
||||
|
||||
Collectors self-register via `init()` + `registerCollector()`. The framework
|
||||
runs all collectors in parallel (goroutines + WaitGroup) and emits per-collector
|
||||
scrape duration and success metrics automatically.
|
||||
|
||||
Collectors that need the node list or shared `/cluster/resources` data implement
|
||||
additional interfaces:
|
||||
|
||||
```go
|
||||
type NodeAwareCollector interface {
|
||||
Collector
|
||||
SetNodes(nodes []string)
|
||||
}
|
||||
|
||||
type ResourceAwareCollector interface {
|
||||
Collector
|
||||
SetResources(data []byte)
|
||||
}
|
||||
```
|
||||
|
||||
### Scrape Flow
|
||||
|
||||
1. Prometheus hits `/metrics`
|
||||
2. `PVECollector.Collect()` fetches `/cluster/resources` first (needed by
|
||||
multiple collectors and provides the node list)
|
||||
3. Node list and resources data are passed to collectors that need them
|
||||
4. All collectors run in parallel
|
||||
5. Per-node API calls within collectors (subscription, replication, node_config)
|
||||
are parallelized across nodes with bounded concurrency (5 concurrent requests)
|
||||
6. Framework measures duration, catches errors, emits scrape meta-metrics
|
||||
|
||||
### API Client
|
||||
|
||||
```go
|
||||
type Client struct {
|
||||
httpClient *http.Client
|
||||
hosts []string // tried in order on failure
|
||||
token string // PVEAPIToken=user@realm!tokenid=uuid
|
||||
}
|
||||
```
|
||||
|
||||
- Tries hosts in order; on connection/HTTP error, falls through to next host.
|
||||
Remembers the last working host and tries it first on subsequent scrapes.
|
||||
- 1-second TCP connect timeout for fast failover to next host.
|
||||
- TLS certificate verification enabled by default. `--pve.tls-insecure` to
|
||||
disable.
|
||||
- Single `Get(path string) ([]byte, error)` method. No caching; each scrape
|
||||
makes fresh API calls.
|
||||
- Context-aware with scrape timeout propagated from Prometheus.
|
||||
|
||||
### Authentication
|
||||
|
||||
- `--pve.api-token` flag or `PVE_API_TOKEN` env var for token string
|
||||
- `--pve.token-file` for reading token from file at startup (avoids token in
|
||||
process list, Ansible-friendly)
|
||||
- Sent as `Authorization: PVEAPIToken=...` header
|
||||
|
||||
## CLI & HTTP
|
||||
|
||||
```
|
||||
pve-exporter \
|
||||
--pve.host=https://node02:8006 \
|
||||
--pve.host=https://node01:8006 \
|
||||
--pve.token-file=/etc/pve-exporter/apikey \
|
||||
--web.listen-address=:9221 \
|
||||
--web.telemetry-path=/metrics
|
||||
```
|
||||
|
||||
| Flag | Default | Description |
|
||||
|------|---------|-------------|
|
||||
| `--pve.host` | (required, repeatable) | PVE API base URLs, tried in order |
|
||||
| `--pve.api-token` | — | API token string (mutually exclusive with token-file) |
|
||||
| `--pve.token-file` | — | Path to file containing API token |
|
||||
| `--pve.tls-insecure` | `false` | Disable TLS certificate verification |
|
||||
| `--web.listen-address` | `:9221` | Address to listen on |
|
||||
| `--web.telemetry-path` | `/metrics` | Path for metrics endpoint |
|
||||
| `--log.level` | `info` | Log level (debug, info, warn, error) |
|
||||
| `--log.format` | `logfmt` | Log format (logfmt, json) |
|
||||
|
||||
HTTP endpoints:
|
||||
- `/metrics` — Prometheus metrics
|
||||
- `/` — Landing page with link to metrics
|
||||
|
||||
Port 9221 matches the Python exporter for drop-in compatibility.
|
||||
|
||||
## Metrics
|
||||
|
||||
All metrics use namespace `pve`.
|
||||
|
||||
### cluster_status collector
|
||||
|
||||
API: `/cluster/status`
|
||||
|
||||
| Metric | Type | Labels |
|
||||
|--------|------|--------|
|
||||
| `pve_up` | Gauge | `id` |
|
||||
| `pve_node_info` | Gauge | `id`, `level`, `name`, `nodeid` |
|
||||
| `pve_cluster_info` | Gauge | `id`, `nodes`, `quorate`, `version` |
|
||||
|
||||
### corosync collector
|
||||
|
||||
API: `/cluster/status`, `/cluster/config/nodes`
|
||||
|
||||
| Metric | Type | Labels |
|
||||
|--------|------|--------|
|
||||
| `pve_cluster_quorate` | Gauge | — |
|
||||
| `pve_cluster_nodes_total` | Gauge | — |
|
||||
| `pve_cluster_expected_votes` | Gauge | — |
|
||||
| `pve_node_online` | Gauge | `name`, `nodeid` |
|
||||
|
||||
### cluster_resources collector
|
||||
|
||||
API: `/cluster/resources`
|
||||
|
||||
| Metric | Type | Labels |
|
||||
|--------|------|--------|
|
||||
| `pve_cpu_usage_ratio` | Gauge | `id` |
|
||||
| `pve_cpu_usage_limit` | Gauge | `id` |
|
||||
| `pve_memory_usage_bytes` | Gauge | `id` |
|
||||
| `pve_memory_size_bytes` | Gauge | `id` |
|
||||
| `pve_disk_usage_bytes` | Gauge | `id` |
|
||||
| `pve_disk_size_bytes` | Gauge | `id` |
|
||||
| `pve_network_transmit_bytes_total` | Counter | `id` |
|
||||
| `pve_network_receive_bytes_total` | Counter | `id` |
|
||||
| `pve_disk_written_bytes_total` | Counter | `id` |
|
||||
| `pve_disk_read_bytes_total` | Counter | `id` |
|
||||
| `pve_uptime_seconds` | Gauge | `id` |
|
||||
| `pve_storage_shared` | Gauge | `id` |
|
||||
| `pve_guest_info` | Gauge | `id`, `node`, `name`, `type`, `template`, `tags` |
|
||||
| `pve_storage_info` | Gauge | `id`, `node`, `storage`, `plugintype`, `content` |
|
||||
| `pve_ha_state` | Gauge | `id`, `state` |
|
||||
| `pve_lock_state` | Gauge | `id`, `state` |
|
||||
|
||||
### version collector
|
||||
|
||||
API: `/version`
|
||||
|
||||
| Metric | Type | Labels |
|
||||
|--------|------|--------|
|
||||
| `pve_version_info` | Gauge | `release`, `repoid`, `version` |
|
||||
|
||||
### backup collector
|
||||
|
||||
API: `/cluster/backup-info/not-backed-up`
|
||||
|
||||
| Metric | Type | Labels |
|
||||
|--------|------|--------|
|
||||
| `pve_not_backed_up_total` | Gauge | `id` |
|
||||
| `pve_not_backed_up_info` | Gauge | `id` |
|
||||
|
||||
### node_config collector
|
||||
|
||||
API: `/nodes/{node}/qemu/{vmid}/config`, `/nodes/{node}/lxc/{vmid}/config`
|
||||
|
||||
| Metric | Type | Labels |
|
||||
|--------|------|--------|
|
||||
| `pve_onboot_status` | Gauge | `id`, `node`, `type` |
|
||||
|
||||
### replication collector
|
||||
|
||||
API: `/nodes/{node}/replication`
|
||||
|
||||
| Metric | Type | Labels |
|
||||
|--------|------|--------|
|
||||
| `pve_replication_info` | Gauge | `id`, `type`, `source`, `target`, `guest` |
|
||||
| `pve_replication_duration_seconds` | Gauge | `id` |
|
||||
| `pve_replication_last_sync_timestamp_seconds` | Gauge | `id` |
|
||||
| `pve_replication_last_try_timestamp_seconds` | Gauge | `id` |
|
||||
| `pve_replication_next_sync_timestamp_seconds` | Gauge | `id` |
|
||||
| `pve_replication_failed_syncs` | Gauge | `id` |
|
||||
|
||||
### subscription collector
|
||||
|
||||
API: `/nodes/{node}/subscription`
|
||||
|
||||
| Metric | Type | Labels |
|
||||
|--------|------|--------|
|
||||
| `pve_subscription_info` | Gauge | `id`, `level` |
|
||||
| `pve_subscription_status` | Gauge | `id`, `status` |
|
||||
| `pve_subscription_next_due_timestamp_seconds` | Gauge | `id` |
|
||||
|
||||
### Scrape meta-metrics
|
||||
|
||||
| Metric | Type | Labels |
|
||||
|--------|------|--------|
|
||||
| `pve_scrape_collector_duration_seconds` | Gauge | `collector` |
|
||||
| `pve_scrape_collector_success` | Gauge | `collector` |
|
||||
|
||||
## Dependencies
|
||||
|
||||
- `github.com/alecthomas/kingpin/v2` — CLI flags
|
||||
- `github.com/prometheus/client_golang` — Prometheus client
|
||||
- `github.com/prometheus/common` — logging (promslog)
|
||||
- `github.com/prometheus/exporter-toolkit` — TLS, web config, landing page
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
- Unit tests per collector with mock API responses (JSON fixtures)
|
||||
- Integration test: start exporter, scrape `/metrics`, verify expected metric
|
||||
names and labels are present
|
||||
- Manual validation against live PVE cluster
|
||||
|
||||
## Future Metrics (TODO)
|
||||
|
||||
The following metrics are available from the PVE API but deferred to future work:
|
||||
|
||||
### Per-node detailed status (`/nodes/{node}/status`)
|
||||
- Load averages (1m, 5m, 15m)
|
||||
- Swap usage (total, used, free)
|
||||
- Root filesystem usage (total, used, available)
|
||||
- KSM shared memory
|
||||
- Kernel version info
|
||||
- Boot mode and secure boot status
|
||||
- CPU model info (model, sockets, cores, MHz)
|
||||
|
||||
### Per-VM pressure metrics (`/nodes/{node}/qemu`)
|
||||
- `pressurecpusome`, `pressurecpufull`
|
||||
- `pressurememorysome`, `pressurememoryfull`
|
||||
- `pressureiosome`, `pressureiofull`
|
||||
|
||||
### HA detailed status (`/cluster/ha/status/current`)
|
||||
- CRM master node and status
|
||||
- Per-node LRM status (idle/active) and timestamps
|
||||
- Per-service HA config (failback, max_restart, max_relocate)
|
||||
|
||||
### Physical disks (`/nodes/{node}/disks/list`)
|
||||
- Disk health (SMART status)
|
||||
- Wearout level
|
||||
- Size and model info
|
||||
- OSD mapping
|
||||
|
||||
### SDN/Network (`/cluster/resources` type=sdn)
|
||||
- Zone status per node
|
||||
- Zone type info
|
||||
Loading…
Add table
Add a link
Reference in a new issue