docs: address spec review feedback
- Split pve_ha_service_info into _config and _status to avoid stale series - Handle wearout "N/A" and health "UNKNOWN" edge cases for physical disks - Clarify node label convention and rootfs available vs free naming - Note QEMU-only scope for VM pressure (LXC lacks PSI in PVE API) - Add full node_status/lrm_status examples showing all cluster nodes - Document mutex-guarded nodes pattern and test fixture requirements Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
b4ed302009
commit
e10156323b
1 changed files with 63 additions and 10 deletions
|
|
@ -27,6 +27,14 @@ not operationally useful). Kernel version and CPU model are excluded from node s
|
|||
| `pve_node_ksm_shared_bytes` | Gauge | `node` | KSM shared memory in bytes |
|
||||
| `pve_node_boot_mode_info` | Gauge | `node`, `mode`, `secureboot` | Boot mode info (always 1) |
|
||||
|
||||
The `pve_node_` prefix disambiguates these from `cluster_resources` metrics which use
|
||||
`id` labels like `node/node01`. The `node` label here is the bare node name (e.g.,
|
||||
`node01`), consistent with how other NodeAwareCollectors label per-node data.
|
||||
|
||||
Rootfs uses `available` (not `free`) because the API field `avail` reflects usable
|
||||
space after reserved blocks, which is the operationally relevant value. Swap has no
|
||||
reserved blocks so `free` is correct there.
|
||||
|
||||
**API response structure:**
|
||||
```json
|
||||
{
|
||||
|
|
@ -73,14 +81,18 @@ Load averages are strings in the API, parse with `strconv.ParseFloat`.
|
|||
```
|
||||
|
||||
- Only emit metrics for running VMs (stopped VMs lack pressure fields).
|
||||
- `id` label follows existing convention: `"qemu/{vmid}"` (e.g., `qemu/112`).
|
||||
- QEMU only — LXC containers run in the host kernel namespace and do not expose
|
||||
per-container PSI metrics through the PVE API.
|
||||
- `id` label: constructed as `fmt.Sprintf("qemu/%d", vmid)` to match the existing
|
||||
convention used by `cluster_resources` and `node_config` collectors.
|
||||
- API returns values 0–100. Divide by 100 to produce 0.0–1.0 ratios.
|
||||
|
||||
### 3. HA Status Collector
|
||||
|
||||
**File:** `collector/ha_status.go`
|
||||
**API:** `/cluster/ha/status/manager_status` + `/cluster/ha/resources` (cluster-level, no per-node fan-out)
|
||||
**Type:** Plain Collector (not NodeAwareCollector)
|
||||
**Type:** Plain Collector (not NodeAwareCollector). Both API calls happen inside a
|
||||
single `Update()` method.
|
||||
|
||||
| Metric | Type | Labels | Description |
|
||||
|--------|------|--------|-------------|
|
||||
|
|
@ -88,7 +100,12 @@ Load averages are strings in the API, parse with `strconv.ParseFloat`.
|
|||
| `pve_ha_node_status` | Gauge | `node`, `status` | Per-node HA status (always 1) |
|
||||
| `pve_ha_lrm_timestamp_seconds` | Gauge | `node` | Last LRM heartbeat as unix timestamp |
|
||||
| `pve_ha_lrm_mode` | Gauge | `node`, `mode` | LRM mode per node (always 1) |
|
||||
| `pve_ha_service_info` | Gauge | `sid`, `type`, `node`, `state`, `max_restart`, `max_relocate`, `failback` | Service config + runtime (always 1) |
|
||||
| `pve_ha_service_config` | Gauge | `sid`, `type`, `max_restart`, `max_relocate`, `failback` | Service config (always 1) |
|
||||
| `pve_ha_service_status` | Gauge | `sid`, `node`, `state` | Service runtime state (always 1) |
|
||||
|
||||
Service metrics are split into `_config` (from `/cluster/ha/resources`, static labels)
|
||||
and `_status` (from `manager_status.service_status`, runtime labels that change). This
|
||||
avoids stale series when a service migrates between nodes or changes state.
|
||||
|
||||
**API response structure (`/cluster/ha/status/manager_status`):**
|
||||
```json
|
||||
|
|
@ -96,13 +113,23 @@ Load averages are strings in the API, parse with `strconv.ParseFloat`.
|
|||
"data": {
|
||||
"manager_status": {
|
||||
"master_node": "node03",
|
||||
"node_status": {"node01": "online", "node02": "online"},
|
||||
"node_status": {
|
||||
"node01": "online",
|
||||
"node02": "online",
|
||||
"node03": "online",
|
||||
"node04": "online",
|
||||
"node05": "online"
|
||||
},
|
||||
"service_status": {
|
||||
"vm:106": {"node": "node04", "running": 1, "state": "started"}
|
||||
}
|
||||
},
|
||||
"lrm_status": {
|
||||
"node01": {"mode": "active", "state": "wait_for_agent_lock", "timestamp": 1774016351}
|
||||
"node01": {"mode": "active", "state": "wait_for_agent_lock", "timestamp": 1774016351},
|
||||
"node02": {"mode": "active", "state": "wait_for_agent_lock", "timestamp": 1774016351},
|
||||
"node03": {"mode": "active", "state": "wait_for_agent_lock", "timestamp": 1774016351},
|
||||
"node04": {"mode": "active", "state": "active", "timestamp": 1774016350},
|
||||
"node05": {"mode": "active", "state": "wait_for_agent_lock", "timestamp": 1774016351}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -117,8 +144,12 @@ Load averages are strings in the API, parse with `strconv.ParseFloat`.
|
|||
}
|
||||
```
|
||||
|
||||
- `pve_ha_crm_master`: iterate `node_status` keys, emit 1 for `master_node`, 0 for all others.
|
||||
- `pve_ha_service_info`: merge runtime state from `service_status` (node, state) with config from `/cluster/ha/resources` (max_restart, max_relocate, failback). Numeric config values become string labels.
|
||||
- `pve_ha_crm_master`: iterate `node_status` keys from `manager_status`, emit 1 for
|
||||
`master_node`, 0 for all others. The `node_status` map contains all cluster nodes.
|
||||
- `pve_ha_service_config`: from `/cluster/ha/resources`. Numeric config values
|
||||
(`max_restart`, `max_relocate`, `failback`) become string labels.
|
||||
- `pve_ha_service_status`: from `manager_status.service_status`. The `state` label
|
||||
reflects runtime state (e.g., `started`, `stopped`, `migrate`).
|
||||
|
||||
### 4. Physical Disks Collector
|
||||
|
||||
|
|
@ -150,8 +181,13 @@ Load averages are strings in the API, parse with `strconv.ParseFloat`.
|
|||
}
|
||||
```
|
||||
|
||||
- `wearout`: API returns 0–100 integer (100 = new). Divide by 100 for ratio.
|
||||
- `health`: compare string to `"PASSED"`, emit 1 or 0.
|
||||
- `wearout`: API returns 0–100 integer representing percentage remaining (100 = new,
|
||||
0 = fully worn). Divide by 100 to get ratio directly (no inversion needed). The API
|
||||
may return `"N/A"` (string) for disks that don't support wear leveling — use
|
||||
`json.Number` or similar to handle this. Skip emitting `pve_physical_disk_wearout_remaining_ratio`
|
||||
when the value is not a valid number.
|
||||
- `health`: compare string to `"PASSED"`, emit 1 or 0. If `health` is empty or
|
||||
`"UNKNOWN"`, emit 0.
|
||||
- `osd` label: format as `"osd.N"` (e.g., `osd.8`) matching Ceph daemon naming convention.
|
||||
- Multi-OSD disks: emit one `pve_physical_disk_osd` entry per item in `osdid-list`.
|
||||
- Non-OSD disks (`osdid` is -1 / `osdid-list` is null): no `pve_physical_disk_osd` entry emitted.
|
||||
|
|
@ -163,11 +199,27 @@ All 4 collectors follow the established patterns:
|
|||
- Self-register via `init()` + `registerCollector()`
|
||||
- NodeAwareCollector interface for per-node endpoints (node_status, vm_pressure, physical_disk)
|
||||
- Plain Collector for cluster-level endpoints (ha_status)
|
||||
- NodeAwareCollector implementations guard the `nodes` slice with `sync.Mutex`, copy
|
||||
it at the start of `Update()`, matching the pattern in `node_config.go` and `replication.go`
|
||||
- Per-node fan-out uses `sync.WaitGroup` + semaphore from `client.MaxConcurrent()`
|
||||
- JSON response types as unexported structs in each collector file
|
||||
- Unit tests with JSON fixtures in `collector/fixtures/`
|
||||
- `testCollectorAdapter` pattern for `testutil.GatherAndCompare`
|
||||
|
||||
### Test Fixtures
|
||||
|
||||
Each collector needs fixture files in `collector/fixtures/`:
|
||||
|
||||
- `node_status.json` — full `/nodes/{node}/status` response
|
||||
- `node_qemu_pressure.json` — `/nodes/{node}/qemu` response with running + stopped VMs
|
||||
(reuse or extend existing `node_qemu.json` if pressure fields can be added)
|
||||
- `ha_manager_status.json` — `/cluster/ha/status/manager_status` response
|
||||
- `ha_resources.json` — `/cluster/ha/resources` response
|
||||
- `node_disks.json` — `/nodes/{node}/disks/list` response
|
||||
|
||||
The HA collector test needs two routes mapped in the test server (one per endpoint),
|
||||
unlike NodeAwareCollector tests which share a single route pattern.
|
||||
|
||||
## Scope Exclusions
|
||||
|
||||
- **SDN/Network**: Excluded — API exposes config only, not operational state.
|
||||
|
|
@ -177,4 +229,5 @@ All 4 collectors follow the established patterns:
|
|||
## README Update
|
||||
|
||||
Remove the implemented items from the TODO section. Remove the SDN, kernel version,
|
||||
and CPU model entries entirely. If no TODO items remain, remove the section.
|
||||
and CPU model entries entirely. If no TODO items remain, remove the section. Add
|
||||
metrics tables for all 4 new collectors following the existing table format.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue