pve-exporter
A Prometheus exporter for Proxmox VE written in Go. Produces a single static
binary for easy deployment.
Designed as a drop-in replacement for
prometheus-pve-exporter
with matching metric names for dashboard compatibility, plus additional
corosync cluster metrics.
Installation
CGO_ENABLED=0 go build -o pve-exporter .
Usage
pve-exporter \
--pve.host=https://node01:8006 \
--pve.host=https://node02:8006 \
--pve.token-file=/etc/pve-exporter/apikey \
--web.listen-address=:9221
The exporter scrapes all cluster data from a single PVE API endpoint. Multiple
--pve.host values provide failover — hosts are tried in order, with a
1-second connect timeout for fast failover.
Flags
| Flag |
Default |
Description |
--pve.host |
(required) |
PVE API base URL (repeatable) |
--pve.api-token |
|
API token string (user@realm!tokenid=uuid) |
--pve.token-file |
|
Path to file containing API token |
--pve.tls-insecure |
false |
Disable TLS certificate verification |
--pve.max-concurrent |
5 |
Max concurrent API requests for per-node fan-out |
--web.listen-address |
:9221 |
Address to listen on |
--web.telemetry-path |
/metrics |
Path for metrics endpoint |
--log.level |
info |
Log level (debug, info, warn, error) |
--log.format |
logfmt |
Log format (logfmt, json) |
Authentication
Create a PVE API token with at least PVEAuditor role. Provide it via:
--pve.api-token=user@realm!tokenid=uuid (visible in process list)
--pve.token-file=/path/to/file (recommended)
PVE_API_TOKEN environment variable
--pve.api-token and --pve.token-file are mutually exclusive.
Metrics
Cluster Status
| Metric |
Type |
Labels |
Description |
pve_node_info |
Gauge |
id, level, name, nodeid |
Node info (always 1) |
pve_cluster_info |
Gauge |
id, nodes, quorate, version |
Cluster info (always 1) |
Corosync
| Metric |
Type |
Labels |
Description |
pve_cluster_quorate |
Gauge |
|
1 if cluster has quorum |
pve_cluster_nodes_total |
Gauge |
|
Total node count |
pve_cluster_expected_votes |
Gauge |
|
Sum of quorum votes from config |
pve_node_online |
Gauge |
name, nodeid |
1 if node is online |
Cluster Resources
| Metric |
Type |
Labels |
Description |
pve_up |
Gauge |
id |
1 if node/VM/CT is online/running |
pve_cpu_usage_ratio |
Gauge |
id |
CPU utilization ratio |
pve_cpu_usage_limit |
Gauge |
id |
Number of available CPUs |
pve_memory_usage_bytes |
Gauge |
id |
Used memory in bytes |
pve_memory_size_bytes |
Gauge |
id |
Total memory in bytes |
pve_disk_usage_bytes |
Gauge |
id |
Used disk space in bytes |
pve_disk_size_bytes |
Gauge |
id |
Total disk space in bytes |
pve_uptime_seconds |
Gauge |
id |
Uptime in seconds |
pve_network_transmit_bytes_total |
Counter |
id |
Network bytes sent |
pve_network_receive_bytes_total |
Counter |
id |
Network bytes received |
pve_disk_written_bytes_total |
Counter |
id |
Disk bytes written |
pve_disk_read_bytes_total |
Counter |
id |
Disk bytes read |
pve_guest_info |
Gauge |
id, node, name, type, template, tags |
VM/CT info (always 1) |
pve_storage_info |
Gauge |
id, node, storage, plugintype, content |
Storage info (always 1) |
pve_storage_shared |
Gauge |
id |
1 if storage is shared |
pve_ha_state |
Gauge |
id, state |
HA service status |
pve_lock_state |
Gauge |
id, state |
Guest config lock state |
Version
| Metric |
Type |
Labels |
Description |
pve_version_info |
Gauge |
release, repoid, version |
PVE version info (always 1) |
Backup
| Metric |
Type |
Labels |
Description |
pve_not_backed_up_total |
Gauge |
id |
1 if guest has no backup job |
pve_not_backed_up_info |
Gauge |
id |
1 if guest has no backup job |
Node Config
| Metric |
Type |
Labels |
Description |
pve_onboot_status |
Gauge |
id, node, type |
VM/CT onboot config value |
Replication
| Metric |
Type |
Labels |
Description |
pve_replication_info |
Gauge |
id, type, source, target, guest |
Replication job info (always 1) |
pve_replication_duration_seconds |
Gauge |
id |
Last replication duration |
pve_replication_last_sync_timestamp_seconds |
Gauge |
id |
Last successful sync time |
pve_replication_last_try_timestamp_seconds |
Gauge |
id |
Last sync attempt time |
pve_replication_next_sync_timestamp_seconds |
Gauge |
id |
Next scheduled sync time |
pve_replication_failed_syncs |
Gauge |
id |
Failed sync count |
Subscription
| Metric |
Type |
Labels |
Description |
pve_subscription_info |
Gauge |
id, level |
Subscription info (always 1) |
pve_subscription_status |
Gauge |
id, status |
Subscription status |
pve_subscription_next_due_timestamp_seconds |
Gauge |
id |
Next due date as Unix timestamp |
Node Status
| Metric |
Type |
Labels |
Description |
pve_node_load1 |
Gauge |
node |
1-minute load average |
pve_node_load5 |
Gauge |
node |
5-minute load average |
pve_node_load15 |
Gauge |
node |
15-minute load average |
pve_node_swap_total_bytes |
Gauge |
node |
Total swap in bytes |
pve_node_swap_used_bytes |
Gauge |
node |
Used swap in bytes |
pve_node_swap_free_bytes |
Gauge |
node |
Free swap in bytes |
pve_node_rootfs_total_bytes |
Gauge |
node |
Root filesystem total in bytes |
pve_node_rootfs_used_bytes |
Gauge |
node |
Root filesystem used in bytes |
pve_node_rootfs_available_bytes |
Gauge |
node |
Root filesystem available in bytes |
pve_node_ksm_shared_bytes |
Gauge |
node |
KSM shared memory in bytes |
pve_node_boot_mode_info |
Gauge |
node, mode, secureboot |
Boot mode info (always 1) |
VM Pressure
| Metric |
Type |
Labels |
Description |
pve_vm_pressure_cpu_some_ratio |
Gauge |
id, node |
CPU pressure (some) |
pve_vm_pressure_cpu_full_ratio |
Gauge |
id, node |
CPU pressure (full) |
pve_vm_pressure_memory_some_ratio |
Gauge |
id, node |
Memory pressure (some) |
pve_vm_pressure_memory_full_ratio |
Gauge |
id, node |
Memory pressure (full) |
pve_vm_pressure_io_some_ratio |
Gauge |
id, node |
I/O pressure (some) |
pve_vm_pressure_io_full_ratio |
Gauge |
id, node |
I/O pressure (full) |
HA Status
| Metric |
Type |
Labels |
Description |
pve_ha_crm_master |
Gauge |
node |
1 if node is CRM master, 0 otherwise |
pve_ha_node_status |
Gauge |
node, status |
Per-node HA status (always 1) |
pve_ha_lrm_timestamp_seconds |
Gauge |
node |
Last LRM heartbeat as Unix timestamp |
pve_ha_lrm_mode |
Gauge |
node, mode |
LRM mode per node (always 1) |
pve_ha_service_config |
Gauge |
id, type, max_restart, max_relocate, failback |
Service config (always 1) |
pve_ha_service_status |
Gauge |
id, node, state |
Service runtime state (always 1) |
Physical Disks
| Metric |
Type |
Labels |
Description |
pve_physical_disk_health |
Gauge |
node, devpath, model, serial, type |
1 if SMART PASSED, 0 otherwise |
pve_physical_disk_wearout_remaining_ratio |
Gauge |
node, devpath |
Wearout remaining (1.0 = new) |
pve_physical_disk_size_bytes |
Gauge |
node, devpath |
Disk size in bytes |
pve_physical_disk_info |
Gauge |
node, devpath, model, serial, type, used |
Disk info (always 1) |
pve_physical_disk_osd |
Gauge |
node, devpath, osd |
Disk-to-OSD mapping (always 1) |
Scrape Meta
| Metric |
Type |
Labels |
Description |
pve_scrape_collector_duration_seconds |
Gauge |
collector |
Scrape duration per collector |
pve_scrape_collector_success |
Gauge |
collector |
1 if collector succeeded |