Proxmox VE prometheus exporter
Find a file
Davíð Steinn Geirsson 5e066a5c4b fix: normalize HA service IDs to match cluster_resources format
Convert HA API service IDs (vm:106, ct:200) to the resource ID format
used by /cluster/resources and the Python exporter (qemu/106, lxc/200).
Rename label from "sid" to "id" so HA metrics can be joined with
pve_ha_state, pve_guest_info, and other id-labeled metrics.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 15:12:01 +00:00
collector fix: normalize HA service IDs to match cluster_resources format 2026-03-23 15:12:01 +00:00
docs/superpowers docs: add implementation plan for remaining collectors 2026-03-20 15:20:17 +00:00
.gitignore Add flake.nix for Nix builds and dev shell 2026-03-20 12:44:50 +00:00
flake.lock Add flake.nix for Nix builds and dev shell 2026-03-20 12:44:50 +00:00
flake.nix Add flake.nix for Nix builds and dev shell 2026-03-20 12:44:50 +00:00
go.mod feat: add version collector (pve_version_info) 2026-03-20 11:31:06 +00:00
go.sum feat: add main entry point with CLI flags and HTTP server 2026-03-20 11:27:53 +00:00
main.go feat: add main entry point with CLI flags and HTTP server 2026-03-20 11:27:53 +00:00
Makefile feat: add main entry point with CLI flags and HTTP server 2026-03-20 11:27:53 +00:00
README.md fix: normalize HA service IDs to match cluster_resources format 2026-03-23 15:12:01 +00:00

pve-exporter

A Prometheus exporter for Proxmox VE written in Go. Produces a single static binary for easy deployment.

Designed as a drop-in replacement for prometheus-pve-exporter with matching metric names for dashboard compatibility, plus additional corosync cluster metrics.

Installation

CGO_ENABLED=0 go build -o pve-exporter .

Usage

pve-exporter \
  --pve.host=https://node01:8006 \
  --pve.host=https://node02:8006 \
  --pve.token-file=/etc/pve-exporter/apikey \
  --web.listen-address=:9221

The exporter scrapes all cluster data from a single PVE API endpoint. Multiple --pve.host values provide failover — hosts are tried in order, with a 1-second connect timeout for fast failover.

Flags

Flag Default Description
--pve.host (required) PVE API base URL (repeatable)
--pve.api-token API token string (user@realm!tokenid=uuid)
--pve.token-file Path to file containing API token
--pve.tls-insecure false Disable TLS certificate verification
--pve.max-concurrent 5 Max concurrent API requests for per-node fan-out
--web.listen-address :9221 Address to listen on
--web.telemetry-path /metrics Path for metrics endpoint
--log.level info Log level (debug, info, warn, error)
--log.format logfmt Log format (logfmt, json)

Authentication

Create a PVE API token with at least PVEAuditor role. Provide it via:

  • --pve.api-token=user@realm!tokenid=uuid (visible in process list)
  • --pve.token-file=/path/to/file (recommended)
  • PVE_API_TOKEN environment variable

--pve.api-token and --pve.token-file are mutually exclusive.

Metrics

Cluster Status

Metric Type Labels Description
pve_node_info Gauge id, level, name, nodeid Node info (always 1)
pve_cluster_info Gauge id, nodes, quorate, version Cluster info (always 1)

Corosync

Metric Type Labels Description
pve_cluster_quorate Gauge 1 if cluster has quorum
pve_cluster_nodes_total Gauge Total node count
pve_cluster_expected_votes Gauge Sum of quorum votes from config
pve_node_online Gauge name, nodeid 1 if node is online

Cluster Resources

Metric Type Labels Description
pve_up Gauge id 1 if node/VM/CT is online/running
pve_cpu_usage_ratio Gauge id CPU utilization ratio
pve_cpu_usage_limit Gauge id Number of available CPUs
pve_memory_usage_bytes Gauge id Used memory in bytes
pve_memory_size_bytes Gauge id Total memory in bytes
pve_disk_usage_bytes Gauge id Used disk space in bytes
pve_disk_size_bytes Gauge id Total disk space in bytes
pve_uptime_seconds Gauge id Uptime in seconds
pve_network_transmit_bytes_total Counter id Network bytes sent
pve_network_receive_bytes_total Counter id Network bytes received
pve_disk_written_bytes_total Counter id Disk bytes written
pve_disk_read_bytes_total Counter id Disk bytes read
pve_guest_info Gauge id, node, name, type, template, tags VM/CT info (always 1)
pve_storage_info Gauge id, node, storage, plugintype, content Storage info (always 1)
pve_storage_shared Gauge id 1 if storage is shared
pve_ha_state Gauge id, state HA service status
pve_lock_state Gauge id, state Guest config lock state

Version

Metric Type Labels Description
pve_version_info Gauge release, repoid, version PVE version info (always 1)

Backup

Metric Type Labels Description
pve_not_backed_up_total Gauge id 1 if guest has no backup job
pve_not_backed_up_info Gauge id 1 if guest has no backup job

Node Config

Metric Type Labels Description
pve_onboot_status Gauge id, node, type VM/CT onboot config value

Replication

Metric Type Labels Description
pve_replication_info Gauge id, type, source, target, guest Replication job info (always 1)
pve_replication_duration_seconds Gauge id Last replication duration
pve_replication_last_sync_timestamp_seconds Gauge id Last successful sync time
pve_replication_last_try_timestamp_seconds Gauge id Last sync attempt time
pve_replication_next_sync_timestamp_seconds Gauge id Next scheduled sync time
pve_replication_failed_syncs Gauge id Failed sync count

Subscription

Metric Type Labels Description
pve_subscription_info Gauge id, level Subscription info (always 1)
pve_subscription_status Gauge id, status Subscription status
pve_subscription_next_due_timestamp_seconds Gauge id Next due date as Unix timestamp

Node Status

Metric Type Labels Description
pve_node_load1 Gauge node 1-minute load average
pve_node_load5 Gauge node 5-minute load average
pve_node_load15 Gauge node 15-minute load average
pve_node_swap_total_bytes Gauge node Total swap in bytes
pve_node_swap_used_bytes Gauge node Used swap in bytes
pve_node_swap_free_bytes Gauge node Free swap in bytes
pve_node_rootfs_total_bytes Gauge node Root filesystem total in bytes
pve_node_rootfs_used_bytes Gauge node Root filesystem used in bytes
pve_node_rootfs_available_bytes Gauge node Root filesystem available in bytes
pve_node_ksm_shared_bytes Gauge node KSM shared memory in bytes
pve_node_boot_mode_info Gauge node, mode, secureboot Boot mode info (always 1)

VM Pressure

Metric Type Labels Description
pve_vm_pressure_cpu_some_ratio Gauge id, node CPU pressure (some)
pve_vm_pressure_cpu_full_ratio Gauge id, node CPU pressure (full)
pve_vm_pressure_memory_some_ratio Gauge id, node Memory pressure (some)
pve_vm_pressure_memory_full_ratio Gauge id, node Memory pressure (full)
pve_vm_pressure_io_some_ratio Gauge id, node I/O pressure (some)
pve_vm_pressure_io_full_ratio Gauge id, node I/O pressure (full)

HA Status

Metric Type Labels Description
pve_ha_crm_master Gauge node 1 if node is CRM master, 0 otherwise
pve_ha_node_status Gauge node, status Per-node HA status (always 1)
pve_ha_lrm_timestamp_seconds Gauge node Last LRM heartbeat as Unix timestamp
pve_ha_lrm_mode Gauge node, mode LRM mode per node (always 1)
pve_ha_service_config Gauge id, type, max_restart, max_relocate, failback Service config (always 1)
pve_ha_service_status Gauge id, node, state Service runtime state (always 1)

Physical Disks

Metric Type Labels Description
pve_physical_disk_health Gauge node, devpath, model, serial, type 1 if SMART PASSED, 0 otherwise
pve_physical_disk_wearout_remaining_ratio Gauge node, devpath Wearout remaining (1.0 = new)
pve_physical_disk_size_bytes Gauge node, devpath Disk size in bytes
pve_physical_disk_info Gauge node, devpath, model, serial, type, used Disk info (always 1)
pve_physical_disk_osd Gauge node, devpath, osd Disk-to-OSD mapping (always 1)

Scrape Meta

Metric Type Labels Description
pve_scrape_collector_duration_seconds Gauge collector Scrape duration per collector
pve_scrape_collector_success Gauge collector 1 if collector succeeded