← Back to DevOps & Cloud
DevOps & Cloud by @akellacom

prometheus

Query Prometheus monitoring data to check server metrics, resource usage, and system health

0
Source Code

Prometheus Skill

Query Prometheus monitoring data to get insights about your infrastructure.

Environment Variables

Set in .env file:

  • PROMETHEUS_URL - Prometheus server URL (e.g., http://localhost:9090)
  • PROMETHEUS_USER - HTTP Basic Auth username (optional)
  • PROMETHEUS_PASSWORD - HTTP Basic Auth password (optional)

Usage

Query Metrics

Use the CLI to run PromQL queries:

source .env && node scripts/cli.js query '<promql_query>'

Common Examples

Disk space usage:

node scripts/cli.js query '100 - (node_filesystem_avail_bytes / node_filesystem_size_bytes * 100)'

CPU usage:

node scripts/cli.js query '100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)'

Memory usage:

node scripts/cli.js query '(node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100'

Load average:

node scripts/cli.js query 'node_load1'

List Metrics

Find available metrics matching a pattern:

node scripts/cli.js metrics 'node_memory_*'

Series Discovery

Find time series by label selectors:

node scripts/cli.js series '{__name__=~"node_cpu_.*", instance=~".*:9100"}'

Get Labels

List label names:

node scripts/cli.js labels

List values for a specific label:

node scripts/cli.js label-values instance

Output Format

All commands output JSON for easy parsing. Use jq for pretty printing:

node scripts/cli.js query 'up' | jq .

Common Queries Reference

Metric PromQL Query
Disk free % node_filesystem_avail_bytes / node_filesystem_size_bytes * 100
Disk used % 100 - (node_filesystem_avail_bytes / node_filesystem_size_bytes * 100)
CPU idle % avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100
Memory used % (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100
Network RX rate(node_network_receive_bytes_total[5m])
Network TX rate(node_network_transmit_bytes_total[5m])
Uptime node_time_seconds - node_boot_time_seconds
Service up up

Notes

  • Time range defaults to last 1 hour for instant queries
  • Use range queries [5m] for rate calculations
  • All queries return JSON with data.result containing the results
  • Instance labels typically show host:port format