vms: add monitoring stack and node-exporter for docker host
utility-101-shadow: - Add full monitoring stack (Prometheus + Blackbox Exporter + Alertmanager) with Docker Compose and a systemd unit (monitoring.service) - Prometheus scrapes: itself, blackbox-exporter, and node-exporter on the docker host (docker:9100); blackbox probes cover HTTPS endpoints with TLS cert monitoring - Alertmanager routes warnings to Slack/Discord, critical alerts also to email (Gmail SMTP); inhibit rule suppresses SSLCertExpiringSoon when SSLCertExpired already fires - Alert rules: 11 node-exporter alerts (host down, CPU, memory, disk fill/prediction, iowait, OOM kill, systemd failed units) + 3 blackbox alerts (probe failed, SSL expiring, SSL expired) - readme: add services list and Docker Engine installation steps docker host: - Add node-exporter container running with host pid/network and read-only mounts of /proc, /sys, / for full host metrics visibility - Enable --collector.systemd for systemd unit state metrics - Add systemd unit (node-exporter.service) to manage the container Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
15
vms/utility-101-shadow/docker/monitoring/monitoring.service
Normal file
15
vms/utility-101-shadow/docker/monitoring/monitoring.service
Normal file
@@ -0,0 +1,15 @@
|
||||
[Unit]
|
||||
Description=Monitoring Stack (Prometheus + Blackbox Exporter + Alertmanager)
|
||||
After=docker.service network-online.target
|
||||
Requires=docker.service
|
||||
|
||||
[Service]
|
||||
Type=oneshot
|
||||
RemainAfterExit=yes
|
||||
WorkingDirectory=/srv/docker/monitoring
|
||||
ExecStart=/usr/bin/docker compose up -d --remove-orphans
|
||||
ExecStop=/usr/bin/docker compose down
|
||||
TimeoutStartSec=300
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
Reference in New Issue
Block a user