From 2e3c0f9188c8f7aa38cf20823f967301362e8071 Mon Sep 17 00:00:00 2001 From: Ronni Baslund Date: Thu, 11 Jun 2026 11:49:00 +0200 Subject: [PATCH] =?UTF-8?q?docs(runbook):=20monitoring=20update=20?= =?UTF-8?q?=E2=80=94=20TCP-25=20rationale=20+=20blacklist=20monitors?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- infrastructure/production/RUNBOOK.md | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/infrastructure/production/RUNBOOK.md b/infrastructure/production/RUNBOOK.md index 094d542..c5de053 100644 --- a/infrastructure/production/RUNBOOK.md +++ b/infrastructure/production/RUNBOOK.md @@ -39,12 +39,15 @@ bottom to rebuild it. Per-layer detail lives in `host/README.md`, type): keep newest 5 versions per image + `latest`, remove older than 7 days. Applied by Gitea's daily cleanup cron. - **Monitoring** — HetrixTools (Ronni's account): 11 uptime monitors via API - (HTTPS on the five apps + Gitea w/ SSL verify, ping, IMAPS/SMTPS TCP, SMTP - protocol on :25; 1-min checks from ams/fra/lon, alert after 2 fails) + the - Linux server agent on node1 (root mode, per-minute cron in - /etc/hetrixtools/; watches stalwart/k3s/dockerd processes, mdadm RAID, - NVMe SMART via smartmontools). Re-create monitors via their v2 API - (uptime/add, Type 9 = server agent — hidden in the new UI); agent install: + (HTTPS on the five apps + Gitea w/ SSL verify, ping, IMAPS/SMTPS/port-25 + TCP — port 25 is a TCP check ON PURPOSE: Stalwart's DNSBL screening + rejects HetrixTools' probe IPs, so an SMTP-protocol check reads down while + real MTAs are fine; 1-min checks from ams/fra/lon, alert after 2 fails), + blacklist monitors on dezky.eu + 46.4.78.187, and the Linux server agent + on node1 (root mode, per-minute cron in /etc/hetrixtools/; watches + stalwart/k3s/dockerd processes, mdadm RAID, NVMe SMART via smartmontools). + Re-create monitors via their v2 API (uptime/add, Type 9 = server agent — + hidden in the new UI); agent install: hetrixtools_install.sh 1 "stalwart,k3s,dockerd" 1 1. ## Deploy flow (day-to-day)