Files
dezky/infrastructure/production/host/README.md
T
Ronni Baslund 9d075343c5
ci / typecheck (map[dir:apps/booking name:booking]) (push) Has been cancelled
ci / typecheck (map[dir:apps/portal name:portal]) (push) Has been cancelled
ci / typecheck (map[dir:apps/website name:website]) (push) Has been cancelled
ci / typecheck (map[dir:services/platform-api name:platform-api]) (push) Has been cancelled
ci / test (push) Has been cancelled
feat(infra): migrate Stalwart to the v0.16 config model (config.json)
v0.16 dropped TOML config. The host service now boots from a tiny config.json
that describes only the datastore (RocksDB); all other settings live in the DB
(web UI / stalwart-cli / platform-api JMAP).

- add stalwart/config.json (RocksDb datastore at /opt/stalwart/data)
- install.sh: install config.json instead of config.toml
- stalwart-mail.service: --config points at config.json
- README: document the v0.16 model + remaining DB-side config + DNS/PTR

Verified: Stalwart 0.16.8 runs on node1 with default mail listeners + the :8080
management server. config.toml retained as a reference for the DB settings.
2026-06-08 21:02:17 +02:00

246 lines
11 KiB
Markdown

# Dezky production — host layer
OS baseline + firewall for the bare-metal **Hetzner AX41** that runs the k3s
node. This layer is everything that lives on the *host* (outside Kubernetes):
hardening, the k3s-safe firewall, and — added next — k3s registration, Stalwart
mail, and Restic backups.
Managed by **Fleet/Rancher** once k3s is up; this host layer is the part Fleet
*can't* do, so it runs over SSH from reviewed scripts.
## Files
| File | Purpose |
|------|---------|
| `config.env.example` | Template for host-specific values |
| `config.env` | **Real values — gitignored.** Source of truth lives only on your machine/box |
| `bootstrap.sh` | One-shot OS hardening: user, SSH, sysctl, swap, fail2ban, auto-updates, firewall |
| `firewall/firewall.sh` | Renders + applies the k3s-safe nftables ruleset (idempotent) |
| `firewall/dezky-firewall.service` | systemd unit; reapplies our table on boot, never flushes globally |
| `k3s/register.sh` | Registers the node into Rancher (Custom k3s cluster); secrets from `config.env` |
| `stalwart/install.sh` | Installs Stalwart as a hardened host service (binary, units, secrets, bootstrap cert) |
| `stalwart/config.toml` | Production Stalwart config (mail ports on host, JMAP on internal 8080) |
| `stalwart/stalwart-mail.service` | systemd unit; non-root + `CAP_NET_BIND_SERVICE` for low ports |
| `stalwart/cert-sync.sh` + `*.service`/`*.timer` | Pulls the cert-manager mail cert into Stalwart, reloads on change |
| `restic/install.sh` | Sets up Restic, the backup SSH key/config, env, and the nightly timer |
| `restic/backup.sh` | Backup → primary Storage Box, retention, then `copy` → Helsinki DR |
| `restic/restore.sh` | List/restore snapshots (run drills!) |
| `restic/dezky-backup.service` + `.timer` | Nightly 03:20 UTC backup |
## The firewall model (read this)
k3s, kube-proxy and flannel manage their **own** nftables tables (`ip`/`ip6`:
`filter`, `nat`, `mangle`). The classic mistake is running `ufw`/`firewalld` or
`nft flush ruleset`, which wipes or fights those rules and breaks pod networking.
So instead:
- We own a single dedicated table — **`inet dezky_fw`** — with only an INPUT
chain (default `drop`). Separate tables coexist; a packet is dropped if *any*
base chain drops it, so our default-drop INPUT gates host-bound traffic while
k3s keeps owning FORWARD/NAT untouched.
- We explicitly **accept the pod (`10.42.0.0/16`) and service (`10.43.0.0/16`)
CIDRs and the CNI interfaces** (`cni0`, `flannel.1`) so cluster↔host traffic
(API server, kubelet, CoreDNS) is never dropped.
- We **never** `flush ruleset`. The systemd unit's `ExecStop` removes only our
table.
### Access policy
| Surface | Ports | Who |
|---------|-------|-----|
| Web + ACME | 80, 443 | **World** (customers) |
| Mail | 25, 465, 587, 143, 993, 4190 | **World** |
| SSH | 22 | **`MGMT_ALLOW_V4/V6` only** |
| k3s API | 6443 | **`MGMT_ALLOW_V4/V6` only** |
Current management allowlist: **home `46.32.144.38`**, **office `46.32.144.45`**.
The Rancher plane (`91.99.122.153`) needs **no inbound rule** — the cluster
agent dials *out* to Rancher over 443, so replies ride the established/related
fast-path.
## Apply order
> Prereqs: AX41 provisioned with **Debian 12 (bookworm)**, reachable as `root`.
> `config.env` filled in — in particular `ADMIN_SSH_PUBKEY` and
> `SERVER_PUBLIC_IPV4` (still TODO until the box exists).
```bash
# From your laptop:
scp -r infrastructure/production/host root@<server-ip>:/opt/dezky-host
# On the server:
ssh root@<server-ip>
cd /opt/dezky-host
# config.env is gitignored, so copy it up separately or recreate it here:
# cp config.env.example config.env && nano config.env
./bootstrap.sh
```
`bootstrap.sh` creates your admin user and installs your key **before** it
disables root/password SSH, so the order is lockout-safe. It's idempotent —
re-run anytime.
To touch only the firewall later:
```bash
sudo ./firewall/firewall.sh --dry-run # preview the ruleset
sudo ./firewall/firewall.sh # render, validate, apply, install unit
```
### Then register into Rancher
Once the host is hardened, register the node as a **Custom k3s cluster**
(create the cluster in Rancher first, choosing the **K3s** distribution, then
paste its token/checksum into `config.env`):
```bash
sudo ./k3s/register.sh # downloads agent installer, joins cluster
journalctl -u rancher-system-agent -f # follow provisioning
```
Rancher is currently reached by IP, so the installer is fetched with
`--insecure`; the agent's ongoing link is still verified via `--ca-checksum`.
Give Rancher a real hostname + cert later to drop the insecure fetch.
### Then install Stalwart (mail)
```bash
sudo ./stalwart/install.sh # binary + systemd + bootstrap cert
systemctl status stalwart-mail
```
Requires `STALWART_ADMIN_PASSWORD` + `STALWART_WEBHOOK_SECRET` in `config.env`
(`openssl rand -hex 24` / `-hex 32`). See the mail topology below.
## Mail (Stalwart) topology
Stalwart runs on the **host**, not in k3s — mail must keep flowing regardless of
cluster state, and SMTP/IMAP want the real public IP for reputation. The single
public IP forces a deliberate split with Traefik:
| Concern | Owner | Detail |
|---------|-------|--------|
| Mail protocol ports (25/465/587/143/993/4190) | **Stalwart (host)** | Bound on the public IP; opened to the world by the firewall |
| Web/JMAP for `mail.dezky.eu:443` | **Traefik (k3s)** | Terminates TLS, reverse-proxies to Stalwart's internal `:8080` |
| ACME / TLS issuance | **cert-manager (k3s)** | Issues `mail.dezky.eu` via HTTP-01; Stalwart runs no ACME (80/443 are Traefik's) |
| Cert delivery to mail ports | **`cert-sync.sh` (host)** | Reads the cluster TLS secret via local kubeconfig, reloads Stalwart on change |
| Storage | **RocksDB on host disk** | Intentionally independent of the in-cluster Postgres |
| Domain/DKIM provisioning | **platform-api (k3s)** | JMAP management API at `http://<node>:8080/jmap`, Basic auth |
| Audit webhook | **Stalwart → platform-api** | POSTs to `https://api.dezky.eu/ingest/...`, HMAC-signed |
**platform-api Fleet env** (must match the host's `config.env`):
```
STALWART_API_URL=http://<node-internal-ip>:8080
STALWART_ADMIN_USER=admin
STALWART_ADMIN_PASSWORD=<same as host STALWART_ADMIN_PASSWORD>
STALWART_WEBHOOK_SECRET=<same as host STALWART_WEBHOOK_SECRET>
STALWART_PROVISIONING_ENABLED=true
```
The firewall already lets the k3s pod CIDR reach host `:8080` while blocking the
world, so no extra rule is needed.
> **Forward dependency:** `cert-sync.sh` needs the fleet layer to create the
> `mail/mail-tls` cert secret. Until then Stalwart serves the self-signed
> bootstrap cert `install.sh` generated; the timer swaps in the real cert
> automatically once it exists.
### Finally, backups
```bash
sudo ./restic/install.sh # restic + key + nightly timer
# upload the printed public key to BOTH Storage Boxes (port 23), then:
sudo ./restic/install.sh # re-run to init the repos
sudo /opt/dezky-backup/backup.sh # first backup (or wait for 03:20 UTC)
```
Needs `RESTIC_PASSWORD` + `BACKUP_PRIMARY_REPO` (+ `BACKUP_DR_REPO`) in
`config.env`. See backups below.
## Backups (Restic)
Nightly at **03:20 UTC**: back up to the **primary Storage Box**, apply
retention, `restic check`, then a dedup-aware **`copy` to the Helsinki DR box**.
| What | Why |
|------|-----|
| `/opt/stalwart/data` + `/etc` | Mail store (RocksDB) + config — the crown jewels |
| `/var/lib/rancher/k3s/server/db/snapshots` | k3s **etcd snapshots** (cluster state) |
| `/var/lib/rancher/k3s/storage` | local-path PVCs — incl. where fleet `pg_dump`/`mongodump` CronJobs land |
- **Retention:** 7 daily · 4 weekly · 6 monthly (tunable via `BACKUP_RETENTION`).
- **Storage Box quirk:** SSH/SFTP on **port 23**, key auth. A single ssh-config
wildcard covers both boxes, so one key + `restic copy` mirrors primary → DR.
- **Encryption:** repos are Restic-encrypted with `RESTIC_PASSWORD`. **Store it
offline** — losing it makes every backup unrecoverable.
- **Alerting:** set `BACKUP_HEALTHCHECK_URL` (e.g. healthchecks.io) for a
dead-man's switch — get paged when a nightly run is missed, not when you need
to restore.
> **Database consistency:** live DB files in PVCs are crash-consistent at best.
> The reliable path is logical dumps — the **fleet layer** adds `pg_dump` /
> `mongodump` CronJobs that write into a backup PVC under
> `/var/lib/rancher/k3s/storage`, which Restic then captures. Restore those
> dumps, not the raw data dirs.
**Run restore drills.** A backup you've never restored isn't a backup:
```bash
sudo /opt/dezky-backup/restore.sh snapshots
sudo /opt/dezky-backup/restore.sh restore latest /tmp/restore-test
```
## ⚠️ Lockout safety
- **Always** open a second SSH session and confirm access **before** closing the
one you ran bootstrap in.
- Management is pinned to home + office IPs. **Residential IPs can change** — if
yours does, you'll be locked out of SSH/6443 (public services stay up).
- **Break-glass:** Hetzner's **KVM/LARA** console (Robot panel) is out-of-band
and bypasses the firewall entirely. From there you can edit
`/etc/nftables.d/dezky-fw.nft` or update `config.env` + re-run `firewall.sh`.
- If your IP changes often, widen `MGMT_ALLOW_V4` to a small prefix, or we add a
WireGuard bastion later.
## Verifying after apply
```bash
sudo nft list table inet dezky_fw # our rules
sudo nft list ruleset | grep -c KUBE # k3s rules still present (>0 once k3s runs)
sudo systemctl status dezky-firewall # enabled + active (exited)
sudo fail2ban-client status sshd # jail active
# From a NON-allowlisted network, `ssh` should hang/timeout; 443 should work.
```
## Host layer status
**Complete:** hardening ✅ · firewall ✅ · k3s registration ✅ · Stalwart ✅ ·
backups ✅.
Next is the **Fleet/GitOps layer** (`infrastructure/production/fleet/`):
cert-manager + `ClusterIssuer`, ingress, the data tier (Postgres/Mongo/Redis),
Authentik, OCIS + Collabora, and portal + platform-api — plus the
`mail/mail-tls` cert and the DB-dump CronJobs this layer's `cert-sync` and
backups depend on.
## Stalwart v0.16 — config model change (IMPORTANT)
v0.16 **removed TOML configuration**. The host service now boots from
`stalwart/config.json` — a tiny file describing ONLY the datastore (RocksDB at
`/opt/stalwart/data`). Every other setting (listeners, authentication, TLS,
domains, DKIM, spam, webhooks) is stored in the DB and managed via the web admin
UI, `stalwart-cli`, or platform-api over JMAP. `stalwart/config.toml` is kept as
a reference for the settings to recreate in the DB; it is NOT loaded by v0.16.
**Status (node1):** Stalwart 0.16.8 installed + running with default listeners
(25/465/587/143/993/4190 + management on `:8080`). Still to configure (DB-side):
- Fallback admin password (so platform-api can authenticate) + the audit webhook.
- TLS for `mail.dezky.eu` — Stalwart's own ACME, or rework `cert-sync.sh` to feed
the cert-manager cert into the v0.16 DB cert model.
- Domains / DKIM — provisioned by platform-api over JMAP.
Then publish DNS (MX, SPF, DKIM, DMARC) and set the **PTR/rDNS**`mail.dezky.eu`.