feat(infra): production host bootstrap and bare-metal Stalwart scaffolding
Host provisioning for the single-server production target: SSH + firewall hardening (nftables allowlist), k3s node registration, bare-metal Stalwart install with systemd units and TLS cert-sync from the cluster secret, and Restic encrypted backup/restore (primary + DR) with timer units. Host-specific secrets live in config.env (gitignored); config.env.example is the template. Also gitignores MemPalace per-project files.
This commit is contained in:
@@ -3,6 +3,9 @@
|
||||
.env.local
|
||||
.env.*.local
|
||||
|
||||
# Production host config (real IPs / SSH key — keep out of git)
|
||||
infrastructure/production/host/config.env
|
||||
|
||||
# TLS certificates (mkcert generated)
|
||||
infrastructure/docker-compose/certs/*.pem
|
||||
|
||||
@@ -41,3 +44,7 @@ coverage/
|
||||
# Temporary
|
||||
tmp/
|
||||
.tmp/
|
||||
|
||||
# MemPalace per-project files (issue #185)
|
||||
mempalace.yaml
|
||||
entities.json
|
||||
|
||||
@@ -0,0 +1,227 @@
|
||||
# Dezky production — host layer
|
||||
|
||||
OS baseline + firewall for the bare-metal **Hetzner AX41** that runs the k3s
|
||||
node. This layer is everything that lives on the *host* (outside Kubernetes):
|
||||
hardening, the k3s-safe firewall, and — added next — k3s registration, Stalwart
|
||||
mail, and Restic backups.
|
||||
|
||||
Managed by **Fleet/Rancher** once k3s is up; this host layer is the part Fleet
|
||||
*can't* do, so it runs over SSH from reviewed scripts.
|
||||
|
||||
## Files
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `config.env.example` | Template for host-specific values |
|
||||
| `config.env` | **Real values — gitignored.** Source of truth lives only on your machine/box |
|
||||
| `bootstrap.sh` | One-shot OS hardening: user, SSH, sysctl, swap, fail2ban, auto-updates, firewall |
|
||||
| `firewall/firewall.sh` | Renders + applies the k3s-safe nftables ruleset (idempotent) |
|
||||
| `firewall/dezky-firewall.service` | systemd unit; reapplies our table on boot, never flushes globally |
|
||||
| `k3s/register.sh` | Registers the node into Rancher (Custom k3s cluster); secrets from `config.env` |
|
||||
| `stalwart/install.sh` | Installs Stalwart as a hardened host service (binary, units, secrets, bootstrap cert) |
|
||||
| `stalwart/config.toml` | Production Stalwart config (mail ports on host, JMAP on internal 8080) |
|
||||
| `stalwart/stalwart-mail.service` | systemd unit; non-root + `CAP_NET_BIND_SERVICE` for low ports |
|
||||
| `stalwart/cert-sync.sh` + `*.service`/`*.timer` | Pulls the cert-manager mail cert into Stalwart, reloads on change |
|
||||
| `restic/install.sh` | Sets up Restic, the backup SSH key/config, env, and the nightly timer |
|
||||
| `restic/backup.sh` | Backup → primary Storage Box, retention, then `copy` → Helsinki DR |
|
||||
| `restic/restore.sh` | List/restore snapshots (run drills!) |
|
||||
| `restic/dezky-backup.service` + `.timer` | Nightly 03:20 UTC backup |
|
||||
|
||||
## The firewall model (read this)
|
||||
|
||||
k3s, kube-proxy and flannel manage their **own** nftables tables (`ip`/`ip6`:
|
||||
`filter`, `nat`, `mangle`). The classic mistake is running `ufw`/`firewalld` or
|
||||
`nft flush ruleset`, which wipes or fights those rules and breaks pod networking.
|
||||
|
||||
So instead:
|
||||
|
||||
- We own a single dedicated table — **`inet dezky_fw`** — with only an INPUT
|
||||
chain (default `drop`). Separate tables coexist; a packet is dropped if *any*
|
||||
base chain drops it, so our default-drop INPUT gates host-bound traffic while
|
||||
k3s keeps owning FORWARD/NAT untouched.
|
||||
- We explicitly **accept the pod (`10.42.0.0/16`) and service (`10.43.0.0/16`)
|
||||
CIDRs and the CNI interfaces** (`cni0`, `flannel.1`) so cluster↔host traffic
|
||||
(API server, kubelet, CoreDNS) is never dropped.
|
||||
- We **never** `flush ruleset`. The systemd unit's `ExecStop` removes only our
|
||||
table.
|
||||
|
||||
### Access policy
|
||||
|
||||
| Surface | Ports | Who |
|
||||
|---------|-------|-----|
|
||||
| Web + ACME | 80, 443 | **World** (customers) |
|
||||
| Mail | 25, 465, 587, 143, 993, 4190 | **World** |
|
||||
| SSH | 22 | **`MGMT_ALLOW_V4/V6` only** |
|
||||
| k3s API | 6443 | **`MGMT_ALLOW_V4/V6` only** |
|
||||
|
||||
Current management allowlist: **home `46.32.144.38`**, **office `46.32.144.45`**.
|
||||
|
||||
The Rancher plane (`91.99.122.153`) needs **no inbound rule** — the cluster
|
||||
agent dials *out* to Rancher over 443, so replies ride the established/related
|
||||
fast-path.
|
||||
|
||||
## Apply order
|
||||
|
||||
> Prereqs: AX41 provisioned with **Debian 12 (bookworm)**, reachable as `root`.
|
||||
> `config.env` filled in — in particular `ADMIN_SSH_PUBKEY` and
|
||||
> `SERVER_PUBLIC_IPV4` (still TODO until the box exists).
|
||||
|
||||
```bash
|
||||
# From your laptop:
|
||||
scp -r infrastructure/production/host root@<server-ip>:/opt/dezky-host
|
||||
|
||||
# On the server:
|
||||
ssh root@<server-ip>
|
||||
cd /opt/dezky-host
|
||||
# config.env is gitignored, so copy it up separately or recreate it here:
|
||||
# cp config.env.example config.env && nano config.env
|
||||
./bootstrap.sh
|
||||
```
|
||||
|
||||
`bootstrap.sh` creates your admin user and installs your key **before** it
|
||||
disables root/password SSH, so the order is lockout-safe. It's idempotent —
|
||||
re-run anytime.
|
||||
|
||||
To touch only the firewall later:
|
||||
|
||||
```bash
|
||||
sudo ./firewall/firewall.sh --dry-run # preview the ruleset
|
||||
sudo ./firewall/firewall.sh # render, validate, apply, install unit
|
||||
```
|
||||
|
||||
### Then register into Rancher
|
||||
|
||||
Once the host is hardened, register the node as a **Custom k3s cluster**
|
||||
(create the cluster in Rancher first, choosing the **K3s** distribution, then
|
||||
paste its token/checksum into `config.env`):
|
||||
|
||||
```bash
|
||||
sudo ./k3s/register.sh # downloads agent installer, joins cluster
|
||||
journalctl -u rancher-system-agent -f # follow provisioning
|
||||
```
|
||||
|
||||
Rancher is currently reached by IP, so the installer is fetched with
|
||||
`--insecure`; the agent's ongoing link is still verified via `--ca-checksum`.
|
||||
Give Rancher a real hostname + cert later to drop the insecure fetch.
|
||||
|
||||
### Then install Stalwart (mail)
|
||||
|
||||
```bash
|
||||
sudo ./stalwart/install.sh # binary + systemd + bootstrap cert
|
||||
systemctl status stalwart-mail
|
||||
```
|
||||
|
||||
Requires `STALWART_ADMIN_PASSWORD` + `STALWART_WEBHOOK_SECRET` in `config.env`
|
||||
(`openssl rand -hex 24` / `-hex 32`). See the mail topology below.
|
||||
|
||||
## Mail (Stalwart) topology
|
||||
|
||||
Stalwart runs on the **host**, not in k3s — mail must keep flowing regardless of
|
||||
cluster state, and SMTP/IMAP want the real public IP for reputation. The single
|
||||
public IP forces a deliberate split with Traefik:
|
||||
|
||||
| Concern | Owner | Detail |
|
||||
|---------|-------|--------|
|
||||
| Mail protocol ports (25/465/587/143/993/4190) | **Stalwart (host)** | Bound on the public IP; opened to the world by the firewall |
|
||||
| Web/JMAP for `mail.dezky.eu:443` | **Traefik (k3s)** | Terminates TLS, reverse-proxies to Stalwart's internal `:8080` |
|
||||
| ACME / TLS issuance | **cert-manager (k3s)** | Issues `mail.dezky.eu` via HTTP-01; Stalwart runs no ACME (80/443 are Traefik's) |
|
||||
| Cert delivery to mail ports | **`cert-sync.sh` (host)** | Reads the cluster TLS secret via local kubeconfig, reloads Stalwart on change |
|
||||
| Storage | **RocksDB on host disk** | Intentionally independent of the in-cluster Postgres |
|
||||
| Domain/DKIM provisioning | **platform-api (k3s)** | JMAP management API at `http://<node>:8080/jmap`, Basic auth |
|
||||
| Audit webhook | **Stalwart → platform-api** | POSTs to `https://api.dezky.eu/ingest/...`, HMAC-signed |
|
||||
|
||||
**platform-api Fleet env** (must match the host's `config.env`):
|
||||
|
||||
```
|
||||
STALWART_API_URL=http://<node-internal-ip>:8080
|
||||
STALWART_ADMIN_USER=admin
|
||||
STALWART_ADMIN_PASSWORD=<same as host STALWART_ADMIN_PASSWORD>
|
||||
STALWART_WEBHOOK_SECRET=<same as host STALWART_WEBHOOK_SECRET>
|
||||
STALWART_PROVISIONING_ENABLED=true
|
||||
```
|
||||
|
||||
The firewall already lets the k3s pod CIDR reach host `:8080` while blocking the
|
||||
world, so no extra rule is needed.
|
||||
|
||||
> **Forward dependency:** `cert-sync.sh` needs the fleet layer to create the
|
||||
> `mail/mail-tls` cert secret. Until then Stalwart serves the self-signed
|
||||
> bootstrap cert `install.sh` generated; the timer swaps in the real cert
|
||||
> automatically once it exists.
|
||||
|
||||
### Finally, backups
|
||||
|
||||
```bash
|
||||
sudo ./restic/install.sh # restic + key + nightly timer
|
||||
# upload the printed public key to BOTH Storage Boxes (port 23), then:
|
||||
sudo ./restic/install.sh # re-run to init the repos
|
||||
sudo /opt/dezky-backup/backup.sh # first backup (or wait for 03:20 UTC)
|
||||
```
|
||||
|
||||
Needs `RESTIC_PASSWORD` + `BACKUP_PRIMARY_REPO` (+ `BACKUP_DR_REPO`) in
|
||||
`config.env`. See backups below.
|
||||
|
||||
## Backups (Restic)
|
||||
|
||||
Nightly at **03:20 UTC**: back up to the **primary Storage Box**, apply
|
||||
retention, `restic check`, then a dedup-aware **`copy` to the Helsinki DR box**.
|
||||
|
||||
| What | Why |
|
||||
|------|-----|
|
||||
| `/opt/stalwart/data` + `/etc` | Mail store (RocksDB) + config — the crown jewels |
|
||||
| `/var/lib/rancher/k3s/server/db/snapshots` | k3s **etcd snapshots** (cluster state) |
|
||||
| `/var/lib/rancher/k3s/storage` | local-path PVCs — incl. where fleet `pg_dump`/`mongodump` CronJobs land |
|
||||
|
||||
- **Retention:** 7 daily · 4 weekly · 6 monthly (tunable via `BACKUP_RETENTION`).
|
||||
- **Storage Box quirk:** SSH/SFTP on **port 23**, key auth. A single ssh-config
|
||||
wildcard covers both boxes, so one key + `restic copy` mirrors primary → DR.
|
||||
- **Encryption:** repos are Restic-encrypted with `RESTIC_PASSWORD`. **Store it
|
||||
offline** — losing it makes every backup unrecoverable.
|
||||
- **Alerting:** set `BACKUP_HEALTHCHECK_URL` (e.g. healthchecks.io) for a
|
||||
dead-man's switch — get paged when a nightly run is missed, not when you need
|
||||
to restore.
|
||||
|
||||
> **Database consistency:** live DB files in PVCs are crash-consistent at best.
|
||||
> The reliable path is logical dumps — the **fleet layer** adds `pg_dump` /
|
||||
> `mongodump` CronJobs that write into a backup PVC under
|
||||
> `/var/lib/rancher/k3s/storage`, which Restic then captures. Restore those
|
||||
> dumps, not the raw data dirs.
|
||||
|
||||
**Run restore drills.** A backup you've never restored isn't a backup:
|
||||
|
||||
```bash
|
||||
sudo /opt/dezky-backup/restore.sh snapshots
|
||||
sudo /opt/dezky-backup/restore.sh restore latest /tmp/restore-test
|
||||
```
|
||||
|
||||
## ⚠️ Lockout safety
|
||||
|
||||
- **Always** open a second SSH session and confirm access **before** closing the
|
||||
one you ran bootstrap in.
|
||||
- Management is pinned to home + office IPs. **Residential IPs can change** — if
|
||||
yours does, you'll be locked out of SSH/6443 (public services stay up).
|
||||
- **Break-glass:** Hetzner's **KVM/LARA** console (Robot panel) is out-of-band
|
||||
and bypasses the firewall entirely. From there you can edit
|
||||
`/etc/nftables.d/dezky-fw.nft` or update `config.env` + re-run `firewall.sh`.
|
||||
- If your IP changes often, widen `MGMT_ALLOW_V4` to a small prefix, or we add a
|
||||
WireGuard bastion later.
|
||||
|
||||
## Verifying after apply
|
||||
|
||||
```bash
|
||||
sudo nft list table inet dezky_fw # our rules
|
||||
sudo nft list ruleset | grep -c KUBE # k3s rules still present (>0 once k3s runs)
|
||||
sudo systemctl status dezky-firewall # enabled + active (exited)
|
||||
sudo fail2ban-client status sshd # jail active
|
||||
# From a NON-allowlisted network, `ssh` should hang/timeout; 443 should work.
|
||||
```
|
||||
|
||||
## Host layer status
|
||||
|
||||
**Complete:** hardening ✅ · firewall ✅ · k3s registration ✅ · Stalwart ✅ ·
|
||||
backups ✅.
|
||||
|
||||
Next is the **Fleet/GitOps layer** (`infrastructure/production/fleet/`):
|
||||
cert-manager + `ClusterIssuer`, ingress, the data tier (Postgres/Mongo/Redis),
|
||||
Authentik, OCIS + Collabora, and portal + platform-api — plus the
|
||||
`mail/mail-tls` cert and the DB-dump CronJobs this layer's `cert-sync` and
|
||||
backups depend on.
|
||||
Executable
+192
@@ -0,0 +1,192 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# Dezky production host bootstrap — OS hardening for the AX41 k3s node.
|
||||
#
|
||||
# Run ONCE on a fresh Debian 12 (bookworm) install, as root, e.g.:
|
||||
# scp -r infrastructure/production/host root@<server>:/opt/dezky-host
|
||||
# ssh root@<server> 'cd /opt/dezky-host && cp config.env.example config.env && nano config.env'
|
||||
# ssh root@<server> 'cd /opt/dezky-host && ./bootstrap.sh'
|
||||
#
|
||||
# Order matters: we create your admin user + install your SSH key BEFORE
|
||||
# disabling root/password login, so you can't lock yourself out. The script
|
||||
# is idempotent — safe to re-run.
|
||||
#
|
||||
# What it does NOT do: install k3s, Stalwart, or backups. Those are separate
|
||||
# steps in this host/ layer (added next). This is OS baseline + firewall only.
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
RED='\033[0;31m'; GREEN='\033[0;32m'; YELLOW='\033[1;33m'; BLUE='\033[0;34m'; NC='\033[0m'
|
||||
info() { echo -e "${BLUE}[INFO]${NC} $*"; }
|
||||
ok() { echo -e "${GREEN}[OK]${NC} $*"; }
|
||||
warn() { echo -e "${YELLOW}[WARN]${NC} $*"; }
|
||||
error() { echo -e "${RED}[ERROR]${NC} $*" >&2; }
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
CONFIG_FILE="$SCRIPT_DIR/config.env"
|
||||
|
||||
echo ""
|
||||
echo "╔══════════════════════════════════════════════════════════════╗"
|
||||
echo "║ Dezky Production Host Bootstrap (Debian 12) ║"
|
||||
echo "╚══════════════════════════════════════════════════════════════╝"
|
||||
echo ""
|
||||
|
||||
# ── Preflight ──────────────────────────────────────────────────────────────
|
||||
if [[ $EUID -ne 0 ]]; then
|
||||
error "Run as root (you'll create the unprivileged admin user from here)."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if [[ ! -f "$CONFIG_FILE" ]]; then
|
||||
error "Missing $CONFIG_FILE — copy config.env.example and fill it in."
|
||||
exit 1
|
||||
fi
|
||||
# shellcheck disable=SC1090
|
||||
source "$CONFIG_FILE"
|
||||
|
||||
: "${ADMIN_USER:?ADMIN_USER required}"
|
||||
: "${ADMIN_SSH_PUBKEY:?ADMIN_SSH_PUBKEY required — without it you would lock yourself out}"
|
||||
: "${MGMT_ALLOW_V4:?MGMT_ALLOW_V4 required}"
|
||||
: "${SERVER_HOSTNAME:?SERVER_HOSTNAME required}"
|
||||
: "${SSH_PORT:=22}"
|
||||
|
||||
if [[ "$ADMIN_SSH_PUBKEY" != ssh-* ]]; then
|
||||
error "ADMIN_SSH_PUBKEY doesn't look like a public key (should start with 'ssh-')."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# ── Step 1: base packages + system upgrade ─────────────────────────────────
|
||||
info "Step 1: Updating system and installing base packages..."
|
||||
export DEBIAN_FRONTEND=noninteractive
|
||||
apt-get update -qq
|
||||
apt-get upgrade -y -qq
|
||||
apt-get install -y -qq \
|
||||
nftables fail2ban unattended-upgrades apt-listchanges \
|
||||
curl ca-certificates gnupg htop tmux vim chrony \
|
||||
>/dev/null
|
||||
ok "Base packages installed."
|
||||
|
||||
# ── Step 2: hostname + timezone + time sync ────────────────────────────────
|
||||
info "Step 2: Hostname, timezone (UTC), time sync..."
|
||||
hostnamectl set-hostname "$SERVER_HOSTNAME"
|
||||
timedatectl set-timezone UTC
|
||||
systemctl enable --now chrony >/dev/null 2>&1 || true
|
||||
# Ensure the FQDN resolves locally
|
||||
if ! grep -q "$SERVER_HOSTNAME" /etc/hosts; then
|
||||
echo "127.0.1.1 ${SERVER_HOSTNAME} ${SERVER_HOSTNAME%%.*}" >> /etc/hosts
|
||||
fi
|
||||
ok "Hostname set to $SERVER_HOSTNAME (UTC)."
|
||||
|
||||
# ── Step 3: admin user + SSH key (BEFORE locking SSH) ──────────────────────
|
||||
info "Step 3: Admin user '$ADMIN_USER' + SSH key..."
|
||||
if ! id -u "$ADMIN_USER" >/dev/null 2>&1; then
|
||||
adduser --disabled-password --gecos "" "$ADMIN_USER"
|
||||
fi
|
||||
usermod -aG sudo "$ADMIN_USER"
|
||||
install -d -m 0700 -o "$ADMIN_USER" -g "$ADMIN_USER" "/home/$ADMIN_USER/.ssh"
|
||||
AUTH_KEYS="/home/$ADMIN_USER/.ssh/authorized_keys"
|
||||
touch "$AUTH_KEYS"
|
||||
grep -qxF "$ADMIN_SSH_PUBKEY" "$AUTH_KEYS" || echo "$ADMIN_SSH_PUBKEY" >> "$AUTH_KEYS"
|
||||
chmod 0600 "$AUTH_KEYS"
|
||||
chown "$ADMIN_USER:$ADMIN_USER" "$AUTH_KEYS"
|
||||
# Passworded sudo (member of sudo group). Set a password manually later if you
|
||||
# want interactive sudo: `passwd $ADMIN_USER`. Key-only login still works.
|
||||
ok "Admin user ready with your SSH key."
|
||||
|
||||
# ── Step 4: SSH hardening (drop-in) ────────────────────────────────────────
|
||||
info "Step 4: Hardening SSH..."
|
||||
SSHD_DROPIN="/etc/ssh/sshd_config.d/99-dezky.conf"
|
||||
cat > "$SSHD_DROPIN" <<EOF
|
||||
# Managed by Dezky bootstrap.sh
|
||||
Port ${SSH_PORT}
|
||||
PermitRootLogin no
|
||||
PasswordAuthentication no
|
||||
KbdInteractiveAuthentication no
|
||||
ChallengeResponseAuthentication no
|
||||
PubkeyAuthentication yes
|
||||
PermitEmptyPasswords no
|
||||
X11Forwarding no
|
||||
MaxAuthTries 3
|
||||
LoginGraceTime 30
|
||||
AllowUsers ${ADMIN_USER}
|
||||
EOF
|
||||
if sshd -t; then
|
||||
systemctl reload ssh 2>/dev/null || systemctl reload sshd 2>/dev/null || true
|
||||
ok "SSH hardened: key-only, no root, AllowUsers=${ADMIN_USER}, port ${SSH_PORT}."
|
||||
else
|
||||
error "sshd config test FAILED — removing drop-in, leaving SSH as-is."
|
||||
rm -f "$SSHD_DROPIN"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# ── Step 5: kernel sysctl for k3s + sane limits ────────────────────────────
|
||||
info "Step 5: sysctl + kernel modules for k3s..."
|
||||
modprobe br_netfilter 2>/dev/null || true
|
||||
modprobe overlay 2>/dev/null || true
|
||||
cat > /etc/modules-load.d/dezky-k3s.conf <<EOF
|
||||
br_netfilter
|
||||
overlay
|
||||
EOF
|
||||
cat > /etc/sysctl.d/99-dezky-k3s.conf <<EOF
|
||||
# Routing/bridging required by k3s/flannel
|
||||
net.ipv4.ip_forward = 1
|
||||
net.ipv6.conf.all.forwarding = 1
|
||||
net.bridge.bridge-nf-call-iptables = 1
|
||||
net.bridge.bridge-nf-call-ip6tables = 1
|
||||
# Many containers => raise inotify + file limits
|
||||
fs.inotify.max_user_instances = 8192
|
||||
fs.inotify.max_user_watches = 524288
|
||||
fs.file-max = 2097152
|
||||
EOF
|
||||
sysctl --system >/dev/null
|
||||
ok "sysctl applied."
|
||||
|
||||
# ── Step 6: disable swap (kubelet best practice) ───────────────────────────
|
||||
info "Step 6: Disabling swap (recommended for k3s nodes)..."
|
||||
swapoff -a || true
|
||||
# Comment any swap entries so it stays off across reboots
|
||||
sed -i.bak -E 's@^([^#].*\sswap\s.*)$@# \1 # disabled by dezky bootstrap@' /etc/fstab || true
|
||||
ok "Swap disabled."
|
||||
|
||||
# ── Step 7: fail2ban (ssh) ─────────────────────────────────────────────────
|
||||
info "Step 7: fail2ban for SSH..."
|
||||
cat > /etc/fail2ban/jail.d/dezky-sshd.local <<EOF
|
||||
[sshd]
|
||||
enabled = true
|
||||
port = ${SSH_PORT}
|
||||
backend = systemd
|
||||
maxretry = 4
|
||||
findtime = 10m
|
||||
bantime = 1h
|
||||
EOF
|
||||
systemctl enable --now fail2ban >/dev/null 2>&1 || true
|
||||
systemctl restart fail2ban >/dev/null 2>&1 || true
|
||||
ok "fail2ban active on SSH."
|
||||
|
||||
# ── Step 8: unattended security upgrades ───────────────────────────────────
|
||||
info "Step 8: Enabling unattended security upgrades..."
|
||||
cat > /etc/apt/apt.conf.d/20auto-upgrades <<EOF
|
||||
APT::Periodic::Update-Package-Lists "1";
|
||||
APT::Periodic::Unattended-Upgrade "1";
|
||||
EOF
|
||||
# Keep defaults for which origins (security). Auto-reboot OFF — you decide when.
|
||||
ok "Unattended security upgrades enabled (auto-reboot left off)."
|
||||
|
||||
# ── Step 9: firewall (k3s-safe nftables) ───────────────────────────────────
|
||||
info "Step 9: Applying k3s-safe nftables firewall..."
|
||||
# Ensure distro nftables.service won't fight us: we run our own unit and never
|
||||
# flush the global ruleset. Disable the stock service's auto-load of its conf.
|
||||
systemctl disable --now nftables.service >/dev/null 2>&1 || true
|
||||
CONFIG_FILE="$CONFIG_FILE" "$SCRIPT_DIR/firewall/firewall.sh"
|
||||
|
||||
echo ""
|
||||
echo "╔══════════════════════════════════════════════════════════════╗"
|
||||
echo "║ Host bootstrap complete ║"
|
||||
echo "╚══════════════════════════════════════════════════════════════╝"
|
||||
warn "BEFORE you close this root session:"
|
||||
warn " 1. Open a new terminal and run: ssh -p ${SSH_PORT} ${ADMIN_USER}@${SERVER_PUBLIC_IPV4:-<server-ip>}"
|
||||
warn " 2. Confirm you get in with your key."
|
||||
warn " 3. Only then close this session. KVM/LARA is your fallback if not."
|
||||
echo ""
|
||||
info "Next host-layer steps (separate scripts, added next): k3s registration,"
|
||||
info "Stalwart mail, Restic backups."
|
||||
@@ -0,0 +1,59 @@
|
||||
# ─────────────────────────────────────────────────────────────
|
||||
# Dezky production host configuration
|
||||
#
|
||||
# Copy to `config.env` and fill in real values. `config.env` is
|
||||
# gitignored — it holds host-specific values, not the repo's source
|
||||
# of truth. Both bootstrap.sh and firewall/firewall.sh source this.
|
||||
# ─────────────────────────────────────────────────────────────
|
||||
|
||||
# --- Management allowlist -------------------------------------------------
|
||||
# Source addresses allowed to reach SSH (22) and the k3s API (6443).
|
||||
# Everything else on those ports is dropped. Accepts a comma-separated
|
||||
# list of single IPs and/or CIDRs (e.g. home + office, or a /29 block,
|
||||
# or a v6 /64 prefix) — the firewall treats these as nftables interval sets.
|
||||
#
|
||||
# NOTE: residential IPs can change. If yours is dynamic, prefer a small
|
||||
# prefix here, and remember Hetzner's KVM/LARA console is always reachable
|
||||
# out-of-band if you ever lock yourself out (see README).
|
||||
MGMT_ALLOW_V4="203.0.113.10, 203.0.113.11" # REQUIRED — management IPv4(s)/CIDR(s)
|
||||
MGMT_ALLOW_V6="" # optional — management IPv6(s)/prefix (empty to skip)
|
||||
|
||||
# --- Server identity ------------------------------------------------------
|
||||
SERVER_HOSTNAME="node1.dezky.eu" # FQDN set on the box
|
||||
SERVER_PUBLIC_IPV4="" # AX41 primary IPv4 (fill after provisioning)
|
||||
SERVER_PUBLIC_IPV6="" # AX41 primary IPv6 (fill after provisioning)
|
||||
|
||||
# --- Admin (non-root) user ------------------------------------------------
|
||||
ADMIN_USER="dezky" # created with sudo; root SSH login is then disabled
|
||||
ADMIN_SSH_PUBKEY="" # REQUIRED — your SSH public key (the WHOLE line, e.g. "ssh-ed25519 AAAA... you@home")
|
||||
|
||||
# --- SSH ------------------------------------------------------------------
|
||||
SSH_PORT="22" # keep 22 unless you have a reason; obscurity is not security
|
||||
|
||||
# --- k3s networking (defaults; change ONLY if you customise k3s CIDRs) ----
|
||||
K3S_POD_CIDR="10.42.0.0/16" # flannel pod network — accepted to/from host
|
||||
K3S_SERVICE_CIDR="10.43.0.0/16" # cluster service network — accepted to/from host
|
||||
|
||||
# --- Rancher Custom-cluster registration (SECRET) -------------------------
|
||||
# From Rancher → Cluster Management → <cluster> → Registration tab. Create the
|
||||
# cluster with the **K3s** distribution first. Token + checksum are secrets.
|
||||
RANCHER_SERVER_URL="https://rancher.example.com"
|
||||
RANCHER_NODE_TOKEN="" # REQUIRED — node registration token
|
||||
RANCHER_CA_CHECKSUM="" # REQUIRED — CA checksum from the same command
|
||||
RANCHER_NODE_ROLES="--etcd --controlplane --worker" # single node = all three
|
||||
RANCHER_INSECURE_FETCH="true" # true if Rancher is reached by IP / self-signed cert
|
||||
|
||||
# --- Stalwart mail (host service) -----------------------------------------
|
||||
# SECRETS — platform-api (k3s) must use the SAME admin password + webhook secret.
|
||||
STALWART_VERSION="latest" # pin to a release tag after first install
|
||||
STALWART_ADMIN_PASSWORD="" # REQUIRED — openssl rand -hex 24
|
||||
STALWART_WEBHOOK_SECRET="" # REQUIRED — openssl rand -hex 32
|
||||
|
||||
# --- Restic backups (host) ------------------------------------------------
|
||||
# Storage Box is SSH/SFTP on PORT 23, key auth. STORE RESTIC_PASSWORD OFFLINE.
|
||||
RESTIC_PASSWORD="" # REQUIRED — openssl rand -hex 32 (save offline!)
|
||||
BACKUP_PRIMARY_REPO="" # sftp:<user>@<user>.your-storagebox.de:/dezky
|
||||
BACKUP_DR_REPO="" # sftp:<user>@<user>.your-storagebox.de:/dezky (Helsinki box)
|
||||
BACKUP_PATHS="/opt/stalwart/data /opt/stalwart/etc /var/lib/rancher/k3s/server/db/snapshots /var/lib/rancher/k3s/storage"
|
||||
BACKUP_RETENTION="--keep-daily 7 --keep-weekly 4 --keep-monthly 6"
|
||||
BACKUP_HEALTHCHECK_URL="" # optional dead-man's-switch base URL
|
||||
@@ -0,0 +1,27 @@
|
||||
# Dezky host firewall — loads ONLY our table on boot.
|
||||
#
|
||||
# Deliberately does NOT use the distro 'nftables.service', whose default
|
||||
# config starts with `flush ruleset` and would wipe k3s's tables. This unit
|
||||
# applies /etc/nftables.d/dezky-fw.nft, which only (re)creates inet dezky_fw.
|
||||
#
|
||||
# Ordering: runs early (before k3s) so the box is never briefly exposed.
|
||||
# k3s adds its own tables independently afterwards.
|
||||
|
||||
[Unit]
|
||||
Description=Dezky host firewall (nftables, k3s-safe)
|
||||
Wants=network-pre.target
|
||||
Before=network-pre.target k3s.service
|
||||
DefaultDependencies=no
|
||||
Conflicts=shutdown.target
|
||||
Before=shutdown.target
|
||||
|
||||
[Service]
|
||||
Type=oneshot
|
||||
RemainAfterExit=yes
|
||||
ExecStart=/usr/sbin/nft -f /etc/nftables.d/dezky-fw.nft
|
||||
ExecReload=/usr/sbin/nft -f /etc/nftables.d/dezky-fw.nft
|
||||
# On stop, remove only our table — leave k3s networking intact.
|
||||
ExecStop=/usr/sbin/nft destroy table inet dezky_fw
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
+160
@@ -0,0 +1,160 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# Dezky production host firewall (nftables) — k3s-safe.
|
||||
#
|
||||
# Why this design:
|
||||
# - k3s/kube-proxy/flannel manage their OWN nftables tables (ip/ip6: filter,
|
||||
# nat, mangle). We must never `flush ruleset` or use ufw/firewalld, or we
|
||||
# wipe/clobber cluster networking. Instead we own a single dedicated table,
|
||||
# `inet dezky_fw`, with only an INPUT chain. Separate tables coexist; a
|
||||
# packet is dropped if ANY base chain drops it, so our default-drop INPUT
|
||||
# is the gate for host-bound traffic while k3s keeps owning FORWARD/NAT.
|
||||
# - We explicitly accept the pod/service CIDRs and CNI interfaces so
|
||||
# cluster<->host traffic (API server, kubelet, CoreDNS) is never dropped.
|
||||
#
|
||||
# Idempotent: re-running replaces only our table (`destroy table` first).
|
||||
#
|
||||
# Usage (as root, on the server):
|
||||
# ./firewall.sh # render from ../config.env, install unit, apply
|
||||
# ./firewall.sh --dry-run # print the ruleset, apply nothing
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
RED='\033[0;31m'; GREEN='\033[0;32m'; YELLOW='\033[1;33m'; BLUE='\033[0;34m'; NC='\033[0m'
|
||||
info() { echo -e "${BLUE}[INFO]${NC} $*"; }
|
||||
ok() { echo -e "${GREEN}[OK]${NC} $*"; }
|
||||
warn() { echo -e "${YELLOW}[WARN]${NC} $*"; }
|
||||
error() { echo -e "${RED}[ERROR]${NC} $*" >&2; }
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
HOST_DIR="$(cd "$SCRIPT_DIR/.." && pwd)"
|
||||
CONFIG_FILE="${CONFIG_FILE:-$HOST_DIR/config.env}"
|
||||
NFT_OUT="/etc/nftables.d/dezky-fw.nft"
|
||||
UNIT_SRC="$SCRIPT_DIR/dezky-firewall.service"
|
||||
UNIT_DST="/etc/systemd/system/dezky-firewall.service"
|
||||
|
||||
DRY_RUN=0
|
||||
[[ "${1:-}" == "--dry-run" ]] && DRY_RUN=1
|
||||
|
||||
# ── Load config ───────────────────────────────────────────────────────────
|
||||
if [[ ! -f "$CONFIG_FILE" ]]; then
|
||||
error "Config not found: $CONFIG_FILE"
|
||||
error "Copy config.env.example → config.env and fill it in."
|
||||
exit 1
|
||||
fi
|
||||
# shellcheck disable=SC1090
|
||||
source "$CONFIG_FILE"
|
||||
|
||||
: "${MGMT_ALLOW_V4:?MGMT_ALLOW_V4 is required in config.env}"
|
||||
: "${SSH_PORT:=22}"
|
||||
: "${K3S_POD_CIDR:=10.42.0.0/16}"
|
||||
: "${K3S_SERVICE_CIDR:=10.43.0.0/16}"
|
||||
|
||||
# ── Build the management v6 block only if a v6 address is configured ───────
|
||||
V6_SET=""
|
||||
V6_RULE=""
|
||||
if [[ -n "${MGMT_ALLOW_V6:-}" ]]; then
|
||||
V6_SET=$(cat <<EOF
|
||||
|
||||
set mgmt_v6 {
|
||||
type ipv6_addr
|
||||
flags interval
|
||||
elements = { ${MGMT_ALLOW_V6} }
|
||||
}
|
||||
EOF
|
||||
)
|
||||
V6_RULE=" ip6 saddr @mgmt_v6 tcp dport { ${SSH_PORT}, 6443 } accept"
|
||||
fi
|
||||
|
||||
# ── Render the ruleset ─────────────────────────────────────────────────────
|
||||
RULESET=$(cat <<EOF
|
||||
#!/usr/sbin/nft -f
|
||||
#
|
||||
# Managed by Dezky firewall.sh — DO NOT edit by hand.
|
||||
# Owns only 'inet dezky_fw'. k3s manages its own ip/ip6 tables separately.
|
||||
# NEVER add 'flush ruleset' here: it would wipe k3s networking.
|
||||
|
||||
destroy table inet dezky_fw
|
||||
|
||||
table inet dezky_fw {
|
||||
# Management source allowlist (SSH + k3s API). Intervals allow CIDRs.
|
||||
set mgmt_v4 {
|
||||
type ipv4_addr
|
||||
flags interval
|
||||
elements = { ${MGMT_ALLOW_V4} }
|
||||
}${V6_SET}
|
||||
|
||||
chain input {
|
||||
type filter hook input priority filter; policy drop;
|
||||
|
||||
# Stateful fast-path
|
||||
ct state established,related accept
|
||||
ct state invalid drop
|
||||
|
||||
# Loopback
|
||||
iif "lo" accept
|
||||
|
||||
# ICMP — keep ping working and (critically) IPv6 NDP/RA + PMTUD
|
||||
ip protocol icmp accept
|
||||
ip6 nexthdr icmpv6 accept
|
||||
|
||||
# ── k3s internal: never block cluster <-> host traffic ──────────────
|
||||
iifname "cni0" accept
|
||||
iifname "flannel.1" accept
|
||||
ip saddr ${K3S_POD_CIDR} accept
|
||||
ip saddr ${K3S_SERVICE_CIDR} accept
|
||||
|
||||
# ── Public services (world-reachable) ──────────────────────────────
|
||||
# Web + ACME HTTP-01 challenge
|
||||
tcp dport { 80, 443 } accept
|
||||
# Mail: smtp, submissions, submission, imap, imaps, managesieve
|
||||
tcp dport { 25, 465, 587, 143, 993, 4190 } accept
|
||||
|
||||
# ── Management surfaces: home IP only ──────────────────────────────
|
||||
ip saddr @mgmt_v4 tcp dport { ${SSH_PORT}, 6443 } accept
|
||||
${V6_RULE}
|
||||
|
||||
# Rate-limited drop logging for debugging (then policy drop applies)
|
||||
limit rate 5/minute burst 5 packets log prefix "dezky-fw drop: " level info
|
||||
}
|
||||
}
|
||||
EOF
|
||||
)
|
||||
|
||||
if [[ $DRY_RUN -eq 1 ]]; then
|
||||
echo "$RULESET"
|
||||
info "Dry run — nothing applied."
|
||||
exit 0
|
||||
fi
|
||||
|
||||
if [[ $EUID -ne 0 ]]; then
|
||||
error "Must run as root to apply the firewall."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# ── Write, validate, install, apply ────────────────────────────────────────
|
||||
mkdir -p /etc/nftables.d
|
||||
echo "$RULESET" > "$NFT_OUT"
|
||||
chmod 0644 "$NFT_OUT"
|
||||
info "Wrote ruleset → $NFT_OUT"
|
||||
|
||||
# Validate syntax before touching the live ruleset
|
||||
if ! nft -c -f "$NFT_OUT"; then
|
||||
error "nft syntax check FAILED — not applying. Live firewall unchanged."
|
||||
exit 1
|
||||
fi
|
||||
ok "Ruleset syntax valid."
|
||||
|
||||
# Install the systemd unit so the rules survive reboot (and never flush global)
|
||||
if [[ -f "$UNIT_SRC" ]]; then
|
||||
install -m 0644 "$UNIT_SRC" "$UNIT_DST"
|
||||
systemctl daemon-reload
|
||||
systemctl enable dezky-firewall.service >/dev/null 2>&1 || true
|
||||
ok "Installed + enabled dezky-firewall.service"
|
||||
fi
|
||||
|
||||
# Apply now
|
||||
nft -f "$NFT_OUT"
|
||||
ok "Firewall applied. Management restricted to: ${MGMT_ALLOW_V4} ${MGMT_ALLOW_V6:-}"
|
||||
warn "Open a SECOND SSH session NOW and confirm you still have access before"
|
||||
warn "closing this one. Hetzner KVM/LARA is your out-of-band fallback."
|
||||
+93
@@ -0,0 +1,93 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# Register the AX41 as a single-node k3s cluster in Rancher (Custom cluster,
|
||||
# provisioning v2). Run AFTER bootstrap.sh — the firewall already allows the
|
||||
# outbound 443 the cluster-agent needs (no inbound rule required).
|
||||
#
|
||||
# This downloads Rancher's system-agent installer and runs it. The agent then
|
||||
# pulls the cluster spec from Rancher and stands up k3s with the configured
|
||||
# roles. The Rancher Custom cluster MUST be created with the K3s distribution.
|
||||
#
|
||||
# Security note: Rancher here is addressed by IP, whose TLS cert won't match,
|
||||
# so we fetch the installer with --insecure. That's acceptable because the
|
||||
# agent verifies Rancher's CA via --ca-checksum for its ongoing connection.
|
||||
# Move Rancher behind rancher.dezky.eu + a valid cert to drop the insecure fetch.
|
||||
#
|
||||
# Usage (on the server):
|
||||
# sudo ./register.sh # register this node
|
||||
# sudo ./register.sh --force # re-run even if an agent is already present
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
RED='\033[0;31m'; GREEN='\033[0;32m'; YELLOW='\033[1;33m'; BLUE='\033[0;34m'; NC='\033[0m'
|
||||
info() { echo -e "${BLUE}[INFO]${NC} $*"; }
|
||||
ok() { echo -e "${GREEN}[OK]${NC} $*"; }
|
||||
warn() { echo -e "${YELLOW}[WARN]${NC} $*"; }
|
||||
error() { echo -e "${RED}[ERROR]${NC} $*" >&2; }
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
HOST_DIR="$(cd "$SCRIPT_DIR/.." && pwd)"
|
||||
CONFIG_FILE="${CONFIG_FILE:-$HOST_DIR/config.env}"
|
||||
|
||||
FORCE=0
|
||||
[[ "${1:-}" == "--force" ]] && FORCE=1
|
||||
|
||||
if [[ $EUID -ne 0 ]]; then
|
||||
error "Run with sudo/root (the agent installer needs root)."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if [[ ! -f "$CONFIG_FILE" ]]; then
|
||||
error "Missing $CONFIG_FILE — fill in the RANCHER_* values first."
|
||||
exit 1
|
||||
fi
|
||||
# shellcheck disable=SC1090
|
||||
source "$CONFIG_FILE"
|
||||
|
||||
: "${RANCHER_SERVER_URL:?RANCHER_SERVER_URL required}"
|
||||
: "${RANCHER_NODE_TOKEN:?RANCHER_NODE_TOKEN required}"
|
||||
: "${RANCHER_CA_CHECKSUM:?RANCHER_CA_CHECKSUM required}"
|
||||
: "${RANCHER_NODE_ROLES:=--etcd --controlplane --worker}"
|
||||
: "${RANCHER_INSECURE_FETCH:=true}"
|
||||
|
||||
# ── Idempotency guard ──────────────────────────────────────────────────────
|
||||
if systemctl list-unit-files 2>/dev/null | grep -q '^rancher-system-agent'; then
|
||||
if [[ $FORCE -eq 0 ]]; then
|
||||
warn "rancher-system-agent already installed — node looks registered."
|
||||
warn "Re-run with --force to register again. Skipping."
|
||||
exit 0
|
||||
fi
|
||||
warn "rancher-system-agent present, but --force given — proceeding."
|
||||
fi
|
||||
|
||||
# ── Fetch installer ────────────────────────────────────────────────────────
|
||||
INSECURE_FLAG=""
|
||||
if [[ "$RANCHER_INSECURE_FETCH" == "true" ]]; then
|
||||
INSECURE_FLAG="--insecure"
|
||||
warn "Fetching installer insecurely (Rancher reached by IP). CA checksum still pins the agent connection."
|
||||
fi
|
||||
|
||||
TMP_INSTALLER="$(mktemp /tmp/rancher-system-agent-install.XXXXXX.sh)"
|
||||
trap 'rm -f "$TMP_INSTALLER"' EXIT
|
||||
|
||||
info "Downloading system-agent installer from ${RANCHER_SERVER_URL} ..."
|
||||
# shellcheck disable=SC2086
|
||||
curl -fsSL $INSECURE_FLAG "${RANCHER_SERVER_URL}/system-agent-install.sh" -o "$TMP_INSTALLER"
|
||||
ok "Installer downloaded ($(wc -c < "$TMP_INSTALLER") bytes)."
|
||||
|
||||
# ── Register ───────────────────────────────────────────────────────────────
|
||||
info "Registering node with roles: ${RANCHER_NODE_ROLES}"
|
||||
info "(token masked: ${RANCHER_NODE_TOKEN:0:6}…)"
|
||||
# shellcheck disable=SC2086
|
||||
sh "$TMP_INSTALLER" \
|
||||
--server "${RANCHER_SERVER_URL}" \
|
||||
--label 'cattle.io/os=linux' \
|
||||
--token "${RANCHER_NODE_TOKEN}" \
|
||||
--ca-checksum "${RANCHER_CA_CHECKSUM}" \
|
||||
${RANCHER_NODE_ROLES}
|
||||
|
||||
echo ""
|
||||
ok "Registration submitted. Watch progress in Rancher (cluster goes Active in a few minutes)."
|
||||
info "On the node you can follow along with:"
|
||||
info " journalctl -u rancher-system-agent -f"
|
||||
info " k3s kubectl get nodes # once k3s is up"
|
||||
+86
@@ -0,0 +1,86 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# Dezky host backup — Restic to a Hetzner Storage Box (primary), then a
|
||||
# dedup-aware `restic copy` to a second Storage Box in Helsinki (DR).
|
||||
#
|
||||
# Runs as root (must read stalwart- and root-owned data). HOME is pointed at
|
||||
# /opt/dezky-backup so ssh uses the dedicated backup key + config (Storage Box
|
||||
# is SSH/SFTP on port 23). Triggered daily by dezky-backup.timer.
|
||||
#
|
||||
# Requires restic >= 0.14 (for `copy --from-repo`).
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
RED='\033[0;31m'; GREEN='\033[0;32m'; YELLOW='\033[1;33m'; BLUE='\033[0;34m'; NC='\033[0m'
|
||||
info() { echo -e "${BLUE}[INFO]${NC} $*"; }
|
||||
ok() { echo -e "${GREEN}[OK]${NC} $*"; }
|
||||
warn() { echo -e "${YELLOW}[WARN]${NC} $*"; }
|
||||
error() { echo -e "${RED}[ERROR]${NC} $*" >&2; }
|
||||
|
||||
BACKUP_HOME="/opt/dezky-backup"
|
||||
ENV_FILE="${ENV_FILE:-$BACKUP_HOME/restic.env}"
|
||||
|
||||
if [[ ! -f "$ENV_FILE" ]]; then
|
||||
error "Missing $ENV_FILE — run restic/install.sh first."
|
||||
exit 1
|
||||
fi
|
||||
# shellcheck disable=SC1090
|
||||
source "$ENV_FILE"
|
||||
|
||||
: "${RESTIC_PASSWORD:?RESTIC_PASSWORD required}"
|
||||
: "${BACKUP_PRIMARY_REPO:?BACKUP_PRIMARY_REPO required}"
|
||||
: "${BACKUP_PATHS:?BACKUP_PATHS required}"
|
||||
: "${BACKUP_RETENTION:=--keep-daily 7 --keep-weekly 4 --keep-monthly 6}"
|
||||
|
||||
# ssh (spawned by restic) reads $HOME/.ssh/config — wildcard for *.your-storagebox.de
|
||||
export HOME="$BACKUP_HOME"
|
||||
export RESTIC_PASSWORD
|
||||
# For `copy`: both repos share the same password.
|
||||
export RESTIC_FROM_PASSWORD="$RESTIC_PASSWORD"
|
||||
|
||||
# Optional dead-man's-switch (e.g. healthchecks.io). Pinged /start, success, /fail.
|
||||
HC="${BACKUP_HEALTHCHECK_URL:-}"
|
||||
ping_hc() { [[ -n "$HC" ]] && curl -fsS -m 10 --retry 3 "${HC}${1:-}" >/dev/null 2>&1 || true; }
|
||||
fail() { error "$1"; ping_hc "/fail"; exit 1; }
|
||||
|
||||
ping_hc "/start"
|
||||
|
||||
# Exclude obvious churn/noise from the PVC tree
|
||||
EXCLUDES=(--exclude-caches
|
||||
--exclude '*/lost+found'
|
||||
--exclude '*.tmp')
|
||||
|
||||
# ── 1) Back up to the primary Storage Box ──────────────────────────────────
|
||||
info "Backing up to primary: $BACKUP_PRIMARY_REPO"
|
||||
# shellcheck disable=SC2086
|
||||
restic -r "$BACKUP_PRIMARY_REPO" backup $BACKUP_PATHS \
|
||||
"${EXCLUDES[@]}" \
|
||||
--tag dezky --tag host \
|
||||
--host dezky-node1 \
|
||||
|| fail "Primary backup failed."
|
||||
ok "Primary backup done."
|
||||
|
||||
# ── 2) Retention on primary ────────────────────────────────────────────────
|
||||
info "Applying retention on primary..."
|
||||
# shellcheck disable=SC2086
|
||||
restic -r "$BACKUP_PRIMARY_REPO" forget $BACKUP_RETENTION --prune \
|
||||
|| warn "Primary forget/prune reported an issue (backup itself is safe)."
|
||||
|
||||
# ── 3) Light integrity check on primary ────────────────────────────────────
|
||||
restic -r "$BACKUP_PRIMARY_REPO" check || warn "restic check flagged the primary repo — investigate."
|
||||
|
||||
# ── 4) Mirror to the Helsinki DR box (dedup-aware copy) ─────────────────────
|
||||
if [[ -n "${BACKUP_DR_REPO:-}" ]]; then
|
||||
info "Copying snapshots to DR: $BACKUP_DR_REPO"
|
||||
restic -r "$BACKUP_DR_REPO" copy --from-repo "$BACKUP_PRIMARY_REPO" \
|
||||
|| fail "DR copy failed."
|
||||
# shellcheck disable=SC2086
|
||||
restic -r "$BACKUP_DR_REPO" forget $BACKUP_RETENTION --prune \
|
||||
|| warn "DR forget/prune reported an issue."
|
||||
ok "DR mirror done."
|
||||
else
|
||||
warn "BACKUP_DR_REPO not set — skipping off-site mirror (set it for real DR)."
|
||||
fi
|
||||
|
||||
ok "Backup cycle complete."
|
||||
ping_hc "" # success ping (bare URL)
|
||||
@@ -0,0 +1,13 @@
|
||||
# Dezky nightly backup (Restic → Storage Box primary + Helsinki DR).
|
||||
[Unit]
|
||||
Description=Dezky host backup (Restic)
|
||||
After=network-online.target
|
||||
Wants=network-online.target
|
||||
|
||||
[Service]
|
||||
Type=oneshot
|
||||
ExecStart=/opt/dezky-backup/backup.sh
|
||||
# Backups are I/O heavy but should never starve mail/k3s
|
||||
Nice=10
|
||||
IOSchedulingClass=best-effort
|
||||
IOSchedulingPriority=6
|
||||
@@ -0,0 +1,12 @@
|
||||
# Nightly at 03:20 UTC, with a randomized delay so it doesn't hammer the
|
||||
# Storage Box at the same second every night. Catches up if the box was off.
|
||||
[Unit]
|
||||
Description=Run the Dezky host backup nightly
|
||||
|
||||
[Timer]
|
||||
OnCalendar=*-*-* 03:20:00
|
||||
RandomizedDelaySec=20min
|
||||
Persistent=true
|
||||
|
||||
[Install]
|
||||
WantedBy=timers.target
|
||||
+115
@@ -0,0 +1,115 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# Install Dezky host backups: Restic + a dedicated backup SSH key/config for the
|
||||
# Hetzner Storage Box(es), the env file, the backup/restore scripts, and the
|
||||
# nightly systemd timer. Idempotent.
|
||||
#
|
||||
# sudo ./install.sh
|
||||
#
|
||||
# Storage Box uses SSH/SFTP on PORT 23 with key auth. After this runs, you must
|
||||
# upload the printed public key to BOTH Storage Boxes, then re-run to init the
|
||||
# repos (the box must trust the key before `restic init` can connect).
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
RED='\033[0;31m'; GREEN='\033[0;32m'; YELLOW='\033[1;33m'; BLUE='\033[0;34m'; NC='\033[0m'
|
||||
info() { echo -e "${BLUE}[INFO]${NC} $*"; }
|
||||
ok() { echo -e "${GREEN}[OK]${NC} $*"; }
|
||||
warn() { echo -e "${YELLOW}[WARN]${NC} $*"; }
|
||||
error() { echo -e "${RED}[ERROR]${NC} $*" >&2; }
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
HOST_DIR="$(cd "$SCRIPT_DIR/.." && pwd)"
|
||||
CONFIG_FILE="${CONFIG_FILE:-$HOST_DIR/config.env}"
|
||||
BACKUP_HOME="/opt/dezky-backup"
|
||||
SSH_DIR="$BACKUP_HOME/.ssh"
|
||||
KEY="$SSH_DIR/id_ed25519"
|
||||
|
||||
if [[ $EUID -ne 0 ]]; then error "Run as root."; exit 1; fi
|
||||
if [[ ! -f "$CONFIG_FILE" ]]; then error "Missing $CONFIG_FILE"; exit 1; fi
|
||||
# shellcheck disable=SC1090
|
||||
source "$CONFIG_FILE"
|
||||
|
||||
: "${RESTIC_PASSWORD:?RESTIC_PASSWORD required (and STORE IT OFFLINE — losing it loses the backups)}"
|
||||
: "${BACKUP_PRIMARY_REPO:?BACKUP_PRIMARY_REPO required}"
|
||||
: "${BACKUP_PATHS:?BACKUP_PATHS required}"
|
||||
: "${BACKUP_RETENTION:=--keep-daily 7 --keep-weekly 4 --keep-monthly 6}"
|
||||
|
||||
# ── 1) Packages ────────────────────────────────────────────────────────────
|
||||
info "Installing restic + openssh client..."
|
||||
export DEBIAN_FRONTEND=noninteractive
|
||||
apt-get update -qq
|
||||
apt-get install -y -qq restic curl openssh-client >/dev/null
|
||||
ok "restic $(restic version | awk '{print $2}') installed."
|
||||
|
||||
# ── 2) Backup home + SSH key/config ────────────────────────────────────────
|
||||
info "Setting up $BACKUP_HOME ..."
|
||||
install -d -m 0700 "$BACKUP_HOME" "$SSH_DIR"
|
||||
if [[ ! -f "$KEY" ]]; then
|
||||
ssh-keygen -t ed25519 -N "" -C "dezky-backup@node1" -f "$KEY" >/dev/null
|
||||
ok "Generated backup SSH key."
|
||||
fi
|
||||
# Single wildcard config covers BOTH Storage Boxes (same domain, port 23, key).
|
||||
cat > "$SSH_DIR/config" <<EOF
|
||||
Host *.your-storagebox.de
|
||||
Port 23
|
||||
IdentityFile $KEY
|
||||
IdentitiesOnly yes
|
||||
StrictHostKeyChecking accept-new
|
||||
UserKnownHostsFile $SSH_DIR/known_hosts
|
||||
EOF
|
||||
chmod 0600 "$SSH_DIR/config" "$KEY"
|
||||
chmod 0644 "$KEY.pub"
|
||||
|
||||
# ── 3) restic.env (secrets; generated, not in git) ─────────────────────────
|
||||
umask 077
|
||||
cat > "$BACKUP_HOME/restic.env" <<EOF
|
||||
# Generated by restic/install.sh from config.env — DO NOT commit.
|
||||
RESTIC_PASSWORD=${RESTIC_PASSWORD}
|
||||
BACKUP_PRIMARY_REPO=${BACKUP_PRIMARY_REPO}
|
||||
BACKUP_DR_REPO=${BACKUP_DR_REPO:-}
|
||||
BACKUP_PATHS=${BACKUP_PATHS}
|
||||
BACKUP_RETENTION=${BACKUP_RETENTION}
|
||||
BACKUP_HEALTHCHECK_URL=${BACKUP_HEALTHCHECK_URL:-}
|
||||
EOF
|
||||
chmod 0600 "$BACKUP_HOME/restic.env"
|
||||
ok "Wrote restic.env."
|
||||
|
||||
# ── 4) Scripts + systemd units ─────────────────────────────────────────────
|
||||
install -m 0750 "$SCRIPT_DIR/backup.sh" "$BACKUP_HOME/backup.sh"
|
||||
install -m 0750 "$SCRIPT_DIR/restore.sh" "$BACKUP_HOME/restore.sh"
|
||||
install -m 0644 "$SCRIPT_DIR/dezky-backup.service" /etc/systemd/system/dezky-backup.service
|
||||
install -m 0644 "$SCRIPT_DIR/dezky-backup.timer" /etc/systemd/system/dezky-backup.timer
|
||||
systemctl daemon-reload
|
||||
systemctl enable --now dezky-backup.timer
|
||||
ok "Nightly timer enabled."
|
||||
|
||||
# ── 5) Try to init the repos (only works once the key is on the box) ───────
|
||||
export HOME="$BACKUP_HOME" RESTIC_PASSWORD
|
||||
init_repo() {
|
||||
local repo="$1" label="$2"
|
||||
[[ -z "$repo" ]] && return 0
|
||||
if restic -r "$repo" cat config >/dev/null 2>&1; then
|
||||
ok "$label repo already initialized."
|
||||
elif restic -r "$repo" init >/dev/null 2>&1; then
|
||||
ok "$label repo initialized."
|
||||
else
|
||||
warn "$label repo not reachable/authorized yet — upload the key, then re-run."
|
||||
fi
|
||||
}
|
||||
init_repo "$BACKUP_PRIMARY_REPO" "Primary"
|
||||
init_repo "${BACKUP_DR_REPO:-}" "DR"
|
||||
|
||||
echo ""
|
||||
echo "╔══════════════════════════════════════════════════════════════╗"
|
||||
echo "║ Backup install complete ║"
|
||||
echo "╚══════════════════════════════════════════════════════════════╝"
|
||||
warn "Upload this PUBLIC key to BOTH Storage Boxes, then re-run install.sh:"
|
||||
echo ""
|
||||
cat "$KEY.pub"
|
||||
echo ""
|
||||
info " ssh-copy-id -p 23 -i $KEY.pub <primary-user>@<primary-host>.your-storagebox.de"
|
||||
info " ssh-copy-id -p 23 -i $KEY.pub <dr-user>@<dr-host>.your-storagebox.de"
|
||||
info "Then test: sudo $BACKUP_HOME/backup.sh (or wait for 03:20 UTC)"
|
||||
info "Drill restore: sudo $BACKUP_HOME/restore.sh restore latest /tmp/restore-test"
|
||||
warn "STORE RESTIC_PASSWORD OFFLINE. Without it, the encrypted backups are unrecoverable."
|
||||
+57
@@ -0,0 +1,57 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# Dezky restore helper. A backup you've never restored is a backup you don't
|
||||
# have — run a drill periodically. This wraps the common restic restore flows.
|
||||
#
|
||||
# sudo ./restore.sh snapshots # list snapshots (primary)
|
||||
# sudo ./restore.sh snapshots --dr # list from the DR box
|
||||
# sudo ./restore.sh restore <snapshot-id> <target-dir> [--dr]
|
||||
# sudo ./restore.sh restore latest /tmp/restore-test # safe drill target
|
||||
#
|
||||
# Restores go to an arbitrary target dir (NOT in place) so you can inspect first.
|
||||
# For Stalwart, stop the service, swap /opt/stalwart/data, then start it.
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
RED='\033[0;31m'; GREEN='\033[0;32m'; BLUE='\033[0;34m'; NC='\033[0m'
|
||||
info() { echo -e "${BLUE}[INFO]${NC} $*"; }
|
||||
ok() { echo -e "${GREEN}[OK]${NC} $*"; }
|
||||
error() { echo -e "${RED}[ERROR]${NC} $*" >&2; }
|
||||
|
||||
BACKUP_HOME="/opt/dezky-backup"
|
||||
ENV_FILE="${ENV_FILE:-$BACKUP_HOME/restic.env}"
|
||||
[[ -f "$ENV_FILE" ]] || { error "Missing $ENV_FILE"; exit 1; }
|
||||
# shellcheck disable=SC1090
|
||||
source "$ENV_FILE"
|
||||
export HOME="$BACKUP_HOME"
|
||||
export RESTIC_PASSWORD
|
||||
|
||||
pick_repo() {
|
||||
if [[ "${*: -1}" == "--dr" ]]; then
|
||||
[[ -n "${BACKUP_DR_REPO:-}" ]] || { error "BACKUP_DR_REPO not set"; exit 1; }
|
||||
echo "$BACKUP_DR_REPO"
|
||||
else
|
||||
echo "$BACKUP_PRIMARY_REPO"
|
||||
fi
|
||||
}
|
||||
|
||||
cmd="${1:-}"; shift || true
|
||||
case "$cmd" in
|
||||
snapshots)
|
||||
repo="$(pick_repo "$@")"
|
||||
info "Snapshots in $repo:"
|
||||
restic -r "$repo" snapshots --tag dezky
|
||||
;;
|
||||
restore)
|
||||
snap="${1:?snapshot id (or 'latest')}"; target="${2:?target dir}"
|
||||
repo="$(pick_repo "$@")"
|
||||
mkdir -p "$target"
|
||||
info "Restoring $snap from $repo → $target"
|
||||
restic -r "$repo" restore "$snap" --target "$target"
|
||||
ok "Restored. Inspect $target before putting anything back in place."
|
||||
;;
|
||||
*)
|
||||
error "Usage: $0 {snapshots|restore} ... (see header)"
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
+77
@@ -0,0 +1,77 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# Sync the mail.dezky.eu TLS cert from the cluster (issued by cert-manager) to
|
||||
# Stalwart on the host. The host IS the k3s node, so we read the secret via the
|
||||
# local kubeconfig — no external machinery. Reloads Stalwart only when the cert
|
||||
# actually changed (cert-manager renews ~30 days before expiry).
|
||||
#
|
||||
# Run by stalwart-cert-sync.timer (every 12h + on boot). Safe to run by hand.
|
||||
#
|
||||
# Forward dependency: needs the fleet layer to have created the TLS secret
|
||||
# (default: namespace 'mail', secret 'mail-tls'). Until then this is a no-op and
|
||||
# Stalwart keeps using the self-signed bootstrap cert from install.sh.
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
RED='\033[0;31m'; GREEN='\033[0;32m'; YELLOW='\033[1;33m'; BLUE='\033[0;34m'; NC='\033[0m'
|
||||
info() { echo -e "${BLUE}[INFO]${NC} $*"; }
|
||||
ok() { echo -e "${GREEN}[OK]${NC} $*"; }
|
||||
warn() { echo -e "${YELLOW}[WARN]${NC} $*"; }
|
||||
error() { echo -e "${RED}[ERROR]${NC} $*" >&2; }
|
||||
|
||||
TLS_NAMESPACE="${TLS_NAMESPACE:-mail}"
|
||||
TLS_SECRET="${TLS_SECRET:-mail-tls}"
|
||||
TLS_DIR="/opt/stalwart/etc/tls"
|
||||
KUBECONFIG_PATH="${KUBECONFIG:-/etc/rancher/k3s/k3s.yaml}"
|
||||
|
||||
# kubectl: prefer standalone, fall back to the k3s-bundled one
|
||||
if command -v kubectl >/dev/null 2>&1; then
|
||||
KUBECTL=(kubectl)
|
||||
elif command -v k3s >/dev/null 2>&1; then
|
||||
KUBECTL=(k3s kubectl)
|
||||
else
|
||||
error "Neither kubectl nor k3s found — is the node provisioned yet?"
|
||||
exit 1
|
||||
fi
|
||||
export KUBECONFIG="$KUBECONFIG_PATH"
|
||||
|
||||
# Pull the secret (no-op if it doesn't exist yet)
|
||||
if ! "${KUBECTL[@]}" -n "$TLS_NAMESPACE" get secret "$TLS_SECRET" >/dev/null 2>&1; then
|
||||
warn "Secret ${TLS_NAMESPACE}/${TLS_SECRET} not present yet — cert-manager hasn't issued it. Skipping."
|
||||
exit 0
|
||||
fi
|
||||
|
||||
TMP_CRT="$(mktemp)"; TMP_KEY="$(mktemp)"
|
||||
trap 'rm -f "$TMP_CRT" "$TMP_KEY"' EXIT
|
||||
|
||||
"${KUBECTL[@]}" -n "$TLS_NAMESPACE" get secret "$TLS_SECRET" \
|
||||
-o jsonpath='{.data.tls\.crt}' | base64 -d > "$TMP_CRT"
|
||||
"${KUBECTL[@]}" -n "$TLS_NAMESPACE" get secret "$TLS_SECRET" \
|
||||
-o jsonpath='{.data.tls\.key}' | base64 -d > "$TMP_KEY"
|
||||
|
||||
if [[ ! -s "$TMP_CRT" || ! -s "$TMP_KEY" ]]; then
|
||||
error "Fetched cert or key is empty — leaving current cert in place."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Only reload if something changed (compare hashes)
|
||||
changed=0
|
||||
mkdir -p "$TLS_DIR"
|
||||
if ! cmp -s "$TMP_CRT" "$TLS_DIR/cert.pem" 2>/dev/null; then changed=1; fi
|
||||
if ! cmp -s "$TMP_KEY" "$TLS_DIR/key.pem" 2>/dev/null; then changed=1; fi
|
||||
|
||||
if [[ $changed -eq 0 ]]; then
|
||||
info "Cert unchanged — nothing to do."
|
||||
exit 0
|
||||
fi
|
||||
|
||||
install -o stalwart -g stalwart -m 0644 "$TMP_CRT" "$TLS_DIR/cert.pem"
|
||||
install -o stalwart -g stalwart -m 0640 "$TMP_KEY" "$TLS_DIR/key.pem"
|
||||
ok "Updated mail TLS cert from ${TLS_NAMESPACE}/${TLS_SECRET}."
|
||||
|
||||
# SIGHUP Stalwart to reload certs without dropping connections
|
||||
if systemctl is-active --quiet stalwart-mail; then
|
||||
systemctl reload stalwart-mail && ok "Reloaded stalwart-mail (SIGHUP)."
|
||||
else
|
||||
warn "stalwart-mail not active — cert staged, will be used on next start."
|
||||
fi
|
||||
@@ -0,0 +1,102 @@
|
||||
# Stalwart Mail Server — Dezky PRODUCTION (bare-metal host, outside k3s)
|
||||
#
|
||||
# Topology (see host/README.md):
|
||||
# - Mail protocol ports bind directly on the host's public IP.
|
||||
# - Web/JMAP is served plaintext on 127-reachable :8080 and fronted by
|
||||
# Traefik (k3s) for mail.dezky.eu:443. Stalwart does NOT bind 80/443 —
|
||||
# those belong to Traefik.
|
||||
# - TLS for the mail-protocol ports uses a cert ISSUED BY cert-manager
|
||||
# (mail.dezky.eu) and copied here by stalwart/cert-sync.sh. Stalwart runs
|
||||
# no ACME of its own (80/443 are Traefik's).
|
||||
# - Storage is RocksDB on local disk — intentionally independent of the
|
||||
# in-cluster Postgres so mail keeps flowing regardless of cluster state.
|
||||
#
|
||||
# Reference: https://stalw.art/docs
|
||||
|
||||
[server]
|
||||
hostname = "mail.dezky.eu" # MUST match the IP's PTR/rDNS record
|
||||
|
||||
# ── Listeners ──────────────────────────────────────────────────────────────
|
||||
# Mail protocols on the public IP; management/JMAP on internal 8080 only
|
||||
# (firewall blocks 8080 from the world, allows the k3s pod CIDR + Traefik).
|
||||
[server.listener]
|
||||
"smtp" = { bind = "[::]:25", protocol = "smtp" }
|
||||
"submission" = { bind = "[::]:587", protocol = "smtp", tls.implicit = false }
|
||||
"submissions" = { bind = "[::]:465", protocol = "smtp", tls.implicit = true }
|
||||
"imap" = { bind = "[::]:143", protocol = "imap", tls.implicit = false }
|
||||
"imaps" = { bind = "[::]:993", protocol = "imap", tls.implicit = true }
|
||||
"sieve" = { bind = "[::]:4190", protocol = "managesieve" }
|
||||
# Internal HTTP: JMAP + WebAdmin + management API. Traefik terminates TLS for
|
||||
# the public hostname and proxies here; platform-api (pod) calls it directly.
|
||||
"http" = { bind = "0.0.0.0:8080", protocol = "http" }
|
||||
|
||||
# ── Storage — RocksDB on local disk (host-isolated from the cluster) ────────
|
||||
[store."rocksdb"]
|
||||
type = "rocksdb"
|
||||
path = "/opt/stalwart/data"
|
||||
compression = "lz4"
|
||||
|
||||
[storage]
|
||||
data = "rocksdb"
|
||||
fts = "rocksdb"
|
||||
blob = "rocksdb"
|
||||
lookup = "rocksdb"
|
||||
directory = "internal"
|
||||
|
||||
[directory."internal"]
|
||||
type = "internal"
|
||||
store = "rocksdb"
|
||||
|
||||
# ── TLS — cert issued by cert-manager, synced here by cert-sync.sh ──────────
|
||||
# Until the first sync runs, install.sh drops a self-signed bootstrap cert so
|
||||
# the TLS listeners can start. cert-sync replaces it with the real LE cert.
|
||||
[certificate."default"]
|
||||
cert = "%{file:/opt/stalwart/etc/tls/cert.pem}%"
|
||||
private-key = "%{file:/opt/stalwart/etc/tls/key.pem}%"
|
||||
default = true
|
||||
|
||||
# ── Authentication ─────────────────────────────────────────────────────────
|
||||
# Fallback admin is what platform-api uses for Basic auth on the JMAP
|
||||
# management API (STALWART_ADMIN_USER/PASSWORD on the platform-api side).
|
||||
[authentication]
|
||||
fallback-admin.user = "admin"
|
||||
fallback-admin.secret = "$env{STALWART_ADMIN_PASSWORD}"
|
||||
|
||||
# ── Resolver ───────────────────────────────────────────────────────────────
|
||||
# DNSSEC-aware system resolver. Mail deliverability depends on clean DNS.
|
||||
[resolver]
|
||||
type = "system"
|
||||
preserve-intermediates = true
|
||||
concurrency = 4
|
||||
|
||||
# ── Spam filtering — built-in filter ON in production ──────────────────────
|
||||
[spam-filter]
|
||||
enable = true
|
||||
|
||||
# ── Logging — journald captures stdout ─────────────────────────────────────
|
||||
[tracer."stdout"]
|
||||
type = "stdout"
|
||||
level = "info"
|
||||
ansi = false
|
||||
enable = true
|
||||
|
||||
# ── Audit webhook → platform-api (via the public api ingress) ──────────────
|
||||
# Stalwart on the host reaches platform-api through Traefik on the public
|
||||
# hostname; HMAC-signed so a public endpoint is safe.
|
||||
[webhook."audit-ingest"]
|
||||
url = "https://api.dezky.eu/ingest/stalwart/webhook"
|
||||
signature-key = "$env{STALWART_WEBHOOK_SECRET}"
|
||||
events = [
|
||||
"auth.success",
|
||||
"auth.failure",
|
||||
"auth.banned",
|
||||
"account.created",
|
||||
"account.deleted",
|
||||
"account.password-changed",
|
||||
"message.rejected",
|
||||
"policy.rejection",
|
||||
"dkim.failure",
|
||||
"dmarc.failure",
|
||||
"spam.detected",
|
||||
]
|
||||
throttle = "1s"
|
||||
+144
@@ -0,0 +1,144 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# Install Stalwart mail server as a hardened host systemd service on the AX41.
|
||||
# Run AFTER bootstrap.sh (and ideally after k3s registration, so cert-sync can
|
||||
# immediately pull the real cert). Idempotent — safe to re-run to upgrade.
|
||||
#
|
||||
# sudo ./install.sh
|
||||
#
|
||||
# What it does: creates the stalwart user + /opt/stalwart layout, downloads a
|
||||
# pinned Stalwart binary, installs config.toml + the secrets EnvironmentFile,
|
||||
# drops a self-signed bootstrap cert (replaced later by cert-sync), and installs
|
||||
# the systemd units (mail service + cert-sync service/timer).
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
RED='\033[0;31m'; GREEN='\033[0;32m'; YELLOW='\033[1;33m'; BLUE='\033[0;34m'; NC='\033[0m'
|
||||
info() { echo -e "${BLUE}[INFO]${NC} $*"; }
|
||||
ok() { echo -e "${GREEN}[OK]${NC} $*"; }
|
||||
warn() { echo -e "${YELLOW}[WARN]${NC} $*"; }
|
||||
error() { echo -e "${RED}[ERROR]${NC} $*" >&2; }
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
HOST_DIR="$(cd "$SCRIPT_DIR/.." && pwd)"
|
||||
CONFIG_FILE="${CONFIG_FILE:-$HOST_DIR/config.env}"
|
||||
|
||||
PREFIX="/opt/stalwart"
|
||||
STALWART_REPO="${STALWART_REPO:-stalwartlabs/mail-server}"
|
||||
|
||||
if [[ $EUID -ne 0 ]]; then
|
||||
error "Run as root."
|
||||
exit 1
|
||||
fi
|
||||
if [[ ! -f "$CONFIG_FILE" ]]; then
|
||||
error "Missing $CONFIG_FILE — fill in the STALWART_* values first."
|
||||
exit 1
|
||||
fi
|
||||
# shellcheck disable=SC1090
|
||||
source "$CONFIG_FILE"
|
||||
|
||||
: "${STALWART_ADMIN_PASSWORD:?STALWART_ADMIN_PASSWORD required (openssl rand -hex 24)}"
|
||||
: "${STALWART_WEBHOOK_SECRET:?STALWART_WEBHOOK_SECRET required (openssl rand -hex 32)}"
|
||||
: "${STALWART_VERSION:=latest}"
|
||||
|
||||
# ── Step 1: user + directory layout ────────────────────────────────────────
|
||||
info "Step 1: stalwart user + ${PREFIX} layout..."
|
||||
if ! id -u stalwart >/dev/null 2>&1; then
|
||||
useradd --system --home-dir "$PREFIX" --shell /usr/sbin/nologin stalwart
|
||||
fi
|
||||
install -d -o stalwart -g stalwart -m 0750 "$PREFIX" "$PREFIX/bin" "$PREFIX/data" "$PREFIX/logs"
|
||||
install -d -o stalwart -g stalwart -m 0750 "$PREFIX/etc" "$PREFIX/etc/tls"
|
||||
ok "Layout ready."
|
||||
|
||||
# ── Step 2: download the Stalwart binary ───────────────────────────────────
|
||||
info "Step 2: fetching Stalwart binary (${STALWART_REPO}@${STALWART_VERSION})..."
|
||||
arch="$(uname -m)"
|
||||
case "$arch" in
|
||||
x86_64) target="x86_64-unknown-linux-gnu" ;;
|
||||
aarch64) target="aarch64-unknown-linux-gnu" ;;
|
||||
*) error "Unsupported arch: $arch"; exit 1 ;;
|
||||
esac
|
||||
|
||||
if [[ "$STALWART_VERSION" == "latest" ]]; then
|
||||
api="https://api.github.com/repos/${STALWART_REPO}/releases/latest"
|
||||
warn "Using 'latest' — pin STALWART_VERSION to a tag in config.env after this install."
|
||||
else
|
||||
api="https://api.github.com/repos/${STALWART_REPO}/releases/tags/${STALWART_VERSION}"
|
||||
fi
|
||||
|
||||
asset_url="$(curl -fsSL "$api" \
|
||||
| grep -oE "https://[^\"]+${target}[^\"]+\.tar\.gz" \
|
||||
| head -n1)"
|
||||
if [[ -z "$asset_url" ]]; then
|
||||
error "Could not find a ${target} .tar.gz asset in ${STALWART_REPO}@${STALWART_VERSION}."
|
||||
error "Check the release assets or set STALWART_REPO/STALWART_VERSION."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
tmp="$(mktemp -d)"; trap 'rm -rf "$tmp"' EXIT
|
||||
info "Downloading $asset_url"
|
||||
curl -fsSL "$asset_url" -o "$tmp/stalwart.tar.gz"
|
||||
tar -xzf "$tmp/stalwart.tar.gz" -C "$tmp"
|
||||
bin="$(find "$tmp" -type f \( -name stalwart -o -name stalwart-mail \) | head -n1)"
|
||||
if [[ -z "$bin" ]]; then
|
||||
error "No 'stalwart'/'stalwart-mail' binary found in the archive."
|
||||
exit 1
|
||||
fi
|
||||
systemctl stop stalwart-mail 2>/dev/null || true
|
||||
install -o stalwart -g stalwart -m 0755 "$bin" "$PREFIX/bin/stalwart"
|
||||
ok "Installed $("$PREFIX/bin/stalwart" --version 2>/dev/null || echo 'stalwart binary')."
|
||||
|
||||
# ── Step 3: config + secrets EnvironmentFile ───────────────────────────────
|
||||
info "Step 3: config.toml + secrets env..."
|
||||
install -o stalwart -g stalwart -m 0640 "$SCRIPT_DIR/config.toml" "$PREFIX/etc/config.toml"
|
||||
umask 077
|
||||
cat > "$PREFIX/etc/stalwart.env" <<EOF
|
||||
# Generated by install.sh from config.env — DO NOT commit.
|
||||
STALWART_ADMIN_PASSWORD=${STALWART_ADMIN_PASSWORD}
|
||||
STALWART_WEBHOOK_SECRET=${STALWART_WEBHOOK_SECRET}
|
||||
EOF
|
||||
chown root:stalwart "$PREFIX/etc/stalwart.env"
|
||||
chmod 0640 "$PREFIX/etc/stalwart.env"
|
||||
ok "Config + secrets installed."
|
||||
|
||||
# ── Step 4: self-signed bootstrap cert (only if none yet) ──────────────────
|
||||
if [[ ! -s "$PREFIX/etc/tls/cert.pem" ]]; then
|
||||
info "Step 4: generating self-signed bootstrap cert (cert-sync replaces it)..."
|
||||
openssl req -x509 -newkey rsa:2048 -nodes -days 3650 \
|
||||
-keyout "$PREFIX/etc/tls/key.pem" \
|
||||
-out "$PREFIX/etc/tls/cert.pem" \
|
||||
-subj "/CN=mail.dezky.eu" >/dev/null 2>&1
|
||||
chown stalwart:stalwart "$PREFIX/etc/tls/"*.pem
|
||||
chmod 0644 "$PREFIX/etc/tls/cert.pem"; chmod 0640 "$PREFIX/etc/tls/key.pem"
|
||||
ok "Bootstrap cert in place."
|
||||
else
|
||||
ok "Step 4: TLS cert already present — keeping it."
|
||||
fi
|
||||
|
||||
# ── Step 5: cert-sync + systemd units ──────────────────────────────────────
|
||||
info "Step 5: installing cert-sync + systemd units..."
|
||||
install -o root -g root -m 0755 "$SCRIPT_DIR/cert-sync.sh" "$PREFIX/cert-sync.sh"
|
||||
install -m 0644 "$SCRIPT_DIR/stalwart-mail.service" /etc/systemd/system/stalwart-mail.service
|
||||
install -m 0644 "$SCRIPT_DIR/stalwart-cert-sync.service" /etc/systemd/system/stalwart-cert-sync.service
|
||||
install -m 0644 "$SCRIPT_DIR/stalwart-cert-sync.timer" /etc/systemd/system/stalwart-cert-sync.timer
|
||||
systemctl daemon-reload
|
||||
systemctl enable --now stalwart-mail.service
|
||||
systemctl enable --now stalwart-cert-sync.timer
|
||||
ok "Services enabled."
|
||||
|
||||
# Try an immediate cert sync (no-op until cert-manager has issued the secret)
|
||||
"$PREFIX/cert-sync.sh" || true
|
||||
|
||||
echo ""
|
||||
echo "╔══════════════════════════════════════════════════════════════╗"
|
||||
echo "║ Stalwart installed & running ║"
|
||||
echo "╚══════════════════════════════════════════════════════════════╝"
|
||||
systemctl --no-pager --lines=0 status stalwart-mail || true
|
||||
echo ""
|
||||
warn "Follow-ups:"
|
||||
warn " • PTR/rDNS for the server IP MUST be 'mail.dezky.eu' (Hetzner Robot)."
|
||||
warn " • Publish DNS at simply.com: MX → mail.dezky.eu, SPF, DMARC; per-domain"
|
||||
warn " DKIM records come from Stalwart's dnsZoneFile via platform-api."
|
||||
warn " • platform-api (k3s) env: STALWART_API_URL=http://<node-ip>:8080"
|
||||
warn " STALWART_ADMIN_USER=admin STALWART_ADMIN_PASSWORD=<same as here>"
|
||||
warn " STALWART_WEBHOOK_SECRET=<same as here> STALWART_PROVISIONING_ENABLED=true"
|
||||
@@ -0,0 +1,10 @@
|
||||
# Oneshot: sync the mail TLS cert from the cluster to Stalwart.
|
||||
# Triggered by stalwart-cert-sync.timer.
|
||||
[Unit]
|
||||
Description=Sync mail.dezky.eu TLS cert from cluster to Stalwart
|
||||
After=network-online.target k3s.service
|
||||
Wants=network-online.target
|
||||
|
||||
[Service]
|
||||
Type=oneshot
|
||||
ExecStart=/opt/stalwart/cert-sync.sh
|
||||
@@ -0,0 +1,12 @@
|
||||
# Run cert-sync shortly after boot and every 12h thereafter. cert-manager
|
||||
# renews well before expiry, so twice-daily comfortably picks up new certs.
|
||||
[Unit]
|
||||
Description=Periodic mail TLS cert sync for Stalwart
|
||||
|
||||
[Timer]
|
||||
OnBootSec=3min
|
||||
OnUnitActiveSec=12h
|
||||
Persistent=true
|
||||
|
||||
[Install]
|
||||
WantedBy=timers.target
|
||||
@@ -0,0 +1,39 @@
|
||||
# Dezky — Stalwart mail server (bare-metal host service).
|
||||
#
|
||||
# Secrets (admin password, webhook secret) come from the EnvironmentFile, which
|
||||
# install.sh generates from config.env. The binary needs CAP_NET_BIND_SERVICE
|
||||
# to bind the privileged mail ports (25/143/...) while running as a non-root user.
|
||||
|
||||
[Unit]
|
||||
Description=Stalwart Mail Server (Dezky)
|
||||
After=network-online.target
|
||||
Wants=network-online.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
User=stalwart
|
||||
Group=stalwart
|
||||
EnvironmentFile=/opt/stalwart/etc/stalwart.env
|
||||
ExecStart=/opt/stalwart/bin/stalwart --config /opt/stalwart/etc/config.toml
|
||||
# Stalwart reloads its TLS certs / config on SIGHUP — used by cert-sync.
|
||||
ExecReload=/bin/kill -HUP $MAINPID
|
||||
Restart=on-failure
|
||||
RestartSec=5
|
||||
LimitNOFILE=65536
|
||||
|
||||
# Bind privileged ports without full root
|
||||
AmbientCapabilities=CAP_NET_BIND_SERVICE
|
||||
CapabilityBoundingSet=CAP_NET_BIND_SERVICE
|
||||
|
||||
# Hardening — Stalwart only needs to write under /opt/stalwart
|
||||
NoNewPrivileges=true
|
||||
ProtectSystem=strict
|
||||
ProtectHome=true
|
||||
PrivateTmp=true
|
||||
ReadWritePaths=/opt/stalwart/data /opt/stalwart/logs /opt/stalwart/etc/tls
|
||||
ProtectKernelTunables=true
|
||||
ProtectControlGroups=true
|
||||
RestrictSUIDSGID=true
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
Reference in New Issue
Block a user