406e2ca78b
ci / typecheck (map[dir:apps/booking name:booking]) (push) Has been cancelled
ci / typecheck (map[dir:apps/website name:website]) (push) Has been cancelled
ci / typecheck (map[dir:apps/portal name:portal]) (push) Has been cancelled
ci / typecheck (map[dir:services/platform-api name:platform-api]) (push) Has been cancelled
ci / test (push) Has been cancelled
- Authentik on the in-cluster Postgres/Redis (mirrors the dev compose config: external DB/Redis, error-reporting off, update-check off, bootstrap admin), via the k3s Helm controller; Ingress + cert-manager letsencrypt-prod. Live at https://auth.dezky.eu (image 2026.5.2). Secrets generated on-box (Bitwarden). - Traefik HelmChartConfig: global :80 -> :443 (308) redirect via additionalArguments (to=:443, HTTP-01-safe). - RUNBOOK updated. Deferred (mirror remaining dev bits): OIDC app blueprints (portal/operator with prod URLs) + the cosmetic "Powered by Dezky" rebrand.
132 lines
6.2 KiB
Markdown
132 lines
6.2 KiB
Markdown
# Dezky production — node1 build runbook
|
|
|
|
The actual, reproducible order used to stand up **node1.dezky.eu** (Hetzner
|
|
AX41, `46.4.78.187`, Ubuntu 24.04). If the box is lost, follow this top to
|
|
bottom to rebuild it. Per-layer detail lives in `host/README.md`,
|
|
`fleet/cert-manager/`, `fleet/longhorn/`, `fleet/data/`.
|
|
|
|
> Secrets are **never** in git. They're generated with `openssl rand -hex 24`
|
|
> and stored in **Bitwarden**. See "Secrets" below for how to read the live
|
|
> values back out of the cluster.
|
|
|
|
## Current state (built 2026-06-08)
|
|
|
|
- **Host:** hardened via `host/bootstrap.sh` — `dezky` admin user, **key-only
|
|
SSH** (no root, no passwords), k3s-safe nftables firewall (SSH/6443 → mgmt
|
|
IPs `46.32.144.38`/`46.32.144.45`; 80/443+mail → world), fail2ban,
|
|
unattended-upgrades, `open-iscsi`+`iscsid` (Longhorn prereq).
|
|
`dezky` has **NOPASSWD sudo** (`/etc/sudoers.d/90-dezky`).
|
|
- **k3s** v1.33.11 — single node (control-plane/etcd/worker), registered in
|
|
Rancher (`91.99.122.153`).
|
|
- **Longhorn** — default StorageClass, `numberOfReplicas: 1` (single node).
|
|
- **cert-manager** + `letsencrypt-staging` / `letsencrypt-prod` (HTTP-01/Traefik).
|
|
- **Data tier** (`dezky-data` ns) — Postgres 16, Mongo 7, Redis 7 as
|
|
StatefulSets on Longhorn PVCs. Postgres holds the `authentik` + `ocis` DBs.
|
|
- **Authentik** (`dezky-auth` ns) — live at https://auth.dezky.eu (LE cert),
|
|
image `2026.5.2`, on our Postgres/Redis. `akadmin` bootstrap login.
|
|
- **Traefik** — global HTTP→HTTPS 308 redirect (`fleet/traefik/`).
|
|
|
|
## Reproduce from scratch
|
|
|
|
### 1. Host layer
|
|
```bash
|
|
# from laptop
|
|
scp -r infrastructure/production/host root@<ip>:/opt/dezky-host
|
|
# copy/fill config.env on the box (gitignored — MGMT IPs, ADMIN_SSH_PUBKEY,
|
|
# RANCHER_* token/checksum, STALWART_*, RESTIC_*)
|
|
ssh root@<ip> 'cd /opt/dezky-host && ./bootstrap.sh'
|
|
# set a console/sudo password for the admin user, then (optional) NOPASSWD:
|
|
ssh root@<ip> 'passwd dezky'
|
|
ssh dezky@<ip> "echo 'dezky ALL=(ALL) NOPASSWD:ALL' | sudo tee /etc/sudoers.d/90-dezky && sudo chmod 0440 /etc/sudoers.d/90-dezky"
|
|
```
|
|
|
|
### 2. k3s + kubectl access
|
|
```bash
|
|
ssh dezky@<ip>
|
|
sudo /opt/dezky-host/k3s/register.sh # joins the Rancher Custom (K3s) cluster
|
|
kubectl --kubeconfig /etc/rancher/k3s/k3s.yaml get nodes # -> Ready
|
|
# give dezky a kubeconfig:
|
|
mkdir -p ~/.kube && sudo install -m 600 -o dezky -g dezky /etc/rancher/k3s/k3s.yaml ~/.kube/config
|
|
```
|
|
|
|
### 3. Longhorn (storage)
|
|
```bash
|
|
sudo apt-get install -y open-iscsi nfs-common && sudo systemctl enable --now iscsid # (bootstrap.sh does this now)
|
|
helm repo add longhorn https://charts.longhorn.io && helm repo update
|
|
helm install longhorn longhorn/longhorn -n longhorn-system --create-namespace \
|
|
--version 1.12.0 -f fleet/longhorn/values.yaml # replica=1, default class
|
|
# one default SC only:
|
|
kubectl patch storageclass local-path -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'
|
|
kubectl -n longhorn-system patch settings.longhorn.io default-replica-count --type=merge -p '{"value":"1"}'
|
|
kubectl get storageclass # only 'longhorn (default)'
|
|
```
|
|
|
|
### 4. cert-manager + issuers
|
|
```bash
|
|
kubectl apply -f fleet/cert-manager/cert-manager.yaml
|
|
kubectl -n cert-manager rollout status deploy/cert-manager-webhook --timeout=180s
|
|
kubectl apply -f fleet/cert-manager/cluster-issuer.yaml
|
|
kubectl get clusterissuer # both READY=True
|
|
```
|
|
|
|
### 5. Data tier
|
|
```bash
|
|
kubectl create namespace dezky-data --dry-run=client -o yaml | kubectl apply -f -
|
|
# secrets — generate fresh, store in Bitwarden:
|
|
kubectl -n dezky-data create secret generic postgres-secret \
|
|
--from-literal=POSTGRES_PASSWORD=$(openssl rand -hex 24) \
|
|
--from-literal=AUTHENTIK_DB_PASSWORD=$(openssl rand -hex 24) \
|
|
--from-literal=OCIS_DB_PASSWORD=$(openssl rand -hex 24)
|
|
kubectl -n dezky-data create secret generic mongo-secret \
|
|
--from-literal=root-username=dezky --from-literal=root-password=$(openssl rand -hex 24)
|
|
kubectl -n dezky-data create secret generic redis-secret \
|
|
--from-literal=REDIS_PASSWORD=$(openssl rand -hex 24)
|
|
kubectl apply -k fleet/data/
|
|
kubectl -n dezky-data get pods,pvc # all Running, PVCs Bound on longhorn
|
|
```
|
|
|
|
### 6. Authentik (IdP)
|
|
See `fleet/authentik/README.md`. Create `dezky-auth` ns + `authentik-secret`
|
|
(DB/Redis pw read back from dezky-data so they match; SECRET_KEY + bootstrap
|
|
generated), then `kubectl apply -f fleet/authentik/helmchart.yaml`. Reachable at
|
|
https://auth.dezky.eu; first login `akadmin` / `AUTHENTIK_BOOTSTRAP_PASSWORD`.
|
|
|
|
### 7. Traefik — global HTTP→HTTPS redirect
|
|
```bash
|
|
kubectl apply -f fleet/traefik/helmchartconfig.yaml
|
|
kubectl -n kube-system delete job helm-install-traefik # force the controller to re-run with merged values
|
|
# verify: curl -sI http://auth.dezky.eu -> 308 -> https://auth.dezky.eu/
|
|
```
|
|
|
|
## Secrets — read live values for Bitwarden
|
|
|
|
```bash
|
|
k(){ kubectl -n dezky-data get secret "$1" -o jsonpath="{.data.$2}" | base64 -d; echo; }
|
|
k postgres-secret POSTGRES_PASSWORD
|
|
k postgres-secret AUTHENTIK_DB_PASSWORD # must match Authentik's DB config
|
|
k postgres-secret OCIS_DB_PASSWORD # must match OCIS's DB config
|
|
k mongo-secret root-password
|
|
k redis-secret REDIS_PASSWORD
|
|
```
|
|
|
|
## Still TODO (next layers)
|
|
|
|
1. **Authentik** — ✅ deployed (`auth.dezky.eu`). Remaining: OIDC app
|
|
blueprints (portal + operator, with prod redirect URLs + client secrets) and
|
|
the cosmetic rebrand. See `fleet/authentik/README.md`.
|
|
2. **OCIS** (files) — uses the `ocis` Postgres DB + Hetzner Object Storage (S3).
|
|
3. **Apps** — `fleet/apps/` (portal · platform-api · booking) + their secrets.
|
|
4. **Stalwart** (host) — `host/stalwart/install.sh`; needs DNS + PTR.
|
|
5. **Backups** — Longhorn → Hetzner Object Storage (`fleet/longhorn/README.md`),
|
|
plus host Restic for the mail store + etcd snapshots, plus pg_dump/mongodump
|
|
CronJobs.
|
|
6. **DNS** — A records `api`/`app`/`booking`/`auth`/`mail`.dezky.eu → 46.4.78.187,
|
|
and PTR for mail.
|
|
|
|
## Access cheatsheet
|
|
- SSH: `ssh dezky@46.4.78.187` (key only). Root SSH disabled.
|
|
- kubectl: works as `dezky` (kubeconfig at `~/.kube/config`).
|
|
- Out-of-band if locked out: Hetzner Robot KVM/LARA or Rescue System.
|
|
- The `level=warning … 50-rancher.yaml: permission denied` from kubectl is
|
|
harmless noise (k3s kubectl probing a root-only config dir).
|