153d7053ca
ci / typecheck (map[dir:apps/website name:website]) (push) Failing after 10m58s
ci / typecheck (map[dir:apps/portal name:portal]) (push) Failing after 11m56s
ci / typecheck (map[dir:apps/booking name:booking]) (push) Failing after 14m0s
ci / typecheck (map[dir:services/platform-api name:platform-api]) (push) Has been cancelled
ci / test (push) Has been cancelled
Adds the production cluster foundation (authored + applied live on node1): - cert-manager via the k3s HelmChart controller + letsencrypt staging/prod ClusterIssuers (HTTP-01 / Traefik). - Longhorn config for single-node (values: replica=1, default StorageClass, Retain) + backup-to-Hetzner-Object-Storage credential template. - In-cluster data tier (dezky-data): Postgres 16 (with Authentik+OCIS DB init), MongoDB 7, Redis 7 as StatefulSets on Longhorn, + secret template. - bootstrap.sh: install open-iscsi/nfs-common + enable iscsid (Longhorn prereq). - RUNBOOK.md: full reproducible node1 build order. Real secrets are generated on-box and kept in Bitwarden — never in git.
5.3 KiB
5.3 KiB
Dezky production — node1 build runbook
The actual, reproducible order used to stand up node1.dezky.eu (Hetzner
AX41, 46.4.78.187, Ubuntu 24.04). If the box is lost, follow this top to
bottom to rebuild it. Per-layer detail lives in host/README.md,
fleet/cert-manager/, fleet/longhorn/, fleet/data/.
Secrets are never in git. They're generated with
openssl rand -hex 24and stored in Bitwarden. See "Secrets" below for how to read the live values back out of the cluster.
Current state (built 2026-06-08)
- Host: hardened via
host/bootstrap.sh—dezkyadmin user, key-only SSH (no root, no passwords), k3s-safe nftables firewall (SSH/6443 → mgmt IPs46.32.144.38/46.32.144.45; 80/443+mail → world), fail2ban, unattended-upgrades,open-iscsi+iscsid(Longhorn prereq).dezkyhas NOPASSWD sudo (/etc/sudoers.d/90-dezky). - k3s v1.33.11 — single node (control-plane/etcd/worker), registered in
Rancher (
91.99.122.153). - Longhorn — default StorageClass,
numberOfReplicas: 1(single node). - cert-manager +
letsencrypt-staging/letsencrypt-prod(HTTP-01/Traefik). - Data tier (
dezky-datans) — Postgres 16, Mongo 7, Redis 7 as StatefulSets on Longhorn PVCs. Postgres holds theauthentik+ocisDBs.
Reproduce from scratch
1. Host layer
# from laptop
scp -r infrastructure/production/host root@<ip>:/opt/dezky-host
# copy/fill config.env on the box (gitignored — MGMT IPs, ADMIN_SSH_PUBKEY,
# RANCHER_* token/checksum, STALWART_*, RESTIC_*)
ssh root@<ip> 'cd /opt/dezky-host && ./bootstrap.sh'
# set a console/sudo password for the admin user, then (optional) NOPASSWD:
ssh root@<ip> 'passwd dezky'
ssh dezky@<ip> "echo 'dezky ALL=(ALL) NOPASSWD:ALL' | sudo tee /etc/sudoers.d/90-dezky && sudo chmod 0440 /etc/sudoers.d/90-dezky"
2. k3s + kubectl access
ssh dezky@<ip>
sudo /opt/dezky-host/k3s/register.sh # joins the Rancher Custom (K3s) cluster
kubectl --kubeconfig /etc/rancher/k3s/k3s.yaml get nodes # -> Ready
# give dezky a kubeconfig:
mkdir -p ~/.kube && sudo install -m 600 -o dezky -g dezky /etc/rancher/k3s/k3s.yaml ~/.kube/config
3. Longhorn (storage)
sudo apt-get install -y open-iscsi nfs-common && sudo systemctl enable --now iscsid # (bootstrap.sh does this now)
helm repo add longhorn https://charts.longhorn.io && helm repo update
helm install longhorn longhorn/longhorn -n longhorn-system --create-namespace \
--version 1.12.0 -f fleet/longhorn/values.yaml # replica=1, default class
# one default SC only:
kubectl patch storageclass local-path -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'
kubectl -n longhorn-system patch settings.longhorn.io default-replica-count --type=merge -p '{"value":"1"}'
kubectl get storageclass # only 'longhorn (default)'
4. cert-manager + issuers
kubectl apply -f fleet/cert-manager/cert-manager.yaml
kubectl -n cert-manager rollout status deploy/cert-manager-webhook --timeout=180s
kubectl apply -f fleet/cert-manager/cluster-issuer.yaml
kubectl get clusterissuer # both READY=True
5. Data tier
kubectl create namespace dezky-data --dry-run=client -o yaml | kubectl apply -f -
# secrets — generate fresh, store in Bitwarden:
kubectl -n dezky-data create secret generic postgres-secret \
--from-literal=POSTGRES_PASSWORD=$(openssl rand -hex 24) \
--from-literal=AUTHENTIK_DB_PASSWORD=$(openssl rand -hex 24) \
--from-literal=OCIS_DB_PASSWORD=$(openssl rand -hex 24)
kubectl -n dezky-data create secret generic mongo-secret \
--from-literal=root-username=dezky --from-literal=root-password=$(openssl rand -hex 24)
kubectl -n dezky-data create secret generic redis-secret \
--from-literal=REDIS_PASSWORD=$(openssl rand -hex 24)
kubectl apply -k fleet/data/
kubectl -n dezky-data get pods,pvc # all Running, PVCs Bound on longhorn
Secrets — read live values for Bitwarden
k(){ kubectl -n dezky-data get secret "$1" -o jsonpath="{.data.$2}" | base64 -d; echo; }
k postgres-secret POSTGRES_PASSWORD
k postgres-secret AUTHENTIK_DB_PASSWORD # must match Authentik's DB config
k postgres-secret OCIS_DB_PASSWORD # must match OCIS's DB config
k mongo-secret root-password
k redis-secret REDIS_PASSWORD
Still TODO (next layers)
- Authentik (
auth.dezky.eu) — OIDC for the portal; uses theauthentikPostgres DB + Redis. - OCIS (files) — uses the
ocisPostgres DB + Hetzner Object Storage (S3). - Apps —
fleet/apps/(portal · platform-api · booking) + their secrets. - Stalwart (host) —
host/stalwart/install.sh; needs DNS + PTR. - Backups — Longhorn → Hetzner Object Storage (
fleet/longhorn/README.md), plus host Restic for the mail store + etcd snapshots, plus pg_dump/mongodump CronJobs. - DNS — A records
api/app/booking/auth/mail.dezky.eu → 46.4.78.187, and PTR for mail.
Access cheatsheet
- SSH:
ssh dezky@46.4.78.187(key only). Root SSH disabled. - kubectl: works as
dezky(kubeconfig at~/.kube/config). - Out-of-band if locked out: Hetzner Robot KVM/LARA or Rescue System.
- The
level=warning … 50-rancher.yaml: permission deniedfrom kubectl is harmless noise (k3s kubectl probing a root-only config dir).