Commit Graph

10 Commits

Author SHA1 Message Date
Ronni Baslund 4c5fdde787 fix(infra): docker:24-dind + capacity 2 (fix moby cgroup-v2 teardown deadlock that hung 'Complete job') 2026-06-08 22:56:21 +02:00
Ronni Baslund aef0f44915 chore(infra): act_runner capacity 4 + disable cache server
ci / typecheck (map[dir:services/platform-api name:platform-api]) (push) Successful in 31s
ci / typecheck (map[dir:apps/booking name:booking]) (push) Has been cancelled
ci / typecheck (map[dir:apps/portal name:portal]) (push) Has been cancelled
ci / typecheck (map[dir:apps/website name:website]) (push) Has been cancelled
ci / test (push) Has been cancelled
Add an act_runner config.yaml (ConfigMap, CONFIG_FILE env): capacity 4 so the
typecheck matrix + image builds run in parallel instead of one-at-a-time, and
cache.enabled: false (we removed the setup-node cache; the cache server isn't
reachable from the DinD job containers anyway).
2026-06-08 22:46:43 +02:00
Ronni Baslund f331e3c1e6 feat(infra): in-cluster Gitea Actions runner (act_runner + dind)
ci / typecheck (map[dir:apps/booking name:booking]) (push) Has been cancelled
ci / typecheck (map[dir:apps/portal name:portal]) (push) Has been cancelled
ci / typecheck (map[dir:apps/website name:website]) (push) Has been cancelled
ci / typecheck (map[dir:services/platform-api name:platform-api]) (push) Has been cancelled
ci / test (push) Has been cancelled
Self-registering act_runner on node1 with a privileged docker:dind sidecar so
workflow jobs can build + push app images (k3s has containerd only, no Docker
daemon). Labels ubuntu-latest + docker; state persisted on a Longhorn PVC. The
registration token is applied out-of-band as the gitea-runner-token Secret
(not in git). Verified: runner declared successfully, dind API up.
2026-06-08 22:13:38 +02:00
Ronni Baslund a27c238c76 feat(infra): nightly DB-dump CronJobs feeding the Restic backup
ci / typecheck (map[dir:apps/booking name:booking]) (push) Has been cancelled
ci / typecheck (map[dir:apps/portal name:portal]) (push) Has been cancelled
ci / typecheck (map[dir:apps/website name:website]) (push) Has been cancelled
ci / typecheck (map[dir:services/platform-api name:platform-api]) (push) Has been cancelled
ci / test (push) Has been cancelled
pg_dumpall (all Postgres DBs + roles) and mongodump (all Mongo DBs) write
gzipped dumps to the hostPath /opt/dezky-backup/dumps at 02:50/02:52 UTC, which
the host Restic job (03:20) ships to the Storage Box. Each keeps the last 7
local dumps; Restic holds the real off-box retention.

- pods run as root (hostPath dir is root-owned, as is the host Restic reader)
- mongo job uses bash (mongo:7 /bin/sh is dash → no pipefail)
- creds from postgres-secret / mongo-secret via secretKeyRef

Verified: both jobs Complete, dumps present on the host
(postgres-all ~2.2MB w/ Authentik data, mongo archive).
2026-06-08 21:55:14 +02:00
Ronni Baslund 326b626fc6 feat(infra): full dezky rebrand of Authentik login (logo, favicon, bg, footer)
ci / typecheck (map[dir:services/platform-api name:platform-api]) (push) Has been cancelled
ci / test (push) Has been cancelled
ci / typecheck (map[dir:apps/portal name:portal]) (push) Has been cancelled
ci / typecheck (map[dir:apps/website name:website]) (push) Has been cancelled
ci / typecheck (map[dir:apps/booking name:booking]) (push) Has been cancelled
Brand CSS only reaches the flow shadow DOM via CSS vars (colors), not the
logo/favicon (deeper shadow root) or the "Powered by authentik" footer (light
DOM). So, dev-style: serve real dezky assets + sed the bundle.

- web-assets/: dezky-logo.svg, dezky-favicon.svg, dezky-bg.svg (carbon).
- server-rebrand.py: patches the authentik-server Deployment with an
  initContainer that copies /web/dist to an emptyDir, drops the svgs into
  assets/icons, and seds "Powered by authentik" -> "Powered by Dezky".
- brand.yaml: branding_logo / branding_favicon / branding_default_flow_background
  point at the served svgs; auth-flow title "Welcome to Dezky"; signal-green CSS.

Verified live: login now matches dev (logo, title, carbon bg, green button,
favicon, Powered by Dezky). Durability caveat documented (reverts on helm
upgrade).
2026-06-08 20:36:01 +02:00
Ronni Baslund 99cd86cd3a feat(infra): full dezky branding on Authentik (logo, carbon bg, flow title)
ci / typecheck (map[dir:apps/booking name:booking]) (push) Failing after 7s
ci / test (push) Failing after 7s
ci / typecheck (map[dir:services/platform-api name:platform-api]) (push) Failing after 6s
ci / typecheck (map[dir:apps/portal name:portal]) (push) Has been cancelled
ci / typecheck (map[dir:apps/website name:website]) (push) Has been cancelled
branding_logo / branding_default_flow_background are file-path fields (reject
data URIs), so the dezky logo + carbon background are injected via the brand's
custom CSS (data URIs allowed there): logo replaces the authentik wordmark,
background overrides the forest. Auth-flow title -> "Welcome to Dezky".
Signal-green primary button retained.
2026-06-08 19:54:44 +02:00
Ronni Baslund db1354a151 feat(infra): Authentik blueprints (portal+operator OIDC, dezky brand)
ci / typecheck (map[dir:apps/booking name:booking]) (push) Failing after 6s
ci / typecheck (map[dir:apps/portal name:portal]) (push) Has been cancelled
ci / typecheck (map[dir:apps/website name:website]) (push) Has been cancelled
ci / test (push) Has been cancelled
ci / typecheck (map[dir:services/platform-api name:platform-api]) (push) Has been cancelled
Mirror the dev Authentik config in prod via blueprints, applied & successful on
node1:
- brand.yaml: dezky branding on the default brand (title + signal-green custom
  CSS) — login page now in dezky colors.
- portal-application.yaml / operator-application.yaml: dezky-portal &
  dezky-operator OIDC apps/providers (prod redirect URLs) + the
  dezky-platform-admins group & operator access policy.

Two 2026.5 gotchas handled + documented in README:
- invalidation_flow is now REQUIRED on OAuth2 providers (added via !Find).
- ConfigMap mounts are symlinks (discovery can't read them) → worker uses an
  initContainer that copies them to an emptyDir as real files. (chart
  worker.volumes didn't apply on this version; patch reverts on helm upgrade —
  noted as a durability TODO.)

Client secrets (PORTAL/OPERATOR_OIDC_CLIENT_SECRET) live in authentik-secret;
the apps must reuse them.
2026-06-08 19:46:48 +02:00
Ronni Baslund 406e2ca78b feat(infra): deploy Authentik (auth.dezky.eu) + global HTTP→HTTPS redirect
ci / typecheck (map[dir:apps/booking name:booking]) (push) Has been cancelled
ci / typecheck (map[dir:apps/website name:website]) (push) Has been cancelled
ci / typecheck (map[dir:apps/portal name:portal]) (push) Has been cancelled
ci / typecheck (map[dir:services/platform-api name:platform-api]) (push) Has been cancelled
ci / test (push) Has been cancelled
- Authentik on the in-cluster Postgres/Redis (mirrors the dev compose config:
  external DB/Redis, error-reporting off, update-check off, bootstrap admin),
  via the k3s Helm controller; Ingress + cert-manager letsencrypt-prod. Live at
  https://auth.dezky.eu (image 2026.5.2). Secrets generated on-box (Bitwarden).
- Traefik HelmChartConfig: global :80 -> :443 (308) redirect via
  additionalArguments (to=:443, HTTP-01-safe).
- RUNBOOK updated.

Deferred (mirror remaining dev bits): OIDC app blueprints (portal/operator with
prod URLs) + the cosmetic "Powered by Dezky" rebrand.
2026-06-08 19:00:07 +02:00
Ronni Baslund 153d7053ca feat(infra): k3s foundation — cert-manager, Longhorn config, in-cluster data tier
ci / typecheck (map[dir:apps/website name:website]) (push) Failing after 10m58s
ci / typecheck (map[dir:apps/portal name:portal]) (push) Failing after 11m56s
ci / typecheck (map[dir:apps/booking name:booking]) (push) Failing after 14m0s
ci / typecheck (map[dir:services/platform-api name:platform-api]) (push) Has been cancelled
ci / test (push) Has been cancelled
Adds the production cluster foundation (authored + applied live on node1):
- cert-manager via the k3s HelmChart controller + letsencrypt staging/prod
  ClusterIssuers (HTTP-01 / Traefik).
- Longhorn config for single-node (values: replica=1, default StorageClass,
  Retain) + backup-to-Hetzner-Object-Storage credential template.
- In-cluster data tier (dezky-data): Postgres 16 (with Authentik+OCIS DB init),
  MongoDB 7, Redis 7 as StatefulSets on Longhorn, + secret template.
- bootstrap.sh: install open-iscsi/nfs-common + enable iscsid (Longhorn prereq).
- RUNBOOK.md: full reproducible node1 build order.

Real secrets are generated on-box and kept in Bitwarden — never in git.
2026-06-08 18:39:31 +02:00
Ronni Baslund 35bc7b6c31 chore(infra): production manifests + CI for scheduling apps
ci / typecheck (map[dir:apps/booking name:booking]) (push) Has been cancelled
ci / typecheck (map[dir:apps/portal name:portal]) (push) Has been cancelled
ci / typecheck (map[dir:apps/website name:website]) (push) Has been cancelled
ci / typecheck (map[dir:services/platform-api name:platform-api]) (push) Has been cancelled
ci / test (push) Has been cancelled
2026-06-07 09:27:44 +02:00