fb4ff48617cec5ead8e057623082dcb586057f61
11 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
25d932d3c1 |
fix(domains): platform tenant slug is configurable (prod: dezky-aps)
ci / changes (push) Successful in 4s
ci / tc_portal (push) Has been skipped
ci / tc_booking (push) Has been skipped
ci / tc_operator (push) Has been skipped
ci / tc_website (push) Has been skipped
ci / tc_platform_api (push) Successful in 23s
ci / build_portal (push) Has been skipped
ci / build_booking (push) Has been skipped
ci / build_operator (push) Has been skipped
ci / test_platform_api (push) Successful in 32s
ci / build_platform_api (push) Successful in 18s
ci / deploy (push) Successful in 41s
The company tenant ended up as slug dezky-aps (the seeded 'dezky' tenant was deleted), so the hardcoded apex allowance for slug 'dezky' would have rejected adding dezky.eu to the real tenant. PLATFORM_TENANT_SLUG env (default 'dezky') now names the only tenant allowed to claim the PLATFORM_TENANT_DOMAIN apex. |
||
|
|
f66a343472 |
fix(infra): Stalwart v0.16 management admin is a real account (admin@dezky.eu)
ci / changes (push) Successful in 3s
ci / tc_operator (push) Has been skipped
ci / build_portal (push) Has been skipped
ci / build_operator (push) Has been skipped
ci / build_platform_api (push) Has been skipped
ci / tc_portal (push) Has been skipped
ci / tc_booking (push) Has been skipped
ci / tc_website (push) Has been skipped
ci / tc_platform_api (push) Has been skipped
ci / test_platform_api (push) Has been skipped
ci / build_booking (push) Has been skipped
ci / deploy (push) Successful in 42s
The v0.16 config migration silently dropped the fallback admin — the live server had ZERO accounts, so every platform-api JMAP call 401'd and tenant mail provisioning was dead. Bootstrapped via recovery mode on node1 (STALWART_RECOVERY_ADMIN): created the dezky.eu domain + an admin account with the Admin role and the existing STALWART_ADMIN_PASSWORD. v0.16 logins use the full address, so STALWART_ADMIN_USER becomes admin@dezky.eu; config-rev annotation bump rolls platform-api so it picks up the new env. install.sh follow-ups now document the recovery-mode bootstrap for rebuilds instead of the defunct fallback-admin promise. |
||
|
|
a43a172449 |
feat(domains): reserve the platform namespace + one workspace per domain
ci / changes (push) Successful in 4s
ci / tc_portal (push) Has been skipped
ci / build_operator (push) Has been skipped
ci / test_platform_api (push) Successful in 34s
ci / tc_booking (push) Has been skipped
ci / tc_operator (push) Has been skipped
ci / tc_website (push) Has been skipped
ci / tc_platform_api (push) Successful in 23s
ci / build_portal (push) Has been skipped
ci / build_booking (push) Has been skipped
ci / build_platform_api (push) Successful in 18s
ci / deploy (push) Successful in 41s
dezky.eu doubles as the platform's infrastructure domain AND the company's
own employee mail domain (added to the dezky tenant via the normal Domains
flow). Guard rails in DomainsService.add:
- a domain already used by ANY other workspace is rejected — Stalwart's
idempotent ensureDomain would otherwise silently share one mail domain
(and its mailboxes) between tenants
- the PLATFORM_TENANT_DOMAIN apex is claimable only by the dezky tenant;
everything under it (per-tenant service domains, auth/api/mail/* infra
hosts) is reserved outright
Set PLATFORM_TENANT_DOMAIN=dezky.eu in the prod ConfigMap (was unset, so
prod service domains would have been {slug}.dezky.local) and align the
seeded dezky tenant's display domain with the environment.
|
||
|
|
94270c1f22 |
fix(health): env-driven infrastructure probe targets
ci / typecheck (map[dir:apps/booking name:booking]) (push) Successful in 20s
ci / typecheck (map[dir:apps/website name:website]) (push) Successful in 22s
ci / typecheck (map[dir:apps/portal name:portal]) (push) Successful in 28s
ci / typecheck (map[dir:services/platform-api name:platform-api]) (push) Successful in 22s
ci / test (push) Successful in 30s
ci / typecheck (map[dir:apps/operator name:operator]) (push) Successful in 23s
ci / build (map[dir:apps/booking name:booking]) (push) Successful in 10s
ci / build (map[dir:apps/operator name:operator]) (push) Successful in 31s
ci / build (map[dir:services/platform-api name:platform-api]) (push) Successful in 15s
ci / build (map[dir:apps/portal name:portal]) (push) Successful in 38s
ci / deploy (push) Successful in 42s
The operator infrastructure page probed docker-compose hostnames (stalwart/postgres/redis/traefik…) which don't resolve in k3s — 7 of 9 services showed down. Probe targets now come from HEALTH_* env vars with the compose names as dev defaults; platform-api-config.yaml sets the in-cluster/host addresses. 'disabled' omits a service from the report — used for OCIS/Collabora until the files tier is deployed. |
||
|
|
0840efb759 |
fix(operator,portal): env-driven sign-out URLs + host labels (no more .local in prod)
Operator sign-out hardcoded the dev Authentik end-session URL, so prod logout landed on auth.dezky.local. Mirror the portal's env-driven pattern (NUXT_PUBLIC_AUTH_URL/NUXT_PUBLIC_OPERATOR_URL with .local fallbacks). Expose authUrl/operatorUrl via public runtimeConfig and use them for the Authentik admin links and the cosmetic host labels (sidebar, eyebrows, auth-page hints). Portal: signed-out + webmail copy now derive their hosts from runtime config (new public.mailUrl, NUXT_PUBLIC_MAIL_URL in prod). |
||
|
|
91134c94f5 |
feat(auth): Redis-backed OIDC sessions for portal + operator
ci / typecheck (map[dir:apps/operator name:operator]) (push) Successful in 19s
ci / typecheck (map[dir:apps/booking name:booking]) (push) Successful in 22s
ci / typecheck (map[dir:apps/website name:website]) (push) Successful in 23s
ci / typecheck (map[dir:apps/portal name:portal]) (push) Successful in 28s
ci / typecheck (map[dir:services/platform-api name:platform-api]) (push) Successful in 23s
ci / test (push) Successful in 31s
ci / build (map[dir:apps/booking name:booking]) (push) Successful in 9s
ci / build (map[dir:apps/operator name:operator]) (push) Successful in 43s
ci / build (map[dir:services/platform-api name:platform-api]) (push) Successful in 5s
ci / build (map[dir:apps/portal name:portal]) (push) Successful in 51s
ci / deploy (push) Failing after 3m42s
nuxt-oidc-auth persists sessions via useStorage('oidc'), whose default
mount is per-pod memory — broken at >1 replica (random 401s) and every
deploy logged all users out. A nitro plugin now mounts 'oidc' on the
dezky-data Redis (db 1, app-prefixed keys, 14d TTL) when SESSION_REDIS_URL
is set; dev keeps the memory driver with no Redis required. Replicas back
to 2 for both apps.
|
||
|
|
fd0c5d011b |
fix(infra): single replica for portal/operator (per-pod OIDC sessions)
ci / typecheck (map[dir:apps/booking name:booking]) (push) Successful in 22s
ci / typecheck (map[dir:apps/operator name:operator]) (push) Successful in 24s
ci / typecheck (map[dir:apps/website name:website]) (push) Successful in 21s
ci / typecheck (map[dir:apps/portal name:portal]) (push) Successful in 26s
ci / typecheck (map[dir:services/platform-api name:platform-api]) (push) Successful in 21s
ci / test (push) Successful in 30s
ci / build (map[dir:apps/booking name:booking]) (push) Successful in 10s
ci / build (map[dir:apps/operator name:operator]) (push) Successful in 9s
ci / build (map[dir:apps/portal name:portal]) (push) Successful in 6s
ci / build (map[dir:services/platform-api name:platform-api]) (push) Successful in 6s
ci / deploy (push) Successful in 41s
nuxt-oidc-auth stores sessions in per-pod memory. With 2 replicas, any request balanced to the pod that didn't handle the login 401s — in practice roughly half of all operator API calls failed after sign-in. One replica until sessions move to shared storage (nitro storage on the dezky-data Redis), then scale back up. Already scaled live; this pins the manifests so the next deploy doesn't undo it. |
||
|
|
b155e34fe6 |
fix(infra): runtime OIDC overrides for prod portal/operator login
ci / typecheck (map[dir:apps/booking name:booking]) (push) Successful in 20s
ci / typecheck (map[dir:apps/operator name:operator]) (push) Successful in 24s
ci / typecheck (map[dir:apps/portal name:portal]) (push) Successful in 26s
ci / typecheck (map[dir:apps/website name:website]) (push) Successful in 23s
ci / build (map[dir:apps/booking name:booking]) (push) Successful in 9s
ci / build (map[dir:apps/operator name:operator]) (push) Successful in 9s
ci / typecheck (map[dir:services/platform-api name:platform-api]) (push) Successful in 18s
ci / test (push) Successful in 34s
ci / build (map[dir:apps/portal name:portal]) (push) Successful in 6s
ci / build (map[dir:services/platform-api name:platform-api]) (push) Successful in 6s
ci / deploy (push) Successful in 41s
CI builds the Nuxt images with no env, so nuxt.config bakes empty OIDC client creds and .local Authentik URLs into runtimeConfig — sign-in dead-ended on the app's own /auth/login. Nitro env overrides only apply when the var name matches the runtimeConfig path (oidc.providers.oidc.* -> NUXT_OIDC_PROVIDERS_OIDC_*), so production secrets need that second set of names; the plain NUXT_OIDC_* ones only work in dev. Also pin NUXT_OIDC_TOKEN_KEY/AUTH_SESSION_SECRET so sessions survive pod restarts. Live secrets patched on the cluster accordingly. |
||
|
|
c60937c5cb |
feat(ci): deploy to k3s straight from the pipeline (drop Flux plan)
ci / build (map[dir:apps/booking name:booking]) (push) Has been cancelled
ci / build (map[dir:apps/operator name:operator]) (push) Has been cancelled
ci / build (map[dir:apps/portal name:portal]) (push) Has been cancelled
ci / build (map[dir:services/platform-api name:platform-api]) (push) Has been cancelled
ci / deploy (push) Has been cancelled
ci / typecheck (map[dir:apps/booking name:booking]) (push) Has been cancelled
ci / typecheck (map[dir:apps/operator name:operator]) (push) Has been cancelled
ci / typecheck (map[dir:apps/portal name:portal]) (push) Has been cancelled
ci / typecheck (map[dir:apps/website name:website]) (push) Has been cancelled
ci / typecheck (map[dir:services/platform-api name:platform-api]) (push) Has been cancelled
ci / test (push) Has been cancelled
Push to main = release: after build, a deploy job pins each app image to the commit SHA (kustomize edit set image), kubectl-applies fleet/apps and waits for the rollouts. The runner already runs in-cluster, so it reaches the API server on the in-cluster service IP with a kubeconfig for the new ci-deployer ServiceAccount (namespace-scoped admin, KUBECONFIG_B64 repo secret). The drafted Flux sync/image-automation layer is removed — a GitOps controller plus bot tag-bump commits is more machinery than a single-node cluster needs. Sortable image tags and $imagepolicy markers go with it. Also: per-router ACME-safe HTTP->HTTPS redirects for the app ingresses, platform-api prod config completed (Authentik JWT/JWKS + admin API, Stalwart via the cni0 gateway IP, OCIS/cold-storage placeholders until those tiers exist) and the secrets template/README updated to match. |
||
|
|
52e0f5e375 |
feat(operator): production build + k3s deployment
- Dockerfile for the operator app (same pattern as portal/booking). - Env-driven auth/app base URLs in nuxt.config so one build serves dev (.local) and production (.eu). - Deployment + Service + Ingress on operator.dezky.eu. - Add operator to the typecheck matrix. |
||
|
|
35bc7b6c31 |
chore(infra): production manifests + CI for scheduling apps
ci / typecheck (map[dir:apps/booking name:booking]) (push) Has been cancelled
ci / typecheck (map[dir:apps/portal name:portal]) (push) Has been cancelled
ci / typecheck (map[dir:apps/website name:website]) (push) Has been cancelled
ci / typecheck (map[dir:services/platform-api name:platform-api]) (push) Has been cancelled
ci / test (push) Has been cancelled
|