docs(runbook): app tier + push-to-deploy CI/CD flow
ci / typecheck (map[dir:apps/booking name:booking]) (push) Successful in 20s
ci / typecheck (map[dir:apps/operator name:operator]) (push) Successful in 23s
ci / typecheck (map[dir:apps/website name:website]) (push) Successful in 20s
ci / typecheck (map[dir:apps/portal name:portal]) (push) Successful in 26s
ci / typecheck (map[dir:services/platform-api name:platform-api]) (push) Successful in 22s
ci / test (push) Successful in 32s
ci / build (map[dir:apps/booking name:booking]) (push) Successful in 9s
ci / build (map[dir:apps/operator name:operator]) (push) Successful in 9s
ci / build (map[dir:apps/portal name:portal]) (push) Successful in 6s
ci / build (map[dir:services/platform-api name:platform-api]) (push) Successful in 5s
ci / deploy (push) Successful in 41s
ci / typecheck (map[dir:apps/booking name:booking]) (push) Successful in 20s
ci / typecheck (map[dir:apps/operator name:operator]) (push) Successful in 23s
ci / typecheck (map[dir:apps/website name:website]) (push) Successful in 20s
ci / typecheck (map[dir:apps/portal name:portal]) (push) Successful in 26s
ci / typecheck (map[dir:services/platform-api name:platform-api]) (push) Successful in 22s
ci / test (push) Successful in 32s
ci / build (map[dir:apps/booking name:booking]) (push) Successful in 9s
ci / build (map[dir:apps/operator name:operator]) (push) Successful in 9s
ci / build (map[dir:apps/portal name:portal]) (push) Successful in 6s
ci / build (map[dir:services/platform-api name:platform-api]) (push) Successful in 5s
ci / deploy (push) Successful in 41s
Bring the runbook up to the 2026-06-10 state: app tier + CI/CD in current state, a Deploy flow section (push to main = release, rollback, break-glass, required Gitea secrets), reproduce steps 8-9 (app tier secrets+apply, CI runner + ci-deployer with the runner gotchas), per-router ACME-safe redirect instead of the old global one, platform-api key read-back for Bitwarden, and a pruned TODO list.
This commit is contained in:
@@ -9,7 +9,7 @@ bottom to rebuild it. Per-layer detail lives in `host/README.md`,
|
|||||||
> and stored in **Bitwarden**. See "Secrets" below for how to read the live
|
> and stored in **Bitwarden**. See "Secrets" below for how to read the live
|
||||||
> values back out of the cluster.
|
> values back out of the cluster.
|
||||||
|
|
||||||
## Current state (built 2026-06-08)
|
## Current state (built 2026-06-08, app tier + CI/CD 2026-06-10)
|
||||||
|
|
||||||
- **Host:** hardened via `host/bootstrap.sh` — `dezky` admin user, **key-only
|
- **Host:** hardened via `host/bootstrap.sh` — `dezky` admin user, **key-only
|
||||||
SSH** (no root, no passwords), k3s-safe nftables firewall (SSH/6443 → mgmt
|
SSH** (no root, no passwords), k3s-safe nftables firewall (SSH/6443 → mgmt
|
||||||
@@ -23,8 +23,42 @@ bottom to rebuild it. Per-layer detail lives in `host/README.md`,
|
|||||||
- **Data tier** (`dezky-data` ns) — Postgres 16, Mongo 7, Redis 7 as
|
- **Data tier** (`dezky-data` ns) — Postgres 16, Mongo 7, Redis 7 as
|
||||||
StatefulSets on Longhorn PVCs. Postgres holds the `authentik` + `ocis` DBs.
|
StatefulSets on Longhorn PVCs. Postgres holds the `authentik` + `ocis` DBs.
|
||||||
- **Authentik** (`dezky-auth` ns) — live at https://auth.dezky.eu (LE cert),
|
- **Authentik** (`dezky-auth` ns) — live at https://auth.dezky.eu (LE cert),
|
||||||
image `2026.5.2`, on our Postgres/Redis. `akadmin` bootstrap login.
|
chart pinned `2026.5.2`, on our Postgres/Redis. Portal + operator OIDC app
|
||||||
- **Traefik** — global HTTP→HTTPS 308 redirect (`fleet/traefik/`).
|
blueprints applied (`fleet/authentik/blueprints/`).
|
||||||
|
- **Stalwart** (host, not k3s) — mail on the bare host; JMAP management API
|
||||||
|
reachable from pods at `http://10.42.0.1:8080` (cni0 gateway).
|
||||||
|
- **Traefik** — per-router HTTP→HTTPS redirect via `redirectScheme`
|
||||||
|
Middleware on each Ingress (`web,websecure` entrypoints). **No global
|
||||||
|
entrypoint redirect** — that breaks cert-manager HTTP-01 (`fleet/traefik/`).
|
||||||
|
- **App tier** (`dezky-apps` ns) — portal (`app.dezky.eu`), platform-api
|
||||||
|
(`api.dezky.eu`), booking (`booking.dezky.eu`), operator
|
||||||
|
(`operator.dezky.eu`). See `fleet/README.md`.
|
||||||
|
- **CI/CD** (`gitea-runner` ns) — in-cluster `gitea/runner:1.0.8` + dind
|
||||||
|
sidecar. **Push to main = deploy** (see "Deploy flow" below).
|
||||||
|
- **Registry hygiene** — Gitea package cleanup rule (user-level, Container
|
||||||
|
type): keep newest 5 versions per image + `latest`, remove older than 7
|
||||||
|
days. Applied by Gitea's daily cleanup cron.
|
||||||
|
|
||||||
|
## Deploy flow (day-to-day)
|
||||||
|
|
||||||
|
Push to `main` on Gitea → `.gitea/workflows/ci.yml` runs in-cluster:
|
||||||
|
**typecheck + test → docker build + push** (each app image tagged `:latest` +
|
||||||
|
the commit SHA, to `git.lastcloud.io/ronnibaslund/dezky/<app>`) → **deploy**
|
||||||
|
(`kustomize edit set image` pins the SHA, `kubectl apply -k fleet/apps`,
|
||||||
|
waits for rollouts). No GitOps controller, no manual steps. Push-to-live is
|
||||||
|
~2 min with a warm build cache, 5–10 min after a runner pod restart (the dind
|
||||||
|
layer cache is an emptyDir).
|
||||||
|
|
||||||
|
- **Watch:** repo → Actions in Gitea, or
|
||||||
|
`kubectl -n dezky-apps get deploy -o wide` (image column shows the SHA).
|
||||||
|
- **Rollback:** re-run an older green run from the Gitea Actions UI, or
|
||||||
|
`kubectl -n dezky-apps set image deploy/<app> <app>=git.lastcloud.io/ronnibaslund/dezky/<app>:<old-sha>`.
|
||||||
|
- **Break-glass (runner down):** `kubectl apply -k fleet/apps/` by hand —
|
||||||
|
manifests reference `:latest`.
|
||||||
|
- **Gitea Actions secrets** (repo Settings → Actions → Secrets):
|
||||||
|
`KUBECONFIG_B64` (ci-deployer kubeconfig, see step 9) and `REGISTRY_TOKEN`
|
||||||
|
(Gitea PAT with package read+write — the per-job GITHUB_TOKEN is NOT
|
||||||
|
accepted by the container registry).
|
||||||
|
|
||||||
## Reproduce from scratch
|
## Reproduce from scratch
|
||||||
|
|
||||||
@@ -91,11 +125,46 @@ See `fleet/authentik/README.md`. Create `dezky-auth` ns + `authentik-secret`
|
|||||||
generated), then `kubectl apply -f fleet/authentik/helmchart.yaml`. Reachable at
|
generated), then `kubectl apply -f fleet/authentik/helmchart.yaml`. Reachable at
|
||||||
https://auth.dezky.eu; first login `akadmin` / `AUTHENTIK_BOOTSTRAP_PASSWORD`.
|
https://auth.dezky.eu; first login `akadmin` / `AUTHENTIK_BOOTSTRAP_PASSWORD`.
|
||||||
|
|
||||||
### 7. Traefik — global HTTP→HTTPS redirect
|
### 7. Traefik — per-router HTTPS redirect (ACME-safe)
|
||||||
```bash
|
```bash
|
||||||
|
# NO global entrypoint redirect — it would 301 the HTTP-01 challenge before
|
||||||
|
# cert-manager's solver router can answer it. Redirect lives per-Ingress via
|
||||||
|
# a redirectScheme Middleware instead (applied with each tier's kustomize).
|
||||||
kubectl apply -f fleet/traefik/helmchartconfig.yaml
|
kubectl apply -f fleet/traefik/helmchartconfig.yaml
|
||||||
kubectl -n kube-system delete job helm-install-traefik # force the controller to re-run with merged values
|
kubectl -n kube-system delete job helm-install-traefik # force the controller to re-run with merged values
|
||||||
# verify: curl -sI http://auth.dezky.eu -> 308 -> https://auth.dezky.eu/
|
# verify: curl -sI http://app.dezky.eu -> 301 https://... AND new certs still issue
|
||||||
|
```
|
||||||
|
|
||||||
|
### 8. App tier (portal · platform-api · booking · operator)
|
||||||
|
```bash
|
||||||
|
# Secrets first (out-of-band, values from Bitwarden / generated — see
|
||||||
|
# fleet/README.md "Required env / secrets" + fleet/apps/secrets.example.yaml):
|
||||||
|
# portal-secrets, booking-secrets, operator-secrets, platform-api-secrets
|
||||||
|
kubectl apply -k fleet/apps/
|
||||||
|
kubectl -n dezky-apps get pods # all Running once images exist in the registry
|
||||||
|
```
|
||||||
|
|
||||||
|
### 9. CI runner + push-to-deploy
|
||||||
|
```bash
|
||||||
|
# In-cluster Gitea Actions runner (gitea/runner + privileged dind sidecar).
|
||||||
|
# Registration token from Gitea: Settings → Actions → Runners → Create token.
|
||||||
|
kubectl create namespace gitea-runner --dry-run=client -o yaml | kubectl apply -f -
|
||||||
|
kubectl -n gitea-runner create secret generic gitea-runner-token \
|
||||||
|
--from-literal=token=<registration token>
|
||||||
|
kubectl apply -f fleet/ci/gitea-runner.yaml
|
||||||
|
|
||||||
|
# Deploy ServiceAccount + kubeconfig for the pipeline's deploy job:
|
||||||
|
kubectl apply -f fleet/ci/ci-deployer.yaml
|
||||||
|
# mint the kubeconfig (full recipe in fleet/README.md "Deploy") and store it
|
||||||
|
# as the KUBECONFIG_B64 repo secret; create a Gitea PAT with package
|
||||||
|
# read+write and store as REGISTRY_TOKEN.
|
||||||
|
|
||||||
|
# Gotchas baked into fleet/ci/gitea-runner.yaml — don't "simplify" them away:
|
||||||
|
# - gitea/runner 1.x (NOT act_runner 0.2.x: Gitea 1.26 never marks its jobs
|
||||||
|
# complete, which freezes runs at "Complete job").
|
||||||
|
# - dind shares /var/run with the runner: jobs can only get a docker host
|
||||||
|
# by bind-mounting a UNIX socket (tcp://+TLS can't be mounted).
|
||||||
|
# - docker:24-dind (moby 27 has a cgroup-v2 teardown deadlock).
|
||||||
```
|
```
|
||||||
|
|
||||||
## Secrets — read live values for Bitwarden
|
## Secrets — read live values for Bitwarden
|
||||||
@@ -107,21 +176,28 @@ k postgres-secret AUTHENTIK_DB_PASSWORD # must match Authentik's DB config
|
|||||||
k postgres-secret OCIS_DB_PASSWORD # must match OCIS's DB config
|
k postgres-secret OCIS_DB_PASSWORD # must match OCIS's DB config
|
||||||
k mongo-secret root-password
|
k mongo-secret root-password
|
||||||
k redis-secret REDIS_PASSWORD
|
k redis-secret REDIS_PASSWORD
|
||||||
|
|
||||||
|
a(){ kubectl -n dezky-apps get secret platform-api-secrets -o jsonpath="{.data.$1}" | base64 -d; echo; }
|
||||||
|
a SCHEDULING_CREDENTIAL_KEY # AES key for stored scheduling creds — losing it orphans them
|
||||||
|
a AUDIT_SIGNING_KEY # audit hash-chain key — rotation closes the segment
|
||||||
```
|
```
|
||||||
|
|
||||||
## Still TODO (next layers)
|
## Still TODO (next layers)
|
||||||
|
|
||||||
1. **Authentik** — ✅ deployed (`auth.dezky.eu`). Remaining: OIDC app
|
1. **OCIS** (files) — uses the `ocis` Postgres DB + Hetzner Object Storage
|
||||||
blueprints (portal + operator, with prod redirect URLs + client secrets) and
|
(S3). platform-api already carries placeholder `OCIS_*` config
|
||||||
the cosmetic rebrand. See `fleet/authentik/README.md`.
|
(`fleet/apps/platform-api-config.yaml`) — swap in real values when live.
|
||||||
2. **OCIS** (files) — uses the `ocis` Postgres DB + Hetzner Object Storage (S3).
|
2. **Audit cold storage** — Hetzner Object Storage bucket + real
|
||||||
3. **Apps** — `fleet/apps/` (portal · platform-api · booking) + their secrets.
|
`AUDIT_COLD_*` keys in `platform-api-secrets`; flip `ARCHIVE_ENABLED`.
|
||||||
4. **Stalwart** (host) — `host/stalwart/install.sh`; needs DNS + PTR.
|
3. **Backups** — Longhorn → Hetzner Object Storage (`fleet/longhorn/README.md`),
|
||||||
5. **Backups** — Longhorn → Hetzner Object Storage (`fleet/longhorn/README.md`),
|
|
||||||
plus host Restic for the mail store + etcd snapshots, plus pg_dump/mongodump
|
plus host Restic for the mail store + etcd snapshots, plus pg_dump/mongodump
|
||||||
CronJobs.
|
CronJobs.
|
||||||
6. **DNS** — A records `api`/`app`/`booking`/`auth`/`mail`.dezky.eu → 46.4.78.187,
|
4. **Stripe live keys** — billing is dark-launched off
|
||||||
and PTR for mail.
|
(`BILLING_STRIPE_ENABLED: "false"` in the app config).
|
||||||
|
|
||||||
|
Done since first build: ✅ Authentik + OIDC blueprints · ✅ Stalwart on the
|
||||||
|
host · ✅ app tier (incl. operator) · ✅ CI/CD push-to-deploy · ✅ DNS A
|
||||||
|
records (`api`/`app`/`booking`/`auth`/`mail`/`operator`).dezky.eu.
|
||||||
|
|
||||||
## Access cheatsheet
|
## Access cheatsheet
|
||||||
- SSH: `ssh dezky@46.4.78.187` (key only). Root SSH disabled.
|
- SSH: `ssh dezky@46.4.78.187` (key only). Root SSH disabled.
|
||||||
|
|||||||
Reference in New Issue
Block a user