Files
dezky/docs/NEXT-STEPS.md
T
Ronni Baslund 5407c04682 docs: feature-flag usage guide + cross-links
New docs/FEATURE-FLAGS.md captures when to add a flag, where the moving
parts live, how to use useFeatureFlag from app code, the 4 states + 4
scope axes, kill-switch flow, naming conventions, and the parts we know
aren't built yet (partnerSlug eval context, user-level flags, audit-log
integration, server-side cache).

CLAUDE.md gets a one-line convention entry under "Code conventions" so
future devs notice it when grepping for code rules. NEXT-STEPS.md is
updated: the feature-flag backend follow-up is now ticked done with a
pointer to FEATURE-FLAGS.md for the remaining sub-tasks, and the
"What landed" section reflects the real Infrastructure + Flags pages
and the notification drawer.
2026-05-24 19:29:24 +02:00

13 KiB
Raw Blame History

Next Steps — After Local Stack Is Running

Once ./scripts/bootstrap.sh completes successfully and all services are reachable, here's the development roadmap.

Phase 1: Verify everything works (day 1) — done

  • https://app.dezky.local shows portal landing page (now the new auth design / post-login home)
  • https://auth.dezky.local shows Authentik login
  • Log into Authentik as admin (still using generated AUTHENTIK_BOOTSTRAP_PASSWORD from .env — rotate before exposing to anyone else)
  • Follow docs/AUTHENTIK-SETUP.md to configure OIDC providers (ocis + dezky-portal)
  • Test OCIS SSO end-to-end (login from https://files.dezky.local)
  • Verify Stalwart admin UI loads at https://mail.dezky.local/login (root path 404s — admin SPA is at /login)

Phase 2: Build portal authentication (week 1) — done

Goal: Users can log in to the portal via Authentik.

  • Add nuxt-oidc-auth to apps/portal (1.0.0-beta.11)
  • Configure Authentik as OIDC provider (generic oidc preset with explicit URLs + discovery)
  • Implement login/logout flows (/auth/oidc/login, /auth/oidc/logout from the module)
  • Display logged-in user info on the portal home (pages/index.vue uses useOidcAuth())
  • Add protected routes (globalMiddlewareEnabled: true; public pages opt out via definePageMeta({ auth: false }))

Where things live

Concern File
OIDC module config apps/portal/nuxt.config.ts (oidc block)
Custom login page apps/portal/pages/auth/login.vue
Error states (expired / disabled) apps/portal/pages/auth/{expired,disabled}.vue
Post-login landing apps/portal/pages/index.vue
Visual shell + tokens apps/portal/components/auth/*, assets/styles/tokens.css
Brand mark apps/portal/components/NodeMark.vue

Dev-mode caveats (clean up before prod)

  • skipAccessTokenParsing: true in the OIDC config — Authentik's access tokens in this setup aren't reliably JWT-parseable; production should re-evaluate
  • openIdConfiguration is pinned to the discovery URL because the generic oidc preset doesn't ship a default — required for id_token JWKS validation
  • docker-compose.yml mounts infrastructure/docker-compose/certs/mkcert-root.pem into the portal at /etc/ssl/mkcert-root.pem and sets NODE_EXTRA_CA_CERTS so Node fetch trusts the mkcert root CA. In prod, replace with real CA-signed certs
  • Traefik has Docker network aliases for auth.dezky.local, app.dezky.local, etc. so container-to-Authentik fetch resolves inside the network without going through host /etc/hosts

Phase 3: Tenant data model (week 1-2) — done

  • Mongoose schemas in services/platform-api/src/schemas/ (Tenant, User, Subscription)
  • Tenant: slug, name, status, plan, domains, authentikGroupId, ocisSpaceId, stalwartDomain, billingInfo
  • User: authentikSubjectId, tenantIds[], email, name, role, active, lastLoginAt
  • Subscription: tenantId, plan, status, stripeCustomerId, stripeSubscriptionId, period dates
  • CRUD endpoints behind JwtAuthGuard (validates Authentik JWT via JWKS)
  • Group-based authorization: users see only tenants whose slug matches one of their Authentik groups; dezky-platform-admins group has global access
  • Idempotent seed (SeedService) creates the dezky tenant + matching subscription on bootstrap
  • platform-api exposed at https://api.dezky.local (Traefik label, dev only) and via internal http://platform-api:3001
  • Portal Nitro route at /api/me forwards the user's encrypted access token to platform-api — verified end-to-end

Endpoints

Method Path Notes
GET /health open
POST/GET /tenants, /tenants/:slug platform admin to create/delete; tenant members can read+update their own
GET /users/me upserts the user on first call from JWT claims
GET/POST/PATCH/DELETE /users[/:subject] platform admin for mutations
GET/POST/PATCH /subscriptions[/:slug] platform admin for mutations

Dev-mode caveats (clean up before prod)

  • NUXT_OIDC_TOKEN_KEY must be base64-encoded 32 bytes (openssl rand -base64 32) — NOT hex. Module silently fails with "Invalid key length" if wrong
  • Portal config has exposeAccessToken: true so Nitro routes can forward the token; token still never reaches the browser
  • The dezky group in Authentik is the single tenant for dev. New tenants in Phase 4 need to create matching Authentik groups
  • A dezky-platform-admins group doesn't exist yet — for now akadmin's membership in authentik Admins does NOT grant platform-admin rights. Create that group if you want admin-only endpoints to work for you

Phase 4: Provisioning automation (week 2-3) — partial

Orchestration ships, two of three integrations are still stubs pending upstream-specific work.

  • POST /tenants writes tenant and triggers reconciliation in one call
  • POST /tenants/:slug/reconcile retries provisioning for an existing tenant — idempotent, useful when an upstream was down or external state drifted
  • Per-step state recorded on Tenant.provisioningStatus (ok / skipped / error / pending) + Tenant.provisioningErrors for the last failure message; tenant auto-activates when all steps settle
  • Worker: Authentik group creation (real, idempotent)
  • Worker: Stalwart domain + DKIM (stubbed — v0.16 dropped REST in favor of JMAP, see follow-up below)
  • Worker: OCIS space (stubbed — needs libregraph /drives endpoint with service-to-service auth)
  • Worker: onboarding email (no SMTP wired yet)

Where things live

Concern File
Integration clients services/platform-api/src/integrations/{authentik,stalwart,ocis}.client.ts
Orchestration services/platform-api/src/tenants/provisioning.service.ts
/tenants/:slug/reconcile services/platform-api/src/tenants/tenants.controller.ts
Portal proxy routes apps/portal/server/api/tenants/index.post.ts + [slug]/reconcile.post.ts

Quick smoke test

From the portal in the browser (signed in), in DevTools:

// Create a fresh tenant
await fetch('/api/tenants', {
  method: 'POST',
  headers: {'Content-Type':'application/json'},
  body: JSON.stringify({ slug: 'acme', name: 'Acme Co', plan: 'pro' })
}).then(r => r.json())

// Re-run provisioning (idempotent)
await fetch('/api/tenants/acme/reconcile', { method: 'POST' }).then(r => r.json())

Response should include provisioningStatus: { authentik: 'ok', stalwart: 'skipped', ocis: 'skipped' } and status: 'active'. Verify the Authentik group exists via the admin UI at /if/admin/#/identity/groups.

Stub follow-up work

Stalwart (JMAP) — v0.16 moved management off REST. Need a minimal JMAP client that wraps Domain/set (create), Domain/get (idempotency check), Principal/set (DKIM-keyed signing identity). Auth via the persistent admin's bearer token from the OAuth flow we already use for the web UI.

OCIS (libregraph)POST /graph/v1.0/drives with body { "name": "<slug>", "driveType": "project" }. Needs service-to-service auth: either an OIDC client_credentials grant (requires registering a new Authentik provider for the worker) or the IDM admin user's bearer token.

Authentik API examples (for the eventual user-creation flow)

// Create user
await authentikClient.coreUsersCreate({
  username: user.email,
  email: user.email,
  name: user.name,
  groups: [authentikGroupId],
})

Operator portal — out-of-band track — shipped (O.0O.9)

operator.dezky.local is live as a separate Nuxt app with its own dezky-operator Authentik OAuth client. Full plan and execution log in OPERATOR-PLAN.md.

What landed:

  • services/provisioning renamed to services/platform-api
  • Audience-aware JwtAuthGuard accepts both dezky-portal and dezky-operator
  • Partner schema + CRUD endpoints, Tenant.partnerId ref
  • Tenant lifecycle (suspend / resume) gated by OperatorGuard
  • Real Infrastructure live-probesGET /health/platform runs TCP + HTTP probes against every neighbouring service; UI splits "Live" vs "Planned" with honest status.
  • Real feature-flag systemFlag schema + CRUD + bulk eval + operator UI + useFeatureFlag composable in the portal. Hash-based deterministic rollout. See FEATURE-FLAGS.md.
  • Operator UI: Overview (real KPIs), Tenants (7-tab detail w/ Danger), Partners (attach/detach), Users, Operator team, real Infrastructure, real Feature flags. Visual-only Audit. Placeholders for Support/Billing/Reports/Settings.
  • Interactions: ⌘K command palette, impersonation stub (modal + banner), incident modal, tweaks panel, notification drawer.

Follow-ups before operator hits production

In rough priority order — bulk lifted from OPERATOR-PLAN.md:

  • Real impersonation flow — OAuth Token Exchange (RFC 8693), act claim on customer portal, audit on entry+exit, banner with origin operator identity
  • Real audit log collectionplatform_audit Mongo collection, written by platform-api on every privileged action; stream from there instead of data/fixtures.ts
  • Feature flag backend — shipped. See FEATURE-FLAGS.md. Remaining sub-tasks: partnerSlug eval context, user-level flags, audit-log integration, server-side cache (all called out in that doc).
  • Incident management backendIncident schema + paging (PagerDuty / OpsGenie / custom). Until then, IncidentModal is mock.
  • Support ticket queueSupportTicket schema + email-in ingestion from a dedicated mailbox via Stalwart
  • Self-serve Partner portal at partner.dezky.local — own Nuxt app, own OAuth client, scoped to a partner's own customers
  • Real environment switcher — currently cosmetic; would need separate API endpoints per env, separate Authentik tenants
  • Real on-call indicator — integration with the paging system from the incident backend
  • Operator workspace impersonation in OCIS/Stalwart — operator tooling reaches into the customer's files + mail for support, with the same audit trail
  • MRR aggregation on Partner when Subscription gains real pricing
  • MFA-required Authentik policy on the dezky-operator provider (deferred from O.1)
  • Delete throwaway endpoints added during verification: apps/operator/server/api/_verify-token.get.ts, apps/portal/server/api/_verify-token.get.ts, apps/operator/server/api/operator-smoke-test.post.ts, apps/portal/server/api/partners/index.post.ts

Phase 5: Custom webmail (week 3-4)

Goal: Branded webmail client using Stalwart's JMAP API.

  • Add JMAP client library to portal
  • Build inbox view in Nuxt
  • Build compose dialog
  • Build message view with thread support
  • Style to match Dezky branding

JMAP is a modern JSON-RPC protocol — clean to work with.

Phase 6: Production migration prep (week 4+)

When the local stack is solid and you have 2-3 pilot customers interested:

  • Order Hetzner AX41-NVMe
  • Order Storage Box BX11 (Falkenstein)
  • Enable Hetzner Object Storage (bucket: dezky-ocis-prod)
  • Build Terraform module for Hetzner provisioning
  • Build Ansible playbook for bare-metal Stalwart deployment
  • Set up k3s on the cloud server
  • Migrate compose to Helm charts
  • Configure Let's Encrypt via cert-manager
  • Set up Restic backup jobs to Storage Box + B2

Phase 7: Add Zulip and Jitsi (when chat/video needed)

These were excluded from MVP for simplicity. When ready:

  • Create infrastructure/docker-compose/docker-compose.optional.yml
  • Add Zulip stack (server + db + worker)
  • Add Jitsi stack (web + prosody + jicofo + jvb)
  • Configure OIDC integration with Authentik
  • Add to portal launcher

Decisions still open

These need to be made before public launch:

  • Final pricing tiers (MVP, Pro, Enterprise)
  • dezky.com purchase decision ($3,000 via BrandBucket)
  • Final logo design (4 directions explored, need to pick one)
  • Legal entity structure for the new business
  • DPA (databehandleraftale) template
  • Customer support process (ticket system choice)

Long-term architecture goals

  • Multi-region deployment (Hetzner Falkenstein + Helsinki)
  • Disaster recovery: cross-DC Restic copies
  • ISO 27001 certification via Vanta
  • GDPR Article 30 record of processing activities
  • SOC 2 (later, for enterprise customers)
  • Customer-facing status page (Uptime Kuma or cstate)
  • Public documentation site
  • Self-service migration tooling from M365