2db41fec5e
JwtAuthGuard now accepts a comma-separated AUTHENTIK_AUDIENCE
('dezky-portal,dezky-operator'). jose.jwtVerify takes an array and succeeds
on any match — both customer-portal and operator-portal tokens validate
against this service. Per-endpoint guards restrict further.
New OperatorGuard enforces operator-only mutations:
1. JWT audience claim includes 'dezky-operator' (proof from the token
alone that this is a privileged session)
2. ActorService-resolved User has platformAdmin=true (DB check so
revocation works without waiting for the token to expire)
Both required; either alone is insufficient.
Partner module:
- Partner schema: slug, name, domain, status, marginPct, contactInfo,
billingInfo. marginPct is one number per partner (decided in grilling)
- CRUD endpoints under @UseGuards(JwtAuthGuard, OperatorGuard) — every
partner mutation requires operator scope
- GET /partners returns each row with a computed customers count from
aggregating Tenant.partnerId. MRR aggregation deferred until
Subscription gains a price column
- GET /partners/:slug/tenants for the partner detail view
- DELETE soft-terminates (status='terminated') — never hard-delete
because tenants may still reference the partner
Tenant changes:
- partnerId?: Types.ObjectId (ref Partner, indexed sparse) added to
Tenant schema
- UpdateTenantDto accepts partnerId so PATCH can attach/detach
- POST /tenants/:slug/suspend and /resume — operator-only via
OperatorGuard. PATCH already covers plan/domains/partnerId changes
Smoke test: customer-portal session sends POST /api/partners through the
portal proxy → 403 "This endpoint requires an operator-scoped token". The
positive test (operator-token → 200) waits for O.3 when there's an
operator app to mint the right token.
apps/portal/server/api/partners/index.post.ts is a temporary verification
proxy — delete once the operator portal exists.
410 lines
19 KiB
Markdown
410 lines
19 KiB
Markdown
# Operator Portal — Plan
|
|
|
|
`operator.dezky.local` (dev) → `operator.dezky.com` (prod). Internal admin portal
|
|
for Dezky staff: managing tenants, partners, operating the platform.
|
|
|
|
Distinct from the customer portal at `app.dezky.local`. Different OAuth client,
|
|
different cookie domain, different surface — though they share Authentik as the
|
|
IdP and (eventually) platform-api as the backend.
|
|
|
|
This file is the running record of decisions made during the design grilling
|
|
session. Updated inline as questions resolve.
|
|
|
|
---
|
|
|
|
## Scope — C-visual with real management for Tenants + Partners
|
|
|
|
Decision: build every screen from the source design visually, but back two
|
|
domains with real CRUD from day one — Tenants and Partners. Everything else
|
|
renders against mock-data fixtures until its backend is built.
|
|
|
|
| Surface | Day-1 state |
|
|
|---|---|
|
|
| Overview / dashboard | Visual — aggregates from real Tenant+Partner data where available, mock for the rest |
|
|
| Tenants (list + detail with 7 tabs) | **Real backend**, full CRUD, suspend/resume/delete |
|
|
| Partners (list + detail) | **Real backend**, new schema, full CRUD |
|
|
| Users (global) | Real read across tenants (already in DB) |
|
|
| Support queue | Mock |
|
|
| Platform billing | Mock |
|
|
| Reports | Mock |
|
|
| Infrastructure | Visual; could derive from Docker health checks but probably mock initially |
|
|
| Feature flags | Mock |
|
|
| Audit log | Mock (real backfill is a follow-up) |
|
|
| Operator team | Real (Users with `platformAdmin: true`) |
|
|
| Platform settings | Mock |
|
|
| Command palette ⌘K | Visual — opens, navigates, but "execute action" just toasts |
|
|
| Impersonation modal + banner | Visual — confirms the action but doesn't actually mint a token |
|
|
| Incident modal | Mock |
|
|
| Env switcher (prod/staging/dev) | Cosmetic — picks a label, no real env switch |
|
|
| On-call indicator | Mock |
|
|
|
|
### Real-backend surface this adds
|
|
|
|
Two genuinely new things on the backend:
|
|
|
|
1. **Partner schema and CRUD** in `services/platform-api` — id, name, domain,
|
|
status, customers count (computed), MRR (computed), margin, sinceDate. Tenants
|
|
gain an optional `partnerId` field. The existing `dezky` seed gets no partner.
|
|
2. **Tenant lifecycle actions** beyond create — suspend, resume, change plan,
|
|
change seat cap, soft-delete with grace period. Existing schema covers most
|
|
of this; controllers need new methods.
|
|
|
|
Everything else (incidents, flags, support tickets, audit log collection,
|
|
impersonation tokens) stays mock until explicitly promoted.
|
|
|
|
---
|
|
|
|
## Lives at `apps/operator/` — separate Nuxt app
|
|
|
|
Decision: new Nuxt 3 app, separate `package.json`, separate Traefik route at
|
|
`operator.dezky.local`. Reuses design tokens / NodeMark / UiIcon by copy for
|
|
now; a `packages/ui` workspace is a likely follow-up once we have a third
|
|
consumer.
|
|
|
|
**Why separate, not a route group in `apps/portal/`:** security boundary. The
|
|
moment any operator-only feature mutates customer state (impersonation, suspend
|
|
tenant), a routing or middleware bug on a shared app is catastrophic. Separate
|
|
apps make that nearly impossible. Different cookies, different OIDC client,
|
|
different domain.
|
|
|
|
**Cost:** one more docker-compose service, ~10 lines of Traefik labels, one more
|
|
volume for `node_modules`. Some duplicated dev tooling (eslint, tsconfig).
|
|
|
|
---
|
|
|
|
## Auth — new `dezky-operator` Authentik OAuth provider
|
|
|
|
Decision: a dedicated OAuth client in Authentik, distinct from `dezky-portal`.
|
|
|
|
- New provider `dezky-operator` (confidential, PKCE on)
|
|
- Redirect URIs: `https://operator.dezky.local/auth/oidc/callback`
|
|
- Group binding: `dezky-platform-admins` required at the provider's authorization
|
|
flow (Authentik policy), so non-admins can't even consent
|
|
- Stricter policies attached only to this provider: MFA required, future IP
|
|
allowlist for the office network/VPN
|
|
- Token audience claim: `dezky-operator`
|
|
- Provisioning's `JwtAuthGuard` widens its audience check to a list:
|
|
`['dezky-portal', 'dezky-operator']`
|
|
- Per-endpoint guard for operator-only mutations: require `aud === 'dezky-operator'`
|
|
AND `actor.platformAdmin === true`. The audience check makes "is this a privileged
|
|
session" provable from the token alone, independent of the DB lookup
|
|
|
|
**UX trade-off accepted:** if Ronni (or any operator who is also a customer)
|
|
wants to be in both apps, they log into Authentik twice — once per audience.
|
|
Correct security-wise, fine ergonomically.
|
|
|
|
---
|
|
|
|
## Backend stays as one service — rename to `services/platform-api`
|
|
|
|
Decision: route all operator mutations and reads through the existing NestJS
|
|
service (no second backend, no Nitro-direct-to-Mongo). Rename
|
|
`services/provisioning` → `services/platform-api` because the service now owns
|
|
more than just provisioning — it's the platform's data + control plane.
|
|
|
|
**What changes during the rename:**
|
|
|
|
- Directory: `services/provisioning/` → `services/platform-api/`
|
|
- Package: `@dezky/provisioning` → `@dezky/platform-api`
|
|
- Docker container name: `dezky-provisioning` → `dezky-platform-api`
|
|
- Compose service key, network alias, volume names
|
|
- Portal env var: `PROVISIONING_INTERNAL_URL` → `PLATFORM_API_INTERNAL_URL`
|
|
- Portal proxy routes: `http://provisioning:3001` → `http://platform-api:3001`
|
|
- Internal module names referencing "provisioning" stay (e.g.
|
|
`ProvisioningService` is now one orchestration concern *inside*
|
|
`platform-api`; not the whole service's purpose)
|
|
- Public URL stays `api.dezky.local` (Traefik routes by Host header, unaffected)
|
|
|
|
**New endpoints platform-api gains in this phase:**
|
|
|
|
- `POST /tenants/:slug/suspend`, `POST /tenants/:slug/resume`
|
|
- `PATCH /tenants/:slug` already exists; ensure it can change plan / seat cap
|
|
- `GET /partners`, `POST /partners`, `GET /partners/:slug`, `PATCH /partners/:slug`
|
|
- `Tenant.partnerId` foreign key + filter on tenant queries
|
|
- `JwtAuthGuard` accepts both `dezky-portal` and `dezky-operator` audiences;
|
|
per-endpoint requirement of `dezky-operator` aud for operator-only mutations
|
|
|
|
**Strategy:** rename in a separate prep commit before the operator work starts,
|
|
so the rename diff is mechanical and reviewable on its own.
|
|
|
|
---
|
|
|
|
## Partner schema
|
|
|
|
```typescript
|
|
@Schema({ collection: 'partners', timestamps: true })
|
|
class Partner {
|
|
slug: string // 'nordicmsp', URL-safe, unique
|
|
name: string // 'NordicMSP'
|
|
domain: string // 'nordicmsp.dk' — partner's own org domain
|
|
status: 'active' | 'in-negotiation' | 'paused' | 'terminated' // default 'in-negotiation'
|
|
marginPct: number // 20 = partner keeps 20% of customer MRR; one number per partner
|
|
partnershipStartedAt?: Date
|
|
contactInfo: { primaryName?, primaryEmail?, billingEmail? }
|
|
billingInfo: { /* same shape as Tenant.billingInfo */ }
|
|
}
|
|
```
|
|
|
|
**Tenant side:** add `partnerId?: Types.ObjectId` (ref Partner, indexed,
|
|
optional). Direct customers have no `partnerId`; partner-owned customers
|
|
reference one.
|
|
|
|
**Computed at query time, not stored:**
|
|
- `Partner.customers` — count of tenants with `partnerId === this._id`
|
|
- `Partner.mrr` — sum of those tenants' MRR
|
|
|
|
Storing denormalized would force write-time syncing on every tenant
|
|
create/suspend/plan-change for ~zero benefit at our scale.
|
|
|
|
**Operator-only.** A self-serve partner portal at `partner.dezky.local` is a
|
|
future surface; not in this phase. Partners are visible/manageable only from
|
|
the operator app.
|
|
|
|
---
|
|
|
|
## Impersonation — visual stub now, real flow later
|
|
|
|
Decision: build the UI exactly as designed (modal with reason field, top
|
|
banner, exit button) but do not wire actual token exchange. The confirm action
|
|
toasts "impersonation not implemented yet" and writes a mock audit entry.
|
|
|
|
**Why now:** validates the UX, lets future hires see the operator surface
|
|
end-to-end, doesn't introduce a dangerous capability before there's an
|
|
operational need.
|
|
|
|
**Mitigations against confusion:**
|
|
- Modal carries a `Demo only` badge — same styling as other stub-data badges
|
|
in the operator UI
|
|
- Toast on confirm makes the no-op explicit
|
|
- The banner does display in mock mode (so we can iterate on its design), but
|
|
the underlying session state is local to the operator tab
|
|
|
|
**Real flow design recorded for the follow-up:** OAuth 2 Token Exchange
|
|
(RFC 8693). Authentik supports it. Customer portal needs to accept tokens
|
|
carrying an `act` claim alongside `sub`, and show its own impersonation banner
|
|
when the two differ. ~2 days of careful work + security review.
|
|
|
|
---
|
|
|
|
## Decisions made without grilling (small, low-risk)
|
|
|
|
- **Theme:** dark by default. Existing `apps/portal/assets/styles/tokens.css`
|
|
already defines `[data-theme='dark']` tokens; the operator app sets
|
|
`<html data-theme="dark">` at app root and reuses them
|
|
- **Mock data location:** TypeScript files under `apps/operator/data/`
|
|
(`tenants-mock.ts`, `partners-mock.ts`, `flags-mock.ts`, etc.). Same shape
|
|
as `operator-data.jsx` from the design bundle, just retyped
|
|
- **Design system reuse:** copy `NodeMark.vue`, `UiIcon.vue`, and the auth
|
|
components into `apps/operator/components/` directly. A shared `packages/ui`
|
|
workspace becomes worth doing once a third surface needs them (partner
|
|
portal? landing site?)
|
|
- **OCIS / Stalwart admin shortcuts in operator UI:** out of scope for this
|
|
phase. Operator drills via the customer-facing service URLs
|
|
|
|
---
|
|
|
|
## Follow-up tasks (post-MVP)
|
|
|
|
In rough priority order:
|
|
|
|
1. **Real impersonation flow** — OAuth Token Exchange (RFC 8693), customer
|
|
portal `act`-claim handling, audit on entry+exit, banner with origin
|
|
operator identity
|
|
2. **Real audit log collection** — replace mock fixtures with a `platform_audit`
|
|
collection in Mongo that platform-api writes on every privileged action;
|
|
stream from there in the operator UI
|
|
3. **Feature flag backend** — `Flag` schema + per-tenant rollout state + a
|
|
tiny flag-eval client every service imports
|
|
4. **Incident management backend** — `Incident` schema + paging integration
|
|
(PagerDuty / OpsGenie / custom). Until then, the incident modal renders
|
|
from mock
|
|
5. **Support ticket queue** — `SupportTicket` schema + email-in ingestion
|
|
from a dedicated mailbox via Stalwart
|
|
6. **Self-serve Partner portal at `partner.dezky.local`** — Phase 6+ work,
|
|
own Nuxt app, own OAuth client, scoped to a partner's own customers
|
|
7. **Real environment switcher** — currently cosmetic; would need separate
|
|
API endpoints per env, separate Authentik tenants, etc.
|
|
8. **Real on-call indicator** — integration with the paging system that
|
|
gets installed in (4)
|
|
9. **Operator workspace impersonation in OCIS/Stalwart** — operator tooling
|
|
reaches *into* the customer's file storage and mail for support, with the
|
|
same audit trail as portal impersonation
|
|
|
|
---
|
|
|
|
## Out of scope for this entire effort
|
|
|
|
- Multi-region operator UI
|
|
- Read-only investor / board mode (a real persona but build it when there's a
|
|
real investor — design has a placeholder "Read-only" role for Jonas Berg)
|
|
- White-label of the operator portal (partners get their own portal eventually;
|
|
Dezky operator never gets white-labeled — it's our internal tool)
|
|
|
|
---
|
|
|
|
## Execution checklist
|
|
|
|
Tick boxes as work lands. Each phase is roughly one commit. Phases must be
|
|
done in order — earlier ones unblock later ones.
|
|
|
|
### O.0 · Prep — service rename ✓
|
|
|
|
- [x] Rename `services/provisioning/` → `services/platform-api/`
|
|
- [x] Update `package.json` name → `@dezky/platform-api`
|
|
- [x] Update `docker-compose.yml`: container name, service key, volume name,
|
|
env var `PROVISIONING_INTERNAL_URL` → `PLATFORM_API_INTERNAL_URL`,
|
|
NUXT_API_BASE points at new hostname
|
|
- [x] Update portal proxy routes to read `PLATFORM_API_INTERNAL_URL` and
|
|
default to `http://platform-api:3001`
|
|
- [x] Sweep docs (README, CLAUDE.md, SERVICES.md, AUTHENTIK-SETUP.md,
|
|
NEXT-STEPS.md, TROUBLESHOOTING.md) for stale references
|
|
- [x] Verify customer portal `/api/me` still works end-to-end after rename
|
|
|
|
### O.1 · Authentik — operator OAuth client ✓
|
|
|
|
- [x] Create `dezky-operator` OAuth provider via Authentik API
|
|
- [x] Set redirect URIs to `https://operator.dezky.local/auth/oidc/{callback,logout}`
|
|
- [x] Confidential client; client_secret persisted to `.env` as
|
|
`OPERATOR_OIDC_CLIENT_SECRET`
|
|
- [x] `Dezky Operator` application created and linked to the provider
|
|
- [x] Group binding on the application: `dezky-platform-admins` required to
|
|
reach the consent screen. (Authentik 2025.10 supports group-direct
|
|
policy bindings — no separate `policy_group_membership` object needed)
|
|
- [ ] **Deferred to follow-up:** MFA-required policy on this provider.
|
|
Authentik does this via a stage binding on the authentication flow,
|
|
which is app-specific configuration we'll wire when there's an actual
|
|
MFA enrollment to gate against. For dev with one akadmin, akadmin
|
|
already has WebAuthn — the auth flow prompts for it automatically
|
|
- [x] Discovery doc verified at
|
|
`/application/o/dezky-operator/.well-known/openid-configuration` —
|
|
issuer correct, scopes include `groups`, all endpoints resolve
|
|
|
|
### Gotchas worth noting
|
|
|
|
- Authentik 2025.10 requires both `authorization_flow` AND `invalidation_flow`
|
|
when creating OAuth2 providers. The default invalidation flow is at
|
|
`/api/v3/flows/instances/?designation=invalidation` (slug
|
|
`default-provider-invalidation-flow`)
|
|
- The `policies/group_membership/` endpoint mentioned in older Authentik
|
|
docs is gone in 2025.10. Use `policies/bindings/` with a direct `group`
|
|
reference instead
|
|
|
|
### O.2 · platform-api — multi-audience + Partner CRUD ✓
|
|
|
|
- [x] `JwtAuthGuard`: accepts comma-separated `AUTHENTIK_AUDIENCE`
|
|
(`dezky-portal,dezky-operator`). Both audiences validate; per-endpoint
|
|
guards further restrict
|
|
- [x] `OperatorGuard` (not a decorator — a regular `CanActivate` guard)
|
|
enforcing `aud includes 'dezky-operator' && actor.platformAdmin`.
|
|
Applied via `@UseGuards(JwtAuthGuard, OperatorGuard)`
|
|
- [x] `schemas/partner.schema.ts` — Partner model
|
|
- [x] `partners/` module: controller + service + DTOs (create / read /
|
|
update / soft-terminate / list tenants under partner)
|
|
- [x] `partnerId?: Types.ObjectId` added to Tenant schema (indexed, sparse).
|
|
`UpdateTenantDto` accepts `partnerId` to attach/detach
|
|
- [x] `Partner.customers` aggregated at query time (count of Tenants by
|
|
partnerId). MRR aggregation **deferred** — Tenant has no monthly
|
|
amount yet and Subscription lacks a price column. Will land when
|
|
Subscription gains pricing
|
|
- [x] Tenant lifecycle endpoints: `POST /tenants/:slug/suspend`,
|
|
`POST /tenants/:slug/resume` (operator-only). PATCH already accepts
|
|
plan/domains/partnerId changes
|
|
- [x] Smoke test: customer-portal token → `POST /partners` returns 403
|
|
"This endpoint requires an operator-scoped token" ✓. Positive test
|
|
(operator token → 200) deferred until O.3 when the operator app
|
|
exists to mint that token
|
|
|
|
### O.3 · Scaffold `apps/operator/`
|
|
|
|
- [ ] `apps/operator/package.json` (Nuxt 3, `nuxt-oidc-auth` beta.11, same
|
|
deps as portal)
|
|
- [ ] `nuxt.config.ts` with `oidc` block pointing at `dezky-operator`
|
|
- [ ] Docker compose service `operator`, with Traefik labels for
|
|
`operator.dezky.local`, `node_modules` volume, same `NODE_EXTRA_CA_CERTS`
|
|
mount for mkcert
|
|
- [ ] Network alias on Traefik: `operator.dezky.local`
|
|
- [ ] User task: add `operator.dezky.local` to `/etc/hosts`
|
|
- [ ] Session secrets in `.env`: `NUXT_OIDC_TOKEN_KEY` (base64-32),
|
|
`NUXT_OIDC_SESSION_SECRET`, `NUXT_OIDC_AUTH_SESSION_SECRET` —
|
|
**distinct from** the customer portal's secrets
|
|
- [ ] Verify login: visit `https://operator.dezky.local`, bounce to Authentik,
|
|
sign in as akadmin, land on a placeholder index page
|
|
|
|
### O.4 · Design system + app shell
|
|
|
|
- [ ] `assets/styles/tokens.css` — copy with `data-theme="dark"` as default
|
|
- [ ] `assets/styles/base.css`
|
|
- [ ] Components: `NodeMark.vue`, `UiIcon.vue` (copy from portal)
|
|
- [ ] Shared primitives ported from the design: `Card`, `Button`, `Table`,
|
|
`Badge`, `Mono`, `Eyebrow`, `StatusDot`, `Avatar`, `PageHeader`
|
|
- [ ] `OpSidebar.vue` — collapsible, badges per nav item
|
|
- [ ] `OpTopbar.vue` — env badge, ⌘K trigger, on-call pill, bell, avatar
|
|
- [ ] `app.vue` shell wires sidebar + topbar + `<NuxtPage />`
|
|
- [ ] Keyboard shortcut: ⌘[ collapses sidebar, ⌘K opens palette
|
|
|
|
### O.5 · Tenant management (real backend)
|
|
|
|
- [ ] `pages/tenants/index.vue` — list with status/plan/seats/MRR columns,
|
|
filter by partner and status, search by slug/name
|
|
- [ ] `pages/tenants/[slug].vue` — detail view with tabs
|
|
- [ ] Tab: **Overview** — header card, key stats, partner link
|
|
- [ ] Tab: **Users** — list users via `GET /users?tenantSlug=…`
|
|
- [ ] Tab: **Resources** — provisioning status per integration
|
|
(Authentik / Stalwart / OCIS), error messages, "Reconcile" button
|
|
- [ ] Tab: **Billing** (mock fixtures)
|
|
- [ ] Tab: **Audit** (mock fixtures)
|
|
- [ ] Tab: **Support** (mock fixtures)
|
|
- [ ] Tab: **Danger** — suspend, resume, change plan, soft-delete; real
|
|
backend calls, confirmation modals
|
|
|
|
### O.6 · Partner management (real backend)
|
|
|
|
- [ ] `pages/partners/index.vue` — list with name/domain/status/customers/MRR
|
|
- [ ] `pages/partners/[slug].vue` — detail panel with customers list,
|
|
MRR breakdown, margin, contact info
|
|
- [ ] "Create partner" modal — POST /partners
|
|
- [ ] Attach / detach tenant to partner (PATCH on tenant.partnerId)
|
|
|
|
### O.7 · Visual-only screens (mock fixtures)
|
|
|
|
- [ ] `data/*.ts` — typed mock fixtures (tenants-extra, partners-extra,
|
|
services, incident, flags, audit, team)
|
|
- [ ] `pages/index.vue` — Overview dashboard
|
|
- [ ] `pages/operator-team.vue` — real backend (Users where
|
|
`platformAdmin === true`)
|
|
- [ ] `pages/users.vue` — global users, real read
|
|
- [ ] `pages/infrastructure.vue` — service health (mock for now;
|
|
docker health check integration is a follow-up)
|
|
- [ ] `pages/flags.vue` — feature flags (mock)
|
|
- [ ] `pages/audit.vue` — global audit (mock)
|
|
- [ ] `pages/support.vue` — placeholder
|
|
- [ ] `pages/billing.vue` — placeholder
|
|
- [ ] `pages/reports.vue` — placeholder
|
|
- [ ] `pages/settings.vue` — placeholder
|
|
|
|
### O.8 · Interactions
|
|
|
|
- [ ] `CommandPalette.vue` — ⌘K opens, fuzzy search over tenants + partners
|
|
+ flags + nav items + actions
|
|
- [ ] `ImpersonationModal.vue` — visual stub with reason field, Demo-only
|
|
badge, no-op confirm + toast
|
|
- [ ] `ImpersonationBanner.vue` — top banner shown when impersonating
|
|
- [ ] `IncidentModal.vue` — mock incident render
|
|
- [ ] `TweaksPanel.vue` — theme (light/dark), density (comfy/compact),
|
|
env (prod/staging/dev cosmetic switch)
|
|
|
|
### O.9 · Verification
|
|
|
|
- [ ] Sign in to `operator.dezky.local` as akadmin via the new OAuth client
|
|
- [ ] Confirm JWT audience is `dezky-operator` (decode in DevTools, post
|
|
response back)
|
|
- [ ] Create a real Partner via the UI, see it in Mongo
|
|
- [ ] Attach the `acme` tenant to that partner; verify count goes 0 → 1
|
|
- [ ] Suspend a tenant from the Danger tab; confirm `status: 'suspended'`
|
|
in Mongo
|
|
- [ ] Sign in to `app.dezky.local` simultaneously in another browser
|
|
profile, confirm the customer portal still works and that customer
|
|
token's `aud` is `dezky-portal`
|
|
- [ ] Tick all the relevant follow-up tasks in NEXT-STEPS.md as remaining
|
|
work, file separate issues if anything was deferred
|