docs: capture operator portal plan from grilling session
OPERATOR-PLAN.md records the decisions from the design review: - Scope: C-visual (full UI fidelity, mock data for most screens) but real CRUD for tenants and partners from day one - Lives at apps/operator/ as a separate Nuxt app, separate domain, separate Authentik OAuth client (dezky-operator), aud-claim distinguishes operator vs portal tokens - Backend stays as a single NestJS service; rename services/provisioning -> services/platform-api as a prep commit - Partner schema designed: slug/name/domain/status/marginPct/contactInfo; Tenant gains optional partnerId; counts and MRR are computed at query time - Impersonation: visual stub now (modal + banner, no-op toast); real OAuth Token Exchange flow recorded as the first follow-up task Also lists follow-up tasks (real audit log, feature flag backend, incident management, partner portal) and out-of-scope items so the next grilling session has a starting point. Pointer added in NEXT-STEPS.md under a new 'Operator portal' track.
This commit is contained in:
@@ -141,6 +141,18 @@ await authentikClient.coreUsersCreate({
|
|||||||
})
|
})
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Operator portal — out-of-band track
|
||||||
|
|
||||||
|
`operator.dezky.local` (internal admin portal — separate Nuxt app, separate
|
||||||
|
Authentik OAuth client, real CRUD for tenants + partners). Plan and decisions
|
||||||
|
captured in [`OPERATOR-PLAN.md`](./OPERATOR-PLAN.md).
|
||||||
|
|
||||||
|
Touches platform-api substantially:
|
||||||
|
- Service rename `services/provisioning` → `services/platform-api` (prep)
|
||||||
|
- New `Partner` schema + CRUD endpoints
|
||||||
|
- Tenant lifecycle actions (suspend/resume/plan change)
|
||||||
|
- Audience-aware JwtAuthGuard for operator-only mutations
|
||||||
|
|
||||||
## Phase 5: Custom webmail (week 3-4)
|
## Phase 5: Custom webmail (week 3-4)
|
||||||
|
|
||||||
Goal: Branded webmail client using Stalwart's JMAP API.
|
Goal: Branded webmail client using Stalwart's JMAP API.
|
||||||
|
|||||||
@@ -0,0 +1,241 @@
|
|||||||
|
# Operator Portal — Plan
|
||||||
|
|
||||||
|
`operator.dezky.local` (dev) → `operator.dezky.com` (prod). Internal admin portal
|
||||||
|
for Dezky staff: managing tenants, partners, operating the platform.
|
||||||
|
|
||||||
|
Distinct from the customer portal at `app.dezky.local`. Different OAuth client,
|
||||||
|
different cookie domain, different surface — though they share Authentik as the
|
||||||
|
IdP and (eventually) the provisioning service as the backend.
|
||||||
|
|
||||||
|
This file is the running record of decisions made during the design grilling
|
||||||
|
session. Updated inline as questions resolve.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Scope — C-visual with real management for Tenants + Partners
|
||||||
|
|
||||||
|
Decision: build every screen from the source design visually, but back two
|
||||||
|
domains with real CRUD from day one — Tenants and Partners. Everything else
|
||||||
|
renders against mock-data fixtures until its backend is built.
|
||||||
|
|
||||||
|
| Surface | Day-1 state |
|
||||||
|
|---|---|
|
||||||
|
| Overview / dashboard | Visual — aggregates from real Tenant+Partner data where available, mock for the rest |
|
||||||
|
| Tenants (list + detail with 7 tabs) | **Real backend**, full CRUD, suspend/resume/delete |
|
||||||
|
| Partners (list + detail) | **Real backend**, new schema, full CRUD |
|
||||||
|
| Users (global) | Real read across tenants (already in DB) |
|
||||||
|
| Support queue | Mock |
|
||||||
|
| Platform billing | Mock |
|
||||||
|
| Reports | Mock |
|
||||||
|
| Infrastructure | Visual; could derive from Docker health checks but probably mock initially |
|
||||||
|
| Feature flags | Mock |
|
||||||
|
| Audit log | Mock (real backfill is a follow-up) |
|
||||||
|
| Operator team | Real (Users with `platformAdmin: true`) |
|
||||||
|
| Platform settings | Mock |
|
||||||
|
| Command palette ⌘K | Visual — opens, navigates, but "execute action" just toasts |
|
||||||
|
| Impersonation modal + banner | Visual — confirms the action but doesn't actually mint a token |
|
||||||
|
| Incident modal | Mock |
|
||||||
|
| Env switcher (prod/staging/dev) | Cosmetic — picks a label, no real env switch |
|
||||||
|
| On-call indicator | Mock |
|
||||||
|
|
||||||
|
### Real-backend surface this adds
|
||||||
|
|
||||||
|
Two genuinely new things on the backend:
|
||||||
|
|
||||||
|
1. **Partner schema and CRUD** in `services/provisioning` — id, name, domain,
|
||||||
|
status, customers count (computed), MRR (computed), margin, sinceDate. Tenants
|
||||||
|
gain an optional `partnerId` field. The existing `dezky` seed gets no partner.
|
||||||
|
2. **Tenant lifecycle actions** beyond create — suspend, resume, change plan,
|
||||||
|
change seat cap, soft-delete with grace period. Existing schema covers most
|
||||||
|
of this; controllers need new methods.
|
||||||
|
|
||||||
|
Everything else (incidents, flags, support tickets, audit log collection,
|
||||||
|
impersonation tokens) stays mock until explicitly promoted.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Lives at `apps/operator/` — separate Nuxt app
|
||||||
|
|
||||||
|
Decision: new Nuxt 3 app, separate `package.json`, separate Traefik route at
|
||||||
|
`operator.dezky.local`. Reuses design tokens / NodeMark / UiIcon by copy for
|
||||||
|
now; a `packages/ui` workspace is a likely follow-up once we have a third
|
||||||
|
consumer.
|
||||||
|
|
||||||
|
**Why separate, not a route group in `apps/portal/`:** security boundary. The
|
||||||
|
moment any operator-only feature mutates customer state (impersonation, suspend
|
||||||
|
tenant), a routing or middleware bug on a shared app is catastrophic. Separate
|
||||||
|
apps make that nearly impossible. Different cookies, different OIDC client,
|
||||||
|
different domain.
|
||||||
|
|
||||||
|
**Cost:** one more docker-compose service, ~10 lines of Traefik labels, one more
|
||||||
|
volume for `node_modules`. Some duplicated dev tooling (eslint, tsconfig).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Auth — new `dezky-operator` Authentik OAuth provider
|
||||||
|
|
||||||
|
Decision: a dedicated OAuth client in Authentik, distinct from `dezky-portal`.
|
||||||
|
|
||||||
|
- New provider `dezky-operator` (confidential, PKCE on)
|
||||||
|
- Redirect URIs: `https://operator.dezky.local/auth/oidc/callback`
|
||||||
|
- Group binding: `dezky-platform-admins` required at the provider's authorization
|
||||||
|
flow (Authentik policy), so non-admins can't even consent
|
||||||
|
- Stricter policies attached only to this provider: MFA required, future IP
|
||||||
|
allowlist for the office network/VPN
|
||||||
|
- Token audience claim: `dezky-operator`
|
||||||
|
- Provisioning's `JwtAuthGuard` widens its audience check to a list:
|
||||||
|
`['dezky-portal', 'dezky-operator']`
|
||||||
|
- Per-endpoint guard for operator-only mutations: require `aud === 'dezky-operator'`
|
||||||
|
AND `actor.platformAdmin === true`. The audience check makes "is this a privileged
|
||||||
|
session" provable from the token alone, independent of the DB lookup
|
||||||
|
|
||||||
|
**UX trade-off accepted:** if Ronni (or any operator who is also a customer)
|
||||||
|
wants to be in both apps, they log into Authentik twice — once per audience.
|
||||||
|
Correct security-wise, fine ergonomically.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Backend stays as one service — rename to `services/platform-api`
|
||||||
|
|
||||||
|
Decision: route all operator mutations and reads through the existing NestJS
|
||||||
|
service (no second backend, no Nitro-direct-to-Mongo). Rename
|
||||||
|
`services/provisioning` → `services/platform-api` because the service now owns
|
||||||
|
more than just provisioning — it's the platform's data + control plane.
|
||||||
|
|
||||||
|
**What changes during the rename:**
|
||||||
|
|
||||||
|
- Directory: `services/provisioning/` → `services/platform-api/`
|
||||||
|
- Package: `@dezky/provisioning` → `@dezky/platform-api`
|
||||||
|
- Docker container name: `dezky-provisioning` → `dezky-platform-api`
|
||||||
|
- Compose service key, network alias, volume names
|
||||||
|
- Portal env var: `PROVISIONING_INTERNAL_URL` → `PLATFORM_API_INTERNAL_URL`
|
||||||
|
- Portal proxy routes: `http://provisioning:3001` → `http://platform-api:3001`
|
||||||
|
- Internal module names referencing "provisioning" stay (e.g.
|
||||||
|
`ProvisioningService` is now one orchestration concern *inside*
|
||||||
|
`platform-api`; not the whole service's purpose)
|
||||||
|
- Public URL stays `api.dezky.local` (Traefik routes by Host header, unaffected)
|
||||||
|
|
||||||
|
**New endpoints platform-api gains in this phase:**
|
||||||
|
|
||||||
|
- `POST /tenants/:slug/suspend`, `POST /tenants/:slug/resume`
|
||||||
|
- `PATCH /tenants/:slug` already exists; ensure it can change plan / seat cap
|
||||||
|
- `GET /partners`, `POST /partners`, `GET /partners/:slug`, `PATCH /partners/:slug`
|
||||||
|
- `Tenant.partnerId` foreign key + filter on tenant queries
|
||||||
|
- `JwtAuthGuard` accepts both `dezky-portal` and `dezky-operator` audiences;
|
||||||
|
per-endpoint requirement of `dezky-operator` aud for operator-only mutations
|
||||||
|
|
||||||
|
**Strategy:** rename in a separate prep commit before the operator work starts,
|
||||||
|
so the rename diff is mechanical and reviewable on its own.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Partner schema
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
@Schema({ collection: 'partners', timestamps: true })
|
||||||
|
class Partner {
|
||||||
|
slug: string // 'nordicmsp', URL-safe, unique
|
||||||
|
name: string // 'NordicMSP'
|
||||||
|
domain: string // 'nordicmsp.dk' — partner's own org domain
|
||||||
|
status: 'active' | 'in-negotiation' | 'paused' | 'terminated' // default 'in-negotiation'
|
||||||
|
marginPct: number // 20 = partner keeps 20% of customer MRR; one number per partner
|
||||||
|
partnershipStartedAt?: Date
|
||||||
|
contactInfo: { primaryName?, primaryEmail?, billingEmail? }
|
||||||
|
billingInfo: { /* same shape as Tenant.billingInfo */ }
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Tenant side:** add `partnerId?: Types.ObjectId` (ref Partner, indexed,
|
||||||
|
optional). Direct customers have no `partnerId`; partner-owned customers
|
||||||
|
reference one.
|
||||||
|
|
||||||
|
**Computed at query time, not stored:**
|
||||||
|
- `Partner.customers` — count of tenants with `partnerId === this._id`
|
||||||
|
- `Partner.mrr` — sum of those tenants' MRR
|
||||||
|
|
||||||
|
Storing denormalized would force write-time syncing on every tenant
|
||||||
|
create/suspend/plan-change for ~zero benefit at our scale.
|
||||||
|
|
||||||
|
**Operator-only.** A self-serve partner portal at `partner.dezky.local` is a
|
||||||
|
future surface; not in this phase. Partners are visible/manageable only from
|
||||||
|
the operator app.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Impersonation — visual stub now, real flow later
|
||||||
|
|
||||||
|
Decision: build the UI exactly as designed (modal with reason field, top
|
||||||
|
banner, exit button) but do not wire actual token exchange. The confirm action
|
||||||
|
toasts "impersonation not implemented yet" and writes a mock audit entry.
|
||||||
|
|
||||||
|
**Why now:** validates the UX, lets future hires see the operator surface
|
||||||
|
end-to-end, doesn't introduce a dangerous capability before there's an
|
||||||
|
operational need.
|
||||||
|
|
||||||
|
**Mitigations against confusion:**
|
||||||
|
- Modal carries a `Demo only` badge — same styling as other stub-data badges
|
||||||
|
in the operator UI
|
||||||
|
- Toast on confirm makes the no-op explicit
|
||||||
|
- The banner does display in mock mode (so we can iterate on its design), but
|
||||||
|
the underlying session state is local to the operator tab
|
||||||
|
|
||||||
|
**Real flow design recorded for the follow-up:** OAuth 2 Token Exchange
|
||||||
|
(RFC 8693). Authentik supports it. Customer portal needs to accept tokens
|
||||||
|
carrying an `act` claim alongside `sub`, and show its own impersonation banner
|
||||||
|
when the two differ. ~2 days of careful work + security review.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Decisions made without grilling (small, low-risk)
|
||||||
|
|
||||||
|
- **Theme:** dark by default. Existing `apps/portal/assets/styles/tokens.css`
|
||||||
|
already defines `[data-theme='dark']` tokens; the operator app sets
|
||||||
|
`<html data-theme="dark">` at app root and reuses them
|
||||||
|
- **Mock data location:** TypeScript files under `apps/operator/data/`
|
||||||
|
(`tenants-mock.ts`, `partners-mock.ts`, `flags-mock.ts`, etc.). Same shape
|
||||||
|
as `operator-data.jsx` from the design bundle, just retyped
|
||||||
|
- **Design system reuse:** copy `NodeMark.vue`, `UiIcon.vue`, and the auth
|
||||||
|
components into `apps/operator/components/` directly. A shared `packages/ui`
|
||||||
|
workspace becomes worth doing once a third surface needs them (partner
|
||||||
|
portal? landing site?)
|
||||||
|
- **OCIS / Stalwart admin shortcuts in operator UI:** out of scope for this
|
||||||
|
phase. Operator drills via the customer-facing service URLs
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Follow-up tasks (post-MVP)
|
||||||
|
|
||||||
|
In rough priority order:
|
||||||
|
|
||||||
|
1. **Real impersonation flow** — OAuth Token Exchange (RFC 8693), customer
|
||||||
|
portal `act`-claim handling, audit on entry+exit, banner with origin
|
||||||
|
operator identity
|
||||||
|
2. **Real audit log collection** — replace mock fixtures with a `platform_audit`
|
||||||
|
collection in Mongo that platform-api writes on every privileged action;
|
||||||
|
stream from there in the operator UI
|
||||||
|
3. **Feature flag backend** — `Flag` schema + per-tenant rollout state + a
|
||||||
|
tiny flag-eval client every service imports
|
||||||
|
4. **Incident management backend** — `Incident` schema + paging integration
|
||||||
|
(PagerDuty / OpsGenie / custom). Until then, the incident modal renders
|
||||||
|
from mock
|
||||||
|
5. **Support ticket queue** — `SupportTicket` schema + email-in ingestion
|
||||||
|
from a dedicated mailbox via Stalwart
|
||||||
|
6. **Self-serve Partner portal at `partner.dezky.local`** — Phase 6+ work,
|
||||||
|
own Nuxt app, own OAuth client, scoped to a partner's own customers
|
||||||
|
7. **Real environment switcher** — currently cosmetic; would need separate
|
||||||
|
API endpoints per env, separate Authentik tenants, etc.
|
||||||
|
8. **Real on-call indicator** — integration with the paging system that
|
||||||
|
gets installed in (4)
|
||||||
|
9. **Operator workspace impersonation in OCIS/Stalwart** — operator tooling
|
||||||
|
reaches *into* the customer's file storage and mail for support, with the
|
||||||
|
same audit trail as portal impersonation
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Out of scope for this entire effort
|
||||||
|
|
||||||
|
- Multi-region operator UI
|
||||||
|
- Read-only investor / board mode (a real persona but build it when there's a
|
||||||
|
real investor — design has a placeholder "Read-only" role for Jonas Berg)
|
||||||
|
- White-label of the operator portal (partners get their own portal eventually;
|
||||||
|
Dezky operator never gets white-labeled — it's our internal tool)
|
||||||
Reference in New Issue
Block a user