dezky

Author	SHA1	Message	Date
Ronni Baslund	02341d8ba5	feat(audit): platform-api audit log + operator UI wired to real events Phase 1 of the audit work — capture everything we control today, ingest from external systems (Authentik / OCIS / Stalwart) in a later phase. The mock OP_AUDIT fixture is gone; both the /audit page and Overview's activity card now show real events recorded by AuditService.record() in platform-api. Schema (services/platform-api/src/schemas/audit-event.schema.ts): AuditEvent { at, actorType, actorId, actorEmail, actorIp, action, outcome, resourceType, resourceId, resourceName, tenantSlug, partnerSlug, source, metadata, prevHash, hash } Indexes: {at:-1}, {tenantSlug,at:-1}, {actorId,at:-1}, {action,at:-1}. prevHash/hash are nullable now; hash-chain tamper evidence is a later phase. AuditService: - record() — best-effort write, swallows errors so the underlying mutation that succeeded isn't failed by a downstream log issue. Surfaces failures via Logger. - list() — filters: since/until/before, action (exact OR prefix match via leading-anchor regex), tenantSlug, partnerSlug, actorEmail, outcome, free-text q across action/resourceName/actorEmail/tenantSlug, limit (default 100, max 500). Cursor pagination via `before`. - No UPDATE/DELETE surface — entries are append-only by construction. AuditController: GET /audit, behind JwtAuthGuard + OperatorGuard. No mutations exposed; entries written internally by other modules. X-Forwarded-For threading: - apps/operator/server/utils/platform-api.ts forwards the originating client IP to platform-api so audit entries carry a real address. - services/platform-api/src/auth/client-ip.ts extracts leftmost X-Forwarded-For, falls back to socket.remoteAddress. Instrumented mutations (every one threads actor + IP through): Tenants: create, update, softDelete, setStatus(suspend/resume) Partners: create, update, terminate Flags: create, update (incl. flag.killed verb when state=off+note=kill-switch), remove Users: deactivate Each controller resolves the User doc via ActorService, extracts IP via clientIp(req), and passes { userId, email, ip } as AuditActor to the service. FlagsService's local ActorRef collapses to AuditActor so flag history and the audit log share one shape. Operator UI: - /api/audit proxy that forwards query params verbatim - types/audit.ts - pages/audit.vue: real list with quick-pick action chips (All/Tenants/ Partners/Flags/Users), outcome filter, free-text search, "Load older events" cursor pagination - pages/index.vue: Overview activity card swaps mock OP_AUDIT for the same /api/audit endpoint, rows link into /audit - data/fixtures.ts: OP_AUDIT / AuditEntry / AuditTone exports removed Verified end-to-end: suspended + resumed acme, flipped oci_versioning through rollout → kill → on, then /audit returned all 5 events with the right action verbs (tenant.suspended, tenant.resumed, flag.updated, flag.killed, flag.updated), actor admin@dezky.local, IP 192.168.65.1. Filters (action prefix + free-text q) narrow correctly. Out of scope for this commit (each gets its own conversation): - Authentik / OCIS / Stalwart ingest adapters (Phase 2) - Hash-chain tamper evidence (Phase 3) - TTL + cold-storage archival to Hetzner Object Storage (Phase 4) - GDPR right-to-erasure tooling	2026-05-24 19:50:24 +02:00
Ronni Baslund	868a305539	feat(flags): real feature-flag system with bulk eval + operator UI Real backend for the flags page (was pure mock). Built so it's ready for the first risky rollout (likely the Stalwart JMAP client or the Stripe billing engine). services/platform-api: - Flag schema (key, description, state, pct, scope.{plans, tenantSlugs, partnerSlugs, environments}, embedded history capped at 20) - FlagsService with CRUD + evaluateAll(tenantSlug) → { key: bool } Eval algorithm: off → false; on → true targeted → require non-empty scope (empty allowlist means "nobody"), then match every non-empty axis rollout → match scope, then sha256(`${tenantId}:${key}`) % 100 < pct Hash-based rollout is deterministic: bumping pct only flips the new slice. Pure helpers (matchesScope, hasAnyScope, inRolloutBucket) are exported for future unit tests. - FlagsController exposes GET /flags, GET /flags/:key, POST /flags/evaluate (JwtAuthGuard); POST/PATCH/DELETE require OperatorGuard. History entries capture the actor's email. - SeedService idempotently creates 10 flag keys mapping to real Dezky concerns (jmap_native_v2, gdpr_export_v2, new_billing_engine, etc.). $setOnInsert so operator edits survive restarts. apps/operator: - 6 proxies: /api/flags index get/post, [key] get/patch/delete, evaluate post - types/flag.ts with the shape that mirrors the backend - pages/flags.vue: useFetch real list, row click opens FlagDetail, "New flag" opens NewFlagModal, scope summary column shows targeting at a glance - FlagDetail.vue: side panel with segmented state, rollout slider with live "~N of M tenants" preview from /api/tenants, plan/tenant/env chip pickers, dirty-tracked Save, instant Kill-switch (PATCH state=off+pct=0), embedded change history - NewFlagModal.vue: minimal create form (key + description). Everything else is configured in the detail panel afterward. - CommandPalette: feature-flag rows now come from /api/flags instead of the dropped fixture, so newly-created flags are searchable immediately - data/fixtures.ts: drop FLAGS / FeatureFlag exports (replaced by the real backend) Smoke-tested end-to-end: list renders 10 seed flags, opening gdpr_export_v2 and flipping to rollout 25% then saving persists + adds a history entry, kill-switch sets state=off in one click, /api/flags/evaluate returns the correct booleans for the seeded tenant, same tenant gets the same answer on consecutive evals (determinism), and creating + deleting a flag through the UI roundtrips correctly.	2026-05-24 19:21:15 +02:00
Ronni Baslund	77a09aaf77	feat(operator): live Infrastructure probes + honest split between deployed and planned The Infrastructure page used to read from a mock fixture that lied two ways: it listed services that aren't deployed (Jitsi, Zulip, Cloudflare, Object Storage, Postmark) and showed hardcoded uptime/latency for the ones that are. Now it shows truth from real probes plus a clearly-labelled "planned" section for the rest. Backend (services/platform-api): - New src/health/ module — HealthService runs 9 probes in parallel with a 1.5s timeout each: Stalwart → TCP stalwart:8080 OCIS → HTTP GET ocis:9200/health Collabora → HTTP GET collabora:9980/hosting/discovery Authentik → HTTP GET authentik-server:9000/-/health/ready/ Postgres → TCP postgres:5432 Mongo → existing Mongoose connection.db.admin().ping() Redis → TCP redis:6379 Traefik → TCP traefik:80 Platform API → trivially ok (this code is running) Status thresholds: ok ≤500ms, warn 500–1500ms, bad on timeout/refuse. - HealthController exposes GET /health/platform behind JwtAuthGuard, plus keeps the existing public GET /health for infra liveness checks. - Moved the old src/health.controller.ts into the new module. Frontend (apps/operator): - /api/health/platform proxy forwards the operator's access token. - Infrastructure page swaps SERVICES fixture for useFetch with 30s auto- refresh + a manual Refresh button. Cards show real status badge + real latency; uptime/error stay as em-dash with a "no probe history yet" tooltip until a Prometheus/event-log backend lands. - Below the live grid, a "Planned · not deployed" section renders 5 dimmed cards (Jitsi, Zulip, simpledns.plus, Hetzner Object Storage, Postmark). simpledns.plus replaces the misnamed Cloudflare entry — we use simpledns.plus, not Cloudflare. - Subtitle is now truthful: "8 / 9 services live · checked 2s ago". Verified: stopped redis → card flipped to "down · getaddrinfo ENOTFOUND redis", subtitle reflected 8/9, incident banner appeared. Restarted → back to 9/9, banner gone. SERVICES fixture stays in place for Overview's incident banner — replacing that is a separate follow-up tied to the incident-management backend.	2026-05-24 18:47:38 +02:00
Ronni Baslund	9fac11e668	feat(operator): notification drawer behind the topbar bell Right-anchored slide-in inbox triggered by the bell button. Backend is a follow-up — for now this is a visual + behavior shell with mock fixtures, same pattern as INCIDENT / FLAGS / OP_AUDIT. - data/fixtures.ts: new NotificationItem type + 6 seed rows from the design (DMARC, invitation, invoice, SAML, ticket reply, failed sign-in) - useNotifications composable: isOpen + items + unreadCount + markRead + markAllRead. Items deep-clone the fixture on first import so toggling unread doesn't mutate the shared seed. - NotificationDrawer component: Teleport + scrim + slide animation, header/list/footer. Each row shows tone-tinted icon tile + title + description + timestamp + left-rail unread dot. Click a row to mark read; click Mark all read or Preferences in the footer. - OpTopbar: bell now opens the drawer and only shows .icon-btn-dot when unreadCount > 0. - Layout mounts <NotificationDrawer /> alongside the other floating components. Dismissal: backdrop click, Escape, X, and route-change watcher (so Preferences → /settings closes the drawer cleanly).	2026-05-24 17:08:14 +02:00
Ronni Baslund	e0ac643e80	feat(operator): visual-only screens with real-data overview (O.7) - Overview (pages/index.vue): KPIs from real /tenants /partners /users, status meter, recent + needs-follow-up tables. Mock activity stream and incident banner overlay come from data/fixtures.ts. - Operator team: real GET /users filtered to platformAdmin === true, with last-seen + tenant counts. - Users (global): real read with All/Admins/Inactive views and search. - Infrastructure / Feature flags / Audit: mock fixtures only — wiring to real backends (Prometheus, OpenFeature, append-only audit) is tracked as follow-ups in OPERATOR-PLAN.md. - Placeholder pages (support/billing/reports/settings) via OpPlaceholder. - Shared: Stat, MetricCell, OpPlaceholder components, /api/users proxy, PlatformUser type. - .gitignore: scope the docker volumes data/ rule so apps/*/data/ is tracked again (operator carries mock fixtures there).	2026-05-24 08:17:26 +02:00

5 Commits