Files
dezky/docs/FEATURE-FLAGS.md
T
Ronni Baslund 5407c04682 docs: feature-flag usage guide + cross-links
New docs/FEATURE-FLAGS.md captures when to add a flag, where the moving
parts live, how to use useFeatureFlag from app code, the 4 states + 4
scope axes, kill-switch flow, naming conventions, and the parts we know
aren't built yet (partnerSlug eval context, user-level flags, audit-log
integration, server-side cache).

CLAUDE.md gets a one-line convention entry under "Code conventions" so
future devs notice it when grepping for code rules. NEXT-STEPS.md is
updated: the feature-flag backend follow-up is now ticked done with a
pointer to FEATURE-FLAGS.md for the remaining sub-tasks, and the
"What landed" section reflects the real Infrastructure + Flags pages
and the notification drawer.
2026-05-24 19:29:24 +02:00

129 lines
5.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Feature flags
Dezky has a real, tenant-aware feature flag system. Use it whenever you ship
something that should roll out incrementally, be gated per plan/tenant, or
needs an instant kill switch in production. Don't push risky behavior behind
hardcoded `if (env === ...)` checks — flip a flag instead.
## When to add a flag
- The change can break things for real customers and you want a kill switch
- You want to ship to internal / friendly tenants first
- The feature is gated by plan tier (Pro/Enterprise)
- You're doing trunk-based development on a feature that takes more than
one PR to land
- Compliance-sensitive features (GDPR export, retention, audit) — kill
switch is mandatory
When you **don't** need one: pure UI tweaks, bug fixes, anything that's safe
to release to everyone at once.
## Where it lives
| Layer | Path | What it does |
|---|---|---|
| Schema + service | `services/platform-api/src/flags/` | CRUD + bulk eval (hash-based rollout) |
| Operator UI | `apps/operator/pages/flags.vue` + `components/FlagDetail.vue` | List, side panel, kill-switch, change history |
| Portal helper | `apps/portal/composables/useFeatureFlag.ts` | What you'll import from app code |
| Seed | `services/platform-api/src/seed/seed.service.ts` (`FLAG_SEEDS`) | The 10 flags created on bootstrap |
## Using a flag from app code
In the customer portal:
```vue
<script setup lang="ts">
const showNewInbox = useFeatureFlag('jmap_native_v2')
</script>
<template>
<NewInbox v-if="showNewInbox" />
<LegacyInbox v-else />
</template>
```
- One bulk eval per session — the composable shares a module-level cache.
- Fail-closed: every flag stays `false` if the eval call errors.
- The returned ref is reactive — gated UI stays hidden during the ~25ms
round-trip and appears when the answer lands.
For multi-flag panels or long-lived sessions:
```ts
const { flags, ready, refresh } = useFeatureFlags()
```
The composable's tenant context comes from the signed-in user's JWT — no
slug parameter. Operator-side checks (where there's no "current tenant")
go directly through `POST /api/flags/evaluate` with an explicit
`{ tenantSlug }`.
## Adding a new flag
1. **Add to the seed list** in
`services/platform-api/src/seed/seed.service.ts → FLAG_SEEDS`. This
documents what the flag is for and ensures every environment gets it
on bootstrap. State defaults to `off` for safety.
2. **Restart platform-api** (or wait for HMR + the bootstrap hook). New
keys are upserted via `$setOnInsert` so existing operator edits
survive.
3. **Open `https://operator.dezky.local/flags`**, click the row, set
targeting/rollout, save.
4. **Reference the key** from app code via `useFeatureFlag('your_key')`.
Alternative: create the flag directly through the operator UI's
"New flag" button. The seed list is for keys that should always exist;
the UI is for ad-hoc experiments.
## The 4 states
| State | Meaning |
|---|---|
| `off` | Disabled for everyone, ignores scope. Default kill-switch state. |
| `on` | Enabled for everyone, ignores scope. |
| `targeted` | Explicit allowlist. Requires non-empty scope — empty allowlist evaluates to false ("nobody is on the list yet"). |
| `rollout` | Scope filter + deterministic hash bucket. `sha256("${tenantId}:${flagKey}") % 100 < pct`. Same tenant always gets the same answer until `pct` changes, so bumping 25→50 only flips the new slice. |
## The 4 scope axes (all optional, AND-ed when set)
- **plans** — `['pro', 'enterprise']`
- **tenantSlugs** — explicit allowlist of tenants
- **partnerSlugs** — partner-level pilots (not wired into eval context yet)
- **environments** — `['prod', 'staging']`
Empty list on an axis = "no restriction on this axis".
## Kill switch
One click in the operator UI flips a flag to `state: 'off'` + `pct: 0` and
appends a `kill-switch` history entry. Use it when something's misbehaving
in production and you need it dark immediately. Then triage at leisure.
## Conventions
- **Keys** are snake_case, lowercase, start with a letter. Match the regex
in `CreateFlagDto`: `^[a-z][a-z0-9_]{1,62}[a-z0-9]$`.
- **One flag per intent**. Don't reuse `new_thing_v2` for unrelated
features — name them separately.
- **Delete flags** once a feature is `on` for everyone and you've removed
the legacy branch. Stale flags rot fast.
- **Don't gate auth, billing-critical, or audit-logging code** behind a
flag where `false` would silently skip security work. Flags should
pick between two correct paths, not enable correctness.
## What's not built yet
- **partnerSlug eval context** — the schema axis exists but the service
doesn't currently hydrate `ctx.partnerSlug` from the tenant doc.
Add when the first partner-gated flag actually needs it.
- **User-level flags** — eval is tenant-level only. If you need
per-individual gating (e.g. internal preview for specific staff),
combine `targeted` + a synthetic single-user tenant for now.
- **Audit log integration** — flag changes write to embedded `history`
on the flag doc, capped at 20. Switch to the real audit collection
once that exists.
- **Server-side cache** — `evaluateAll` re-reads all flags from Mongo
on every call. With ~1050 flags this is fine; if a service ends up
evaluating per-request and flag count grows, add a small TTL cache
(~5s) in `FlagsService`.