feat(audit): cold-storage archival to S3 (Phase 4)
Final piece of the audit work. Events older than the hot retention window
move to S3-compatible object storage with signed manifests. Production uses
Hetzner Object Storage; dev uses a MinIO container with the same API.
Infra (infrastructure/docker-compose):
- New `minio` service exposing the S3 API at minio:9000 + admin console at
minio.dezky.local. Healthchecked. Bucket-init sidecar runs `mc mb` once
to create `dezky-audit`; safe to re-run.
- .env adds MINIO_ROOT_USER + MINIO_ROOT_PASSWORD.
- platform-api env: AUDIT_COLD_{ENDPOINT,REGION,BUCKET,ACCESS_KEY,SECRET_KEY}
+ AUDIT_HOT_RETENTION_DAYS=90 + ARCHIVE_ENABLED=false (dormant in dev;
operator UI's "Run archive now" bypasses this gate). AUDIT_COLD_SSE
opts into SSE-S3 — left unset in dev because MinIO without a KMS rejects
AES256 PUTs with "KMS is not configured".
Platform-api (services/platform-api/src/cold/):
- cold-storage.client.ts: thin @aws-sdk/client-s3 wrapper — put/head/list.
forcePathStyle=true so MinIO and Hetzner both work; same code, env-swap.
- archive.service.ts: runOnce() selects chained events with at < cutoff →
serializes to JSONL → gzip → sha256s → uploads JSONL + signed manifest
→ HEAD-confirms both objects exist → records an ArchiveBatch doc → only
then deletes from hot Mongo. Crash-safe: a failed upload leaves events
in hot. Manifest uses the Phase 3 AUDIT_SIGNING_KEY (HMAC-SHA-256), so
archives + checkpoints share trust chain. Bypassable via { override:
true } for the operator's UI force-run.
- archive.worker.ts: hourly tick guarded by configured run-hour-UTC
(default 03:00) + day-guard so the same UTC day doesn't archive twice.
Disabled until ARCHIVE_ENABLED=true.
- archive-batch.schema.ts: { archivedAt, startSeq, endSeq, eventCount,
manifestSha256, jsonlKey, manifestKey, bytesUncompressed }. The
manifest sha256 stored in Mongo lets us detect manifest tampering
without downloading the actual manifest.
Audit module additions:
- audit.controller.ts: GET /audit/archives, POST /audit/archive/run,
/audit/verify now reports { oldestHotSeq, highestArchivedSeq } so the
UI shows the tier boundary.
Operator UI (apps/operator):
- 2 new proxies: /api/audit/archives + /api/audit/archive/run (force
override=true). Both behind operator auth via the existing platformApi
helper.
- audit.vue: new "Cold storage" card with batch table (archived-at, seq
range, event count, size, truncated manifest sha256), "Run archive
now" button + per-run result line.
Smoke-tested end-to-end:
- 7 chained events in hot. /api/audit/archive/run → ok=true, batchId
returned. JSONL + manifest both exist in MinIO (verified via mc ls +
mc cat). Mongo's chained set went 7 → 0. Verify reports
highestArchivedSeq=1446 (since we burn-allocate seqs on Authentik
dup-key rejections). Operator /audit panel shows the batch with
manifest hash 1d8263…
- First attempt with SSE-S3 enabled failed cleanly (MinIO KMS not
configured) — archive service correctly left events in hot Mongo.
Made SSE opt-in via AUDIT_COLD_SSE=true; prod turns it on.
Out of scope (each could be its own session):
- Restore-to-hot endpoint (today: download from S3 + offline query)
- Client-side encryption (today: SSE-S3 in prod, none in dev)
- Multi-region replication
- Soft TTL safety net (defense-in-depth on top of app-managed deletion)
This completes the four-phase audit log work:
1. platform-api as audit hub
2. External system ingest (Authentik / Stalwart / OCIS)
3. Hash-chain + signed checkpoints (tamper evidence)
4. Cold-storage archival (retention without unbounded Mongo growth)
This commit is contained in:
@@ -32,6 +32,9 @@ volumes:
|
||||
portal_node_modules:
|
||||
platform_api_node_modules:
|
||||
operator_node_modules:
|
||||
# MinIO data (S3-compatible cold storage for audit archives). Production
|
||||
# swaps the endpoint to Hetzner Object Storage and this volume goes away.
|
||||
minio_data:
|
||||
|
||||
services:
|
||||
# ─────────────────────────────────────────────────────────────────
|
||||
@@ -127,6 +130,52 @@ services:
|
||||
timeout: 3s
|
||||
retries: 5
|
||||
|
||||
# ─────────────────────────────────────────────────────────────────
|
||||
# MinIO — S3-compatible cold storage for audit archives (Phase 4).
|
||||
# Production swaps endpoint to Hetzner Object Storage; same protocol.
|
||||
# ─────────────────────────────────────────────────────────────────
|
||||
minio:
|
||||
image: minio/minio:latest
|
||||
container_name: dezky-minio
|
||||
restart: unless-stopped
|
||||
command: server /data --console-address ":9001"
|
||||
environment:
|
||||
MINIO_ROOT_USER: ${MINIO_ROOT_USER}
|
||||
MINIO_ROOT_PASSWORD: ${MINIO_ROOT_PASSWORD}
|
||||
volumes:
|
||||
- minio_data:/data
|
||||
networks: [dezky]
|
||||
healthcheck:
|
||||
test: ["CMD", "mc", "ready", "local"]
|
||||
interval: 10s
|
||||
timeout: 3s
|
||||
retries: 5
|
||||
labels:
|
||||
- traefik.enable=true
|
||||
# Optional: expose MinIO admin UI behind Traefik. Dev only — production
|
||||
# uses Hetzner's console.
|
||||
- traefik.http.routers.minio.rule=Host(`minio.dezky.local`)
|
||||
- traefik.http.routers.minio.tls=true
|
||||
- traefik.http.services.minio.loadbalancer.server.port=9001
|
||||
|
||||
# One-shot init container that creates the audit bucket if it doesn't
|
||||
# exist. Idempotent — re-running is a no-op. Exits cleanly so docker
|
||||
# doesn't restart it.
|
||||
minio-init:
|
||||
image: minio/mc:latest
|
||||
container_name: dezky-minio-init
|
||||
depends_on:
|
||||
minio:
|
||||
condition: service_healthy
|
||||
networks: [dezky]
|
||||
entrypoint: >
|
||||
sh -c "
|
||||
mc alias set local http://minio:9000 ${MINIO_ROOT_USER} ${MINIO_ROOT_PASSWORD} &&
|
||||
mc mb --ignore-existing local/dezky-audit &&
|
||||
echo 'MinIO bucket dezky-audit ready'
|
||||
"
|
||||
restart: "no"
|
||||
|
||||
# ─────────────────────────────────────────────────────────────────
|
||||
# Authentik — Identity provider (OIDC/SAML SSO)
|
||||
# ─────────────────────────────────────────────────────────────────
|
||||
@@ -485,6 +534,18 @@ services:
|
||||
# out the current segment with a key-rotation checkpoint (not in scope
|
||||
# for Phase 3). Prod swaps HMAC for ed25519 from an HSM.
|
||||
AUDIT_SIGNING_KEY: ${AUDIT_SIGNING_KEY}
|
||||
# Cold storage (Phase 4). Dev uses MinIO on the docker network; prod
|
||||
# swaps endpoint to Hetzner Object Storage and provides real IAM keys.
|
||||
# ARCHIVE_ENABLED defaults to false in dev so the worker doesn't move
|
||||
# data we still want to query while building. The UI "Run archive now"
|
||||
# button bypasses this gate.
|
||||
AUDIT_COLD_ENDPOINT: http://minio:9000
|
||||
AUDIT_COLD_REGION: us-east-1
|
||||
AUDIT_COLD_BUCKET: dezky-audit
|
||||
AUDIT_COLD_ACCESS_KEY: ${MINIO_ROOT_USER}
|
||||
AUDIT_COLD_SECRET_KEY: ${MINIO_ROOT_PASSWORD}
|
||||
AUDIT_HOT_RETENTION_DAYS: "90"
|
||||
ARCHIVE_ENABLED: "false"
|
||||
volumes:
|
||||
- ../../services/platform-api:/app
|
||||
- platform_api_node_modules:/app/node_modules
|
||||
|
||||
Reference in New Issue
Block a user