feat(audit): cold-storage archival to S3 (Phase 4)

Final piece of the audit work. Events older than the hot retention window
move to S3-compatible object storage with signed manifests. Production uses
Hetzner Object Storage; dev uses a MinIO container with the same API.

Infra (infrastructure/docker-compose):
  - New `minio` service exposing the S3 API at minio:9000 + admin console at
    minio.dezky.local. Healthchecked. Bucket-init sidecar runs `mc mb` once
    to create `dezky-audit`; safe to re-run.
  - .env adds MINIO_ROOT_USER + MINIO_ROOT_PASSWORD.
  - platform-api env: AUDIT_COLD_{ENDPOINT,REGION,BUCKET,ACCESS_KEY,SECRET_KEY}
    + AUDIT_HOT_RETENTION_DAYS=90 + ARCHIVE_ENABLED=false (dormant in dev;
    operator UI's "Run archive now" bypasses this gate). AUDIT_COLD_SSE
    opts into SSE-S3 — left unset in dev because MinIO without a KMS rejects
    AES256 PUTs with "KMS is not configured".

Platform-api (services/platform-api/src/cold/):
  - cold-storage.client.ts: thin @aws-sdk/client-s3 wrapper — put/head/list.
    forcePathStyle=true so MinIO and Hetzner both work; same code, env-swap.
  - archive.service.ts: runOnce() selects chained events with at < cutoff →
    serializes to JSONL → gzip → sha256s → uploads JSONL + signed manifest
    → HEAD-confirms both objects exist → records an ArchiveBatch doc → only
    then deletes from hot Mongo. Crash-safe: a failed upload leaves events
    in hot. Manifest uses the Phase 3 AUDIT_SIGNING_KEY (HMAC-SHA-256), so
    archives + checkpoints share trust chain. Bypassable via { override:
    true } for the operator's UI force-run.
  - archive.worker.ts: hourly tick guarded by configured run-hour-UTC
    (default 03:00) + day-guard so the same UTC day doesn't archive twice.
    Disabled until ARCHIVE_ENABLED=true.
  - archive-batch.schema.ts: { archivedAt, startSeq, endSeq, eventCount,
    manifestSha256, jsonlKey, manifestKey, bytesUncompressed }. The
    manifest sha256 stored in Mongo lets us detect manifest tampering
    without downloading the actual manifest.

Audit module additions:
  - audit.controller.ts: GET /audit/archives, POST /audit/archive/run,
    /audit/verify now reports { oldestHotSeq, highestArchivedSeq } so the
    UI shows the tier boundary.

Operator UI (apps/operator):
  - 2 new proxies: /api/audit/archives + /api/audit/archive/run (force
    override=true). Both behind operator auth via the existing platformApi
    helper.
  - audit.vue: new "Cold storage" card with batch table (archived-at, seq
    range, event count, size, truncated manifest sha256), "Run archive
    now" button + per-run result line.

Smoke-tested end-to-end:
  - 7 chained events in hot. /api/audit/archive/run → ok=true, batchId
    returned. JSONL + manifest both exist in MinIO (verified via mc ls +
    mc cat). Mongo's chained set went 7 → 0. Verify reports
    highestArchivedSeq=1446 (since we burn-allocate seqs on Authentik
    dup-key rejections). Operator /audit panel shows the batch with
    manifest hash 1d8263…
  - First attempt with SSE-S3 enabled failed cleanly (MinIO KMS not
    configured) — archive service correctly left events in hot Mongo.
    Made SSE opt-in via AUDIT_COLD_SSE=true; prod turns it on.

Out of scope (each could be its own session):
  - Restore-to-hot endpoint (today: download from S3 + offline query)
  - Client-side encryption (today: SSE-S3 in prod, none in dev)
  - Multi-region replication
  - Soft TTL safety net (defense-in-depth on top of app-managed deletion)

This completes the four-phase audit log work:
  1. platform-api as audit hub
  2. External system ingest (Authentik / Stalwart / OCIS)
  3. Hash-chain + signed checkpoints (tamper evidence)
  4. Cold-storage archival (retention without unbounded Mongo growth)
This commit is contained in:
Ronni Baslund
2026-05-24 21:03:41 +02:00
parent 9435baa09d
commit 4d9e906ec1
13 changed files with 1279 additions and 10 deletions
+115 -4
View File
@@ -72,13 +72,16 @@ function shortIp(ip?: string) {
return ip.replace(/^::ffff:/, '')
}
// ── Tamper-evidence (Phase 3) ──────────────────────────────────────────
// ── Tamper-evidence (Phase 3) + cold-storage archives (Phase 4) ────────
interface VerifyReport {
ok: boolean
totalEventsVerified: number
checkpointsChecked: number
latestCheckpointAt: string | null
latestVerifiedSeq: number | null
// Phase 4 additions — included in /audit/verify response.
oldestHotSeq: number | null
highestArchivedSeq: number | null
break?:
| { kind: 'event-hash-mismatch'; seq: number; expected: string; actual: string }
| { kind: 'event-prev-hash-mismatch'; seq: number; expected: string; actual: string }
@@ -90,13 +93,30 @@ interface CheckpointSummary {
headHash: string | null
reason?: string
}
interface ArchiveBatch {
_id: string
archivedAt: string
startSeq: number
endSeq: number
eventCount: number
manifestSha256: string
jsonlKey: string
manifestKey: string
bytesUncompressed: number
}
const { data: latestCp, refresh: refreshCp } = useLazyFetch<CheckpointSummary>(
'/api/audit/checkpoint/latest',
{ default: () => ({ at: null, headSeq: null, headHash: null }), server: false },
)
const { data: archives, refresh: refreshArchives } = useLazyFetch<ArchiveBatch[]>(
'/api/audit/archives',
{ default: () => [], server: false },
)
const verifyReport = ref<VerifyReport | null>(null)
const verifying = ref(false)
const archiving = ref(false)
const archiveResult = ref<{ ok: boolean; reason?: string; eventCount?: number; startSeq?: number; endSeq?: number } | null>(null)
async function runVerify() {
verifying.value = true
@@ -113,6 +133,23 @@ async function forceCheckpoint() {
await refreshCp()
}
async function forceArchive() {
archiving.value = true
archiveResult.value = null
try {
archiveResult.value = await $fetch('/api/audit/archive/run', { method: 'POST' })
await Promise.all([refreshArchives(), refresh()])
} finally {
archiving.value = false
}
}
function fmtBytes(n: number): string {
if (n < 1024) return `${n} B`
if (n < 1024 * 1024) return `${(n / 1024).toFixed(1)} KB`
return `${(n / 1024 / 1024).toFixed(1)} MB`
}
function fmtRelative(iso: string | null | undefined): string {
if (!iso) return 'never'
const ms = Date.now() - new Date(iso).getTime()
@@ -259,10 +296,62 @@ function fmtRelative(iso: string | null | undefined): string {
</div>
</Card>
<!-- Cold-storage archives Phase 4 -->
<Card :pad="0" class="archive-card">
<div class="archive-head">
<div>
<Eyebrow>Cold storage</Eyebrow>
<div class="cap">
{{
archives?.length
? `archived through seq ${archives[0].endSeq} · ${archives.length} batch${archives.length === 1 ? '' : 'es'}`
: 'no archives yet · 90-day hot retention'
}}
</div>
</div>
<UiButton variant="secondary" :disabled="archiving" @click="forceArchive">
{{ archiving ? 'Archiving' : 'Run archive now' }}
</UiButton>
</div>
<div v-if="archiveResult" class="archive-result" :data-ok="archiveResult.ok">
<Badge v-if="archiveResult.ok && archiveResult.eventCount" tone="ok" dot>archived</Badge>
<Badge v-else-if="archiveResult.ok" tone="info" dot>no-op</Badge>
<Badge v-else tone="bad" dot>failed</Badge>
<Mono v-if="archiveResult.ok && archiveResult.eventCount">
{{ archiveResult.eventCount }} event(s) · seq {{ archiveResult.startSeq }}{{ archiveResult.endSeq }}
</Mono>
<Mono v-else dim>{{ archiveResult.reason || '—' }}</Mono>
</div>
<table v-if="archives?.length">
<thead>
<tr>
<th>Archived</th>
<th>Seq range</th>
<th>Events</th>
<th>Size</th>
<th>Manifest sha256</th>
</tr>
</thead>
<tbody>
<tr v-for="b in archives" :key="b._id">
<td><Mono>{{ fmtAbs(b.archivedAt) }}</Mono></td>
<td><Mono>{{ b.startSeq }}{{ b.endSeq }}</Mono></td>
<td><Mono>{{ b.eventCount }}</Mono></td>
<td><Mono dim>{{ fmtBytes(b.bytesUncompressed) }}</Mono></td>
<td><Mono dim>{{ b.manifestSha256.slice(0, 16) }}</Mono></td>
</tr>
</tbody>
</table>
<div v-else class="empty"><Mono dim>// no archive batches yet — events stay in hot Mongo for {{ '90' }} days, then move to S3 (MinIO in dev / Hetzner in prod)</Mono></div>
</Card>
<Mono dim class="note">
// sourced from /audit on platform-api · append-only · sha256 hash-chain
with HMAC-signed checkpoints every 100 events or 5 minutes · retention
+ cold-storage archival to Hetzner Object Storage is Phase 4
// hot tier: Mongo · cold tier: S3-compatible object storage ·
sha256 hash-chain with HMAC-signed checkpoints + signed archive
manifests · retention 90 days hot, indefinite cold · production
encryption at rest is SSE-S3
</Mono>
</div>
</div>
@@ -368,4 +457,26 @@ td.actor { display: flex; align-items: center; gap: 10px; }
.verify-result[data-ok="true"] { background: rgba(31, 138, 91, 0.05); }
.verify-result[data-ok="false"] { background: rgba(240, 88, 88, 0.06); }
.result-line { display: flex; align-items: center; gap: 12px; flex-wrap: wrap; }
/* Cold-storage archives panel */
.archive-card { margin-top: 8px; }
.archive-head {
padding: 14px 18px;
display: flex; justify-content: space-between; align-items: center;
gap: 16px;
border-bottom: 1px solid var(--border);
}
.archive-head .cap { font-family: var(--font-display); font-weight: 600; font-size: 15px; margin-top: 2px; }
.archive-result {
padding: 10px 18px;
display: flex; align-items: center; gap: 12px; flex-wrap: wrap;
border-bottom: 1px solid var(--border);
}
.archive-result[data-ok="true"] { background: rgba(31, 138, 91, 0.05); }
.archive-result[data-ok="false"] { background: rgba(240, 88, 88, 0.06); }
.archive-card table { width: 100%; border-collapse: collapse; }
.archive-card th, .archive-card td { padding: 10px 18px; font-size: 12px; text-align: left; }
.archive-card th { font-family: var(--font-mono); font-size: 9px; letter-spacing: 0.12em; text-transform: uppercase; color: var(--text-mute); font-weight: 500; border-bottom: 1px solid var(--border); }
.archive-card td { border-top: 1px solid var(--border); }
.archive-card .empty { padding: 16px 18px; }
</style>
@@ -0,0 +1,9 @@
import { platformApi } from '~~/server/utils/platform-api'
export default defineEventHandler(async (event) => {
// We force override=true here because this proxy is only callable from the
// operator UI's "Run archive now" button, which is explicitly a dev/ops
// exercise of the cold-storage path. Production may want to remove this
// proxy entirely once schedulers are trusted.
return platformApi(event, '/audit/archive/run?override=true', { method: 'POST' })
})
@@ -0,0 +1,3 @@
import { platformApi } from '~~/server/utils/platform-api'
export default defineEventHandler((event) => platformApi(event, '/audit/archives'))