feat(audit): hash-chain tamper evidence + signed checkpoints (Phase 3)

The audit log now carries cryptographic chain-of-custody. Every chained
event references the previous event's sha256, and periodic checkpoints
sign the head with HMAC-SHA-256. An attacker who modifies a historical
row must also forge every checkpoint signature past it — which requires
the AUDIT_SIGNING_KEY, kept outside Mongo.

Schema (services/platform-api/src/schemas/):
  - audit-event.schema.ts: new `seq` (monotonic) + `chained` (Phase-3-or-
    later flag) + `prevHash` + `hash`. Compound unique index on seq with
    partial filter so pre-Phase-3 rows don't collide on null.
  - audit-counter.schema.ts: single doc `_id='audit_seq'`, incremented
    atomically by findOneAndUpdate($inc).
  - audit-checkpoint.schema.ts: { at, headSeq, headHash, signature,
    sigAlg, reason }. Reason ∈ {startup, interval, threshold, manual}.

Audit module (services/platform-api/src/audit/):
  - canonical.ts: stable JSON form + hashCanonical (sha256) +
    checkpointSignature (HMAC-SHA-256) + verifyCheckpointSignature
    (timingSafeEqual). Single source of truth for hash inputs — schema
    additions land here at the same time as the field.
  - audit.service.ts: record() now allocates seq → looks up lastHash() →
    computes hash → inserts. Per-process write mutex serializes the
    allocate+lookup so concurrent writers don't both chain off the same
    predecessor. Documented multi-instance caveat (needs Mongo replica
    set + transactions OR a distributed lock).
  - checkpoint.service.ts: scheduler triggers on startup + every 5min
    + threshold of 100 events accumulated. Skips when no new chained
    events since the last anchor.
  - verifier.service.ts: walks chain in seq order, recomputes each
    hash, validates checkpoint signatures. Returns a precise break:
    'event-hash-mismatch' (in-place modification), 'event-prev-hash-
    mismatch' (insertion/deletion), or 'checkpoint-signature-mismatch'.
  - audit.controller.ts: GET /audit/verify, GET /audit/checkpoint/latest,
    POST /audit/checkpoint (manual force).

Operator UI (apps/operator/):
  - 3 new proxies under /api/audit/{verify, checkpoint/latest, checkpoint}.
  - pages/audit.vue: new "Tamper evidence" card with "Force checkpoint"
    + "Verify chain" buttons. Header shows live head seq; result line
    shows verified count or a precise break (kind + seq + expected vs
    actual hash). Background tinted green/red on ok/broken.

Env (.env + docker-compose.yml):
  - new AUDIT_SIGNING_KEY (32-byte hex HMAC secret). Prod swaps this for
    ed25519 from an HSM/KMS; verifier code stays the same because sigAlg
    is on the checkpoint doc.

Smoke-tested all three break paths against a clean chain of 5 events:
  - normal verify: ok=true, 5/5 events verified, 1 checkpoint signed
  - modified seq=3 in Mongo directly: verify returns ok=false with
    break = { kind: 'event-hash-mismatch', seq: 3, expected, actual }
  - restored, nuked checkpoint signature: break = { kind:
    'checkpoint-signature-mismatch', headSeq: 5 }
  - operator UI's verify panel reflects all three states correctly.

Legacy data: pre-Phase-3 events stay `chained: false` and are excluded
from the chain walk. Retroactive chaining of historical entries is a
one-off migration script we can run if we ever care to.

Out of scope (Phase 4 etc.):
  - TTL + cold-storage archival to Hetzner Object Storage
  - GDPR right-to-erasure tooling
  - ed25519 / HSM signing (swap is well-defined; sigAlg field is ready)
  - Multi-instance write coordination (Mongo transaction OR distributed
    lock when we scale platform-api beyond 1 replica)
This commit is contained in:
Ronni Baslund
2026-05-24 20:43:54 +02:00
parent df18128617
commit 9435baa09d
14 changed files with 737 additions and 18 deletions
+124 -3
View File
@@ -71,6 +71,56 @@ function shortIp(ip?: string) {
// Strip v4-in-v6 prefix that node sockets sometimes report (::ffff:1.2.3.4 → 1.2.3.4) // Strip v4-in-v6 prefix that node sockets sometimes report (::ffff:1.2.3.4 → 1.2.3.4)
return ip.replace(/^::ffff:/, '') return ip.replace(/^::ffff:/, '')
} }
// ── Tamper-evidence (Phase 3) ──────────────────────────────────────────
interface VerifyReport {
ok: boolean
totalEventsVerified: number
checkpointsChecked: number
latestCheckpointAt: string | null
latestVerifiedSeq: number | null
break?:
| { kind: 'event-hash-mismatch'; seq: number; expected: string; actual: string }
| { kind: 'event-prev-hash-mismatch'; seq: number; expected: string; actual: string }
| { kind: 'checkpoint-signature-mismatch'; headSeq: number }
}
interface CheckpointSummary {
at: string | null
headSeq: number | null
headHash: string | null
reason?: string
}
const { data: latestCp, refresh: refreshCp } = useLazyFetch<CheckpointSummary>(
'/api/audit/checkpoint/latest',
{ default: () => ({ at: null, headSeq: null, headHash: null }), server: false },
)
const verifyReport = ref<VerifyReport | null>(null)
const verifying = ref(false)
async function runVerify() {
verifying.value = true
try {
verifyReport.value = await $fetch<VerifyReport>('/api/audit/verify')
await refreshCp()
} finally {
verifying.value = false
}
}
async function forceCheckpoint() {
await $fetch('/api/audit/checkpoint', { method: 'POST' })
await refreshCp()
}
function fmtRelative(iso: string | null | undefined): string {
if (!iso) return 'never'
const ms = Date.now() - new Date(iso).getTime()
if (ms < 60_000) return `${Math.floor(ms / 1000)}s ago`
if (ms < 3_600_000) return `${Math.floor(ms / 60_000)}m ago`
if (ms < 86_400_000) return `${Math.floor(ms / 3_600_000)}h ago`
return new Date(iso).toLocaleDateString('da-DK')
}
</script> </script>
<template> <template>
@@ -163,10 +213,56 @@ function shortIp(ip?: string) {
<Mono v-else dim>// reached the start of the log</Mono> <Mono v-else dim>// reached the start of the log</Mono>
</div> </div>
<!-- Tamper-evidence panel Phase 3 -->
<Card :pad="0" class="verify-card">
<div class="verify-head">
<div>
<Eyebrow>Tamper evidence</Eyebrow>
<div class="cap">
Hash chain · {{ latestCp?.headSeq != null ? `signed through seq ${latestCp.headSeq}` : 'no checkpoints yet' }}
</div>
</div>
<div class="verify-actions">
<UiButton variant="secondary" :disabled="verifying" @click="forceCheckpoint">
Force checkpoint
</UiButton>
<UiButton variant="primary" :disabled="verifying" @click="runVerify">
{{ verifying ? 'Verifying' : 'Verify chain' }}
</UiButton>
</div>
</div>
<div class="verify-meta">
<div class="kv"><Eyebrow>Last signed checkpoint</Eyebrow><Mono>{{ fmtRelative(latestCp?.at ?? null) }}</Mono></div>
<div v-if="latestCp?.headHash" class="kv"><Eyebrow>Head hash</Eyebrow><Mono dim>{{ latestCp.headHash.slice(0, 16) }}</Mono></div>
<div v-if="latestCp?.reason" class="kv"><Eyebrow>Reason</Eyebrow><Mono dim>{{ latestCp.reason }}</Mono></div>
</div>
<!-- Verify result -->
<div v-if="verifyReport" class="verify-result" :data-ok="verifyReport.ok">
<div v-if="verifyReport.ok" class="result-line">
<Badge tone="ok" dot>verified</Badge>
<Mono>{{ verifyReport.totalEventsVerified }} event(s) · {{ verifyReport.checkpointsChecked }} checkpoint(s) · last seq {{ verifyReport.latestVerifiedSeq ?? '—' }}</Mono>
</div>
<div v-else class="result-line">
<Badge tone="bad" dot>BROKEN</Badge>
<Mono v-if="verifyReport.break?.kind === 'event-hash-mismatch'">
event hash mismatch at seq {{ verifyReport.break.seq }} · stored {{ verifyReport.break.actual.slice(0, 16) }} expected {{ verifyReport.break.expected.slice(0, 16) }}
</Mono>
<Mono v-else-if="verifyReport.break?.kind === 'event-prev-hash-mismatch'">
chain link broken at seq {{ verifyReport.break.seq }} · prevHash mismatch
</Mono>
<Mono v-else-if="verifyReport.break?.kind === 'checkpoint-signature-mismatch'">
checkpoint signature mismatch at head seq {{ verifyReport.break.headSeq }}
</Mono>
</div>
</div>
</Card>
<Mono dim class="note"> <Mono dim class="note">
// sourced from /audit on platform-api · append-only · hash-chain tamper // sourced from /audit on platform-api · append-only · sha256 hash-chain
evidence + external system ingest (Authentik / OCIS / Stalwart) are with HMAC-signed checkpoints every 100 events or 5 minutes · retention
planned follow-ups (see docs/NEXT-STEPS.md) + cold-storage archival to Hetzner Object Storage is Phase 4
</Mono> </Mono>
</div> </div>
</div> </div>
@@ -247,4 +343,29 @@ td.actor { display: flex; align-items: center; gap: 10px; }
.empty { padding: 40px 20px; text-align: center; } .empty { padding: 40px 20px; text-align: center; }
.footer { display: flex; justify-content: center; padding: 4px 0; } .footer { display: flex; justify-content: center; padding: 4px 0; }
.note { display: block; padding: 4px 4px 0 4px; } .note { display: block; padding: 4px 4px 0 4px; }
/* Tamper-evidence panel */
.verify-card { margin-top: 8px; }
.verify-head {
padding: 14px 18px;
display: flex; justify-content: space-between; align-items: center;
gap: 16px;
border-bottom: 1px solid var(--border);
}
.verify-head .cap { font-family: var(--font-display); font-weight: 600; font-size: 15px; margin-top: 2px; }
.verify-actions { display: flex; gap: 8px; flex-shrink: 0; }
.verify-meta {
padding: 12px 18px;
display: flex; gap: 24px; flex-wrap: wrap;
}
.verify-meta .kv { display: flex; flex-direction: column; gap: 4px; }
.verify-result {
padding: 12px 18px;
border-top: 1px solid var(--border);
}
.verify-result[data-ok="true"] { background: rgba(31, 138, 91, 0.05); }
.verify-result[data-ok="false"] { background: rgba(240, 88, 88, 0.06); }
.result-line { display: flex; align-items: center; gap: 12px; flex-wrap: wrap; }
</style> </style>
@@ -0,0 +1,3 @@
import { platformApi } from '~~/server/utils/platform-api'
export default defineEventHandler((event) => platformApi(event, '/audit/checkpoint', { method: 'POST' }))
@@ -0,0 +1,3 @@
import { platformApi } from '~~/server/utils/platform-api'
export default defineEventHandler((event) => platformApi(event, '/audit/checkpoint/latest'))
@@ -0,0 +1,3 @@
import { platformApi } from '~~/server/utils/platform-api'
export default defineEventHandler((event) => platformApi(event, '/audit/verify'))
@@ -481,6 +481,10 @@ services:
# Path to the OCIS audit log inside this container. The same shared # Path to the OCIS audit log inside this container. The same shared
# volume is mounted on the OCIS service writeable; here it's read-only. # volume is mounted on the OCIS service writeable; here it's read-only.
OCIS_AUDIT_LOG_PATH: /var/log/ocis/audit.log OCIS_AUDIT_LOG_PATH: /var/log/ocis/audit.log
# Tamper-evidence signing key for the audit hash chain. Rotation closes
# out the current segment with a key-rotation checkpoint (not in scope
# for Phase 3). Prod swaps HMAC for ed25519 from an HSM.
AUDIT_SIGNING_KEY: ${AUDIT_SIGNING_KEY}
volumes: volumes:
- ../../services/platform-api:/app - ../../services/platform-api:/app
- platform_api_node_modules:/app/node_modules - platform_api_node_modules:/app/node_modules
@@ -1,16 +1,28 @@
import { Controller, Get, Query, UseGuards } from '@nestjs/common' import { Controller, Get, Post, Query, UseGuards } from '@nestjs/common'
import { JwtAuthGuard } from '../auth/jwt-auth.guard.js' import { JwtAuthGuard } from '../auth/jwt-auth.guard.js'
import { OperatorGuard } from '../auth/operator.guard.js' import { OperatorGuard } from '../auth/operator.guard.js'
import { ListAuditDto } from './dto/list-audit.dto.js' import { ListAuditDto } from './dto/list-audit.dto.js'
import { AuditService } from './audit.service.js' import { AuditService } from './audit.service.js'
import { CheckpointService } from './checkpoint.service.js'
import { AuditVerifier } from './verifier.service.js'
// Read-only. There is intentionally no POST/PATCH/DELETE — entries are // Read-only. There is intentionally no POST/PATCH/DELETE on /audit itself
// written by AuditService.record(), called from every mutation in other // — entries are written by AuditService.record(), called from every
// modules. Operator-only because the trail is sensitive. // mutation in other modules. Operator-only because the trail is sensitive.
//
// Phase 3 surfaces:
// GET /audit/verify — walks the chain + validates checkpoint signatures
// GET /audit/checkpoint/latest — current "last verified" anchor for the UI
// POST /audit/checkpoint — force a fresh checkpoint (rare; testing
// the chain or anchoring before an export)
@Controller('audit') @Controller('audit')
@UseGuards(JwtAuthGuard, OperatorGuard) @UseGuards(JwtAuthGuard, OperatorGuard)
export class AuditController { export class AuditController {
constructor(private readonly audit: AuditService) {} constructor(
private readonly audit: AuditService,
private readonly checkpoints: CheckpointService,
private readonly verifier: AuditVerifier,
) {}
@Get() @Get()
list(@Query() q: ListAuditDto) { list(@Query() q: ListAuditDto) {
@@ -27,4 +39,21 @@ export class AuditController {
limit: q.limit, limit: q.limit,
}) })
} }
@Get('verify')
verify() {
return this.verifier.verify()
}
@Get('checkpoint/latest')
async latestCheckpoint() {
const cp = await this.checkpoints.latest()
if (!cp) return { at: null, headSeq: null, headHash: null }
return { at: cp.at, headSeq: cp.headSeq, headHash: cp.headHash, reason: cp.reason }
}
@Post('checkpoint')
forceCheckpoint() {
return this.checkpoints.tryWrite('manual')
}
} }
@@ -1,17 +1,31 @@
import { Module } from '@nestjs/common' import { Module } from '@nestjs/common'
import { MongooseModule } from '@nestjs/mongoose' import { MongooseModule } from '@nestjs/mongoose'
import { AuthModule } from '../auth/auth.module.js' import { AuthModule } from '../auth/auth.module.js'
import {
AuditCheckpoint,
AuditCheckpointSchema,
} from '../schemas/audit-checkpoint.schema.js'
import {
AuditCounter,
AuditCounterSchema,
} from '../schemas/audit-counter.schema.js'
import { AuditEvent, AuditEventSchema } from '../schemas/audit-event.schema.js' import { AuditEvent, AuditEventSchema } from '../schemas/audit-event.schema.js'
import { AuditController } from './audit.controller.js' import { AuditController } from './audit.controller.js'
import { AuditService } from './audit.service.js' import { AuditService } from './audit.service.js'
import { CheckpointService } from './checkpoint.service.js'
import { AuditVerifier } from './verifier.service.js'
@Module({ @Module({
imports: [ imports: [
AuthModule, AuthModule,
MongooseModule.forFeature([{ name: AuditEvent.name, schema: AuditEventSchema }]), MongooseModule.forFeature([
{ name: AuditEvent.name, schema: AuditEventSchema },
{ name: AuditCounter.name, schema: AuditCounterSchema },
{ name: AuditCheckpoint.name, schema: AuditCheckpointSchema },
]),
], ],
controllers: [AuditController], controllers: [AuditController],
providers: [AuditService], providers: [AuditService, CheckpointService, AuditVerifier],
exports: [AuditService], exports: [AuditService],
}) })
export class AuditModule {} export class AuditModule {}
@@ -3,15 +3,20 @@
// reflects reality. The service is intentionally minimal — no UPDATE/DELETE // reflects reality. The service is intentionally minimal — no UPDATE/DELETE
// methods exist, and `list` is read-only. // methods exist, and `list` is read-only.
// //
// What's NOT here yet (tracked as Phase 2+ in the audit plan): // Phase 3 (tamper-evidence) wiring: each write now allocates an atomic
// - Ingest adapters for Authentik / OCIS / Stalwart // sequence number, chains its hash to the previous event, and marks the row
// - Hash-chain tamper evidence (schema fields exist; computation lands later) // as `chained: true`. Pre-Phase-3 events stay in the collection with
// - TTL / cold-storage archival // `chained: false` and no seq/hash; the verify endpoint only walks the
// chained set.
//
// What's still NOT here:
// - Cold-storage archival (Phase 4)
// - GDPR right-to-erasure (delete events for tenant X) // - GDPR right-to-erasure (delete events for tenant X)
import { Injectable, Logger } from '@nestjs/common' import { Injectable, Logger } from '@nestjs/common'
import { InjectModel } from '@nestjs/mongoose' import { InjectModel } from '@nestjs/mongoose'
import type { FilterQuery, Model, Types } from 'mongoose' import type { FilterQuery, Model, Types } from 'mongoose'
import { AuditCounter, type AuditCounterDocument } from '../schemas/audit-counter.schema.js'
import { import {
AuditEvent, AuditEvent,
type AuditEventDocument, type AuditEventDocument,
@@ -19,6 +24,7 @@ import {
type AuditResourceType, type AuditResourceType,
type AuditSource, type AuditSource,
} from '../schemas/audit-event.schema.js' } from '../schemas/audit-event.schema.js'
import { eventToCanonical, hashCanonical, GENESIS_HASH } from './canonical.js'
export interface AuditActor { export interface AuditActor {
userId?: Types.ObjectId | string userId?: Types.ObjectId | string
@@ -61,13 +67,28 @@ export interface AuditListFilters {
const DEFAULT_LIMIT = 100 const DEFAULT_LIMIT = 100
const MAX_LIMIT = 500 const MAX_LIMIT = 500
const SEQ_COUNTER_ID = 'audit_seq'
@Injectable() @Injectable()
export class AuditService { export class AuditService {
private readonly logger = new Logger(AuditService.name) private readonly logger = new Logger(AuditService.name)
// Serializes writes within this process. seq allocation + lastHash lookup
// need to be atomic relative to each other so two concurrent writers don't
// both observe the same "previous" event and chain off it — that produces
// two events with the same prevHash but different seq, which the verifier
// flags as a chain break. A Mongo transaction would handle this cleanly
// but standalone Mongo (our dev setup) doesn't support transactions. For a
// single-instance deployment a per-process queue is enough.
//
// Multi-instance caveat: this mutex doesn't cross processes. When we scale
// platform-api to >1 replica (k3s phase), we need either a Mongo replica
// set + real transaction here, OR a distributed lock (Redis SETNX or a
// Mongo lock-doc with findOneAndUpdate on a "writing" key).
private writeLock: Promise<unknown> = Promise.resolve()
constructor( constructor(
@InjectModel(AuditEvent.name) private readonly model: Model<AuditEventDocument>, @InjectModel(AuditEvent.name) private readonly model: Model<AuditEventDocument>,
@InjectModel(AuditCounter.name) private readonly counter: Model<AuditCounterDocument>,
) {} ) {}
// Best-effort. We deliberately swallow write failures rather than failing // Best-effort. We deliberately swallow write failures rather than failing
@@ -75,10 +96,24 @@ export class AuditService {
// the underlying mutation (which already succeeded) is worse. Write failures // the underlying mutation (which already succeeded) is worse. Write failures
// are surfaced via the logger so they show up in container logs. // are surfaced via the logger so they show up in container logs.
async record(input: AuditRecordInput, actor?: AuditActor): Promise<void> { async record(input: AuditRecordInput, actor?: AuditActor): Promise<void> {
// Chain to the previous write before kicking off our own. .catch() here so
// a single failed write doesn't deadlock the queue for everyone behind it.
const run = this.writeLock.then(() => this.recordChained(input, actor).catch(() => {}))
this.writeLock = run
return run as Promise<void>
}
private async recordChained(input: AuditRecordInput, actor?: AuditActor): Promise<void> {
try { try {
await this.model.create({ const seq = await this.nextSeq()
const prevHash = await this.lastHash()
// Build the doc fields, then derive the canonical form + hash from them.
// Doing this in two passes guarantees the hash matches exactly what the
// verifier will recompute from the persisted row.
const baseDoc = {
at: input.at ?? new Date(), at: input.at ?? new Date(),
actorType: actor?.userId || actor?.email ? 'user' : 'system', actorType: (actor?.userId || actor?.email ? 'user' : 'system') as 'user' | 'system',
actorId: actor?.userId, actorId: actor?.userId,
actorEmail: actor?.email, actorEmail: actor?.email,
actorIp: actor?.ip, actorIp: actor?.ip,
@@ -92,7 +127,14 @@ export class AuditService {
source: input.source ?? 'platform-api', source: input.source ?? 'platform-api',
metadata: input.metadata, metadata: input.metadata,
externalId: input.externalId, externalId: input.externalId,
}) seq,
prevHash,
chained: true,
}
const canonical = eventToCanonical(baseDoc)
const hash = hashCanonical(canonical)
await this.model.create({ ...baseDoc, hash })
} catch (err) { } catch (err) {
// Duplicate-key on (source, externalId) is expected when ingest workers // Duplicate-key on (source, externalId) is expected when ingest workers
// re-poll an overlapping window — quietly ignore so the worker doesn't // re-poll an overlapping window — quietly ignore so the worker doesn't
@@ -147,6 +189,57 @@ export class AuditService {
const limit = clamp(filters.limit ?? DEFAULT_LIMIT, 1, MAX_LIMIT) const limit = clamp(filters.limit ?? DEFAULT_LIMIT, 1, MAX_LIMIT)
return this.model.find(q).sort({ at: -1, _id: -1 }).limit(limit).exec() return this.model.find(q).sort({ at: -1, _id: -1 }).limit(limit).exec()
} }
// ── Chain integrity helpers ────────────────────────────────────────────
// Atomic monotonic counter. findOneAndUpdate with $inc and upsert returns
// the post-increment value — each caller gets a distinct seq even under
// concurrent inserts.
private async nextSeq(): Promise<number> {
const res = await this.counter
.findOneAndUpdate(
{ _id: SEQ_COUNTER_ID },
{ $inc: { n: 1 } },
{ upsert: true, new: true },
)
.exec()
return res!.n
}
// The previous chained event's hash, by seq desc. Returns GENESIS for the
// first chained event.
async lastHash(): Promise<string> {
const last = await this.model
.findOne({ chained: true }, { hash: 1 })
.sort({ seq: -1 })
.exec()
return last?.hash ?? GENESIS_HASH
}
// Used by the checkpoint scheduler to know the current chain head.
async headChainState(): Promise<{ headSeq: number; headHash: string } | null> {
const last = await this.model
.findOne({ chained: true }, { seq: 1, hash: 1 })
.sort({ seq: -1 })
.exec()
if (!last || last.seq == null || !last.hash) return null
return { headSeq: last.seq, headHash: last.hash }
}
// Count of chained events at or below `seq`. Useful for the verify endpoint
// to know how many events it walks per checkpoint segment.
async countChainedThrough(seq: number): Promise<number> {
return this.model.countDocuments({ chained: true, seq: { $lte: seq } }).exec()
}
// Walk the chain in seq order for the verifier. Stream-y signature so we
// don't load everything into memory for large chains.
chainCursor(fromSeq: number, toSeq: number) {
return this.model
.find({ chained: true, seq: { $gte: fromSeq, $lte: toSeq } })
.sort({ seq: 1 })
.cursor()
}
} }
function escapeRegex(s: string): string { function escapeRegex(s: string): string {
@@ -0,0 +1,112 @@
// Canonical form for audit-chain hashing. Must be deterministic across
// writes and reads — anyone with the same data must compute the same hash.
// Stability rules:
// - keys are sorted alphabetically at every level
// - undefined fields are omitted (not null)
// - dates serialized as ISO-8601 with millisecond precision (Date.toISOString())
// - the `hash` field itself is never included (chicken-and-egg)
// - new schema fields land in this canonicalizer at the same time the
// schema field lands, otherwise the chain breaks at the upgrade boundary
import { createHash, createHmac, timingSafeEqual } from 'node:crypto'
import type { AuditEventDocument } from '../schemas/audit-event.schema.js'
// Subset of AuditEvent we hash. The order of keys here doesn't matter —
// canonicalJson() sorts before serializing — but listing them explicitly
// guarantees we don't accidentally hash Mongoose internals (_id, __v) or
// Date timestamps (createdAt/recordedAt) that vary across writes.
const HASHABLE_FIELDS = [
'seq',
'at',
'actorType',
'actorId',
'actorEmail',
'actorIp',
'action',
'outcome',
'resourceType',
'resourceId',
'resourceName',
'tenantSlug',
'partnerSlug',
'source',
'externalId',
'metadata',
'prevHash',
] as const
export interface CanonicalEvent {
seq: number
at: string
actorType: string
actorId?: string
actorEmail?: string
actorIp?: string
action: string
outcome: string
resourceType?: string
resourceId?: string
resourceName?: string
tenantSlug?: string
partnerSlug?: string
source: string
externalId?: string
metadata?: unknown
prevHash: string
}
export function eventToCanonical(evt: AuditEventDocument | Record<string, unknown>): CanonicalEvent {
const e = evt as Record<string, unknown>
const out: Record<string, unknown> = {}
for (const k of HASHABLE_FIELDS) {
const v = e[k]
if (v === undefined || v === null) continue
out[k] = k === 'at' && v instanceof Date ? v.toISOString() : v
}
return out as unknown as CanonicalEvent
}
// Stable JSON: sort keys recursively. We never emit `undefined`, and arrays
// preserve order (they're position-sensitive).
export function canonicalJson(value: unknown): string {
return JSON.stringify(value, (_key, v) => {
if (v === null || typeof v !== 'object' || Array.isArray(v)) return v
if (v instanceof Date) return v.toISOString()
const sorted: Record<string, unknown> = {}
for (const k of Object.keys(v as Record<string, unknown>).sort()) {
const val = (v as Record<string, unknown>)[k]
if (val !== undefined) sorted[k] = val
}
return sorted
})
}
export function hashCanonical(canonical: CanonicalEvent): string {
return createHash('sha256').update(canonicalJson(canonical)).digest('hex')
}
// HMAC of a checkpoint's identifying triple. Signed with AUDIT_SIGNING_KEY.
// Verification: recompute on read, compare with timingSafeEqual.
export function checkpointSignature(
key: string,
headSeq: number,
headHash: string,
at: Date,
): string {
const payload = `${headSeq}:${headHash}:${at.toISOString()}`
return createHmac('sha256', key).update(payload).digest('hex')
}
export function verifyCheckpointSignature(
key: string,
headSeq: number,
headHash: string,
at: Date,
signature: string,
): boolean {
const expected = checkpointSignature(key, headSeq, headHash, at)
if (expected.length !== signature.length) return false
return timingSafeEqual(Buffer.from(expected, 'hex'), Buffer.from(signature, 'hex'))
}
export const GENESIS_HASH = 'GENESIS'
@@ -0,0 +1,114 @@
// Periodic signed anchors over the audit hash chain. The chain itself
// (event[N].hash references event[N-1].hash) protects integrity between
// events. Checkpoints layer signed attestations on top: "the head at time T
// was seq=S, hash=H, signed by us". An attacker who modifies a historical
// event must also forge every checkpoint signature past that event — which
// requires AUDIT_SIGNING_KEY, kept separately from Mongo write access.
//
// Trigger model: every N events (THRESHOLD_COUNT) or T seconds
// (INTERVAL_MS), whichever first, plus one on startup so a long quiet
// system still has a recent checkpoint to verify against.
import {
Injectable,
Logger,
type OnApplicationBootstrap,
type OnModuleDestroy,
} from '@nestjs/common'
import { ConfigService } from '@nestjs/config'
import { InjectModel } from '@nestjs/mongoose'
import type { Model } from 'mongoose'
import {
AuditCheckpoint,
type AuditCheckpointDocument,
} from '../schemas/audit-checkpoint.schema.js'
import { AuditService } from './audit.service.js'
import { checkpointSignature } from './canonical.js'
const DEFAULT_INTERVAL_MS = 5 * 60 * 1_000 // 5 minutes
const DEFAULT_THRESHOLD = 100
@Injectable()
export class CheckpointService implements OnApplicationBootstrap, OnModuleDestroy {
private readonly logger = new Logger(CheckpointService.name)
private readonly key: string
private readonly intervalMs: number
private readonly threshold: number
private timer: NodeJS.Timeout | null = null
private writing = false
constructor(
@InjectModel(AuditCheckpoint.name) private readonly model: Model<AuditCheckpointDocument>,
private readonly audit: AuditService,
config: ConfigService,
) {
this.key = config.getOrThrow<string>('AUDIT_SIGNING_KEY')
this.intervalMs = Number(config.get('AUDIT_CHECKPOINT_INTERVAL_MS') ?? DEFAULT_INTERVAL_MS)
this.threshold = Number(config.get('AUDIT_CHECKPOINT_THRESHOLD') ?? DEFAULT_THRESHOLD)
}
async onApplicationBootstrap(): Promise<void> {
this.logger.log(
`Audit checkpoint scheduler starting · interval=${this.intervalMs / 1000}s threshold=${this.threshold} events`,
)
// Fire one on startup so a fresh deployment / restart immediately has a
// verifiable checkpoint covering the head.
await this.tryWrite('startup')
this.timer = setInterval(() => void this.tryWrite('interval'), this.intervalMs)
}
onModuleDestroy(): void {
if (this.timer) clearInterval(this.timer)
}
// Called by AuditService writes (via the threshold path) and by the
// interval ticker. Skips when no chained events exist yet OR when the
// current head matches the previous checkpoint (nothing new to anchor).
async tryWrite(reason: AuditCheckpointDocument['reason']): Promise<AuditCheckpointDocument | null> {
if (this.writing) return null
this.writing = true
try {
const head = await this.audit.headChainState()
if (!head) return null // no chained events yet
const last = await this.model.findOne().sort({ headSeq: -1 }).exec()
if (last && last.headSeq === head.headSeq) return null // nothing new
// For threshold reason, only fire if N events have accumulated since the
// last checkpoint. Other reasons (startup, interval, manual) always
// proceed past the headSeq guard above.
if (reason === 'threshold' && last && head.headSeq - last.headSeq < this.threshold) {
return null
}
const at = new Date()
const signature = checkpointSignature(this.key, head.headSeq, head.headHash, at)
const doc = await this.model.create({
at,
headSeq: head.headSeq,
headHash: head.headHash,
signature,
sigAlg: 'HMAC-SHA-256',
reason,
})
this.logger.log(`Audit checkpoint written · seq=${head.headSeq} reason=${reason}`)
return doc
} catch (err) {
this.logger.error(
`checkpoint write failed: ${err instanceof Error ? err.message : String(err)}`,
)
return null
} finally {
this.writing = false
}
}
// Used by the verify endpoint to walk anchors in order.
list(): Promise<AuditCheckpointDocument[]> {
return this.model.find().sort({ headSeq: 1 }).exec()
}
latest(): Promise<AuditCheckpointDocument | null> {
return this.model.findOne().sort({ headSeq: -1 }).exec()
}
}
@@ -0,0 +1,137 @@
// Walks the audit chain in seq order, recomputes hashes, and validates each
// checkpoint signature. Returns the position of the first detected break OR
// a clean ok with the count of entries verified.
//
// Three classes of failure surface separately so the operator can act on
// them:
// - 'event-hash-mismatch' — an event's stored hash doesn't match the
// recompute. Modification in place.
// - 'event-prev-hash-mismatch' — an event's prevHash doesn't match the
// prior event's hash. Insertion/deletion in the middle.
// - 'checkpoint-signature-mismatch' — a checkpoint's signature doesn't
// match the recomputed HMAC. Checkpoint tampered OR signing key
// rotated without a key-rotation event (the latter is not yet
// implemented; for now treat as tampering).
import { Injectable, Logger } from '@nestjs/common'
import { ConfigService } from '@nestjs/config'
import { AuditService } from './audit.service.js'
import { CheckpointService } from './checkpoint.service.js'
import {
eventToCanonical,
GENESIS_HASH,
hashCanonical,
verifyCheckpointSignature,
} from './canonical.js'
export type VerifyBreak =
| { kind: 'event-hash-mismatch'; seq: number; expected: string; actual: string }
| { kind: 'event-prev-hash-mismatch'; seq: number; expected: string; actual: string }
| { kind: 'checkpoint-signature-mismatch'; headSeq: number }
export interface VerifyReport {
ok: boolean
totalEventsVerified: number
checkpointsChecked: number
latestCheckpointAt: string | null
latestVerifiedSeq: number | null
break?: VerifyBreak
}
@Injectable()
export class AuditVerifier {
private readonly logger = new Logger(AuditVerifier.name)
private readonly key: string
constructor(
private readonly audit: AuditService,
private readonly checkpoints: CheckpointService,
config: ConfigService,
) {
this.key = config.getOrThrow<string>('AUDIT_SIGNING_KEY')
}
async verify(): Promise<VerifyReport> {
const head = await this.audit.headChainState()
if (!head) {
return {
ok: true,
totalEventsVerified: 0,
checkpointsChecked: 0,
latestCheckpointAt: null,
latestVerifiedSeq: null,
}
}
// Walk every chained event in seq order, recomputing each hash from its
// canonical form + the prior event's hash. The first break terminates
// the walk with a precise location.
const cursor = this.audit.chainCursor(1, head.headSeq)
let prevHash = GENESIS_HASH
let totalVerified = 0
let lastSeq: number | null = null
let breakInfo: VerifyBreak | undefined
for await (const evt of cursor) {
if (evt.prevHash !== prevHash) {
breakInfo = {
kind: 'event-prev-hash-mismatch',
seq: evt.seq ?? -1,
expected: prevHash,
actual: evt.prevHash ?? '<missing>',
}
break
}
const recomputed = hashCanonical(eventToCanonical(evt))
if (evt.hash !== recomputed) {
breakInfo = {
kind: 'event-hash-mismatch',
seq: evt.seq ?? -1,
expected: recomputed,
actual: evt.hash ?? '<missing>',
}
break
}
prevHash = evt.hash
lastSeq = evt.seq ?? lastSeq
totalVerified++
}
if (breakInfo) {
return {
ok: false,
totalEventsVerified: totalVerified,
checkpointsChecked: 0,
latestCheckpointAt: null,
latestVerifiedSeq: lastSeq,
break: breakInfo,
}
}
// Chain integrity holds end to end. Now validate every checkpoint
// signature against the same key.
const checkpoints = await this.checkpoints.list()
for (const cp of checkpoints) {
const ok = verifyCheckpointSignature(this.key, cp.headSeq, cp.headHash, cp.at, cp.signature)
if (!ok) {
return {
ok: false,
totalEventsVerified: totalVerified,
checkpointsChecked: checkpoints.indexOf(cp),
latestCheckpointAt: cp.at.toISOString(),
latestVerifiedSeq: lastSeq,
break: { kind: 'checkpoint-signature-mismatch', headSeq: cp.headSeq },
}
}
}
const latest = checkpoints[checkpoints.length - 1]
return {
ok: true,
totalEventsVerified: totalVerified,
checkpointsChecked: checkpoints.length,
latestCheckpointAt: latest?.at?.toISOString() ?? null,
latestVerifiedSeq: lastSeq,
}
}
}
@@ -0,0 +1,42 @@
import { Prop, Schema, SchemaFactory } from '@nestjs/mongoose'
import { HydratedDocument } from 'mongoose'
export type AuditCheckpointDocument = HydratedDocument<AuditCheckpoint>
// Periodic signed anchor over the audit hash chain. CheckpointService writes
// one of these every N events (default 100) or every T minutes (default 5),
// whichever comes first. The verify endpoint walks events between
// consecutive checkpoints and re-validates each signature.
//
// `signature` is HMAC-SHA-256 of `${headSeq}:${headHash}:${at.toISOString()}`
// signed with AUDIT_SIGNING_KEY. Verification recomputes the HMAC; mismatch
// = checkpoint tampered with OR signing key rotated.
//
// Reason recorded for the audit trail's own audit trail: 'interval' (clock
// tick), 'threshold' (N events accumulated), 'startup' (first checkpoint
// after a process start), 'manual' (forced via operator UI).
@Schema({ collection: 'audit_checkpoints', timestamps: { createdAt: 'createdAt', updatedAt: false } })
export class AuditCheckpoint {
@Prop({ required: true, index: true })
at!: Date
@Prop({ required: true, unique: true })
headSeq!: number
@Prop({ required: true })
headHash!: string
@Prop({ required: true })
signature!: string
// HMAC algorithm used. Future-proofing: when we move from HMAC-SHA-256 to
// ed25519 in prod, the verifier picks the right primitive from this field
// and we add a new value rather than breaking old checkpoints.
@Prop({ default: 'HMAC-SHA-256' })
sigAlg!: string
@Prop({ enum: ['interval', 'threshold', 'startup', 'manual'], default: 'interval' })
reason!: 'interval' | 'threshold' | 'startup' | 'manual'
}
export const AuditCheckpointSchema = SchemaFactory.createForClass(AuditCheckpoint)
@@ -0,0 +1,21 @@
import { Prop, Schema, SchemaFactory } from '@nestjs/mongoose'
import { HydratedDocument } from 'mongoose'
export type AuditCounterDocument = HydratedDocument<AuditCounter>
// Atomic monotonic counter for AuditEvent.seq. A single document with
// `_id: 'audit_seq'` is incremented via findOneAndUpdate({_id}, {$inc: {n:1}}).
// Concurrent inserts each get a distinct sequence number; the chain integrity
// holds regardless of which commits to disk first.
@Schema({ collection: 'audit_counters', timestamps: { updatedAt: true, createdAt: false } })
export class AuditCounter {
// Counter identifier — for now we only have 'audit_seq' but the same
// collection could host other monotonic counters later.
@Prop({ required: true, unique: true })
_id!: string
@Prop({ required: true, default: 0 })
n!: number
}
export const AuditCounterSchema = SchemaFactory.createForClass(AuditCounter)
@@ -102,10 +102,26 @@ export class AuditEvent {
@Prop() @Prop()
externalId?: string externalId?: string
// Tamper-evidence prep. Populated by a later phase (hash-chain + signing). // ── Tamper-evidence (Phase 3) ──────────────────────────────────────────
// Monotonic sequence number from the audit_counters atomic counter. Defines
// the canonical order of the hash chain — events are insertion-ordered by
// seq, NOT by `at` (which may be backfilled by ingest workers).
@Prop({ index: true })
seq?: number
// True for events written by Phase 3+ AuditService. Pre-Phase-3 entries
// exist in the collection without seq/prevHash/hash — the verifier walks
// only the chained set.
@Prop({ default: false, index: true })
chained!: boolean
// sha256 of the previous event's hash, in seq order. The first chained
// event uses the literal "GENESIS".
@Prop() @Prop()
prevHash?: string prevHash?: string
// sha256 of canonical(this event without `hash`). Recomputed by the verify
// endpoint to detect modification.
@Prop() @Prop()
hash?: string hash?: string
} }
@@ -125,3 +141,10 @@ AuditEventSchema.index(
{ source: 1, externalId: 1 }, { source: 1, externalId: 1 },
{ unique: true, partialFilterExpression: { externalId: { $type: 'string' } } }, { unique: true, partialFilterExpression: { externalId: { $type: 'string' } } },
) )
// Hash-chain sequence — exactly one event per seq. Partial filter so
// pre-Phase-3 entries (no seq) don't conflict at the null spot.
AuditEventSchema.index(
{ seq: 1 },
{ unique: true, partialFilterExpression: { seq: { $type: 'number' } } },
)