nim-sds/doc/PLAN_SNAPSHOT_PERSISTENCE.md
NagyZoltanPeter 4ccdd122fc
refactor(persistence): snapshot-based interface (5 procs, atomic per-op) (#72)
* feat: propagate persistence backend errors via Result

The Persistence contract previously returned `Future[void]` for writes and
`Future[ChannelSnapshot]` for the loader, with `raises: []`. Backends had no
way to report a failure, so a failed write or a failed/partial read was
silently swallowed — and on the read path a mid-scan failure could bootstrap
a *truncated* channel snapshot, corrupting the rebuilt bloom filter and
lamport clock across a restart.

Make every contract field Result-returning:
  * mutating ops  -> Future[Result[void, string]]
  * loadAllForChannel -> Future[Result[ChannelSnapshot, string]]

The backend-supplied error string is mapped to a new
`ReliabilityError.rePersistenceError` (logged once at the boundary via
`reliabilityErr`) and threaded up through every persistence-touching proc to
the public API, where the caller decides what to do. Request-driven paths
(wrap/unwrap/markDependenciesMet/ensureChannel/removeChannel/reset) propagate
the error; background maintenance loops (periodicBufferSweep,
periodicRepairSweep) log and retry on the next tick, since they have no
synchronous caller.

Tests: in-memory backend gains a `failingOps` injection hook; new
"Persistence: error propagation" suite asserts read/write/drop failures
surface as `rePersistenceError`. Full suite passes (90 OK).

BREAKING CHANGE: the `Persistence` contract signature changed; custom
backends must return `Result` and `ok()` on success. Bumped to 0.3.0.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(persistence): add snapshot types and codec (phase 0)

Introduce atomic-snapshot persistence types that will replace the current
fine-grained 13-proc Persistence interface. This commit is purely additive:
no existing call site changes, no behaviour change.

New types (sds/types/):
- channel_meta.nim — ChannelMeta (atomic per-channel snapshot blob),
  ChannelData (bootstrap payload), OutgoingRepairKV / IncomingRepairKV
  (flattened map entries for protobuf wire shape).
- history_update.nim — HistoryUpdate (combined append/evict payload for
  the message log).

New codec (sds/snapshot_codec.nim):
- Protobuf encode/decode for all new types, reusing the existing
  SdsMessage and HistoryEntry encoders from sds/protobuf.nim.
- Explicit schemaVersion=1 on ChannelMeta; decoder rejects unknown
  versions loudly rather than silently truncating.
- Time encoded as int64 unix milliseconds.

Tests (tests/test_snapshot_codec.nim):
- 13 round-trip cases covering empty, single-entry, full-buffer, and
  repair-heavy snapshots; ChannelData ordering; HistoryUpdate variants;
  schemaVersion rejection.

Planning artefacts:
- ANALYSIS_SDS_PERSISTENCE.md — problem statement (partial-write
  divergence, chatty call rate, non-fatal-error policy gap).
- ANALYSIS_SNAPSHOT_SAVE_POINTS.md — exact save points per protocol op
  and projected call rates.
- PLAN_SNAPSHOT_PERSISTENCE.md — phased refactor plan; this commit
  implements phase 0.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(persistence): add PersistenceV2 interface alongside legacy (phase 1)

Introduce the 5-proc snapshot-based Persistence interface that will
replace the legacy 13-proc one. Both coexist on `ReliabilityManager` so
phase 2 can migrate protocol ops one at a time without breaking existing
callers.

New file:
- sds/types/persistence_v2.nim — `PersistenceV2` type with
  saveChannelMeta / updateHistory / loadChannel / dropChannel /
  setRetrievalHint. `noOpPersistenceV2()` default. Doc-comments capture
  the atomicity pairing (meta save + history update issued back-to-back
  under the channel lock) and the non-fatal failure policy from PLAN §8.

Modified:
- sds/types/reliability_manager.nim — adds `persistenceV2: PersistenceV2`
  field alongside `persistence`; constructor takes both, both default to
  no-op.
- sds.nim — `newReliabilityManager` plumbs the new optional parameter.
- AGENTS.md / CLAUDE.md — GitNexus index re-indexed after phase 0 +
  phase 1 additions; symbol counts updated by `npx gitnexus analyze`.

No call site uses the new interface yet — that's phase 2. All existing
tests still pass against the legacy interface.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* refactor(persistence): migrate runRepairSweep to PersistenceV2 (phase 2.1)

Per-entry removeIncomingRepair / removeOutgoingRepair calls are replaced
by a single trySaveMeta per *dirty* channel at the end of that channel's
sweep. Failure is logged but does NOT abort the sweep — in-memory state
is the source of truth (PLAN_SNAPSHOT_PERSISTENCE.md §8).

Helpers added in sds/sds_utils.nim:
- snapshotMeta(channel) — capture current ChannelContext as ChannelMeta
  blob (flattens Table-keyed buffers to seqs for the wire shape).
- trySaveMeta(rm, channelId, channel) — best-effort meta snapshot save;
  logs on failure, never propagates.
- tryUpdateHistory(rm, channelId, append, evict) — best-effort history
  update; skips the call entirely when both lists are empty (HistoryUpdate
  contract).

Call-rate impact for runRepairSweep:
- Before: N persistence calls per expired entry per channel.
- After:  at most 1 saveChannelMeta per dirty channel; 0 on idle channels
  (matches the dirty-flag floor in ANALYSIS_SNAPSHOT_SAVE_POINTS).

All existing tests pass — including the 3 SDS-R Repair Sweep tests that
directly exercise this proc.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* refactor(persistence): migrate checkUnacknowledgedMessages to PersistenceV2 (phase 2.2)

Per-entry saveOutgoing / removeOutgoing calls are replaced by one
trySaveMeta at the end of the pass, conditional on a dirty flag (resend
attempt incremented, or entry expired). Pass succeeds even if the save
fails — next tick reissues the snapshot.

Call-rate impact:
- Before: N persistence calls per affected entry per pass.
- After:  at most 1 saveChannelMeta per pass; 0 when nothing aged out.

All existing tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* refactor(persistence): add V2 meta snapshot saves to foreground ops (phase 2A)

Wires `trySaveMeta` into the three public protocol ops that mutate
per-channel state — wrapOutgoingMessage, unwrapReceivedMessage, and
markDependenciesMet — at the operation's end, under the channel lock.

Legacy fine-grained persistence calls REMAIN in place; this commit is
additive. Both interfaces persist the same state simultaneously, so all
existing tests pass and a real backend wired to either interface
continues to work. Phase 2B will strip the legacy calls.

Save points match the §"Save Points" table in
ANALYSIS_SNAPSHOT_SAVE_POINTS.md exactly:
- wrapOutgoingMessage: 1 save (always)
- unwrapReceivedMessage: 1 save on every path including duplicate
  (the duplicate path still mutates the repair buffers)
- markDependenciesMet: 1 save after the processIncomingBuffer cascade

Non-fatal failure policy (PLAN §8): trySaveMeta logs and continues;
the protocol op never returns rePersistenceError for snapshot failures.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* refactor(persistence): strip legacy interface from protocol path; migrate tests to V2 (phase 2B+2C+2D)

End-state of phase 2: the protocol code no longer issues any legacy
fine-grained Persistence calls. All state survives via the snapshot-based
PersistenceV2 interface — one trySaveMeta per op end, plus tryUpdateHistory
batched inside addToHistory. The legacy Persistence field on
ReliabilityManager remains for backwards compatibility; phase 3 deletes it.

Protocol changes (sds.nim, sds/sds_utils.nim):
- reviewAckStatus, processIncomingBuffer, updateLamportTimestamp →
  pure in-memory; no per-mutation persistence.
- addToHistory: replaces appendLogEntry+removeLogEntry with a single
  tryUpdateHistory call carrying (append, evict) atomically.
- getRecentHistoryEntries: setRetrievalHint switched to V2; non-fatal.
- wrapOutgoingMessage, unwrapReceivedMessage, markDependenciesMet:
  all per-row saveOutgoing / removeOutgoing / saveIncoming /
  removeIncoming / saveOutgoingRepair / removeOutgoingRepair /
  saveIncomingRepair / removeIncomingRepair calls removed (16 call
  sites in total). State is captured by the op-end trySaveMeta added
  in phase 2A.
- getOrCreateChannel: bootstraps from persistenceV2.loadChannel.
- dropChannelFromPersistence: uses persistenceV2.dropChannel.

Failure policy (PLAN_SNAPSHOT_PERSISTENCE.md §8):
- Foreground ops (wrap, unwrap, markDeps, sweeps): non-fatal —
  trySaveMeta / tryUpdateHistory log and continue; the protocol op
  returns ok regardless of disk failure. In-memory state is the source
  of truth; the next op re-issues a complete snapshot and disk catches
  up automatically.
- Durability-intent ops (removeChannel, resetReliabilityManager via
  dropChannelFromPersistence; getOrCreateChannel via loadChannel):
  still propagate rePersistenceError, because the caller asked us to
  confirm a disk operation and we cannot silently lie.

Test infrastructure:
- tests/in_memory_persistence_v2.nim: new V2 adapter mock that
  decomposes the meta blob into the existing InMemoryStore shape so
  test assertions on store.outgoing / store.incoming / etc. continue to
  work without change.
- tests/test_persistence.nim: 17 tests, all rewritten against V2.
  - 13 state-survival tests carry over with identical assertions.
  - "loadChannel failure surfaces as err on bootstrap" — bootstrap
    keeps durability-intent semantics.
  - "saveChannelMeta failure during send does NOT surface" — deliberate
    inversion of the legacy "write failure surfaces as err" test. Asserts
    the new non-fatal policy: op returns ok, in-memory state correct,
    disk re-syncs on the next op.
  - "updateHistory failure during send does NOT surface" — same policy
    applied to the history path.
  - "dropChannel failure during removeChannel surfaces as err" — kept.
- All 17 tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* refactor(persistence): delete legacy interface; rename PersistenceV2 -> Persistence (phase 3)

End-state of the snapshot-persistence refactor. The legacy 13-proc
Persistence interface and its noOpPersistence are gone; the 5-proc
snapshot-based interface (formerly PersistenceV2) takes their place under
the canonical name.

Source:
- sds/types/persistence.nim: replaced 13-proc contract with the 5-proc
  snapshot interface (saveChannelMeta, updateHistory, loadChannel,
  dropChannel, setRetrievalHint). noOpPersistence returns ok everywhere
  and an empty ChannelData on load.
- sds/types/persistence_v2.nim: removed.
- sds/types/reliability_manager.nim: dropped the second persistenceV2
  field; constructor takes a single `persistence: Persistence`.
- sds/sds_utils.nim: rm.persistenceV2.X -> rm.persistence.X; doc-comments
  updated.
- sds.nim: dropped the persistenceV2 parameter from newReliabilityManager.

Tests:
- tests/in_memory_persistence_v2.nim: removed; its content moved to...
- tests/in_memory_persistence.nim: replaces the old legacy mock with the
  snapshot adapter under the canonical filename. Same InMemoryStore
  shape so test assertions stay unchanged.
- tests/test_persistence.nim: ctor param renamed, suite name de-prefixed.

FFI smoke (`nimble libsdsDynamicMac`, refc/threads:on): builds clean.
All 4 test suites pass:
- test_bloom
- test_reliability
- test_persistence (17 V2 tests)
- test_snapshot_codec (13 codec round-trip tests)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Persisting persistence redesign plan for reference

* refactor(persistence): R2 pending-write queue + per-op accumulator (PR #72 review fix)

Addresses all three substantive review findings on PR #72 in one
structural change: fold the per-op accumulator and the R2 retry buffer
into a single queue on `ChannelContext`, flushed once at op end.

Changes:

- sds/types/channel_context.nim: add `pendingHistoryAppends`
  (`OrderedSet[SdsMessageID]`) and `pendingHistoryEvicts`
  (`HashSet[SdsMessageID]`) fields. Only ids are stored — the full
  SdsMessage is looked up from `messageHistory` at flush time. Documented
  invariant: every id in pendingHistoryAppends is also in messageHistory,
  upheld by the merge rule.

- sds/sds_utils.nim:
  * `queueHistoryAppend(channel, msgId)` / `queueHistoryEvict(channel,
    msgId)` — "latest-wins" merge: append cancels any pending evict
    and vice versa. Symmetric, simple, handles the evict-then-re-add
    sequence correctly (SDS-R repair re-delivering an evicted message
    while the backend is unreachable).
  * `tryUpdateHistory(rm, channelId)` — no more list params; flushes the
    channel's pending queue. Dual role: per-op accumulator (multiple
    `addToHistory` calls within one op queue together and flush as one
    round-trip) AND R2 retry buffer (a failed flush leaves the queue
    populated for the next op to retry).
  * `addToHistory` queues via the helpers; does not call persistence.
  * Pending queue cleared on `cleanup` and `removeChannel`.

- sds.nim:
  * `processIncomingBuffer` returns to its single-arg signature — the
    queue lives on the channel, no parameter threading needed.
  * `wrapOutgoingMessage`, `unwrapReceivedMessage` (all three paths),
    `markDependenciesMet` issue exactly one `trySaveMeta` +
    `tryUpdateHistory` pair at op end, under the lock, with no
    intervening `await`-of-other-work. Matches the Persistence atomicity
    contract documented in `sds/types/persistence.nim`.
  * Pending queue cleared in `resetReliabilityManager`.

- tests/test_persistence.nim:
  * Direct `addToHistory` callers (state-survival setup) now follow with
    explicit `tryUpdateHistory(channelId)` to flush. Reflects the
    production op-end flush pattern.
  * New: `updateHistory failure is retried via R2 pending-write queue` —
    verifies that two failed sends leave both messages on the queue,
    and a third successful send drains the whole queue in one call.
  * New: `pending queue survives idle ops` — verifies that an op with
    no history changes of its own still flushes a previously-failed
    batch at op end.
  * New: `evict-then-re-add merge rule preserves the re-added message
    on disk` — regression for the "latest-wins" merge rule. The original
    "evict-wins" rule would silently drop the re-add and leave the
    message permanently absent from disk; this test would fail under
    that rule and passes under the corrected one.

Resolves PR #72 review comments:
- #1 (delta loss on failed updateHistory) — R2 retry queue.
- #2 (cascade chattiness — N updateHistory calls per op) — queue collects
  cascaded entries, flushed as one batch.
- #3 (atomicity contract mismatch) — implementation now matches the
  documented "saveChannelMeta then updateHistory back-to-back" pairing.

Test summary: 50 tests pass (47 prior + 3 new R2/merge-rule tests).
FFI dylib (`nimble libsdsDynamicMac`, refc + threads:on): clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-01 12:24:38 +02:00

24 KiB
Raw Blame History

SDS Snapshot Persistence — Design & Refactor Plan

Companion to ANALYSIS_SDS_PERSISTENCE.md (problem statement) and ANALYSIS_SNAPSHOT_SAVE_POINTS.md (where & how often we save).

This document defines:

  1. Data structures to be persisted (snapshot + history)
  2. New Persistence interface (5 procs replacing the current 13)
  3. Refactor plan — phased, test-gated, backward-compatible interim state

1. Data Structure Design

1.1 Design principles

Principle Reason
Snapshot is one atomic blob Eliminates partial-write divergence (the root cause from ANALYSIS_SDS_PERSISTENCE.md §4)
Snapshot is small (buffers only, no history) Keeps per-op write cost ≤ a few KB; foldable into one SQLite txn
History is separate, append-batched Large data, append-mostly, queryable by msg_id for SDS-R
Bloom filter is not persisted Already the case — rebuilt from history on bootstrap
Versioned wire format Allow future schema evolution without breaking on-disk data
Protobuf serialization Project already uses it (sds/protobuf.nim); keeps one codec

1.2 ChannelMeta — the snapshot payload

# sds/types/channel_meta.nim  (new file)

import std/[tables, times]
import ./sds_message_id
import ./unacknowledged_message
import ./incoming_message
import ./repair_entry
export
  sds_message_id, unacknowledged_message, incoming_message, repair_entry

const ChannelMetaSchemaVersion* = 1'u32

type ChannelMeta* = object
  ## Atomic snapshot of the fast-changing per-channel protocol state.
  ## Persisted as one blob per `saveChannelMeta` call. Bloom filter is
  ## intentionally absent — rebuilt from the message log on bootstrap.
  ## Message history is also absent — persisted separately via `updateHistory`
  ## because it is large and append-mostly.
  schemaVersion*: uint32
    ## On-disk format version. Backends MUST refuse to load a meta whose
    ## version they don't know how to decode rather than silently truncating
    ## or zero-filling unknown fields.

  lamportTimestamp*: int64

  outgoingBuffer*: seq[UnacknowledgedMessage]
    ## Sent-but-not-yet-acked messages. Order matters: the protocol iterates
    ## in insertion order for resend-attempt accounting.

  incomingBuffer*: seq[IncomingMessage]
    ## Received-but-not-yet-deliverable messages, each carrying its
    ## still-missing dependency set. Order is irrelevant; flattened from
    ## the in-memory `Table` for wire-friendliness.

  outgoingRepairBuffer*: seq[OutgoingRepairKV]
  incomingRepairBuffer*: seq[IncomingRepairKV]
    ## SDS-R repair buffers, flattened from in-memory `Table` to seq of
    ## (key, value) for stable serialization.

type
  OutgoingRepairKV* = object
    messageId*: SdsMessageID
    entry*: OutgoingRepairEntry

  IncomingRepairKV* = object
    messageId*: SdsMessageID
    entry*: IncomingRepairEntry

Why flatten the Tables to seqs? Protobuf has no native map of SdsMessageID → object. Flattening to seq of KV objects gives deterministic encoding and trivial decode-time rebuild of the in-memory Table. The cost is one extra alloc per entry on encode/decode — negligible vs. the I/O it replaces.

Why an explicit schemaVersion? The current interface has no version field. Adding fields later (e.g., a new SDS-R counter) silently truncates old data on load. The version makes incompatibility explicit; backends fail loud instead of corrupting state.

1.3 HistoryAppend — the history-write payload

# extension to sds/types/persistence.nim or new history_update.nim

type HistoryUpdate* = object
  ## Combined append/evict for one protocol operation. Empty `append` and
  ## empty `evict` ⇒ caller should skip the call entirely.
  append*: seq[SdsMessage]
    ## New delivered messages, in delivery order (matters for SDS-R retrieval
    ## hint correctness and FIFO eviction on the backend side).
  evict*: seq[SdsMessageID]
    ## Oldest messages now past `maxMessageHistory`. Backend deletes by id.

append is a seq (not a single SdsMessage) because processIncomingBuffer can deliver a chain of unblocked messages in one call to the parent op (unwrapReceivedMessage / markDependenciesMet). Sending them all in one updateHistory call keeps the "one save per protocol op" guarantee.

1.4 ChannelData — the bootstrap payload

type ChannelData* = object
  ## Returned by `loadChannel` on `getOrCreateChannel` bootstrap.
  ## Carries everything needed to rebuild the in-memory `ChannelContext`
  ## from a clean restart.
  meta*: ChannelMeta
  messageHistory*: seq[SdsMessage]
    ## MUST be ordered oldest-first (lamportTimestamp ASC, tie-break msg_id
    ## ASC). Bloom filter is rebuilt from this on load; FIFO eviction relies
    ## on this ordering. Backend contract; validated by nim-sds on load.

1.5 Storage encoding (internal to nim-sds — not the SDS network wire format)

Disambiguation. The SDS network wire format (bytes peers exchange) is handled by the existing sds/protobuf.nim and is untouched by this plan. What this section defines is the storage encoding: the codec nim-sds uses to turn a ChannelMeta Nim object into the opaque seq[byte] blob it hands to saveChannelMeta. The KV persistence worker treats that blob as fully opaque — it stores (key: bytes) → (value: bytes) and does its own buffering/batching of writes. Whether nim-sds uses protobuf, CBOR, or anything else is invisible to the worker.

Why this codec exists at all. The worker stores bytes; something must produce those bytes from the in-memory ChannelMeta. That responsibility sits inside nim-sds, on the producer side of the persistence boundary. It runs synchronously inside saveChannelMeta, before the blob crosses to the worker.

Choice: protobuf, reusing the existing toolchain.

  • sds/protobuf.nim is already a dependency and already encodes SdsMessage
  • Field-number versioning composes naturally with the explicit schemaVersion
  • Encoders for the new types compose on top of the existing SdsMessage one — no new codec to maintain

Encoders to add:

  • UnacknowledgedMessage (wraps SdsMessage + sendTime: int64 unix-ms + resendAttempts: uint32)
  • IncomingMessage (wraps SdsMessage + missingDeps: repeated bytes)
  • OutgoingRepairEntry / IncomingRepairEntry (HistoryEntry + Time + optional cachedMessage)
  • OutgoingRepairKV / IncomingRepairKV (msgId + entry — flattened map; see §6)
  • ChannelMeta (top-level)

Time is serialized as int64 unix milliseconds. The wall-clock semantics are already used by the protocol itself (getTime() in wrapOutgoingMessage).

On durability. Because the worker buffers blobs, saveChannelMeta returning ok() means "the blob was accepted by the worker," not "the blob is fsynced." That is the worker's contract to manage. nim-sds's own invariant — one snapshot save per protocol op, after all in-memory mutation completes — is satisfied as soon as the worker accepts the blob, because on recovery the worker replays its own buffer in order, so the snapshot nim-sds last issued is the snapshot nim-sds will see on next loadChannel.


2. New Persistence Interface

Replace the current 13 procs in sds/types/persistence.nim with 5:

type Persistence* = object
  saveChannelMeta*: proc(
    channelId: SdsChannelID, meta: ChannelMeta
  ): Future[Result[void, string]] {.async: (raises: []), gcsafe.}

  updateHistory*: proc(
    channelId: SdsChannelID, update: HistoryUpdate
  ): Future[Result[void, string]] {.async: (raises: []), gcsafe.}

  loadChannel*: proc(
    channelId: SdsChannelID
  ): Future[Result[ChannelData, string]] {.async: (raises: []), gcsafe.}

  dropChannel*: proc(
    channelId: SdsChannelID
  ): Future[Result[void, string]] {.async: (raises: []), gcsafe.}

  setRetrievalHint*: proc(
    msgId: SdsMessageID, hint: seq[byte]
  ): Future[Result[void, string]] {.async: (raises: []), gcsafe.}

Atomicity contract (documented in the interface comment)

Backends SHOULD execute saveChannelMeta and the immediately following updateHistory call within a single transaction when both arrive together from the same protocol op. nim-sds always issues them back-to-back under the channel lock, with no await-of-other-work in between, so the backend can either (a) buffer saveChannelMeta until the next updateHistory or flush, or (b) use a txn(channelId) handle. Variant (b) is cleaner; see §3.2 for the optional beginTxn/commitTxn extension.

Backend assumption: schema-agnostic KV blob store

The target backend is the existing schema-agnostic KV persistence module in the sibling repo. It stores opaque (key: bytes) → (value: bytes) blobs with its own crash-consistency guarantees. Therefore:

  • nim-sds owns the wire format end-to-end (no SQL schema to coordinate)
  • The "single transaction per op" requirement reduces to "two KV puts per op": meta:<channelId> and history:<channelId>:<msgId> (one or more)
  • The backend's existing batch/atomicity primitives are what guarantee crash consistency — nim-sds doesn't need transaction-handle plumbing

3. Refactor Plan

Phase 0 — Pre-work (no behavior change)

Step File(s) Verify
0.1 Add ChannelMeta, HistoryUpdate, ChannelData types new sds/types/channel_meta.nim, sds/types/history_update.nim nimble c sds.nim compiles
0.2 Add protobuf encoders/decoders for new types extend sds/protobuf.nim round-trip unit tests
0.3 Add tests/test_snapshot_codec.nim new test file nimble test passes; covers empty, single-entry, full-buffer, repair-heavy cases

Phase 1 — New interface alongside old

Step File(s) Verify
1.1 Add new 5-proc Persistence type as PersistenceV2 (rename later) sds/types/persistence.nim compiles; old interface still works
1.2 Add noOpPersistenceV2() for tests same nimble test passes
1.3 Add ReliabilityManager.persistenceV2 field, optional sds/types/reliability_manager.nim one of persistence / persistenceV2 is in use; assert at construction

Phase 2 — Migrate protocol ops, one at a time

For each op, the pattern is:

  1. Add a dirty: bool local accumulator
  2. Replace inner await rm.persistence.X calls with in-memory mutation + set dirty = true
  3. At the end of the op (under lock, before return), emit at most one saveChannelMeta and at most one updateHistory call

Order (least risky → highest risk):

Step Op File:line Verify
2.1 runRepairSweep sds.nim:510 repair sweep unit test, with failure injection
2.2 checkUnacknowledgedMessages sds.nim:445 resend-flow integration test
2.3 processIncomingBuffer → pure (no persistence) sds.nim:176 callers will persist; covered by 2.4/2.5
2.4 reviewAckStatus → pure (no persistence) sds.nim:36 covered by 2.5
2.5 unwrapReceivedMessage sds.nim:235 full receive-path tests (paths A/B/C); duplicate early-return must skip save
2.6 wrapOutgoingMessage sds.nim:87 send-path tests
2.7 markDependenciesMet sds.nim:378 dep-resolution tests
2.8 addToHistory → return appended/evicted lists instead of persisting sds_utils.nim:81 covered by 2.5/2.6/2.7
2.9 updateLamportTimestamp → pure (no persistence) sds_utils.nim:108 covered
2.10 getOrCreateChannel use loadChannel sds_utils.nim:289 bootstrap unit test
2.11 removeChannel, resetReliabilityManagerdropChannel sds_utils.nim, sds.nim wipe tests

Each step is a small commit. After every step: nimble test + gitnexus_detect_changes to confirm scope.

Phase 3 — Remove the old interface

Step File(s) Verify
3.1 Delete old 13-proc Persistence fields sds/types/persistence.nim compile fails on stragglers — fix
3.2 Rename PersistenceV2Persistence all call sites full test suite
3.3 Delete noOpPersistence (old), keep noOpPersistenceV2 as noOpPersistence same tests pass
3.4 Update library/ FFI thread to construct the new Persistence library/sds_thread/... FFI smoke test on macOS + Linux
3.5 Update Broker_FFI_API.md and any docs referencing the old contract docs review

Phase 4 — (removed)

A reference backend is not part of this plan. The schema-agnostic KV persistence module in the sibling repo is the production backend. Its authors own the integration adapter that maps the 5 Persistence procs onto KV puts/gets. nim-sds only needs to expose the interface and a working noOpPersistence for its own tests.


4. Risk Mitigation During Refactor

Risk Mitigation
Mid-refactor inconsistency (some ops on new interface, some on old) Phase 2 keeps both interfaces wired — only one is active per RM via a constructor switch; integration tests run against both
Behavior change masked by passing tests Add tests/test_persistence_contract.nim that asserts exact call count per protocol op (before vs after must match the table in ANALYSIS_SNAPSHOT_SAVE_POINTS.md)
Memory-first mutation pattern preserved by accident Move all persistence calls to the end of the op, after the lock-held mutation block completes. The dirty flag is set during mutation; the save fires after. If save fails, the in-memory state is still the source of truth for the next op — but now there's only one possible point of divergence per op, not 10.
FFI thread breakage Phase 3.4 is the FFI cutover; smoke test on both --mm:refc and --mm:orc, macOS and Linux, before declaring done. ASAN run on the FFI example.
Snapshot blob growth surprises Add a len() log on saveChannelMeta for the first week of integration; fail-loud if any blob exceeds (configurable) 1 MB

5. Acceptance Criteria

  • All existing nimble test cases pass against the new interface
  • New tests/test_persistence_contract.nim enforces exactly the call counts from ANALYSIS_SNAPSHOT_SAVE_POINTS.md §"Save Points" table
  • New tests/test_snapshot_codec.nim round-trips every ChannelMeta variant
  • Failure-injection test: kill persistence between saveChannelMeta and updateHistory → on restart, the manager loads a self-consistent snapshot (no orphan history entries; no dangling buffer references)
  • FFI smoke (liblogosdelivery-style) runs clean on macOS+refc, macOS+orc, Linux+refc, Linux+orc
  • Broker_FFI_API.md reflects the new contract
  • Bench: snapshot save rate matches the predicted S + R (foreground) and ≤ 0.2/s/channel background floor (with dirty-guard) under a synthetic 50-msg/s workload
  • Snapshot blob size on the bench workload matches the estimate in §7 within 2×; outliers logged

6. Codec & flattening — where protobuf comes in

Codec choice

The KV backend stores opaque blobs. The codec that produces the blob is internal to nim-sds. Protobuf is the natural choice because:

  • The project already uses protobuf for the SDS wire format (sds/protobuf.nim encodes SdsMessage). One codec, one toolchain.
  • Field-number versioning gives forward/backward compatibility for free — pairs naturally with the schemaVersion field.
  • Repeated message fields encode efficiently and round-trip cleanly.

Concretely: ChannelMeta is a top-level protobuf message; saveChannelMeta serializes it to seq[byte] and the backend writes that under meta:<channelId>. On load, the backend returns the bytes; nim-sds deserializes.

Why flatten Table[Id, Entry] to seq[KV]

Protobuf's wire format has no first-class "map of bytes-key → message-value" type in the minimal subset used by sds/protobuf.nim (the nim-libp2p-style minprotobuf). Even the full proto3 map<K, V> is encoded on the wire as repeated KV messages anyway — the map syntax is just sugar over repeated Entry { key = 1; value = 2; }.

So flattening is making the wire shape explicit:

ChannelMeta {
  ...
  repeated OutgoingRepairKV outgoingRepairBuffer = 5;
  repeated IncomingRepairKV incomingRepairBuffer = 6;
}

OutgoingRepairKV {
  bytes messageId = 1;
  OutgoingRepairEntry entry = 2;
}

The Table exists only in memory; the wire and disk form is the flat seq. Decode rebuilds the Table by iterating the seq. Cost: one alloc per entry on encode/decode — negligible against the I/O it replaces.

outgoingBuffer (already a seq) and incomingBuffer (a Table flattened to seq[IncomingMessage] — the key is message.messageId so no separate KV wrapper is needed) follow the same logic.


7. Snapshot size estimates

Assumptions (call out — every number below derives from these):

Quantity Assumed bytes Source
SdsMessageID 32 typical content-addressed id
SdsParticipantID 32 same
SdsChannelID 32 same
bloomFilter (serialized, in an SdsMessage) 256 derived from default bloomFilterCapacity × errorRate
causalHistory 10 entries × ~40 B maxCausalHistory = 10 from reliability_config.nim
repairRequest in a wire SdsMessage up to 3 × ~40 B maxRepairRequests = 3
Application payload (content) — small 100 B typical short chat payload
Application payload — medium 1 KB richer payload
Protobuf framing ~10% overhead tag bytes + varints

One SdsMessage on the wire (no content): ~700 B One SdsMessage with 100 B content: ~800 B One SdsMessage with 1 KB content: ~1.7 KB

Per-entry sizes inside ChannelMeta:

Entry Size (100 B payload) Size (1 KB payload) Notes
UnacknowledgedMessage ~820 B ~1.7 KB SdsMessage + sendTime + resendAttempts
IncomingMessage ~950 B ~1.9 KB SdsMessage + missingDeps (avg 3 × 32 B)
OutgoingRepairKV ~110 B ~110 B no cached message, payload-independent
IncomingRepairKV ~920 B ~1.8 KB cached serialized SdsMessage dominates

Fixed overhead per ChannelMeta: ~30 B (schemaVersion + lamportTimestamp + framing).

Per-channel snapshot size by load

Profile outBuf inBuf outRepair inRepair Size (100 B payload) Size (1 KB payload)
Idle 0 0 0 0 ~30 B ~30 B
Light chat 2 0 0 0 ~1.7 KB ~3.5 KB
Steady 5 1 1 1 ~6 KB ~12 KB
Busy 10 3 3 3 ~14 KB ~28 KB
Heavy, lossy network (SDS-R churning) 30 10 20 10 ~45 KB ~95 KB
Pathological (resend window full, big repair caches) 50 20 30 20 ~75 KB ~155 KB

Where the bytes go

Load profile Dominant contributor
Idle / light Fixed overhead + outgoingBuffer
Steady / busy outgoingBuffer (each entry ~1 KB+)
Heavy / lossy incomingRepairBuffer — each KV entry caches a full serialized message for rebroadcast. This is the single biggest amplifier; 20 entries with 1 KB payloads ≈ 36 KB on their own.

Implications

  1. Typical write is small (130 KB). Comfortably foldable into the per-op KV write cost; the backend's blob-write cost is bounded.
  2. IncomingRepairEntry.cachedMessage is the size lever to watch. Under heavy SDS-R activity it dominates the snapshot. If snapshot size becomes a bottleneck, the optimization is to drop the cache from the snapshot and re-serialize from messageHistory on demand — at the cost of more CPU and the corner case where the requested message has been evicted from history between snapshot save and repair sweep firing.
  3. Heavy profile (~95 KB) at the predicted 6/s/ch save rate = ~570 KB/s per channel. A 10-channel heavy node is then ~5.7 MB/s of snapshot churn — well within KV backend throughput, but worth a real bench before declaring it OK.
  4. The 1 MB hard cap suggested in §4 stays appropriate; pathological profile at 1 KB payload is ~155 KB, leaving healthy headroom.

8. Persistence failure policy — non-fatal, best-effort

Change from current branch. The current implementation treats every rePersistenceError as fatal: the protocol op returns err(), the caller sees a failure, and normal SDS operation breaks even though the in-memory state is fine. This is wrong for the snapshot model.

New policy.

  • In-memory state is the source of truth for protocol correctness. Lamport clock, buffers, history, bloom filter — all live in ChannelContext and are mutated under the lock before any persistence call. SDS message processing never depends on disk state for correctness within a session.
  • Persistence is best-effort durability. A failed saveChannelMeta or updateHistory does not abort the operation, does not return err to the FFI caller, and does not corrupt protocol semantics. The next op will issue its own snapshot — if that succeeds, on-disk state is re-synchronised; if it also fails, the one after that tries again.
  • Snapshot writes are idempotent and self-contained. Each saveChannelMeta blob is the complete current ChannelMeta. A missed write is fully recovered by any later successful write — no log of deltas to replay, no compensating action needed.
  • Bootstrap loss tolerance: if loadChannel fails or returns stale state on restart, the manager starts from whatever it could load (possibly empty). Peer traffic and SDS-R repair will re-populate it. This is the expected behaviour of the bloom-rebuilt-from-history design extended to the meta blob.

Implementation pattern. At each save point:

# end of wrapOutgoingMessage / unwrapReceivedMessage / etc.
if dirty:
  let saveRes = await rm.persistence.saveChannelMeta(channelId, snapshot)
  if saveRes.isErr:
    warn "snapshot save failed; in-memory state unaffected, next op will retry",
      channelId = channelId, detail = saveRes.error
    # DO NOT return err; protocol op succeeded.
if appended.len > 0 or evicted.len > 0:
  let histRes = await rm.persistence.updateHistory(channelId,
                  HistoryUpdate(append: appended, evict: evicted))
  if histRes.isErr:
    warn "history update failed; in-memory log authoritative, next op will retry",
      channelId = channelId, detail = histRes.error
return ok(serializedMessage)  # protocol op succeeded regardless

What still returns err(rePersistenceError). Only operations whose semantic intent is durability:

  • removeChannel, resetReliabilityManager → must confirm dropChannel succeeded; otherwise the caller may assume disk is clean when it isn't.
  • getOrCreateChannel on first bootstrap → if loadChannel errors (vs. returns empty), surface it so the caller can decide between "start fresh in memory" and "abort init".

Impact on §5 acceptance criteria. Add: failure-injection test must prove that wrapOutgoingMessage, unwrapReceivedMessage, markDependenciesMet, checkUnacknowledgedMessages, runRepairSweep all return ok under 100%-failing persistence, with correct in-memory behaviour and a recovered on-disk state after persistence is restored.

Why this is safe. Each snapshot is a full self-contained blob; partial-write divergence (the original ANALYSIS §4 critical risk) is already eliminated by the atomic-blob design. Once that's true, treating persistence failure as fatal is pure downside — it propagates a recoverable I/O hiccup into a user-visible protocol failure for no correctness gain.


9. What this plan deliberately does NOT do

  • Does not add transaction handles — the KV backend's batch primitive is sufficient
  • Does not ship a reference backend — the schema-agnostic KV module in the sibling repo is the production backend
  • Does not change the bloom filter persistence policy (still rebuilt from history)
  • Does not introduce SDS-R repair extension changes
  • Does not touch the FFI surface shape beyond construction of Persistence — the existing C API is unchanged
  • Does not auto-migrate on-disk data from an older format (no production data exists yet; schemaVersion=1 starts clean)