* feat: propagate persistence backend errors via Result
The Persistence contract previously returned `Future[void]` for writes and
`Future[ChannelSnapshot]` for the loader, with `raises: []`. Backends had no
way to report a failure, so a failed write or a failed/partial read was
silently swallowed — and on the read path a mid-scan failure could bootstrap
a *truncated* channel snapshot, corrupting the rebuilt bloom filter and
lamport clock across a restart.
Make every contract field Result-returning:
* mutating ops -> Future[Result[void, string]]
* loadAllForChannel -> Future[Result[ChannelSnapshot, string]]
The backend-supplied error string is mapped to a new
`ReliabilityError.rePersistenceError` (logged once at the boundary via
`reliabilityErr`) and threaded up through every persistence-touching proc to
the public API, where the caller decides what to do. Request-driven paths
(wrap/unwrap/markDependenciesMet/ensureChannel/removeChannel/reset) propagate
the error; background maintenance loops (periodicBufferSweep,
periodicRepairSweep) log and retry on the next tick, since they have no
synchronous caller.
Tests: in-memory backend gains a `failingOps` injection hook; new
"Persistence: error propagation" suite asserts read/write/drop failures
surface as `rePersistenceError`. Full suite passes (90 OK).
BREAKING CHANGE: the `Persistence` contract signature changed; custom
backends must return `Result` and `ok()` on success. Bumped to 0.3.0.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* feat(persistence): add snapshot types and codec (phase 0)
Introduce atomic-snapshot persistence types that will replace the current
fine-grained 13-proc Persistence interface. This commit is purely additive:
no existing call site changes, no behaviour change.
New types (sds/types/):
- channel_meta.nim — ChannelMeta (atomic per-channel snapshot blob),
ChannelData (bootstrap payload), OutgoingRepairKV / IncomingRepairKV
(flattened map entries for protobuf wire shape).
- history_update.nim — HistoryUpdate (combined append/evict payload for
the message log).
New codec (sds/snapshot_codec.nim):
- Protobuf encode/decode for all new types, reusing the existing
SdsMessage and HistoryEntry encoders from sds/protobuf.nim.
- Explicit schemaVersion=1 on ChannelMeta; decoder rejects unknown
versions loudly rather than silently truncating.
- Time encoded as int64 unix milliseconds.
Tests (tests/test_snapshot_codec.nim):
- 13 round-trip cases covering empty, single-entry, full-buffer, and
repair-heavy snapshots; ChannelData ordering; HistoryUpdate variants;
schemaVersion rejection.
Planning artefacts:
- ANALYSIS_SDS_PERSISTENCE.md — problem statement (partial-write
divergence, chatty call rate, non-fatal-error policy gap).
- ANALYSIS_SNAPSHOT_SAVE_POINTS.md — exact save points per protocol op
and projected call rates.
- PLAN_SNAPSHOT_PERSISTENCE.md — phased refactor plan; this commit
implements phase 0.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* feat(persistence): add PersistenceV2 interface alongside legacy (phase 1)
Introduce the 5-proc snapshot-based Persistence interface that will
replace the legacy 13-proc one. Both coexist on `ReliabilityManager` so
phase 2 can migrate protocol ops one at a time without breaking existing
callers.
New file:
- sds/types/persistence_v2.nim — `PersistenceV2` type with
saveChannelMeta / updateHistory / loadChannel / dropChannel /
setRetrievalHint. `noOpPersistenceV2()` default. Doc-comments capture
the atomicity pairing (meta save + history update issued back-to-back
under the channel lock) and the non-fatal failure policy from PLAN §8.
Modified:
- sds/types/reliability_manager.nim — adds `persistenceV2: PersistenceV2`
field alongside `persistence`; constructor takes both, both default to
no-op.
- sds.nim — `newReliabilityManager` plumbs the new optional parameter.
- AGENTS.md / CLAUDE.md — GitNexus index re-indexed after phase 0 +
phase 1 additions; symbol counts updated by `npx gitnexus analyze`.
No call site uses the new interface yet — that's phase 2. All existing
tests still pass against the legacy interface.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* refactor(persistence): migrate runRepairSweep to PersistenceV2 (phase 2.1)
Per-entry removeIncomingRepair / removeOutgoingRepair calls are replaced
by a single trySaveMeta per *dirty* channel at the end of that channel's
sweep. Failure is logged but does NOT abort the sweep — in-memory state
is the source of truth (PLAN_SNAPSHOT_PERSISTENCE.md §8).
Helpers added in sds/sds_utils.nim:
- snapshotMeta(channel) — capture current ChannelContext as ChannelMeta
blob (flattens Table-keyed buffers to seqs for the wire shape).
- trySaveMeta(rm, channelId, channel) — best-effort meta snapshot save;
logs on failure, never propagates.
- tryUpdateHistory(rm, channelId, append, evict) — best-effort history
update; skips the call entirely when both lists are empty (HistoryUpdate
contract).
Call-rate impact for runRepairSweep:
- Before: N persistence calls per expired entry per channel.
- After: at most 1 saveChannelMeta per dirty channel; 0 on idle channels
(matches the dirty-flag floor in ANALYSIS_SNAPSHOT_SAVE_POINTS).
All existing tests pass — including the 3 SDS-R Repair Sweep tests that
directly exercise this proc.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* refactor(persistence): migrate checkUnacknowledgedMessages to PersistenceV2 (phase 2.2)
Per-entry saveOutgoing / removeOutgoing calls are replaced by one
trySaveMeta at the end of the pass, conditional on a dirty flag (resend
attempt incremented, or entry expired). Pass succeeds even if the save
fails — next tick reissues the snapshot.
Call-rate impact:
- Before: N persistence calls per affected entry per pass.
- After: at most 1 saveChannelMeta per pass; 0 when nothing aged out.
All existing tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* refactor(persistence): add V2 meta snapshot saves to foreground ops (phase 2A)
Wires `trySaveMeta` into the three public protocol ops that mutate
per-channel state — wrapOutgoingMessage, unwrapReceivedMessage, and
markDependenciesMet — at the operation's end, under the channel lock.
Legacy fine-grained persistence calls REMAIN in place; this commit is
additive. Both interfaces persist the same state simultaneously, so all
existing tests pass and a real backend wired to either interface
continues to work. Phase 2B will strip the legacy calls.
Save points match the §"Save Points" table in
ANALYSIS_SNAPSHOT_SAVE_POINTS.md exactly:
- wrapOutgoingMessage: 1 save (always)
- unwrapReceivedMessage: 1 save on every path including duplicate
(the duplicate path still mutates the repair buffers)
- markDependenciesMet: 1 save after the processIncomingBuffer cascade
Non-fatal failure policy (PLAN §8): trySaveMeta logs and continues;
the protocol op never returns rePersistenceError for snapshot failures.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* refactor(persistence): strip legacy interface from protocol path; migrate tests to V2 (phase 2B+2C+2D)
End-state of phase 2: the protocol code no longer issues any legacy
fine-grained Persistence calls. All state survives via the snapshot-based
PersistenceV2 interface — one trySaveMeta per op end, plus tryUpdateHistory
batched inside addToHistory. The legacy Persistence field on
ReliabilityManager remains for backwards compatibility; phase 3 deletes it.
Protocol changes (sds.nim, sds/sds_utils.nim):
- reviewAckStatus, processIncomingBuffer, updateLamportTimestamp →
pure in-memory; no per-mutation persistence.
- addToHistory: replaces appendLogEntry+removeLogEntry with a single
tryUpdateHistory call carrying (append, evict) atomically.
- getRecentHistoryEntries: setRetrievalHint switched to V2; non-fatal.
- wrapOutgoingMessage, unwrapReceivedMessage, markDependenciesMet:
all per-row saveOutgoing / removeOutgoing / saveIncoming /
removeIncoming / saveOutgoingRepair / removeOutgoingRepair /
saveIncomingRepair / removeIncomingRepair calls removed (16 call
sites in total). State is captured by the op-end trySaveMeta added
in phase 2A.
- getOrCreateChannel: bootstraps from persistenceV2.loadChannel.
- dropChannelFromPersistence: uses persistenceV2.dropChannel.
Failure policy (PLAN_SNAPSHOT_PERSISTENCE.md §8):
- Foreground ops (wrap, unwrap, markDeps, sweeps): non-fatal —
trySaveMeta / tryUpdateHistory log and continue; the protocol op
returns ok regardless of disk failure. In-memory state is the source
of truth; the next op re-issues a complete snapshot and disk catches
up automatically.
- Durability-intent ops (removeChannel, resetReliabilityManager via
dropChannelFromPersistence; getOrCreateChannel via loadChannel):
still propagate rePersistenceError, because the caller asked us to
confirm a disk operation and we cannot silently lie.
Test infrastructure:
- tests/in_memory_persistence_v2.nim: new V2 adapter mock that
decomposes the meta blob into the existing InMemoryStore shape so
test assertions on store.outgoing / store.incoming / etc. continue to
work without change.
- tests/test_persistence.nim: 17 tests, all rewritten against V2.
- 13 state-survival tests carry over with identical assertions.
- "loadChannel failure surfaces as err on bootstrap" — bootstrap
keeps durability-intent semantics.
- "saveChannelMeta failure during send does NOT surface" — deliberate
inversion of the legacy "write failure surfaces as err" test. Asserts
the new non-fatal policy: op returns ok, in-memory state correct,
disk re-syncs on the next op.
- "updateHistory failure during send does NOT surface" — same policy
applied to the history path.
- "dropChannel failure during removeChannel surfaces as err" — kept.
- All 17 tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* refactor(persistence): delete legacy interface; rename PersistenceV2 -> Persistence (phase 3)
End-state of the snapshot-persistence refactor. The legacy 13-proc
Persistence interface and its noOpPersistence are gone; the 5-proc
snapshot-based interface (formerly PersistenceV2) takes their place under
the canonical name.
Source:
- sds/types/persistence.nim: replaced 13-proc contract with the 5-proc
snapshot interface (saveChannelMeta, updateHistory, loadChannel,
dropChannel, setRetrievalHint). noOpPersistence returns ok everywhere
and an empty ChannelData on load.
- sds/types/persistence_v2.nim: removed.
- sds/types/reliability_manager.nim: dropped the second persistenceV2
field; constructor takes a single `persistence: Persistence`.
- sds/sds_utils.nim: rm.persistenceV2.X -> rm.persistence.X; doc-comments
updated.
- sds.nim: dropped the persistenceV2 parameter from newReliabilityManager.
Tests:
- tests/in_memory_persistence_v2.nim: removed; its content moved to...
- tests/in_memory_persistence.nim: replaces the old legacy mock with the
snapshot adapter under the canonical filename. Same InMemoryStore
shape so test assertions stay unchanged.
- tests/test_persistence.nim: ctor param renamed, suite name de-prefixed.
FFI smoke (`nimble libsdsDynamicMac`, refc/threads:on): builds clean.
All 4 test suites pass:
- test_bloom
- test_reliability
- test_persistence (17 V2 tests)
- test_snapshot_codec (13 codec round-trip tests)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* Persisting persistence redesign plan for reference
* refactor(persistence): R2 pending-write queue + per-op accumulator (PR #72 review fix)
Addresses all three substantive review findings on PR #72 in one
structural change: fold the per-op accumulator and the R2 retry buffer
into a single queue on `ChannelContext`, flushed once at op end.
Changes:
- sds/types/channel_context.nim: add `pendingHistoryAppends`
(`OrderedSet[SdsMessageID]`) and `pendingHistoryEvicts`
(`HashSet[SdsMessageID]`) fields. Only ids are stored — the full
SdsMessage is looked up from `messageHistory` at flush time. Documented
invariant: every id in pendingHistoryAppends is also in messageHistory,
upheld by the merge rule.
- sds/sds_utils.nim:
* `queueHistoryAppend(channel, msgId)` / `queueHistoryEvict(channel,
msgId)` — "latest-wins" merge: append cancels any pending evict
and vice versa. Symmetric, simple, handles the evict-then-re-add
sequence correctly (SDS-R repair re-delivering an evicted message
while the backend is unreachable).
* `tryUpdateHistory(rm, channelId)` — no more list params; flushes the
channel's pending queue. Dual role: per-op accumulator (multiple
`addToHistory` calls within one op queue together and flush as one
round-trip) AND R2 retry buffer (a failed flush leaves the queue
populated for the next op to retry).
* `addToHistory` queues via the helpers; does not call persistence.
* Pending queue cleared on `cleanup` and `removeChannel`.
- sds.nim:
* `processIncomingBuffer` returns to its single-arg signature — the
queue lives on the channel, no parameter threading needed.
* `wrapOutgoingMessage`, `unwrapReceivedMessage` (all three paths),
`markDependenciesMet` issue exactly one `trySaveMeta` +
`tryUpdateHistory` pair at op end, under the lock, with no
intervening `await`-of-other-work. Matches the Persistence atomicity
contract documented in `sds/types/persistence.nim`.
* Pending queue cleared in `resetReliabilityManager`.
- tests/test_persistence.nim:
* Direct `addToHistory` callers (state-survival setup) now follow with
explicit `tryUpdateHistory(channelId)` to flush. Reflects the
production op-end flush pattern.
* New: `updateHistory failure is retried via R2 pending-write queue` —
verifies that two failed sends leave both messages on the queue,
and a third successful send drains the whole queue in one call.
* New: `pending queue survives idle ops` — verifies that an op with
no history changes of its own still flushes a previously-failed
batch at op end.
* New: `evict-then-re-add merge rule preserves the re-added message
on disk` — regression for the "latest-wins" merge rule. The original
"evict-wins" rule would silently drop the re-add and leave the
message permanently absent from disk; this test would fail under
that rule and passes under the corrected one.
Resolves PR #72 review comments:
- #1 (delta loss on failed updateHistory) — R2 retry queue.
- #2 (cascade chattiness — N updateHistory calls per op) — queue collects
cascaded entries, flushed as one batch.
- #3 (atomicity contract mismatch) — implementation now matches the
documented "saveChannelMeta then updateHistory back-to-back" pairing.
Test summary: 50 tests pass (47 prior + 3 new R2/merge-rule tests).
FFI dylib (`nimble libsdsDynamicMac`, refc + threads:on): clean.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
24 KiB
SDS Snapshot Persistence — Design & Refactor Plan
Companion to ANALYSIS_SDS_PERSISTENCE.md (problem statement) and
ANALYSIS_SNAPSHOT_SAVE_POINTS.md (where & how often we save).
This document defines:
- Data structures to be persisted (snapshot + history)
- New
Persistenceinterface (5 procs replacing the current 13) - Refactor plan — phased, test-gated, backward-compatible interim state
1. Data Structure Design
1.1 Design principles
| Principle | Reason |
|---|---|
| Snapshot is one atomic blob | Eliminates partial-write divergence (the root cause from ANALYSIS_SDS_PERSISTENCE.md §4) |
| Snapshot is small (buffers only, no history) | Keeps per-op write cost ≤ a few KB; foldable into one SQLite txn |
| History is separate, append-batched | Large data, append-mostly, queryable by msg_id for SDS-R |
| Bloom filter is not persisted | Already the case — rebuilt from history on bootstrap |
| Versioned wire format | Allow future schema evolution without breaking on-disk data |
| Protobuf serialization | Project already uses it (sds/protobuf.nim); keeps one codec |
1.2 ChannelMeta — the snapshot payload
# sds/types/channel_meta.nim (new file)
import std/[tables, times]
import ./sds_message_id
import ./unacknowledged_message
import ./incoming_message
import ./repair_entry
export
sds_message_id, unacknowledged_message, incoming_message, repair_entry
const ChannelMetaSchemaVersion* = 1'u32
type ChannelMeta* = object
## Atomic snapshot of the fast-changing per-channel protocol state.
## Persisted as one blob per `saveChannelMeta` call. Bloom filter is
## intentionally absent — rebuilt from the message log on bootstrap.
## Message history is also absent — persisted separately via `updateHistory`
## because it is large and append-mostly.
schemaVersion*: uint32
## On-disk format version. Backends MUST refuse to load a meta whose
## version they don't know how to decode rather than silently truncating
## or zero-filling unknown fields.
lamportTimestamp*: int64
outgoingBuffer*: seq[UnacknowledgedMessage]
## Sent-but-not-yet-acked messages. Order matters: the protocol iterates
## in insertion order for resend-attempt accounting.
incomingBuffer*: seq[IncomingMessage]
## Received-but-not-yet-deliverable messages, each carrying its
## still-missing dependency set. Order is irrelevant; flattened from
## the in-memory `Table` for wire-friendliness.
outgoingRepairBuffer*: seq[OutgoingRepairKV]
incomingRepairBuffer*: seq[IncomingRepairKV]
## SDS-R repair buffers, flattened from in-memory `Table` to seq of
## (key, value) for stable serialization.
type
OutgoingRepairKV* = object
messageId*: SdsMessageID
entry*: OutgoingRepairEntry
IncomingRepairKV* = object
messageId*: SdsMessageID
entry*: IncomingRepairEntry
Why flatten the Tables to seqs?
Protobuf has no native map of SdsMessageID → object. Flattening to seq of KV
objects gives deterministic encoding and trivial decode-time rebuild of the
in-memory Table. The cost is one extra alloc per entry on encode/decode —
negligible vs. the I/O it replaces.
Why an explicit schemaVersion?
The current interface has no version field. Adding fields later (e.g., a new
SDS-R counter) silently truncates old data on load. The version makes
incompatibility explicit; backends fail loud instead of corrupting state.
1.3 HistoryAppend — the history-write payload
# extension to sds/types/persistence.nim or new history_update.nim
type HistoryUpdate* = object
## Combined append/evict for one protocol operation. Empty `append` and
## empty `evict` ⇒ caller should skip the call entirely.
append*: seq[SdsMessage]
## New delivered messages, in delivery order (matters for SDS-R retrieval
## hint correctness and FIFO eviction on the backend side).
evict*: seq[SdsMessageID]
## Oldest messages now past `maxMessageHistory`. Backend deletes by id.
append is a seq (not a single SdsMessage) because processIncomingBuffer
can deliver a chain of unblocked messages in one call to the parent op
(unwrapReceivedMessage / markDependenciesMet). Sending them all in one
updateHistory call keeps the "one save per protocol op" guarantee.
1.4 ChannelData — the bootstrap payload
type ChannelData* = object
## Returned by `loadChannel` on `getOrCreateChannel` bootstrap.
## Carries everything needed to rebuild the in-memory `ChannelContext`
## from a clean restart.
meta*: ChannelMeta
messageHistory*: seq[SdsMessage]
## MUST be ordered oldest-first (lamportTimestamp ASC, tie-break msg_id
## ASC). Bloom filter is rebuilt from this on load; FIFO eviction relies
## on this ordering. Backend contract; validated by nim-sds on load.
1.5 Storage encoding (internal to nim-sds — not the SDS network wire format)
Disambiguation. The SDS network wire format (bytes peers exchange) is
handled by the existing sds/protobuf.nim and is untouched by this plan.
What this section defines is the storage encoding: the codec nim-sds uses
to turn a ChannelMeta Nim object into the opaque seq[byte] blob it hands
to saveChannelMeta. The KV persistence worker treats that blob as
fully opaque — it stores (key: bytes) → (value: bytes) and does its own
buffering/batching of writes. Whether nim-sds uses protobuf, CBOR, or
anything else is invisible to the worker.
Why this codec exists at all. The worker stores bytes; something must
produce those bytes from the in-memory ChannelMeta. That responsibility
sits inside nim-sds, on the producer side of the persistence boundary. It
runs synchronously inside saveChannelMeta, before the blob crosses to the
worker.
Choice: protobuf, reusing the existing toolchain.
sds/protobuf.nimis already a dependency and already encodesSdsMessage- Field-number versioning composes naturally with the explicit
schemaVersion - Encoders for the new types compose on top of the existing
SdsMessageone — no new codec to maintain
Encoders to add:
UnacknowledgedMessage(wrapsSdsMessage+sendTime: int64unix-ms +resendAttempts: uint32)IncomingMessage(wrapsSdsMessage+missingDeps: repeated bytes)OutgoingRepairEntry/IncomingRepairEntry(HistoryEntry + Time + optional cachedMessage)OutgoingRepairKV/IncomingRepairKV(msgId + entry — flattened map; see §6)ChannelMeta(top-level)
Time is serialized as int64 unix milliseconds. The wall-clock semantics
are already used by the protocol itself (getTime() in wrapOutgoingMessage).
On durability. Because the worker buffers blobs, saveChannelMeta
returning ok() means "the blob was accepted by the worker," not "the blob
is fsynced." That is the worker's contract to manage. nim-sds's own
invariant — one snapshot save per protocol op, after all in-memory mutation
completes — is satisfied as soon as the worker accepts the blob, because
on recovery the worker replays its own buffer in order, so the snapshot
nim-sds last issued is the snapshot nim-sds will see on next loadChannel.
2. New Persistence Interface
Replace the current 13 procs in sds/types/persistence.nim with 5:
type Persistence* = object
saveChannelMeta*: proc(
channelId: SdsChannelID, meta: ChannelMeta
): Future[Result[void, string]] {.async: (raises: []), gcsafe.}
updateHistory*: proc(
channelId: SdsChannelID, update: HistoryUpdate
): Future[Result[void, string]] {.async: (raises: []), gcsafe.}
loadChannel*: proc(
channelId: SdsChannelID
): Future[Result[ChannelData, string]] {.async: (raises: []), gcsafe.}
dropChannel*: proc(
channelId: SdsChannelID
): Future[Result[void, string]] {.async: (raises: []), gcsafe.}
setRetrievalHint*: proc(
msgId: SdsMessageID, hint: seq[byte]
): Future[Result[void, string]] {.async: (raises: []), gcsafe.}
Atomicity contract (documented in the interface comment)
Backends SHOULD execute
saveChannelMetaand the immediately followingupdateHistorycall within a single transaction when both arrive together from the same protocol op. nim-sds always issues them back-to-back under the channel lock, with noawait-of-other-work in between, so the backend can either (a) buffersaveChannelMetauntil the nextupdateHistoryorflush, or (b) use atxn(channelId)handle. Variant (b) is cleaner; see §3.2 for the optionalbeginTxn/commitTxnextension.
Backend assumption: schema-agnostic KV blob store
The target backend is the existing schema-agnostic KV persistence module in
the sibling repo. It stores opaque (key: bytes) → (value: bytes) blobs with
its own crash-consistency guarantees. Therefore:
- nim-sds owns the wire format end-to-end (no SQL schema to coordinate)
- The "single transaction per op" requirement reduces to "two KV puts per
op":
meta:<channelId>andhistory:<channelId>:<msgId>(one or more) - The backend's existing batch/atomicity primitives are what guarantee crash consistency — nim-sds doesn't need transaction-handle plumbing
3. Refactor Plan
Phase 0 — Pre-work (no behavior change)
| Step | File(s) | Verify |
|---|---|---|
0.1 Add ChannelMeta, HistoryUpdate, ChannelData types |
new sds/types/channel_meta.nim, sds/types/history_update.nim |
nimble c sds.nim compiles |
| 0.2 Add protobuf encoders/decoders for new types | extend sds/protobuf.nim |
round-trip unit tests |
0.3 Add tests/test_snapshot_codec.nim |
new test file | nimble test passes; covers empty, single-entry, full-buffer, repair-heavy cases |
Phase 1 — New interface alongside old
| Step | File(s) | Verify |
|---|---|---|
1.1 Add new 5-proc Persistence type as PersistenceV2 (rename later) |
sds/types/persistence.nim |
compiles; old interface still works |
1.2 Add noOpPersistenceV2() for tests |
same | nimble test passes |
1.3 Add ReliabilityManager.persistenceV2 field, optional |
sds/types/reliability_manager.nim |
one of persistence / persistenceV2 is in use; assert at construction |
Phase 2 — Migrate protocol ops, one at a time
For each op, the pattern is:
- Add a
dirty: boollocal accumulator - Replace inner
await rm.persistence.Xcalls with in-memory mutation + setdirty = true - At the end of the op (under lock, before
return), emit at most onesaveChannelMetaand at most oneupdateHistorycall
Order (least risky → highest risk):
| Step | Op | File:line | Verify |
|---|---|---|---|
| 2.1 | runRepairSweep |
sds.nim:510 | repair sweep unit test, with failure injection |
| 2.2 | checkUnacknowledgedMessages |
sds.nim:445 | resend-flow integration test |
| 2.3 | processIncomingBuffer → pure (no persistence) |
sds.nim:176 | callers will persist; covered by 2.4/2.5 |
| 2.4 | reviewAckStatus → pure (no persistence) |
sds.nim:36 | covered by 2.5 |
| 2.5 | unwrapReceivedMessage |
sds.nim:235 | full receive-path tests (paths A/B/C); duplicate early-return must skip save |
| 2.6 | wrapOutgoingMessage |
sds.nim:87 | send-path tests |
| 2.7 | markDependenciesMet |
sds.nim:378 | dep-resolution tests |
| 2.8 | addToHistory → return appended/evicted lists instead of persisting |
sds_utils.nim:81 | covered by 2.5/2.6/2.7 |
| 2.9 | updateLamportTimestamp → pure (no persistence) |
sds_utils.nim:108 | covered |
| 2.10 | getOrCreateChannel use loadChannel |
sds_utils.nim:289 | bootstrap unit test |
| 2.11 | removeChannel, resetReliabilityManager → dropChannel |
sds_utils.nim, sds.nim | wipe tests |
Each step is a small commit. After every step: nimble test + gitnexus_detect_changes to confirm scope.
Phase 3 — Remove the old interface
| Step | File(s) | Verify |
|---|---|---|
3.1 Delete old 13-proc Persistence fields |
sds/types/persistence.nim |
compile fails on stragglers — fix |
3.2 Rename PersistenceV2 → Persistence |
all call sites | full test suite |
3.3 Delete noOpPersistence (old), keep noOpPersistenceV2 as noOpPersistence |
same | tests pass |
3.4 Update library/ FFI thread to construct the new Persistence |
library/sds_thread/... |
FFI smoke test on macOS + Linux |
3.5 Update Broker_FFI_API.md and any docs referencing the old contract |
docs | review |
Phase 4 — (removed)
A reference backend is not part of this plan. The schema-agnostic KV
persistence module in the sibling repo is the production backend. Its
authors own the integration adapter that maps the 5 Persistence procs onto
KV puts/gets. nim-sds only needs to expose the interface and a working
noOpPersistence for its own tests.
4. Risk Mitigation During Refactor
| Risk | Mitigation |
|---|---|
| Mid-refactor inconsistency (some ops on new interface, some on old) | Phase 2 keeps both interfaces wired — only one is active per RM via a constructor switch; integration tests run against both |
| Behavior change masked by passing tests | Add tests/test_persistence_contract.nim that asserts exact call count per protocol op (before vs after must match the table in ANALYSIS_SNAPSHOT_SAVE_POINTS.md) |
| Memory-first mutation pattern preserved by accident | Move all persistence calls to the end of the op, after the lock-held mutation block completes. The dirty flag is set during mutation; the save fires after. If save fails, the in-memory state is still the source of truth for the next op — but now there's only one possible point of divergence per op, not 10. |
| FFI thread breakage | Phase 3.4 is the FFI cutover; smoke test on both --mm:refc and --mm:orc, macOS and Linux, before declaring done. ASAN run on the FFI example. |
| Snapshot blob growth surprises | Add a len() log on saveChannelMeta for the first week of integration; fail-loud if any blob exceeds (configurable) 1 MB |
5. Acceptance Criteria
- All existing
nimble testcases pass against the new interface - New
tests/test_persistence_contract.nimenforces exactly the call counts fromANALYSIS_SNAPSHOT_SAVE_POINTS.md§"Save Points" table - New
tests/test_snapshot_codec.nimround-trips everyChannelMetavariant - Failure-injection test: kill persistence between
saveChannelMetaandupdateHistory→ on restart, the manager loads a self-consistent snapshot (no orphan history entries; no dangling buffer references) - FFI smoke (
liblogosdelivery-style) runs clean on macOS+refc, macOS+orc, Linux+refc, Linux+orc Broker_FFI_API.mdreflects the new contract- Bench: snapshot save rate matches the predicted
S + R(foreground) and ≤ 0.2/s/channel background floor (with dirty-guard) under a synthetic 50-msg/s workload - Snapshot blob size on the bench workload matches the estimate in §7 within 2×; outliers logged
6. Codec & flattening — where protobuf comes in
Codec choice
The KV backend stores opaque blobs. The codec that produces the blob is internal to nim-sds. Protobuf is the natural choice because:
- The project already uses protobuf for the SDS wire format
(
sds/protobuf.nimencodesSdsMessage). One codec, one toolchain. - Field-number versioning gives forward/backward compatibility for free —
pairs naturally with the
schemaVersionfield. - Repeated message fields encode efficiently and round-trip cleanly.
Concretely: ChannelMeta is a top-level protobuf message; saveChannelMeta
serializes it to seq[byte] and the backend writes that under
meta:<channelId>. On load, the backend returns the bytes; nim-sds
deserializes.
Why flatten Table[Id, Entry] to seq[KV]
Protobuf's wire format has no first-class "map of bytes-key → message-value"
type in the minimal subset used by sds/protobuf.nim (the
nim-libp2p-style minprotobuf). Even the full proto3 map<K, V> is
encoded on the wire as repeated KV messages anyway — the map syntax is
just sugar over repeated Entry { key = 1; value = 2; }.
So flattening is making the wire shape explicit:
ChannelMeta {
...
repeated OutgoingRepairKV outgoingRepairBuffer = 5;
repeated IncomingRepairKV incomingRepairBuffer = 6;
}
OutgoingRepairKV {
bytes messageId = 1;
OutgoingRepairEntry entry = 2;
}
The Table exists only in memory; the wire and disk form is the flat seq.
Decode rebuilds the Table by iterating the seq. Cost: one alloc per entry
on encode/decode — negligible against the I/O it replaces.
outgoingBuffer (already a seq) and incomingBuffer (a Table flattened
to seq[IncomingMessage] — the key is message.messageId so no separate KV
wrapper is needed) follow the same logic.
7. Snapshot size estimates
Assumptions (call out — every number below derives from these):
| Quantity | Assumed bytes | Source |
|---|---|---|
SdsMessageID |
32 | typical content-addressed id |
SdsParticipantID |
32 | same |
SdsChannelID |
32 | same |
bloomFilter (serialized, in an SdsMessage) |
256 | derived from default bloomFilterCapacity × errorRate |
causalHistory |
10 entries × ~40 B | maxCausalHistory = 10 from reliability_config.nim |
repairRequest in a wire SdsMessage |
up to 3 × ~40 B | maxRepairRequests = 3 |
Application payload (content) — small |
100 B | typical short chat payload |
| Application payload — medium | 1 KB | richer payload |
| Protobuf framing | ~10% overhead | tag bytes + varints |
One SdsMessage on the wire (no content): ~700 B
One SdsMessage with 100 B content: ~800 B
One SdsMessage with 1 KB content: ~1.7 KB
Per-entry sizes inside ChannelMeta:
| Entry | Size (100 B payload) | Size (1 KB payload) | Notes |
|---|---|---|---|
UnacknowledgedMessage |
~820 B | ~1.7 KB | SdsMessage + sendTime + resendAttempts |
IncomingMessage |
~950 B | ~1.9 KB | SdsMessage + missingDeps (avg 3 × 32 B) |
OutgoingRepairKV |
~110 B | ~110 B | no cached message, payload-independent |
IncomingRepairKV |
~920 B | ~1.8 KB | cached serialized SdsMessage dominates |
Fixed overhead per ChannelMeta: ~30 B (schemaVersion + lamportTimestamp + framing).
Per-channel snapshot size by load
| Profile | outBuf | inBuf | outRepair | inRepair | Size (100 B payload) | Size (1 KB payload) |
|---|---|---|---|---|---|---|
| Idle | 0 | 0 | 0 | 0 | ~30 B | ~30 B |
| Light chat | 2 | 0 | 0 | 0 | ~1.7 KB | ~3.5 KB |
| Steady | 5 | 1 | 1 | 1 | ~6 KB | ~12 KB |
| Busy | 10 | 3 | 3 | 3 | ~14 KB | ~28 KB |
| Heavy, lossy network (SDS-R churning) | 30 | 10 | 20 | 10 | ~45 KB | ~95 KB |
| Pathological (resend window full, big repair caches) | 50 | 20 | 30 | 20 | ~75 KB | ~155 KB |
Where the bytes go
| Load profile | Dominant contributor |
|---|---|
| Idle / light | Fixed overhead + outgoingBuffer |
| Steady / busy | outgoingBuffer (each entry ~1 KB+) |
| Heavy / lossy | incomingRepairBuffer — each KV entry caches a full serialized message for rebroadcast. This is the single biggest amplifier; 20 entries with 1 KB payloads ≈ 36 KB on their own. |
Implications
- Typical write is small (1–30 KB). Comfortably foldable into the per-op KV write cost; the backend's blob-write cost is bounded.
IncomingRepairEntry.cachedMessageis the size lever to watch. Under heavy SDS-R activity it dominates the snapshot. If snapshot size becomes a bottleneck, the optimization is to drop the cache from the snapshot and re-serialize frommessageHistoryon demand — at the cost of more CPU and the corner case where the requested message has been evicted from history between snapshot save and repair sweep firing.- Heavy profile (~95 KB) at the predicted 6/s/ch save rate = ~570 KB/s per channel. A 10-channel heavy node is then ~5.7 MB/s of snapshot churn — well within KV backend throughput, but worth a real bench before declaring it OK.
- The 1 MB hard cap suggested in §4 stays appropriate; pathological profile at 1 KB payload is ~155 KB, leaving healthy headroom.
8. Persistence failure policy — non-fatal, best-effort
Change from current branch. The current implementation treats every
rePersistenceError as fatal: the protocol op returns err(), the caller
sees a failure, and normal SDS operation breaks even though the in-memory
state is fine. This is wrong for the snapshot model.
New policy.
- In-memory state is the source of truth for protocol correctness.
Lamport clock, buffers, history, bloom filter — all live in
ChannelContextand are mutated under the lock before any persistence call. SDS message processing never depends on disk state for correctness within a session. - Persistence is best-effort durability. A failed
saveChannelMetaorupdateHistorydoes not abort the operation, does not returnerrto the FFI caller, and does not corrupt protocol semantics. The next op will issue its own snapshot — if that succeeds, on-disk state is re-synchronised; if it also fails, the one after that tries again. - Snapshot writes are idempotent and self-contained. Each
saveChannelMetablob is the complete currentChannelMeta. A missed write is fully recovered by any later successful write — no log of deltas to replay, no compensating action needed. - Bootstrap loss tolerance: if
loadChannelfails or returns stale state on restart, the manager starts from whatever it could load (possibly empty). Peer traffic and SDS-R repair will re-populate it. This is the expected behaviour of the bloom-rebuilt-from-history design extended to the meta blob.
Implementation pattern. At each save point:
# end of wrapOutgoingMessage / unwrapReceivedMessage / etc.
if dirty:
let saveRes = await rm.persistence.saveChannelMeta(channelId, snapshot)
if saveRes.isErr:
warn "snapshot save failed; in-memory state unaffected, next op will retry",
channelId = channelId, detail = saveRes.error
# DO NOT return err; protocol op succeeded.
if appended.len > 0 or evicted.len > 0:
let histRes = await rm.persistence.updateHistory(channelId,
HistoryUpdate(append: appended, evict: evicted))
if histRes.isErr:
warn "history update failed; in-memory log authoritative, next op will retry",
channelId = channelId, detail = histRes.error
return ok(serializedMessage) # protocol op succeeded regardless
What still returns err(rePersistenceError). Only operations whose
semantic intent is durability:
removeChannel,resetReliabilityManager→ must confirmdropChannelsucceeded; otherwise the caller may assume disk is clean when it isn't.getOrCreateChannelon first bootstrap → ifloadChannelerrors (vs. returns empty), surface it so the caller can decide between "start fresh in memory" and "abort init".
Impact on §5 acceptance criteria. Add: failure-injection test must
prove that wrapOutgoingMessage, unwrapReceivedMessage,
markDependenciesMet, checkUnacknowledgedMessages, runRepairSweep all
return ok under 100%-failing persistence, with correct in-memory
behaviour and a recovered on-disk state after persistence is restored.
Why this is safe. Each snapshot is a full self-contained blob; partial-write divergence (the original ANALYSIS §4 critical risk) is already eliminated by the atomic-blob design. Once that's true, treating persistence failure as fatal is pure downside — it propagates a recoverable I/O hiccup into a user-visible protocol failure for no correctness gain.
9. What this plan deliberately does NOT do
- Does not add transaction handles — the KV backend's batch primitive is sufficient
- Does not ship a reference backend — the schema-agnostic KV module in the sibling repo is the production backend
- Does not change the bloom filter persistence policy (still rebuilt from history)
- Does not introduce SDS-R repair extension changes
- Does not touch the FFI surface shape beyond construction of
Persistence— the existing C API is unchanged - Does not auto-migrate on-disk data from an older format (no production data exists yet; schemaVersion=1 starts clean)