mirror of
https://github.com/logos-messaging/nim-sds.git
synced 2026-06-16 05:59:57 +00:00
* feat: propagate persistence backend errors via Result
The Persistence contract previously returned `Future[void]` for writes and
`Future[ChannelSnapshot]` for the loader, with `raises: []`. Backends had no
way to report a failure, so a failed write or a failed/partial read was
silently swallowed — and on the read path a mid-scan failure could bootstrap
a *truncated* channel snapshot, corrupting the rebuilt bloom filter and
lamport clock across a restart.
Make every contract field Result-returning:
* mutating ops -> Future[Result[void, string]]
* loadAllForChannel -> Future[Result[ChannelSnapshot, string]]
The backend-supplied error string is mapped to a new
`ReliabilityError.rePersistenceError` (logged once at the boundary via
`reliabilityErr`) and threaded up through every persistence-touching proc to
the public API, where the caller decides what to do. Request-driven paths
(wrap/unwrap/markDependenciesMet/ensureChannel/removeChannel/reset) propagate
the error; background maintenance loops (periodicBufferSweep,
periodicRepairSweep) log and retry on the next tick, since they have no
synchronous caller.
Tests: in-memory backend gains a `failingOps` injection hook; new
"Persistence: error propagation" suite asserts read/write/drop failures
surface as `rePersistenceError`. Full suite passes (90 OK).
BREAKING CHANGE: the `Persistence` contract signature changed; custom
backends must return `Result` and `ok()` on success. Bumped to 0.3.0.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* feat(persistence): add snapshot types and codec (phase 0)
Introduce atomic-snapshot persistence types that will replace the current
fine-grained 13-proc Persistence interface. This commit is purely additive:
no existing call site changes, no behaviour change.
New types (sds/types/):
- channel_meta.nim — ChannelMeta (atomic per-channel snapshot blob),
ChannelData (bootstrap payload), OutgoingRepairKV / IncomingRepairKV
(flattened map entries for protobuf wire shape).
- history_update.nim — HistoryUpdate (combined append/evict payload for
the message log).
New codec (sds/snapshot_codec.nim):
- Protobuf encode/decode for all new types, reusing the existing
SdsMessage and HistoryEntry encoders from sds/protobuf.nim.
- Explicit schemaVersion=1 on ChannelMeta; decoder rejects unknown
versions loudly rather than silently truncating.
- Time encoded as int64 unix milliseconds.
Tests (tests/test_snapshot_codec.nim):
- 13 round-trip cases covering empty, single-entry, full-buffer, and
repair-heavy snapshots; ChannelData ordering; HistoryUpdate variants;
schemaVersion rejection.
Planning artefacts:
- ANALYSIS_SDS_PERSISTENCE.md — problem statement (partial-write
divergence, chatty call rate, non-fatal-error policy gap).
- ANALYSIS_SNAPSHOT_SAVE_POINTS.md — exact save points per protocol op
and projected call rates.
- PLAN_SNAPSHOT_PERSISTENCE.md — phased refactor plan; this commit
implements phase 0.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* feat(persistence): add PersistenceV2 interface alongside legacy (phase 1)
Introduce the 5-proc snapshot-based Persistence interface that will
replace the legacy 13-proc one. Both coexist on `ReliabilityManager` so
phase 2 can migrate protocol ops one at a time without breaking existing
callers.
New file:
- sds/types/persistence_v2.nim — `PersistenceV2` type with
saveChannelMeta / updateHistory / loadChannel / dropChannel /
setRetrievalHint. `noOpPersistenceV2()` default. Doc-comments capture
the atomicity pairing (meta save + history update issued back-to-back
under the channel lock) and the non-fatal failure policy from PLAN §8.
Modified:
- sds/types/reliability_manager.nim — adds `persistenceV2: PersistenceV2`
field alongside `persistence`; constructor takes both, both default to
no-op.
- sds.nim — `newReliabilityManager` plumbs the new optional parameter.
- AGENTS.md / CLAUDE.md — GitNexus index re-indexed after phase 0 +
phase 1 additions; symbol counts updated by `npx gitnexus analyze`.
No call site uses the new interface yet — that's phase 2. All existing
tests still pass against the legacy interface.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* refactor(persistence): migrate runRepairSweep to PersistenceV2 (phase 2.1)
Per-entry removeIncomingRepair / removeOutgoingRepair calls are replaced
by a single trySaveMeta per *dirty* channel at the end of that channel's
sweep. Failure is logged but does NOT abort the sweep — in-memory state
is the source of truth (PLAN_SNAPSHOT_PERSISTENCE.md §8).
Helpers added in sds/sds_utils.nim:
- snapshotMeta(channel) — capture current ChannelContext as ChannelMeta
blob (flattens Table-keyed buffers to seqs for the wire shape).
- trySaveMeta(rm, channelId, channel) — best-effort meta snapshot save;
logs on failure, never propagates.
- tryUpdateHistory(rm, channelId, append, evict) — best-effort history
update; skips the call entirely when both lists are empty (HistoryUpdate
contract).
Call-rate impact for runRepairSweep:
- Before: N persistence calls per expired entry per channel.
- After: at most 1 saveChannelMeta per dirty channel; 0 on idle channels
(matches the dirty-flag floor in ANALYSIS_SNAPSHOT_SAVE_POINTS).
All existing tests pass — including the 3 SDS-R Repair Sweep tests that
directly exercise this proc.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* refactor(persistence): migrate checkUnacknowledgedMessages to PersistenceV2 (phase 2.2)
Per-entry saveOutgoing / removeOutgoing calls are replaced by one
trySaveMeta at the end of the pass, conditional on a dirty flag (resend
attempt incremented, or entry expired). Pass succeeds even if the save
fails — next tick reissues the snapshot.
Call-rate impact:
- Before: N persistence calls per affected entry per pass.
- After: at most 1 saveChannelMeta per pass; 0 when nothing aged out.
All existing tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* refactor(persistence): add V2 meta snapshot saves to foreground ops (phase 2A)
Wires `trySaveMeta` into the three public protocol ops that mutate
per-channel state — wrapOutgoingMessage, unwrapReceivedMessage, and
markDependenciesMet — at the operation's end, under the channel lock.
Legacy fine-grained persistence calls REMAIN in place; this commit is
additive. Both interfaces persist the same state simultaneously, so all
existing tests pass and a real backend wired to either interface
continues to work. Phase 2B will strip the legacy calls.
Save points match the §"Save Points" table in
ANALYSIS_SNAPSHOT_SAVE_POINTS.md exactly:
- wrapOutgoingMessage: 1 save (always)
- unwrapReceivedMessage: 1 save on every path including duplicate
(the duplicate path still mutates the repair buffers)
- markDependenciesMet: 1 save after the processIncomingBuffer cascade
Non-fatal failure policy (PLAN §8): trySaveMeta logs and continues;
the protocol op never returns rePersistenceError for snapshot failures.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* refactor(persistence): strip legacy interface from protocol path; migrate tests to V2 (phase 2B+2C+2D)
End-state of phase 2: the protocol code no longer issues any legacy
fine-grained Persistence calls. All state survives via the snapshot-based
PersistenceV2 interface — one trySaveMeta per op end, plus tryUpdateHistory
batched inside addToHistory. The legacy Persistence field on
ReliabilityManager remains for backwards compatibility; phase 3 deletes it.
Protocol changes (sds.nim, sds/sds_utils.nim):
- reviewAckStatus, processIncomingBuffer, updateLamportTimestamp →
pure in-memory; no per-mutation persistence.
- addToHistory: replaces appendLogEntry+removeLogEntry with a single
tryUpdateHistory call carrying (append, evict) atomically.
- getRecentHistoryEntries: setRetrievalHint switched to V2; non-fatal.
- wrapOutgoingMessage, unwrapReceivedMessage, markDependenciesMet:
all per-row saveOutgoing / removeOutgoing / saveIncoming /
removeIncoming / saveOutgoingRepair / removeOutgoingRepair /
saveIncomingRepair / removeIncomingRepair calls removed (16 call
sites in total). State is captured by the op-end trySaveMeta added
in phase 2A.
- getOrCreateChannel: bootstraps from persistenceV2.loadChannel.
- dropChannelFromPersistence: uses persistenceV2.dropChannel.
Failure policy (PLAN_SNAPSHOT_PERSISTENCE.md §8):
- Foreground ops (wrap, unwrap, markDeps, sweeps): non-fatal —
trySaveMeta / tryUpdateHistory log and continue; the protocol op
returns ok regardless of disk failure. In-memory state is the source
of truth; the next op re-issues a complete snapshot and disk catches
up automatically.
- Durability-intent ops (removeChannel, resetReliabilityManager via
dropChannelFromPersistence; getOrCreateChannel via loadChannel):
still propagate rePersistenceError, because the caller asked us to
confirm a disk operation and we cannot silently lie.
Test infrastructure:
- tests/in_memory_persistence_v2.nim: new V2 adapter mock that
decomposes the meta blob into the existing InMemoryStore shape so
test assertions on store.outgoing / store.incoming / etc. continue to
work without change.
- tests/test_persistence.nim: 17 tests, all rewritten against V2.
- 13 state-survival tests carry over with identical assertions.
- "loadChannel failure surfaces as err on bootstrap" — bootstrap
keeps durability-intent semantics.
- "saveChannelMeta failure during send does NOT surface" — deliberate
inversion of the legacy "write failure surfaces as err" test. Asserts
the new non-fatal policy: op returns ok, in-memory state correct,
disk re-syncs on the next op.
- "updateHistory failure during send does NOT surface" — same policy
applied to the history path.
- "dropChannel failure during removeChannel surfaces as err" — kept.
- All 17 tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* refactor(persistence): delete legacy interface; rename PersistenceV2 -> Persistence (phase 3)
End-state of the snapshot-persistence refactor. The legacy 13-proc
Persistence interface and its noOpPersistence are gone; the 5-proc
snapshot-based interface (formerly PersistenceV2) takes their place under
the canonical name.
Source:
- sds/types/persistence.nim: replaced 13-proc contract with the 5-proc
snapshot interface (saveChannelMeta, updateHistory, loadChannel,
dropChannel, setRetrievalHint). noOpPersistence returns ok everywhere
and an empty ChannelData on load.
- sds/types/persistence_v2.nim: removed.
- sds/types/reliability_manager.nim: dropped the second persistenceV2
field; constructor takes a single `persistence: Persistence`.
- sds/sds_utils.nim: rm.persistenceV2.X -> rm.persistence.X; doc-comments
updated.
- sds.nim: dropped the persistenceV2 parameter from newReliabilityManager.
Tests:
- tests/in_memory_persistence_v2.nim: removed; its content moved to...
- tests/in_memory_persistence.nim: replaces the old legacy mock with the
snapshot adapter under the canonical filename. Same InMemoryStore
shape so test assertions stay unchanged.
- tests/test_persistence.nim: ctor param renamed, suite name de-prefixed.
FFI smoke (`nimble libsdsDynamicMac`, refc/threads:on): builds clean.
All 4 test suites pass:
- test_bloom
- test_reliability
- test_persistence (17 V2 tests)
- test_snapshot_codec (13 codec round-trip tests)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* Persisting persistence redesign plan for reference
* refactor(persistence): R2 pending-write queue + per-op accumulator (PR #72 review fix)
Addresses all three substantive review findings on PR #72 in one
structural change: fold the per-op accumulator and the R2 retry buffer
into a single queue on `ChannelContext`, flushed once at op end.
Changes:
- sds/types/channel_context.nim: add `pendingHistoryAppends`
(`OrderedSet[SdsMessageID]`) and `pendingHistoryEvicts`
(`HashSet[SdsMessageID]`) fields. Only ids are stored — the full
SdsMessage is looked up from `messageHistory` at flush time. Documented
invariant: every id in pendingHistoryAppends is also in messageHistory,
upheld by the merge rule.
- sds/sds_utils.nim:
* `queueHistoryAppend(channel, msgId)` / `queueHistoryEvict(channel,
msgId)` — "latest-wins" merge: append cancels any pending evict
and vice versa. Symmetric, simple, handles the evict-then-re-add
sequence correctly (SDS-R repair re-delivering an evicted message
while the backend is unreachable).
* `tryUpdateHistory(rm, channelId)` — no more list params; flushes the
channel's pending queue. Dual role: per-op accumulator (multiple
`addToHistory` calls within one op queue together and flush as one
round-trip) AND R2 retry buffer (a failed flush leaves the queue
populated for the next op to retry).
* `addToHistory` queues via the helpers; does not call persistence.
* Pending queue cleared on `cleanup` and `removeChannel`.
- sds.nim:
* `processIncomingBuffer` returns to its single-arg signature — the
queue lives on the channel, no parameter threading needed.
* `wrapOutgoingMessage`, `unwrapReceivedMessage` (all three paths),
`markDependenciesMet` issue exactly one `trySaveMeta` +
`tryUpdateHistory` pair at op end, under the lock, with no
intervening `await`-of-other-work. Matches the Persistence atomicity
contract documented in `sds/types/persistence.nim`.
* Pending queue cleared in `resetReliabilityManager`.
- tests/test_persistence.nim:
* Direct `addToHistory` callers (state-survival setup) now follow with
explicit `tryUpdateHistory(channelId)` to flush. Reflects the
production op-end flush pattern.
* New: `updateHistory failure is retried via R2 pending-write queue` —
verifies that two failed sends leave both messages on the queue,
and a third successful send drains the whole queue in one call.
* New: `pending queue survives idle ops` — verifies that an op with
no history changes of its own still flushes a previously-failed
batch at op end.
* New: `evict-then-re-add merge rule preserves the re-added message
on disk` — regression for the "latest-wins" merge rule. The original
"evict-wins" rule would silently drop the re-add and leave the
message permanently absent from disk; this test would fail under
that rule and passes under the corrected one.
Resolves PR #72 review comments:
- #1 (delta loss on failed updateHistory) — R2 retry queue.
- #2 (cascade chattiness — N updateHistory calls per op) — queue collects
cascaded entries, flushed as one batch.
- #3 (atomicity contract mismatch) — implementation now matches the
documented "saveChannelMeta then updateHistory back-to-back" pairing.
Test summary: 50 tests pass (47 prior + 3 new R2/merge-rule tests).
FFI dylib (`nimble libsdsDynamicMac`, refc + threads:on): clean.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
503 lines
24 KiB
Markdown
503 lines
24 KiB
Markdown
# SDS Snapshot Persistence — Design & Refactor Plan
|
||
|
||
Companion to `ANALYSIS_SDS_PERSISTENCE.md` (problem statement) and
|
||
`ANALYSIS_SNAPSHOT_SAVE_POINTS.md` (where & how often we save).
|
||
|
||
This document defines:
|
||
1. **Data structures** to be persisted (snapshot + history)
|
||
2. **New `Persistence` interface** (5 procs replacing the current 13)
|
||
3. **Refactor plan** — phased, test-gated, backward-compatible interim state
|
||
|
||
---
|
||
|
||
## 1. Data Structure Design
|
||
|
||
### 1.1 Design principles
|
||
|
||
| Principle | Reason |
|
||
|-----------|--------|
|
||
| Snapshot is **one atomic blob** | Eliminates partial-write divergence (the root cause from ANALYSIS_SDS_PERSISTENCE.md §4) |
|
||
| Snapshot is **small** (buffers only, no history) | Keeps per-op write cost ≤ a few KB; foldable into one SQLite txn |
|
||
| History is **separate, append-batched** | Large data, append-mostly, queryable by msg_id for SDS-R |
|
||
| Bloom filter is **not persisted** | Already the case — rebuilt from history on bootstrap |
|
||
| **Versioned wire format** | Allow future schema evolution without breaking on-disk data |
|
||
| **Protobuf** serialization | Project already uses it (`sds/protobuf.nim`); keeps one codec |
|
||
|
||
### 1.2 `ChannelMeta` — the snapshot payload
|
||
|
||
```nim
|
||
# sds/types/channel_meta.nim (new file)
|
||
|
||
import std/[tables, times]
|
||
import ./sds_message_id
|
||
import ./unacknowledged_message
|
||
import ./incoming_message
|
||
import ./repair_entry
|
||
export
|
||
sds_message_id, unacknowledged_message, incoming_message, repair_entry
|
||
|
||
const ChannelMetaSchemaVersion* = 1'u32
|
||
|
||
type ChannelMeta* = object
|
||
## Atomic snapshot of the fast-changing per-channel protocol state.
|
||
## Persisted as one blob per `saveChannelMeta` call. Bloom filter is
|
||
## intentionally absent — rebuilt from the message log on bootstrap.
|
||
## Message history is also absent — persisted separately via `updateHistory`
|
||
## because it is large and append-mostly.
|
||
schemaVersion*: uint32
|
||
## On-disk format version. Backends MUST refuse to load a meta whose
|
||
## version they don't know how to decode rather than silently truncating
|
||
## or zero-filling unknown fields.
|
||
|
||
lamportTimestamp*: int64
|
||
|
||
outgoingBuffer*: seq[UnacknowledgedMessage]
|
||
## Sent-but-not-yet-acked messages. Order matters: the protocol iterates
|
||
## in insertion order for resend-attempt accounting.
|
||
|
||
incomingBuffer*: seq[IncomingMessage]
|
||
## Received-but-not-yet-deliverable messages, each carrying its
|
||
## still-missing dependency set. Order is irrelevant; flattened from
|
||
## the in-memory `Table` for wire-friendliness.
|
||
|
||
outgoingRepairBuffer*: seq[OutgoingRepairKV]
|
||
incomingRepairBuffer*: seq[IncomingRepairKV]
|
||
## SDS-R repair buffers, flattened from in-memory `Table` to seq of
|
||
## (key, value) for stable serialization.
|
||
|
||
type
|
||
OutgoingRepairKV* = object
|
||
messageId*: SdsMessageID
|
||
entry*: OutgoingRepairEntry
|
||
|
||
IncomingRepairKV* = object
|
||
messageId*: SdsMessageID
|
||
entry*: IncomingRepairEntry
|
||
```
|
||
|
||
**Why flatten the `Table`s to `seq`s?**
|
||
Protobuf has no native map of `SdsMessageID → object`. Flattening to `seq` of KV
|
||
objects gives deterministic encoding and trivial decode-time rebuild of the
|
||
in-memory `Table`. The cost is one extra alloc per entry on encode/decode —
|
||
negligible vs. the I/O it replaces.
|
||
|
||
**Why an explicit `schemaVersion`?**
|
||
The current interface has no version field. Adding fields later (e.g., a new
|
||
SDS-R counter) silently truncates old data on load. The version makes
|
||
incompatibility explicit; backends fail loud instead of corrupting state.
|
||
|
||
### 1.3 `HistoryAppend` — the history-write payload
|
||
|
||
```nim
|
||
# extension to sds/types/persistence.nim or new history_update.nim
|
||
|
||
type HistoryUpdate* = object
|
||
## Combined append/evict for one protocol operation. Empty `append` and
|
||
## empty `evict` ⇒ caller should skip the call entirely.
|
||
append*: seq[SdsMessage]
|
||
## New delivered messages, in delivery order (matters for SDS-R retrieval
|
||
## hint correctness and FIFO eviction on the backend side).
|
||
evict*: seq[SdsMessageID]
|
||
## Oldest messages now past `maxMessageHistory`. Backend deletes by id.
|
||
```
|
||
|
||
`append` is a `seq` (not a single `SdsMessage`) because `processIncomingBuffer`
|
||
can deliver a chain of unblocked messages in one call to the parent op
|
||
(`unwrapReceivedMessage` / `markDependenciesMet`). Sending them all in one
|
||
`updateHistory` call keeps the "one save per protocol op" guarantee.
|
||
|
||
### 1.4 `ChannelData` — the bootstrap payload
|
||
|
||
```nim
|
||
type ChannelData* = object
|
||
## Returned by `loadChannel` on `getOrCreateChannel` bootstrap.
|
||
## Carries everything needed to rebuild the in-memory `ChannelContext`
|
||
## from a clean restart.
|
||
meta*: ChannelMeta
|
||
messageHistory*: seq[SdsMessage]
|
||
## MUST be ordered oldest-first (lamportTimestamp ASC, tie-break msg_id
|
||
## ASC). Bloom filter is rebuilt from this on load; FIFO eviction relies
|
||
## on this ordering. Backend contract; validated by nim-sds on load.
|
||
```
|
||
|
||
### 1.5 Storage encoding (internal to nim-sds — not the SDS network wire format)
|
||
|
||
**Disambiguation.** The SDS **network** wire format (bytes peers exchange) is
|
||
handled by the existing `sds/protobuf.nim` and is untouched by this plan.
|
||
What this section defines is the **storage** encoding: the codec nim-sds uses
|
||
to turn a `ChannelMeta` Nim object into the opaque `seq[byte]` blob it hands
|
||
to `saveChannelMeta`. The KV persistence worker treats that blob as
|
||
fully opaque — it stores `(key: bytes) → (value: bytes)` and does its own
|
||
buffering/batching of writes. Whether nim-sds uses protobuf, CBOR, or
|
||
anything else is invisible to the worker.
|
||
|
||
**Why this codec exists at all.** The worker stores bytes; something must
|
||
produce those bytes from the in-memory `ChannelMeta`. That responsibility
|
||
sits inside nim-sds, on the producer side of the persistence boundary. It
|
||
runs synchronously inside `saveChannelMeta`, before the blob crosses to the
|
||
worker.
|
||
|
||
**Choice: protobuf, reusing the existing toolchain.**
|
||
- `sds/protobuf.nim` is already a dependency and already encodes `SdsMessage`
|
||
- Field-number versioning composes naturally with the explicit `schemaVersion`
|
||
- Encoders for the new types compose on top of the existing `SdsMessage` one
|
||
— no new codec to maintain
|
||
|
||
**Encoders to add:**
|
||
- `UnacknowledgedMessage` (wraps `SdsMessage` + `sendTime: int64` unix-ms + `resendAttempts: uint32`)
|
||
- `IncomingMessage` (wraps `SdsMessage` + `missingDeps: repeated bytes`)
|
||
- `OutgoingRepairEntry` / `IncomingRepairEntry` (HistoryEntry + Time + optional cachedMessage)
|
||
- `OutgoingRepairKV` / `IncomingRepairKV` (msgId + entry — flattened map; see §6)
|
||
- `ChannelMeta` (top-level)
|
||
|
||
`Time` is serialized as `int64` unix milliseconds. The wall-clock semantics
|
||
are already used by the protocol itself (`getTime()` in `wrapOutgoingMessage`).
|
||
|
||
**On durability.** Because the worker buffers blobs, `saveChannelMeta`
|
||
returning `ok()` means "the blob was accepted by the worker," not "the blob
|
||
is fsynced." That is the worker's contract to manage. nim-sds's own
|
||
invariant — one snapshot save per protocol op, after all in-memory mutation
|
||
completes — is satisfied as soon as the worker accepts the blob, because
|
||
on recovery the worker replays its own buffer in order, so the snapshot
|
||
nim-sds last issued is the snapshot nim-sds will see on next `loadChannel`.
|
||
|
||
---
|
||
|
||
## 2. New `Persistence` Interface
|
||
|
||
Replace the current 13 procs in `sds/types/persistence.nim` with **5**:
|
||
|
||
```nim
|
||
type Persistence* = object
|
||
saveChannelMeta*: proc(
|
||
channelId: SdsChannelID, meta: ChannelMeta
|
||
): Future[Result[void, string]] {.async: (raises: []), gcsafe.}
|
||
|
||
updateHistory*: proc(
|
||
channelId: SdsChannelID, update: HistoryUpdate
|
||
): Future[Result[void, string]] {.async: (raises: []), gcsafe.}
|
||
|
||
loadChannel*: proc(
|
||
channelId: SdsChannelID
|
||
): Future[Result[ChannelData, string]] {.async: (raises: []), gcsafe.}
|
||
|
||
dropChannel*: proc(
|
||
channelId: SdsChannelID
|
||
): Future[Result[void, string]] {.async: (raises: []), gcsafe.}
|
||
|
||
setRetrievalHint*: proc(
|
||
msgId: SdsMessageID, hint: seq[byte]
|
||
): Future[Result[void, string]] {.async: (raises: []), gcsafe.}
|
||
```
|
||
|
||
### Atomicity contract (documented in the interface comment)
|
||
|
||
> Backends SHOULD execute `saveChannelMeta` and the immediately following
|
||
> `updateHistory` call within a single transaction when both arrive together
|
||
> from the same protocol op. nim-sds always issues them back-to-back under
|
||
> the channel lock, with no `await`-of-other-work in between, so the backend
|
||
> can either (a) buffer `saveChannelMeta` until the next `updateHistory` or
|
||
> `flush`, or (b) use a `txn(channelId)` handle. Variant (b) is cleaner; see
|
||
> §3.2 for the optional `beginTxn`/`commitTxn` extension.
|
||
|
||
### Backend assumption: schema-agnostic KV blob store
|
||
|
||
The target backend is the existing schema-agnostic KV persistence module in
|
||
the sibling repo. It stores opaque `(key: bytes) → (value: bytes)` blobs with
|
||
its own crash-consistency guarantees. Therefore:
|
||
|
||
- nim-sds owns the wire format end-to-end (no SQL schema to coordinate)
|
||
- The "single transaction per op" requirement reduces to "two KV puts per
|
||
op": `meta:<channelId>` and `history:<channelId>:<msgId>` (one or more)
|
||
- The backend's existing batch/atomicity primitives are what guarantee
|
||
crash consistency — nim-sds doesn't need transaction-handle plumbing
|
||
|
||
---
|
||
|
||
## 3. Refactor Plan
|
||
|
||
### Phase 0 — Pre-work (no behavior change)
|
||
|
||
| Step | File(s) | Verify |
|
||
|------|---------|--------|
|
||
| 0.1 Add `ChannelMeta`, `HistoryUpdate`, `ChannelData` types | new `sds/types/channel_meta.nim`, `sds/types/history_update.nim` | `nimble c sds.nim` compiles |
|
||
| 0.2 Add protobuf encoders/decoders for new types | extend `sds/protobuf.nim` | round-trip unit tests |
|
||
| 0.3 Add `tests/test_snapshot_codec.nim` | new test file | `nimble test` passes; covers empty, single-entry, full-buffer, repair-heavy cases |
|
||
|
||
### Phase 1 — New interface alongside old
|
||
|
||
| Step | File(s) | Verify |
|
||
|------|---------|--------|
|
||
| 1.1 Add new 5-proc `Persistence` type as `PersistenceV2` (rename later) | `sds/types/persistence.nim` | compiles; old interface still works |
|
||
| 1.2 Add `noOpPersistenceV2()` for tests | same | `nimble test` passes |
|
||
| 1.3 Add `ReliabilityManager.persistenceV2` field, optional | `sds/types/reliability_manager.nim` | one of `persistence` / `persistenceV2` is in use; assert at construction |
|
||
|
||
### Phase 2 — Migrate protocol ops, one at a time
|
||
|
||
For each op, the pattern is:
|
||
1. Add a `dirty: bool` local accumulator
|
||
2. Replace inner `await rm.persistence.X` calls with in-memory mutation + set `dirty = true`
|
||
3. At the end of the op (under lock, before `return`), emit at most one `saveChannelMeta` and at most one `updateHistory` call
|
||
|
||
Order (least risky → highest risk):
|
||
|
||
| Step | Op | File:line | Verify |
|
||
|------|-----|-----------|--------|
|
||
| 2.1 | `runRepairSweep` | sds.nim:510 | repair sweep unit test, with failure injection |
|
||
| 2.2 | `checkUnacknowledgedMessages` | sds.nim:445 | resend-flow integration test |
|
||
| 2.3 | `processIncomingBuffer` → pure (no persistence) | sds.nim:176 | callers will persist; covered by 2.4/2.5 |
|
||
| 2.4 | `reviewAckStatus` → pure (no persistence) | sds.nim:36 | covered by 2.5 |
|
||
| 2.5 | `unwrapReceivedMessage` | sds.nim:235 | full receive-path tests (paths A/B/C); duplicate early-return must skip save |
|
||
| 2.6 | `wrapOutgoingMessage` | sds.nim:87 | send-path tests |
|
||
| 2.7 | `markDependenciesMet` | sds.nim:378 | dep-resolution tests |
|
||
| 2.8 | `addToHistory` → return appended/evicted lists instead of persisting | sds_utils.nim:81 | covered by 2.5/2.6/2.7 |
|
||
| 2.9 | `updateLamportTimestamp` → pure (no persistence) | sds_utils.nim:108 | covered |
|
||
| 2.10 | `getOrCreateChannel` use `loadChannel` | sds_utils.nim:289 | bootstrap unit test |
|
||
| 2.11 | `removeChannel`, `resetReliabilityManager` → `dropChannel` | sds_utils.nim, sds.nim | wipe tests |
|
||
|
||
Each step is a small commit. After every step: `nimble test` + `gitnexus_detect_changes` to confirm scope.
|
||
|
||
### Phase 3 — Remove the old interface
|
||
|
||
| Step | File(s) | Verify |
|
||
|------|---------|--------|
|
||
| 3.1 Delete old 13-proc `Persistence` fields | `sds/types/persistence.nim` | compile fails on stragglers — fix |
|
||
| 3.2 Rename `PersistenceV2` → `Persistence` | all call sites | full test suite |
|
||
| 3.3 Delete `noOpPersistence` (old), keep `noOpPersistenceV2` as `noOpPersistence` | same | tests pass |
|
||
| 3.4 Update `library/` FFI thread to construct the new `Persistence` | `library/sds_thread/...` | FFI smoke test on macOS + Linux |
|
||
| 3.5 Update `Broker_FFI_API.md` and any docs referencing the old contract | docs | review |
|
||
|
||
### Phase 4 — (removed)
|
||
|
||
A reference backend is **not** part of this plan. The schema-agnostic KV
|
||
persistence module in the sibling repo is the production backend. Its
|
||
authors own the integration adapter that maps the 5 `Persistence` procs onto
|
||
KV puts/gets. nim-sds only needs to expose the interface and a working
|
||
`noOpPersistence` for its own tests.
|
||
|
||
---
|
||
|
||
## 4. Risk Mitigation During Refactor
|
||
|
||
| Risk | Mitigation |
|
||
|------|------------|
|
||
| Mid-refactor inconsistency (some ops on new interface, some on old) | Phase 2 keeps both interfaces wired — only one is active per RM via a constructor switch; integration tests run against both |
|
||
| Behavior change masked by passing tests | Add `tests/test_persistence_contract.nim` that asserts exact call count per protocol op (before vs after must match the table in `ANALYSIS_SNAPSHOT_SAVE_POINTS.md`) |
|
||
| Memory-first mutation pattern preserved by accident | Move *all* persistence calls to the end of the op, after the lock-held mutation block completes. The dirty flag is set *during* mutation; the save fires *after*. If save fails, the in-memory state is still the source of truth for the next op — but now there's only one possible point of divergence per op, not 10. |
|
||
| FFI thread breakage | Phase 3.4 is the FFI cutover; smoke test on both `--mm:refc` and `--mm:orc`, macOS and Linux, before declaring done. ASAN run on the FFI example. |
|
||
| Snapshot blob growth surprises | Add a `len()` log on `saveChannelMeta` for the first week of integration; fail-loud if any blob exceeds (configurable) 1 MB |
|
||
|
||
---
|
||
|
||
## 5. Acceptance Criteria
|
||
|
||
- [ ] All existing `nimble test` cases pass against the new interface
|
||
- [ ] New `tests/test_persistence_contract.nim` enforces exactly the call counts from `ANALYSIS_SNAPSHOT_SAVE_POINTS.md` §"Save Points" table
|
||
- [ ] New `tests/test_snapshot_codec.nim` round-trips every `ChannelMeta` variant
|
||
- [ ] Failure-injection test: kill persistence between `saveChannelMeta` and `updateHistory` → on restart, the manager loads a self-consistent snapshot (no orphan history entries; no dangling buffer references)
|
||
- [ ] FFI smoke (`liblogosdelivery`-style) runs clean on macOS+refc, macOS+orc, Linux+refc, Linux+orc
|
||
- [ ] `Broker_FFI_API.md` reflects the new contract
|
||
- [ ] Bench: snapshot save rate matches the predicted `S + R` (foreground) and ≤ 0.2/s/channel background floor (with dirty-guard) under a synthetic 50-msg/s workload
|
||
- [ ] Snapshot blob size on the bench workload matches the estimate in §7 within 2×; outliers logged
|
||
|
||
---
|
||
|
||
## 6. Codec & flattening — where protobuf comes in
|
||
|
||
### Codec choice
|
||
|
||
The KV backend stores opaque blobs. The codec that produces the blob is
|
||
**internal to nim-sds**. Protobuf is the natural choice because:
|
||
|
||
- The project already uses protobuf for the SDS wire format
|
||
(`sds/protobuf.nim` encodes `SdsMessage`). One codec, one toolchain.
|
||
- Field-number versioning gives forward/backward compatibility for free —
|
||
pairs naturally with the `schemaVersion` field.
|
||
- Repeated message fields encode efficiently and round-trip cleanly.
|
||
|
||
Concretely: `ChannelMeta` is a top-level protobuf message; `saveChannelMeta`
|
||
serializes it to `seq[byte]` and the backend writes that under
|
||
`meta:<channelId>`. On load, the backend returns the bytes; nim-sds
|
||
deserializes.
|
||
|
||
### Why flatten `Table[Id, Entry]` to `seq[KV]`
|
||
|
||
Protobuf's wire format has no first-class "map of bytes-key → message-value"
|
||
type in the minimal subset used by `sds/protobuf.nim` (the
|
||
`nim-libp2p`-style `minprotobuf`). Even the full proto3 `map<K, V>` is
|
||
encoded on the wire as **repeated KV messages anyway** — the map syntax is
|
||
just sugar over `repeated Entry { key = 1; value = 2; }`.
|
||
|
||
So flattening is making the wire shape explicit:
|
||
|
||
```
|
||
ChannelMeta {
|
||
...
|
||
repeated OutgoingRepairKV outgoingRepairBuffer = 5;
|
||
repeated IncomingRepairKV incomingRepairBuffer = 6;
|
||
}
|
||
|
||
OutgoingRepairKV {
|
||
bytes messageId = 1;
|
||
OutgoingRepairEntry entry = 2;
|
||
}
|
||
```
|
||
|
||
The `Table` exists only in memory; the wire and disk form is the flat seq.
|
||
Decode rebuilds the `Table` by iterating the seq. Cost: one alloc per entry
|
||
on encode/decode — negligible against the I/O it replaces.
|
||
|
||
`outgoingBuffer` (already a `seq`) and `incomingBuffer` (a `Table` flattened
|
||
to `seq[IncomingMessage]` — the key is `message.messageId` so no separate KV
|
||
wrapper is needed) follow the same logic.
|
||
|
||
---
|
||
|
||
## 7. Snapshot size estimates
|
||
|
||
Assumptions (call out — every number below derives from these):
|
||
|
||
| Quantity | Assumed bytes | Source |
|
||
|----------|---------------|--------|
|
||
| `SdsMessageID` | 32 | typical content-addressed id |
|
||
| `SdsParticipantID` | 32 | same |
|
||
| `SdsChannelID` | 32 | same |
|
||
| `bloomFilter` (serialized, in an `SdsMessage`) | 256 | derived from default `bloomFilterCapacity` × `errorRate` |
|
||
| `causalHistory` | 10 entries × ~40 B | `maxCausalHistory = 10` from `reliability_config.nim` |
|
||
| `repairRequest` in a wire SdsMessage | up to 3 × ~40 B | `maxRepairRequests = 3` |
|
||
| Application payload (`content`) — small | 100 B | typical short chat payload |
|
||
| Application payload — medium | 1 KB | richer payload |
|
||
| Protobuf framing | ~10% overhead | tag bytes + varints |
|
||
|
||
**One `SdsMessage` on the wire (no content):** ~700 B
|
||
**One `SdsMessage` with 100 B content:** ~800 B
|
||
**One `SdsMessage` with 1 KB content:** ~1.7 KB
|
||
|
||
Per-entry sizes inside `ChannelMeta`:
|
||
|
||
| Entry | Size (100 B payload) | Size (1 KB payload) | Notes |
|
||
|-------|----------------------|---------------------|-------|
|
||
| `UnacknowledgedMessage` | ~820 B | ~1.7 KB | SdsMessage + sendTime + resendAttempts |
|
||
| `IncomingMessage` | ~950 B | ~1.9 KB | SdsMessage + missingDeps (avg 3 × 32 B) |
|
||
| `OutgoingRepairKV` | ~110 B | ~110 B | no cached message, payload-independent |
|
||
| `IncomingRepairKV` | ~920 B | ~1.8 KB | **cached serialized SdsMessage dominates** |
|
||
|
||
Fixed overhead per `ChannelMeta`: ~30 B (schemaVersion + lamportTimestamp + framing).
|
||
|
||
### Per-channel snapshot size by load
|
||
|
||
| Profile | outBuf | inBuf | outRepair | inRepair | Size (100 B payload) | Size (1 KB payload) |
|
||
|---------|--------|-------|-----------|----------|----------------------|---------------------|
|
||
| Idle | 0 | 0 | 0 | 0 | **~30 B** | ~30 B |
|
||
| Light chat | 2 | 0 | 0 | 0 | **~1.7 KB** | ~3.5 KB |
|
||
| Steady | 5 | 1 | 1 | 1 | **~6 KB** | ~12 KB |
|
||
| Busy | 10 | 3 | 3 | 3 | **~14 KB** | ~28 KB |
|
||
| Heavy, lossy network (SDS-R churning) | 30 | 10 | 20 | 10 | **~45 KB** | ~95 KB |
|
||
| Pathological (resend window full, big repair caches) | 50 | 20 | 30 | 20 | **~75 KB** | ~155 KB |
|
||
|
||
### Where the bytes go
|
||
|
||
| Load profile | Dominant contributor |
|
||
|--------------|----------------------|
|
||
| Idle / light | Fixed overhead + outgoingBuffer |
|
||
| Steady / busy | outgoingBuffer (each entry ~1 KB+) |
|
||
| Heavy / lossy | **incomingRepairBuffer** — each KV entry caches a full serialized message for rebroadcast. This is the single biggest amplifier; 20 entries with 1 KB payloads ≈ 36 KB on their own. |
|
||
|
||
### Implications
|
||
|
||
1. **Typical write is small (1–30 KB).** Comfortably foldable into the
|
||
per-op KV write cost; the backend's blob-write cost is bounded.
|
||
2. **`IncomingRepairEntry.cachedMessage` is the size lever to watch.**
|
||
Under heavy SDS-R activity it dominates the snapshot. If snapshot size
|
||
becomes a bottleneck, the optimization is to drop the cache from the
|
||
snapshot and re-serialize from `messageHistory` on demand — at the cost
|
||
of more CPU and the corner case where the requested message has been
|
||
evicted from history between snapshot save and repair sweep firing.
|
||
3. **Heavy profile (~95 KB) at the predicted 6/s/ch save rate = ~570 KB/s
|
||
per channel.** A 10-channel heavy node is then ~5.7 MB/s of snapshot
|
||
churn — well within KV backend throughput, but worth a real bench
|
||
before declaring it OK.
|
||
4. **The 1 MB hard cap** suggested in §4 stays appropriate; pathological
|
||
profile at 1 KB payload is ~155 KB, leaving healthy headroom.
|
||
|
||
---
|
||
|
||
## 8. Persistence failure policy — non-fatal, best-effort
|
||
|
||
**Change from current branch.** The current implementation treats every
|
||
`rePersistenceError` as fatal: the protocol op returns `err()`, the caller
|
||
sees a failure, and normal SDS operation breaks even though the in-memory
|
||
state is fine. This is wrong for the snapshot model.
|
||
|
||
**New policy.**
|
||
- In-memory state is the **source of truth** for protocol correctness.
|
||
Lamport clock, buffers, history, bloom filter — all live in
|
||
`ChannelContext` and are mutated under the lock before any persistence
|
||
call. SDS message processing never depends on disk state for correctness
|
||
within a session.
|
||
- Persistence is **best-effort durability**. A failed `saveChannelMeta` or
|
||
`updateHistory` does **not** abort the operation, does not return `err`
|
||
to the FFI caller, and does not corrupt protocol semantics. The next op
|
||
will issue its own snapshot — if that succeeds, on-disk state is
|
||
re-synchronised; if it also fails, the one after that tries again.
|
||
- Snapshot writes are **idempotent and self-contained.** Each
|
||
`saveChannelMeta` blob is the complete current `ChannelMeta`. A missed
|
||
write is fully recovered by any later successful write — no log of
|
||
deltas to replay, no compensating action needed.
|
||
- Bootstrap loss tolerance: if `loadChannel` fails or returns stale state
|
||
on restart, the manager starts from whatever it could load (possibly
|
||
empty). Peer traffic and SDS-R repair will re-populate it. This is the
|
||
expected behaviour of the bloom-rebuilt-from-history design extended to
|
||
the meta blob.
|
||
|
||
**Implementation pattern.** At each save point:
|
||
|
||
```nim
|
||
# end of wrapOutgoingMessage / unwrapReceivedMessage / etc.
|
||
if dirty:
|
||
let saveRes = await rm.persistence.saveChannelMeta(channelId, snapshot)
|
||
if saveRes.isErr:
|
||
warn "snapshot save failed; in-memory state unaffected, next op will retry",
|
||
channelId = channelId, detail = saveRes.error
|
||
# DO NOT return err; protocol op succeeded.
|
||
if appended.len > 0 or evicted.len > 0:
|
||
let histRes = await rm.persistence.updateHistory(channelId,
|
||
HistoryUpdate(append: appended, evict: evicted))
|
||
if histRes.isErr:
|
||
warn "history update failed; in-memory log authoritative, next op will retry",
|
||
channelId = channelId, detail = histRes.error
|
||
return ok(serializedMessage) # protocol op succeeded regardless
|
||
```
|
||
|
||
**What still returns `err(rePersistenceError)`.** Only operations whose
|
||
**semantic intent** is durability:
|
||
- `removeChannel`, `resetReliabilityManager` → must confirm `dropChannel`
|
||
succeeded; otherwise the caller may assume disk is clean when it isn't.
|
||
- `getOrCreateChannel` on first bootstrap → if `loadChannel` errors (vs.
|
||
returns empty), surface it so the caller can decide between "start
|
||
fresh in memory" and "abort init".
|
||
|
||
**Impact on §5 acceptance criteria.** Add: failure-injection test must
|
||
prove that `wrapOutgoingMessage`, `unwrapReceivedMessage`,
|
||
`markDependenciesMet`, `checkUnacknowledgedMessages`, `runRepairSweep` all
|
||
return `ok` under 100%-failing persistence, with correct in-memory
|
||
behaviour and a recovered on-disk state after persistence is restored.
|
||
|
||
**Why this is safe.** Each snapshot is a full self-contained blob;
|
||
partial-write divergence (the original ANALYSIS §4 critical risk) is
|
||
already eliminated by the atomic-blob design. Once that's true, treating
|
||
persistence failure as fatal is pure downside — it propagates a
|
||
recoverable I/O hiccup into a user-visible protocol failure for no
|
||
correctness gain.
|
||
|
||
---
|
||
|
||
## 9. What this plan deliberately does NOT do
|
||
|
||
- Does not add transaction handles — the KV backend's batch primitive is sufficient
|
||
- Does not ship a reference backend — the schema-agnostic KV module in the sibling repo is the production backend
|
||
- Does not change the bloom filter persistence policy (still rebuilt from history)
|
||
- Does not introduce SDS-R repair extension changes
|
||
- Does not touch the FFI surface shape beyond construction of `Persistence` — the existing C API is unchanged
|
||
- Does not auto-migrate on-disk data from an older format (no production data exists yet; schemaVersion=1 starts clean)
|