99 Commits

Author SHA1 Message Date
NagyZoltanPeter
f919957950
refactor(persistence): migrate runRepairSweep to PersistenceV2 (phase 2.1)
Per-entry removeIncomingRepair / removeOutgoingRepair calls are replaced
by a single trySaveMeta per *dirty* channel at the end of that channel's
sweep. Failure is logged but does NOT abort the sweep — in-memory state
is the source of truth (PLAN_SNAPSHOT_PERSISTENCE.md §8).

Helpers added in sds/sds_utils.nim:
- snapshotMeta(channel) — capture current ChannelContext as ChannelMeta
  blob (flattens Table-keyed buffers to seqs for the wire shape).
- trySaveMeta(rm, channelId, channel) — best-effort meta snapshot save;
  logs on failure, never propagates.
- tryUpdateHistory(rm, channelId, append, evict) — best-effort history
  update; skips the call entirely when both lists are empty (HistoryUpdate
  contract).

Call-rate impact for runRepairSweep:
- Before: N persistence calls per expired entry per channel.
- After:  at most 1 saveChannelMeta per dirty channel; 0 on idle channels
  (matches the dirty-flag floor in ANALYSIS_SNAPSHOT_SAVE_POINTS).

All existing tests pass — including the 3 SDS-R Repair Sweep tests that
directly exercise this proc.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 12:42:10 +02:00
NagyZoltanPeter
f5946763c4
feat(persistence): add PersistenceV2 interface alongside legacy (phase 1)
Introduce the 5-proc snapshot-based Persistence interface that will
replace the legacy 13-proc one. Both coexist on `ReliabilityManager` so
phase 2 can migrate protocol ops one at a time without breaking existing
callers.

New file:
- sds/types/persistence_v2.nim — `PersistenceV2` type with
  saveChannelMeta / updateHistory / loadChannel / dropChannel /
  setRetrievalHint. `noOpPersistenceV2()` default. Doc-comments capture
  the atomicity pairing (meta save + history update issued back-to-back
  under the channel lock) and the non-fatal failure policy from PLAN §8.

Modified:
- sds/types/reliability_manager.nim — adds `persistenceV2: PersistenceV2`
  field alongside `persistence`; constructor takes both, both default to
  no-op.
- sds.nim — `newReliabilityManager` plumbs the new optional parameter.
- AGENTS.md / CLAUDE.md — GitNexus index re-indexed after phase 0 +
  phase 1 additions; symbol counts updated by `npx gitnexus analyze`.

No call site uses the new interface yet — that's phase 2. All existing
tests still pass against the legacy interface.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 12:39:10 +02:00
NagyZoltanPeter
979a66360b
feat(persistence): add snapshot types and codec (phase 0)
Introduce atomic-snapshot persistence types that will replace the current
fine-grained 13-proc Persistence interface. This commit is purely additive:
no existing call site changes, no behaviour change.

New types (sds/types/):
- channel_meta.nim — ChannelMeta (atomic per-channel snapshot blob),
  ChannelData (bootstrap payload), OutgoingRepairKV / IncomingRepairKV
  (flattened map entries for protobuf wire shape).
- history_update.nim — HistoryUpdate (combined append/evict payload for
  the message log).

New codec (sds/snapshot_codec.nim):
- Protobuf encode/decode for all new types, reusing the existing
  SdsMessage and HistoryEntry encoders from sds/protobuf.nim.
- Explicit schemaVersion=1 on ChannelMeta; decoder rejects unknown
  versions loudly rather than silently truncating.
- Time encoded as int64 unix milliseconds.

Tests (tests/test_snapshot_codec.nim):
- 13 round-trip cases covering empty, single-entry, full-buffer, and
  repair-heavy snapshots; ChannelData ordering; HistoryUpdate variants;
  schemaVersion rejection.

Planning artefacts:
- ANALYSIS_SDS_PERSISTENCE.md — problem statement (partial-write
  divergence, chatty call rate, non-fatal-error policy gap).
- ANALYSIS_SNAPSHOT_SAVE_POINTS.md — exact save points per protocol op
  and projected call rates.
- PLAN_SNAPSHOT_PERSISTENCE.md — phased refactor plan; this commit
  implements phase 0.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 12:33:17 +02:00
NagyZoltanPeter
145e5d6459
feat: propagate persistence backend errors via Result
The Persistence contract previously returned `Future[void]` for writes and
`Future[ChannelSnapshot]` for the loader, with `raises: []`. Backends had no
way to report a failure, so a failed write or a failed/partial read was
silently swallowed — and on the read path a mid-scan failure could bootstrap
a *truncated* channel snapshot, corrupting the rebuilt bloom filter and
lamport clock across a restart.

Make every contract field Result-returning:
  * mutating ops  -> Future[Result[void, string]]
  * loadAllForChannel -> Future[Result[ChannelSnapshot, string]]

The backend-supplied error string is mapped to a new
`ReliabilityError.rePersistenceError` (logged once at the boundary via
`reliabilityErr`) and threaded up through every persistence-touching proc to
the public API, where the caller decides what to do. Request-driven paths
(wrap/unwrap/markDependenciesMet/ensureChannel/removeChannel/reset) propagate
the error; background maintenance loops (periodicBufferSweep,
periodicRepairSweep) log and retry on the next tick, since they have no
synchronous caller.

Tests: in-memory backend gains a `failingOps` injection hook; new
"Persistence: error propagation" suite asserts read/write/drop failures
surface as `rePersistenceError`. Full suite passes (90 OK).

BREAKING CHANGE: the `Persistence` contract signature changed; custom
backends must return `Result` and `ok()` on success. Bumped to 0.3.0.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 17:12:06 +02:00
Siddarth Kumar
980c830415
nix: set nim cache to proper tmp directory
Otherwise we end up with cache collisions like this in CI :

```
 > Error: cannot open '/tmp/nim/libsds_d/@z..@f..@f..@f..@f..@f..
 @ffgber@f6y2zz1uv2lzi4ln2717py8m0aix64u56-avz-hajenccrq-2.2.4@favz
 @fyvo@fflfgrz@frkprcgvbaf.nim.c'
```
2026-05-26 16:01:55 +05:30
NagyZoltanPeter
35a33adc98
feat: make Persistence interface async (#69)
* feat: make Persistence interface async

The 14 Persistence proc fields now return Future[...] with
{.async: (raises: []), gcsafe.}, allowing real I/O backends (SQLite,
encrypted file, network) to suspend rather than block the Chronos event
loop the manager runs on.

Propagates through:
- ReliabilityManager.lock: system.Lock -> chronos.AsyncLock. Acquired
  across awaits cleanly; matches the single-threaded Chronos worker the
  FFI uses. Multi-OS-thread use is now explicitly the caller's
  responsibility.
- sds_utils + sds.nim public API procs (wrapOutgoingMessage,
  unwrapReceivedMessage, markDependenciesMet, setCallbacks,
  resetReliabilityManager, cleanup, ensureChannel, removeChannel, the
  getter snapshots, etc.) are now async.
- FFI request handlers in library/sds_thread/... await the new API.
- Tests converted via an asyncTest template that wraps each test body
  in an async proc; setup/teardown use waitFor for their single async
  call (ensureChannel / cleanup).

Lock scope is preserved exactly: the same call sites that held the
kernel Lock today hold AsyncLock now -- no new locking added.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* refactor: drop asyncSpawn, add asyncSetup/asyncTeardown

Three asyncSpawn usages removed:

- sds.nim startPeriodicTasks: stored the periodic-task futures on
  ReliabilityManager (new field `periodicTasks: seq[FutureBase]`) so
  cleanup can cancel them on shutdown instead of leaking the loops
  against a cleared manager.
- library/sds_thread/sds_thread.nim: fireSync moved BEFORE processing,
  then `await SdsThreadRequest.process(...)` instead of asyncSpawn'ing
  it. Aligns the worker with the SP-channel + lock assumption that
  there are no concurrent requests; caller throughput is unchanged
  because the caller only waits for receipt (fireSync), not processing.
- tests TestBus repair callback: replaced asyncSpawn(deliverExcept...)
  with an explicit pending-delivery queue drained by `bus.drain()`.
  Integration tests no longer rely on `sleepAsync(10ms)` to let
  spawned deliveries finish — they await drain instead.

Tests also pick up an asyncSetup/asyncTeardown pair (tests/async_unittest.nim)
so suite fixtures can `await` directly. All `waitFor` in setup/teardown
blocks is gone; only the top-level asyncTest wrapper still uses waitFor
(once, to drive the async proc to completion).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Correctly propagate error hidden by new async move

* Correctly handle future cancellation exceptions, +some housekeeping

* Apply suggestion from @Ivansete-status

Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com>

* Stylistics, async default implication addressed, nph style run

* Remove leaking CancelledFuture from public facing + as a consequence it is tuneled into handling CatchableError everywhere

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com>
2026-05-25 22:30:15 +02:00
Darshan
2e9a7683f0
fix: require participantId on newReliabilityManager (#67) 2026-05-10 13:24:42 +05:30
Darshan
881d8cb359
feat: persistence interface for SDS state (#66) 2026-05-08 03:14:12 +05:30
Darshan
9d08f5995b
feat: Implementation of SDS-Repair (#60) 2026-05-01 18:35:38 +05:30
Ivan FB
8ee857c908
generic refactor to make the code more aligned to logos-delivery style (#62)
* generic refactor to make the code more aligned to logos-delivery style
* use explicit return statement
2026-04-24 09:50:18 +02:00
Ivan FB
6f49a9742a
add raw no-nix ci.yml (#61)
* adapt arch and cpu flags and fix ios build
* require chronos 4.2.0 or higher
* Android fixes and rm Makefile
* enable long paths in git windows ci
2026-04-10 14:23:30 +02:00
Igor Sirotin
a3e98a99ec
docs: added CLAUDE.md (#58) 2026-04-02 11:57:05 +01:00
Ivan FB
0dea35d364
feat: refactor to support building with Nimble (#52)
Changes include:

- Removing all submodules from vendor folder.
- Updating sds.nimble with required depndencies.
- Generating a nimble.lock file using Nimble.
- Updated Nim code to reference depndencies correctly.
- Added nix/deps.nix fixed output derivation that calls Nimble.
- Updated nixpkgs to use 25.11 commit which provides Nimbe 0.20.1.
- Disabled Nix Android builds on MacOS due to Nimble segfault.

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2026-02-11 16:32:02 +01:00
Igor Sirotin
47757bacea
chore: update license files to comply with Logos licensing requirements 2026-02-05 15:11:35 +00:00
Igor Sirotin
55ba7e2bc3
docs: added licenses (#51) 2026-02-03 19:05:46 +00:00
e301dad197
nix: use Nix Flake from NBS repo to provide Nim
This way we can track same Nim as in vendor folder.

Notably this upgrades from Nim 2.2.4 to 2.2.6.

Depends on:
https://github.com/status-im/nimbus-build-system/pull/112

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2026-02-01 20:53:14 +01:00
Ivan FB
19c48ef602
Merge pull request #46 from logos-messaging/use-epoll-in-android
force epoll is being used in Android
2026-01-29 23:28:00 +01:00
shash256
9f7ae0c7df
feat: support retrieval hints for efficient message retrieval from store nodes (#18)
* feat: updates for retrieval hint

* use HistoryEntry for deps

* chore: rearrange helper funcs

* chore: address review comments

* fix: simplify with mapIt
v0.3.0
2026-01-29 09:52:40 +00:00
Ivan FB
a8a5e42530
Merge pull request #45 from logos-messaging/fix-shebangs
fix: use env instead of hardcoding bash path
2026-01-29 10:23:17 +01:00
Ivan Folgueira Bande
4aa800ad2b
force epoll is being used in Android 2026-01-27 22:41:46 +01:00
c6e54b70ee
fix: use env instead of hardcoding bash path
Causes issues in Nix shells and derivations.

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2026-01-27 15:28:57 +01:00
Ivan FB
239f619625
Merge pull request #42 from logos-messaging/initialize-lock
initialize ctxPoolLock to avoid crash on Windows/iOS
v0.2.4
2026-01-14 12:16:15 +01:00
Ivan Folgueira Bande
be4c283581
initialize ctxPoolLock 2026-01-14 11:49:04 +01:00
Siddarth Kumar
fb8039c5a5
chore: fix iOS build
otherwise iOS linker fails with
  Undefined symbols for architecture arm64
2025-12-23 19:47:02 +04:00
Igor Sirotin
8d33a7f7da
feat: thread pool (#40)
* feat: thread pool

* proper pass ARCH in Makefile when building for Android

---------

Co-authored-by: Ivan Folgueira Bande <ivansete@status.im>
2025-12-22 18:10:45 +00:00
Ivan Folgueira Bande
e67639ee08
get arch from uname -m if ARCH env var is not set v0.2.3 2025-12-17 16:02:53 +01:00
Ivan FB
ac31e5adf2
Merge pull request #37 from logos-messaging/fix/buildForAppleSilicon
fix: Add condition to check hostCpu and then build based on that
2025-12-15 12:06:17 +01:00
Khushboo Mehta
13c3c348fa fix: Add condition to check hostCpu and then build based on that 2025-12-15 12:01:51 +01:00
Ivan FB
0b4d3cc03f
Merge pull request #38 from logos-messaging/rm-log
rm UNWRAP_MESSAGE failed error
v0.2.2
2025-12-10 09:51:41 +01:00
Ivan Folgueira Bande
191928adc6
rm UNWRAP_MESSAGE failed error 2025-12-09 11:45:35 +01:00
Ivan Folgueira Bande
ae445d5585
rename ANDROID_NDK_HOME to ANDROID_NDK_ROOT v0.2.1 2025-11-28 19:34:51 +01:00
Ivan Folgueira Bande
024b8c50e9
adapt Makefile and sds.nimble to support iOS target v0.2.0 2025-11-27 23:01:39 +01:00
Ivan Folgueira Bande
b643126011
Makefile and sds.nimble adaptation to ensure all lib ext are covered v0.1.0 2025-11-07 11:20:17 +01:00
Ivan Folgueira Bande
c82dad828a
protobuf.nim enhancement to avoid crash due to bad handled Result object
This aims to avoid the error that appeared in status-go's
test-functional:

ERROR    root:statusgo_container.py:146 Container health check failed:
Container is not running. Status: exited. Logs (last 10 lines):
/go/src/github.com/status-im/status-go/vendor/github.com/waku-org/sds-go-bindings/third_party/nim-sds/vendor/nim-chronos/chronos/internal/asyncfutures.nim(624)
pollFor
/go/src/github.com/status-im/status-go/vendor/github.com/waku-org/sds-go-bindings/third_party/nim-sds/vendor/nim-chronos/chronos/internal/asyncengine.nim(150)
poll
/go/src/github.com/status-im/status-go/vendor/github.com/waku-org/sds-go-bindings/third_party/nim-sds/library/sds_thread/sds_thread.nim(45)
runSds
/go/src/github.com/status-im/status-go/vendor/github.com/waku-org/sds-go-bindings/third_party/nim-sds/vendor/nim-chronos/chronos/internal/asyncmacro.nim(71)
process
/go/src/github.com/status-im/status-go/vendor/github.com/waku-org/sds-go-bindings/third_party/nim-sds/library/sds_thread/inter_thread_communication/requests/sds_message_request.nim(65)
process
/go/src/github.com/status-im/status-go/vendor/github.com/waku-org/sds-go-bindings/third_party/nim-sds/src/reliability.nim(190)
unwrapReceivedMessage
/go/src/github.com/status-im/status-go/vendor/github.com/waku-org/sds-go-bindings/third_party/nim-sds/src/protobuf.nim(55)
extractChannelId
/go/src/github.com/status-im/status-go/vendor/github.com/waku-org/sds-go-bindings/third_party/nim-sds/vendor/nim-results/results.nim(445)
value
/go/src/github.com/status-im/status-go/vendor/github.com/waku-org/sds-go-bindings/third_party/nim-sds/vendor/nim-results/results.nim(433)
raiseResultDefect
Error: unhandled exception: Trying to access value with err Result:
BadWireType [ResultDefect]
2025-10-31 19:54:27 +01:00
Ivan Folgueira Bande
26e13dd87a
fix in flake.nix pkgs.stdenv.isDarwin 2025-10-25 00:54:04 +02:00
Ivan Folgueira Bande
98b738c9c2
flake.nix: completely avoid adding android objects in macos
That produces a super long object file path which overpasses
the one hundred char size limit on macos ar utility
2025-10-25 00:44:34 +02:00
Ivan Folgueira Bande
b9114ec917
Makefile set proper lib extension depending on platform etc 2025-10-24 20:59:08 +02:00
cb472bc829
nix: fix Android builds on Darwin platforms
The assert for Android NDK is obsolete. But we also can avoid fetching
the Android dependencies when not building for Android.

Also, lsb-release is a linux-only tool.

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2025-10-24 17:27:49 +02:00
a8060c5600
nix: do not add android SDK on aarch64 Darwin
Otherwise it fails with:

error: aarch64-darwin not supported for Android SDK. Use: NIXPKGS_SYSTEM_OVERRIDE=x86_64-darwin

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2025-10-24 12:29:56 +02:00
2a216f3eb5
nix: fix targets to not be nested under androidPackages
Otherwise this happens:

 > nix flake show
error: expected a derivation

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2025-10-24 12:26:02 +02:00
Ivan Folgueira Bande
f3b084103d
try to add windows in default.nix 2025-10-22 02:33:19 +02:00
Ivan Folgueira Bande
9fa8c7cd64
add windows in flake.nix 2025-10-22 02:13:39 +02:00
Ivan Folgueira Bande
52966cb874
flake nix allow build for all four archs 2025-10-21 03:10:08 +02:00
Ivan Folgueira Bande
35c0f1964d
nix add all unix status go supported platforms 2025-10-21 02:55:09 +02:00
Ivan FB
972d7862fc
include libsds.h in the resulting nix package 2025-10-15 19:56:13 +02:00
Ivan Folgueira Bande
27a2caecaf
adjust as per comments 2025-10-15 19:54:09 +02:00
Ivan Folgueira Bande
68dcde059d
adjust as per comments 2025-10-15 19:53:27 +02:00
Ivan Folgueira Bande
8b6b4046fb
adjust 2025-10-15 19:51:02 +02:00
Ivan Folgueira Bande
2f7fc614d2
include header 2025-10-15 19:47:48 +02:00
c1e47fd449
nix: add Gh workflow for building Flake packages
Important if `status-go` is consume the Nix Flake.

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2025-10-15 18:34:38 +02:00