2325 Commits

Author SHA1 Message Date
NagyZoltanPeter
42e0aa43d1
feat: persistency (#3880)
* persistency: per-job SQLite-backed storage layer (singleton, brokered)

Adds a backend-neutral CRUD library at waku/persistency/, plus the
nim-brokers dependency swap that enables it.

Architecture (ports-and-adapters):
  * Persistency: process-wide singleton, one root directory.
  * Job: one tenant, one DB file, one worker thread, one BrokerContext.
  * Backend: SQLite via waku/common/databases/db_sqlite. Uniform schema
    kv(category BLOB, key BLOB, payload BLOB) PRIMARY KEY (category, key)
    WITHOUT ROWID, WAL mode.
  * Writes are fire-and-forget via EventBroker(mt) PersistEvent.
  * Reads are async via five RequestBroker(mt) shapes (KvGet, KvExists,
    KvScan, KvCount, KvDelete). Reads return Result[T, PersistencyError].
  * One storage thread per job; tenants isolated by BrokerContext.

Public surface (waku/persistency/persistency.nim):
  Persistency.instance(rootDir) / Persistency.instance() / Persistency.reset()
  p.openJob(id) / p.closeJob(id) / p.dropJob(id) / p.close()
  p.job(id) / p[id] / p.hasJob(id)
  Writes (Job form & string-id form, fire-and-forget):
    persist / persistPut / persistDelete / persistEncoded
  Reads (Job form & string-id form, async Result):
    get / exists / scan / scanPrefix / count / deleteAcked

Key & payload encoding (keys.nim, payload.nim):
  * encodePart family + variadic key(...) / payload(...) macros +
    single-value toKey / toPayload.
  * Primitives: string and openArray[byte] are 2-byte BE length + bytes;
    int{8..64} are sign-flipped 8-byte BE; uint{16..64} are 8-byte BE;
    bool/byte/char are 1 byte; enums are int64(ord(v)).
  * Generic encodePart[T: tuple | object] recurses through fields() so
    any composite Nim type is encodable without ceremony.
  * Stable across Nim/C compiler upgrades: no sizeof, no memcpy, no
    cast on pointers, no host-endianness dependency.
  * `rawKey(bytes)` + `persistPut(..., openArray[byte])` let callers
    bypass the built-in encoder with their own format (CBOR, protobuf...).

Lifecycle:
  * Persistency.new is private; Persistency.instance is the only public
    constructor. Same rootDir is idempotent; conflicting rootDir is
    peInvalidArgument. Persistency.reset for test/restart paths.
  * openJob opens-or-creates the per-job SQLite file; an existing file
    is reused with its data preserved.
  * Teardown integration: Persistency.instance registers a Teardown
    MultiRequestBroker provider that closes all jobs and clears the
    singleton slot when Waku.stop() issues Teardown.request.

Internal layering:
  types.nim          pure value types (Key, KeyRange, KvRow, TxOp,
                     PersistencyError)
  keys.nim           encodePart primitives + key(...) macro
  payload.nim        toPayload + payload(...) macro
  schema.nim         CREATE TABLE + connection pragmas + user_version
  backend_sqlite.nim KvBackend, applyOps (single source of write SQL),
                     getOne/existsOne/deleteOne, scanRange (asc/desc,
                     half-open ranges, open-ended stop), countRange
  backend_comm.nim   EventBroker(mt) PersistEvent + 5 RequestBroker(mt)
                     declarations; encodeErr/decodeErr boundary helpers
  backend_thread.nim startStorageThread / stopStorageThread (shared
                     allocShared0 arg, cstring dbPath, atomic
                     ready/shutdown flags); per-thread provider
                     registration
  persistency.nim    Persistency + Job types, singleton state, public
                     facade
  ../requests/lifecycle_requests.nim
                     Teardown MultiRequestBroker

Tests (69 cases, all passing):
  test_keys.nim          sort-order invariants (length-prefix strings,
                         sign-flipped ints, composite tuples, prefix
                         range)
  test_backend.nim       round-trip / replace / delete-return-value /
                         batched atomicity / asc-desc-half-open-open-
                         ended scans / category isolation / batch
                         txDelete
  test_lifecycle.nim     open-or-create rootDir / non-dir collision /
                         reopen across sessions / idempotent openJob /
                         two-tenant parallel isolation / closeJob joins
                         worker / dropJob removes file / acked delete
  test_facade.nim        put-then-get / atomic batch / scanPrefix
                         asc/desc / deleteAcked hit-miss /
                         fire-and-forget delete / two-tenant facade
                         isolation
  test_encoding.nim      tuple/named-tuple/object keys, embedded Key,
                         enum encoding, field-major composite sort,
                         payload struct encoding, end-to-end struct
                         round-trip through SQLite
  test_string_lookup.nim peJobNotFound semantics / hasJob / subscript /
                         persistPut+get via id / reads short-circuit /
                         writes drop+warn / persistEncoded via id /
                         scan parity Job-ref vs id
  test_singleton.nim     idempotent same-rootDir / different-rootDir
                         rejection / no-arg instance lifecycle / reset
                         retargets / reset idempotence / Teardown.request
                         end-to-end

Prerequisite delivered in the same series: replace the in-tree broker
implementation with the external nim-brokers package; update all
broker call-sites (waku_filter_v2, waku_relay, waku_rln_relay,
delivery_service, peer_manager, requests/*, factory/*, api tests, etc.)
to the new package API; chat2 made to compile again.

Note: SDS adapter (Phase 5 of the design) is deferred -- nim-sds is
still developed side-by-side and the persistency layer is intentionally
SDS-agnostic.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* persistency: pin nim-brokers by URL+commit (workaround for stale registry)

The bare `brokers >= 2.0.1` form cannot resolve on machines where the
local nimble SAT solver enumerates only the registry-recorded 0.1.0 for
brokers. The nim-lang/packages entry for `brokers` carries no per-tag
metadata (only the URL), so until that registry entry is refreshed the
SAT solver clamps the available-versions list to 0.1.0 and rejects the
>= 2.0.1 constraint -- even though pkgs2 and pkgcache both have v2.0.1
cloned locally.

Pinning by URL+commit bypasses the registry path entirely. Inline
comment in waku.nimble documents the situation and the path back to
the bare form once nim-lang/packages is updated.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* persistency: nph format pass

Run `nph` on all 57 Nim files touched by this PR. Pure formatting:
17 files re-styled, no semantic change. Suite still 69/69.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Fix build, add local-storage-path config, lazy init of Persistency from Waku start

* fix: fix nix deps

* fixes for nix build, regenerate deps

* reverting accidental dependency changes

* Fixing deps

* Apply suggestions from code review

Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com>

* persistency tests: migrate to suite / asyncTest / await

Match the in-tree test convention (procSuite -> suite, sync test +
waitFor -> asyncTest + await):

- procSuite "X": -> suite "X":
- For tests doing async work: test -> asyncTest, waitFor -> await.
- Poll helpers (proc waitFor(t: Job, ...) in test_lifecycle.nim,
  proc waitUntilExists(...) in test_facade.nim and
  test_string_lookup.nim) -> Future[bool] {.async.}, internal
  `waitFor X` -> `await X`, internal `sleep(N)` ->
  `await sleepAsync(chronos.milliseconds(N))`.
- Renamed test_lifecycle.nim's helper proc from `waitFor(t: Job, ...)`
  -> `pollExists(t: Job, ...)`; the previous name shadowed
  chronos.waitFor in the chronos macro expansion.
- `chronos.milliseconds(N)` explicitly qualified because `std/times`
  also exports `milliseconds` (returning TimeInterval, not Duration).
- `check await x` -> `let okN = await x; check okN` to dodge chronos's
  "yield in expr not lowered" with await-as-macro-argument.
- `(await x).foo()` -> `let awN = await x; ... awN.foo() ...` for the
  same reason.

waku/persistency/persistency.nim: nph also pulled the proc signatures
across multiple lines; restored explicit `Future[void] {.async.}`
return types after the colon (an intermediate nph pass had elided them).

Suite: 71 / 71 OK against the new async write surface.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* use idiomatic valueOr instead of ifs

* Reworked persistency shutdown, remove not necessary teardown mechanism

* Use const for DefaultStoragePath

* format to follow coding guidelines - no use of result and explicit returns - no functional change

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com>
2026-05-16 00:09:07 +02:00
Ivan FB
34c197c5cd
avoid keeping delivery tasks in propagated state when check store is disabled (#3843) 2026-05-15 17:39:38 +02:00
Fabiana Cecin
cb35b59f95
stop recv_service from delivering messages on unsubscribed topics for store-recovered messages (#3874)
* fix/harden recv_service so it won't deliver messages on unsubscribed content topics
* fix SubscrptionManager's subscribed-content-topics iterator
* fix broken store-message-receive test
* misc cleanups
2026-05-13 12:09:56 -03:00
Ivan FB
f23983f488
ensure peers are retrieved in random order from peer store (#3860) 2026-05-13 15:29:11 +02:00
Ivan FB
3c98aa7fac
Merge pull request #3875 from logos-messaging/update_master_from_v0.38 2026-05-13 13:01:52 +02:00
darshankabariya
a537c85594
Merge branch 'master' into update_master_from_v0.38 2026-05-13 16:25:35 +05:30
Fabiana Cecin
71a369ffad
feat: allow a port value of zero for service ports (auto-assign port) (#3828)
* any port set to 0 on conf results in a random port bound
* Debug API MyBoundPorts reports actually bound ports for all services, reports 0 if disabled
* write back bound values to both WakuConf and WakuNode.ports
* setupDiscoveryV5 returns Result and errors out on port 0
* rename setupAndStartDiscv5WithAutoPort to setupAndStartDiscv5
* updateWaku ENR rebuild now runs after discv5 startup
* Add DefaultP2pTcpPort, DefaultDiscv5UdpPort, DefaultWebSocketPort, DefaultRestPort, DefaultMetricsHttpPort
* add tests
2026-05-11 15:22:22 -03:00
Darshan
a62ab1e7b1
chore: add nim-sds (no runtime integration yet) (#3820) 2026-05-11 15:32:25 +02:00
Ivan FB
fb30109a22
update changelog for v0.38.1
Co-authored-by: Copilot <copilot@github.com>
v0.38.1 v0.38.1-rc.0
2026-05-07 18:19:34 +02:00
Ivan FB
35da224d5d
Evict peer instead of abrupt disconnect and avoid sending unnecessary store requests (#3857)
* peer manager not disconnect abruptly ongoing service peers streams
* fix: recv_service delivers store-recovered messages (#3805)
* recv_service now delivers store-recovered messages via MessageReceivedEvent
2026-05-07 17:28:30 +02:00
Ivan FB
27ae07adaa
receive_service: ensure fetch msgs query is performed when missing msg (#3849) 2026-05-06 19:58:19 +02:00
NagyZoltanPeter
75864a705e
Fix websock nimble dependency version restriction to match lock file. (#3829) 2026-04-30 14:20:11 +02:00
Ivan FB
587014e34f
add event_loop_accumulates_lag_secs (#3833) 2026-04-30 00:27:38 +02:00
NagyZoltanPeter
300f584efc
Removed duplicates of announcedAddresses, extMultiaddresses (#3831)
Removing duplicates of multiaddresses for Enr.
Safe building Enr Record.

Co-authored-by: Copilot <copilot@github.com>
2026-04-29 15:10:21 +02:00
osmaczko
5034086fef
Chore/make nix build phase configurable (#3826)
* nix: parameterize build flags with named args

Expose `enablePostgres`, `enableNimDebugDlOpen`, and `chroniclesLogLevel`
as arguments on `nix/default.nix`. Defaults preserve today's hardcoded
behavior, so `nix build .#liblogosdelivery` with no overrides is a
no-op change.

Consume the package via `callPackage` in `flake.nix` so consumers can
use `.override { ... }` without extra wrapping.

* nix: link libstdc++ on Linux so consumers don't need patchelf

Append `stdenv.cc.cc.lib` to `buildInputs` on Linux and add `-lstdc++`
to the Nim `--passL` flags. Nix stdenv's fixupPhase will auto-inject
`${stdenv.cc.cc.lib}/lib` into the output's RUNPATH, so downstream
consumers can drop their patchelf step.

macOS resolves the C++ stdlib via dyld/libc++ and is unaffected.

* nix: bundle librln into the output for a self-contained package

Copy the librln shared library (`librln.so` / `librln.dylib`) from the
zerokit input into `$out/lib` and rewrite the internal reference in
`liblogosdelivery`:

- Darwin: set librln's install name to `@rpath/librln.dylib`, change the
  consumer's reference to match, and add `@loader_path` as an rpath.
- Linux: add `$ORIGIN` to the rpath so `librln.so` resolves from the
  sibling directory, preserving the gcc-lib entry injected by the stdenv
  fixupPhase for libstdc++.

The installed `liblogosdelivery` no longer carries a `/nix/store/...`
absolute path to zerokit, so downstream consumers can ship the bundle
as-is.
2026-04-27 12:51:39 +02:00
Darshan
324048430b
fix: restore -d:postgres in nimble task and propagate NIMFLAGS (#3830) 2026-04-25 00:03:46 +05:30
Fabiana Cecin
ff98d85313
fix: relay validator registration and sync filter (#3823)
* reuse stored validator in relay
* fix skip check in store sync
* increase sync tolerance in test (matches similar test)
2026-04-23 21:02:34 +02:00
NagyZoltanPeter
820ccc6e10
Add ci support for liblogosdeliery, build and artifacts (#3746) 2026-04-23 18:24:55 +02:00
Fabiana Cecin
bb8a7e8782
Fix redundant start/stop calls (#3817)
* remove redundant proto start/stop calls from node start/stop
* fix WakuRelay start/stop not overriding GossipSub start/stop
* replace startRelay with reconnectRelayPeers
2026-04-22 09:52:57 -03:00
Igor Sirotin
4394843299
fix: make update and wakunode2 build on arm64 after Nimble migration (#3814)
Rebuild nat libs (miniupnpc, libnatpmp) for the host architecture during
nimble deps setup. The prebuilt libs from the nimble cache are x86_64 and
fail to link on arm64 (Apple Silicon).

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 23:20:53 +02:00
Ivan FB
260def68ad
use EWMA to show main loop lag information (#3808) 2026-04-20 18:05:44 +02:00
Ivan FB
cda0197168
use nimble 0.22.3 and more appropriate nimble.lock (#3809) 2026-04-20 13:54:34 +02:00
Fabiana Cecin
9cbb4e7338
fix: prefer --num-shards-in-network over preset (#3816)
* fill numShardsInCluster from preset when builder slot is none
* add regression tests
2026-04-20 13:48:27 +02:00
Fabiana Cecin
9ae108b4a7
Fix peer stats endpoint (#3815) 2026-04-20 08:16:01 -03:00
Fabiana Cecin
ca4dbb19e0
Improve logging of content topic on server (#3818) 2026-04-20 13:05:54 +02:00
NagyZoltanPeter
509c875533
chore: enable postgres support in nix liblogosdelivery build (#3813)
Add -d:postgres and -d:nimDebugDlOpen to both the dynamic and static
nim c invocations in nix/default.nix, matching the POSTGRES=1 flag
already used in the Make-based build path.
2026-04-15 16:12:52 +02:00
Darshan
ecd3758580
Merge pull request #3760 from logos-messaging/release/v0.38 2026-04-14 18:17:49 +05:30
darshankabariya
04b8e8c2a8
chore: update remaining changelog 2026-04-14 18:07:02 +05:30
Gabriel Cruz
166dc69c39
chore: bump nim-jwt version (#3812) 2026-04-13 21:44:30 +02:00
darshankabariya
a4db8895e4
chore: resolving lint 2026-04-10 17:03:25 +05:30
Fabiana Cecin
c04df751db
Fix BearSSL and NAT lib build reproducibility (#3806)
* pass -mssse3 on x86_64 to BearSSL and NAT C lib builds
* add BearSSL.mk and Nat.mk to nimbledeps cache key
2026-04-10 07:38:02 -03:00
Fabiana Cecin
494ea94606
fix: recv_service delivers store-recovered messages (#3805)
* recv_service now delivers store-recovered messages via MessageReceivedEvent
* add regression test_api_receive to prove store recovery actually delivers messages
* fix confusing "UNSUCCESSFUL / Missed message" log message
* removed some dead/duplicated code

Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com>
Co-authored by Zoltan Nagy
2026-04-09 14:29:17 -03:00
Ivan FB
ca7ec3de05
add main loop lag monitor (#3803)
* add loop lagging as health status
2026-04-09 16:51:46 +02:00
Ivan FB
4d314b376d
setting num-shards-in-network to 0 by default (#3748)
Co-authored-by: darshankabariya <darshan@status.im>
v0.38.0
2026-04-09 11:52:48 +05:30
NagyZoltanPeter
5503529531
chore: Add pre-check of options used in config Json for liblogosdelivery pre-createNode, treat unrecognized options as error (#3801)
* Add pre-check of options used in config Json for logos-delivery-api pre-createNode, treat unrecognized options as error
* Collect all unrecognized options and report them at once.
* Refactor json config parsing and error detection
2026-04-09 07:17:17 +02:00
Ivan FB
59bd365c16
setting num-shards-in-network to 0 by default (#3748)
Co-authored-by: darshankabariya <darshan@status.im>
2026-04-08 15:33:16 +02:00
Ivan FB
f5762af4c4
Start using nimble and deprecate vendor dependencies (#3798)
Co-authored-by: NagyZoltanPeter <113987313+NagyZoltanPeter@users.noreply.github.com>
Co-authored-by: Darshan K <35736874+darshankabariya@users.noreply.github.com>
2026-04-08 12:42:14 +02:00
darshankabariya
b2e46b6e91
Merge branch 'master' into release/v0.38 v0.38.0-rc.0 2026-04-08 00:55:39 +05:30
Darshan
9a344553e7
chore: update master changelog after v0.37.4 (#3802) 2026-04-07 18:00:57 +02:00
Danish Arora
549bf8bc43
fix(nix): fetch git submodules automatically via inputs.self (#3738)
The Nix build fails when consumers use `nix build github:logos-messaging/logos-delivery#liblogosdelivery`
without appending `?submodules=1` — vendor/nimbus-build-system is missing,
causing patchShebangs and substituteInPlace to fail.

Two fixes:
1. Add `inputs.self.submodules = true` to flake.nix (Nix >= 2.27) so
   submodules are fetched automatically without requiring callers to
   pass `?submodules=1`.
2. Fix the assertion in nix/default.nix: `(src.submodules or true)`
   always evaluates to true, silently masking the missing-submodules
   error. Changed to `builtins.pathExists` check on the actual
   submodule directory so it fails with a helpful message when
   submodules are genuinely absent.
2026-04-07 13:14:32 +05:30
Fabiana Cecin
56359e49ed
prefer reusing service peers across shards in edge filter reconciliation (#3789)
* selectFilterCandidates prefers peers already serving other shards
* restructure edgeFilterSubLoop (plan all dials then execute) for safety
2026-04-06 11:08:47 -03:00
Darshan
39719e1247
increase default timeout to 20s and add debug logging (#3792) 2026-04-06 15:53:45 +05:30
Fabiana Cecin
b0c0e0b637
chore: optimize release builds for speed (#3735) (#3777)
* Add -flto (lto_incremental, link-time optimization) for release builds
* Add -s (strip symbols) for release builds
* Switch library builds from --opt:size to --opt:speed
* Change -d:marchOptimized to x86-64-v2 target from broadwell
* Remove obsolete chronicles_colors=off for Windows
* Remove obsolete withoutPCRE define
2026-04-02 12:10:02 +02:00
Fabiana Cecin
dc026bbff1
feat: active filter subscription management for edge nodes (#3773)
feat: active filter subscription management for edge nodes

## Subscription Manager
* edgeFilterSubLoop reconciles desired vs actual filter subscriptions
* edgeFilterHealthLoop pings filter peers, evicts stale ones
* EdgeFilterSubState per-shard tracking of confirmed peers and health
* best-effort unsubscribe on peer removal
* RequestEdgeShardHealth and RequestEdgeFilterPeerCount broker providers

## WakuNode
* Remove old edge health loop (loopEdgeHealth, edgeHealthEvent, calculateEdgeTopicHealth)
* Register MessageSeenEvent push handler on filter client during start
* startDeliveryService now returns `Result[void, string]` and propagates errors

## Health Monitor
* getFilterClientHealth queries RequestEdgeFilterPeerCount via broker
* Shard/content health providers fall back to RequestEdgeShardHealth when relay inactive
* Listen to EventShardTopicHealthChange for health recalculation
* Add missing return p.notReady() on failed edge filter peer count request
* HealthyThreshold constant moved to `connection_status.nim`

## Broker types
* RequestEdgeShardHealth, RequestEdgeFilterPeerCount request types
* EventShardTopicHealthChange event type

## Filter Client
* Add timeout parameter to ping proc

## Tests
* Health monitor event tests with per-node lockNewGlobalBrokerContext
* Edge (light client) health update test
* Edge health driven by confirmed filter subscriptions test
* API subscription tests: sub/receive, failover, peer replacement

Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com>
Co-authored by Zoltan Nagy
2026-03-30 08:30:34 -03:00
Ivan FB
0623c10635
completely remove storev2 (#3781) 2026-03-30 00:08:08 +02:00
Ivan FB
5c335c2002
address leftover comments (#3782) 2026-03-27 13:55:27 +01:00
Ivan FB
b1e1c87534
update changelog for v0.37.3 (#3783)
Co-authored-by: darshankabariya <darshan@status.im>
2026-03-27 13:54:06 +01:00
Ivan FB
0b86093247
allow override user-message-rate-limit (#3778) 2026-03-25 13:23:20 +01:00
Ivan Folgueira Bande
6749144739
update change log for v0.37.2 2026-03-24 12:03:59 +01:00
Ivan Folgueira Bande
37f587f057
set default retention policy in archive.nim 2026-03-24 12:03:21 +01:00