* persistency: per-job SQLite-backed storage layer (singleton, brokered)
Adds a backend-neutral CRUD library at waku/persistency/, plus the
nim-brokers dependency swap that enables it.
Architecture (ports-and-adapters):
* Persistency: process-wide singleton, one root directory.
* Job: one tenant, one DB file, one worker thread, one BrokerContext.
* Backend: SQLite via waku/common/databases/db_sqlite. Uniform schema
kv(category BLOB, key BLOB, payload BLOB) PRIMARY KEY (category, key)
WITHOUT ROWID, WAL mode.
* Writes are fire-and-forget via EventBroker(mt) PersistEvent.
* Reads are async via five RequestBroker(mt) shapes (KvGet, KvExists,
KvScan, KvCount, KvDelete). Reads return Result[T, PersistencyError].
* One storage thread per job; tenants isolated by BrokerContext.
Public surface (waku/persistency/persistency.nim):
Persistency.instance(rootDir) / Persistency.instance() / Persistency.reset()
p.openJob(id) / p.closeJob(id) / p.dropJob(id) / p.close()
p.job(id) / p[id] / p.hasJob(id)
Writes (Job form & string-id form, fire-and-forget):
persist / persistPut / persistDelete / persistEncoded
Reads (Job form & string-id form, async Result):
get / exists / scan / scanPrefix / count / deleteAcked
Key & payload encoding (keys.nim, payload.nim):
* encodePart family + variadic key(...) / payload(...) macros +
single-value toKey / toPayload.
* Primitives: string and openArray[byte] are 2-byte BE length + bytes;
int{8..64} are sign-flipped 8-byte BE; uint{16..64} are 8-byte BE;
bool/byte/char are 1 byte; enums are int64(ord(v)).
* Generic encodePart[T: tuple | object] recurses through fields() so
any composite Nim type is encodable without ceremony.
* Stable across Nim/C compiler upgrades: no sizeof, no memcpy, no
cast on pointers, no host-endianness dependency.
* `rawKey(bytes)` + `persistPut(..., openArray[byte])` let callers
bypass the built-in encoder with their own format (CBOR, protobuf...).
Lifecycle:
* Persistency.new is private; Persistency.instance is the only public
constructor. Same rootDir is idempotent; conflicting rootDir is
peInvalidArgument. Persistency.reset for test/restart paths.
* openJob opens-or-creates the per-job SQLite file; an existing file
is reused with its data preserved.
* Teardown integration: Persistency.instance registers a Teardown
MultiRequestBroker provider that closes all jobs and clears the
singleton slot when Waku.stop() issues Teardown.request.
Internal layering:
types.nim pure value types (Key, KeyRange, KvRow, TxOp,
PersistencyError)
keys.nim encodePart primitives + key(...) macro
payload.nim toPayload + payload(...) macro
schema.nim CREATE TABLE + connection pragmas + user_version
backend_sqlite.nim KvBackend, applyOps (single source of write SQL),
getOne/existsOne/deleteOne, scanRange (asc/desc,
half-open ranges, open-ended stop), countRange
backend_comm.nim EventBroker(mt) PersistEvent + 5 RequestBroker(mt)
declarations; encodeErr/decodeErr boundary helpers
backend_thread.nim startStorageThread / stopStorageThread (shared
allocShared0 arg, cstring dbPath, atomic
ready/shutdown flags); per-thread provider
registration
persistency.nim Persistency + Job types, singleton state, public
facade
../requests/lifecycle_requests.nim
Teardown MultiRequestBroker
Tests (69 cases, all passing):
test_keys.nim sort-order invariants (length-prefix strings,
sign-flipped ints, composite tuples, prefix
range)
test_backend.nim round-trip / replace / delete-return-value /
batched atomicity / asc-desc-half-open-open-
ended scans / category isolation / batch
txDelete
test_lifecycle.nim open-or-create rootDir / non-dir collision /
reopen across sessions / idempotent openJob /
two-tenant parallel isolation / closeJob joins
worker / dropJob removes file / acked delete
test_facade.nim put-then-get / atomic batch / scanPrefix
asc/desc / deleteAcked hit-miss /
fire-and-forget delete / two-tenant facade
isolation
test_encoding.nim tuple/named-tuple/object keys, embedded Key,
enum encoding, field-major composite sort,
payload struct encoding, end-to-end struct
round-trip through SQLite
test_string_lookup.nim peJobNotFound semantics / hasJob / subscript /
persistPut+get via id / reads short-circuit /
writes drop+warn / persistEncoded via id /
scan parity Job-ref vs id
test_singleton.nim idempotent same-rootDir / different-rootDir
rejection / no-arg instance lifecycle / reset
retargets / reset idempotence / Teardown.request
end-to-end
Prerequisite delivered in the same series: replace the in-tree broker
implementation with the external nim-brokers package; update all
broker call-sites (waku_filter_v2, waku_relay, waku_rln_relay,
delivery_service, peer_manager, requests/*, factory/*, api tests, etc.)
to the new package API; chat2 made to compile again.
Note: SDS adapter (Phase 5 of the design) is deferred -- nim-sds is
still developed side-by-side and the persistency layer is intentionally
SDS-agnostic.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* persistency: pin nim-brokers by URL+commit (workaround for stale registry)
The bare `brokers >= 2.0.1` form cannot resolve on machines where the
local nimble SAT solver enumerates only the registry-recorded 0.1.0 for
brokers. The nim-lang/packages entry for `brokers` carries no per-tag
metadata (only the URL), so until that registry entry is refreshed the
SAT solver clamps the available-versions list to 0.1.0 and rejects the
>= 2.0.1 constraint -- even though pkgs2 and pkgcache both have v2.0.1
cloned locally.
Pinning by URL+commit bypasses the registry path entirely. Inline
comment in waku.nimble documents the situation and the path back to
the bare form once nim-lang/packages is updated.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* persistency: nph format pass
Run `nph` on all 57 Nim files touched by this PR. Pure formatting:
17 files re-styled, no semantic change. Suite still 69/69.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* Fix build, add local-storage-path config, lazy init of Persistency from Waku start
* fix: fix nix deps
* fixes for nix build, regenerate deps
* reverting accidental dependency changes
* Fixing deps
* Apply suggestions from code review
Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com>
* persistency tests: migrate to suite / asyncTest / await
Match the in-tree test convention (procSuite -> suite, sync test +
waitFor -> asyncTest + await):
- procSuite "X": -> suite "X":
- For tests doing async work: test -> asyncTest, waitFor -> await.
- Poll helpers (proc waitFor(t: Job, ...) in test_lifecycle.nim,
proc waitUntilExists(...) in test_facade.nim and
test_string_lookup.nim) -> Future[bool] {.async.}, internal
`waitFor X` -> `await X`, internal `sleep(N)` ->
`await sleepAsync(chronos.milliseconds(N))`.
- Renamed test_lifecycle.nim's helper proc from `waitFor(t: Job, ...)`
-> `pollExists(t: Job, ...)`; the previous name shadowed
chronos.waitFor in the chronos macro expansion.
- `chronos.milliseconds(N)` explicitly qualified because `std/times`
also exports `milliseconds` (returning TimeInterval, not Duration).
- `check await x` -> `let okN = await x; check okN` to dodge chronos's
"yield in expr not lowered" with await-as-macro-argument.
- `(await x).foo()` -> `let awN = await x; ... awN.foo() ...` for the
same reason.
waku/persistency/persistency.nim: nph also pulled the proc signatures
across multiple lines; restored explicit `Future[void] {.async.}`
return types after the colon (an intermediate nph pass had elided them).
Suite: 71 / 71 OK against the new async write surface.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* use idiomatic valueOr instead of ifs
* Reworked persistency shutdown, remove not necessary teardown mechanism
* Use const for DefaultStoragePath
* format to follow coding guidelines - no use of result and explicit returns - no functional change
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com>
feat: active filter subscription management for edge nodes
## Subscription Manager
* edgeFilterSubLoop reconciles desired vs actual filter subscriptions
* edgeFilterHealthLoop pings filter peers, evicts stale ones
* EdgeFilterSubState per-shard tracking of confirmed peers and health
* best-effort unsubscribe on peer removal
* RequestEdgeShardHealth and RequestEdgeFilterPeerCount broker providers
## WakuNode
* Remove old edge health loop (loopEdgeHealth, edgeHealthEvent, calculateEdgeTopicHealth)
* Register MessageSeenEvent push handler on filter client during start
* startDeliveryService now returns `Result[void, string]` and propagates errors
## Health Monitor
* getFilterClientHealth queries RequestEdgeFilterPeerCount via broker
* Shard/content health providers fall back to RequestEdgeShardHealth when relay inactive
* Listen to EventShardTopicHealthChange for health recalculation
* Add missing return p.notReady() on failed edge filter peer count request
* HealthyThreshold constant moved to `connection_status.nim`
## Broker types
* RequestEdgeShardHealth, RequestEdgeFilterPeerCount request types
* EventShardTopicHealthChange event type
## Filter Client
* Add timeout parameter to ping proc
## Tests
* Health monitor event tests with per-node lockNewGlobalBrokerContext
* Edge (light client) health update test
* Edge health driven by confirmed filter subscriptions test
* API subscription tests: sub/receive, failover, peer replacement
Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com>
Co-authored by Zoltan Nagy
* Fix peer selection for cases where ENR is not yet advertiesed but metadata exchange already adjusted supported shards. Fix initialization rendezvous protocol with configured and autoshards to let connect to relay nodes without having a valid subscribed shard already. This solves issue for autoshard nodes to connect ahead of subscribing.
* Extend peer selection, rendezvous and metadata tests
* Fix rendezvous test, fix metadata test failing due wrong setup, added it into all_tests
* Adapt using chronos' TokenBucket. Removed TokenBucket and test. bump nim-chronos -> nim-libp2p/nim-lsquic/nim-jwt -> adapt to latest libp2p changes
* Fix libp2p/utility reports unlisted exception can occure from close of socket in waitForService - -d:ssl compile flag caused it
* Adapt request_limiter to new chronos' TokenBucket replenish algorithm to keep original intent of use
* Fix filter dos protection test
* Fix peer manager tests due change caused by new libp2p
* Adjust store test rate limit to eliminate CI test flakyness of timing
* Adjust store test rate limit to eliminate CI test flakyness of timing - lightpush/legacy_lightpush/filter
* Rework filter dos protection test to avoid CI crazy timing causing flakyness in test results compared to local runs
* Rework lightpush dos protection test to avoid CI crazy timing causing flakyness in test results compared to local runs
* Rework lightpush and legacy lightpush rate limit tests to eliminate timing effect in CI that cause longer awaits thus result in minting new tokens unlike local runs
* update rendezvous to work with WakuPeeRecord and use libp2p updated version
* split rendezvous client and service implementation
* mount rendezvous client by default
* fix: admin API peer shards field from metadata protocol
Store and return peer shard info from metadata protocol exchange instead of only checking ENR records.
* peer_manager set shard info and extend rest test to validate it
Co-authored-by: MorganaFuture <andrewmochalskyi@gmail.com>
* properly pass userMessageLimit to OnchainGroupManager
* waku.nimble 2.2.4 Nim compiler
* rm stew/shims/net import
* change ValidIpAddress.init with parseIpAddress
* fix serialize for zerokit
* group_manager: separate if statements
* protocol_types: add encode UInt32 with zeros up to 32 bytes
* windows build: skip libunwind build and rm libunwind.a inlcusion step
* bump nph to overcome the compilation issues with 2.2.x
* bump nim-libp2p to v1.10.1
* fix: addPeer could unintentionally override metadata of previously stored peer with defaults and empty
* Add explanation why we discard updates of different peerStore books.
Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com>
---------
Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com>
* Extended /admin/v1 RESP API with different option to look at current connected/relay/mesh state of the node
* Added score information for peer info retrievals
* add new unit test to validate that any peer can be retrieved
* add new discv5 test and better peer store management
* wakuPeerStore -> switch.peerStore
* simplify waku_peer_store, better logs and peer_manager enhancements
Better control when the remote peer closes the WakuFilterPushCodec
stream.
For example, go-waku closes the stream for every received message.
On the other hand, js-waku keeps the stream opened.
Therefore, we support both scenarios.
* waku/waku_filter_v2/protocol.nim keeps track of the filter-client connections in Table[PeerId, Connection]
* waku/waku_filter_v2/protocol.nim starts listening for peer-left events in order to completely remove the previous Connection instance. Also, a new Connection is added when the filter-service starts publishing to its peers.
---------
Co-authored-by: NagyZoltanPeter <113987313+NagyZoltanPeter@users.noreply.github.com>