logos-messaging-nim/tests/node/test_wakunode_health_monitor.nim

435 lines
16 KiB
Nim
Raw Normal View History

{.used.}
import
std/[json, options, sequtils, strutils, tables], testutils/unittests, chronos, results
feat: persistency (#3880) * persistency: per-job SQLite-backed storage layer (singleton, brokered) Adds a backend-neutral CRUD library at waku/persistency/, plus the nim-brokers dependency swap that enables it. Architecture (ports-and-adapters): * Persistency: process-wide singleton, one root directory. * Job: one tenant, one DB file, one worker thread, one BrokerContext. * Backend: SQLite via waku/common/databases/db_sqlite. Uniform schema kv(category BLOB, key BLOB, payload BLOB) PRIMARY KEY (category, key) WITHOUT ROWID, WAL mode. * Writes are fire-and-forget via EventBroker(mt) PersistEvent. * Reads are async via five RequestBroker(mt) shapes (KvGet, KvExists, KvScan, KvCount, KvDelete). Reads return Result[T, PersistencyError]. * One storage thread per job; tenants isolated by BrokerContext. Public surface (waku/persistency/persistency.nim): Persistency.instance(rootDir) / Persistency.instance() / Persistency.reset() p.openJob(id) / p.closeJob(id) / p.dropJob(id) / p.close() p.job(id) / p[id] / p.hasJob(id) Writes (Job form & string-id form, fire-and-forget): persist / persistPut / persistDelete / persistEncoded Reads (Job form & string-id form, async Result): get / exists / scan / scanPrefix / count / deleteAcked Key & payload encoding (keys.nim, payload.nim): * encodePart family + variadic key(...) / payload(...) macros + single-value toKey / toPayload. * Primitives: string and openArray[byte] are 2-byte BE length + bytes; int{8..64} are sign-flipped 8-byte BE; uint{16..64} are 8-byte BE; bool/byte/char are 1 byte; enums are int64(ord(v)). * Generic encodePart[T: tuple | object] recurses through fields() so any composite Nim type is encodable without ceremony. * Stable across Nim/C compiler upgrades: no sizeof, no memcpy, no cast on pointers, no host-endianness dependency. * `rawKey(bytes)` + `persistPut(..., openArray[byte])` let callers bypass the built-in encoder with their own format (CBOR, protobuf...). Lifecycle: * Persistency.new is private; Persistency.instance is the only public constructor. Same rootDir is idempotent; conflicting rootDir is peInvalidArgument. Persistency.reset for test/restart paths. * openJob opens-or-creates the per-job SQLite file; an existing file is reused with its data preserved. * Teardown integration: Persistency.instance registers a Teardown MultiRequestBroker provider that closes all jobs and clears the singleton slot when Waku.stop() issues Teardown.request. Internal layering: types.nim pure value types (Key, KeyRange, KvRow, TxOp, PersistencyError) keys.nim encodePart primitives + key(...) macro payload.nim toPayload + payload(...) macro schema.nim CREATE TABLE + connection pragmas + user_version backend_sqlite.nim KvBackend, applyOps (single source of write SQL), getOne/existsOne/deleteOne, scanRange (asc/desc, half-open ranges, open-ended stop), countRange backend_comm.nim EventBroker(mt) PersistEvent + 5 RequestBroker(mt) declarations; encodeErr/decodeErr boundary helpers backend_thread.nim startStorageThread / stopStorageThread (shared allocShared0 arg, cstring dbPath, atomic ready/shutdown flags); per-thread provider registration persistency.nim Persistency + Job types, singleton state, public facade ../requests/lifecycle_requests.nim Teardown MultiRequestBroker Tests (69 cases, all passing): test_keys.nim sort-order invariants (length-prefix strings, sign-flipped ints, composite tuples, prefix range) test_backend.nim round-trip / replace / delete-return-value / batched atomicity / asc-desc-half-open-open- ended scans / category isolation / batch txDelete test_lifecycle.nim open-or-create rootDir / non-dir collision / reopen across sessions / idempotent openJob / two-tenant parallel isolation / closeJob joins worker / dropJob removes file / acked delete test_facade.nim put-then-get / atomic batch / scanPrefix asc/desc / deleteAcked hit-miss / fire-and-forget delete / two-tenant facade isolation test_encoding.nim tuple/named-tuple/object keys, embedded Key, enum encoding, field-major composite sort, payload struct encoding, end-to-end struct round-trip through SQLite test_string_lookup.nim peJobNotFound semantics / hasJob / subscript / persistPut+get via id / reads short-circuit / writes drop+warn / persistEncoded via id / scan parity Job-ref vs id test_singleton.nim idempotent same-rootDir / different-rootDir rejection / no-arg instance lifecycle / reset retargets / reset idempotence / Teardown.request end-to-end Prerequisite delivered in the same series: replace the in-tree broker implementation with the external nim-brokers package; update all broker call-sites (waku_filter_v2, waku_relay, waku_rln_relay, delivery_service, peer_manager, requests/*, factory/*, api tests, etc.) to the new package API; chat2 made to compile again. Note: SDS adapter (Phase 5 of the design) is deferred -- nim-sds is still developed side-by-side and the persistency layer is intentionally SDS-agnostic. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * persistency: pin nim-brokers by URL+commit (workaround for stale registry) The bare `brokers >= 2.0.1` form cannot resolve on machines where the local nimble SAT solver enumerates only the registry-recorded 0.1.0 for brokers. The nim-lang/packages entry for `brokers` carries no per-tag metadata (only the URL), so until that registry entry is refreshed the SAT solver clamps the available-versions list to 0.1.0 and rejects the >= 2.0.1 constraint -- even though pkgs2 and pkgcache both have v2.0.1 cloned locally. Pinning by URL+commit bypasses the registry path entirely. Inline comment in waku.nimble documents the situation and the path back to the bare form once nim-lang/packages is updated. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * persistency: nph format pass Run `nph` on all 57 Nim files touched by this PR. Pure formatting: 17 files re-styled, no semantic change. Suite still 69/69. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Fix build, add local-storage-path config, lazy init of Persistency from Waku start * fix: fix nix deps * fixes for nix build, regenerate deps * reverting accidental dependency changes * Fixing deps * Apply suggestions from code review Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com> * persistency tests: migrate to suite / asyncTest / await Match the in-tree test convention (procSuite -> suite, sync test + waitFor -> asyncTest + await): - procSuite "X": -> suite "X": - For tests doing async work: test -> asyncTest, waitFor -> await. - Poll helpers (proc waitFor(t: Job, ...) in test_lifecycle.nim, proc waitUntilExists(...) in test_facade.nim and test_string_lookup.nim) -> Future[bool] {.async.}, internal `waitFor X` -> `await X`, internal `sleep(N)` -> `await sleepAsync(chronos.milliseconds(N))`. - Renamed test_lifecycle.nim's helper proc from `waitFor(t: Job, ...)` -> `pollExists(t: Job, ...)`; the previous name shadowed chronos.waitFor in the chronos macro expansion. - `chronos.milliseconds(N)` explicitly qualified because `std/times` also exports `milliseconds` (returning TimeInterval, not Duration). - `check await x` -> `let okN = await x; check okN` to dodge chronos's "yield in expr not lowered" with await-as-macro-argument. - `(await x).foo()` -> `let awN = await x; ... awN.foo() ...` for the same reason. waku/persistency/persistency.nim: nph also pulled the proc signatures across multiple lines; restored explicit `Future[void] {.async.}` return types after the colon (an intermediate nph pass had elided them). Suite: 71 / 71 OK against the new async write surface. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * use idiomatic valueOr instead of ifs * Reworked persistency shutdown, remove not necessary teardown mechanism * Use const for DefaultStoragePath * format to follow coding guidelines - no use of result and explicit returns - no functional change --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com>
2026-05-16 00:09:07 +02:00
import brokers/broker_context
import
waku/[
waku_core,
common/waku_protocol,
node/waku_node,
node/peer_manager,
node/health_monitor/health_status,
node/health_monitor/connection_status,
node/health_monitor/protocol_health,
feat: active filter subscription management for edge nodes (#3773) feat: active filter subscription management for edge nodes ## Subscription Manager * edgeFilterSubLoop reconciles desired vs actual filter subscriptions * edgeFilterHealthLoop pings filter peers, evicts stale ones * EdgeFilterSubState per-shard tracking of confirmed peers and health * best-effort unsubscribe on peer removal * RequestEdgeShardHealth and RequestEdgeFilterPeerCount broker providers ## WakuNode * Remove old edge health loop (loopEdgeHealth, edgeHealthEvent, calculateEdgeTopicHealth) * Register MessageSeenEvent push handler on filter client during start * startDeliveryService now returns `Result[void, string]` and propagates errors ## Health Monitor * getFilterClientHealth queries RequestEdgeFilterPeerCount via broker * Shard/content health providers fall back to RequestEdgeShardHealth when relay inactive * Listen to EventShardTopicHealthChange for health recalculation * Add missing return p.notReady() on failed edge filter peer count request * HealthyThreshold constant moved to `connection_status.nim` ## Broker types * RequestEdgeShardHealth, RequestEdgeFilterPeerCount request types * EventShardTopicHealthChange event type ## Filter Client * Add timeout parameter to ping proc ## Tests * Health monitor event tests with per-node lockNewGlobalBrokerContext * Edge (light client) health update test * Edge health driven by confirmed filter subscriptions test * API subscription tests: sub/receive, failover, peer replacement Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com> Co-authored by Zoltan Nagy
2026-03-30 08:30:34 -03:00
node/health_monitor/topic_health,
node/health_monitor/node_health_monitor,
feat: active filter subscription management for edge nodes (#3773) feat: active filter subscription management for edge nodes ## Subscription Manager * edgeFilterSubLoop reconciles desired vs actual filter subscriptions * edgeFilterHealthLoop pings filter peers, evicts stale ones * EdgeFilterSubState per-shard tracking of confirmed peers and health * best-effort unsubscribe on peer removal * RequestEdgeShardHealth and RequestEdgeFilterPeerCount broker providers ## WakuNode * Remove old edge health loop (loopEdgeHealth, edgeHealthEvent, calculateEdgeTopicHealth) * Register MessageSeenEvent push handler on filter client during start * startDeliveryService now returns `Result[void, string]` and propagates errors ## Health Monitor * getFilterClientHealth queries RequestEdgeFilterPeerCount via broker * Shard/content health providers fall back to RequestEdgeShardHealth when relay inactive * Listen to EventShardTopicHealthChange for health recalculation * Add missing return p.notReady() on failed edge filter peer count request * HealthyThreshold constant moved to `connection_status.nim` ## Broker types * RequestEdgeShardHealth, RequestEdgeFilterPeerCount request types * EventShardTopicHealthChange event type ## Filter Client * Add timeout parameter to ping proc ## Tests * Health monitor event tests with per-node lockNewGlobalBrokerContext * Edge (light client) health update test * Edge health driven by confirmed filter subscriptions test * API subscription tests: sub/receive, failover, peer replacement Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com> Co-authored by Zoltan Nagy
2026-03-30 08:30:34 -03:00
node/delivery_service/delivery_service,
node/delivery_service/subscription_manager,
node/kernel_api/relay,
node/kernel_api/store,
node/kernel_api/lightpush,
node/kernel_api/filter,
feat: active filter subscription management for edge nodes (#3773) feat: active filter subscription management for edge nodes ## Subscription Manager * edgeFilterSubLoop reconciles desired vs actual filter subscriptions * edgeFilterHealthLoop pings filter peers, evicts stale ones * EdgeFilterSubState per-shard tracking of confirmed peers and health * best-effort unsubscribe on peer removal * RequestEdgeShardHealth and RequestEdgeFilterPeerCount broker providers ## WakuNode * Remove old edge health loop (loopEdgeHealth, edgeHealthEvent, calculateEdgeTopicHealth) * Register MessageSeenEvent push handler on filter client during start * startDeliveryService now returns `Result[void, string]` and propagates errors ## Health Monitor * getFilterClientHealth queries RequestEdgeFilterPeerCount via broker * Shard/content health providers fall back to RequestEdgeShardHealth when relay inactive * Listen to EventShardTopicHealthChange for health recalculation * Add missing return p.notReady() on failed edge filter peer count request * HealthyThreshold constant moved to `connection_status.nim` ## Broker types * RequestEdgeShardHealth, RequestEdgeFilterPeerCount request types * EventShardTopicHealthChange event type ## Filter Client * Add timeout parameter to ping proc ## Tests * Health monitor event tests with per-node lockNewGlobalBrokerContext * Edge (light client) health update test * Edge health driven by confirmed filter subscriptions test * API subscription tests: sub/receive, failover, peer replacement Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com> Co-authored by Zoltan Nagy
2026-03-30 08:30:34 -03:00
events/health_events,
events/peer_events,
waku_archive,
]
import ../testlib/[wakunode, wakucore], ../waku_archive/archive_utils
const MockDLow = 4 # Mocked GossipSub DLow value
const TestConnectivityTimeLimit = 3.seconds
proc protoHealthMock(kind: WakuProtocol, health: HealthStatus): ProtocolHealth =
var ph = ProtocolHealth.init(kind)
if health == HealthStatus.READY:
return ph.ready()
else:
return ph.notReady("mock")
suite "Health Monitor - health state calculation":
test "Disconnected, zero peers":
2026-03-17 10:15:35 -03:00
let protocols = @[
protoHealthMock(RelayProtocol, HealthStatus.NOT_READY),
protoHealthMock(StoreClientProtocol, HealthStatus.NOT_READY),
protoHealthMock(FilterClientProtocol, HealthStatus.NOT_READY),
protoHealthMock(LightpushClientProtocol, HealthStatus.NOT_READY),
]
let strength = initTable[WakuProtocol, int]()
let state = calculateConnectionState(protocols, strength, some(MockDLow))
check state == ConnectionStatus.Disconnected
test "PartiallyConnected, weak relay":
let weakCount = MockDLow - 1
let protocols = @[protoHealthMock(RelayProtocol, HealthStatus.READY)]
var strength = initTable[WakuProtocol, int]()
strength[RelayProtocol] = weakCount
let state = calculateConnectionState(protocols, strength, some(MockDLow))
# Partially connected since relay connectivity is weak (> 0, but < dLow)
check state == ConnectionStatus.PartiallyConnected
test "Connected, robust relay":
let protocols = @[protoHealthMock(RelayProtocol, HealthStatus.READY)]
var strength = initTable[WakuProtocol, int]()
strength[RelayProtocol] = MockDLow
let state = calculateConnectionState(protocols, strength, some(MockDLow))
# Fully connected since relay connectivity is ideal (>= dLow)
check state == ConnectionStatus.Connected
test "Connected, robust edge":
2026-03-17 10:15:35 -03:00
let protocols = @[
protoHealthMock(RelayProtocol, HealthStatus.NOT_MOUNTED),
protoHealthMock(LightpushClientProtocol, HealthStatus.READY),
protoHealthMock(FilterClientProtocol, HealthStatus.READY),
protoHealthMock(StoreClientProtocol, HealthStatus.READY),
]
var strength = initTable[WakuProtocol, int]()
strength[LightpushClientProtocol] = HealthyThreshold
strength[FilterClientProtocol] = HealthyThreshold
strength[StoreClientProtocol] = HealthyThreshold
let state = calculateConnectionState(protocols, strength, some(MockDLow))
check state == ConnectionStatus.Connected
test "Disconnected, edge missing store":
2026-03-17 10:15:35 -03:00
let protocols = @[
protoHealthMock(LightpushClientProtocol, HealthStatus.READY),
protoHealthMock(FilterClientProtocol, HealthStatus.READY),
protoHealthMock(StoreClientProtocol, HealthStatus.NOT_READY),
]
var strength = initTable[WakuProtocol, int]()
strength[LightpushClientProtocol] = HealthyThreshold
strength[FilterClientProtocol] = HealthyThreshold
strength[StoreClientProtocol] = 0
let state = calculateConnectionState(protocols, strength, some(MockDLow))
check state == ConnectionStatus.Disconnected
test "PartiallyConnected, edge meets minimum failover requirement":
let weakCount = max(1, HealthyThreshold - 1)
2026-03-17 10:15:35 -03:00
let protocols = @[
protoHealthMock(LightpushClientProtocol, HealthStatus.READY),
protoHealthMock(FilterClientProtocol, HealthStatus.READY),
protoHealthMock(StoreClientProtocol, HealthStatus.READY),
]
var strength = initTable[WakuProtocol, int]()
strength[LightpushClientProtocol] = weakCount
strength[FilterClientProtocol] = weakCount
strength[StoreClientProtocol] = weakCount
let state = calculateConnectionState(protocols, strength, some(MockDLow))
check state == ConnectionStatus.PartiallyConnected
test "Connected, robust relay ignores store server":
2026-03-17 10:15:35 -03:00
let protocols = @[
protoHealthMock(RelayProtocol, HealthStatus.READY),
protoHealthMock(StoreProtocol, HealthStatus.READY),
]
var strength = initTable[WakuProtocol, int]()
strength[RelayProtocol] = MockDLow
strength[StoreProtocol] = 0
let state = calculateConnectionState(protocols, strength, some(MockDLow))
check state == ConnectionStatus.Connected
test "Connected, robust relay ignores store client":
2026-03-17 10:15:35 -03:00
let protocols = @[
protoHealthMock(RelayProtocol, HealthStatus.READY),
protoHealthMock(StoreProtocol, HealthStatus.READY),
protoHealthMock(StoreClientProtocol, HealthStatus.NOT_READY),
]
var strength = initTable[WakuProtocol, int]()
strength[RelayProtocol] = MockDLow
strength[StoreProtocol] = 0
strength[StoreClientProtocol] = 0
let state = calculateConnectionState(protocols, strength, some(MockDLow))
check state == ConnectionStatus.Connected
suite "Health Monitor - events":
asyncTest "Core (relay) health update":
feat: active filter subscription management for edge nodes (#3773) feat: active filter subscription management for edge nodes ## Subscription Manager * edgeFilterSubLoop reconciles desired vs actual filter subscriptions * edgeFilterHealthLoop pings filter peers, evicts stale ones * EdgeFilterSubState per-shard tracking of confirmed peers and health * best-effort unsubscribe on peer removal * RequestEdgeShardHealth and RequestEdgeFilterPeerCount broker providers ## WakuNode * Remove old edge health loop (loopEdgeHealth, edgeHealthEvent, calculateEdgeTopicHealth) * Register MessageSeenEvent push handler on filter client during start * startDeliveryService now returns `Result[void, string]` and propagates errors ## Health Monitor * getFilterClientHealth queries RequestEdgeFilterPeerCount via broker * Shard/content health providers fall back to RequestEdgeShardHealth when relay inactive * Listen to EventShardTopicHealthChange for health recalculation * Add missing return p.notReady() on failed edge filter peer count request * HealthyThreshold constant moved to `connection_status.nim` ## Broker types * RequestEdgeShardHealth, RequestEdgeFilterPeerCount request types * EventShardTopicHealthChange event type ## Filter Client * Add timeout parameter to ping proc ## Tests * Health monitor event tests with per-node lockNewGlobalBrokerContext * Edge (light client) health update test * Edge health driven by confirmed filter subscriptions test * API subscription tests: sub/receive, failover, peer replacement Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com> Co-authored by Zoltan Nagy
2026-03-30 08:30:34 -03:00
var nodeA: WakuNode
lockNewGlobalBrokerContext:
let nodeAKey = generateSecp256k1Key()
nodeA = newTestWakuNode(nodeAKey, parseIpAddress("127.0.0.1"), Port(0))
feat: active filter subscription management for edge nodes (#3773) feat: active filter subscription management for edge nodes ## Subscription Manager * edgeFilterSubLoop reconciles desired vs actual filter subscriptions * edgeFilterHealthLoop pings filter peers, evicts stale ones * EdgeFilterSubState per-shard tracking of confirmed peers and health * best-effort unsubscribe on peer removal * RequestEdgeShardHealth and RequestEdgeFilterPeerCount broker providers ## WakuNode * Remove old edge health loop (loopEdgeHealth, edgeHealthEvent, calculateEdgeTopicHealth) * Register MessageSeenEvent push handler on filter client during start * startDeliveryService now returns `Result[void, string]` and propagates errors ## Health Monitor * getFilterClientHealth queries RequestEdgeFilterPeerCount via broker * Shard/content health providers fall back to RequestEdgeShardHealth when relay inactive * Listen to EventShardTopicHealthChange for health recalculation * Add missing return p.notReady() on failed edge filter peer count request * HealthyThreshold constant moved to `connection_status.nim` ## Broker types * RequestEdgeShardHealth, RequestEdgeFilterPeerCount request types * EventShardTopicHealthChange event type ## Filter Client * Add timeout parameter to ping proc ## Tests * Health monitor event tests with per-node lockNewGlobalBrokerContext * Edge (light client) health update test * Edge health driven by confirmed filter subscriptions test * API subscription tests: sub/receive, failover, peer replacement Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com> Co-authored by Zoltan Nagy
2026-03-30 08:30:34 -03:00
(await nodeA.mountRelay()).expect("Node A failed to mount Relay")
await nodeA.start()
let monitorA = NodeHealthMonitor.new(nodeA)
var
lastStatus = ConnectionStatus.Disconnected
callbackCount = 0
healthChangeSignal = newAsyncEvent()
monitorA.onConnectionStatusChange = proc(status: ConnectionStatus) {.async.} =
lastStatus = status
callbackCount.inc()
healthChangeSignal.fire()
monitorA.startHealthMonitor().expect("Health monitor failed to start")
feat: active filter subscription management for edge nodes (#3773) feat: active filter subscription management for edge nodes ## Subscription Manager * edgeFilterSubLoop reconciles desired vs actual filter subscriptions * edgeFilterHealthLoop pings filter peers, evicts stale ones * EdgeFilterSubState per-shard tracking of confirmed peers and health * best-effort unsubscribe on peer removal * RequestEdgeShardHealth and RequestEdgeFilterPeerCount broker providers ## WakuNode * Remove old edge health loop (loopEdgeHealth, edgeHealthEvent, calculateEdgeTopicHealth) * Register MessageSeenEvent push handler on filter client during start * startDeliveryService now returns `Result[void, string]` and propagates errors ## Health Monitor * getFilterClientHealth queries RequestEdgeFilterPeerCount via broker * Shard/content health providers fall back to RequestEdgeShardHealth when relay inactive * Listen to EventShardTopicHealthChange for health recalculation * Add missing return p.notReady() on failed edge filter peer count request * HealthyThreshold constant moved to `connection_status.nim` ## Broker types * RequestEdgeShardHealth, RequestEdgeFilterPeerCount request types * EventShardTopicHealthChange event type ## Filter Client * Add timeout parameter to ping proc ## Tests * Health monitor event tests with per-node lockNewGlobalBrokerContext * Edge (light client) health update test * Edge health driven by confirmed filter subscriptions test * API subscription tests: sub/receive, failover, peer replacement Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com> Co-authored by Zoltan Nagy
2026-03-30 08:30:34 -03:00
var nodeB: WakuNode
lockNewGlobalBrokerContext:
let nodeBKey = generateSecp256k1Key()
nodeB = newTestWakuNode(nodeBKey, parseIpAddress("127.0.0.1"), Port(0))
feat: active filter subscription management for edge nodes (#3773) feat: active filter subscription management for edge nodes ## Subscription Manager * edgeFilterSubLoop reconciles desired vs actual filter subscriptions * edgeFilterHealthLoop pings filter peers, evicts stale ones * EdgeFilterSubState per-shard tracking of confirmed peers and health * best-effort unsubscribe on peer removal * RequestEdgeShardHealth and RequestEdgeFilterPeerCount broker providers ## WakuNode * Remove old edge health loop (loopEdgeHealth, edgeHealthEvent, calculateEdgeTopicHealth) * Register MessageSeenEvent push handler on filter client during start * startDeliveryService now returns `Result[void, string]` and propagates errors ## Health Monitor * getFilterClientHealth queries RequestEdgeFilterPeerCount via broker * Shard/content health providers fall back to RequestEdgeShardHealth when relay inactive * Listen to EventShardTopicHealthChange for health recalculation * Add missing return p.notReady() on failed edge filter peer count request * HealthyThreshold constant moved to `connection_status.nim` ## Broker types * RequestEdgeShardHealth, RequestEdgeFilterPeerCount request types * EventShardTopicHealthChange event type ## Filter Client * Add timeout parameter to ping proc ## Tests * Health monitor event tests with per-node lockNewGlobalBrokerContext * Edge (light client) health update test * Edge health driven by confirmed filter subscriptions test * API subscription tests: sub/receive, failover, peer replacement Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com> Co-authored by Zoltan Nagy
2026-03-30 08:30:34 -03:00
let driver = newSqliteArchiveDriver()
nodeB.mountArchive(driver).expect("Node B failed to mount archive")
(await nodeB.mountRelay()).expect("Node B failed to mount relay")
await nodeB.mountStore()
await nodeB.start()
await nodeA.connectToNodes(@[nodeB.switch.peerInfo.toRemotePeerInfo()])
proc dummyHandler(topic: PubsubTopic, msg: WakuMessage): Future[void] {.async.} =
discard
nodeA.subscribe((kind: PubsubSub, topic: DefaultPubsubTopic), dummyHandler).expect(
"Node A failed to subscribe"
)
nodeB.subscribe((kind: PubsubSub, topic: DefaultPubsubTopic), dummyHandler).expect(
"Node B failed to subscribe"
)
let connectTimeLimit = Moment.now() + TestConnectivityTimeLimit
var gotConnected = false
while Moment.now() < connectTimeLimit:
if lastStatus == ConnectionStatus.PartiallyConnected:
gotConnected = true
break
if await healthChangeSignal.wait().withTimeout(connectTimeLimit - Moment.now()):
healthChangeSignal.clear()
check:
gotConnected == true
callbackCount >= 1
lastStatus == ConnectionStatus.PartiallyConnected
healthChangeSignal.clear()
await nodeB.stop()
await nodeA.disconnectNode(nodeB.switch.peerInfo.toRemotePeerInfo())
let disconnectTimeLimit = Moment.now() + TestConnectivityTimeLimit
var gotDisconnected = false
while Moment.now() < disconnectTimeLimit:
if lastStatus == ConnectionStatus.Disconnected:
gotDisconnected = true
break
if await healthChangeSignal.wait().withTimeout(disconnectTimeLimit - Moment.now()):
healthChangeSignal.clear()
check:
gotDisconnected == true
await monitorA.stopHealthMonitor()
await nodeA.stop()
asyncTest "Edge (light client) health update":
feat: active filter subscription management for edge nodes (#3773) feat: active filter subscription management for edge nodes ## Subscription Manager * edgeFilterSubLoop reconciles desired vs actual filter subscriptions * edgeFilterHealthLoop pings filter peers, evicts stale ones * EdgeFilterSubState per-shard tracking of confirmed peers and health * best-effort unsubscribe on peer removal * RequestEdgeShardHealth and RequestEdgeFilterPeerCount broker providers ## WakuNode * Remove old edge health loop (loopEdgeHealth, edgeHealthEvent, calculateEdgeTopicHealth) * Register MessageSeenEvent push handler on filter client during start * startDeliveryService now returns `Result[void, string]` and propagates errors ## Health Monitor * getFilterClientHealth queries RequestEdgeFilterPeerCount via broker * Shard/content health providers fall back to RequestEdgeShardHealth when relay inactive * Listen to EventShardTopicHealthChange for health recalculation * Add missing return p.notReady() on failed edge filter peer count request * HealthyThreshold constant moved to `connection_status.nim` ## Broker types * RequestEdgeShardHealth, RequestEdgeFilterPeerCount request types * EventShardTopicHealthChange event type ## Filter Client * Add timeout parameter to ping proc ## Tests * Health monitor event tests with per-node lockNewGlobalBrokerContext * Edge (light client) health update test * Edge health driven by confirmed filter subscriptions test * API subscription tests: sub/receive, failover, peer replacement Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com> Co-authored by Zoltan Nagy
2026-03-30 08:30:34 -03:00
var nodeA: WakuNode
lockNewGlobalBrokerContext:
let nodeAKey = generateSecp256k1Key()
nodeA = newTestWakuNode(nodeAKey, parseIpAddress("127.0.0.1"), Port(0))
feat: active filter subscription management for edge nodes (#3773) feat: active filter subscription management for edge nodes ## Subscription Manager * edgeFilterSubLoop reconciles desired vs actual filter subscriptions * edgeFilterHealthLoop pings filter peers, evicts stale ones * EdgeFilterSubState per-shard tracking of confirmed peers and health * best-effort unsubscribe on peer removal * RequestEdgeShardHealth and RequestEdgeFilterPeerCount broker providers ## WakuNode * Remove old edge health loop (loopEdgeHealth, edgeHealthEvent, calculateEdgeTopicHealth) * Register MessageSeenEvent push handler on filter client during start * startDeliveryService now returns `Result[void, string]` and propagates errors ## Health Monitor * getFilterClientHealth queries RequestEdgeFilterPeerCount via broker * Shard/content health providers fall back to RequestEdgeShardHealth when relay inactive * Listen to EventShardTopicHealthChange for health recalculation * Add missing return p.notReady() on failed edge filter peer count request * HealthyThreshold constant moved to `connection_status.nim` ## Broker types * RequestEdgeShardHealth, RequestEdgeFilterPeerCount request types * EventShardTopicHealthChange event type ## Filter Client * Add timeout parameter to ping proc ## Tests * Health monitor event tests with per-node lockNewGlobalBrokerContext * Edge (light client) health update test * Edge health driven by confirmed filter subscriptions test * API subscription tests: sub/receive, failover, peer replacement Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com> Co-authored by Zoltan Nagy
2026-03-30 08:30:34 -03:00
nodeA.mountLightpushClient()
await nodeA.mountFilterClient()
nodeA.mountStoreClient()
require nodeA.mountAutoSharding(1, 8).isOk
nodeA.mountMetadata(1, @[0'u16]).expect("Node A failed to mount metadata")
await nodeA.start()
feat: active filter subscription management for edge nodes (#3773) feat: active filter subscription management for edge nodes ## Subscription Manager * edgeFilterSubLoop reconciles desired vs actual filter subscriptions * edgeFilterHealthLoop pings filter peers, evicts stale ones * EdgeFilterSubState per-shard tracking of confirmed peers and health * best-effort unsubscribe on peer removal * RequestEdgeShardHealth and RequestEdgeFilterPeerCount broker providers ## WakuNode * Remove old edge health loop (loopEdgeHealth, edgeHealthEvent, calculateEdgeTopicHealth) * Register MessageSeenEvent push handler on filter client during start * startDeliveryService now returns `Result[void, string]` and propagates errors ## Health Monitor * getFilterClientHealth queries RequestEdgeFilterPeerCount via broker * Shard/content health providers fall back to RequestEdgeShardHealth when relay inactive * Listen to EventShardTopicHealthChange for health recalculation * Add missing return p.notReady() on failed edge filter peer count request * HealthyThreshold constant moved to `connection_status.nim` ## Broker types * RequestEdgeShardHealth, RequestEdgeFilterPeerCount request types * EventShardTopicHealthChange event type ## Filter Client * Add timeout parameter to ping proc ## Tests * Health monitor event tests with per-node lockNewGlobalBrokerContext * Edge (light client) health update test * Edge health driven by confirmed filter subscriptions test * API subscription tests: sub/receive, failover, peer replacement Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com> Co-authored by Zoltan Nagy
2026-03-30 08:30:34 -03:00
let ds =
DeliveryService.new(false, nodeA).expect("Failed to create DeliveryService")
ds.startDeliveryService().expect("Failed to start DeliveryService")
let monitorA = NodeHealthMonitor.new(nodeA)
var
lastStatus = ConnectionStatus.Disconnected
callbackCount = 0
healthChangeSignal = newAsyncEvent()
monitorA.onConnectionStatusChange = proc(status: ConnectionStatus) {.async.} =
lastStatus = status
callbackCount.inc()
healthChangeSignal.fire()
monitorA.startHealthMonitor().expect("Health monitor failed to start")
feat: active filter subscription management for edge nodes (#3773) feat: active filter subscription management for edge nodes ## Subscription Manager * edgeFilterSubLoop reconciles desired vs actual filter subscriptions * edgeFilterHealthLoop pings filter peers, evicts stale ones * EdgeFilterSubState per-shard tracking of confirmed peers and health * best-effort unsubscribe on peer removal * RequestEdgeShardHealth and RequestEdgeFilterPeerCount broker providers ## WakuNode * Remove old edge health loop (loopEdgeHealth, edgeHealthEvent, calculateEdgeTopicHealth) * Register MessageSeenEvent push handler on filter client during start * startDeliveryService now returns `Result[void, string]` and propagates errors ## Health Monitor * getFilterClientHealth queries RequestEdgeFilterPeerCount via broker * Shard/content health providers fall back to RequestEdgeShardHealth when relay inactive * Listen to EventShardTopicHealthChange for health recalculation * Add missing return p.notReady() on failed edge filter peer count request * HealthyThreshold constant moved to `connection_status.nim` ## Broker types * RequestEdgeShardHealth, RequestEdgeFilterPeerCount request types * EventShardTopicHealthChange event type ## Filter Client * Add timeout parameter to ping proc ## Tests * Health monitor event tests with per-node lockNewGlobalBrokerContext * Edge (light client) health update test * Edge health driven by confirmed filter subscriptions test * API subscription tests: sub/receive, failover, peer replacement Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com> Co-authored by Zoltan Nagy
2026-03-30 08:30:34 -03:00
var nodeB: WakuNode
lockNewGlobalBrokerContext:
let nodeBKey = generateSecp256k1Key()
nodeB = newTestWakuNode(nodeBKey, parseIpAddress("127.0.0.1"), Port(0))
feat: active filter subscription management for edge nodes (#3773) feat: active filter subscription management for edge nodes ## Subscription Manager * edgeFilterSubLoop reconciles desired vs actual filter subscriptions * edgeFilterHealthLoop pings filter peers, evicts stale ones * EdgeFilterSubState per-shard tracking of confirmed peers and health * best-effort unsubscribe on peer removal * RequestEdgeShardHealth and RequestEdgeFilterPeerCount broker providers ## WakuNode * Remove old edge health loop (loopEdgeHealth, edgeHealthEvent, calculateEdgeTopicHealth) * Register MessageSeenEvent push handler on filter client during start * startDeliveryService now returns `Result[void, string]` and propagates errors ## Health Monitor * getFilterClientHealth queries RequestEdgeFilterPeerCount via broker * Shard/content health providers fall back to RequestEdgeShardHealth when relay inactive * Listen to EventShardTopicHealthChange for health recalculation * Add missing return p.notReady() on failed edge filter peer count request * HealthyThreshold constant moved to `connection_status.nim` ## Broker types * RequestEdgeShardHealth, RequestEdgeFilterPeerCount request types * EventShardTopicHealthChange event type ## Filter Client * Add timeout parameter to ping proc ## Tests * Health monitor event tests with per-node lockNewGlobalBrokerContext * Edge (light client) health update test * Edge health driven by confirmed filter subscriptions test * API subscription tests: sub/receive, failover, peer replacement Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com> Co-authored by Zoltan Nagy
2026-03-30 08:30:34 -03:00
let driver = newSqliteArchiveDriver()
nodeB.mountArchive(driver).expect("Node B failed to mount archive")
(await nodeB.mountRelay()).expect("Node B failed to mount relay")
(await nodeB.mountLightpush()).expect("Node B failed to mount lightpush")
await nodeB.mountFilter()
await nodeB.mountStore()
require nodeB.mountAutoSharding(1, 8).isOk
nodeB.mountMetadata(1, toSeq(0'u16 ..< 8'u16)).expect(
"Node B failed to mount metadata"
)
await nodeB.start()
var metadataFut = newFuture[void]("waitForMetadata")
let metadataLis = WakuPeerEvent
.listen(
nodeA.brokerCtx,
proc(evt: WakuPeerEvent): Future[void] {.async: (raises: []), gcsafe.} =
if not metadataFut.finished and
evt.kind == WakuPeerEventKind.EventMetadataUpdated:
metadataFut.complete()
,
)
.expect("Failed to listen for metadata")
await nodeA.connectToNodes(@[nodeB.switch.peerInfo.toRemotePeerInfo()])
feat: active filter subscription management for edge nodes (#3773) feat: active filter subscription management for edge nodes ## Subscription Manager * edgeFilterSubLoop reconciles desired vs actual filter subscriptions * edgeFilterHealthLoop pings filter peers, evicts stale ones * EdgeFilterSubState per-shard tracking of confirmed peers and health * best-effort unsubscribe on peer removal * RequestEdgeShardHealth and RequestEdgeFilterPeerCount broker providers ## WakuNode * Remove old edge health loop (loopEdgeHealth, edgeHealthEvent, calculateEdgeTopicHealth) * Register MessageSeenEvent push handler on filter client during start * startDeliveryService now returns `Result[void, string]` and propagates errors ## Health Monitor * getFilterClientHealth queries RequestEdgeFilterPeerCount via broker * Shard/content health providers fall back to RequestEdgeShardHealth when relay inactive * Listen to EventShardTopicHealthChange for health recalculation * Add missing return p.notReady() on failed edge filter peer count request * HealthyThreshold constant moved to `connection_status.nim` ## Broker types * RequestEdgeShardHealth, RequestEdgeFilterPeerCount request types * EventShardTopicHealthChange event type ## Filter Client * Add timeout parameter to ping proc ## Tests * Health monitor event tests with per-node lockNewGlobalBrokerContext * Edge (light client) health update test * Edge health driven by confirmed filter subscriptions test * API subscription tests: sub/receive, failover, peer replacement Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com> Co-authored by Zoltan Nagy
2026-03-30 08:30:34 -03:00
let metadataOk = await metadataFut.withTimeout(TestConnectivityTimeLimit)
feat: persistency (#3880) * persistency: per-job SQLite-backed storage layer (singleton, brokered) Adds a backend-neutral CRUD library at waku/persistency/, plus the nim-brokers dependency swap that enables it. Architecture (ports-and-adapters): * Persistency: process-wide singleton, one root directory. * Job: one tenant, one DB file, one worker thread, one BrokerContext. * Backend: SQLite via waku/common/databases/db_sqlite. Uniform schema kv(category BLOB, key BLOB, payload BLOB) PRIMARY KEY (category, key) WITHOUT ROWID, WAL mode. * Writes are fire-and-forget via EventBroker(mt) PersistEvent. * Reads are async via five RequestBroker(mt) shapes (KvGet, KvExists, KvScan, KvCount, KvDelete). Reads return Result[T, PersistencyError]. * One storage thread per job; tenants isolated by BrokerContext. Public surface (waku/persistency/persistency.nim): Persistency.instance(rootDir) / Persistency.instance() / Persistency.reset() p.openJob(id) / p.closeJob(id) / p.dropJob(id) / p.close() p.job(id) / p[id] / p.hasJob(id) Writes (Job form & string-id form, fire-and-forget): persist / persistPut / persistDelete / persistEncoded Reads (Job form & string-id form, async Result): get / exists / scan / scanPrefix / count / deleteAcked Key & payload encoding (keys.nim, payload.nim): * encodePart family + variadic key(...) / payload(...) macros + single-value toKey / toPayload. * Primitives: string and openArray[byte] are 2-byte BE length + bytes; int{8..64} are sign-flipped 8-byte BE; uint{16..64} are 8-byte BE; bool/byte/char are 1 byte; enums are int64(ord(v)). * Generic encodePart[T: tuple | object] recurses through fields() so any composite Nim type is encodable without ceremony. * Stable across Nim/C compiler upgrades: no sizeof, no memcpy, no cast on pointers, no host-endianness dependency. * `rawKey(bytes)` + `persistPut(..., openArray[byte])` let callers bypass the built-in encoder with their own format (CBOR, protobuf...). Lifecycle: * Persistency.new is private; Persistency.instance is the only public constructor. Same rootDir is idempotent; conflicting rootDir is peInvalidArgument. Persistency.reset for test/restart paths. * openJob opens-or-creates the per-job SQLite file; an existing file is reused with its data preserved. * Teardown integration: Persistency.instance registers a Teardown MultiRequestBroker provider that closes all jobs and clears the singleton slot when Waku.stop() issues Teardown.request. Internal layering: types.nim pure value types (Key, KeyRange, KvRow, TxOp, PersistencyError) keys.nim encodePart primitives + key(...) macro payload.nim toPayload + payload(...) macro schema.nim CREATE TABLE + connection pragmas + user_version backend_sqlite.nim KvBackend, applyOps (single source of write SQL), getOne/existsOne/deleteOne, scanRange (asc/desc, half-open ranges, open-ended stop), countRange backend_comm.nim EventBroker(mt) PersistEvent + 5 RequestBroker(mt) declarations; encodeErr/decodeErr boundary helpers backend_thread.nim startStorageThread / stopStorageThread (shared allocShared0 arg, cstring dbPath, atomic ready/shutdown flags); per-thread provider registration persistency.nim Persistency + Job types, singleton state, public facade ../requests/lifecycle_requests.nim Teardown MultiRequestBroker Tests (69 cases, all passing): test_keys.nim sort-order invariants (length-prefix strings, sign-flipped ints, composite tuples, prefix range) test_backend.nim round-trip / replace / delete-return-value / batched atomicity / asc-desc-half-open-open- ended scans / category isolation / batch txDelete test_lifecycle.nim open-or-create rootDir / non-dir collision / reopen across sessions / idempotent openJob / two-tenant parallel isolation / closeJob joins worker / dropJob removes file / acked delete test_facade.nim put-then-get / atomic batch / scanPrefix asc/desc / deleteAcked hit-miss / fire-and-forget delete / two-tenant facade isolation test_encoding.nim tuple/named-tuple/object keys, embedded Key, enum encoding, field-major composite sort, payload struct encoding, end-to-end struct round-trip through SQLite test_string_lookup.nim peJobNotFound semantics / hasJob / subscript / persistPut+get via id / reads short-circuit / writes drop+warn / persistEncoded via id / scan parity Job-ref vs id test_singleton.nim idempotent same-rootDir / different-rootDir rejection / no-arg instance lifecycle / reset retargets / reset idempotence / Teardown.request end-to-end Prerequisite delivered in the same series: replace the in-tree broker implementation with the external nim-brokers package; update all broker call-sites (waku_filter_v2, waku_relay, waku_rln_relay, delivery_service, peer_manager, requests/*, factory/*, api tests, etc.) to the new package API; chat2 made to compile again. Note: SDS adapter (Phase 5 of the design) is deferred -- nim-sds is still developed side-by-side and the persistency layer is intentionally SDS-agnostic. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * persistency: pin nim-brokers by URL+commit (workaround for stale registry) The bare `brokers >= 2.0.1` form cannot resolve on machines where the local nimble SAT solver enumerates only the registry-recorded 0.1.0 for brokers. The nim-lang/packages entry for `brokers` carries no per-tag metadata (only the URL), so until that registry entry is refreshed the SAT solver clamps the available-versions list to 0.1.0 and rejects the >= 2.0.1 constraint -- even though pkgs2 and pkgcache both have v2.0.1 cloned locally. Pinning by URL+commit bypasses the registry path entirely. Inline comment in waku.nimble documents the situation and the path back to the bare form once nim-lang/packages is updated. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * persistency: nph format pass Run `nph` on all 57 Nim files touched by this PR. Pure formatting: 17 files re-styled, no semantic change. Suite still 69/69. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Fix build, add local-storage-path config, lazy init of Persistency from Waku start * fix: fix nix deps * fixes for nix build, regenerate deps * reverting accidental dependency changes * Fixing deps * Apply suggestions from code review Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com> * persistency tests: migrate to suite / asyncTest / await Match the in-tree test convention (procSuite -> suite, sync test + waitFor -> asyncTest + await): - procSuite "X": -> suite "X": - For tests doing async work: test -> asyncTest, waitFor -> await. - Poll helpers (proc waitFor(t: Job, ...) in test_lifecycle.nim, proc waitUntilExists(...) in test_facade.nim and test_string_lookup.nim) -> Future[bool] {.async.}, internal `waitFor X` -> `await X`, internal `sleep(N)` -> `await sleepAsync(chronos.milliseconds(N))`. - Renamed test_lifecycle.nim's helper proc from `waitFor(t: Job, ...)` -> `pollExists(t: Job, ...)`; the previous name shadowed chronos.waitFor in the chronos macro expansion. - `chronos.milliseconds(N)` explicitly qualified because `std/times` also exports `milliseconds` (returning TimeInterval, not Duration). - `check await x` -> `let okN = await x; check okN` to dodge chronos's "yield in expr not lowered" with await-as-macro-argument. - `(await x).foo()` -> `let awN = await x; ... awN.foo() ...` for the same reason. waku/persistency/persistency.nim: nph also pulled the proc signatures across multiple lines; restored explicit `Future[void] {.async.}` return types after the colon (an intermediate nph pass had elided them). Suite: 71 / 71 OK against the new async write surface. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * use idiomatic valueOr instead of ifs * Reworked persistency shutdown, remove not necessary teardown mechanism * Use const for DefaultStoragePath * format to follow coding guidelines - no use of result and explicit returns - no functional change --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com>
2026-05-16 00:09:07 +02:00
await WakuPeerEvent.dropListener(nodeA.brokerCtx, metadataLis)
feat: active filter subscription management for edge nodes (#3773) feat: active filter subscription management for edge nodes ## Subscription Manager * edgeFilterSubLoop reconciles desired vs actual filter subscriptions * edgeFilterHealthLoop pings filter peers, evicts stale ones * EdgeFilterSubState per-shard tracking of confirmed peers and health * best-effort unsubscribe on peer removal * RequestEdgeShardHealth and RequestEdgeFilterPeerCount broker providers ## WakuNode * Remove old edge health loop (loopEdgeHealth, edgeHealthEvent, calculateEdgeTopicHealth) * Register MessageSeenEvent push handler on filter client during start * startDeliveryService now returns `Result[void, string]` and propagates errors ## Health Monitor * getFilterClientHealth queries RequestEdgeFilterPeerCount via broker * Shard/content health providers fall back to RequestEdgeShardHealth when relay inactive * Listen to EventShardTopicHealthChange for health recalculation * Add missing return p.notReady() on failed edge filter peer count request * HealthyThreshold constant moved to `connection_status.nim` ## Broker types * RequestEdgeShardHealth, RequestEdgeFilterPeerCount request types * EventShardTopicHealthChange event type ## Filter Client * Add timeout parameter to ping proc ## Tests * Health monitor event tests with per-node lockNewGlobalBrokerContext * Edge (light client) health update test * Edge health driven by confirmed filter subscriptions test * API subscription tests: sub/receive, failover, peer replacement Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com> Co-authored by Zoltan Nagy
2026-03-30 08:30:34 -03:00
require metadataOk
let connectTimeLimit = Moment.now() + TestConnectivityTimeLimit
var gotConnected = false
while Moment.now() < connectTimeLimit:
if lastStatus == ConnectionStatus.PartiallyConnected:
gotConnected = true
break
if await healthChangeSignal.wait().withTimeout(connectTimeLimit - Moment.now()):
healthChangeSignal.clear()
check:
gotConnected == true
callbackCount >= 1
lastStatus == ConnectionStatus.PartiallyConnected
healthChangeSignal.clear()
await nodeB.stop()
await nodeA.disconnectNode(nodeB.switch.peerInfo.toRemotePeerInfo())
let disconnectTimeLimit = Moment.now() + TestConnectivityTimeLimit
var gotDisconnected = false
while Moment.now() < disconnectTimeLimit:
if lastStatus == ConnectionStatus.Disconnected:
gotDisconnected = true
break
if await healthChangeSignal.wait().withTimeout(disconnectTimeLimit - Moment.now()):
healthChangeSignal.clear()
check:
gotDisconnected == true
lastStatus == ConnectionStatus.Disconnected
await monitorA.stopHealthMonitor()
feat: active filter subscription management for edge nodes (#3773) feat: active filter subscription management for edge nodes ## Subscription Manager * edgeFilterSubLoop reconciles desired vs actual filter subscriptions * edgeFilterHealthLoop pings filter peers, evicts stale ones * EdgeFilterSubState per-shard tracking of confirmed peers and health * best-effort unsubscribe on peer removal * RequestEdgeShardHealth and RequestEdgeFilterPeerCount broker providers ## WakuNode * Remove old edge health loop (loopEdgeHealth, edgeHealthEvent, calculateEdgeTopicHealth) * Register MessageSeenEvent push handler on filter client during start * startDeliveryService now returns `Result[void, string]` and propagates errors ## Health Monitor * getFilterClientHealth queries RequestEdgeFilterPeerCount via broker * Shard/content health providers fall back to RequestEdgeShardHealth when relay inactive * Listen to EventShardTopicHealthChange for health recalculation * Add missing return p.notReady() on failed edge filter peer count request * HealthyThreshold constant moved to `connection_status.nim` ## Broker types * RequestEdgeShardHealth, RequestEdgeFilterPeerCount request types * EventShardTopicHealthChange event type ## Filter Client * Add timeout parameter to ping proc ## Tests * Health monitor event tests with per-node lockNewGlobalBrokerContext * Edge (light client) health update test * Edge health driven by confirmed filter subscriptions test * API subscription tests: sub/receive, failover, peer replacement Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com> Co-authored by Zoltan Nagy
2026-03-30 08:30:34 -03:00
await ds.stopDeliveryService()
await nodeA.stop()
asyncTest "Edge health driven by confirmed filter subscriptions":
var nodeA: WakuNode
lockNewGlobalBrokerContext:
let nodeAKey = generateSecp256k1Key()
nodeA = newTestWakuNode(nodeAKey, parseIpAddress("127.0.0.1"), Port(0))
await nodeA.mountFilterClient()
nodeA.mountLightpushClient()
nodeA.mountStoreClient()
require nodeA.mountAutoSharding(1, 8).isOk
nodeA.mountMetadata(1, @[0'u16]).expect("Node A failed to mount metadata")
await nodeA.start()
let ds =
DeliveryService.new(false, nodeA).expect("Failed to create DeliveryService")
ds.startDeliveryService().expect("Failed to start DeliveryService")
let subMgr = ds.subscriptionManager
var nodeB: WakuNode
lockNewGlobalBrokerContext:
let nodeBKey = generateSecp256k1Key()
nodeB = newTestWakuNode(nodeBKey, parseIpAddress("127.0.0.1"), Port(0))
let driver = newSqliteArchiveDriver()
nodeB.mountArchive(driver).expect("Node B failed to mount archive")
(await nodeB.mountRelay()).expect("Node B failed to mount relay")
(await nodeB.mountLightpush()).expect("Node B failed to mount lightpush")
await nodeB.mountFilter()
await nodeB.mountStore()
require nodeB.mountAutoSharding(1, 8).isOk
nodeB.mountMetadata(1, toSeq(0'u16 ..< 8'u16)).expect(
"Node B failed to mount metadata"
)
await nodeB.start()
let monitorA = NodeHealthMonitor.new(nodeA)
var
lastStatus = ConnectionStatus.Disconnected
healthSignal = newAsyncEvent()
monitorA.onConnectionStatusChange = proc(status: ConnectionStatus) {.async.} =
lastStatus = status
healthSignal.fire()
monitorA.startHealthMonitor().expect("Health monitor failed to start")
var metadataFut = newFuture[void]("waitForMetadata")
let metadataLis = WakuPeerEvent
.listen(
nodeA.brokerCtx,
proc(evt: WakuPeerEvent): Future[void] {.async: (raises: []), gcsafe.} =
if not metadataFut.finished and
evt.kind == WakuPeerEventKind.EventMetadataUpdated:
metadataFut.complete()
,
)
.expect("Failed to listen for metadata")
await nodeA.connectToNodes(@[nodeB.switch.peerInfo.toRemotePeerInfo()])
let metadataOk = await metadataFut.withTimeout(TestConnectivityTimeLimit)
feat: persistency (#3880) * persistency: per-job SQLite-backed storage layer (singleton, brokered) Adds a backend-neutral CRUD library at waku/persistency/, plus the nim-brokers dependency swap that enables it. Architecture (ports-and-adapters): * Persistency: process-wide singleton, one root directory. * Job: one tenant, one DB file, one worker thread, one BrokerContext. * Backend: SQLite via waku/common/databases/db_sqlite. Uniform schema kv(category BLOB, key BLOB, payload BLOB) PRIMARY KEY (category, key) WITHOUT ROWID, WAL mode. * Writes are fire-and-forget via EventBroker(mt) PersistEvent. * Reads are async via five RequestBroker(mt) shapes (KvGet, KvExists, KvScan, KvCount, KvDelete). Reads return Result[T, PersistencyError]. * One storage thread per job; tenants isolated by BrokerContext. Public surface (waku/persistency/persistency.nim): Persistency.instance(rootDir) / Persistency.instance() / Persistency.reset() p.openJob(id) / p.closeJob(id) / p.dropJob(id) / p.close() p.job(id) / p[id] / p.hasJob(id) Writes (Job form & string-id form, fire-and-forget): persist / persistPut / persistDelete / persistEncoded Reads (Job form & string-id form, async Result): get / exists / scan / scanPrefix / count / deleteAcked Key & payload encoding (keys.nim, payload.nim): * encodePart family + variadic key(...) / payload(...) macros + single-value toKey / toPayload. * Primitives: string and openArray[byte] are 2-byte BE length + bytes; int{8..64} are sign-flipped 8-byte BE; uint{16..64} are 8-byte BE; bool/byte/char are 1 byte; enums are int64(ord(v)). * Generic encodePart[T: tuple | object] recurses through fields() so any composite Nim type is encodable without ceremony. * Stable across Nim/C compiler upgrades: no sizeof, no memcpy, no cast on pointers, no host-endianness dependency. * `rawKey(bytes)` + `persistPut(..., openArray[byte])` let callers bypass the built-in encoder with their own format (CBOR, protobuf...). Lifecycle: * Persistency.new is private; Persistency.instance is the only public constructor. Same rootDir is idempotent; conflicting rootDir is peInvalidArgument. Persistency.reset for test/restart paths. * openJob opens-or-creates the per-job SQLite file; an existing file is reused with its data preserved. * Teardown integration: Persistency.instance registers a Teardown MultiRequestBroker provider that closes all jobs and clears the singleton slot when Waku.stop() issues Teardown.request. Internal layering: types.nim pure value types (Key, KeyRange, KvRow, TxOp, PersistencyError) keys.nim encodePart primitives + key(...) macro payload.nim toPayload + payload(...) macro schema.nim CREATE TABLE + connection pragmas + user_version backend_sqlite.nim KvBackend, applyOps (single source of write SQL), getOne/existsOne/deleteOne, scanRange (asc/desc, half-open ranges, open-ended stop), countRange backend_comm.nim EventBroker(mt) PersistEvent + 5 RequestBroker(mt) declarations; encodeErr/decodeErr boundary helpers backend_thread.nim startStorageThread / stopStorageThread (shared allocShared0 arg, cstring dbPath, atomic ready/shutdown flags); per-thread provider registration persistency.nim Persistency + Job types, singleton state, public facade ../requests/lifecycle_requests.nim Teardown MultiRequestBroker Tests (69 cases, all passing): test_keys.nim sort-order invariants (length-prefix strings, sign-flipped ints, composite tuples, prefix range) test_backend.nim round-trip / replace / delete-return-value / batched atomicity / asc-desc-half-open-open- ended scans / category isolation / batch txDelete test_lifecycle.nim open-or-create rootDir / non-dir collision / reopen across sessions / idempotent openJob / two-tenant parallel isolation / closeJob joins worker / dropJob removes file / acked delete test_facade.nim put-then-get / atomic batch / scanPrefix asc/desc / deleteAcked hit-miss / fire-and-forget delete / two-tenant facade isolation test_encoding.nim tuple/named-tuple/object keys, embedded Key, enum encoding, field-major composite sort, payload struct encoding, end-to-end struct round-trip through SQLite test_string_lookup.nim peJobNotFound semantics / hasJob / subscript / persistPut+get via id / reads short-circuit / writes drop+warn / persistEncoded via id / scan parity Job-ref vs id test_singleton.nim idempotent same-rootDir / different-rootDir rejection / no-arg instance lifecycle / reset retargets / reset idempotence / Teardown.request end-to-end Prerequisite delivered in the same series: replace the in-tree broker implementation with the external nim-brokers package; update all broker call-sites (waku_filter_v2, waku_relay, waku_rln_relay, delivery_service, peer_manager, requests/*, factory/*, api tests, etc.) to the new package API; chat2 made to compile again. Note: SDS adapter (Phase 5 of the design) is deferred -- nim-sds is still developed side-by-side and the persistency layer is intentionally SDS-agnostic. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * persistency: pin nim-brokers by URL+commit (workaround for stale registry) The bare `brokers >= 2.0.1` form cannot resolve on machines where the local nimble SAT solver enumerates only the registry-recorded 0.1.0 for brokers. The nim-lang/packages entry for `brokers` carries no per-tag metadata (only the URL), so until that registry entry is refreshed the SAT solver clamps the available-versions list to 0.1.0 and rejects the >= 2.0.1 constraint -- even though pkgs2 and pkgcache both have v2.0.1 cloned locally. Pinning by URL+commit bypasses the registry path entirely. Inline comment in waku.nimble documents the situation and the path back to the bare form once nim-lang/packages is updated. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * persistency: nph format pass Run `nph` on all 57 Nim files touched by this PR. Pure formatting: 17 files re-styled, no semantic change. Suite still 69/69. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Fix build, add local-storage-path config, lazy init of Persistency from Waku start * fix: fix nix deps * fixes for nix build, regenerate deps * reverting accidental dependency changes * Fixing deps * Apply suggestions from code review Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com> * persistency tests: migrate to suite / asyncTest / await Match the in-tree test convention (procSuite -> suite, sync test + waitFor -> asyncTest + await): - procSuite "X": -> suite "X": - For tests doing async work: test -> asyncTest, waitFor -> await. - Poll helpers (proc waitFor(t: Job, ...) in test_lifecycle.nim, proc waitUntilExists(...) in test_facade.nim and test_string_lookup.nim) -> Future[bool] {.async.}, internal `waitFor X` -> `await X`, internal `sleep(N)` -> `await sleepAsync(chronos.milliseconds(N))`. - Renamed test_lifecycle.nim's helper proc from `waitFor(t: Job, ...)` -> `pollExists(t: Job, ...)`; the previous name shadowed chronos.waitFor in the chronos macro expansion. - `chronos.milliseconds(N)` explicitly qualified because `std/times` also exports `milliseconds` (returning TimeInterval, not Duration). - `check await x` -> `let okN = await x; check okN` to dodge chronos's "yield in expr not lowered" with await-as-macro-argument. - `(await x).foo()` -> `let awN = await x; ... awN.foo() ...` for the same reason. waku/persistency/persistency.nim: nph also pulled the proc signatures across multiple lines; restored explicit `Future[void] {.async.}` return types after the colon (an intermediate nph pass had elided them). Suite: 71 / 71 OK against the new async write surface. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * use idiomatic valueOr instead of ifs * Reworked persistency shutdown, remove not necessary teardown mechanism * Use const for DefaultStoragePath * format to follow coding guidelines - no use of result and explicit returns - no functional change --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com>
2026-05-16 00:09:07 +02:00
await WakuPeerEvent.dropListener(nodeA.brokerCtx, metadataLis)
feat: active filter subscription management for edge nodes (#3773) feat: active filter subscription management for edge nodes ## Subscription Manager * edgeFilterSubLoop reconciles desired vs actual filter subscriptions * edgeFilterHealthLoop pings filter peers, evicts stale ones * EdgeFilterSubState per-shard tracking of confirmed peers and health * best-effort unsubscribe on peer removal * RequestEdgeShardHealth and RequestEdgeFilterPeerCount broker providers ## WakuNode * Remove old edge health loop (loopEdgeHealth, edgeHealthEvent, calculateEdgeTopicHealth) * Register MessageSeenEvent push handler on filter client during start * startDeliveryService now returns `Result[void, string]` and propagates errors ## Health Monitor * getFilterClientHealth queries RequestEdgeFilterPeerCount via broker * Shard/content health providers fall back to RequestEdgeShardHealth when relay inactive * Listen to EventShardTopicHealthChange for health recalculation * Add missing return p.notReady() on failed edge filter peer count request * HealthyThreshold constant moved to `connection_status.nim` ## Broker types * RequestEdgeShardHealth, RequestEdgeFilterPeerCount request types * EventShardTopicHealthChange event type ## Filter Client * Add timeout parameter to ping proc ## Tests * Health monitor event tests with per-node lockNewGlobalBrokerContext * Edge (light client) health update test * Edge health driven by confirmed filter subscriptions test * API subscription tests: sub/receive, failover, peer replacement Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com> Co-authored by Zoltan Nagy
2026-03-30 08:30:34 -03:00
require metadataOk
var deadline = Moment.now() + TestConnectivityTimeLimit
while Moment.now() < deadline:
if lastStatus == ConnectionStatus.PartiallyConnected:
break
if await healthSignal.wait().withTimeout(deadline - Moment.now()):
healthSignal.clear()
check lastStatus == ConnectionStatus.PartiallyConnected
var shardHealthFut = newFuture[EventShardTopicHealthChange]("waitForShardHealth")
let shardHealthLis = EventShardTopicHealthChange
.listen(
nodeA.brokerCtx,
proc(
evt: EventShardTopicHealthChange
): Future[void] {.async: (raises: []), gcsafe.} =
if not shardHealthFut.finished and (
evt.health == TopicHealth.MINIMALLY_HEALTHY or
evt.health == TopicHealth.SUFFICIENTLY_HEALTHY
):
shardHealthFut.complete(evt)
,
)
.expect("Failed to listen for shard health")
let contentTopic = ContentTopic("/waku/2/default-content/proto")
subMgr.subscribe(contentTopic).expect("Failed to subscribe")
let shardHealthOk = await shardHealthFut.withTimeout(TestConnectivityTimeLimit)
feat: persistency (#3880) * persistency: per-job SQLite-backed storage layer (singleton, brokered) Adds a backend-neutral CRUD library at waku/persistency/, plus the nim-brokers dependency swap that enables it. Architecture (ports-and-adapters): * Persistency: process-wide singleton, one root directory. * Job: one tenant, one DB file, one worker thread, one BrokerContext. * Backend: SQLite via waku/common/databases/db_sqlite. Uniform schema kv(category BLOB, key BLOB, payload BLOB) PRIMARY KEY (category, key) WITHOUT ROWID, WAL mode. * Writes are fire-and-forget via EventBroker(mt) PersistEvent. * Reads are async via five RequestBroker(mt) shapes (KvGet, KvExists, KvScan, KvCount, KvDelete). Reads return Result[T, PersistencyError]. * One storage thread per job; tenants isolated by BrokerContext. Public surface (waku/persistency/persistency.nim): Persistency.instance(rootDir) / Persistency.instance() / Persistency.reset() p.openJob(id) / p.closeJob(id) / p.dropJob(id) / p.close() p.job(id) / p[id] / p.hasJob(id) Writes (Job form & string-id form, fire-and-forget): persist / persistPut / persistDelete / persistEncoded Reads (Job form & string-id form, async Result): get / exists / scan / scanPrefix / count / deleteAcked Key & payload encoding (keys.nim, payload.nim): * encodePart family + variadic key(...) / payload(...) macros + single-value toKey / toPayload. * Primitives: string and openArray[byte] are 2-byte BE length + bytes; int{8..64} are sign-flipped 8-byte BE; uint{16..64} are 8-byte BE; bool/byte/char are 1 byte; enums are int64(ord(v)). * Generic encodePart[T: tuple | object] recurses through fields() so any composite Nim type is encodable without ceremony. * Stable across Nim/C compiler upgrades: no sizeof, no memcpy, no cast on pointers, no host-endianness dependency. * `rawKey(bytes)` + `persistPut(..., openArray[byte])` let callers bypass the built-in encoder with their own format (CBOR, protobuf...). Lifecycle: * Persistency.new is private; Persistency.instance is the only public constructor. Same rootDir is idempotent; conflicting rootDir is peInvalidArgument. Persistency.reset for test/restart paths. * openJob opens-or-creates the per-job SQLite file; an existing file is reused with its data preserved. * Teardown integration: Persistency.instance registers a Teardown MultiRequestBroker provider that closes all jobs and clears the singleton slot when Waku.stop() issues Teardown.request. Internal layering: types.nim pure value types (Key, KeyRange, KvRow, TxOp, PersistencyError) keys.nim encodePart primitives + key(...) macro payload.nim toPayload + payload(...) macro schema.nim CREATE TABLE + connection pragmas + user_version backend_sqlite.nim KvBackend, applyOps (single source of write SQL), getOne/existsOne/deleteOne, scanRange (asc/desc, half-open ranges, open-ended stop), countRange backend_comm.nim EventBroker(mt) PersistEvent + 5 RequestBroker(mt) declarations; encodeErr/decodeErr boundary helpers backend_thread.nim startStorageThread / stopStorageThread (shared allocShared0 arg, cstring dbPath, atomic ready/shutdown flags); per-thread provider registration persistency.nim Persistency + Job types, singleton state, public facade ../requests/lifecycle_requests.nim Teardown MultiRequestBroker Tests (69 cases, all passing): test_keys.nim sort-order invariants (length-prefix strings, sign-flipped ints, composite tuples, prefix range) test_backend.nim round-trip / replace / delete-return-value / batched atomicity / asc-desc-half-open-open- ended scans / category isolation / batch txDelete test_lifecycle.nim open-or-create rootDir / non-dir collision / reopen across sessions / idempotent openJob / two-tenant parallel isolation / closeJob joins worker / dropJob removes file / acked delete test_facade.nim put-then-get / atomic batch / scanPrefix asc/desc / deleteAcked hit-miss / fire-and-forget delete / two-tenant facade isolation test_encoding.nim tuple/named-tuple/object keys, embedded Key, enum encoding, field-major composite sort, payload struct encoding, end-to-end struct round-trip through SQLite test_string_lookup.nim peJobNotFound semantics / hasJob / subscript / persistPut+get via id / reads short-circuit / writes drop+warn / persistEncoded via id / scan parity Job-ref vs id test_singleton.nim idempotent same-rootDir / different-rootDir rejection / no-arg instance lifecycle / reset retargets / reset idempotence / Teardown.request end-to-end Prerequisite delivered in the same series: replace the in-tree broker implementation with the external nim-brokers package; update all broker call-sites (waku_filter_v2, waku_relay, waku_rln_relay, delivery_service, peer_manager, requests/*, factory/*, api tests, etc.) to the new package API; chat2 made to compile again. Note: SDS adapter (Phase 5 of the design) is deferred -- nim-sds is still developed side-by-side and the persistency layer is intentionally SDS-agnostic. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * persistency: pin nim-brokers by URL+commit (workaround for stale registry) The bare `brokers >= 2.0.1` form cannot resolve on machines where the local nimble SAT solver enumerates only the registry-recorded 0.1.0 for brokers. The nim-lang/packages entry for `brokers` carries no per-tag metadata (only the URL), so until that registry entry is refreshed the SAT solver clamps the available-versions list to 0.1.0 and rejects the >= 2.0.1 constraint -- even though pkgs2 and pkgcache both have v2.0.1 cloned locally. Pinning by URL+commit bypasses the registry path entirely. Inline comment in waku.nimble documents the situation and the path back to the bare form once nim-lang/packages is updated. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * persistency: nph format pass Run `nph` on all 57 Nim files touched by this PR. Pure formatting: 17 files re-styled, no semantic change. Suite still 69/69. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Fix build, add local-storage-path config, lazy init of Persistency from Waku start * fix: fix nix deps * fixes for nix build, regenerate deps * reverting accidental dependency changes * Fixing deps * Apply suggestions from code review Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com> * persistency tests: migrate to suite / asyncTest / await Match the in-tree test convention (procSuite -> suite, sync test + waitFor -> asyncTest + await): - procSuite "X": -> suite "X": - For tests doing async work: test -> asyncTest, waitFor -> await. - Poll helpers (proc waitFor(t: Job, ...) in test_lifecycle.nim, proc waitUntilExists(...) in test_facade.nim and test_string_lookup.nim) -> Future[bool] {.async.}, internal `waitFor X` -> `await X`, internal `sleep(N)` -> `await sleepAsync(chronos.milliseconds(N))`. - Renamed test_lifecycle.nim's helper proc from `waitFor(t: Job, ...)` -> `pollExists(t: Job, ...)`; the previous name shadowed chronos.waitFor in the chronos macro expansion. - `chronos.milliseconds(N)` explicitly qualified because `std/times` also exports `milliseconds` (returning TimeInterval, not Duration). - `check await x` -> `let okN = await x; check okN` to dodge chronos's "yield in expr not lowered" with await-as-macro-argument. - `(await x).foo()` -> `let awN = await x; ... awN.foo() ...` for the same reason. waku/persistency/persistency.nim: nph also pulled the proc signatures across multiple lines; restored explicit `Future[void] {.async.}` return types after the colon (an intermediate nph pass had elided them). Suite: 71 / 71 OK against the new async write surface. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * use idiomatic valueOr instead of ifs * Reworked persistency shutdown, remove not necessary teardown mechanism * Use const for DefaultStoragePath * format to follow coding guidelines - no use of result and explicit returns - no functional change --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com>
2026-05-16 00:09:07 +02:00
await EventShardTopicHealthChange.dropListener(nodeA.brokerCtx, shardHealthLis)
feat: active filter subscription management for edge nodes (#3773) feat: active filter subscription management for edge nodes ## Subscription Manager * edgeFilterSubLoop reconciles desired vs actual filter subscriptions * edgeFilterHealthLoop pings filter peers, evicts stale ones * EdgeFilterSubState per-shard tracking of confirmed peers and health * best-effort unsubscribe on peer removal * RequestEdgeShardHealth and RequestEdgeFilterPeerCount broker providers ## WakuNode * Remove old edge health loop (loopEdgeHealth, edgeHealthEvent, calculateEdgeTopicHealth) * Register MessageSeenEvent push handler on filter client during start * startDeliveryService now returns `Result[void, string]` and propagates errors ## Health Monitor * getFilterClientHealth queries RequestEdgeFilterPeerCount via broker * Shard/content health providers fall back to RequestEdgeShardHealth when relay inactive * Listen to EventShardTopicHealthChange for health recalculation * Add missing return p.notReady() on failed edge filter peer count request * HealthyThreshold constant moved to `connection_status.nim` ## Broker types * RequestEdgeShardHealth, RequestEdgeFilterPeerCount request types * EventShardTopicHealthChange event type ## Filter Client * Add timeout parameter to ping proc ## Tests * Health monitor event tests with per-node lockNewGlobalBrokerContext * Edge (light client) health update test * Edge health driven by confirmed filter subscriptions test * API subscription tests: sub/receive, failover, peer replacement Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com> Co-authored by Zoltan Nagy
2026-03-30 08:30:34 -03:00
check shardHealthOk == true
check subMgr.edgeFilterSubStates.len > 0
healthSignal.clear()
deadline = Moment.now() + TestConnectivityTimeLimit
while Moment.now() < deadline:
if lastStatus == ConnectionStatus.PartiallyConnected:
break
if await healthSignal.wait().withTimeout(deadline - Moment.now()):
healthSignal.clear()
check lastStatus == ConnectionStatus.PartiallyConnected
await ds.stopDeliveryService()
await monitorA.stopHealthMonitor()
await nodeB.stop()
await nodeA.stop()