Root cause (post libp2p f54c715 update + rv/kad/service-disco changes):
- chat2disco (and waku factory for circuitRelayClient) do direct
`node.switch.services = @[Service(hp/autonat)]` (bypassing the
deprecated switch.add that called .setup).
- AutonatService (enableAddressMapper=true by default) + HP require
their .setup(switch) to populate .addressMapper (and handlers)
before start().
- In switch.start: services.start → autonat.start does
addressMappers.add(nil); await peerInfo.update() → for m in ...:
await m(...) → nil proc deref (Defect/SEGV in release).
- Surfaces at the `await node.switch.start()` (waku_node:589).
- Secondary: wakuKademlia.start() + waku mapper (capturing node,
returning announced which hp mutates) scheduled before switch.start
(which activates the mounted kad via ms + runs the hp mappers/updates).
Fix:
- After services= in apps/chat2disco/chat2disco.nim and in
waku/factory/waku.nim (both hp and bare autonat branches), explicitly
call the .setup(node.switch) (or hp.setup) and handle error.
- Move `if not node.wakuKademlia.isNil(): ...start()` to after
switch.start() + reconnectRelayPeers (correct ordering for mounted
protocol user loops).
- Harden waku addressMapper (nil/empty guard, return listenAddrs) and
set peerInfo.announcedAddrs (short-circuit) at the add site, in
updateAnnounced..., and in the onReservation callbacks (chat2disco +
factory) so expandAddrs prefers it.
- Minor: lookup/periodic guards in waku_kademlia; doc in
autonat_service.
Also nph reformats on touched files.
Reuses: the .setup methods, existing post-switch init patterns,
isNil guards, CatchableError handling, make chat2disco + nph.
Verified: make chat2disco (twice, pre/post nph) SuccessX; no SEGV in
multiple start-path runs (only expected thread EOF on pipe close);
diff only our 5 files.
Builds on 82d87cfa (libp2p update) without touching pins or vendored.
Caveat: clean `make update` still requires the mix temp patches in
pkgs2 (as documented in the update).
Fixes the reported chat2disco startup segfault.
- waku.nimble: libp2p #f54c7150a7ccbc4e9871bb8b56ecfd7e3e59f7de; also pin protobuf_serialization#ce97ba0 and websock#fb8ba71 to match new libp2p reqs; mix remains on 6c5f43 (its declared pins lag)
- nimble.lock + nix/deps.nix updated (libp2p rev/sha)
- Source fixes for new libp2p (object configs, removed utility module -> libp2p/utils/opt, rendezvous nil, kademlia no longer imports mix_protocol to reduce bad dep surface)
- nph on touched .nim
- chat2disco builds+starts successfully against the updated libp2p (with in-nimbledeps patches to mix for its removed symbols like sequninit/utility and withValue(Opt) sites; run make update will require similar or upstream mix bump)
Refs the 106-commit libp2p delta with kademlia/service-disco fixes (e.g. ticket time, record sizes, registration).
* any port set to 0 on conf results in a random port bound
* Debug API MyBoundPorts reports actually bound ports for all services, reports 0 if disabled
* write back bound values to both WakuConf and WakuNode.ports
* setupDiscoveryV5 returns Result and errors out on port 0
* rename setupAndStartDiscv5WithAutoPort to setupAndStartDiscv5
* updateWaku ENR rebuild now runs after discv5 startup
* Add DefaultP2pTcpPort, DefaultDiscv5UdpPort, DefaultWebSocketPort, DefaultRestPort, DefaultMetricsHttpPort
* add tests
feat: active filter subscription management for edge nodes
## Subscription Manager
* edgeFilterSubLoop reconciles desired vs actual filter subscriptions
* edgeFilterHealthLoop pings filter peers, evicts stale ones
* EdgeFilterSubState per-shard tracking of confirmed peers and health
* best-effort unsubscribe on peer removal
* RequestEdgeShardHealth and RequestEdgeFilterPeerCount broker providers
## WakuNode
* Remove old edge health loop (loopEdgeHealth, edgeHealthEvent, calculateEdgeTopicHealth)
* Register MessageSeenEvent push handler on filter client during start
* startDeliveryService now returns `Result[void, string]` and propagates errors
## Health Monitor
* getFilterClientHealth queries RequestEdgeFilterPeerCount via broker
* Shard/content health providers fall back to RequestEdgeShardHealth when relay inactive
* Listen to EventShardTopicHealthChange for health recalculation
* Add missing return p.notReady() on failed edge filter peer count request
* HealthyThreshold constant moved to `connection_status.nim`
## Broker types
* RequestEdgeShardHealth, RequestEdgeFilterPeerCount request types
* EventShardTopicHealthChange event type
## Filter Client
* Add timeout parameter to ping proc
## Tests
* Health monitor event tests with per-node lockNewGlobalBrokerContext
* Edge (light client) health update test
* Edge health driven by confirmed filter subscriptions test
* API subscription tests: sub/receive, failover, peer replacement
Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com>
Co-authored by Zoltan Nagy
* refactor retention policy to allow union of several retention policies
* bug fix time retention policy
* add removal of orphan partitions if any
* use nim-http-utils 0.4.1
* refactor retention policy to allow union of several retention policies
* bug fix time retention policy
* add removal of orphan partitions if any
* use nim-http-utils 0.4.1