Alex Jbanca
2da08c3e14
fix(filter): bound per-Sub retry storm under sustained subscribe failures
...
Sustained subscribe failures saturated CPU, leaked 600+ subscriptionLoop
goroutines, and twice panicked with `strings: Join output length overflow`.
Five independent issues:
- api/filter: errcnt budget was gated on `possibleRecursiveError`, which
matched only `ErrNoPeersAvailable` / `swarm.ErrDialBackoff`. The dominant
error class never incremented errcnt, so the 3-error-per-5s budget was
dead code. Replaced gate with `shouldIncrementErrCnt(err)`: counts every
non-nil error.
- protocol/filter: WakuFilterLightNode.Subscribe flattened per-peer errors
via `fmt.Errorf+strings.Join`, losing typed *FilterError and growing
unboundedly. Replaced with typed `*SubscribeError` (PeerID, ContentTopics,
Err) plus `HasRateLimitError()`; `Error()` is hard-capped. Concurrent
per-peer appends now mutex-guarded.
- api/filter: 60-s rate-limit backoff on `*SubscribeError.HasRateLimitError()`.
`shouldHonourRateLimitBackoff(rateLimitedUntil, now)` gates ticker push and
closing-channel checkAndResubscribe. Cleared on subscribe success.
- api/filter: FilterManager.waitingToSubQueue was a cap-100 chan written and
drained under the same lock, deadlocking the manager once full. Replaced
with mutex-guarded slice.
- api/filter: Sub.cleanup closed DataCh while multiplex forwarders could
still be sending. Added multiplexWG awaited in cleanup; forwarder send is
in a select with apiSub.ctx.Done() so it can't deadlock when
subDetails.C is never closed (node-stop transitions).
Tests (all under -race):
- TestSub_CleanupRaceWithMultiplex (50 iter)
- TestSub_CleanupDoesNotDeadlockWhenSubChannelStaysOpen
- TestFilterManager_SubscribeFilter_DoesNotDeadlockWhenQueueFull
- TestShouldIncrementErrCnt
2026-05-18 12:14:54 +05:30
Alex Jbanca
de84ba47f9
fix: SetClosing no longer holds SubscriptionDetails lock across a potentially blocking channel send ( #1301 )
2026-04-21 15:40:52 +05:30
Igor Sirotin
b0af7695bd
feat: single point of localnode parametrization ( #1297 )
2025-12-10 10:21:57 +00:00
Prem Chaitanya Prathi
84a4b1be7a
fix: use simple protocol and pubsubTopic based selection for peerExchange ( #1295 )
2025-10-15 06:37:06 +05:30
Igor Sirotin
070ff0d7c5
chore: upgrade go-ethereum ( #1292 )
2025-10-03 00:33:39 +01:00
Igor Sirotin
6ceea038ff
ci: fix linter, storev3 tests and nix-flake jobs ( #1289 )
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
Co-authored-by: Jakub Sokołowski <jakub@status.im>
2025-08-25 15:44:07 +01:00
Igor Sirotin
5dea6d3bce
fix: remove automatic relay unsubscribe in a goroutine ( #1284 )
2025-05-28 12:04:23 +01:00
gabrielmer
fdf03de179
fix: setting peerId for store query ( #1278 )
2025-03-20 15:47:51 +02:00
richΛrd
e68fcdb554
refactor: use peerInfo and multiaddresses instead of peerID ( #1269 )
2025-03-13 18:15:44 +02:00
Prem Chaitanya Prathi
24932b529c
chore: disable filter logs to prevent spam ( #1275 )
2025-02-18 11:16:16 +05:30
Prem Chaitanya Prathi
4ef460cb95
fix: filter network change handling ( #1270 )
2025-01-03 15:47:27 +05:30
Prem Chaitanya Prathi
78b522db50
fix: have better defaults for send/recv rate-limits ( #1267 )
2024-12-16 19:18:16 +05:30
Pablo Lopez
9a243696d7
fix: fatal error: concurrent map writes ( #1265 )
2024-12-10 14:08:04 +02:00
Prem Chaitanya Prathi
809dba5854
fix: use bad peer removal logic only for lightpush and filter and option to restart missing message verifier ( #1244 )
2024-12-09 14:14:28 +05:30
richΛrd
0c594b3140
feat: filter rate limit ( #1258 )
2024-11-29 12:25:55 -04:00
richΛrd
fdb3c3d0b3
refactor: use protobuffer for API storenode queries ( #1248 )
2024-10-24 14:47:57 -04:00
kaichao
38be0dc169
chore: fix store request id log ( #1242 )
2024-10-21 11:23:49 +08:00
richΛrd
76275f6fb8
feat: storenode cycle ( #1223 )
2024-10-14 14:58:51 -04:00
richΛrd
15b4aee808
fix: use byte array to decode ENRs uint8 fields ( #1227 )
2024-10-03 10:12:31 -04:00
frank
8b0e03113d
feat: log error and stacktrace when panic in goroutine ( #1225 )
2024-09-25 17:15:20 +08:00
Arseniy Klempner
798c9c5d81
feat: emit an event in EventBus upon dial error ( #1222 )
2024-09-23 14:41:07 -07:00
Richard Ramos
2800391204
fix: requestID validation
2024-09-18 17:27:51 -04:00
richΛrd
f0acee4d1d
feat: ratelimit store queries and add options to Next ( #1221 )
2024-09-18 17:09:37 -04:00
Richard Ramos
991e872de9
chore: add requestID to error message in store validation
2024-09-17 10:13:01 -04:00
Prem Chaitanya Prathi
bf2b7dce1a
feat: increase outbound q size for pubsub ( #1217 )
2024-09-10 18:12:08 +05:30
kaichao
99d2477035
fix: check subscription when relay publish message ( #1212 )
2024-08-31 09:22:59 +08:00
Igor Sirotin
4c3ec60da5
fix: prevent panics in peermanager and WakuRelay ( #1206 )
2024-08-23 15:23:07 +01:00
Igor Sirotin
1472b17d39
fix: flaky panic on relay unsubscribe ( #1201 )
2024-08-22 10:16:03 +05:30
Prem Chaitanya Prathi
8ff8779bb0
feat: shard aware pruning of peer store ( #1193 )
2024-08-21 18:08:11 +05:30
Prem Chaitanya Prathi
bc16c74f2e
feat: shard based filtering in peer exchange ( #1194 )
2024-08-15 07:27:56 +05:30
Prem Chaitanya Prathi
f3560ced3b
chore: move filter manager from status-go to go-waku ( #1177 )
2024-08-06 13:10:56 +05:30
richΛrd
04a9af931f
fix: handle scenario where the node's ENR has no shard (due to shard update) ( #1176 )
2024-07-31 14:58:21 -04:00
a4009b70d1
fix: replace references to old statusim.net domain
...
Use of `statusim.net` domain been deprecated since March:
https://github.com/status-im/infra-shards/commit/7df38c14
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-07-31 13:31:16 +02:00
Prem Chaitanya Prathi
e1e136cc68
fix: parallelize filter subs to different peers ( #1169 )
2024-07-30 18:06:41 +05:30
Prem Chaitanya Prathi
a9be17fd48
chore: method to disconnect all peers and not notify ( #1168 )
2024-07-24 18:17:31 +05:30
Prem Chaitanya Prathi
58d9721026
fix: filter ping timeout and retry in case of failure ( #1166 )
2024-07-24 07:59:17 +05:30
Prem Chaitanya Prathi
f3da812b33
fix: record connection failures when stream opening fails for any protocol ( #1163 )
2024-07-18 21:52:33 -07:00
Prem Chaitanya Prathi
8afeb529df
chore: change log levels ( #1165 )
2024-07-17 15:32:32 +05:30
Prem Chaitanya Prathi
dacff8a6ae
feat: lightclient err handling ( #1160 )
2024-07-15 19:47:27 +05:30
Prem Chaitanya Prathi
9fbb955b16
chore: allow setting enr shards for lightclient ( #1159 )
2024-07-15 19:29:31 +05:30
Vaclav Pavlin
2f333c1e1c
chore(wakunode2): add ability to specify PX options in wakunode2 ( #1157 )
...
Co-authored-by: Prem Chaitanya Prathi <chaitanyaprem@gmail.com>
2024-07-12 10:09:04 +05:30
Prem Chaitanya Prathi
bb74e39ed9
feat: support for lightpush to use more than 1 peer ( #1158 )
2024-07-12 09:28:23 +05:30
richΛrd
3b0c8e9207
chore: bump go-libp2p ( #1155 )
2024-07-11 11:26:04 -04:00
Prem Chaitanya Prathi
221cbf6599
fix: for light node do not check for matching shards but only clusterID ( #1154 )
2024-07-09 18:50:44 +05:30
richΛrd
7c13021a32
feat: use mesh peers instead of all peers for determining topic health ( #1150 )
2024-07-03 16:35:39 -04:00
Prem Chaitanya Prathi
5b5ea977af
feat: optimize filter subs ( #1144 )
...
Co-authored-by: richΛrd <info@richardramos.me>
2024-07-01 19:48:00 +05:30
richΛrd
e3d7ab1d58
fix: panic due to enr having more than 300 bytes ( #1140 )
2024-07-01 09:47:38 -04:00
richΛrd
201d434d50
fix: ignore ws from circuit relay addresses, and allow non multiaddresses in multiaddrs ENR key ( #1141 )
2024-07-01 09:03:34 -04:00
richΛrd
8d7c2f7bfa
feat: filter peers stored in cache by cluster-id in peer-exchange ( #1139 )
2024-06-27 10:02:01 -04:00
Prem Chaitanya Prathi
19a47a1ac1
feat: modify peer-manager to consider relay target peers ( #1135 )
2024-06-26 06:18:44 +05:30