logos-messaging-go

mirror of https://github.com/logos-messaging/logos-messaging-go.git synced 2026-07-08 18:29:25 +00:00

History

Alex Jbanca 2da08c3e14 fix(filter): bound per-Sub retry storm under sustained subscribe failures

Sustained subscribe failures saturated CPU, leaked 600+ subscriptionLoop
  goroutines, and twice panicked with `strings: Join output length overflow`.
  Five independent issues:

  - api/filter: errcnt budget was gated on `possibleRecursiveError`, which
    matched only `ErrNoPeersAvailable` / `swarm.ErrDialBackoff`. The dominant
    error class never incremented errcnt, so the 3-error-per-5s budget was
    dead code. Replaced gate with `shouldIncrementErrCnt(err)`: counts every
    non-nil error.

  - protocol/filter: WakuFilterLightNode.Subscribe flattened per-peer errors
    via `fmt.Errorf+strings.Join`, losing typed *FilterError and growing
    unboundedly. Replaced with typed `*SubscribeError` (PeerID, ContentTopics,
    Err) plus `HasRateLimitError()`; `Error()` is hard-capped. Concurrent
    per-peer appends now mutex-guarded.

  - api/filter: 60-s rate-limit backoff on `*SubscribeError.HasRateLimitError()`.
    `shouldHonourRateLimitBackoff(rateLimitedUntil, now)` gates ticker push and
    closing-channel checkAndResubscribe. Cleared on subscribe success.

  - api/filter: FilterManager.waitingToSubQueue was a cap-100 chan written and
    drained under the same lock, deadlocking the manager once full. Replaced
    with mutex-guarded slice.

  - api/filter: Sub.cleanup closed DataCh while multiplex forwarders could
    still be sending. Added multiplexWG awaited in cleanup; forwarder send is
    in a select with apiSub.ctx.Done() so it can't deadlock when
    subDetails.C is never closed (node-stop transitions).

  Tests (all under -race):
  - TestSub_CleanupRaceWithMultiplex (50 iter)
  - TestSub_CleanupDoesNotDeadlockWhenSubChannelStaysOpen
  - TestFilterManager_SubscribeFilter_DoesNotDeadlockWhenQueueFull
  - TestShouldIncrementErrCnt

2026-05-18 12:14:54 +05:30

cliutils

chore: remove bridging topics feature (#949 )

2023-12-09 13:59:35 -04:00

metrics

fix: return error in relay publish if exceeding max-msg-size (#939 )

2023-12-01 06:53:28 +05:30

persistence

feat: log error and stacktrace when panic in goroutine (#1225 )

2024-09-25 17:15:20 +08:00

tools

chore: switch to Google's Protobuf library

2023-02-16 11:37:59 -04:00

try

feat: Implement logic for publish from node

2021-12-07 14:32:02 +01:00

fix(filter): bound per-Sub retry storm under sustained subscribe failures

2026-05-18 12:14:54 +05:30