mirror of
https://github.com/logos-messaging/logos-delivery.git
synced 2026-05-21 14:19:36 +00:00
* persistency: per-job SQLite-backed storage layer (singleton, brokered)
Adds a backend-neutral CRUD library at waku/persistency/, plus the
nim-brokers dependency swap that enables it.
Architecture (ports-and-adapters):
* Persistency: process-wide singleton, one root directory.
* Job: one tenant, one DB file, one worker thread, one BrokerContext.
* Backend: SQLite via waku/common/databases/db_sqlite. Uniform schema
kv(category BLOB, key BLOB, payload BLOB) PRIMARY KEY (category, key)
WITHOUT ROWID, WAL mode.
* Writes are fire-and-forget via EventBroker(mt) PersistEvent.
* Reads are async via five RequestBroker(mt) shapes (KvGet, KvExists,
KvScan, KvCount, KvDelete). Reads return Result[T, PersistencyError].
* One storage thread per job; tenants isolated by BrokerContext.
Public surface (waku/persistency/persistency.nim):
Persistency.instance(rootDir) / Persistency.instance() / Persistency.reset()
p.openJob(id) / p.closeJob(id) / p.dropJob(id) / p.close()
p.job(id) / p[id] / p.hasJob(id)
Writes (Job form & string-id form, fire-and-forget):
persist / persistPut / persistDelete / persistEncoded
Reads (Job form & string-id form, async Result):
get / exists / scan / scanPrefix / count / deleteAcked
Key & payload encoding (keys.nim, payload.nim):
* encodePart family + variadic key(...) / payload(...) macros +
single-value toKey / toPayload.
* Primitives: string and openArray[byte] are 2-byte BE length + bytes;
int{8..64} are sign-flipped 8-byte BE; uint{16..64} are 8-byte BE;
bool/byte/char are 1 byte; enums are int64(ord(v)).
* Generic encodePart[T: tuple | object] recurses through fields() so
any composite Nim type is encodable without ceremony.
* Stable across Nim/C compiler upgrades: no sizeof, no memcpy, no
cast on pointers, no host-endianness dependency.
* `rawKey(bytes)` + `persistPut(..., openArray[byte])` let callers
bypass the built-in encoder with their own format (CBOR, protobuf...).
Lifecycle:
* Persistency.new is private; Persistency.instance is the only public
constructor. Same rootDir is idempotent; conflicting rootDir is
peInvalidArgument. Persistency.reset for test/restart paths.
* openJob opens-or-creates the per-job SQLite file; an existing file
is reused with its data preserved.
* Teardown integration: Persistency.instance registers a Teardown
MultiRequestBroker provider that closes all jobs and clears the
singleton slot when Waku.stop() issues Teardown.request.
Internal layering:
types.nim pure value types (Key, KeyRange, KvRow, TxOp,
PersistencyError)
keys.nim encodePart primitives + key(...) macro
payload.nim toPayload + payload(...) macro
schema.nim CREATE TABLE + connection pragmas + user_version
backend_sqlite.nim KvBackend, applyOps (single source of write SQL),
getOne/existsOne/deleteOne, scanRange (asc/desc,
half-open ranges, open-ended stop), countRange
backend_comm.nim EventBroker(mt) PersistEvent + 5 RequestBroker(mt)
declarations; encodeErr/decodeErr boundary helpers
backend_thread.nim startStorageThread / stopStorageThread (shared
allocShared0 arg, cstring dbPath, atomic
ready/shutdown flags); per-thread provider
registration
persistency.nim Persistency + Job types, singleton state, public
facade
../requests/lifecycle_requests.nim
Teardown MultiRequestBroker
Tests (69 cases, all passing):
test_keys.nim sort-order invariants (length-prefix strings,
sign-flipped ints, composite tuples, prefix
range)
test_backend.nim round-trip / replace / delete-return-value /
batched atomicity / asc-desc-half-open-open-
ended scans / category isolation / batch
txDelete
test_lifecycle.nim open-or-create rootDir / non-dir collision /
reopen across sessions / idempotent openJob /
two-tenant parallel isolation / closeJob joins
worker / dropJob removes file / acked delete
test_facade.nim put-then-get / atomic batch / scanPrefix
asc/desc / deleteAcked hit-miss /
fire-and-forget delete / two-tenant facade
isolation
test_encoding.nim tuple/named-tuple/object keys, embedded Key,
enum encoding, field-major composite sort,
payload struct encoding, end-to-end struct
round-trip through SQLite
test_string_lookup.nim peJobNotFound semantics / hasJob / subscript /
persistPut+get via id / reads short-circuit /
writes drop+warn / persistEncoded via id /
scan parity Job-ref vs id
test_singleton.nim idempotent same-rootDir / different-rootDir
rejection / no-arg instance lifecycle / reset
retargets / reset idempotence / Teardown.request
end-to-end
Prerequisite delivered in the same series: replace the in-tree broker
implementation with the external nim-brokers package; update all
broker call-sites (waku_filter_v2, waku_relay, waku_rln_relay,
delivery_service, peer_manager, requests/*, factory/*, api tests, etc.)
to the new package API; chat2 made to compile again.
Note: SDS adapter (Phase 5 of the design) is deferred -- nim-sds is
still developed side-by-side and the persistency layer is intentionally
SDS-agnostic.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* persistency: pin nim-brokers by URL+commit (workaround for stale registry)
The bare `brokers >= 2.0.1` form cannot resolve on machines where the
local nimble SAT solver enumerates only the registry-recorded 0.1.0 for
brokers. The nim-lang/packages entry for `brokers` carries no per-tag
metadata (only the URL), so until that registry entry is refreshed the
SAT solver clamps the available-versions list to 0.1.0 and rejects the
>= 2.0.1 constraint -- even though pkgs2 and pkgcache both have v2.0.1
cloned locally.
Pinning by URL+commit bypasses the registry path entirely. Inline
comment in waku.nimble documents the situation and the path back to
the bare form once nim-lang/packages is updated.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* persistency: nph format pass
Run `nph` on all 57 Nim files touched by this PR. Pure formatting:
17 files re-styled, no semantic change. Suite still 69/69.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* Fix build, add local-storage-path config, lazy init of Persistency from Waku start
* fix: fix nix deps
* fixes for nix build, regenerate deps
* reverting accidental dependency changes
* Fixing deps
* Apply suggestions from code review
Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com>
* persistency tests: migrate to suite / asyncTest / await
Match the in-tree test convention (procSuite -> suite, sync test +
waitFor -> asyncTest + await):
- procSuite "X": -> suite "X":
- For tests doing async work: test -> asyncTest, waitFor -> await.
- Poll helpers (proc waitFor(t: Job, ...) in test_lifecycle.nim,
proc waitUntilExists(...) in test_facade.nim and
test_string_lookup.nim) -> Future[bool] {.async.}, internal
`waitFor X` -> `await X`, internal `sleep(N)` ->
`await sleepAsync(chronos.milliseconds(N))`.
- Renamed test_lifecycle.nim's helper proc from `waitFor(t: Job, ...)`
-> `pollExists(t: Job, ...)`; the previous name shadowed
chronos.waitFor in the chronos macro expansion.
- `chronos.milliseconds(N)` explicitly qualified because `std/times`
also exports `milliseconds` (returning TimeInterval, not Duration).
- `check await x` -> `let okN = await x; check okN` to dodge chronos's
"yield in expr not lowered" with await-as-macro-argument.
- `(await x).foo()` -> `let awN = await x; ... awN.foo() ...` for the
same reason.
waku/persistency/persistency.nim: nph also pulled the proc signatures
across multiple lines; restored explicit `Future[void] {.async.}`
return types after the colon (an intermediate nph pass had elided them).
Suite: 71 / 71 OK against the new async write surface.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* use idiomatic valueOr instead of ifs
* Reworked persistency shutdown, remove not necessary teardown mechanism
* Use const for DefaultStoragePath
* format to follow coding guidelines - no use of result and explicit returns - no functional change
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Ivan FB <128452529+Ivansete-status@users.noreply.github.com>
751 lines
24 KiB
Nim
751 lines
24 KiB
Nim
## Waku Relay module. Thin layer on top of GossipSub.
|
||
##
|
||
## See https://github.com/vacp2p/specs/blob/master/specs/waku/v2/waku-relay.md
|
||
## for spec.
|
||
{.push raises: [].}
|
||
|
||
import
|
||
std/[strformat, strutils, sets],
|
||
stew/byteutils,
|
||
results,
|
||
sequtils,
|
||
chronos,
|
||
chronicles,
|
||
metrics,
|
||
libp2p/multihash,
|
||
libp2p/protocols/pubsub/gossipsub,
|
||
libp2p/protocols/pubsub/rpc/messages,
|
||
libp2p/stream/connection,
|
||
libp2p/switch,
|
||
brokers/broker_context
|
||
|
||
import
|
||
waku/waku_core,
|
||
waku/node/health_monitor/topic_health,
|
||
waku/requests/health_requests,
|
||
waku/events/health_events,
|
||
./message_id,
|
||
waku/events/peer_events
|
||
|
||
from waku/waku_core/codecs import WakuRelayCodec
|
||
export WakuRelayCodec
|
||
|
||
type ShardMetrics = object
|
||
count: float64
|
||
sizeSum: float64
|
||
avgSize: float64
|
||
maxSize: float64
|
||
|
||
logScope:
|
||
topics = "waku relay"
|
||
|
||
declareCounter waku_relay_network_bytes,
|
||
"total traffic per topic, distinct gross/net and direction",
|
||
labels = ["topic", "type", "direction"]
|
||
|
||
declarePublicGauge(
|
||
waku_relay_total_msg_bytes_per_shard,
|
||
"total length of messages seen per shard",
|
||
labels = ["shard"],
|
||
)
|
||
|
||
declarePublicGauge(
|
||
waku_relay_max_msg_bytes_per_shard,
|
||
"Maximum length of messages seen per shard",
|
||
labels = ["shard"],
|
||
)
|
||
|
||
declarePublicGauge(
|
||
waku_relay_avg_msg_bytes_per_shard,
|
||
"Average length of messages seen per shard",
|
||
labels = ["shard"],
|
||
)
|
||
|
||
# see: https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.1.md#overview-of-new-parameters
|
||
const TopicParameters = TopicParams(
|
||
topicWeight: 1,
|
||
|
||
# p1: favours peers already in the mesh
|
||
timeInMeshWeight: 0.01,
|
||
timeInMeshQuantum: 1.seconds,
|
||
timeInMeshCap: 10.0,
|
||
|
||
# p2: rewards fast peers
|
||
firstMessageDeliveriesWeight: 1.0,
|
||
firstMessageDeliveriesDecay: 0.5,
|
||
firstMessageDeliveriesCap: 10.0,
|
||
|
||
# p3: penalizes lazy peers. safe low value
|
||
meshMessageDeliveriesWeight: 0.0,
|
||
meshMessageDeliveriesDecay: 0.0,
|
||
meshMessageDeliveriesCap: 0,
|
||
meshMessageDeliveriesThreshold: 0,
|
||
meshMessageDeliveriesWindow: 0.milliseconds,
|
||
meshMessageDeliveriesActivation: 0.seconds,
|
||
|
||
# p3b: tracks history of prunes
|
||
meshFailurePenaltyWeight: 0.0,
|
||
meshFailurePenaltyDecay: 0.0,
|
||
|
||
# p4: penalizes invalid messages. highly penalize
|
||
# peers sending wrong messages
|
||
invalidMessageDeliveriesWeight: -100.0,
|
||
invalidMessageDeliveriesDecay: 0.5,
|
||
)
|
||
|
||
# see: https://rfc.vac.dev/spec/29/#gossipsub-v10-parameters
|
||
const GossipsubParameters = GossipSubParams.init(
|
||
pruneBackoff = chronos.minutes(1),
|
||
unsubscribeBackoff = chronos.seconds(5),
|
||
floodPublish = true,
|
||
gossipFactor = 0.25,
|
||
d = 6,
|
||
dLow = 4,
|
||
dHigh = 8,
|
||
dScore = 6,
|
||
dOut = 3,
|
||
dLazy = 6,
|
||
heartbeatInterval = chronos.seconds(1),
|
||
historyLength = 6,
|
||
historyGossip = 3,
|
||
fanoutTTL = chronos.minutes(1),
|
||
seenTTL = chronos.minutes(2),
|
||
|
||
# no gossip is sent to peers below this score
|
||
gossipThreshold = -100,
|
||
|
||
# no self-published msgs are sent to peers below this score
|
||
publishThreshold = -1000,
|
||
|
||
# used to trigger disconnections + ignore peer if below this score
|
||
graylistThreshold = -10000,
|
||
|
||
# grafts better peers if the mesh median score drops below this. unset.
|
||
opportunisticGraftThreshold = 0,
|
||
|
||
# how often peer scoring is updated
|
||
decayInterval = chronos.seconds(12),
|
||
|
||
# below this we consider the parameter to be zero
|
||
decayToZero = 0.01,
|
||
|
||
# remember peer score during x after it disconnects
|
||
retainScore = chronos.minutes(10),
|
||
|
||
# p5: application specific, unset
|
||
appSpecificWeight = 0.0,
|
||
|
||
# p6: penalizes peers sharing more than threshold ips
|
||
ipColocationFactorWeight = -50.0,
|
||
ipColocationFactorThreshold = 5.0,
|
||
|
||
# p7: penalizes bad behaviour (weight and decay)
|
||
behaviourPenaltyWeight = -10.0,
|
||
behaviourPenaltyDecay = 0.986,
|
||
|
||
# triggers disconnections of bad peers aka score <graylistThreshold
|
||
disconnectBadPeers = true,
|
||
)
|
||
|
||
type
|
||
WakuRelayResult*[T] = Result[T, string]
|
||
WakuRelayHandler* = proc(pubsubTopic: PubsubTopic, message: WakuMessage): Future[void] {.
|
||
gcsafe, raises: [Defect]
|
||
.}
|
||
WakuValidatorHandler* = proc(
|
||
pubsubTopic: PubsubTopic, message: WakuMessage
|
||
): Future[ValidationResult] {.gcsafe, raises: [Defect].}
|
||
WakuRelay* = ref object of GossipSub
|
||
brokerCtx: BrokerContext
|
||
peerEventListener: WakuPeerEventListener
|
||
# seq of tuples: the first entry in the tuple contains the validators are called for every topic
|
||
# the second entry contains the error messages to be returned when the validator fails
|
||
wakuValidators: seq[tuple[handler: WakuValidatorHandler, errorMessage: string]]
|
||
# a map of validators to error messages to return when validation fails
|
||
topicValidator: Table[PubsubTopic, ValidatorHandler]
|
||
# map topic with its assigned validator within pubsub
|
||
topicHandlers: Table[PubsubTopic, TopicHandler]
|
||
# map topic with the TopicHandler proc in charge of attending topic's incoming message events
|
||
topicsHealth*: Table[string, TopicHealth]
|
||
onTopicHealthChange*: TopicHealthChangeHandler
|
||
topicHealthLoopHandle*: Future[void]
|
||
topicHealthUpdateEvent: AsyncEvent
|
||
topicHealthDirty: HashSet[string]
|
||
# list of topics that need their health updated in the update event
|
||
topicHealthCheckAll: bool
|
||
# true if all topics need to have their health status refreshed in the update event
|
||
msgMetricsPerShard*: Table[string, ShardMetrics]
|
||
|
||
# predefinition for more detailed results from publishing new message
|
||
type PublishOutcome* {.pure.} = enum
|
||
NoTopicSpecified
|
||
DuplicateMessage
|
||
NoPeersToPublish
|
||
CannotGenerateMessageId
|
||
|
||
proc initProtocolHandler(w: WakuRelay) =
|
||
proc handler(conn: Connection, proto: string) {.async: (raises: [CancelledError]).} =
|
||
## main protocol handler that gets triggered on every
|
||
## connection for a protocol string
|
||
## e.g. ``/wakusub/0.0.1``, etc...
|
||
info "Incoming WakuRelay connection", connection = conn, protocol = proto
|
||
|
||
try:
|
||
await w.handleConn(conn, proto)
|
||
except CancelledError:
|
||
# This is top-level procedure which will work as separate task, so it
|
||
# do not need to propogate CancelledError.
|
||
error "Unexpected cancellation in relay handler",
|
||
conn = conn, error = getCurrentExceptionMsg()
|
||
except CatchableError:
|
||
error "WakuRelay handler leaks an error",
|
||
conn = conn, error = getCurrentExceptionMsg()
|
||
|
||
# XXX: Handler hijack GossipSub here?
|
||
w.handler = handler
|
||
w.codec = WakuRelayCodec
|
||
|
||
proc logMessageInfo*(
|
||
w: WakuRelay,
|
||
remotePeerId: string,
|
||
topic: string,
|
||
msg_id_short: string,
|
||
msg: WakuMessage,
|
||
onRecv: bool,
|
||
) =
|
||
let msg_hash = computeMessageHash(topic, msg).to0xHex()
|
||
let payloadSize = float64(msg.payload.len)
|
||
|
||
if onRecv:
|
||
debug "received relay message",
|
||
my_peer_id = w.switch.peerInfo.peerId,
|
||
msg_hash = msg_hash,
|
||
msg_id = msg_id_short,
|
||
from_peer_id = remotePeerId,
|
||
topic = topic,
|
||
contentTopic = msg.contentTopic,
|
||
receivedTime = getNowInNanosecondTime(),
|
||
payloadSizeBytes = payloadSize
|
||
else:
|
||
debug "sent relay message",
|
||
my_peer_id = w.switch.peerInfo.peerId,
|
||
msg_hash = msg_hash,
|
||
msg_id = msg_id_short,
|
||
to_peer_id = remotePeerId,
|
||
topic = topic,
|
||
contentTopic = msg.contentTopic,
|
||
sentTime = getNowInNanosecondTime(),
|
||
payloadSizeBytes = payloadSize
|
||
|
||
var shardMetrics = w.msgMetricsPerShard.getOrDefault(topic, ShardMetrics())
|
||
shardMetrics.count += 1
|
||
shardMetrics.sizeSum += payloadSize
|
||
if payloadSize > shardMetrics.maxSize:
|
||
shardMetrics.maxSize = payloadSize
|
||
shardMetrics.avgSize = shardMetrics.sizeSum / shardMetrics.count
|
||
w.msgMetricsPerShard[topic] = shardMetrics
|
||
|
||
waku_relay_max_msg_bytes_per_shard.set(shardMetrics.maxSize, labelValues = [topic])
|
||
|
||
waku_relay_avg_msg_bytes_per_shard.set(shardMetrics.avgSize, labelValues = [topic])
|
||
|
||
waku_relay_total_msg_bytes_per_shard.set(shardMetrics.sizeSum, labelValues = [topic])
|
||
|
||
proc initRelayObservers(w: WakuRelay) =
|
||
proc decodeRpcMessageInfo(
|
||
peer: PubSubPeer, msg: Message
|
||
): Result[
|
||
tuple[msgId: string, topic: string, wakuMessage: WakuMessage, msgSize: int], void
|
||
] =
|
||
let msg_id = w.msgIdProvider(msg).valueOr:
|
||
warn "Error generating message id",
|
||
my_peer_id = w.switch.peerInfo.peerId,
|
||
from_peer_id = peer.peerId,
|
||
pubsub_topic = msg.topic,
|
||
error = $error
|
||
return err()
|
||
|
||
let msg_id_short = shortLog(msg_id)
|
||
|
||
let wakuMessage = WakuMessage.decode(msg.data).valueOr:
|
||
warn "Error decoding to Waku Message",
|
||
my_peer_id = w.switch.peerInfo.peerId,
|
||
msg_id = msg_id_short,
|
||
from_peer_id = peer.peerId,
|
||
pubsub_topic = msg.topic,
|
||
error = $error
|
||
return err()
|
||
|
||
let msgSize = msg.data.len + msg.topic.len
|
||
return ok((msg_id_short, msg.topic, wakuMessage, msgSize))
|
||
|
||
proc updateMetrics(
|
||
peer: PubSubPeer,
|
||
pubsub_topic: string,
|
||
msg: WakuMessage,
|
||
msgSize: int,
|
||
onRecv: bool,
|
||
) =
|
||
if onRecv:
|
||
waku_relay_network_bytes.inc(
|
||
msgSize.int64, labelValues = [pubsub_topic, "gross", "in"]
|
||
)
|
||
else:
|
||
# sent traffic can only be "net"
|
||
# TODO: If we can measure unsuccessful sends would mean a possible distinction between gross/net
|
||
waku_relay_network_bytes.inc(
|
||
msgSize.int64, labelValues = [pubsub_topic, "net", "out"]
|
||
)
|
||
|
||
proc onRecv(peer: PubSubPeer, msgs: var RPCMsg) =
|
||
if msgs.control.isSome():
|
||
let ctrl = msgs.control.get()
|
||
var topicsChanged = false
|
||
|
||
for graft in ctrl.graft:
|
||
w.topicHealthDirty.incl(graft.topicID)
|
||
topicsChanged = true
|
||
|
||
for prune in ctrl.prune:
|
||
w.topicHealthDirty.incl(prune.topicID)
|
||
topicsChanged = true
|
||
|
||
if topicsChanged:
|
||
w.topicHealthUpdateEvent.fire()
|
||
|
||
for msg in msgs.messages:
|
||
let (msg_id_short, topic, wakuMessage, msgSize) = decodeRpcMessageInfo(peer, msg).valueOr:
|
||
continue
|
||
# message receive log happens in onValidated observer as onRecv is called before checks
|
||
updateMetrics(peer, topic, wakuMessage, msgSize, onRecv = true)
|
||
discard
|
||
|
||
proc onValidated(peer: PubSubPeer, msg: Message, msgId: MessageId) =
|
||
let msg_id_short = shortLog(msgId)
|
||
let wakuMessage = WakuMessage.decode(msg.data).valueOr:
|
||
warn "onValidated: failed decoding to Waku Message",
|
||
my_peer_id = w.switch.peerInfo.peerId,
|
||
msg_id = msg_id_short,
|
||
from_peer_id = peer.peerId,
|
||
pubsub_topic = msg.topic,
|
||
error = $error
|
||
return
|
||
|
||
logMessageInfo(
|
||
w, shortLog(peer.peerId), msg.topic, msg_id_short, wakuMessage, onRecv = true
|
||
)
|
||
|
||
proc onSend(peer: PubSubPeer, msgs: var RPCMsg) =
|
||
for msg in msgs.messages:
|
||
let (msg_id_short, topic, wakuMessage, msgSize) = decodeRpcMessageInfo(peer, msg).valueOr:
|
||
warn "onSend: failed decoding RPC info",
|
||
my_peer_id = w.switch.peerInfo.peerId, to_peer_id = peer.peerId
|
||
continue
|
||
logMessageInfo(
|
||
w, shortLog(peer.peerId), topic, msg_id_short, wakuMessage, onRecv = false
|
||
)
|
||
updateMetrics(peer, topic, wakuMessage, msgSize, onRecv = false)
|
||
|
||
let administrativeObserver =
|
||
PubSubObserver(onRecv: onRecv, onSend: onSend, onValidated: onValidated)
|
||
|
||
w.addObserver(administrativeObserver)
|
||
|
||
proc new*(
|
||
T: type WakuRelay, switch: Switch, maxMessageSize = int(DefaultMaxWakuMessageSize)
|
||
): WakuRelayResult[T] =
|
||
## maxMessageSize: max num bytes that are allowed for the WakuMessage
|
||
|
||
var w: WakuRelay
|
||
try:
|
||
w = WakuRelay.init(
|
||
switch = switch,
|
||
anonymize = true,
|
||
verifySignature = false,
|
||
sign = false,
|
||
triggerSelf = true,
|
||
msgIdProvider = defaultMessageIdProvider,
|
||
maxMessageSize = maxMessageSize,
|
||
parameters = GossipsubParameters,
|
||
)
|
||
w.brokerCtx = globalBrokerContext()
|
||
|
||
procCall GossipSub(w).initPubSub()
|
||
w.topicsHealth = initTable[string, TopicHealth]()
|
||
w.topicHealthUpdateEvent = newAsyncEvent()
|
||
w.topicHealthDirty = initHashSet[string]()
|
||
w.topicHealthCheckAll = false
|
||
w.initProtocolHandler()
|
||
w.initRelayObservers()
|
||
|
||
w.peerEventListener = WakuPeerEvent.listen(
|
||
w.brokerCtx,
|
||
proc(evt: WakuPeerEvent): Future[void] {.async: (raises: []), gcsafe.} =
|
||
if evt.kind == WakuPeerEventKind.EventDisconnected:
|
||
w.topicHealthCheckAll = true
|
||
w.topicHealthUpdateEvent.fire()
|
||
,
|
||
).valueOr:
|
||
return err("Failed to subscribe to peer events: " & error)
|
||
except InitializationError:
|
||
return err("initialization error: " & getCurrentExceptionMsg())
|
||
|
||
return ok(w)
|
||
|
||
proc addValidator*(
|
||
w: WakuRelay, handler: WakuValidatorHandler, errorMessage: string = ""
|
||
) {.gcsafe.} =
|
||
w.wakuValidators.add((handler, errorMessage))
|
||
|
||
proc addObserver*(w: WakuRelay, observer: PubSubObserver) {.gcsafe.} =
|
||
## Observes when a message is sent/received from the GossipSub PoV
|
||
procCall GossipSub(w).addObserver(observer)
|
||
|
||
proc getDHigh*(T: type WakuRelay): int =
|
||
return GossipsubParameters.dHigh
|
||
|
||
proc getPubSubPeersInMesh*(
|
||
w: WakuRelay, pubsubTopic: PubsubTopic
|
||
): Result[HashSet[PubSubPeer], string] =
|
||
## Returns the list of PubSubPeers in a mesh defined by the passed pubsub topic.
|
||
## The 'mesh' atribute is defined in the GossipSub ref object.
|
||
|
||
# If pubsubTopic is empty, we return all peers in mesh for any pubsub topic
|
||
if pubsubTopic == "":
|
||
var allPeers = initHashSet[PubSubPeer]()
|
||
for topic, topicMesh in w.mesh.pairs:
|
||
allPeers = allPeers.union(topicMesh)
|
||
return ok(allPeers)
|
||
|
||
if not w.mesh.hasKey(pubsubTopic):
|
||
info "getPubSubPeersInMesh - there is no mesh peer for the given pubsub topic",
|
||
pubsubTopic = pubsubTopic
|
||
return ok(initHashSet[PubSubPeer]())
|
||
|
||
let peersRes = catch:
|
||
w.mesh[pubsubTopic]
|
||
|
||
let peers: HashSet[PubSubPeer] = peersRes.valueOr:
|
||
return err(
|
||
"getPubSubPeersInMesh - exception accessing " & pubsubTopic & ": " & error.msg
|
||
)
|
||
|
||
return ok(peers)
|
||
|
||
proc getPeersInMesh*(
|
||
w: WakuRelay, pubsubTopic: PubsubTopic = ""
|
||
): Result[seq[PeerId], string] =
|
||
## Returns the list of peerIds in a mesh defined by the passed pubsub topic.
|
||
## The 'mesh' atribute is defined in the GossipSub ref object.
|
||
let pubSubPeers = ?w.getPubSubPeersInMesh(pubsubTopic)
|
||
let peerIds = toSeq(pubSubPeers).mapIt(it.peerId)
|
||
|
||
return ok(peerIds)
|
||
|
||
proc getNumPeersInMesh*(w: WakuRelay, pubsubTopic: PubsubTopic): Result[int, string] =
|
||
## Returns the number of peers in a mesh defined by the passed pubsub topic.
|
||
|
||
let peers = w.getPubSubPeersInMesh(pubsubTopic).valueOr:
|
||
return err(
|
||
"getNumPeersInMesh - failed retrieving peers in mesh: " & pubsubTopic & ": " &
|
||
error
|
||
)
|
||
|
||
return ok(peers.len)
|
||
|
||
proc calculateTopicHealth(wakuRelay: WakuRelay, topic: string): TopicHealth =
|
||
let numPeersInMesh = wakuRelay.getNumPeersInMesh(topic).valueOr:
|
||
error "Could not calculate topic health", topic = topic, error = error
|
||
return TopicHealth.UNHEALTHY
|
||
|
||
if numPeersInMesh < 1:
|
||
return TopicHealth.UNHEALTHY
|
||
elif numPeersInMesh < wakuRelay.parameters.dLow:
|
||
return TopicHealth.MINIMALLY_HEALTHY
|
||
return TopicHealth.SUFFICIENTLY_HEALTHY
|
||
|
||
proc isSubscribed*(w: WakuRelay, topic: PubsubTopic): bool =
|
||
GossipSub(w).topics.hasKey(topic)
|
||
|
||
proc subscribedTopics*(w: WakuRelay): seq[PubsubTopic] =
|
||
return toSeq(GossipSub(w).topics.keys())
|
||
|
||
proc topicsHealthLoop(w: WakuRelay) {.async.} =
|
||
while true:
|
||
await w.topicHealthUpdateEvent.wait()
|
||
w.topicHealthUpdateEvent.clear()
|
||
|
||
var topicsToCheck: seq[string]
|
||
|
||
if w.topicHealthCheckAll:
|
||
topicsToCheck = toSeq(w.topics.keys)
|
||
else:
|
||
topicsToCheck = toSeq(w.topicHealthDirty)
|
||
|
||
w.topicHealthCheckAll = false
|
||
w.topicHealthDirty.clear()
|
||
|
||
var futs = newSeq[Future[void]]()
|
||
|
||
for topic in topicsToCheck:
|
||
# guard against topic being unsubscribed since fire()
|
||
if not w.isSubscribed(topic):
|
||
continue
|
||
|
||
let
|
||
oldHealth = w.topicsHealth.getOrDefault(topic, TopicHealth.UNHEALTHY)
|
||
currentHealth = w.calculateTopicHealth(topic)
|
||
|
||
if oldHealth == currentHealth:
|
||
continue
|
||
|
||
w.topicsHealth[topic] = currentHealth
|
||
|
||
EventShardTopicHealthChange.emit(w.brokerCtx, topic, currentHealth)
|
||
|
||
if not w.onTopicHealthChange.isNil():
|
||
futs.add(w.onTopicHealthChange(topic, currentHealth))
|
||
|
||
if futs.len() > 0:
|
||
try:
|
||
discard await allFinished(futs)
|
||
except CancelledError:
|
||
break
|
||
except CatchableError as e:
|
||
warn "Error in topic health callback", error = e.msg
|
||
|
||
# safety cooldown to protect from edge cases
|
||
await sleepAsync(100.milliseconds)
|
||
|
||
method start*(w: WakuRelay) {.async: (raises: [CancelledError]).} =
|
||
info "start"
|
||
await procCall GossipSub(w).start()
|
||
w.topicHealthLoopHandle = w.topicsHealthLoop()
|
||
|
||
method stop*(w: WakuRelay) {.async: (raises: []).} =
|
||
info "stop"
|
||
await procCall GossipSub(w).stop()
|
||
|
||
await WakuPeerEvent.dropListener(w.brokerCtx, w.peerEventListener)
|
||
|
||
if not w.topicHealthLoopHandle.isNil():
|
||
await w.topicHealthLoopHandle.cancelAndWait()
|
||
|
||
proc generateOrderedValidator(w: WakuRelay): ValidatorHandler {.gcsafe.} =
|
||
# rejects messages that are not WakuMessage
|
||
let wrappedValidator = proc(
|
||
pubsubTopic: string, message: messages.Message
|
||
): Future[ValidationResult] {.async.} =
|
||
# can be optimized by checking if the message is a WakuMessage without allocating memory
|
||
# see nim-libp2p protobuf library
|
||
let msg = WakuMessage.decode(message.data).valueOr:
|
||
error "protocol generateOrderedValidator reject decode error",
|
||
pubsubTopic = pubsubTopic, error = $error
|
||
return ValidationResult.Reject
|
||
|
||
# now sequentially validate the message
|
||
for (validator, errorMessage) in w.wakuValidators:
|
||
let validatorRes = await validator(pubsubTopic, msg)
|
||
|
||
if validatorRes != ValidationResult.Accept:
|
||
let msgHash = computeMessageHash(pubsubTopic, msg).to0xHex()
|
||
error "protocol generateOrderedValidator reject waku validator",
|
||
msg_hash = msgHash,
|
||
pubsubTopic = pubsubTopic,
|
||
contentTopic = msg.contentTopic,
|
||
validatorRes = validatorRes,
|
||
error = errorMessage
|
||
|
||
return validatorRes
|
||
|
||
return ValidationResult.Accept
|
||
|
||
return wrappedValidator
|
||
|
||
proc validateMessage*(
|
||
w: WakuRelay, pubsubTopic: string, msg: WakuMessage
|
||
): Future[Result[void, string]] {.async.} =
|
||
let messageSizeBytes = msg.encode().buffer.len
|
||
let msgHash = computeMessageHash(pubsubTopic, msg).to0xHex()
|
||
|
||
if messageSizeBytes > w.maxMessageSize:
|
||
let message = fmt"Message size exceeded maximum of {w.maxMessageSize} bytes"
|
||
error "too large Waku message",
|
||
msg_hash = msgHash,
|
||
error = message,
|
||
messageSizeBytes = messageSizeBytes,
|
||
maxMessageSize = w.maxMessageSize
|
||
|
||
return err(message)
|
||
|
||
for (validator, message) in w.wakuValidators:
|
||
let validatorRes = await validator(pubsubTopic, msg)
|
||
if validatorRes != ValidationResult.Accept:
|
||
if message.len > 0:
|
||
error "invalid Waku message", msg_hash = msgHash, error = message
|
||
return err(message)
|
||
else:
|
||
## This should never happen
|
||
error "uncertain invalid Waku message", msg_hash = msgHash, error = message
|
||
return err("validator failed")
|
||
|
||
return ok()
|
||
|
||
proc subscribe*(w: WakuRelay, pubsubTopic: PubsubTopic, handler: WakuRelayHandler) =
|
||
info "subscribe", pubsubTopic = pubsubTopic
|
||
|
||
# We need to wrap the handler since gossipsub doesnt understand WakuMessage
|
||
let topicHandler = proc(
|
||
pubsubTopic: string, data: seq[byte]
|
||
): Future[void] {.gcsafe, raises: [].} =
|
||
let decMsg = WakuMessage.decode(data).valueOr:
|
||
# fine if triggerSelf enabled, since validators are bypassed
|
||
error "failed to decode WakuMessage, validator passed a wrong message",
|
||
pubsubTopic = pubsubTopic, error = error
|
||
let fut = newFuture[void]()
|
||
fut.complete()
|
||
return fut
|
||
# this subscription handler is called once for every validated message
|
||
# that will be relayed, hence this is the place we can count net incoming traffic
|
||
waku_relay_network_bytes.inc(
|
||
data.len.int64 + pubsubTopic.len.int64, labelValues = [pubsubTopic, "net", "in"]
|
||
)
|
||
|
||
return handler(pubsubTopic, decMsg)
|
||
|
||
# Add the ordered validator to the topic
|
||
# This assumes that if `w.validatorInserted.hasKey(pubSubTopic) is true`, it contains the ordered validator.
|
||
# Otherwise this might lead to unintended behaviour.
|
||
if not w.topicValidator.hasKey(pubSubTopic):
|
||
let newValidator = w.generateOrderedValidator()
|
||
procCall GossipSub(w).addValidator(pubSubTopic, newValidator)
|
||
w.topicValidator[pubSubTopic] = newValidator
|
||
|
||
# set this topic parameters for scoring
|
||
w.topicParams[pubsubTopic] = TopicParameters
|
||
|
||
# subscribe to the topic with our wrapped handler
|
||
procCall GossipSub(w).subscribe(pubsubTopic, topicHandler)
|
||
|
||
w.topicHandlers[pubsubTopic] = topicHandler
|
||
w.topicHealthDirty.incl(pubsubTopic)
|
||
w.topicHealthUpdateEvent.fire()
|
||
|
||
proc unsubscribeAll*(w: WakuRelay, pubsubTopic: PubsubTopic) =
|
||
## Unsubscribe all handlers on this pubsub topic
|
||
|
||
info "unsubscribe all", pubsubTopic = pubsubTopic
|
||
|
||
procCall GossipSub(w).unsubscribeAll(pubsubTopic)
|
||
w.topicValidator.del(pubsubTopic)
|
||
w.topicHandlers.del(pubsubTopic)
|
||
w.topicsHealth.del(pubsubTopic)
|
||
w.topicHealthDirty.excl(pubsubTopic)
|
||
|
||
proc unsubscribe*(w: WakuRelay, pubsubTopic: PubsubTopic) =
|
||
if not w.topicValidator.hasKey(pubsubTopic):
|
||
error "unsubscribe no validator for this topic", pubsubTopic
|
||
return
|
||
|
||
if not w.topicHandlers.hasKey(pubsubTopic):
|
||
error "not subscribed to the given topic", pubsubTopic
|
||
return
|
||
|
||
var topicHandler: TopicHandler
|
||
var topicValidator: ValidatorHandler
|
||
try:
|
||
topicHandler = w.topicHandlers[pubsubTopic]
|
||
topicValidator = w.topicValidator[pubsubTopic]
|
||
except KeyError:
|
||
error "exception in unsubscribe", pubsubTopic, error = getCurrentExceptionMsg()
|
||
return
|
||
|
||
info "unsubscribe", pubsubTopic
|
||
procCall GossipSub(w).unsubscribe(pubsubTopic, topicHandler)
|
||
procCall GossipSub(w).removeValidator(pubsubTopic, topicValidator)
|
||
|
||
w.topicValidator.del(pubsubTopic)
|
||
w.topicHandlers.del(pubsubTopic)
|
||
w.topicsHealth.del(pubsubTopic)
|
||
w.topicHealthDirty.excl(pubsubTopic)
|
||
|
||
proc publish*(
|
||
w: WakuRelay, pubsubTopic: PubsubTopic, wakuMessage: WakuMessage
|
||
): Future[Result[int, PublishOutcome]] {.async.} =
|
||
if pubsubTopic.isEmptyOrWhitespace():
|
||
return err(NoTopicSpecified)
|
||
|
||
var message = wakuMessage
|
||
if message.timestamp == 0:
|
||
message.timestamp = getNowInNanosecondTime()
|
||
|
||
let data = message.encode().buffer
|
||
|
||
let msgHash = computeMessageHash(pubsubTopic, message).to0xHex()
|
||
notice "start publish Waku message",
|
||
msg_hash = msgHash, pubsubTopic = pubsubTopic, contentTopic = message.contentTopic
|
||
|
||
let relayedPeerCount = await procCall GossipSub(w).publish(pubsubTopic, data)
|
||
|
||
if relayedPeerCount <= 0:
|
||
return err(NoPeersToPublish)
|
||
|
||
return ok(relayedPeerCount)
|
||
|
||
proc getConnectedPubSubPeers*(
|
||
w: WakuRelay, pubsubTopic: PubsubTopic
|
||
): Result[HashSet[PubsubPeer], string] =
|
||
## Returns the list of peerIds of connected peers and subscribed to the passed pubsub topic.
|
||
## The 'gossipsub' atribute is defined in the GossipSub ref object.
|
||
|
||
if pubsubTopic == "":
|
||
## Return all the connected peers
|
||
var peerIds = initHashSet[PubsubPeer]()
|
||
for k, v in w.gossipsub:
|
||
peerIds = peerIds + v
|
||
return ok(peerIds)
|
||
|
||
if not w.gossipsub.hasKey(pubsubTopic):
|
||
return err(
|
||
"getConnectedPeers - there is no gossipsub peer for the given pubsub topic: " &
|
||
pubsubTopic
|
||
)
|
||
|
||
let peersRes = catch:
|
||
w.gossipsub[pubsubTopic]
|
||
|
||
let peers: HashSet[PubSubPeer] = peersRes.valueOr:
|
||
return
|
||
err("getConnectedPeers - exception accessing " & pubsubTopic & ": " & error.msg)
|
||
|
||
return ok(peers)
|
||
|
||
proc getConnectedPeers*(
|
||
w: WakuRelay, pubsubTopic: PubsubTopic
|
||
): Result[seq[PeerId], string] =
|
||
## Returns the list of peerIds of connected peers and subscribed to the passed pubsub topic.
|
||
## The 'gossipsub' atribute is defined in the GossipSub ref object.
|
||
|
||
let peers = ?w.getConnectedPubSubPeers(pubsubTopic)
|
||
|
||
let peerIds = toSeq(peers).mapIt(it.peerId)
|
||
return ok(peerIds)
|
||
|
||
proc getNumConnectedPeers*(
|
||
w: WakuRelay, pubsubTopic: PubsubTopic
|
||
): Result[int, string] =
|
||
## Returns the number of connected peers and subscribed to the passed pubsub topic.
|
||
|
||
## Return all the connected peers
|
||
let peers = w.getConnectedPubSubPeers(pubsubTopic).valueOr:
|
||
return err(
|
||
"getNumConnectedPeers - failed retrieving peers in mesh: " & pubsubTopic & ": " &
|
||
error
|
||
)
|
||
|
||
return ok(peers.len)
|
||
|
||
proc getSubscribedTopics*(w: WakuRelay): seq[PubsubTopic] =
|
||
## Returns a seq containing the current list of subscribed topics
|
||
return PubSub(w).topics.keys.toSeq().mapIt(cast[PubsubTopic](it))
|