* add more traces, remove async from rebalance
* more traces
* avoid computng scores when weight is 0.0
* debug colocation, fix an indent in unsubpeer (minor)
* add full ValidationResult coverage
* store in cache only after validation
* gossip 1.0 fixes
* fix typo
* gossip 10 internal test fixes
* test fixing
* refactor peerstats usages
* populate tables if missing when scoring
* streamline socket read/write hot path
This avoids some unnecessary memory copying on the hot path of noise /
mplex, as well as getting rid of a few futures - profiling shows that
this is one of the main culprits of small memory allocations, which
makes sense - this is where gossip fan-out happens.
* fewer futures (and corresponding closures) when sending lpchannel
messages
* avoid data copies when encrypting and framing noise messages
* avoid copying tuple when reading noise data (poor c codegen)
* fix setting eof flag in secure read
* write noise frames in one go
...and closing secure socket once is enough
* check that connection is not closed or eof
* don't release connection lock prematurely
* test that only valid connections can be added
* correct exception type on closed connection
* add clarifying comment
* use closeWithEOF for more stable test
* misc comments
* log stream id in buffestream asserts
* use closeWithEOF to prevent races in tests
* give some time to the remote handler to trigger
* adding more tests to make codecov happy
* handle resets properly with/without pushes/reads
* add clarifying comments
* pushEof should also not be concurrent
* move channel reset to bufferstream
this is where the action happens - lpchannel merely redefines how close
is done
Co-authored-by: Jacek Sieka <jacek@status.im>
* move gossip parameters to runtime
* internal test fixes
* add missing params
* restore const parameters are soldi base and use them in init
* more constants tuning
* rework transport to use the new accept api
* use the new chronos primits
* fixup tests to use the new transport api
* handle all exceptions in upgradeIncoming
* master merge
* add multiaddress exception type
* raise appropriate exception on invalida address
* allow retrying on TransportTooManyError
* adding TODO
* wip
* merge master
* add sleep if nil is returned
* accept loop handles all exceptions
* avoid issues with tray/except/finally
* make consistent with master
* cleanup accept loop
* logging
* Update libp2p/transports/tcptransport.nim
Co-authored-by: Jacek Sieka <jacek@status.im>
* use Direction enum instead of initiator flag
* use consistent import style
* remove experimental `closeWithEOF()`
Co-authored-by: Jacek Sieka <jacek@status.im>
* fix channels not being reset
silly for loop..
* allow only one concurrent read
* fix mplex test race condition
* add some bufferstream eof tests
* deadlock, lost data and hung channel fixes
* prevent concurrent `reset` calls
* reset LPChannel when read is cancelled (since data is lost)
* ensure there's one, and one only, 0-byte readOnce on EOF
* ensure that all data is returned before EOF is returned
* keep running activity monitor for half-closed channels (or they never
get closed)
* break stream tracking by type
* use closeWithEOF to await wrapped stream
* fix cancelation leaks
* fix channel leaks
* logging
* use close monitor and always call closeUnderlying
* don't use closeWithEOF
* removing close monitor
* logging
To break a potential read/write deadlock, gossipsub uses an unbounded
queue for writes - when peers are too slow to process this queue, it may
end up growing without bounds causing high memory usage.
Here, we introduce a maximum write queue length after which the peer is
disconnected - the queue is generous enough that any "normal" usage
should be fine - writes that are `await`:ed are not affected, only
writes that are launched in an `asyncSpawn` task or similar.
* avoid unnecessary copy of message when there are no send observers
* release message memory earlier in gossipsub
* simplify pubsubpeer logging
* add helper to read EOF marker after closing stream (else stream stay
alive until timeout/reset)
* don't assert on empty channel message
* don't loop when writing to chronos (no need)
When messages can't be sent to peer, we try to establish a send
connection - this causes messages to stack up as more and more unsent
messages are blocked on the dial lock.
* remove dial lock
* run reconnection loop in background task
* channel close race and deadlock fixes
* remove send lock, write chunks in one go
* push some of half-closed implementation to BufferStream
* fix some hangs where LPChannel readers and writers would not always
wake up
* simplify lazy channels
* fix close happening more than once in some orderings
* reenable connection tracking tests
* close channels first on mplex close such that consumers can read bytes
A notable difference is that BufferedStream is no longer considered EOF
until someone has actually read the EOF marker.
* docs, simplification
* add peer lifecycle events
* rework peer events to not use connection events
* don't use result in pubsub and switch init
* wip
* use ordered hashes and remove logscope
* logging
* add missing test
* small fixes
* remove almost-empty types module
* lock when writing message (that's the only place the lock matters, and
only when the message is > max msg size)
* logging updates (log in consistent order, makes reading logs easier)
* raise EOF from readExactly only if no bytes have been read (to signal
that _no_ bytes were lost)
This change modifies how the backpressure algorithm in bufferstream
works - in particular, instead of working byte-by-byte, it will now work
seq-by-seq.
When data arrives, it usually does so in packets - in the current
bufferstream, the packet is read then split into bytes which are fed one
by one to the bufferstream. On the reading side, the bytes are popped of
the bufferstream, again byte by byte, to satisfy `readOnce` requests -
this introduces a lot of synchronization traffic because the checks for
full buffer and for async event handling must be done for every byte.
In this PR, a queue of length 1 is used instead - this means there will
at most exist one "packet" in `pushTo`, one in the queue and one in the
slush buffer that is used to store incomplete reads.
* avoid byte-by-byte copy to buffer, with synchronization in-between
* reuse AsyncQueue synchronization logic instead of rolling own
* avoid writeHandler callback - implement `write` method instead
* simplify EOF signalling by only setting EOF flag in queue reader (and
reset)
* remove BufferStream pipes (unused)
* fixes drainBuffer deadlock when drain is called from within read loop
and thus blocks draining
* fix lpchannel init order
if the connection is already closed (because the remote closes during
identfiy for example), an exception would be raised which would leave
the connection in limbo, beacuse it would not go through the rest of
internalConnect.
Also, if the connection is already closed, the disconnect event would be
scheduled before the connect event :/
* mcache fixes
* remove timed cache - the window shifting already removes old messages
* ref -> object
* avoid unnecessary allocations with `[]` operator
* simplify init
* fix several gossipsub/floodsub issues
* floodsub, gossipsub: don't rebroadcast messages that fail validation
(!)
* floodsub, gossipsub: don't crash when unsubscribing from unknown
topics (!)
* gossipsub: don't send message to peers that are not interested in the
topic, when messages don't share topic list
* floodsub: don't repeat all messages for each message when
rebroadcasting
* floodsub: allow sending empty data
* floodsub: fix inefficient unsubscribe
* sync floodsub/gossipsub logging
* gossipsub: include incoming messages in mcache (!)
* gossipsub: don't rebroadcast already-seen messages (!)
* pubsubpeer: remove incoming/outgoing seen caches - these are already
handled in gossipsub, floodsub and will cause trouble when peers try to
resubscribe / regraft topics (because control messages will have same
digest)
* timedcache: reimplement without timers (fixes timer leaks and extreme
inefficiency due to per-message closures, futures etc)
* timedcache: ref -> obj
when identify is run on incoming connections, the connmanager tables are
updated too late for incoming connections to properly be handled
this is a quickfix that will eventually need cleaning up