Commit Graph

756 Commits

Author SHA1 Message Date
Jacek Sieka 1befeb8c2e
clean up peerid (#470)
* fix dangling cstring on error return
* remove some useless inlines
* less mallocs in shortlog
* proc -> func
* rename test
2020-12-03 13:53:16 -06:00
Dmitriy Ryajov e9d4679059
Race in connection setup (#464)
* check that connection is not closed or eof

* don't release connection lock prematurely

* test that only valid connections can be added

* correct exception type on closed connection

* add clarifying comment

* use closeWithEOF for more stable test

* misc comments

* log stream id in buffestream asserts

* use closeWithEOF to prevent races in tests

* give some time to the remote handler to trigger

* adding more tests to make codecov happy
2020-12-02 19:24:48 -06:00
Dmitriy Ryajov d1c689e5ab
adding libp2p tag to logScope (#465) 2020-12-01 11:34:27 -06:00
Giovanni Petrantoni e1648d4404 fix mcache logic check in gossipsub 2020-12-01 23:55:51 +09:00
Giovanni Petrantoni b4738d723c
Some gossip fixes (#467)
* fix some missing rpc in rebalanceMesh

* clarify some variable names and lifetime

* further improvements
2020-12-01 11:44:09 +01:00
Dmitriy Ryajov 94e672ead0
allow concurrent closeWithEOF (#466)
* allow concurrent closeWithEOF

* add dedicated closedWithEOF flag
2020-12-01 09:44:21 +01:00
Jacek Sieka 5c2a54bdd9
fix timeoutmonitor loop (#463)
* fix timeoutmonitor loop

* Clarify that cancellation can happen while in timeoutMonitor
2020-11-29 13:34:19 +01:00
Dmitriy Ryajov 18443dafc1
rework peer event to take an initiator flag (#456)
* rework peer event to take an initiator flag

* use correct direction for initiator
2020-11-28 10:59:47 -06:00
Dmitriy Ryajov 3d44fcb8b3
use cancelAndAwait to mitigate further hangs (#459) 2020-11-28 09:48:06 -06:00
Dmitriy Ryajov a8f5f7a8bb
move dialing logic to it's own proc to avoid try/finally bugs (#461)
* move dialing logic to it's own proc to avoid try/finally bugs

* re-export transport

* lint

* add cancelation test

* test remote conn close on dial
2020-11-28 09:05:12 +01:00
Giovanni Petrantoni 02b20440f2
Limit ihave emission (#462)
* add some limits to ihave emission, go has them, rust does not actually

* restore shuffling of IDs

* add some context
2020-11-28 16:27:39 +09:00
Giovanni Petrantoni 12db9a4cf2 TopicParams validation tuning 2020-11-28 00:23:09 +09:00
Giovanni Petrantoni 809df8d04d add some extra gossip metrics 2020-11-26 16:20:34 +09:00
Giovanni Petrantoni 6c7f2766fe
expose seenTTL parameters (#457) 2020-11-26 14:45:10 +09:00
Dmitriy Ryajov ca9c5c85e4
dont break chronicles logging streamline connsetup (#455) 2020-11-25 13:34:48 -06:00
Dmitriy Ryajov 7b1e652224
Allow custom identify agent string (#451)
* allow custom agent version string

* rework tests and add test for custom agent version
2020-11-25 07:42:02 -06:00
Dmitriy Ryajov 164892776b
get rid of hangs cleanup (#453) 2020-11-25 07:35:25 -06:00
Dmitriy Ryajov 21110636cb
fixing log level to avoid sacring users (#452) 2020-11-24 12:07:27 -06:00
Dmitriy Ryajov 351489bfa9
getMuxedStream to more appropriate getStream (#448) 2020-11-24 00:37:45 -06:00
Dmitriy Ryajov 69ae24dc8d
less leak prone cleanup (#447)
* less leak prone cleanup

* fix double allFinished
2020-11-23 18:22:15 -06:00
Dmitriy Ryajov 6cc3f4283a
update conn peerinfo instead of replacing (#445)
* update conn peerinfo instead of replacing

* remove unnecesary peerid var
2020-11-23 15:15:55 -06:00
Dmitriy Ryajov 034a1e8b1b
small cleanups from tcp-limits2 (#446) 2020-11-23 15:02:23 -06:00
Dmitriy Ryajov 1d16d22f5f
Don't allow concurrent pushdata (#444)
* handle resets properly with/without pushes/reads

* add clarifying comments

* pushEof should also not be concurrent

* move channel reset to bufferstream

this is where the action happens - lpchannel merely redefines how close
is done

Co-authored-by: Jacek Sieka <jacek@status.im>
2020-11-23 09:07:11 -06:00
Dmitriy Ryajov c42009d56e
don't quit accept prematurelly (#443) 2020-11-19 09:10:25 -06:00
Giovanni Petrantoni 93b6c4dc52
Gossip runtime params (#437)
* move gossip parameters to runtime

* internal test fixes

* add missing params

* restore const parameters are soldi base and use them in init

* more constants tuning
2020-11-19 16:48:17 +09:00
Dmitriy Ryajov 92fa4110c1
Rework transport to use chronos accept (#420)
* rework transport to use the new accept api

* use the new chronos primits

* fixup tests to use the new transport api

* handle all exceptions in upgradeIncoming

* master merge

* add multiaddress exception type

* raise appropriate exception on invalida address

* allow retrying on TransportTooManyError

* adding TODO

* wip

* merge master

* add sleep if nil is returned

* accept loop handles all exceptions

* avoid issues with tray/except/finally

* make consistent with master

* cleanup accept loop

* logging

* Update libp2p/transports/tcptransport.nim

Co-authored-by: Jacek Sieka <jacek@status.im>

* use Direction enum instead of initiator flag

* use consistent import style

* remove experimental `closeWithEOF()`

Co-authored-by: Jacek Sieka <jacek@status.im>
2020-11-18 20:06:42 -06:00
Dmitriy Ryajov 8c8d73380f
Re-add connection manager tests (#441)
* use table.getOrDefault()

* re-add missing connection manager tests
2020-11-17 18:48:26 -06:00
Jacek Sieka 74acd0a33a
fix channels not being reset (#439)
* fix channels not being reset

silly for loop..

* allow only one concurrent read
* fix mplex test race condition
* add some bufferstream eof tests

* deadlock, lost data and hung channel fixes

* prevent concurrent `reset` calls
* reset LPChannel when read is cancelled (since data is lost)
* ensure there's one, and one only, 0-byte readOnce on EOF
* ensure that all data is returned before EOF is returned
* keep running activity monitor for half-closed channels (or they never
get closed)
2020-11-17 08:59:25 -06:00
Dmitriy Ryajov da37eee285
Test disconnect from conn event (#432)
* logs

* adding disconnect test in connection events

* adding immediate disconnect from connection event
2020-11-11 13:20:14 -06:00
Giovanni Petrantoni a9948b0b05
clarify validation messages (#431)
* clarify validation messages

* add codecov threshold
2020-11-12 01:42:12 +09:00
Dmitriy Ryajov 90921bff09
move some importance trace logs to debug (#428) 2020-11-09 22:14:46 -06:00
Dmitriy Ryajov 4fb3f50d2c
Reset channels on close (#425)
* reset when failed to read/write muxed conn

* add more comprehensive resource cleanup tests

* style

* cleanup tests
2020-11-06 09:24:24 -06:00
Dmitriy Ryajov 3956f3fd69
make sure all streams are tracked (#422)
* make sure all streams are tracked

* revert unnecesary change
2020-11-04 21:52:54 -06:00
Dmitriy Ryajov 6040cb4ef1
fix debugutils (#423) 2020-11-04 19:56:28 -06:00
Giovanni Petrantoni 7cc42ce219
start adding more tests + minor fixes (#419)
* start adding more tests + minor fixes

* add wrong secure negotiation test

* add noise failed handshake test
2020-11-04 23:24:41 +09:00
Giovanni Petrantoni e496802943
Least expensive metrics (#421)
* add more general and useful metrics

* fix gossipsub peers metrics in heartbeat
2020-11-04 15:18:00 +01:00
Dmitriy Ryajov 7b5259dbc7
Move triggers (#416)
* move event triggers to connmanager

* use base error type

* avoid deadlocks

* handle eof and closed when identifying incoming

* use `closeWait`
2020-11-02 14:35:26 -06:00
Dmitriy Ryajov 43a77e60a1
split stream counts by direction (#418) 2020-11-01 16:23:26 -06:00
Jacek Sieka 03639f1446
Revert "Channel leaks (#413)" (#417)
This reverts commit 1de1d49223.
2020-11-01 14:49:25 -06:00
Giovanni Petrantoni 9c1633bf87 fix ValidIpAddress multiaddress init return type 2020-10-31 13:20:29 +09:00
cheatfate 04c95cb7b0
Fix write should be writeArray. 2020-10-31 03:23:34 +02:00
cheatfate ff48d0b1a2
Proper fix for init(ValidIpAddress). 2020-10-30 17:52:38 +02:00
Giovanni Petrantoni 3d9948a65e ensure all multiaddress routines use Result 2020-10-30 23:50:04 +09:00
Giovanni Petrantoni 75b023c9e5
gossipsub audit fixes (#412)
* [SEC] gossipsub - rebalanceMesh grafts peers giving preference to low scores #405

* comment score choices

* compiler warning fixes/bug fixes (unsubscribe)

* rebalanceMesh does not enforce D_out quota

* fix outbound grafting

* fight the nim compiler

* fix closure capture bs...

* another closure fix

* #403 rebalance prune fixes

* more test fixing

* #403 fixes

* #402 avoid removing scores on unsub

* #401 handleGraft improvements

* [SEC] handleIHAVE/handleIWANT recommendations

* add a note about peer exchange handling
2020-10-30 21:49:54 +09:00
Dmitriy Ryajov 1de1d49223
Channel leaks (#413)
* break stream tracking by type

* use closeWithEOF to await wrapped stream

* fix cancelation leaks

* fix channel leaks

* logging

* use close monitor and always call closeUnderlying

* don't use closeWithEOF

* removing close monitor

* logging
2020-10-27 11:21:03 -06:00
Giovanni Petrantoni eeaa62feec add more debug details to multiaddress assertions 2020-10-21 18:29:59 +09:00
Giovanni Petrantoni 462da1f7a8
gossip MessageID as seq[byte] (#391)
* gossip MessageID as seq[byte]

* combina hashes in defaultMsgIdProvider

* wip

* fix defaultMsgIdProvider
2020-10-21 12:26:04 +09:00
Giovanni Petrantoni 27b9bf436e
fix validation according to specification (#410) 2020-10-21 12:25:42 +09:00
Giovanni Petrantoni 5c19668b2d
avoid verbose EOF messages in readOnce(secure) (#411)
* avoid verbose EOF messages in readOnce(secure)

* shorten azure tests further
2020-10-21 10:08:24 +09:00
Giovanni Petrantoni 32623b930e
handle secure errors in readOnce (secure) (#397)
* handle secure errors in readOnce(secure)

* small synthax fix

* fix mistake in readOnce's isNil
2020-10-19 14:13:14 +09:00
Giovanni Petrantoni 556213abf4
Extended validators (#395)
* gossip extended validation

* fix flood tests

* fix gossip 1.0 tests

* synthax consistency
2020-10-12 16:56:00 +09:00
Giovanni Petrantoni e3bdb9eb13
decode properly ControlPrune (#392) 2020-10-09 09:12:38 +09:00
Giovanni Petrantoni 0f2435f551
better opportunistic grafting score (when score is disabled) (#389) 2020-10-03 09:26:45 +09:00
Dean Eigenmann 853238a215
feature/expose-matcher (#387)
* exposes matcher

* might work

* fix

* fix
2020-10-02 08:59:15 -06:00
Giovanni Petrantoni 4a98a8af5a
gossip pruning fixes related to #371 (#385)
* gossip pruning fixes related to #371

* better trace for grafted/pruned

* shorted azure testing again
2020-10-02 13:09:31 +09:00
Mamy Ratsimbazafy 03f5bbba6d
saner logging (#381) 2020-09-29 09:40:06 -06:00
Giovanni Petrantoni 98d0cc3a16
defaultMsgIdProvider alternative/test anonymize (#379)
* defaultMsgIdProvider alternative/test anonymize

* avoid freeze during flood tests

* avoid `empty message, skipping` situation

* test observers

* avoid double initPubSub

* fix gossip testing (specially when anonymize is on)

* make azure tests shorter
2020-09-28 09:11:18 +02:00
Jacek Sieka 8ecef46738
reencode gossipsub messages with anonymization (#378)
This helps protect against clients sending more data than they should
and thus getting penalized on topics that require anonymity
2020-09-25 18:39:34 +02:00
Jacek Sieka 17e00e642a
limit write queue length (#376)
To break a potential read/write deadlock, gossipsub uses an unbounded
queue for writes - when peers are too slow to process this queue, it may
end up growing without bounds causing high memory usage.

Here, we introduce a maximum write queue length after which the peer is
disconnected - the queue is generous enough that any "normal" usage
should be fine - writes that are `await`:ed are not affected, only
writes that are launched in an `asyncSpawn` task or similar.

* avoid unnecessary copy of message when there are no send observers
* release message memory earlier in gossipsub
* simplify pubsubpeer logging
2020-09-24 18:43:20 +02:00
Jacek Sieka 25bd0a18f4
small fixes (#374)
* add helper to read EOF marker after closing stream (else stream stay
alive until timeout/reset)
* don't assert on empty channel message
* don't loop when writing to chronos (no need)
2020-09-24 07:30:19 +02:00
Giovanni Petrantoni ec322124ac
allow to omit peerId and seqno (#372)
* allow to omit peerId and seqno

* small refactor

* wip

* fix message encoding

* improve rpc signature logic

* remove peerid from verify

* trace fixes

* fix message test

* fix gossip 1.0 tests
2020-09-23 17:56:33 +02:00
Dmitriy Ryajov abd234601b
move events to conn manager (#373) 2020-09-23 08:07:16 -06:00
Jacek Sieka 471e5906f6
fix gossipsub memory leak on disconnected peer (#371)
When messages can't be sent to peer, we try to establish a send
connection - this causes messages to stack up as more and more unsent
messages are blocked on the dial lock.

* remove dial lock
* run reconnection loop in background task
2020-09-22 09:05:53 +02:00
Jacek Sieka 49a12e619d
channel close race and deadlock fixes (#368)
* channel close race and deadlock fixes

* remove send lock, write chunks in one go
* push some of half-closed implementation to BufferStream
* fix some hangs where LPChannel readers and writers would not always
wake up
* simplify lazy channels
* fix close happening more than once in some orderings
* reenable connection tracking tests
* close channels first on mplex close such that consumers can read bytes

A notable difference is that BufferedStream is no longer considered EOF
until someone has actually read the EOF marker.

* docs, simplification
2020-09-21 19:48:19 +02:00
Giovanni Petrantoni b99d2039a8
Gossip one one (#240)
* allow multiple codecs per protocol (without breaking things)

* add 1.1 protocol to gossip

* explicit peering part 1

* explicit peering part 2

* explicit peering part 3

* PeerInfo and ControlPrune protocols

* fix encodePrune

* validated always, even explicit peers

* prune by score (score is stub still)

* add a way to pass parameters to gossip

* standard setup fixes

* take into account explicit direct peers in publish

* add floodPublish logic

* small fixes, publish still half broken

* make sure to waitsub in sparse test

* use var semantics to optimize table access

* wip... lvalues don't work properly sadly...

* big publish refactor, replenish and balance

* fix internal tests

* use g.peers for fanout (todo: don't include flood peers)

* exclude non gossip from fanout

* internal test fixes

* fix flood tests

* fix test's trypublish

* test interop fixes

* make sure to not remove peers from gossip table

* restore old replenishFanout

* cleanups

* restore utility module import

* restore trace vs debug in gossip

* improve fanout replenish behavior further

* triage publish nil peers (issue is on master too but just hidden behind a if/in)

* getGossipPeers fixes

* remove topics from pubsubpeer (was unused)

* simplify rebalanceMesh (following spec) and make it finally reach D_high

* better diagnostics

* merge new pubsubpeer, copy 1.1 to new module

* fix up merge

* conditional enable gossip11 module

* add back topics in peers, re-enable flood publish

* add more heartbeat locking to prevent races

* actually lock the heartbeat

* minor fixes

* with sugar

* merge 1.0

* remove assertion in publish

* fix multistream 1.1 multi proto

* Fix merge oops

* wip

* fix gossip 11 upstream

* gossipsub11 -> gossipsub

* support interop testing

* tests fixing

* fix directchat build

* control prune updates (pb)

* wip parameters

* gossip internal tests fixes

* parameters wip

* finishup with params

* cleanups/wip

* small sugar

* grafted and pruned procs

* wip updateScores

* wip

* fix logging issue

* pubsubpeer, chronicles explicit override

* fix internal gossip tests

* wip

* tables troubleshooting

* score wip

* score wip

* fixes

* fix test utils generateNodes

* don't delete while iterating in score update

* fix grafted defect

* add a handleConnect in subscribeTopic

* pruning improvements

* wip

* score fixes

* post merge - builds gossip tests

* further merge fixes

* rebalance improvements and opportunistic grafting

* fix test for now

* restore explicit peering

* implement peer exchange graft message

* add an hard cap to PX

* backoff time management

* IWANT cap/budget

* Adaptive gossip dissemination

* outbound mesh quota, internal tests fixing

* oversub prune score based, finish outbound quota

* finishup with score and ihave budget

* use go daemon 0.3.0

* import fixes

* byScore cleanup score sorting

* remove pointless scaling in `/` Duration operator

* revert using libp2p org for daemon

* interop fixes

* fixes and cleanup

* remove heartbeat assertion, minor debug fixes

* logging improvements and cleaning up

* (to revert) add some traces

* add explicit topic to gossip rpcs

* pubsub merge fixes and type fix in switch

* Revert "(to revert) add some traces"

This reverts commit 4663eaab6cc336c81cee50bc54025cf0b7bcbd99.

* cleanup some now irrelevant todo

* shuffle peers anyway as score might be disabled

* add missing shuffle

* old merge fix

* more merge fixes

* debug improvements

* re-enable gossip internal tests

* add gossip10 fallback (dormant but tested)

* split gossipsub internal tests into 1.0 and 1.1

Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
2020-09-21 11:16:29 +02:00
Jacek Sieka b7e5d1122c
cleanups (#366)
* reuse connection timeout for noise handshake (avoid extra timer)
* enforce nbytes > 0 for readOnce
* avoid some unnecessary memory zeroing
* simplify noise
* fix dumping when noise splits message
2020-09-16 11:55:25 +02:00
Dmitriy Ryajov b0d86b95dd
add peer lifecycle events (#357)
* add peer lifecycle events

* rework peer events to not use connection events

* don't use result in pubsub and switch init

* wip

* use ordered hashes and remove logscope

* logging

* add missing test

* small fixes
2020-09-15 14:19:22 -06:00
Giovanni Petrantoni a6007be428
avoid sending empty seqno and/or fromPeer (gossip rpc) (#364) 2020-09-15 12:33:18 +02:00
Eugene Kabanov e2d46c6b53
Add `libp2p_dump` and `libp2p_dump_dir` compiler time options. (#359)
* Add `libp2p_dump` and `libp2p_dump_dir` compiler time options.

* Add tools/pbcap_parser.

* Add mplex control messages decoding.
2020-09-15 11:16:43 +02:00
Oskar Thorén 5e66f6fbd8
Add logScope to connmanager and pubsubprotobuf (#363) 2020-09-15 08:03:53 +02:00
Jacek Sieka 0db45462cd
mplex fixes (#362)
* remove almost-empty types module
* lock when writing message (that's the only place the lock matters, and
only when the message is > max msg size)
* logging updates (log in consistent order, makes reading logs easier)
* raise EOF from readExactly only if no bytes have been read (to signal
that _no_ bytes were lost)
2020-09-14 10:19:54 +02:00
Jacek Sieka 96d4c44fec
refactor bufferstream to use a queue (#346)
This change modifies how the backpressure algorithm in bufferstream
works - in particular, instead of working byte-by-byte, it will now work
seq-by-seq.

When data arrives, it usually does so in packets - in the current
bufferstream, the packet is read then split into bytes which are fed one
by one to the bufferstream. On the reading side, the bytes are popped of
the bufferstream, again byte by byte, to satisfy `readOnce` requests -
this introduces a lot of synchronization traffic because the checks for
full buffer and for async event handling must be done for every byte.

In this PR, a queue of length 1 is used instead - this means there will
at most exist one "packet" in `pushTo`, one in the queue and one in the
slush buffer that is used to store incomplete reads.

* avoid byte-by-byte copy to buffer, with synchronization in-between
* reuse AsyncQueue synchronization logic instead of rolling own
* avoid writeHandler callback - implement `write` method instead
* simplify EOF signalling by only setting EOF flag in queue reader (and
reset)
* remove BufferStream pipes (unused)
* fixes drainBuffer deadlock when drain is called from within read loop
and thus blocks draining
* fix lpchannel init order
2020-09-10 08:19:13 +02:00
Jacek Sieka 5b347adf58
logging fixes and small cleanups (#361)
In particular, allow longer multistream select reads
2020-09-09 19:12:08 +02:00
Jacek Sieka 63b38734bd
fix poor performance in LRU cache (#360)
it turns out (in NBC) a heap is sufficiently slow becuase of all the
deletes that it makes more sense to go with a linked list
2020-09-09 18:28:46 +02:00
Jacek Sieka 82c179db9e
mplex fixes (#356)
* close the right connection when channel send fails
* don't crash on channel id that is not unique
2020-09-08 08:24:28 +02:00
Jacek Sieka 2b72d485a3
a few more log fixes (#355) 2020-09-07 14:15:11 +02:00
Jacek Sieka c1856fda53
simplify and unify logging (#353)
* use short format for logging peerid
* log peerid:oid for connections
2020-09-06 10:31:47 +02:00
Jacek Sieka 9b815efe8f
gossipsub: don't subscribe to floodsub also (#352) 2020-09-04 22:53:03 +02:00
Jacek Sieka 16a008db75
fix connection event order when connection dies early (#351)
if the connection is already closed (because the remote closes during
identfiy for example), an exception would be raised which would leave
the connection in limbo, beacuse it would not go through the rest of
internalConnect.

Also, if the connection is already closed, the disconnect event would be
scheduled before the connect event :/
2020-09-04 20:30:26 +02:00
Jacek Sieka 6d91d61844
small cleanups & docs (#347)
* simplify gossipsub heartbeat start / stop
* avoid alloc in peerid check
* stop iterating over seq after unsubscribing item (could crash)
* don't crash on missing private key with enabled sigs (shouldn't happen
but...)
2020-09-04 18:31:43 +02:00
Eugene Kabanov 0b85192119
Remove asyncCheck from codebase. (#345)
* Remove asyncCheck from codebase.

* Replace all `discard` statements with new `asyncSpawn`.

* Bump `nim-chronos` requirement.
2020-09-04 18:30:45 +02:00
Jacek Sieka 5819c6a9a7
gossipsub / floodsub fixes (#348)
* mcache fixes

* remove timed cache - the window shifting already removes old messages
* ref -> object
* avoid unnecessary allocations with `[]` operator

* simplify init

* fix several gossipsub/floodsub issues

* floodsub, gossipsub: don't rebroadcast messages that fail validation
(!)
* floodsub, gossipsub: don't crash when unsubscribing from unknown
topics (!)
* gossipsub: don't send message to peers that are not interested in the
topic, when messages don't share topic list
* floodsub: don't repeat all messages for each message when
rebroadcasting
* floodsub: allow sending empty data
* floodsub: fix inefficient unsubscribe
* sync floodsub/gossipsub logging
* gossipsub: include incoming messages in mcache (!)
* gossipsub: don't rebroadcast already-seen messages (!)
* pubsubpeer: remove incoming/outgoing seen caches - these are already
handled in gossipsub, floodsub and will cause trouble when peers try to
resubscribe / regraft topics (because control messages will have same
digest)
* timedcache: reimplement without timers (fixes timer leaks and extreme
inefficiency due to per-message closures, futures etc)
* timedcache: ref -> obj
2020-09-04 08:10:32 +02:00
Jacek Sieka cd1c68dbc5
avoid send deadlock by not allowing send to block (#342)
* avoid send deadlock by not allowing send to block

* handle message issues more consistently
2020-09-01 09:33:03 +02:00
Dmitriy Ryajov d3182c4dba
No raise send (#339)
* dont raise in send

* check that the lock is acquire on release
2020-08-20 20:50:33 -06:00
Giovanni Petrantoni 840a76915e warn -> debug log levels in errors.nim 2020-08-20 16:53:28 +09:00
Jacek Sieka eb13845f65 work around send that may raise
`send` can raise exceptions that together with asyncCheck will
crash NBC
2020-08-19 14:25:30 +03:00
Zahary Karadjov af0955c58b
Add comments explaning a possible deadlock 2020-08-18 13:51:41 +03:00
Zahary Karadjov 60122a044c
Restore interop with Lighthouse by preventing concurrent meshsub dials 2020-08-17 22:40:58 +03:00
Jacek Sieka 833a5b8e57
add muxer nil check 2020-08-17 13:32:02 +02:00
Jacek Sieka cfcda3c3ef
work around race conditions between identify and other protocols
when identify is run on incoming connections, the connmanager tables are
updated too late for incoming connections to properly be handled

this is a quickfix that will eventually need cleaning up
2020-08-17 13:29:45 +02:00
Jacek Sieka 790b67c923
work around bufferstream deadlock (#332)
mplex backpressure handling deadlocks with something
2020-08-17 12:45:54 +02:00
Jacek Sieka 53877e97bd
trace logs 2020-08-17 12:39:25 +02:00
Jacek Sieka f46bf0faa4
remove send lock (#334)
* remove send lock

When mplex receives data it will block until a reader has processed the
data. Thus, when a large message is received, such as a gossipsub
subscription table, all of mplex will be blocked until all reading is
finished.

However, if at the same time a `dial` to establish a gossipsub send
connection is ongoing, that `dial` will be blocked because mplex is no
longer reading data - specifically, it might indeed be the connection
that's processing the previous data that is waiting for a send
connection.

There are other problems with the current code:
* If an exception is raised, it is not necessarily raised for the same
connection as `p.sendConn`, so resetting `p.sendConn` in the exception
handling is wrong
* `p.isConnected` is checked before taking the lock - thus, if it
returns false, a new dial will be started. If a new task enters `send`
before dial is finished, it will also determine `p.isConnected` is
false, then get stuck on the lock - when the previous task finishes and
releases the lock, the new task will _also_ dial and thus reset
`p.sendConn` causing a leak.

* prefer existing connection

simplifies flow
2020-08-17 12:38:27 +02:00
Jacek Sieka b12145dff7
avoid crash when subscribe is received (#333)
...by making subscribeTopic synchronous, avoiding a peer table lookup
completely.

rebalanceMesh will be called a second later - it's fine
2020-08-17 12:10:22 +02:00
Jacek Sieka ab864fc747
logging cleanups and small fixes (#331) 2020-08-15 21:50:31 +02:00
Jacek Sieka 397f9edfd4
simplify mplex (#327)
* less async
* less copying of data
* less redundant cleanup
2020-08-15 07:58:30 +02:00
Jacek Sieka 9c7e055310
set activity flag on noise / secio (#330) 2020-08-15 07:36:15 +02:00
Dmitriy Ryajov d1f1e1b31e
add missing mplex half closed test (#326) 2020-08-12 07:23:49 +02:00
Dmitriy Ryajov b76b3e0e9b
Rework pubsub (#322)
* move pubsub of off switch, pass switch into pubsub

* use join on lpstreams

* properly cleanup up failed peers

* fix tests

* fix peertable hasPeerId

* fix tests

* rework sending, remove helpers from pubsubpeer, unify in broadcast

* further split broadcast into send

* use send where appropriate

* use formatIt

* improve trace

Co-authored-by: Giovanni Petrantoni <giovanni@fragcolor.xyz>
2020-08-11 18:05:49 -06:00
Eugene Kabanov 59b290fcc7
Refactor minasn1 and fix security issues. (#323)
* Refactor minasn1 and fix security issues.

* Fix for RSA test vectors.
2020-08-11 16:58:51 -06:00