Commit Graph

964 Commits

Author SHA1 Message Date
Jacek Sieka 9e5ba64c48
avoid creating prune message unless we're pruning (#487) 2020-12-15 22:46:03 +01:00
Giovanni Petrantoni 18d69a5c41
Workaround the gossipsub table race condition (#486) 2020-12-15 12:32:11 -06:00
Jacek Sieka b52dab9fd7
use stew/leb128 (#481)
* avoids multiple reallocations in readLp
* simplifies varint implementation
* remove vbuffer.length (unused)
2020-12-15 12:15:22 -06:00
Giovanni Petrantoni 5543f6681f
first pass, use results for Cid module (#480)
* first pass, use results for Cid module

* improvements to decode
2020-12-15 14:19:18 +01:00
Giovanni Petrantoni f8f0bc1bd8
Gossip improvements (#476)
* add more traces, remove async from rebalance

* more traces

* avoid computng scores when weight is 0.0

* debug colocation, fix an indent in unsubpeer (minor)

* add full ValidationResult coverage

* store in cache only after validation

* gossip 1.0 fixes

* fix typo

* gossip 10 internal test fixes

* test fixing

* refactor peerstats usages

* populate tables if missing when scoring
2020-12-15 10:25:22 +09:00
Mamy Ratsimbazafy 42d264d8b0
Rm bearssl + Deactivate Travis completely (#477)
* Rm bearssl added in #2167

* Travis ARM doesn't work
2020-12-10 14:19:27 +01:00
Mamy André-Ratsimbazafy 8805e5f061 Use Travis only for ARM64 - https://github.com/status-im/nimbus-eth2/pull/2167 2020-12-09 16:05:41 -06:00
Jacek Sieka 6f1ecc8df7
streamline socket read/write hot path (#473)
* streamline socket read/write hot path

This avoids some unnecessary memory copying on the hot path of noise /
mplex, as well as getting rid of a few futures - profiling shows that
this is one of the main culprits of small memory allocations, which
makes sense - this is where gossip fan-out happens.

* fewer futures (and corresponding closures) when sending lpchannel
messages
* avoid data copies when encrypting and framing noise messages
* avoid copying tuple when reading noise data (poor c codegen)
* fix setting eof flag in secure read

* write noise frames in one go

...and closing secure socket once is enough
2020-12-09 08:56:40 -06:00
Jacek Sieka 1befeb8c2e
clean up peerid (#470)
* fix dangling cstring on error return
* remove some useless inlines
* less mallocs in shortlog
* proc -> func
* rename test
2020-12-03 13:53:16 -06:00
Dmitriy Ryajov e9d4679059
Race in connection setup (#464)
* check that connection is not closed or eof

* don't release connection lock prematurely

* test that only valid connections can be added

* correct exception type on closed connection

* add clarifying comment

* use closeWithEOF for more stable test

* misc comments

* log stream id in buffestream asserts

* use closeWithEOF to prevent races in tests

* give some time to the remote handler to trigger

* adding more tests to make codecov happy
2020-12-02 19:24:48 -06:00
Dmitriy Ryajov d1c689e5ab
adding libp2p tag to logScope (#465) 2020-12-01 11:34:27 -06:00
Giovanni Petrantoni e1648d4404 fix mcache logic check in gossipsub 2020-12-01 23:55:51 +09:00
Giovanni Petrantoni b4738d723c
Some gossip fixes (#467)
* fix some missing rpc in rebalanceMesh

* clarify some variable names and lifetime

* further improvements
2020-12-01 11:44:09 +01:00
Dmitriy Ryajov 94e672ead0
allow concurrent closeWithEOF (#466)
* allow concurrent closeWithEOF

* add dedicated closedWithEOF flag
2020-12-01 09:44:21 +01:00
Jacek Sieka 5c2a54bdd9
fix timeoutmonitor loop (#463)
* fix timeoutmonitor loop

* Clarify that cancellation can happen while in timeoutMonitor
2020-11-29 13:34:19 +01:00
Dmitriy Ryajov 18443dafc1
rework peer event to take an initiator flag (#456)
* rework peer event to take an initiator flag

* use correct direction for initiator
2020-11-28 10:59:47 -06:00
Dmitriy Ryajov 3d44fcb8b3
use cancelAndAwait to mitigate further hangs (#459) 2020-11-28 09:48:06 -06:00
Dmitriy Ryajov a8f5f7a8bb
move dialing logic to it's own proc to avoid try/finally bugs (#461)
* move dialing logic to it's own proc to avoid try/finally bugs

* re-export transport

* lint

* add cancelation test

* test remote conn close on dial
2020-11-28 09:05:12 +01:00
Giovanni Petrantoni 02b20440f2
Limit ihave emission (#462)
* add some limits to ihave emission, go has them, rust does not actually

* restore shuffling of IDs

* add some context
2020-11-28 16:27:39 +09:00
Giovanni Petrantoni 12db9a4cf2 TopicParams validation tuning 2020-11-28 00:23:09 +09:00
Giovanni Petrantoni 809df8d04d add some extra gossip metrics 2020-11-26 16:20:34 +09:00
Giovanni Petrantoni 6c7f2766fe
expose seenTTL parameters (#457) 2020-11-26 14:45:10 +09:00
Dmitriy Ryajov ca9c5c85e4
dont break chronicles logging streamline connsetup (#455) 2020-11-25 13:34:48 -06:00
Dmitriy Ryajov 7b1e652224
Allow custom identify agent string (#451)
* allow custom agent version string

* rework tests and add test for custom agent version
2020-11-25 07:42:02 -06:00
Dmitriy Ryajov 164892776b
get rid of hangs cleanup (#453) 2020-11-25 07:35:25 -06:00
Dmitriy Ryajov 21110636cb
fixing log level to avoid sacring users (#452) 2020-11-24 12:07:27 -06:00
Dmitriy Ryajov 351489bfa9
getMuxedStream to more appropriate getStream (#448) 2020-11-24 00:37:45 -06:00
Dmitriy Ryajov 69ae24dc8d
less leak prone cleanup (#447)
* less leak prone cleanup

* fix double allFinished
2020-11-23 18:22:15 -06:00
Dmitriy Ryajov 6cc3f4283a
update conn peerinfo instead of replacing (#445)
* update conn peerinfo instead of replacing

* remove unnecesary peerid var
2020-11-23 15:15:55 -06:00
Dmitriy Ryajov 034a1e8b1b
small cleanups from tcp-limits2 (#446) 2020-11-23 15:02:23 -06:00
Dmitriy Ryajov 1d16d22f5f
Don't allow concurrent pushdata (#444)
* handle resets properly with/without pushes/reads

* add clarifying comments

* pushEof should also not be concurrent

* move channel reset to bufferstream

this is where the action happens - lpchannel merely redefines how close
is done

Co-authored-by: Jacek Sieka <jacek@status.im>
2020-11-23 09:07:11 -06:00
Dmitriy Ryajov c42009d56e
don't quit accept prematurelly (#443) 2020-11-19 09:10:25 -06:00
Giovanni Petrantoni 93b6c4dc52
Gossip runtime params (#437)
* move gossip parameters to runtime

* internal test fixes

* add missing params

* restore const parameters are soldi base and use them in init

* more constants tuning
2020-11-19 16:48:17 +09:00
Dmitriy Ryajov 92fa4110c1
Rework transport to use chronos accept (#420)
* rework transport to use the new accept api

* use the new chronos primits

* fixup tests to use the new transport api

* handle all exceptions in upgradeIncoming

* master merge

* add multiaddress exception type

* raise appropriate exception on invalida address

* allow retrying on TransportTooManyError

* adding TODO

* wip

* merge master

* add sleep if nil is returned

* accept loop handles all exceptions

* avoid issues with tray/except/finally

* make consistent with master

* cleanup accept loop

* logging

* Update libp2p/transports/tcptransport.nim

Co-authored-by: Jacek Sieka <jacek@status.im>

* use Direction enum instead of initiator flag

* use consistent import style

* remove experimental `closeWithEOF()`

Co-authored-by: Jacek Sieka <jacek@status.im>
2020-11-18 20:06:42 -06:00
Dmitriy Ryajov 8c8d73380f
Re-add connection manager tests (#441)
* use table.getOrDefault()

* re-add missing connection manager tests
2020-11-17 18:48:26 -06:00
Jacek Sieka 74acd0a33a
fix channels not being reset (#439)
* fix channels not being reset

silly for loop..

* allow only one concurrent read
* fix mplex test race condition
* add some bufferstream eof tests

* deadlock, lost data and hung channel fixes

* prevent concurrent `reset` calls
* reset LPChannel when read is cancelled (since data is lost)
* ensure there's one, and one only, 0-byte readOnce on EOF
* ensure that all data is returned before EOF is returned
* keep running activity monitor for half-closed channels (or they never
get closed)
2020-11-17 08:59:25 -06:00
Dmitriy Ryajov da37eee285
Test disconnect from conn event (#432)
* logs

* adding disconnect test in connection events

* adding immediate disconnect from connection event
2020-11-11 13:20:14 -06:00
Giovanni Petrantoni a9948b0b05
clarify validation messages (#431)
* clarify validation messages

* add codecov threshold
2020-11-12 01:42:12 +09:00
Dmitriy Ryajov 90921bff09
move some importance trace logs to debug (#428) 2020-11-09 22:14:46 -06:00
Dmitriy Ryajov 4fb3f50d2c
Reset channels on close (#425)
* reset when failed to read/write muxed conn

* add more comprehensive resource cleanup tests

* style

* cleanup tests
2020-11-06 09:24:24 -06:00
Dmitriy Ryajov 3956f3fd69
make sure all streams are tracked (#422)
* make sure all streams are tracked

* revert unnecesary change
2020-11-04 21:52:54 -06:00
Dmitriy Ryajov 6040cb4ef1
fix debugutils (#423) 2020-11-04 19:56:28 -06:00
Giovanni Petrantoni 7cc42ce219
start adding more tests + minor fixes (#419)
* start adding more tests + minor fixes

* add wrong secure negotiation test

* add noise failed handshake test
2020-11-04 23:24:41 +09:00
Giovanni Petrantoni e496802943
Least expensive metrics (#421)
* add more general and useful metrics

* fix gossipsub peers metrics in heartbeat
2020-11-04 15:18:00 +01:00
Dmitriy Ryajov 7b5259dbc7
Move triggers (#416)
* move event triggers to connmanager

* use base error type

* avoid deadlocks

* handle eof and closed when identifying incoming

* use `closeWait`
2020-11-02 14:35:26 -06:00
Dmitriy Ryajov 43a77e60a1
split stream counts by direction (#418) 2020-11-01 16:23:26 -06:00
Jacek Sieka 03639f1446
Revert "Channel leaks (#413)" (#417)
This reverts commit 1de1d49223.
2020-11-01 14:49:25 -06:00
Giovanni Petrantoni 9c1633bf87 fix ValidIpAddress multiaddress init return type 2020-10-31 13:20:29 +09:00
cheatfate 04c95cb7b0
Fix write should be writeArray. 2020-10-31 03:23:34 +02:00
cheatfate ff48d0b1a2
Proper fix for init(ValidIpAddress). 2020-10-30 17:52:38 +02:00
Giovanni Petrantoni 3d9948a65e ensure all multiaddress routines use Result 2020-10-30 23:50:04 +09:00
Giovanni Petrantoni 75b023c9e5
gossipsub audit fixes (#412)
* [SEC] gossipsub - rebalanceMesh grafts peers giving preference to low scores #405

* comment score choices

* compiler warning fixes/bug fixes (unsubscribe)

* rebalanceMesh does not enforce D_out quota

* fix outbound grafting

* fight the nim compiler

* fix closure capture bs...

* another closure fix

* #403 rebalance prune fixes

* more test fixing

* #403 fixes

* #402 avoid removing scores on unsub

* #401 handleGraft improvements

* [SEC] handleIHAVE/handleIWANT recommendations

* add a note about peer exchange handling
2020-10-30 21:49:54 +09:00
Dmitriy Ryajov 1de1d49223
Channel leaks (#413)
* break stream tracking by type

* use closeWithEOF to await wrapped stream

* fix cancelation leaks

* fix channel leaks

* logging

* use close monitor and always call closeUnderlying

* don't use closeWithEOF

* removing close monitor

* logging
2020-10-27 11:21:03 -06:00
Giovanni Petrantoni eeaa62feec add more debug details to multiaddress assertions 2020-10-21 18:29:59 +09:00
Giovanni Petrantoni 462da1f7a8
gossip MessageID as seq[byte] (#391)
* gossip MessageID as seq[byte]

* combina hashes in defaultMsgIdProvider

* wip

* fix defaultMsgIdProvider
2020-10-21 12:26:04 +09:00
Giovanni Petrantoni 27b9bf436e
fix validation according to specification (#410) 2020-10-21 12:25:42 +09:00
Giovanni Petrantoni 5c19668b2d
avoid verbose EOF messages in readOnce(secure) (#411)
* avoid verbose EOF messages in readOnce(secure)

* shorten azure tests further
2020-10-21 10:08:24 +09:00
Giovanni Petrantoni 32623b930e
handle secure errors in readOnce (secure) (#397)
* handle secure errors in readOnce(secure)

* small synthax fix

* fix mistake in readOnce's isNil
2020-10-19 14:13:14 +09:00
Giovanni Petrantoni 556213abf4
Extended validators (#395)
* gossip extended validation

* fix flood tests

* fix gossip 1.0 tests

* synthax consistency
2020-10-12 16:56:00 +09:00
Giovanni Petrantoni e3bdb9eb13
decode properly ControlPrune (#392) 2020-10-09 09:12:38 +09:00
Giovanni Petrantoni 0f2435f551
better opportunistic grafting score (when score is disabled) (#389) 2020-10-03 09:26:45 +09:00
Dean Eigenmann 853238a215
feature/expose-matcher (#387)
* exposes matcher

* might work

* fix

* fix
2020-10-02 08:59:15 -06:00
Giovanni Petrantoni 4a98a8af5a
gossip pruning fixes related to #371 (#385)
* gossip pruning fixes related to #371

* better trace for grafted/pruned

* shorted azure testing again
2020-10-02 13:09:31 +09:00
Mamy Ratsimbazafy 03f5bbba6d
saner logging (#381) 2020-09-29 09:40:06 -06:00
Giovanni Petrantoni 98d0cc3a16
defaultMsgIdProvider alternative/test anonymize (#379)
* defaultMsgIdProvider alternative/test anonymize

* avoid freeze during flood tests

* avoid `empty message, skipping` situation

* test observers

* avoid double initPubSub

* fix gossip testing (specially when anonymize is on)

* make azure tests shorter
2020-09-28 09:11:18 +02:00
Jacek Sieka 8ecef46738
reencode gossipsub messages with anonymization (#378)
This helps protect against clients sending more data than they should
and thus getting penalized on topics that require anonymity
2020-09-25 18:39:34 +02:00
Jacek Sieka 17e00e642a
limit write queue length (#376)
To break a potential read/write deadlock, gossipsub uses an unbounded
queue for writes - when peers are too slow to process this queue, it may
end up growing without bounds causing high memory usage.

Here, we introduce a maximum write queue length after which the peer is
disconnected - the queue is generous enough that any "normal" usage
should be fine - writes that are `await`:ed are not affected, only
writes that are launched in an `asyncSpawn` task or similar.

* avoid unnecessary copy of message when there are no send observers
* release message memory earlier in gossipsub
* simplify pubsubpeer logging
2020-09-24 18:43:20 +02:00
Jacek Sieka 25bd0a18f4
small fixes (#374)
* add helper to read EOF marker after closing stream (else stream stay
alive until timeout/reset)
* don't assert on empty channel message
* don't loop when writing to chronos (no need)
2020-09-24 07:30:19 +02:00
Giovanni Petrantoni ec322124ac
allow to omit peerId and seqno (#372)
* allow to omit peerId and seqno

* small refactor

* wip

* fix message encoding

* improve rpc signature logic

* remove peerid from verify

* trace fixes

* fix message test

* fix gossip 1.0 tests
2020-09-23 17:56:33 +02:00
Dmitriy Ryajov abd234601b
move events to conn manager (#373) 2020-09-23 08:07:16 -06:00
Jacek Sieka 471e5906f6
fix gossipsub memory leak on disconnected peer (#371)
When messages can't be sent to peer, we try to establish a send
connection - this causes messages to stack up as more and more unsent
messages are blocked on the dial lock.

* remove dial lock
* run reconnection loop in background task
2020-09-22 09:05:53 +02:00
Jacek Sieka 49a12e619d
channel close race and deadlock fixes (#368)
* channel close race and deadlock fixes

* remove send lock, write chunks in one go
* push some of half-closed implementation to BufferStream
* fix some hangs where LPChannel readers and writers would not always
wake up
* simplify lazy channels
* fix close happening more than once in some orderings
* reenable connection tracking tests
* close channels first on mplex close such that consumers can read bytes

A notable difference is that BufferedStream is no longer considered EOF
until someone has actually read the EOF marker.

* docs, simplification
2020-09-21 19:48:19 +02:00
Giovanni Petrantoni b99d2039a8
Gossip one one (#240)
* allow multiple codecs per protocol (without breaking things)

* add 1.1 protocol to gossip

* explicit peering part 1

* explicit peering part 2

* explicit peering part 3

* PeerInfo and ControlPrune protocols

* fix encodePrune

* validated always, even explicit peers

* prune by score (score is stub still)

* add a way to pass parameters to gossip

* standard setup fixes

* take into account explicit direct peers in publish

* add floodPublish logic

* small fixes, publish still half broken

* make sure to waitsub in sparse test

* use var semantics to optimize table access

* wip... lvalues don't work properly sadly...

* big publish refactor, replenish and balance

* fix internal tests

* use g.peers for fanout (todo: don't include flood peers)

* exclude non gossip from fanout

* internal test fixes

* fix flood tests

* fix test's trypublish

* test interop fixes

* make sure to not remove peers from gossip table

* restore old replenishFanout

* cleanups

* restore utility module import

* restore trace vs debug in gossip

* improve fanout replenish behavior further

* triage publish nil peers (issue is on master too but just hidden behind a if/in)

* getGossipPeers fixes

* remove topics from pubsubpeer (was unused)

* simplify rebalanceMesh (following spec) and make it finally reach D_high

* better diagnostics

* merge new pubsubpeer, copy 1.1 to new module

* fix up merge

* conditional enable gossip11 module

* add back topics in peers, re-enable flood publish

* add more heartbeat locking to prevent races

* actually lock the heartbeat

* minor fixes

* with sugar

* merge 1.0

* remove assertion in publish

* fix multistream 1.1 multi proto

* Fix merge oops

* wip

* fix gossip 11 upstream

* gossipsub11 -> gossipsub

* support interop testing

* tests fixing

* fix directchat build

* control prune updates (pb)

* wip parameters

* gossip internal tests fixes

* parameters wip

* finishup with params

* cleanups/wip

* small sugar

* grafted and pruned procs

* wip updateScores

* wip

* fix logging issue

* pubsubpeer, chronicles explicit override

* fix internal gossip tests

* wip

* tables troubleshooting

* score wip

* score wip

* fixes

* fix test utils generateNodes

* don't delete while iterating in score update

* fix grafted defect

* add a handleConnect in subscribeTopic

* pruning improvements

* wip

* score fixes

* post merge - builds gossip tests

* further merge fixes

* rebalance improvements and opportunistic grafting

* fix test for now

* restore explicit peering

* implement peer exchange graft message

* add an hard cap to PX

* backoff time management

* IWANT cap/budget

* Adaptive gossip dissemination

* outbound mesh quota, internal tests fixing

* oversub prune score based, finish outbound quota

* finishup with score and ihave budget

* use go daemon 0.3.0

* import fixes

* byScore cleanup score sorting

* remove pointless scaling in `/` Duration operator

* revert using libp2p org for daemon

* interop fixes

* fixes and cleanup

* remove heartbeat assertion, minor debug fixes

* logging improvements and cleaning up

* (to revert) add some traces

* add explicit topic to gossip rpcs

* pubsub merge fixes and type fix in switch

* Revert "(to revert) add some traces"

This reverts commit 4663eaab6c.

* cleanup some now irrelevant todo

* shuffle peers anyway as score might be disabled

* add missing shuffle

* old merge fix

* more merge fixes

* debug improvements

* re-enable gossip internal tests

* add gossip10 fallback (dormant but tested)

* split gossipsub internal tests into 1.0 and 1.1

Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
2020-09-21 11:16:29 +02:00
Jacek Sieka b7e5d1122c
cleanups (#366)
* reuse connection timeout for noise handshake (avoid extra timer)
* enforce nbytes > 0 for readOnce
* avoid some unnecessary memory zeroing
* simplify noise
* fix dumping when noise splits message
2020-09-16 11:55:25 +02:00
Dmitriy Ryajov b0d86b95dd
add peer lifecycle events (#357)
* add peer lifecycle events

* rework peer events to not use connection events

* don't use result in pubsub and switch init

* wip

* use ordered hashes and remove logscope

* logging

* add missing test

* small fixes
2020-09-15 14:19:22 -06:00
Giovanni Petrantoni a6007be428
avoid sending empty seqno and/or fromPeer (gossip rpc) (#364) 2020-09-15 12:33:18 +02:00
Eugene Kabanov e2d46c6b53
Add `libp2p_dump` and `libp2p_dump_dir` compiler time options. (#359)
* Add `libp2p_dump` and `libp2p_dump_dir` compiler time options.

* Add tools/pbcap_parser.

* Add mplex control messages decoding.
2020-09-15 11:16:43 +02:00
Oskar Thorén 5e66f6fbd8
Add logScope to connmanager and pubsubprotobuf (#363) 2020-09-15 08:03:53 +02:00
Jacek Sieka 0db45462cd
mplex fixes (#362)
* remove almost-empty types module
* lock when writing message (that's the only place the lock matters, and
only when the message is > max msg size)
* logging updates (log in consistent order, makes reading logs easier)
* raise EOF from readExactly only if no bytes have been read (to signal
that _no_ bytes were lost)
2020-09-14 10:19:54 +02:00
Jacek Sieka 96d4c44fec
refactor bufferstream to use a queue (#346)
This change modifies how the backpressure algorithm in bufferstream
works - in particular, instead of working byte-by-byte, it will now work
seq-by-seq.

When data arrives, it usually does so in packets - in the current
bufferstream, the packet is read then split into bytes which are fed one
by one to the bufferstream. On the reading side, the bytes are popped of
the bufferstream, again byte by byte, to satisfy `readOnce` requests -
this introduces a lot of synchronization traffic because the checks for
full buffer and for async event handling must be done for every byte.

In this PR, a queue of length 1 is used instead - this means there will
at most exist one "packet" in `pushTo`, one in the queue and one in the
slush buffer that is used to store incomplete reads.

* avoid byte-by-byte copy to buffer, with synchronization in-between
* reuse AsyncQueue synchronization logic instead of rolling own
* avoid writeHandler callback - implement `write` method instead
* simplify EOF signalling by only setting EOF flag in queue reader (and
reset)
* remove BufferStream pipes (unused)
* fixes drainBuffer deadlock when drain is called from within read loop
and thus blocks draining
* fix lpchannel init order
2020-09-10 08:19:13 +02:00
Jacek Sieka 5b347adf58
logging fixes and small cleanups (#361)
In particular, allow longer multistream select reads
2020-09-09 19:12:08 +02:00
Jacek Sieka 63b38734bd
fix poor performance in LRU cache (#360)
it turns out (in NBC) a heap is sufficiently slow becuase of all the
deletes that it makes more sense to go with a linked list
2020-09-09 18:28:46 +02:00
Jacek Sieka 82c179db9e
mplex fixes (#356)
* close the right connection when channel send fails
* don't crash on channel id that is not unique
2020-09-08 08:24:28 +02:00
Jacek Sieka 2b72d485a3
a few more log fixes (#355) 2020-09-07 14:15:11 +02:00
Jacek Sieka c1856fda53
simplify and unify logging (#353)
* use short format for logging peerid
* log peerid:oid for connections
2020-09-06 10:31:47 +02:00
Jacek Sieka 9b815efe8f
gossipsub: don't subscribe to floodsub also (#352) 2020-09-04 22:53:03 +02:00
Jacek Sieka 16a008db75
fix connection event order when connection dies early (#351)
if the connection is already closed (because the remote closes during
identfiy for example), an exception would be raised which would leave
the connection in limbo, beacuse it would not go through the rest of
internalConnect.

Also, if the connection is already closed, the disconnect event would be
scheduled before the connect event :/
2020-09-04 20:30:26 +02:00
Jacek Sieka 6d91d61844
small cleanups & docs (#347)
* simplify gossipsub heartbeat start / stop
* avoid alloc in peerid check
* stop iterating over seq after unsubscribing item (could crash)
* don't crash on missing private key with enabled sigs (shouldn't happen
but...)
2020-09-04 18:31:43 +02:00
Eugene Kabanov 0b85192119
Remove asyncCheck from codebase. (#345)
* Remove asyncCheck from codebase.

* Replace all `discard` statements with new `asyncSpawn`.

* Bump `nim-chronos` requirement.
2020-09-04 18:30:45 +02:00
Jacek Sieka 5819c6a9a7
gossipsub / floodsub fixes (#348)
* mcache fixes

* remove timed cache - the window shifting already removes old messages
* ref -> object
* avoid unnecessary allocations with `[]` operator

* simplify init

* fix several gossipsub/floodsub issues

* floodsub, gossipsub: don't rebroadcast messages that fail validation
(!)
* floodsub, gossipsub: don't crash when unsubscribing from unknown
topics (!)
* gossipsub: don't send message to peers that are not interested in the
topic, when messages don't share topic list
* floodsub: don't repeat all messages for each message when
rebroadcasting
* floodsub: allow sending empty data
* floodsub: fix inefficient unsubscribe
* sync floodsub/gossipsub logging
* gossipsub: include incoming messages in mcache (!)
* gossipsub: don't rebroadcast already-seen messages (!)
* pubsubpeer: remove incoming/outgoing seen caches - these are already
handled in gossipsub, floodsub and will cause trouble when peers try to
resubscribe / regraft topics (because control messages will have same
digest)
* timedcache: reimplement without timers (fixes timer leaks and extreme
inefficiency due to per-message closures, futures etc)
* timedcache: ref -> obj
2020-09-04 08:10:32 +02:00
Jacek Sieka cd1c68dbc5
avoid send deadlock by not allowing send to block (#342)
* avoid send deadlock by not allowing send to block

* handle message issues more consistently
2020-09-01 09:33:03 +02:00
Dmitriy Ryajov d3182c4dba
No raise send (#339)
* dont raise in send

* check that the lock is acquire on release
2020-08-20 20:50:33 -06:00
Giovanni Petrantoni 840a76915e warn -> debug log levels in errors.nim 2020-08-20 16:53:28 +09:00
Jacek Sieka eb13845f65 work around send that may raise
`send` can raise exceptions that together with asyncCheck will
crash NBC
2020-08-19 14:25:30 +03:00
Zahary Karadjov af0955c58b
Add comments explaning a possible deadlock 2020-08-18 13:51:41 +03:00
Zahary Karadjov 60122a044c
Restore interop with Lighthouse by preventing concurrent meshsub dials 2020-08-17 22:40:58 +03:00
Jacek Sieka 833a5b8e57
add muxer nil check 2020-08-17 13:32:02 +02:00
Jacek Sieka cfcda3c3ef
work around race conditions between identify and other protocols
when identify is run on incoming connections, the connmanager tables are
updated too late for incoming connections to properly be handled

this is a quickfix that will eventually need cleaning up
2020-08-17 13:29:45 +02:00
Jacek Sieka 790b67c923
work around bufferstream deadlock (#332)
mplex backpressure handling deadlocks with something
2020-08-17 12:45:54 +02:00
Jacek Sieka 53877e97bd
trace logs 2020-08-17 12:39:25 +02:00
Jacek Sieka f46bf0faa4
remove send lock (#334)
* remove send lock

When mplex receives data it will block until a reader has processed the
data. Thus, when a large message is received, such as a gossipsub
subscription table, all of mplex will be blocked until all reading is
finished.

However, if at the same time a `dial` to establish a gossipsub send
connection is ongoing, that `dial` will be blocked because mplex is no
longer reading data - specifically, it might indeed be the connection
that's processing the previous data that is waiting for a send
connection.

There are other problems with the current code:
* If an exception is raised, it is not necessarily raised for the same
connection as `p.sendConn`, so resetting `p.sendConn` in the exception
handling is wrong
* `p.isConnected` is checked before taking the lock - thus, if it
returns false, a new dial will be started. If a new task enters `send`
before dial is finished, it will also determine `p.isConnected` is
false, then get stuck on the lock - when the previous task finishes and
releases the lock, the new task will _also_ dial and thus reset
`p.sendConn` causing a leak.

* prefer existing connection

simplifies flow
2020-08-17 12:38:27 +02:00
Jacek Sieka b12145dff7
avoid crash when subscribe is received (#333)
...by making subscribeTopic synchronous, avoiding a peer table lookup
completely.

rebalanceMesh will be called a second later - it's fine
2020-08-17 12:10:22 +02:00
Jacek Sieka ab864fc747
logging cleanups and small fixes (#331) 2020-08-15 21:50:31 +02:00
Jacek Sieka 397f9edfd4
simplify mplex (#327)
* less async
* less copying of data
* less redundant cleanup
2020-08-15 07:58:30 +02:00
Jacek Sieka 9c7e055310
set activity flag on noise / secio (#330) 2020-08-15 07:36:15 +02:00
Dmitriy Ryajov d1f1e1b31e
add missing mplex half closed test (#326) 2020-08-12 07:23:49 +02:00
Dmitriy Ryajov b76b3e0e9b
Rework pubsub (#322)
* move pubsub of off switch, pass switch into pubsub

* use join on lpstreams

* properly cleanup up failed peers

* fix tests

* fix peertable hasPeerId

* fix tests

* rework sending, remove helpers from pubsubpeer, unify in broadcast

* further split broadcast into send

* use send where appropriate

* use formatIt

* improve trace

Co-authored-by: Giovanni Petrantoni <giovanni@fragcolor.xyz>
2020-08-11 18:05:49 -06:00
Eugene Kabanov 59b290fcc7
Refactor minasn1 and fix security issues. (#323)
* Refactor minasn1 and fix security issues.

* Fix for RSA test vectors.
2020-08-11 16:58:51 -06:00
Eugene Kabanov d47b2d805f
Use constant-time hex encoding/decoding procedures explicitly. (#305)
* Use constant-time hex encoding/decoding procedures explicitly.

* Add comments.
2020-08-11 08:48:21 -06:00
Dmitriy Ryajov 2325692f55
Fix half closed (#324)
* don't call `close` in `remoteClose`

* make sure timeout are properly propagted

* fix tests

* adding remote close write test
2020-08-10 16:17:11 -06:00
Giovanni Petrantoni 6ffd5be059 fix crash at TRACE log level 2020-08-08 16:31:37 +09:00
Eugene Kabanov 7c1aac5dc1
Use constant-time comparison for keys and signatures. (#299) 2020-08-08 08:53:33 +02:00
Jacek Sieka f303954989
peer hooks -> events (#320)
* peer hooks -> events

* peerinfo -> peerid
* include connection direction in event
* check connection status after event
* lock connmanager lookup also when dialling peer
* clean up un-upgraded connection when upgrade fails
* await peer eventing

* remove join/lifetime future from peerinfo

Peerinfo instances are not unique per peer so the lifetime future is
misleading - it fires when a random connection is closed, not the "last"
one

* document switch values

* naming

* peerevent->conneevent
2020-08-08 08:52:20 +02:00
zah fbb59c3638
`msg` is a reserved property name in Chronicles (#321)
Every Chronicles log record has an existing `msg` property matching
the static string supplied in the log statement. Thus, it's currently
not possible to use `msg` as the name of a user property:

https://github.com/status-im/nim-chronicles/issues/86
2020-08-07 16:46:00 -06:00
Jacek Sieka 7c2ab38da1
cleanups (#319) 2020-08-06 20:14:40 +02:00
Jacek Sieka c6c0c152c0
Dial peerid (#308)
* prefer PeerID in switch api

This avoids ref issues like ref identity and nil

* use existing peerinfo instance if possible

* remove secureCodec

there may be multiple connections per peerinfo with different codecs

* avoid some extra async::
2020-08-06 09:29:27 +02:00
Giovanni Petrantoni 9bbe5e4841
Fix subclass calls to handleDisconnect (#314)
* Fix subclass calls to handleDisconnect

* add peer ID to nil peer debug message
2020-08-06 11:12:52 +09:00
Giovanni Petrantoni 5c986cf657
Fix build, add some raises (#315)
* Fix build, add some raises

* wip

* wip more raises

* missing exc object in mplex

* proper lifetime for subscribePeer

Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
2020-08-05 19:30:57 -06:00
Ștefan Talpalaru bd5d43874a
more expensive metrics (#312) 2020-08-05 14:02:26 +02:00
Dmitriy Ryajov 74a6dccd80
adding channel limits to mplex (#309) 2020-08-04 23:16:04 -06:00
Ștefan Talpalaru 843d32f8db
put expensive metrics under a Nim define (#310) 2020-08-04 17:27:59 -06:00
Dmitriy Ryajov cf2b42b914
Moving idle timeout to Connection to enable across all connection streams (#307)
* move idle timeout logic to connection

* more informative logs

* more informative logs
2020-08-04 07:22:05 -06:00
Giovanni Petrantoni 5f0637c49a
Audit curve fixes part2 (#298)
* refactor and fix mulgen (curve25519)

* crypto tests fixing

* fix some confusion in curve25519 mul

* removing ForbiddenCurveValues table and checks

* fix remaining merge issues
2020-08-04 18:19:26 +09:00
Giovanni Petrantoni 5cd93540fa
add a timeout during noise handshake (#294)
* add a timeout during noise handshake

* noise hs timeout into const
2020-08-04 17:04:16 +09:00
Giovanni Petrantoni 504e0444d3
refactor and fix mulgen (curve25519) (#293)
* refactor and fix mulgen (curve25519)

* crypto tests fixing
2020-08-04 14:07:53 +09:00
Dmitriy Ryajov b6877b8aac
increase send timeout for prune and graft msgs (#306)
* increase send timeout for prune and graft msgs

* use trace logs for subscribe monitor
2020-08-03 17:55:42 -06:00
Dmitriy Ryajov 980764774e
pubsub timeouts tuning (#295)
* add finegrained timeouts to pubsub

* use 10 millis timeout in tests

* finalization

* revert timeouts

* use `atEof` for reads

* adjust timeouts and use atEof for reads

* use atEof for reads

* set isEof flag

* no backoff for pubsub streams

* temp timer increase, make macos finalize

* don't call `subscribePeer` in libp2p anymore

* more traces

* leak tests

* lower timeouts

* handle exceptions in control message

* don't use `cancelAndWait`

* handle exceptions in helpers

* wip

* don't send empty messages

* check for leaks properly

* don't use cancelAndWait

* don't await subscribption sends

* remove subscrivePeer calls from switch

* trying without the hooks again
2020-08-02 23:20:11 -06:00
Jacek Sieka e655a510cd
misc cleanups (#303) 2020-08-02 12:22:49 +02:00
Jacek Sieka d544b64010
resolve several races in connmanager (#302)
* resolve several races in connmanager

collections may change while doing await

* close conn

* simplify connmanager API

PeerID avoids nil and ref issues

* remove silly condition
2020-08-01 22:50:40 +02:00
Giovanni Petrantoni afcfd27aa0 add some verbosity to multistream handshake for debugging pruposes 2020-07-31 14:02:03 +09:00
Giovanni Petrantoni 0f06ae5a1d
Audit multistream fixes (#291)
* Don't ignore missing \n in multistream requests

Also make sure to except and quit an existing connection if multistream handshake fails

* solve handshake tracking in ms handler
2020-07-28 23:03:22 +09:00
Dmitriy Ryajov f7fdf31365
Pubsub lifetime (#284)
* lifecycle hooks

* tests

* move trace after closed check

* restore 1 second heartbeat

* await close event

* fix tests

* print direction string

* more trace logging

* add pubsub monitor

* add log scope

* adjust idle timeout

* add exc.msg to trace
2020-07-27 13:33:51 -06:00
Dmitriy Ryajov ed0df74bbd
Connection lifecycle hooks (#288)
* lifecycle hooks

* trigger hooks as tasks

* handle exceptions in trigger hooks

* trigger hooks after storing the connection

* add disconnected hook

* tests
2020-07-24 13:24:31 -06:00
Eugene Kabanov 6af3cb6406
Public key infrastructure filters. (#272)
* Initial commit.

* Workaround nim's bug and add some other compilation error fixes.

* Rename to libp2p_pki_schemes.
Fix secio.
Add tests.

* Attempt to fix command line.

* Fix command line.
Show status in tests.
2020-07-21 14:10:21 -06:00
Giovanni Petrantoni c3404f6eea
Handle cancellation in timeoutMonitor (#283)
* Handle cancellation in timeoutMonitor

* refactor lpchannel timeout as suggested by cheatfate
2020-07-21 09:03:41 -06:00
Giovanni Petrantoni 3b088f8980
Fix some unsubscribe issues and add unsubscribeAll helper (#282)
* Fix some unsub issues and add unsuball helper

* batch sendprune in unsubscribe methods

* add unsubscribeAll for floodsub
2020-07-20 10:16:13 -06:00
Dmitriy Ryajov 38eb36efae
don't use close event to stop timer (#280) 2020-07-18 11:00:44 -06:00
Dmitriy Ryajov 94196fee71
Connections and pubsub peers cleanup (#279)
* better peer tracking and cleanup

* check if peer and conn is nil

* test name

* make timeout more agressive

* rename method for better clarity
2020-07-17 13:46:24 -06:00
Dmitriy Ryajov ba071cafa6
Channel timeout (#278)
* add support for channel timeouts

* tests for channel timeout

* add timeouts to standard switch

* fix mplex init

* cleanup timer on stream close

* add comment for `isConnected`

* move cleanup event
2020-07-17 12:44:41 -06:00
Dmitriy Ryajov 0348773ec9
Connection manager (#277)
* splitting out connection management

* wip

* wip conn mngr tests

* set peerinfo in contructor

* comments and documentation

* tests

* wip

* add `None` to detect untagged connections

* use `PeerID` to index connections

* fix tests

* remove useless equality
2020-07-17 09:36:48 -06:00
Jacek Sieka 170685f9c6
gossipsub fixes (#276)
* graft up to D peers
* fix logging so it's clear who is grafting/pruning who
* clear fanout when grafting
2020-07-16 21:26:57 +02:00
Jacek Sieka c76152f2c1
Simplify send (#271)
* PubSubPeer.send single message

* gossipsub: simplify send further
2020-07-16 12:06:57 +02:00
Dmitriy Ryajov f35b8999b3
some light cleanup for pub/gossip sub (#273)
* move peer table out to its own file

* move peer table

* cleanup `==` and add one to peerinfo

* add peertable

* missed equality check
2020-07-15 13:18:55 -06:00
Eugene Kabanov b832668768
Minprotobuf refactoring 2 (#269)
* Protobuf refactoring stage II.

* Remove NoError.

* Change trace level for invalid message.
2020-07-15 10:25:39 +02:00
Eugene Kabanov 9eb5828a42
Fix #266. (#270)
* Fix security issue #266.

* Add more tests.

* Fix PeerID tests should not use RSA-512 keys.

* Fix crypto tests to use vectors with 2048+ bits.

* Disable 4096bit RSA key generation for CI debug runs.
2020-07-15 10:24:04 +02:00
Giovanni Petrantoni d7bab37119
Fix gossip messages seqno according to spec (#253)
* Fix gossip messages seqno according to spec

* Add peers back to gossipsub table, slow down heartbeat

* Revert "Add peers back to gossipsub table, slow down heartbeat"

This reverts commit 01e2e62172.

* make seqno a threadvar, remove from peerinfo

* seqno refactor, into pubsub
2020-07-14 21:51:33 -06:00
Ștefan Talpalaru b8b0a2b4bc
CI: build binaries with TRACE & JSON logs (#268)
Also: remove unused imports.
2020-07-14 02:02:16 +02:00
Jacek Sieka c6c2d99907
one more log fix 2020-07-13 20:19:20 +02:00
Jacek Sieka 76853f064a
json logging again 2020-07-13 19:59:49 +02:00
Jacek Sieka 6620b7a00b
more comment fixes 2020-07-13 19:30:18 +02:00
Jacek Sieka 0d4c74b33a
comment log that can't be json-serialized 2020-07-13 18:36:49 +02:00
Jacek Sieka 061c54d3c6
logging fixes 2020-07-13 17:26:05 +02:00
Jacek Sieka 87e58c1c8d
metrics: one more pubsub peers fix 2020-07-13 16:16:46 +02:00
Jacek Sieka c7895ccc52
metrics: fix pubsub_peers add metric 2020-07-13 16:15:27 +02:00
Giovanni Petrantoni fcda0f6ce1
PubSubPeer tables refactor (#263)
* refactor peer tables

* tests fixing

* override PubSubPeer equality

* fix pubsubpeer comparison
2020-07-13 15:32:38 +02:00
Eugene Kabanov efb952f18b
[WIP] Minprotobuf refactoring (#259)
* Minprotobuf initial commit

* Fix noise.

* Add signed integers support.
Add checks for field number value.
Remove some casts.

* Fix compile errors.

* Fix comments and constants.
2020-07-13 14:43:07 +02:00
Dmitriy Ryajov 181cf73ca7
Drain buffer (#264)
* drain lpchannel on reset

* move drainBuffer to bufferstream
2020-07-12 18:37:10 +02:00
Dmitriy Ryajov bec9a0658f
Cleanup rpc handler (#261)
* more cleanup

* fix tests

* merging master

* remove `withLock` as it conflicts with stdlib

* wip

* more fanout ttl

Co-authored-by: Giovanni Petrantoni <giovanni@fragcolor.xyz>
2020-07-09 17:54:16 -06:00
Dmitriy Ryajov 4c815d75e7
More gossip cleanup (#257)
* more cleanup

* correct pubsub peer count

* close the stream first

* handle cancelation

* fix tests

* fix fanout ttl

* merging master

* remove `withLock` as it conflicts with stdlib

* fix trace build

Co-authored-by: Giovanni Petrantoni <giovanni@fragcolor.xyz>
2020-07-09 14:21:47 -06:00
Jacek Sieka c720e042fc
clean up mesh handling logic (#260)
* gossipsub is a function of subscription messages only
* graft/prune work with mesh, get filled up from gossipsub
* fix race conditions with await
* fix exception unsafety when grafting/pruning
* fix allowing up to DHi peers in mesh on incoming graft
* fix metrics in several places
2020-07-09 11:16:46 -06:00
Jacek Sieka 9a3684c221
init from concrete key type (#252) 2020-07-09 02:59:09 -06:00
Jacek Sieka 45c089ff0d
noise updates (#255)
* clear secrets explicitly
* simplify keygen
* avoid some trivial memory allocations
* fix little endian encoding of nonce
2020-07-09 02:53:19 -06:00
Giovanni Petrantoni 4e12d0d97a nil check peer before disconnect 2020-07-09 17:20:45 +09:00
Giovanni Petrantoni f9e0a1f069 CI fix handleDisconnect (pubsub) 2020-07-09 13:56:59 +09:00
Giovanni Petrantoni 9b8b159abb Remove other spurious getStacktrace in pubsub traces 2020-07-09 13:19:34 +09:00
Giovanni Petrantoni 4bcb567d47 fix gossip tests 2020-07-09 12:34:36 +09:00
Giovanni Petrantoni 4698f41a91 Remove stacktrace logging from pubsub connect 2020-07-09 12:23:03 +09:00
Giovanni Petrantoni fec507e755
Add peers back to gossipsub table, slow down heartbeat (#256)
* Add peers back to gossipsub table, slow down heartbeat

* exclude on unsub from mesh and fanout
2020-07-08 11:06:26 -06:00
Dmitriy Ryajov a52763cc6d
fix publishing (#250)
* use var semantics to optimize table access

* wip... lvalues don't work properly sadly...

* big publish refactor, replenish and balance

* fix internal tests

* use g.peers for fanout (todo: don't include flood peers)

* exclude non gossip from fanout

* internal test fixes

* fix flood tests

* fix test's trypublish

* test interop fixes

* make sure to not remove peers from gossip table

* restore old replenishFanout

* cleanups

* Cleanup resources (#246)

* consolidate reading in lpstream

* remove debug echo

* tune log level

* add channel cleanup and cancelation handling

* cancelation handling

* cancelation handling

* cancelation handling

* cancelation handling

* cleanup and cancelation handling

* cancelation handling

* cancelation

* tests

* rename isConnected to connected

* remove testing trace

* comment out debug stacktraces

* explicit raises

* restore trace vs debug in gossip

* improve fanout replenish behavior further

* cleanup stale peers more eaguerly

* synchronize connection cleanup and small refactor

* close client first and call parent second

* disconnect failed peers on publish

* check for publish result

* fix tests

* fix tests

* always call close

Co-authored-by: Giovanni Petrantoni <giovanni@fragcolor.xyz>
2020-07-07 18:33:05 -06:00
Eugene Kabanov 775cab414a
Remove SHA1 from crypto and crypto tests. (#251)
* Remove SHA1 from crypto and crypto tests.

* Simplify RSA comparison procedure.
Refactor some procedures in crypto.nim.
2020-07-07 15:48:15 +02:00
Jacek Sieka d522537b19
reuse single RNG instance for all crypto key generation (#249)
* reuse single RNG instance for all crypto key generation

* use foolproof rng

* initRng -> newRng (because it's ref)

* fix test

* imports/exports, chat fix

* fix rsa

* imports and exports

* work around threadvar issue

* fixup

* mac workaround test
2020-07-07 13:14:11 +02:00
Jacek Sieka b49c619ca8
export public field types (#248)
* export public field types

* one more
2020-07-01 09:22:29 +02:00
Giovanni Petrantoni ec00c7fc50
Peer resultification and defect only (#245)
* Peer resultification and defect only

* Fixing some tests

* test fixes

* Rename peer into peerid

* better result error message in identify

* further merge fixes
2020-07-01 08:25:09 +02:00
Dmitriy Ryajov c788a6a3c0
Cleanup resources (#246)
* consolidate reading in lpstream

* remove debug echo

* tune log level

* add channel cleanup and cancelation handling

* cancelation handling

* cancelation handling

* cancelation handling

* cancelation handling

* cleanup and cancelation handling

* cancelation handling

* cancelation

* tests

* rename isConnected to connected

* remove testing trace

* comment out debug stacktraces

* explicit raises
2020-06-29 09:15:31 -06:00
Jacek Sieka aa6756dfe0
allow message id provider to be specified (#243)
* don't send public key in message when not signing (information leak)
* don't run rebalance if there are peers in gossip (see #242)
* don't crash randomly on bad peer id from remote
2020-06-28 09:56:38 -06:00
Dmitriy Ryajov 902880ef1f
consolidate reading in lpstream (#241)
* consolidate reading in lpstream

* remove debug echo

* throw if not enough bytes where read

* tune log level

* set eof flag

* test readExactly to fail on not enough bytes
2020-06-27 11:33:34 -06:00
Dmitriy Ryajov 7a95f1844b
Concurrent dials (#238)
* count published messages

* don't call `switch.dial` in `subscribeToPeer`

* add secureconn constructor

* close in the correct order

* concurent dial lock and track in/out conns better

* make tests pass

* add todo comment

* disconect peers that open too many connections

* wip

* do connection and muxer tracking in one place

* prevent nil pointer in observers

* drop connections when peers is over max

* prevent channel leaks

* don't use closure to handle channel
2020-06-24 09:08:44 -06:00
Giovanni Petrantoni ee6e545878
multistream select make sure to not report NA (#235)
* multistream select make sure to not report NA but rather empty string if all fails

Also re-enable tests

* avoid using bad constructs, make multistream.select flow crystal clear
2020-06-22 15:38:48 -06:00
Jacek Sieka 6331b04cb4
secp: requiresInit updates (#237)
* secp: requiresInit updates

* fix
2020-06-22 19:03:15 +02:00
Jacek Sieka b99fd88deb
logging fixes 2020-06-21 11:14:19 +02:00
Giovanni Petrantoni 7852c6dd0f
Noise and eth2/nbc fixes (#226)
* Remove noise padding payload (spec removed it)

* add log scope in secure

* avoid defect array out of range in switch secure when "na"

* improve identify traces

* wip noise fixes

* noise protobuf adjustments (trying)

* add more debugging messages/traces, improve their actual contents

* re-enable ID check in noise

* bump go daemon tag version

* bump go daemon tag version

* enable noise in daemonapi

* interop testing, (both secio and noise will be tested)

* azure cache bump (p2pd)

* CI changes

- Travis: use Go 1.14
- azure-pipelines.yml: big cleanup
- Azure: bump cache keys
- build 64-bit p2pd on 32-bit Windows
- install both Mingw-w64 architectures

* noise logging fixes

* alternate testing between noise and secio

* increase timeout to avoid VM errors in CI (multistream tests)

* refactor heartbeat management in gossipsub

* remove locking within heartbeat

* refactor heartbeat management in gossipsub

* remove locking within heartbeat

Co-authored-by: Ștefan Talpalaru <stefantalpalaru@yahoo.com>
2020-06-20 19:56:55 +09:00
Dmitriy Ryajov 7a1c1c2ea6
fixing some key not found exceptions (#231) 2020-06-19 15:19:07 -06:00
Dmitriy Ryajov 5b28e8c488
Cleanup lpstream, Connection and BufferStream (#228)
* count published messages

* don't call `switch.dial` in `subscribeToPeer`

* don't use delegation in connection

* move connection out to own file

* don't breakout on reset

* make sure to call close on secured conn

* add lpstream tracing

* don't breackdown by conn id

* fix import

* remove unused lable

* reset  connection on exception

* add additional metrics for skipped messages

* check for nil in secure.close
2020-06-19 11:29:43 -06:00
Dmitriy Ryajov 719744f46a
Small fixes (#230)
* count published messages

* don't call `switch.dial` in `subscribeToPeer`

* make sure sending doesn't fail

* add `contains`

* review comment from prev pr
2020-06-19 11:29:25 -06:00
Dmitriy Ryajov 5f75e5adc6
don't use `switch.dial` in `subscribeToPeer` (#227)
* count published messages

* don't call `switch.dial` in `subscribeToPeer`
2020-06-17 20:12:32 -06:00
Dmitriy Ryajov fe828d87d8
count published messages (#224) 2020-06-16 22:14:02 -06:00
Giovanni Petrantoni 224c2613e3
Remove noise padding payload (spec removed it) (#222) 2020-06-16 17:34:02 -06:00
Jacek Sieka 523eed65ce
metrics: use eth2 name for peers metric (#223) 2020-06-16 14:43:45 -06:00
Giovanni Petrantoni 0d81acafa8 Fix multistream select trace proto logging 2020-06-16 12:02:47 +09:00
Dmitriy Ryajov 9d9f793b4f
add metrics for sent messages by topic and peer (#220) 2020-06-15 17:39:03 -06:00
Dmitriy Ryajov 3c4c10d871
con't crash on nil ptr 2020-06-15 13:37:25 -06:00
Dmitriy Ryajov 7cb6c81159
Don't modify iterables while iterating them (#219)
* don't modify iterables while iterating

* assert handlers to properly close connections
2020-06-15 12:30:09 -06:00
Dmitriy Ryajov 9cd47fe816
add a logScope to make tracing less ugly (#218)
* add a logScope to make tracing less ugly

* don't crash on nil pointer
2020-06-15 12:10:41 -06:00
Dmitriy Ryajov 85b56d0b3a
Observedaddr (#217)
* send correct observerAddr

* run tests with info log level

* set observedaddr for channels
2020-06-12 19:56:17 -06:00
Oskar Thorén f97e2deec7
Make methods in FloodSub, GossipSub public (#216)
Similar to https://github.com/status-im/nim-libp2p/pull/193 but doing it for all
methods to avoid this issue in PubSub in the future. Got hit by this behavior
again in rpcHandler and took me a while to figure out.

See https://github.com/oskarth/nim-private-method-skipping-case/ for minimal
repro
2020-06-12 17:54:12 -06:00
Dmitriy Ryajov ac04ca6e31
make sure keys exist and more metrics (#215) 2020-06-11 20:20:58 -06:00
Dmitriy Ryajov 55a294a5c9
better pubsub metrics (#214) 2020-06-11 12:09:34 -06:00
Dmitriy Ryajov 6b196ad7b4
remove pubsub peer on disconnect (#212)
* remove pubsub peer on disconnect

* make sure lock is aquired

* add $

* count upgrades/dials/disconnects
2020-06-11 08:45:59 -06:00
Jacek Sieka 92579435b6
secio then noise (#213)
* secio then noise

much fewer peers on witti with noise first

* comment
2020-06-11 08:38:47 +02:00
Viktor Kirilov 1afec627c2
proper name for topics so that we can filter dynamically using chronicles (#210)
* proper name for topics so that we can filter dynamically using chronicles

* lowercase
2020-06-10 10:48:01 +02:00
Jacek Sieka 8d9e231a74
grab agentversion/protoversion (#211) 2020-06-09 12:42:52 -06:00
Dmitriy Ryajov 35ff99829e
close streams (#208)
* close streams

* don't warn on missing proto
2020-06-08 17:42:27 -06:00
Dmitriy Ryajov ee281310c0
move trace log 2020-06-08 10:40:08 -06:00
Giovanni Petrantoni 82b4ed8f44 use declareCounter rather then gauge for certain metrics 2020-06-07 16:41:23 +09:00
Giovanni Petrantoni a6a2a81711
Start adding some metrics to pubsub (#192)
* Start adding some metrics to pubsub

In order to visualize it's functionality
Still WIP

* more metrics

* add per topic metrics

* finishup with requested metrics

* add a metrisServer define to start local server

* PR fixes and cleanup
2020-06-07 09:15:21 +02:00
Dmitriy Ryajov 130c64f33a
don't return nil in dial (#205)
* dont return nil in dial

* don't crash on pubsub send
2020-06-05 18:17:05 -06:00
Zahary Karadjov 2aebae56c0
Don't rely on the side-effects from doAssert 2020-06-05 19:12:10 +03:00
Zahary Karadjov 828a80ec8f
Make the MultiAddress.init function used in NBC non-failing 2020-06-05 17:51:22 +03:00
Dmitriy Ryajov 5960d42c50
remove casts from (#203) 2020-06-02 20:21:11 -06:00
Dmitriy Ryajov bb8bff2195
add sparse message propagation tests to gossipsub (#202)
* add sparce tests to gossipsub

* add send hooks

* remove `all`
2020-06-02 17:53:38 -06:00
Dmitriy Ryajov 285884c20c
Close peers (#201)
* wip

* exceptions and resource cleanup

* correct peerlifetime on disconnect

* emulate defered

* remove comment
2020-06-02 11:32:42 -06:00
Dmitriy Ryajov c7298f34f4 additional comments 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 71640da8f2 remove close from read/write methods 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 1b4876d26d emulate `defered` 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov abf659a01a more consistent dialing proto selecting logic 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 86e1c8169c decorate observers hooks with {.raises: [Defect].}
move hooks logic out into standalone procs

License: MIT
Signed-off-by: Dmitriy Ryajov <dryajov@gmail.com>
2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 4df151a3a3 typos 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 27a7f8c948 move EOF flag after local close and comments 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 815282a5da remove all() 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 293b7da295 typo 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 6e0eb93d4f remove readloops 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov abc12d0fb5 move stram close to a better location 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov daef00fc7b don't crash schlesi-dev 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 640c3bdc45 better exception handling and resource cleanup 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 7ff76d76b6 better exceptions 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 95774b2b81 better cleanup 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 9c2f31262e wip: try handling child stream exceptions 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov e3f8f53620 initStream method and better exceptions handling 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov d3b79b002e better exceptions and don't fail writes 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 74f8b5b5f1 resource cleanup 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 93e5805c01 better exception handling 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov d83ce4c932 exceptions and resource cleanup 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov d4bdb42046 gossipsub fixes 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 9d3cc9647b fix merge 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 3d9c0bffba read from stream 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 048b1db1ad revert back allread 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 8e9716f5c3 remove on transport close cleanup 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov cf76edb0dd make noise work again 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 5158d96eaf close connection on chronos close 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 46daed9a38 wip 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 0f691cbafd add eof and closed handling 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 62da2a05c3 wip: rework with proper half-closed 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 98117a3068 call write until all is written out 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 802299e69a breakout from publish loop 2020-06-02 09:10:27 -06:00
Jacek Sieka 7e31210455
cover missing case in MultiAddress.init (#198)
* cover missing case in MultiAddress.init

* raise assert on marker in protocol

* unify inits for markers / non-markers

* fix string
2020-06-01 14:41:45 +02:00
Jacek Sieka 88dbeebf17
add side effect annotations (#197) 2020-06-01 09:25:16 +02:00
Giovanni Petrantoni 37b98ad45c
Secure managers are now sorted, giving priority to noise (#191)
* Secure managers are now sorted, giving priority to noise

* fix nimble test command

* Fix native tests

* fix directchat sample

* Could not write to connection - reduce verbosity

* fix interop testing

* Remove more tables

* test interop fixes

* directchat fix

* fix interop/remove some deprecation
2020-06-01 08:41:32 +02:00
Giovanni Petrantoni 6affcda937
Less exceptions more results (#188)
* Less exceptions more results

* Fix daemonapi and interop tests

* Add multibase

* wip multiaddress

* fix the build, consuming new result types

* fix standard setup

* Simplify match, rename into MaError, add more exaustive err text

* Fix the CI issues

* Fix directchat build

* daemon api fixes

* better err messages formatting

Co-authored-by: Zahary Karadjov <zahary@gmail.com>
2020-05-31 16:22:49 +02:00
Giovanni Petrantoni 7c9e5c2f7a "Could not write to connection" message split between trace and debug due to log size 2020-05-30 23:47:56 +09:00
Oskar Thorén b88bfc05f8
Make GossipSub initPubSub method public (#193)
This means we can use it from other protocols that inherit GossipSub. Otherwise,
a lot of internal state (heartbeat lock etc) doesn't get initialized properly.
2020-05-29 09:35:03 -06:00
Dmitriy Ryajov 7b6e1c0688
Gossipsub interop (#189)
* interop fixes

* add custom messageid provider and fix seqno

* use ECDSA for speed

* adding messageid tests

* breakout from publish loop

* addressing review comments

* remove unneded var

* dont stop broadcasting on failed peers
2020-05-27 12:33:49 -06:00
Giovanni Petrantoni 536555138c fix "future" typo 2020-05-26 16:08:09 +09:00
Giovanni Petrantoni 4447d97234 add back forgotten message in tryAndWarn 2020-05-26 15:21:59 +09:00
Dmitriy Ryajov 9132f16927
gossipsub fixes (#186) 2020-05-21 14:24:20 -06:00
Dmitriy Ryajov ba53c08b3c
Track incoming connections (#181)
* call write until all is written out

* wip: rework with proper half-closed

* add eof and closed handling

* wip

* close connection on chronos close

* don't use read

* make noise work again

* don't reraise just yet

* fixes after backporting

* remove on transport close cleanup

* revert back allread

* rust interop fixes

* read from stream

* inc count before closing

* rebasing master

* store incomming connections

* fix merge

* remove unneeded changes

* use internal close flag to indicate disposal
2020-05-21 11:33:48 -06:00
Dmitriy Ryajov 7900fd9f61
Half closed (#174)
* call write until all is written out

* add comments to lpchannel fields

* add an eof flag to signal which end closed

* wip: rework with proper half-closed

* add eof and closed handling

* propagate closes to piped

* call parent close

* moving bufferstream trackers out

* move writeLock to bufferstream

* move writeLock out

* remove unused call

* wip

* rebasing master

* fix mplex tests

* wip

* fix bufferstream after backport

* wip

* rename to differentiate from chronos tracker

* close connection on chronos close

* make reset request asyncCheck

* fix channel cleanup

* misc

* don't use read

* fix backports

* make noise work again

* proper exception handling

* don't reraise just yet

* add convenience templates

* dont double wrap

* use async pragma

* fixes after backporting

* muxer owns connection

* remove on transport close cleanup

* revert back allread

* adding some todos

* read from stream

* inc count before closing

* rebasing master

* rebase master

* use correct exception type

* use try/finally insted of defer

* fix compile in trace mode

* reset channels on mplex close
2020-05-19 18:14:15 -06:00
Dmitriy Ryajov 681991ae48
reduce buffer size in lpchannel (#184) 2020-05-19 16:15:36 -06:00
Giovanni Petrantoni 01339c991f
Don't use and expose directly secp types (#183)
* Don't use and expose directly secp types

* Reuse same secp type names
2020-05-19 14:48:55 +02:00
Giovanni Petrantoni c219100e64
Use results and no exceptions in chacha and curve25519 (#182) 2020-05-19 10:22:49 +02:00
Dmitriy Ryajov f8029e7359
use sha256 digest as cache keys (#135)
* use sha256 digest as cache keys

* rebasing master
2020-05-18 14:49:49 -06:00
Dmitriy Ryajov 9cf1fd0216
remove generic constructor and expose serverflags (#176)
* remove generic constructor and expose serverflags

* fix transport constructor

* fix merge issues
2020-05-18 13:04:05 -06:00
Dmitriy Ryajov 773b738c12
don't track Connection, track StreamTransport (#177)
* don't track Connection, track StreamTransport

* make tests more deterministic
2020-05-18 11:05:34 -06:00
Dmitriy Ryajov 5583168965
Track all connections (#175)
* adding to do for future refactor

* no mix conn and transport stopping or it will race

* don't use intermediary vars
2020-05-18 09:08:10 -06:00
Dmitriy Ryajov 1819502fb5
Cleanup - tests and logging (#178)
* make async for proper exception handling

* tryAndWarn msg messes up Exception msg

* misc: comment out tracker dumps

* cleanup mplex tests

* more informative errors

* give CI time to run

* revert change, bacause it causes races
2020-05-18 07:49:49 -06:00
Giovanni Petrantoni 7dcb807f64
Crypto utilities resultification (#150) 2020-05-18 07:25:55 +02:00
Dmitriy Ryajov 167f42ed45
Remove read (#171)
* use readExactly

* remove `read`

* remove read

* no more `read`
2020-05-14 22:02:05 -06:00
Jacek Sieka 69abf5097d
handle a few exceptions (#170)
* handle a few exceptions

Some of these are maybe too aggressive, but in return, they'll log
their exception - more refactoring needed to sort this out - right now
we get crashes on unhandled exceptions of unknown origin

* during connection setup
* while closing channels
* while processing pubsubs

* catch exceptions that are raised and don't try to catch exceptions that are not raised

* propagate cancellederror

* one more

* more

* more

* make interop tests less fragile

* Raise expiration time in gossipsub fanout test for slow CI

Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
Co-authored-by: Giovanni Petrantoni <giovanni@fragcolor.xyz>
2020-05-14 21:56:56 -06:00
Giovanni Petrantoni 005e088405
Properly track and close mplex handlers (#166)
* Properly track and close mplex handlers

* Avoid verbose warnings

* Fix tryAndWarn trace issue

* Handle LPEOF in lpchannel close
2020-05-12 14:45:32 +02:00
Jacek Sieka 618e01eba3
don't crash on double peerinfo close (#167) 2020-05-11 21:05:24 +02:00
Jacek Sieka 3053f03814 fix varint issues
* fixes #111
2020-05-11 09:12:23 -06:00
Jacek Sieka 87e1f3c61f
fix trace logging 2020-05-09 13:56:07 +02:00
Jacek Sieka fbf640794b
move channel size constant 2020-05-09 08:48:26 +02:00
Dmitriy Ryajov 6196d56fc2
check for nil observers 2020-05-08 15:44:18 -06:00
Jacek Sieka ccd019b328
use stream directly in chronosstream (#163)
* use stream directly in chronosstream

for now, chronos.AsyncStream is not used to provide any features on top
of chronos.Stream, so in order to simplify the code, chronosstream can
be used directly.

In particular, the exception handling is broken in the current
chronosstream - opening and closing the stream is simplified this way as
well.

A future implementation that actually takes advantage of the AsyncStream
features would wrap AsyncStream instead as a separate lpstream
implementation, leaving this one as-is.

* work around chronos exception type issue
2020-05-08 22:10:06 +02:00
Giovanni Petrantoni c889224012 Add PubSub observer+ hooks (they can modify as well) 2020-05-08 13:31:52 -06:00
Ștefan Talpalaru 268253ea18 remove Chronos type from public API 2020-05-08 13:23:36 -06:00
Ștefan Talpalaru 313f9b0952 use a Transport.serverFlags attribute 2020-05-08 13:23:36 -06:00
Ștefan Talpalaru c480e65055 use set[ServerFlags] params instead of hardcoded flags 2020-05-08 13:23:36 -06:00
Ștefan Talpalaru 125843af7d TcpTransport.listen(): enable SO_REUSEADDR
in order to avoid failures when restarting the process while the OS is
keeping open sockets from previous runs
2020-05-08 13:23:36 -06:00
Jacek Sieka 1efada474c
remove readLoop in secure protocols (#162)
* remove readLoop in secure protocols, fix security issues

* fix Defect on remote sending 0-byte noise/secio message
* remove msglen from `write` (unused)
* simplify SecureConn data flow
* document some control-flow issues

* unify exception behaviour across noise and secio

* secio would not raise on mac/decryption errors

* fix compile error
2020-05-07 14:37:46 -06:00
Jacek Sieka 330da51819
removals (#159)
* remove unused stream methods
* reimplement some of them with proc's
* remove broken tests
* Error->Defect for defect
* warning fixes
2020-05-06 18:31:47 +02:00
Dmitriy Ryajov 6da4d2af48
Pubsub signatures flags (#161)
* add verify signature flag

* add sign flag to enable/disable msg signing

* moving internal tests out to their own file

* cleanup nimble file

* remove unneeded tests

* move pubsub tests out

* fix tests
2020-05-06 11:26:08 +02:00
Jacek Sieka e1928456a7 avoid newlines in $
they mess with debug prints and logging (same reason why $(seq) doesn't
print them
2020-04-28 10:59:53 -06:00
cheatfate 290ba712e9
Fix MultiAddress.protoAddress() bug for fixed arrays.
Add tests for it.
2020-04-28 14:43:44 +03:00
Giovanni Petrantoni 8a22c073c7 Fix secure/noise securing explicitly, added noise to pubsub tests 2020-04-24 14:33:08 -06:00
cheatfate 917b5f5c84 Add MultiAddress.init(integer) for tcp,udp,dccp,sctp protocols.
Add tests for it.
2020-04-23 08:10:17 -06:00
Giovanni Petrantoni 1c4d72f5e3
Use Result construct in minasn1 (#144) 2020-04-23 14:10:20 +02:00
Giovanni Petrantoni 6ae92eb21a Hotfix noise read usage, replace with readExactly 2020-04-22 23:36:59 +09:00
Giovanni Petrantoni d96372f820 Minor tweaking in errors.nim in order to give priority to Defect 2020-04-21 22:10:47 +09:00
Giovanni Petrantoni 4c6a123d31
Add chronos trackers and used them to sanitize resource disposal (#131)
* Add chronos trackers and used them to sanitize resource disposal

* Chronos trackers for transport tests wip

* No more chronos leaks in testtransport

* Make tcp transport and test more robust when closing

* Test async leaking tracking wip

* Fix a regression in wire connect

* Add chronos trackers to more tests and sanitize resource closure

* Wip fixing floodsub tests

* Floodsub wip

* Made floodsub basically deterministic, hit a nim bug with captures tho

* Wrap up floodsub tests refactor

* Wrapping up

* Add allFuturesThrowing utility

* Fix missing allFuturesThrowing in noise tests!

* Make tests green

* attempt fixing gossipsub failing cases

* Make sure to check also fanout in waitSub

* More verbose traces

* Gossipsub test improvments

* Refactor TcpTransport remove asyncCheck

* Add Connection trackers

* Add stricter connection tracking, wip mplex fix

* More asynccheck removal, in order to avoid connection leaks

* bump chronicles requirement

* Enable tracker dump to check CI output

* Wait for more futures in testmplex

* Remove tracker dump messages

* add tryAndWarn utility, fix mplex issue with go interop

* All allFuturesThrowing to directchat too

* make sure to cleanup on transport close
2020-04-21 10:24:42 +09:00
Jacek Sieka e8b33c64fa
secp: use upstream secp convenience api (#141)
* secp: use upstream secp convenience api
2020-04-17 12:51:13 +02:00
Ștefan Talpalaru eaa73ae6e8
add stream metrics (#136)
* add stream metrics

- just BufferStream and Connection are tracked, for now
- flag checking is enforced more strictly in close(), since it became
  clear that instances are closed multiple times

* add "metrics" dependency

and sort the list
2020-04-14 15:27:07 +02:00
Ștefan Talpalaru 7723403b1f
debug prints (#132)
* debug prints

* CI: enable stack traces

* Azure: better NimBinaries cache key

* CI changes

- Azure: remove Linux target
- Travis: add ARM64 target

* uglify the code in order to save 12 bytes per LPStream object
2020-04-14 15:21:16 +02:00
Jacek Sieka 2b823bde68
secp: update (#138) 2020-04-12 19:03:08 +02:00
Giovanni Petrantoni 303ec297da
Start removing allFutures (#125)
* Start removing allFutures

* More allfutures removal

* Complete allFutures removal except legacy and tests

* Introduce table values copies to prevent error

* Switch to allFinished

* Resolve TODOs in flood/gossip

* muxer handler, log and re-raise

* Add a common and flexible way to check multiple futures
2020-04-11 13:08:25 +09:00
Dmitriy Ryajov f4740c8b8e fix trace runs in connection 2020-04-07 14:55:05 -06:00
Dmitriy Ryajov 00fbc9246e fix nil condition 2020-04-07 12:16:59 -06:00
Dmitriy Ryajov 6cbcc7859e reduse usssage of asyncCheck 2020-04-07 12:16:59 -06:00
Dmitriy Ryajov bd49a35e0a formatting 2020-04-07 12:16:59 -06:00
Dmitriy Ryajov 976164ba3c proper connection cleanup 2020-04-07 12:16:59 -06:00