Commit Graph

821 Commits

Author SHA1 Message Date
Giovanni Petrantoni e496802943
Least expensive metrics (#421)
* add more general and useful metrics

* fix gossipsub peers metrics in heartbeat
2020-11-04 15:18:00 +01:00
Dmitriy Ryajov 7b5259dbc7
Move triggers (#416)
* move event triggers to connmanager

* use base error type

* avoid deadlocks

* handle eof and closed when identifying incoming

* use `closeWait`
2020-11-02 14:35:26 -06:00
Dmitriy Ryajov 43a77e60a1
split stream counts by direction (#418) 2020-11-01 16:23:26 -06:00
Jacek Sieka 03639f1446
Revert "Channel leaks (#413)" (#417)
This reverts commit 1de1d49223.
2020-11-01 14:49:25 -06:00
Giovanni Petrantoni 9c1633bf87 fix ValidIpAddress multiaddress init return type 2020-10-31 13:20:29 +09:00
cheatfate 04c95cb7b0
Fix write should be writeArray. 2020-10-31 03:23:34 +02:00
cheatfate ff48d0b1a2
Proper fix for init(ValidIpAddress). 2020-10-30 17:52:38 +02:00
Giovanni Petrantoni 3d9948a65e ensure all multiaddress routines use Result 2020-10-30 23:50:04 +09:00
Giovanni Petrantoni 75b023c9e5
gossipsub audit fixes (#412)
* [SEC] gossipsub - rebalanceMesh grafts peers giving preference to low scores #405

* comment score choices

* compiler warning fixes/bug fixes (unsubscribe)

* rebalanceMesh does not enforce D_out quota

* fix outbound grafting

* fight the nim compiler

* fix closure capture bs...

* another closure fix

* #403 rebalance prune fixes

* more test fixing

* #403 fixes

* #402 avoid removing scores on unsub

* #401 handleGraft improvements

* [SEC] handleIHAVE/handleIWANT recommendations

* add a note about peer exchange handling
2020-10-30 21:49:54 +09:00
Dmitriy Ryajov 1de1d49223
Channel leaks (#413)
* break stream tracking by type

* use closeWithEOF to await wrapped stream

* fix cancelation leaks

* fix channel leaks

* logging

* use close monitor and always call closeUnderlying

* don't use closeWithEOF

* removing close monitor

* logging
2020-10-27 11:21:03 -06:00
Giovanni Petrantoni eeaa62feec add more debug details to multiaddress assertions 2020-10-21 18:29:59 +09:00
Giovanni Petrantoni 462da1f7a8
gossip MessageID as seq[byte] (#391)
* gossip MessageID as seq[byte]

* combina hashes in defaultMsgIdProvider

* wip

* fix defaultMsgIdProvider
2020-10-21 12:26:04 +09:00
Giovanni Petrantoni 27b9bf436e
fix validation according to specification (#410) 2020-10-21 12:25:42 +09:00
Giovanni Petrantoni 5c19668b2d
avoid verbose EOF messages in readOnce(secure) (#411)
* avoid verbose EOF messages in readOnce(secure)

* shorten azure tests further
2020-10-21 10:08:24 +09:00
Giovanni Petrantoni 32623b930e
handle secure errors in readOnce (secure) (#397)
* handle secure errors in readOnce(secure)

* small synthax fix

* fix mistake in readOnce's isNil
2020-10-19 14:13:14 +09:00
Giovanni Petrantoni 556213abf4
Extended validators (#395)
* gossip extended validation

* fix flood tests

* fix gossip 1.0 tests

* synthax consistency
2020-10-12 16:56:00 +09:00
Giovanni Petrantoni e3bdb9eb13
decode properly ControlPrune (#392) 2020-10-09 09:12:38 +09:00
Giovanni Petrantoni 0f2435f551
better opportunistic grafting score (when score is disabled) (#389) 2020-10-03 09:26:45 +09:00
Dean Eigenmann 853238a215
feature/expose-matcher (#387)
* exposes matcher

* might work

* fix

* fix
2020-10-02 08:59:15 -06:00
Giovanni Petrantoni 4a98a8af5a
gossip pruning fixes related to #371 (#385)
* gossip pruning fixes related to #371

* better trace for grafted/pruned

* shorted azure testing again
2020-10-02 13:09:31 +09:00
Mamy Ratsimbazafy 03f5bbba6d
saner logging (#381) 2020-09-29 09:40:06 -06:00
Giovanni Petrantoni 98d0cc3a16
defaultMsgIdProvider alternative/test anonymize (#379)
* defaultMsgIdProvider alternative/test anonymize

* avoid freeze during flood tests

* avoid `empty message, skipping` situation

* test observers

* avoid double initPubSub

* fix gossip testing (specially when anonymize is on)

* make azure tests shorter
2020-09-28 09:11:18 +02:00
Jacek Sieka 8ecef46738
reencode gossipsub messages with anonymization (#378)
This helps protect against clients sending more data than they should
and thus getting penalized on topics that require anonymity
2020-09-25 18:39:34 +02:00
Jacek Sieka 17e00e642a
limit write queue length (#376)
To break a potential read/write deadlock, gossipsub uses an unbounded
queue for writes - when peers are too slow to process this queue, it may
end up growing without bounds causing high memory usage.

Here, we introduce a maximum write queue length after which the peer is
disconnected - the queue is generous enough that any "normal" usage
should be fine - writes that are `await`:ed are not affected, only
writes that are launched in an `asyncSpawn` task or similar.

* avoid unnecessary copy of message when there are no send observers
* release message memory earlier in gossipsub
* simplify pubsubpeer logging
2020-09-24 18:43:20 +02:00
Jacek Sieka 25bd0a18f4
small fixes (#374)
* add helper to read EOF marker after closing stream (else stream stay
alive until timeout/reset)
* don't assert on empty channel message
* don't loop when writing to chronos (no need)
2020-09-24 07:30:19 +02:00
Giovanni Petrantoni ec322124ac
allow to omit peerId and seqno (#372)
* allow to omit peerId and seqno

* small refactor

* wip

* fix message encoding

* improve rpc signature logic

* remove peerid from verify

* trace fixes

* fix message test

* fix gossip 1.0 tests
2020-09-23 17:56:33 +02:00
Dmitriy Ryajov abd234601b
move events to conn manager (#373) 2020-09-23 08:07:16 -06:00
Jacek Sieka 471e5906f6
fix gossipsub memory leak on disconnected peer (#371)
When messages can't be sent to peer, we try to establish a send
connection - this causes messages to stack up as more and more unsent
messages are blocked on the dial lock.

* remove dial lock
* run reconnection loop in background task
2020-09-22 09:05:53 +02:00
Jacek Sieka 49a12e619d
channel close race and deadlock fixes (#368)
* channel close race and deadlock fixes

* remove send lock, write chunks in one go
* push some of half-closed implementation to BufferStream
* fix some hangs where LPChannel readers and writers would not always
wake up
* simplify lazy channels
* fix close happening more than once in some orderings
* reenable connection tracking tests
* close channels first on mplex close such that consumers can read bytes

A notable difference is that BufferedStream is no longer considered EOF
until someone has actually read the EOF marker.

* docs, simplification
2020-09-21 19:48:19 +02:00
Giovanni Petrantoni b99d2039a8
Gossip one one (#240)
* allow multiple codecs per protocol (without breaking things)

* add 1.1 protocol to gossip

* explicit peering part 1

* explicit peering part 2

* explicit peering part 3

* PeerInfo and ControlPrune protocols

* fix encodePrune

* validated always, even explicit peers

* prune by score (score is stub still)

* add a way to pass parameters to gossip

* standard setup fixes

* take into account explicit direct peers in publish

* add floodPublish logic

* small fixes, publish still half broken

* make sure to waitsub in sparse test

* use var semantics to optimize table access

* wip... lvalues don't work properly sadly...

* big publish refactor, replenish and balance

* fix internal tests

* use g.peers for fanout (todo: don't include flood peers)

* exclude non gossip from fanout

* internal test fixes

* fix flood tests

* fix test's trypublish

* test interop fixes

* make sure to not remove peers from gossip table

* restore old replenishFanout

* cleanups

* restore utility module import

* restore trace vs debug in gossip

* improve fanout replenish behavior further

* triage publish nil peers (issue is on master too but just hidden behind a if/in)

* getGossipPeers fixes

* remove topics from pubsubpeer (was unused)

* simplify rebalanceMesh (following spec) and make it finally reach D_high

* better diagnostics

* merge new pubsubpeer, copy 1.1 to new module

* fix up merge

* conditional enable gossip11 module

* add back topics in peers, re-enable flood publish

* add more heartbeat locking to prevent races

* actually lock the heartbeat

* minor fixes

* with sugar

* merge 1.0

* remove assertion in publish

* fix multistream 1.1 multi proto

* Fix merge oops

* wip

* fix gossip 11 upstream

* gossipsub11 -> gossipsub

* support interop testing

* tests fixing

* fix directchat build

* control prune updates (pb)

* wip parameters

* gossip internal tests fixes

* parameters wip

* finishup with params

* cleanups/wip

* small sugar

* grafted and pruned procs

* wip updateScores

* wip

* fix logging issue

* pubsubpeer, chronicles explicit override

* fix internal gossip tests

* wip

* tables troubleshooting

* score wip

* score wip

* fixes

* fix test utils generateNodes

* don't delete while iterating in score update

* fix grafted defect

* add a handleConnect in subscribeTopic

* pruning improvements

* wip

* score fixes

* post merge - builds gossip tests

* further merge fixes

* rebalance improvements and opportunistic grafting

* fix test for now

* restore explicit peering

* implement peer exchange graft message

* add an hard cap to PX

* backoff time management

* IWANT cap/budget

* Adaptive gossip dissemination

* outbound mesh quota, internal tests fixing

* oversub prune score based, finish outbound quota

* finishup with score and ihave budget

* use go daemon 0.3.0

* import fixes

* byScore cleanup score sorting

* remove pointless scaling in `/` Duration operator

* revert using libp2p org for daemon

* interop fixes

* fixes and cleanup

* remove heartbeat assertion, minor debug fixes

* logging improvements and cleaning up

* (to revert) add some traces

* add explicit topic to gossip rpcs

* pubsub merge fixes and type fix in switch

* Revert "(to revert) add some traces"

This reverts commit 4663eaab6c.

* cleanup some now irrelevant todo

* shuffle peers anyway as score might be disabled

* add missing shuffle

* old merge fix

* more merge fixes

* debug improvements

* re-enable gossip internal tests

* add gossip10 fallback (dormant but tested)

* split gossipsub internal tests into 1.0 and 1.1

Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
2020-09-21 11:16:29 +02:00
Jacek Sieka b7e5d1122c
cleanups (#366)
* reuse connection timeout for noise handshake (avoid extra timer)
* enforce nbytes > 0 for readOnce
* avoid some unnecessary memory zeroing
* simplify noise
* fix dumping when noise splits message
2020-09-16 11:55:25 +02:00
Dmitriy Ryajov b0d86b95dd
add peer lifecycle events (#357)
* add peer lifecycle events

* rework peer events to not use connection events

* don't use result in pubsub and switch init

* wip

* use ordered hashes and remove logscope

* logging

* add missing test

* small fixes
2020-09-15 14:19:22 -06:00
Giovanni Petrantoni a6007be428
avoid sending empty seqno and/or fromPeer (gossip rpc) (#364) 2020-09-15 12:33:18 +02:00
Eugene Kabanov e2d46c6b53
Add `libp2p_dump` and `libp2p_dump_dir` compiler time options. (#359)
* Add `libp2p_dump` and `libp2p_dump_dir` compiler time options.

* Add tools/pbcap_parser.

* Add mplex control messages decoding.
2020-09-15 11:16:43 +02:00
Oskar Thorén 5e66f6fbd8
Add logScope to connmanager and pubsubprotobuf (#363) 2020-09-15 08:03:53 +02:00
Jacek Sieka 0db45462cd
mplex fixes (#362)
* remove almost-empty types module
* lock when writing message (that's the only place the lock matters, and
only when the message is > max msg size)
* logging updates (log in consistent order, makes reading logs easier)
* raise EOF from readExactly only if no bytes have been read (to signal
that _no_ bytes were lost)
2020-09-14 10:19:54 +02:00
Jacek Sieka 96d4c44fec
refactor bufferstream to use a queue (#346)
This change modifies how the backpressure algorithm in bufferstream
works - in particular, instead of working byte-by-byte, it will now work
seq-by-seq.

When data arrives, it usually does so in packets - in the current
bufferstream, the packet is read then split into bytes which are fed one
by one to the bufferstream. On the reading side, the bytes are popped of
the bufferstream, again byte by byte, to satisfy `readOnce` requests -
this introduces a lot of synchronization traffic because the checks for
full buffer and for async event handling must be done for every byte.

In this PR, a queue of length 1 is used instead - this means there will
at most exist one "packet" in `pushTo`, one in the queue and one in the
slush buffer that is used to store incomplete reads.

* avoid byte-by-byte copy to buffer, with synchronization in-between
* reuse AsyncQueue synchronization logic instead of rolling own
* avoid writeHandler callback - implement `write` method instead
* simplify EOF signalling by only setting EOF flag in queue reader (and
reset)
* remove BufferStream pipes (unused)
* fixes drainBuffer deadlock when drain is called from within read loop
and thus blocks draining
* fix lpchannel init order
2020-09-10 08:19:13 +02:00
Jacek Sieka 5b347adf58
logging fixes and small cleanups (#361)
In particular, allow longer multistream select reads
2020-09-09 19:12:08 +02:00
Jacek Sieka 63b38734bd
fix poor performance in LRU cache (#360)
it turns out (in NBC) a heap is sufficiently slow becuase of all the
deletes that it makes more sense to go with a linked list
2020-09-09 18:28:46 +02:00
Jacek Sieka 82c179db9e
mplex fixes (#356)
* close the right connection when channel send fails
* don't crash on channel id that is not unique
2020-09-08 08:24:28 +02:00
Jacek Sieka 2b72d485a3
a few more log fixes (#355) 2020-09-07 14:15:11 +02:00
Jacek Sieka c1856fda53
simplify and unify logging (#353)
* use short format for logging peerid
* log peerid:oid for connections
2020-09-06 10:31:47 +02:00
Jacek Sieka 9b815efe8f
gossipsub: don't subscribe to floodsub also (#352) 2020-09-04 22:53:03 +02:00
Jacek Sieka 16a008db75
fix connection event order when connection dies early (#351)
if the connection is already closed (because the remote closes during
identfiy for example), an exception would be raised which would leave
the connection in limbo, beacuse it would not go through the rest of
internalConnect.

Also, if the connection is already closed, the disconnect event would be
scheduled before the connect event :/
2020-09-04 20:30:26 +02:00
Jacek Sieka 6d91d61844
small cleanups & docs (#347)
* simplify gossipsub heartbeat start / stop
* avoid alloc in peerid check
* stop iterating over seq after unsubscribing item (could crash)
* don't crash on missing private key with enabled sigs (shouldn't happen
but...)
2020-09-04 18:31:43 +02:00
Eugene Kabanov 0b85192119
Remove asyncCheck from codebase. (#345)
* Remove asyncCheck from codebase.

* Replace all `discard` statements with new `asyncSpawn`.

* Bump `nim-chronos` requirement.
2020-09-04 18:30:45 +02:00
Jacek Sieka 5819c6a9a7
gossipsub / floodsub fixes (#348)
* mcache fixes

* remove timed cache - the window shifting already removes old messages
* ref -> object
* avoid unnecessary allocations with `[]` operator

* simplify init

* fix several gossipsub/floodsub issues

* floodsub, gossipsub: don't rebroadcast messages that fail validation
(!)
* floodsub, gossipsub: don't crash when unsubscribing from unknown
topics (!)
* gossipsub: don't send message to peers that are not interested in the
topic, when messages don't share topic list
* floodsub: don't repeat all messages for each message when
rebroadcasting
* floodsub: allow sending empty data
* floodsub: fix inefficient unsubscribe
* sync floodsub/gossipsub logging
* gossipsub: include incoming messages in mcache (!)
* gossipsub: don't rebroadcast already-seen messages (!)
* pubsubpeer: remove incoming/outgoing seen caches - these are already
handled in gossipsub, floodsub and will cause trouble when peers try to
resubscribe / regraft topics (because control messages will have same
digest)
* timedcache: reimplement without timers (fixes timer leaks and extreme
inefficiency due to per-message closures, futures etc)
* timedcache: ref -> obj
2020-09-04 08:10:32 +02:00
Jacek Sieka cd1c68dbc5
avoid send deadlock by not allowing send to block (#342)
* avoid send deadlock by not allowing send to block

* handle message issues more consistently
2020-09-01 09:33:03 +02:00
Dmitriy Ryajov d3182c4dba
No raise send (#339)
* dont raise in send

* check that the lock is acquire on release
2020-08-20 20:50:33 -06:00
Giovanni Petrantoni 840a76915e warn -> debug log levels in errors.nim 2020-08-20 16:53:28 +09:00
Jacek Sieka eb13845f65 work around send that may raise
`send` can raise exceptions that together with asyncCheck will
crash NBC
2020-08-19 14:25:30 +03:00
Zahary Karadjov af0955c58b
Add comments explaning a possible deadlock 2020-08-18 13:51:41 +03:00
Zahary Karadjov 60122a044c
Restore interop with Lighthouse by preventing concurrent meshsub dials 2020-08-17 22:40:58 +03:00
Jacek Sieka 833a5b8e57
add muxer nil check 2020-08-17 13:32:02 +02:00
Jacek Sieka cfcda3c3ef
work around race conditions between identify and other protocols
when identify is run on incoming connections, the connmanager tables are
updated too late for incoming connections to properly be handled

this is a quickfix that will eventually need cleaning up
2020-08-17 13:29:45 +02:00
Jacek Sieka 790b67c923
work around bufferstream deadlock (#332)
mplex backpressure handling deadlocks with something
2020-08-17 12:45:54 +02:00
Jacek Sieka 53877e97bd
trace logs 2020-08-17 12:39:25 +02:00
Jacek Sieka f46bf0faa4
remove send lock (#334)
* remove send lock

When mplex receives data it will block until a reader has processed the
data. Thus, when a large message is received, such as a gossipsub
subscription table, all of mplex will be blocked until all reading is
finished.

However, if at the same time a `dial` to establish a gossipsub send
connection is ongoing, that `dial` will be blocked because mplex is no
longer reading data - specifically, it might indeed be the connection
that's processing the previous data that is waiting for a send
connection.

There are other problems with the current code:
* If an exception is raised, it is not necessarily raised for the same
connection as `p.sendConn`, so resetting `p.sendConn` in the exception
handling is wrong
* `p.isConnected` is checked before taking the lock - thus, if it
returns false, a new dial will be started. If a new task enters `send`
before dial is finished, it will also determine `p.isConnected` is
false, then get stuck on the lock - when the previous task finishes and
releases the lock, the new task will _also_ dial and thus reset
`p.sendConn` causing a leak.

* prefer existing connection

simplifies flow
2020-08-17 12:38:27 +02:00
Jacek Sieka b12145dff7
avoid crash when subscribe is received (#333)
...by making subscribeTopic synchronous, avoiding a peer table lookup
completely.

rebalanceMesh will be called a second later - it's fine
2020-08-17 12:10:22 +02:00
Jacek Sieka ab864fc747
logging cleanups and small fixes (#331) 2020-08-15 21:50:31 +02:00
Jacek Sieka 397f9edfd4
simplify mplex (#327)
* less async
* less copying of data
* less redundant cleanup
2020-08-15 07:58:30 +02:00
Jacek Sieka 9c7e055310
set activity flag on noise / secio (#330) 2020-08-15 07:36:15 +02:00
Dmitriy Ryajov d1f1e1b31e
add missing mplex half closed test (#326) 2020-08-12 07:23:49 +02:00
Dmitriy Ryajov b76b3e0e9b
Rework pubsub (#322)
* move pubsub of off switch, pass switch into pubsub

* use join on lpstreams

* properly cleanup up failed peers

* fix tests

* fix peertable hasPeerId

* fix tests

* rework sending, remove helpers from pubsubpeer, unify in broadcast

* further split broadcast into send

* use send where appropriate

* use formatIt

* improve trace

Co-authored-by: Giovanni Petrantoni <giovanni@fragcolor.xyz>
2020-08-11 18:05:49 -06:00
Eugene Kabanov 59b290fcc7
Refactor minasn1 and fix security issues. (#323)
* Refactor minasn1 and fix security issues.

* Fix for RSA test vectors.
2020-08-11 16:58:51 -06:00
Eugene Kabanov d47b2d805f
Use constant-time hex encoding/decoding procedures explicitly. (#305)
* Use constant-time hex encoding/decoding procedures explicitly.

* Add comments.
2020-08-11 08:48:21 -06:00
Dmitriy Ryajov 2325692f55
Fix half closed (#324)
* don't call `close` in `remoteClose`

* make sure timeout are properly propagted

* fix tests

* adding remote close write test
2020-08-10 16:17:11 -06:00
Giovanni Petrantoni 6ffd5be059 fix crash at TRACE log level 2020-08-08 16:31:37 +09:00
Eugene Kabanov 7c1aac5dc1
Use constant-time comparison for keys and signatures. (#299) 2020-08-08 08:53:33 +02:00
Jacek Sieka f303954989
peer hooks -> events (#320)
* peer hooks -> events

* peerinfo -> peerid
* include connection direction in event
* check connection status after event
* lock connmanager lookup also when dialling peer
* clean up un-upgraded connection when upgrade fails
* await peer eventing

* remove join/lifetime future from peerinfo

Peerinfo instances are not unique per peer so the lifetime future is
misleading - it fires when a random connection is closed, not the "last"
one

* document switch values

* naming

* peerevent->conneevent
2020-08-08 08:52:20 +02:00
zah fbb59c3638
`msg` is a reserved property name in Chronicles (#321)
Every Chronicles log record has an existing `msg` property matching
the static string supplied in the log statement. Thus, it's currently
not possible to use `msg` as the name of a user property:

https://github.com/status-im/nim-chronicles/issues/86
2020-08-07 16:46:00 -06:00
Jacek Sieka 7c2ab38da1
cleanups (#319) 2020-08-06 20:14:40 +02:00
Jacek Sieka c6c0c152c0
Dial peerid (#308)
* prefer PeerID in switch api

This avoids ref issues like ref identity and nil

* use existing peerinfo instance if possible

* remove secureCodec

there may be multiple connections per peerinfo with different codecs

* avoid some extra async::
2020-08-06 09:29:27 +02:00
Giovanni Petrantoni 9bbe5e4841
Fix subclass calls to handleDisconnect (#314)
* Fix subclass calls to handleDisconnect

* add peer ID to nil peer debug message
2020-08-06 11:12:52 +09:00
Giovanni Petrantoni 5c986cf657
Fix build, add some raises (#315)
* Fix build, add some raises

* wip

* wip more raises

* missing exc object in mplex

* proper lifetime for subscribePeer

Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
2020-08-05 19:30:57 -06:00
Ștefan Talpalaru bd5d43874a
more expensive metrics (#312) 2020-08-05 14:02:26 +02:00
Dmitriy Ryajov 74a6dccd80
adding channel limits to mplex (#309) 2020-08-04 23:16:04 -06:00
Ștefan Talpalaru 843d32f8db
put expensive metrics under a Nim define (#310) 2020-08-04 17:27:59 -06:00
Dmitriy Ryajov cf2b42b914
Moving idle timeout to Connection to enable across all connection streams (#307)
* move idle timeout logic to connection

* more informative logs

* more informative logs
2020-08-04 07:22:05 -06:00
Giovanni Petrantoni 5f0637c49a
Audit curve fixes part2 (#298)
* refactor and fix mulgen (curve25519)

* crypto tests fixing

* fix some confusion in curve25519 mul

* removing ForbiddenCurveValues table and checks

* fix remaining merge issues
2020-08-04 18:19:26 +09:00
Giovanni Petrantoni 5cd93540fa
add a timeout during noise handshake (#294)
* add a timeout during noise handshake

* noise hs timeout into const
2020-08-04 17:04:16 +09:00
Giovanni Petrantoni 504e0444d3
refactor and fix mulgen (curve25519) (#293)
* refactor and fix mulgen (curve25519)

* crypto tests fixing
2020-08-04 14:07:53 +09:00
Dmitriy Ryajov b6877b8aac
increase send timeout for prune and graft msgs (#306)
* increase send timeout for prune and graft msgs

* use trace logs for subscribe monitor
2020-08-03 17:55:42 -06:00
Dmitriy Ryajov 980764774e
pubsub timeouts tuning (#295)
* add finegrained timeouts to pubsub

* use 10 millis timeout in tests

* finalization

* revert timeouts

* use `atEof` for reads

* adjust timeouts and use atEof for reads

* use atEof for reads

* set isEof flag

* no backoff for pubsub streams

* temp timer increase, make macos finalize

* don't call `subscribePeer` in libp2p anymore

* more traces

* leak tests

* lower timeouts

* handle exceptions in control message

* don't use `cancelAndWait`

* handle exceptions in helpers

* wip

* don't send empty messages

* check for leaks properly

* don't use cancelAndWait

* don't await subscribption sends

* remove subscrivePeer calls from switch

* trying without the hooks again
2020-08-02 23:20:11 -06:00
Jacek Sieka e655a510cd
misc cleanups (#303) 2020-08-02 12:22:49 +02:00
Jacek Sieka d544b64010
resolve several races in connmanager (#302)
* resolve several races in connmanager

collections may change while doing await

* close conn

* simplify connmanager API

PeerID avoids nil and ref issues

* remove silly condition
2020-08-01 22:50:40 +02:00
Giovanni Petrantoni afcfd27aa0 add some verbosity to multistream handshake for debugging pruposes 2020-07-31 14:02:03 +09:00
Giovanni Petrantoni 0f06ae5a1d
Audit multistream fixes (#291)
* Don't ignore missing \n in multistream requests

Also make sure to except and quit an existing connection if multistream handshake fails

* solve handshake tracking in ms handler
2020-07-28 23:03:22 +09:00
Dmitriy Ryajov f7fdf31365
Pubsub lifetime (#284)
* lifecycle hooks

* tests

* move trace after closed check

* restore 1 second heartbeat

* await close event

* fix tests

* print direction string

* more trace logging

* add pubsub monitor

* add log scope

* adjust idle timeout

* add exc.msg to trace
2020-07-27 13:33:51 -06:00
Dmitriy Ryajov ed0df74bbd
Connection lifecycle hooks (#288)
* lifecycle hooks

* trigger hooks as tasks

* handle exceptions in trigger hooks

* trigger hooks after storing the connection

* add disconnected hook

* tests
2020-07-24 13:24:31 -06:00
Eugene Kabanov 6af3cb6406
Public key infrastructure filters. (#272)
* Initial commit.

* Workaround nim's bug and add some other compilation error fixes.

* Rename to libp2p_pki_schemes.
Fix secio.
Add tests.

* Attempt to fix command line.

* Fix command line.
Show status in tests.
2020-07-21 14:10:21 -06:00
Giovanni Petrantoni c3404f6eea
Handle cancellation in timeoutMonitor (#283)
* Handle cancellation in timeoutMonitor

* refactor lpchannel timeout as suggested by cheatfate
2020-07-21 09:03:41 -06:00
Giovanni Petrantoni 3b088f8980
Fix some unsubscribe issues and add unsubscribeAll helper (#282)
* Fix some unsub issues and add unsuball helper

* batch sendprune in unsubscribe methods

* add unsubscribeAll for floodsub
2020-07-20 10:16:13 -06:00
Dmitriy Ryajov 38eb36efae
don't use close event to stop timer (#280) 2020-07-18 11:00:44 -06:00
Dmitriy Ryajov 94196fee71
Connections and pubsub peers cleanup (#279)
* better peer tracking and cleanup

* check if peer and conn is nil

* test name

* make timeout more agressive

* rename method for better clarity
2020-07-17 13:46:24 -06:00
Dmitriy Ryajov ba071cafa6
Channel timeout (#278)
* add support for channel timeouts

* tests for channel timeout

* add timeouts to standard switch

* fix mplex init

* cleanup timer on stream close

* add comment for `isConnected`

* move cleanup event
2020-07-17 12:44:41 -06:00
Dmitriy Ryajov 0348773ec9
Connection manager (#277)
* splitting out connection management

* wip

* wip conn mngr tests

* set peerinfo in contructor

* comments and documentation

* tests

* wip

* add `None` to detect untagged connections

* use `PeerID` to index connections

* fix tests

* remove useless equality
2020-07-17 09:36:48 -06:00
Jacek Sieka 170685f9c6
gossipsub fixes (#276)
* graft up to D peers
* fix logging so it's clear who is grafting/pruning who
* clear fanout when grafting
2020-07-16 21:26:57 +02:00
Jacek Sieka c76152f2c1
Simplify send (#271)
* PubSubPeer.send single message

* gossipsub: simplify send further
2020-07-16 12:06:57 +02:00
Dmitriy Ryajov f35b8999b3
some light cleanup for pub/gossip sub (#273)
* move peer table out to its own file

* move peer table

* cleanup `==` and add one to peerinfo

* add peertable

* missed equality check
2020-07-15 13:18:55 -06:00
Eugene Kabanov b832668768
Minprotobuf refactoring 2 (#269)
* Protobuf refactoring stage II.

* Remove NoError.

* Change trace level for invalid message.
2020-07-15 10:25:39 +02:00
Eugene Kabanov 9eb5828a42
Fix #266. (#270)
* Fix security issue #266.

* Add more tests.

* Fix PeerID tests should not use RSA-512 keys.

* Fix crypto tests to use vectors with 2048+ bits.

* Disable 4096bit RSA key generation for CI debug runs.
2020-07-15 10:24:04 +02:00
Giovanni Petrantoni d7bab37119
Fix gossip messages seqno according to spec (#253)
* Fix gossip messages seqno according to spec

* Add peers back to gossipsub table, slow down heartbeat

* Revert "Add peers back to gossipsub table, slow down heartbeat"

This reverts commit 01e2e62172.

* make seqno a threadvar, remove from peerinfo

* seqno refactor, into pubsub
2020-07-14 21:51:33 -06:00
Ștefan Talpalaru b8b0a2b4bc
CI: build binaries with TRACE & JSON logs (#268)
Also: remove unused imports.
2020-07-14 02:02:16 +02:00
Jacek Sieka c6c2d99907
one more log fix 2020-07-13 20:19:20 +02:00
Jacek Sieka 76853f064a
json logging again 2020-07-13 19:59:49 +02:00
Jacek Sieka 6620b7a00b
more comment fixes 2020-07-13 19:30:18 +02:00
Jacek Sieka 0d4c74b33a
comment log that can't be json-serialized 2020-07-13 18:36:49 +02:00
Jacek Sieka 061c54d3c6
logging fixes 2020-07-13 17:26:05 +02:00
Jacek Sieka 87e58c1c8d
metrics: one more pubsub peers fix 2020-07-13 16:16:46 +02:00
Jacek Sieka c7895ccc52
metrics: fix pubsub_peers add metric 2020-07-13 16:15:27 +02:00
Giovanni Petrantoni fcda0f6ce1
PubSubPeer tables refactor (#263)
* refactor peer tables

* tests fixing

* override PubSubPeer equality

* fix pubsubpeer comparison
2020-07-13 15:32:38 +02:00
Eugene Kabanov efb952f18b
[WIP] Minprotobuf refactoring (#259)
* Minprotobuf initial commit

* Fix noise.

* Add signed integers support.
Add checks for field number value.
Remove some casts.

* Fix compile errors.

* Fix comments and constants.
2020-07-13 14:43:07 +02:00
Dmitriy Ryajov 181cf73ca7
Drain buffer (#264)
* drain lpchannel on reset

* move drainBuffer to bufferstream
2020-07-12 18:37:10 +02:00
Dmitriy Ryajov bec9a0658f
Cleanup rpc handler (#261)
* more cleanup

* fix tests

* merging master

* remove `withLock` as it conflicts with stdlib

* wip

* more fanout ttl

Co-authored-by: Giovanni Petrantoni <giovanni@fragcolor.xyz>
2020-07-09 17:54:16 -06:00
Dmitriy Ryajov 4c815d75e7
More gossip cleanup (#257)
* more cleanup

* correct pubsub peer count

* close the stream first

* handle cancelation

* fix tests

* fix fanout ttl

* merging master

* remove `withLock` as it conflicts with stdlib

* fix trace build

Co-authored-by: Giovanni Petrantoni <giovanni@fragcolor.xyz>
2020-07-09 14:21:47 -06:00
Jacek Sieka c720e042fc
clean up mesh handling logic (#260)
* gossipsub is a function of subscription messages only
* graft/prune work with mesh, get filled up from gossipsub
* fix race conditions with await
* fix exception unsafety when grafting/pruning
* fix allowing up to DHi peers in mesh on incoming graft
* fix metrics in several places
2020-07-09 11:16:46 -06:00
Jacek Sieka 9a3684c221
init from concrete key type (#252) 2020-07-09 02:59:09 -06:00
Jacek Sieka 45c089ff0d
noise updates (#255)
* clear secrets explicitly
* simplify keygen
* avoid some trivial memory allocations
* fix little endian encoding of nonce
2020-07-09 02:53:19 -06:00
Giovanni Petrantoni 4e12d0d97a nil check peer before disconnect 2020-07-09 17:20:45 +09:00
Giovanni Petrantoni f9e0a1f069 CI fix handleDisconnect (pubsub) 2020-07-09 13:56:59 +09:00
Giovanni Petrantoni 9b8b159abb Remove other spurious getStacktrace in pubsub traces 2020-07-09 13:19:34 +09:00
Giovanni Petrantoni 4bcb567d47 fix gossip tests 2020-07-09 12:34:36 +09:00
Giovanni Petrantoni 4698f41a91 Remove stacktrace logging from pubsub connect 2020-07-09 12:23:03 +09:00
Giovanni Petrantoni fec507e755
Add peers back to gossipsub table, slow down heartbeat (#256)
* Add peers back to gossipsub table, slow down heartbeat

* exclude on unsub from mesh and fanout
2020-07-08 11:06:26 -06:00
Dmitriy Ryajov a52763cc6d
fix publishing (#250)
* use var semantics to optimize table access

* wip... lvalues don't work properly sadly...

* big publish refactor, replenish and balance

* fix internal tests

* use g.peers for fanout (todo: don't include flood peers)

* exclude non gossip from fanout

* internal test fixes

* fix flood tests

* fix test's trypublish

* test interop fixes

* make sure to not remove peers from gossip table

* restore old replenishFanout

* cleanups

* Cleanup resources (#246)

* consolidate reading in lpstream

* remove debug echo

* tune log level

* add channel cleanup and cancelation handling

* cancelation handling

* cancelation handling

* cancelation handling

* cancelation handling

* cleanup and cancelation handling

* cancelation handling

* cancelation

* tests

* rename isConnected to connected

* remove testing trace

* comment out debug stacktraces

* explicit raises

* restore trace vs debug in gossip

* improve fanout replenish behavior further

* cleanup stale peers more eaguerly

* synchronize connection cleanup and small refactor

* close client first and call parent second

* disconnect failed peers on publish

* check for publish result

* fix tests

* fix tests

* always call close

Co-authored-by: Giovanni Petrantoni <giovanni@fragcolor.xyz>
2020-07-07 18:33:05 -06:00
Eugene Kabanov 775cab414a
Remove SHA1 from crypto and crypto tests. (#251)
* Remove SHA1 from crypto and crypto tests.

* Simplify RSA comparison procedure.
Refactor some procedures in crypto.nim.
2020-07-07 15:48:15 +02:00
Jacek Sieka d522537b19
reuse single RNG instance for all crypto key generation (#249)
* reuse single RNG instance for all crypto key generation

* use foolproof rng

* initRng -> newRng (because it's ref)

* fix test

* imports/exports, chat fix

* fix rsa

* imports and exports

* work around threadvar issue

* fixup

* mac workaround test
2020-07-07 13:14:11 +02:00
Jacek Sieka b49c619ca8
export public field types (#248)
* export public field types

* one more
2020-07-01 09:22:29 +02:00
Giovanni Petrantoni ec00c7fc50
Peer resultification and defect only (#245)
* Peer resultification and defect only

* Fixing some tests

* test fixes

* Rename peer into peerid

* better result error message in identify

* further merge fixes
2020-07-01 08:25:09 +02:00
Dmitriy Ryajov c788a6a3c0
Cleanup resources (#246)
* consolidate reading in lpstream

* remove debug echo

* tune log level

* add channel cleanup and cancelation handling

* cancelation handling

* cancelation handling

* cancelation handling

* cancelation handling

* cleanup and cancelation handling

* cancelation handling

* cancelation

* tests

* rename isConnected to connected

* remove testing trace

* comment out debug stacktraces

* explicit raises
2020-06-29 09:15:31 -06:00
Jacek Sieka aa6756dfe0
allow message id provider to be specified (#243)
* don't send public key in message when not signing (information leak)
* don't run rebalance if there are peers in gossip (see #242)
* don't crash randomly on bad peer id from remote
2020-06-28 09:56:38 -06:00
Dmitriy Ryajov 902880ef1f
consolidate reading in lpstream (#241)
* consolidate reading in lpstream

* remove debug echo

* throw if not enough bytes where read

* tune log level

* set eof flag

* test readExactly to fail on not enough bytes
2020-06-27 11:33:34 -06:00
Dmitriy Ryajov 7a95f1844b
Concurrent dials (#238)
* count published messages

* don't call `switch.dial` in `subscribeToPeer`

* add secureconn constructor

* close in the correct order

* concurent dial lock and track in/out conns better

* make tests pass

* add todo comment

* disconect peers that open too many connections

* wip

* do connection and muxer tracking in one place

* prevent nil pointer in observers

* drop connections when peers is over max

* prevent channel leaks

* don't use closure to handle channel
2020-06-24 09:08:44 -06:00
Giovanni Petrantoni ee6e545878
multistream select make sure to not report NA (#235)
* multistream select make sure to not report NA but rather empty string if all fails

Also re-enable tests

* avoid using bad constructs, make multistream.select flow crystal clear
2020-06-22 15:38:48 -06:00
Jacek Sieka 6331b04cb4
secp: requiresInit updates (#237)
* secp: requiresInit updates

* fix
2020-06-22 19:03:15 +02:00
Jacek Sieka b99fd88deb
logging fixes 2020-06-21 11:14:19 +02:00
Giovanni Petrantoni 7852c6dd0f
Noise and eth2/nbc fixes (#226)
* Remove noise padding payload (spec removed it)

* add log scope in secure

* avoid defect array out of range in switch secure when "na"

* improve identify traces

* wip noise fixes

* noise protobuf adjustments (trying)

* add more debugging messages/traces, improve their actual contents

* re-enable ID check in noise

* bump go daemon tag version

* bump go daemon tag version

* enable noise in daemonapi

* interop testing, (both secio and noise will be tested)

* azure cache bump (p2pd)

* CI changes

- Travis: use Go 1.14
- azure-pipelines.yml: big cleanup
- Azure: bump cache keys
- build 64-bit p2pd on 32-bit Windows
- install both Mingw-w64 architectures

* noise logging fixes

* alternate testing between noise and secio

* increase timeout to avoid VM errors in CI (multistream tests)

* refactor heartbeat management in gossipsub

* remove locking within heartbeat

* refactor heartbeat management in gossipsub

* remove locking within heartbeat

Co-authored-by: Ștefan Talpalaru <stefantalpalaru@yahoo.com>
2020-06-20 19:56:55 +09:00
Dmitriy Ryajov 7a1c1c2ea6
fixing some key not found exceptions (#231) 2020-06-19 15:19:07 -06:00
Dmitriy Ryajov 5b28e8c488
Cleanup lpstream, Connection and BufferStream (#228)
* count published messages

* don't call `switch.dial` in `subscribeToPeer`

* don't use delegation in connection

* move connection out to own file

* don't breakout on reset

* make sure to call close on secured conn

* add lpstream tracing

* don't breackdown by conn id

* fix import

* remove unused lable

* reset  connection on exception

* add additional metrics for skipped messages

* check for nil in secure.close
2020-06-19 11:29:43 -06:00
Dmitriy Ryajov 719744f46a
Small fixes (#230)
* count published messages

* don't call `switch.dial` in `subscribeToPeer`

* make sure sending doesn't fail

* add `contains`

* review comment from prev pr
2020-06-19 11:29:25 -06:00
Dmitriy Ryajov 5f75e5adc6
don't use `switch.dial` in `subscribeToPeer` (#227)
* count published messages

* don't call `switch.dial` in `subscribeToPeer`
2020-06-17 20:12:32 -06:00
Dmitriy Ryajov fe828d87d8
count published messages (#224) 2020-06-16 22:14:02 -06:00
Giovanni Petrantoni 224c2613e3
Remove noise padding payload (spec removed it) (#222) 2020-06-16 17:34:02 -06:00
Jacek Sieka 523eed65ce
metrics: use eth2 name for peers metric (#223) 2020-06-16 14:43:45 -06:00
Giovanni Petrantoni 0d81acafa8 Fix multistream select trace proto logging 2020-06-16 12:02:47 +09:00
Dmitriy Ryajov 9d9f793b4f
add metrics for sent messages by topic and peer (#220) 2020-06-15 17:39:03 -06:00
Dmitriy Ryajov 3c4c10d871
con't crash on nil ptr 2020-06-15 13:37:25 -06:00
Dmitriy Ryajov 7cb6c81159
Don't modify iterables while iterating them (#219)
* don't modify iterables while iterating

* assert handlers to properly close connections
2020-06-15 12:30:09 -06:00
Dmitriy Ryajov 9cd47fe816
add a logScope to make tracing less ugly (#218)
* add a logScope to make tracing less ugly

* don't crash on nil pointer
2020-06-15 12:10:41 -06:00
Dmitriy Ryajov 85b56d0b3a
Observedaddr (#217)
* send correct observerAddr

* run tests with info log level

* set observedaddr for channels
2020-06-12 19:56:17 -06:00
Oskar Thorén f97e2deec7
Make methods in FloodSub, GossipSub public (#216)
Similar to https://github.com/status-im/nim-libp2p/pull/193 but doing it for all
methods to avoid this issue in PubSub in the future. Got hit by this behavior
again in rpcHandler and took me a while to figure out.

See https://github.com/oskarth/nim-private-method-skipping-case/ for minimal
repro
2020-06-12 17:54:12 -06:00
Dmitriy Ryajov ac04ca6e31
make sure keys exist and more metrics (#215) 2020-06-11 20:20:58 -06:00
Dmitriy Ryajov 55a294a5c9
better pubsub metrics (#214) 2020-06-11 12:09:34 -06:00
Dmitriy Ryajov 6b196ad7b4
remove pubsub peer on disconnect (#212)
* remove pubsub peer on disconnect

* make sure lock is aquired

* add $

* count upgrades/dials/disconnects
2020-06-11 08:45:59 -06:00
Jacek Sieka 92579435b6
secio then noise (#213)
* secio then noise

much fewer peers on witti with noise first

* comment
2020-06-11 08:38:47 +02:00
Viktor Kirilov 1afec627c2
proper name for topics so that we can filter dynamically using chronicles (#210)
* proper name for topics so that we can filter dynamically using chronicles

* lowercase
2020-06-10 10:48:01 +02:00
Jacek Sieka 8d9e231a74
grab agentversion/protoversion (#211) 2020-06-09 12:42:52 -06:00
Dmitriy Ryajov 35ff99829e
close streams (#208)
* close streams

* don't warn on missing proto
2020-06-08 17:42:27 -06:00
Dmitriy Ryajov ee281310c0
move trace log 2020-06-08 10:40:08 -06:00
Giovanni Petrantoni 82b4ed8f44 use declareCounter rather then gauge for certain metrics 2020-06-07 16:41:23 +09:00
Giovanni Petrantoni a6a2a81711
Start adding some metrics to pubsub (#192)
* Start adding some metrics to pubsub

In order to visualize it's functionality
Still WIP

* more metrics

* add per topic metrics

* finishup with requested metrics

* add a metrisServer define to start local server

* PR fixes and cleanup
2020-06-07 09:15:21 +02:00
Dmitriy Ryajov 130c64f33a
don't return nil in dial (#205)
* dont return nil in dial

* don't crash on pubsub send
2020-06-05 18:17:05 -06:00
Zahary Karadjov 2aebae56c0
Don't rely on the side-effects from doAssert 2020-06-05 19:12:10 +03:00
Zahary Karadjov 828a80ec8f
Make the MultiAddress.init function used in NBC non-failing 2020-06-05 17:51:22 +03:00
Dmitriy Ryajov 5960d42c50
remove casts from (#203) 2020-06-02 20:21:11 -06:00
Dmitriy Ryajov bb8bff2195
add sparse message propagation tests to gossipsub (#202)
* add sparce tests to gossipsub

* add send hooks

* remove `all`
2020-06-02 17:53:38 -06:00
Dmitriy Ryajov 285884c20c
Close peers (#201)
* wip

* exceptions and resource cleanup

* correct peerlifetime on disconnect

* emulate defered

* remove comment
2020-06-02 11:32:42 -06:00
Dmitriy Ryajov c7298f34f4 additional comments 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 71640da8f2 remove close from read/write methods 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 1b4876d26d emulate `defered` 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov abf659a01a more consistent dialing proto selecting logic 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 86e1c8169c decorate observers hooks with {.raises: [Defect].}
move hooks logic out into standalone procs

License: MIT
Signed-off-by: Dmitriy Ryajov <dryajov@gmail.com>
2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 4df151a3a3 typos 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 27a7f8c948 move EOF flag after local close and comments 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 815282a5da remove all() 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 293b7da295 typo 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 6e0eb93d4f remove readloops 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov abc12d0fb5 move stram close to a better location 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov daef00fc7b don't crash schlesi-dev 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 640c3bdc45 better exception handling and resource cleanup 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 7ff76d76b6 better exceptions 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 95774b2b81 better cleanup 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 9c2f31262e wip: try handling child stream exceptions 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov e3f8f53620 initStream method and better exceptions handling 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov d3b79b002e better exceptions and don't fail writes 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 74f8b5b5f1 resource cleanup 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 93e5805c01 better exception handling 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov d83ce4c932 exceptions and resource cleanup 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov d4bdb42046 gossipsub fixes 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 9d3cc9647b fix merge 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 3d9c0bffba read from stream 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 048b1db1ad revert back allread 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 8e9716f5c3 remove on transport close cleanup 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov cf76edb0dd make noise work again 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 5158d96eaf close connection on chronos close 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 46daed9a38 wip 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 0f691cbafd add eof and closed handling 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 62da2a05c3 wip: rework with proper half-closed 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 98117a3068 call write until all is written out 2020-06-02 09:10:27 -06:00