Commit Graph

115 Commits

Author SHA1 Message Date
Zahary Karadjov 60122a044c
Restore interop with Lighthouse by preventing concurrent meshsub dials 2020-08-17 22:40:58 +03:00
Jacek Sieka 53877e97bd
trace logs 2020-08-17 12:39:25 +02:00
Jacek Sieka f46bf0faa4
remove send lock (#334)
* remove send lock

When mplex receives data it will block until a reader has processed the
data. Thus, when a large message is received, such as a gossipsub
subscription table, all of mplex will be blocked until all reading is
finished.

However, if at the same time a `dial` to establish a gossipsub send
connection is ongoing, that `dial` will be blocked because mplex is no
longer reading data - specifically, it might indeed be the connection
that's processing the previous data that is waiting for a send
connection.

There are other problems with the current code:
* If an exception is raised, it is not necessarily raised for the same
connection as `p.sendConn`, so resetting `p.sendConn` in the exception
handling is wrong
* `p.isConnected` is checked before taking the lock - thus, if it
returns false, a new dial will be started. If a new task enters `send`
before dial is finished, it will also determine `p.isConnected` is
false, then get stuck on the lock - when the previous task finishes and
releases the lock, the new task will _also_ dial and thus reset
`p.sendConn` causing a leak.

* prefer existing connection

simplifies flow
2020-08-17 12:38:27 +02:00
Dmitriy Ryajov b76b3e0e9b
Rework pubsub (#322)
* move pubsub of off switch, pass switch into pubsub

* use join on lpstreams

* properly cleanup up failed peers

* fix tests

* fix peertable hasPeerId

* fix tests

* rework sending, remove helpers from pubsubpeer, unify in broadcast

* further split broadcast into send

* use send where appropriate

* use formatIt

* improve trace

Co-authored-by: Giovanni Petrantoni <giovanni@fragcolor.xyz>
2020-08-11 18:05:49 -06:00
zah fbb59c3638
`msg` is a reserved property name in Chronicles (#321)
Every Chronicles log record has an existing `msg` property matching
the static string supplied in the log statement. Thus, it's currently
not possible to use `msg` as the name of a user property:

https://github.com/status-im/nim-chronicles/issues/86
2020-08-07 16:46:00 -06:00
Ștefan Talpalaru 843d32f8db
put expensive metrics under a Nim define (#310) 2020-08-04 17:27:59 -06:00
Dmitriy Ryajov b6877b8aac
increase send timeout for prune and graft msgs (#306)
* increase send timeout for prune and graft msgs

* use trace logs for subscribe monitor
2020-08-03 17:55:42 -06:00
Dmitriy Ryajov 980764774e
pubsub timeouts tuning (#295)
* add finegrained timeouts to pubsub

* use 10 millis timeout in tests

* finalization

* revert timeouts

* use `atEof` for reads

* adjust timeouts and use atEof for reads

* use atEof for reads

* set isEof flag

* no backoff for pubsub streams

* temp timer increase, make macos finalize

* don't call `subscribePeer` in libp2p anymore

* more traces

* leak tests

* lower timeouts

* handle exceptions in control message

* don't use `cancelAndWait`

* handle exceptions in helpers

* wip

* don't send empty messages

* check for leaks properly

* don't use cancelAndWait

* don't await subscribption sends

* remove subscrivePeer calls from switch

* trying without the hooks again
2020-08-02 23:20:11 -06:00
Dmitriy Ryajov 94196fee71
Connections and pubsub peers cleanup (#279)
* better peer tracking and cleanup

* check if peer and conn is nil

* test name

* make timeout more agressive

* rename method for better clarity
2020-07-17 13:46:24 -06:00
Dmitriy Ryajov 0348773ec9
Connection manager (#277)
* splitting out connection management

* wip

* wip conn mngr tests

* set peerinfo in contructor

* comments and documentation

* tests

* wip

* add `None` to detect untagged connections

* use `PeerID` to index connections

* fix tests

* remove useless equality
2020-07-17 09:36:48 -06:00
Jacek Sieka 170685f9c6
gossipsub fixes (#276)
* graft up to D peers
* fix logging so it's clear who is grafting/pruning who
* clear fanout when grafting
2020-07-16 21:26:57 +02:00
Jacek Sieka c76152f2c1
Simplify send (#271)
* PubSubPeer.send single message

* gossipsub: simplify send further
2020-07-16 12:06:57 +02:00
Dmitriy Ryajov f35b8999b3
some light cleanup for pub/gossip sub (#273)
* move peer table out to its own file

* move peer table

* cleanup `==` and add one to peerinfo

* add peertable

* missed equality check
2020-07-15 13:18:55 -06:00
Eugene Kabanov b832668768
Minprotobuf refactoring 2 (#269)
* Protobuf refactoring stage II.

* Remove NoError.

* Change trace level for invalid message.
2020-07-15 10:25:39 +02:00
Giovanni Petrantoni d7bab37119
Fix gossip messages seqno according to spec (#253)
* Fix gossip messages seqno according to spec

* Add peers back to gossipsub table, slow down heartbeat

* Revert "Add peers back to gossipsub table, slow down heartbeat"

This reverts commit 01e2e62172.

* make seqno a threadvar, remove from peerinfo

* seqno refactor, into pubsub
2020-07-14 21:51:33 -06:00
Giovanni Petrantoni fcda0f6ce1
PubSubPeer tables refactor (#263)
* refactor peer tables

* tests fixing

* override PubSubPeer equality

* fix pubsubpeer comparison
2020-07-13 15:32:38 +02:00
Dmitriy Ryajov 4c815d75e7
More gossip cleanup (#257)
* more cleanup

* correct pubsub peer count

* close the stream first

* handle cancelation

* fix tests

* fix fanout ttl

* merging master

* remove `withLock` as it conflicts with stdlib

* fix trace build

Co-authored-by: Giovanni Petrantoni <giovanni@fragcolor.xyz>
2020-07-09 14:21:47 -06:00
Jacek Sieka c720e042fc
clean up mesh handling logic (#260)
* gossipsub is a function of subscription messages only
* graft/prune work with mesh, get filled up from gossipsub
* fix race conditions with await
* fix exception unsafety when grafting/pruning
* fix allowing up to DHi peers in mesh on incoming graft
* fix metrics in several places
2020-07-09 11:16:46 -06:00
Dmitriy Ryajov a52763cc6d
fix publishing (#250)
* use var semantics to optimize table access

* wip... lvalues don't work properly sadly...

* big publish refactor, replenish and balance

* fix internal tests

* use g.peers for fanout (todo: don't include flood peers)

* exclude non gossip from fanout

* internal test fixes

* fix flood tests

* fix test's trypublish

* test interop fixes

* make sure to not remove peers from gossip table

* restore old replenishFanout

* cleanups

* Cleanup resources (#246)

* consolidate reading in lpstream

* remove debug echo

* tune log level

* add channel cleanup and cancelation handling

* cancelation handling

* cancelation handling

* cancelation handling

* cancelation handling

* cleanup and cancelation handling

* cancelation handling

* cancelation

* tests

* rename isConnected to connected

* remove testing trace

* comment out debug stacktraces

* explicit raises

* restore trace vs debug in gossip

* improve fanout replenish behavior further

* cleanup stale peers more eaguerly

* synchronize connection cleanup and small refactor

* close client first and call parent second

* disconnect failed peers on publish

* check for publish result

* fix tests

* fix tests

* always call close

Co-authored-by: Giovanni Petrantoni <giovanni@fragcolor.xyz>
2020-07-07 18:33:05 -06:00
Giovanni Petrantoni ec00c7fc50
Peer resultification and defect only (#245)
* Peer resultification and defect only

* Fixing some tests

* test fixes

* Rename peer into peerid

* better result error message in identify

* further merge fixes
2020-07-01 08:25:09 +02:00
Dmitriy Ryajov c788a6a3c0
Cleanup resources (#246)
* consolidate reading in lpstream

* remove debug echo

* tune log level

* add channel cleanup and cancelation handling

* cancelation handling

* cancelation handling

* cancelation handling

* cancelation handling

* cleanup and cancelation handling

* cancelation handling

* cancelation

* tests

* rename isConnected to connected

* remove testing trace

* comment out debug stacktraces

* explicit raises
2020-06-29 09:15:31 -06:00
Jacek Sieka aa6756dfe0
allow message id provider to be specified (#243)
* don't send public key in message when not signing (information leak)
* don't run rebalance if there are peers in gossip (see #242)
* don't crash randomly on bad peer id from remote
2020-06-28 09:56:38 -06:00
Dmitriy Ryajov 7a95f1844b
Concurrent dials (#238)
* count published messages

* don't call `switch.dial` in `subscribeToPeer`

* add secureconn constructor

* close in the correct order

* concurent dial lock and track in/out conns better

* make tests pass

* add todo comment

* disconect peers that open too many connections

* wip

* do connection and muxer tracking in one place

* prevent nil pointer in observers

* drop connections when peers is over max

* prevent channel leaks

* don't use closure to handle channel
2020-06-24 09:08:44 -06:00
Dmitriy Ryajov 5b28e8c488
Cleanup lpstream, Connection and BufferStream (#228)
* count published messages

* don't call `switch.dial` in `subscribeToPeer`

* don't use delegation in connection

* move connection out to own file

* don't breakout on reset

* make sure to call close on secured conn

* add lpstream tracing

* don't breackdown by conn id

* fix import

* remove unused lable

* reset  connection on exception

* add additional metrics for skipped messages

* check for nil in secure.close
2020-06-19 11:29:43 -06:00
Dmitriy Ryajov 9d9f793b4f
add metrics for sent messages by topic and peer (#220) 2020-06-15 17:39:03 -06:00
Dmitriy Ryajov ac04ca6e31
make sure keys exist and more metrics (#215) 2020-06-11 20:20:58 -06:00
Dmitriy Ryajov 55a294a5c9
better pubsub metrics (#214) 2020-06-11 12:09:34 -06:00
Viktor Kirilov 1afec627c2
proper name for topics so that we can filter dynamically using chronicles (#210)
* proper name for topics so that we can filter dynamically using chronicles

* lowercase
2020-06-10 10:48:01 +02:00
Dmitriy Ryajov ee281310c0
move trace log 2020-06-08 10:40:08 -06:00
Giovanni Petrantoni 82b4ed8f44 use declareCounter rather then gauge for certain metrics 2020-06-07 16:41:23 +09:00
Giovanni Petrantoni a6a2a81711
Start adding some metrics to pubsub (#192)
* Start adding some metrics to pubsub

In order to visualize it's functionality
Still WIP

* more metrics

* add per topic metrics

* finishup with requested metrics

* add a metrisServer define to start local server

* PR fixes and cleanup
2020-06-07 09:15:21 +02:00
Dmitriy Ryajov 130c64f33a
don't return nil in dial (#205)
* dont return nil in dial

* don't crash on pubsub send
2020-06-05 18:17:05 -06:00
Dmitriy Ryajov bb8bff2195
add sparse message propagation tests to gossipsub (#202)
* add sparce tests to gossipsub

* add send hooks

* remove `all`
2020-06-02 17:53:38 -06:00
Dmitriy Ryajov 1b4876d26d emulate `defered` 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 86e1c8169c decorate observers hooks with {.raises: [Defect].}
move hooks logic out into standalone procs

License: MIT
Signed-off-by: Dmitriy Ryajov <dryajov@gmail.com>
2020-06-02 09:10:27 -06:00
Dmitriy Ryajov daef00fc7b don't crash schlesi-dev 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 93e5805c01 better exception handling 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 9132f16927
gossipsub fixes (#186) 2020-05-21 14:24:20 -06:00
Dmitriy Ryajov 7900fd9f61
Half closed (#174)
* call write until all is written out

* add comments to lpchannel fields

* add an eof flag to signal which end closed

* wip: rework with proper half-closed

* add eof and closed handling

* propagate closes to piped

* call parent close

* moving bufferstream trackers out

* move writeLock to bufferstream

* move writeLock out

* remove unused call

* wip

* rebasing master

* fix mplex tests

* wip

* fix bufferstream after backport

* wip

* rename to differentiate from chronos tracker

* close connection on chronos close

* make reset request asyncCheck

* fix channel cleanup

* misc

* don't use read

* fix backports

* make noise work again

* proper exception handling

* don't reraise just yet

* add convenience templates

* dont double wrap

* use async pragma

* fixes after backporting

* muxer owns connection

* remove on transport close cleanup

* revert back allread

* adding some todos

* read from stream

* inc count before closing

* rebasing master

* rebase master

* use correct exception type

* use try/finally insted of defer

* fix compile in trace mode

* reset channels on mplex close
2020-05-19 18:14:15 -06:00
Dmitriy Ryajov f8029e7359
use sha256 digest as cache keys (#135)
* use sha256 digest as cache keys

* rebasing master
2020-05-18 14:49:49 -06:00
Jacek Sieka 3053f03814 fix varint issues
* fixes #111
2020-05-11 09:12:23 -06:00
Dmitriy Ryajov 6196d56fc2
check for nil observers 2020-05-08 15:44:18 -06:00
Giovanni Petrantoni c889224012 Add PubSub observer+ hooks (they can modify as well) 2020-05-08 13:31:52 -06:00
Dmitriy Ryajov 6da4d2af48
Pubsub signatures flags (#161)
* add verify signature flag

* add sign flag to enable/disable msg signing

* moving internal tests out to their own file

* cleanup nimble file

* remove unneeded tests

* move pubsub tests out

* fix tests
2020-05-06 11:26:08 +02:00
Dmitriy Ryajov 3effb95f10 close underlying bufferstream in lpchannel 2020-03-28 09:29:43 -06:00
Giovanni Petrantoni 0a3e4a764b
Less verbose traces (#112)
* Make traces less verbose with shortHexDump utility

* Rename shortHexDump into shortLog

* Improve shortLog, add shortLog for crypto keys

* Add proper shortLog implementations in messages
2020-03-23 15:03:36 +09:00
Dmitriy Ryajov 88a030d8fb fix: removing timeouts from conn 2020-02-05 20:38:43 +01:00
Dmitriy Ryajov 2232ca598e don't timeout in pubsub 2020-02-04 17:59:57 +01:00
Zahary Karadjov 1bd933cd5a
More precise tracing 2020-02-04 17:27:32 +01:00
Zahary Karadjov 7bd305471c
Make sure the library can compile with json logging in trace mode 2020-02-04 15:17:39 +01:00
Dmitriy Ryajov 667691f784 send messages in batches 2020-01-09 12:55:21 -06:00
Dmitriy Ryajov 68cc57669e
Feat/pubsub validators (#58)
* feat: adding validator hooks to pubsub

* expose add/remove validators on switch

* do less unnecessary copyng
2019-12-16 23:24:03 -06:00
Dmitriy Ryajov 293a219dbe
Cleanup (#55)
* fix: don't allow replacing pubkey

* fix: several small improvements

* removing pubkey setter

* improove error handling

* remove the use of Option[T] if not needed

* don't use optional

* fix-ci: temporarily pin p2pd to a working tag

* fix example to comply with latest changes

* bumping p2pd again to a higher version
2019-12-10 14:50:35 -06:00
Dmitriy Ryajov 5f6fcc3d90
extract public and private keys fields from peerid (#44)
* extract public and private keys fields from peerid

* allow assigning a public key

* cleaned up TODOs

* make pubsub prefix a const

* public key should be an `Option`
2019-12-07 10:36:39 -06:00
Dmitriy Ryajov e623e70e7b
PubSub (Gossip & Flood) Implementation (#36)
This adds gossipsub and floodsub, as well as basic interop testing with the go libp2p daemon. 

* add close event

* wip: gossipsub

* splitting rpc message

* making message handling more consistent

* initial gossipsub implementation

* feat: nim 1.0 cleanup

* wip: gossipsub protobuf

* adding encoding/decoding of gossipsub messages

* add disconnect handler

* add proper gossipsub msg handling

* misc: cleanup for nim 1.0

* splitting floodsub and gossipsub tests

* feat: add mesh rebalansing

* test pubsub

* add mesh rebalansing tests

* testing mesh maintenance

* finishing mcache implementatin

* wip: commenting out broken tests

* wip: don't run heartbeat for now

* switchout debug for trace logging

* testing gossip peer selection algorithm

* test stream piping

* more work around message amplification

* get the peerid from message

* use timed cache as backing store

* allow setting timeout in constructor

* several changes to improve performance

* more through testing of msg amplification

* prevent gc issues

* allow piping to self and prevent deadlocks

* improove floodsub

* allow running hook on cache eviction

* prevent race conditions

* prevent race conditions and improove tests

* use hashes as cache keys

* removing useless file

* don't create a new seq

* re-enable pubsub tests

* fix imports

* reduce number of runs to speed up tests

* break out control message processing

* normalize sleeps between steps

* implement proper transport filtering

* initial interop testing

* clean up floodsub publish logic

* allow dialing without a protocol

* adding multiple reads/writes

* use protobuf varint in mplex

* don't loose conn's peerInfo

* initial interop pubsub tests

* don't duplicate connections/peers

* bring back interop tests

* wip: interop

* re-enable interop and daemon tests

* add multiple read write tests from handlers

* don't cleanup channel prematurely

* use correct channel to send/receive msgs

* adjust tests with latest changes

* include interop tests

* remove temp logging output

* fix ci

* use correct public key serialization

* additional tests for pubsub interop
2019-12-05 20:16:18 -06:00
Dmitriy Ryajov 903e79ede1
Feat/conn cleanup (#41)
Backporting proper connection cleanup from #36 to align with latest chronos changes.

* add close event

* use proper varint encoding

* add proper channel cleanup in mplex

* add connection cleanup in secio

* tidy up

* add dollar operator

* fix tests

* don't close connections prematurely

* handle closing streams properly

* misc

* implement address filtering logic

* adding pipe tests

* don't use gcsafe if not needed

* misc

* proper connection cleanup and stream muxing

* re-enable pubsub tests
2019-12-03 22:44:54 -06:00
Dmitriy Ryajov 37d7a03fba use a timed cache in floodsub 2019-10-11 08:15:24 +09:00
Dmitriy Ryajov 3b9d34116d decrease floodsub traffic 2019-10-11 08:15:24 +09:00
Dmitriy Ryajov 34d1a641de cleanup/test pubsub 2019-10-11 08:15:24 +09:00
Dmitriy Ryajov 9862064234 changed copyright year 2019-10-11 08:15:24 +09:00
Dmitriy Ryajov 5b3f93ba1c feat: allow multiple handlers per topic in pubsub 2019-10-11 08:15:24 +09:00
Dmitriy Ryajov 054085620c logging: switch debug for trace in most cases 2019-10-11 08:15:24 +09:00
Dmitriy Ryajov 9f3b80b60c got pubsub working without signing 2019-10-11 08:15:24 +09:00
Dmitriy Ryajov 8920cd7d60 misc: pubsub/floodsub and logging 2019-10-11 08:15:24 +09:00
Dmitriy Ryajov 177eb71ffa wip: floodsub initial implementation 2019-10-11 08:15:24 +09:00