Commit Graph

435 Commits

Author SHA1 Message Date
Eric Mastro fffa7e8cc2
fix: remove returned Futures from switch.start (#662)
* fix: remove returned Futures from switch.start

The proc `start` returned a seq of futures that was mean to be awaited by the caller. However, the start proc itself awaited each Future before returning it, so the ceremony requiring the caller to await the Future, and returning the Futures themselves was just used to handle errors. But we'll give a better way to handle errors in a future revision

Remove `switch.start` return type (implicit `Future[void]`)

Update tutorials and examples to reflect the change.

* Raise error during failed transport

Replaces logging of error, and adds comment that it should be replaced with a callback in a future PR.
2021-12-03 19:23:12 +01:00
Tanguy 6893bd9dbb
Customizable gossipsub backoff on unsubscribe (#665)
* Customizable gossipsub backoff on unsubscribe
* change default to 5s
2021-12-02 14:47:40 +00:00
Tanguy 0dfac6fce7
Signed envelopes and routing records (#656) 2021-11-24 14:03:40 -06:00
Dmitriy Ryajov 73168b6eae
Add support for multiple addresses to transports (#598)
* add test for multiple local addresses

* allow transports to listen on multiple addrs

* fix tcp transport accept

* check switch addrs are correct

* switch test to port 0

* close accepted peers on close

* ignore CancelledError in transport accept

* test ci

* only accept in accept loop

* avoid accept greedyness

* close acceptedPeers

* accept doesn't crash on cancelled fut

* add common transport test

* close conn on handling failure

* close accepted peers in two steps

* test for macos

* revert accept greedyness

* fix dialing cancel

* test chronos fix

* add ws

* ws cancellation

* small fix

* remove chronos blocked test

* fix testping

* Fix transport's switch start (like #609)

* bump chronos

* Websocket: handle both ws & wss

Co-authored-by: Tanguy Cizain <tanguycizain@gmail.com>
Co-authored-by: Tanguy <tanguy@status.im>
2021-11-24 14:01:12 -06:00
Tanguy 6f779c47c8
Gossipsub: don't send to peers seen during validation (#648)
* Gossipsub: don't send to peers seen during validation

* Less error prone code

* add metric

* Fix metric

* remove dangling code test

* address comments

* don't allocate memory
2021-11-14 09:08:05 +01:00
Tanguy c92125a1a4
Integrate dns resolving (#615)
* integrate dns

* give hostname to transport dial

* add hostname test

* switched to websock master

* Add dnsaddr dial test w multiple transports
2021-11-08 13:02:03 +01:00
Tanguy e1d96a0f4d
Remove isWire (#640) 2021-10-28 19:11:31 +02:00
Tanguy 5885e03680
Add maxMessageSize option to pubsub (#634)
* Add maxMessageSize option to pubsub
* Switched default to 1mb
2021-10-25 12:58:38 +02:00
Tanguy 846baf3853
Various cleanups part 1 (#632)
* raise -> raise exc
* replace stdlib random with bearssl
* object init -> new
* Remove deprecated procs
* getMandatoryField
2021-10-25 10:26:32 +02:00
Tanguy 3669b90ceb
Fix WS observed address (#631)
* Fix WS observed address

* Unify tcptransport & wstransport
2021-10-14 13:16:34 +02:00
Menduist d02735dc46
Remove peer info (#610)
Peer Info is now for local peer data only.
For other peers info, use the peer store.

Previous reference to peer info are replaced with the peerid
2021-09-08 11:07:46 +02:00
Tanguy Cizain 8cddfde837
Rename getKey -> getPublicKey (#621)
* rename getKey to getPublicKey

* use publicKey directly in gossipsub

* update error messages
2021-09-02 12:03:40 +02:00
Tanguy Cizain 62ca6dd9a0
Add missing test to testnative (#618)
* add missing tests

* fix testping
2021-09-01 08:38:24 +02:00
Tanguy Cizain f274bfe19d
DNS Addresses handling (#580)
* add 'dns' multiaddr protocol

* multiaddr: isWire is true for DNS protocols

* resolve dns on connect

* fix typo

* add dns test

* update resolveDns error handling

* handle multiple dns entries

* start of new resolver

* working dns resolver

* use the DnsResolver

* fix json logs

* small overhaul

* fix dns implem in lp2p

* update dnsclient repo

* add dns test to testnative

* dummy dns server for ut

* better mocked

* moved resolving to transport

* moved mockresolver to libp2p

* test resolve in switch test

* try multiple txt & track leaks

* raise e

* catchable error instead of exception

* save failed dns server

* moved resolve back to dialer

* remove nameresolver from dialer
2021-08-18 09:40:12 +02:00
Tanguy Cizain af3be7966b
Websocket Transport (#593)
* start of websocket transport

* more ws tests

* switch to common test

* add close to wsstream

* update ws & chronicles version

* cleanup

* removed multicodec

* clean ws outgoing connections

* renamed to websock

* removed stream from logs

* renamed ws to websock

* add connection closing test to common transport

* close incoming connection on ws stop

* renamed testwebsocket.nim -> testwstransport.nim

* removed raise todo

* split out/in connections

* add wss to tests

* Fix tls (#608)

* change log level

* fixed issue related to stopping

some cosmetic cleanup

* use `allFutures` to stop/close things

Prevent potential race conditions when stopping two or more transports

* misc

* point websock to server-case-object branch

* interop test with go

* removed websock version specification

* add daemon -> native ws test

* fix & test closed read/write

* update readOnce, thanks jangko

Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
2021-08-03 15:48:03 +02:00
markspanbroek 8b438142ce
Fixes failing peer store test in Nim 1.4.8 (#612)
* Fixes failing test in Nim 1.4.8

This tests failed because the order in which the elements
of a hashset are added to a seq is non-deterministic.

Co-authored-by: Tanguy Cizain <tanguycizain@gmail.com>
2021-07-28 12:06:18 +02:00
Tanguy Cizain c1b2d45d1b
Track ChronosStream in test (#605)
* track chronos stream in tests

* use chronosstreamtrackername
2021-07-13 13:53:08 +02:00
Tanguy Cizain 26e47d7da5
Various transports improvement (#594)
* little transport cleanup

* rename TcpTransport.init -> TcpTransport.new

* moved transport e2e to common file

* remove localAddress

* rename testtransport -> testtcptransport

* add checktrackers to commontransports

* removed multicodec from transports
2021-06-30 10:59:30 +02:00
Tanguy Cizain 7e3974f9c8
Fix flaky macos switch test (#592)
* sleep to avoid flaky switch test on macos

* Update tests/testswitch.nim

Co-authored-by: markspanbroek <mark@spanbroek.net>

Co-authored-by: markspanbroek <mark@spanbroek.net>
2021-06-23 09:27:52 -06:00
Dmitriy Ryajov b63e064b4a
Remove asynccheck (#590)
* Merge master (#555)

* Revisit Floodsub (#543)

Fixes #525

add coverage to unsubscribeAll and testing

* add mounted protos to identify message (#546)

* add stable/unstable auto bumps

* fix auto-bump CI

* merge nbc auto bump with CI in order to bump only on CI success

* put conditional locks on nbc bump (#549)

* Fix minor exception issues (#550)

Makes code compatible with
https://github.com/status-im/nim-chronos/pull/166 without requiring it.

* fix nimbus ref for auto-bump stable's PR

* Split dialer (#542)

* extracting dialing logic to dialer

* exposing upgrade methods on transport

* cleanup

* fixing tests to use new interfaces

* add comments

* add base exception class and fix hierarchy

* fix imports

* `doAssert` is `ValueError` not `AssertionError`?

* revert back to `AssertionError`

Co-authored-by: Giovanni Petrantoni <7008900+sinkingsugar@users.noreply.github.com>
Co-authored-by: Jacek Sieka <jacek@status.im>

* Merge master (#555)

* Revisit Floodsub (#543)

Fixes #525

add coverage to unsubscribeAll and testing

* add mounted protos to identify message (#546)

* add stable/unstable auto bumps

* fix auto-bump CI

* merge nbc auto bump with CI in order to bump only on CI success

* put conditional locks on nbc bump (#549)

* Fix minor exception issues (#550)

Makes code compatible with
https://github.com/status-im/nim-chronos/pull/166 without requiring it.

* fix nimbus ref for auto-bump stable's PR

* Split dialer (#542)

* extracting dialing logic to dialer

* exposing upgrade methods on transport

* cleanup

* fixing tests to use new interfaces

* add comments

* add base exception class and fix hierarchy

* fix imports

* `doAssert` is `ValueError` not `AssertionError`?

* revert back to `AssertionError`

Co-authored-by: Giovanni Petrantoni <7008900+sinkingsugar@users.noreply.github.com>
Co-authored-by: Jacek Sieka <jacek@status.im>

* cleanup

Co-authored-by: Giovanni Petrantoni <7008900+sinkingsugar@users.noreply.github.com>
Co-authored-by: Jacek Sieka <jacek@status.im>
2021-06-14 17:21:44 -06:00
Tanguy Cizain bed00ec43c
Identify Push (#587)
* start of identifypush

* better pushidentify

* push identify test

* fix: make peerid optional
2021-06-14 11:08:47 -06:00
Tanguy Cizain bd2e9a0462
Fix trackers tests (#588)
* Fix checkTracker typo

* fix testswich tests

* use tracker consts
2021-06-14 10:26:11 +02:00
Dmitriy Ryajov 9e2b464e14
Cleanup testinterop imports (#589)
* cleanup exports/imports, specially for testinterop

* export pubsub
2021-06-11 16:34:40 -06:00
Tanguy Cizain 93156447ba
Peer Store implement part II (#586)
* Connect & Peer event handlers now receive a peerinfo

* small peerstore refacto

* implement peerstore in switch

* changed PeerStore to final ref object

* revert libp2p/builders.nim
2021-06-08 18:55:24 +02:00
Tanguy Cizain 55a3606ecb
add ping protocol (#584)
* add ping protocol

* add ping protocol handler

* ping styling

* more ping tests

* switch ping to bearssl rng

* update ping style

* new cancellation test
2021-06-08 18:53:45 +02:00
Tanguy Cizain caac8191d2
Change newXXXX procs to XXXX.new (#585)
* newBufferStream -> BufferStream.new

* newMultistream -> MultistreamSelect.new

* newSecio -> Secio.new

* newNoise -> Noise.new

* newPlainText -> PlainText.new

* newPubSubPeer -> PubSubPeer.new

* newIdentify -> Identify.new

* newMuxerProvider -> MuxerProvider.new
2021-06-07 09:32:08 +02:00
Dmitriy Ryajov a3c00af945
Split dialer (#542)
* extracting dialing logic to dialer

* exposing upgrade methods on transport

* cleanup

* fixing tests to use new interfaces

* add comments
2021-06-02 12:23:44 -06:00
Dmitriy Ryajov ac4e060e1a
adding raises defect across the codebase (#572)
* adding raises defect across the codebase

* use unittest2

* add windows deps caching

* update mingw link

* die on failed peerinfo initialization

* use result.expect instead of get

* use expect more consistently and rework inits

* use expect more consistently

* throw on missing public key

* remove unused closure annotation

* merge master
2021-05-21 10:27:01 -06:00
Jacek Sieka 83a20a992a
gossipsub: unsubscribe fixes (#569)
* gossipsub: unsubscribe fixes

* fix KeyError when updating metric of unsubscribed topic
* fix unsubscribe message not being sent to all peers causing them to
keep thinking we're still subscribed
* release memory earlier in a few places

* floodsub fix
2021-05-07 00:43:45 +02:00
Giovanni Petrantoni f81a085d0b
More gossip coverage (#553)
* add floodPublish test

* test delivery via control Iwant/have mechanics

* fix issues in control, and add testing

* fix possible backoff issue with pruned routine overriding it
2021-04-22 18:51:22 +09:00
Jacek Sieka e285d8bbf4
mem usage cleanups for pubsub (#564)
In `async` functions, a closure environment is created for variables
that cross an await boundary - this closure environment is kept in
memory for the lifetime of the associated future - this means that
although _some_ variables are no longer used, they still take up memory
for a long time.

In Nimbus, message validation is processed in batches meaning the future
of an incoming gossip message stays around for quite a while - this
leads to memory consumption peaks of 100-200 mb when there are many
attestations in the pipeline.

To avoid excessive memory usage, it's generally better to move non-async
code into proc's such that the variables therein can be released earlier
- this includes the many hidden variables introduced by macro and
template expansion (ie chronicles that does expensive exception
handling)

* move seen table salt to floodsub, use there as well
* shorten seen table salt to size of hash
* avoid unnecessary memory allocations and copies in a few places
* factor out message scoring
* avoid reencoding outgoing message for every peer
* keep checking validators until reject (in case there's both reject and
ignore)
* `readOnce` avoids `readExactly` overhead for single-byte read
* genericAssign -> assign2
2021-04-18 10:08:33 +02:00
Giovanni Petrantoni 795a651839
use a builder pattern to build the switch (#551)
* use a builder pattern to build the switch

* with with

* more refs
2021-04-02 10:20:51 +09:00
Giovanni Petrantoni 03bbdd2261
Revisit Floodsub (#543)
Fixes #525

add coverage to unsubscribeAll and testing
2021-03-15 13:16:03 -06:00
Jacek Sieka 70deac9e0d
fix peer score accumulation (#541)
* fix accumulating peer score
* fix missing exception handling
* remove unnecessary initHashSet/initTable calls
* simplify peer stats management
* clean up tests a little
* fix some missing raises annotations
2021-03-09 13:22:52 +01:00
Giovanni Petrantoni 02ad017107
Gossipsub fixes and Initiator flagging fixes (#539)
* properly propagate initiator information for gossipsub

* Fix pubsubpeer lifetime management

* restore old behavior

* tests fixing

* clamp backoff time value received

* fix member name collisions

* internal test fixes

* better names and explaining of the importance of transport direction

* fixes
2021-03-03 08:23:40 +09:00
Giovanni Petrantoni 45300c28a9
[SEC] gossipsub - handleIHAVE/handleIWANT recommendations & notes (#535)
Fixes #400
2021-02-26 14:27:42 +09:00
Giovanni Petrantoni d7469b2286
[SEC] gossipsub - a peer score is not retained up until expiry if abusive peer unsubscribes (#534)
* [SEC] gossipsub - a peer score is not retained up until expiry if abusive peer unsubscribes
Fixes #402

* remove debug logging
2021-02-26 14:15:58 +09:00
Giovanni Petrantoni eac6cd3dbf
Debt: cleanup warnings #426 (#536)
* testswitch cleanups

* Debt: cleanup warnings
Fixes #426
2021-02-25 09:24:49 -06:00
Giovanni Petrantoni 8236319a91
[SEC] gossipsub - handleGraft/handlePrune recommendations & notes (#530)
Fixes #401
2021-02-22 12:04:20 +09:00
Giovanni Petrantoni 1368bf7ecb
test rebalanceMesh with low score peers (#529) 2021-02-22 10:05:25 +09:00
Giovanni Petrantoni 51d8cd4ade
[SEC] gossipsub - rebalanceMesh may prune up to D_lo on oversubscription (#531)
Fixes #403
2021-02-13 13:39:32 +09:00
Giovanni Petrantoni e124e342b0
n subscription limits (#528)
* subscription high water, cleanups

* subscription limits test

* newline
2021-02-12 12:27:26 +09:00
Dmitriy Ryajov 2658181df9
Merge unstable (#518)
* Address Book POC implementation (#499)

* Address Book POC implementation

* Feat/peerstore impl (#505)

Co-authored-by: Hanno Cornelius <68783915+jm-clius@users.noreply.github.com>
2021-02-08 15:16:23 -06:00
Dmitriy Ryajov 4dea23c394
Remove secio usage and cleanup exports (#519)
* cleaned up exports

* remove secio use

* added more useful exports

* proper import
2021-02-08 14:33:34 -06:00
Dmitriy Ryajov fb493d1a4a
Connection limits tests (#509)
* connection limit tests

* remove use of secio

* check that upgraded fut is not nil

* rebuild
2021-01-27 21:27:33 -06:00
Dmitriy Ryajov 0959877b29
Connection limits (#384)
* master merge

* wip

* avoid deadlocks

* tcp limits

* expose client field in chronosstream

* limit incoming connections

* update with new listen api

* fix release

* don't override peerinfo in connection

* rework transport with accept

* use semaphore to track resource ussage

* rework with new transport accept api

* move events to conn manager (#373)

* use semaphore to track resource ussage

* merge master

* expose api to acquire conn slots

* don't fail expensive metrics

* allow tracking and updating connections

* set global connection limits to 80

* add per peer connection limits

* make sure conn is closed if tracking failed

* more descriptive naming for handle

* rework with new transport accept api

* add `getStream` hide `selectConn`

* add TransportClosedError

* make nil explicit

* don't make unnecessary copies of message

* logging

* error handling

* cleanup semaphore

* track connections properly

* throw `TooManyConnections` when tracking outgoing

* use proper exception and handle conventions

* check onCloseHandle for nil

* revert internalConnect changes

* adding upgraded flag

* await stream before closing

* simplify tracking

* wip

* logging

* split connection limits into incoming and outgoing

* further streamline connection limits split counts

* don't use closeWithEOF

* move peer and conn event triggers from switch

* wip

* wip

* wip

* merge master

* handle nil connections properly

* add clarifying comment

* don't raise exc on nil

* no finally

* add proper min/max connections logic

* rebase master

* merge master

* master merge

* remove request timeout

should be addressed in separate PR

* merge master

* share semaphore when in/out limits arent enforced

* merge master

* use import

* pass semaphore to trackConn

* don't close last conn

* use storeConn

* merge master

* use storeConn
2021-01-20 22:00:24 -06:00
Giovanni Petrantoni 240ec84ffb
Gossipsub wip (#502)
* Remove unused connections in pubsubpeer, also removed wrong usages, add a disconnect bad peers parameter

* handle exceptions in disconnectPeer

* small fix

* use the proper disconnection procedure for gossip peers

* fixes, more metrics add test about disconnection

* hot fix possible null pointers in switch

* silly isnil sugar

* Fix and test gossip directPeer connections
2021-01-15 13:48:03 +09:00
Dmitriy Ryajov 3878a95b23
Semaphore cancellations (#503)
* add proper cancelation handling

* remove cancelled futures explicitly

* use fifo to keep proper order

* add out of order cancelations test

* make count public

* use `new` instead of `init`

* remove private `queue` from tests

* expose count as a readonly prop

* use `delete()` to preserve seq order
2021-01-14 10:11:12 +01:00
Giovanni Petrantoni dc48170b0d
Gossip subscription improvements (#497)
* salt ids in seen table

* add subscription validation callback and avoid processing topics we don't care of

* apply penalty on bad subscription

* fix IHave handling IDs

* reduce indenting, add some comments

* fix gossip randombytes generation

* do not descore unwanted topics (might happen, due to timing, needs improvements)

* cleaning up and added tests

* validate subscriptions only when subscribing

* set notice level for failed publish

* fix floodsub behavior
2021-01-13 23:49:44 +09:00
Giovanni Petrantoni 87be2c7f1f
make switch tests less sensitive to time (#501)
* make switch tests less sensitive to time

* missing new line
2021-01-12 10:26:48 +09:00
Dmitriy Ryajov b2ea5a3c77
Concurrent upgrades (#489)
* adding an upgraded event to conn

* set stopped flag asap

* trigger upgradded event on conn

* set concurrency limit for accepts

* backporting semaphore from tcp-limits2

* export unittests module

* make params explicit

* tone down debug logs

* adding semaphore tests

* use semaphore to throttle concurent upgrades

* add libp2p scope

* trigger upgraded event before any other events

* add event handler for connection upgrade

* cleanup upgraded event on conn close

* make upgrades slot release rebust

* dont forget to release slot on nil connection

* misc

* make sure semaphore is always released

* minor improvements and a nil check

* removing unneeded comment

* make upgradeMonitor a non-closure proc

* make sure the `upgraded` event is initialized

* handle exceptions in accepts when stopping

* don't leak exceptions when stopping accept loops
2021-01-04 12:59:05 -06:00
Giovanni Petrantoni 4858e0ab15
Gossipsub refactor pt2 (#495)
* add sub/unsub test

* remove unused variable from gossip
2020-12-20 00:45:34 +09:00
Giovanni Petrantoni 05e789a34f
Gossipsub refactor (#490)
* refactor peerStats, re-enable scores for testing

* remove gossip 1.0

* cleanup

* codecov matrix fixes

* restore previous score on onNewPeer

* fix coverage n checks

* unsubscribeAll gossipsub fixes

* refactor unsub/sub

* refactor onNewPeer and fix score flow

* disable scores by default (change in tests later)

* fix tests, enable scores in tests

* fix wrongly merged test

* ensure topic removal from topics table

* small typo fix

* testinterop fixes
2020-12-19 15:43:32 +01:00
Giovanni Petrantoni 5543f6681f
first pass, use results for Cid module (#480)
* first pass, use results for Cid module

* improvements to decode
2020-12-15 14:19:18 +01:00
Dmitriy Ryajov a990fe95a0
Fixing range error introduces in v1.2.8 (#485) 2020-12-15 06:58:38 +01:00
Giovanni Petrantoni f8f0bc1bd8
Gossip improvements (#476)
* add more traces, remove async from rebalance

* more traces

* avoid computng scores when weight is 0.0

* debug colocation, fix an indent in unsubpeer (minor)

* add full ValidationResult coverage

* store in cache only after validation

* gossip 1.0 fixes

* fix typo

* gossip 10 internal test fixes

* test fixing

* refactor peerstats usages

* populate tables if missing when scoring
2020-12-15 10:25:22 +09:00
Dmitriy Ryajov 4224f12503
fix memory safety issue in tests (#484) 2020-12-14 15:22:53 -06:00
Jacek Sieka 1befeb8c2e
clean up peerid (#470)
* fix dangling cstring on error return
* remove some useless inlines
* less mallocs in shortlog
* proc -> func
* rename test
2020-12-03 13:53:16 -06:00
Dmitriy Ryajov e9d4679059
Race in connection setup (#464)
* check that connection is not closed or eof

* don't release connection lock prematurely

* test that only valid connections can be added

* correct exception type on closed connection

* add clarifying comment

* use closeWithEOF for more stable test

* misc comments

* log stream id in buffestream asserts

* use closeWithEOF to prevent races in tests

* give some time to the remote handler to trigger

* adding more tests to make codecov happy
2020-12-02 19:24:48 -06:00
Dmitriy Ryajov 94e672ead0
allow concurrent closeWithEOF (#466)
* allow concurrent closeWithEOF

* add dedicated closedWithEOF flag
2020-12-01 09:44:21 +01:00
Dmitriy Ryajov 18443dafc1
rework peer event to take an initiator flag (#456)
* rework peer event to take an initiator flag

* use correct direction for initiator
2020-11-28 10:59:47 -06:00
Dmitriy Ryajov a8f5f7a8bb
move dialing logic to it's own proc to avoid try/finally bugs (#461)
* move dialing logic to it's own proc to avoid try/finally bugs

* re-export transport

* lint

* add cancelation test

* test remote conn close on dial
2020-11-28 09:05:12 +01:00
Dmitriy Ryajov 7b1e652224
Allow custom identify agent string (#451)
* allow custom agent version string

* rework tests and add test for custom agent version
2020-11-25 07:42:02 -06:00
Dmitriy Ryajov 351489bfa9
getMuxedStream to more appropriate getStream (#448) 2020-11-24 00:37:45 -06:00
Dmitriy Ryajov 1d16d22f5f
Don't allow concurrent pushdata (#444)
* handle resets properly with/without pushes/reads

* add clarifying comments

* pushEof should also not be concurrent

* move channel reset to bufferstream

this is where the action happens - lpchannel merely redefines how close
is done

Co-authored-by: Jacek Sieka <jacek@status.im>
2020-11-23 09:07:11 -06:00
Giovanni Petrantoni 93b6c4dc52
Gossip runtime params (#437)
* move gossip parameters to runtime

* internal test fixes

* add missing params

* restore const parameters are soldi base and use them in init

* more constants tuning
2020-11-19 16:48:17 +09:00
Dmitriy Ryajov 92fa4110c1
Rework transport to use chronos accept (#420)
* rework transport to use the new accept api

* use the new chronos primits

* fixup tests to use the new transport api

* handle all exceptions in upgradeIncoming

* master merge

* add multiaddress exception type

* raise appropriate exception on invalida address

* allow retrying on TransportTooManyError

* adding TODO

* wip

* merge master

* add sleep if nil is returned

* accept loop handles all exceptions

* avoid issues with tray/except/finally

* make consistent with master

* cleanup accept loop

* logging

* Update libp2p/transports/tcptransport.nim

Co-authored-by: Jacek Sieka <jacek@status.im>

* use Direction enum instead of initiator flag

* use consistent import style

* remove experimental `closeWithEOF()`

Co-authored-by: Jacek Sieka <jacek@status.im>
2020-11-18 20:06:42 -06:00
Dmitriy Ryajov 8c8d73380f
Re-add connection manager tests (#441)
* use table.getOrDefault()

* re-add missing connection manager tests
2020-11-17 18:48:26 -06:00
Jacek Sieka 74acd0a33a
fix channels not being reset (#439)
* fix channels not being reset

silly for loop..

* allow only one concurrent read
* fix mplex test race condition
* add some bufferstream eof tests

* deadlock, lost data and hung channel fixes

* prevent concurrent `reset` calls
* reset LPChannel when read is cancelled (since data is lost)
* ensure there's one, and one only, 0-byte readOnce on EOF
* ensure that all data is returned before EOF is returned
* keep running activity monitor for half-closed channels (or they never
get closed)
2020-11-17 08:59:25 -06:00
Jacek Sieka 51a0aec058
read failure (#436)
* read failure

Test showing read failure on cancel

* adding one more test case from jacek

Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
2020-11-13 12:30:14 -06:00
Jacek Sieka 5e6ec6f422
Revert "failing test"
This reverts commit 5dbed426c2.
2020-11-13 12:26:08 +01:00
Jacek Sieka 5dbed426c2
failing test 2020-11-13 12:24:23 +01:00
Dmitriy Ryajov 55b763264e
Cleanup tests (#435)
* add async testing methods

* refactor with async testing methods

* use iffy in async tests
2020-11-12 21:44:02 -06:00
Dmitriy Ryajov 23ffd1f9f9
remove unnecesary sleeps (#434) 2020-11-12 14:18:17 -06:00
Dmitriy Ryajov da37eee285
Test disconnect from conn event (#432)
* logs

* adding disconnect test in connection events

* adding immediate disconnect from connection event
2020-11-11 13:20:14 -06:00
Dmitriy Ryajov 331961ef14
dont use double allFinished in mplex tests (#427) 2020-11-06 12:20:59 -06:00
Dmitriy Ryajov 4fb3f50d2c
Reset channels on close (#425)
* reset when failed to read/write muxed conn

* add more comprehensive resource cleanup tests

* style

* cleanup tests
2020-11-06 09:24:24 -06:00
Dmitriy Ryajov 3956f3fd69
make sure all streams are tracked (#422)
* make sure all streams are tracked

* revert unnecesary change
2020-11-04 21:52:54 -06:00
Giovanni Petrantoni 7cc42ce219
start adding more tests + minor fixes (#419)
* start adding more tests + minor fixes

* add wrong secure negotiation test

* add noise failed handshake test
2020-11-04 23:24:41 +09:00
Giovanni Petrantoni 75b023c9e5
gossipsub audit fixes (#412)
* [SEC] gossipsub - rebalanceMesh grafts peers giving preference to low scores #405

* comment score choices

* compiler warning fixes/bug fixes (unsubscribe)

* rebalanceMesh does not enforce D_out quota

* fix outbound grafting

* fight the nim compiler

* fix closure capture bs...

* another closure fix

* #403 rebalance prune fixes

* more test fixing

* #403 fixes

* #402 avoid removing scores on unsub

* #401 handleGraft improvements

* [SEC] handleIHAVE/handleIWANT recommendations

* add a note about peer exchange handling
2020-10-30 21:49:54 +09:00
Giovanni Petrantoni bd70515087
fix first interop test (#398) 2020-10-15 10:37:20 +09:00
Giovanni Petrantoni 556213abf4
Extended validators (#395)
* gossip extended validation

* fix flood tests

* fix gossip 1.0 tests

* synthax consistency
2020-10-12 16:56:00 +09:00
Giovanni Petrantoni 98d82fce5c
fix opportunistic graft in internal 11 testing (#390) 2020-10-05 11:35:03 +09:00
Dean Eigenmann 853238a215
feature/expose-matcher (#387)
* exposes matcher

* might work

* fix

* fix
2020-10-02 08:59:15 -06:00
Giovanni Petrantoni 98d0cc3a16
defaultMsgIdProvider alternative/test anonymize (#379)
* defaultMsgIdProvider alternative/test anonymize

* avoid freeze during flood tests

* avoid `empty message, skipping` situation

* test observers

* avoid double initPubSub

* fix gossip testing (specially when anonymize is on)

* make azure tests shorter
2020-09-28 09:11:18 +02:00
Jacek Sieka 58f26d4164
temporarily skip broken test on windows (#375) 2020-09-24 14:50:04 +02:00
Giovanni Petrantoni ec322124ac
allow to omit peerId and seqno (#372)
* allow to omit peerId and seqno

* small refactor

* wip

* fix message encoding

* improve rpc signature logic

* remove peerid from verify

* trace fixes

* fix message test

* fix gossip 1.0 tests
2020-09-23 17:56:33 +02:00
Jacek Sieka 471e5906f6
fix gossipsub memory leak on disconnected peer (#371)
When messages can't be sent to peer, we try to establish a send
connection - this causes messages to stack up as more and more unsent
messages are blocked on the dial lock.

* remove dial lock
* run reconnection loop in background task
2020-09-22 09:05:53 +02:00
Jacek Sieka 49a12e619d
channel close race and deadlock fixes (#368)
* channel close race and deadlock fixes

* remove send lock, write chunks in one go
* push some of half-closed implementation to BufferStream
* fix some hangs where LPChannel readers and writers would not always
wake up
* simplify lazy channels
* fix close happening more than once in some orderings
* reenable connection tracking tests
* close channels first on mplex close such that consumers can read bytes

A notable difference is that BufferedStream is no longer considered EOF
until someone has actually read the EOF marker.

* docs, simplification
2020-09-21 19:48:19 +02:00
Giovanni Petrantoni b99d2039a8
Gossip one one (#240)
* allow multiple codecs per protocol (without breaking things)

* add 1.1 protocol to gossip

* explicit peering part 1

* explicit peering part 2

* explicit peering part 3

* PeerInfo and ControlPrune protocols

* fix encodePrune

* validated always, even explicit peers

* prune by score (score is stub still)

* add a way to pass parameters to gossip

* standard setup fixes

* take into account explicit direct peers in publish

* add floodPublish logic

* small fixes, publish still half broken

* make sure to waitsub in sparse test

* use var semantics to optimize table access

* wip... lvalues don't work properly sadly...

* big publish refactor, replenish and balance

* fix internal tests

* use g.peers for fanout (todo: don't include flood peers)

* exclude non gossip from fanout

* internal test fixes

* fix flood tests

* fix test's trypublish

* test interop fixes

* make sure to not remove peers from gossip table

* restore old replenishFanout

* cleanups

* restore utility module import

* restore trace vs debug in gossip

* improve fanout replenish behavior further

* triage publish nil peers (issue is on master too but just hidden behind a if/in)

* getGossipPeers fixes

* remove topics from pubsubpeer (was unused)

* simplify rebalanceMesh (following spec) and make it finally reach D_high

* better diagnostics

* merge new pubsubpeer, copy 1.1 to new module

* fix up merge

* conditional enable gossip11 module

* add back topics in peers, re-enable flood publish

* add more heartbeat locking to prevent races

* actually lock the heartbeat

* minor fixes

* with sugar

* merge 1.0

* remove assertion in publish

* fix multistream 1.1 multi proto

* Fix merge oops

* wip

* fix gossip 11 upstream

* gossipsub11 -> gossipsub

* support interop testing

* tests fixing

* fix directchat build

* control prune updates (pb)

* wip parameters

* gossip internal tests fixes

* parameters wip

* finishup with params

* cleanups/wip

* small sugar

* grafted and pruned procs

* wip updateScores

* wip

* fix logging issue

* pubsubpeer, chronicles explicit override

* fix internal gossip tests

* wip

* tables troubleshooting

* score wip

* score wip

* fixes

* fix test utils generateNodes

* don't delete while iterating in score update

* fix grafted defect

* add a handleConnect in subscribeTopic

* pruning improvements

* wip

* score fixes

* post merge - builds gossip tests

* further merge fixes

* rebalance improvements and opportunistic grafting

* fix test for now

* restore explicit peering

* implement peer exchange graft message

* add an hard cap to PX

* backoff time management

* IWANT cap/budget

* Adaptive gossip dissemination

* outbound mesh quota, internal tests fixing

* oversub prune score based, finish outbound quota

* finishup with score and ihave budget

* use go daemon 0.3.0

* import fixes

* byScore cleanup score sorting

* remove pointless scaling in `/` Duration operator

* revert using libp2p org for daemon

* interop fixes

* fixes and cleanup

* remove heartbeat assertion, minor debug fixes

* logging improvements and cleaning up

* (to revert) add some traces

* add explicit topic to gossip rpcs

* pubsub merge fixes and type fix in switch

* Revert "(to revert) add some traces"

This reverts commit 4663eaab6c.

* cleanup some now irrelevant todo

* shuffle peers anyway as score might be disabled

* add missing shuffle

* old merge fix

* more merge fixes

* debug improvements

* re-enable gossip internal tests

* add gossip10 fallback (dormant but tested)

* split gossipsub internal tests into 1.0 and 1.1

Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
2020-09-21 11:16:29 +02:00
Dmitriy Ryajov b0d86b95dd
add peer lifecycle events (#357)
* add peer lifecycle events

* rework peer events to not use connection events

* don't use result in pubsub and switch init

* wip

* use ordered hashes and remove logscope

* logging

* add missing test

* small fixes
2020-09-15 14:19:22 -06:00
Jacek Sieka 0db45462cd
mplex fixes (#362)
* remove almost-empty types module
* lock when writing message (that's the only place the lock matters, and
only when the message is > max msg size)
* logging updates (log in consistent order, makes reading logs easier)
* raise EOF from readExactly only if no bytes have been read (to signal
that _no_ bytes were lost)
2020-09-14 10:19:54 +02:00
Jacek Sieka 96d4c44fec
refactor bufferstream to use a queue (#346)
This change modifies how the backpressure algorithm in bufferstream
works - in particular, instead of working byte-by-byte, it will now work
seq-by-seq.

When data arrives, it usually does so in packets - in the current
bufferstream, the packet is read then split into bytes which are fed one
by one to the bufferstream. On the reading side, the bytes are popped of
the bufferstream, again byte by byte, to satisfy `readOnce` requests -
this introduces a lot of synchronization traffic because the checks for
full buffer and for async event handling must be done for every byte.

In this PR, a queue of length 1 is used instead - this means there will
at most exist one "packet" in `pushTo`, one in the queue and one in the
slush buffer that is used to store incomplete reads.

* avoid byte-by-byte copy to buffer, with synchronization in-between
* reuse AsyncQueue synchronization logic instead of rolling own
* avoid writeHandler callback - implement `write` method instead
* simplify EOF signalling by only setting EOF flag in queue reader (and
reset)
* remove BufferStream pipes (unused)
* fixes drainBuffer deadlock when drain is called from within read loop
and thus blocks draining
* fix lpchannel init order
2020-09-10 08:19:13 +02:00
Jacek Sieka 63b38734bd
fix poor performance in LRU cache (#360)
it turns out (in NBC) a heap is sufficiently slow becuase of all the
deletes that it makes more sense to go with a linked list
2020-09-09 18:28:46 +02:00
Jacek Sieka c1856fda53
simplify and unify logging (#353)
* use short format for logging peerid
* log peerid:oid for connections
2020-09-06 10:31:47 +02:00
Jacek Sieka 5819c6a9a7
gossipsub / floodsub fixes (#348)
* mcache fixes

* remove timed cache - the window shifting already removes old messages
* ref -> object
* avoid unnecessary allocations with `[]` operator

* simplify init

* fix several gossipsub/floodsub issues

* floodsub, gossipsub: don't rebroadcast messages that fail validation
(!)
* floodsub, gossipsub: don't crash when unsubscribing from unknown
topics (!)
* gossipsub: don't send message to peers that are not interested in the
topic, when messages don't share topic list
* floodsub: don't repeat all messages for each message when
rebroadcasting
* floodsub: allow sending empty data
* floodsub: fix inefficient unsubscribe
* sync floodsub/gossipsub logging
* gossipsub: include incoming messages in mcache (!)
* gossipsub: don't rebroadcast already-seen messages (!)
* pubsubpeer: remove incoming/outgoing seen caches - these are already
handled in gossipsub, floodsub and will cause trouble when peers try to
resubscribe / regraft topics (because control messages will have same
digest)
* timedcache: reimplement without timers (fixes timer leaks and extreme
inefficiency due to per-message closures, futures etc)
* timedcache: ref -> obj
2020-09-04 08:10:32 +02:00
Jacek Sieka cd1c68dbc5
avoid send deadlock by not allowing send to block (#342)
* avoid send deadlock by not allowing send to block

* handle message issues more consistently
2020-09-01 09:33:03 +02:00
Jacek Sieka f46bf0faa4
remove send lock (#334)
* remove send lock

When mplex receives data it will block until a reader has processed the
data. Thus, when a large message is received, such as a gossipsub
subscription table, all of mplex will be blocked until all reading is
finished.

However, if at the same time a `dial` to establish a gossipsub send
connection is ongoing, that `dial` will be blocked because mplex is no
longer reading data - specifically, it might indeed be the connection
that's processing the previous data that is waiting for a send
connection.

There are other problems with the current code:
* If an exception is raised, it is not necessarily raised for the same
connection as `p.sendConn`, so resetting `p.sendConn` in the exception
handling is wrong
* `p.isConnected` is checked before taking the lock - thus, if it
returns false, a new dial will be started. If a new task enters `send`
before dial is finished, it will also determine `p.isConnected` is
false, then get stuck on the lock - when the previous task finishes and
releases the lock, the new task will _also_ dial and thus reset
`p.sendConn` causing a leak.

* prefer existing connection

simplifies flow
2020-08-17 12:38:27 +02:00
Jacek Sieka ab864fc747
logging cleanups and small fixes (#331) 2020-08-15 21:50:31 +02:00
Dmitriy Ryajov d1f1e1b31e
add missing mplex half closed test (#326) 2020-08-12 07:23:49 +02:00