Commit Graph

355 Commits

Author SHA1 Message Date
Jacek Sieka e285d8bbf4
mem usage cleanups for pubsub (#564)
In `async` functions, a closure environment is created for variables
that cross an await boundary - this closure environment is kept in
memory for the lifetime of the associated future - this means that
although _some_ variables are no longer used, they still take up memory
for a long time.

In Nimbus, message validation is processed in batches meaning the future
of an incoming gossip message stays around for quite a while - this
leads to memory consumption peaks of 100-200 mb when there are many
attestations in the pipeline.

To avoid excessive memory usage, it's generally better to move non-async
code into proc's such that the variables therein can be released earlier
- this includes the many hidden variables introduced by macro and
template expansion (ie chronicles that does expensive exception
handling)

* move seen table salt to floodsub, use there as well
* shorten seen table salt to size of hash
* avoid unnecessary memory allocations and copies in a few places
* factor out message scoring
* avoid reencoding outgoing message for every peer
* keep checking validators until reject (in case there's both reject and
ignore)
* `readOnce` avoids `readExactly` overhead for single-byte read
* genericAssign -> assign2
2021-04-18 10:08:33 +02:00
Giovanni Petrantoni 795a651839
use a builder pattern to build the switch (#551)
* use a builder pattern to build the switch

* with with

* more refs
2021-04-02 10:20:51 +09:00
Giovanni Petrantoni 03bbdd2261
Revisit Floodsub (#543)
Fixes #525

add coverage to unsubscribeAll and testing
2021-03-15 13:16:03 -06:00
Jacek Sieka 70deac9e0d
fix peer score accumulation (#541)
* fix accumulating peer score
* fix missing exception handling
* remove unnecessary initHashSet/initTable calls
* simplify peer stats management
* clean up tests a little
* fix some missing raises annotations
2021-03-09 13:22:52 +01:00
Giovanni Petrantoni 02ad017107
Gossipsub fixes and Initiator flagging fixes (#539)
* properly propagate initiator information for gossipsub

* Fix pubsubpeer lifetime management

* restore old behavior

* tests fixing

* clamp backoff time value received

* fix member name collisions

* internal test fixes

* better names and explaining of the importance of transport direction

* fixes
2021-03-03 08:23:40 +09:00
Giovanni Petrantoni 45300c28a9
[SEC] gossipsub - handleIHAVE/handleIWANT recommendations & notes (#535)
Fixes #400
2021-02-26 14:27:42 +09:00
Giovanni Petrantoni d7469b2286
[SEC] gossipsub - a peer score is not retained up until expiry if abusive peer unsubscribes (#534)
* [SEC] gossipsub - a peer score is not retained up until expiry if abusive peer unsubscribes
Fixes #402

* remove debug logging
2021-02-26 14:15:58 +09:00
Giovanni Petrantoni eac6cd3dbf
Debt: cleanup warnings #426 (#536)
* testswitch cleanups

* Debt: cleanup warnings
Fixes #426
2021-02-25 09:24:49 -06:00
Giovanni Petrantoni 8236319a91
[SEC] gossipsub - handleGraft/handlePrune recommendations & notes (#530)
Fixes #401
2021-02-22 12:04:20 +09:00
Giovanni Petrantoni 1368bf7ecb
test rebalanceMesh with low score peers (#529) 2021-02-22 10:05:25 +09:00
Giovanni Petrantoni 51d8cd4ade
[SEC] gossipsub - rebalanceMesh may prune up to D_lo on oversubscription (#531)
Fixes #403
2021-02-13 13:39:32 +09:00
Giovanni Petrantoni e124e342b0
n subscription limits (#528)
* subscription high water, cleanups

* subscription limits test

* newline
2021-02-12 12:27:26 +09:00
Dmitriy Ryajov 2658181df9
Merge unstable (#518)
* Address Book POC implementation (#499)

* Address Book POC implementation

* Feat/peerstore impl (#505)

Co-authored-by: Hanno Cornelius <68783915+jm-clius@users.noreply.github.com>
2021-02-08 15:16:23 -06:00
Dmitriy Ryajov 4dea23c394
Remove secio usage and cleanup exports (#519)
* cleaned up exports

* remove secio use

* added more useful exports

* proper import
2021-02-08 14:33:34 -06:00
Dmitriy Ryajov fb493d1a4a
Connection limits tests (#509)
* connection limit tests

* remove use of secio

* check that upgraded fut is not nil

* rebuild
2021-01-27 21:27:33 -06:00
Dmitriy Ryajov 0959877b29
Connection limits (#384)
* master merge

* wip

* avoid deadlocks

* tcp limits

* expose client field in chronosstream

* limit incoming connections

* update with new listen api

* fix release

* don't override peerinfo in connection

* rework transport with accept

* use semaphore to track resource ussage

* rework with new transport accept api

* move events to conn manager (#373)

* use semaphore to track resource ussage

* merge master

* expose api to acquire conn slots

* don't fail expensive metrics

* allow tracking and updating connections

* set global connection limits to 80

* add per peer connection limits

* make sure conn is closed if tracking failed

* more descriptive naming for handle

* rework with new transport accept api

* add `getStream` hide `selectConn`

* add TransportClosedError

* make nil explicit

* don't make unnecessary copies of message

* logging

* error handling

* cleanup semaphore

* track connections properly

* throw `TooManyConnections` when tracking outgoing

* use proper exception and handle conventions

* check onCloseHandle for nil

* revert internalConnect changes

* adding upgraded flag

* await stream before closing

* simplify tracking

* wip

* logging

* split connection limits into incoming and outgoing

* further streamline connection limits split counts

* don't use closeWithEOF

* move peer and conn event triggers from switch

* wip

* wip

* wip

* merge master

* handle nil connections properly

* add clarifying comment

* don't raise exc on nil

* no finally

* add proper min/max connections logic

* rebase master

* merge master

* master merge

* remove request timeout

should be addressed in separate PR

* merge master

* share semaphore when in/out limits arent enforced

* merge master

* use import

* pass semaphore to trackConn

* don't close last conn

* use storeConn

* merge master

* use storeConn
2021-01-20 22:00:24 -06:00
Giovanni Petrantoni 240ec84ffb
Gossipsub wip (#502)
* Remove unused connections in pubsubpeer, also removed wrong usages, add a disconnect bad peers parameter

* handle exceptions in disconnectPeer

* small fix

* use the proper disconnection procedure for gossip peers

* fixes, more metrics add test about disconnection

* hot fix possible null pointers in switch

* silly isnil sugar

* Fix and test gossip directPeer connections
2021-01-15 13:48:03 +09:00
Dmitriy Ryajov 3878a95b23
Semaphore cancellations (#503)
* add proper cancelation handling

* remove cancelled futures explicitly

* use fifo to keep proper order

* add out of order cancelations test

* make count public

* use `new` instead of `init`

* remove private `queue` from tests

* expose count as a readonly prop

* use `delete()` to preserve seq order
2021-01-14 10:11:12 +01:00
Giovanni Petrantoni dc48170b0d
Gossip subscription improvements (#497)
* salt ids in seen table

* add subscription validation callback and avoid processing topics we don't care of

* apply penalty on bad subscription

* fix IHave handling IDs

* reduce indenting, add some comments

* fix gossip randombytes generation

* do not descore unwanted topics (might happen, due to timing, needs improvements)

* cleaning up and added tests

* validate subscriptions only when subscribing

* set notice level for failed publish

* fix floodsub behavior
2021-01-13 23:49:44 +09:00
Giovanni Petrantoni 87be2c7f1f
make switch tests less sensitive to time (#501)
* make switch tests less sensitive to time

* missing new line
2021-01-12 10:26:48 +09:00
Dmitriy Ryajov b2ea5a3c77
Concurrent upgrades (#489)
* adding an upgraded event to conn

* set stopped flag asap

* trigger upgradded event on conn

* set concurrency limit for accepts

* backporting semaphore from tcp-limits2

* export unittests module

* make params explicit

* tone down debug logs

* adding semaphore tests

* use semaphore to throttle concurent upgrades

* add libp2p scope

* trigger upgraded event before any other events

* add event handler for connection upgrade

* cleanup upgraded event on conn close

* make upgrades slot release rebust

* dont forget to release slot on nil connection

* misc

* make sure semaphore is always released

* minor improvements and a nil check

* removing unneeded comment

* make upgradeMonitor a non-closure proc

* make sure the `upgraded` event is initialized

* handle exceptions in accepts when stopping

* don't leak exceptions when stopping accept loops
2021-01-04 12:59:05 -06:00
Giovanni Petrantoni 4858e0ab15
Gossipsub refactor pt2 (#495)
* add sub/unsub test

* remove unused variable from gossip
2020-12-20 00:45:34 +09:00
Giovanni Petrantoni 05e789a34f
Gossipsub refactor (#490)
* refactor peerStats, re-enable scores for testing

* remove gossip 1.0

* cleanup

* codecov matrix fixes

* restore previous score on onNewPeer

* fix coverage n checks

* unsubscribeAll gossipsub fixes

* refactor unsub/sub

* refactor onNewPeer and fix score flow

* disable scores by default (change in tests later)

* fix tests, enable scores in tests

* fix wrongly merged test

* ensure topic removal from topics table

* small typo fix

* testinterop fixes
2020-12-19 15:43:32 +01:00
Giovanni Petrantoni 5543f6681f
first pass, use results for Cid module (#480)
* first pass, use results for Cid module

* improvements to decode
2020-12-15 14:19:18 +01:00
Dmitriy Ryajov a990fe95a0
Fixing range error introduces in v1.2.8 (#485) 2020-12-15 06:58:38 +01:00
Giovanni Petrantoni f8f0bc1bd8
Gossip improvements (#476)
* add more traces, remove async from rebalance

* more traces

* avoid computng scores when weight is 0.0

* debug colocation, fix an indent in unsubpeer (minor)

* add full ValidationResult coverage

* store in cache only after validation

* gossip 1.0 fixes

* fix typo

* gossip 10 internal test fixes

* test fixing

* refactor peerstats usages

* populate tables if missing when scoring
2020-12-15 10:25:22 +09:00
Dmitriy Ryajov 4224f12503
fix memory safety issue in tests (#484) 2020-12-14 15:22:53 -06:00
Jacek Sieka 1befeb8c2e
clean up peerid (#470)
* fix dangling cstring on error return
* remove some useless inlines
* less mallocs in shortlog
* proc -> func
* rename test
2020-12-03 13:53:16 -06:00
Dmitriy Ryajov e9d4679059
Race in connection setup (#464)
* check that connection is not closed or eof

* don't release connection lock prematurely

* test that only valid connections can be added

* correct exception type on closed connection

* add clarifying comment

* use closeWithEOF for more stable test

* misc comments

* log stream id in buffestream asserts

* use closeWithEOF to prevent races in tests

* give some time to the remote handler to trigger

* adding more tests to make codecov happy
2020-12-02 19:24:48 -06:00
Dmitriy Ryajov 94e672ead0
allow concurrent closeWithEOF (#466)
* allow concurrent closeWithEOF

* add dedicated closedWithEOF flag
2020-12-01 09:44:21 +01:00
Dmitriy Ryajov 18443dafc1
rework peer event to take an initiator flag (#456)
* rework peer event to take an initiator flag

* use correct direction for initiator
2020-11-28 10:59:47 -06:00
Dmitriy Ryajov a8f5f7a8bb
move dialing logic to it's own proc to avoid try/finally bugs (#461)
* move dialing logic to it's own proc to avoid try/finally bugs

* re-export transport

* lint

* add cancelation test

* test remote conn close on dial
2020-11-28 09:05:12 +01:00
Dmitriy Ryajov 7b1e652224
Allow custom identify agent string (#451)
* allow custom agent version string

* rework tests and add test for custom agent version
2020-11-25 07:42:02 -06:00
Dmitriy Ryajov 351489bfa9
getMuxedStream to more appropriate getStream (#448) 2020-11-24 00:37:45 -06:00
Dmitriy Ryajov 1d16d22f5f
Don't allow concurrent pushdata (#444)
* handle resets properly with/without pushes/reads

* add clarifying comments

* pushEof should also not be concurrent

* move channel reset to bufferstream

this is where the action happens - lpchannel merely redefines how close
is done

Co-authored-by: Jacek Sieka <jacek@status.im>
2020-11-23 09:07:11 -06:00
Giovanni Petrantoni 93b6c4dc52
Gossip runtime params (#437)
* move gossip parameters to runtime

* internal test fixes

* add missing params

* restore const parameters are soldi base and use them in init

* more constants tuning
2020-11-19 16:48:17 +09:00
Dmitriy Ryajov 92fa4110c1
Rework transport to use chronos accept (#420)
* rework transport to use the new accept api

* use the new chronos primits

* fixup tests to use the new transport api

* handle all exceptions in upgradeIncoming

* master merge

* add multiaddress exception type

* raise appropriate exception on invalida address

* allow retrying on TransportTooManyError

* adding TODO

* wip

* merge master

* add sleep if nil is returned

* accept loop handles all exceptions

* avoid issues with tray/except/finally

* make consistent with master

* cleanup accept loop

* logging

* Update libp2p/transports/tcptransport.nim

Co-authored-by: Jacek Sieka <jacek@status.im>

* use Direction enum instead of initiator flag

* use consistent import style

* remove experimental `closeWithEOF()`

Co-authored-by: Jacek Sieka <jacek@status.im>
2020-11-18 20:06:42 -06:00
Dmitriy Ryajov 8c8d73380f
Re-add connection manager tests (#441)
* use table.getOrDefault()

* re-add missing connection manager tests
2020-11-17 18:48:26 -06:00
Jacek Sieka 74acd0a33a
fix channels not being reset (#439)
* fix channels not being reset

silly for loop..

* allow only one concurrent read
* fix mplex test race condition
* add some bufferstream eof tests

* deadlock, lost data and hung channel fixes

* prevent concurrent `reset` calls
* reset LPChannel when read is cancelled (since data is lost)
* ensure there's one, and one only, 0-byte readOnce on EOF
* ensure that all data is returned before EOF is returned
* keep running activity monitor for half-closed channels (or they never
get closed)
2020-11-17 08:59:25 -06:00
Jacek Sieka 51a0aec058
read failure (#436)
* read failure

Test showing read failure on cancel

* adding one more test case from jacek

Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
2020-11-13 12:30:14 -06:00
Jacek Sieka 5e6ec6f422
Revert "failing test"
This reverts commit 5dbed426c2.
2020-11-13 12:26:08 +01:00
Jacek Sieka 5dbed426c2
failing test 2020-11-13 12:24:23 +01:00
Dmitriy Ryajov 55b763264e
Cleanup tests (#435)
* add async testing methods

* refactor with async testing methods

* use iffy in async tests
2020-11-12 21:44:02 -06:00
Dmitriy Ryajov 23ffd1f9f9
remove unnecesary sleeps (#434) 2020-11-12 14:18:17 -06:00
Dmitriy Ryajov da37eee285
Test disconnect from conn event (#432)
* logs

* adding disconnect test in connection events

* adding immediate disconnect from connection event
2020-11-11 13:20:14 -06:00
Dmitriy Ryajov 331961ef14
dont use double allFinished in mplex tests (#427) 2020-11-06 12:20:59 -06:00
Dmitriy Ryajov 4fb3f50d2c
Reset channels on close (#425)
* reset when failed to read/write muxed conn

* add more comprehensive resource cleanup tests

* style

* cleanup tests
2020-11-06 09:24:24 -06:00
Dmitriy Ryajov 3956f3fd69
make sure all streams are tracked (#422)
* make sure all streams are tracked

* revert unnecesary change
2020-11-04 21:52:54 -06:00
Giovanni Petrantoni 7cc42ce219
start adding more tests + minor fixes (#419)
* start adding more tests + minor fixes

* add wrong secure negotiation test

* add noise failed handshake test
2020-11-04 23:24:41 +09:00
Giovanni Petrantoni 75b023c9e5
gossipsub audit fixes (#412)
* [SEC] gossipsub - rebalanceMesh grafts peers giving preference to low scores #405

* comment score choices

* compiler warning fixes/bug fixes (unsubscribe)

* rebalanceMesh does not enforce D_out quota

* fix outbound grafting

* fight the nim compiler

* fix closure capture bs...

* another closure fix

* #403 rebalance prune fixes

* more test fixing

* #403 fixes

* #402 avoid removing scores on unsub

* #401 handleGraft improvements

* [SEC] handleIHAVE/handleIWANT recommendations

* add a note about peer exchange handling
2020-10-30 21:49:54 +09:00