Commit Graph

297 Commits

Author SHA1 Message Date
Tanguy 7323ecc9c4
Optimize rebalanceMesh (#708) 2022-05-25 12:59:33 +02:00
Tanguy 60becadcf9
Peer store refacto (#700)
There is now a global PeerStore structure (instead of having one for libp2p, one for waku, etc)

The user can create custom books for new types easily

Also add a pruning system to remove dead peers
2022-05-25 12:12:57 +02:00
Tanguy 991549f391
Gossipsub scoring fixes (#709)
* Use decayInterval as a scoring heartbeat period
* Take mesh delivery window into account
2022-05-11 10:38:43 +02:00
lchenut 32ca1898d9
Gossipsub: Put Peer Exchange behind a flag (#715)
Add a flag to enable Peer Exchange in Gossipsub (disabled by default)
2022-05-10 10:39:43 +02:00
Tanguy c97befb387
Add tests for gossipsub direct peers (#707) 2022-04-12 14:03:31 +00:00
Csaba Kiraly 9973b9466d
expose more libp2p performance and queuing metrics (#678)
* gossipsub: adding duplicate arrival metrics

Adding counters for received deduplicated messages and for
duplicates recognized by the seen cache. Note that duplicates that
are not recognized (arrive after seenTTL) are not counted as
duplicates here either.

* gossipsub: adding mcache (message cache for responding IWANT) stats

It is generally assumed that IWANT messages arrive when mcache still
has the message. These stats are to verify this assumption.

* libp2p: adding internal TX queuing stats

Messages are queued in TX before getting written on the stream,
but we have no statistics about these queues. This patch adds
some queue length and queuing time related statistics.

* adding Grafana libp2p dashboard

Adding Grafana dashboard with newly exposed metrics.

* enable libp2p_mplex_metrics in nimble test

Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
2022-04-06 16:00:24 +02:00
Tanguy c7504d2446
Gossipsub peer exchange (#647)
* Signed envelopes and routing records
* Send signed peer record as part of identify (#649)
* Add SPR from identify to new peer book (#657)
* Send & receive gossipsub PX
* Add Signed Payload

Co-authored-by: Hanno Cornelius <68783915+jm-clius@users.noreply.github.com>
2022-03-14 09:39:30 +01:00
Eric Mastro 44a7260f07
fixes from #688 (#697)
* tests: invert message logic on expect from #688
* fix: export pubsub_errors for backward compatibility
2022-02-24 17:32:20 +01:00
Tanguy fd59cbc7a9
Fix shuffle of #638 2022-02-21 17:00:18 +01:00
Tanguy bc318084f4
GS: Publish to fanout when mesh unhealthy (#638)
* Send to fanout when mesh unhealthy

* don't use fanout when floodPublish
2022-02-21 16:22:08 +01:00
Eric Mastro 3b718baa97
feat: allow msgIdProvider to fail (#688)
* feat: allow msgIdProvider to fail

Closes: #642.

Changes the return type of the msgIdProvider to `Result[MessageID, string]` so that message id generation can fail.

String error type was chosen as this `msgIdProvider` mainly because the failed message id generation drops the message and logs the error provided. Because `msgIdProvider` can be externally provided by library consumers, an enum didn’t make sense and a object seemed to be overkill. Exceptions could have been used as well, however, in this case, Result ergonomics were warranted and prevented wrapping quite a large block of code in try/except.

The `defaultMsgIdProvider` function previously allowed message id generation to fail silently for use in the tests: when seqno or source peerid were not valid, the message id generated was based on a hash of the message data and topic ids. The silent failing was moved to the `defaultMsgIdProvider` used only in the tests so that it could not fail silently in applications.

Unit tests were added for the `defaultMsgIdProvider`.

* Change MsgIdProvider error type to ValidationResult
2022-02-21 16:04:17 +01:00
Tanguy c18830ad33
Score correctly on mesh peer unsub (#644)
* Score correctly on mesh peer unsub
* remove from mesh before removing from gossipsub
2022-01-15 12:47:41 +01:00
Tanguy 1a97d0a2f5
Validate pubsub subscriptions (#627)
* Check topic before subscribing
* Block subscribe to invalid topics
2022-01-14 12:40:30 -06:00
Tanguy fb0d10b6fd
Gossipsub: process messages concurrently (#680)
* Gossip sub: process messages concurrently

* Retries for flaky test
2021-12-27 11:17:00 +01:00
Tanguy df566e69db
Fixes for style check (#676) 2021-12-16 11:05:20 +01:00
Tanguy 47a35e26d7
Typo: s/unsubcribeBackoff/unsubscribeBackoff (#675) 2021-12-14 10:50:57 +01:00
Tanguy 6893bd9dbb
Customizable gossipsub backoff on unsubscribe (#665)
* Customizable gossipsub backoff on unsubscribe
* change default to 5s
2021-12-02 14:47:40 +00:00
Tanguy 6f779c47c8
Gossipsub: don't send to peers seen during validation (#648)
* Gossipsub: don't send to peers seen during validation

* Less error prone code

* add metric

* Fix metric

* remove dangling code test

* address comments

* don't allocate memory
2021-11-14 09:08:05 +01:00
Tanguy 5885e03680
Add maxMessageSize option to pubsub (#634)
* Add maxMessageSize option to pubsub
* Switched default to 1mb
2021-10-25 12:58:38 +02:00
Tanguy 846baf3853
Various cleanups part 1 (#632)
* raise -> raise exc
* replace stdlib random with bearssl
* object init -> new
* Remove deprecated procs
* getMandatoryField
2021-10-25 10:26:32 +02:00
Tanguy 1b2cdd6aec
Merge branch 'master' into unstable 2021-09-09 13:22:45 +02:00
Menduist d02735dc46
Remove peer info (#610)
Peer Info is now for local peer data only.
For other peers info, use the peer store.

Previous reference to peer info are replaced with the peerid
2021-09-08 11:07:46 +02:00
Tanguy Cizain 8cddfde837
Rename getKey -> getPublicKey (#621)
* rename getKey to getPublicKey

* use publicKey directly in gossipsub

* update error messages
2021-09-02 12:03:40 +02:00
Tanguy Cizain 8ea90037e5
Fix gossipsub incoming graft backoff (#616)
* Fix gossipsub incoming graft backoff

* Improve debug messages

* clamp to 24h
2021-09-01 08:41:11 +02:00
Tanguy Cizain 93156447ba
Peer Store implement part II (#586)
* Connect & Peer event handlers now receive a peerinfo

* small peerstore refacto

* implement peerstore in switch

* changed PeerStore to final ref object

* revert libp2p/builders.nim
2021-06-08 18:55:24 +02:00
Tanguy Cizain caac8191d2
Change newXXXX procs to XXXX.new (#585)
* newBufferStream -> BufferStream.new

* newMultistream -> MultistreamSelect.new

* newSecio -> Secio.new

* newNoise -> Noise.new

* newPlainText -> PlainText.new

* newPubSubPeer -> PubSubPeer.new

* newIdentify -> Identify.new

* newMuxerProvider -> MuxerProvider.new
2021-06-07 09:32:08 +02:00
Dmitriy Ryajov ce42674d80
avoid memory safety errors with nim 1.4.x 2021-06-02 12:26:01 -06:00
Dmitriy Ryajov 1c3616e3a5
merge latest master 2021-06-02 12:25:36 -06:00
Dmitriy Ryajov c949f14a99
Merge master to unstable (#570)
* Revisit Floodsub (#543)

Fixes #525

add coverage to unsubscribeAll and testing

* add mounted protos to identify message (#546)

* add stable/unstable auto bumps

* fix auto-bump CI

* merge nbc auto bump with CI in order to bump only on CI success

* put conditional locks on nbc bump (#549)

* Fix minor exception issues (#550)

Makes code compatible with
https://github.com/status-im/nim-chronos/pull/166 without requiring it.

* fix nimbus ref for auto-bump stable's PR

* use a builder pattern to build the switch (#551)

* use a builder pattern to build the switch

* with with

* more refs

* builders (#559)

* More builders (#560)

* address some issues pointed out in review

* re-add to prevent breaking other projects

* mem usage cleanups for pubsub (#564)

In `async` functions, a closure environment is created for variables
that cross an await boundary - this closure environment is kept in
memory for the lifetime of the associated future - this means that
although _some_ variables are no longer used, they still take up memory
for a long time.

In Nimbus, message validation is processed in batches meaning the future
of an incoming gossip message stays around for quite a while - this
leads to memory consumption peaks of 100-200 mb when there are many
attestations in the pipeline.

To avoid excessive memory usage, it's generally better to move non-async
code into proc's such that the variables therein can be released earlier
- this includes the many hidden variables introduced by macro and
template expansion (ie chronicles that does expensive exception
handling)

* move seen table salt to floodsub, use there as well
* shorten seen table salt to size of hash
* avoid unnecessary memory allocations and copies in a few places
* factor out message scoring
* avoid reencoding outgoing message for every peer
* keep checking validators until reject (in case there's both reject and
ignore)
* `readOnce` avoids `readExactly` overhead for single-byte read
* genericAssign -> assign2

* More gossip coverage (#553)

* add floodPublish test

* test delivery via control Iwant/have mechanics

* fix issues in control, and add testing

* fix possible backoff issue with pruned routine overriding it

* fix control messages (#566)

* remove unused control graft check in handleControl

* avoid sending empty Iwant messages

* Split dialer (#542)

* extracting dialing logic to dialer

* exposing upgrade methods on transport

* cleanup

* fixing tests to use new interfaces

* add comments

* add base exception class and fix hierarchy

* fix imports

* Merge master (#555)

* Revisit Floodsub (#543)

Fixes #525

add coverage to unsubscribeAll and testing

* add mounted protos to identify message (#546)

* add stable/unstable auto bumps

* fix auto-bump CI

* merge nbc auto bump with CI in order to bump only on CI success

* put conditional locks on nbc bump (#549)

* Fix minor exception issues (#550)

Makes code compatible with
https://github.com/status-im/nim-chronos/pull/166 without requiring it.

* fix nimbus ref for auto-bump stable's PR

* Split dialer (#542)

* extracting dialing logic to dialer

* exposing upgrade methods on transport

* cleanup

* fixing tests to use new interfaces

* add comments

* add base exception class and fix hierarchy

* fix imports

* `doAssert` is `ValueError` not `AssertionError`?

* revert back to `AssertionError`

Co-authored-by: Giovanni Petrantoni <7008900+sinkingsugar@users.noreply.github.com>
Co-authored-by: Jacek Sieka <jacek@status.im>

* Builders (#558)

* use a builder pattern to build the switch (#551)

* use a builder pattern to build the switch

* with with

* more refs

* Merge master (#555)

* Revisit Floodsub (#543)

Fixes #525

add coverage to unsubscribeAll and testing

* add mounted protos to identify message (#546)

* add stable/unstable auto bumps

* fix auto-bump CI

* merge nbc auto bump with CI in order to bump only on CI success

* put conditional locks on nbc bump (#549)

* Fix minor exception issues (#550)

Makes code compatible with
https://github.com/status-im/nim-chronos/pull/166 without requiring it.

* fix nimbus ref for auto-bump stable's PR

* Split dialer (#542)

* extracting dialing logic to dialer

* exposing upgrade methods on transport

* cleanup

* fixing tests to use new interfaces

* add comments

* add base exception class and fix hierarchy

* fix imports

* `doAssert` is `ValueError` not `AssertionError`?

* revert back to `AssertionError`

Co-authored-by: Giovanni Petrantoni <7008900+sinkingsugar@users.noreply.github.com>
Co-authored-by: Jacek Sieka <jacek@status.im>

* `doAssert` is `ValueError` not `AssertionError`?

* revert back to `AssertionError`

* fix builders

* more builder stuff

* more builders

Co-authored-by: Giovanni Petrantoni <7008900+sinkingsugar@users.noreply.github.com>
Co-authored-by: Jacek Sieka <jacek@status.im>

Co-authored-by: Giovanni Petrantoni <7008900+sinkingsugar@users.noreply.github.com>
Co-authored-by: Jacek Sieka <jacek@status.im>
2021-06-02 12:24:46 -06:00
Dmitriy Ryajov 3da656687b
use LPError more consistently (#582)
* use LPError more consistently

* don't use Exceptino

* annotate with raises

* don't panic on concatenation

* further rework error handling
2021-06-02 15:39:10 +02:00
Dmitriy Ryajov 1b94c3feda
fix #581 (#583) 2021-06-01 18:06:08 -06:00
Dmitriy Ryajov ac4e060e1a
adding raises defect across the codebase (#572)
* adding raises defect across the codebase

* use unittest2

* add windows deps caching

* update mingw link

* die on failed peerinfo initialization

* use result.expect instead of get

* use expect more consistently and rework inits

* use expect more consistently

* throw on missing public key

* remove unused closure annotation

* merge master
2021-05-21 10:27:01 -06:00
Jacek Sieka 83a20a992a
gossipsub: unsubscribe fixes (#569)
* gossipsub: unsubscribe fixes

* fix KeyError when updating metric of unsubscribed topic
* fix unsubscribe message not being sent to all peers causing them to
keep thinking we're still subscribed
* release memory earlier in a few places

* floodsub fix
2021-05-07 00:43:45 +02:00
Giovanni Petrantoni 9f301964ed
fix control messages (#566)
* remove unused control graft check in handleControl

* avoid sending empty Iwant messages
2021-04-28 10:03:03 +09:00
Giovanni Petrantoni f81a085d0b
More gossip coverage (#553)
* add floodPublish test

* test delivery via control Iwant/have mechanics

* fix issues in control, and add testing

* fix possible backoff issue with pruned routine overriding it
2021-04-22 18:51:22 +09:00
Jacek Sieka e285d8bbf4
mem usage cleanups for pubsub (#564)
In `async` functions, a closure environment is created for variables
that cross an await boundary - this closure environment is kept in
memory for the lifetime of the associated future - this means that
although _some_ variables are no longer used, they still take up memory
for a long time.

In Nimbus, message validation is processed in batches meaning the future
of an incoming gossip message stays around for quite a while - this
leads to memory consumption peaks of 100-200 mb when there are many
attestations in the pipeline.

To avoid excessive memory usage, it's generally better to move non-async
code into proc's such that the variables therein can be released earlier
- this includes the many hidden variables introduced by macro and
template expansion (ie chronicles that does expensive exception
handling)

* move seen table salt to floodsub, use there as well
* shorten seen table salt to size of hash
* avoid unnecessary memory allocations and copies in a few places
* factor out message scoring
* avoid reencoding outgoing message for every peer
* keep checking validators until reject (in case there's both reject and
ignore)
* `readOnce` avoids `readExactly` overhead for single-byte read
* genericAssign -> assign2
2021-04-18 10:08:33 +02:00
Giovanni Petrantoni 4760df1e31 fix build with libp2p_agents_metrics switch 2021-03-15 01:42:47 +00:00
Jacek Sieka 70deac9e0d
fix peer score accumulation (#541)
* fix accumulating peer score
* fix missing exception handling
* remove unnecessary initHashSet/initTable calls
* simplify peer stats management
* clean up tests a little
* fix some missing raises annotations
2021-03-09 13:22:52 +01:00
Giovanni Petrantoni 269d3df351
Consolidate metrics collection for mesh (#540)
* Consolidate metrics collection for mesh

* more fixes

* wrapping up

* Update libp2p/protocols/pubsub/gossipsub/behavior.nim

Co-authored-by: Jacek Sieka <jacek@status.im>

* Update libp2p/protocols/pubsub/gossipsub/behavior.nim

Co-authored-by: Jacek Sieka <jacek@status.im>

* Update libp2p/protocols/pubsub/gossipsub/behavior.nim

Co-authored-by: Jacek Sieka <jacek@status.im>

Co-authored-by: Jacek Sieka <jacek@status.im>
2021-03-03 22:11:21 +01:00
Giovanni Petrantoni 34c2fbeb16 small gossipsub metrics change 2021-03-03 08:41:21 +00:00
Giovanni Petrantoni 02ad017107
Gossipsub fixes and Initiator flagging fixes (#539)
* properly propagate initiator information for gossipsub

* Fix pubsubpeer lifetime management

* restore old behavior

* tests fixing

* clamp backoff time value received

* fix member name collisions

* internal test fixes

* better names and explaining of the importance of transport direction

* fixes
2021-03-03 08:23:40 +09:00
Giovanni Petrantoni c1334c6d89 pubsubpeer better address management 2021-02-28 04:53:17 +00:00
Giovanni Petrantoni 7b2727d930 avoid leaking in peersInIP, don't depend on sendConn 2021-02-27 23:49:56 +09:00
Giovanni Petrantoni 67d0926e89 use in any case PeerID for peersInIP to avoid keeping references 2021-02-27 21:31:59 +09:00
Giovanni Petrantoni fae38e0146 fix PubSubPeer hashing issues 2021-02-26 19:19:15 +09:00
Giovanni Petrantoni 45300c28a9
[SEC] gossipsub - handleIHAVE/handleIWANT recommendations & notes (#535)
Fixes #400
2021-02-26 14:27:42 +09:00
Giovanni Petrantoni c1d8317e3c fix badly merged code in gossipsub.colocationFactor 2021-02-26 12:39:57 +09:00
Giovanni Petrantoni 922cd92f94 don't check if peers have `sendConn` when disconnecting for bad scoring 2021-02-22 10:04:02 +09:00
Giovanni Petrantoni 51d8cd4ade
[SEC] gossipsub - rebalanceMesh may prune up to D_lo on oversubscription (#531)
Fixes #403
2021-02-13 13:39:32 +09:00
Giovanni Petrantoni e124e342b0
n subscription limits (#528)
* subscription high water, cleanups

* subscription limits test

* newline
2021-02-12 12:27:26 +09:00
Giovanni Petrantoni fff54fa23c add more diagnostics when gossip publish fails 2021-02-09 18:42:59 +09:00
Giovanni Petrantoni 646557597d lower some gossipsub logging to debug level 2021-02-08 10:11:41 +09:00
Giovanni Petrantoni fd73cf9f9d
Refactor gossipsub into multiple modules (#515)
* Refactor gossipsub into multiple modules

* splitup further gossipsub

* move more mesh related stuff to behavior

* fix internal tests

* fix PubSubPeer.outbound flag, make it more reliable

* use discard rather then _
2021-02-06 09:13:04 +09:00
Giovanni Petrantoni 5aebf0990e
peer stats fixes (#511)
Gossipsub fix, required by nimbus, merging into master as low impact
2021-01-29 12:41:51 +09:00
Giovanni Petrantoni 1d77d37f17
Gossipsub scoring fixes (#508)
* Fix some problematics when running with full scoring

* more fixes
2021-01-25 21:13:42 +09:00
Giovanni Petrantoni b57101f265 add an invalid topic subscriptions metric 2021-01-15 18:55:54 +09:00
Giovanni Petrantoni 1fb783eb7f let apps decide if they want to penalize peers on invalid topic 2021-01-15 18:50:42 +09:00
Giovanni Petrantoni 6542b913df set "ignoring invalid topic subscription" to trace level 2021-01-15 18:48:58 +09:00
Giovanni Petrantoni 240ec84ffb
Gossipsub wip (#502)
* Remove unused connections in pubsubpeer, also removed wrong usages, add a disconnect bad peers parameter

* handle exceptions in disconnectPeer

* small fix

* use the proper disconnection procedure for gossip peers

* fixes, more metrics add test about disconnection

* hot fix possible null pointers in switch

* silly isnil sugar

* Fix and test gossip directPeer connections
2021-01-15 13:48:03 +09:00
Giovanni Petrantoni dc48170b0d
Gossip subscription improvements (#497)
* salt ids in seen table

* add subscription validation callback and avoid processing topics we don't care of

* apply penalty on bad subscription

* fix IHave handling IDs

* reduce indenting, add some comments

* fix gossip randombytes generation

* do not descore unwanted topics (might happen, due to timing, needs improvements)

* cleaning up and added tests

* validate subscriptions only when subscribing

* set notice level for failed publish

* fix floodsub behavior
2021-01-13 23:49:44 +09:00
Giovanni Petrantoni b902c030a0
add metrics into chronosstream to identify peers agents (#458)
* add metrics into chronosstream to identify peers agents

* avoid too many agent strings

* use gauge instead of counter for stream metrics

* filter identity on /

* also track bytes traffic

* fix identity tracking closeimpl call

* add gossip rpc metrics

* fix missing metrics inclusions

* metrics fixes and additions

* add a KnownLibP2PAgents strdefine

* enforse toLowerAscii to agent names (metrics)

* incoming rpc metrics

* fix silly mistake in rpc metrics

* fix agent metrics logic

* libp2p_gossipsub_failed_publish metric

* message ids metrics

* libp2p_pubsub_broadcast_ihave metric improvement

* refactor expensive gossip metrics

* more detailed metrics

* metrics improvements

* remove generic metrics for `set` users

* small fixes, add debug counters

* fix counter and add missing subs metrics!

* agent metrics behind -d:libp2p_agents_metrics

* remove testing related code from this PR

* small rebroadcast metric fix

* fix small mistake

* add some guide to the readme in order to use new metrics

* add libp2p_gossipsub_peers_scores metric

* add protobuf metrics to understand bytes traffic precisely

* refactor gossipsub metrics

* remove unused variable

* add more metrics, refactor rebalance metrics

* avoid bad metric concurrent states

* use a stack structure for gossip mesh metrics

* refine sub metrics

* add received subs metrics fixes

* measure handlers of known topics

* sub/unsub counter

* unsubscribeAll log unknown topics

* expose a way to specify known topics at runtime
2021-01-08 14:21:24 +09:00
Giovanni Petrantoni 5e79d3ab9c avoid triggering unsubscribeAll from unsubscribe (gossip) 2020-12-20 17:08:03 +09:00
Giovanni Petrantoni 4858e0ab15
Gossipsub refactor pt2 (#495)
* add sub/unsub test

* remove unused variable from gossip
2020-12-20 00:45:34 +09:00
Giovanni Petrantoni 05e789a34f
Gossipsub refactor (#490)
* refactor peerStats, re-enable scores for testing

* remove gossip 1.0

* cleanup

* codecov matrix fixes

* restore previous score on onNewPeer

* fix coverage n checks

* unsubscribeAll gossipsub fixes

* refactor unsub/sub

* refactor onNewPeer and fix score flow

* disable scores by default (change in tests later)

* fix tests, enable scores in tests

* fix wrongly merged test

* ensure topic removal from topics table

* small typo fix

* testinterop fixes
2020-12-19 15:43:32 +01:00
Giovanni Petrantoni 6c2e743ff3
Race condition in pubsub #469 (#471)
* Race condition in pubsub #469

* use allFinished

* improve cancellation handling
2020-12-19 00:56:46 +09:00
Jacek Sieka a1a5f9abac
nice msgid log formatting (#491) 2020-12-18 16:14:07 +01:00
Giovanni Petrantoni 52628d1a2e avoid using debug logging in gossip, just because 2020-12-17 17:21:09 +09:00
Jacek Sieka 9e5ba64c48
avoid creating prune message unless we're pruning (#487) 2020-12-15 22:46:03 +01:00
Giovanni Petrantoni 18d69a5c41
Workaround the gossipsub table race condition (#486) 2020-12-15 12:32:11 -06:00
Giovanni Petrantoni f8f0bc1bd8
Gossip improvements (#476)
* add more traces, remove async from rebalance

* more traces

* avoid computng scores when weight is 0.0

* debug colocation, fix an indent in unsubpeer (minor)

* add full ValidationResult coverage

* store in cache only after validation

* gossip 1.0 fixes

* fix typo

* gossip 10 internal test fixes

* test fixing

* refactor peerstats usages

* populate tables if missing when scoring
2020-12-15 10:25:22 +09:00
Jacek Sieka 1befeb8c2e
clean up peerid (#470)
* fix dangling cstring on error return
* remove some useless inlines
* less mallocs in shortlog
* proc -> func
* rename test
2020-12-03 13:53:16 -06:00
Dmitriy Ryajov d1c689e5ab
adding libp2p tag to logScope (#465) 2020-12-01 11:34:27 -06:00
Giovanni Petrantoni e1648d4404 fix mcache logic check in gossipsub 2020-12-01 23:55:51 +09:00
Giovanni Petrantoni b4738d723c
Some gossip fixes (#467)
* fix some missing rpc in rebalanceMesh

* clarify some variable names and lifetime

* further improvements
2020-12-01 11:44:09 +01:00
Dmitriy Ryajov 18443dafc1
rework peer event to take an initiator flag (#456)
* rework peer event to take an initiator flag

* use correct direction for initiator
2020-11-28 10:59:47 -06:00
Giovanni Petrantoni 02b20440f2
Limit ihave emission (#462)
* add some limits to ihave emission, go has them, rust does not actually

* restore shuffling of IDs

* add some context
2020-11-28 16:27:39 +09:00
Giovanni Petrantoni 12db9a4cf2 TopicParams validation tuning 2020-11-28 00:23:09 +09:00
Giovanni Petrantoni 809df8d04d add some extra gossip metrics 2020-11-26 16:20:34 +09:00
Giovanni Petrantoni 6c7f2766fe
expose seenTTL parameters (#457) 2020-11-26 14:45:10 +09:00
Dmitriy Ryajov 034a1e8b1b
small cleanups from tcp-limits2 (#446) 2020-11-23 15:02:23 -06:00
Giovanni Petrantoni 93b6c4dc52
Gossip runtime params (#437)
* move gossip parameters to runtime

* internal test fixes

* add missing params

* restore const parameters are soldi base and use them in init

* more constants tuning
2020-11-19 16:48:17 +09:00
Giovanni Petrantoni a9948b0b05
clarify validation messages (#431)
* clarify validation messages

* add codecov threshold
2020-11-12 01:42:12 +09:00
Dmitriy Ryajov 90921bff09
move some importance trace logs to debug (#428) 2020-11-09 22:14:46 -06:00
Giovanni Petrantoni e496802943
Least expensive metrics (#421)
* add more general and useful metrics

* fix gossipsub peers metrics in heartbeat
2020-11-04 15:18:00 +01:00
Jacek Sieka 03639f1446
Revert "Channel leaks (#413)" (#417)
This reverts commit 1de1d49223.
2020-11-01 14:49:25 -06:00
Giovanni Petrantoni 75b023c9e5
gossipsub audit fixes (#412)
* [SEC] gossipsub - rebalanceMesh grafts peers giving preference to low scores #405

* comment score choices

* compiler warning fixes/bug fixes (unsubscribe)

* rebalanceMesh does not enforce D_out quota

* fix outbound grafting

* fight the nim compiler

* fix closure capture bs...

* another closure fix

* #403 rebalance prune fixes

* more test fixing

* #403 fixes

* #402 avoid removing scores on unsub

* #401 handleGraft improvements

* [SEC] handleIHAVE/handleIWANT recommendations

* add a note about peer exchange handling
2020-10-30 21:49:54 +09:00
Dmitriy Ryajov 1de1d49223
Channel leaks (#413)
* break stream tracking by type

* use closeWithEOF to await wrapped stream

* fix cancelation leaks

* fix channel leaks

* logging

* use close monitor and always call closeUnderlying

* don't use closeWithEOF

* removing close monitor

* logging
2020-10-27 11:21:03 -06:00
Giovanni Petrantoni 462da1f7a8
gossip MessageID as seq[byte] (#391)
* gossip MessageID as seq[byte]

* combina hashes in defaultMsgIdProvider

* wip

* fix defaultMsgIdProvider
2020-10-21 12:26:04 +09:00
Giovanni Petrantoni 27b9bf436e
fix validation according to specification (#410) 2020-10-21 12:25:42 +09:00
Giovanni Petrantoni 556213abf4
Extended validators (#395)
* gossip extended validation

* fix flood tests

* fix gossip 1.0 tests

* synthax consistency
2020-10-12 16:56:00 +09:00
Giovanni Petrantoni e3bdb9eb13
decode properly ControlPrune (#392) 2020-10-09 09:12:38 +09:00
Giovanni Petrantoni 0f2435f551
better opportunistic grafting score (when score is disabled) (#389) 2020-10-03 09:26:45 +09:00
Giovanni Petrantoni 4a98a8af5a
gossip pruning fixes related to #371 (#385)
* gossip pruning fixes related to #371

* better trace for grafted/pruned

* shorted azure testing again
2020-10-02 13:09:31 +09:00
Mamy Ratsimbazafy 03f5bbba6d
saner logging (#381) 2020-09-29 09:40:06 -06:00
Giovanni Petrantoni 98d0cc3a16
defaultMsgIdProvider alternative/test anonymize (#379)
* defaultMsgIdProvider alternative/test anonymize

* avoid freeze during flood tests

* avoid `empty message, skipping` situation

* test observers

* avoid double initPubSub

* fix gossip testing (specially when anonymize is on)

* make azure tests shorter
2020-09-28 09:11:18 +02:00
Jacek Sieka 8ecef46738
reencode gossipsub messages with anonymization (#378)
This helps protect against clients sending more data than they should
and thus getting penalized on topics that require anonymity
2020-09-25 18:39:34 +02:00
Jacek Sieka 17e00e642a
limit write queue length (#376)
To break a potential read/write deadlock, gossipsub uses an unbounded
queue for writes - when peers are too slow to process this queue, it may
end up growing without bounds causing high memory usage.

Here, we introduce a maximum write queue length after which the peer is
disconnected - the queue is generous enough that any "normal" usage
should be fine - writes that are `await`:ed are not affected, only
writes that are launched in an `asyncSpawn` task or similar.

* avoid unnecessary copy of message when there are no send observers
* release message memory earlier in gossipsub
* simplify pubsubpeer logging
2020-09-24 18:43:20 +02:00
Jacek Sieka 25bd0a18f4
small fixes (#374)
* add helper to read EOF marker after closing stream (else stream stay
alive until timeout/reset)
* don't assert on empty channel message
* don't loop when writing to chronos (no need)
2020-09-24 07:30:19 +02:00
Giovanni Petrantoni ec322124ac
allow to omit peerId and seqno (#372)
* allow to omit peerId and seqno

* small refactor

* wip

* fix message encoding

* improve rpc signature logic

* remove peerid from verify

* trace fixes

* fix message test

* fix gossip 1.0 tests
2020-09-23 17:56:33 +02:00
Jacek Sieka 471e5906f6
fix gossipsub memory leak on disconnected peer (#371)
When messages can't be sent to peer, we try to establish a send
connection - this causes messages to stack up as more and more unsent
messages are blocked on the dial lock.

* remove dial lock
* run reconnection loop in background task
2020-09-22 09:05:53 +02:00