82 Commits

Author SHA1 Message Date
Marco Munizaga
0c5ee7bbfe
feat(gossipsub): Add MessageBatch (#607)
to support batch publishing messages

Replaces #602.

Batch publishing lets the system know there are multiple related
messages to be published so it can prioritize sending different messages
before sending copies of messages. For example, with the default API,
when you publish two messages A and B, under the hood A gets sent to D=8
peers first, before B gets sent out. With this MessageBatch api we can
now send one copy of A _and then_ one copy of B before sending multiple
copies.

When a node has bandwidth constraints relative to the messages it is
publishing this improves dissemination time.

For more context see this post:
https://ethresear.ch/t/improving-das-performance-with-gossipsub-batch-publishing/21713
2025-05-08 10:23:02 -07:00
Marco Munizaga
50ccc5ca90
fix(IDONTWANT)!: Do not IDONTWANT your sender (#609)
We were sending IDONTWANT to the sender of the received message. This is
pointless, as the sender should not repeat a message it already sent.
The sender could also have tracked that it had sent this peer the
message (we don't do this currently, and it's probably not necessary).

@ppopth
2025-04-30 10:58:50 +03:00
Pop Chunhapanya
bf5b583843
Allow cancelling IWANT using IDONTWANT (#591)
As specified in the Gossipsub v1.2 spec, we should allow cancelling
IWANT by IDONTWANT.

That is if IDONTWANT already arrived, we should not process IWANT.

However due to the code structure, we can cancel IWANT only in
handleIWant.


https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.2.md#cancelling-iwant
2024-12-30 22:25:26 +02:00
Nishant Das
3536508a9d
Fix the Router's Ability to Prune the Mesh Periodically (#589)
When a new peer wants to graft us into their mesh, we check our current
mesh size to determine whether we can add any more new peers to it. This
is done to prevent our mesh size from being greater than `Dhi` and
prevent mesh takeover attacks here:


c06df2f9a3/gossipsub.go (L943)

During every heartbeat we check our mesh size and if it is **greater**
than `Dhi` then we will prune our mesh back down to `D`.

c06df2f9a3/gossipsub.go (L1608)

However if you look closely at both lines there is a problematic end
result. Since we only stop grafting new peers into our mesh if our
current mesh size is **greater than or equal to** `Dhi` and we only
prune peers if the current mesh size is greater than `Dhi`.

This would result in the mesh being in a state of stasis at `Dhi`.
Rather than float between `D` and `Dhi` , the mesh stagnates at `Dhi` .
This would end up increasing the target degree of the node to `Dhi` from
`D`. This had been observed in ethereum mainnet by recording mesh
interactions and message fulfillment from those peers.

This PR fixes it by adding an equality check to the conditional so that
it can be periodically pruned. The PR also adds a regression test for
this particular case.
2024-12-26 20:02:14 +02:00
Pavel Zbitskiy
f71345c1ec
Do not format expensive debug messages in non-debug levels in doDropRPC (#580)
In high load scenarios when consumer is slow, `doDropRPC` is called
often and makes extra unnecessary allocations formatting `log.Debug`
message.

Fixed by checking log level before running expensive formatting.

Before:
```
BenchmarkAllocDoDropRPC-10    	13684732	        76.28 ns/op	     144 B/op	       3 allocs/op
```

After:
```
BenchmarkAllocDoDropRPC-10    	28140273	        42.88 ns/op	     112 B/op	       1 allocs/op
```
2024-09-25 09:33:35 +03:00
Pop Chunhapanya
b421b3ab05
GossipSub v1.2: IDONTWANT control message and priority queue. (#553)
## GossipSub v1.2 implementation

Specification: libp2p/specs#548

### Work Summary
Sending IDONTWANT

Implement a smart queue
Add priorities to the smart queue

    Put IDONTWANT packets into the smart priority queue as soon as the node gets the packets

Handling IDONTWANT

Use a map to remember the message ids whose IDONTWANT packets have been received
Implement max_idontwant_messages (ignore the IDONWANT packets if the max is reached)
Clear the message IDs from the cache after 3 heartbeats

    Hash the message IDs before putting them into the cache.

More requested features

    Add a feature test to not send IDONTWANT if the other side doesnt support it

### Commit Summary

* Replace sending channel with the smart rpcQueue

Since we want to implement a priority queue later, we need to replace
the normal sending channels with the new smart structures first.

* Implement UrgentPush in the smart rpcQueue

UrgentPush allows you to push an rpc packet to the front of the queue so
that it will be popped out fast.

* Add IDONTWANT to rpc.proto and trace.proto

* Send IDONTWANT right before validation step

Most importantly, this commit adds a new method called PreValidation to
the interface PubSubRouter, which will be called right before validating
the gossipsub message.

In GossipSubRouter, PreValidation will send the IDONTWANT controll
messages to all the mesh peers of the topics of the received messages.

* Test GossipSub IDONWANT sending

* Send IDONWANT only for large messages

* Handle IDONTWANT control messages

When receiving IDONTWANTs, the host should remember the message ids
contained in IDONTWANTs using a hash map.

When receiving messages with those ids, it shouldn't forward them to the
peers who already sent the IDONTWANTs.

When the maximum number of IDONTWANTs is reached for any particular
peer, the host should ignore any excessive IDONTWANTs from that peer.

* Clear expired message IDs from the IDONTWANT cache

If the messages IDs received from IDONTWANTs are older than 3
heartbeats, they should be removed from the IDONTWANT cache.

* Keep the hashes of IDONTWANT message ids instead

Rather than keeping the raw message ids, keep their hashes instead to
save memory and protect again memory DoS attacks.

* Increase GossipSubMaxIHaveMessages to 1000

* fixup! Clear expired message IDs from the IDONTWANT cache

* Not send IDONTWANT if the receiver doesn't support

* fixup! Replace sending channel with the smart rpcQueue

* Not use pointers in rpcQueue

* Simply rcpQueue by using only one mutex

* Check ctx error in rpc sending worker

Co-authored-by: Steven Allen <steven@stebalien.com>

* fixup! Simply rcpQueue by using only one mutex

* fixup! Keep the hashes of IDONTWANT message ids instead

* Use AfterFunc instead implementing our own

* Fix misc lint errors

* fixup! Fix misc lint errors

* Revert "Increase GossipSubMaxIHaveMessages to 1000"

This reverts commit 6fabcdd068a5f5238c5280a3460af9c3998418ec.

* Increase GossipSubMaxIDontWantMessages to 1000

* fixup! Handle IDONTWANT control messages

* Skip TestGossipsubConnTagMessageDeliveries

* Skip FuzzAppendOrMergeRPC

* Revert "Skip FuzzAppendOrMergeRPC"

This reverts commit f141e13234de0960d139339acb636a1afea9e219.

* fixup! Send IDONWANT only for large messages

* fixup! fixup! Keep the hashes of IDONTWANT message ids instead

* fixup! Implement UrgentPush in the smart rpcQueue

* fixup! Use AfterFunc instead implementing our own

---------

Co-authored-by: Steven Allen <steven@stebalien.com>
2024-08-16 18:16:35 +03:00
Steven Allen
19ffbb3a48
Re-enable disabled gossipsub test (#566)
And change it to take into account the fact that libp2p now trims
connections immediately (when no grace-period is specified) instead of
waiting for a timeout.
2024-08-06 20:43:28 +00:00
galargh
097b4671b0 chore: staticcheck 2024-08-06 19:52:49 +00:00
galargh
8f56e8c97a chore: update rand usage 2024-08-06 19:52:49 +00:00
Steven Allen
1f5b81fb61
test: use the regular libp2p host (#565)
This removes dependencies on swarm/testing and the blank host.

1. swarm/testing really shouldn't be used at all except for internal
libp2p stuff.
2. The blank host should only be used in _very_ special cases (autonat,
mostly).
2024-07-11 10:32:18 +00:00
Marco Munizaga
dbd1c9eade
Fix: Own our CertifiedAddrBook (#555)
* Subscribe to libp2p events to maintain our own Certified Address Book

* Update go version

* Use TestGossipsubStarTopology test instead of new test

* Don't return an error in manageAddrBook

* Return on error while subscribing

* Use null resource manager so that the new IP limit doesn't break tests

* Mod tidy
2024-05-20 17:13:30 -07:00
Marco Munizaga
c0a528ee7b
Replace fragmentRPC with appendOrMergeRPC (#557)
This will allow us to add more logic around when we split/merge
messages. It will also allow us to build the outgoing rpcs as we go
rather than building one giant rpc and then splitting it.
2024-05-02 09:40:54 -07:00
Sukun
d13e24ddc9
remove usage of deprecated peerid.Pretty method (#542) 2023-09-14 11:11:11 +03:00
Marten Seemann
4f56e8f0a7
update go-libp2p to v0.22.0 (#498)
* update go-libp2p to v0.22.0

* skip TestGossipsubConnTagMessageDeliveries
2022-08-26 02:45:41 -07:00
Nishant Das
ca702289e6
update pubsub deps (#491) 2022-06-30 07:30:19 +03:00
Marco Munizaga
68cdae031b
Gossipsub: Unsubscribe backoff (#488)
* Implement Unsusbcribe backoff

* Add test to check that prune backoff time is used

* Update which backoff to use in TestGossibSubJoinTopic test

* Fix race in TestGossipSubLeaveTopic

* Wait for all the backoff checks, and check that we aren't missing too many

* Remove open question
2022-06-03 06:46:56 +03:00
nisdas
aeb30a2ac1 Add in Backoff Check 2022-02-08 09:20:54 +02:00
nisdas
3d93f5f991 Add Backoff For Pruned Peers 2022-02-07 14:09:18 +02:00
Gus Eggert
c6dd285c5d
feat: plumb through context changes (#459) 2021-11-11 11:09:45 -05:00
Simon Zhu
628353661b Create peer filter option 2021-09-21 13:50:09 +03:00
Ian Davis
2efd313b83
cleanup: fix vet and staticcheck failures (#435)
* cleanup: fix vet failures and most staticcheck failures

* Fix remaining staticcheck failures

* Give test goroutines chance to exit early when context is canceled
2021-07-22 15:27:32 -07:00
Steven Allen
0094708cc4
Refactor Gossipsub Parameters To Make Them More Configurable (#421)
Co-authored-by: nisdas <nishdas93@gmail.com>
2021-05-03 08:59:15 -07:00
nisdas
f7f33e10cc satisfy race detector 2020-09-10 12:39:04 +03:00
nisdas
b0d384d2e8 clean up 2020-09-10 12:39:04 +03:00
nisdas
309d45acef copy string topic 2020-09-10 12:39:04 +03:00
vyzo
73880606b5 add test for topic score parameter reset method 2020-09-09 16:57:36 +03:00
vyzo
769831b478 add regression test for issue 371 2020-08-10 21:00:00 +03:00
Raúl Kripalani
ae55bf9603 upgrade deps + interoperable uvarint delimited writer/reader. 2020-07-30 14:00:54 +03:00
Alan Shaw
c0712c6e92 feat: add direct connect ticks option
In [drand](https://github.com/drand/drand) we have a gossipsub relay to allow users to subscribe to getting random values over pubsub. We want to support pure gossip relays who relay from a relay. For this we need direct peering agreements and want to mitigate the possibility of "missing" randomness messages by ensuring the direct connect ticks period is less than the period between updates.

This PR simply adds a new functional option allowing us to set the direct connect ticks value without modifying the global variable.
2020-05-27 16:26:41 +03:00
Yusef Napora
906c941b29 sleep longer for travis 2020-05-19 19:26:53 +03:00
Yusef Napora
568fa5a244 close stream in test 2020-05-06 19:01:22 +03:00
Yusef Napora
cb02a50cd8 split large IWANT / IHAVE messages, add unit test 2020-05-06 19:01:22 +03:00
Yusef Napora
4427c3def7 fix prune message in test 2020-05-06 19:01:22 +03:00
Yusef Napora
cb65238a39 fix race condition in rpc fragmentation test 2020-05-06 19:01:22 +03:00
Yusef Napora
8642662340 rewrite test for rpc fragmentation 2020-05-06 19:01:22 +03:00
Yusef Napora
27f009a9c7 fragment large RPCs in sendRPC 2020-05-06 19:01:22 +03:00
vyzo
deee35d9b8 fix apparent flakiness in test 2020-05-04 09:42:20 +03:00
vyzo
213da1cf8c add test exercising score integration with extended validation 2020-05-04 09:42:20 +03:00
vyzo
8b7e7a1103 fix typo 2020-04-27 18:35:25 +03:00
vyzo
5c1b637dce add test for peer score inspection 2020-04-27 18:35:25 +03:00
vyzo
01041fa327 improve reliability of star topology tests
Configure the star with 0 D, to act as a proper bootstrapper
2020-04-23 13:40:50 +03:00
vyzo
caffc3bf2c make star topology tests more reliable
probabilities are such that they occasionally fail
2020-04-23 13:40:50 +03:00
vyzo
1a3695988b import grouping 2020-04-23 13:40:50 +03:00
vyzo
94db23fd41 add signed peer records only in the center of the star for signed peer record test 2020-04-23 13:40:50 +03:00
Yusef Napora
516a32c7ad add test with signed peer records 2020-04-23 13:40:50 +03:00
vyzo
11ef2a9cf2 fix the global variable mutation races 2020-04-22 21:08:13 +03:00
vyzo
a50deb04c0 a little bit more time to avoid races with restoring mutated config variables 2020-04-22 21:08:13 +03:00
vyzo
6a230e711e add a heartbeat's worth of delay before restoring mutated globals
the race detector cries on travis
2020-04-22 21:08:13 +03:00
vyzo
6d24f46a13 reduce prune backoff times for opportunistic grafting test 2020-04-22 21:08:13 +03:00
vyzo
bac5d5910c add test for opportunistic grafting 2020-04-22 21:08:13 +03:00