23 KiB

Table Of Contents

v0.28.0

🔦 Highlights

Smart Dialing

This release introduces smart dialing logic. Currently, libp2p dials all addresses of a remote peer in parallel, and aborts all outstanding dials as soon as the first one succeeds. Dialing many addresses in parallel creates a lot of churn on the client side, and unnecessary load on the network and on the server side, and is heavily discouraged by the networking community (see RFC 8305 for example).

When connecting to a peer we first determine the order to dial its addresses. This ranking logic considers a number of corner cases described in detail in the documentation of the swarm package (swarm.DefaultDialRanker). At a high level, this is what happens:

  • If a peer offers a WebTransport and a QUIC address (on the same IP:port), the QUIC address is preferred.
  • If a peer has a QUIC and a TCP address, the QUIC address is dialed first. Only if the connection attempt doesn't succeed within 250ms, a TCP connection is started.

Our measurements on the IPFS network show that for >90% of established libp2p connections, the first connection attempt succeeds, leading a dramatic decrease in the number of aborted connection attempts.

We also added new metrics to the swarm Grafana dashboard, showing:

  • The number of connection attempts it took to establish a connection
  • The delay introduced by the ranking logic

This feature should be safe to enable for nodes running in data centers and for most nodes in home networks. However, there are some (mostly home and corporate networks) that block all UDP traffic. If enabled, the current implementation of the smart dialing logic will lead to a regression, since it preferes QUIC addresses over TCP addresses. Nodes would still be able to connect, but connection establishment of the TCP connection would be delayed by 250ms.

In a future release (see #1605 for details), we will introduce a feature called blackhole detection. By observing the outcome of QUIC connection attempts, we can determine if UDP traffic is blocked (namely, if all QUIC connection attempts fail), and stop dialing QUIC in this case altogether. Once this detection logic is in place, smart dialing will be enabled by default.

More Metrics!

Since the last release, we've added metrics for:

WebTransport

  • #2251: Infer public WebTransport address from quic-v1 addresses if both transports are using the same port for both quic-v1 and WebTransport addresses.
  • #2271: Only add certificate hashes to WebTransport mulitaddress if listening on WebTransport

Housekeeping updates

  • Identify
    • #2303: Don't send default protocol version
    • Prevent polluting PeerStore with local addrs
      • #2325: Don't save signed peer records
      • #2300: Filter received addresses based on the node's remote address
  • WebSocket
    • #2280: Reverted back to the Gorilla library for WebSocket
  • NAT
    • #2248: Move NAT mapping logic out of the host

🐞 Bugfixes

Full Changelog: https://github.com/libp2p/go-libp2p/compare/v0.27.0...v0.28.0

v0.27.0

Breaking Changes

  • The LocalPrivateKey method was removed from the network.Conn interface. #2144

🔦 Highlights

Additional metrics

Since the last release, we've added metrics for:

  • Relay Service: RequestStatus, RequestCounts, RejectionReasons for Reservation and Connection Requests, ConnectionDuration, BytesTransferred, Relay Service Status.
  • Autorelay: relay finder status, reservation request outcomes, current reservations, candidate circuit v2 support, current candidates, relay addresses updated, num relay address, and scheduled work times

🐞 Bugfixes

  • autonat: don't change status on dial request refused 2225
  • relaysvc: fix flaky TestReachabilityChangeEvent 2215
  • basichost: prevent duplicate dials 2196
  • websocket: don't set a WSS multiaddr for accepted unencrypted conns 2199
  • identify: Fix IdentifyWait when Connected events happen out of order 2173
  • circuitv2: cleanup relay service properly 2164

Full Changelog: https://github.com/libp2p/go-libp2p/compare/v0.26.4...v0.27.0

v0.26.4

This patch release fixes a busy-looping happening inside AutoRelay on private nodes, see 2208.

Full Changelog: https://github.com/libp2p/go-libp2p/compare/v0.26.0...v0.26.4

v0.26.3

  • rcmgr: fix JSON marshalling of ResourceManagerStat peer map 2156
  • websocket: Don't limit message sizes in the websocket reader 2193

Full Changelog: https://github.com/libp2p/go-libp2p/compare/v0.26.0...v0.26.3

v0.26.2

This patch release fixes two bugs:

Full Changelog: https://github.com/libp2p/go-libp2p/compare/v0.26.0...v0.26.2

v0.26.1

This version was retracted due to errors when publishing the release.

v0.26.0

🔦 Highlights

Circuit Relay Changes

Removed Circuit Relay v1

We've decided to remove support for Circuit Relay v1 in this release. v1 Relays have been retired a few months ago. Notably, running the Relay v1 protocol was expensive and resulted in only a small number of nodes in the network. Users had to either manually configure these nodes as static relays, or discover them from the DHT. Furthermore, rust-libp2p has dropped support and js-libp2p is dropping support for Relay v1.

Support for Relay v2 was first added in late 2021 in v0.16.0. With Circuit Relay v2 it became cheap to run (limited) relays. Public nodes also started the relay service by default. There's now a massive number of Relay v2 nodes on the IPFS network, and they don't advertise their service to the DHT any more. Because there's now so many of these nodes, connecting to just a small number of nodes (e.g. by joining the DHT), a node is statistically guaranteed to connect to some relays.

Unlimited Relay v2

In conjunction with removing relay v1, we also added an option to Circuit Relay v2 to disable limits. This done by enabling WithInfiniteLimits. When enabled this allows for users to have a drop in replacement for Relay v1 with Relay v2.

Additional metrics

Since the last release, we've added additional metrics to different components. Metrics were added to:

We also migrated the metric dashboards to a top-level dashboards directory.

🐞 Bugfixes

AutoNat

  • Fixed a bug where AutoNat would emit events when the observed address has changed even though the node reachability hadn't changed.

Relay Manager

  • Fixed a bug where the Relay Manager started a new relay even though the previous reachability was Public or if a relay already existed.

Stop sending detailed error messages on closing QUIC connections

Users reported seeing confusing error messages and could not determine the root cause or if the error was from a local or remote peer:

{12D... Application error 0x0: conn-27571160: system: cannot reserve inbound connection: resource limit exceeded}

This error occurred when a connection had been made with a remote peer but the remote peer dropped the connection (due to it exceeding limits). This was actually an Application error emitted by quic-go and it was a bug in go-libp2p that we sent the whole message. For now, we decided to stop sending this confusing error message. In the future, we will report such errors via error codes.

Full Changelog: https://github.com/libp2p/go-libp2p/compare/v0.25.1...v0.26.0

v0.25.1

Fix some test-utils used by https://github.com/libp2p/go-libp2p-kad-dht

Full Changelog: https://github.com/libp2p/go-libp2p/compare/v0.25.0...v0.25.1

v0.25.0

🔦 Highlights

Metrics

We've started instrumenting the entire stack. In this release, we're adding metrics for:

Our metrics effort is still ongoing, see https://github.com/libp2p/go-libp2p/issues/1356 for progress. We'll add metrics and dashboards for more libp2p components in a future release.

Switching to Google's official Protobuf compiler

So far, we were using GoGo Protobuf to compile our Protobuf definitions to Go code. However, this library was deprecated in October last year: https://twitter.com/awalterschulze/status/1584553056100057088. We benchmarked serialization and deserialization, and found that it's (only) 20% slower than GoGo. Since the vast majority of go-libp2p's CPU time is spent in code paths other than Protobuf handling, switching to the official compiler seemed like a worthwhile tradeoff.

Removal of OpenSSL

Before this release, go-libp2p had an option to use OpenSSL bindings for certain cryptographic primitives, mostly to speed up the generation of signatures and their verification. When building go-libp2p using go build, we'd use the standard library crypto packages. OpenSSL was only used when passing in a build tag: go build -tags openssl. Maintaining our own fork of the long unmaintained go-openssl package has proven to place a larger than expected maintenance burden on the libp2p stewards, and when we recently discovered a range of new bugs (this and this and this), we decided to re-evaluate if this code path is really worth it. The results surprised us, it turns out that:

  • The Go standard library is faster than OpenSSL for all key types that are not RSA.
  • Verifying RSA signatures is as fast as Ed25519 signatures using the Go standard library, and even faster in OpenSSL.
  • Generating RSA signatures is painfully slow, both using Go standard library crypto and using OpenSSL (but even slower using Go standard library).

Now the good news is, that if your node is not using an RSA key, it will never create any RSA signatures (it might need to verify them though, when it connects to a node that uses RSA keys). If you're concerned about CPU performance, it's a good idea to avoid RSA keys (the same applies to bandwidth, RSA keys are huge!). Even for nodes using RSA keys, it turns out that generating the signatures is not a significant part of their CPU load, as verified by profiling one of Kubo's bootstrap nodes.

We therefore concluded that it's safe to drop this code path altogether, and thereby reduce our maintenance burden.

New Resource Manager types

  • Introduces a new type LimitVal which can explicitly specify "use default", "unlimited", "block all", as well as any positive number. The zero value of LimitVal (the value when you create the object in Go) is "Use default".
    • The JSON marshalling of this is straightforward.
  • Introduces a new ResourceLimits type which uses LimitVal instead of ints so it can encode the above for the resources.
  • Changes LimitConfig to PartialLimitConfig and uses ResourceLimits. This along with the marshalling changes means you can now marshal the fact that some resource limit is set to block all.
    • Because the default is to use the defaults, this avoids the footgun of initializing the resource manager with 0 limits (that would block everything).

In general, you can go from a resource config with defaults to a concrete one with .Build(). e.g. ResourceLimits.Build() => BaseLimit, PartialLimitConfig.Build() => ConcreteLimitConfig, LimitVal.Build() => int. See PR #2000 for more details.

If you're using the defaults for the resource manager, there should be no changes needed.

Other Breaking Changes

We've cleaned up our API to consistently use protocol.ID for libp2p and application protocols. Specifically, this means that the peer store now uses protocol.IDs, and the host's SetStreamHandler as well.

What's Changed

New Contributors

Full Changelog: https://github.com/libp2p/go-libp2p/compare/v0.24.2...v0.25.0