As part of the discussions surrounding EIP-7594 (peerdas), it was
highlighted that during sampling and/or data requests, the sampler does
not have timing information for when a samplee will have data available.
It is desireable to not introduce a deadline, since this artificially
introduces latency for the typical scenario where data becomes available
earlier than an agreed-upon deadline.
Similarly, when a client issues a request for blocks, it does often not
know what rate limiting policy of the serving end and must either
pessimistically rate limit itself or run the risk of getting
disconnected for spamming the server - outcomes which lead to
unnecessarily slow syncing as well as testnet mess with peer scoring and
disconnection issues.
This PR solves both problems by:
* removing the time-to-first-byte and response timeouts allowing
requesters to optimistically queue requests - the timeouts have
historically not been implemented fully in clients to this date
* introducing a hard limit in the number of concurrent requests that a
client may issue, per protocol
* introducing a recommendation for rate limiting that allows optimal
bandwidth usage without protocol changes or additional messaging
roundtrips
On the server side, an "open" request does not consume significant
resources while it's resting, meaning that allowing the server to manage
resource allocation by slowing down data serving is safe, as long as
concurrency is adequately limited.
On the client side, clients must be prepared to handle slow servers
already and they can simply apply their existing strategy both to
uncertainty and rate-limiting scenarios (how long before timeout, what
to do in "slow peer" scenarios).
Token / leaky buckets are a classic option for rate limiting with
desireable properties both for the case when we're sending requests to
many clients concurrently (getting good burst performance) and when the
requestees are busy (by keeping long-term resource usage in check and
fairly serving clients)
- More isolated per job
- Moderate speed increase
- Moderate Cost reduction
- Better security as jobs and tasks are fully isolated and run on ephemeral instances