Merge pull request #2131 from ethereum/bbr-ws

BlocksByRange under WS
2021-05-12 08:51:16 -06:00 · 2021-05-12 08:51:16 -06:00 · 84830e8f26
parent dff12dd644 f52f067b8e
commit 84830e8f26
3 changed files with 68 additions and 5 deletions
--- a/README.md
+++ b/README.md
@ -23,6 +23,7 @@ The current features are:
 * [Deposit Contract](specs/phase0/deposit-contract.md)
 * [Honest Validator](specs/phase0/validator.md)
 * [P2P Networking](specs/phase0/p2p-interface.md)
 * [Weak Subjectivity](specs/phase0/weak-subjectivity.md)
 ### Altair
--- a/specs/phase0/p2p-interface.md
+++ b/specs/phase0/p2p-interface.md
@ -93,6 +93,8 @@ It consists of four main sections:
    - [Why is it called Req/Resp and not RPC?](#why-is-it-called-reqresp-and-not-rpc)
    - [Why do we allow empty responses in block requests?](#why-do-we-allow-empty-responses-in-block-requests)
    - [Why does `BeaconBlocksByRange` let the server choose which branch to send blocks from?](#why-does-beaconblocksbyrange-let-the-server-choose-which-branch-to-send-blocks-from)
    - [Why are `BlocksByRange` requests only required to be served for the latest `MIN_EPOCHS_FOR_BLOCK_REQUESTS` epochs?](#why-are-blocksbyrange-requests-only-required-to-be-served-for-the-latest-min_epochs_for_block_requests-epochs)
    - [Why must the proposer signature be checked when backfilling blocks in the database?](#why-must-the-proposer-signature-be-checked-when-backfilling-blocks-in-the-database)
    - [What's the effect of empty slots on the sync algorithm?](#whats-the-effect-of-empty-slots-on-the-sync-algorithm)
  - [Discovery](#discovery)
    - [Why are we using discv5 and not libp2p Kademlia DHT?](#why-are-we-using-discv5-and-not-libp2p-kademlia-dht)
@ -171,6 +173,7 @@ This section outlines constants that are used in this spec.
 |---|---|---|
 | `GOSSIP_MAX_SIZE` | `2**20` (= 1048576, 1 MiB) | The maximum allowed size of uncompressed gossip messages. |
 | `MAX_REQUEST_BLOCKS` | `2**10` (= 1024) | Maximum number of blocks in a single request |
 | `MIN_EPOCHS_FOR_BLOCK_REQUESTS` | `MIN_VALIDATOR_WITHDRAWABILITY_DELAY + CHURN_LIMIT_QUOTIENT // 2` (= 33024, ~5 months) | The minimum epoch range over which a node must serve blocks |
 | `MAX_CHUNK_SIZE` | `2**20` (1048576, 1 MiB) | The maximum allowed size of uncompressed req/resp chunked responses. |
 | `TTFB_TIMEOUT` | `5s` | The maximum time to wait for first byte of request response (time-to-first-byte). |
 | `RESP_TIMEOUT` | `10s` | The maximum time for complete response transfer. |
@ -179,7 +182,6 @@ This section outlines constants that are used in this spec.
 | `MESSAGE_DOMAIN_INVALID_SNAPPY` | `0x00000000` | 4-byte domain for gossip message-id isolation of *invalid* snappy messages |
 | `MESSAGE_DOMAIN_VALID_SNAPPY`  | `0x01000000` | 4-byte domain for gossip message-id isolation of *valid* snappy messages |
 ## MetaData
 Clients MUST locally store the following `MetaData`:
@ -565,6 +567,8 @@ The response code can have one of the following values, encoded as a single unsi
  The response payload adheres to the `ErrorMessage` schema (described below).
 -  2: **ServerError** -- the responder encountered an error while processing the request.
  The response payload adheres to the `ErrorMessage` schema (described below).
 -  3: **ResourceUnavailable** -- the responder does not have requested resource.
  The response payload adheres to the `ErrorMessage` schema (described below).
 Clients MAY use response codes above `128` to indicate alternative, erroneous request-specific responses.
@ -745,10 +749,27 @@ The request MUST be encoded as an SSZ-container.
 The response MUST consist of zero or more `response_chunk`.
 Each _successful_ `response_chunk` MUST contain a single `SignedBeaconBlock` payload.
-Clients MUST keep a record of signed blocks seen since the start of the weak subjectivity period
+Clients MUST keep a record of signed blocks seen on the epoch range
-and MUST support serving requests of blocks up to their own `head_block_root`.
+`[max(GENESIS_EPOCH, current_epoch - MIN_EPOCHS_FOR_BLOCK_REQUESTS), current_epoch]`
 where `current_epoch` is defined by the current wall-clock time,
 and clients MUST support serving requests of blocks on this range.
-Clients MUST respond with at least the first block that exists in the range, if they have it, and no more than `MAX_REQUEST_BLOCKS` blocks.
+Peers that are unable to reply to block requests within the
 `MIN_EPOCHS_FOR_BLOCK_REQUESTS` epoch range MAY get descored or disconnected at any time.
 *Note*: The above requirement implies that nodes that start from a recent weak subjectivity checkpoint
 MUST backfill the local block database to at least epoch `current_epoch - MIN_EPOCHS_FOR_BLOCK_REQUESTS`
 to be fully compliant with `BlocksByRange` requests. To safely perform such a
 backfill of blocks to the recent state, the node MUST validate both (1) the
 proposer signatures and (2) that the blocks form a valid chain up to the most
 recent block referenced in the weak subjectivity state.
 *Note*: Although clients that bootstrap from a weak subjectivity checkpoint can begin
 participating in the networking immediately, other peers MAY
 disconnect and/or temporarily ban such an un-synced or semi-synced client.
 Clients MUST respond with at least the first block that exists in the range, if they have it,
 and no more than `MAX_REQUEST_BLOCKS` blocks.
 The following blocks, where they exist, MUST be sent in consecutive order.
@ -1393,6 +1414,45 @@ To avoid this race condition, we allow the responding side to choose which branc
 The requesting client then goes on to validate the blocks and incorporate them in their own database
 -- because they follow the same rules, they should at this point arrive at the same canonical chain.
 ### Why are `BlocksByRange` requests only required to be served for the latest `MIN_EPOCHS_FOR_BLOCK_REQUESTS` epochs?
 Due to economic finality and weak subjectivity requirements of a proof-of-stake blockchain, for a new node to safely join the network
 the node must provide a recent checkpoint found out-of-band. This checkpoint can be in the form of a `root` & `epoch` or it can be the entire
 beacon state and then a simple block sync from there to the head. We expect the latter to be the dominant UX strategy.
 These checkpoints *in the worst case* (i.e. very large validator set and maximal allowed safety decay) must be from the
 most recent `MIN_EPOCHS_FOR_BLOCK_REQUESTS` epochs, and thus a user must be able to block sync to the head from this starting point.
 Thus, this defines the epoch range outside which nodes may prune blocks, and
 the epoch range that a new node syncing from a checkpoint must backfill.
 `MIN_EPOCHS_FOR_BLOCK_REQUESTS` is calculated using the arithmetic from `compute_weak_subjectivity_period` found in the
 [weak subjectivity guide](./weak-subjectivity.md). Specifically to find this max epoch range, we use the worst case event of a very large validator size
 (`>= MIN_PER_EPOCH_CHURN_LIMIT * CHURN_LIMIT_QUOTIENT`).
 ```python
 MIN_EPOCHS_FOR_BLOCK_REQUESTS = (
    MIN_VALIDATOR_WITHDRAWABILITY_DELAY
    + MAX_SAFETY_DECAY * CHURN_LIMIT_QUOTIENT // (2 * 100)
 )
 ```
 Where `MAX_SAFETY_DECAY = 100` and thus `MIN_EPOCHS_FOR_BLOCK_REQUESTS = 33024` (~5 months).
 ### Why must the proposer signature be checked when backfilling blocks in the database?
 When backfilling blocks in a database from a know safe block/state (e.g. when starting from a weak subjectivity state),
 the node not only must ensure the `BeaconBlock`s form a chain to the known safe block,
 but also must check that the proposer signature is valid in the `SignedBeaconBlock` wrapper.
 This is because the signature is not part of the `BeaconBlock` hash chain, and
 thus could be corrupted by an attacker serving valid `BeaconBlock`s but invalid
 signatures contained in `SignedBeaconBlock`.
 Although in this particular use case this does not represent a decay in safety
 (due to the assumptions of starting at a weak subjectivity checkpoint), it
 would represent invalid historic data and could be unwittingly transmitted to
 additional nodes.
 ### What's the effect of empty slots on the sync algorithm?
 When syncing one can only tell that a slot has been skipped on a particular branch
--- a/specs/phase0/weak-subjectivity.md
+++ b/specs/phase0/weak-subjectivity.md
@ -136,7 +136,9 @@ A brief reference for what these values look like in practice ([reference script
 ## Weak Subjectivity Sync
-Clients should allow users to input a Weak Subjectivity Checkpoint at startup, and guarantee that any successful sync leads to the given Weak Subjectivity Checkpoint along the canonical chain. If such a sync is not possible, the client should treat this as a critical and irrecoverable failure.
+Clients should allow users to input a Weak Subjectivity Checkpoint at startup,
 and guarantee that any successful sync leads to the given Weak Subjectivity Checkpoint along the canonical chain.
 If such a sync is not possible, the client should treat this as a critical and irrecoverable failure.
 ### Weak Subjectivity Sync Procedure