Merge pull request #2131 from ethereum/bbr-ws

BlocksByRange under WS
This commit is contained in:
Danny Ryan 2021-05-12 08:51:16 -06:00 committed by GitHub
commit 84830e8f26
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 68 additions and 5 deletions

View File

@ -23,6 +23,7 @@ The current features are:
* [Deposit Contract](specs/phase0/deposit-contract.md) * [Deposit Contract](specs/phase0/deposit-contract.md)
* [Honest Validator](specs/phase0/validator.md) * [Honest Validator](specs/phase0/validator.md)
* [P2P Networking](specs/phase0/p2p-interface.md) * [P2P Networking](specs/phase0/p2p-interface.md)
* [Weak Subjectivity](specs/phase0/weak-subjectivity.md)
### Altair ### Altair

View File

@ -93,6 +93,8 @@ It consists of four main sections:
- [Why is it called Req/Resp and not RPC?](#why-is-it-called-reqresp-and-not-rpc) - [Why is it called Req/Resp and not RPC?](#why-is-it-called-reqresp-and-not-rpc)
- [Why do we allow empty responses in block requests?](#why-do-we-allow-empty-responses-in-block-requests) - [Why do we allow empty responses in block requests?](#why-do-we-allow-empty-responses-in-block-requests)
- [Why does `BeaconBlocksByRange` let the server choose which branch to send blocks from?](#why-does-beaconblocksbyrange-let-the-server-choose-which-branch-to-send-blocks-from) - [Why does `BeaconBlocksByRange` let the server choose which branch to send blocks from?](#why-does-beaconblocksbyrange-let-the-server-choose-which-branch-to-send-blocks-from)
- [Why are `BlocksByRange` requests only required to be served for the latest `MIN_EPOCHS_FOR_BLOCK_REQUESTS` epochs?](#why-are-blocksbyrange-requests-only-required-to-be-served-for-the-latest-min_epochs_for_block_requests-epochs)
- [Why must the proposer signature be checked when backfilling blocks in the database?](#why-must-the-proposer-signature-be-checked-when-backfilling-blocks-in-the-database)
- [What's the effect of empty slots on the sync algorithm?](#whats-the-effect-of-empty-slots-on-the-sync-algorithm) - [What's the effect of empty slots on the sync algorithm?](#whats-the-effect-of-empty-slots-on-the-sync-algorithm)
- [Discovery](#discovery) - [Discovery](#discovery)
- [Why are we using discv5 and not libp2p Kademlia DHT?](#why-are-we-using-discv5-and-not-libp2p-kademlia-dht) - [Why are we using discv5 and not libp2p Kademlia DHT?](#why-are-we-using-discv5-and-not-libp2p-kademlia-dht)
@ -171,6 +173,7 @@ This section outlines constants that are used in this spec.
|---|---|---| |---|---|---|
| `GOSSIP_MAX_SIZE` | `2**20` (= 1048576, 1 MiB) | The maximum allowed size of uncompressed gossip messages. | | `GOSSIP_MAX_SIZE` | `2**20` (= 1048576, 1 MiB) | The maximum allowed size of uncompressed gossip messages. |
| `MAX_REQUEST_BLOCKS` | `2**10` (= 1024) | Maximum number of blocks in a single request | | `MAX_REQUEST_BLOCKS` | `2**10` (= 1024) | Maximum number of blocks in a single request |
| `MIN_EPOCHS_FOR_BLOCK_REQUESTS` | `MIN_VALIDATOR_WITHDRAWABILITY_DELAY + CHURN_LIMIT_QUOTIENT // 2` (= 33024, ~5 months) | The minimum epoch range over which a node must serve blocks |
| `MAX_CHUNK_SIZE` | `2**20` (1048576, 1 MiB) | The maximum allowed size of uncompressed req/resp chunked responses. | | `MAX_CHUNK_SIZE` | `2**20` (1048576, 1 MiB) | The maximum allowed size of uncompressed req/resp chunked responses. |
| `TTFB_TIMEOUT` | `5s` | The maximum time to wait for first byte of request response (time-to-first-byte). | | `TTFB_TIMEOUT` | `5s` | The maximum time to wait for first byte of request response (time-to-first-byte). |
| `RESP_TIMEOUT` | `10s` | The maximum time for complete response transfer. | | `RESP_TIMEOUT` | `10s` | The maximum time for complete response transfer. |
@ -179,7 +182,6 @@ This section outlines constants that are used in this spec.
| `MESSAGE_DOMAIN_INVALID_SNAPPY` | `0x00000000` | 4-byte domain for gossip message-id isolation of *invalid* snappy messages | | `MESSAGE_DOMAIN_INVALID_SNAPPY` | `0x00000000` | 4-byte domain for gossip message-id isolation of *invalid* snappy messages |
| `MESSAGE_DOMAIN_VALID_SNAPPY` | `0x01000000` | 4-byte domain for gossip message-id isolation of *valid* snappy messages | | `MESSAGE_DOMAIN_VALID_SNAPPY` | `0x01000000` | 4-byte domain for gossip message-id isolation of *valid* snappy messages |
## MetaData ## MetaData
Clients MUST locally store the following `MetaData`: Clients MUST locally store the following `MetaData`:
@ -565,6 +567,8 @@ The response code can have one of the following values, encoded as a single unsi
The response payload adheres to the `ErrorMessage` schema (described below). The response payload adheres to the `ErrorMessage` schema (described below).
- 2: **ServerError** -- the responder encountered an error while processing the request. - 2: **ServerError** -- the responder encountered an error while processing the request.
The response payload adheres to the `ErrorMessage` schema (described below). The response payload adheres to the `ErrorMessage` schema (described below).
- 3: **ResourceUnavailable** -- the responder does not have requested resource.
The response payload adheres to the `ErrorMessage` schema (described below).
Clients MAY use response codes above `128` to indicate alternative, erroneous request-specific responses. Clients MAY use response codes above `128` to indicate alternative, erroneous request-specific responses.
@ -745,10 +749,27 @@ The request MUST be encoded as an SSZ-container.
The response MUST consist of zero or more `response_chunk`. The response MUST consist of zero or more `response_chunk`.
Each _successful_ `response_chunk` MUST contain a single `SignedBeaconBlock` payload. Each _successful_ `response_chunk` MUST contain a single `SignedBeaconBlock` payload.
Clients MUST keep a record of signed blocks seen since the start of the weak subjectivity period Clients MUST keep a record of signed blocks seen on the epoch range
and MUST support serving requests of blocks up to their own `head_block_root`. `[max(GENESIS_EPOCH, current_epoch - MIN_EPOCHS_FOR_BLOCK_REQUESTS), current_epoch]`
where `current_epoch` is defined by the current wall-clock time,
and clients MUST support serving requests of blocks on this range.
Clients MUST respond with at least the first block that exists in the range, if they have it, and no more than `MAX_REQUEST_BLOCKS` blocks. Peers that are unable to reply to block requests within the
`MIN_EPOCHS_FOR_BLOCK_REQUESTS` epoch range MAY get descored or disconnected at any time.
*Note*: The above requirement implies that nodes that start from a recent weak subjectivity checkpoint
MUST backfill the local block database to at least epoch `current_epoch - MIN_EPOCHS_FOR_BLOCK_REQUESTS`
to be fully compliant with `BlocksByRange` requests. To safely perform such a
backfill of blocks to the recent state, the node MUST validate both (1) the
proposer signatures and (2) that the blocks form a valid chain up to the most
recent block referenced in the weak subjectivity state.
*Note*: Although clients that bootstrap from a weak subjectivity checkpoint can begin
participating in the networking immediately, other peers MAY
disconnect and/or temporarily ban such an un-synced or semi-synced client.
Clients MUST respond with at least the first block that exists in the range, if they have it,
and no more than `MAX_REQUEST_BLOCKS` blocks.
The following blocks, where they exist, MUST be sent in consecutive order. The following blocks, where they exist, MUST be sent in consecutive order.
@ -1393,6 +1414,45 @@ To avoid this race condition, we allow the responding side to choose which branc
The requesting client then goes on to validate the blocks and incorporate them in their own database The requesting client then goes on to validate the blocks and incorporate them in their own database
-- because they follow the same rules, they should at this point arrive at the same canonical chain. -- because they follow the same rules, they should at this point arrive at the same canonical chain.
### Why are `BlocksByRange` requests only required to be served for the latest `MIN_EPOCHS_FOR_BLOCK_REQUESTS` epochs?
Due to economic finality and weak subjectivity requirements of a proof-of-stake blockchain, for a new node to safely join the network
the node must provide a recent checkpoint found out-of-band. This checkpoint can be in the form of a `root` & `epoch` or it can be the entire
beacon state and then a simple block sync from there to the head. We expect the latter to be the dominant UX strategy.
These checkpoints *in the worst case* (i.e. very large validator set and maximal allowed safety decay) must be from the
most recent `MIN_EPOCHS_FOR_BLOCK_REQUESTS` epochs, and thus a user must be able to block sync to the head from this starting point.
Thus, this defines the epoch range outside which nodes may prune blocks, and
the epoch range that a new node syncing from a checkpoint must backfill.
`MIN_EPOCHS_FOR_BLOCK_REQUESTS` is calculated using the arithmetic from `compute_weak_subjectivity_period` found in the
[weak subjectivity guide](./weak-subjectivity.md). Specifically to find this max epoch range, we use the worst case event of a very large validator size
(`>= MIN_PER_EPOCH_CHURN_LIMIT * CHURN_LIMIT_QUOTIENT`).
```python
MIN_EPOCHS_FOR_BLOCK_REQUESTS = (
MIN_VALIDATOR_WITHDRAWABILITY_DELAY
+ MAX_SAFETY_DECAY * CHURN_LIMIT_QUOTIENT // (2 * 100)
)
```
Where `MAX_SAFETY_DECAY = 100` and thus `MIN_EPOCHS_FOR_BLOCK_REQUESTS = 33024` (~5 months).
### Why must the proposer signature be checked when backfilling blocks in the database?
When backfilling blocks in a database from a know safe block/state (e.g. when starting from a weak subjectivity state),
the node not only must ensure the `BeaconBlock`s form a chain to the known safe block,
but also must check that the proposer signature is valid in the `SignedBeaconBlock` wrapper.
This is because the signature is not part of the `BeaconBlock` hash chain, and
thus could be corrupted by an attacker serving valid `BeaconBlock`s but invalid
signatures contained in `SignedBeaconBlock`.
Although in this particular use case this does not represent a decay in safety
(due to the assumptions of starting at a weak subjectivity checkpoint), it
would represent invalid historic data and could be unwittingly transmitted to
additional nodes.
### What's the effect of empty slots on the sync algorithm? ### What's the effect of empty slots on the sync algorithm?
When syncing one can only tell that a slot has been skipped on a particular branch When syncing one can only tell that a slot has been skipped on a particular branch

View File

@ -136,7 +136,9 @@ A brief reference for what these values look like in practice ([reference script
## Weak Subjectivity Sync ## Weak Subjectivity Sync
Clients should allow users to input a Weak Subjectivity Checkpoint at startup, and guarantee that any successful sync leads to the given Weak Subjectivity Checkpoint along the canonical chain. If such a sync is not possible, the client should treat this as a critical and irrecoverable failure. Clients should allow users to input a Weak Subjectivity Checkpoint at startup,
and guarantee that any successful sync leads to the given Weak Subjectivity Checkpoint along the canonical chain.
If such a sync is not possible, the client should treat this as a critical and irrecoverable failure.
### Weak Subjectivity Sync Procedure ### Weak Subjectivity Sync Procedure