This PR reintroduces and further decouples blocks and blobs in EIP-4844,
so as to improve network and processing performance.
Block and blob processing, for the purpose of gossip validation, are
independent: they can both be propagated and gossip-validated
in parallel - the decoupled design allows 4 important optimizations
(or, if you are so inclined, removes 4 unnecessary pessimizations):
* Blocks and blobs travel on independent meshes allowing for better
parallelization and utilization of high-bandwidth peers
* Re-broadcasting after validation can start earlier allowing more
efficient use of upload bandwidth - blocks for example can be
rebroadcast to peers while blobs are still being downloaded
* bandwidth-reduction techniques such as per-peer deduplication are more
efficient because of the smaller message size
* gossip verification happens independently for blocks and blobs,
allowing better sharing / use of CPU and I/O resources in clients
With growing block sizes and additional blob data to stream, the network
streaming time becomes a dominant factor in propagation times - on a
100mbit line, streaming 1mb to 8 peers takes ~1s - this process is
repeated for each hop in both incoming and outgoing directions.
This design in particular sends each blob on a separate subnet, thus
maximising the potential for parallelisation and providing a natural
path for growing the number of blobs per block should the network be
judged to be able to handle it.
Changes compared to the current design include:
* `BlobsSidecar` is split into individual `BlobSidecar` containers -
each container is signed individually by the proposer
* the signature is used during gossip validation but later dropped.
* KZG commitment verification is moved out of the gossip pipeline and
instead done before fork choice addition, when both block and sidecars
have arrived
* clients may verify individual blob commitments earlier
* more generally and similar to block verification, gossip propagation
is performed solely based on trivial consistency checks and proposer
signature verification
* by-root blob requests are done per-blob, so as to retain the ability
to fill in blobs one-by-one assuming clients generally receive blobs
from gossip
* by-range blob requests are done per-block, so as to simplify
historical sync
* range and root requests are limited to `128` entries for both blocks
and blobs - practically, the current higher limit of `1024` for blocks
does not get used and keeping the limits consistent simplifies
implementation - with the merge, block sizes have grown significantly
and clients generally fetch smaller chunks.
In Altair, light client sync protocol exchanges `BeaconBlockHeader`
structures for tracking current progress. Wrapping `BeaconBlockHeader`
inside a `LightClientHeader` allows future extensions of this header,
e.g., to also track `ExecutionPayloadHeader`.
Note: This changes the JSON REST format by adding a `beacon` nesting.
For SSZ, the serialization format stays same (but overall root changes).
* Add @description decorator
* Unify test case naming style
* more clean ups
* Altair tests cleanup
* Clean up Altair and Bellatrix `process_deposit` tests
* Clean up Bellatrix tests
* Clean up Capella tests
* PR feedback from @ralexstokes
* Add comments on the deposit fork version tests
* Remove `test_incorrect_sig_other_version` since it is duplicate to `test_ineffective_deposit_with_bad_fork_version`
* Add `test_ineffective_deposit_with_current_fork_version`
While the light client sync protocol currently provides access to the
latest `BeaconBlockHeader`, obtaining the matching execution data needs
workarounds such as downloading the full block.
Having ready access to the EL state root simplifies use cases that need
a way to cross-check `eth_getProof` responses against LC data.
Access to `block_hash` unlocks scenarios where a CL light client drives
an EL without `engine_newPayload`. As of Altair, only the CL beacon
block root is available, but the EL block hash is needed for engine API.
Other fields in the `ExecutionPayloadHeader` such as `logs_bloom` may
allow light client applications to monitor blocks for local interest,
e.g. for transfers affecting a certain wallet. This enables to download
only the few relevant blocks instead of every single one.
A new `LightClientStore` is proposed into the Capella spec that may be
used to sync LC data that includes execution data. Existing pre-Capella
LC data will remain as is, but can be locally upgraded before feeding it
into the new `LightClientStore` so that light clients do not need to run
a potentially expensive fork transition at a specific time. This enables
the `LightClientStore` to be upgraded at a use case dependent timing at
any time before Capella hits. Smart contract and embedded deployments
benefit from reduced code size and do not need synchronization with the
beacon chain clock to perform the Capella fork.
Introduce `get_lc_beacon_slot` and `get_lc_beacon_root` accessors
similar to `get_current_slot(state)` to account for future extensions
to the light client header structure that may override how those fields
are accessed. Idea is to extend with execution accessors in the future.
Introduce `block_to_light_client_header` helper function to enable
future forks to override it with additional info (e.g., execution),
without having to change the general light client logic.
Likewise, update existing light client data creation flow to use
`block_to_light_client_header` and default-initialize empty fields.
Furthermore, generalize `create_update` helper to streamline test code
using `block_to_light_client_header`.
Note: In Altair spec, LC header is the same as `BeaconBlockHeader`.
however; future forks will extend it with more information.
* EIP4844: bytes_to_bls_field() must not accept values >= BLS_MODULUS
bytes_to_bls_field() will be used in the precompile and hence it should error out when provided with malicious inputs.
* EIP4844: Add hash_to_bls_field() for use in compute_challenges()
The previous commit made bytes_to_bls_field() be strict about its inputs. However in compute_challenges() we are
dealing with Fiat-Shamir and hash outputs that could be innocuously higher than the modulus. For this reason we add the
hash_to_bls_field() helper for use in compute_challenges().
* EIP4844: Further use of bytes_to_bls_field() // Fix executable spec
Co-authored-by: Hsiao-Wei Wang <hsiaowei.eth@gmail.com>
Additionally, it makes the Fiat-Shamir hashing logic more robust by making the challenges independent of each other. It also makes it more efficient to implement by moving both challenge computations to a single function needing a single transcript hash.
Co-authored-by: George Kadianakis <desnacked@riseup.net>
Co-authored-by: Dankrad Feist <mail@dankradfeist.de>
- Implemented many of Alex's comments including reinsertion of the
withdrawal index in the BeaconState
- Implemented Sean's suggestion of separating the logic for block
production so that one matches the list in the payload with what
`get_expected_withdrawals` returns
- Changed `get_expected_wihdrawals` to match the current behavior and
moved it to `beacon-chain.md`
This commit changes the public API of the KZG library to the following high-level API:
```
- verify_kzg_proof()
- compute_aggregate_kzg_proof()
- verify_aggregate_kzg_proof()
- blob_to_kzg_commitment()
```
compared to the previous much more low-level API:
```
- compute_powers()
- matrix_lincomb()
- lincomb()
- bytes_to_bls_field()
- evaluate_polynomial_in_evaluation_form()
- verify_kzg_proof()
- compute_kzg_proof()
```
This means that all the cryptographic logic (including Fiat-Shamir) is now isolated and hidden in the KZG library and the `validator.md` file ends up being significantly simplified, only calling high-level KZG functions.
Some additional things that this commit does:
- Moves all EIP4844 cryptography into polynomial-commitments.md
- Improves the Fiat-Shamir stack by removing the need for SSZ and by introducing simple domain separators
Co-authored-by: Kevaundray Wedderburn <kevtheappdev@gmail.com>
Co-authored-by: Hsiao-Wei Wang <hsiaowei.eth@gmail.com>
Co-authored-by: Dankrad Feist <mail@dankradfeist.de>
Future light client protocol extensions may include data from the block
in addition to data from the state, e.g., `ExecutionPayloadHeader`.
To prepare for this, also pass the block to the corresponding functions.
In practice, blocks access is easier than historic state access, meaning
there are no practical limitations due to this change.
Add more detailed LC object documentation to explain that the various
merkle proofs are relative to the beacon block's state root.
Likewise, clarify that sync committees relate to the finalized header
(not to the optimistic header, which can be a period ahead).
For LC gossip, the documentation did not specify what slot number to use
for deriving the gossip objects. This missing documentation is now added
to document using `attested_header.slot`.
This PR, a continuation of
replaces `historical_roots` with
`historical_block_roots`.
By keeping an accumulator of historical block roots in the state, it
becomes possible to validate the entire block history that led up to
that particular state without executing the transitions, and without
checking them one by one in backwards order using a parent chain.
This is interesting for archival purposes as well as when implementing
sync protocols that can verify chunks of blocks quickly, meaning they
can be downloaded in any order.
It's also useful as it provides a canonical hash by which such chunks of
blocks can be named, with a direct reference in the state.
In this PR, `historical_roots` is frozen at its current value and
`historical_batches` are computed from the merge epoch onwards.
After this PR, `block_batch_root` in the state can be used to verify an
era of blocks against the state with a simple root check.
The `historical_roots` values on the other hand can be used to verify
that a constant distributed with clients is valid for a particular
state, and therefore extends the block validation all the way back to
genesis without backfilling `block_batch_root` and without introducing
any new security assumptions in the client.
As far as naming goes, it's convenient to talk about an "era" being 8192
slots ~= 1.14 days. The 8192 number comes from the
SLOTS_PER_HISTORICAL_ROOT constant.
With multiple easily verifable blocks in a file, it becomes trivial to
offload block history to out-of-protocol transfer methods (bittorrent /
ftp / whatever) - including execution payloads, paving the way for a
future in which clients purge block history in p2p.
This PR can be applied along with the merge which simplifies payload
distribution from the get-go. Both execution and consensus clients
benefit because from the merge onwards, they both need to be able to
supply ranges of blocks in the sync protocol from what effectively is
"cold storage".
Another possibility is to include it in a future cleanup PR - this
complicates the "cold storage" mode above by not covering exection
payloads from start.
* Remove the work-in-progress note in Bellatrix spec
Bellatrix is done and released.
* Remove work-in-progress notes in Bellatrix specs
* Remove work-in-progress notes in Bellatrix specs
* Remove work-in-progress notes in Bellatrix specs
Improves separation between BLS cryptography and Ethereum SSZ logic.
Now the BLS library just implements bytes_to_bls_field(). Then hash_to_bls_field() does the Ethereum SSZ magic and
calls bytes_to_bls_field().
While the current Altair specs define structures to aid light client
development, one missing key aspect is the network protocol definition.
Certain implementations have started defining their own REST based APIs,
e.g., Lodestar at https://github.com/ChainSafe/lodestar/blob/master/packages/api/src/routes/lightclient.ts
While such APIs are useful, REST does not seem to be the ideomatic
choice as the sole API available at such a low level for Ethereum.
This patch introduces a libp2p based protocol to allow light clients to
sync to the latest `BeaconBlockHeader` in a trustless and decentralized
manner, building on top of prior work from:
- @hwwhww at https://github.com/ethereum/consensus-specs/pull/2267
- @jinfwhuang at https://github.com/ethereum/consensus-specs/pull/2786
- Lodestar's REST API (also has an endpoint to fetch merkle proofs!)
Replaces `process_slot_for_light_client_store` which force updates the
`LightClientStore` automatically based on `finalized_header` age with
`try_light_client_store_force_update` which may be manually called based
on use case dependent heuristics if light client sync appears stuck.
Not all use cases share the same risk profile.
Introduces reduced `LightClientUpdate` structures to allow keeping track
of the latest `finalized_header` and `optimistic_header`. This may also
help in scheduling the next query for a full `LightClientUpdate` once
sync committee finality has been reached.
Adds `create_light_client_bootstrap` and `create_light_client_update`
functions as a reference implementation for serving light client data.
This also enables a new test harness to verify that light client data
gets applied to a `LightClientStore` as expected.
- Move more code into polynomial-commitments.md
- Implement aggregated sidecar verification logic from PR #2915
- Rename `kzgs` to `kzg_commitments`
Co-authored-by: Hsiao-Wei Wang <hsiaowei.eth@gmail.com>
Introduces a new `LightClientBootstrap` structure to allow setting up a
`LightClientStore` with the initial sync committee and block header from
a user-configured trusted block root.
This leads to new cases where the `LightClientStore` is only aware of
the current but not the next sync committee. As a side effect of these
new cases, the store's `finalized_header` may now advance into the next
sync committee period before a corresponding `LightClientUpdate` with
the new sync committee is obtained, improving responsiveness.
Note that so far, `LightClientUpdate.attested_header.slot` needed to be
newer than `LightClientStore.finalized_header.slot`. However, it is now
necessary to also consider certain older updates to try and backfill the
`next_sync_committee`. The `is_better_update` helper is also updated to
improve `best_valid_update` tracking.
`LightClientUpdate` structures currently use different merkle proof root
depending on the presence of `finalized_header`. By always rooting it in
the same state (the `attested_header.state_root`), logic gets simpler.
Caveats:
- In periods of extended non-finality, `update.finalized_header` may now
be outdated by several sync committee periods. The old implementation
rejected such updates as the `next_sync_committee` in them was stale,
but the new implementation can properly handle this case.
- The `next_sync_committee` can no longer be considered finalized based
on `is_finality_update`. Instead, waiting until `finalized_header` is
in the `attested_header`'s sync committee period is now necessary.
- Because `update.finalized_header > store.finalized_header` no longer
holds (for updates with finality), an `is_better_update` helper is
added to improve `best_valid_update` tracking (in the past, finalized
updates with supermajority participation would always directly apply)
This PR builds on prior work from:
- @hwwhww at https://github.com/ethereum/consensus-specs/pull/2829
The producer of `LightClientUpdate` structures usually does not know how
far the `LightClientStore` on the client side has advanced. Updates are
currently rejected when including a redundant `next_sync_committee` not
advancing the `LightClientStore`. Behaviour is changed to allow this.
- committee_index is used as an input to compute_subnet_for_attestation but it's not previously defined
- attestation.data.committee_index is incorrect, the field is "index"
When `state.finalized_checkpoint` references the genesis slot, it points
to an empty `root`, instead of the actual genesis block hash. This patch
updates the `LightClientUpdate` logic to allow including finality proofs
for genesis `finalized_checkpoint.root`, better supporting non-mainnet.
When including such a finality proof, the proof is for the empty `root`,
but `finalized_header` is kept zeroed out to signify this edge case.
The `fork_version` field in `LightClientUpdate` can be derived from the
`update.signature_slot` value by consulting the locally configured fork
schedule. The light client already needs access to the fork schedule to
determine the `GeneralizedIndex` values used for merkle proofs, and the
memory layouts of the structures (including `LightClientUpdate`). The
`fork_version` itself is network dependent and doesn't reveal that info.
* t push base design for partial withdrawals
* moor tests
* clean up withdrawals naming
* make partial withdrawal randomized tests better
* Apply suggestions from code review
Co-authored-by: Alex Stokes <r.alex.stokes@gmail.com>
Co-authored-by: Hsiao-Wei Wang <hsiaowei.eth@gmail.com>
* fix mainnet brokn test
* name swap
* lint
Co-authored-by: Alex Stokes <r.alex.stokes@gmail.com>
Co-authored-by: Hsiao-Wei Wang <hsiaowei.eth@gmail.com>
* deprecate `BeaconBlocksByRange.step`
The `step` parameter has not seen much implementation in real life
clients which instead opt to request variations on a few epochs at a
time (instead of interleaving single blocks, entire epochs are
interleaved).
At the same time, supporting `step` on the server side brings several
complications: more complex bounds checking logic, more complex loading
of blocks from linear storage (which presumably stores all blocks and
not just certain increments).
This PR suggests that we deprecate the whole idea. Backwards
compatibility is kept by simply responding with a single block when
`step > 0` - this is allowed by the spec and should thus be handled
gracefully by requesting clients already, should there exist any that
use larger step counts.
Removing `step` now allows simplifying the EL-CL protocol for serving
execution data from the EL to avoid double storage.
* Update specs/phase0/p2p-interface.md
Co-authored-by: Danny Ryan <dannyjryan@gmail.com>
Co-authored-by: Danny Ryan <dannyjryan@gmail.com>
* Ignore subset aggregates
When aggregates are propagated through the network, it is often the case
that a better aggregate has already been seen - in particular, this
happens when an aggregator has not been able to include itself in the
mesh and therefore publishes an aggregate with only its own
attestations.
This new ignore rule allows dropping all aggregates that are
(non-strict) subsets of aggregates that have already been seen on the
network. In particular, it does not mandate dropping aggregates where a
union of previous aggregates would cause it to become a subset).
The logic for allowing this is based on the premise that any aggregate
that has already been seen by a peer will also have been seen by its
neighbours - a subset aggregate (strict or not) brings no new value to
the aggregation algorithm, except in the extreme edge case where you
could combine several such sparse aggregates into a single, more dense
"combined" aggregate and thus use less block space.
Further, as a small benefit, computing the `hash_tree_root` of the full
aggregate is generally not done -however, `hash_tree_root(data)` is
already done for other purposes as this is used as index in the beacon
API.
* add subset ignore rule to sync contributions as well
* typo
As the sync committee signs the previous block, the situation arises at
every sync committee period boundary, that the new sync committee signs
a block in the previous sync committee period. The light client cannot
reliably detect this condition (e.g., assume that this is the case when
it is currently on the last slot of a sync committee period), because
the last couple slots of a sync committee period may not have a block.
For example, when receiving a `LightClientUpdate` that is constructed
as in the following illustration, it is unknown whether `sync_aggregate`
was signed by the current or next sync committee at `attested_header`.
```
slot N N + 1 | N + 2 (slot not sent!)
|
+-----------------+ \ / | +----------------+
| attested_header | <--- X ----|---- | sync_aggregate |
+-----------------+ / \ | +----------------+
missed |
|
sync committee
period boundary
```
This patch addresses this edge case by including the slot at which the
`sync_aggregate` was created into the `LightClientUpdate` object.
Note that the `signature_slot` cannot be trusted beyond the purpose of
signature verification, as it could be manipulated to any other slot
within the same sync committee period and fork version, without making
the `sync_aggregate` invalid.
Documentation on how to call `prepare_execution_payload` had the params
for `safe_block_hash` and `finalized_block` hash flipped compared to the
function definition. Also updated tests for consistency.