The validator beacon APIs `getAttesterDuties`, `getProposerDuties`, and
`getSyncCommitteeDuties`, have reported the `execution_optimistic`
state for the current head block. This can lead to a race if duties are
requested around the slot start, if a new head block is currently being
processed by the EL, during which the BN head may be briefly optimistic.
`execution_optimistic` is documented in beacon APIs as:
> True if the response references an unverified execution payload.
> Optimistic information may be invalidated at a later time.
> If the field is not present, assume the False value.
As the duty endpoints reference the shuffling dependent root instead of
the currently selected head block, `execution_optimistic` is now fetched
based on that shuffling dependent block root. As this dependent block is
in the past it doesn't usually become optimistic when adding new blocks.
Note that the endpoints requested 4/8 seconds into the slot that perform
the actual duties instead of just querying for duty schedule, still
report `execution_optimistic` based on the BN head block.
When fetching historical `getSyncCommitteeDuties` for the very first
sync committee period, the case must be handled where Altair may not
have been scheduled on a sync committee period boundary.
When an uncached `ShufflingRef` is requested, we currently replay state
which can take several seconds. Acceleration is possible by:
1. Start from any state with locked-in `get_active_validator_indices`.
Any blocks / slots applied to such a state can only affect that
result for future epochs, so are viable for querying target epoch.
`compute_activation_exit_epoch(state.slot.epoch) > target.epoch`
2. Determine highest common ancestor among `state` and `target.blck`.
At the ancestor slot, same rules re `get_active_validator_indices`.
`compute_activation_exit_epoch(ancestorSlot.epoch) > target.epoch`
3. We now have a `state` that shares history with `target.blck` up
through a common ancestor slot. Any blocks / slots that the `state`
contains, which are not part of the `target.blck` history, affect
`get_active_validator_indices` at epochs _after_ `target.epoch`.
4. Select `state.randao_mixes[N]` that is closest to common ancestor.
Either direction is fine (above / below ancestor).
5. From that RANDAO mix, mix in / out all RANDAO reveals from blocks
in-between. This is just an XOR operation, so fully reversible.
`mix = mix xor SHA256(blck.message.body.randao_reveal)`
6. Compute the attester dependent slot from `target.epoch`.
`if epoch >= 2: (target.epoch - 1).start_slot - 1 else: GENESIS_SLOT`
7. Trace back from `target.blck` to the attester dependent slot.
We now have the destination for which we want to obtain RANDAO.
8. Mix in all RANDAO reveals from blocks up through the `dependentBlck`.
Same method, no special handling necessary for epoch transitions.
9. Combine `get_active_validator_indices` from `state` at `target.epoch`
with the recovered RANDAO value at `dependentBlck` to obtain the
requested shuffling, and construct the `ShufflingRef` without replay.
* more tests and simplify logic
* test with different number of deposits per branch
* Update beacon_chain/consensus_object_pools/blockchain_dag.nim
Co-authored-by: Jacek Sieka <jacek@status.im>
* `commonAncestor` tests
* lint
---------
Co-authored-by: Jacek Sieka <jacek@status.im>
* fix SSZ response for `produceBlindedBlock`
In `produceBlindedBlock`, we sent the `ForkedBlindedBeaconBlock` when
requested to reply in SSZ format. However, expected result is just the
inner `ForkyBlindedBeaconBlock` together with `eth-consensus-version`.
Note: We do not use SSZ format in our VC for this endpoint at this time,
which explains why we haven't noticed earlier.
* fix Altair/Phase0
* Incremental pruning
When turning on pruning the first time the current pruning algorithm
will prune the full database at startup. This delays restart
unnecessarily, since all of the pruned space is not needed at once.
This PR introduces incremental pruning such that we will never prune
more than 32 blocks or the sync speed, whichever is higher.
This mode is expected to become default in a follow-up release.
* Kzg: Load trusted setup
* scripts/launch_local_testnet.sh: set FIELD_ELEMENTS_PER_BLOB
* Use right setup file for mainnet/minimal
* Force rebuild
* Add comment explaining why build with -f
`attachMerkleProofs` is used by `mockUpdateStateForNewDeposit` to create
a single deposit. The function doesn't work correctly when trying with
with multiple deposits, though. Fix this to enable more complex tests,
and also return the `deposit_root` for forming matching `Eth1Data`.
* final portion of non-trivial v1.3.0 bumps
Updates unchanged logic to latest v1.3.0 consensus-specs refs,
and cleans up surrounding sections / syncs comments, and so on.
```
https://github.com/ethereum/consensus-specs/(blob|tree)/(?!v1\.3\.0/)
```
* lint
* linebreak
* cleanup `state_transition_epoch` and bump to v1.3.0
More v1.3.0 consensus-specs bumps, focused on `state_transition_epoch`.
Also fixed `current_epoch` spurious style check warning, and cleanup.
* Update beacon_chain/spec/state_transition_epoch.nim
Make intent clearer about when to expect `bls_to_execution_changes`,
and replace test-time errors with compile-time errors if the field
is renamed or goes away in the future.
The `SignedContributionAndProof: invalid contribution signature` check
is sometimes hit around fork boundaries when running local testnet.
To avoid failing CI, revert this isntance to a plain `errReject` until
the underlying problem is addressed.
We already updated the field order in the actual `ExecutionPayload`,
but in init code and tests / logs etc we still used the old order.
Update those occurrences to also match the field order in the struct.
Furthermore, add `excess_data_gas` to last entry in `test_eth1_monitor`.
Updates gossip validation spec references to v1.3.0 and fixes an
incorrect reference to "signed_aggregate_and_proof" in sync contribution
documentation.
The `SAFE_SLOTS_TO_UPDATE_JUSTIFIED` constant is no longer used as the
bouncing attack fix was removed:
https://github.com/ethereum/consensus-specs/pull/3290
Note: Some test networks still define the constant, ignoring the config
constant for now until it is no longer used.
Only comment changes:
- Bump refs to final v1.3.0 spec
- Align documentation style in various `BeaconState` structures
- Add `justification_bits` / `historical_roots` comment from spec
- Remove `previous_justified_checkpoint` from non-phase0 (same as spec)
- Cleanup some `Modified` tags
Fail local testnets on any gossip REJECT, instead of just asserting some
of the attestation related checks. This now also ensures that blocks,
BLS to Execution changes, blob sidecars and LC messages are checked
when running in a local testnet environment (`--verify-finalization`).
https://github.com/status-im/nimbus-eth2/pull/2904#discussion_r719603935
Add compatibility with https://github.com/ethereum/beacon-APIs/pull/290
to the beacon node. Behaviour when configured with multiple ELs is not
specified; intention suggests to indicate whether all ELs are offline.
When using trusted node sync with light client (`--trusted-block-root`),
the trust assumption on the server is reduced to solely be responsible
for data availability, but not data correctness. This means that we must
check block proposer signatures against the downloaded checkpoint, as
they are not covered by the block root.
Note that this lowers the backfill speed when using LC based CP sync
due to the extra checks, by about 60% for me.
Post-Capella, historical roots are computed from historical summaries
instead of being directly stored in the beacon state.
Slightly messy to pass both lists around - this is done to avoid
computing the historical root unnecessarily.
* low attestations during epoch should instafail in CI; dbg -> warn level on newPayload log
* improve newPayload warning message when no valid EL connected
* reduce potential spam; make log spelling more consistent; use fatal/quit
Post-Deneb, when the request manager receives a missing block from a
peer, it needs to check if the corresponding blobs are available, and
if so pass them along. If they aren't available, the newly-fetched
block must be put in blobless quarantine (while the blobs are
retrieved, coming in next commit).
The consensus-spec-tests already cover the scenarios of our custom test
runner, so the custom tests can be removed. Also cleans up unused config
flags and related unreachable logic.
* Fix durationToNextSlot() and durationToNextEpoch() to work not only after Genesis, but also before Genesis.
Change VC pre-genesis behavior, add runPreGenesisWaitingLoop() and runGenesisWaitingLoop().
Add checkedWaitForSlot() and checkedWaitForNextSlot() to strictly check current time and print warnings.
Fix VC main loop to use checkedWaitForNextSlot().
Fix attestation_service to run attestations processing only until the end of the duty slot.
Change attestation_service main loop to use checkedWaitForNextSlot().
Change block_service to properly cancel all the pending proposer tasks.
Use checkedWaitForSlot to wait for block proposal.
Fix block_service waitForBlockPublished() to be compatible with BN.
Fix sync_committee_service to avoid asyncSpawn.
Fix sync_committee_service to run only until the end of the duty slot.
Fix sync_committee_service to use checkedWaitForNextSlot().
* Refactor validator logging.
Fix aggregated attestation publishing missing delay.
* Fix doppelganger detection should not start at pre-genesis time.
Fix fallback service sync status spam.
Fix false `sync committee subnets subscription error`.
* Address review comments part 1.
* Address review comments.
* Fix condition issue for near genesis waiting loop.
* Address review comments.
* Address review comments 2.
The 'peek' name was incorrect as it was actually removing from the
table. It was consequently used incorrectly in block processing: the
blobless block wasn't returned to the table when it should be.
* Simplify block quarantine blobless
The quarantine blobless table was initially keyed off of (Eth2Digest,
ValidatorSig). This was modelled off the orphan table. The presence of
the signature in the key is necessary for orphans, because we can't
verify the signature for an orphan. That is not the case for a
blobless block, where the signature can be verified.
So this PR changes the blobless block table to be keyed off a
Eth2Digest only. This simplifies the retrieval and handling of
blobless blocks.
* review feedback
* allow trusted node sync based on LC trusted block root
Extends `trustedNodeSync` with a new `--trusted-block-root` option that
allows initializing a light client. No `--state-id` must be provided.
The beacon node will then use this light client to obtain the latest
finalized state from the remote server in a trust-minimized fashion.
Note that the provided `--trusted-block-root` should be somewhat recent,
and that security precautions such as comparing the state root against
block explorers is still recommended.
* fix
* workaround for `valueOr` limitations
* reduce magic numbers
* digest len > context len for readability
* move `cstring` conversion to caller
* avoid abbreviations
* `return` codestyle
* add `formatIt` for `ForkedLightClientXyz`
Even though the `ForkyLightClientXyz` have `formatIt`, they do not apply
when logging `ForkedLightClientXyz`, leading to large logs at times.
Defining `formatIt` for `ForkedLightClientXyz` fixes this.
* exports not needed
Two fixes to `/eth/v1/debug/fork_choice`:
- `validity` enum is expected to be serialized as string instead of int
- `data` wrapper is not expected for this endpoint
Syncs the `/eth/v1/debug/fork_choice` REST endpoint with latest specs.
- Validity is now reported as tri-state `enum` instead of two `bool`s
- Response includes store's justified and finalized checkpoints
- Additional `ExtraData` field on outer layer (empty for now)
https://github.com/ethereum/beacon-APIs/pull/232
* Refactor nimbus_signing_node to support Unix signals.
* Fix SN unable to close REST server properly.
* Fix `keys`, `deposit` and `validator_registration` endpoints issues.
Add getValidatorExitSignature() and getDepositMessageSignature() to validator_pool.
* Add /reload endpoint and implementation.
Fix signData to not cancel `timer`.
Fix validator_pool should clear attachedValidators table.
* Diva protocol enhancement implementation.
* outline for comparing bids from builder and engine API in BN
* set up async
* decision scaffold
* clean up logging
* Refactor proposeBlockMEV
* Update beacon_chain/validators/validator_duties.nim
Co-authored-by: zah <zahary@status.im>
* Update beacon_chain/validators/validator_duties.nim
Co-authored-by: zah <zahary@status.im>
* use typedescs instead of explicit generic parameters
---------
Co-authored-by: zah <zahary@status.im>
Just the variable, not yet `lcDataForkAtStateFork` / `atStateFork`.
- Shorten comment in `light_client.nim` to keep line width
- Do not rename `stateFork` mention in `runProposalForkchoiceUpdated`.
- Do not rename `stateFork` in `getStateField(dag.headState, fork)`
Rest is just a mechanical mass replace
* `engine_api_response_time` provides a histogram for the Engine API
response times for each unique pair ot URL and request type.
* All engine API requests are now tracked
Other changes:
The client will no longer exit on start-up if it fails to connect to
a properly configured EL node.
When using `--history=prune`, `dag.tail.slot` may advance beyond the
configured light client data retention period. Update the LC logic so
that the `dag.tail.slot` is no longer considered for LC pruning.
It is still considered to check whether new data can be produced.
* Update sync to use post-decoupling RPCs
blob_sidecars_by_range returns a flat list of sidecars, which must
then be grouped per-slot.
* Add test for groupBlobs
* createBlobs: convert proc to func
* Support for driving multiple EL nodes from a single Nimbus BN
Full list of changes:
* Eth1Monitor has been renamed to ELManager to match its current
responsibilities better.
* The ELManager is no longer optional in the code (it won't have
a nil value under any circumstances).
* The support for subscribing for headers was removed as it only
worked with WebSockets and contributed significant complexity
while bringing only a very minor advantage.
* The `--web3-url` parameter has been deprecated in favor of a
new `--el` parameter. The new parameter has a reasonable default
value and supports specifying a different JWT for each connection.
Each connection can also be configured with a different set of
responsibilities (e.g. download deposits, validate blocks and/or
produce blocks). On the command-line, these properties can be
configured through URL properties stored in the #anchor part of
the URL. In TOML files, they come with a very natural syntax
(althrough the URL scheme is also supported).
* The previously scattered EL-related state and logic is now moved
to `eth1_monitor.nim` (this module will be renamed to `el_manager.nim`
in a follow-up commit). State is assigned properly either to the
`ELManager` or the to individual `ELConnection` objects where
appropriate.
The ELManager executes all Engine API requests against all attached
EL nodes, in parallel. It compares their results and if there is a
disagreement regarding the validity of a certain payload, this is
detected and the beacon node is protected from publishing a block
with a potential execution layer consensus bug in it.
The BN provides metrics per EL node for the number of successful or
failed requests for each type Engine API requests. If an EL node
goes offline and connectivity is resoted later, we report the
problem and the remedy in edge-triggered fashion.
* More progress towards implementing Deneb block production in the VC
and comparing the value of blocks produced by the EL and the builder
API.
* Adds a Makefile target for the zhejiang testnet
* Fix issue when VC unable to detect errors properly and act accordingly.
Switch all API functions used by VC to RestPlainResponse, this allows us to print errors returned by BN servers.
* Fix issue when prepareBeaconCommitteeSubnet() do not perform actions when BN is optimistically synced only.
* Fix Defect issue.
* Fix submit/publish returning `false` when operation was successful.
* Address review comments.
* Fix some client calls unable to receive `execution_optimistic` field, mark BN as OptSynced when such request has been made.
* Adjust warning levels.
---------
Co-authored-by: Jacek Sieka <jacek@status.im>
Sepolia in particular goes through two hard forks during era 1, meaning
the phase0 fork is no longer part of the state, even though blocks that
belong to it still are phase0.
* log validator that triggers doppelganger
* move activity detection closer to where it's performed, record
aggregation as activity (since it's part of the liveness endpoint)
* vc: check for doppelgangers in epoch 0 also (so that activity in epoch
1 can happen)
* avoid some looping when processing activities
* Remove use of beacon_block_and_blobs_sidecar topic
This topic goes away with decoupled blocks and blobs.
* remove use of getBeaconBlockAndBlobsSidecarTopic from test
* update nimbus_light_client.nim
This commit removes ForkySignedBeaconBlockMaybeBlobs and all
references. I tried to pull that thread only as little as was needed
to get rid of it. Left a placeholder BlobSidecar array (in lieu of
Opt[BlobsSidecar]) in a few places; this will be used as we rebuild
the decoupled implementation.
* Local sim impovements
* Added support for running Capella and EIP-4844 simulations
by downloading the correct version of Geth.
* Added support for using Nimbus remote signer and Web3Signer.
Use 2 out of 3 threshold signing configuration in the mainnet
configuration and regular remote signing in the minimal one.
* The local testnet simulation can now use a payload builder.
This is currently not activated in CI due to lack of automated
procedures for installing third-party relays or builders.
You are adviced to use mergemock for now, but for most realistic
results, we can create a simple builder based on the nimbus-eth1
codebase that will be able to propose transactions from the regular
network mempool.
* Start the simulation from a merged state. This would allow us
to start removing pre-merge functionality such as the gossip
subsciption logic. The commit also removes the merge-forcing
hack installed after the TTD removal.
* Consolidate all the tools used in the local simulation into a
single `ncli_testnet` binary.
* Initial commit.
* Address review comments and recommendations.
* Fix too often `Execution client not in sync` messages in logs.
* Add failure reason for duties requests.
* Add more reasons to every place of ValidatorApiError.
* Address race condition issue.
* Remove `vc` argument for getFailureReason().
* restore doppelganger check on connectivity loss
https://github.com/status-im/nimbus-eth2/pull/4398 introduced a
regression in functionality where doppelganger detection would not be
rerun during connectivity loss. This PR reintroduces this check and
makes some adjustments to the implementation to simplify the code flow
for both BN and VC.
* track when check was last performed for each validator (to deal with
late-added validators)
* track when we performed a doppel-detectable activity (attesting) so as
to avoid false positives
* remove nodeStart special case (this should be treated the same as
adding a validator dynamically just after startup)
* allow sync committee duties in doppelganger period
* don't trigger doppelganger when registering duties
* fix crash when expected index response is missing
* fix missing slashingSafe propagation
Other changes:
Renamed the `EIP_4844_FORK_*` config constants to `DENEB_FORK_*` as
this matches the latest spec and it's already used in the official
Sepolia config.
* Fix getStateRoot() and getBlockRoot() API functions which should obtain `execution_optimistic` field.
Fix sync committee service to check `execution_optimistic` field of getBlockRoot() response.
* 2nd part.
* Remove presets usage.
We do a linear scan of all pubkeys for each validator and slot - this
becomes expensive with large validator counts.
* normalise BN/VC validator startup logging
* fix crash when host cannot be resolved while adding remote validator
* silence repeated log spam for unknown validators
* print pubkey/index/activation mapping on startup/validator
identification
By pre-seeding the sync committee cache when applying blocks, we avoid a
significantly expensive validator set traversal / sync committee index
construction during sync / block application - 20-30% sync speedup
post-altair.
* also cache/reload total active balance for another cool 10%
While syncing the finalized portion of the chain, the execution client
cannot efficiently sync and most of the time returns `SYNCING` - in this
PR, we use CL-verified optmistic sync as long as the block is claimed to
be finalized, only occasionally updating the EL with progress.
Although a peer might lie about what is finalized and what isn't,
eventually we'll call the execution client - thus, all a dishonest
client can do is delay execution verification slightly. Gossip blocks in
particular are never assumed to be finalized.
Extends fork choice state to also track slot numbers to improve accuracy
of `/eth/v1/debug/fork_choice` endpoint. Autoenable this API on devnet,
and disable some extra checks on devnet to aid focused testing efforts.
Align fork choice pruning logic with API based on checkpoints vs root.
* clean up some Nim 1.2 workarounds
* re-add notes about JS backend
* another proc/noSideEffect -> func
* revert ncli/ncli_common.nim changes; 19969 evidently wasn't backported to 1.6
To allow LC data retention longer than the one for historic states,
introduce persistent DB caches for `current_sync_committee` and
`LightClientHeader` for finalized epoch boundary blocks.
This way, historic `LightClientBootstrap` requests may still be honored
even after pruning. Note that historic `LightClientUpdate` requests are
already answered using fully persisted objects, so don't need changes.
Sync committees and headers are cached on finalization of new data.
For existing data, info is lazily cached on first access.
Co-authored-by: Jacek Sieka <jacek@status.im>
The "on" default for validator monitor details incurs a heavy
performance penalty on large-validator setups - this may cause excess
memory usage or slowdowns when metrics are queried - this PR changes the
default to off, as was intended for the 23.1.0 release.
* debug log upon sidecar validation failure
* Fill in signature catch upon SignedBeaconBlockAndBlobsSidecar deser
* Always fill blobssidecar slot and root
* Skip lastFCU when eth1monitor is nil
* fix
* Use cached root
When the epoch boundary block is missed, we incorrectly assume that the
next couple blocks improve finality, leading to repeated pushes of the
same light client finality update and incorrectly ignoring some gossip.
* combine common implementation of LC helpers
Combine replicated helper code from Altair/Capella/EIP4844 into single
`Forky` based implementation. Also convert `template` to `func` to avoid
selection of incorrect overload.
* fix