* low attestations during epoch should instafail in CI; dbg -> warn level on newPayload log
* improve newPayload warning message when no valid EL connected
* reduce potential spam; make log spelling more consistent; use fatal/quit
* Refactor nimbus_signing_node to support Unix signals.
* Fix SN unable to close REST server properly.
* Fix `keys`, `deposit` and `validator_registration` endpoints issues.
Add getValidatorExitSignature() and getDepositMessageSignature() to validator_pool.
* Add /reload endpoint and implementation.
Fix signData to not cancel `timer`.
Fix validator_pool should clear attachedValidators table.
* Diva protocol enhancement implementation.
* outline for comparing bids from builder and engine API in BN
* set up async
* decision scaffold
* clean up logging
* Refactor proposeBlockMEV
* Update beacon_chain/validators/validator_duties.nim
Co-authored-by: zah <zahary@status.im>
* Update beacon_chain/validators/validator_duties.nim
Co-authored-by: zah <zahary@status.im>
* use typedescs instead of explicit generic parameters
---------
Co-authored-by: zah <zahary@status.im>
Just the variable, not yet `lcDataForkAtStateFork` / `atStateFork`.
- Shorten comment in `light_client.nim` to keep line width
- Do not rename `stateFork` mention in `runProposalForkchoiceUpdated`.
- Do not rename `stateFork` in `getStateField(dag.headState, fork)`
Rest is just a mechanical mass replace
* Support for driving multiple EL nodes from a single Nimbus BN
Full list of changes:
* Eth1Monitor has been renamed to ELManager to match its current
responsibilities better.
* The ELManager is no longer optional in the code (it won't have
a nil value under any circumstances).
* The support for subscribing for headers was removed as it only
worked with WebSockets and contributed significant complexity
while bringing only a very minor advantage.
* The `--web3-url` parameter has been deprecated in favor of a
new `--el` parameter. The new parameter has a reasonable default
value and supports specifying a different JWT for each connection.
Each connection can also be configured with a different set of
responsibilities (e.g. download deposits, validate blocks and/or
produce blocks). On the command-line, these properties can be
configured through URL properties stored in the #anchor part of
the URL. In TOML files, they come with a very natural syntax
(althrough the URL scheme is also supported).
* The previously scattered EL-related state and logic is now moved
to `eth1_monitor.nim` (this module will be renamed to `el_manager.nim`
in a follow-up commit). State is assigned properly either to the
`ELManager` or the to individual `ELConnection` objects where
appropriate.
The ELManager executes all Engine API requests against all attached
EL nodes, in parallel. It compares their results and if there is a
disagreement regarding the validity of a certain payload, this is
detected and the beacon node is protected from publishing a block
with a potential execution layer consensus bug in it.
The BN provides metrics per EL node for the number of successful or
failed requests for each type Engine API requests. If an EL node
goes offline and connectivity is resoted later, we report the
problem and the remedy in edge-triggered fashion.
* More progress towards implementing Deneb block production in the VC
and comparing the value of blocks produced by the EL and the builder
API.
* Adds a Makefile target for the zhejiang testnet
* log validator that triggers doppelganger
* move activity detection closer to where it's performed, record
aggregation as activity (since it's part of the liveness endpoint)
* vc: check for doppelgangers in epoch 0 also (so that activity in epoch
1 can happen)
* avoid some looping when processing activities
This commit removes ForkySignedBeaconBlockMaybeBlobs and all
references. I tried to pull that thread only as little as was needed
to get rid of it. Left a placeholder BlobSidecar array (in lieu of
Opt[BlobsSidecar]) in a few places; this will be used as we rebuild
the decoupled implementation.
* Local sim impovements
* Added support for running Capella and EIP-4844 simulations
by downloading the correct version of Geth.
* Added support for using Nimbus remote signer and Web3Signer.
Use 2 out of 3 threshold signing configuration in the mainnet
configuration and regular remote signing in the minimal one.
* The local testnet simulation can now use a payload builder.
This is currently not activated in CI due to lack of automated
procedures for installing third-party relays or builders.
You are adviced to use mergemock for now, but for most realistic
results, we can create a simple builder based on the nimbus-eth1
codebase that will be able to propose transactions from the regular
network mempool.
* Start the simulation from a merged state. This would allow us
to start removing pre-merge functionality such as the gossip
subsciption logic. The commit also removes the merge-forcing
hack installed after the TTD removal.
* Consolidate all the tools used in the local simulation into a
single `ncli_testnet` binary.
* restore doppelganger check on connectivity loss
https://github.com/status-im/nimbus-eth2/pull/4398 introduced a
regression in functionality where doppelganger detection would not be
rerun during connectivity loss. This PR reintroduces this check and
makes some adjustments to the implementation to simplify the code flow
for both BN and VC.
* track when check was last performed for each validator (to deal with
late-added validators)
* track when we performed a doppel-detectable activity (attesting) so as
to avoid false positives
* remove nodeStart special case (this should be treated the same as
adding a validator dynamically just after startup)
* allow sync committee duties in doppelganger period
* don't trigger doppelganger when registering duties
* fix crash when expected index response is missing
* fix missing slashingSafe propagation
Other changes:
Renamed the `EIP_4844_FORK_*` config constants to `DENEB_FORK_*` as
this matches the latest spec and it's already used in the official
Sepolia config.
We do a linear scan of all pubkeys for each validator and slot - this
becomes expensive with large validator counts.
* normalise BN/VC validator startup logging
* fix crash when host cannot be resolved while adding remote validator
* silence repeated log spam for unknown validators
* print pubkey/index/activation mapping on startup/validator
identification
* debug log upon sidecar validation failure
* Fill in signature catch upon SignedBeaconBlockAndBlobsSidecar deser
* Always fill blobssidecar slot and root
* Skip lastFCU when eth1monitor is nil
* fix
* Use cached root
* eip4844 beacon block proposals
* Don't fetch blobs under minimal preset
@tersec's summary of the issue:
BlobsBundleV1 in the execution API spec assumes a mainnet preset blob
size, where the EIP4844 consensus spec defines
FIELD_ELEMENTS_PER_BLOB: 4 under the minimal preset, which leads to a
Blob having a length of 4 * 32, not 4096 * 32 which BlobsBundleV1
requires.
* Revert unintentional script change
* exit/validatorchange pool includes BLS to execution messages; REST
support for new pool
* catch failed individual futures
* increase BLS changes bound and keep BLS seen consistent with subpool
* deque capacities should be powers of 2
* Refactor block/blobs types
Use type system to enforce invariant that a pre-4844 block cannot have
a sidecar.
* Update beacon_chain/nimbus_beacon_node.nim
Co-authored-by: tersec <tersec@users.noreply.github.com>
* review feedback
Co-authored-by: tersec <tersec@users.noreply.github.com>
By enabling the validator monitor, more precise information about the
lifecycle of an attestation is logged at the higher `NOTICE` log level
while current `sent` messages are logged at `INF` instead, since they
are less interesting.
In particular, missed attestations and those that vote for the wrong
head are now detected and logged at NOTICE.
In addition to logging, this feature enables rich metrics around
attestation and sync committee performance - by default, validators are
tracked in aggregate but a detailed mode exists as well
This feature has been available since early Nimbus days, but it has now
been tuned and optimised such that it is safe to enable by default, even
for large setups.
* enable automatic validator monitoring by default
* replace `--validator-monitor-totals` flag with
`--validator-monitor-details` - the detailed mode is disabled by default
* lower "sent" log level to `INF` for several messages - in particular
those that are traced by the validator monitor
This is a retake on #3531 which was later reverted in #3578.
Distinguish between those code locations that need to be updated on each
light client data format change, and those others that should generally
be fine, as long as a valid light client object is processed.
The former are tagged with static assert for `LightClientDataFork.high`.
The latter are changed to `lcDataFork > LightClientDataFork.None` to
indicate that they depend only on presence of any valid object.
Also bundled a few minor cleanups and fixes.
Also add `Forky` type for `LightClientStore` and minor fixes / cleanups.
The light client data structures were changed to accommodate additional
fields in future forks (e.g., to also hold execution data).
There is a minor change to the JSON serialization, where the `header`
properties are now nested inside a `LightClientHeader`.
The SSZ serialization remains compatible.
See https://github.com/ethereum/consensus-specs/pull/3190
and https://github.com/ethereum/beacon-APIs/pull/287
In a future fork, light client data will be extended with execution info
to support more use cases. To anticipate such an upgrade, introduce
`Forky` and `Forked` types, and ready the database schema.
Because the mapping of sync committee periods to fork versions is not
necessarily unique (fork schedule not in sync with period boundaries),
an additional column is added to `period` -> `LightClientUpdate` table.
* fix REST liveness endpoint responding even when gossip is not enabled
* fix VC exit code on doppelganger hit
* fix activation epoch not being updated correctly on long deposit
queues
* fix activation epoch being set incorrectly when updating validator
* move most implementation logic to `validator_pool`, add tests
* ensure consistent logging between VC and BN
* add docs
* Types and scaffolding for EIP-4844
This commit adds the EIP-4844 spec types, and fills in
scaffolding/boilerplate for the use of these types across the repo.
None of the actual EIP-4844 logic is introduced yet.
This follows the pattern used by @tersec when introducing Capella (#4276).
* use eth2-networks fork
* review feedback: add static check EIP4844_FORK_EPOCH == FAR_FUTURE_EPOCH
* review feedback: remove EIP4844 from /eth/v1/config/spec response
* Cleanup / review feedback
* Fix REST test
* Initial commit.
* NextAttestationEntry type.
* Add doppelgangerCheck and actual check.
* Recover deleted check.
* Remove NextAttestainEntry changes.
* More cleanups for NextAttestationEntry.
* Address review comments.
* Remove GENESIS_EPOCH specific check branch.
* Decrease number of full epochs for doppelganger check in VC.
Co-authored-by: zah <zahary@status.im>
When the EL/Builder fails to produce an execution payload, we fall back
to an empty `ExecutionPayload`. Even though it contains no transactions
it should refer to the configured fee recipient. This is useful for
privacy reasons (do not reveal the reason for the empty payload) and for
compliance with additional fee recipient rules by staking pools.
* move duty tracking code to `ActionTracker`
* fix earlier duties overwriting later ones
* re-run subnet selection when new duty appears
* log upcoming duties as soon as they're known (vs 4 epochs before)
* detect mismatch of config and binary
When loading configuration that sets keys that Nimbus bakes into the
binary at compile-time, raise an error if the config is incompatible
instead of ignoring the conflicting value.
Since these files may have been created in a previous run or manually,
we want to keep loading them even on nodes that don't enable the
keystore API (for example static setups)
Other changes:
* log keystore loading progressively (#3699)
* print initial fee recipient when loading validators
* log dynamic fee recipient updates
* more efficient forkchoiceUpdated usage
* await rather than asyncSpawn; ensure head update before dag.updateHead
* use action tracker rather than attached validators to check for next slot proposal; use wall slot + 1 rather than state slot + 1 to correctly check when missing blocks
* re-add two-fcU case for when newPayload not VALID
* check dynamicFeeRecipientsStore for potential proposal
* remove duplicate checks for whether next proposer
* Harden block proposal against expired slashings/exits
When a message is signed in a phase0 domain, it can no longer be
validated under bellatrix due to the correct fork no longer being
available in the `BeaconState`.
To ensure that all slashing/exits are still valid, in this PR we re-run
the checks in the state that we're proposing for, thus hardening against
both signatures and other changes in the state that might have
invalidated the message.
* fix same message added multiple times
in case of attestation slashing of multiple validators in one go
* Fixes a segfault during block production when the Keymanager API
is disabled. The Keymanager is now disabled on half of the local
testnet nodes to catch such problems in the future.
* Fixes multiple potential stalls from REST requests being done
without a timeout. From practice, we know that such requests
can hang forever if not cancelled with a timeout. At best,
this would be a resource leak, at worst, it may lead to a
full stall of the client and missed validator duties.
* Changes some Options usages to Opt (for easier use of valueOr)
When the client was started without any validators, the doppelganger
detection structures were never initialized properly. Later, when
validators were added through the Keymanager API, they interacted
with the uninitialized doppelganger detection structures and their
duties were inappropriately skipped.
* Keymanager API for the validator client
* Properly treat the 'description' field as optional when loading Keystores
* Spec-compliant serialization of the slashing data in Keymanager's DeleteKeys response ()
Fixes#3940Fixes#3964Closes#3884 by adding test
In order to avoid full replays when validating attestations hailing from
untaken forks, it's better to keep shufflings separate from `EpochRef`
and perform a lookahead on the shuffling when processing the block that
determines them.
This also helps performance in the case where REST clients are trying to
perform lookahead on attestation duties and decreases memory usage by
sharing shufflings between EpochRef instances of the same dependent
root.
* MEV validator registration
* add nearby canary to detect new beacon chain forks
* remove special MEV graffiti
* web3signer support
* fix trace logging
* Nim 1.2 needs raises Defect
* use template rather than proc in REST JSON parsing
* use --payload-builder-enable and --payload-builder-url
* explicitly default MEV to disabled
* explicitly empty default value for payload builder URL
* revert attestation pool to unstable version
Other changes:
* The Keymanager error responses differ from the Beacon API responses.
'keymanagerApiError' replaces the former usages of 'jsonError'.
* Return status code 401 and 403 for authorization errors in accordance
to the spec.
* Eliminate inconsistencies in the REST JSON parsing. Some of the code
paths allowed missing fields.
* Added logging of serialization failure details at DEBUG level.
a notice in the log is enough - we don't want the REST API to return an
error in this case because that makes the validator client think
something is seriously wrong (like the BN or message being broken)
Whether new blocks/attestations/etc are produced internally or received
via REST, their journey through the node is the same - to ensure that
they get the same treatment (logging, metrics, processing), this PR
moves the routing to a dedicated module and fixes several small
differences that existed before.
* `xxxValidator` -> `processMessageName` - the processor also was adding
messages to pools, so we want the name to reflect that action
* add missing "sent" metrics for some messages
* document ignore policy better - already-seen messages are not actaully
rebroadcast by libp2p
* skip redundant signature checks for internal validators consistently
The justified and finalized `Checkpoint` are frequently passed around
together. This introduces a new `FinalityCheckpoint` data structure that
combines them into one.
Due to the large usage of this structure in fork choice, also took this
opportunity to update fork choice tests to the latest v1.2.0-rc.1 spec.
Many additional tests enabled, some need more work, e.g. EL mock blocks.
Also implemented `discard_equivocations` which was skipped in #3661,
and improved code reuse across fork choice logic while at it.
* optimistic sync
* flag that initially loaded blocks from database might need execution block root filled in
* return optimistic status in REST calls
* refactor blockslot pruning
* ensure beacon_blocks_by_{root,range} do not provide optimistic blocks
* handle forkchoice head being pre-merge with block being postmerge
* re-enable blocking head updates on validator duties
* fix is_optimistic_candidate_block per spec; don't crash with nil future
* fix is_optimistic_candidate_block per spec; don't crash with nil future
* mark blocks sans execution payloads valid during head update
Combines the LC data configuration options (serve / importMode), the
callbacks (finality / optimistic LC update) as well as the cache storing
light client data, into a new `LightClientDataStore` structure.
Also moves the structure into a light client specific file.
* check for and log gossip broadcast failure
* switch notices to warns; update LC variables regardless
* don't both return a Result and log sending error
* add metrics counter for failed-due-to-no-peers and removed unnecessary async
* don't report failure of sync committee messages
* remove redundant metric
* document metric being incremented
For consistency with other options, use a common prefix for light client
data configuration options.
* `--serve-light-client-data` --> `--light-client-data-serve`
* `--import-light-client-data` --> `--light-client-data-import-mode`
No deprecation of the old identifiers as they were only sparingly used
and all usage can be easily updated without interferance.
* SSZ `[]` -> `mitem`
* `[]` -> `item`
immutable access via mutable instance cannot rely on template
overloading, and `[]` cannot be a `func` because of special seq handling
in compiler.
Incorporates the latest changes to the light client sync protocol based
on Devconnect AMS feedback. Note that this breaks compatibility with the
previous prototype, due to changes to data structures and endpoints.
See https://github.com/ethereum/consensus-specs/pull/2802
* Some Web3Signer versions insist replying with text/plain messages
* When reading blocks, the Web3Signer uses upper-case fork identifiers
instead of lower-case identifies like the Beacon API.
Other changes:
* logtrace can now verify sync committee messages and contributions
* Many unnecessary use of pairs() have been removed for consistency
* Map 40x BN response codes to BeaconNodeStatus.Incompatible in the VC
Other fixes:
* Fix bit rot in the `make prater-dev-deposit` target.
* Correct content-type in the responses of the Nimbus signing node
* Invalid JSON payload was being sent in the web3signer requests
* use MAX_CHUNK_SIZE_BELLATRIX for signed Bellatrix blocks
* Update beacon_chain/networking/eth2_network.nim
Co-authored-by: Etan Kissling <etan@status.im>
* localPassC to localPassc
* check against maxChunkSize rather than constant
Co-authored-by: Etan Kissling <etan@status.im>
Some upstream repos still need fixes, but this gets us close enough that
style hints can be enabled by default.
In general, "canonical" spellings are preferred even if they violate
nep-1 - this applies in particular to spec-related stuff like
`genesis_validators_root` which appears throughout the codebase.
Validator monitoring improves logging by giving more specific monitoring
information, and can now be seen as complete.
Previously, logging has focused on "Attestation sent" messages which
carry little informational value when things go wrong, and are overly
aggressive when everything works as expected (sending attestations is
the norm).
* lower "Attestation sent" log to `INFO`
* mark 1.7.0 as the start of the validator monitor feature - previous
versions had significant bugs in totals mode
* harden validator API against pre-finalized slot requests
* check `syncHorizon` when responding to validator api requests too far
from `head`
* limit state-id based requests to one epoch ahead of `head`
* put historic data bounds on block/attestation/etc validator production API, preventing them from being used with already-finalized slots
* add validator block smoke tests
* make rest test create a new genesis with the tests running roughly in
the first epoch to allow testing a few more boundary conditions
Recently, block processing times have been going up as the network grows
making early attestation riskier. Since blocks are big and attestations
are small (though numerous and therefore bandwidth-intense), it seems
better to wait a little bit longer after receiving a block, before we
publish the attestation.
Up til now, the block dag has been using `BlockRef`, a structure adapted
for a full DAG, to represent all of chain history. This is a correct and
simple design, but does not exploit the linearity of the chain once
parts of it finalize.
By pruning the in-memory `BlockRef` structure at finalization, we save,
at the time of writing, a cool ~250mb (or 25%:ish) chunk of memory
landing us at a steady state of ~750mb normal memory usage for a
validating node.
Above all though, we prevent memory usage from growing proportionally
with the length of the chain, something that would not be sustainable
over time - instead, the steady state memory usage is roughly
determined by the validator set size which grows much more slowly. With
these changes, the core should remain sustainable memory-wise post-merge
all the way to withdrawals (when the validator set is expected to grow).
In-memory indices are still used for the "hot" unfinalized portion of
the chain - this ensure that consensus performance remains unchanged.
What changes is that for historical access, we use a db-based linear
slot index which is cache-and-disk-friendly, keeping the cost for
accessing historical data at a similar level as before, achieving the
savings at no percievable cost to functionality or performance.
A nice collateral benefit is the almost-instant startup since we no
longer load any large indicies at dag init.
The cost of this functionality instead can be found in the complexity of
having to deal with two ways of traversing the chain - by `BlockRef` and
by slot.
* use `BlockId` instead of `BlockRef` where finalized / historical data
may be required
* simplify clearance pre-advancement
* remove dag.finalizedBlocks (~50:ish mb)
* remove `getBlockAtSlot` - use `getBlockIdAtSlot` instead
* `parent` and `atSlot` for `BlockId` now require a `ChainDAGRef`
instance, unlike `BlockRef` traversal
* prune `BlockRef` parents on finality (~200:ish mb)
* speed up ChainDAG init by not loading finalized history index
* mess up light client server error handling - this need revisiting :)
One more step on the journey to reduce `BlockRef` usage across the
codebase - this one gets rid of `StateData` whose job was to keep track
of which block was last assigned to a state - these duties have now been
taken over by `latest_block_root`, a fairly recent addition that
computes this block root from state data (at a small cost that should be
insignificant)
99% mechanical change.
* fewer deps on `BlockRef` traversal in anticipation of pruning
* allows identifying EpochRef:s by their shuffling as a first step of
* tighten error handling around missing blocks
using the zero hash for signalling "missing block" is fragile and easy
to miss - with checkpoint sync now, and pruning in the future, missing
blocks become "normal".
* Initial commit.
* Fix current test suite.
* Fix keymanager api test.
* Fix wss_sim.
* Add more keystore_management tests.
* Recover deleted isEmptyDir().
* Add `HttpHostUri` distinct type.
Move keymanager calls away from rest_beacon_calls to rest_keymanager_calls.
Add REST serialization of RemoteKeystore and Keystore object.
Add tests for Remote Keystore management API.
Add tests for Keystore management API (Add keystore).
Fix serialzation issues.
* Fix test to use HttpHostUri instead of Uri.
* Add links to specification in comments.
* Remove debugging echoes.
* deactivate Doppelganger Protection during genesis
* also don't actually flag supposed-doppelgangers (because they're before broadcastStartEpoch) on GENESIS_SLOT start
* update action tracker on dependent-root-changing reorg (instead of
epoch change)
* don't try to log duties while syncing - we're not tracking actions yet
* fix slot used for doppelganger loss detection
These use a separate flow, and were previously only registered from the
network
* don't log successes in totals mode (TMI)
* remove `attestation-sent` event which is unused
The current counters set gauges etc to the value of the _last_ validator
to be processed - as the name of the feature implies, we should be using
sums instead.
* fix missing beacon state metrics on startup, pre-first-head-selection
* fix epoch metrics not being updated on cross-epoch reorg
* limit by-root requests to non-finalized blocks
Presently, we keep a mapping from block root to `BlockRef` in memory -
this has simplified reasoning about the dag, but is not sustainable with
the chain growing.
We can distinguish between two cases where by-root access is useful:
* unfinalized blocks - this is where the beacon chain is operating
generally, by validating incoming data as interesting for future fork
choice decisions - bounded by the length of the unfinalized period
* finalized blocks - historical access in the REST API etc - no bounds,
really
In this PR, we limit the by-root block index to the first use case:
finalized chain data can more efficiently be addressed by slot number.
Future work includes:
* limiting the `BlockRef` horizon in general - each instance is 40
bytes+overhead which adds up - this needs further refactoring to deal
with the tail vs state problem
* persisting the finalized slot-to-hash index - this one also keeps
growing unbounded (albeit slowly)
Anyway, this PR easily shaves ~128mb of memory usage at the time of
writing.
* No longer honor `BeaconBlocksByRoot` requests outside of the
non-finalized period - previously, Nimbus would generously return any
block through this libp2p request - per the spec, finalized blocks
should be fetched via `BeaconBlocksByRange` instead.
* return `Opt[BlockRef]` instead of `nil` when blocks can't be found -
this becomes a lot more common now and thus deserves more attention
* `dag.blocks` -> `dag.forkBlocks` - this index only carries unfinalized
blocks from now - `finalizedBlocks` covers the other `BlockRef`
instances
* in backfill, verify that the last backfilled block leads back to
genesis, or panic
* add backfill timings to log
* fix missing check that `BlockRef` block can be fetched with
`getForkedBlock` reliably
* shortcut doppelganger check when feature is not enabled
* in REST/JSON-RPC, fetch blocks without involving `BlockRef`
* fix dag.blocks ref
* initial support for minification and new interchange tests. Removal of v1 and v1 migration.
* Synthetic attestations: SQLite3 requires one statement/query per prepared statement
* Fix DB import interrupted if no attestation was found
* Skip test relying on undocumented test behavior (https://github.com/eth-clients/slashing-protection-interchange-tests/pull/12#issuecomment-1011158701)
* Skip test relying on unclear minification behavior:
creating an invalid minified attestation with source > target or setting target = max(source, target)
* remove DB v1 and update submodule
* Apply suggestions from code review
Co-authored-by: Jacek Sieka <jacek@status.im>
Co-authored-by: Jacek Sieka <jacek@status.im>