* move duty tracking code to `ActionTracker`
* fix earlier duties overwriting later ones
* re-run subnet selection when new duty appears
* log upcoming duties as soon as they're known (vs 4 epochs before)
Currently, we require genesis and a checkpoint block and state to start
from an arbitrary slot - this PR relaxes this requirement so that we can
start with a state alone.
The current trusted-node-sync algorithm works by first downloading
blocks until we find an epoch aligned non-empty slot, then downloads the
state via slot.
However, current
[proposals](https://github.com/ethereum/beacon-APIs/pull/226) for
checkpointing prefer finalized state as
the main reference - this allows more simple access control and caching
on the server side - in particular, this should help checkpoint-syncing
from sources that have a fast `finalized` state download (like infura
and teku) but are slow when accessing state via slot.
Earlier versions of Nimbus will not be able to read databases created
without a checkpoint block and genesis. In most cases, backfilling makes
the database compatible except where genesis is also missing (custom
networks).
* backfill checkpoint block from libp2p instead of checkpoint source,
when doing trusted node sync
* allow starting the client without genesis / checkpoint block
* perform epoch start slot lookahead when loading tail state, so as to
deal with the case where the epoch start slot does not have a block
* replace `--blockId` with `--state-id` in TNS command line
* when replaying, also look at the parent of the last-known-block (even
if we don't have the parent block data, we can still replay from a
"parent" state) - in particular, this clears the way for implementing
state pruning
* deprecate `--finalized-checkpoint-block` option (no longer needed)
* Allow chain dag without genesis / block
This PR enables the initialization of the dag without access to blocks
or genesis state - it is a prerequisite for implementing a number of
interesting features:
* checkpoint sync without any block download
* pruning of blocks and states
* backfill checkpoint block
When EL `newPayload` is slow (e.g., Raspberry Pi with Besu), the epoch
and shuffling caches tend to fill up with multiple copies per epoch when
processing gossip and performing validator duties close to wall slot.
The old strategy of evicting oldest epoch led to the same item being
evicted over and over, leading to blocking of over 5 minutes in extreme
cases where alternate epochs/shuffling got loaded repeatedly.
Changing the cache eviction strategy to least-recently-used seems to
improve the situation drastically. A simple implementation was selected
based on single linked-list without a hashtable.
When backfilling LC updates (`--light-client-data-import-mode=full`),
the highest participation update is computed without ensuring that the
finalized header is in the same period. Updates sharing same period for
both finalized and attested headers should be preferred.
Fixes a bug leading to suboptimal update selection.
* avoid database race-condition inconsistency after fcU `INVALID` then crash
* ensure head doesn't fall behind finalized; add more tests for head movement/reloading DAG
When the BN's head is reorged while shut down, reloading the BN will not
assign `BlockRef` to alternate branches. However, blocks from other
branches are still present in the database, leading to their descendants
incorrectly marked as `UnviableFork`. By restricting the check to blocks
that have been finalized, they should be reported as `MissingParent`
instead, eventually re-assigning a `BlockRef` to them.
Since these files may have been created in a previous run or manually,
we want to keep loading them even on nodes that don't enable the
keystore API (for example static setups)
Other changes:
* log keystore loading progressively (#3699)
* print initial fee recipient when loading validators
* log dynamic fee recipient updates
* more efficient forkchoiceUpdated usage
* await rather than asyncSpawn; ensure head update before dag.updateHead
* use action tracker rather than attached validators to check for next slot proposal; use wall slot + 1 rather than state slot + 1 to correctly check when missing blocks
* re-add two-fcU case for when newPayload not VALID
* check dynamicFeeRecipientsStore for potential proposal
* remove duplicate checks for whether next proposer
When the BN-embedded LC makes sync progress, pass the corresponding
execution block hash to the EL via `engine_forkchoiceUpdatedV1`.
This allows the EL to sync to wall slot while the chain DAG is behind.
Renamed `--light-client` to `--sync-light-client` for clarity, and
`--light-client-trusted-block-root` to `--trusted-block-root` for
consistency with `nimbus_light_client`.
Note that this does not work well in practice at this time:
- Geth sticks to the optimistic sync:
"Ignoring payload while snap syncing" (when passing the LC head)
"Forkchoice requested unknown head" (when updating to LC head)
- Nethermind syncs to LC head but does not report ancestors as VALID,
so the main forward sync is still stuck in optimistic mode:
"Pre-pivot block, ignored and returned Syncing"
To aid EL client teams in fixing those issues, having this available
as a hidden option is still useful.
The optimistic sync spec was updated since the LC based optsync module
was introduced. It is no longer necessary to wait for the justified
checkpoint to have execution enabled; instead, any block is okay to be
optimistically imported to the EL client, as long as its parent block
has execution enabled. Complex syncing logic has been removed, and the
LC optsync module will now follow gossip directly, reducing the latency
when using this module. Note that because this is now based on gossip
instead of using sync manager / request manager, that individual blocks
may be missed. However, EL clients should recover from this by fetching
missing blocks themselves.
* Harden block proposal against expired slashings/exits
When a message is signed in a phase0 domain, it can no longer be
validated under bellatrix due to the correct fork no longer being
available in the `BeaconState`.
To ensure that all slashing/exits are still valid, in this PR we re-run
the checks in the state that we're proposing for, thus hardening against
both signatures and other changes in the state that might have
invalidated the message.
* fix same message added multiple times
in case of attestation slashing of multiple validators in one go
Aligns the default retention policy for LC data with the one for blocks.
Minimum spec requirement for both blocks and LC data is ~5 months.
Additional use cases are better supported by retaining data for longer.
In order to avoid full replays when validating attestations hailing from
untaken forks, it's better to keep shufflings separate from `EpochRef`
and perform a lookahead on the shuffling when processing the block that
determines them.
This also helps performance in the case where REST clients are trying to
perform lookahead on attestation duties and decreases memory usage by
sharing shufflings between EpochRef instances of the same dependent
root.