Removes a few extra-ambitious templates to make `self` updates explicit,
and moves the `FinalityCheckpoints` type from `base` to `helpers` as it
is an additional Nimbus specific type not defined by spec.
Whether new blocks/attestations/etc are produced internally or received
via REST, their journey through the node is the same - to ensure that
they get the same treatment (logging, metrics, processing), this PR
moves the routing to a dedicated module and fixes several small
differences that existed before.
* `xxxValidator` -> `processMessageName` - the processor also was adding
messages to pools, so we want the name to reflect that action
* add missing "sent" metrics for some messages
* document ignore policy better - already-seen messages are not actaully
rebroadcast by libp2p
* skip redundant signature checks for internal validators consistently
The justified and finalized `Checkpoint` are frequently passed around
together. This introduces a new `FinalityCheckpoint` data structure that
combines them into one.
Due to the large usage of this structure in fork choice, also took this
opportunity to update fork choice tests to the latest v1.2.0-rc.1 spec.
Many additional tests enabled, some need more work, e.g. EL mock blocks.
Also implemented `discard_equivocations` which was skipped in #3661,
and improved code reuse across fork choice logic while at it.
* optimistic sync
* flag that initially loaded blocks from database might need execution block root filled in
* return optimistic status in REST calls
* refactor blockslot pruning
* ensure beacon_blocks_by_{root,range} do not provide optimistic blocks
* handle forkchoice head being pre-merge with block being postmerge
* re-enable blocking head updates on validator duties
* fix is_optimistic_candidate_block per spec; don't crash with nil future
* fix is_optimistic_candidate_block per spec; don't crash with nil future
* mark blocks sans execution payloads valid during head update
Separate LC initialization options from the main ChainDAGRef options to
allow ChainDAGRef to treat them as opaque and reduce risk for conflicts
when extending those options in the future.
Merkle proofs tend to have long underlying type definitions, e.g.,
`array[log2trunc(NEXT_SYNC_COMMITTEE_INDEX), Eth2Digest]`. For the
ones used in the LC sync protocol, dedicated types are introduced
to improve readability. Furthermore, the `CachedLightClientBootstrap`
wrapper that solely wrapped a merkle branch is eliminated.
Adds a `--light-client-data-max-periods` option to override the number
of sync committee periods to retain light client data.
Raising it above the default enables archive nodes to serve full data.
Lowering below the default speeds up import times (still no persistence)
This updates `nim-ssz-serialization` to
`3db6cc0f282708aca6c290914488edd832971d61`.
Notable changes:
- Use `uint64` for `GeneralizedIndex`
- Add support for building merkle multiproofs
* Initial commit
* Make `events` API spec compliant.
* Add `Eth-Consensus-Version` in responses.
* Bump chronos to get redirect with headers working.
* Add `is_optimistic` field and handling to syncing RestSyncInfo.
* adopt LC REST API with v0 suffix (without proofs)
Adopts the light client data REST API used by Lodestar as defined in
https://github.com/ethereum/beacon-APIs/pull/181 with a v0 suffix.
Requests:
- `/eth/v0/beacon/light_client/bootstrap/{block_root}`
- `/eth/v0/beacon/light_client/updates?start_period={start_period}&count={count}`
- `/eth/v0/beacon/light_client/finality_update`
- `/eth/v0/beacon/light_client/optimistic_update`
HTTP Server-Sent Events (SSE):
- `light_client_finality_update_v0`
- `light_client_optimistic_update_v0`
More work is needed to adopt the proofs endpoint, it is not included.
* initialize event queues
* register event topics
When launched with `--light-client-enable` the latest blocks are fetched
and optimistic candidate blocks are passed to a callback (log for now).
This helps accelerate syncing in the future (optimistic sync).
Introduces a new library for syncing using libp2p based light client
sync protocol, and adds a new `nimbus_light_client` executable that uses
this library for syncing. The new executable emits log messages when
new beacon block headers are received, and is integrated into testing.
* SSZ `[]` -> `mitem`
* `[]` -> `item`
immutable access via mutable instance cannot rely on template
overloading, and `[]` cannot be a `func` because of special seq handling
in compiler.
* remove deprecated JSON-RPC server
* keep the command-line options around as no-ops, temporarily
* service -> server; JSON-RPC is still used elsewhere
* document static vs dynamic range checking requirements
* add `vindices` iterator to iterate over valid validator indices in a
state
* clean up spec comments in general
* fixup
Co-authored-by: tersec <tersec@users.noreply.github.com>
Incorporates the latest changes to the light client sync protocol based
on Devconnect AMS feedback. Note that this breaks compatibility with the
previous prototype, due to changes to data structures and endpoints.
See https://github.com/ethereum/consensus-specs/pull/2802
Since we were not verifying BLS signature in blocks that we produce,
we were failing to notice that some deposits need to be ignored (due
to having an invalid signature). Processing these deposits resulted
in a different ending state after the state transition which caused
our blocks to be rejected by the network.
* Some Web3Signer versions insist replying with text/plain messages
* When reading blocks, the Web3Signer uses upper-case fork identifiers
instead of lower-case identifies like the Beacon API.
Other changes:
* logtrace can now verify sync committee messages and contributions
* Many unnecessary use of pairs() have been removed for consistency
* Map 40x BN response codes to BeaconNodeStatus.Incompatible in the VC
Other fixes:
* Fix bit rot in the `make prater-dev-deposit` target.
* Correct content-type in the responses of the Nimbus signing node
* Invalid JSON payload was being sent in the web3signer requests
* era file verification
Implement and document era file verification
* era file states now come with block applied for easier verification
* clarify conflicting version handling
* document verification requirements
* remove count from name, use start-era, end-root to discover range
* remove obsolete todo
* abstract out block root loading
Updated outdated presets / configs / REST config to v1.1.10 specs.
- `TERMINAL_BLOCK_HASH_ACTIVATION_EPOCH` and `PROPOSER_SCORE_BOOST` are
not yet used in `eth2-networks`, added configurability as TODOs.
- `MIN_ANCHOR_POW_BLOCK_DIFFICULTY` is no longer needed, put on ignore
list as some Altair devnets still reference it.
* use MAX_CHUNK_SIZE_BELLATRIX for signed Bellatrix blocks
* Update beacon_chain/networking/eth2_network.nim
Co-authored-by: Etan Kissling <etan@status.im>
* localPassC to localPassc
* check against maxChunkSize rather than constant
Co-authored-by: Etan Kissling <etan@status.im>
* `gnosis-chain` -> `gnosis`
Use same name as LH/Teku throughout
* fixes#3504
* fixes large stack temporary that can cause crashes during genesis
detection
Some upstream repos still need fixes, but this gets us close enough that
style hints can be enabled by default.
In general, "canonical" spellings are preferred even if they violate
nep-1 - this applies in particular to spec-related stuff like
`genesis_validators_root` which appears throughout the codebase.
* era: load blocks and states
Era files contain finalized history and can be thought of as an
alternative source for block and state data that allows clients to avoid
syncing this information from the P2P network - the P2P network is then
used to "top up" the client with the most recent data. They can be
freely shared in the community via whatever means (http, torrent, etc)
and serve as a permanent cold store of consensus data (and, after the
merge, execution data) for history buffs and bean counters alike.
This PR gently introduces support for loading blocks and states in two
cases: block requests from rest/p2p and frontfilling when doing
checkpoint sync.
The era files are used as a secondary source if the information is not
found in the database - compared to the database, there are a few key
differences:
* the database stores the block indexed by block root while the era file
indexes by slot - the former is used only in rest, while the latter is
used both by p2p and rest.
* when loading blocks from era files, the root is no longer trivially
available - if it is needed, it must either be computed (slow) or cached
(messy) - the good news is that for p2p requests, it is not needed
* in era files, "framed" snappy encoding is used while in the database
we store unframed snappy - for p2p2 requests, the latter requires
recompression while the former could avoid it
* front-filling is the process of using era files to replace backfilling
- in theory this front-filling could happen from any block and
front-fills with gaps could also be entertained, but our backfilling
algorithm cannot take advantage of this because there's no (simple) way
to tell it to "skip" a range.
* front-filling, as implemented, is a bit slow (10s to load mainnet): we
load the full BeaconState for every era to grab the roots of the blocks
- it would be better to partially load the state - as such, it would
also be good to be able to partially decompress snappy blobs
* lookups from REST via root are served by first looking up a block
summary in the database, then using the slot to load the block data from
the era file - however, there needs to be an option to create the
summary table from era files to fully support historical queries
To test this, `ncli_db` has an era file exporter: the files it creates
should be placed in an `era` folder next to `db` in the data directory.
What's interesting in particular about this setup is that `db` remains
as the source of truth for security purposes - it stores the latest
synced head root which in turn determines where a node "starts" its
consensus participation - the era directory however can be freely shared
between nodes / people without any (significant) security implications,
assuming the era files are consistent / not broken.
There's lots of future improvements to be had:
* we can drop the in-memory `BlockRef` index almost entirely - at this
point, resident memory usage of Nimbus should drop to a cool 500-600 mb
* we could serve era files via REST trivially: this would drop backfill
times to whatever time it takes to download the files - unlike the
current implementation that downloads block by block, downloading an era
at a time almost entirely cuts out request overhead
* we can "reasonably" recreate detailed state history from almost any
point in time, turning an O(slot) process into O(1) effectively - we'll
still need caches and indices to do this with sufficient efficiency for
the rest api, but at least it cuts the whole process down to minutes
instead of hours, for arbitrary points in time
* CI: ignore failures with Nim-1.6 (temporary)
* test fixes
Co-authored-by: Ștefan Talpalaru <stefantalpalaru@yahoo.com>
Adds `LightClientProcessor` as the pendant to `BlockProcessor` while
operating in light client mode. Note that a similar mechanism based on
async futures is used for interoperability with existing infrastructure,
despite light client object validation being done synchronously.
The spec implicitly talks about the slot of a block in several places,
and keeping it readily available is useful in a number of context -
might as well put this implicitly refereneced helper in the spec code
directly
One more step on the journey to reduce `BlockRef` usage across the
codebase - this one gets rid of `StateData` whose job was to keep track
of which block was last assigned to a state - these duties have now been
taken over by `latest_block_root`, a fairly recent addition that
computes this block root from state data (at a small cost that should be
insignificant)
99% mechanical change.
* fewer deps on `BlockRef` traversal in anticipation of pruning
* allows identifying EpochRef:s by their shuffling as a first step of
* tighten error handling around missing blocks
using the zero hash for signalling "missing block" is fragile and easy
to miss - with checkpoint sync now, and pruning in the future, missing
blocks become "normal".
When syncing as a light client, different behaviour is needed to handle
the various ways how errors may occur. The existing logic for blocks can
also be applied to light client objects:
- `Invalid`: Malformed object that is clearly an error by its producer.
- `MissingParent`: More data is needed to decide applicability.
- `UnviableFork`: Object may be valid but will never apply on this fork.
- `Duplicate`: No errors were encountered but the object was not useful.
When performing trusted node sync, historical access is limited to
states after the checkpoint.
Reindexing restores full historical access by replaying historical
blocks against the state and storing snapshots in the database.
The process can be initiated or resumed at any point in time.
This adopts the spec sections of the pre-release proposal of the libp2p
based light client sync protocol, and also adds a test runner for the
new accompanying tests. While the release version of the light client
sync protocol contains conflicting definitions, it is currently unused,
and the code specific to the pre-release proposal is marked as such.
See https://github.com/ethereum/consensus-specs/pull/2802
The spec does not provide code for validating the `fork_version` field
of `LightClientUpdate`. However, we can use our own logic for additional
validation of that field. The spec's python test suite sets up states
that do not follow the fork schedule (e.g., that use Altair fork version
before Altair fork epoch), which complicates upstreaming this as code.
Uses consistent formatting in `light_client_sync.nim`, always refers to
fork-dependent light client objects in full qualified notation, moves
`get_safety_threshold` helper function to same location as in the spec.
Can't apply a phase0 block to a later phase state and vice versa.
Since instantiation has been a topic, pre/post c file size:
```
424K @mspec@sstate_transition.nim.c
892K @mspec@sstate_transition_block.nim.c
```
```
288K @mspec@sstate_transition.nim.c
880K @mspec@sstate_transition_block.nim.c
```
Updates the spec references for `GeneralizedIndex` constants used by the
light client sync protocol, and adds a short explanation how they are
derived and which SSZ fields they refer to.
* Support for Gnosis Chain
`make gnosis-chain-build` will build the Nimbus gnosis chain binary,
stored in `build/nimbus_beacon_node_for_gnosis_chain`.
`make gnosis-chain` will connect to the network.
Other changes:
* Restore compilation with -d:has_genesis_detection
* Removed Makefile target related to testnet0 and testnet1
* Added more debug logging for failed peer handshakes
* Report misconfigured builds which try to embed network metadata
that is incompatible with the currently selected const preset.
* Don't bundle network metadata in minimal builds, as they are not compatible
To calculate the deltas correctly, the `process_inactivity_updates` function
must be called before the rewards and penalties processing code in order to
update the `inactivity_scores` field in the state. This would have required
duplicating more logic from the spec in the ncli modules, so I've decided to
pay the price of introducing a run-time copy of the state at each epoch which
eliminates the need to duplicate logic (both for this fix and the previous one).
Other changes:
* Fixes for the read-only mode of the `BeaconChainDb`
* Fix an uint64 underflow in the debug output procedure for printing
balance deltas
* Allow Bellatrix states in the reward computation helpers
* clean up / document init
* drop `immutable_validators` data (pre-altair)
* document versions where data is first added
* avoid needlessly loading genesis block data on startup
* add a few more internal database consistency checks
* remove duplicate state root lookup on state load
* comment
With these changes, we can backfill about 400-500 slots/sec, which means
a full backfill of mainnet takes about 2-3h.
However, the CPU is not saturated - neither in server nor in client
meaning that somewhere, there's an artificial inefficiency in the
communication - 16 parallel downloads *should* saturate the CPU.
One plasible cause would be "too many async event loop iterations" per
block request, which would introduce multiple "sleep-like" delays along
the way.
I can push the speed up to 800 slots/sec by increasing parallel
downloads even further, but going after the root cause of the slowness
would be better.
* avoid some unnecessary block copies
* double parallel requests
When node is restarted before backfill has started but after some blocks
have finalized with forward sync, we would not start the backfill.
* also clean up one last `SomeSome`
* Initial commit.
* Fix current test suite.
* Fix keymanager api test.
* Fix wss_sim.
* Add more keystore_management tests.
* Recover deleted isEmptyDir().
* Add `HttpHostUri` distinct type.
Move keymanager calls away from rest_beacon_calls to rest_keymanager_calls.
Add REST serialization of RemoteKeystore and Keystore object.
Add tests for Remote Keystore management API.
Add tests for Keystore management API (Add keystore).
Fix serialzation issues.
* Fix test to use HttpHostUri instead of Uri.
* Add links to specification in comments.
* Remove debugging echoes.
Make `validator exit command` work both with `JSON-RPC` and `REST` APIs
Fix problem with specifying rest-url using `localhost`
Change back exit error messages in `state_transition_block`
* Store finalized block roots in database (3s startup)
When the chain has finalized a checkpoint, the history from that point
onwards becomes linear - this is exploited in `.era` files to allow
constant-time by-slot lookups.
In the database, we can do the same by storing finalized block roots in
a simple sparse table indexed by slot, bringing the two representations
closer to each other in terms of conceptual layout and performance.
Doing so has a number of interesting effects:
* mainnet startup time is improved 3-5x (3s on my laptop)
* the _first_ startup might take slightly longer as the new index is
being built - ~10s on the same laptop
* we no longer rely on the beacon block summaries to load the full dag -
this is a lot faster because we no longer have to look up each block by
parent root
* a collateral benefit is that we no longer need to load the full
summaries table into memory - we get the RSS benefits of #3164 without
the CPU hit.
Other random stuff:
* simplify forky block generics
* fix withManyWrites multiple evaluation
* fix validator key cache not being updated properly in chaindag
read-only mode
* drop pre-altair summaries from `kvstore`
* recreate missing summaries from altair+ blocks as well (in case
database has lost some to an involuntary restart)
* print database startup timings in chaindag load log
* avoid allocating superfluos state at startup
* use a recursive sql query to load the summaries of the unfinalized
blocks
The new format is based on compressed CSV files in two channels:
* Detailed per-epoch data
* Aggregated "daily" summaries
The use of append-only CSV file speeds up significantly the epoch
processing speed during data generation. The use of compression
results in smaller storage requirements overall. The use of the
aggregated files has a very minor cost in both CPU and storage,
but leads to near interactive speed for report generation.
Other changes:
- Implemented support for graceful shut downs to avoid corrupting
the saved files.
- Fixed a memory leak caused by lacking `StateCache` clean up on each
iteration.
- Addressed review comments
- Moved the rewards and penalties calculation code in a separate module
Required invasive changes to existing modules:
- The `data` field of the `KeyedBlockRef` type is made public to be used
by the validator rewards monitor's Chain DAG update procedure.
- The `getForkedBlock` procedure from the `blockchain_dag.nim` module
is made public to be used by the validator rewards monitor's Chain DAG
update procedure.
This is an alternative take on https://github.com/status-im/nimbus-eth2/pull/3107
that aims for more minimal interventions in the spec modules at the expense of
duplicating more of the spec logic in ncli_db.
Using the wrong type here causes requests to fail due to the overly
zealous parameter validation - the failure is harmless in the current
duty subscription model, but would have caused more serious failures
down the line.
* Trusted node sync
Trusted node sync, aka checkpoint sync, allows syncing tyhe chain from a
trusted node instead of relying on a full sync from genesis.
Features include:
* sync from any slot, including the latest finalized slot
* backfill blocks either from the REST api (default) or p2p (#3263)
Future improvements:
* top up blocks between head in database and some other node - this
makes for an efficient backup tool
* recreate historical state to enable historical queries
* fixes
* load genesis from network metadata
* check checkpoint block root against state
* fix invalid block root in rest json decoding
* odds and ends
* retry looking for epoch-boundary checkpoint blocks
Time in the beacon chain is expressed relative to the genesis time -
this PR creates a `beacon_time` module that collects helpers and
utilities for dealing the time units - the new module does not deal with
actual wall time (that's remains in `beacon_clock`).
Collecting the time related stuff in one place makes it easier to find,
avoids some circular imports and allows more easily identifying the code
actually needs wall time to operate.
* move genesis-time-related functionality into `spec/beacon_time`
* avoid using `chronos.Duration` for time differences - it does not
support negative values (such as when something happens earlier than it
should)
* saturate conversions between `FAR_FUTURE_XXX`, so as to avoid
overflows
* fix delay reporting in validator client so it uses the expected
deadline of the slot, not "closest wall slot"
* simplify looping over the slots of an epoch
* `compute_start_slot_at_epoch` -> `start_slot`
* `compute_epoch_at_slot` -> `epoch`
A follow-up PR will (likely) introduce saturating arithmetic for the
time units - this is merely code moves, renames and fixing of small
bugs.