* --stop-at-synced-epoch
This allows benchmarking the initial sync (only forward sync, 1s error
margin). Might be useful in CI, with a timeout, as a sanity check.
The spec does not provide code for validating the `fork_version` field
of `LightClientUpdate`. However, we can use our own logic for additional
validation of that field. The spec's python test suite sets up states
that do not follow the fork schedule (e.g., that use Altair fork version
before Altair fork epoch), which complicates upstreaming this as code.
* Refactor and optimize logs.
* Introduce shortLog(SyncRequest).
* Address review comment.
* make sync queue logs more consistent
Adds a few minor logging improvements:
- Fixes a typo (`was happened` -> `has happened`)
- Avoids passing `reset_slot` argument to log statement multiple times
- Uses same `rewind_to_slot` label when logging in both sync directions
- Consistent rewind point logging
Co-authored-by: cheatfate <eugene.kabanov@status.im>
Uses consistent formatting in `light_client_sync.nim`, always refers to
fork-dependent light client objects in full qualified notation, moves
`get_safety_threshold` helper function to same location as in the spec.
Can't apply a phase0 block to a later phase state and vice versa.
Since instantiation has been a topic, pre/post c file size:
```
424K @mspec@sstate_transition.nim.c
892K @mspec@sstate_transition_block.nim.c
```
```
288K @mspec@sstate_transition.nim.c
880K @mspec@sstate_transition_block.nim.c
```
Updates the spec references for `GeneralizedIndex` constants used by the
light client sync protocol, and adds a short explanation how they are
derived and which SSZ fields they refer to.
This PR names and documents the concept of the archive: a range of slots
for which we have degraded functionality in terms of historical access -
in particular:
* we don't support rewinding to states in this range
* we don't keep an in-memory representation of the block dag
The archive de-facto exists in a trusted-node-synced node, but this PR
gives it a name and drops the in-memory digest index.
In order to satisfy `GetBlocksByRange` requests, we ensure that we have
blocks for the entire archive period via backfill. Future versions may
relax this further, adding a "pre-archive" period that is fully pruned.
During by-slot searches in the archive (both for libp2p and rest
requests), an extra database lookup is used to covert the given `slot`
to a `root` - future versions will avoid this using era files which
natively are indexed by `slot`. That said, the lookup is quite
fast compared to the actual block loading given how trivial the table
is - it's hard to measure, even.
A collateral benefit of this PR is that checkpoint-synced nodes will see
100-200MB memory usage savings, thanks to the dropped in-memory cache -
future pruning work will bring this benefit to full nodes as well.
* document chaindag storage architecture and assumptions
* look up parent using block id instead of full block in clearance
(future-proofing the code against a future in which blocks come from era
files)
* simplify finalized block init, always writing the backfill portion to
db at startup (to ensure lookups work as expected)
* preallocate some extra memory for finalized blocks, to avoid immediate
realloc
https://github.com/ethereum/consensus-specs/pull/2225 removed an ignore
rule that would filter out duplicate aggregates from gossip publishing -
however, this causes increased bandwidth and CPU usage as discussed in
https://github.com/ethereum/consensus-specs/issues/2183 - the intent is
to revert the removal and reinstate the rule.
This PR implements ignore filtering which cuts down on CPU usage (fewer
aggregates to validate) and bandwidth usage (less fanout of duplicates)
- as #2225 points out, this may lead to a small increase in IHAVE
messages.
* Support for Gnosis Chain
`make gnosis-chain-build` will build the Nimbus gnosis chain binary,
stored in `build/nimbus_beacon_node_for_gnosis_chain`.
`make gnosis-chain` will connect to the network.
Other changes:
* Restore compilation with -d:has_genesis_detection
* Removed Makefile target related to testnet0 and testnet1
* Added more debug logging for failed peer handshakes
* Report misconfigured builds which try to embed network metadata
that is incompatible with the currently selected const preset.
* Don't bundle network metadata in minimal builds, as they are not compatible
To calculate the deltas correctly, the `process_inactivity_updates` function
must be called before the rewards and penalties processing code in order to
update the `inactivity_scores` field in the state. This would have required
duplicating more logic from the spec in the ncli modules, so I've decided to
pay the price of introducing a run-time copy of the state at each epoch which
eliminates the need to duplicate logic (both for this fix and the previous one).
Other changes:
* Fixes for the read-only mode of the `BeaconChainDb`
* Fix an uint64 underflow in the debug output procedure for printing
balance deltas
* Allow Bellatrix states in the reward computation helpers
Streamline lookup with Forky and BeaconBlockFork (then we can do the
same for era)
We use type to avoid conditionals, as fork is often already known at a
"higher" level.
* load blockid before loading block by root - this is needed to map root
to slot and will eventually be done via block summary table for "old"
blocks
Co-authored-by: tersec <tersec@users.noreply.github.com>
Update several `ncli_db` commands to run in readOnly mode, allowing them
to be used with a running instance - in particular era export.
* export all eras by default
* skip already-exported eras
* clean up / document init
* drop `immutable_validators` data (pre-altair)
* document versions where data is first added
* avoid needlessly loading genesis block data on startup
* add a few more internal database consistency checks
* remove duplicate state root lookup on state load
* comment
When initializing backfill sync, the implementation intends to start at
the first unknown slot (`1` before tail). However, an incorrect variable
is passed, and backfill sync actually starts at the tail slot instead.
This patch corrects this by passing the intended variable. The problem
was introduced with the original backfill implementation at #3263.
The added options work in opt-in fashion. If they are not specified,
the server will respond to all requests as if the CORS specification
doesn't exist. This will result in errors in CORS-enabled clients.
Please note that future versions may support more than one allowed
origin. The option names will stay the same, but the user will be
able to repeat them on the command line (similar to other options
such as --web3-url).
To be documented in the guide in a separate PR.
The `SyncManager` has a leftover optional `sleepTime` parameter in
its constructor that used to configure the sync loop polling rate.
This parameter was replaced with a constant in #1602 and is no longer
functional. This patch removes the `sleepTime` leftovers.
#3304 introduced a regression to the sync status string displayed in the
status bar; during the main forward sync, the current slot is no longer
reported and always displays as `0`. This patch corrects the computation
to accurately report the current slot once more.
The `SyncManager` has a leftover optional `maxStatusAge` parameter in
its constructor that used to configure the libp2p `Status` polling rate.
This parameter was replaced with a constant in #1827 and is no longer
functional. This patch removes the `maxStatusAge` leftovers.
With these changes, we can backfill about 400-500 slots/sec, which means
a full backfill of mainnet takes about 2-3h.
However, the CPU is not saturated - neither in server nor in client
meaning that somewhere, there's an artificial inefficiency in the
communication - 16 parallel downloads *should* saturate the CPU.
One plasible cause would be "too many async event loop iterations" per
block request, which would introduce multiple "sleep-like" delays along
the way.
I can push the speed up to 800 slots/sec by increasing parallel
downloads even further, but going after the root cause of the slowness
would be better.
* avoid some unnecessary block copies
* double parallel requests
When node is restarted before backfill has started but after some blocks
have finalized with forward sync, we would not start the backfill.
* also clean up one last `SomeSome`