562 Commits

Author SHA1 Message Date
Etan Kissling
55a5ffaf8c
Merge branch 'dev/etan/zf-branchpull' into feat/splitview 2024-03-27 16:40:29 +01:00
Etan Kissling
b37ad4dccb
fix 2024-03-27 16:39:42 +01:00
Etan Kissling
b869546524
Merge branch 'dev/etan/zf-branchpull' into feat/splitview 2024-03-27 16:14:38 +01:00
Etan Kissling
02a69be4e2
generic branch discovery version that supports mocking peers 2024-03-27 16:00:36 +01:00
Etan Kissling
9f37ffdc62
suspend light client sync while branch discovery is in progress 2024-03-27 16:00:02 +01:00
Etan Kissling
ea2cf8e69b
Merge branch 'dev/etan/zf-branchpull' into feat/splitview 2024-03-25 23:44:50 +01:00
Etan Kissling
74606c6e1b
handoff useless peers from sync manager directly into branch discovery 2024-03-25 23:44:05 +01:00
Etan Kissling
58383d1ca0
Merge branch 'dev/etan/vd-incprop' into feat/splitview 2024-03-25 22:29:03 +01:00
Etan Kissling
97ec45e939
after a deep reorg, both newPayload and forkchoiceUpdated are needed 2024-03-25 22:25:46 +01:00
Etan Kissling
63971c0e1f
Merge branch 'dev/etan/zf-branchpull' into feat/splitview 2024-03-25 22:05:01 +01:00
Etan Kissling
08b87e2506
add branch discovery module for use in split view scenarios
When the network is partitioned for a long time, e.g., Goerli, branches
start forming where different peers have distinct views about the chain
state. The current syncing solution with sync manager doesn't handle the
case well, as it is optimized for a healthy network where syncing can be
parallelized across different peers. To support sync manager discovering
additional branches, a new module is added that pulls in histories from
peers on unknown branches in a backwards manner.
2024-03-25 22:02:23 +01:00
Etan Kissling
c5352cf89f
add option to incrementally compute proposal state if behind
There can be situations where proposals need to be made on top of stale
parent state. Because the distance to the wall slot is big, the proposal
state (`getProposalState`) has to be computed incrementally to avoid lag
spikes. Proposals have to be timely to have a chance to propagate on the
network. Peak memory requirements don't change, but the proposal state
needs to be allocated for a longer duration than usual.
2024-03-25 21:11:37 +01:00
Etan Kissling
2dbe24c740
move split view catchup to research branch (#6133)
Using a dedicated branch for researching the effectiveness of split view
scenario handling simplifies testing and avoids having partial work on
`unstable`. If we want, we can reintroduce it under a `--debug` flag at
a later time. But for now, Goerli is a rare opoprtunity to test this,
maybe just for another week or so.

- https://github.com/status-im/infra-nimbus/pull/179
2024-03-25 19:09:31 +01:00
Etan Kissling
fc9bc1da3a
add branch discovery module for supporting chain stall situation (#6125)
In split view situation, the canonical chain may only be served by a
tiny amount of peers, and branches may span long durations. Minority
branches may still have a large weight from attestations and should
be discovered. To assist with that, add a branch discovery module that
assists in such a situation by specifically targeting peers with unknown
histories and downloading from them, in addition to sync manager work
which handles popular branches.
2024-03-24 08:41:47 +00:00
Etan Kissling
66a9304fea
use separate state when catching up to perform validator duties (#6131)
There are situations where all states in the `blockchain_dag` are
occupied and cannot be borrowed.

- headState: Many assumptions in the code that it cannot be advanced
- clearanceState: Resets every time a new block gets imported, including
  blocks from non-canonical branches
- epochRefState: Used even more frequently than clearanceState

This means that during the catch-up mechanic where the head state is
slowly advanced to wall clock to catch up on validator duties in the
situation where the canonical head is way behind non-canonical heads,
we cannot use any of the three existing states. In that situation,
Nimbus already consumes an increased amount of memory due to all the
`BlockRef`, fork choice states and so on, so experience is degraded.
It seems reasonable to allocate a fourth state temporarily during that
mechanic, until a new proposal could be made on the canonical chain.

Note that currently, on `unstable`, proposals _do_ happen every couple
hours because sync manager doesn't manage to discover additional heads
in a split-view scenario on Goerli. However, with the branch discovery
module, new blocks are discovered all the time, and the clearanceState
may no longer be borrowed as it is reset to different branch too often.

The extra state could also find other uses in the future, e.g., for
incremental computations as in reindexing the database, or online
collection of historical light client data.
2024-03-24 07:18:33 +01:00
Etan Kissling
6f466894ab
answer RequestManager queries from disk if possible (#6109)
When restarting beacon node, orphaned blocks remain in the database but
on startup, only the canonical chain as selected by fork choice loads.
When a new block is discovered that builds on top of an orphaned block,
the orphaned block is re-downloaded using sync/request manager, despite
it already being present on disk. Such queries can be answered locally
to improve discovery speed of alternate forks.
2024-03-21 18:37:31 +01:00
Etan Kissling
de2d205f61
use PR-3431 style fork choice on all networks (#6110)
To start phasing out Capella fork choice logic, set default to PR 3431.
A subsequent release can remove the fallback option.
2024-03-20 19:12:33 +01:00
Etan Kissling
035ca015e6
continue validator duties if chain does not progress for a long time (#6101)
Nimbus currently stops performing validator duties if the blockchain
does not progress for `node.config.syncHorizon` slots. This means that
the chain won't recover because no new blocks are proposed. To fix that,
continue performing validator duties if no progress is registered for a
long time, and none of our peers is indicating any progress.
2024-03-20 03:23:53 +01:00
Etan Kissling
5d42859176
make Gwei distinct (#6090)
#6087 introduced a subtle change to `nim-web3` resulting in `Gwei` to be
serialized differently than before. Using a `distinct` type for `Gwei`
improves type safety and avoids such problems in the future.
2024-03-19 14:22:07 +01:00
tersec
0a6d189161
automated consensus spec URL updating to v1.4.0 (#6074) 2024-03-14 07:26:36 +01:00
Eugene Kabanov
72c844534f
Add Keymanager API graffiti endpoints. (#6054)
* Initial commit.

* Add more tests.

* Fix API mistypes.

* Fix mistypes in tests.

* Fix one more mistype.

* Fix affected tests because of error code 401.

* Add GetGraffitiResponse object.

* Add more tests.

* Fix compilation errors.

* Recover old behavior.

* Recover old behavior.

* Fix mistype.

* Test could not know default graffiti value.

* Make VC use adopted graffiti settings.

* Make BN use adopted graffiti settings.

* Update Alltests.

* Fix test.

* Revert "Fix test."

This reverts commit c735f855d3cb9c4a1c8e8af29d3f4438d068e31f.

* Workaround {.push raises.} requirement.

* Fix comment.

* Update Alltests.
2024-03-14 03:44:00 +00:00
Etan Kissling
984d5e4631
fix regression when fetching genesis data in trustedNodeSync mode (#6047)
Corrects a regression from #5998 that led to crashes in #6046.
In `trustedNodeSync` mode, the config does not contain genesis keys,
so attempting to load from them is a `Defect`.
2024-03-08 14:52:54 +01:00
Etan Kissling
a0bc3fff86
fix /eth/v1/beacon/deposit_snapshot for EIP-4881 (#6038)
Fix the `/eth/v1/beacon/deposit_snapshot` API to produce proper EIP-4881
compatible `DepositTreeSnapshot` responses. The endpoint used to expose
a Nimbus-specific database internal format.

Also fix trusted node sync to consume properly formatted EIP-4881 data
with `--with-deposit-snapshot`, and `--finalized-deposit-tree-snapshot`
beacon node launch option to use the EIP-4881 data. Further ensure that
`ncli_testnet` produces EIP-4881 formatted data for interoperability.
2024-03-08 14:22:03 +01:00
Etan Kissling
50a43f397f
rename DepositTreeSnapshot -> DepositContractSnapshot (#6036)
EIP-4881 was never correctly implemented, the `DepositTreeSnapshot`
structure has nothing to do with its actual definition. Reflect that
by renaming the type to a Nimbus-specific `DepositContractSnapshot`,
so that an actual EIP-4881 implementation can use the correct names.

- https://eips.ethereum.org/EIPS/eip-4881#specification

Notably, `DepositTreeSnapshot` contains a compressed sequence in
`finalized`, only containing the minimally required intermediate roots.

That also explains the incorrect REST response reported in #5508.

The non-canonical representation was introduced in #4303 and is also
persisted in the database. We'll have to maintain it for a while.
2024-03-07 18:42:52 +01:00
Etan Kissling
3b4b1ccb89
metrics for proposer and sync duties (#6028)
To avoid restarting with scheduled proposals or sync duties, which are
rare but have high rewards, add metrics to simpify tracking them.
2024-03-07 15:08:48 +01:00
Etan Kissling
b7c52c9537
consider proposal slot in Slot end's nextActionWait log/metric (#6026)
`nextActionWait` currently shows `n/a` if only proposal is scheduled
but no attestation, e.g., attestation was already made for current
epoch and validator is exiting next epoch so doesn't have another
attestation lined up. It's an edge case but it's still more correct
to also log `nextActionWait` if only proposal is scheduled.
2024-03-06 12:20:53 +01:00
Etan Kissling
89d9dc24bd
allow --external-beacon-api-url to fallback to genesis if viable (#5998)
When using `--external-beacon-api-url`, one has to accompany it with
either `--trusted-block-root` or `--trusted-state-root`. If neither is
specified, we can fallback to a deeply finalized noncontroversial block
root. For networks that started post Altair, e.g., Holesky, the genesis
block root fulfills that requirement, as in, it is implicitly trusted.
Therefore, if only `--external-beacon-api-url` is provided without any
`--trusted-block-root` or `--trusted-state-root`, use genesis block root
if it is a viable starting point (post-Altair).

```
build/nimbus_beacon_node \
    --network=holesky \
    --data-dir="$HOME/Downloads/nimbus/data/holesky" \
    "--external-beacon-api-url=http://unstable.holesky.beacon-api.nimbus.team" \
    --tcp-port=9010 --udp-port=9010 \
    --rest --log-level=DEBUG \
    --no-el
```
2024-03-05 15:41:22 +01:00
Etan Kissling
4e9bc7f570
add EIP-7044 support to keymanager API (#5959)
* add EIP-7044 support to keymanager API

When trying to sign `VoluntaryExit` via keymanager API, the logic is not
yet aware of EIP-7044 (part of Deneb). This patch adds missing EIP-7044
support to the keymanager API as well.

As part of this, the VC needs to become aware about:

- `CAPELLA_FORK_VERSION`: To correctly form the EIP-7044 signing domain.
  The fork schedule does not indicate which of the results, if any,
  corresponds to Capella.
- `CAPELLA_FORK_EPOCH`: To detect whether Capella was scheduled.
  If a BN does not have it in its config while other BNs have it,
  this leads to a log if Capella has not activated yet, or marks the BN
  as incompatible if Capella already activated.
- `DENEB_FORK_EPOCH`: To check whether EIP-7044 logic should be used.

Related PRs:

- #5120 added support for processing EIP-7044 `VoluntaryExit` messages
  as part of the state transition functions (tested by EF spec tests).
- #5953 synced the support from #5120 to gossip validation.
- #5954 added support to the `nimbus_beacon_node deposits exit` command.
- #5956 contains an alternative generic version of `VCForkConfig`.

* address reviewer feedback: letter case, module location, double lookup

---------

Co-authored-by: cheatfate <eugene.kabanov@status.im>

* Update beacon_chain/rpc/rest_constants.nim

* move `VCRuntimeConfig` back to `rest_types`

---------

Co-authored-by: cheatfate <eugene.kabanov@status.im>

* fix `getForkVersion` helper

---------

Co-authored-by: cheatfate <eugene.kabanov@status.im>
2024-02-26 09:48:07 +01:00
tersec
a4f4a35845
Revert "initial Electra support skeleton" (#5955)
* Revert "initial Electra support skeleton (#5946)"

This reverts commit d09bf3b587bc4f0c91b8e2f58884665a0ae80e34.

* Update test_signing_node.nim
2024-02-25 19:42:44 +00:00
tersec
d09bf3b587
initial Electra support skeleton (#5946) 2024-02-24 13:44:15 +00:00
tersec
c73d7c6f6f
automated consensus spec URL updating to v1.4.0-beta.7 (#5942) 2024-02-21 19:44:48 +00:00
Etan Kissling
88045a91cd
rename new timing metrics, as _total suffix is implicit (#5917)
* track latest duration instead of total in new timing metrics

Change `db_checkpoint_seconds` and `state_replay_seconds` metrics to
record the latest duration instead of the total. `nim-metrics` already
synthesizes a `_total` metric from these implicitly.

* still have to use inc, metrics only synthesizes the name not the sum

* prefix with `beacon_dag`
2024-02-20 20:34:41 +01:00
Etan Kissling
92197ce690
add metric for database checkpoint duration (#5897)
Database checkpointing can take seconds, e.g., while Geth is syncing.
Add a debug log + metric for it, and also info log if it takes longer
than 250ms, same as for the existing `State replayed` log. If the log
shows up for a user while the system is not overloaded, it may point
to slow disk speed or thermal issue.
2024-02-19 11:00:11 +01:00
Etan Kissling
7c53841cd8
Revert "Revert "fix checkpoint block potentially not getting backfilled into DB (#5863)" (#5871)" (#5875)
This reverts commit 1575478b721b6f62d6466a226b91188bdf97e00b.
2024-02-09 20:44:54 +01:00
tersec
1575478b72
Revert "fix checkpoint block potentially not getting backfilled into DB (#5863)" (#5871)
This reverts commit 65e6f892deb5d9ff4399a0840a90788726024008.
2024-02-09 12:49:07 +00:00
Etan Kissling
65e6f892de
fix checkpoint block potentially not getting backfilled into DB (#5863)
When using checkpoint sync, only checkpoint state is available, block is
not downloaded and backfilled later.

`dag.backfill` tracks latest filled `slot`, and latest `parent_root` for
which no block has been synced yet.

In checkpoint sync, this assumption is broken, because there, the start
`dag.backfill.slot` is set based on checkpoint state slot, and the block
is also not available.

However, sync manager in backward mode also requests `dag.backfill.slot`
and `block_clearance` then backfills the checkpoint block once it is
synced. But, there is no guarantee that a peer ever sends us that block.
They could send us all parent blocks and solely omit the checkpoint
block itself. In that situation, we would accept the parent blocks and
advance `dag.backfill`, and subsequently never request the checkpoint
block again, resulting in gap inside blocks DB that is never filled.

To mitigate that, the assumption is restored that `dag.backfill.slot`
is the latest filled `slot`, and `dag.backfill.parent_root` is the next
block that needs to be synced. By setting `slot` to `tail.slot + 1` and
`parent_root` to `tail.root`, we put a fake summary into `dag.backfill`
so that `block_clearance` only proceeds once checkpoint block exists.
2024-02-09 11:20:36 +01:00
tersec
642774e596
unrevert rest of https://github.com/status-im/nimbus-eth2/pull/5765 (#5867)
* unrevert rest of https://github.com/status-im/nimbus-eth2/pull/5765

* rm stray e2store docs changes

* reduce diff

* fix indent

---------

Co-authored-by: Jacek Sieka <jacek@status.im>
2024-02-09 09:35:41 +01:00
tersec
8b261dd3e0
fix blob_sidecar SSE versioned_hash field to be 0x-prefixed hex (#5844) 2024-01-31 04:50:24 +01:00
tersec
225ef5e69a
partially revert https://github.com/status-im/nimbus-eth2/pull/5765 (#5833) 2024-01-28 23:45:52 +01:00
Jacek Sieka
6328c77778
raises for gossip (#5808)
* raises for gossip

* fix light client
2024-01-22 17:34:54 +01:00
tersec
d8a2690a92
update builder API spec reference URLs to v0.4.0 (#5812) 2024-01-22 08:36:46 +01:00
tersec
d669eef97b
rm unused code; fix a Deprecated warning; proc to func (#5807) 2024-01-20 21:36:01 +00:00
tersec
6c53dc1e11
automated consensus spec URL updating to v1.4.0-beta.6 (#5804) 2024-01-20 11:19:47 +00:00
Jacek Sieka
3ff9b69bf1
simplify eth2_network error handling (#5765)
This PR gets rid of a bunch of redundant exception handling through
async raises guarantees.

More can be removed once libp2p gets properly annotated.
2024-01-19 21:05:52 +00:00
Etan Kissling
68d0542ae1
log const_preset on beacon node startup (#5764)
To understand what binary is being used (regular / minimal / gnosis),
extend launch logging.
2024-01-17 14:38:56 +01:00
Etan Kissling
ad74c1a6a5
add vanity mascot for upcoming fork to status bar (#5761)
To simplify supporting "am I ready for the fork" requests, add a simple
marker to the default status bar that indicates readiness by displaying
the fork's corresponding mascot. This is the same one that is also
displayed during the actual fork transition, so does not introduce
new dependencies. It also only shows in default status bar, not in logs.
2024-01-16 17:33:46 +00:00
Etan Kissling
39a2a91003
only show upcoming fork info if one is scheduled (#5751)
This ensures that information about the next scheduled fork is only
displayed if one is actually scheduled. Current fork name is no longer
shown.
2024-01-15 17:48:03 +01:00
tersec
3541cfe020
remove extraneous length checks in KZG batch proofs (#5744)
* remove extraneous length checks in KZG batch proofs

* re-add winservice import but only for Windows, to avoid UnusedImport warning

* also uses establishWindowsService
2024-01-15 16:53:34 +01:00
Eugene Kabanov
5404178a40
Dissect Windows specific code from beacon node. (#5612)
* Make some startup procedures async.
Add more handful makeBannerAndConfig().

* Dissect windows service code from `nimbus_beacon_node.nim`.

* Add report service startup errors using windows error codes.
Add plug able exitService().

Co-authored-by: Zahary Karadjov <zahary@status.im>
Co-authored-by: Jacek Sieka <jacek@status.im>
2024-01-13 12:53:53 +02:00
Jacek Sieka
b98f46c04d
Avoid global in p2p macro (fixes #4578) (#5719)
* Avoid global in p2p macro (fixes #4578)

* copy p2p macro to this repo and start de-crufting it
* make protocol registration dynamic, removing light client hacks et al
* split out light client protocol into its own file

* cleanups

* Option -> Opt
* remove more cruft

* further split beacon_sync

this allows the light client to respond to peer metadata messages
without exposing the block sync protocol

* better protocol init

* "constant" protocol index

* avoid casts

* copyright

* move some discovery code to discovery

* avoid extraneous data copy when sending chunks

* remove redundant forkdigest field

* document how to connect to a specific peer
2024-01-13 11:54:24 +02:00