127 Commits

Author SHA1 Message Date
Etan Kissling
2dbe24c740
move split view catchup to research branch (#6133)
Using a dedicated branch for researching the effectiveness of split view
scenario handling simplifies testing and avoids having partial work on
`unstable`. If we want, we can reintroduce it under a `--debug` flag at
a later time. But for now, Goerli is a rare opoprtunity to test this,
maybe just for another week or so.

- https://github.com/status-im/infra-nimbus/pull/179
2024-03-25 19:09:31 +01:00
Etan Kissling
fc9bc1da3a
add branch discovery module for supporting chain stall situation (#6125)
In split view situation, the canonical chain may only be served by a
tiny amount of peers, and branches may span long durations. Minority
branches may still have a large weight from attestations and should
be discovered. To assist with that, add a branch discovery module that
assists in such a situation by specifically targeting peers with unknown
histories and downloading from them, in addition to sync manager work
which handles popular branches.
2024-03-24 08:41:47 +00:00
Etan Kissling
66a9304fea
use separate state when catching up to perform validator duties (#6131)
There are situations where all states in the `blockchain_dag` are
occupied and cannot be borrowed.

- headState: Many assumptions in the code that it cannot be advanced
- clearanceState: Resets every time a new block gets imported, including
  blocks from non-canonical branches
- epochRefState: Used even more frequently than clearanceState

This means that during the catch-up mechanic where the head state is
slowly advanced to wall clock to catch up on validator duties in the
situation where the canonical head is way behind non-canonical heads,
we cannot use any of the three existing states. In that situation,
Nimbus already consumes an increased amount of memory due to all the
`BlockRef`, fork choice states and so on, so experience is degraded.
It seems reasonable to allocate a fourth state temporarily during that
mechanic, until a new proposal could be made on the canonical chain.

Note that currently, on `unstable`, proposals _do_ happen every couple
hours because sync manager doesn't manage to discover additional heads
in a split-view scenario on Goerli. However, with the branch discovery
module, new blocks are discovered all the time, and the clearanceState
may no longer be borrowed as it is reset to different branch too often.

The extra state could also find other uses in the future, e.g., for
incremental computations as in reindexing the database, or online
collection of historical light client data.
2024-03-24 07:18:33 +01:00
Etan Kissling
3d45c0575a
avoid resetting chain stall detection on lag spike (#6115)
During lag spike, e.g., from state replays, peer count can temporarily
drop significantly. Should not have to wait another 60 minutes in that
situation just to be back where one started.
2024-03-21 04:55:29 +01:00
Etan Kissling
4c0b9efb30
fix chain stall detection (#6105)
Used the incorrect count for `numPeers` and did not account for heads on
alternate branches in in chain stall detection.
2024-03-20 04:51:55 +01:00
Etan Kissling
035ca015e6
continue validator duties if chain does not progress for a long time (#6101)
Nimbus currently stops performing validator duties if the blockchain
does not progress for `node.config.syncHorizon` slots. This means that
the chain won't recover because no new blocks are proposed. To fix that,
continue performing validator duties if no progress is registered for a
long time, and none of our peers is indicating any progress.
2024-03-20 03:23:53 +01:00
tersec
0a6d189161
automated consensus spec URL updating to v1.4.0 (#6074) 2024-03-14 07:26:36 +01:00
Eugene Kabanov
72c844534f
Add Keymanager API graffiti endpoints. (#6054)
* Initial commit.

* Add more tests.

* Fix API mistypes.

* Fix mistypes in tests.

* Fix one more mistype.

* Fix affected tests because of error code 401.

* Add GetGraffitiResponse object.

* Add more tests.

* Fix compilation errors.

* Recover old behavior.

* Recover old behavior.

* Fix mistype.

* Test could not know default graffiti value.

* Make VC use adopted graffiti settings.

* Make BN use adopted graffiti settings.

* Update Alltests.

* Fix test.

* Revert "Fix test."

This reverts commit c735f855d3cb9c4a1c8e8af29d3f4438d068e31f.

* Workaround {.push raises.} requirement.

* Fix comment.

* Update Alltests.
2024-03-14 03:44:00 +00:00
Eugene Kabanov
f088e5f57b
Consensus block value calculation for produceBlockV3 API call. (#5873)
* allow specifying get_proposer_reward block root at state.slot

* Add consensus_block_value calculation.

* Address review comments.

* Post-rebase adjustments.

* Use proper state to calculate consensus block value.

* Revert "allow specifying get_proposer_reward block root at state.slot"

This reverts commit 9fef9a8199f63056060527ac2531acc3b0ed8dcb.

* Fix post-revert problems.
Return back to Gwei.

* Adding test which is not working.

* Do not use test suite if it does not have post-state.

* Add debug logging.

* Increase logging to track sources of balance changes.

* Fix sync committee rewards/penalties calculation.

* Revert "Increase logging to track sources of balance changes."

This reverts commit 32feb20f2fdb66521401710866cd59ecc9951ef8.

* Adopt new vision to block rewards.

* Add block produce logging to VC.

* Remove rewards.nim.

* Eliminate toWei changes.

* Improve UInt256 shortLog.

* Fix conversion procedure.

* Address review comments.

* Fix test.

* Revert "Fix test."

This reverts commit 4948b2c1ec83fbe93f265fd4dea8cc678d48534c.

---------

Co-authored-by: tersec <tersec@users.noreply.github.com>
Co-authored-by: Etan Kissling <etan@status.im>
2024-03-11 14:18:50 +00:00
tersec
84034c0379
rm Capella builder API-related remote signer support (#6003) 2024-03-01 05:30:09 +00:00
tersec
684de046db
switch Builder API validator registration error to warning (#6005) 2024-03-01 06:25:29 +01:00
tersec
f076502e25
rm Capella builder API bid types and blinded block construction (#6002) 2024-03-01 00:02:13 +00:00
tersec
5da2bcd243
rm Capella builder API REST calls (#5997) 2024-02-29 12:37:08 +00:00
tersec
84b752c7a1
rm REST blinded forked Capella block support (#5994) 2024-02-28 18:27:26 +00:00
Etan Kissling
acb1eb1ac6
extend notes on random Jenkins aarch64 test failures (#5962)
The weird `let` bug from #5757 appeared again :-) Document findings.
2024-02-26 03:02:03 +01:00
tersec
a4f4a35845
Revert "initial Electra support skeleton" (#5955)
* Revert "initial Electra support skeleton (#5946)"

This reverts commit d09bf3b587bc4f0c91b8e2f58884665a0ae80e34.

* Update test_signing_node.nim
2024-02-25 19:42:44 +00:00
tersec
d09bf3b587
initial Electra support skeleton (#5946) 2024-02-24 13:44:15 +00:00
tersec
c73d7c6f6f
automated consensus spec URL updating to v1.4.0-beta.7 (#5942) 2024-02-21 19:44:48 +00:00
Jacek Sieka
b5089ebf70
log elmanager timeouts (#5895)
Also:

* remove some unused metrics
* simplify execution payload fetching flow
2024-02-17 10:15:02 +01:00
tersec
8240c1bf34
use decimal representations of engine and builder bid values (#5879) 2024-02-10 05:13:00 +01:00
Etan Kissling
9593ef74b8
do not cache zero block hash if block unavailable (#5865)
With checkpoint sync, the checkpoint block is typically unavailable at
the start, and only backfilled later. To avoid treating it as having
zero hash, execution disabled in some contexts, wrap the result of
`loadExecutionBlockHash` in `Opt` and handle block hash being unknown.

---------

Co-authored-by: Jacek Sieka <jacek@status.im>
2024-02-09 22:10:38 +00:00
Etan Kissling
e398078abc
...ExecutionPayloadHash --> ...ExecutionBlockHash (#5864)
Finish the rename started in #4809 to have a consistent naming.
`ExecutionPayloadHash` suggests hash over payload instead of block.
`BlockHash` is also the canonical name in engine API.
2024-02-08 01:24:55 +01:00
Jacek Sieka
47704bde14
raises for beacon validators & router (#5826)
Changes here are more significant because of some good old tech debt in
block production which has grown quite hairy - the reduction in
exception handling at least provides some steps in the right direction.
2024-02-07 12:26:04 +01:00
tersec
0638741f8b
halve validator registration chunk size (#5837) 2024-01-29 14:09:09 +01:00
tersec
128834a8eb
use RestPlainResponse to improve builder API rerror reporting (#5819) 2024-01-24 23:27:22 +00:00
tersec
d8a2690a92
update builder API spec reference URLs to v0.4.0 (#5812) 2024-01-22 08:36:46 +01:00
tersec
4ec36e0670
Revert "use RestPlainResponse to improve builder API rerror reporting" (#5811)
* Revert "use `RestPlainResponse` to improve builder API rerror reporting"

* Update rest_deneb_mev_calls.nim

copyright year linting

* Update rest_capella_mev_calls.nim

more copyright year linting
2024-01-21 22:39:45 +00:00
tersec
6c53dc1e11
automated consensus spec URL updating to v1.4.0-beta.6 (#5804) 2024-01-20 11:19:47 +00:00
Etan Kissling
d59632acd0
remove obsolete curSlot variable (#5786)
#5773 removed catching up on validator duties after lag. The `curSlot`
variable that was used originally to track catch-up progress no longer
has a use and is also no longer properly updated. Remove it.
2024-01-19 03:21:38 +00:00
tersec
545fb17649
use RestPlainResponse to improve builder API rerror reporting (#5777) 2024-01-19 03:20:47 +00:00
tersec
db7909c1fe
don't catch up on validator duties (#5773) 2024-01-18 15:56:43 +00:00
Etan Kissling
b382833f43
workaround random SIGSEGV on macOS aarch64 CI (#5757)
Separate a `let` block into multiple `let` statements to reduce
probability of hitting random `SIGSEGV` during flaky CI tests.

whatever... 🤯
2024-01-16 13:41:49 +01:00
Eugene Kabanov
5404178a40
Dissect Windows specific code from beacon node. (#5612)
* Make some startup procedures async.
Add more handful makeBannerAndConfig().

* Dissect windows service code from `nimbus_beacon_node.nim`.

* Add report service startup errors using windows error codes.
Add plug able exitService().

Co-authored-by: Zahary Karadjov <zahary@status.im>
Co-authored-by: Jacek Sieka <jacek@status.im>
2024-01-13 12:53:53 +02:00
tersec
251143fd51
attest to known valid block when possible (#5313)
* attest to known valid block when possible

* cleaner approach; slot is always == attestation slot itself

* copyright year linting
2024-01-11 22:34:10 +00:00
Jacek Sieka
62cbdeefc5
verify genesis_time more strictly (fixes #1667) (#5694)
Bogus values lead to crashes down the line when timers overflow
2024-01-06 15:26:56 +01:00
Etan Kissling
030226148d
rename exit_pool > validator_change_pool (#5679)
The `ExitPool` was renamed to `ValidatorChangePool` with Capella, but
the files were still using the previous name. Rename for consistency.
2023-12-23 06:55:47 +01:00
tersec
9c6ba7d142
consensus spec v1.4.0-beta.5 URL updates (#5672) 2023-12-16 03:27:06 +01:00
tersec
cb6b54ec89
log engine/builder API decisionmaking (#5608) 2023-12-15 22:31:14 +02:00
andri lim
15147cccb1
Bump nim-web3 to dcabb8f29ee55afedefdf93cd3e102bb1daee354 (#5664)
* bump nim-web3 to dcabb8f29ee55afedefdf93cd3e102bb1daee354

also bump json-rpc to a8731e91bc336d930ac66f985d3b88ed7cf2a7d7
2023-12-12 22:15:00 +07:00
tersec
4776fecc33
consensus spec v1.4.0-beta.5 URL updates (#5655) 2023-12-06 22:16:55 +00:00
tersec
91029ce6d6
fix XDeclaredButNotUsed hints (#5652) 2023-12-06 17:23:45 +01:00
tersec
c36d2aa103
fix XDeclaredButNotUsed warnings (#5648) 2023-12-05 11:45:47 +00:00
tersec
9efb2958ec
automated consensus spec URL updating to v1.4.0-beta.5 (#5647) 2023-12-05 03:34:45 +01:00
Etan Kissling
b0839d1ae5
use correct KZG commitments in Deneb constructPlainBlindedBlock (#5642)
For Deneb, extend on #5639 and use correct KZG commitments when
producing new blinded blocks using Nimbus VC.
2023-12-04 17:36:50 +01:00
tersec
144d453f4a
Update to current (deprecated, but) version of produceBlindedBlock (#5639) 2023-12-03 10:04:12 +01:00
tersec
2fc43c9ba7
track block/blob matching/quarantines using both indices and commitments (#5621) 2023-12-01 18:58:46 +00:00
Eugene Kabanov
e2e4912645
REST API produceBlockV3 implementation (#5474)
Co-authored-by: Etan Kissling <etan@status.im>
Co-authored-by: Jacek Sieka <jacek@status.im>
2023-11-29 00:30:14 +01:00
tersec
ab5343d1bc
update some consensus spec URLs to v1.4.0-beta.4 (#5631) 2023-11-27 19:56:34 +01:00
tersec
0e5234efcc
avoid potentially subtle template/function symbol name interactions (#5622)
* avoid potentially subtle template/function symbol name interactions

* use warn instead of error in getExecutionPayload codepath to ensure lack of ambiguity
2023-11-24 16:34:25 +00:00
Jacek Sieka
e1e809eeb7
Batch slashing protection registration (#5604)
This PR brings down the time to send 100 attestations from ~1s to
~100ms, making it feasible to run 10k validators on a single node (which
regularly send 300 attestations / slot).

This is done by batching the slashing protection database write in a
single transaction thus avoiding a slow fsync for every signature -
effects will be more pronounced on slow drives.

The benefit applies both to beacon and client validators.
2023-11-19 14:08:07 +01:00