Commit Graph

157 Commits

Author SHA1 Message Date
tersec 8541674498
simplify ELMonitor fcU payload attributes handling (#4696) 2023-03-06 16:19:15 +00:00
zah 8771e91d53
Support for driving multiple EL nodes from a single Nimbus BN (#4465)
* Support for driving multiple EL nodes from a single Nimbus BN

Full list of changes:

* Eth1Monitor has been renamed to ELManager to match its current
  responsibilities better.

* The ELManager is no longer optional in the code (it won't have
  a nil value under any circumstances).

* The support for subscribing for headers was removed as it only
  worked with WebSockets and contributed significant complexity
  while bringing only a very minor advantage.

* The `--web3-url` parameter has been deprecated in favor of a
  new `--el` parameter. The new parameter has a reasonable default
  value and supports specifying a different JWT for each connection.
  Each connection can also be configured with a different set of
  responsibilities (e.g. download deposits, validate blocks and/or
  produce blocks). On the command-line, these properties can be
  configured through URL properties stored in the #anchor part of
  the URL. In TOML files, they come with a very natural syntax
  (althrough the URL scheme is also supported).

* The previously scattered EL-related state and logic is now moved
  to `eth1_monitor.nim` (this module will be renamed to `el_manager.nim`
  in a follow-up commit). State is assigned properly either to the
  `ELManager` or the to individual `ELConnection` objects where
  appropriate.

  The ELManager executes all Engine API requests against all attached
  EL nodes, in parallel. It compares their results and if there is a
  disagreement regarding the validity of a certain payload, this is
  detected and the beacon node is protected from publishing a block
  with a potential execution layer consensus bug in it.

  The BN provides metrics per EL node for the number of successful or
  failed requests for each type Engine API requests. If an EL node
  goes offline and connectivity is resoted later, we report the
  problem and the remedy in edge-triggered fashion.

* More progress towards implementing Deneb block production in the VC
  and comparing the value of blocks produced by the EL and the builder
  API.

* Adds a Makefile target for the zhejiang testnet
2023-03-05 01:40:21 +00:00
tersec 982d79f9a2
more eip4844 -> deneb changes (#4666) 2023-02-25 03:03:34 +02:00
zah 6036f2e7d7
Local sim impovements (#4551)
* Local sim impovements

* Added support for running Capella and EIP-4844 simulations
  by downloading the correct version of Geth.

* Added support for using Nimbus remote signer and Web3Signer.
  Use 2 out of 3 threshold signing configuration in the mainnet
  configuration and regular remote signing in the minimal one.

* The local testnet simulation can now use a payload builder.
  This is currently not activated in CI due to lack of automated
  procedures for installing third-party relays or builders.

  You are adviced to use mergemock for now, but for most realistic
  results, we can create a simple builder based on the nimbus-eth1
  codebase that will be able to propose transactions from the regular
  network mempool.

* Start the simulation from a merged state. This would allow us
  to start removing pre-merge functionality such as the gossip
  subsciption logic. The commit also removes the merge-forcing
  hack installed after the TTD removal.

* Consolidate all the tools used in the local simulation into a
  single `ncli_testnet` binary.
2023-02-23 02:10:07 +00:00
tersec 63ed5885ab
update engine API URLs to v1.0.0-beta.2 (#4579) 2023-02-01 18:49:36 +00:00
tersec 58ed9308d2
automated v1.3.0-rc.1 to v1.3.0-rc.2 consensus spec URL updates (#4568) 2023-01-31 00:26:57 +01:00
tersec 8e2792bdd7
remove Nim 1.2 workarounds and `TODO`s (#4533)
* clean up some Nim 1.2 workarounds

* re-add notes about JS backend

* another proc/noSideEffect -> func

* revert ncli/ncli_common.nim changes; 19969 evidently wasn't backported to 1.6
2023-01-30 20:21:51 +01:00
henridf f8ee0def2b
Add stubs for EIP4844 engine API calls (#4536) 2023-01-21 00:47:38 +00:00
tersec aacc8d702d
remove Nim 1.2-compatible `push raise`s and update copyright notice years (#4528) 2023-01-20 14:14:37 +00:00
tersec aea7a0c8b8
remove TTD monitoring (#4486) 2023-01-18 16:01:49 +02:00
tersec 073c544f0c
automated update from v1.3.0-rc.0 to v1.3.0-rc.1 consensus spec URLs (#4517) 2023-01-17 16:10:52 +00:00
tersec eb02415f00
execution/engine withdrawals amount in uint64 gwei (#4509) 2023-01-14 17:26:57 +00:00
zah 0f758c5f02
Working Makefile targets for Capella devnet2 (#4494)
* Working Makefile targets for Capella devnet2

make capella-devnet-2
make clean-capella-devnet-2

You'll need to have https://github.com/tmuxinator/tmuxinator installed.
It's available as a regular package in most Linux distributions or through
Nix or Brew on macOS.

This commit also fixes the initial hang in the Eth1 monitor in the "find
TTD block" procedure through a fix to the network metadata files which
hasn't been upstreamed yet.

Other changes:

* Disabled Geth snap sync in the simulation

When all Geth nodes are configured to run with snap sync enabled, they all
start snap sync after the first forkchoiceUpdated which causes the BNs to
skip validator duties because the EL is syncing. The snap sync never completes
due to poor connectivity between the Geth nodes in the simulation.
2023-01-13 12:21:58 +02:00
Zahary Karadjov 3b03ef8ffb
Remove the genesis detection code 2023-01-13 04:28:30 +02:00
Zahary Karadjov c01e53e35e
Fix bitrot in the gnosis build; Up-to-date bootstrap nodes
To prevent similar bitrot from occuring in the future, the gnosis-build
is now part of the default `make` targets.
2023-01-13 04:28:29 +02:00
henridf 309f8690de
Wire up engine_newPayloadV3 (#4482)
* Wire up eip4844's newPayloadV3

* Add eip4844 test

* Update AllTests-mainnet.md and fix typo
2023-01-11 18:21:19 +00:00
Zahary Karadjov 713bdd317d
Bugfix: Handle networks with TTD=0 (e.g. withdrawal devnets) 2023-01-10 19:17:11 +02:00
tersec 2fc0fa4288
don't use non-engine-API-whitelisted web3 methods (#4484) 2023-01-10 15:50:34 +00:00
tersec 2dd3cd786f
consensus spec ref URL update v1.3.0-{alpha.2,rc.0}; copyright year update (#4477) 2023-01-09 22:44:44 +00:00
tersec d1b799eb64
bump nim-web3 for correct getPayloadV2 response signature (#4471) 2023-01-06 21:06:45 +00:00
tersec 47cb0f7991
capella forkchoiceUpdated support (#4462)
* capella forkchoiceUpdated support

* match V2 fcU with V2 getPayload
2023-01-06 22:01:10 +01:00
tersec ec01065555
getPayloadV2 support for capella (#4457)
* getPayloadV2 support for capella

* check execution payload type more rigorously
2023-01-04 20:13:17 +01:00
zah 07d4160e00
Migrating the deposit contract snapshot can no longer fail on start-up (#4438)
The missing piece of data that had to be obtained previously from
the configured EL client is now part of the network metadata baked
into the binary.
2022-12-19 18:19:48 +01:00
tersec bb4ea37baa
update EF consensus spec URLs from v1.3.0-alpha.1 to v1.3.0-alpha.2 (#4432) 2022-12-15 12:15:12 +00:00
zah d30cb8baf1
Support for obtaining deposit snapshots during trustedNodeSync (#4303)
Other changes:

* More optimal search for TTD block.

* Add timeouts to all REST requests during trusted node sync.
  Fixes #4037

* Removed support for storing a deposit snapshot in the network
  metadata.
2022-12-07 12:24:51 +02:00
tersec 9572bb8721
fix effective merge conflict between #s 4357 and 4358 around Withdrawal symbol ambiguity (#4370) 2022-11-29 11:04:36 +00:00
tersec ed672113bc
support engine API execution payloads with withdrawals (#4358) 2022-11-29 05:02:16 +00:00
tersec 61c5ac32d8
automated consensus spec ref URL update to v1.3.0-alpha.1 (#4354) 2022-11-24 19:07:02 +00:00
tersec c8083f2c32
implement more missing capella functionality (#4344) 2022-11-24 09:53:04 +02:00
tersec 909c095e64
initial automated v1.2.0 -> v1.3.0-alpha.0 consensus spec URL update (#4296) 2022-11-08 02:37:28 +00:00
tersec 16817fef95
cleanups: `proc` -> `func`, unused import, spec URLs (#4224) 2022-10-08 05:07:54 -05:00
tersec a0ead042ad
newPayload `INVALIDATED` should be `unviableFork` (#4180) 2022-09-26 21:24:32 +00:00
tersec deb043796b
a few more manual v1.2.0 consensus spec ref URL updates (#4165) 2022-09-23 12:00:17 +00:00
zah 154723947b
Don't search for the TTD block after the merge (#4152) 2022-09-20 09:17:25 +03:00
Etan Kissling 0708fcd7cf
rm require engine API check (#4144)
The `eth1_monitor` check to require engine API from bellatrix onward
has issues in setups where the EL and CL are started simultaneously
because the EL may not be ready to answer requests by the time that the
check is performed. This can be observed, e.g., on Raspberry Pi 4 when
using Besu as the EL client. Now that the merge transition happened, the
check is also not that useful anymore, as users have other ways to know
that their setup is not working correctly (e.g., repeated exchange logs)
2022-09-19 23:47:46 +02:00
Etan Kissling 4b3768c3a1
fix TTD before bellatrix (#4137)
When TTD hits before Bellatrix, avoid waiting for new blocks and detect
the TTD block as the terminal block hash even before Bellatrix hits.
Also allow detecting EL genesis block as merge transition block.
This fixes the local testnet simulation with Geth to actually merge.
2022-09-18 08:45:51 +03:00
Jacek Sieka 43188a0990
clean up exchange configuration handling (#4126)
Per spec, we should not be sending our detected terminal block to EL -
the EL configuration exchange should only look at values from
configuration and report mismatches.
2022-09-16 15:33:22 +02:00
tersec 0b93eeeaaf
delay first exchangeTransitionConfiguration (#4114) 2022-09-15 15:00:23 +02:00
Jacek Sieka ee1465e320
Don't consider stubbed terminal block hash terminal (fixes #4094) (#4096) 2022-09-08 10:57:26 +02:00
Etan Kissling e6b8bc6527
harden `exchangeTransitionConfiguration` retries (#4095)
`p.dataProvider` may become `nil` between individual attempts to
exchange transition configuration with the EL. Harden by capturing
the data provider on function start.

Note that other functions are already hardened, or are unaffected.
Only `close` transitions `p.dataProvider` to `nil`, and `close` is
only called by the main deposits import sequence. During the deposits
import, `close` is not called, so extra checks are not needed.
2022-09-08 09:36:53 +02:00
tersec b90ae838c7
checking for merge terminal block should be debug-level (#4075) 2022-09-06 23:41:55 +00:00
Zahary Karadjov 74ac85a75f
Add reassuring log message upon connecting to the EL 2022-08-29 23:11:09 +03:00
tersec 61dc296046
update engine API spec ref URLs from alpha.9 to beta.1 (#4030)
* update engine API spec ref URLs from alpha.9 to beta.1

* require exactly 256-bit JWT keys
2022-08-26 13:44:50 +03:00
Etan Kissling 64972e3c8a
set `safe_block_hash` to fork choice justified (#4010)
Implements the fork choice safe block spec, where `safe_block_hash` in
`forkChoiceUpdated` is set to justified (used to be `ZERO_HASH`).
https://github.com/ethereum/consensus-specs/blob/v1.2.0-rc.3/fork_choice/safe-block.md#get_safe_execution_payload_hash
2022-08-25 23:34:02 +00:00
Etan Kissling d619b539f3
fix engine API crash when EL disconnected (#4027)
When issuing an engine API call while the EL is disconnected, a `nil`
pointer is dereferenced. Fixed by correctly initializing futures.

```
Traceback (most recent call last, using override)
vendor/nim-libp2p/libp2p/protocols/pubsub/pubsub.nim(890) main
beacon_chain/nimbus_beacon_node.nim(2139) main
beacon_chain/nimbus_beacon_node.nim(0) handleStartUpCmd
beacon_chain/nimbus_beacon_node.nim(0) doRunBeaconNode
beacon_chain/nimbus_beacon_node.nim(0) start
beacon_chain/nimbus_beacon_node.nim(1589) run
vendor/nimbus-build-system/vendor/Nim/lib/system/iterators_1.nim(107) poll
vendor/nim-chronos/chronos/asyncfutures2.nim(365) futureContinue
beacon_chain/consensus_object_pools/consensus_manager.nim(297) updateHeadWithExecution
vendor/nim-chronos/chronos/asyncmacro2.nim(213) runProposalForkchoiceUpdated
vendor/nim-chronos/chronos/asyncfutures2.nim(365) futureContinue
beacon_chain/consensus_object_pools/consensus_manager.nim(259) runProposalForkchoiceUpdated
beacon_chain/eth1/eth1_monitor.nim(0) forkchoiceUpdated
vendor/nim-chronos/chronos/asyncfutures2.nim(219) complete
vendor/nim-chronos/chronos/asyncfutures2.nim(149) cancelled
vendor/nimbus-build-system/vendor/Nim/lib/system/excpt.nim(610) signalHandler
SIGSEGV: Illegal storage access. (Attempt to read from nil?)
```
2022-08-25 20:07:29 +02:00
zah 4e41ed1d5a
Require properly configured Engine API connection after the merge (#4006) 2022-08-22 22:44:40 +03:00
zah 09de83af80
Reviewed the Engine API calls for missing error handling (#4004) 2022-08-20 09:09:25 +03:00
Etan Kissling 052f9edfd4
import EL deposits even when EL is stuck (#3956)
* import EL deposits even when EL is stuck

The `eth1_monitor` only starts importing deposits once the EL reports a
new head block. However, the EL may be stuck at a block, e.g., the TTD.
By polling the latest EL block once after subscribing to new EL block
events it is ensured that deposits are still imported in this situation.

* also poll once on re-connects

* update `eth1_latest_head` metric in poll mode

* add comment about similar polling vs events parts

* replace check with assert

* `isNewLastBlock` helper
2022-08-12 19:44:55 +00:00
Etan Kissling c360db8194
avoid materializing potentially long deposits seq (#3947)
When fetching eth1 data and deposits for a new block proposal, the list
of deposits from previous eth1 data to the next one is fully loaded into
a `seq`. This can potentially be a very long list in active periods.
Changing this to an `iterator` saves memory by ensuring that the entire
list is no longer materialized; only the `DepositData` roots are needed.
2022-08-12 16:52:06 +03:00
Etan Kissling 03d6a1a934
track request chunk size across EL reconnects (#3960)
When the EL connection is interrupted, deposits are once more requested
in chunks of 5000 blocks. This is a problem when the response takes over
a minute to produce and consistently times out as followup requests with
lower chunk sizes may no longer work after a request was canceled, e.g.,
when using Geth with websockets. By keeping track of `blocksPerRequest`
across EL reconnections, it is possible to recover from this by avoiding
to continuously repeat the initial request with the full 5000 blocks.
Also cleans up one more "retry of retry" instance; `DataProviderTimeout`
is a `CatchableError` and already handled by the existing retry logic.
2022-08-12 16:51:33 +03:00