Commit Graph

279 Commits

Author SHA1 Message Date
zah 8771e91d53
Support for driving multiple EL nodes from a single Nimbus BN (#4465)
* Support for driving multiple EL nodes from a single Nimbus BN

Full list of changes:

* Eth1Monitor has been renamed to ELManager to match its current
  responsibilities better.

* The ELManager is no longer optional in the code (it won't have
  a nil value under any circumstances).

* The support for subscribing for headers was removed as it only
  worked with WebSockets and contributed significant complexity
  while bringing only a very minor advantage.

* The `--web3-url` parameter has been deprecated in favor of a
  new `--el` parameter. The new parameter has a reasonable default
  value and supports specifying a different JWT for each connection.
  Each connection can also be configured with a different set of
  responsibilities (e.g. download deposits, validate blocks and/or
  produce blocks). On the command-line, these properties can be
  configured through URL properties stored in the #anchor part of
  the URL. In TOML files, they come with a very natural syntax
  (althrough the URL scheme is also supported).

* The previously scattered EL-related state and logic is now moved
  to `eth1_monitor.nim` (this module will be renamed to `el_manager.nim`
  in a follow-up commit). State is assigned properly either to the
  `ELManager` or the to individual `ELConnection` objects where
  appropriate.

  The ELManager executes all Engine API requests against all attached
  EL nodes, in parallel. It compares their results and if there is a
  disagreement regarding the validity of a certain payload, this is
  detected and the beacon node is protected from publishing a block
  with a potential execution layer consensus bug in it.

  The BN provides metrics per EL node for the number of successful or
  failed requests for each type Engine API requests. If an EL node
  goes offline and connectivity is resoted later, we report the
  problem and the remedy in edge-triggered fashion.

* More progress towards implementing Deneb block production in the VC
  and comparing the value of blocks produced by the EL and the builder
  API.

* Adds a Makefile target for the zhejiang testnet
2023-03-05 01:40:21 +00:00
tersec adeaa9e6c4
build make all targets in debug mode on GitHub Actions CI (#4655) 2023-02-27 11:30:13 +00:00
tersec cdca07908b
no remote signer or threshold in default CI (#4667)
* no remote signer or threshold in default CI

* also no remote validators
2023-02-25 10:52:10 +00:00
tersec 73797fb911
revert that version of no-remote-signer 2023-02-25 07:17:01 +00:00
tersec 3c542c3ae0
no remote signer in default CI 2023-02-25 00:18:34 +00:00
zah 6036f2e7d7
Local sim impovements (#4551)
* Local sim impovements

* Added support for running Capella and EIP-4844 simulations
  by downloading the correct version of Geth.

* Added support for using Nimbus remote signer and Web3Signer.
  Use 2 out of 3 threshold signing configuration in the mainnet
  configuration and regular remote signing in the minimal one.

* The local testnet simulation can now use a payload builder.
  This is currently not activated in CI due to lack of automated
  procedures for installing third-party relays or builders.

  You are adviced to use mergemock for now, but for most realistic
  results, we can create a simple builder based on the nimbus-eth1
  codebase that will be able to propose transactions from the regular
  network mempool.

* Start the simulation from a merged state. This would allow us
  to start removing pre-merge functionality such as the gossip
  subsciption logic. The commit also removes the merge-forcing
  hack installed after the TTD removal.

* Consolidate all the tools used in the local simulation into a
  single `ncli_testnet` binary.
2023-02-23 02:10:07 +00:00
Etan Kissling ea6a6b1acd
track slot as part of fork choice debug API (#4565)
Extends fork choice state to also track slot numbers to improve accuracy
of `/eth/v1/debug/fork_choice` endpoint. Autoenable this API on devnet,
and disable some extra checks on devnet to aid focused testing efforts.
Align fork choice pruning logic with API based on checkpoints vs root.
2023-01-31 12:35:01 +00:00
tersec b975455254
avoid gnosis-build-triggered OOMs (#4531) 2023-01-20 14:14:08 +00:00
zah a2bc10e51c
Makefile targets for capella-devnet-3 (#4521) 2023-01-18 17:19:02 +02:00
tersec 87a34bff6c
build largest two RAM consumers, nimbus_beacon_node and all_tests, by default separately (#4513) 2023-01-16 11:34:40 +01:00
zah 0f758c5f02
Working Makefile targets for Capella devnet2 (#4494)
* Working Makefile targets for Capella devnet2

make capella-devnet-2
make clean-capella-devnet-2

You'll need to have https://github.com/tmuxinator/tmuxinator installed.
It's available as a regular package in most Linux distributions or through
Nix or Brew on macOS.

This commit also fixes the initial hang in the Eth1 monitor in the "find
TTD block" procedure through a fix to the network metadata files which
hasn't been upstreamed yet.

Other changes:

* Disabled Geth snap sync in the simulation

When all Geth nodes are configured to run with snap sync enabled, they all
start snap sync after the first forkchoiceUpdated which causes the BNs to
skip validator duties because the EL is syncing. The snap sync never completes
due to poor connectivity between the Geth nodes in the simulation.
2023-01-13 12:21:58 +02:00
zah 9861f0ab7d
Enable the gnosis build in CI on non-Windows targets (#4501) 2023-01-13 12:17:53 +02:00
Zahary Karadjov 3b03ef8ffb
Remove the genesis detection code 2023-01-13 04:28:30 +02:00
Zahary Karadjov 2ac28b1346
Disable the CI build of Gnosis 2023-01-13 04:28:30 +02:00
Zahary Karadjov b06502bf65
Gnosis const preset 2023-01-13 04:28:29 +02:00
Zahary Karadjov c01e53e35e
Fix bitrot in the gnosis build; Up-to-date bootstrap nodes
To prevent similar bitrot from occuring in the future, the gnosis-build
is now part of the default `make` targets.
2023-01-13 04:28:29 +02:00
Etan Kissling 44aa1f5152
avoid VC keymanager port conflict in CI (#4459)
`BASE_VC_KEYMANAGER_PORT` was not configured separately between runs,
make it configurable and base on `EXECUTOR_NUMBER` like the others.
2023-01-04 18:59:35 +01:00
Etan Kissling 7878e8083b
avoid port re-use across unit test runs (#4374)
In Jenkins CI we run two instances of unit tests concurrently.
This can trigger CI failure when the same port numbers are re-used
by the different test instances. Fixed one more issue of this by
allowing user configuration of the base port number.
2022-11-30 16:29:08 +02:00
Zahary Karadjov 4b9621cd77
Switch to a HTTPS web3-url for the Gnosis network (the wss is currently down) 2022-11-30 12:48:08 +02:00
Zahary Karadjov fbbab3bfef
Add version metric for the VC; Enable VC metrics in the local simulation 2022-11-30 12:47:11 +02:00
Jacek Sieka 04392c236b
bumps (#4352)
* build&style fixes
* use multithreading for LTO outside of make too
* blst 0.3.10 with no significant changes
2022-11-24 20:56:02 +00:00
Zahary Karadjov 26fe3990b9
Show last 50 lines on test failure instead of last 10 2022-11-04 14:30:29 +02:00
zah 865637e8d8
Show last 10 lines from the test log file on test failure (#4284) 2022-11-04 01:11:11 +00:00
tersec f9830836a9
deprecate --terminal-total-difficulty-override; remove launch script for deprecated ropsten (#4241)
* deprecate --terminal-total-difficulty-override; remove launch script for deprecated ropsten

* remove Makefile support for Ropsten
2022-10-24 23:32:52 +03:00
Jacek Sieka fa9c60089c
add fake execution engine server (#4250)
Useful for testing beacon node without running an execution client
(results in an
optimistically synced node)
2022-10-18 22:18:36 +00:00
Etan Kissling 23007eac3b
finalize 🐼 transition in local testnet CI (#4142)
When running local testnets as part of Jenkins CI, add one more epoch
to finalize the merge transition block.
2022-09-19 11:10:47 +00:00
Etan Kissling 3f17ceb0e2
run local finalization testnets with Geth (#4138) 2022-09-18 08:44:20 +03:00
Etan Kissling ae655e7b0f
add `lodestar` to known lib p2p agents (#4108)
Lodestar is switching from `js-libp2p/0.36.2` to `lodestar/version`.
Collect metrics on Lodestar peers following that scheme.
https://github.com/status-im/nimbus-eth2/issues/4106
2022-09-10 17:57:34 +02:00
Etan Kissling 613f4a9a50
accelerate EL sync with LC with `--sync-light-client` (#4041)
When the BN-embedded LC makes sync progress, pass the corresponding
execution block hash to the EL via `engine_forkchoiceUpdatedV1`.
This allows the EL to sync to wall slot while the chain DAG is behind.
Renamed `--light-client` to `--sync-light-client` for clarity, and
`--light-client-trusted-block-root` to `--trusted-block-root` for
consistency with `nimbus_light_client`.

Note that this does not work well in practice at this time:
- Geth sticks to the optimistic sync:
  "Ignoring payload while snap syncing" (when passing the LC head)
  "Forkchoice requested unknown head" (when updating to LC head)
- Nethermind syncs to LC head but does not report ancestors as VALID,
  so the main forward sync is still stuck in optimistic mode:
  "Pre-pivot block, ignored and returned Syncing"

To aid EL client teams in fixing those issues, having this available
as a hidden option is still useful.
2022-08-29 12:16:35 +00:00
zah 8273b3d909
Keep CLI options consistent by removing the '-enable' suffix from the outliers (#3928) 2022-08-05 17:38:26 +02:00
Etan Kissling 3ec7982293
update light client protocol version (#3550)
* Use final `v1` version for light client protocols
* Unhide LC data collection options
* Default enable LC data serving
* rm unneeded import
* Connect to EL on startup
* Add docs for LC based EL sync
2022-07-29 11:45:39 +03:00
Jakub Sokołowski d2d6e632a8
fix tesnet port conflicts on Jenkins CI
A fix for a bug triggered by recent `Jenkinsfile` refactoring done in:
https://github.com/status-im/nimbus-eth2/pull/3827

Which due to a big in Jenkins Throttling plugin caused jobs to start
running in parallel on the same host despite global configuration that
is supposed to block this:
https://issues.jenkins.io/browse/JENKINS-49173
https://github.com/jenkinsci/throttle-concurrent-builds-plugin/pull/68

An attempt to fix this was made in this PR:
https://github.com/status-im/nimbus-eth2/pull/3913

But it was ineffective due to bugs in the Throttle plugin.

As a result semi-random testnet launches would fail with errors like this:
```
./scripts/launch_local_testnet.sh: line 1026: 58977 Killed: 9   ${BEACON_NODE_COMMAND} ...
```
The culprit was the old process cleanup in `scripts/launch_local_testnet.sh`:
```
+ make local-testnet-mainnet
Found old process listening on port 7001, with PID 58977. Killing it.
Found old process listening on port 7002, with PID 59024. Killing it.
Found old process listening on port 7003, withu PID 59027. Killing it.
Found old process listening on port 7004, with PID 59030. Killing it.
```
Which was triggered due to use of immediate assignment for `EXECUTOR_NUMBER`:
```
EXECUTOR_NUMBER := 0
```
Which cause the `EXECUTOR_NUMBER` value set by Jenkins to be ignored.

For more details see:
https://www.gnu.org/software/make/manual/html_node/Flavors.html#Flavors

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2022-07-26 11:54:02 +02:00
Etan Kissling fd4cf35c20
fix concurrent Jenkins stages (#3904)
The ports for the concurrently executing REST and Minimal testnet clash,
leading to some CI failures since #3827 introduced further concurrency.
Adjusting the ports to be distinct across various tests should fix this.
2022-07-23 14:28:10 +00:00
Jacek Sieka dbd3d02e63
Migrate docs to mkdocs (#3900)
`mkdocs` works with markdown similar to `mdbook` but is generally more
pleasing to the eye and has several nice UX features.

This PR does the bulk of the transition - likely, a followup would be
needed to fully make use of the extra features and navigation.

Book pages have been kept url-compatible, meaning that for the most
part, old links should continue to work!

Co-authored-by: Etan Kissling <etan@status.im>
2022-07-22 21:47:24 +02:00
Jakub Sokołowski c33989e490
ci: refactor Jenkinsfile to be a pipeline (#3827)
Changes:
- Name local testnet output folders same as the `make` taget.
- Move both `Jenkinsfile`s to `ci` folder to avoid cluttering repo root.
- Separate builds by platform so logs from macos and linux hosts don't get mixed.
- Detect platform and architecture from Jenkins Job path to use one Jenkinsfile.
- Divide shell commands into as many stages as possible to make debugging easier.
- Generalize running testnets via a `launchLocalTestnet()` Groovy method.
- Handle uploading of results of running testnets stage-by-stage basis.
- Use `catchError()` to upload test results while marking job as failed.
- Abort previously started PR build jobs usin `disableConcurrentBuilds()`.
- Throttle jobs using the new `throttleJobProperty()` function.

Builds:
- https://ci.status.im/job/nimbus/job/nimbus-eth2/job/platforms/job/linux/job/x86_64/
- https://ci.status.im/job/nimbus/job/nimbus-eth2/job/platforms/job/macos/job/x86_64/
- https://ci.status.im/job/nimbus/job/nimbus-eth2/job/platforms/job/macos/job/aarch64/

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2022-07-22 01:34:31 +03:00
Jacek Sieka f98e9ec8bc
update docs (#3890)
* update docs

* introduce mdbook-admonish for nice looking callouts
* new section on data directory
* recommend source build for advanced users and direct the rest to
binaries
* more strongly highlight that execution client is needed
* write an actual deposit guide
* remove cruft / fix links / etc
2022-07-21 21:19:47 +03:00
zah 20d45e69b5
Re-enabled requireAllFields after a fix in nim-json-serialization (#3871)
* Re-enabled requireAllFields after a fix in nim-json-serialization

The problem was that `Option[T]` fields were not treated as optional
when requireAllFields is set to true. This is now fixed in NJS.

* Add makefile targets for recreating the Jenkins simulation runs

* Fix a discrepancy with the REST spec
2022-07-15 03:19:19 +03:00
zah a517e8718c
Allow the user to use 'goerli' instead of 'prater' (#3874) 2022-07-14 20:07:16 +00:00
Etan Kissling a6deacd878
allow driving EL with LC (#3865)
Adds the `--web3-url` launch argument to `nimbus_light_client` to enable
driving the EL with the optimistic head obtained from LC sync protocol.
This will keep issuing `newPayload` / `forkChoiceUpdated` requests for
new blocks, marking them as optimistic. `ZERO_HASH` is reported as the
finalized block for now.
2022-07-14 04:07:40 +00:00
Jacek Sieka 1e213567fe
lock down mdbook version 2022-06-21 10:31:34 +02:00
zah 5402dfab6b
Correct URLs for the DeleteKeys request in the Keymanager API (#3727)
Other changes:

* Make it easier to run the REST tests locally through a Makefile target
2022-06-19 20:54:12 +03:00
zah e8efc0f184
Add support for the Sepolia network (#3762) 2022-06-16 17:11:26 +03:00
Etan Kissling 52ba4f7999
rename light client config parameters (#3740)
For consistency with other options, use a common prefix for light client
data configuration options.

* `--serve-light-client-data` --> `--light-client-data-serve`
* `--import-light-client-data` --> `--light-client-data-import-mode`

No deprecation of the old identifiers as they were only sparingly used
and all usage can be easily updated without interferance.
2022-06-14 12:03:39 +03:00
Etan Kissling 15967c4076
keep track of latest blocks for optimistic sync (#3715)
When launched with `--light-client-enable` the latest blocks are fetched
and optimistic candidate blocks are passed to a callback (log for now).
This helps accelerate syncing in the future (optimistic sync).
2022-06-10 14:16:37 +00:00
Etan Kissling 01efa93cf6
add light client (standalone) (#3653)
Introduces a new library for syncing using libp2p based light client
sync protocol, and adds a new `nimbus_light_client` executable that uses
this library for syncing. The new executable emits log messages when
new beacon block headers are received, and is integrated into testing.
2022-05-31 12:45:37 +02:00
zah e7ce3cacd0
Add support for the Ropsten beacon chain (#3648) 2022-05-20 18:26:07 +03:00
zah a0a6dd2f63
Add a ncli tool for converting a regular keystore into a distributed one (#3634) 2022-05-17 16:50:49 +03:00
zah 6d11ad6ce1
Support for distributed keystores with multiple remotes based on threshold signatures (#3616)
Other fixes:

* Fix bit rot in the `make prater-dev-deposit` target.
* Correct content-type in the responses of the Nimbus signing node
* Invalid JSON payload was being sent in the web3signer requests
2022-05-10 03:32:12 +03:00
Zahary Karadjov def69e2a06
Revert "More sparse state snapshots in the Gnosis network"
This reverts commit 557717b517.
2022-04-11 13:56:42 +03:00
Zahary Karadjov 557717b517
More sparse state snapshots in the Gnosis network 2022-04-09 18:07:36 +03:00