Commit Graph

140 Commits

Author SHA1 Message Date
Jacek Sieka b8a32419b8
async batch verification (+40% sig verification throughput) (#5176)
* async batch verification

When batch verification is done, the main thread is blocked reducing
concurrency.

With this PR, the new thread signalling primitive in chronos is used to
offload the full batch verification process to a separate thread
allowing the main threads to continue async operations while the other
threads verify signatures.

Similar to previous behavior, the number of ongoing batch verifications
is capped to prevent runaway resource usage.

In addition to the asynchronous processing, 3 addition changes help
drive throughput:

* A loop is used for batch accumulation: this prevents a stampede of
small batches in eager mode where both the eager and the scheduled batch
runner would pick batches off the queue, prematurely picking "fresh"
batches off the queue
* An additional small wait is introduced for small batches - this helps
create slightly larger batches which make better used of the increased
concurrency
* Up to 2 batches are scheduled to the threadpool during high pressure,
reducing startup latency for the threads

Together, these changes increase attestation verification throughput
under load up to 30%.

* fixup

* Update submodules

* fix blst build issues (and a PIC warning)

* bump

---------

Co-authored-by: Zahary Karadjov <zahary@gmail.com>
2023-08-03 11:36:45 +03:00
tersec 3a818ecb93
fall back to non-fcu fork choice on epoch boundaries (#5195)
* fall back to non-fcu fork choice on epoch boundaries

* Future[bool]

* fix

* Update beacon_chain/consensus_object_pools/consensus_manager.nim

Co-authored-by: Etan Kissling <etan@status.im>

* make things consistent with Opt[void] return

---------

Co-authored-by: Etan Kissling <etan@status.im>
2023-07-17 22:30:38 +02:00
Jacek Sieka a2adbf809f
Perform block pre-check before validating execution (#5169)
* Perform block pre-check before validating execution

When syncing, blocks have not been gossip-validated and are therefore
prone to trivial faults like being known-unviable, duplicate or missing
their parent.

In addition, the duplicate-block check in BlockProcessor was not
considering the quarantine flow and would therefore cause
recently-quarantined blocks to be silenty dropped when their parent
appears delaying the sync end-game and thus causing longer startup
resync time.

This PR verifies trivial conditions before performing execution
validation thus avoiding duplicates and missing parents alike.

It also ensures that the fast-sync EL mode is used for finalized blocks
even if the EL is timing out / slow to respond - this allows the CL to
complete its sync faster and switch to "normal" lock-step at the head of
the chain more quickly, thus also allowing the EL to access the latest
consensensus information earlier.

* oops

* remove unused constant
2023-07-11 18:55:51 +02:00
tersec b4c4f0955e
https://github.com/ethereum/consensus-specs/pull/3421 https://github.com/ethereum/execution-apis/pull/420 (#5147) 2023-06-30 08:14:20 +00:00
tersec 614202e30d
automated consensus spec URL updating to v1.4.0-beta.0 (#5121) 2023-06-24 15:43:30 +00:00
tersec 591c2246d5
update consensus spec URLs to v1.4.0-alpha.3 (#5088) 2023-06-16 16:45:09 +00:00
tersec 9e14d904ac
https://github.com/ethereum/consensus-specs/pull/3359 (#5047) 2023-06-10 05:39:10 +00:00
henridf 5ef748b19d
Clarify addOrphan error/logging (#4981)
* Clarify addOrphan error/logging

addOrphan returned a bool to indicate success. Change this to a Result
so that different errors can be distinguished.

* Update beacon_chain/consensus_object_pools/block_quarantine.nim

Co-authored-by: tersec <tersec@users.noreply.github.com>

* Update beacon_chain/gossip_processing/gossip_validation.nim

---------

Co-authored-by: tersec <tersec@users.noreply.github.com>
2023-05-21 17:47:00 +00:00
tersec cd087b9a43
replace `optimisticRoots` table with field in `BlockRef` (#4969)
* replace optimisticRoots table with field in BlockRef

* copyright year

* mark finalized blocks as verified on load

* Update beacon_chain/consensus_object_pools/block_dag.nim

Co-authored-by: Etan Kissling <etan@status.im>

* expand non-optimistic block checking to all pre-merge blocks; refactor markBlockVerified to use BlockRef rather than block root and remove superfluous caller in newPayload path replaced by addResolvedHeadBlock BlockRef construction

* don't treat finalized block specially; VALID status is sticky

---------

Co-authored-by: Etan Kissling <etan@status.im>
2023-05-20 12:18:51 +00:00
henridf 1cf777c64b
Fix sync for blocks older than MIN_EPOCHS_FOR_BLOB_SIDECARS_REQUESTS (#4977)
When doing sync for blocks older than
MIN_EPOCHS_FOR_BLOB_SIDECARS_REQUESTS, we skip the blobs by range
request, but we then pass en empty blob sequence to
validation, which then fails.

To fix this: Use an Option[Blobsidecars] to allow expressing the
distinction between "empty blob sequence" and "blobs unavailable". Use
the latter for "old" blocks, and don't attempt to run blob validation.
2023-05-19 16:25:11 +00:00
henridf 01549f6aa4
Wire in blob validation (#4864)
* Wire in blob validation

* Remove unused "is_data_available"

* Log blobs when blob validation fails
2023-05-17 13:55:50 +00:00
henridf 573228ffa0
Rename eth1/ -> el/ and eth1_monitor.nim -> el_monitor.nim (#4944) 2023-05-15 05:05:12 +00:00
tersec 2fcc01f516
modify newPayload failure logging (#4930) 2023-05-11 13:58:25 +03:00
henridf be3f5b1eac
More blob tweaks/fixes from running in devnet (#4933)
* BeaconNode: don't call fetchMissingblobs with empty list

* More logging

* BlockProcessor.checkBloblessSignature: Add missing return value
2023-05-11 00:36:35 +00:00
tersec d3929cbb45
update some beacon API spec URLs; fix some Name and DuplicateModuleImport hints (#4929) 2023-05-10 10:20:55 +00:00
tersec e503cb4f51
make invalid execution payloads more visible (#4906) 2023-05-10 07:17:15 +00:00
Etan Kissling 297881edb7
bump gossip validation refs to 1.3.0 spec (#4895)
Updates gossip validation spec references to v1.3.0 and fixes an
incorrect reference to "signed_aggregate_and_proof" in sync contribution
documentation.
2023-05-05 22:48:33 +02:00
tersec 1ccb36b272
include small dedup in block processor to handle blockByRoot blocks (#4850) 2023-04-26 07:00:03 +00:00
henridf da6169bfe9
BlockProcessor.storeBlock: write blobs to DB (#4854) 2023-04-25 13:55:35 +03:00
tersec d3400ca11b
low attestations during epoch should instafail in CI; dbg -> warn level in `newPayload` log (#4830)
* low attestations during epoch should instafail in CI; dbg -> warn level on newPayload log

* improve newPayload warning message when no valid EL connected

* reduce potential spam; make log spelling more consistent; use fatal/quit
2023-04-19 19:42:30 +00:00
tersec 2246a6ec95
Revert "include small dedup in block processor to handle blockByRoot blocks (#4814)" (#4840)
This reverts commit 8b3ffec0d5.

Syncing was broken with this: https://github.com/status-im/infra-nimbus/issues/132#issuecomment-1514465481
2023-04-19 19:16:27 +00:00
tersec 228e10f1d9
update engine API URLs from v1.0.0-beta.2 to beta.3 (#4828) 2023-04-17 20:11:28 +00:00
tersec 8b3ffec0d5
include small dedup in block processor to handle blockByRoot blocks (#4814) 2023-04-17 19:36:15 +00:00
tersec 75be7d267d
always use fcUV2 in shapella even for non-proposer fcUs (#4817)
* always use fcUV2 in shapella even for non-proposer fcUs

* avoid template/proc naming conflict with libp2p/signed_envelope.nim having a payload proc
2023-04-17 16:17:52 +02:00
henridf 29b431e312
Simplify block quarantine blobless (#4824)
* Simplify block quarantine blobless

The quarantine blobless table was initially keyed off of (Eth2Digest,
ValidatorSig). This was modelled off the orphan table. The presence of
the signature in the key is necessary for orphans, because we can't
verify the signature for an orphan. That is not the case for a
blobless block, where the signature can be verified.

So this PR changes the blobless block table to be keyed off a
Eth2Digest only. This simplifies the retrieval and handling of
blobless blocks.

* review feedback
2023-04-16 08:37:56 +00:00
henridf 57623af36a
Remove unnecessary field derefs in BlockProcessor.storeBlock (#4823) 2023-04-16 01:25:17 +00:00
henridf 021de18e06
Quarantine and reassembly of gossiped blobs and blocks (#4808) 2023-04-13 19:11:40 +00:00
tersec cd7da00d16
eliminate fcU/getPayload race condition causing missed proposals (#4800) 2023-04-12 12:33:21 +03:00
Etan Kissling c3d043c0e1
rename `loadExecutionBlockRoot` > `loadExecutionBlockHash` (#4807)
There are still some `executionBlockRoot` after this, separate rename.
2023-04-11 16:56:29 +00:00
Etan Kissling b7d08d0a38
do not report pre-Merge sync progress as `/opt` (#4801)
Before the merge, assume `payloadStatus == NewPaylodStatus.valid` to
avoid cases of sync progress being reported with `/opt` suffix.
2023-04-09 14:58:20 +00:00
henridf 635a924e8c
Add nim-kzg4844 and use it in validate_blobs (#4732) 2023-03-23 09:47:04 +00:00
tersec 2f634c10a4
automated consensus spec URL updating from v1.3.0-rc.4 to rc.5 (#4756) 2023-03-21 00:42:22 +00:00
tersec ec77116414
automated consensus spec URL updating from v1.3.0-rc.3 to rc.4 (#4742) 2023-03-17 01:10:31 +00:00
Etan Kissling ad118cd354
rename `stateFork` > `consensusFork` (#4718)
Just the variable, not yet `lcDataForkAtStateFork` / `atStateFork`.

- Shorten comment in `light_client.nim` to keep line width
- Do not rename `stateFork` mention in `runProposalForkchoiceUpdated`.
- Do not rename `stateFork` in `getStateField(dag.headState, fork)`

Rest is just a mechanical mass replace
2023-03-11 00:35:52 +00:00
henridf f5612f2a77
Remove BlobsSidecar used in BeaconChainDB (#4710) 2023-03-10 12:51:36 +00:00
tersec a47f0b054e
finish eip4844 to deneb module rename (#4705) 2023-03-09 01:34:17 +01:00
henridf 90640cce05
Update sync to use post-decoupling RPC (#4701)
* Update sync to use post-decoupling RPCs

blob_sidecars_by_range returns a flat list of sidecars, which must
then be grouped per-slot.

* Add test for groupBlobs

* createBlobs: convert proc to func
2023-03-07 20:19:17 +00:00
tersec 8541674498
simplify ELMonitor fcU payload attributes handling (#4696) 2023-03-06 16:19:15 +00:00
zah 8771e91d53
Support for driving multiple EL nodes from a single Nimbus BN (#4465)
* Support for driving multiple EL nodes from a single Nimbus BN

Full list of changes:

* Eth1Monitor has been renamed to ELManager to match its current
  responsibilities better.

* The ELManager is no longer optional in the code (it won't have
  a nil value under any circumstances).

* The support for subscribing for headers was removed as it only
  worked with WebSockets and contributed significant complexity
  while bringing only a very minor advantage.

* The `--web3-url` parameter has been deprecated in favor of a
  new `--el` parameter. The new parameter has a reasonable default
  value and supports specifying a different JWT for each connection.
  Each connection can also be configured with a different set of
  responsibilities (e.g. download deposits, validate blocks and/or
  produce blocks). On the command-line, these properties can be
  configured through URL properties stored in the #anchor part of
  the URL. In TOML files, they come with a very natural syntax
  (althrough the URL scheme is also supported).

* The previously scattered EL-related state and logic is now moved
  to `eth1_monitor.nim` (this module will be renamed to `el_manager.nim`
  in a follow-up commit). State is assigned properly either to the
  `ELManager` or the to individual `ELConnection` objects where
  appropriate.

  The ELManager executes all Engine API requests against all attached
  EL nodes, in parallel. It compares their results and if there is a
  disagreement regarding the validity of a certain payload, this is
  detected and the beacon node is protected from publishing a block
  with a potential execution layer consensus bug in it.

  The BN provides metrics per EL node for the number of successful or
  failed requests for each type Engine API requests. If an EL node
  goes offline and connectivity is resoted later, we report the
  problem and the remedy in edge-triggered fashion.

* More progress towards implementing Deneb block production in the VC
  and comparing the value of blocks produced by the EL and the builder
  API.

* Adds a Makefile target for the zhejiang testnet
2023-03-05 01:40:21 +00:00
tersec 3b41e6a0e7
rename ConsensusFork.EIP4844 to ConsensusFork.Deneb (#4692) 2023-03-04 13:35:39 +00:00
tersec 88092bb411
don't try to validate execution block hashes of non-execution payloads (#4687) 2023-03-02 00:11:46 +00:00
henridf 3681177cf4
Remove ForkySignedBeaconBlockMaybeBlobs (#4681)
This commit removes ForkySignedBeaconBlockMaybeBlobs and all
references. I tried to pull that thread only as little as was needed
to get rid of it. Left a placeholder BlobSidecar array (in lieu of
Opt[BlobsSidecar]) in a few places; this will be used as we rebuild
the decoupled implementation.
2023-02-28 11:36:17 +00:00
henridf dede36fe86
Remove blobsSidecar from orphans table (#4670) 2023-02-27 06:10:22 +00:00
tersec 29fb65a9db
automated update of v1.3.0-rc.2 to v1.3.0-rc.3 consensus spec URLs (#4647) 2023-02-21 16:43:21 +00:00
tersec cf551f10c4
don't fcU on blocks for which block processor received no newPayload reply (#4623) 2023-02-14 21:41:49 +01:00
tersec 3011d49946
refactor fcU sending and rename EL-side root to hash (#4614) 2023-02-14 07:48:39 +01:00
tersec aee19fec6b
block on forkchoiceUpdated EL calls due to doing fewer of them (#4609) 2023-02-13 12:13:52 +01:00
henridf 59e41dc65d
EIP4844 sync (#4581)
* EIP4844 Sync

* Pass eip4844 fork epoch rather than cfg to syncmanager

* Fix sync

* Update test

* map->mapIt
2023-02-11 20:48:35 +00:00
Jacek Sieka f3ddea6c86
Skip execution payload verification for finalized blocks (#4591)
While syncing the finalized portion of the chain, the execution client
cannot efficiently sync and most of the time returns `SYNCING` - in this
PR, we use CL-verified optmistic sync as long as the block is claimed to
be finalized, only occasionally updating the EL with progress.

Although a peer might lie about what is finalized and what isn't,
eventually we'll call the execution client - thus, all a dishonest
client can do is delay execution verification slightly. Gossip blocks in
particular are never assumed to be finalized.
2023-02-06 08:22:08 +01:00
tersec 63ed5885ab
update engine API URLs to v1.0.0-beta.2 (#4579) 2023-02-01 18:49:36 +00:00