Commit Graph

423 Commits

Author SHA1 Message Date
Tanguy 769ed00203
Add light client gossip metrics (#4745) 2023-03-21 08:55:48 +01:00
tersec 2f634c10a4
automated consensus spec URL updating from v1.3.0-rc.4 to rc.5 (#4756) 2023-03-21 00:42:22 +00:00
tersec ec77116414
automated consensus spec URL updating from v1.3.0-rc.3 to rc.4 (#4742) 2023-03-17 01:10:31 +00:00
Etan Kissling 69013d153c
bump light client spec references to `v1.3.0-rc.3` (#4719) 2023-03-11 01:11:51 +00:00
Etan Kissling ad118cd354
rename `stateFork` > `consensusFork` (#4718)
Just the variable, not yet `lcDataForkAtStateFork` / `atStateFork`.

- Shorten comment in `light_client.nim` to keep line width
- Do not rename `stateFork` mention in `runProposalForkchoiceUpdated`.
- Do not rename `stateFork` in `getStateField(dag.headState, fork)`

Rest is just a mechanical mass replace
2023-03-11 00:35:52 +00:00
henridf f5612f2a77
Remove BlobsSidecar used in BeaconChainDB (#4710) 2023-03-10 12:51:36 +00:00
tersec a47f0b054e
finish eip4844 to deneb module rename (#4705) 2023-03-09 01:34:17 +01:00
henridf 90640cce05
Update sync to use post-decoupling RPC (#4701)
* Update sync to use post-decoupling RPCs

blob_sidecars_by_range returns a flat list of sidecars, which must
then be grouped per-slot.

* Add test for groupBlobs

* createBlobs: convert proc to func
2023-03-07 20:19:17 +00:00
tersec 63b1b0840f
5 more modules of eip4844.foo to deneb.foo renames (#4698) 2023-03-06 18:45:52 +00:00
tersec 8541674498
simplify ELMonitor fcU payload attributes handling (#4696) 2023-03-06 16:19:15 +00:00
zah 8771e91d53
Support for driving multiple EL nodes from a single Nimbus BN (#4465)
* Support for driving multiple EL nodes from a single Nimbus BN

Full list of changes:

* Eth1Monitor has been renamed to ELManager to match its current
  responsibilities better.

* The ELManager is no longer optional in the code (it won't have
  a nil value under any circumstances).

* The support for subscribing for headers was removed as it only
  worked with WebSockets and contributed significant complexity
  while bringing only a very minor advantage.

* The `--web3-url` parameter has been deprecated in favor of a
  new `--el` parameter. The new parameter has a reasonable default
  value and supports specifying a different JWT for each connection.
  Each connection can also be configured with a different set of
  responsibilities (e.g. download deposits, validate blocks and/or
  produce blocks). On the command-line, these properties can be
  configured through URL properties stored in the #anchor part of
  the URL. In TOML files, they come with a very natural syntax
  (althrough the URL scheme is also supported).

* The previously scattered EL-related state and logic is now moved
  to `eth1_monitor.nim` (this module will be renamed to `el_manager.nim`
  in a follow-up commit). State is assigned properly either to the
  `ELManager` or the to individual `ELConnection` objects where
  appropriate.

  The ELManager executes all Engine API requests against all attached
  EL nodes, in parallel. It compares their results and if there is a
  disagreement regarding the validity of a certain payload, this is
  detected and the beacon node is protected from publishing a block
  with a potential execution layer consensus bug in it.

  The BN provides metrics per EL node for the number of successful or
  failed requests for each type Engine API requests. If an EL node
  goes offline and connectivity is resoted later, we report the
  problem and the remedy in edge-triggered fashion.

* More progress towards implementing Deneb block production in the VC
  and comparing the value of blocks produced by the EL and the builder
  API.

* Adds a Makefile target for the zhejiang testnet
2023-03-05 01:40:21 +00:00
tersec 3b41e6a0e7
rename ConsensusFork.EIP4844 to ConsensusFork.Deneb (#4692) 2023-03-04 13:35:39 +00:00
tersec d058aa09c8
more withdrowls (#4674) 2023-03-02 17:13:35 +01:00
henridf 1de3cf5246
Remove SignedBeaconBlockAndBlobsSidecar (#4683)
This commit removes SignedBeaconBlockAndBlobsSidecar and all remaining
references.
2023-03-02 15:12:04 +01:00
tersec 88092bb411
don't try to validate execution block hashes of non-execution payloads (#4687) 2023-03-02 00:11:46 +00:00
henridf 3681177cf4
Remove ForkySignedBeaconBlockMaybeBlobs (#4681)
This commit removes ForkySignedBeaconBlockMaybeBlobs and all
references. I tried to pull that thread only as little as was needed
to get rid of it. Left a placeholder BlobSidecar array (in lieu of
Opt[BlobsSidecar]) in a few places; this will be used as we rebuild
the decoupled implementation.
2023-02-28 11:36:17 +00:00
henridf dede36fe86
Remove blobsSidecar from orphans table (#4670) 2023-02-27 06:10:22 +00:00
tersec 29fb65a9db
automated update of v1.3.0-rc.2 to v1.3.0-rc.3 consensus spec URLs (#4647) 2023-02-21 16:43:21 +00:00
Jacek Sieka 83f9745df1
restore doppelganger check on connectivity loss (#4616)
* restore doppelganger check on connectivity loss

https://github.com/status-im/nimbus-eth2/pull/4398 introduced a
regression in functionality where doppelganger detection would not be
rerun during connectivity loss. This PR reintroduces this check and
makes some adjustments to the implementation to simplify the code flow
for both BN and VC.

* track when check was last performed for each validator (to deal with
late-added validators)
* track when we performed a doppel-detectable activity (attesting) so as
to avoid false positives
* remove nodeStart special case (this should be treated the same as
adding a validator dynamically just after startup)

* allow sync committee duties in doppelganger period

* don't trigger doppelganger when registering duties

* fix crash when expected index response is missing

* fix missing slashingSafe propagation
2023-02-20 13:28:56 +02:00
tersec 629b005c27
refactor batch validation not to require genesis_validators_root each time (#4640) 2023-02-20 09:26:22 +01:00
tersec a382498cfe
batch-verify BLS to execution change messages (#4637) 2023-02-17 13:35:12 +00:00
tersec cf551f10c4
don't fcU on blocks for which block processor received no newPayload reply (#4623) 2023-02-14 21:41:49 +01:00
tersec 3011d49946
refactor fcU sending and rename EL-side root to hash (#4614) 2023-02-14 07:48:39 +01:00
tersec aee19fec6b
block on forkchoiceUpdated EL calls due to doing fewer of them (#4609) 2023-02-13 12:13:52 +01:00
henridf 59e41dc65d
EIP4844 sync (#4581)
* EIP4844 Sync

* Pass eip4844 fork epoch rather than cfg to syncmanager

* Fix sync

* Update test

* map->mapIt
2023-02-11 20:48:35 +00:00
Jacek Sieka f3ddea6c86
Skip execution payload verification for finalized blocks (#4591)
While syncing the finalized portion of the chain, the execution client
cannot efficiently sync and most of the time returns `SYNCING` - in this
PR, we use CL-verified optmistic sync as long as the block is claimed to
be finalized, only occasionally updating the EL with progress.

Although a peer might lie about what is finalized and what isn't,
eventually we'll call the execution client - thus, all a dishonest
client can do is delay execution verification slightly. Gossip blocks in
particular are never assumed to be finalized.
2023-02-06 08:22:08 +01:00
tersec bca781b1b0
prioritize REST API-provided BLS to execution changes over gossip-received changes (#4580) 2023-02-03 16:28:28 +01:00
tersec 63ed5885ab
update engine API URLs to v1.0.0-beta.2 (#4579) 2023-02-01 18:49:36 +00:00
henridf 94837caa2a
Eip4844 sync fixes (#4577)
* fixes

* Make some log messages blob-aware

* remove redundant optBlobs()
2023-02-01 14:14:50 +00:00
tersec 58ed9308d2
automated v1.3.0-rc.1 to v1.3.0-rc.2 consensus spec URL updates (#4568) 2023-01-31 00:26:57 +01:00
tersec 0fb726c420
`BeaconStateFork/BeaconBlockFork` -> `ConsensusFork` (#4560)
* `BeaconStateFork/BeaconBlockFork` -> `ConsensusFork`

* revert unrelated change

* revert unrelated changes

* update test summaries
2023-01-28 19:53:41 +00:00
henridf 03b468c537
Pass correct block root to validate_blobs_sidecar (#4557) 2023-01-26 09:25:37 +00:00
henridf 7966ab6be2
Some EIP4844 fixes (#4549)
* debug log upon sidecar validation failure

* Fill in signature catch upon SignedBeaconBlockAndBlobsSidecar deser

* Always fill blobssidecar slot and root

* Skip lastFCU when eth1monitor is nil

* fix

* Use cached root
2023-01-25 18:35:46 +01:00
Etan Kissling efbd4e395a
avoid sending redundant LC finality updates (#4546)
When the epoch boundary block is missed, we incorrectly assume that the
next couple blocks improve finality, leading to repeated pushes of the
same light client finality update and incorrectly ignoring some gossip.
2023-01-24 17:44:55 +00:00
tersec fe1a57c220
use shortLog for execution payload logging (#4544) 2023-01-24 13:19:38 +00:00
Zahary Karadjov 285eec6512
Add metrics and debug logging for dropped BLS to execution change messages 2023-01-23 14:58:40 +01:00
tersec aacc8d702d
remove Nim 1.2-compatible `push raise`s and update copyright notice years (#4528) 2023-01-20 14:14:37 +00:00
tersec 819e007689
exit/validatorchange pool includes BLS to execution messages; REST support for new pool (#4519)
* exit/validatorchange pool includes BLS to execution messages; REST
support for new pool

* catch failed individual futures

* increase BLS changes bound and keep BLS seen consistent with subpool

* deque capacities should be powers of 2
2023-01-19 22:00:40 +00:00
tersec aea7a0c8b8
remove TTD monitoring (#4486) 2023-01-18 16:01:49 +02:00
tersec 073c544f0c
automated update from v1.3.0-rc.0 to v1.3.0-rc.1 consensus spec URLs (#4517) 2023-01-17 16:10:52 +00:00
henridf 727920a571
Refactor block/blobs types (#4491)
* Refactor block/blobs types

Use type system to enforce invariant that a pre-4844 block cannot have
a sidecar.

* Update beacon_chain/nimbus_beacon_node.nim

Co-authored-by: tersec <tersec@users.noreply.github.com>

* review feedback

Co-authored-by: tersec <tersec@users.noreply.github.com>
2023-01-16 16:26:48 +00:00
Etan Kissling fda03548e3
use `ForkedLightClientStore` internally (#4512)
When running `nimbus_light_client`, we persist the latest header from
`LightClientStore.finalized_header` in a database across restarts.
Because the data format is derived from the latest `LightClientStore`,
this could lead to data being persisted in pre-release formats.

To enable us to test later `LightClientStore` versions on devnets,
transition to a `ForkedLightClientStore` internally that is only
migrated to newer forks on-demand (instead of starting at latest).
2023-01-16 16:53:45 +01:00
Etan Kissling 609227559f
LC data fork cleanup (#4506)
Distinguish between those code locations that need to be updated on each
light client data format change, and those others that should generally
be fine, as long as a valid light client object is processed.

The former are tagged with static assert for `LightClientDataFork.high`.

The latter are changed to `lcDataFork > LightClientDataFork.None` to
indicate that they depend only on presence of any valid object.
Also bundled a few minor cleanups and fixes.

Also add `Forky` type for `LightClientStore` and minor fixes / cleanups.
2023-01-14 22:19:50 +01:00
Etan Kissling 2324136552
add `LightClientHeader` wrapper (#4481)
The light client data structures were changed to accommodate additional
fields in future forks (e.g., to also hold execution data).

There is a minor change to the JSON serialization, where the `header`
properties are now nested inside a `LightClientHeader`.
The SSZ serialization remains compatible.

See https://github.com/ethereum/consensus-specs/pull/3190
and https://github.com/ethereum/beacon-APIs/pull/287
2023-01-13 16:46:35 +01:00
Etan Kissling 7e276937dc
make LC data fork aware (#4493)
In a future fork, light client data will be extended with execution info
to support more use cases. To anticipate such an upgrade, introduce
`Forky` and `Forked` types, and ready the database schema.
Because the mapping of sync committee periods to fork versions is not
necessarily unique (fork schedule not in sync with period boundaries),
an additional column is added to `period` -> `LightClientUpdate` table.
2023-01-12 18:11:38 +01:00
Jacek Sieka 6bfc766629
drop subset sync contributions in gossip (#4490)
* correctly report ignored contributions in metrics
* avoid counting subset contributions in vmon (bring in line with
attestation aggregates)
* avoid signature checks for subset attestations

A being a non-strict subset is a sufficient condition to ignore.
2023-01-12 15:08:08 +01:00
henridf 309f8690de
Wire up engine_newPayloadV3 (#4482)
* Wire up eip4844's newPayloadV3

* Add eip4844 test

* Update AllTests-mainnet.md and fix typo
2023-01-11 18:21:19 +00:00
Jacek Sieka ba3db7aa5a
spec: Option -> Opt (#4488) 2023-01-11 12:29:21 +00:00
tersec e28e1aeec8
a few consensus spec ref URL updates (#4483) 2023-01-10 16:14:17 +00:00
tersec 2dd3cd786f
consensus spec ref URL update v1.3.0-{alpha.2,rc.0}; copyright year update (#4477) 2023-01-09 22:44:44 +00:00
henridf 64878888bd
Blob storage (#4454)
* Blob storage

* fix indentation

* Fix build (none->Opt.none)

* putBlobs -> putBlobsSidecar

* getBlobs -> getBlobsSidecar

* Check blob correctness when storing a backfill block

* Blobs table: rename and conditionally create

* Check block<->blob match in storeBackfillBlock

* Use when .. toFork() to condition on type

* Check blob viability in block_processor.storeBlock()

* Fix build

* Review feedback
2023-01-09 18:42:10 +00:00
tersec c5d1683f19
spec ref URL & copyright year updates (#4467) 2023-01-06 16:28:46 +00:00
Jacek Sieka 7c2ed5c609
Always-on optimistic mode (#4458)
With https://github.com/status-im/nimbus-eth2/pull/4420 implemented, the
checks that we perform are equivalent to those of a `SYNCING` EL - as
such, we can treat missing EL the same as SYNCING and proceed with an
optimistic sync.

This mode of operation significantly speeds up recovery after an offline
EL event because the CL is already synced and can immediately inform the
EL of the latest head.

It also allows using a beacon node for consensus archival queries
without an execution client.

* deprecate `--optimistic` flag
* log block details on EL error, soften log level because we can now
continue to operate
* `UnviableFork` -> `Invalid` when block hash verification fails -
failed hash verification is not a fork-related block issue
2023-01-04 15:51:14 +00:00
henridf 8251cc223d
eip4844 gossip (#4444)
* eip4844 gossip

* Check BLSFieldElement range validity in gossip validation

* lint/nits cleanup

* Use template to avoid an assignment with copy.

* More review feedback

* lint

* lint

* processSignedBeaconBlockAndBlobsSidecar: clean up error handling flow

* Undo factoring-out of beacon blocks validator installation
2023-01-04 12:34:15 +00:00
Jacek Sieka 75c7195bfd
Backfill only up to MIN_EPOCHS_FOR_BLOCK_REQUESTS blocks (#4421)
When backfilling, we only need to download blocks that are newer than
MIN_EPOCHS_FOR_BLOCK_REQUESTS - the rest cannot reliably be fetched from
the network and does not have to be provided to others.

This change affects only trusted-node-synced clients - genesis sync
continues to work as before (because it needs to construct a state by
building it from genesis).

Those wishing to complete a backfill should do so with era files
instead.
2022-12-23 08:42:55 +01:00
Etan Kissling c91d9d61e2
validate EL block hash in EL simulation (#4420)
When simulating EL with `--optimistic` flag, perform block hash check.
2022-12-20 09:24:33 +01:00
tersec bb4ea37baa
update EF consensus spec URLs from v1.3.0-alpha.1 to v1.3.0-alpha.2 (#4432) 2022-12-15 12:15:12 +00:00
tersec 7faef7827e
fix EIP4844 withBlck (#4411)
* fix EIP4844 withBlck

* don't raiseAssert by default
2022-12-14 18:30:56 +01:00
Jacek Sieka 6e2a02466e
unify bn/vc doppelganger detection (#4398)
* fix REST liveness endpoint responding even when gossip is not enabled
* fix VC exit code on doppelganger hit
* fix activation epoch not being updated correctly on long deposit
queues
* fix activation epoch being set incorrectly when updating validator
* move most implementation logic to `validator_pool`, add tests
* ensure consistent logging between VC and BN
* add docs
2022-12-09 17:05:55 +01:00
Etan Kissling bbf1d6030c
add hooks for observing LC progress (#4401)
For Fluffy injection, add observer callbacks that get called whenever
new light client data is sucecssfully processed.

```
  proc onLightClientObject(
      lightClient: LightClient, obj: SomeLightClientObject) =
    info "New LC object", obj

  lightClient.bootstrapObserver =
    proc(lightClient: LightClient, obj: altair.LightClientBootstrap) =
      lightClient.onLightClientObject(obj)
  lightClient.updateObserver =
    proc(lightClient: LightClient, obj: altair.LightClientUpdate) =
      lightClient.onLightClientObject(obj)
  lightClient.finalityUpdateObserver =
    proc(lightClient: LightClient, obj: altair.LightClientFinalityUpdate) =
      lightClient.onLightClientObject(obj)
  lightClient.optimisticUpdateObserver =
    proc(lightClient: LightClient, obj: altair.LightClientOptimisticUpdate) =
      lightClient.onLightClientObject(obj)
```
2022-12-08 16:24:16 +00:00
tersec 7cf432b155
eip4844 fork and epoch transition tests; some eip4844 gossip (#4393) 2022-12-06 16:43:11 +00:00
tersec 415b11aa67
EIP4844 tweaks to pass SSZ consensus object tests (#4390) 2022-12-05 21:36:53 +00:00
tersec 474b0d8502
`withUpdatedState` injects `updatedState` rather than `state` template (#4375) 2022-11-30 16:37:23 +02:00
tersec ed672113bc
support engine API execution payloads with withdrawals (#4358) 2022-11-29 05:02:16 +00:00
tersec 61c5ac32d8
automated consensus spec ref URL update to v1.3.0-alpha.1 (#4354) 2022-11-24 19:07:02 +00:00
tersec c8083f2c32
implement more missing capella functionality (#4344) 2022-11-24 09:53:04 +02:00
Eugene Kabanov fb4fea81b5
Fix doppelganger protection in validator duties. (#4345)
Fix missing activationEpoch setup.
2022-11-24 09:48:10 +02:00
Eugene Kabanov eb661565ed
Per-validator doppelganger protection. (#4304)
* Initial commit.

* NextAttestationEntry type.

* Add doppelgangerCheck and actual check.

* Recover deleted check.

* Remove NextAttestainEntry changes.

* More cleanups for NextAttestationEntry.

* Address review comments.

* Remove GENESIS_EPOCH specific check branch.

* Decrease number of full epochs for doppelganger check in VC.

Co-authored-by: zah <zahary@status.im>
2022-11-20 15:55:43 +02:00
Etan Kissling 48994f67d3
rename `BlockError` -> `VerifierError` (#4310)
We currently use `BlockError` for both beacon blocks and LC objects.
In light of EIP4844, we will likely also use it for blob sidecars.
To avoid confusion, renaming it to a more generic `VerifierError`,
and update its documentation to be more generic.

To avoid long lines as a followup, also renaming the `block_processor`'s
`BlockProcessingCompleted.completed`->`ProcessingStatus.completed` and
`BlockProcessingCompleted.notCompleted`->`ProcessingStatus.notCompleted`
2022-11-10 17:40:27 +00:00
tersec 909c095e64
initial automated v1.2.0 -> v1.3.0-alpha.0 consensus spec URL update (#4296) 2022-11-08 02:37:28 +00:00
tersec 5b46f0b723
add Capella support to Forked* (#4276)
* add Capella support to Forked*

* remove cruft

* add `OnForkyBlockAdded`
2022-11-02 16:23:30 +00:00
tersec 69ed3a2fd6
fix false-positive warnings on expected VALID fcU status; adjust log levels (#4242)
* fix false-positive warnings on expected VALID fcU status; adjust log levels

* clearer info/warning message wording
2022-10-26 21:14:11 +00:00
Jacek Sieka b08d0ff2ab
Optimistic mode (#4262)
In optimistic mode, Nimbus will sync optimistically even when the
execution client is offline / not available.

An optimistic node is less secure because it has not validated block
transactions via the execution client and can thus not be used for
validation duties.
2022-10-26 20:44:45 +00:00
tersec fb6e6d9cf4
remove `newPayload` from block production flow (#4186)
* remove `newPayload` from block production flow

* refactor block_processor to run `newPayload` as part of `storeBlock`
2022-10-14 22:48:56 +03:00
tersec ad7541567c
move LVH handling to tests/; increase maximum fork choice retries (#4205) 2022-10-03 13:10:08 +00:00
tersec c367b14ad9
deprecate `--safe-slots-to-import-optimistically` (#4182) 2022-09-29 06:29:49 +00:00
tersec 1819d79e07
avoid potential database inconsistency after fcU `INVALID`+crash (#4192)
* avoid database race-condition inconsistency after fcU `INVALID` then crash

* ensure head doesn't fall behind finalized; add more tests for head movement/reloading DAG
2022-09-28 21:07:31 +00:00
tersec 0f6d19b4b3
implement v1.2.0 optimistic sync tests (#4174)
* implement v1.2.0 optimistic sync tests

* Update beacon_chain/consensus_object_pools/blockchain_dag.nim

Co-authored-by: Etan Kissling <etan@status.im>

* `lvh` -> `latestValidHash` and only invalidate one specific block"

* `getEarliestInvalidRoot` -> `getEarliestInvalidBlockRoot`; `defaultEarliestInvalidRoot` -> `defaultEarliestInvalidBlockRoot`

Co-authored-by: Etan Kissling <etan@status.im>
2022-09-27 15:11:47 +03:00
tersec a0ead042ad
newPayload `INVALIDATED` should be `unviableFork` (#4180) 2022-09-26 21:24:32 +00:00
tersec deb043796b
a few more manual v1.2.0 consensus spec ref URL updates (#4165) 2022-09-23 12:00:17 +00:00
tersec 3c03ba86c1
update consensus spec ref URLs to v1.2.0 (#4164) 2022-09-23 07:56:06 +00:00
zah 154723947b
Don't search for the TTD block after the merge (#4152) 2022-09-20 09:17:25 +03:00
tersec 56720dd808
update consensus layer spec ref URLs to v1.2.0-rc.3 (#4143) 2022-09-20 02:08:09 +02:00
tersec ab3ac64b19
Remove optimistic sync candidate check (#4129) 2022-09-17 20:45:35 +00:00
Etan Kissling 0244671cb8
rm optimistic candidate block check from LC (#4131)
The optimistic candidate block check that only imports a new block into
the EL client if its parent block also had execution enabled is not
needed anymore, as mainnet has merged and the attack period is over.
2022-09-17 00:42:19 +00:00
tersec 8be964a152
update consensus layer spec ref URLs to v1.2.0-rc.3 (#4109) 2022-09-10 17:16:38 +00:00
tersec 19bf460a3b
more `withState` `state` -> `forkyState` (#4104) 2022-09-10 08:12:07 +02:00
tersec 1d620f0123
consensus spec URL updates to v1.2.0-rc.3 (#4105) 2022-09-09 21:56:06 +00:00
tersec cd46af17e9
handle INVALIDATED forkchoiceUpdated better (#4081) 2022-09-07 22:54:37 +02:00
tersec bf3a014287
more efficient forkchoiceUpdated usage (#4055)
* more efficient forkchoiceUpdated usage

* await rather than asyncSpawn; ensure head update before dag.updateHead

* use action tracker rather than attached validators to check for next slot proposal; use wall slot + 1 rather than state slot + 1 to correctly check when missing blocks

* re-add two-fcU case for when newPayload not VALID

* check dynamicFeeRecipientsStore for potential proposal

* remove duplicate checks for whether next proposer
2022-09-07 20:34:52 +02:00
tersec ad0d30093f
state/forkyState cleanup; spec URL updates; rm unused imports (#4052) 2022-08-31 13:29:34 +02:00
Etan Kissling 613f4a9a50
accelerate EL sync with LC with `--sync-light-client` (#4041)
When the BN-embedded LC makes sync progress, pass the corresponding
execution block hash to the EL via `engine_forkchoiceUpdatedV1`.
This allows the EL to sync to wall slot while the chain DAG is behind.
Renamed `--light-client` to `--sync-light-client` for clarity, and
`--light-client-trusted-block-root` to `--trusted-block-root` for
consistency with `nimbus_light_client`.

Note that this does not work well in practice at this time:
- Geth sticks to the optimistic sync:
  "Ignoring payload while snap syncing" (when passing the LC head)
  "Forkchoice requested unknown head" (when updating to LC head)
- Nethermind syncs to LC head but does not report ancestors as VALID,
  so the main forward sync is still stuck in optimistic mode:
  "Pre-pivot block, ignored and returned Syncing"

To aid EL client teams in fixing those issues, having this available
as a hidden option is still useful.
2022-08-29 12:16:35 +00:00
tersec 2545d1d053
remove incorrect block gossip validation condition (#4044)
* remove incorrect block gossip validation condition

* clarify explanation
2022-08-29 13:01:32 +03:00
Etan Kissling 64972e3c8a
set `safe_block_hash` to fork choice justified (#4010)
Implements the fork choice safe block spec, where `safe_block_hash` in
`forkChoiceUpdated` is set to justified (used to be `ZERO_HASH`).
https://github.com/ethereum/consensus-specs/blob/v1.2.0-rc.3/fork_choice/safe-block.md#get_safe_execution_payload_hash
2022-08-25 23:34:02 +00:00
Etan Kissling 9180f09641
reduce LC optsync latency (#4002)
The optimistic sync spec was updated since the LC based optsync module
was introduced. It is no longer necessary to wait for the justified
checkpoint to have execution enabled; instead, any block is okay to be
optimistically imported to the EL client, as long as its parent block
has execution enabled. Complex syncing logic has been removed, and the
LC optsync module will now follow gossip directly, reducing the latency
when using this module. Note that because this is now based on gossip
instead of using sync manager / request manager, that individual blocks
may be missed. However, EL clients should recover from this by fetching
missing blocks themselves.
2022-08-25 03:53:59 +00:00
Etan Kissling eec6c04d32
do not descore peer when EL connection fails (#4020)
When the EL fails to respond to `newPayload`, e.g., because connection
to the EL got interrupted, or due to misconfiguration, optimistic blocks
cannot be imported according to spec. This condition is treated the same
as if the peer returned a block with missing parent which gets the block
out of our processing queue, but can have nasty side effects.

For example, if sync manager asks for validation of a block known to be
in the finalized range, if it receives a `MissingParent` verdict, the
peer is immediately removed from the peer pool.

```
DBG 2022-08-24 11:45:26.874+02:00 newPayload: inserting block into execution engine parentHash=e4ca7424 blockHash=36cdc198 stateRoot=cf3902c1 receiptsRoot=56e81f17 prevRandao=0b49a172 blockNumber=1518089 gasLimit=30000000 gasUsed=0 timestamp=1657980396 extraDataLen=0 baseFeePerGas=7 numTransactions=0
ERR 2022-08-24 11:45:26.875+02:00 newPayload failed                          msg="Transport is not initialised (missing a call to connect?)"
DBG 2022-08-24 11:45:26.875+02:00 Block pool rejected peer's response        topics="syncman" request=187232:32@1475 peer=16U*MsCJdx direction=forward blocks_map=xxxxxxxxxxxxxxxxxxxxxxxxxxxx.xxx blocks_count=31 ok=false unviable=false missing_parent=true sync_ident=main
ERR 2022-08-24 11:45:26.875+02:00 Unexpected missing parent at finalized epoch slot topics="syncman" request=187232:32@1475 peer=16U*MsCJdx direction=forward rewind_to_slot=187232 blocks_count=31 blocks_map=xxxxxxxxxxxxxxxxxxxxxxxxxxxx.xxx sync_ident=main
DBG 2022-08-24 11:45:26.875+02:00 Peer was removed from PeerPool due to low score topics="beacnde" peer=16U*MsCJdx peer_score=-1000 score_low_limit=0 score_high_limit=1000
DBG 2022-08-24 11:45:26.875+02:00 Lost connection to peer                    topics="networking" peer=16U*MsCJdx connections=0
```

By delaying issuing a verdict until the EL connection is restored and
`newPayload` successfully ran, the problem should be fixed. This also
induces back pressure to the sync manager by stopping download of new
blocks (or re-downloading the same block over and over again).
2022-08-24 16:55:41 +00:00
tersec 1d55743ebb
allow execution clients several seconds to construct blocks (#4012) 2022-08-23 19:19:52 +03:00
tersec c65eaca1bf
update spec ref URLs (#4005) 2022-08-20 16:03:32 +00:00
tersec f537f263df
don't use empty execution payload when newPayload rejects it (#3999)
* don't use empty execution payload when newPayload rejects it

* disallow optimistic import except when accepted/syncing
2022-08-20 00:20:57 +03:00
zah df5ef95111
Doppelganger detection bug fix (#3997)
When the client was started without any validators, the doppelganger
detection structures were never initialized properly. Later, when
validators were added through the Keymanager API, they interacted
with the uninitialized doppelganger detection structures and their
duties were inappropriately skipped.
2022-08-19 13:34:08 +03:00
Jacek Sieka 0d9fd54857
cache shuffling separately from other EpochRef data (fixes #2677) (#3990)
In order to avoid full replays when validating attestations hailing from
untaken forks, it's better to keep shufflings separate from `EpochRef`
and perform a lookahead on the shuffling when processing the block that
determines them.

This also helps performance in the case where REST clients are trying to
perform lookahead on attestation duties and decreases memory usage by
sharing shufflings between EpochRef instances of the same dependent
root.
2022-08-18 21:07:01 +03:00
tersec 3ad1d251ef
make newPayload/forkchoiceUpdated failures errors (#3989) 2022-08-18 12:57:32 +00:00
Etan Kissling 5c8e58ea23
update LC spec references for v1.2.0-rc.2 (#3982)
Updates light client spec references for latest spec (no more `vFuture`)
2022-08-17 19:47:06 +00:00
tersec 8274d5373b
update spec ref URLs (#3979) 2022-08-17 11:33:19 +00:00
zah ca3245c4f0
Doppelganger exit code changed from 1031 to 129 (addresses #3973) (#3977) 2022-08-17 08:13:55 +02:00
zah dc50abbc90
Implement a missing ingnore rule for sync committee contributions (#3941) 2022-08-09 12:52:11 +03:00
Miran dfd4afc9f2
compatibility with Nim 1.4+ (#3888) 2022-07-29 10:53:42 +00:00
Etan Kissling 3bc42994e4
update to latest LC test format (#3879)
The EF test format for the LC sync protocol is modified to verify checks
after each step: https://github.com/ethereum/consensus-specs/pull/2938 -
The test runner is updated accordingly.
2022-07-23 05:54:01 +00:00
tersec 2f77f05a1a
optimistic block gossip validation (#3876) 2022-07-21 21:39:43 +03:00
tersec f4208cfb23
opportunistically even less async optimistic sync (#3880) 2022-07-21 21:26:36 +03:00
Etan Kissling 735c1df62f
add strict mode to light client processor (#3894)
The light client sync protocol employs heuristics to ensure it does not
become stuck during non-finality or low sync committee participation.
These can enable use cases that prefer availability of recent data
over security. For our syncing use case, though, security is preferred.
An option is added to light client processor to configure this tradeoff.
2022-07-21 11:16:10 +02:00
tersec 06c8e10ae2
move consensus_manager to consensus_object_pools (#3852) 2022-07-13 14:13:54 +00:00
tersec ce6cbd84e2
rename verifyFinalization internal flag to strictVerification (#3866)
* rename verifyFinalization internal flag to strictVerification

* Update beacon_chain/extras.nim

Co-authored-by: Etan Kissling <etan@status.im>

Co-authored-by: Etan Kissling <etan@status.im>
2022-07-13 13:48:09 +00:00
tersec 1250c56e32
less async optimistic sync (#3842)
* less async optimistic sync

* use asyncSpawn; adapt changes to message router
2022-07-07 16:57:52 +00:00
Jacek Sieka e1830519a4
Introduce message router (#3829)
Whether new blocks/attestations/etc are produced internally or received
via REST, their journey through the node is the same - to ensure that
they get the same treatment (logging, metrics, processing), this PR
moves the routing to a dedicated module and fixes several small
differences that existed before.

* `xxxValidator` -> `processMessageName` - the processor also was adding
messages to pools, so we want the name to reflect that action
* add missing "sent" metrics for some messages
* document ignore policy better - already-seen messages are not actaully
rebroadcast by libp2p
* skip redundant signature checks for internal validators consistently
2022-07-06 16:11:44 +00:00
Etan Kissling 2a2bcea70d
group justified and finalized `Checkpoint` (#3841)
The justified and finalized `Checkpoint` are frequently passed around
together. This introduces a new `FinalityCheckpoint` data structure that
combines them into one.

Due to the large usage of this structure in fork choice, also took this
opportunity to update fork choice tests to the latest v1.2.0-rc.1 spec.
Many additional tests enabled, some need more work, e.g. EL mock blocks.
Also implemented `discard_equivocations` which was skipped in #3661,
and improved code reuse across fork choice logic while at it.
2022-07-06 13:33:02 +03:00
tersec 1221bb66e8
optimistic sync (#3793)
* optimistic sync

* flag that initially loaded blocks from database might need execution block root filled in

* return optimistic status in REST calls

* refactor blockslot pruning

* ensure beacon_blocks_by_{root,range} do not provide optimistic blocks

* handle forkchoice head being pre-merge with block being postmerge

* re-enable blocking head updates on validator duties

* fix is_optimistic_candidate_block per spec; don't crash with nil future

* fix is_optimistic_candidate_block per spec; don't crash with nil future

* mark blocks sans execution payloads valid during head update
2022-07-04 23:35:33 +03:00
Etan Kissling 2e98c7722f
encapsulate LC data variables into single structure (#3777)
Combines the LC data configuration options (serve / importMode), the
callbacks (finality / optimistic LC update) as well as the cache storing
light client data, into a new `LightClientDataStore` structure.
Also moves the structure into a light client specific file.
2022-06-24 16:57:50 +02:00
Jacek Sieka 347a485b5b
bearssl: split abi (#3755) 2022-06-21 10:29:16 +02:00
tersec 2c623e5f92
don't try to fcU on pre-merge bellatrix blocks (#3773) 2022-06-18 13:39:21 +03:00
tersec d41c2a293b
rewrite merge sync (#3759) 2022-06-17 17:16:03 +03:00
tersec 8d421f3d91
keep fcU consistent with actual DAG (#3748) 2022-06-14 08:28:30 +00:00
tersec cc5f95dbbb
separate non-zero exit code for doppelganger detection (#3728) 2022-06-10 14:53:19 +03:00
tersec 65cecc50ca
cleanups: unused and duplicate imports, inconsistent naming conventions, URL updates (#3724) 2022-06-09 14:30:13 +00:00
Etan Kissling 72a46bd520
integrate light client into beacon node (#3557)
Adds a `LightClient` instance to the beacon node as preparation to
accelerate syncing in the future (optimistic sync).

- `--light-client-enable` turns on the feature
- `--light-client-trusted-block-root` configures block to start from

If no block root is configured, light client tracks DAG `finalizedHead`.
2022-06-07 19:01:11 +02:00
tersec 7492f99f35
update CL spec URLs (#3696) 2022-06-03 09:01:58 +00:00
tersec ce143a1078
update CL spec URLs (#3690) 2022-06-01 15:52:45 +00:00
tersec 62bfe97bbe
fix ExecutionPayload(Header) JSON serialization (#3679) 2022-06-01 14:57:28 +02:00
tersec f929980bf3
update 20 CL spec ref URLs (#3677) 2022-05-31 11:15:31 +00:00
Etan Kissling 01efa93cf6
add light client (standalone) (#3653)
Introduces a new library for syncing using libp2p based light client
sync protocol, and adds a new `nimbus_light_client` executable that uses
this library for syncing. The new executable emits log messages when
new beacon block headers are received, and is integrated into testing.
2022-05-31 12:45:37 +02:00
tersec a3413963a1
update (or for one, remove) 15 CL spec ref URLs (#3671) 2022-05-30 12:24:43 +00:00
tersec dfd8cd22b7
bump nim-web3 and use engine API v1.0.0.alpha.9 (#3663) 2022-05-25 10:30:37 +00:00
tersec b3d603f364
more CL spec URL updates to v1.2.0-rc.1 (#3657) 2022-05-24 08:26:35 +00:00
Jacek Sieka 1101c745b9
document and clean up `ValidatorIndex` usage (#3651)
* document static vs dynamic range checking requirements
* add `vindices` iterator to iterate over valid validator indices in a
state
* clean up spec comments in general

* fixup

Co-authored-by: tersec <tersec@users.noreply.github.com>
2022-05-23 23:39:08 +00:00
tersec c73239f60b
CL spec URL updates to v1.2.0-rc.1 (#3655) 2022-05-23 19:30:24 +00:00
Etan Kissling c808f17a37
update to latest light client libp2p protocol (#3623)
Incorporates the latest changes to the light client sync protocol based
on Devconnect AMS feedback. Note that this breaks compatibility with the
previous prototype, due to changes to data structures and endpoints.
See https://github.com/ethereum/consensus-specs/pull/2802
2022-05-23 14:02:54 +02:00
tersec 1177f33363
standardize on upcoming/specified engine API timeouts (#3637) 2022-05-17 13:57:33 +00:00
zah a2ba34f686
Implement all sync committee duties in the validator client (#3583)
Other changes:

* logtrace can now verify sync committee messages and contributions
* Many unnecessary use of pairs() have been removed for consistency
* Map 40x BN response codes to BeaconNodeStatus.Incompatible in the VC
2022-05-10 10:03:40 +00:00
tersec 104cc3053f
fcU on syncing newPayload syncing response (#3618) 2022-05-08 09:09:46 +02:00
tersec ab1fac7236
post-merge Bellatrix block proposals (#3570)
* post-merge Bellatrix block proposals

* tolerate running without an Eth1Monitor better

* remove obsolete comment

* use correct empty receipts root

* handle invalid CLI parameters in parseCmdArg overloads
2022-04-14 20:15:34 +00:00
Zahary Karadjov d450681b15
Fix another off-by-one causing rejected sync contributions at period boundaries 2022-04-08 22:47:47 +03:00
Jacek Sieka f70ff38b53
enable `styleCheck:usages` (#3573)
Some upstream repos still need fixes, but this gets us close enough that
style hints can be enabled by default.

In general, "canonical" spellings are preferred even if they violate
nep-1 - this applies in particular to spec-related stuff like
`genesis_validators_root` which appears throughout the codebase.
2022-04-08 16:22:49 +00:00
Jacek Sieka 30eef0a369
Validator monitor polish (#3569)
* lower "Previous epoch attestation missing" to `NOTICE` for easier
filtering
* add delay logging to validator monitor logs
* simplify delay logging code post-`BeaconTime`
2022-04-06 09:23:01 +00:00
tersec 759a793764
use Eth1Monitor as abstraction; increase timeouts; handle newPayload 'accepted' (#3563) 2022-04-05 08:40:59 +00:00
tersec 9b43a76f2f
kiln beacon node (#3540)
* kiln bn

* use  version of beacon_chain_db

* have Eth1Monitor abstract more tightly over web3provider
2022-03-25 11:40:10 +00:00
Etan Kissling 12dc427535
introduce light client processor (#3509)
Adds `LightClientProcessor` as the pendant to `BlockProcessor` while
operating in light client mode. Note that a similar mechanism based on
async futures is used for interoperability with existing infrastructure,
despite light client object validation being done synchronously.
2022-03-17 23:26:56 +01:00
Jacek Sieka 05ffe7b2bf
Prune `BlockRef` on finalization (#3513)
Up til now, the block dag has been using `BlockRef`, a structure adapted
for a full DAG, to represent all of chain history. This is a correct and
simple design, but does not exploit the linearity of the chain once
parts of it finalize.

By pruning the in-memory `BlockRef` structure at finalization, we save,
at the time of writing, a cool ~250mb (or 25%:ish) chunk of memory
landing us at a steady state of ~750mb normal memory usage for a
validating node.

Above all though, we prevent memory usage from growing proportionally
with the length of the chain, something that would not be sustainable
over time -  instead, the steady state memory usage is roughly
determined by the validator set size which grows much more slowly. With
these changes, the core should remain sustainable memory-wise post-merge
all the way to withdrawals (when the validator set is expected to grow).

In-memory indices are still used for the "hot" unfinalized portion of
the chain - this ensure that consensus performance remains unchanged.

What changes is that for historical access, we use a db-based linear
slot index which is cache-and-disk-friendly, keeping the cost for
accessing historical data at a similar level as before, achieving the
savings at no percievable cost to functionality or performance.

A nice collateral benefit is the almost-instant startup since we no
longer load any large indicies at dag init.

The cost of this functionality instead can be found in the complexity of
having to deal with two ways of traversing the chain - by `BlockRef` and
by slot.

* use `BlockId` instead of `BlockRef` where finalized / historical data
may be required
* simplify clearance pre-advancement
* remove dag.finalizedBlocks (~50:ish mb)
* remove `getBlockAtSlot` - use `getBlockIdAtSlot` instead
* `parent` and `atSlot` for `BlockId` now require a `ChainDAGRef`
instance, unlike `BlockRef` traversal
* prune `BlockRef` parents on finality (~200:ish mb)
* speed up ChainDAG init by not loading finalized history index
* mess up light client server error handling - this need revisiting :)
2022-03-17 17:42:56 +00:00
tersec 8fbcf29775
update unchanged specs/phase0/p2p-interface.md URL references from v1.1.9 to v1.1.10 (#3510) 2022-03-16 10:40:35 +00:00
Jacek Sieka c64bf045f3
remove StateData (#3507)
One more step on the journey to reduce `BlockRef` usage across the
codebase - this one gets rid of `StateData` whose job was to keep track
of which block was last assigned to a state - these duties have now been
taken over by `latest_block_root`, a fairly recent addition that
computes this block root from state data (at a small cost that should be
insignificant)

99% mechanical change.
2022-03-16 08:20:40 +01:00
Jacek Sieka a3bd01b58d
move dependent root computations to `BeaconState` / `EpochRef` (#3478)
* fewer deps on `BlockRef` traversal in anticipation of pruning
* allows identifying EpochRef:s by their shuffling as a first step of
* tighten error handling around missing blocks

using the zero hash for signalling "missing block" is fragile and easy
to miss - with checkpoint sync now, and pruning in the future, missing
blocks become "normal".
2022-03-15 09:24:55 +01:00
Etan Kissling a08114e996
libp2p light client gossip validation (#3486)
When `--serve-light-client-data` is specified, provides stability on the
`optimistic_light_client_update` GossipSub topic.
2022-03-14 14:05:38 +01:00
Jacek Sieka 4363215a32
relax `BlockRef` database assumptions (#3472)
* remove `getForkedBlock(BlockRef)` which assumes block data exists but
doesn't support archive/backfilled blocks
* fix REST `/eth/v1/beacon/headers` request not returning
archive/backfilled blocks
* avoid re-encoding in REST block SSZ requests (using `getBlockSSZ`)
2022-03-11 13:08:17 +01:00
tersec c18cd8ee0c
rename random -> prev_randao in Bellatrix for CL specs v1.1.10 (#3460) 2022-03-03 16:08:14 +00:00
tersec f0ada15dac
automated CL spec ref URL updates from v1.1.9 to v1.1.10 (#3455) 2022-03-02 10:00:21 +00:00
Jacek Sieka 92e7e288e7
Ignore seen aggregates (#3439)
https://github.com/ethereum/consensus-specs/pull/2225 removed an ignore
rule that would filter out duplicate aggregates from gossip publishing -
however, this causes increased bandwidth and CPU usage as discussed in
https://github.com/ethereum/consensus-specs/issues/2183 - the intent is
to revert the removal and reinstate the rule.

This PR implements ignore filtering which cuts down on CPU usage (fewer
aggregates to validate) and bandwidth usage (less fanout of duplicates)
- as #2225 points out, this may lead to a small increase in IHAVE
messages.
2022-02-25 17:15:39 +01:00
tersec 79761c78a4
proc -> func, mainly in spec/state transition and adjecent modules (#3405) 2022-02-17 11:53:55 +00:00
tersec 2275fad335
only show setting up doppelganger detection log message if enabled (#3391)
* only show setting up doppelganger detection log message if enabled

* correct indentation
2022-02-14 19:24:38 +00:00
tersec 873a8ec1e6
use isZeroMemory for Eth2Digest comparisons (#3386)
* use isZeroMemory for Eth2Digest comparisons

* use Eth2Digest.isZero abstraction
2022-02-14 05:26:19 +00:00
tersec bf3ef987e4
deactivate doppelganger protection during genesis (#3362)
* deactivate Doppelganger Protection during genesis

* also don't actually flag supposed-doppelgangers (because they're before broadcastStartEpoch) on GENESIS_SLOT start
2022-02-07 07:12:36 +02:00
Jacek Sieka 49282e9477
val_mon: register locally produced aggregates (#3352)
These use a separate flow, and were previously only registered from the
network

* don't log successes in totals mode (TMI)
* remove `attestation-sent` event which is unused
2022-02-04 08:33:20 +01:00
tersec 0c814f49ee
rename sync_{committee_,}aggregate and execute_payload -> notify_new_payload (#3347) 2022-02-01 07:31:53 +00:00
tersec 89ffa8a1a7
spec URL & copyright year update (#3338) 2022-01-29 01:05:39 +00:00
tersec 7c51da037f
add block gossip validation condition (#3325) 2022-01-26 17:22:06 +00:00
Jacek Sieka f70aceef37
Harden handling of unviable forks (#3312)
* Harden handling of unviable forks

In our current handling of unviable forks, we allow peers to send us
blocks that come from a different fork - this is not necessarily an
error as it can happen naturally, but it does open up the client to a
case where the same unviable fork keeps getting requested - rather than
allowing this to happen, we'll now give these peers a small negative
score - if it keeps happening, we'll disconnect them.

* keep track of unviable forks in quarantine, to avoid filling it with
known junk
* collect peer scores in single module
* descore peers when they send unviable blocks during sync
* don't give score for duplicate blocks
* increase quarantine size to a level that allows finality to happen
under optimal conditions - this helps avoid downloading the same blocks
over and over in case of an unviable fork
* increase initial score for new peers to make room for one more failure
before disconnection
* log and score invalid/unviable blocks in requestmanager too
* avoid ChainDAG dependency in quarantine
* reject gossip blocks with unviable parent
* continue processing unviable sync blocks in order to build unviable
dag

* docs

* Update beacon_chain/consensus_object_pools/block_pools_types.nim

* add unviable queue test
2022-01-26 13:20:08 +01:00
tersec 351c2fd48a
rename mergeData to bellatrixData and mergeFork to bellatrixFork (#3315) 2022-01-24 16:23:13 +00:00
Jacek Sieka 61342c2449
limit by-root requests to non-finalized blocks (#3293)
* limit by-root requests to non-finalized blocks

Presently, we keep a mapping from block root to `BlockRef` in memory -
this has simplified reasoning about the dag, but is not sustainable with
the chain growing.

We can distinguish between two cases where by-root access is useful:

* unfinalized blocks - this is where the beacon chain is operating
generally, by validating incoming data as interesting for future fork
choice decisions - bounded by the length of the unfinalized period
* finalized blocks - historical access in the REST API etc - no bounds,
really

In this PR, we limit the by-root block index to the first use case:
finalized chain data can more efficiently be addressed by slot number.

Future work includes:

* limiting the `BlockRef` horizon in general - each instance is 40
bytes+overhead which adds up - this needs further refactoring to deal
with the tail vs state problem
* persisting the finalized slot-to-hash index - this one also keeps
growing unbounded (albeit slowly)

Anyway, this PR easily shaves ~128mb of memory usage at the time of
writing.

* No longer honor `BeaconBlocksByRoot` requests outside of the
non-finalized period - previously, Nimbus would generously return any
block through this libp2p request - per the spec, finalized blocks
should be fetched via `BeaconBlocksByRange` instead.
* return `Opt[BlockRef]` instead of `nil` when blocks can't be found -
this becomes a lot more common now and thus deserves more attention
* `dag.blocks` -> `dag.forkBlocks` - this index only carries unfinalized
blocks from now - `finalizedBlocks` covers the other `BlockRef`
instances
* in backfill, verify that the last backfilled block leads back to
genesis, or panic
* add backfill timings to log
* fix missing check that `BlockRef` block can be fetched with
`getForkedBlock` reliably
* shortcut doppelganger check when feature is not enabled
* in REST/JSON-RPC, fetch blocks without involving `BlockRef`

* fix dag.blocks ref
2022-01-21 13:33:16 +02:00
tersec 9c0c9c98ce
complete switch to beacon_chain/specs/datatypes/bellatrix (#3295) 2022-01-18 13:36:52 +00:00
tersec d878948ed2
update sync committee gossip validation comments; spec URL updates (#3280) 2022-01-13 13:46:08 +00:00
tersec 14aab2c13f
update 10 modules from using merge to bellatrix (#3272) 2022-01-12 15:50:30 +01:00
Jacek Sieka 805e85e1ff
time: spring cleaning (#3262)
Time in the beacon chain is expressed relative to the genesis time -
this PR creates a `beacon_time` module that collects helpers and
utilities for dealing the time units - the new module does not deal with
actual wall time (that's remains in `beacon_clock`).

Collecting the time related stuff in one place makes it easier to find,
avoids some circular imports and allows more easily identifying the code
actually needs wall time to operate.

* move genesis-time-related functionality into `spec/beacon_time`
* avoid using `chronos.Duration` for time differences - it does not
support negative values (such as when something happens earlier than it
should)
* saturate conversions between `FAR_FUTURE_XXX`, so as to avoid
overflows
* fix delay reporting in validator client so it uses the expected
deadline of the slot, not "closest wall slot"
* simplify looping over the slots of an epoch
* `compute_start_slot_at_epoch` -> `start_slot`
* `compute_epoch_at_slot` -> `epoch`

A follow-up PR will (likely) introduce saturating arithmetic for the
time units - this is merely code moves, renames and fixing of small
bugs.
2022-01-11 11:01:54 +01:00
tersec ae61512ee9
rename upgrade_to_{merge,bellatrix}; detect unchanging spec YAMLs (#3265) 2022-01-10 09:39:43 +00:00
Jacek Sieka 20e700fae4
Harden CommitteeIndex, SubnetId, SyncSubcommitteeIndex (#3259)
* Harden CommitteeIndex, SubnetId, SyncSubcommitteeIndex

Harden the use of `CommitteeIndex` et al to prevent future issues by
using a distinct type, then validating before use in several cases -
datatypes in spec are kept simple though so that invalid data still can
be read.

* fix invalid epoch used in REST
`/eth/v1/beacon/states/{state_id}/committees` committee length (could
return invalid data)
* normalize some variable names
* normalize committee index loops
* fix `RestAttesterDuty` to use `uint64` for `validator_committee_index`
* validate `CommitteeIndex` on ingress in REST API
* update rest rules with stricter parsing
* better REST serializers
* save lots of memory by not using `zip` ...at least a few bytes!
2022-01-09 01:28:49 +02:00
tersec 0fd8bf7b56
spec URL updates (#3254) 2022-01-06 18:35:38 +00:00
Jacek Sieka 0a4728a241
Handle access to historical data for which there is no state (#3217)
With checkpoint sync in particular, and state pruning in the future,
loading states or state-dependent data may fail. This PR adjusts the
code to allow this to be handled gracefully.

In particular, the new availability assumption is that states are always
available for the finalized checkpoint and newer, but may fail for
anything older.

The `tail` remains the point where state loading de-facto fails, meaning
that between the tail and the finalized checkpoint, we can still get
historical data (but code should be prepared to handle this as an
error).

However, to harden the code against long replays, several operations
which are assumed to work only with non-final data (such as gossip
verification and validator duties) now limit their search horizon to
post-finalized data.

* harden several state-dependent operations by logging an error instead
of introducing a panic when state loading fails
* `withState` -> `withUpdatedState` to differentiate from the other
`withState`
* `updateStateData` can now fail if no state is found in database - it
is also hardened against excessively long replays
* `getEpochRef` can now fail when replay fails
* reject blocks with invalid target root - they would be ignored
previously
* fix recursion bug in `isProposed`
2022-01-05 19:38:04 +01:00
tersec b81c06edab
rename Beacon{Block,State}Fork.Merge to Bellatrix; update copyright years (#3240) 2022-01-04 09:45:38 +00:00
tersec da017d2ca5
update from phase0/altair v1.1.6 URLs to v1.1.8 spec URLs (#3238) 2022-01-04 03:57:15 +00:00
Jacek Sieka c4ce59e55b
Assorted logging improvements (#3237)
* log doppelganger detection when it activates and when it causes missed
duties
* less prominent eth1 sync progress
* log in-progress sync at notice only when actually missing duties
* better detail in replay log
* don't log finalization checkpoints - this is quite verbose when
syncing and already included in "Slot start"
2022-01-03 22:18:49 +01:00
tersec e78d12beb9
support GOSSIP_MAX_SIZE_MERGE blocks; prevent fork choice stutter via aggregate attestations (#3230)
* support GOSSIP_MAX_SIZE_MERGE-sized blocks; prevent fork choice clock stutter via aggregate attestations

* relay max gossip size to libp2p, use tight uncompressed bounds for fixed-size messages

* Update beacon_chain/networking/eth2_network.nim

Co-authored-by: Jacek Sieka <jacek@status.im>

* Update beacon_chain/networking/eth2_network.nim

Co-authored-by: Jacek Sieka <jacek@status.im>

Co-authored-by: Jacek Sieka <jacek@status.im>
2022-01-03 16:20:15 +00:00
Jacek Sieka 6b60a774e0
Lazy aggregated batch verification (#3212)
A novel optimisation for attestation and sync committee message
validation: when batching, we look for signatures of the same message
and aggregate these before batch-validating: this results in up to 60%
fewer signature verifications on a busy server, leading to a significant
reduction in CPU usage.

* increase batch size slightly which helps finding more aggregates
* add metrics for batch verification efficiency
* use simple `blsVerify` when there is only one signature to verify in
the batch, avoiding the RNG
2021-12-29 15:28:40 +01:00
tersec 1a6a56bdb1
use BeaconTime instead of Slot in fork choice (#3138)
* use v1.1.6 test vectors; use BeaconTime instead of Slot in fork choice

* tick through every slot at least once

* use div INTERVALS_PER_SLOT and use precomputed constants of them

* use correct (even if numerically equal) constant
2021-12-21 18:56:08 +00:00
Jacek Sieka c270ec21e4
Validator monitoring (#2925)
Validator monitoring based on and mostly compatible with the
implementation in Lighthouse - tracks additional logs and metrics for
specified validators so as to stay on top on performance.

The implementation works more or less the following way:
* Validator pubkeys are singled out for monitoring - these can be
running on the node or not
* For every action that the validator takes, we record steps in the
process such as messages being seen on the network or published in the
API
* When the dust settles at the end of an epoch, we report the
information from one epoch before that, which coincides with the
balances being updated - this is a tradeoff between being correct
(waiting for finalization) and providing relevant information in a
timely manner)
2021-12-20 20:20:31 +01:00
tersec d7799ecdcc
v1.1.6 spec updates (#3206) 2021-12-17 06:56:33 +00:00
Jacek Sieka 118840d241
SyncManager cleanups for backfill support (#3189)
* SyncManager cleanups for backfill support

Cleanups, fixes and simplifications, in anticipation of backfill support
for the `SyncManager`:

* reformat sync progress indicator to show time left and % done more
prominently:
  * old: `sync="sPssPsssss:2:2.4229:00h57m (2706898)"`
  * new: `sync="14d12h31m (0.52%) 1.1378slots/s (wQQQQQDDQQ:1287520)"`
* reset average speed when going out of sync
* pass all block errors to sync manager, including duplicate/unviable
* penalize peers for reporting a head block that is outside of our
expected wall clock time (they're likely on a different network or
trying to disrupt sync)
* remove `SyncFailureKind` (unused)
* remove `inRange` (unused)
* add `Q` for sync queue requests that are in the `SyncQueue` but not
yet in the `BlockProcessor` queue
* update last slot in `SyncQueue` after getting peer status
* fix race condition between `wakeupWaiters` and `resetWait`, where
workers would not be correctly reset if block verification returned a
completed future without event loop
* log syncmanager direction

* Fix ordering issue.
Some of the requests size of which are not equal to `chunkSize` could be processed in wrong order which could lead to sync process freezes.

Co-authored-by: cheatfate <eugene.kabanov@status.im>
2021-12-16 15:57:16 +01:00
tersec 36ade1c1c6
v1.1.6 spec updates (minor, mostly URLs) (#3197) 2021-12-14 21:02:29 +00:00
tersec f09686e835
update some spec URLs to v1.1.6 (#3188) 2021-12-13 15:45:48 +00:00
Jacek Sieka 03005f48e1
Backfill support for ChainDAG (#3171)
In the ChainDAG, 3 block pointers are kept: genesis, tail and head. This
PR adds one more block pointer: the backfill block which represents the
block that has been backfilled so far.

When doing a checkpoint sync, a random block is given as starting point
- this is the tail block, and we require that the tail block has a
corresponding state.

When backfilling, we end up with blocks without corresponding states,
hence we cannot use `tail` as a backfill pointer - there is no state.

Nonetheless, we need to keep track of where we are in the backfill
process between restarts, such that we can answer GetBeaconBlocksByRange
requests.

This PR adds the basic support for backfill handling - it needs to be
integrated with backfill sync, and the REST API needs to be adjusted to
take advantage of the new backfilled blocks when responding to certain
requests.

Future work will also enable moving the tail in either direction:
* pruning means moving the tail forward in time and removing states
* backwards means recreating past states from genesis, such that
intermediate states are recreated step by step all the way to the tail -
at that point, tail, genesis and backfill will match up.
* backfilling is done when backfill != genesis - later, this will be the
WSS checkpoint instead
2021-12-13 14:36:06 +01:00
Jacek Sieka dfbd50b4d6
avoid SyncCommitteMsgPool copy (#3185)
introduced by batch verification, when verifiers were made async
2021-12-11 16:39:24 +01:00
Jacek Sieka 069bccd51b
batch-verify sync messages for a small perf boost (#3151)
* batch-verify sync messages for a small perf boost

Generally reuses the same structure as attestation and aggregate
verification

* normalize `signatures` and `signature_batch` to use the same pattern
of verification
* normalize parameter names, order etc for signature stuff in general
* avoid calling `blsSign` directly - instead, go through `signatures`
consistently
2021-12-09 14:56:54 +02:00
tersec 2ca28fb861
Merge BeaconBlock gossip validation (#3165)
* Merge BeaconBlock gossip validation

* figure/ground inversion

* revert cosmetic cleanups to reduce merge conflicts
2021-12-08 17:29:22 +00:00
Jacek Sieka 1a8b7469e3
move quarantine outside of chaindag (#3124)
* move quarantine outside of chaindag

The quarantine has been part of the ChainDAG for the longest time, but
this design has a few issues:

* the function in which blocks are verified and added to the dag becomes
reentrant and therefore difficult to reason about - we're currently
using a stateful flag to work around it
* quarantined blocks bypass the processing queue leading to a processing
stampede
* the quarantine flow is unsuitable for orphaned attestations - these
should also should be quarantined eventually

Instead of processing the quarantine inside ChainDAG, this PR moves
re-queueing to `block_processor` which already is responsible for
dealing with follow-up work when a block is added to the dag

This sets the stage for keeping attestations in the quarantine as well.

Also:

* make `BlockError` `{.pure.}`
* avoid use of `ValidationResult` in block clearance (that's for gossip)
2021-12-06 10:49:01 +01:00
tersec e6921f808f
cleanups, partly from kintsugi branch (#3161)
* cleanups, partly from kintsugi branch

* re-export shortLog(EthBlock) and preserve exception messages in batchVerify and processBatch
2021-12-05 17:32:41 +00:00
tersec 4378f3f096
almost all remaining ethereum/{eth2.0-specs -> consensus-specs} (#3158) 2021-12-03 20:01:13 +00:00
tersec cc51f3fd12
v1.1.{5 -> 6} phase 0 and altair spec URL updates (#3157) 2021-12-03 17:40:23 +00:00
Jacek Sieka 065d72fb15 move head update to storeBlock
when blocks are supplied via rest, this ensures the newly posted head is
chosen
2021-12-03 11:18:37 +02:00
Jacek Sieka aa1dea03cd
speed up gossip and sync block validation (#3143)
* avoid recomputing hash for block signature check
* check block slot match before hitting the database
2021-12-01 10:52:40 +01:00
Jacek Sieka a223d62b07
Cleanups (#3123)
Renames and cleanups split out from the validator monitoring branch, so
as to reduce conflict area vs other PR:s

* add constants for expected message timing
* name validators after the messages they validate, mostly, to make
grepping easier
* unify field naming of EpochInfo across forks to make cross-fork code
easier
2021-11-25 13:20:36 +01:00
Jacek Sieka 9c2f43ed0e
Speed up altair block processing 2x (#3115)
* Speed up altair block processing >2x

Like #3089, this PR drastially speeds up historical REST queries and
other long state replays.

* cache sync committee validator indices
* use ~80mb less memory for validator pubkey mappings
* batch-verify sync aggregate signature (fixes #2985)
* document sync committee hack with head block vs sync message block
* add batch signature verification failure tests

Before:

```
../env.sh nim c -d:release -r ncli_db --db:mainnet_0/db bench --start-slot:-1000
All time are ms
     Average,       StdDev,          Min,          Max,      Samples,         Test
Validation is turned off meaning that no BLS operations are performed
    5830.675,        0.000,     5830.675,     5830.675,            1, Initialize DB
       0.481,        1.878,        0.215,       59.167,          981, Load block from database
    8422.566,        0.000,     8422.566,     8422.566,            1, Load state from database
       6.996,        1.678,        0.042,       14.385,          969, Advance slot, non-epoch
      93.217,        8.318,       84.192,      122.209,           32, Advance slot, epoch
      20.513,       23.665,       11.510,      201.561,          981, Apply block, no slot processing
       0.000,        0.000,        0.000,        0.000,            0, Database load
       0.000,        0.000,        0.000,        0.000,            0, Database store
```

After:

```
    7081.422,        0.000,     7081.422,     7081.422,            1, Initialize DB
       0.553,        2.122,        0.175,       66.692,          981, Load block from database
    5439.446,        0.000,     5439.446,     5439.446,            1, Load state from database
       6.829,        1.575,        0.043,       12.156,          969, Advance slot, non-epoch
      94.716,        2.749,       88.395,      100.026,           32, Advance slot, epoch
      11.636,       23.766,        4.889,      205.250,          981, Apply block, no slot processing
       0.000,        0.000,        0.000,        0.000,            0, Database load
       0.000,        0.000,        0.000,        0.000,            0, Database store
```

* add comment
2021-11-24 13:43:50 +01:00
tersec 9e395011d9
update 22 spec URLs to v1.1.5 (#3111) 2021-11-18 08:08:00 +00:00
tersec 2e868dc2ba
mass/mechanical update of 1.1.4 phase0 and altair spec URLs to 1.1.5 (#3067) 2021-11-09 07:40:41 +00:00
tersec 2c8600e746
mass/mechanical update of 1.1.3 phase0 spec URLs to 1.1.4 in markdown (#3059) 2021-11-08 09:26:18 +00:00