Commit Graph

77 Commits

Author SHA1 Message Date
Jacek Sieka 62cbdeefc5
verify `genesis_time` more strictly (fixes #1667) (#5694)
Bogus values lead to crashes down the line when timers overflow
2024-01-06 15:26:56 +01:00
Etan Kissling e7bc41e005
`blck` --> `forkyBlck` when using `withBlck` / `withStateAndBlck` (#5451)
For symmetry with `forkyState` when using `withState`, and to avoid
problems with shadowing of `blck` when using `withBlck` in `template`,
also rename the injected `blck` to `forkyBlck`.

- https://github.com/nim-lang/Nim/issues/22698
2023-09-21 12:49:14 +02:00
Etan Kissling 176ea09c2b
cleanup `OnBlockAdded` usage (#5426)
Reduce repetitiveness when using forked `OnBlockAdded` callbacks by
introducing a template to obtain appropriate cb from `ConsensusFork`.
2023-09-13 17:57:54 +00:00
Etan Kissling cd68c71c6c
add gossip tests for period boundary (#5423)
Test `validateSyncCommitteeMessage` and `validateContribution`
around sync committee period boundary to cover edge cases.
2023-09-13 08:32:11 +02:00
Jacek Sieka b8a32419b8
async batch verification (+40% sig verification throughput) (#5176)
* async batch verification

When batch verification is done, the main thread is blocked reducing
concurrency.

With this PR, the new thread signalling primitive in chronos is used to
offload the full batch verification process to a separate thread
allowing the main threads to continue async operations while the other
threads verify signatures.

Similar to previous behavior, the number of ongoing batch verifications
is capped to prevent runaway resource usage.

In addition to the asynchronous processing, 3 addition changes help
drive throughput:

* A loop is used for batch accumulation: this prevents a stampede of
small batches in eager mode where both the eager and the scheduled batch
runner would pick batches off the queue, prematurely picking "fresh"
batches off the queue
* An additional small wait is introduced for small batches - this helps
create slightly larger batches which make better used of the increased
concurrency
* Up to 2 batches are scheduled to the threadpool during high pressure,
reducing startup latency for the threads

Together, these changes increase attestation verification throughput
under load up to 30%.

* fixup

* Update submodules

* fix blst build issues (and a PIC warning)

* bump

---------

Co-authored-by: Zahary Karadjov <zahary@gmail.com>
2023-08-03 11:36:45 +03:00
Etan Kissling 2722778ce5
reduce `nim-eth` dependencies just for RNG (#5099)
We have several modules that import `nim-eth` for the sole purpose of
its `keys.newRng` function. This function is meanwhile a simple wrapper
around `nim-bearssl`'s `HmacDrbgContext.new()`, so the import doesn't
really serve a use anymore. Replace `keys.newRng` with the direct call
to reduce `nim-eth` imports.
2023-06-19 22:43:50 +00:00
Etan Kissling 8c6c8a0ffa
use correct slot when producing sync aggregate around forks (#5089)
`produceSyncAggregate` is called in new slot when block is produced,
while the other functions in `sync_committee_msg_pool` are called in
previous slot. So, need to subtract 1 slot when producing sync aggregate
to accept the signatures using the old digest during fork transition.
2023-06-16 21:30:36 +00:00
Etan Kissling c70fd8fe97
Merge branch 'unstable' into dev/etan/rd-shufflingacc 2023-05-17 14:06:31 +02:00
Etan Kissling 40e89937c5
segregate sync committee messages by period / fork (#4953)
`SyncCommitteeMsgPool` grouped messages by their `beacon_block_root`.
This is problematic around sync committee period boundaries and forks.
Around sync committee period boundaries, members from both the current
and next sync committee may sign the same `beacon_block_root`; mixing
the signatures from both committees together is a mistake. Likewise,
around fork transitions, the `signing_root` changes, so those messages
also need to be segregated.
2023-05-17 07:55:55 +03:00
Etan Kissling dbba003a38
Revert "Revert "accelerate `getShufflingRef` (#4911)" (#4958)"
This reverts commit 748be8b67b.
2023-05-15 17:41:40 +02:00
Etan Kissling 748be8b67b
Revert "accelerate `getShufflingRef` (#4911)" (#4958)
This reverts commit ea97e93e74.
2023-05-15 15:25:51 +00:00
Etan Kissling ec049bc493
use correct root in `test_gossip_validation` (#4951)
Use block root not state root for sync committee message in test.
2023-05-14 16:18:50 +02:00
Etan Kissling ea97e93e74
accelerate `getShufflingRef` (#4911)
When an uncached `ShufflingRef` is requested, we currently replay state
which can take several seconds. Acceleration is possible by:

1. Start from any state with locked-in `get_active_validator_indices`.
   Any blocks / slots applied to such a state can only affect that
   result for future epochs, so are viable for querying target epoch.
   `compute_activation_exit_epoch(state.slot.epoch) > target.epoch`

2. Determine highest common ancestor among `state` and `target.blck`.
   At the ancestor slot, same rules re `get_active_validator_indices`.
   `compute_activation_exit_epoch(ancestorSlot.epoch) > target.epoch`

3. We now have a `state` that shares history with `target.blck` up
   through a common ancestor slot. Any blocks / slots that the `state`
   contains, which are not part of the `target.blck` history, affect
   `get_active_validator_indices` at epochs _after_ `target.epoch`.

4. Select `state.randao_mixes[N]` that is closest to common ancestor.
   Either direction is fine (above / below ancestor).

5. From that RANDAO mix, mix in / out all RANDAO reveals from blocks
   in-between. This is just an XOR operation, so fully reversible.
   `mix = mix xor SHA256(blck.message.body.randao_reveal)`

6. Compute the attester dependent slot from `target.epoch`.
   `if epoch >= 2: (target.epoch - 1).start_slot - 1 else: GENESIS_SLOT`

7. Trace back from `target.blck` to the attester dependent slot.
   We now have the destination for which we want to obtain RANDAO.

8. Mix in all RANDAO reveals from blocks up through the `dependentBlck`.
   Same method, no special handling necessary for epoch transitions.

9. Combine `get_active_validator_indices` from `state` at `target.epoch`
   with the recovered RANDAO value at `dependentBlck` to obtain the
   requested shuffling, and construct the `ShufflingRef` without replay.

* more tests and simplify logic

* test with different number of deposits per branch

* Update beacon_chain/consensus_object_pools/blockchain_dag.nim

Co-authored-by: Jacek Sieka <jacek@status.im>

* `commonAncestor` tests

* lint

---------

Co-authored-by: Jacek Sieka <jacek@status.im>
2023-05-12 19:36:59 +02:00
Etan Kissling b0577bdc6c
rename `OnEIP4844BlockAdded` > `OnDenebBlockAdded` (#4714) 2023-03-10 17:14:27 +00:00
tersec e3d96ef147
rename most eip4844Data to denebData (#4693) 2023-03-04 22:23:52 +00:00
tersec 3b41e6a0e7
rename ConsensusFork.EIP4844 to ConsensusFork.Deneb (#4692) 2023-03-04 13:35:39 +00:00
tersec d058aa09c8
more withdrowls (#4674) 2023-03-02 17:13:35 +01:00
tersec 629b005c27
refactor batch validation not to require genesis_validators_root each time (#4640) 2023-02-20 09:26:22 +01:00
tersec 0fb726c420
`BeaconStateFork/BeaconBlockFork` -> `ConsensusFork` (#4560)
* `BeaconStateFork/BeaconBlockFork` -> `ConsensusFork`

* revert unrelated change

* revert unrelated changes

* update test summaries
2023-01-28 19:53:41 +00:00
Jacek Sieka 6bfc766629
drop subset sync contributions in gossip (#4490)
* correctly report ignored contributions in metrics
* avoid counting subset contributions in vmon (bring in line with
attestation aggregates)
* avoid signature checks for subset attestations

A being a non-strict subset is a sufficient condition to ignore.
2023-01-12 15:08:08 +01:00
henridf 8251cc223d
eip4844 gossip (#4444)
* eip4844 gossip

* Check BLSFieldElement range validity in gossip validation

* lint/nits cleanup

* Use template to avoid an assignment with copy.

* More review feedback

* lint

* lint

* processSignedBeaconBlockAndBlobsSidecar: clean up error handling flow

* Undo factoring-out of beacon blocks validator installation
2023-01-04 12:34:15 +00:00
henridf f0329b2212
Types and scaffolding for EIP-4844 (#4365)
* Types and scaffolding for EIP-4844

This commit adds the EIP-4844 spec types, and fills in
scaffolding/boilerplate for the use of these types across the repo.

None of the actual EIP-4844 logic is introduced yet.

This follows the pattern used by @tersec when introducing Capella (#4276).

* use eth2-networks fork

* review feedback: add static check EIP4844_FORK_EPOCH == FAR_FUTURE_EPOCH

* review feedback: remove EIP4844 from /eth/v1/config/spec response

* Cleanup / review feedback

* Fix REST test
2022-12-05 16:29:09 +00:00
tersec 38827d776c
capella gossip support (#4386) 2022-12-04 08:42:03 +01:00
tersec 5b46f0b723
add Capella support to Forked* (#4276)
* add Capella support to Forked*

* remove cruft

* add `OnForkyBlockAdded`
2022-11-02 16:23:30 +00:00
Jacek Sieka ef8bab58eb
load suggested fee recipient file also when keymanager is disabled (#4078)
Since these files may have been created in a previous run or manually,
we want to keep loading them even on nodes that don't enable the
keystore API (for example static setups)

Other changes:

* log keystore loading progressively (#3699)
* print initial fee recipient when loading validators
* log dynamic fee recipient updates
2022-09-17 08:30:07 +03:00
zah b1ac9c9fe4
Fix a potential segfault and various potential stalls (#4003)
* Fixes a segfault during block production when the Keymanager API
  is disabled. The Keymanager is now disabled on half of the local
  testnet nodes to catch such problems in the future.

* Fixes multiple potential stalls from REST requests being done
  without a timeout. From practice, we know that such requests
  can hang forever if not cancelled with a timeout. At best,
  this would be a resource leak, at worst, it may lead to a
  full stall of the client and missed validator duties.

* Changes some Options usages to Opt (for easier use of valueOr)
2022-08-19 21:51:30 +00:00
zah dc50abbc90
Implement a missing ingnore rule for sync committee contributions (#3941) 2022-08-09 12:52:11 +03:00
Etan Kissling 2a2bcea70d
group justified and finalized `Checkpoint` (#3841)
The justified and finalized `Checkpoint` are frequently passed around
together. This introduces a new `FinalityCheckpoint` data structure that
combines them into one.

Due to the large usage of this structure in fork choice, also took this
opportunity to update fork choice tests to the latest v1.2.0-rc.1 spec.
Many additional tests enabled, some need more work, e.g. EL mock blocks.
Also implemented `discard_equivocations` which was skipped in #3661,
and improved code reuse across fork choice logic while at it.
2022-07-06 13:33:02 +03:00
Jacek Sieka c145916414
cleanups (#3819)
* avoid circular panda imports
* move deposit merkleization helpers to spec/
* normalize validator signature helpers to spec names / params
* remove redundant functions for remote signing
2022-06-29 18:53:59 +02:00
tersec a07d14cd99
remove unused imports in tests/ (#3713) 2022-06-07 17:05:06 +00:00
Jacek Sieka 48f01186d6
fix unnecessary HashList/HashArray cache invalidation (#3660)
* SSZ `[]` -> `mitem`
* `[]` -> `item`

immutable access via mutable instance cannot rely on template
overloading, and `[]` cannot be a `func` because of special seq handling
in compiler.
2022-05-30 13:30:42 +00:00
Jacek Sieka c64bf045f3
remove StateData (#3507)
One more step on the journey to reduce `BlockRef` usage across the
codebase - this one gets rid of `StateData` whose job was to keep track
of which block was last assigned to a state - these duties have now been
taken over by `latest_block_root`, a fairly recent addition that
computes this block root from state data (at a small cost that should be
insignificant)

99% mechanical change.
2022-03-16 08:20:40 +01:00
tersec 84588b34da
var => let in specs/ and tests/ (#3425) 2022-02-20 20:13:06 +00:00
tersec 2b4a960270
rename On{Merge,Bellatrix}BlockAdded and Rollback{Merge,Bellatrix}HashedProc (#3321) 2022-01-26 13:21:29 +01:00
tersec 00a347457a
dynamic sync committee subscriptions (#3308)
* dynamic sync committee subscriptions

* fast-path trivial case rather than rely on RNG with probability 1 outcome

Co-authored-by: zah <zahary@gmail.com>

* use func instead of template; avoid calling async function unnecessarily

* avoid unnecessary sync committee topic computation; use correct epoch lookahead; enforce exception/effect tracking

* don't over-optimistically update ENR syncnets; non-looping version of nearSyncCommitteePeriod

* allow separately setting --allow-all-{sub,att,sync}nets

* remove unnecessary async

Co-authored-by: zah <zahary@gmail.com>
2022-01-24 20:40:59 +00:00
tersec 351c2fd48a
rename mergeData to bellatrixData and mergeFork to bellatrixFork (#3315) 2022-01-24 16:23:13 +00:00
Jacek Sieka 836f6984bb
move `state_transition` to `Result` (#3284)
* better error messages in api
* avoid `BlockData` copies when replaying blocks
2022-01-17 12:19:58 +01:00
Jacek Sieka 805e85e1ff
time: spring cleaning (#3262)
Time in the beacon chain is expressed relative to the genesis time -
this PR creates a `beacon_time` module that collects helpers and
utilities for dealing the time units - the new module does not deal with
actual wall time (that's remains in `beacon_clock`).

Collecting the time related stuff in one place makes it easier to find,
avoids some circular imports and allows more easily identifying the code
actually needs wall time to operate.

* move genesis-time-related functionality into `spec/beacon_time`
* avoid using `chronos.Duration` for time differences - it does not
support negative values (such as when something happens earlier than it
should)
* saturate conversions between `FAR_FUTURE_XXX`, so as to avoid
overflows
* fix delay reporting in validator client so it uses the expected
deadline of the slot, not "closest wall slot"
* simplify looping over the slots of an epoch
* `compute_start_slot_at_epoch` -> `start_slot`
* `compute_epoch_at_slot` -> `epoch`

A follow-up PR will (likely) introduce saturating arithmetic for the
time units - this is merely code moves, renames and fixing of small
bugs.
2022-01-11 11:01:54 +01:00
Jacek Sieka 20e700fae4
Harden CommitteeIndex, SubnetId, SyncSubcommitteeIndex (#3259)
* Harden CommitteeIndex, SubnetId, SyncSubcommitteeIndex

Harden the use of `CommitteeIndex` et al to prevent future issues by
using a distinct type, then validating before use in several cases -
datatypes in spec are kept simple though so that invalid data still can
be read.

* fix invalid epoch used in REST
`/eth/v1/beacon/states/{state_id}/committees` committee length (could
return invalid data)
* normalize some variable names
* normalize committee index loops
* fix `RestAttesterDuty` to use `uint64` for `validator_committee_index`
* validate `CommitteeIndex` on ingress in REST API
* update rest rules with stricter parsing
* better REST serializers
* save lots of memory by not using `zip` ...at least a few bytes!
2022-01-09 01:28:49 +02:00
Zahary Karadjov 54d0d588b1 Implementation of the Keymanager API (BETA)
https://github.com/ethereum/keymanager-APIs
2022-01-04 18:51:45 +02:00
tersec b81c06edab
rename Beacon{Block,State}Fork.Merge to Bellatrix; update copyright years (#3240) 2022-01-04 09:45:38 +00:00
tersec 1a6a56bdb1
use BeaconTime instead of Slot in fork choice (#3138)
* use v1.1.6 test vectors; use BeaconTime instead of Slot in fork choice

* tick through every slot at least once

* use div INTERVALS_PER_SLOT and use precomputed constants of them

* use correct (even if numerically equal) constant
2021-12-21 18:56:08 +00:00
Jacek Sieka c270ec21e4
Validator monitoring (#2925)
Validator monitoring based on and mostly compatible with the
implementation in Lighthouse - tracks additional logs and metrics for
specified validators so as to stay on top on performance.

The implementation works more or less the following way:
* Validator pubkeys are singled out for monitoring - these can be
running on the node or not
* For every action that the validator takes, we record steps in the
process such as messages being seen on the network or published in the
API
* When the dust settles at the end of an epoch, we report the
information from one epoch before that, which coincides with the
balances being updated - this is a tradeoff between being correct
(waiting for finalization) and providing relevant information in a
timely manner)
2021-12-20 20:20:31 +01:00
Jacek Sieka 03005f48e1
Backfill support for ChainDAG (#3171)
In the ChainDAG, 3 block pointers are kept: genesis, tail and head. This
PR adds one more block pointer: the backfill block which represents the
block that has been backfilled so far.

When doing a checkpoint sync, a random block is given as starting point
- this is the tail block, and we require that the tail block has a
corresponding state.

When backfilling, we end up with blocks without corresponding states,
hence we cannot use `tail` as a backfill pointer - there is no state.

Nonetheless, we need to keep track of where we are in the backfill
process between restarts, such that we can answer GetBeaconBlocksByRange
requests.

This PR adds the basic support for backfill handling - it needs to be
integrated with backfill sync, and the REST API needs to be adjusted to
take advantage of the new backfilled blocks when responding to certain
requests.

Future work will also enable moving the tail in either direction:
* pruning means moving the tail forward in time and removing states
* backwards means recreating past states from genesis, such that
intermediate states are recreated step by step all the way to the tail -
at that point, tail, genesis and backfill will match up.
* backfilling is done when backfill != genesis - later, this will be the
WSS checkpoint instead
2021-12-13 14:36:06 +01:00
Jacek Sieka dfbd50b4d6
avoid SyncCommitteMsgPool copy (#3185)
introduced by batch verification, when verifiers were made async
2021-12-11 16:39:24 +01:00
Jacek Sieka 069bccd51b
batch-verify sync messages for a small perf boost (#3151)
* batch-verify sync messages for a small perf boost

Generally reuses the same structure as attestation and aggregate
verification

* normalize `signatures` and `signature_batch` to use the same pattern
of verification
* normalize parameter names, order etc for signature stuff in general
* avoid calling `blsSign` directly - instead, go through `signatures`
consistently
2021-12-09 14:56:54 +02:00
Jacek Sieka 1a8b7469e3
move quarantine outside of chaindag (#3124)
* move quarantine outside of chaindag

The quarantine has been part of the ChainDAG for the longest time, but
this design has a few issues:

* the function in which blocks are verified and added to the dag becomes
reentrant and therefore difficult to reason about - we're currently
using a stateful flag to work around it
* quarantined blocks bypass the processing queue leading to a processing
stampede
* the quarantine flow is unsuitable for orphaned attestations - these
should also should be quarantined eventually

Instead of processing the quarantine inside ChainDAG, this PR moves
re-queueing to `block_processor` which already is responsible for
dealing with follow-up work when a block is added to the dag

This sets the stage for keeping attestations in the quarantine as well.

Also:

* make `BlockError` `{.pure.}`
* avoid use of `ValidationResult` in block clearance (that's for gossip)
2021-12-06 10:49:01 +01:00
Eugene Kabanov e62c7c7c37
Remote signing client/server. (#3077) 2021-11-30 03:20:21 +02:00
Jacek Sieka a223d62b07
Cleanups (#3123)
Renames and cleanups split out from the validator monitoring branch, so
as to reduce conflict area vs other PR:s

* add constants for expected message timing
* name validators after the messages they validate, mostly, to make
grepping easier
* unify field naming of EpochInfo across forks to make cross-fork code
easier
2021-11-25 13:20:36 +01:00
Jacek Sieka 9c2f43ed0e
Speed up altair block processing 2x (#3115)
* Speed up altair block processing >2x

Like #3089, this PR drastially speeds up historical REST queries and
other long state replays.

* cache sync committee validator indices
* use ~80mb less memory for validator pubkey mappings
* batch-verify sync aggregate signature (fixes #2985)
* document sync committee hack with head block vs sync message block
* add batch signature verification failure tests

Before:

```
../env.sh nim c -d:release -r ncli_db --db:mainnet_0/db bench --start-slot:-1000
All time are ms
     Average,       StdDev,          Min,          Max,      Samples,         Test
Validation is turned off meaning that no BLS operations are performed
    5830.675,        0.000,     5830.675,     5830.675,            1, Initialize DB
       0.481,        1.878,        0.215,       59.167,          981, Load block from database
    8422.566,        0.000,     8422.566,     8422.566,            1, Load state from database
       6.996,        1.678,        0.042,       14.385,          969, Advance slot, non-epoch
      93.217,        8.318,       84.192,      122.209,           32, Advance slot, epoch
      20.513,       23.665,       11.510,      201.561,          981, Apply block, no slot processing
       0.000,        0.000,        0.000,        0.000,            0, Database load
       0.000,        0.000,        0.000,        0.000,            0, Database store
```

After:

```
    7081.422,        0.000,     7081.422,     7081.422,            1, Initialize DB
       0.553,        2.122,        0.175,       66.692,          981, Load block from database
    5439.446,        0.000,     5439.446,     5439.446,            1, Load state from database
       6.829,        1.575,        0.043,       12.156,          969, Advance slot, non-epoch
      94.716,        2.749,       88.395,      100.026,           32, Advance slot, epoch
      11.636,       23.766,        4.889,      205.250,          981, Apply block, no slot processing
       0.000,        0.000,        0.000,        0.000,            0, Database load
       0.000,        0.000,        0.000,        0.000,            0, Database store
```

* add comment
2021-11-24 13:43:50 +01:00