nimbus-eth2

Commit Graph

Author	SHA1	Message	Date
tersec	b3f6be71d5	refactor `makeBeaconBlock`; some capella support for `ncli_db` and `wss_sim` (#4321 )	2022-11-11 15:37:43 +01:00
tersec	35b1104bea	`block_sim` runs capella by default (#4315 )	2022-11-11 10:17:27 +00:00
tersec	04cbea754b	don't require attached validator for blinded block BN endpoint (#4313 )	2022-11-10 20:18:08 +00:00
Jacek Sieka	b170a09c47	remove `news` leftovers (#4299 )	2022-11-08 20:06:54 +00:00
tersec	5b46f0b723	add Capella support to Forked* (#4276 ) * add Capella support to Forked* * remove cruft * add `OnForkyBlockAdded`	2022-11-02 16:23:30 +00:00
Jacek Sieka	d839b9d07e	State-only checkpoint state startup (#4251 ) Currently, we require genesis and a checkpoint block and state to start from an arbitrary slot - this PR relaxes this requirement so that we can start with a state alone. The current trusted-node-sync algorithm works by first downloading blocks until we find an epoch aligned non-empty slot, then downloads the state via slot. However, current [proposals](https://github.com/ethereum/beacon-APIs/pull/226) for checkpointing prefer finalized state as the main reference - this allows more simple access control and caching on the server side - in particular, this should help checkpoint-syncing from sources that have a fast `finalized` state download (like infura and teku) but are slow when accessing state via slot. Earlier versions of Nimbus will not be able to read databases created without a checkpoint block and genesis. In most cases, backfilling makes the database compatible except where genesis is also missing (custom networks). * backfill checkpoint block from libp2p instead of checkpoint source, when doing trusted node sync * allow starting the client without genesis / checkpoint block * perform epoch start slot lookahead when loading tail state, so as to deal with the case where the epoch start slot does not have a block * replace `--blockId` with `--state-id` in TNS command line * when replaying, also look at the parent of the last-known-block (even if we don't have the parent block data, we can still replay from a "parent" state) - in particular, this clears the way for implementing state pruning * deprecate `--finalized-checkpoint-block` option (no longer needed)	2022-11-02 10:02:38 +00:00
Jacek Sieka	36e2518d79	fakeee: Increase incoming POST size (#4252 ) Needed to handle payloads	2022-10-25 20:01:45 +00:00
Jacek Sieka	fa9c60089c	add fake execution engine server (#4250 ) Useful for testing beacon node without running an execution client (results in an optimistically synced node)	2022-10-18 22:18:36 +00:00
tersec	0410aec9d8	remove rest of `withState.state` usage (#4120 ) * remove rest of `withState.state` usage * remove scaffolding	2022-09-16 15:35:00 +02:00
tersec	19bf460a3b	more `withState` `state` -> `forkyState` (#4104 )	2022-09-10 08:12:07 +02:00
Etan Kissling	634408ff2c	use `nim-websock` instead of `news` (#4061 ) `news` has a few open issues that are not present in `nim-websock`: 1. There is a 1 second delay between each MB of sent data. 2. Cancelling an ongoing `send` makes the entire WebSocket unusable. 3. Control packets do not have priority over ongoing message frames. Using `news`, there are quite a few of these messages in Geth: ``` Previously seen beacon client is offline. Please ensure it is operational to follow the chain! ``` It may take quite some time to reconnect when this happens. Using `nim-websock`, this message still occurs because `eth1_monitor` reconnects the EL connection when no new blocks occurred for 5 minutes, but reconnecting is quick and the message is rarer.	2022-09-06 23:41:33 +02:00
tersec	ad0d30093f	state/forkyState cleanup; spec URL updates; rm unused imports (#4052 )	2022-08-31 13:29:34 +02:00
tersec	9ae796daed	Cache and resend, rather than recreate, builder API registrations (#4040 )	2022-08-31 03:29:03 +03:00
tersec	b60456fdf3	`withState`: `state` -> `forkyState` (#4038 )	2022-08-26 22:47:40 +00:00
zah	b1ac9c9fe4	Fix a potential segfault and various potential stalls (#4003 ) * Fixes a segfault during block production when the Keymanager API is disabled. The Keymanager is now disabled on half of the local testnet nodes to catch such problems in the future. * Fixes multiple potential stalls from REST requests being done without a timeout. From practice, we know that such requests can hang forever if not cancelled with a timeout. At best, this would be a resource leak, at worst, it may lead to a full stall of the client and missed validator duties. * Changes some Options usages to Opt (for easier use of valueOr)	2022-08-19 21:51:30 +00:00
zah	d64c17ffc3	Minor post-merge cleanups (#3945 ) https://github.com/status-im/nimbus-eth2/pull/3944 The use of nested `awaitWithRetries` calls would have resulted in an unexpected number of retries (3x3). We now use regular `await` in outer layer to avoid the problem. https://github.com/status-im/nimbus-eth2/pull/3943 The new code has an invariant that the `headMerkleizer` field in the `Eth1Chain` is always kept in sync with the blocks stored in the chain. This invariant is now enforced better by doing the necessary merkleizer updates in the `Eth1Chain.addBlock` function, in the `Eth1Chain.init` function and in the `Eth1Chain.reset` function.	2022-08-10 12:31:10 +00:00
zah	dc50abbc90	Implement a missing ingnore rule for sync committee contributions (#3941 )	2022-08-09 12:52:11 +03:00
Etan Kissling	2a2bcea70d	group justified and finalized `Checkpoint` (#3841 ) The justified and finalized `Checkpoint` are frequently passed around together. This introduces a new `FinalityCheckpoint` data structure that combines them into one. Due to the large usage of this structure in fork choice, also took this opportunity to update fork choice tests to the latest v1.2.0-rc.1 spec. Many additional tests enabled, some need more work, e.g. EL mock blocks. Also implemented `discard_equivocations` which was skipped in #3661, and improved code reuse across fork choice logic while at it.	2022-07-06 13:33:02 +03:00
Jacek Sieka	c145916414	cleanups (#3819 ) * avoid circular panda imports * move deposit merkleization helpers to spec/ * normalize validator signature helpers to spec names / params * remove redundant functions for remote signing	2022-06-29 18:53:59 +02:00
tersec	8eb5d5de09	use ZERO_HASH for default(Eth2Digest)/Eth2Digest() in func calls (#3770 )	2022-06-18 04:57:37 +00:00
Etan Kissling	c808f17a37	update to latest light client libp2p protocol (#3623 ) Incorporates the latest changes to the light client sync protocol based on Devconnect AMS feedback. Note that this breaks compatibility with the previous prototype, due to changes to data structures and endpoints. See https://github.com/ethereum/consensus-specs/pull/2802	2022-05-23 14:02:54 +02:00
zah	a2ba34f686	Implement all sync committee duties in the validator client (#3583 ) Other changes: * logtrace can now verify sync committee messages and contributions * Many unnecessary use of pairs() have been removed for consistency * Map 40x BN response codes to BeaconNodeStatus.Incompatible in the VC	2022-05-10 10:03:40 +00:00
tersec	9e738a92b4	stylecheck fixes (#3595 )	2022-04-15 12:46:56 +00:00
tersec	61ba308e13	stylecheck fixes (#3593 )	2022-04-14 17:39:37 +02:00
tersec	28ba2d5544	stylecheck fixes (#3592 )	2022-04-14 13:47:14 +03:00
Jacek Sieka	f70ff38b53	enable `styleCheck:usages` (#3573 ) Some upstream repos still need fixes, but this gets us close enough that style hints can be enabled by default. In general, "canonical" spellings are preferred even if they violate nep-1 - this applies in particular to spec-related stuff like `genesis_validators_root` which appears throughout the codebase.	2022-04-08 16:22:49 +00:00
Jacek Sieka	05ffe7b2bf	Prune `BlockRef` on finalization (#3513 ) Up til now, the block dag has been using `BlockRef`, a structure adapted for a full DAG, to represent all of chain history. This is a correct and simple design, but does not exploit the linearity of the chain once parts of it finalize. By pruning the in-memory `BlockRef` structure at finalization, we save, at the time of writing, a cool ~250mb (or 25%:ish) chunk of memory landing us at a steady state of ~750mb normal memory usage for a validating node. Above all though, we prevent memory usage from growing proportionally with the length of the chain, something that would not be sustainable over time - instead, the steady state memory usage is roughly determined by the validator set size which grows much more slowly. With these changes, the core should remain sustainable memory-wise post-merge all the way to withdrawals (when the validator set is expected to grow). In-memory indices are still used for the "hot" unfinalized portion of the chain - this ensure that consensus performance remains unchanged. What changes is that for historical access, we use a db-based linear slot index which is cache-and-disk-friendly, keeping the cost for accessing historical data at a similar level as before, achieving the savings at no percievable cost to functionality or performance. A nice collateral benefit is the almost-instant startup since we no longer load any large indicies at dag init. The cost of this functionality instead can be found in the complexity of having to deal with two ways of traversing the chain - by `BlockRef` and by slot. * use `BlockId` instead of `BlockRef` where finalized / historical data may be required * simplify clearance pre-advancement * remove dag.finalizedBlocks (~50:ish mb) * remove `getBlockAtSlot` - use `getBlockIdAtSlot` instead * `parent` and `atSlot` for `BlockId` now require a `ChainDAGRef` instance, unlike `BlockRef` traversal * prune `BlockRef` parents on finality (~200:ish mb) * speed up ChainDAG init by not loading finalized history index * mess up light client server error handling - this need revisiting :)	2022-03-17 17:42:56 +00:00
Jacek Sieka	c64bf045f3	remove StateData (#3507 ) One more step on the journey to reduce `BlockRef` usage across the codebase - this one gets rid of `StateData` whose job was to keep track of which block was last assigned to a state - these duties have now been taken over by `latest_block_root`, a fairly recent addition that computes this block root from state data (at a small cost that should be insignificant) 99% mechanical change.	2022-03-16 08:20:40 +01:00
tersec	79761c78a4	proc -> func, mainly in spec/state transition and adjecent modules (#3405 )	2022-02-17 11:53:55 +00:00
Eugene Kabanov	40c77e5928	Remote KeyManager API and number of fixes/tests for KeyManager API (#3360 ) * Initial commit. * Fix current test suite. * Fix keymanager api test. * Fix wss_sim. * Add more keystore_management tests. * Recover deleted isEmptyDir(). * Add `HttpHostUri` distinct type. Move keymanager calls away from rest_beacon_calls to rest_keymanager_calls. Add REST serialization of RemoteKeystore and Keystore object. Add tests for Remote Keystore management API. Add tests for Keystore management API (Add keystore). Fix serialzation issues. * Fix test to use HttpHostUri instead of Uri. * Add links to specification in comments. * Remove debugging echoes.	2022-02-07 22:36:09 +02:00
tersec	8e6a920bf4	rename MERGE_FORK_EPOCH to BELLATRIX_FORK_EPOCH (#3350 ) * rename MERGE_FORK_EPOCH to BELLATRIX_FORK_EPOCH * fix REST test rules	2022-02-02 14:06:55 +01:00
tersec	00a347457a	dynamic sync committee subscriptions (#3308 ) * dynamic sync committee subscriptions * fast-path trivial case rather than rely on RNG with probability 1 outcome Co-authored-by: zah <zahary@gmail.com> * use func instead of template; avoid calling async function unnecessarily * avoid unnecessary sync committee topic computation; use correct epoch lookahead; enforce exception/effect tracking * don't over-optimistically update ENR syncnets; non-looping version of nearSyncCommitteePeriod * allow separately setting --allow-all-{sub,att,sync}nets * remove unnecessary async Co-authored-by: zah <zahary@gmail.com>	2022-01-24 20:40:59 +00:00
tersec	351c2fd48a	rename mergeData to bellatrixData and mergeFork to bellatrixFork (#3315 )	2022-01-24 16:23:13 +00:00
Jacek Sieka	c7e92bfd84	wss_sim: state transition simulator (#3309 ) It's sometimes useful to simulate what happens when a chain runs from a given state with a given set of private keys - `wss_sim` allows running such a simulation. One use of such a tool is to simulate a weak subjectivity attack, creating alternative histories of the same chain: https://notes.status.im/nimbus-insecura-network#	2022-01-22 10:25:30 +01:00
tersec	9c0c9c98ce	complete switch to beacon_chain/specs/datatypes/bellatrix (#3295 )	2022-01-18 13:36:52 +00:00
Jacek Sieka	e9486f5e5b	state_sim: clean up attestation production (#3274 ) * use same naming as everywhere * avoid iterator bug that leads to state copy	2022-01-12 21:42:03 +01:00
Jacek Sieka	805e85e1ff	time: spring cleaning (#3262 ) Time in the beacon chain is expressed relative to the genesis time - this PR creates a `beacon_time` module that collects helpers and utilities for dealing the time units - the new module does not deal with actual wall time (that's remains in `beacon_clock`). Collecting the time related stuff in one place makes it easier to find, avoids some circular imports and allows more easily identifying the code actually needs wall time to operate. * move genesis-time-related functionality into `spec/beacon_time` * avoid using `chronos.Duration` for time differences - it does not support negative values (such as when something happens earlier than it should) * saturate conversions between `FAR_FUTURE_XXX`, so as to avoid overflows * fix delay reporting in validator client so it uses the expected deadline of the slot, not "closest wall slot" * simplify looping over the slots of an epoch * `compute_start_slot_at_epoch` -> `start_slot` * `compute_epoch_at_slot` -> `epoch` A follow-up PR will (likely) introduce saturating arithmetic for the time units - this is merely code moves, renames and fixing of small bugs.	2022-01-11 11:01:54 +01:00
Jacek Sieka	20e700fae4	Harden CommitteeIndex, SubnetId, SyncSubcommitteeIndex (#3259 ) * Harden CommitteeIndex, SubnetId, SyncSubcommitteeIndex Harden the use of `CommitteeIndex` et al to prevent future issues by using a distinct type, then validating before use in several cases - datatypes in spec are kept simple though so that invalid data still can be read. * fix invalid epoch used in REST `/eth/v1/beacon/states/{state_id}/committees` committee length (could return invalid data) * normalize some variable names * normalize committee index loops * fix `RestAttesterDuty` to use `uint64` for `validator_committee_index` * validate `CommitteeIndex` on ingress in REST API * update rest rules with stricter parsing * better REST serializers * save lots of memory by not using `zip` ...at least a few bytes!	2022-01-09 01:28:49 +02:00
Jacek Sieka	0a4728a241	Handle access to historical data for which there is no state (#3217 ) With checkpoint sync in particular, and state pruning in the future, loading states or state-dependent data may fail. This PR adjusts the code to allow this to be handled gracefully. In particular, the new availability assumption is that states are always available for the finalized checkpoint and newer, but may fail for anything older. The `tail` remains the point where state loading de-facto fails, meaning that between the tail and the finalized checkpoint, we can still get historical data (but code should be prepared to handle this as an error). However, to harden the code against long replays, several operations which are assumed to work only with non-final data (such as gossip verification and validator duties) now limit their search horizon to post-finalized data. * harden several state-dependent operations by logging an error instead of introducing a panic when state loading fails * `withState` -> `withUpdatedState` to differentiate from the other `withState` * `updateStateData` can now fail if no state is found in database - it is also hardened against excessively long replays * `getEpochRef` can now fail when replay fails * reject blocks with invalid target root - they would be ignored previously * fix recursion bug in `isProposed`	2022-01-05 19:38:04 +01:00
tersec	66c9b7fbce	shift block_sim fork epochs; allow VC to work with non-multiple-of-3 SECONDS_PER_SLOT (#3244 )	2022-01-05 13:41:39 +00:00
tersec	b81c06edab	rename Beacon{Block,State}Fork.Merge to Bellatrix; update copyright years (#3240 )	2022-01-04 09:45:38 +00:00
tersec	1a6a56bdb1	use BeaconTime instead of Slot in fork choice (#3138 ) * use v1.1.6 test vectors; use BeaconTime instead of Slot in fork choice * tick through every slot at least once * use div INTERVALS_PER_SLOT and use precomputed constants of them * use correct (even if numerically equal) constant	2021-12-21 18:56:08 +00:00
Jacek Sieka	c270ec21e4	Validator monitoring (#2925 ) Validator monitoring based on and mostly compatible with the implementation in Lighthouse - tracks additional logs and metrics for specified validators so as to stay on top on performance. The implementation works more or less the following way: * Validator pubkeys are singled out for monitoring - these can be running on the node or not * For every action that the validator takes, we record steps in the process such as messages being seen on the network or published in the API * When the dust settles at the end of an epoch, we report the information from one epoch before that, which coincides with the balances being updated - this is a tradeoff between being correct (waiting for finalization) and providing relevant information in a timely manner)	2021-12-20 20:20:31 +01:00
Jacek Sieka	03005f48e1	Backfill support for ChainDAG (#3171 ) In the ChainDAG, 3 block pointers are kept: genesis, tail and head. This PR adds one more block pointer: the backfill block which represents the block that has been backfilled so far. When doing a checkpoint sync, a random block is given as starting point - this is the tail block, and we require that the tail block has a corresponding state. When backfilling, we end up with blocks without corresponding states, hence we cannot use `tail` as a backfill pointer - there is no state. Nonetheless, we need to keep track of where we are in the backfill process between restarts, such that we can answer GetBeaconBlocksByRange requests. This PR adds the basic support for backfill handling - it needs to be integrated with backfill sync, and the REST API needs to be adjusted to take advantage of the new backfilled blocks when responding to certain requests. Future work will also enable moving the tail in either direction: * pruning means moving the tail forward in time and removing states * backwards means recreating past states from genesis, such that intermediate states are recreated step by step all the way to the tail - at that point, tail, genesis and backfill will match up. * backfilling is done when backfill != genesis - later, this will be the WSS checkpoint instead	2021-12-13 14:36:06 +01:00
Jacek Sieka	dfbd50b4d6	avoid SyncCommitteMsgPool copy (#3185 ) introduced by batch verification, when verifiers were made async	2021-12-11 16:39:24 +01:00
Jacek Sieka	069bccd51b	batch-verify sync messages for a small perf boost (#3151 ) * batch-verify sync messages for a small perf boost Generally reuses the same structure as attestation and aggregate verification * normalize `signatures` and `signature_batch` to use the same pattern of verification * normalize parameter names, order etc for signature stuff in general * avoid calling `blsSign` directly - instead, go through `signatures` consistently	2021-12-09 14:56:54 +02:00
Jacek Sieka	1a8b7469e3	move quarantine outside of chaindag (#3124 ) * move quarantine outside of chaindag The quarantine has been part of the ChainDAG for the longest time, but this design has a few issues: * the function in which blocks are verified and added to the dag becomes reentrant and therefore difficult to reason about - we're currently using a stateful flag to work around it * quarantined blocks bypass the processing queue leading to a processing stampede * the quarantine flow is unsuitable for orphaned attestations - these should also should be quarantined eventually Instead of processing the quarantine inside ChainDAG, this PR moves re-queueing to `block_processor` which already is responsible for dealing with follow-up work when a block is added to the dag This sets the stage for keeping attestations in the quarantine as well. Also: * make `BlockError` `{.pure.}` * avoid use of `ValidationResult` in block clearance (that's for gossip)	2021-12-06 10:49:01 +01:00
Jacek Sieka	a223d62b07	Cleanups (#3123 ) Renames and cleanups split out from the validator monitoring branch, so as to reduce conflict area vs other PR:s * add constants for expected message timing * name validators after the messages they validate, mostly, to make grepping easier * unify field naming of EpochInfo across forks to make cross-fork code easier	2021-11-25 13:20:36 +01:00
Jacek Sieka	9c2f43ed0e	Speed up altair block processing 2x (#3115 ) * Speed up altair block processing >2x Like #3089, this PR drastially speeds up historical REST queries and other long state replays. * cache sync committee validator indices * use ~80mb less memory for validator pubkey mappings * batch-verify sync aggregate signature (fixes #2985) * document sync committee hack with head block vs sync message block * add batch signature verification failure tests Before: ``` ../env.sh nim c -d:release -r ncli_db --db:mainnet_0/db bench --start-slot:-1000 All time are ms Average, StdDev, Min, Max, Samples, Test Validation is turned off meaning that no BLS operations are performed 5830.675, 0.000, 5830.675, 5830.675, 1, Initialize DB 0.481, 1.878, 0.215, 59.167, 981, Load block from database 8422.566, 0.000, 8422.566, 8422.566, 1, Load state from database 6.996, 1.678, 0.042, 14.385, 969, Advance slot, non-epoch 93.217, 8.318, 84.192, 122.209, 32, Advance slot, epoch 20.513, 23.665, 11.510, 201.561, 981, Apply block, no slot processing 0.000, 0.000, 0.000, 0.000, 0, Database load 0.000, 0.000, 0.000, 0.000, 0, Database store ``` After: ``` 7081.422, 0.000, 7081.422, 7081.422, 1, Initialize DB 0.553, 2.122, 0.175, 66.692, 981, Load block from database 5439.446, 0.000, 5439.446, 5439.446, 1, Load state from database 6.829, 1.575, 0.043, 12.156, 969, Advance slot, non-epoch 94.716, 2.749, 88.395, 100.026, 32, Advance slot, epoch 11.636, 23.766, 4.889, 205.250, 981, Apply block, no slot processing 0.000, 0.000, 0.000, 0.000, 0, Database load 0.000, 0.000, 0.000, 0.000, 0, Database store ``` * add comment	2021-11-24 13:43:50 +01:00
Jacek Sieka	f19a497eec	ncli_db: add putState, putBlock (#3096 ) * ncli_db: add putState, putBlock These tools allow modifying an existing nimbus database for the purpose of recovery or reorg, moving the head, tail and genesis to arbitrary points. * remove potentially expensive `putState` in `BeaconStateDB` * introduce `latest_block_root` which computes the root of the latest applied block from the `latest_block_header` field (instead of passing it in separately) * avoid some unnecessary BeaconState copies during init * discover https://github.com/nim-lang/Nim/issues/19094 * prefer `HashedBeaconState` in a few places to avoid recomputing state root * fetch latest block root from state when creating blocks * harden `get_beacon_proposer_index` against invalid slots and document * move random spec function tests to `test_spec.nim` * avoid unnecessary state root computation before block proposal	2021-11-18 13:02:43 +01:00

1 2 3 4 5

238 Commits