nimbus-eth2

Commit Graph

Author	SHA1	Message	Date
tersec	d0314f0942	more ncli_db Deneb support (#5336 )	2023-08-23 19:37:25 +03:00
Jacek Sieka	e8379389e7	speed up state/block loading (#5207 ) * speed up state/block loading When loading blocks and states from db/era, we currently redundantly check their CRC32 - for a state, this costs 50ms of loading time presently (110mb uncompressed size) on a decent laptop. * remove `maxDecompressedDbRecordSize` - not actually used on recent data since we store the framed format - also, we're in luck: we blew past the limit quite some time ago * fix obsolete exception-based error checking * avoid `zeroMem` when reading from era store see https://github.com/status-im/nim-snappy/pull/22 for benchmarks * bump snappy	2023-07-26 10:47:46 +03:00
Jacek Sieka	4ae1857898	fix missing era regeneration (#5098 ) +1 is already done via a `defer`	2023-06-19 15:47:24 +00:00
Jacek Sieka	58b93ccbe0	era: Capella+ support (fixes #4752 ) (#4853 ) Post-Capella, historical roots are computed from historical summaries instead of being directly stored in the beacon state. Slightly messy to pass both lists around - this is done to avoid computing the historical root unnecessarily.	2023-04-24 15:26:28 +02:00
tersec	f71a279d17	more deneb support in ncli_db and forks (#4774 )	2023-03-30 10:06:23 +00:00
tersec	e6043f656f	add Deneb support to ncli_db (#4767 )	2023-03-28 16:44:38 +03:00
Etan Kissling	ad118cd354	rename `stateFork` > `consensusFork` (#4718 ) Just the variable, not yet `lcDataForkAtStateFork` / `atStateFork`. - Shorten comment in `light_client.nim` to keep line width - Do not rename `stateFork` mention in `runProposalForkchoiceUpdated`. - Do not rename `stateFork` in `getStateField(dag.headState, fork)` Rest is just a mechanical mass replace	2023-03-11 00:35:52 +00:00
tersec	3b41e6a0e7	rename ConsensusFork.EIP4844 to ConsensusFork.Deneb (#4692 )	2023-03-04 13:35:39 +00:00
tersec	8f269c92d7	rename eip48844ImplementationMissing to denebImplementationMissing (#4654 )	2023-02-23 10:37:45 +00:00
tersec	dc0bbe3a57	rm blockForkAtEpoch and switch callers to consensusForkAtEpoch (#4634 )	2023-02-16 21:16:54 +01:00
tersec	956aee2d35	fill some capella/EIP4844 missing implementations (#4585 )	2023-02-02 22:24:06 +00:00
tersec	0fb726c420	`BeaconStateFork/BeaconBlockFork` -> `ConsensusFork` (#4560 ) * `BeaconStateFork/BeaconBlockFork` -> `ConsensusFork` * revert unrelated change * revert unrelated changes * update test summaries	2023-01-28 19:53:41 +00:00
Jacek Sieka	3e565e9878	exportEra: allow exporting pruned databases (#4485 ) When a database has been pruned, we can still export the non-pruned part - running the era exported together with pruning allows archiving the full ethereum history for future reference without wasting space in the database. * use logging for reporting era write progress * less noise when skipping existing files * load blocks from era store also when working with `ncli_db` * write to temporary file then rename when era is complete, to reduce risk of corruption * also avoids loading the in-progress era file when writing and reading from the same era folder	2023-01-11 17:20:47 +01:00
henridf	64878888bd	Blob storage (#4454 ) * Blob storage * fix indentation * Fix build (none->Opt.none) * putBlobs -> putBlobsSidecar * getBlobs -> getBlobsSidecar * Check blob correctness when storing a backfill block * Blobs table: rename and conditionally create * Check block<->blob match in storeBackfillBlock * Use when .. toFork() to condition on type * Check blob viability in block_processor.storeBlock() * Fix build * Review feedback	2023-01-09 18:42:10 +00:00
Jacek Sieka	0ba9fc4ede	History pruning (fixes #4419 ) (#4445 ) Introduce (optional) pruning of historical data - a pruned node will continue to answer queries for historical data up to `MIN_EPOCHS_FOR_BLOCK_REQUESTS` epochs, or roughly 5 months, capping typical database usage at around 60-70gb. To enable pruning, add `--history=prune` to the command line - on the first start, old data will be cleared (which may take a while) - after that, data is pruned continuously. When pruning an existing database, the database will not shrink - instead, the freed space is recycled as the node continues to run - to free up space, perform a trusted node sync with a fresh database. When switching on archive mode in a pruned node, history is retained from that point onwards. History pruning is scheduled to be enabled by default in a future release. In this PR, `minimal` mode from #4419 is not implemented meaning retention periods for states and blocks are always the same - depending on user demand, a future PR may implement `minimal` as well.	2023-01-07 10:02:15 +00:00
tersec	7faef7827e	fix EIP4844 withBlck (#4411 ) * fix EIP4844 withBlck * don't raiseAssert by default	2022-12-14 18:30:56 +01:00
tersec	2932d3b808	extent `BeaconStateFork` enum (#4396 )	2022-12-07 16:47:23 +00:00
henridf	f0329b2212	Types and scaffolding for EIP-4844 (#4365 ) * Types and scaffolding for EIP-4844 This commit adds the EIP-4844 spec types, and fills in scaffolding/boilerplate for the use of these types across the repo. None of the actual EIP-4844 logic is introduced yet. This follows the pattern used by @tersec when introducing Capella (#4276). * use eth2-networks fork * review feedback: add static check EIP4844_FORK_EPOCH == FAR_FUTURE_EPOCH * review feedback: remove EIP4844 from /eth/v1/config/spec response * Cleanup / review feedback * Fix REST test	2022-12-05 16:29:09 +00:00
tersec	474b0d8502	`withUpdatedState` injects `updatedState` rather than `state` template (#4375 )	2022-11-30 16:37:23 +02:00
Jacek Sieka	cd160b5650	more strict read-only database mode (#4362 ) * avoid creating pre-altair backwards compatibility tables * allow running ncli_db era export without above tables present * drop unused pre-altair backwards compatibility tables * run benchmark on read-ronly database * fix running benchmark from genesis	2022-11-28 23:21:58 +00:00
Jacek Sieka	aac61165d5	ncli_db: better error message on missing history	2022-11-16 10:12:50 +01:00
tersec	b3f6be71d5	refactor `makeBeaconBlock`; some capella support for `ncli_db` and `wss_sim` (#4321 )	2022-11-11 15:37:43 +01:00
tersec	5b46f0b723	add Capella support to Forked* (#4276 ) * add Capella support to Forked* * remove cruft * add `OnForkyBlockAdded`	2022-11-02 16:23:30 +00:00
Jacek Sieka	819442acc3	Allow chain dag without genesis / block (#4230 ) * Allow chain dag without genesis / block This PR enables the initialization of the dag without access to blocks or genesis state - it is a prerequisite for implementing a number of interesting features: * checkpoint sync without any block download * pruning of blocks and states * backfill checkpoint block	2022-10-14 22:40:10 +03:00
tersec	0410aec9d8	remove rest of `withState.state` usage (#4120 ) * remove rest of `withState.state` usage * remove scaffolding	2022-09-16 15:35:00 +02:00
tersec	02a99543c6	more `withState` `state` -> `forkyState` (#4112 )	2022-09-13 14:53:12 +03:00
Jacek Sieka	48f01186d6	fix unnecessary HashList/HashArray cache invalidation (#3660 ) * SSZ `[]` -> `mitem` * `[]` -> `item` immutable access via mutable instance cannot rely on template overloading, and `[]` cannot be a `func` because of special seq handling in compiler.	2022-05-30 13:30:42 +00:00
zah	a2ba34f686	Implement all sync committee duties in the validator client (#3583 ) Other changes: * logtrace can now verify sync committee messages and contributions * Many unnecessary use of pairs() have been removed for consistency * Map 40x BN response codes to BeaconNodeStatus.Incompatible in the VC	2022-05-10 10:03:40 +00:00
Jacek Sieka	011e0ca02f	era file verification (#3605 ) * era file verification Implement and document era file verification * era file states now come with block applied for easier verification * clarify conflicting version handling * document verification requirements * remove count from name, use start-era, end-root to discover range * remove obsolete todo * abstract out block root loading	2022-05-10 03:28:46 +03:00
Jacek Sieka	d0dbc4a8f9	Snappy revamp (#3564 ) This PR makes the necessary adjustments to deal with the revamped snappy API. In practical terms for nimbus-eth2, there are performance increases to gossip processing, database reading and writing as well as era file processing. Exporting `.era` files for example, a snappy-heavy operation, almost halves in total processing time: Pre: ``` Average, StdDev, Min, Max, Samples, Test 39.088, 8.735, 23.619, 53.301, 50, tState 237.079, 46.692, 165.620, 355.481, 49, tBlocks ``` Post: ``` All time are ms Average, StdDev, Min, Max, Samples, Test 25.350, 5.303, 15.351, 41.856, 50, tState 141.238, 24.164, 99.990, 199.329, 49, tBlocks ```	2022-04-15 09:44:06 +02:00
tersec	28ba2d5544	stylecheck fixes (#3592 )	2022-04-14 13:47:14 +03:00
Jacek Sieka	5092fc41c7	use snappy-framed format for compressing bellatrix+ database entries (#3551 ) `.era` files and Req/Resp protocols use framed formats - aligning the database with these makes for less recompression work overall as gossip is sent only once while req/resp repeats (potentially) - this also allows efficient pruning-to-era where snappy-recompression is the major cycle thief.	2022-03-29 11:33:06 +00:00
Jacek Sieka	4207b127f9	era: load blocks and states (#3394 ) * era: load blocks and states Era files contain finalized history and can be thought of as an alternative source for block and state data that allows clients to avoid syncing this information from the P2P network - the P2P network is then used to "top up" the client with the most recent data. They can be freely shared in the community via whatever means (http, torrent, etc) and serve as a permanent cold store of consensus data (and, after the merge, execution data) for history buffs and bean counters alike. This PR gently introduces support for loading blocks and states in two cases: block requests from rest/p2p and frontfilling when doing checkpoint sync. The era files are used as a secondary source if the information is not found in the database - compared to the database, there are a few key differences: * the database stores the block indexed by block root while the era file indexes by slot - the former is used only in rest, while the latter is used both by p2p and rest. * when loading blocks from era files, the root is no longer trivially available - if it is needed, it must either be computed (slow) or cached (messy) - the good news is that for p2p requests, it is not needed * in era files, "framed" snappy encoding is used while in the database we store unframed snappy - for p2p2 requests, the latter requires recompression while the former could avoid it * front-filling is the process of using era files to replace backfilling - in theory this front-filling could happen from any block and front-fills with gaps could also be entertained, but our backfilling algorithm cannot take advantage of this because there's no (simple) way to tell it to "skip" a range. * front-filling, as implemented, is a bit slow (10s to load mainnet): we load the full BeaconState for every era to grab the roots of the blocks - it would be better to partially load the state - as such, it would also be good to be able to partially decompress snappy blobs * lookups from REST via root are served by first looking up a block summary in the database, then using the slot to load the block data from the era file - however, there needs to be an option to create the summary table from era files to fully support historical queries To test this, `ncli_db` has an era file exporter: the files it creates should be placed in an `era` folder next to `db` in the data directory. What's interesting in particular about this setup is that `db` remains as the source of truth for security purposes - it stores the latest synced head root which in turn determines where a node "starts" its consensus participation - the era directory however can be freely shared between nodes / people without any (significant) security implications, assuming the era files are consistent / not broken. There's lots of future improvements to be had: * we can drop the in-memory `BlockRef` index almost entirely - at this point, resident memory usage of Nimbus should drop to a cool 500-600 mb * we could serve era files via REST trivially: this would drop backfill times to whatever time it takes to download the files - unlike the current implementation that downloads block by block, downloading an era at a time almost entirely cuts out request overhead * we can "reasonably" recreate detailed state history from almost any point in time, turning an O(slot) process into O(1) effectively - we'll still need caches and indices to do this with sufficient efficiency for the rest api, but at least it cuts the whole process down to minutes instead of hours, for arbitrary points in time * CI: ignore failures with Nim-1.6 (temporary) * test fixes Co-authored-by: Ștefan Talpalaru <stefantalpalaru@yahoo.com>	2022-03-23 09:58:17 +01:00
Jacek Sieka	05ffe7b2bf	Prune `BlockRef` on finalization (#3513 ) Up til now, the block dag has been using `BlockRef`, a structure adapted for a full DAG, to represent all of chain history. This is a correct and simple design, but does not exploit the linearity of the chain once parts of it finalize. By pruning the in-memory `BlockRef` structure at finalization, we save, at the time of writing, a cool ~250mb (or 25%:ish) chunk of memory landing us at a steady state of ~750mb normal memory usage for a validating node. Above all though, we prevent memory usage from growing proportionally with the length of the chain, something that would not be sustainable over time - instead, the steady state memory usage is roughly determined by the validator set size which grows much more slowly. With these changes, the core should remain sustainable memory-wise post-merge all the way to withdrawals (when the validator set is expected to grow). In-memory indices are still used for the "hot" unfinalized portion of the chain - this ensure that consensus performance remains unchanged. What changes is that for historical access, we use a db-based linear slot index which is cache-and-disk-friendly, keeping the cost for accessing historical data at a similar level as before, achieving the savings at no percievable cost to functionality or performance. A nice collateral benefit is the almost-instant startup since we no longer load any large indicies at dag init. The cost of this functionality instead can be found in the complexity of having to deal with two ways of traversing the chain - by `BlockRef` and by slot. * use `BlockId` instead of `BlockRef` where finalized / historical data may be required * simplify clearance pre-advancement * remove dag.finalizedBlocks (~50:ish mb) * remove `getBlockAtSlot` - use `getBlockIdAtSlot` instead * `parent` and `atSlot` for `BlockId` now require a `ChainDAGRef` instance, unlike `BlockRef` traversal * prune `BlockRef` parents on finality (~200:ish mb) * speed up ChainDAG init by not loading finalized history index * mess up light client server error handling - this need revisiting :)	2022-03-17 17:42:56 +00:00
Jacek Sieka	c64bf045f3	remove StateData (#3507 ) One more step on the journey to reduce `BlockRef` usage across the codebase - this one gets rid of `StateData` whose job was to keep track of which block was last assigned to a state - these duties have now been taken over by `latest_block_root`, a fairly recent addition that computes this block root from state data (at a small cost that should be insignificant) 99% mechanical change.	2022-03-16 08:20:40 +01:00
Jacek Sieka	4363215a32	relax `BlockRef` database assumptions (#3472 ) * remove `getForkedBlock(BlockRef)` which assumes block data exists but doesn't support archive/backfilled blocks * fix REST `/eth/v1/beacon/headers` request not returning archive/backfilled blocks * avoid re-encoding in REST block SSZ requests (using `getBlockSSZ`)	2022-03-11 13:08:17 +01:00
zah	9c1ff78f84	Fix a reward calculation bug affecting Prater epoch 64781 (#3428 ) To calculate the deltas correctly, the `process_inactivity_updates` function must be called before the rewards and penalties processing code in order to update the `inactivity_scores` field in the state. This would have required duplicating more logic from the spec in the ncli modules, so I've decided to pay the price of introducing a run-time copy of the state at each epoch which eliminates the need to duplicate logic (both for this fix and the previous one). Other changes: * Fixes for the read-only mode of the `BeaconChainDb` * Fix an uint64 underflow in the debug output procedure for printing balance deltas * Allow Bellatrix states in the reward computation helpers	2022-02-22 14:14:17 +02:00
Jacek Sieka	adfe655b16	db: make block loading generic (#3413 ) Streamline lookup with Forky and BeaconBlockFork (then we can do the same for era) We use type to avoid conditionals, as fork is often already known at a "higher" level. * load blockid before loading block by root - this is needed to map root to slot and will eventually be done via block summary table for "old" blocks Co-authored-by: tersec <tersec@users.noreply.github.com>	2022-02-21 09:48:02 +01:00
Jacek Sieka	a88427bd39	ncli_db: more readonly support (#3411 ) Update several `ncli_db` commands to run in readOnly mode, allowing them to be used with a running instance - in particular era export. * export all eras by default * skip already-exported eras	2022-02-18 07:37:44 +01:00
tersec	9c18765b3b	remove ncli_db pruneDatabase (#3356 )	2022-02-03 20:03:01 +01:00
Zahary Karadjov	ac16eb4691	Streamline the validator reward analysis Notable improvements: * A separate aggregation pass is no longer required. * The user can opt to produce only aggregated data (resuing in a much smaller data set). * Large portion of the number cruching in Jupyter is now done in C through the rich DataFrames API. * Added support for comparisons against the "median" validator performance in the network.	2022-02-01 11:30:14 +02:00
Jacek Sieka	d076e1a11b	ncli_db: import states and blocks from era file (#3313 )	2022-01-25 09:28:26 +01:00
Zahary Karadjov	54a745cb0e	Bugfix: Take into account the finalization delay in the ncli_db rewards calculation This fixes a reward calculation error affecting Prater's epoch 31256	2022-01-23 23:10:56 +02:00
Jacek Sieka	61342c2449	limit by-root requests to non-finalized blocks (#3293 ) * limit by-root requests to non-finalized blocks Presently, we keep a mapping from block root to `BlockRef` in memory - this has simplified reasoning about the dag, but is not sustainable with the chain growing. We can distinguish between two cases where by-root access is useful: * unfinalized blocks - this is where the beacon chain is operating generally, by validating incoming data as interesting for future fork choice decisions - bounded by the length of the unfinalized period * finalized blocks - historical access in the REST API etc - no bounds, really In this PR, we limit the by-root block index to the first use case: finalized chain data can more efficiently be addressed by slot number. Future work includes: * limiting the `BlockRef` horizon in general - each instance is 40 bytes+overhead which adds up - this needs further refactoring to deal with the tail vs state problem * persisting the finalized slot-to-hash index - this one also keeps growing unbounded (albeit slowly) Anyway, this PR easily shaves ~128mb of memory usage at the time of writing. * No longer honor `BeaconBlocksByRoot` requests outside of the non-finalized period - previously, Nimbus would generously return any block through this libp2p request - per the spec, finalized blocks should be fetched via `BeaconBlocksByRange` instead. * return `Opt[BlockRef]` instead of `nil` when blocks can't be found - this becomes a lot more common now and thus deserves more attention * `dag.blocks` -> `dag.forkBlocks` - this index only carries unfinalized blocks from now - `finalizedBlocks` covers the other `BlockRef` instances * in backfill, verify that the last backfilled block leads back to genesis, or panic * add backfill timings to log * fix missing check that `BlockRef` block can be fetched with `getForkedBlock` reliably * shortcut doppelganger check when feature is not enabled * in REST/JSON-RPC, fetch blocks without involving `BlockRef` * fix dag.blocks ref	2022-01-21 13:33:16 +02:00
Zahary Karadjov	8a7cdc61f6	[ncli db] Add a requirements file for the Jupyter notebook	2022-01-18 20:24:20 +02:00
tersec	2f635d3337	rename *_{MERGE => BELLATRIX} constant names (#3296 )	2022-01-18 16:31:05 +00:00
tersec	9c0c9c98ce	complete switch to beacon_chain/specs/datatypes/bellatrix (#3295 )	2022-01-18 13:36:52 +00:00
Zahary Karadjov	47f1f7ff1a	More efficient reward data persistance; Address review comments The new format is based on compressed CSV files in two channels: * Detailed per-epoch data * Aggregated "daily" summaries The use of append-only CSV file speeds up significantly the epoch processing speed during data generation. The use of compression results in smaller storage requirements overall. The use of the aggregated files has a very minor cost in both CPU and storage, but leads to near interactive speed for report generation. Other changes: - Implemented support for graceful shut downs to avoid corrupting the saved files. - Fixed a memory leak caused by lacking `StateCache` clean up on each iteration. - Addressed review comments - Moved the rewards and penalties calculation code in a separate module Required invasive changes to existing modules: - The `data` field of the `KeyedBlockRef` type is made public to be used by the validator rewards monitor's Chain DAG update procedure. - The `getForkedBlock` procedure from the `blockchain_dag.nim` module is made public to be used by the validator rewards monitor's Chain DAG update procedure.	2022-01-18 01:56:56 +02:00
Zahary Karadjov	29aad0241b	Precise per-component ETH-denominated rewards tracking This is an alternative take on https://github.com/status-im/nimbus-eth2/pull/3107 that aims for more minimal interventions in the spec modules at the expense of duplicating more of the spec logic in ncli_db.	2022-01-18 01:56:56 +02:00
Jacek Sieka	836f6984bb	move `state_transition` to `Result` (#3284 ) * better error messages in api * avoid `BlockData` copies when replaying blocks	2022-01-17 12:19:58 +01:00

1 2 3

117 Commits