nimbus-eth2

Commit Graph

Author	SHA1	Message	Date
Jacek Sieka	13fafe3a40	simplify unviable head pruning (#3528 ) Also note bug that exists that potentially prevents states from being pruned correctly	2022-03-21 09:20:26 +00:00
Etan Kissling	fd1ffd62dd	update light client server for DAG failure modes (#3514 ) Gracefully handles the new failure modes recently introduced to the DAG as part of https://github.com/status-im/nimbus-eth2/pull/3513 Data that is deemed to exist but fails to load leads to an error log to avoid suppressing logic errors accidentally. In `verifyFinalization` mode, the assertions remain active.	2022-03-20 11:58:59 +01:00
Etan Kissling	18bd6df1b4	fix light client data collection for checkpoint sync (#3498 ) When doing checkpoint sync, collecting light client data of known blocks and states incorrectly assumes that `finalized_checkpoint` information is also known. Hardens collection to only collect finalized checkpoint data after `dag.computeEarliestLightClientSlot`.	2022-03-18 15:47:53 +01:00
Jacek Sieka	d0223d1f28	fix finalized epoch ref loading on checkpoint start (#3517 ) regression from #3513 that did not take tail into consideration when loading epoch ancestor	2022-03-18 13:13:57 +01:00
tersec	d11d61c745	engine API alpha.7 -> alpha.8 and a few remaining v1.1.9 to v1.1.0 CL spec URL updates (#3519 )	2022-03-18 11:46:39 +00:00
Etan Kissling	12dc427535	introduce light client processor (#3509 ) Adds `LightClientProcessor` as the pendant to `BlockProcessor` while operating in light client mode. Note that a similar mechanism based on async futures is used for interoperability with existing infrastructure, despite light client object validation being done synchronously.	2022-03-17 23:26:56 +01:00
Jacek Sieka	05ffe7b2bf	Prune `BlockRef` on finalization (#3513 ) Up til now, the block dag has been using `BlockRef`, a structure adapted for a full DAG, to represent all of chain history. This is a correct and simple design, but does not exploit the linearity of the chain once parts of it finalize. By pruning the in-memory `BlockRef` structure at finalization, we save, at the time of writing, a cool ~250mb (or 25%:ish) chunk of memory landing us at a steady state of ~750mb normal memory usage for a validating node. Above all though, we prevent memory usage from growing proportionally with the length of the chain, something that would not be sustainable over time - instead, the steady state memory usage is roughly determined by the validator set size which grows much more slowly. With these changes, the core should remain sustainable memory-wise post-merge all the way to withdrawals (when the validator set is expected to grow). In-memory indices are still used for the "hot" unfinalized portion of the chain - this ensure that consensus performance remains unchanged. What changes is that for historical access, we use a db-based linear slot index which is cache-and-disk-friendly, keeping the cost for accessing historical data at a similar level as before, achieving the savings at no percievable cost to functionality or performance. A nice collateral benefit is the almost-instant startup since we no longer load any large indicies at dag init. The cost of this functionality instead can be found in the complexity of having to deal with two ways of traversing the chain - by `BlockRef` and by slot. * use `BlockId` instead of `BlockRef` where finalized / historical data may be required * simplify clearance pre-advancement * remove dag.finalizedBlocks (~50:ish mb) * remove `getBlockAtSlot` - use `getBlockIdAtSlot` instead * `parent` and `atSlot` for `BlockId` now require a `ChainDAGRef` instance, unlike `BlockRef` traversal * prune `BlockRef` parents on finality (~200:ish mb) * speed up ChainDAG init by not loading finalized history index * mess up light client server error handling - this need revisiting :)	2022-03-17 17:42:56 +00:00
Jacek Sieka	8a63efc413	move `BlockId` to `spec` (#3511 ) The spec implicitly talks about the slot of a block in several places, and keeping it readily available is useful in a number of context - might as well put this implicitly refereneced helper in the spec code directly	2022-03-16 16:00:18 +01:00
Jacek Sieka	c64bf045f3	remove StateData (#3507 ) One more step on the journey to reduce `BlockRef` usage across the codebase - this one gets rid of `StateData` whose job was to keep track of which block was last assigned to a state - these duties have now been taken over by `latest_block_root`, a fairly recent addition that computes this block root from state data (at a small cost that should be insignificant) 99% mechanical change.	2022-03-16 08:20:40 +01:00
Etan Kissling	6d1d31dd01	avoid re-requesting finalized blocks during sync (#3461 ) When a `beaconBlocksByRange` response advances the `safeSlot`, but later has errors, the sync queue keeps repeating that same request until it is fulfilled without errors. Data up through `safeSlot` is considered to be immutable, i.e., finalized, so re-requesting that data is not useful. By advancing the sync progress in that scenario, those redundant query portions can be avoided. Note, the finalized block _itself_ is always requested, even in the initial request. This behaviour is kept same.	2022-03-15 18:56:56 +01:00
Jacek Sieka	a3bd01b58d	move dependent root computations to `BeaconState` / `EpochRef` (#3478 ) * fewer deps on `BlockRef` traversal in anticipation of pruning * allows identifying EpochRef:s by their shuffling as a first step of * tighten error handling around missing blocks using the zero hash for signalling "missing block" is fragile and easy to miss - with checkpoint sync now, and pruning in the future, missing blocks become "normal".	2022-03-15 09:24:55 +01:00
Etan Kissling	29e5a4a752	error and progress codes for light client sync (#3490 ) When syncing as a light client, different behaviour is needed to handle the various ways how errors may occur. The existing logic for blocks can also be applied to light client objects: - `Invalid`: Malformed object that is clearly an error by its producer. - `MissingParent`: More data is needed to decide applicability. - `UnviableFork`: Object may be valid but will never apply on this fork. - `Duplicate`: No errors were encountered but the object was not useful.	2022-03-14 10:25:54 +01:00
Etan Kissling	ae408c279a	add option to collect light client data (#3474 ) Light clients require full nodes to serve additional data so that they can stay in sync with the network. This patch adds a new launch option `--import-light-client-data` to configure what data to make available. For now, data is only kept in memory; it is not persisted at this time. Note that data is only locally collected, a separate patch is needed to actually make it availble over the network. `--serve-light-client-data` will be used for serving data, but is not functional yet outside tests.	2022-03-11 21:28:10 +01:00
Jacek Sieka	d0183ccd77	Historical state reindex for trusted node sync (#3452 ) When performing trusted node sync, historical access is limited to states after the checkpoint. Reindexing restores full historical access by replaying historical blocks against the state and storing snapshots in the database. The process can be initiated or resumed at any point in time.	2022-03-11 12:49:47 +00:00
Jacek Sieka	4363215a32	relax `BlockRef` database assumptions (#3472 ) * remove `getForkedBlock(BlockRef)` which assumes block data exists but doesn't support archive/backfilled blocks * fix REST `/eth/v1/beacon/headers` request not returning archive/backfilled blocks * avoid re-encoding in REST block SSZ requests (using `getBlockSSZ`)	2022-03-11 13:08:17 +01:00
Tanguy	f589bf2119	Peer dialing/kicking system overhaul (#3346 ) * Force dial + excess peer trimmer * Ensure we always have outgoing peers * Add configurable hard-max-peers	2022-03-11 10:51:53 +00:00
Etan Kissling	5a3ba5d968	update to pre-release light client sync protocol (#3465 ) This adopts the spec sections of the pre-release proposal of the libp2p based light client sync protocol, and also adds a test runner for the new accompanying tests. While the release version of the light client sync protocol contains conflicting definitions, it is currently unused, and the code specific to the pre-release proposal is marked as such. See https://github.com/ethereum/consensus-specs/pull/2802	2022-03-08 13:21:56 +01:00
Etan Kissling	aaa5a5ad40	add `start_slot` overload for sync periods (#3469 ) Adds a `start_slot` overload for `SyncCommitteePeriod` as a shortcut for `period.start_epoch.start_slot`.	2022-03-08 11:38:58 +01:00
Etan Kissling	a84ab5d47f	validate `fork_version` as light client (#3459 ) The spec does not provide code for validating the `fork_version` field of `LightClientUpdate`. However, we can use our own logic for additional validation of that field. The spec's python test suite sets up states that do not follow the fork schedule (e.g., that use Altair fork version before Altair fork epoch), which complicates upstreaming this as code.	2022-03-04 17:09:33 +01:00
Mamy Ratsimbazafy	ef7e8bdbd2	Minify slashing protection before SQLite (#3393 )	2022-03-04 16:43:34 +02:00
tersec	c18cd8ee0c	rename random -> prev_randao in Bellatrix for CL specs v1.1.10 (#3460 )	2022-03-03 16:08:14 +00:00
Etan Kissling	47d7814518	update light client to v1.1.10 spec (#3457 ) Adopts the changes introduced in the v1.1.10 ETH consensus-specs: - Introduces `is_finality_update` helper - Ensures `optimistic_header` always >= `finalized_header` - Updates spec references	2022-03-03 14:03:08 +01:00
Etan Kissling	3ffab01b07	Refactor and optimize sync logs. (#3451 ) * Refactor and optimize logs. * Introduce shortLog(SyncRequest). * Address review comment. * make sync queue logs more consistent Adds a few minor logging improvements: - Fixes a typo (`was happened` -> `has happened`) - Avoids passing `reset_slot` argument to log statement multiple times - Uses same `rewind_to_slot` label when logging in both sync directions - Consistent rewind point logging Co-authored-by: cheatfate <eugene.kabanov@status.im>	2022-03-03 09:05:33 +01:00
Etan Kissling	3b20d57277	use next slot when signing for light client tests (#3447 ) In practice, the sync committee signs `LightClientUpdate` instances at the next slot following the block. This is not correctly reflected in the tests, where it is signed one slot early. This patch updates the tests to use the correct slot for the computation.	2022-03-02 11:46:17 +01:00
tersec	f0ada15dac	automated CL spec ref URL updates from v1.1.9 to v1.1.10 (#3455 )	2022-03-02 10:00:21 +00:00
Etan Kissling	0e34c6023e	cleanup light client sync tests (#3445 ) Various cleanups in the light client sync test suite without semantic impact to make the various tests more streamlined.	2022-02-28 20:58:32 +01:00
tersec	ef9767eb7a	implement --jwt-secret and HS256 JWT/JWS signing for engine API alpha.7 (#3440 )	2022-02-27 16:55:02 +00:00
Jacek Sieka	40a4c01086	chaindag: don't keep backfill block table in memory (#3429 ) This PR names and documents the concept of the archive: a range of slots for which we have degraded functionality in terms of historical access - in particular: * we don't support rewinding to states in this range * we don't keep an in-memory representation of the block dag The archive de-facto exists in a trusted-node-synced node, but this PR gives it a name and drops the in-memory digest index. In order to satisfy `GetBlocksByRange` requests, we ensure that we have blocks for the entire archive period via backfill. Future versions may relax this further, adding a "pre-archive" period that is fully pruned. During by-slot searches in the archive (both for libp2p and rest requests), an extra database lookup is used to covert the given `slot` to a `root` - future versions will avoid this using era files which natively are indexed by `slot`. That said, the lookup is quite fast compared to the actual block loading given how trivial the table is - it's hard to measure, even. A collateral benefit of this PR is that checkpoint-synced nodes will see 100-200MB memory usage savings, thanks to the dropped in-memory cache - future pruning work will bring this benefit to full nodes as well. * document chaindag storage architecture and assumptions * look up parent using block id instead of full block in clearance (future-proofing the code against a future in which blocks come from era files) * simplify finalized block init, always writing the backfill portion to db at startup (to ensure lookups work as expected) * preallocate some extra memory for finalized blocks, to avoid immediate realloc	2022-02-26 19:16:19 +01:00
Jacek Sieka	92e7e288e7	Ignore seen aggregates (#3439 ) https://github.com/ethereum/consensus-specs/pull/2225 removed an ignore rule that would filter out duplicate aggregates from gossip publishing - however, this causes increased bandwidth and CPU usage as discussed in https://github.com/ethereum/consensus-specs/issues/2183 - the intent is to revert the removal and reinstate the rule. This PR implements ignore filtering which cuts down on CPU usage (fewer aggregates to validate) and bandwidth usage (less fanout of duplicates) - as #2225 points out, this may lead to a small increase in IHAVE messages.	2022-02-25 17:15:39 +01:00
tersec	05bc61b712	add mev-boost RPC test, with docs (#3430 ) * bump nim-web3 and add mev-boost RPC test, with docs * remove trailing space * use specific commithash	2022-02-24 14:38:31 +01:00
tersec	7de3f00f35	generic putCorruptState; {Merge=>Bellatrix}BeaconStateNoImmutableValidators (#3427 )	2022-02-21 12:55:56 +01:00
Jacek Sieka	adfe655b16	db: make block loading generic (#3413 ) Streamline lookup with Forky and BeaconBlockFork (then we can do the same for era) We use type to avoid conditionals, as fork is often already known at a "higher" level. * load blockid before loading block by root - this is needed to map root to slot and will eventually be done via block summary table for "old" blocks Co-authored-by: tersec <tersec@users.noreply.github.com>	2022-02-21 09:48:02 +01:00
tersec	84588b34da	var => let in specs/ and tests/ (#3425 )	2022-02-20 20:13:06 +00:00
Etan Kissling	9790c4958b	converter function for reducing blocks to headers (#3410 ) This introduces a function to convert `SignedBeaconBlock` to just their `BeaconBlockHeader` and updates the usages for reduced code duplication.	2022-02-18 21:35:52 +01:00
tersec	79761c78a4	proc -> func, mainly in spec/state transition and adjecent modules (#3405 )	2022-02-17 11:53:55 +00:00
tersec	5eecb9a21f	rename no{R=>r}eturn, no{I=>i}init, short{l=>L}og, E{T=>t}h2Node, Beacon{c=>C}hainDB (#3403 )	2022-02-16 23:24:44 +01:00
tersec	873a8ec1e6	use isZeroMemory for Eth2Digest comparisons (#3386 ) * use isZeroMemory for Eth2Digest comparisons * use Eth2Digest.isZero abstraction	2022-02-14 05:26:19 +00:00
tersec	d02daf8cbd	bump nim-web3 to fix kiln interop (#3373 )	2022-02-11 18:38:44 +00:00
Eugene Kabanov	40c77e5928	Remote KeyManager API and number of fixes/tests for KeyManager API (#3360 ) * Initial commit. * Fix current test suite. * Fix keymanager api test. * Fix wss_sim. * Add more keystore_management tests. * Recover deleted isEmptyDir(). * Add `HttpHostUri` distinct type. Move keymanager calls away from rest_beacon_calls to rest_keymanager_calls. Add REST serialization of RemoteKeystore and Keystore object. Add tests for Remote Keystore management API. Add tests for Keystore management API (Add keystore). Fix serialzation issues. * Fix test to use HttpHostUri instead of Uri. * Add links to specification in comments. * Remove debugging echoes.	2022-02-07 22:36:09 +02:00
Jacek Sieka	c7abc97545	harden and speed up block sync (#3358 ) * harden and speed up block sync The `GetBlockBy` server implementation currently reads SSZ bytes from database, deserializes them into a Nim object then serializes them right back to SSZ - here, we eliminate the deser/ser steps and send the bytes straight to the network. Unfortunately, the snappy recoding must still be done because of differences in framing. Also, the quota system makes one giant request for quota right before sending all blocks - this means that a 1024 block request will be "paused" for a long time, then all blocks will be sent at once causing a spike in database reads which potentially will see the reading client time out before any block is sent. Finally, on the reading side we make several copies of blocks as they travel through various queues - this was not noticeable before but becomes a problem in two cases: bellatrix blocks are up to 10mb (instead of .. 30-40kb) and when backfilling, we process a lot more of them a lot faster. fix status comparisons for nodes syncing from genesis (#3327 was a bit too hard) * don't hit database at all for post-altair slots in GetBlock v1 requests	2022-02-07 19:20:10 +02:00
tersec	02349b4181	update to engine API alpha.6 (#3351 )	2022-02-04 12:12:19 +00:00
tersec	d358299875	fork choice proposer boosting support (#3349 ) * fork choice proposer boosting support * detect nodeDelta underflow/overflow	2022-02-04 12:59:40 +01:00
tersec	8e6a920bf4	rename MERGE_FORK_EPOCH to BELLATRIX_FORK_EPOCH (#3350 ) * rename MERGE_FORK_EPOCH to BELLATRIX_FORK_EPOCH * fix REST test rules	2022-02-02 14:06:55 +01:00
tersec	0c814f49ee	rename sync_{committee_,}aggregate and execute_payload -> notify_new_payload (#3347 )	2022-02-01 07:31:53 +00:00
tersec	c9aa1bee01	spec URL updates (#3342 )	2022-01-31 09:56:59 +00:00
Jacek Sieka	d583e8e4ac	Store finalized block roots in database (3s startup) (#3320 ) * Store finalized block roots in database (3s startup) When the chain has finalized a checkpoint, the history from that point onwards becomes linear - this is exploited in `.era` files to allow constant-time by-slot lookups. In the database, we can do the same by storing finalized block roots in a simple sparse table indexed by slot, bringing the two representations closer to each other in terms of conceptual layout and performance. Doing so has a number of interesting effects: * mainnet startup time is improved 3-5x (3s on my laptop) * the _first_ startup might take slightly longer as the new index is being built - ~10s on the same laptop * we no longer rely on the beacon block summaries to load the full dag - this is a lot faster because we no longer have to look up each block by parent root * a collateral benefit is that we no longer need to load the full summaries table into memory - we get the RSS benefits of #3164 without the CPU hit. Other random stuff: * simplify forky block generics * fix withManyWrites multiple evaluation * fix validator key cache not being updated properly in chaindag read-only mode * drop pre-altair summaries from `kvstore` * recreate missing summaries from altair+ blocks as well (in case database has lost some to an involuntary restart) * print database startup timings in chaindag load log * avoid allocating superfluos state at startup * use a recursive sql query to load the summaries of the unfinalized blocks	2022-01-30 18:51:04 +02:00
tersec	29e2169585	phase 0 & altair beacon chain and altair validator spec URL updates (#3339 )	2022-01-29 13:53:31 +00:00
tersec	89ffa8a1a7	spec URL & copyright year update (#3338 )	2022-01-29 01:05:39 +00:00
tersec	60bf5b8bf4	use v1.1.9 test vectors (#3337 )	2022-01-28 22:47:48 +00:00
tersec	95fee10328	clean up hashed rollback proc declarations (#3333 ) * clean up hashed rollback proc declarations * use generic hashed rollback proc type	2022-01-28 14:24:37 +00:00

1 2 3 4 5 ...

1039 Commits