nimbus-eth2

Commit Graph

Author	SHA1	Message	Date
Etan Kissling	21b69d5901	avoid small gaps in optimistic block stream (#3749 ) Ensures that all intermediate blocks are reported if a small gap is encountered when downloading optimistic blocks. Gaps may occur when a block is missed and still downloading, or when EL processing is slow. If the gap exceeds 1 epoch, optimistic block stream jumps to latest.	2022-06-16 15:24:08 +00:00
Eugene Kabanov	1b6651dfc3	Fix /eth/v1/node/syncing (#3720 ) * Fix REST `/eth/v1/node/syncing` call to return values even if SyncManager is not running. * Use syncManager.inProgress as is_syncing indicator.	2022-06-14 22:26:23 +02:00
Etan Kissling	52ba4f7999	rename light client config parameters (#3740 ) For consistency with other options, use a common prefix for light client data configuration options. * `--serve-light-client-data` --> `--light-client-data-serve` * `--import-light-client-data` --> `--light-client-data-import-mode` No deprecation of the old identifiers as they were only sparingly used and all usage can be easily updated without interferance.	2022-06-14 12:03:39 +03:00
tersec	8d421f3d91	keep fcU consistent with actual DAG (#3748 )	2022-06-14 08:28:30 +00:00
Etan Kissling	7b04a94d43	fix #3674 (Sync progress >100% on checkpoint sync) (#3736 ) Corrects an off-by-1 in the reported sync percentage computation. New logic is based on `SyncQueue.total` and `SyncQueue.progress` with `pivot` instead of `sq.startSlot`.	2022-06-13 20:00:36 +03:00
Etan Kissling	15967c4076	keep track of latest blocks for optimistic sync (#3715 ) When launched with `--light-client-enable` the latest blocks are fetched and optimistic candidate blocks are passed to a callback (log for now). This helps accelerate syncing in the future (optimistic sync).	2022-06-10 14:16:37 +00:00
Jacek Sieka	7ec1521c52	use unsigned literals (#3717 ) in the hopes of avoiding potential for conversion bugs on i386	2022-06-08 11:09:33 +00:00
Jacek Sieka	b35584632b	sync: remove `step` from sync client implementation (#3678 ) * sync: remove `step` from sync client implementation Deprecated in the spec: https://github.com/ethereum/consensus-specs/pull/2856 - future PR:s will deprecate server support as well.	2022-06-06 16:56:59 +03:00
tersec	ea113fc420	disallow non-(genesis, far-future) equal transition epochs (#3691 )	2022-06-03 09:37:03 +00:00
Eugene Kabanov	50f9596108	Eliminate rpc_types.nim usage. (#3692 )	2022-06-02 09:39:08 +00:00
Etan Kissling	01efa93cf6	add light client (standalone) (#3653 ) Introduces a new library for syncing using libp2p based light client sync protocol, and adds a new `nimbus_light_client` executable that uses this library for syncing. The new executable emits log messages when new beacon block headers are received, and is integrated into testing.	2022-05-31 12:45:37 +02:00
Etan Kissling	c808f17a37	update to latest light client libp2p protocol (#3623 ) Incorporates the latest changes to the light client sync protocol based on Devconnect AMS feedback. Note that this breaks compatibility with the previous prototype, due to changes to data structures and endpoints. See https://github.com/ethereum/consensus-specs/pull/2802	2022-05-23 14:02:54 +02:00
Etan Kissling	8cfb630aa9	never request blocks before `safeSlot` in sync (#3512 ) Follows up on https://github.com/status-im/nimbus-eth2/pull/3461 which ensured that repeated `beaconBlocksByRange` requests get shrinked to account for potential out-of-band advancements to `safeSlot`, with similar logic for the initial request.	2022-05-10 13:46:14 +02:00
Jacek Sieka	138c40161d	avoid unnecessary recompression in block protocol (#3598 ) Blocks can be sent straight from compressed data sources Co-authored-by: Etan Kissling <etan@status.im>	2022-05-05 11:00:02 +00:00
tersec	7bb40d28ae	ensure MAX_CHUNK_SIZE usage consistent in sync_protocol (#3615 )	2022-05-05 09:17:14 +00:00
tersec	4a372410a4	use MAX_CHUNK_SIZE_BELLATRIX for signed Bellatrix blocks (#3613 ) * use MAX_CHUNK_SIZE_BELLATRIX for signed Bellatrix blocks * Update beacon_chain/networking/eth2_network.nim Co-authored-by: Etan Kissling <etan@status.im> * localPassC to localPassc * check against maxChunkSize rather than constant Co-authored-by: Etan Kissling <etan@status.im>	2022-05-05 05:45:35 +00:00
Eugene Kabanov	5592c7c674	NoMonitor and removed clock check for SyncManager. (#3420 ) * Add `NoMonitor` flag to stop SyncManager from monitoring sync situation. * Remove `toleranceValue` and `PeerScoreHeadTooNew`. Co-authored-by: Etan Kissling <etan@status.im>	2022-04-14 15:17:44 +02:00
Jacek Sieka	f70ff38b53	enable `styleCheck:usages` (#3573 ) Some upstream repos still need fixes, but this gets us close enough that style hints can be enabled by default. In general, "canonical" spellings are preferred even if they violate nep-1 - this applies in particular to spec-related stuff like `genesis_validators_root` which appears throughout the codebase.	2022-04-08 16:22:49 +00:00
Jacek Sieka	4207b127f9	era: load blocks and states (#3394 ) * era: load blocks and states Era files contain finalized history and can be thought of as an alternative source for block and state data that allows clients to avoid syncing this information from the P2P network - the P2P network is then used to "top up" the client with the most recent data. They can be freely shared in the community via whatever means (http, torrent, etc) and serve as a permanent cold store of consensus data (and, after the merge, execution data) for history buffs and bean counters alike. This PR gently introduces support for loading blocks and states in two cases: block requests from rest/p2p and frontfilling when doing checkpoint sync. The era files are used as a secondary source if the information is not found in the database - compared to the database, there are a few key differences: * the database stores the block indexed by block root while the era file indexes by slot - the former is used only in rest, while the latter is used both by p2p and rest. * when loading blocks from era files, the root is no longer trivially available - if it is needed, it must either be computed (slow) or cached (messy) - the good news is that for p2p requests, it is not needed * in era files, "framed" snappy encoding is used while in the database we store unframed snappy - for p2p2 requests, the latter requires recompression while the former could avoid it * front-filling is the process of using era files to replace backfilling - in theory this front-filling could happen from any block and front-fills with gaps could also be entertained, but our backfilling algorithm cannot take advantage of this because there's no (simple) way to tell it to "skip" a range. * front-filling, as implemented, is a bit slow (10s to load mainnet): we load the full BeaconState for every era to grab the roots of the blocks - it would be better to partially load the state - as such, it would also be good to be able to partially decompress snappy blobs * lookups from REST via root are served by first looking up a block summary in the database, then using the slot to load the block data from the era file - however, there needs to be an option to create the summary table from era files to fully support historical queries To test this, `ncli_db` has an era file exporter: the files it creates should be placed in an `era` folder next to `db` in the data directory. What's interesting in particular about this setup is that `db` remains as the source of truth for security purposes - it stores the latest synced head root which in turn determines where a node "starts" its consensus participation - the era directory however can be freely shared between nodes / people without any (significant) security implications, assuming the era files are consistent / not broken. There's lots of future improvements to be had: * we can drop the in-memory `BlockRef` index almost entirely - at this point, resident memory usage of Nimbus should drop to a cool 500-600 mb * we could serve era files via REST trivially: this would drop backfill times to whatever time it takes to download the files - unlike the current implementation that downloads block by block, downloading an era at a time almost entirely cuts out request overhead * we can "reasonably" recreate detailed state history from almost any point in time, turning an O(slot) process into O(1) effectively - we'll still need caches and indices to do this with sufficient efficiency for the rest api, but at least it cuts the whole process down to minutes instead of hours, for arbitrary points in time * CI: ignore failures with Nim-1.6 (temporary) * test fixes Co-authored-by: Ștefan Talpalaru <stefantalpalaru@yahoo.com>	2022-03-23 09:58:17 +01:00
Etan Kissling	b2b7b0bd56	serve libp2p protocol for light client sync (#3341 ) This extends the `--serve-light-client-data` launch option to serve locally collected light client data via libp2p. Backfill of historic best `LightClientUpdate` is not yet implemented. See https://github.com/ethereum/consensus-specs/pull/2802	2022-03-22 21:23:36 +01:00
Etan Kissling	6d1d31dd01	avoid re-requesting finalized blocks during sync (#3461 ) When a `beaconBlocksByRange` response advances the `safeSlot`, but later has errors, the sync queue keeps repeating that same request until it is fulfilled without errors. Data up through `safeSlot` is considered to be immutable, i.e., finalized, so re-requesting that data is not useful. By advancing the sync progress in that scenario, those redundant query portions can be avoided. Note, the finalized block _itself_ is always requested, even in the initial request. This behaviour is kept same.	2022-03-15 18:56:56 +01:00
Jacek Sieka	a3bd01b58d	move dependent root computations to `BeaconState` / `EpochRef` (#3478 ) * fewer deps on `BlockRef` traversal in anticipation of pruning * allows identifying EpochRef:s by their shuffling as a first step of * tighten error handling around missing blocks using the zero hash for signalling "missing block" is fragile and easy to miss - with checkpoint sync now, and pruning in the future, missing blocks become "normal".	2022-03-15 09:24:55 +01:00
Etan Kissling	3ffab01b07	Refactor and optimize sync logs. (#3451 ) * Refactor and optimize logs. * Introduce shortLog(SyncRequest). * Address review comment. * make sync queue logs more consistent Adds a few minor logging improvements: - Fixes a typo (`was happened` -> `has happened`) - Avoids passing `reset_slot` argument to log statement multiple times - Uses same `rewind_to_slot` label when logging in both sync directions - Consistent rewind point logging Co-authored-by: cheatfate <eugene.kabanov@status.im>	2022-03-03 09:05:33 +01:00
Etan Kissling	6849536742	fix `firstSlot` computation for backfill sync When initializing backfill sync, the implementation intends to start at the first unknown slot (`1` before tail). However, an incorrect variable is passed, and backfill sync actually starts at the tail slot instead. This patch corrects this by passing the intended variable. The problem was introduced with the original backfill implementation at #3263.	2022-02-14 18:53:38 +02:00
Etan Kissling	d1f97e209a	remove unused `sleepTime` from `SyncManager` (#3384 ) The `SyncManager` has a leftover optional `sleepTime` parameter in its constructor that used to configure the sync loop polling rate. This parameter was replaced with a constant in #1602 and is no longer functional. This patch removes the `sleepTime` leftovers.	2022-02-14 12:05:01 +01:00
Etan Kissling	a28900c348	fix slot number display during sync (#3383 ) #3304 introduced a regression to the sync status string displayed in the status bar; during the main forward sync, the current slot is no longer reported and always displays as `0`. This patch corrects the computation to accurately report the current slot once more.	2022-02-14 12:04:04 +01:00
tersec	873a8ec1e6	use isZeroMemory for Eth2Digest comparisons (#3386 ) * use isZeroMemory for Eth2Digest comparisons * use Eth2Digest.isZero abstraction	2022-02-14 05:26:19 +00:00
Etan Kissling	15fc7534cf	remove unused `maxStatusAge` from `SyncManager` (#3382 ) The `SyncManager` has a leftover optional `maxStatusAge` parameter in its constructor that used to configure the libp2p `Status` polling rate. This parameter was replaced with a constant in #1827 and is no longer functional. This patch removes the `maxStatusAge` leftovers.	2022-02-13 16:17:13 +01:00
Jacek Sieka	1760f4d7a7	move wallet/deposit commands to separate files (#3372 ) These commands have little to do with the "normal" beacon node operation - ergo, they deserve to live in their own module. * clean up imports/exports	2022-02-11 21:40:49 +01:00
Jacek Sieka	c7abc97545	harden and speed up block sync (#3358 ) * harden and speed up block sync The `GetBlockBy` server implementation currently reads SSZ bytes from database, deserializes them into a Nim object then serializes them right back to SSZ - here, we eliminate the deser/ser steps and send the bytes straight to the network. Unfortunately, the snappy recoding must still be done because of differences in framing. Also, the quota system makes one giant request for quota right before sending all blocks - this means that a 1024 block request will be "paused" for a long time, then all blocks will be sent at once causing a spike in database reads which potentially will see the reading client time out before any block is sent. Finally, on the reading side we make several copies of blocks as they travel through various queues - this was not noticeable before but becomes a problem in two cases: bellatrix blocks are up to 10mb (instead of .. 30-40kb) and when backfilling, we process a lot more of them a lot faster. fix status comparisons for nodes syncing from genesis (#3327 was a bit too hard) * don't hit database at all for post-altair slots in GetBlock v1 requests	2022-02-07 19:20:10 +02:00
Jacek Sieka	84b6ad871d	harden status message handling Additional sanity checking of the status message exchanged during a fresh connection: * check that head and finalized make sense, slot-wise * verify that finalized root lies on the canonical chain, when possible * re-check these things for every status message during sync	2022-01-27 18:46:47 +02:00
Jacek Sieka	f70aceef37	Harden handling of unviable forks (#3312 ) * Harden handling of unviable forks In our current handling of unviable forks, we allow peers to send us blocks that come from a different fork - this is not necessarily an error as it can happen naturally, but it does open up the client to a case where the same unviable fork keeps getting requested - rather than allowing this to happen, we'll now give these peers a small negative score - if it keeps happening, we'll disconnect them. * keep track of unviable forks in quarantine, to avoid filling it with known junk * collect peer scores in single module * descore peers when they send unviable blocks during sync * don't give score for duplicate blocks * increase quarantine size to a level that allows finality to happen under optimal conditions - this helps avoid downloading the same blocks over and over in case of an unviable fork * increase initial score for new peers to make room for one more failure before disconnection * log and score invalid/unviable blocks in requestmanager too * avoid ChainDAG dependency in quarantine * reject gossip blocks with unviable parent * continue processing unviable sync blocks in order to build unviable dag * docs * Update beacon_chain/consensus_object_pools/block_pools_types.nim * add unviable queue test	2022-01-26 13:20:08 +01:00
tersec	351c2fd48a	rename mergeData to bellatrixData and mergeFork to bellatrixFork (#3315 )	2022-01-24 16:23:13 +00:00
Jacek Sieka	61342c2449	limit by-root requests to non-finalized blocks (#3293 ) * limit by-root requests to non-finalized blocks Presently, we keep a mapping from block root to `BlockRef` in memory - this has simplified reasoning about the dag, but is not sustainable with the chain growing. We can distinguish between two cases where by-root access is useful: * unfinalized blocks - this is where the beacon chain is operating generally, by validating incoming data as interesting for future fork choice decisions - bounded by the length of the unfinalized period * finalized blocks - historical access in the REST API etc - no bounds, really In this PR, we limit the by-root block index to the first use case: finalized chain data can more efficiently be addressed by slot number. Future work includes: * limiting the `BlockRef` horizon in general - each instance is 40 bytes+overhead which adds up - this needs further refactoring to deal with the tail vs state problem * persisting the finalized slot-to-hash index - this one also keeps growing unbounded (albeit slowly) Anyway, this PR easily shaves ~128mb of memory usage at the time of writing. * No longer honor `BeaconBlocksByRoot` requests outside of the non-finalized period - previously, Nimbus would generously return any block through this libp2p request - per the spec, finalized blocks should be fetched via `BeaconBlocksByRange` instead. * return `Opt[BlockRef]` instead of `nil` when blocks can't be found - this becomes a lot more common now and thus deserves more attention * `dag.blocks` -> `dag.forkBlocks` - this index only carries unfinalized blocks from now - `finalizedBlocks` covers the other `BlockRef` instances * in backfill, verify that the last backfilled block leads back to genesis, or panic * add backfill timings to log * fix missing check that `BlockRef` block can be fetched with `getForkedBlock` reliably * shortcut doppelganger check when feature is not enabled * in REST/JSON-RPC, fetch blocks without involving `BlockRef` * fix dag.blocks ref	2022-01-21 13:33:16 +02:00
Eugene Kabanov	0ea6dfa517	Fix current slot value and finishing progress for backfilling. (#3304 )	2022-01-21 10:35:54 +01:00
Jacek Sieka	570379d3d9	Backfiller (#3263 ) Backfilling is the process of downloading historical blocks via P2P that are required to fulfill `GetBlocksByRange` duties - this happens during both trusted node and finalized checkpoint syncs. In particular, backfilling happens after syncing to head, such that attestation work can start as soon as possible. * Fix SyncQueue initialization procedure. Remove usage of `awaitne`. Add cancellation support. Remove unneeded `sleepAsync()` if peer's head is older than needed. Add `direction` field to all logs. Fix syncmanager wedge issue. Add proper resource cleaning procedure on backward sync finish. Co-authored-by: cheatfate <eugene.kabanov@status.im>	2022-01-20 08:25:45 +01:00
tersec	9c0c9c98ce	complete switch to beacon_chain/specs/datatypes/bellatrix (#3295 )	2022-01-18 13:36:52 +00:00
Jacek Sieka	836f6984bb	move `state_transition` to `Result` (#3284 ) * better error messages in api * avoid `BlockData` copies when replaying blocks	2022-01-17 12:19:58 +01:00
Jacek Sieka	68247f81b3	Trusted node sync (#3209 ) * Trusted node sync Trusted node sync, aka checkpoint sync, allows syncing tyhe chain from a trusted node instead of relying on a full sync from genesis. Features include: * sync from any slot, including the latest finalized slot * backfill blocks either from the REST api (default) or p2p (#3263) Future improvements: * top up blocks between head in database and some other node - this makes for an efficient backup tool * recreate historical state to enable historical queries * fixes * load genesis from network metadata * check checkpoint block root against state * fix invalid block root in rest json decoding * odds and ends * retry looking for epoch-boundary checkpoint blocks	2022-01-17 10:27:08 +01:00
Jacek Sieka	d57c2dc4e5	use tail block as sync pivot (#3276 ) When syncing, we show how much of the sync has completed - with checkpoint sync, the syncing does not always go from slot 0 to head, but rather can start in the middle. To show a consistent `%` between restarts, we introduce the concept of a pivot point, such that if I sync 10% of the chain, then restart the client, it picks up at 10% (instead of counting from 0). What it looks like: ``` INF ... sync="01d12h41m (15.96%) 13.5158slots/s (QDDQDDQQDP:339018)" ... ```	2022-01-13 10:37:53 +01:00
tersec	14aab2c13f	update 10 modules from using merge to bellatrix (#3272 )	2022-01-12 15:50:30 +01:00
Jacek Sieka	805e85e1ff	time: spring cleaning (#3262 ) Time in the beacon chain is expressed relative to the genesis time - this PR creates a `beacon_time` module that collects helpers and utilities for dealing the time units - the new module does not deal with actual wall time (that's remains in `beacon_clock`). Collecting the time related stuff in one place makes it easier to find, avoids some circular imports and allows more easily identifying the code actually needs wall time to operate. * move genesis-time-related functionality into `spec/beacon_time` * avoid using `chronos.Duration` for time differences - it does not support negative values (such as when something happens earlier than it should) * saturate conversions between `FAR_FUTURE_XXX`, so as to avoid overflows * fix delay reporting in validator client so it uses the expected deadline of the slot, not "closest wall slot" * simplify looping over the slots of an epoch * `compute_start_slot_at_epoch` -> `start_slot` * `compute_epoch_at_slot` -> `epoch` A follow-up PR will (likely) introduce saturating arithmetic for the time units - this is merely code moves, renames and fixing of small bugs.	2022-01-11 11:01:54 +01:00
Jacek Sieka	6f7e0e3393	REST cleanups (#3255 ) * REST cleanups * reject out-of-range committee requests * print all hex values as lower-case * allow requesting state information by head state root * turn `DomainType` into array (follow spec) * `uint_to_bytesXX` -> `uint_to_bytes` (follow spec) * fix wrong dependent root in `/eth/v1/validator/duties/proposer/` * update documentation - `--subscribe-all-subnets` is no longer needed when using the REST interface with validator clients * more fixes * common helpers for dependent block * remove test rules obsoleted by more strict epoch tests * fix trailing commas * Update docs/the_nimbus_book/src/rest-api.md * Update docs/the_nimbus_book/src/rest-api.md Co-authored-by: sacha <sacha@status.im>	2022-01-08 22:06:34 +02:00
tersec	5878d34117	rename forkDigests.merge to forkDigests.bellatrix (#3245 )	2022-01-05 14:24:15 +00:00
tersec	b81c06edab	rename Beacon{Block,State}Fork.Merge to Bellatrix; update copyright years (#3240 )	2022-01-04 09:45:38 +00:00
tersec	6ef3834f4a	fix type-conversions-to-self, unexport from nimbus_beacon_node, and rm unused vars/procs (#3211 )	2021-12-20 12:21:17 +01:00
Jacek Sieka	118840d241	SyncManager cleanups for backfill support (#3189 ) * SyncManager cleanups for backfill support Cleanups, fixes and simplifications, in anticipation of backfill support for the `SyncManager`: * reformat sync progress indicator to show time left and % done more prominently: * old: `sync="sPssPsssss:2:2.4229:00h57m (2706898)"` * new: `sync="14d12h31m (0.52%) 1.1378slots/s (wQQQQQDDQQ:1287520)"` * reset average speed when going out of sync * pass all block errors to sync manager, including duplicate/unviable * penalize peers for reporting a head block that is outside of our expected wall clock time (they're likely on a different network or trying to disrupt sync) * remove `SyncFailureKind` (unused) * remove `inRange` (unused) * add `Q` for sync queue requests that are in the `SyncQueue` but not yet in the `BlockProcessor` queue * update last slot in `SyncQueue` after getting peer status * fix race condition between `wakeupWaiters` and `resetWait`, where workers would not be correctly reset if block verification returned a completed future without event loop * log syncmanager direction * Fix ordering issue. Some of the requests size of which are not equal to `chunkSize` could be processed in wrong order which could lead to sync process freezes. Co-authored-by: cheatfate <eugene.kabanov@status.im>	2021-12-16 15:57:16 +01:00
Jacek Sieka	03005f48e1	Backfill support for ChainDAG (#3171 ) In the ChainDAG, 3 block pointers are kept: genesis, tail and head. This PR adds one more block pointer: the backfill block which represents the block that has been backfilled so far. When doing a checkpoint sync, a random block is given as starting point - this is the tail block, and we require that the tail block has a corresponding state. When backfilling, we end up with blocks without corresponding states, hence we cannot use `tail` as a backfill pointer - there is no state. Nonetheless, we need to keep track of where we are in the backfill process between restarts, such that we can answer GetBeaconBlocksByRange requests. This PR adds the basic support for backfill handling - it needs to be integrated with backfill sync, and the REST API needs to be adjusted to take advantage of the new backfilled blocks when responding to certain requests. Future work will also enable moving the tail in either direction: * pruning means moving the tail forward in time and removing states * backwards means recreating past states from genesis, such that intermediate states are recreated step by step all the way to the tail - at that point, tail, genesis and backfill will match up. * backfilling is done when backfill != genesis - later, this will be the WSS checkpoint instead	2021-12-13 14:36:06 +01:00
Eugene Kabanov	b05734f610	Backward sync support for SyncManager. (#3131 ) * Unbundle SyncQueue from sync_manager.nim. Unbundle Peer scores constants to peer_scores.nim. Add Forward/Backward enum. * Further improvements and tests. * Adopt getRewindPoint() and fix MissingParent handler. * Remove unused procedures. Refactor `result` usage. Fix resetWait(). * Add all the tests and fix the issue with rewind point. * Fix get() issue. * Fix flaky tests. * test fixes Co-authored-by: Jacek Sieka <jacek@status.im>	2021-12-08 22:15:29 +01:00
Jacek Sieka	1a8b7469e3	move quarantine outside of chaindag (#3124 ) * move quarantine outside of chaindag The quarantine has been part of the ChainDAG for the longest time, but this design has a few issues: * the function in which blocks are verified and added to the dag becomes reentrant and therefore difficult to reason about - we're currently using a stateful flag to work around it * quarantined blocks bypass the processing queue leading to a processing stampede * the quarantine flow is unsuitable for orphaned attestations - these should also should be quarantined eventually Instead of processing the quarantine inside ChainDAG, this PR moves re-queueing to `block_processor` which already is responsible for dealing with follow-up work when a block is added to the dag This sets the stage for keeping attestations in the quarantine as well. Also: * make `BlockError` `{.pure.}` * avoid use of `ValidationResult` in block clearance (that's for gossip)	2021-12-06 10:49:01 +01:00

1 2

89 Commits