History

Jacek Sieka 61342c2449 limit by-root requests to non-finalized blocks (#3293 ) * limit by-root requests to non-finalized blocks Presently, we keep a mapping from block root to `BlockRef` in memory - this has simplified reasoning about the dag, but is not sustainable with the chain growing. We can distinguish between two cases where by-root access is useful: * unfinalized blocks - this is where the beacon chain is operating generally, by validating incoming data as interesting for future fork choice decisions - bounded by the length of the unfinalized period * finalized blocks - historical access in the REST API etc - no bounds, really In this PR, we limit the by-root block index to the first use case: finalized chain data can more efficiently be addressed by slot number. Future work includes: * limiting the `BlockRef` horizon in general - each instance is 40 bytes+overhead which adds up - this needs further refactoring to deal with the tail vs state problem * persisting the finalized slot-to-hash index - this one also keeps growing unbounded (albeit slowly) Anyway, this PR easily shaves ~128mb of memory usage at the time of writing. * No longer honor `BeaconBlocksByRoot` requests outside of the non-finalized period - previously, Nimbus would generously return any block through this libp2p request - per the spec, finalized blocks should be fetched via `BeaconBlocksByRange` instead. * return `Opt[BlockRef]` instead of `nil` when blocks can't be found - this becomes a lot more common now and thus deserves more attention * `dag.blocks` -> `dag.forkBlocks` - this index only carries unfinalized blocks from now - `finalizedBlocks` covers the other `BlockRef` instances * in backfill, verify that the last backfilled block leads back to genesis, or panic * add backfill timings to log * fix missing check that `BlockRef` block can be fetched with `getForkedBlock` reliably * shortcut doppelganger check when feature is not enabled * in REST/JSON-RPC, fetch blocks without involving `BlockRef` * fix dag.blocks ref		2022-01-21 13:33:16 +02:00
..
README.md	Consolidate modules by areas [part 1] (#2365 )	2021-03-02 11:27:45 +01:00
peer_scores.nim	SyncManager cleanups for backfill support (#3189 )	2021-12-16 15:57:16 +01:00
request_manager.nim	SyncManager cleanups for backfill support (#3189 )	2021-12-16 15:57:16 +01:00
sync_manager.nim	Fix current slot value and finishing progress for backfilling. (#3304 )	2022-01-21 10:35:54 +01:00
sync_protocol.nim	limit by-root requests to non-finalized blocks (#3293 )	2022-01-21 13:33:16 +02:00
sync_queue.nim	Backfiller (#3263 )	2022-01-20 08:25:45 +01:00

README.md

Block syncing

This folder holds all modules related to block syncing

Block syncing uses ETH2 RPC protocol.

Reference diagram

Eth2 RPC in

Blocks are requested during sync by the SyncManager.

Blocks are received by batch:

syncStep(SyncManager, index, peer)
in case of success:
- push(SyncQueue, SyncRequest, seq[SignedBeaconBlock]) is called to handle a successful sync step. It calls validate(SyncQueue, SignedBeaconBlock)` on each block retrieved one-by-one
- validate only enqueues the block in the SharedBlockQueue AsyncQueue[BlockEntry] but does no extra validation only the GossipSub case
in case of failure:
- push(SyncQueue, SyncRequest) is called to reschedule the sync request.

Every second when sync is not in progress, the beacon node will ask the RequestManager to download all missing blocks currently in quarantaine.

via handleMissingBlocks
which calls fetchAncestorBlocks
which asynchronously enqueue the request in the SharedBlockQueue AsyncQueue[BlockEntry].

The RequestManager runs an event loop:

that calls fetchAncestorBlocksFromNetwork
which RPC calls peers with beaconBlocksByRoot
and calls validate(RequestManager, SignedBeaconBlock) on each block retrieved one-by-one
validate only enqueues the block in the AsyncQueue[BlockEntry] but does no extra validation only the GossipSub case

Weak subjectivity sync

Not implemented!

Comments

The validate procedure name for SyncManager and RequestManager as no P2P validation actually occurs.

Sync vs Steady State

During sync:

The RequestManager is deactivated
The syncManager is working full speed ahead
Gossip is deactivated

Bottlenecks during sync

During sync:

The bottleneck is clearing the SharedBlockQueue AsyncQueue[BlockEntry] via storeBlock which requires full verification (state transition + cryptography)

Backpressure

The SyncManager handles backpressure by ensuring that current_queue_slot <= request.slot <= current_queue_slot + sq.queueSize * sq.chunkSize.

queueSize is -1, unbounded, by default according to comment but all init paths uses 1 (?)
chunkSize is SLOTS_PER_EPOCH = 32

However the shared AsyncQueue[BlockEntry] itself is unbounded. Concretely:

The shared AsyncQueue[BlockEntry] is bounded for sync
The shared AsyncQueue[BlockEntry] is unbounded for validated gossip blocks

RequestManager and Gossip are deactivated during sync and so do not contribute to pressure.