nimbus-eth2/beacon_chain/sync
Eugene Kabanov 18409a69e1
Light forward sync mechanism (#6515)
* Initial commit.

* Add hybrid syncing.

* Compilation fixes.

* Cast custom event for our purposes.

* Instantiate AsyncEventQueue properly.

* Fix mistype.

* Further research on optimistic updates.

* Fixing circular deps.

* Add backfilling.

* Add block download feature.

* Add block store.

* Update backfill information before storing block.

* Use custom block verifier for backfilling sync.

* Skip signature verification in backfilling.

* Add one more generic reload to storeBackfillBlock().

* Add block verification debugging statements.

* Add more debugging

* Do not use database for backfilling, part 1.

* Fix for stash.

* Stash fixes part 2.

* Prepare for testing.

* Fix assertion.

* Fix post-restart syncing process.

* Update backfill loading log statement.
Use proper backfill slot callback for sync manager.

* Add handling of Duplicates.

* Fix store duration and block backfilled log statements.

* Add proper syncing state log statement.

* Add snappy compression to beaconchain_file.
Format syncing speed properly.

* Add blobs verification.

* Add `slot` number to file structure for easy navigation over stream of compressed objects.

* Change database filename.

* Fix structure size.

* Add more consistency properties.

* Fix checkRepair() issues.

* Preparation to state rebuild process.

* Add plain & compressed size.

* Debugging snappy encode process.

* Add one more debugging line.

* Dump blocks.

* One more filedump.

* Fix chunk corruption code.

* Fix detection issue.

* Some fixes in state rebuilding process.

* Add more clearance steps.

* Move updateHead() back to block_processor.

* Fix compilation issues.

* Make code more async friendly.

* Fix async issues.
Add more information when proposer verification failed.

* Fix 8192 slots issue.

* Fix Future double completion issue.

* Pass updateFlags to some of the core procedures.

* Fix tests.

* Improve initial sync handling mechanism.

* Fix checkStateTransition() performance improvements.

* Add some performance tuning and meters.

* Light client performance tuning.

* Remove debugging statement.

* Use single file descriptor for blockchain file.

* Attempt to fix LC.

* Fix timeleft calculation when untrusted sync backfilling started right after LC block received.

* Workaround for `chronicles` + `results` `error` issue.
Remove some compilation warnings.
Fix `CatchableError` leaks on Windows.

* Address review comments.

* Address review comments part 2.

* Address review comments part 1.

* Rebase and fix the issues.

* Address review comments part 3.

* Add tests and fix some issues in auto-repair mechanism.

* Add tests to all_tests.

* Rename binary test file to pass restrictions.

* Add `bin` extension to excluded list.
Recover binary test data.

* Rename fixture file to .bin again.

* Update AllTests.

* Address review comments part 4.

* Address review comments part 5 and fix tests.

* Address review comments part 6.

* Eliminate foldl and combine from blobs processing.
Add some tests to ensure that checkResponse() also checks for correct order.

* Fix forgotten place.

* Post rebase fixes.

* Add unique slots tests.

* Optimize updateHead() code.

* Add forgotten changes.

* Address review comments on state as argument.
2024-10-30 05:38:53 +00:00
..
README.md docs: fix typos (#5571) 2023-11-06 03:56:07 +00:00
light_client_manager.nim automated consensus spec URL updating to v1.5.0-alpha.8 (#6617) 2024-10-09 08:37:35 +02:00
light_client_protocol.nim automated consensus spec URL updating to v1.5.0-alpha.8 (#6617) 2024-10-09 08:37:35 +02:00
light_client_sync_helpers.nim verify `genesis_time` more strictly (fixes #1667) (#5694) 2024-01-06 15:26:56 +01:00
request_manager.nim add Electra blob support to block/blob quarantines, block processor, and request manager (#6201) 2024-04-11 09:31:39 +00:00
sync_manager.nim Light forward sync mechanism (#6515) 2024-10-30 05:38:53 +00:00
sync_overseer.nim Light forward sync mechanism (#6515) 2024-10-30 05:38:53 +00:00
sync_protocol.nim Fix blob syncing for Electra (#6438) 2024-07-23 03:10:41 +00:00
sync_queue.nim BN: Disable genesis sync via long-range-sync argument. (#6361) 2024-06-20 18:57:08 +00:00
sync_types.nim Light forward sync mechanism (#6515) 2024-10-30 05:38:53 +00:00

README.md

Block syncing

This folder holds all modules related to block syncing

Block syncing uses ETH2 RPC protocol.

Reference diagram

Block flow

Eth2 RPC in

Blocks are requested during sync by the SyncManager.

Blocks are received by batch:

  • syncStep(SyncManager, index, peer)
  • in case of success:
    • push(SyncQueue, SyncRequest, seq[SignedBeaconBlock]) is called to handle a successful sync step. It calls validate(SyncQueue, SignedBeaconBlock)` on each block retrieved one-by-one
    • validate only enqueues the block in the SharedBlockQueue AsyncQueue[BlockEntry] but does no extra validation only the GossipSub case
  • in case of failure:
    • push(SyncQueue, SyncRequest) is called to reschedule the sync request.

Every second when sync is not in progress, the beacon node will ask the RequestManager to download all missing blocks currently in quarantine.

  • via handleMissingBlocks
  • which calls fetchAncestorBlocks
  • which asynchronously enqueue the request in the SharedBlockQueue AsyncQueue[BlockEntry].

The RequestManager runs an event loop:

  • that calls fetchAncestorBlocksFromNetwork
  • which RPC calls peers with beaconBlocksByRoot
  • and calls validate(RequestManager, SignedBeaconBlock) on each block retrieved one-by-one
  • validate only enqueues the block in the AsyncQueue[BlockEntry] but does no extra validation only the GossipSub case

Weak subjectivity sync

Not implemented!

Comments

The validate procedure name for SyncManager and RequestManager as no P2P validation actually occurs.

Sync vs Steady State

During sync:

  • The RequestManager is deactivated
  • The syncManager is working full speed ahead
  • Gossip is deactivated

Bottlenecks during sync

During sync:

  • The bottleneck is clearing the SharedBlockQueue AsyncQueue[BlockEntry] via storeBlock which requires full verification (state transition + cryptography)

Backpressure

The SyncManager handles backpressure by ensuring that current_queue_slot <= request.slot <= current_queue_slot + sq.queueSize * sq.chunkSize.

  • queueSize is -1, unbounded, by default according to comment but all init paths uses 1 (?)
  • chunkSize is SLOTS_PER_EPOCH = 32

However the shared AsyncQueue[BlockEntry] itself is unbounded. Concretely:

  • The shared AsyncQueue[BlockEntry] is bounded for sync
  • The shared AsyncQueue[BlockEntry] is unbounded for validated gossip blocks

RequestManager and Gossip are deactivated during sync and so do not contribute to pressure.