mirror of
https://github.com/status-im/nimbus-eth2.git
synced 2025-02-08 20:54:33 +00:00
Up til now, the block dag has been using `BlockRef`, a structure adapted for a full DAG, to represent all of chain history. This is a correct and simple design, but does not exploit the linearity of the chain once parts of it finalize. By pruning the in-memory `BlockRef` structure at finalization, we save, at the time of writing, a cool ~250mb (or 25%:ish) chunk of memory landing us at a steady state of ~750mb normal memory usage for a validating node. Above all though, we prevent memory usage from growing proportionally with the length of the chain, something that would not be sustainable over time - instead, the steady state memory usage is roughly determined by the validator set size which grows much more slowly. With these changes, the core should remain sustainable memory-wise post-merge all the way to withdrawals (when the validator set is expected to grow). In-memory indices are still used for the "hot" unfinalized portion of the chain - this ensure that consensus performance remains unchanged. What changes is that for historical access, we use a db-based linear slot index which is cache-and-disk-friendly, keeping the cost for accessing historical data at a similar level as before, achieving the savings at no percievable cost to functionality or performance. A nice collateral benefit is the almost-instant startup since we no longer load any large indicies at dag init. The cost of this functionality instead can be found in the complexity of having to deal with two ways of traversing the chain - by `BlockRef` and by slot. * use `BlockId` instead of `BlockRef` where finalized / historical data may be required * simplify clearance pre-advancement * remove dag.finalizedBlocks (~50:ish mb) * remove `getBlockAtSlot` - use `getBlockIdAtSlot` instead * `parent` and `atSlot` for `BlockId` now require a `ChainDAGRef` instance, unlike `BlockRef` traversal * prune `BlockRef` parents on finality (~200:ish mb) * speed up ChainDAG init by not loading finalized history index * mess up light client server error handling - this need revisiting :)
Gossip Processing
This folder holds a collection of modules to:
- validate raw gossip data before
- rebroadcasting it (potentially aggregated)
- sending it to one of the consensus object pools
Validation
Gossip validation is different from consensus verification in particular for blocks.
- Blocks: https://github.com/ethereum/consensus-specs/blob/v1.1.10/specs/phase0/p2p-interface.md#beacon_block
- Attestations (aggregated): https://github.com/ethereum/consensus-specs/blob/v1.1.10/specs/phase0/p2p-interface.md#beacon_aggregate_and_proof
- Attestations (unaggregated): https://github.com/ethereum/consensus-specs/blob/v1.1.10/specs/phase0/p2p-interface.md#attestation-subnets
- Voluntary exits: https://github.com/ethereum/consensus-specs/blob/v1.1.10/specs/phase0/p2p-interface.md#voluntary_exit
- Proposer slashings: https://github.com/ethereum/consensus-specs/blob/v1.1.10/specs/phase0/p2p-interface.md#proposer_slashing
- Attester slashing: https://github.com/ethereum/consensus-specs/blob/v1.1.10/specs/phase0/p2p-interface.md#attester_slashing
There are multiple consumers of validated consensus objects:
- a
ValidationResult.Accept
output triggers rebroadcasting in libp2p- We jump into method
validate(PubSub, Message)
in libp2p/protocols/pubsub/pubsub.nim - which was called by
rpcHandler(GossipSub, PubSubPeer, RPCMsg)
- We jump into method
- a
blockValidator
message enqueues the validated object to the processing queue inblock_processor
blockQueue: AsyncQueue[BlockEntry]
(shared with request_manager and sync_manager)- This queue is then regularly processed to be made available to the consensus object pools.
- a
xyzValidator
message adds the validated object to a pool in eth2_processor- Attestations (unaggregated and aggregated) get collected into batches.
- Once a threshold is exceeded or after a timeout, they get validated together using BatchCrypto.
Security concerns
As the first line of defense in Nimbus, modules must be able to handle bursts of data that may come:
- from malicious nodes trying to DOS us
- from long periods of non-finality, creating lots of forks, attestations