Jacek Sieka 05ffe7b2bf
Prune BlockRef on finalization (#3513)
Up til now, the block dag has been using `BlockRef`, a structure adapted
for a full DAG, to represent all of chain history. This is a correct and
simple design, but does not exploit the linearity of the chain once
parts of it finalize.

By pruning the in-memory `BlockRef` structure at finalization, we save,
at the time of writing, a cool ~250mb (or 25%:ish) chunk of memory
landing us at a steady state of ~750mb normal memory usage for a
validating node.

Above all though, we prevent memory usage from growing proportionally
with the length of the chain, something that would not be sustainable
over time -  instead, the steady state memory usage is roughly
determined by the validator set size which grows much more slowly. With
these changes, the core should remain sustainable memory-wise post-merge
all the way to withdrawals (when the validator set is expected to grow).

In-memory indices are still used for the "hot" unfinalized portion of
the chain - this ensure that consensus performance remains unchanged.

What changes is that for historical access, we use a db-based linear
slot index which is cache-and-disk-friendly, keeping the cost for
accessing historical data at a similar level as before, achieving the
savings at no percievable cost to functionality or performance.

A nice collateral benefit is the almost-instant startup since we no
longer load any large indicies at dag init.

The cost of this functionality instead can be found in the complexity of
having to deal with two ways of traversing the chain - by `BlockRef` and
by slot.

* use `BlockId` instead of `BlockRef` where finalized / historical data
may be required
* simplify clearance pre-advancement
* remove dag.finalizedBlocks (~50:ish mb)
* remove `getBlockAtSlot` - use `getBlockIdAtSlot` instead
* `parent` and `atSlot` for `BlockId` now require a `ChainDAGRef`
instance, unlike `BlockRef` traversal
* prune `BlockRef` parents on finality (~200:ish mb)
* speed up ChainDAG init by not loading finalized history index
* mess up light client server error handling - this need revisiting :)
2022-03-17 17:42:56 +00:00
..
2022-03-16 08:20:40 +01:00

Gossip Processing

This folder holds a collection of modules to:

  • validate raw gossip data before
    • rebroadcasting it (potentially aggregated)
    • sending it to one of the consensus object pools

Validation

Gossip validation is different from consensus verification in particular for blocks.

There are multiple consumers of validated consensus objects:

  • a ValidationResult.Accept output triggers rebroadcasting in libp2p
    • We jump into method validate(PubSub, Message) in libp2p/protocols/pubsub/pubsub.nim
    • which was called by rpcHandler(GossipSub, PubSubPeer, RPCMsg)
  • a blockValidator message enqueues the validated object to the processing queue in block_processor
    • blockQueue: AsyncQueue[BlockEntry] (shared with request_manager and sync_manager)
    • This queue is then regularly processed to be made available to the consensus object pools.
  • a xyzValidator message adds the validated object to a pool in eth2_processor
    • Attestations (unaggregated and aggregated) get collected into batches.
    • Once a threshold is exceeded or after a timeout, they get validated together using BatchCrypto.

Security concerns

As the first line of defense in Nimbus, modules must be able to handle bursts of data that may come:

  • from malicious nodes trying to DOS us
  • from long periods of non-finality, creating lots of forks, attestations