Jacek Sieka 61342c2449
limit by-root requests to non-finalized blocks (#3293)
* limit by-root requests to non-finalized blocks

Presently, we keep a mapping from block root to `BlockRef` in memory -
this has simplified reasoning about the dag, but is not sustainable with
the chain growing.

We can distinguish between two cases where by-root access is useful:

* unfinalized blocks - this is where the beacon chain is operating
generally, by validating incoming data as interesting for future fork
choice decisions - bounded by the length of the unfinalized period
* finalized blocks - historical access in the REST API etc - no bounds,
really

In this PR, we limit the by-root block index to the first use case:
finalized chain data can more efficiently be addressed by slot number.

Future work includes:

* limiting the `BlockRef` horizon in general - each instance is 40
bytes+overhead which adds up - this needs further refactoring to deal
with the tail vs state problem
* persisting the finalized slot-to-hash index - this one also keeps
growing unbounded (albeit slowly)

Anyway, this PR easily shaves ~128mb of memory usage at the time of
writing.

* No longer honor `BeaconBlocksByRoot` requests outside of the
non-finalized period - previously, Nimbus would generously return any
block through this libp2p request - per the spec, finalized blocks
should be fetched via `BeaconBlocksByRange` instead.
* return `Opt[BlockRef]` instead of `nil` when blocks can't be found -
this becomes a lot more common now and thus deserves more attention
* `dag.blocks` -> `dag.forkBlocks` - this index only carries unfinalized
blocks from now - `finalizedBlocks` covers the other `BlockRef`
instances
* in backfill, verify that the last backfilled block leads back to
genesis, or panic
* add backfill timings to log
* fix missing check that `BlockRef` block can be fetched with
`getForkedBlock` reliably
* shortcut doppelganger check when feature is not enabled
* in REST/JSON-RPC, fetch blocks without involving `BlockRef`

* fix dag.blocks ref
2022-01-21 13:33:16 +02:00
..

Gossip Processing

This folder holds a collection of modules to:

  • validate raw gossip data before
    • rebroadcasting it (potentially aggregated)
    • sending it to one of the consensus object pools

Validation

Gossip validation is different from consensus verification in particular for blocks.

There are multiple consumers of validated consensus objects:

  • a ValidationResult.Accept output triggers rebroadcasting in libp2p
    • We jump into method validate(PubSub, Message) in libp2p/protocols/pubsub/pubsub.nim
    • which was called by rpcHandler(GossipSub, PubSubPeer, RPCMsg)
  • a blockValidator message enqueues the validated object to the processing queue in block_processor
    • blockQueue: AsyncQueue[BlockEntry] (shared with request_manager and sync_manager)
    • This queue is then regularly processed to be made available to the consensus object pools.
  • a xyzValidator message adds the validated object to a pool in eth2_processor
    • Attestations (unaggregated and aggregated) get collected into batches.
    • Once a threshold is exceeded or after a timeout, they get validated together using BatchCrypto.

Security concerns

As the first line of defense in Nimbus, modules must be able to handle bursts of data that may come:

  • from malicious nodes trying to DOS us
  • from long periods of non-finality, creating lots of forks, attestations