nimbus-eth2/beacon_chain/gossip_processing
Jacek Sieka 9c2f43ed0e
Speed up altair block processing 2x (#3115)
* Speed up altair block processing >2x

Like #3089, this PR drastially speeds up historical REST queries and
other long state replays.

* cache sync committee validator indices
* use ~80mb less memory for validator pubkey mappings
* batch-verify sync aggregate signature (fixes #2985)
* document sync committee hack with head block vs sync message block
* add batch signature verification failure tests

Before:

```
../env.sh nim c -d:release -r ncli_db --db:mainnet_0/db bench --start-slot:-1000
All time are ms
     Average,       StdDev,          Min,          Max,      Samples,         Test
Validation is turned off meaning that no BLS operations are performed
    5830.675,        0.000,     5830.675,     5830.675,            1, Initialize DB
       0.481,        1.878,        0.215,       59.167,          981, Load block from database
    8422.566,        0.000,     8422.566,     8422.566,            1, Load state from database
       6.996,        1.678,        0.042,       14.385,          969, Advance slot, non-epoch
      93.217,        8.318,       84.192,      122.209,           32, Advance slot, epoch
      20.513,       23.665,       11.510,      201.561,          981, Apply block, no slot processing
       0.000,        0.000,        0.000,        0.000,            0, Database load
       0.000,        0.000,        0.000,        0.000,            0, Database store
```

After:

```
    7081.422,        0.000,     7081.422,     7081.422,            1, Initialize DB
       0.553,        2.122,        0.175,       66.692,          981, Load block from database
    5439.446,        0.000,     5439.446,     5439.446,            1, Load state from database
       6.829,        1.575,        0.043,       12.156,          969, Advance slot, non-epoch
      94.716,        2.749,       88.395,      100.026,           32, Advance slot, epoch
      11.636,       23.766,        4.889,      205.250,          981, Apply block, no slot processing
       0.000,        0.000,        0.000,        0.000,            0, Database load
       0.000,        0.000,        0.000,        0.000,            0, Database store
```

* add comment
2021-11-24 13:43:50 +01:00
..
README.md update 22 spec URLs to v1.1.5 (#3111) 2021-11-18 08:08:00 +00:00
batch_validation.nim import cleanup (#2997) 2021-10-19 16:09:26 +02:00
block_processor.nim Better REST/RPC error messages (#3046) 2021-11-05 17:39:47 +02:00
consensus_manager.nim disentangle eth2 types from the ssz library (#2785) 2021-08-18 20:57:58 +02:00
eth2_processor.nim Better REST/RPC error messages (#3046) 2021-11-05 17:39:47 +02:00
gossip_validation.nim Speed up altair block processing 2x (#3115) 2021-11-24 13:43:50 +01:00

README.md

Gossip Processing

This folder holds a collection of modules to:

  • validate raw gossip data before
    • rebroadcasting it (potentially aggregated)
    • sending it to one of the consensus object pools

Validation

Gossip validation is different from consensus verification in particular for blocks.

There are multiple consumers of validated consensus objects:

  • a ValidationResult.Accept output triggers rebroadcasting in libp2p
    • We jump into method validate(PubSub, Message) in libp2p/protocols/pubsub/pubsub.nim
    • which was called by rpcHandler(GossipSub, PubSubPeer, RPCMsg)
  • a blockValidator message enqueues the validated object to the processing queue in block_processor
    • blocksQueue: AsyncQueue[BlockEntry] (shared with request_manager and sync_manager)
    • This queue is then regularly processed to be made available to the consensus object pools.
  • a xyzValidator message adds the validated object to a pool in eth2_processor
    • Attestations (unaggregated and aggregated) get collected into batches.
    • Once a threshold is exceeded or after a timeout, they get validated together using BatchCrypto.

Security concerns

As the first line of defense in Nimbus, modules must be able to handle bursts of data that may come:

  • from malicious nodes trying to DOS us
  • from long periods of non-finality, creating lots of forks, attestations