eec6c04d32
When the EL fails to respond to `newPayload`, e.g., because connection to the EL got interrupted, or due to misconfiguration, optimistic blocks cannot be imported according to spec. This condition is treated the same as if the peer returned a block with missing parent which gets the block out of our processing queue, but can have nasty side effects. For example, if sync manager asks for validation of a block known to be in the finalized range, if it receives a `MissingParent` verdict, the peer is immediately removed from the peer pool. ``` DBG 2022-08-24 11:45:26.874+02:00 newPayload: inserting block into execution engine parentHash=e4ca7424 blockHash=36cdc198 stateRoot=cf3902c1 receiptsRoot=56e81f17 prevRandao=0b49a172 blockNumber=1518089 gasLimit=30000000 gasUsed=0 timestamp=1657980396 extraDataLen=0 baseFeePerGas=7 numTransactions=0 ERR 2022-08-24 11:45:26.875+02:00 newPayload failed msg="Transport is not initialised (missing a call to connect?)" DBG 2022-08-24 11:45:26.875+02:00 Block pool rejected peer's response topics="syncman" request=187232:32@1475 peer=16U*MsCJdx direction=forward blocks_map=xxxxxxxxxxxxxxxxxxxxxxxxxxxx.xxx blocks_count=31 ok=false unviable=false missing_parent=true sync_ident=main ERR 2022-08-24 11:45:26.875+02:00 Unexpected missing parent at finalized epoch slot topics="syncman" request=187232:32@1475 peer=16U*MsCJdx direction=forward rewind_to_slot=187232 blocks_count=31 blocks_map=xxxxxxxxxxxxxxxxxxxxxxxxxxxx.xxx sync_ident=main DBG 2022-08-24 11:45:26.875+02:00 Peer was removed from PeerPool due to low score topics="beacnde" peer=16U*MsCJdx peer_score=-1000 score_low_limit=0 score_high_limit=1000 DBG 2022-08-24 11:45:26.875+02:00 Lost connection to peer topics="networking" peer=16U*MsCJdx connections=0 ``` By delaying issuing a verdict until the EL connection is restored and `newPayload` successfully ran, the problem should be fixed. This also induces back pressure to the sync manager by stopping download of new blocks (or re-downloading the same block over and over again). |
||
---|---|---|
.. | ||
README.md | ||
batch_validation.nim | ||
block_processor.nim | ||
eth2_processor.nim | ||
gossip_validation.nim | ||
light_client_processor.nim |
README.md
Gossip Processing
This folder holds a collection of modules to:
- validate raw gossip data before
- rebroadcasting it (potentially aggregated)
- sending it to one of the consensus object pools
Validation
Gossip validation is different from consensus verification in particular for blocks.
- Blocks: https://github.com/ethereum/consensus-specs/blob/v1.1.10/specs/phase0/p2p-interface.md#beacon_block
- Attestations (aggregated): https://github.com/ethereum/consensus-specs/blob/v1.1.10/specs/phase0/p2p-interface.md#beacon_aggregate_and_proof
- Attestations (unaggregated): https://github.com/ethereum/consensus-specs/blob/v1.2.0-rc.2/specs/phase0/p2p-interface.md#attestation-subnets
- Voluntary exits: https://github.com/ethereum/consensus-specs/blob/v1.2.0-rc.2/specs/phase0/p2p-interface.md#voluntary_exit
- Proposer slashings: https://github.com/ethereum/consensus-specs/blob/v1.2.0-rc.2/specs/phase0/p2p-interface.md#proposer_slashing
- Attester slashing: https://github.com/ethereum/consensus-specs/blob/v1.2.0-rc.2/specs/phase0/p2p-interface.md#attester_slashing
There are multiple consumers of validated consensus objects:
- a
ValidationResult.Accept
output triggers rebroadcasting in libp2p- We jump into method
validate(PubSub, Message)
in libp2p/protocols/pubsub/pubsub.nim - which was called by
rpcHandler(GossipSub, PubSubPeer, RPCMsg)
- We jump into method
- a
blockValidator
message enqueues the validated object to the processing queue inblock_processor
blockQueue: AsyncQueue[BlockEntry]
(shared with request_manager and sync_manager)- This queue is then regularly processed to be made available to the consensus object pools.
- a
xyzValidator
message adds the validated object to a pool in eth2_processor- Attestations (unaggregated and aggregated) get collected into batches.
- Once a threshold is exceeded or after a timeout, they get validated together using BatchCrypto.
Security concerns
As the first line of defense in Nimbus, modules must be able to handle bursts of data that may come:
- from malicious nodes trying to DOS us
- from long periods of non-finality, creating lots of forks, attestations