[RFC - Doc] State of Nimbus block & attestation flows (#2351)

* Expand documentation on block flow [skip ci]

* address review comments [skip ci]

* Update with GossipFlow out [skip ci]

* LocalBlockProposer -> LocalValidatorDuties +  WeakSubjectivitySync

* First outline of attestation flow

* finish up prose
This commit is contained in:
Mamy Ratsimbazafy 2021-03-01 11:22:16 +01:00 committed by GitHub
parent 9c241f805b
commit 08f063aba9
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
8 changed files with 401 additions and 62 deletions

View File

@ -392,7 +392,7 @@ proc installBeaconApiHandlers*(rpcServer: RpcServer, node: BeaconNode) =
if head.slot >= blck.message.slot: if head.slot >= blck.message.slot:
node.network.broadcast(getBeaconBlocksTopic(node.forkDigest), blck) node.network.broadcast(getBeaconBlocksTopic(node.forkDigest), blck)
# The block failed validation, but was successfully broadcast anyway. # The block failed validation, but was successfully broadcast anyway.
# It was not integrated into the beacon node''s database. # It was not integrated into the beacon node's database.
return 202 return 202
else: else:
let res = proposeSignedBlock(node, head, AttachedValidator(), blck) let res = proposeSignedBlock(node, head, AttachedValidator(), blck)

30
docs/attestation_flow.dot Normal file
View File

@ -0,0 +1,30 @@
digraph architecture{
node [shape = invhouse]; Eth2RPC GossipSub;
node [shape = octagon]; AttestationPool Eth2Processor DoppelgangerDetection;
node [shape = doubleoctagon]; AsyncQueueAttestationEntry AsyncQueueAggregateEntry;
node [shape = octagon]; ForkChoice Quarantine;
{rank = same; Eth2RPC GossipSub;}
AsyncQueueAttestationEntry [label="AsyncQueue[AttestationEntry]"];
AsyncQueueAggregateEntry [label="AsyncQueue[AggregateEntry]"];
GossipSub -> Eth2Processor [label="getAttestationTopic: attestationValidator->attestationPool.validateAttestation()"];
GossipSub -> Eth2Processor [label="node.topicAggregateAndProofs: aggregateValidator->attestationPool.validateAggregate()"];
Eth2Processor -> AttestationAggregation [label="validateAttestation() validateAggregate()"];
Eth2Processor -> AttestationPool [label="attestation_pool.selectHead()"];
Eth2Processor -> DoppelgangerDetection
Eth2Processor -> AsyncQueueAttestationEntry [label="Move to queue after P2P validation via validateAttestation()"];
Eth2Processor -> AsyncQueueAggregateEntry [label="Move to queue after P2P validation via validateAggregate()"];
AsyncQueueAttestationEntry -> AttestationPool [label="runQueueProcessingLoop() -> processAttestation()\n->addAttestation (to fork choice)"];
AsyncQueueAggregateEntry -> AttestationPool [label="runQueueProcessingLoop() -> processAggregate()\n->addAttestation (to fork choice)"];
AttestationPool -> Quarantine [label="Missing block targets"];
AttestationPool -> ForkChoice [label="addAttestation(), prune(), selectHead()"];
Eth2RPC -> AttestationPool [dir="back" label="get_v1_beacon_pool_attestations() -> attestations iterator"]
LocalValidatorDuties -> AttestationPool [label="Vote for head\nMake a block (with local validators attestations)"];
LocalValidatorDuties -> AttestationAggregation [label="aggregate_attestations()"];
AttestationAggregation -> AttestationPool
}

111
docs/attestation_flow.md Normal file
View File

@ -0,0 +1,111 @@
# Attestation Flow
This is a WIP document to explain the attestation flows.
## Validation & Verification flow
Important distinction:
- We distinguish attestation `validation` which is defined in the P2P specs:
- single: https://github.com/ethereum/eth2.0-specs/blob/v1.0.0/specs/phase0/p2p-interface.md#beacon_attestation_subnet_id
- aggregated: https://github.com/ethereum/eth2.0-specs/blob/v1.0.0/specs/phase0/p2p-interface.md#beacon_aggregate_and_proof
A validated attestation or aggregate can be forwarded on gossipsub.
- and we distinguish `verification` which is defined in consensus specs:
https://github.com/ethereum/eth2.0-specs/blob/v1.0.0/specs/phase0/beacon-chain.md#attestations
An attestation needs to be verified to enter fork choice, or be included in a block.
To be clarified: from the specs it seems like gossip attestation validation is a superset of consensus attestation verification.
### Inputs
Attestations can be received from the following sources:
- Gossipsub
- Aggregate: `/eth2/{$forkDigest}/beacon_aggregate_and_proof/ssz` stored `topicAggregateAndProofs` field of the beacon node
- Unaggregated `/eth2/{$forkDigest}/beacon_attestation_{subnetIndex}/ssz`
- Included in blocks received
- the NBC database (within a block)
- a local validator vote
- Devtools: test suite, ncli, fuzzing
The related base types are
- Attestation
- IndexedAttestation
The base types are defined in the Eth2 specs.
On top, Nimbus builds new types to represent the level of trust and validation we have with regards to each BeaconBlock.
Those types allow the Nim compiler to help us ensure proper usage at compile-time and zero runtime cost.
#### TrustedAttestation & TrustedIndexedAttestation
An attestation or indexed_attestation that was verified as per the consensus spec or that was retrieved from the database or any source of trusted blocks is considered trusted. In practice we assume that its signature was already verified.
_TODO Note: it seems like P2P validation is a superset of consensus verification in terms of check and that we might use TrustedAttestation earlier in the attestation flow._
## Attestation processing architecture
How the various modules interact with block is described in a diagram:
![./attestation_flow.png](./attestation_flow.png)
Note: The Eth2Processor as 2 queues for attestations (before and after P2P validation) and 2 queues for aggregates. The diagram highlights that with separated `AsyncQueue[AttestationEntry]` and `AsyncQueue[AggregateEntry]`
### Gossip flow in
Attestatios are listened to via the gossipsub topics
- Aggregate: `/eth2/{$forkDigest}/beacon_aggregate_and_proof/ssz` stored `topicAggregateAndProofs` field of the beacon node
- Unaggregated `/eth2/{$forkDigest}/beacon_attestation_{subnetIndex}/ssz`
They are then
- validated by `validateAttestation()` or `validateAggregate()` in either `attestationValidator()` or `aggregateValidator()`
according to spec
- https://github.com/ethereum/eth2.0-specs/blob/v1.0.0/specs/phase0/p2p-interface.md#beacon_aggregate_and_proof
- https://github.com/ethereum/eth2.0-specs/blob/v1.0.0/specs/phase0/p2p-interface.md#attestation-subnets
- It seems like P2P validation is a superset of consensus verification as in `process_attestation()`:
- https://github.com/ethereum/eth2.0-specs/blob/v1.0.0/specs/phase0/beacon-chain.md#attestations
- Enqueued in
- `attestationsQueue: AsyncQueue[AttestationEntry]`
- `aggregatesQueue: AsyncQueue[AggregateEntry]`
- dropped in case of error
### Gossip flow out
- After validation in `attestationValidator()` or `aggregateValidator()` in the Eth2Processor
- Important: P2P validation is different from verification at the consensus level.
- We jump into libp2p/protocols/pubsub/pubsub.nim in the method `validate(PubSub, message)`
- which was called by `rpcHandler(GossipSub, PubSubPeer, RPCMsg)`
### Eth2 RPC in
There is no RPC for attestations but attestations might be included in synced blocks.
### Comments
### Sync vs Steady State
During sync the only attestation we receive are within synced blocks.
Afterwards attestations come from GossipSub
#### Bottlenecks during sync
During sync, attestations are not a bottleneck, they are a small part of the large block processing.
##### Backpressure
The `attestationsQueue` to store P2P validated attestations has a max size of `TARGET_COMMITTEE_SIZE * MAX_COMMITTEES_PER_SLOT` (128 * 64 = 8192)
The `aggregatesQueue` has a max size of `TARGET_AGGREGATORS_PER_COMMITTEE * MAX_COMMITTEES_PER_SLOT` (16 * 64 = 1024)
##### Latency & Throughput sensitiveness
We distinguish an aggregated attestation:
- aggregation is done according to eth2-spec rules, usually implies diffusion of aggregate signature and the aggregate public key is computed on-the-fly.
- and a batched validation (batching done client side), implies aggregation of signatures and public keys are done on-the-fly
During sync attestations are included in blocks and so do not require gossip validation,
they are also aggregated per validators and can be batched with other signatures within a block.
After sync, attestations need to be rebroadcasted fast to:
- maintain the quality of the GossipSub mesh
- not be booted by peers
Attestations need to be validated or aggregated fast to avoid delaying other networking operations, in particular they are bottleneck by cryptography.
The number of attestation to process grows with the number of peers. An aggregated attestation is as cheap to process as a non-aggregated one. Batching is worth it even with only 2 attestations to batch.

BIN
docs/attestation_flow.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 148 KiB

40
docs/block_flow.dot Normal file
View File

@ -0,0 +1,40 @@
digraph architecture{
node [shape = invhouse]; Eth2RPC GossipSub WeakSubjectivitySync;
node [shape = octagon]; SyncManager SyncProtocol RequestManager;
SyncManager [label="SyncManager (Sync in)"];
node [shape = doubleoctagon] SharedBlockQueue;
{rank = same; SyncManager RequestManager;}
{rank = same; Eth2RPC GossipSub WeakSubjectivitySync;}
WeakSubjectivitySync [label="WeakSubjectivitySync (impl TBD)"];
node [shape = octagon]; Eth2Processor RequestManager;
node [shape = octagon]; ChainDAG Quarantine Clearance;
Eth2RPC -> SyncProtocol [dir=both]
SyncProtocol -> SyncManager [dir=both, label="beaconBlocksByRange() (mixin)"]
GossipSub -> Eth2Processor [label="node.topicBeaconBlocks: blockValidator->isValidBeaconBlock (no transition or signature check yet)\nthen enqueued in blocksQueue"];
GossipSub -> Eth2Processor [dir=back, label="node.topicBeaconBlocks: blockValidator()->ValidationResult.Accept->libp2p/gossipsub.nim\nvalidate() in rpcHandler()"];
Eth2Processor -> Clearance [label="storeBlock(): enqueue in clearance/quarantine and callback to fork choice"];
SyncProtocol -> RequestManager [dir=both, label="fetchAncestorBlocksFromNetwork()"];
SyncManager -> SharedBlockQueue [dir=both, label="Eth2Processor.blocksQueue\n== SyncManager.outQueue (shared state!)"];
Eth2Processor -> SharedBlockQueue [dir=both, label="Eth2Processor.blocksQueue\n== RequestManager.outQueue (shared state!)"];
SharedBlockQueue -> RequestManager [dir=both, label="SyncManager.outQueue\n== RequestManager.outQueue (shared state!)"];
LocalValidatorDuties -> Clearance
RequestManager -> Quarantine [dir=back, label="Retrieve missing ancestors"]
Clearance -> Quarantine
Clearance -> ChainDAG
Eth2Processor -> ForkChoice
LocalValidatorDuties -> ForkChoice
node [shape = cylinder]; BeaconChainDB;
ChainDAG -> BeaconChainDB [dir=both]
SyncProtocol -> ChainDAG [dir=back, label="Sync out: getBlockRange()\nbeaconBlocksByRoot()\n"]
}

219
docs/block_flow.md Normal file
View File

@ -0,0 +1,219 @@
# Beacon Block Flow
This is a WIP document to explain the beacon block flows.
## Validation & Verification flow
Important distinction:
- We distinguish block `validation` which is defined in the P2P specs:
https://github.com/ethereum/eth2.0-specs/blob/v1.0.0/specs/phase0/p2p-interface.md#beacon_block.
A validated block can be forwarded on gossipsub.
- and we distinguish `verification` which is defined in consensus specs:
https://github.com/ethereum/eth2.0-specs/blob/v1.0.0/specs/phase0/beacon-chain.md#block-processing
A block needs to be verified to enter fork choice, the DAG and the BeaconChainDB
In particular in terms of costly checks validating a block only requires checking:
- the **block proposer signature**
while verifying a block requires checking:
- the state transition
- the block proposer signature
- and also the signatures within a block:
- Randao Reveal
- Proposer slashings
- Attester slashings
- Attestations
- VoluntaryExits
### Inputs
Blocks can be received from the following sources:
- Gossipsub
- via `topicBeaconBlocks` (`&"/eth2/{$forkDigest}/{topicBeaconBlocksSuffix}"`). They are then validated by `blockValidator` (in `eth2_processor.nim`) which calls `isValidBeaconBlock`
- ETH2 RPC
- via SyncManager (when Nimbus syncs)
- via RequestManager (when Nimbus needs ancestor blocks)
- the NBC database
- a local validator block proposal
- Devtools: test suite, ncli, fuzzing
The related base types are:
- BeaconBlockBody
- BeaconBlock
- BeaconBlockBody
- + metadata (slot, blockchain state before/after, proposer)
- BeaconBlockHeader
- metadata (slot, blockchain state before/after, proposer)
- merkle hash of the BeaconBlockBody
- SignedBeaconBlock
- BeaconBlock
- + BLS signature
The base types are defined in the Eth2 specs.
On top, Nimbus builds new types to represent the level of trust and validation we have with regards to each BeaconBlock.
Those types allow the Nim compiler to help us ensure proper usage at compile-time and zero runtime cost.
#### BeaconBlocks
Those are spec-defined types.
On deserialization the SSZ code guarantees that BeaconBlock are correctly max-sized
according to:
- MAX_PROPOSER_SLASHINGS
- MAX_ATTESTER_SLASHINGS
- MAX_ATTESTATIONS
- MAX_DEPOSITS
- MAX_VOLUNTARY_EXITS
#### TrustedBeaconBlocks
A block that has been fully checked to be sound
both in regards to the blockchain protocol and its cryptographic signatures is known as a `TrustedBeaconBlock` or `TrustedSignedBeaconBlock`.
This allows skipping expensive signature checks.
Blocks are considered trusted if they come from:
- the NBC database
- produced by a local validator
#### SigVerifiedBeaconBlocks
A block with a valid cryptographic signature is considered SigVerified.
This is a weaker guarantee than Trusted as the block might still be invalid according to the state transition function.
Such a block are produced if incoming gossip blocks' signatures are batched together for batch verification **before** being passed to state transition.
#### TransitionVerifiedBeaconBlocks
A block that passes the state transition checks and can be successfully applied to the beacon chain is considered `TransitionVerified`.
Such a block can be produced if incoming blocks' signatures are batched together for batch verification **after** successfully passing state transition.
_This is not used in Nimbus at the moment_
## Block processing architecture
How the various modules interact with block is described in a diagram:
![./block_flow.png](./block_flow.png)
It is important to note that 3 data structures are sharing the same `AsyncQueue[BlockEntry]`:
- Eth2Processor.blocksQueue
- SyncManager.outQueue
- RequestManager.outQueue
### Gossip flow in
Blocks are listened to via the gossipsub topic `/eth2/{$forkDigest}/beacon_block/ssz` (`topicBeaconBlocks` variable)
They are then:
- validated by `blockValidator()` in the Eth2Processor by `isValidBeaconBlock()` according to spec https://github.com/ethereum/eth2.0-specs/blob/v1.0.0/specs/phase0/p2p-interface.md#beacon_block
- Important: P2P validation is not full verification (state transition and internal cryptographic signatures were not checked)
- enqueued in the shared block queue `AsyncQueue[BlockEntry]` in case of success
- dropped in case of error
Logs:
- debug: "Dropping already-seen gossip block"
- debug: "Block received"
- trace: "Block validated"
### Gossip flow out
- After validation in `blockValidator()` in the Eth2Processor by `isValidBeaconBlock()` according to spec https://github.com/ethereum/eth2.0-specs/blob/v1.0.0/specs/phase0/p2p-interface.md#beacon_block
- Important: P2P validation is not full verification (state transition and internal cryptographic signatures were not checked)
- We jump into libp2p/protocols/pubsub/pubsub.nim in the method `validate(PubSub, message)`
- which was called by `rpcHandler(GossipSub, PubSubPeer, RPCMsg)`
### Eth2 RPC in
Blocks are requested during sync by the SyncManager.
Blocks are received by batch:
- `syncStep(SyncManager, index, peer)`
- in case of success:
- `push(SyncQueue, SyncRequest, seq[SignedBeaconBlock]) is called to handle a successful sync step.
It calls `validate(SyncQueue, SignedBeaconBlock)` on each block retrieved one-by-one
- `validate` only enqueues the block in the `AsyncQueue[BlockEntry]` but does no extra validation only the GossipSub case
- in case of failure:
- `push(SyncQueue, SyncRequest)` is called to reschedule the sync request.
Every second when sync is not in progress, the beacon node will ask the RequestManager to download all missing blocks currently in quarantaine.
- via `handleMissingBlocks`
- which calls `fetchAncestorBlocks`
- which asynchronously enqueue the request
The RequestManager runs an event loop:
- that calls `fetchAncestorBlocksFromNetwork`
- which RPC calls peers with `beaconBlocksByRoot`
- and calls `validate(RequestManager, SignedBeaconBlock)` on each block retrieved one-by-one
- `validate` only enqueues the block in the `AsyncQueue[BlockEntry]` but does no extra validation only the GossipSub case
### Weak subjectivity sync
Not implemented!
### Consuming the shared block queue
The SharedBlockQueue is consumed by the Eth2Processor via `runQueueProcessingLoop(Eth2Processor)`
To mitigate blocking networking and timeshare between Io and compute, blocks are processed 1 by 1 by `processBlock(Eth2Processor, BlockEntry)`
This in turn calls:
- `storeBlock(Eth2Processor, SignedBeaconBlock, Slot)`
- `addRawBlock(ChainDagRef, QuarantineRef, SignedBeaconBlock, forkChoiceCallback)`
- trigger sending attestation if relevant
### Comments
The `validate` procedure name for `SyncManager` and `RequestManager`
as no P2P validation actually occurs.
Furthermore it seems like we are not gossiping blocks as soon as they are validated. TODO: When do we gossip them? Are we waiting for full verification.
### Sync vs Steady State
During sync:
- The RequestManager is deactivated
- The syncManager is working full speed ahead
- Gossip is deactivated
#### Bottlenecks during sync
During sync:
- The bottleneck is clearing the SharedBlockQueue via `storeBlock`
which requires full verification (state transition + cryptography)
##### Backpressure
The SyncManager handles backpressure by ensuring that
`current_queue_slot <= request.slot <= current_queue_slot + sq.queueSize * sq.chunkSize`.
- queueSize is -1 unbounded by default according to comment but it seems like all init path uses 1 (?)
- chunkSize is SLOTS_PER_EPOCH = 32
However the shared `AsyncQueue[BlockEntry]` is unbounded (?) with.
RequestManager and Gossip are deactivated during sync and so do not contribute to pressure
##### Latency & Throughput sensitiveness
Latency TODO
Throughput:
- storeBlock is bottlenecked by verification (state transition and cryptography) of a block.
It might also be bottlenecked by the BeaconChainDB database speed.
#### Bottlenecks when synced
The SyncManager is deactivated
RequestManager and Gossip are activated.
##### Backpressure
There is no backpressure handling at the RequestManager and Gossip level with regards to the SharedBlockQueue
There is backpressure handling at the Quarantine level:
- Blocks in the SharedBlockQueue that are missing parents
are put in quarantine, only 16 can be stored and new candidate are dropped as long as the older ones are unresolved.
##### Latency & Throughput sensitiveness
When synced, blocks are a small part of the whole processing compared to attestations, there is no more throughput constraint however a single block should be processed ASAP as it blocks attestation flow.
Furthermore to ensure the stability of the gossip mesh, blocks should be validated and rebroadcasted ASAP as well.

BIN
docs/block_flow.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 182 KiB

View File

@ -1,61 +0,0 @@
# Block Validation Flow
This is a WIP document to explain the block validation flow.
This should be transformed into diagram that explain
the implicit block validation state machine.
## Inputs
Blocks can be received from the following sources:
- Gossipsub
- the NBC database
- a local validator block proposal
- Devtools: test suite, ncli, fuzzing
The related base types are:
- BeaconBlockBody
- BeaconBlock
- BeaconBlockBody
- + metadata (slot, blockchain state before/after, proposer)
- BeaconBlockHeader
- metadata (slot, blockchain state before/after, proposer)
- merkle hash of the BeaconBlockBody
- SignedBeaconBlock
- BeaconBlock
- + BLS signature
The base types are defined in the Eth2 specs.
On top, Nimbus builds new types to represent the level of trust and validation we have with regards to each BeaconBlock.
Those types allow the Nim compiler to help us ensure proper usage at compile-time and zero runtime cost.
### BeaconBlocks
Those are spec-defined types.
On deserialization the SSZ code guarantees that BeaconBlock are correctly max-sized
according to:
- MAX_PROPOSER_SLASHINGS
- MAX_ATTESTER_SLASHINGS
- MAX_ATTESTATIONS
- MAX_DEPOSITS
- MAX_VOLUNTARY_EXITS
### TrustedBeaconBlocks
A block that has been fully checked to be sound
both in regards to the blockchain protocol and its cryptographic signatures is known as a `TrustedBeaconBlock` or `TrustedSignedBeaconBlock`.
This allows skipping expensive signature checks.
Blocks are considered trusted if they come from:
- the NBC database
- produced by a local validator
### SigVerifiedBeaconBlocks
A block with a valid cryptographic signature is considered SigVerified.
This is a weaker guarantee than Trusted as the block might still be invalid according to the state transition function.
Such a block are produced if incoming gossip blocks' signatures are batched together for batch verification **before** being passed to state transition.
### TransitionVerifiedBeaconBlocks
A block that passes the state transition checks and can be successfully applied to the beacon chain is considered `TransitionVerified`.
Such a block can be produced if incoming blocks' signatures are batched together for batch verification **after** successfully passing state transition.