mirror of
https://github.com/status-im/nimbus-eth2.git
synced 2025-01-09 22:06:21 +00:00
do not descore peer when EL connection fails (#4020)
When the EL fails to respond to `newPayload`, e.g., because connection to the EL got interrupted, or due to misconfiguration, optimistic blocks cannot be imported according to spec. This condition is treated the same as if the peer returned a block with missing parent which gets the block out of our processing queue, but can have nasty side effects. For example, if sync manager asks for validation of a block known to be in the finalized range, if it receives a `MissingParent` verdict, the peer is immediately removed from the peer pool. ``` DBG 2022-08-24 11:45:26.874+02:00 newPayload: inserting block into execution engine parentHash=e4ca7424 blockHash=36cdc198 stateRoot=cf3902c1 receiptsRoot=56e81f17 prevRandao=0b49a172 blockNumber=1518089 gasLimit=30000000 gasUsed=0 timestamp=1657980396 extraDataLen=0 baseFeePerGas=7 numTransactions=0 ERR 2022-08-24 11:45:26.875+02:00 newPayload failed msg="Transport is not initialised (missing a call to connect?)" DBG 2022-08-24 11:45:26.875+02:00 Block pool rejected peer's response topics="syncman" request=187232:32@1475 peer=16U*MsCJdx direction=forward blocks_map=xxxxxxxxxxxxxxxxxxxxxxxxxxxx.xxx blocks_count=31 ok=false unviable=false missing_parent=true sync_ident=main ERR 2022-08-24 11:45:26.875+02:00 Unexpected missing parent at finalized epoch slot topics="syncman" request=187232:32@1475 peer=16U*MsCJdx direction=forward rewind_to_slot=187232 blocks_count=31 blocks_map=xxxxxxxxxxxxxxxxxxxxxxxxxxxx.xxx sync_ident=main DBG 2022-08-24 11:45:26.875+02:00 Peer was removed from PeerPool due to low score topics="beacnde" peer=16U*MsCJdx peer_score=-1000 score_low_limit=0 score_high_limit=1000 DBG 2022-08-24 11:45:26.875+02:00 Lost connection to peer topics="networking" peer=16U*MsCJdx connections=0 ``` By delaying issuing a verdict until the EL connection is restored and `newPayload` successfully ran, the problem should be fixed. This also induces back pressure to the sync manager by stopping download of new blocks (or re-downloading the same block over and over again).
This commit is contained in:
parent
492a2ccfac
commit
eec6c04d32
@ -506,6 +506,17 @@ proc runQueueProcessingLoop*(self: ref BlockProcessor) {.async.} =
|
|||||||
# TODO detect have-TTD-but-not-is_execution_block case, and where
|
# TODO detect have-TTD-but-not-is_execution_block case, and where
|
||||||
# execution payload was non-zero when TTD detection more reliable
|
# execution payload was non-zero when TTD detection more reliable
|
||||||
when true:
|
when true:
|
||||||
|
# When an execution engine returns an error or fails to respond to a
|
||||||
|
# payload validity request for some block, a consensus engine:
|
||||||
|
# - MUST NOT optimistically import the block.
|
||||||
|
# - MUST NOT apply the block to the fork choice store.
|
||||||
|
# - MAY queue the block for later processing.
|
||||||
|
# https://github.com/ethereum/consensus-specs/blob/v1.2.0-rc.3/sync/optimistic.md#execution-engine-errors
|
||||||
|
template reEnqueueBlock: untyped =
|
||||||
|
await sleepAsync(chronos.seconds(1))
|
||||||
|
self[].addBlock(
|
||||||
|
blck.src, blck.blck, blck.resfut, blck.validationDur)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# Minimize window for Eth1 monitor to shut down connection
|
# Minimize window for Eth1 monitor to shut down connection
|
||||||
await self.consensusManager.eth1Monitor.ensureDataProvider()
|
await self.consensusManager.eth1Monitor.ensureDataProvider()
|
||||||
@ -520,22 +531,14 @@ proc runQueueProcessingLoop*(self: ref BlockProcessor) {.async.} =
|
|||||||
|
|
||||||
let executionPayloadStatus = await newExecutionPayload(
|
let executionPayloadStatus = await newExecutionPayload(
|
||||||
self.consensusManager.eth1Monitor, executionPayload)
|
self.consensusManager.eth1Monitor, executionPayload)
|
||||||
|
|
||||||
if executionPayloadStatus.isNone:
|
if executionPayloadStatus.isNone:
|
||||||
# https://github.com/ethereum/consensus-specs/blob/v1.2.0-rc.3/sync/optimistic.md#execution-engine-errors
|
reEnqueueBlock()
|
||||||
if not blck.resfut.isNil:
|
|
||||||
blck.resfut.complete(
|
|
||||||
Result[void, BlockError].err(BlockError.MissingParent))
|
|
||||||
continue
|
continue
|
||||||
|
|
||||||
executionPayloadStatus.get
|
executionPayloadStatus.get
|
||||||
except CatchableError as err:
|
except CatchableError as err:
|
||||||
error "runQueueProcessingLoop: newPayload failed",
|
error "runQueueProcessingLoop: newPayload failed", err = err.msg
|
||||||
err = err.msg
|
reEnqueueBlock()
|
||||||
# https://github.com/ethereum/consensus-specs/blob/v1.2.0-rc.3/sync/optimistic.md#execution-engine-errors
|
|
||||||
if not blck.resfut.isNil:
|
|
||||||
blck.resfut.complete(
|
|
||||||
Result[void, BlockError].err(BlockError.MissingParent))
|
|
||||||
continue
|
continue
|
||||||
else:
|
else:
|
||||||
debug "runQueueProcessingLoop: got execution payload before TTD"
|
debug "runQueueProcessingLoop: got execution payload before TTD"
|
||||||
|
Loading…
x
Reference in New Issue
Block a user