2020-05-19 16:18:07 +02:00
|
|
|
# beacon_chain
|
2022-01-26 12:21:29 +00:00
|
|
|
# Copyright (c) 2018-2022 Status Research & Development GmbH
|
2020-05-19 16:18:07 +02:00
|
|
|
# Licensed and distributed under either of
|
|
|
|
# * MIT license (license terms in the root directory or at https://opensource.org/licenses/MIT).
|
|
|
|
# * Apache v2 license (license terms in the root directory or at https://www.apache.org/licenses/LICENSE-2.0).
|
|
|
|
# at your option. This file may not be copied, modified, or distributed except according to those terms.
|
|
|
|
|
2022-07-29 12:53:42 +02:00
|
|
|
when (NimMajor, NimMinor) < (1, 4):
|
|
|
|
{.push raises: [Defect].}
|
|
|
|
else:
|
|
|
|
{.push raises: [].}
|
2020-06-16 07:45:04 +02:00
|
|
|
|
2020-05-19 16:18:07 +02:00
|
|
|
import
|
2020-07-28 15:54:32 +02:00
|
|
|
chronicles,
|
2020-12-16 09:37:22 +01:00
|
|
|
stew/[assign2, results],
|
2022-07-06 03:33:02 -07:00
|
|
|
../spec/[
|
|
|
|
beaconstate, forks, signatures, signatures_batch,
|
|
|
|
state_transition, state_transition_epoch],
|
2022-03-11 21:28:10 +01:00
|
|
|
"."/[block_dag, blockchain_dag, blockchain_dag_light_client]
|
2020-05-19 16:18:07 +02:00
|
|
|
|
2022-11-02 16:23:30 +00:00
|
|
|
# TODO remove when forks re-exports this
|
|
|
|
from ../spec/datatypes/capella import asSigVerified, asTrusted, shortLog
|
|
|
|
|
limit by-root requests to non-finalized blocks (#3293)
* limit by-root requests to non-finalized blocks
Presently, we keep a mapping from block root to `BlockRef` in memory -
this has simplified reasoning about the dag, but is not sustainable with
the chain growing.
We can distinguish between two cases where by-root access is useful:
* unfinalized blocks - this is where the beacon chain is operating
generally, by validating incoming data as interesting for future fork
choice decisions - bounded by the length of the unfinalized period
* finalized blocks - historical access in the REST API etc - no bounds,
really
In this PR, we limit the by-root block index to the first use case:
finalized chain data can more efficiently be addressed by slot number.
Future work includes:
* limiting the `BlockRef` horizon in general - each instance is 40
bytes+overhead which adds up - this needs further refactoring to deal
with the tail vs state problem
* persisting the finalized slot-to-hash index - this one also keeps
growing unbounded (albeit slowly)
Anyway, this PR easily shaves ~128mb of memory usage at the time of
writing.
* No longer honor `BeaconBlocksByRoot` requests outside of the
non-finalized period - previously, Nimbus would generously return any
block through this libp2p request - per the spec, finalized blocks
should be fetched via `BeaconBlocksByRange` instead.
* return `Opt[BlockRef]` instead of `nil` when blocks can't be found -
this becomes a lot more common now and thus deserves more attention
* `dag.blocks` -> `dag.forkBlocks` - this index only carries unfinalized
blocks from now - `finalizedBlocks` covers the other `BlockRef`
instances
* in backfill, verify that the last backfilled block leads back to
genesis, or panic
* add backfill timings to log
* fix missing check that `BlockRef` block can be fetched with
`getForkedBlock` reliably
* shortcut doppelganger check when feature is not enabled
* in REST/JSON-RPC, fetch blocks without involving `BlockRef`
* fix dag.blocks ref
2022-01-21 12:33:16 +01:00
|
|
|
export results, signatures_batch, block_dag, blockchain_dag
|
2020-05-21 19:08:31 +02:00
|
|
|
|
2020-05-19 16:18:07 +02:00
|
|
|
# Clearance
|
|
|
|
# ---------------------------------------------
|
|
|
|
#
|
|
|
|
# This module is in charge of making the
|
|
|
|
# "quarantined" network blocks
|
2020-07-30 22:18:17 +03:00
|
|
|
# pass the firewall and be stored in the chain DAG
|
2020-05-19 16:18:07 +02:00
|
|
|
|
2020-06-16 07:45:04 +02:00
|
|
|
logScope:
|
|
|
|
topics = "clearance"
|
2020-05-19 16:18:07 +02:00
|
|
|
|
Backfill support for ChainDAG (#3171)
In the ChainDAG, 3 block pointers are kept: genesis, tail and head. This
PR adds one more block pointer: the backfill block which represents the
block that has been backfilled so far.
When doing a checkpoint sync, a random block is given as starting point
- this is the tail block, and we require that the tail block has a
corresponding state.
When backfilling, we end up with blocks without corresponding states,
hence we cannot use `tail` as a backfill pointer - there is no state.
Nonetheless, we need to keep track of where we are in the backfill
process between restarts, such that we can answer GetBeaconBlocksByRange
requests.
This PR adds the basic support for backfill handling - it needs to be
integrated with backfill sync, and the REST API needs to be adjusted to
take advantage of the new backfilled blocks when responding to certain
requests.
Future work will also enable moving the tail in either direction:
* pruning means moving the tail forward in time and removing states
* backwards means recreating past states from genesis, such that
intermediate states are recreated step by step all the way to the tail -
at that point, tail, genesis and backfill will match up.
* backfilling is done when backfill != genesis - later, this will be the
WSS checkpoint instead
2021-12-13 14:36:06 +01:00
|
|
|
proc addResolvedHeadBlock(
|
2021-12-06 10:49:01 +01:00
|
|
|
dag: ChainDAGRef,
|
2022-03-16 08:20:40 +01:00
|
|
|
state: var ForkedHashedBeaconState,
|
2021-11-05 08:34:34 +01:00
|
|
|
trustedBlock: ForkyTrustedSignedBeaconBlock,
|
2022-07-04 20:35:33 +00:00
|
|
|
blockVerified: bool,
|
2020-08-05 08:28:43 +02:00
|
|
|
parent: BlockRef, cache: var StateCache,
|
2022-11-02 16:23:30 +00:00
|
|
|
onBlockAdded: OnForkyBlockAdded,
|
2021-10-19 17:20:55 +02:00
|
|
|
stateDataDur, sigVerifyDur, stateVerifyDur: Duration
|
2021-12-06 10:49:01 +01:00
|
|
|
): BlockRef =
|
2022-03-16 08:20:40 +01:00
|
|
|
doAssert state.matches_block_slot(
|
|
|
|
trustedBlock.root, trustedBlock.message.slot),
|
|
|
|
"Given state must have the new block applied"
|
2020-05-19 16:18:07 +02:00
|
|
|
|
2020-07-16 15:16:51 +02:00
|
|
|
let
|
2021-01-25 19:45:48 +01:00
|
|
|
blockRoot = trustedBlock.root
|
|
|
|
blockRef = BlockRef.init(blockRoot, trustedBlock.message)
|
2021-05-28 18:34:00 +02:00
|
|
|
startTick = Moment.now()
|
2020-08-11 21:39:53 +02:00
|
|
|
|
|
|
|
link(parent, blockRef)
|
|
|
|
|
limit by-root requests to non-finalized blocks (#3293)
* limit by-root requests to non-finalized blocks
Presently, we keep a mapping from block root to `BlockRef` in memory -
this has simplified reasoning about the dag, but is not sustainable with
the chain growing.
We can distinguish between two cases where by-root access is useful:
* unfinalized blocks - this is where the beacon chain is operating
generally, by validating incoming data as interesting for future fork
choice decisions - bounded by the length of the unfinalized period
* finalized blocks - historical access in the REST API etc - no bounds,
really
In this PR, we limit the by-root block index to the first use case:
finalized chain data can more efficiently be addressed by slot number.
Future work includes:
* limiting the `BlockRef` horizon in general - each instance is 40
bytes+overhead which adds up - this needs further refactoring to deal
with the tail vs state problem
* persisting the finalized slot-to-hash index - this one also keeps
growing unbounded (albeit slowly)
Anyway, this PR easily shaves ~128mb of memory usage at the time of
writing.
* No longer honor `BeaconBlocksByRoot` requests outside of the
non-finalized period - previously, Nimbus would generously return any
block through this libp2p request - per the spec, finalized blocks
should be fetched via `BeaconBlocksByRange` instead.
* return `Opt[BlockRef]` instead of `nil` when blocks can't be found -
this becomes a lot more common now and thus deserves more attention
* `dag.blocks` -> `dag.forkBlocks` - this index only carries unfinalized
blocks from now - `finalizedBlocks` covers the other `BlockRef`
instances
* in backfill, verify that the last backfilled block leads back to
genesis, or panic
* add backfill timings to log
* fix missing check that `BlockRef` block can be fetched with
`getForkedBlock` reliably
* shortcut doppelganger check when feature is not enabled
* in REST/JSON-RPC, fetch blocks without involving `BlockRef`
* fix dag.blocks ref
2022-01-21 12:33:16 +01:00
|
|
|
dag.forkBlocks.incl(KeyedBlockRef.init(blockRef))
|
2020-05-19 16:18:07 +02:00
|
|
|
|
|
|
|
# Resolved blocks should be stored in database
|
2021-06-24 18:34:08 +00:00
|
|
|
dag.putBlock(trustedBlock)
|
2021-05-28 21:03:20 +02:00
|
|
|
let putBlockTick = Moment.now()
|
2020-05-19 16:18:07 +02:00
|
|
|
|
2021-05-29 20:56:30 +02:00
|
|
|
var foundHead: bool
|
2020-05-19 16:18:07 +02:00
|
|
|
for head in dag.heads.mitems():
|
2020-07-28 15:54:32 +02:00
|
|
|
if head.isAncestorOf(blockRef):
|
|
|
|
head = blockRef
|
2021-05-29 20:56:30 +02:00
|
|
|
foundHead = true
|
2020-05-19 16:18:07 +02:00
|
|
|
break
|
|
|
|
|
2021-05-29 20:56:30 +02:00
|
|
|
if not foundHead:
|
|
|
|
dag.heads.add(blockRef)
|
|
|
|
|
2021-06-01 13:13:40 +02:00
|
|
|
# Regardless of the chain we're on, the deposits come in the same order so
|
|
|
|
# as soon as we import a block, we'll also update the shared public key
|
|
|
|
# cache
|
2022-03-16 08:20:40 +01:00
|
|
|
dag.updateValidatorKeys(getStateField(state, validators).asSeq())
|
2021-06-01 13:13:40 +02:00
|
|
|
|
2021-05-29 20:56:30 +02:00
|
|
|
# Getting epochRef with the state will potentially create a new EpochRef
|
|
|
|
let
|
Prune `BlockRef` on finalization (#3513)
Up til now, the block dag has been using `BlockRef`, a structure adapted
for a full DAG, to represent all of chain history. This is a correct and
simple design, but does not exploit the linearity of the chain once
parts of it finalize.
By pruning the in-memory `BlockRef` structure at finalization, we save,
at the time of writing, a cool ~250mb (or 25%:ish) chunk of memory
landing us at a steady state of ~750mb normal memory usage for a
validating node.
Above all though, we prevent memory usage from growing proportionally
with the length of the chain, something that would not be sustainable
over time - instead, the steady state memory usage is roughly
determined by the validator set size which grows much more slowly. With
these changes, the core should remain sustainable memory-wise post-merge
all the way to withdrawals (when the validator set is expected to grow).
In-memory indices are still used for the "hot" unfinalized portion of
the chain - this ensure that consensus performance remains unchanged.
What changes is that for historical access, we use a db-based linear
slot index which is cache-and-disk-friendly, keeping the cost for
accessing historical data at a similar level as before, achieving the
savings at no percievable cost to functionality or performance.
A nice collateral benefit is the almost-instant startup since we no
longer load any large indicies at dag init.
The cost of this functionality instead can be found in the complexity of
having to deal with two ways of traversing the chain - by `BlockRef` and
by slot.
* use `BlockId` instead of `BlockRef` where finalized / historical data
may be required
* simplify clearance pre-advancement
* remove dag.finalizedBlocks (~50:ish mb)
* remove `getBlockAtSlot` - use `getBlockIdAtSlot` instead
* `parent` and `atSlot` for `BlockId` now require a `ChainDAGRef`
instance, unlike `BlockRef` traversal
* prune `BlockRef` parents on finality (~200:ish mb)
* speed up ChainDAG init by not loading finalized history index
* mess up light client server error handling - this need revisiting :)
2022-03-17 18:42:56 +01:00
|
|
|
epochRef = dag.getEpochRef(state, cache)
|
2021-05-29 20:56:30 +02:00
|
|
|
epochRefTick = Moment.now()
|
2020-05-19 16:18:07 +02:00
|
|
|
|
2020-10-01 20:56:42 +02:00
|
|
|
debug "Block resolved",
|
2020-05-19 16:18:07 +02:00
|
|
|
blockRoot = shortLog(blockRoot),
|
2021-11-05 16:39:47 +01:00
|
|
|
blck = shortLog(trustedBlock.message),
|
2022-07-04 20:35:33 +00:00
|
|
|
blockVerified,
|
2021-05-28 18:34:00 +02:00
|
|
|
heads = dag.heads.len(),
|
2021-05-28 21:03:20 +02:00
|
|
|
stateDataDur, sigVerifyDur, stateVerifyDur,
|
2021-05-29 20:56:30 +02:00
|
|
|
putBlockDur = putBlockTick - startTick,
|
|
|
|
epochRefDur = epochRefTick - putBlockTick
|
2020-08-18 22:29:33 +02:00
|
|
|
|
2022-03-11 21:28:10 +01:00
|
|
|
# Update light client data
|
2022-03-20 11:58:59 +01:00
|
|
|
dag.processNewBlockForLightClient(state, trustedBlock, parent.bid)
|
2022-03-11 21:28:10 +01:00
|
|
|
|
2022-08-18 20:07:01 +02:00
|
|
|
# Pre-heat the shuffling cache with the shuffling caused by this block - this
|
|
|
|
# is useful for attestation duty lookahead, REST API queries and attestation
|
|
|
|
# validation of untaken forks (in case of instability / multiple heads)
|
|
|
|
if dag.findShufflingRef(blockRef.bid, blockRef.slot.epoch + 1).isNone:
|
|
|
|
dag.putShufflingRef(
|
|
|
|
ShufflingRef.init(state, cache, blockRef.slot.epoch + 1))
|
|
|
|
|
2022-07-04 20:35:33 +00:00
|
|
|
if not blockVerified:
|
|
|
|
dag.optimisticRoots.incl blockRoot
|
|
|
|
|
2020-07-30 17:48:25 +02:00
|
|
|
# Notify others of the new block before processing the quarantine, such that
|
|
|
|
# notifications for parents happens before those of the children
|
2020-07-22 11:42:55 +02:00
|
|
|
if onBlockAdded != nil:
|
2022-07-06 03:33:02 -07:00
|
|
|
var unrealized: FinalityCheckpoints
|
|
|
|
if enableTestFeatures in dag.updateFlags:
|
|
|
|
unrealized = withState(state):
|
2022-11-02 16:23:30 +00:00
|
|
|
when stateFork >= BeaconStateFork.Capella:
|
|
|
|
raiseAssert $capellaImplementationMissing
|
|
|
|
elif stateFork >= BeaconStateFork.Altair:
|
2022-09-10 06:12:07 +00:00
|
|
|
forkyState.data.compute_unrealized_finality()
|
2022-07-06 03:33:02 -07:00
|
|
|
else:
|
2022-09-10 06:12:07 +00:00
|
|
|
forkyState.data.compute_unrealized_finality(cache)
|
2022-07-06 03:33:02 -07:00
|
|
|
onBlockAdded(blockRef, trustedBlock, epochRef, unrealized)
|
2021-09-22 15:17:15 +03:00
|
|
|
if not(isNil(dag.onBlockAdded)):
|
|
|
|
dag.onBlockAdded(ForkedTrustedSignedBeaconBlock.init(trustedBlock))
|
2020-07-09 11:29:32 +02:00
|
|
|
|
2021-12-06 10:49:01 +01:00
|
|
|
blockRef
|
2020-07-09 11:29:32 +02:00
|
|
|
|
2021-06-01 13:13:40 +02:00
|
|
|
proc checkStateTransition(
|
2022-02-28 13:58:34 +01:00
|
|
|
dag: ChainDAGRef, signedBlock: ForkySigVerifiedSignedBeaconBlock,
|
2021-12-06 10:49:01 +01:00
|
|
|
cache: var StateCache): Result[void, BlockError] =
|
2021-06-01 13:13:40 +02:00
|
|
|
## Ensure block can be applied on a state
|
2021-06-11 17:51:46 +00:00
|
|
|
func restore(v: var ForkedHashedBeaconState) =
|
2021-05-05 08:54:21 +02:00
|
|
|
assign(dag.clearanceState, dag.headState)
|
2021-01-25 19:45:48 +01:00
|
|
|
|
2022-01-17 12:19:58 +01:00
|
|
|
let res = state_transition_block(
|
2022-03-16 08:20:40 +01:00
|
|
|
dag.cfg, dag.clearanceState, signedBlock,
|
2022-01-17 12:19:58 +01:00
|
|
|
cache, dag.updateFlags, restore)
|
2022-03-16 08:20:40 +01:00
|
|
|
|
2022-01-17 12:19:58 +01:00
|
|
|
if res.isErr():
|
|
|
|
info "Invalid block",
|
|
|
|
blockRoot = shortLog(signedBlock.root),
|
|
|
|
blck = shortLog(signedBlock.message),
|
|
|
|
error = res.error()
|
2021-01-25 19:45:48 +01:00
|
|
|
|
2021-12-06 10:49:01 +01:00
|
|
|
err(BlockError.Invalid)
|
|
|
|
else:
|
|
|
|
ok()
|
2021-01-25 19:45:48 +01:00
|
|
|
|
2021-07-07 12:09:47 +03:00
|
|
|
proc advanceClearanceState*(dag: ChainDAGRef) =
|
2021-05-29 20:56:30 +02:00
|
|
|
# When the chain is synced, the most likely block to be produced is the block
|
|
|
|
# right after head - we can exploit this assumption and advance the state
|
|
|
|
# to that slot before the block arrives, thus allowing us to do the expensive
|
|
|
|
# epoch transition ahead of time.
|
|
|
|
# Notably, we use the clearance state here because that's where the block will
|
|
|
|
# first be seen - later, this state will be copied to the head state!
|
Prune `BlockRef` on finalization (#3513)
Up til now, the block dag has been using `BlockRef`, a structure adapted
for a full DAG, to represent all of chain history. This is a correct and
simple design, but does not exploit the linearity of the chain once
parts of it finalize.
By pruning the in-memory `BlockRef` structure at finalization, we save,
at the time of writing, a cool ~250mb (or 25%:ish) chunk of memory
landing us at a steady state of ~750mb normal memory usage for a
validating node.
Above all though, we prevent memory usage from growing proportionally
with the length of the chain, something that would not be sustainable
over time - instead, the steady state memory usage is roughly
determined by the validator set size which grows much more slowly. With
these changes, the core should remain sustainable memory-wise post-merge
all the way to withdrawals (when the validator set is expected to grow).
In-memory indices are still used for the "hot" unfinalized portion of
the chain - this ensure that consensus performance remains unchanged.
What changes is that for historical access, we use a db-based linear
slot index which is cache-and-disk-friendly, keeping the cost for
accessing historical data at a similar level as before, achieving the
savings at no percievable cost to functionality or performance.
A nice collateral benefit is the almost-instant startup since we no
longer load any large indicies at dag init.
The cost of this functionality instead can be found in the complexity of
having to deal with two ways of traversing the chain - by `BlockRef` and
by slot.
* use `BlockId` instead of `BlockRef` where finalized / historical data
may be required
* simplify clearance pre-advancement
* remove dag.finalizedBlocks (~50:ish mb)
* remove `getBlockAtSlot` - use `getBlockIdAtSlot` instead
* `parent` and `atSlot` for `BlockId` now require a `ChainDAGRef`
instance, unlike `BlockRef` traversal
* prune `BlockRef` parents on finality (~200:ish mb)
* speed up ChainDAG init by not loading finalized history index
* mess up light client server error handling - this need revisiting :)
2022-03-17 18:42:56 +01:00
|
|
|
let advanced = withState(dag.clearanceState):
|
2022-08-26 22:47:40 +00:00
|
|
|
forkyState.data.slot > forkyState.data.latest_block_header.slot
|
Prune `BlockRef` on finalization (#3513)
Up til now, the block dag has been using `BlockRef`, a structure adapted
for a full DAG, to represent all of chain history. This is a correct and
simple design, but does not exploit the linearity of the chain once
parts of it finalize.
By pruning the in-memory `BlockRef` structure at finalization, we save,
at the time of writing, a cool ~250mb (or 25%:ish) chunk of memory
landing us at a steady state of ~750mb normal memory usage for a
validating node.
Above all though, we prevent memory usage from growing proportionally
with the length of the chain, something that would not be sustainable
over time - instead, the steady state memory usage is roughly
determined by the validator set size which grows much more slowly. With
these changes, the core should remain sustainable memory-wise post-merge
all the way to withdrawals (when the validator set is expected to grow).
In-memory indices are still used for the "hot" unfinalized portion of
the chain - this ensure that consensus performance remains unchanged.
What changes is that for historical access, we use a db-based linear
slot index which is cache-and-disk-friendly, keeping the cost for
accessing historical data at a similar level as before, achieving the
savings at no percievable cost to functionality or performance.
A nice collateral benefit is the almost-instant startup since we no
longer load any large indicies at dag init.
The cost of this functionality instead can be found in the complexity of
having to deal with two ways of traversing the chain - by `BlockRef` and
by slot.
* use `BlockId` instead of `BlockRef` where finalized / historical data
may be required
* simplify clearance pre-advancement
* remove dag.finalizedBlocks (~50:ish mb)
* remove `getBlockAtSlot` - use `getBlockIdAtSlot` instead
* `parent` and `atSlot` for `BlockId` now require a `ChainDAGRef`
instance, unlike `BlockRef` traversal
* prune `BlockRef` parents on finality (~200:ish mb)
* speed up ChainDAG init by not loading finalized history index
* mess up light client server error handling - this need revisiting :)
2022-03-17 18:42:56 +01:00
|
|
|
if not advanced:
|
|
|
|
let next = getStateField(dag.clearanceState, slot) + 1
|
2021-06-01 13:13:40 +02:00
|
|
|
|
2021-06-01 17:33:00 +02:00
|
|
|
let startTick = Moment.now()
|
Prune `BlockRef` on finalization (#3513)
Up til now, the block dag has been using `BlockRef`, a structure adapted
for a full DAG, to represent all of chain history. This is a correct and
simple design, but does not exploit the linearity of the chain once
parts of it finalize.
By pruning the in-memory `BlockRef` structure at finalization, we save,
at the time of writing, a cool ~250mb (or 25%:ish) chunk of memory
landing us at a steady state of ~750mb normal memory usage for a
validating node.
Above all though, we prevent memory usage from growing proportionally
with the length of the chain, something that would not be sustainable
over time - instead, the steady state memory usage is roughly
determined by the validator set size which grows much more slowly. With
these changes, the core should remain sustainable memory-wise post-merge
all the way to withdrawals (when the validator set is expected to grow).
In-memory indices are still used for the "hot" unfinalized portion of
the chain - this ensure that consensus performance remains unchanged.
What changes is that for historical access, we use a db-based linear
slot index which is cache-and-disk-friendly, keeping the cost for
accessing historical data at a similar level as before, achieving the
savings at no percievable cost to functionality or performance.
A nice collateral benefit is the almost-instant startup since we no
longer load any large indicies at dag init.
The cost of this functionality instead can be found in the complexity of
having to deal with two ways of traversing the chain - by `BlockRef` and
by slot.
* use `BlockId` instead of `BlockRef` where finalized / historical data
may be required
* simplify clearance pre-advancement
* remove dag.finalizedBlocks (~50:ish mb)
* remove `getBlockAtSlot` - use `getBlockIdAtSlot` instead
* `parent` and `atSlot` for `BlockId` now require a `ChainDAGRef`
instance, unlike `BlockRef` traversal
* prune `BlockRef` parents on finality (~200:ish mb)
* speed up ChainDAG init by not loading finalized history index
* mess up light client server error handling - this need revisiting :)
2022-03-17 18:42:56 +01:00
|
|
|
var
|
|
|
|
cache = StateCache()
|
|
|
|
info = ForkedEpochInfo()
|
|
|
|
|
|
|
|
dag.advanceSlots(dag.clearanceState, next, true, cache, info)
|
|
|
|
|
|
|
|
debug "Prepared clearance state for next block",
|
|
|
|
next, updateStateDur = Moment.now() - startTick
|
2021-06-01 17:33:00 +02:00
|
|
|
|
Backfill support for ChainDAG (#3171)
In the ChainDAG, 3 block pointers are kept: genesis, tail and head. This
PR adds one more block pointer: the backfill block which represents the
block that has been backfilled so far.
When doing a checkpoint sync, a random block is given as starting point
- this is the tail block, and we require that the tail block has a
corresponding state.
When backfilling, we end up with blocks without corresponding states,
hence we cannot use `tail` as a backfill pointer - there is no state.
Nonetheless, we need to keep track of where we are in the backfill
process between restarts, such that we can answer GetBeaconBlocksByRange
requests.
This PR adds the basic support for backfill handling - it needs to be
integrated with backfill sync, and the REST API needs to be adjusted to
take advantage of the new backfilled blocks when responding to certain
requests.
Future work will also enable moving the tail in either direction:
* pruning means moving the tail forward in time and removing states
* backwards means recreating past states from genesis, such that
intermediate states are recreated step by step all the way to the tail -
at that point, tail, genesis and backfill will match up.
* backfilling is done when backfill != genesis - later, this will be the
WSS checkpoint instead
2021-12-13 14:36:06 +01:00
|
|
|
proc addHeadBlock*(
|
2021-12-06 10:49:01 +01:00
|
|
|
dag: ChainDAGRef, verifier: var BatchVerifier,
|
|
|
|
signedBlock: ForkySignedBeaconBlock,
|
2022-07-04 20:35:33 +00:00
|
|
|
blockVerified: bool,
|
2022-11-02 16:23:30 +00:00
|
|
|
onBlockAdded: OnForkyBlockAdded
|
2021-12-06 10:49:01 +01:00
|
|
|
): Result[BlockRef, BlockError] =
|
|
|
|
## Try adding a block to the chain, verifying first that it passes the state
|
|
|
|
## transition function and contains correct cryptographic signature.
|
|
|
|
##
|
2022-04-14 15:39:37 +00:00
|
|
|
## Cryptographic checks can be skipped by adding skipBlsValidation to
|
2022-01-26 12:21:29 +00:00
|
|
|
## dag.updateFlags
|
2021-08-05 10:26:10 +02:00
|
|
|
logScope:
|
|
|
|
blockRoot = shortLog(signedBlock.root)
|
2021-11-05 16:39:47 +01:00
|
|
|
blck = shortLog(signedBlock.message)
|
2022-02-26 19:16:19 +01:00
|
|
|
signature = shortLog(signedBlock.signature)
|
2021-12-06 10:49:01 +01:00
|
|
|
|
|
|
|
template blck(): untyped = signedBlock.message # shortcuts without copy
|
|
|
|
template blockRoot(): untyped = signedBlock.root
|
|
|
|
|
|
|
|
# If the block we get is older than what we finalized already, we drop it.
|
|
|
|
# One way this can happen is that we start request a block and finalization
|
|
|
|
# happens in the meantime - the block we requested will then be stale
|
|
|
|
# by the time it gets here.
|
|
|
|
if blck.slot <= dag.finalizedHead.slot:
|
2022-02-26 19:16:19 +01:00
|
|
|
let existing = dag.getBlockIdAtSlot(blck.slot)
|
|
|
|
# The exact slot match ensures we reject blocks that were orphaned in
|
|
|
|
# the finalized chain
|
2022-03-15 09:24:55 +01:00
|
|
|
if existing.isSome:
|
|
|
|
if existing.get().bid.slot == blck.slot and
|
|
|
|
existing.get().bid.root == blockRoot:
|
|
|
|
debug "Duplicate block"
|
|
|
|
return err(BlockError.Duplicate)
|
|
|
|
|
|
|
|
# Block is older than finalized, but different from the block in our
|
|
|
|
# canonical history: it must be from an unviable branch
|
|
|
|
debug "Block from unviable fork",
|
|
|
|
existing = shortLog(existing.get()),
|
|
|
|
finalizedHead = shortLog(dag.finalizedHead),
|
|
|
|
tail = shortLog(dag.tail)
|
limit by-root requests to non-finalized blocks (#3293)
* limit by-root requests to non-finalized blocks
Presently, we keep a mapping from block root to `BlockRef` in memory -
this has simplified reasoning about the dag, but is not sustainable with
the chain growing.
We can distinguish between two cases where by-root access is useful:
* unfinalized blocks - this is where the beacon chain is operating
generally, by validating incoming data as interesting for future fork
choice decisions - bounded by the length of the unfinalized period
* finalized blocks - historical access in the REST API etc - no bounds,
really
In this PR, we limit the by-root block index to the first use case:
finalized chain data can more efficiently be addressed by slot number.
Future work includes:
* limiting the `BlockRef` horizon in general - each instance is 40
bytes+overhead which adds up - this needs further refactoring to deal
with the tail vs state problem
* persisting the finalized slot-to-hash index - this one also keeps
growing unbounded (albeit slowly)
Anyway, this PR easily shaves ~128mb of memory usage at the time of
writing.
* No longer honor `BeaconBlocksByRoot` requests outside of the
non-finalized period - previously, Nimbus would generously return any
block through this libp2p request - per the spec, finalized blocks
should be fetched via `BeaconBlocksByRange` instead.
* return `Opt[BlockRef]` instead of `nil` when blocks can't be found -
this becomes a lot more common now and thus deserves more attention
* `dag.blocks` -> `dag.forkBlocks` - this index only carries unfinalized
blocks from now - `finalizedBlocks` covers the other `BlockRef`
instances
* in backfill, verify that the last backfilled block leads back to
genesis, or panic
* add backfill timings to log
* fix missing check that `BlockRef` block can be fetched with
`getForkedBlock` reliably
* shortcut doppelganger check when feature is not enabled
* in REST/JSON-RPC, fetch blocks without involving `BlockRef`
* fix dag.blocks ref
2022-01-21 12:33:16 +01:00
|
|
|
|
2022-03-15 09:24:55 +01:00
|
|
|
return err(BlockError.UnviableFork)
|
2021-12-06 10:49:01 +01:00
|
|
|
|
limit by-root requests to non-finalized blocks (#3293)
* limit by-root requests to non-finalized blocks
Presently, we keep a mapping from block root to `BlockRef` in memory -
this has simplified reasoning about the dag, but is not sustainable with
the chain growing.
We can distinguish between two cases where by-root access is useful:
* unfinalized blocks - this is where the beacon chain is operating
generally, by validating incoming data as interesting for future fork
choice decisions - bounded by the length of the unfinalized period
* finalized blocks - historical access in the REST API etc - no bounds,
really
In this PR, we limit the by-root block index to the first use case:
finalized chain data can more efficiently be addressed by slot number.
Future work includes:
* limiting the `BlockRef` horizon in general - each instance is 40
bytes+overhead which adds up - this needs further refactoring to deal
with the tail vs state problem
* persisting the finalized slot-to-hash index - this one also keeps
growing unbounded (albeit slowly)
Anyway, this PR easily shaves ~128mb of memory usage at the time of
writing.
* No longer honor `BeaconBlocksByRoot` requests outside of the
non-finalized period - previously, Nimbus would generously return any
block through this libp2p request - per the spec, finalized blocks
should be fetched via `BeaconBlocksByRange` instead.
* return `Opt[BlockRef]` instead of `nil` when blocks can't be found -
this becomes a lot more common now and thus deserves more attention
* `dag.blocks` -> `dag.forkBlocks` - this index only carries unfinalized
blocks from now - `finalizedBlocks` covers the other `BlockRef`
instances
* in backfill, verify that the last backfilled block leads back to
genesis, or panic
* add backfill timings to log
* fix missing check that `BlockRef` block can be fetched with
`getForkedBlock` reliably
* shortcut doppelganger check when feature is not enabled
* in REST/JSON-RPC, fetch blocks without involving `BlockRef`
* fix dag.blocks ref
2022-01-21 12:33:16 +01:00
|
|
|
# Check non-finalized blocks as well
|
|
|
|
if dag.containsForkBlock(blockRoot):
|
|
|
|
return err(BlockError.Duplicate)
|
2021-12-06 10:49:01 +01:00
|
|
|
|
limit by-root requests to non-finalized blocks (#3293)
* limit by-root requests to non-finalized blocks
Presently, we keep a mapping from block root to `BlockRef` in memory -
this has simplified reasoning about the dag, but is not sustainable with
the chain growing.
We can distinguish between two cases where by-root access is useful:
* unfinalized blocks - this is where the beacon chain is operating
generally, by validating incoming data as interesting for future fork
choice decisions - bounded by the length of the unfinalized period
* finalized blocks - historical access in the REST API etc - no bounds,
really
In this PR, we limit the by-root block index to the first use case:
finalized chain data can more efficiently be addressed by slot number.
Future work includes:
* limiting the `BlockRef` horizon in general - each instance is 40
bytes+overhead which adds up - this needs further refactoring to deal
with the tail vs state problem
* persisting the finalized slot-to-hash index - this one also keeps
growing unbounded (albeit slowly)
Anyway, this PR easily shaves ~128mb of memory usage at the time of
writing.
* No longer honor `BeaconBlocksByRoot` requests outside of the
non-finalized period - previously, Nimbus would generously return any
block through this libp2p request - per the spec, finalized blocks
should be fetched via `BeaconBlocksByRange` instead.
* return `Opt[BlockRef]` instead of `nil` when blocks can't be found -
this becomes a lot more common now and thus deserves more attention
* `dag.blocks` -> `dag.forkBlocks` - this index only carries unfinalized
blocks from now - `finalizedBlocks` covers the other `BlockRef`
instances
* in backfill, verify that the last backfilled block leads back to
genesis, or panic
* add backfill timings to log
* fix missing check that `BlockRef` block can be fetched with
`getForkedBlock` reliably
* shortcut doppelganger check when feature is not enabled
* in REST/JSON-RPC, fetch blocks without involving `BlockRef`
* fix dag.blocks ref
2022-01-21 12:33:16 +01:00
|
|
|
let parent = dag.getBlockRef(blck.parent_root).valueOr:
|
|
|
|
# There are two cases where the parent won't be found: we don't have it or
|
|
|
|
# it has been finalized already, and as a result the branch the new block
|
|
|
|
# is on is no longer a viable fork candidate - we can't tell which is which
|
|
|
|
# at this stage, but we can check if we've seen the parent block previously
|
|
|
|
# and thus prevent requests for it to be downloaded again.
|
2022-02-26 19:16:19 +01:00
|
|
|
let parentId = dag.getBlockId(blck.parent_root)
|
2022-09-22 20:33:26 +02:00
|
|
|
if parentId.isSome() and parentId.get.slot < dag.finalizedHead.slot:
|
2022-02-26 19:16:19 +01:00
|
|
|
debug "Block unviable due to pre-finalized-checkpoint parent",
|
|
|
|
parentId = parentId.get()
|
limit by-root requests to non-finalized blocks (#3293)
* limit by-root requests to non-finalized blocks
Presently, we keep a mapping from block root to `BlockRef` in memory -
this has simplified reasoning about the dag, but is not sustainable with
the chain growing.
We can distinguish between two cases where by-root access is useful:
* unfinalized blocks - this is where the beacon chain is operating
generally, by validating incoming data as interesting for future fork
choice decisions - bounded by the length of the unfinalized period
* finalized blocks - historical access in the REST API etc - no bounds,
really
In this PR, we limit the by-root block index to the first use case:
finalized chain data can more efficiently be addressed by slot number.
Future work includes:
* limiting the `BlockRef` horizon in general - each instance is 40
bytes+overhead which adds up - this needs further refactoring to deal
with the tail vs state problem
* persisting the finalized slot-to-hash index - this one also keeps
growing unbounded (albeit slowly)
Anyway, this PR easily shaves ~128mb of memory usage at the time of
writing.
* No longer honor `BeaconBlocksByRoot` requests outside of the
non-finalized period - previously, Nimbus would generously return any
block through this libp2p request - per the spec, finalized blocks
should be fetched via `BeaconBlocksByRange` instead.
* return `Opt[BlockRef]` instead of `nil` when blocks can't be found -
this becomes a lot more common now and thus deserves more attention
* `dag.blocks` -> `dag.forkBlocks` - this index only carries unfinalized
blocks from now - `finalizedBlocks` covers the other `BlockRef`
instances
* in backfill, verify that the last backfilled block leads back to
genesis, or panic
* add backfill timings to log
* fix missing check that `BlockRef` block can be fetched with
`getForkedBlock` reliably
* shortcut doppelganger check when feature is not enabled
* in REST/JSON-RPC, fetch blocks without involving `BlockRef`
* fix dag.blocks ref
2022-01-21 12:33:16 +01:00
|
|
|
return err(BlockError.UnviableFork)
|
|
|
|
|
2022-02-26 19:16:19 +01:00
|
|
|
debug "Block parent unknown or finalized already", parentId
|
2021-12-06 10:49:01 +01:00
|
|
|
return err(BlockError.MissingParent)
|
2021-01-25 19:45:48 +01:00
|
|
|
|
2022-02-26 19:16:19 +01:00
|
|
|
if parent.slot >= blck.slot:
|
2021-01-25 19:45:48 +01:00
|
|
|
# A block whose parent is newer than the block itself is clearly invalid -
|
|
|
|
# discard it immediately
|
2022-02-26 19:16:19 +01:00
|
|
|
debug "Block older than parent",
|
|
|
|
parent = shortLog(parent)
|
2021-01-25 19:45:48 +01:00
|
|
|
|
2021-12-06 10:49:01 +01:00
|
|
|
return err(BlockError.Invalid)
|
2021-01-25 19:45:48 +01:00
|
|
|
|
|
|
|
# The block is resolved, now it's time to validate it to ensure that the
|
|
|
|
# blocks we add to the database are clean for the given state
|
2021-05-28 21:03:20 +02:00
|
|
|
let startTick = Moment.now()
|
2020-05-19 16:18:07 +02:00
|
|
|
|
2021-12-30 12:33:03 +01:00
|
|
|
# The clearance state works as the canonical
|
|
|
|
# "let's make things permanent" point and saves things to the database -
|
|
|
|
# storing things is slow, so we don't want to do so before there's a
|
|
|
|
# reasonable chance that the information will become more permanently useful -
|
|
|
|
# by the time a new block reaches this point, the parent block will already
|
|
|
|
# have "established" itself in the network to some degree at least.
|
2021-01-25 19:45:48 +01:00
|
|
|
var cache = StateCache()
|
2022-03-23 12:42:16 +01:00
|
|
|
|
|
|
|
# We've verified that the slot of the new block is newer than that of the
|
|
|
|
# parent, so we should now be able to create an approriate clearance state
|
|
|
|
# onto which we can apply the new block
|
|
|
|
let clearanceBlock = BlockSlotId.init(parent.bid, signedBlock.message.slot)
|
2022-03-16 08:20:40 +01:00
|
|
|
if not updateState(
|
Prune `BlockRef` on finalization (#3513)
Up til now, the block dag has been using `BlockRef`, a structure adapted
for a full DAG, to represent all of chain history. This is a correct and
simple design, but does not exploit the linearity of the chain once
parts of it finalize.
By pruning the in-memory `BlockRef` structure at finalization, we save,
at the time of writing, a cool ~250mb (or 25%:ish) chunk of memory
landing us at a steady state of ~750mb normal memory usage for a
validating node.
Above all though, we prevent memory usage from growing proportionally
with the length of the chain, something that would not be sustainable
over time - instead, the steady state memory usage is roughly
determined by the validator set size which grows much more slowly. With
these changes, the core should remain sustainable memory-wise post-merge
all the way to withdrawals (when the validator set is expected to grow).
In-memory indices are still used for the "hot" unfinalized portion of
the chain - this ensure that consensus performance remains unchanged.
What changes is that for historical access, we use a db-based linear
slot index which is cache-and-disk-friendly, keeping the cost for
accessing historical data at a similar level as before, achieving the
savings at no percievable cost to functionality or performance.
A nice collateral benefit is the almost-instant startup since we no
longer load any large indicies at dag init.
The cost of this functionality instead can be found in the complexity of
having to deal with two ways of traversing the chain - by `BlockRef` and
by slot.
* use `BlockId` instead of `BlockRef` where finalized / historical data
may be required
* simplify clearance pre-advancement
* remove dag.finalizedBlocks (~50:ish mb)
* remove `getBlockAtSlot` - use `getBlockIdAtSlot` instead
* `parent` and `atSlot` for `BlockId` now require a `ChainDAGRef`
instance, unlike `BlockRef` traversal
* prune `BlockRef` parents on finality (~200:ish mb)
* speed up ChainDAG init by not loading finalized history index
* mess up light client server error handling - this need revisiting :)
2022-03-17 18:42:56 +01:00
|
|
|
dag, dag.clearanceState, clearanceBlock, true, cache):
|
2022-01-05 19:38:04 +01:00
|
|
|
# We should never end up here - the parent must be a block no older than and
|
|
|
|
# rooted in the finalized checkpoint, hence we should always be able to
|
|
|
|
# load its corresponding state
|
|
|
|
error "Unable to load clearance state for parent block, database corrupt?",
|
Prune `BlockRef` on finalization (#3513)
Up til now, the block dag has been using `BlockRef`, a structure adapted
for a full DAG, to represent all of chain history. This is a correct and
simple design, but does not exploit the linearity of the chain once
parts of it finalize.
By pruning the in-memory `BlockRef` structure at finalization, we save,
at the time of writing, a cool ~250mb (or 25%:ish) chunk of memory
landing us at a steady state of ~750mb normal memory usage for a
validating node.
Above all though, we prevent memory usage from growing proportionally
with the length of the chain, something that would not be sustainable
over time - instead, the steady state memory usage is roughly
determined by the validator set size which grows much more slowly. With
these changes, the core should remain sustainable memory-wise post-merge
all the way to withdrawals (when the validator set is expected to grow).
In-memory indices are still used for the "hot" unfinalized portion of
the chain - this ensure that consensus performance remains unchanged.
What changes is that for historical access, we use a db-based linear
slot index which is cache-and-disk-friendly, keeping the cost for
accessing historical data at a similar level as before, achieving the
savings at no percievable cost to functionality or performance.
A nice collateral benefit is the almost-instant startup since we no
longer load any large indicies at dag init.
The cost of this functionality instead can be found in the complexity of
having to deal with two ways of traversing the chain - by `BlockRef` and
by slot.
* use `BlockId` instead of `BlockRef` where finalized / historical data
may be required
* simplify clearance pre-advancement
* remove dag.finalizedBlocks (~50:ish mb)
* remove `getBlockAtSlot` - use `getBlockIdAtSlot` instead
* `parent` and `atSlot` for `BlockId` now require a `ChainDAGRef`
instance, unlike `BlockRef` traversal
* prune `BlockRef` parents on finality (~200:ish mb)
* speed up ChainDAG init by not loading finalized history index
* mess up light client server error handling - this need revisiting :)
2022-03-17 18:42:56 +01:00
|
|
|
clearanceBlock = shortLog(clearanceBlock)
|
2022-01-05 19:38:04 +01:00
|
|
|
return err(BlockError.MissingParent)
|
|
|
|
|
2021-05-28 21:03:20 +02:00
|
|
|
let stateDataTick = Moment.now()
|
2020-05-19 16:18:07 +02:00
|
|
|
|
2021-06-01 13:13:40 +02:00
|
|
|
# First, batch-verify all signatures in block
|
2022-04-14 15:39:37 +00:00
|
|
|
if skipBlsValidation notin dag.updateFlags:
|
|
|
|
# TODO: remove skipBlsValidation
|
2021-01-25 19:45:48 +01:00
|
|
|
var sigs: seq[SignatureSet]
|
2021-08-05 10:26:10 +02:00
|
|
|
if (let e = sigs.collectSignatureSets(
|
2021-08-09 13:14:28 +02:00
|
|
|
signedBlock, dag.db.immutableValidators,
|
2022-03-16 08:20:40 +01:00
|
|
|
dag.clearanceState, cache); e.isErr()):
|
limit by-root requests to non-finalized blocks (#3293)
* limit by-root requests to non-finalized blocks
Presently, we keep a mapping from block root to `BlockRef` in memory -
this has simplified reasoning about the dag, but is not sustainable with
the chain growing.
We can distinguish between two cases where by-root access is useful:
* unfinalized blocks - this is where the beacon chain is operating
generally, by validating incoming data as interesting for future fork
choice decisions - bounded by the length of the unfinalized period
* finalized blocks - historical access in the REST API etc - no bounds,
really
In this PR, we limit the by-root block index to the first use case:
finalized chain data can more efficiently be addressed by slot number.
Future work includes:
* limiting the `BlockRef` horizon in general - each instance is 40
bytes+overhead which adds up - this needs further refactoring to deal
with the tail vs state problem
* persisting the finalized slot-to-hash index - this one also keeps
growing unbounded (albeit slowly)
Anyway, this PR easily shaves ~128mb of memory usage at the time of
writing.
* No longer honor `BeaconBlocksByRoot` requests outside of the
non-finalized period - previously, Nimbus would generously return any
block through this libp2p request - per the spec, finalized blocks
should be fetched via `BeaconBlocksByRange` instead.
* return `Opt[BlockRef]` instead of `nil` when blocks can't be found -
this becomes a lot more common now and thus deserves more attention
* `dag.blocks` -> `dag.forkBlocks` - this index only carries unfinalized
blocks from now - `finalizedBlocks` covers the other `BlockRef`
instances
* in backfill, verify that the last backfilled block leads back to
genesis, or panic
* add backfill timings to log
* fix missing check that `BlockRef` block can be fetched with
`getForkedBlock` reliably
* shortcut doppelganger check when feature is not enabled
* in REST/JSON-RPC, fetch blocks without involving `BlockRef`
* fix dag.blocks ref
2022-01-21 12:33:16 +01:00
|
|
|
# A PublicKey or Signature isn't on the BLS12-381 curve
|
2021-08-05 10:26:10 +02:00
|
|
|
info "Unable to load signature sets",
|
|
|
|
err = e.error()
|
2021-12-06 10:49:01 +01:00
|
|
|
return err(BlockError.Invalid)
|
2022-02-26 19:16:19 +01:00
|
|
|
|
2021-12-06 10:49:01 +01:00
|
|
|
if not verifier.batchVerify(sigs):
|
limit by-root requests to non-finalized blocks (#3293)
* limit by-root requests to non-finalized blocks
Presently, we keep a mapping from block root to `BlockRef` in memory -
this has simplified reasoning about the dag, but is not sustainable with
the chain growing.
We can distinguish between two cases where by-root access is useful:
* unfinalized blocks - this is where the beacon chain is operating
generally, by validating incoming data as interesting for future fork
choice decisions - bounded by the length of the unfinalized period
* finalized blocks - historical access in the REST API etc - no bounds,
really
In this PR, we limit the by-root block index to the first use case:
finalized chain data can more efficiently be addressed by slot number.
Future work includes:
* limiting the `BlockRef` horizon in general - each instance is 40
bytes+overhead which adds up - this needs further refactoring to deal
with the tail vs state problem
* persisting the finalized slot-to-hash index - this one also keeps
growing unbounded (albeit slowly)
Anyway, this PR easily shaves ~128mb of memory usage at the time of
writing.
* No longer honor `BeaconBlocksByRoot` requests outside of the
non-finalized period - previously, Nimbus would generously return any
block through this libp2p request - per the spec, finalized blocks
should be fetched via `BeaconBlocksByRange` instead.
* return `Opt[BlockRef]` instead of `nil` when blocks can't be found -
this becomes a lot more common now and thus deserves more attention
* `dag.blocks` -> `dag.forkBlocks` - this index only carries unfinalized
blocks from now - `finalizedBlocks` covers the other `BlockRef`
instances
* in backfill, verify that the last backfilled block leads back to
genesis, or panic
* add backfill timings to log
* fix missing check that `BlockRef` block can be fetched with
`getForkedBlock` reliably
* shortcut doppelganger check when feature is not enabled
* in REST/JSON-RPC, fetch blocks without involving `BlockRef`
* fix dag.blocks ref
2022-01-21 12:33:16 +01:00
|
|
|
info "Block signature verification failed",
|
|
|
|
signature = shortLog(signedBlock.signature)
|
2021-12-06 10:49:01 +01:00
|
|
|
return err(BlockError.Invalid)
|
2020-05-19 16:18:07 +02:00
|
|
|
|
2021-05-28 21:03:20 +02:00
|
|
|
let sigVerifyTick = Moment.now()
|
2021-12-06 10:49:01 +01:00
|
|
|
|
|
|
|
? checkStateTransition(dag, signedBlock.asSigVerified(), cache)
|
2021-05-29 20:56:30 +02:00
|
|
|
|
2021-05-28 21:03:20 +02:00
|
|
|
let stateVerifyTick = Moment.now()
|
2021-01-25 19:45:48 +01:00
|
|
|
# Careful, clearanceState.data has been updated but not blck - we need to
|
|
|
|
# create the BlockRef first!
|
Backfill support for ChainDAG (#3171)
In the ChainDAG, 3 block pointers are kept: genesis, tail and head. This
PR adds one more block pointer: the backfill block which represents the
block that has been backfilled so far.
When doing a checkpoint sync, a random block is given as starting point
- this is the tail block, and we require that the tail block has a
corresponding state.
When backfilling, we end up with blocks without corresponding states,
hence we cannot use `tail` as a backfill pointer - there is no state.
Nonetheless, we need to keep track of where we are in the backfill
process between restarts, such that we can answer GetBeaconBlocksByRange
requests.
This PR adds the basic support for backfill handling - it needs to be
integrated with backfill sync, and the REST API needs to be adjusted to
take advantage of the new backfilled blocks when responding to certain
requests.
Future work will also enable moving the tail in either direction:
* pruning means moving the tail forward in time and removing states
* backwards means recreating past states from genesis, such that
intermediate states are recreated step by step all the way to the tail -
at that point, tail, genesis and backfill will match up.
* backfilling is done when backfill != genesis - later, this will be the
WSS checkpoint instead
2021-12-13 14:36:06 +01:00
|
|
|
ok addResolvedHeadBlock(
|
2021-12-06 10:49:01 +01:00
|
|
|
dag, dag.clearanceState,
|
2021-01-25 19:45:48 +01:00
|
|
|
signedBlock.asTrusted(),
|
2022-07-04 20:35:33 +00:00
|
|
|
blockVerified = blockVerified,
|
2021-01-25 19:45:48 +01:00
|
|
|
parent, cache,
|
2021-05-28 21:03:20 +02:00
|
|
|
onBlockAdded,
|
|
|
|
stateDataDur = stateDataTick - startTick,
|
|
|
|
sigVerifyDur = sigVerifyTick - stateDataTick,
|
|
|
|
stateVerifyDur = stateVerifyTick - sigVerifyTick)
|
Backfill support for ChainDAG (#3171)
In the ChainDAG, 3 block pointers are kept: genesis, tail and head. This
PR adds one more block pointer: the backfill block which represents the
block that has been backfilled so far.
When doing a checkpoint sync, a random block is given as starting point
- this is the tail block, and we require that the tail block has a
corresponding state.
When backfilling, we end up with blocks without corresponding states,
hence we cannot use `tail` as a backfill pointer - there is no state.
Nonetheless, we need to keep track of where we are in the backfill
process between restarts, such that we can answer GetBeaconBlocksByRange
requests.
This PR adds the basic support for backfill handling - it needs to be
integrated with backfill sync, and the REST API needs to be adjusted to
take advantage of the new backfilled blocks when responding to certain
requests.
Future work will also enable moving the tail in either direction:
* pruning means moving the tail forward in time and removing states
* backwards means recreating past states from genesis, such that
intermediate states are recreated step by step all the way to the tail -
at that point, tail, genesis and backfill will match up.
* backfilling is done when backfill != genesis - later, this will be the
WSS checkpoint instead
2021-12-13 14:36:06 +01:00
|
|
|
|
2022-07-04 20:35:33 +00:00
|
|
|
proc addHeadBlock*(
|
|
|
|
dag: ChainDAGRef, verifier: var BatchVerifier,
|
|
|
|
signedBlock: ForkySignedBeaconBlock,
|
2022-11-02 16:23:30 +00:00
|
|
|
onBlockAdded: OnForkyBlockAdded
|
2022-07-04 20:35:33 +00:00
|
|
|
): Result[BlockRef, BlockError] =
|
|
|
|
addHeadBlock(
|
|
|
|
dag, verifier, signedBlock, blockVerified = true, onBlockAdded)
|
|
|
|
|
Backfill support for ChainDAG (#3171)
In the ChainDAG, 3 block pointers are kept: genesis, tail and head. This
PR adds one more block pointer: the backfill block which represents the
block that has been backfilled so far.
When doing a checkpoint sync, a random block is given as starting point
- this is the tail block, and we require that the tail block has a
corresponding state.
When backfilling, we end up with blocks without corresponding states,
hence we cannot use `tail` as a backfill pointer - there is no state.
Nonetheless, we need to keep track of where we are in the backfill
process between restarts, such that we can answer GetBeaconBlocksByRange
requests.
This PR adds the basic support for backfill handling - it needs to be
integrated with backfill sync, and the REST API needs to be adjusted to
take advantage of the new backfilled blocks when responding to certain
requests.
Future work will also enable moving the tail in either direction:
* pruning means moving the tail forward in time and removing states
* backwards means recreating past states from genesis, such that
intermediate states are recreated step by step all the way to the tail -
at that point, tail, genesis and backfill will match up.
* backfilling is done when backfill != genesis - later, this will be the
WSS checkpoint instead
2021-12-13 14:36:06 +01:00
|
|
|
proc addBackfillBlock*(
|
|
|
|
dag: ChainDAGRef,
|
2022-11-10 11:44:47 +01:00
|
|
|
signedBlock: ForkySignedBeaconBlock | ForkySigVerifiedSignedBeaconBlock):
|
|
|
|
Result[void, BlockError] =
|
Backfill support for ChainDAG (#3171)
In the ChainDAG, 3 block pointers are kept: genesis, tail and head. This
PR adds one more block pointer: the backfill block which represents the
block that has been backfilled so far.
When doing a checkpoint sync, a random block is given as starting point
- this is the tail block, and we require that the tail block has a
corresponding state.
When backfilling, we end up with blocks without corresponding states,
hence we cannot use `tail` as a backfill pointer - there is no state.
Nonetheless, we need to keep track of where we are in the backfill
process between restarts, such that we can answer GetBeaconBlocksByRange
requests.
This PR adds the basic support for backfill handling - it needs to be
integrated with backfill sync, and the REST API needs to be adjusted to
take advantage of the new backfilled blocks when responding to certain
requests.
Future work will also enable moving the tail in either direction:
* pruning means moving the tail forward in time and removing states
* backwards means recreating past states from genesis, such that
intermediate states are recreated step by step all the way to the tail -
at that point, tail, genesis and backfill will match up.
* backfilling is done when backfill != genesis - later, this will be the
WSS checkpoint instead
2021-12-13 14:36:06 +01:00
|
|
|
## When performing checkpoint sync, we need to backfill historical blocks
|
|
|
|
## in order to respond to GetBlocksByRange requests. Backfill blocks are
|
|
|
|
## added in backwards order, one by one, based on the `parent_root` of the
|
|
|
|
## earliest block we know about.
|
|
|
|
##
|
|
|
|
## Because only one history is relevant when backfilling, one doesn't have to
|
|
|
|
## consider forks or other finalization-related issues - a block is either
|
|
|
|
## valid and finalized, or not.
|
|
|
|
logScope:
|
|
|
|
blockRoot = shortLog(signedBlock.root)
|
|
|
|
blck = shortLog(signedBlock.message)
|
2022-02-26 19:16:19 +01:00
|
|
|
signature = shortLog(signedBlock.signature)
|
2021-12-21 11:40:14 +01:00
|
|
|
backfill = (dag.backfill.slot, shortLog(dag.backfill.parent_root))
|
Backfill support for ChainDAG (#3171)
In the ChainDAG, 3 block pointers are kept: genesis, tail and head. This
PR adds one more block pointer: the backfill block which represents the
block that has been backfilled so far.
When doing a checkpoint sync, a random block is given as starting point
- this is the tail block, and we require that the tail block has a
corresponding state.
When backfilling, we end up with blocks without corresponding states,
hence we cannot use `tail` as a backfill pointer - there is no state.
Nonetheless, we need to keep track of where we are in the backfill
process between restarts, such that we can answer GetBeaconBlocksByRange
requests.
This PR adds the basic support for backfill handling - it needs to be
integrated with backfill sync, and the REST API needs to be adjusted to
take advantage of the new backfilled blocks when responding to certain
requests.
Future work will also enable moving the tail in either direction:
* pruning means moving the tail forward in time and removing states
* backwards means recreating past states from genesis, such that
intermediate states are recreated step by step all the way to the tail -
at that point, tail, genesis and backfill will match up.
* backfilling is done when backfill != genesis - later, this will be the
WSS checkpoint instead
2021-12-13 14:36:06 +01:00
|
|
|
|
|
|
|
template blck(): untyped = signedBlock.message # shortcuts without copy
|
|
|
|
template blockRoot(): untyped = signedBlock.root
|
2022-10-14 21:40:10 +02:00
|
|
|
template checkSignature =
|
|
|
|
# If the hash is correct, the block itself must be correct, but the root does
|
|
|
|
# not cover the signature, which we check next
|
2022-11-10 11:44:47 +01:00
|
|
|
when signedBlock.signature isnot TrustedSig:
|
|
|
|
if blck.slot == GENESIS_SLOT:
|
|
|
|
# The genesis block must have an empty signature (since there's no proposer)
|
|
|
|
if signedBlock.signature != ValidatorSig():
|
|
|
|
info "Invalid genesis block signature"
|
|
|
|
return err(BlockError.Invalid)
|
|
|
|
else:
|
|
|
|
let proposerKey = dag.validatorKey(blck.proposer_index)
|
|
|
|
if proposerKey.isNone():
|
|
|
|
# We've verified that the block root matches our expectations by following
|
|
|
|
# the chain of parents all the way from checkpoint. If all those blocks
|
|
|
|
# were valid, the proposer_index in this block must also be valid, and we
|
|
|
|
# should have a key for it but we don't: this is either a bug on our from
|
|
|
|
# which we cannot recover, or an invalid checkpoint state was given in which
|
|
|
|
# case we're in trouble.
|
|
|
|
fatal "Invalid proposer in backfill block - checkpoint state corrupt?",
|
|
|
|
head = shortLog(dag.head), tail = shortLog(dag.tail)
|
|
|
|
|
|
|
|
quit 1
|
|
|
|
|
|
|
|
if not verify_block_signature(
|
|
|
|
dag.forkAtEpoch(blck.slot.epoch),
|
|
|
|
getStateField(dag.headState, genesis_validators_root),
|
|
|
|
blck.slot,
|
|
|
|
signedBlock.root,
|
|
|
|
proposerKey.get(),
|
|
|
|
signedBlock.signature):
|
|
|
|
info "Block signature verification failed"
|
|
|
|
return err(BlockError.Invalid)
|
Backfill support for ChainDAG (#3171)
In the ChainDAG, 3 block pointers are kept: genesis, tail and head. This
PR adds one more block pointer: the backfill block which represents the
block that has been backfilled so far.
When doing a checkpoint sync, a random block is given as starting point
- this is the tail block, and we require that the tail block has a
corresponding state.
When backfilling, we end up with blocks without corresponding states,
hence we cannot use `tail` as a backfill pointer - there is no state.
Nonetheless, we need to keep track of where we are in the backfill
process between restarts, such that we can answer GetBeaconBlocksByRange
requests.
This PR adds the basic support for backfill handling - it needs to be
integrated with backfill sync, and the REST API needs to be adjusted to
take advantage of the new backfilled blocks when responding to certain
requests.
Future work will also enable moving the tail in either direction:
* pruning means moving the tail forward in time and removing states
* backwards means recreating past states from genesis, such that
intermediate states are recreated step by step all the way to the tail -
at that point, tail, genesis and backfill will match up.
* backfilling is done when backfill != genesis - later, this will be the
WSS checkpoint instead
2021-12-13 14:36:06 +01:00
|
|
|
|
limit by-root requests to non-finalized blocks (#3293)
* limit by-root requests to non-finalized blocks
Presently, we keep a mapping from block root to `BlockRef` in memory -
this has simplified reasoning about the dag, but is not sustainable with
the chain growing.
We can distinguish between two cases where by-root access is useful:
* unfinalized blocks - this is where the beacon chain is operating
generally, by validating incoming data as interesting for future fork
choice decisions - bounded by the length of the unfinalized period
* finalized blocks - historical access in the REST API etc - no bounds,
really
In this PR, we limit the by-root block index to the first use case:
finalized chain data can more efficiently be addressed by slot number.
Future work includes:
* limiting the `BlockRef` horizon in general - each instance is 40
bytes+overhead which adds up - this needs further refactoring to deal
with the tail vs state problem
* persisting the finalized slot-to-hash index - this one also keeps
growing unbounded (albeit slowly)
Anyway, this PR easily shaves ~128mb of memory usage at the time of
writing.
* No longer honor `BeaconBlocksByRoot` requests outside of the
non-finalized period - previously, Nimbus would generously return any
block through this libp2p request - per the spec, finalized blocks
should be fetched via `BeaconBlocksByRange` instead.
* return `Opt[BlockRef]` instead of `nil` when blocks can't be found -
this becomes a lot more common now and thus deserves more attention
* `dag.blocks` -> `dag.forkBlocks` - this index only carries unfinalized
blocks from now - `finalizedBlocks` covers the other `BlockRef`
instances
* in backfill, verify that the last backfilled block leads back to
genesis, or panic
* add backfill timings to log
* fix missing check that `BlockRef` block can be fetched with
`getForkedBlock` reliably
* shortcut doppelganger check when feature is not enabled
* in REST/JSON-RPC, fetch blocks without involving `BlockRef`
* fix dag.blocks ref
2022-01-21 12:33:16 +01:00
|
|
|
let startTick = Moment.now()
|
|
|
|
|
|
|
|
if blck.slot >= dag.backfill.slot:
|
2022-02-26 19:16:19 +01:00
|
|
|
let existing = dag.getBlockIdAtSlot(blck.slot)
|
2022-03-15 09:24:55 +01:00
|
|
|
if existing.isSome:
|
|
|
|
if existing.get().bid.slot == blck.slot and
|
|
|
|
existing.get().bid.root == blockRoot:
|
2022-10-14 21:40:10 +02:00
|
|
|
|
|
|
|
# Special case: when starting with only a checkpoint state, we will not
|
|
|
|
# have the head block data in the database
|
|
|
|
if dag.getForkedBlock(existing.get().bid).isNone():
|
|
|
|
checkSignature()
|
|
|
|
|
|
|
|
debug "Block backfilled (checkpoint)"
|
|
|
|
dag.putBlock(signedBlock.asTrusted())
|
|
|
|
return ok()
|
|
|
|
|
2022-03-15 09:24:55 +01:00
|
|
|
debug "Duplicate block"
|
|
|
|
return err(BlockError.Duplicate)
|
|
|
|
|
|
|
|
# Block is older than finalized, but different from the block in our
|
|
|
|
# canonical history: it must be from an unviable branch
|
|
|
|
debug "Block from unviable fork",
|
|
|
|
existing = shortLog(existing.get()),
|
|
|
|
finalizedHead = shortLog(dag.finalizedHead)
|
|
|
|
|
|
|
|
return err(BlockError.UnviableFork)
|
Backfill support for ChainDAG (#3171)
In the ChainDAG, 3 block pointers are kept: genesis, tail and head. This
PR adds one more block pointer: the backfill block which represents the
block that has been backfilled so far.
When doing a checkpoint sync, a random block is given as starting point
- this is the tail block, and we require that the tail block has a
corresponding state.
When backfilling, we end up with blocks without corresponding states,
hence we cannot use `tail` as a backfill pointer - there is no state.
Nonetheless, we need to keep track of where we are in the backfill
process between restarts, such that we can answer GetBeaconBlocksByRange
requests.
This PR adds the basic support for backfill handling - it needs to be
integrated with backfill sync, and the REST API needs to be adjusted to
take advantage of the new backfilled blocks when responding to certain
requests.
Future work will also enable moving the tail in either direction:
* pruning means moving the tail forward in time and removing states
* backwards means recreating past states from genesis, such that
intermediate states are recreated step by step all the way to the tail -
at that point, tail, genesis and backfill will match up.
* backfilling is done when backfill != genesis - later, this will be the
WSS checkpoint instead
2021-12-13 14:36:06 +01:00
|
|
|
|
2022-10-14 21:40:10 +02:00
|
|
|
if dag.frontfill.isSome():
|
|
|
|
let frontfill = dag.frontfill.get()
|
|
|
|
if blck.slot == frontfill.slot and
|
|
|
|
dag.backfill.parent_root == frontfill.root:
|
|
|
|
if blockRoot != frontfill.root:
|
|
|
|
# We've matched the backfill blocks all the way back to frontfill via the
|
|
|
|
# `parent_root` chain and ended up at a different block - one way this
|
|
|
|
# can happen is when an invalid `--network` parameter is given during
|
|
|
|
# startup (though in theory, we check that - maybe the database was
|
|
|
|
# swapped or something?).
|
|
|
|
fatal "Checkpoint given during initial startup inconsistent with genesis block - wrong network used when starting the node?",
|
|
|
|
tail = shortLog(dag.tail), head = shortLog(dag.head)
|
|
|
|
quit 1
|
|
|
|
|
|
|
|
# Signal that we're done by resetting backfill
|
|
|
|
reset(dag.backfill)
|
|
|
|
dag.db.finalizedBlocks.insert(blck.slot, blockRoot)
|
|
|
|
dag.updateFrontfillBlocks()
|
|
|
|
|
|
|
|
notice "Received final block during backfill, backfill complete"
|
|
|
|
|
|
|
|
# Backfill done - dag.backfill.slot now points to genesis block just like
|
|
|
|
# it would if we loaded a fully synced database - returning duplicate
|
|
|
|
# here is appropriate, though one could also call it ... ok?
|
|
|
|
return err(BlockError.Duplicate)
|
limit by-root requests to non-finalized blocks (#3293)
* limit by-root requests to non-finalized blocks
Presently, we keep a mapping from block root to `BlockRef` in memory -
this has simplified reasoning about the dag, but is not sustainable with
the chain growing.
We can distinguish between two cases where by-root access is useful:
* unfinalized blocks - this is where the beacon chain is operating
generally, by validating incoming data as interesting for future fork
choice decisions - bounded by the length of the unfinalized period
* finalized blocks - historical access in the REST API etc - no bounds,
really
In this PR, we limit the by-root block index to the first use case:
finalized chain data can more efficiently be addressed by slot number.
Future work includes:
* limiting the `BlockRef` horizon in general - each instance is 40
bytes+overhead which adds up - this needs further refactoring to deal
with the tail vs state problem
* persisting the finalized slot-to-hash index - this one also keeps
growing unbounded (albeit slowly)
Anyway, this PR easily shaves ~128mb of memory usage at the time of
writing.
* No longer honor `BeaconBlocksByRoot` requests outside of the
non-finalized period - previously, Nimbus would generously return any
block through this libp2p request - per the spec, finalized blocks
should be fetched via `BeaconBlocksByRange` instead.
* return `Opt[BlockRef]` instead of `nil` when blocks can't be found -
this becomes a lot more common now and thus deserves more attention
* `dag.blocks` -> `dag.forkBlocks` - this index only carries unfinalized
blocks from now - `finalizedBlocks` covers the other `BlockRef`
instances
* in backfill, verify that the last backfilled block leads back to
genesis, or panic
* add backfill timings to log
* fix missing check that `BlockRef` block can be fetched with
`getForkedBlock` reliably
* shortcut doppelganger check when feature is not enabled
* in REST/JSON-RPC, fetch blocks without involving `BlockRef`
* fix dag.blocks ref
2022-01-21 12:33:16 +01:00
|
|
|
|
|
|
|
if dag.backfill.parent_root != blockRoot:
|
Backfill support for ChainDAG (#3171)
In the ChainDAG, 3 block pointers are kept: genesis, tail and head. This
PR adds one more block pointer: the backfill block which represents the
block that has been backfilled so far.
When doing a checkpoint sync, a random block is given as starting point
- this is the tail block, and we require that the tail block has a
corresponding state.
When backfilling, we end up with blocks without corresponding states,
hence we cannot use `tail` as a backfill pointer - there is no state.
Nonetheless, we need to keep track of where we are in the backfill
process between restarts, such that we can answer GetBeaconBlocksByRange
requests.
This PR adds the basic support for backfill handling - it needs to be
integrated with backfill sync, and the REST API needs to be adjusted to
take advantage of the new backfilled blocks when responding to certain
requests.
Future work will also enable moving the tail in either direction:
* pruning means moving the tail forward in time and removing states
* backwards means recreating past states from genesis, such that
intermediate states are recreated step by step all the way to the tail -
at that point, tail, genesis and backfill will match up.
* backfilling is done when backfill != genesis - later, this will be the
WSS checkpoint instead
2021-12-13 14:36:06 +01:00
|
|
|
debug "Block does not match expected backfill root"
|
|
|
|
return err(BlockError.MissingParent) # MissingChild really, but ..
|
|
|
|
|
2022-10-14 21:40:10 +02:00
|
|
|
checkSignature()
|
2022-02-26 19:16:19 +01:00
|
|
|
|
limit by-root requests to non-finalized blocks (#3293)
* limit by-root requests to non-finalized blocks
Presently, we keep a mapping from block root to `BlockRef` in memory -
this has simplified reasoning about the dag, but is not sustainable with
the chain growing.
We can distinguish between two cases where by-root access is useful:
* unfinalized blocks - this is where the beacon chain is operating
generally, by validating incoming data as interesting for future fork
choice decisions - bounded by the length of the unfinalized period
* finalized blocks - historical access in the REST API etc - no bounds,
really
In this PR, we limit the by-root block index to the first use case:
finalized chain data can more efficiently be addressed by slot number.
Future work includes:
* limiting the `BlockRef` horizon in general - each instance is 40
bytes+overhead which adds up - this needs further refactoring to deal
with the tail vs state problem
* persisting the finalized slot-to-hash index - this one also keeps
growing unbounded (albeit slowly)
Anyway, this PR easily shaves ~128mb of memory usage at the time of
writing.
* No longer honor `BeaconBlocksByRoot` requests outside of the
non-finalized period - previously, Nimbus would generously return any
block through this libp2p request - per the spec, finalized blocks
should be fetched via `BeaconBlocksByRange` instead.
* return `Opt[BlockRef]` instead of `nil` when blocks can't be found -
this becomes a lot more common now and thus deserves more attention
* `dag.blocks` -> `dag.forkBlocks` - this index only carries unfinalized
blocks from now - `finalizedBlocks` covers the other `BlockRef`
instances
* in backfill, verify that the last backfilled block leads back to
genesis, or panic
* add backfill timings to log
* fix missing check that `BlockRef` block can be fetched with
`getForkedBlock` reliably
* shortcut doppelganger check when feature is not enabled
* in REST/JSON-RPC, fetch blocks without involving `BlockRef`
* fix dag.blocks ref
2022-01-21 12:33:16 +01:00
|
|
|
let sigVerifyTick = Moment.now
|
Backfill support for ChainDAG (#3171)
In the ChainDAG, 3 block pointers are kept: genesis, tail and head. This
PR adds one more block pointer: the backfill block which represents the
block that has been backfilled so far.
When doing a checkpoint sync, a random block is given as starting point
- this is the tail block, and we require that the tail block has a
corresponding state.
When backfilling, we end up with blocks without corresponding states,
hence we cannot use `tail` as a backfill pointer - there is no state.
Nonetheless, we need to keep track of where we are in the backfill
process between restarts, such that we can answer GetBeaconBlocksByRange
requests.
This PR adds the basic support for backfill handling - it needs to be
integrated with backfill sync, and the REST API needs to be adjusted to
take advantage of the new backfilled blocks when responding to certain
requests.
Future work will also enable moving the tail in either direction:
* pruning means moving the tail forward in time and removing states
* backwards means recreating past states from genesis, such that
intermediate states are recreated step by step all the way to the tail -
at that point, tail, genesis and backfill will match up.
* backfilling is done when backfill != genesis - later, this will be the
WSS checkpoint instead
2021-12-13 14:36:06 +01:00
|
|
|
|
|
|
|
dag.putBlock(signedBlock.asTrusted())
|
2022-02-26 19:16:19 +01:00
|
|
|
dag.db.finalizedBlocks.insert(blck.slot, blockRoot)
|
Backfill support for ChainDAG (#3171)
In the ChainDAG, 3 block pointers are kept: genesis, tail and head. This
PR adds one more block pointer: the backfill block which represents the
block that has been backfilled so far.
When doing a checkpoint sync, a random block is given as starting point
- this is the tail block, and we require that the tail block has a
corresponding state.
When backfilling, we end up with blocks without corresponding states,
hence we cannot use `tail` as a backfill pointer - there is no state.
Nonetheless, we need to keep track of where we are in the backfill
process between restarts, such that we can answer GetBeaconBlocksByRange
requests.
This PR adds the basic support for backfill handling - it needs to be
integrated with backfill sync, and the REST API needs to be adjusted to
take advantage of the new backfilled blocks when responding to certain
requests.
Future work will also enable moving the tail in either direction:
* pruning means moving the tail forward in time and removing states
* backwards means recreating past states from genesis, such that
intermediate states are recreated step by step all the way to the tail -
at that point, tail, genesis and backfill will match up.
* backfilling is done when backfill != genesis - later, this will be the
WSS checkpoint instead
2021-12-13 14:36:06 +01:00
|
|
|
|
limit by-root requests to non-finalized blocks (#3293)
* limit by-root requests to non-finalized blocks
Presently, we keep a mapping from block root to `BlockRef` in memory -
this has simplified reasoning about the dag, but is not sustainable with
the chain growing.
We can distinguish between two cases where by-root access is useful:
* unfinalized blocks - this is where the beacon chain is operating
generally, by validating incoming data as interesting for future fork
choice decisions - bounded by the length of the unfinalized period
* finalized blocks - historical access in the REST API etc - no bounds,
really
In this PR, we limit the by-root block index to the first use case:
finalized chain data can more efficiently be addressed by slot number.
Future work includes:
* limiting the `BlockRef` horizon in general - each instance is 40
bytes+overhead which adds up - this needs further refactoring to deal
with the tail vs state problem
* persisting the finalized slot-to-hash index - this one also keeps
growing unbounded (albeit slowly)
Anyway, this PR easily shaves ~128mb of memory usage at the time of
writing.
* No longer honor `BeaconBlocksByRoot` requests outside of the
non-finalized period - previously, Nimbus would generously return any
block through this libp2p request - per the spec, finalized blocks
should be fetched via `BeaconBlocksByRange` instead.
* return `Opt[BlockRef]` instead of `nil` when blocks can't be found -
this becomes a lot more common now and thus deserves more attention
* `dag.blocks` -> `dag.forkBlocks` - this index only carries unfinalized
blocks from now - `finalizedBlocks` covers the other `BlockRef`
instances
* in backfill, verify that the last backfilled block leads back to
genesis, or panic
* add backfill timings to log
* fix missing check that `BlockRef` block can be fetched with
`getForkedBlock` reliably
* shortcut doppelganger check when feature is not enabled
* in REST/JSON-RPC, fetch blocks without involving `BlockRef`
* fix dag.blocks ref
2022-01-21 12:33:16 +01:00
|
|
|
dag.backfill = blck.toBeaconBlockSummary()
|
Backfill support for ChainDAG (#3171)
In the ChainDAG, 3 block pointers are kept: genesis, tail and head. This
PR adds one more block pointer: the backfill block which represents the
block that has been backfilled so far.
When doing a checkpoint sync, a random block is given as starting point
- this is the tail block, and we require that the tail block has a
corresponding state.
When backfilling, we end up with blocks without corresponding states,
hence we cannot use `tail` as a backfill pointer - there is no state.
Nonetheless, we need to keep track of where we are in the backfill
process between restarts, such that we can answer GetBeaconBlocksByRange
requests.
This PR adds the basic support for backfill handling - it needs to be
integrated with backfill sync, and the REST API needs to be adjusted to
take advantage of the new backfilled blocks when responding to certain
requests.
Future work will also enable moving the tail in either direction:
* pruning means moving the tail forward in time and removing states
* backwards means recreating past states from genesis, such that
intermediate states are recreated step by step all the way to the tail -
at that point, tail, genesis and backfill will match up.
* backfilling is done when backfill != genesis - later, this will be the
WSS checkpoint instead
2021-12-13 14:36:06 +01:00
|
|
|
|
limit by-root requests to non-finalized blocks (#3293)
* limit by-root requests to non-finalized blocks
Presently, we keep a mapping from block root to `BlockRef` in memory -
this has simplified reasoning about the dag, but is not sustainable with
the chain growing.
We can distinguish between two cases where by-root access is useful:
* unfinalized blocks - this is where the beacon chain is operating
generally, by validating incoming data as interesting for future fork
choice decisions - bounded by the length of the unfinalized period
* finalized blocks - historical access in the REST API etc - no bounds,
really
In this PR, we limit the by-root block index to the first use case:
finalized chain data can more efficiently be addressed by slot number.
Future work includes:
* limiting the `BlockRef` horizon in general - each instance is 40
bytes+overhead which adds up - this needs further refactoring to deal
with the tail vs state problem
* persisting the finalized slot-to-hash index - this one also keeps
growing unbounded (albeit slowly)
Anyway, this PR easily shaves ~128mb of memory usage at the time of
writing.
* No longer honor `BeaconBlocksByRoot` requests outside of the
non-finalized period - previously, Nimbus would generously return any
block through this libp2p request - per the spec, finalized blocks
should be fetched via `BeaconBlocksByRange` instead.
* return `Opt[BlockRef]` instead of `nil` when blocks can't be found -
this becomes a lot more common now and thus deserves more attention
* `dag.blocks` -> `dag.forkBlocks` - this index only carries unfinalized
blocks from now - `finalizedBlocks` covers the other `BlockRef`
instances
* in backfill, verify that the last backfilled block leads back to
genesis, or panic
* add backfill timings to log
* fix missing check that `BlockRef` block can be fetched with
`getForkedBlock` reliably
* shortcut doppelganger check when feature is not enabled
* in REST/JSON-RPC, fetch blocks without involving `BlockRef`
* fix dag.blocks ref
2022-01-21 12:33:16 +01:00
|
|
|
let putBlockTick = Moment.now
|
|
|
|
debug "Block backfilled",
|
|
|
|
sigVerifyDur = sigVerifyTick - startTick,
|
2022-04-08 18:22:49 +02:00
|
|
|
putBlockDur = putBlockTick - sigVerifyTick
|
Backfill support for ChainDAG (#3171)
In the ChainDAG, 3 block pointers are kept: genesis, tail and head. This
PR adds one more block pointer: the backfill block which represents the
block that has been backfilled so far.
When doing a checkpoint sync, a random block is given as starting point
- this is the tail block, and we require that the tail block has a
corresponding state.
When backfilling, we end up with blocks without corresponding states,
hence we cannot use `tail` as a backfill pointer - there is no state.
Nonetheless, we need to keep track of where we are in the backfill
process between restarts, such that we can answer GetBeaconBlocksByRange
requests.
This PR adds the basic support for backfill handling - it needs to be
integrated with backfill sync, and the REST API needs to be adjusted to
take advantage of the new backfilled blocks when responding to certain
requests.
Future work will also enable moving the tail in either direction:
* pruning means moving the tail forward in time and removing states
* backwards means recreating past states from genesis, such that
intermediate states are recreated step by step all the way to the tail -
at that point, tail, genesis and backfill will match up.
* backfilling is done when backfill != genesis - later, this will be the
WSS checkpoint instead
2021-12-13 14:36:06 +01:00
|
|
|
|
|
|
|
ok()
|