387 lines
14 KiB
Nim
Raw Normal View History

# Nimbus
# Copyright (c) 2018-2025 Status Research & Development GmbH
# Licensed under either of
# * Apache License, version 2.0, ([LICENSE-APACHE](LICENSE-APACHE) or
# http://www.apache.org/licenses/LICENSE-2.0)
# * MIT license ([LICENSE-MIT](LICENSE-MIT) or
# http://opensource.org/licenses/MIT)
# at your option. This file may not be copied, modified, or distributed except
# according to those terms.
{.push raises: [].}
import
std/[sequtils, sets, strformat],
pkg/blscurve, # Kludge: needed to compile `eip4844` -- sometimes :)
../db/ledger,
2024-10-26 14:19:48 +07:00
../common/common,
../transaction/call_types,
../transaction,
../utils/utils,
"."/[dao, eip4844, eip7702, eip7691, gaslimit, withdrawals],
./pow/difficulty,
2024-05-30 14:54:03 +02:00
stew/objects,
results
from stew/byteutils
import nil
export
results
const
daoForkBlockExtraData* =
byteutils.hexToByteArray[13](DAOForkBlockExtra).toSeq
# ------------------------------------------------------------------------------
# Private validator functions
# ------------------------------------------------------------------------------
proc validateHeader(
com: CommonRef;
blk: Block;
parentHeader: Header;
aristo: fork support via layers/txframes (#2960) * aristo: fork support via layers/txframes This change reorganises how the database is accessed: instead holding a "current frame" in the database object, a dag of frames is created based on the "base frame" held in `AristoDbRef` and all database access happens through this frame, which can be thought of as a consistent point-in-time snapshot of the database based on a particular fork of the chain. In the code, "frame", "transaction" and "layer" is used to denote more or less the same thing: a dag of stacked changes backed by the on-disk database. Although this is not a requirement, in practice each frame holds the change set of a single block - as such, the frame and its ancestors leading up to the on-disk state represents the state of the database after that block has been applied. "committing" means merging the changes to its parent frame so that the difference between them is lost and only the cumulative changes remain - this facility enables frames to be combined arbitrarily wherever they are in the dag. In particular, it becomes possible to consolidate a set of changes near the base of the dag and commit those to disk without having to re-do the in-memory frames built on top of them - this is useful for "flattening" a set of changes during a base update and sending those to storage without having to perform a block replay on top. Looking at abstractions, a side effect of this change is that the KVT and Aristo are brought closer together by considering them to be part of the "same" atomic transaction set - the way the code gets organised, applying a block and saving it to the kvt happens in the same "logical" frame - therefore, discarding the frame discards both the aristo and kvt changes at the same time - likewise, they are persisted to disk together - this makes reasoning about the database somewhat easier but has the downside of increased memory usage, something that perhaps will need addressing in the future. Because the code reasons more strictly about frames and the state of the persisted database, it also makes it more visible where ForkedChain should be used and where it is still missing - in particular, frames represent a single branch of history while forkedchain manages multiple parallel forks - user-facing services such as the RPC should use the latter, ie until it has been finalized, a getBlock request should consider all forks and not just the blocks in the canonical head branch. Another advantage of this approach is that `AristoDbRef` conceptually becomes more simple - removing its tracking of the "current" transaction stack simplifies reasoning about what can go wrong since this state now has to be passed around in the form of `AristoTxRef` - as such, many of the tests and facilities in the code that were dealing with "stack inconsistency" are now structurally prevented from happening. The test suite will need significant refactoring after this change. Once this change has been merged, there are several follow-ups to do: * there's no mechanism for keeping frames up to date as they get committed or rolled back - TODO * naming is confused - many names for the same thing for legacy reason * forkedchain support is still missing in lots of code * clean up redundant logic based on previous designs - in particular the debug and introspection code no longer makes sense * the way change sets are stored will probably need revisiting - because it's a stack of changes where each frame must be interrogated to find an on-disk value, with a base distance of 128 we'll at minimum have to perform 128 frame lookups for *every* database interaction - regardless, the "dag-like" nature will stay * dispose and commit are poorly defined and perhaps redundant - in theory, one could simply let the GC collect abandoned frames etc, though it's likely an explicit mechanism will remain useful, so they stay for now More about the changes: * `AristoDbRef` gains a `txRef` field (todo: rename) that "more or less" corresponds to the old `balancer` field * `AristoDbRef.stack` is gone - instead, there's a chain of `AristoTxRef` objects that hold their respective "layer" which has the actual changes * No more reasoning about "top" and "stack" - instead, each `AristoTxRef` can be a "head" that "more or less" corresponds to the old single-history `top` notion and its stack * `level` still represents "distance to base" - it's computed from the parent chain instead of being stored * one has to be careful not to use frames where forkedchain was intended - layers are only for a single branch of history! * fix layer vtop after rollback * engine fix * Fix test_txpool * Fix test_rpc * Fix copyright year * fix simulator * Fix copyright year * Fix copyright year * Fix tracer * Fix infinite recursion bug * Remove aristo and kvt empty files * Fic copyright year * Fix fc chain_kvt * ForkedChain refactoring * Fix merge master conflict * Fix copyright year * Reparent txFrame * Fix test * Fix txFrame reparent again * Cleanup and fix test * UpdateBase bugfix and fix test * Fixe newPayload bug discovered by hive * Fix engine api fcu * Clean up call template, chain_kvt, andn txguid * Fix copyright year * work around base block loading issue * Add test * Fix updateHead bug * Fix updateBase bug * Change func commitBase to proc commitBase * Touch up and fix debug mode crash --------- Co-authored-by: jangko <jangko128@gmail.com>
2025-02-06 08:04:50 +01:00
txFrame: CoreDbTxRef;
Consolidate block type for block processing (#2325) This PR consolidates the split header-body sequences into a single EthBlock sequence and cleans up the fallout from that which significantly reduces block processing overhead during import thanks to less garbage collection and fewer copies of things all around. Notably, since the number of headers must always match the number of bodies, we also get rid of a pointless degree of freedom that in the future could introduce unnecessary bugs. * only read header and body from era file * avoid several unnecessary copies along the block processing way * simplify signatures, cleaning up unused arguemnts and returns * use `stew/assign2` in a few strategic places where the generated nim assignent is slow and add a few `move` to work around poor analysis in nim 1.6 (will need to be revisited for 2.0) ``` stats-20240607_2223-a814aa0b.csv vs stats-20240608_0714-21c1d0a9.csv bps_x bps_y tps_x tps_y bpsd tpsd timed block_number (498305, 713245] 1,540.52 1,809.73 2,361.58 2775.340189 17.63% 17.63% -14.92% (713245, 928185] 730.36 865.26 1,715.90 2028.973852 18.01% 18.01% -15.21% (928185, 1143126] 663.03 789.10 2,529.26 3032.490771 19.79% 19.79% -16.28% (1143126, 1358066] 393.46 508.05 2,152.50 2777.578119 29.13% 29.13% -22.50% (1358066, 1573007] 370.88 440.72 2,351.31 2791.896052 18.81% 18.81% -15.80% (1573007, 1787947] 283.65 335.11 2,068.93 2441.373402 17.60% 17.60% -14.91% (1787947, 2002888] 287.29 342.11 2,078.39 2474.179448 18.99% 18.99% -15.91% (2002888, 2217828] 293.38 343.16 2,208.83 2584.77457 17.16% 17.16% -14.61% (2217828, 2432769] 140.09 167.86 1,081.87 1296.336926 18.82% 18.82% -15.80% blocks: 1934464, baseline: 3h13m1s, contender: 2h43m47s bpsd (mean): 19.55% tpsd (mean): 19.55% Time (total): -29m13s, -15.14% ```
2024-06-09 16:32:20 +02:00
): Result[void,string] =
template header: Header = blk.header
Consolidate block type for block processing (#2325) This PR consolidates the split header-body sequences into a single EthBlock sequence and cleans up the fallout from that which significantly reduces block processing overhead during import thanks to less garbage collection and fewer copies of things all around. Notably, since the number of headers must always match the number of bodies, we also get rid of a pointless degree of freedom that in the future could introduce unnecessary bugs. * only read header and body from era file * avoid several unnecessary copies along the block processing way * simplify signatures, cleaning up unused arguemnts and returns * use `stew/assign2` in a few strategic places where the generated nim assignent is slow and add a few `move` to work around poor analysis in nim 1.6 (will need to be revisited for 2.0) ``` stats-20240607_2223-a814aa0b.csv vs stats-20240608_0714-21c1d0a9.csv bps_x bps_y tps_x tps_y bpsd tpsd timed block_number (498305, 713245] 1,540.52 1,809.73 2,361.58 2775.340189 17.63% 17.63% -14.92% (713245, 928185] 730.36 865.26 1,715.90 2028.973852 18.01% 18.01% -15.21% (928185, 1143126] 663.03 789.10 2,529.26 3032.490771 19.79% 19.79% -16.28% (1143126, 1358066] 393.46 508.05 2,152.50 2777.578119 29.13% 29.13% -22.50% (1358066, 1573007] 370.88 440.72 2,351.31 2791.896052 18.81% 18.81% -15.80% (1573007, 1787947] 283.65 335.11 2,068.93 2441.373402 17.60% 17.60% -14.91% (1787947, 2002888] 287.29 342.11 2,078.39 2474.179448 18.99% 18.99% -15.91% (2002888, 2217828] 293.38 343.16 2,208.83 2584.77457 17.16% 17.16% -14.61% (2217828, 2432769] 140.09 167.86 1,081.87 1296.336926 18.82% 18.82% -15.80% blocks: 1934464, baseline: 3h13m1s, contender: 2h43m47s bpsd (mean): 19.55% tpsd (mean): 19.55% Time (total): -29m13s, -15.14% ```
2024-06-09 16:32:20 +02:00
# TODO this code is used for validating uncles also, though these get passed
# an empty body - avoid this by separating header and block validation
template inDAOExtraRange(blockNumber: BlockNumber): bool =
# EIP-799
# Blocks with block numbers in the range [1_920_000, 1_920_009]
# MUST have DAOForkBlockExtra
2022-12-02 11:35:41 +07:00
let daoForkBlock = com.daoForkBlock.get
let DAOHigh = daoForkBlock + DAOForkExtraRange
2022-12-02 11:35:41 +07:00
daoForkBlock <= blockNumber and
blockNumber < DAOHigh
if header.extraData.len > 32:
return err("Header.extraData larger than 32 bytes")
Consolidate block type for block processing (#2325) This PR consolidates the split header-body sequences into a single EthBlock sequence and cleans up the fallout from that which significantly reduces block processing overhead during import thanks to less garbage collection and fewer copies of things all around. Notably, since the number of headers must always match the number of bodies, we also get rid of a pointless degree of freedom that in the future could introduce unnecessary bugs. * only read header and body from era file * avoid several unnecessary copies along the block processing way * simplify signatures, cleaning up unused arguemnts and returns * use `stew/assign2` in a few strategic places where the generated nim assignent is slow and add a few `move` to work around poor analysis in nim 1.6 (will need to be revisited for 2.0) ``` stats-20240607_2223-a814aa0b.csv vs stats-20240608_0714-21c1d0a9.csv bps_x bps_y tps_x tps_y bpsd tpsd timed block_number (498305, 713245] 1,540.52 1,809.73 2,361.58 2775.340189 17.63% 17.63% -14.92% (713245, 928185] 730.36 865.26 1,715.90 2028.973852 18.01% 18.01% -15.21% (928185, 1143126] 663.03 789.10 2,529.26 3032.490771 19.79% 19.79% -16.28% (1143126, 1358066] 393.46 508.05 2,152.50 2777.578119 29.13% 29.13% -22.50% (1358066, 1573007] 370.88 440.72 2,351.31 2791.896052 18.81% 18.81% -15.80% (1573007, 1787947] 283.65 335.11 2,068.93 2441.373402 17.60% 17.60% -14.91% (1787947, 2002888] 287.29 342.11 2,078.39 2474.179448 18.99% 18.99% -15.91% (2002888, 2217828] 293.38 343.16 2,208.83 2584.77457 17.16% 17.16% -14.61% (2217828, 2432769] 140.09 167.86 1,081.87 1296.336926 18.82% 18.82% -15.80% blocks: 1934464, baseline: 3h13m1s, contender: 2h43m47s bpsd (mean): 19.55% tpsd (mean): 19.55% Time (total): -29m13s, -15.14% ```
2024-06-09 16:32:20 +02:00
if header.gasUsed == 0 and 0 < blk.transactions.len:
return err("zero gasUsed but transactions present");
if header.gasUsed < 0 or header.gasUsed > header.gasLimit:
return err("gasUsed should be non negative and smaller or equal gasLimit")
if header.number != parentHeader.number + 1:
return err("Blocks must be numbered consecutively")
if header.timestamp <= parentHeader.timestamp:
return err("timestamp must be strictly later than parent")
if header.gasLimit > GAS_LIMIT_MAXIMUM:
return err("gasLimit exceeds GAS_LIMIT_MAXIMUM")
if com.daoForkSupport and inDAOExtraRange(header.number):
if header.extraData != daoForkBlockExtraData:
return err("header extra data should be marked DAO")
aristo: fork support via layers/txframes (#2960) * aristo: fork support via layers/txframes This change reorganises how the database is accessed: instead holding a "current frame" in the database object, a dag of frames is created based on the "base frame" held in `AristoDbRef` and all database access happens through this frame, which can be thought of as a consistent point-in-time snapshot of the database based on a particular fork of the chain. In the code, "frame", "transaction" and "layer" is used to denote more or less the same thing: a dag of stacked changes backed by the on-disk database. Although this is not a requirement, in practice each frame holds the change set of a single block - as such, the frame and its ancestors leading up to the on-disk state represents the state of the database after that block has been applied. "committing" means merging the changes to its parent frame so that the difference between them is lost and only the cumulative changes remain - this facility enables frames to be combined arbitrarily wherever they are in the dag. In particular, it becomes possible to consolidate a set of changes near the base of the dag and commit those to disk without having to re-do the in-memory frames built on top of them - this is useful for "flattening" a set of changes during a base update and sending those to storage without having to perform a block replay on top. Looking at abstractions, a side effect of this change is that the KVT and Aristo are brought closer together by considering them to be part of the "same" atomic transaction set - the way the code gets organised, applying a block and saving it to the kvt happens in the same "logical" frame - therefore, discarding the frame discards both the aristo and kvt changes at the same time - likewise, they are persisted to disk together - this makes reasoning about the database somewhat easier but has the downside of increased memory usage, something that perhaps will need addressing in the future. Because the code reasons more strictly about frames and the state of the persisted database, it also makes it more visible where ForkedChain should be used and where it is still missing - in particular, frames represent a single branch of history while forkedchain manages multiple parallel forks - user-facing services such as the RPC should use the latter, ie until it has been finalized, a getBlock request should consider all forks and not just the blocks in the canonical head branch. Another advantage of this approach is that `AristoDbRef` conceptually becomes more simple - removing its tracking of the "current" transaction stack simplifies reasoning about what can go wrong since this state now has to be passed around in the form of `AristoTxRef` - as such, many of the tests and facilities in the code that were dealing with "stack inconsistency" are now structurally prevented from happening. The test suite will need significant refactoring after this change. Once this change has been merged, there are several follow-ups to do: * there's no mechanism for keeping frames up to date as they get committed or rolled back - TODO * naming is confused - many names for the same thing for legacy reason * forkedchain support is still missing in lots of code * clean up redundant logic based on previous designs - in particular the debug and introspection code no longer makes sense * the way change sets are stored will probably need revisiting - because it's a stack of changes where each frame must be interrogated to find an on-disk value, with a base distance of 128 we'll at minimum have to perform 128 frame lookups for *every* database interaction - regardless, the "dag-like" nature will stay * dispose and commit are poorly defined and perhaps redundant - in theory, one could simply let the GC collect abandoned frames etc, though it's likely an explicit mechanism will remain useful, so they stay for now More about the changes: * `AristoDbRef` gains a `txRef` field (todo: rename) that "more or less" corresponds to the old `balancer` field * `AristoDbRef.stack` is gone - instead, there's a chain of `AristoTxRef` objects that hold their respective "layer" which has the actual changes * No more reasoning about "top" and "stack" - instead, each `AristoTxRef` can be a "head" that "more or less" corresponds to the old single-history `top` notion and its stack * `level` still represents "distance to base" - it's computed from the parent chain instead of being stored * one has to be careful not to use frames where forkedchain was intended - layers are only for a single branch of history! * fix layer vtop after rollback * engine fix * Fix test_txpool * Fix test_rpc * Fix copyright year * fix simulator * Fix copyright year * Fix copyright year * Fix tracer * Fix infinite recursion bug * Remove aristo and kvt empty files * Fic copyright year * Fix fc chain_kvt * ForkedChain refactoring * Fix merge master conflict * Fix copyright year * Reparent txFrame * Fix test * Fix txFrame reparent again * Cleanup and fix test * UpdateBase bugfix and fix test * Fixe newPayload bug discovered by hive * Fix engine api fcu * Clean up call template, chain_kvt, andn txguid * Fix copyright year * work around base block loading issue * Add test * Fix updateHead bug * Fix updateBase bug * Change func commitBase to proc commitBase * Touch up and fix debug mode crash --------- Co-authored-by: jangko <jangko128@gmail.com>
2025-02-06 08:04:50 +01:00
if com.proofOfStake(header, txFrame):
# EIP-4399 and EIP-3675
# no need to check mixHash because EIP-4399 override this field
# checking rule
if not header.difficulty.isZero:
return err("Non-zero difficulty in a post-merge block")
if not header.nonce.isZeroMemory:
return err("Non-zero nonce in a post-merge block")
if header.ommersHash != EMPTY_UNCLE_HASH:
return err("Invalid ommers hash in a post-merge block")
else:
2022-12-02 11:35:41 +07:00
let calcDiffc = com.calcDifficulty(header.timestamp, parentHeader)
if header.difficulty < calcDiffc:
return err("provided header difficulty is too low")
Consolidate block type for block processing (#2325) This PR consolidates the split header-body sequences into a single EthBlock sequence and cleans up the fallout from that which significantly reduces block processing overhead during import thanks to less garbage collection and fewer copies of things all around. Notably, since the number of headers must always match the number of bodies, we also get rid of a pointless degree of freedom that in the future could introduce unnecessary bugs. * only read header and body from era file * avoid several unnecessary copies along the block processing way * simplify signatures, cleaning up unused arguemnts and returns * use `stew/assign2` in a few strategic places where the generated nim assignent is slow and add a few `move` to work around poor analysis in nim 1.6 (will need to be revisited for 2.0) ``` stats-20240607_2223-a814aa0b.csv vs stats-20240608_0714-21c1d0a9.csv bps_x bps_y tps_x tps_y bpsd tpsd timed block_number (498305, 713245] 1,540.52 1,809.73 2,361.58 2775.340189 17.63% 17.63% -14.92% (713245, 928185] 730.36 865.26 1,715.90 2028.973852 18.01% 18.01% -15.21% (928185, 1143126] 663.03 789.10 2,529.26 3032.490771 19.79% 19.79% -16.28% (1143126, 1358066] 393.46 508.05 2,152.50 2777.578119 29.13% 29.13% -22.50% (1358066, 1573007] 370.88 440.72 2,351.31 2791.896052 18.81% 18.81% -15.80% (1573007, 1787947] 283.65 335.11 2,068.93 2441.373402 17.60% 17.60% -14.91% (1787947, 2002888] 287.29 342.11 2,078.39 2474.179448 18.99% 18.99% -15.91% (2002888, 2217828] 293.38 343.16 2,208.83 2584.77457 17.16% 17.16% -14.61% (2217828, 2432769] 140.09 167.86 1,081.87 1296.336926 18.82% 18.82% -15.80% blocks: 1934464, baseline: 3h13m1s, contender: 2h43m47s bpsd (mean): 19.55% tpsd (mean): 19.55% Time (total): -29m13s, -15.14% ```
2024-06-09 16:32:20 +02:00
? com.validateWithdrawals(header, blk.withdrawals)
? com.validateEip4844Header(header, parentHeader, blk.transactions)
? com.validateGasLimitOrBaseFee(header, parentHeader)
ok()
aristo: fork support via layers/txframes (#2960) * aristo: fork support via layers/txframes This change reorganises how the database is accessed: instead holding a "current frame" in the database object, a dag of frames is created based on the "base frame" held in `AristoDbRef` and all database access happens through this frame, which can be thought of as a consistent point-in-time snapshot of the database based on a particular fork of the chain. In the code, "frame", "transaction" and "layer" is used to denote more or less the same thing: a dag of stacked changes backed by the on-disk database. Although this is not a requirement, in practice each frame holds the change set of a single block - as such, the frame and its ancestors leading up to the on-disk state represents the state of the database after that block has been applied. "committing" means merging the changes to its parent frame so that the difference between them is lost and only the cumulative changes remain - this facility enables frames to be combined arbitrarily wherever they are in the dag. In particular, it becomes possible to consolidate a set of changes near the base of the dag and commit those to disk without having to re-do the in-memory frames built on top of them - this is useful for "flattening" a set of changes during a base update and sending those to storage without having to perform a block replay on top. Looking at abstractions, a side effect of this change is that the KVT and Aristo are brought closer together by considering them to be part of the "same" atomic transaction set - the way the code gets organised, applying a block and saving it to the kvt happens in the same "logical" frame - therefore, discarding the frame discards both the aristo and kvt changes at the same time - likewise, they are persisted to disk together - this makes reasoning about the database somewhat easier but has the downside of increased memory usage, something that perhaps will need addressing in the future. Because the code reasons more strictly about frames and the state of the persisted database, it also makes it more visible where ForkedChain should be used and where it is still missing - in particular, frames represent a single branch of history while forkedchain manages multiple parallel forks - user-facing services such as the RPC should use the latter, ie until it has been finalized, a getBlock request should consider all forks and not just the blocks in the canonical head branch. Another advantage of this approach is that `AristoDbRef` conceptually becomes more simple - removing its tracking of the "current" transaction stack simplifies reasoning about what can go wrong since this state now has to be passed around in the form of `AristoTxRef` - as such, many of the tests and facilities in the code that were dealing with "stack inconsistency" are now structurally prevented from happening. The test suite will need significant refactoring after this change. Once this change has been merged, there are several follow-ups to do: * there's no mechanism for keeping frames up to date as they get committed or rolled back - TODO * naming is confused - many names for the same thing for legacy reason * forkedchain support is still missing in lots of code * clean up redundant logic based on previous designs - in particular the debug and introspection code no longer makes sense * the way change sets are stored will probably need revisiting - because it's a stack of changes where each frame must be interrogated to find an on-disk value, with a base distance of 128 we'll at minimum have to perform 128 frame lookups for *every* database interaction - regardless, the "dag-like" nature will stay * dispose and commit are poorly defined and perhaps redundant - in theory, one could simply let the GC collect abandoned frames etc, though it's likely an explicit mechanism will remain useful, so they stay for now More about the changes: * `AristoDbRef` gains a `txRef` field (todo: rename) that "more or less" corresponds to the old `balancer` field * `AristoDbRef.stack` is gone - instead, there's a chain of `AristoTxRef` objects that hold their respective "layer" which has the actual changes * No more reasoning about "top" and "stack" - instead, each `AristoTxRef` can be a "head" that "more or less" corresponds to the old single-history `top` notion and its stack * `level` still represents "distance to base" - it's computed from the parent chain instead of being stored * one has to be careful not to use frames where forkedchain was intended - layers are only for a single branch of history! * fix layer vtop after rollback * engine fix * Fix test_txpool * Fix test_rpc * Fix copyright year * fix simulator * Fix copyright year * Fix copyright year * Fix tracer * Fix infinite recursion bug * Remove aristo and kvt empty files * Fic copyright year * Fix fc chain_kvt * ForkedChain refactoring * Fix merge master conflict * Fix copyright year * Reparent txFrame * Fix test * Fix txFrame reparent again * Cleanup and fix test * UpdateBase bugfix and fix test * Fixe newPayload bug discovered by hive * Fix engine api fcu * Clean up call template, chain_kvt, andn txguid * Fix copyright year * work around base block loading issue * Add test * Fix updateHead bug * Fix updateBase bug * Change func commitBase to proc commitBase * Touch up and fix debug mode crash --------- Co-authored-by: jangko <jangko128@gmail.com>
2025-02-06 08:04:50 +01:00
proc validateUncles(com: CommonRef; header: Header; txFrame: CoreDbTxRef,
uncles: openArray[Header]): Result[void,string]
{.gcsafe, raises: [].} =
let hasUncles = uncles.len > 0
let shouldHaveUncles = header.ommersHash != EMPTY_UNCLE_HASH
if not hasUncles and not shouldHaveUncles:
# optimization to avoid loading ancestors from DB, since the block has
# no uncles
return ok()
if hasUncles and not shouldHaveUncles:
return err("Block has uncles but header suggests uncles should be empty")
if shouldHaveUncles and not hasUncles:
return err("Header suggests block should have uncles but block has none")
# Check for duplicates
var uncleSet = HashSet[Hash32]()
for uncle in uncles:
Fearture/poa clique tuning (#765) * Provide API details: API is bundled via clique.nim. * Set extraValidation as default for PoA chains why: This triggers consensus verification and an update of the list of authorised signers. These signers are integral part of the PoA block chain. todo: Option argument to control validation for the nimbus binary. * Fix snapshot state block number why: Using sub-sequence here, so the len() function was wrong. * Optional start where block verification begins why: Can speed up time building loading initial parts of block chain. For PoA, this allows to prove & test that authorised signers can be (correctly) calculated starting at any point on the block chain. todo: On Goerli around blocks #193537..#197568, processing time increases disproportionally -- needs to be understand * For Clique test, get old grouping back (7 transactions per log entry) why: Forgot to change back after troubleshooting * Fix field/function/module-name misunderstanding why: Make compilation work * Use eth_types.blockHash() rather than utils.hash() in Clique modules why: Prefer lib module * Dissolve snapshot_misc.nim details: .. into clique_verify.nim (the other source file clique_unused.nim is inactive) * Hide unused AsyncLock in Clique descriptor details: Unused here but was part of the Go reference implementation * Remove fakeDiff flag from Clique descriptor details: This flag was a kludge in the Go reference implementation used for the canonical tests. The tests have been adapted so there is no need for the fakeDiff flag and its implementation. * Not observing minimum distance from epoch sync point why: For compiling PoA state, the go implementation will walk back to the epoch header with at least 90000 blocks apart from the current header in the absence of other synchronisation points. Here just the nearest epoch header is used. The assumption is that all the checkpoints before have been vetted already regardless of the current branch. details: The behaviour of using the nearest vs the minimum distance epoch is controlled by a flag and can be changed at run time. * Analysing processing time (patch adds some debugging/visualisation support) why: At the first half million blocks of the Goerli replay, blocks on the interval #194854..#196224 take exceptionally long to process, but not due to PoA processing. details: It turns out that much time is spent in p2p/excecutor.processBlock() where the elapsed transaction execution time is significantly greater for many of these blocks. Between the 1371 blocks #194854..#196224 there are 223 blocks with more than 1/2 seconds execution time whereas there are only 4 such blocks before and 13 such after this range up to #504192. * fix debugging symbol in clique_desc (causes CI failing) * Fixing canonical reference tests why: Two errors were introduced earlier but ovelooked: 1. "Remove fakeDiff flag .." patch was incomplete 2. "Not observing minimum distance .." introduced problem w/tests 23/24 details: Fixing 2. needed to revert the behaviour by setting the applySnapsMinBacklog flag for the Clique descriptor. Also a new test was added to lock the new behaviour. * Remove cruft why: Clique/PoA processing was intended to take place somewhere in executor/process_block.processBlock() but was decided later to run from chain/persist_block.persistBlock() instead. * Update API comment * ditto
2021-07-30 15:06:51 +01:00
let uncleHash = uncle.blockHash
if uncleHash in uncleSet:
return err("Block contains duplicate uncles")
else:
uncleSet.incl uncleHash
let
aristo: fork support via layers/txframes (#2960) * aristo: fork support via layers/txframes This change reorganises how the database is accessed: instead holding a "current frame" in the database object, a dag of frames is created based on the "base frame" held in `AristoDbRef` and all database access happens through this frame, which can be thought of as a consistent point-in-time snapshot of the database based on a particular fork of the chain. In the code, "frame", "transaction" and "layer" is used to denote more or less the same thing: a dag of stacked changes backed by the on-disk database. Although this is not a requirement, in practice each frame holds the change set of a single block - as such, the frame and its ancestors leading up to the on-disk state represents the state of the database after that block has been applied. "committing" means merging the changes to its parent frame so that the difference between them is lost and only the cumulative changes remain - this facility enables frames to be combined arbitrarily wherever they are in the dag. In particular, it becomes possible to consolidate a set of changes near the base of the dag and commit those to disk without having to re-do the in-memory frames built on top of them - this is useful for "flattening" a set of changes during a base update and sending those to storage without having to perform a block replay on top. Looking at abstractions, a side effect of this change is that the KVT and Aristo are brought closer together by considering them to be part of the "same" atomic transaction set - the way the code gets organised, applying a block and saving it to the kvt happens in the same "logical" frame - therefore, discarding the frame discards both the aristo and kvt changes at the same time - likewise, they are persisted to disk together - this makes reasoning about the database somewhat easier but has the downside of increased memory usage, something that perhaps will need addressing in the future. Because the code reasons more strictly about frames and the state of the persisted database, it also makes it more visible where ForkedChain should be used and where it is still missing - in particular, frames represent a single branch of history while forkedchain manages multiple parallel forks - user-facing services such as the RPC should use the latter, ie until it has been finalized, a getBlock request should consider all forks and not just the blocks in the canonical head branch. Another advantage of this approach is that `AristoDbRef` conceptually becomes more simple - removing its tracking of the "current" transaction stack simplifies reasoning about what can go wrong since this state now has to be passed around in the form of `AristoTxRef` - as such, many of the tests and facilities in the code that were dealing with "stack inconsistency" are now structurally prevented from happening. The test suite will need significant refactoring after this change. Once this change has been merged, there are several follow-ups to do: * there's no mechanism for keeping frames up to date as they get committed or rolled back - TODO * naming is confused - many names for the same thing for legacy reason * forkedchain support is still missing in lots of code * clean up redundant logic based on previous designs - in particular the debug and introspection code no longer makes sense * the way change sets are stored will probably need revisiting - because it's a stack of changes where each frame must be interrogated to find an on-disk value, with a base distance of 128 we'll at minimum have to perform 128 frame lookups for *every* database interaction - regardless, the "dag-like" nature will stay * dispose and commit are poorly defined and perhaps redundant - in theory, one could simply let the GC collect abandoned frames etc, though it's likely an explicit mechanism will remain useful, so they stay for now More about the changes: * `AristoDbRef` gains a `txRef` field (todo: rename) that "more or less" corresponds to the old `balancer` field * `AristoDbRef.stack` is gone - instead, there's a chain of `AristoTxRef` objects that hold their respective "layer" which has the actual changes * No more reasoning about "top" and "stack" - instead, each `AristoTxRef` can be a "head" that "more or less" corresponds to the old single-history `top` notion and its stack * `level` still represents "distance to base" - it's computed from the parent chain instead of being stored * one has to be careful not to use frames where forkedchain was intended - layers are only for a single branch of history! * fix layer vtop after rollback * engine fix * Fix test_txpool * Fix test_rpc * Fix copyright year * fix simulator * Fix copyright year * Fix copyright year * Fix tracer * Fix infinite recursion bug * Remove aristo and kvt empty files * Fic copyright year * Fix fc chain_kvt * ForkedChain refactoring * Fix merge master conflict * Fix copyright year * Reparent txFrame * Fix test * Fix txFrame reparent again * Cleanup and fix test * UpdateBase bugfix and fix test * Fixe newPayload bug discovered by hive * Fix engine api fcu * Clean up call template, chain_kvt, andn txguid * Fix copyright year * work around base block loading issue * Add test * Fix updateHead bug * Fix updateBase bug * Change func commitBase to proc commitBase * Touch up and fix debug mode crash --------- Co-authored-by: jangko <jangko128@gmail.com>
2025-02-06 08:04:50 +01:00
recentAncestorHashes = ?txFrame.getAncestorsHashes(MAX_UNCLE_DEPTH + 1, header)
recentUncleHashes = ?txFrame.getUncleHashes(recentAncestorHashes)
blockHash = header.blockHash
for uncle in uncles:
Fearture/poa clique tuning (#765) * Provide API details: API is bundled via clique.nim. * Set extraValidation as default for PoA chains why: This triggers consensus verification and an update of the list of authorised signers. These signers are integral part of the PoA block chain. todo: Option argument to control validation for the nimbus binary. * Fix snapshot state block number why: Using sub-sequence here, so the len() function was wrong. * Optional start where block verification begins why: Can speed up time building loading initial parts of block chain. For PoA, this allows to prove & test that authorised signers can be (correctly) calculated starting at any point on the block chain. todo: On Goerli around blocks #193537..#197568, processing time increases disproportionally -- needs to be understand * For Clique test, get old grouping back (7 transactions per log entry) why: Forgot to change back after troubleshooting * Fix field/function/module-name misunderstanding why: Make compilation work * Use eth_types.blockHash() rather than utils.hash() in Clique modules why: Prefer lib module * Dissolve snapshot_misc.nim details: .. into clique_verify.nim (the other source file clique_unused.nim is inactive) * Hide unused AsyncLock in Clique descriptor details: Unused here but was part of the Go reference implementation * Remove fakeDiff flag from Clique descriptor details: This flag was a kludge in the Go reference implementation used for the canonical tests. The tests have been adapted so there is no need for the fakeDiff flag and its implementation. * Not observing minimum distance from epoch sync point why: For compiling PoA state, the go implementation will walk back to the epoch header with at least 90000 blocks apart from the current header in the absence of other synchronisation points. Here just the nearest epoch header is used. The assumption is that all the checkpoints before have been vetted already regardless of the current branch. details: The behaviour of using the nearest vs the minimum distance epoch is controlled by a flag and can be changed at run time. * Analysing processing time (patch adds some debugging/visualisation support) why: At the first half million blocks of the Goerli replay, blocks on the interval #194854..#196224 take exceptionally long to process, but not due to PoA processing. details: It turns out that much time is spent in p2p/excecutor.processBlock() where the elapsed transaction execution time is significantly greater for many of these blocks. Between the 1371 blocks #194854..#196224 there are 223 blocks with more than 1/2 seconds execution time whereas there are only 4 such blocks before and 13 such after this range up to #504192. * fix debugging symbol in clique_desc (causes CI failing) * Fixing canonical reference tests why: Two errors were introduced earlier but ovelooked: 1. "Remove fakeDiff flag .." patch was incomplete 2. "Not observing minimum distance .." introduced problem w/tests 23/24 details: Fixing 2. needed to revert the behaviour by setting the applySnapsMinBacklog flag for the Clique descriptor. Also a new test was added to lock the new behaviour. * Remove cruft why: Clique/PoA processing was intended to take place somewhere in executor/process_block.processBlock() but was decided later to run from chain/persist_block.persistBlock() instead. * Update API comment * ditto
2021-07-30 15:06:51 +01:00
let uncleHash = uncle.blockHash
if uncleHash == blockHash:
return err("Uncle has same hash as block")
# ensure the uncle has not already been included.
if uncleHash in recentUncleHashes:
return err("Duplicate uncle")
# ensure that the uncle is not one of the canonical chain blocks.
if uncleHash in recentAncestorHashes:
return err("Uncle cannot be an ancestor")
# ensure that the uncle was built off of one of the canonical chain
# blocks.
if (uncle.parentHash notin recentAncestorHashes) or
(uncle.parentHash == header.parentHash):
return err("Uncle's parent is not an ancestor")
if uncle.number >= header.number:
return err("uncle block number larger than current block number")
# check uncle against own parent
aristo: fork support via layers/txframes (#2960) * aristo: fork support via layers/txframes This change reorganises how the database is accessed: instead holding a "current frame" in the database object, a dag of frames is created based on the "base frame" held in `AristoDbRef` and all database access happens through this frame, which can be thought of as a consistent point-in-time snapshot of the database based on a particular fork of the chain. In the code, "frame", "transaction" and "layer" is used to denote more or less the same thing: a dag of stacked changes backed by the on-disk database. Although this is not a requirement, in practice each frame holds the change set of a single block - as such, the frame and its ancestors leading up to the on-disk state represents the state of the database after that block has been applied. "committing" means merging the changes to its parent frame so that the difference between them is lost and only the cumulative changes remain - this facility enables frames to be combined arbitrarily wherever they are in the dag. In particular, it becomes possible to consolidate a set of changes near the base of the dag and commit those to disk without having to re-do the in-memory frames built on top of them - this is useful for "flattening" a set of changes during a base update and sending those to storage without having to perform a block replay on top. Looking at abstractions, a side effect of this change is that the KVT and Aristo are brought closer together by considering them to be part of the "same" atomic transaction set - the way the code gets organised, applying a block and saving it to the kvt happens in the same "logical" frame - therefore, discarding the frame discards both the aristo and kvt changes at the same time - likewise, they are persisted to disk together - this makes reasoning about the database somewhat easier but has the downside of increased memory usage, something that perhaps will need addressing in the future. Because the code reasons more strictly about frames and the state of the persisted database, it also makes it more visible where ForkedChain should be used and where it is still missing - in particular, frames represent a single branch of history while forkedchain manages multiple parallel forks - user-facing services such as the RPC should use the latter, ie until it has been finalized, a getBlock request should consider all forks and not just the blocks in the canonical head branch. Another advantage of this approach is that `AristoDbRef` conceptually becomes more simple - removing its tracking of the "current" transaction stack simplifies reasoning about what can go wrong since this state now has to be passed around in the form of `AristoTxRef` - as such, many of the tests and facilities in the code that were dealing with "stack inconsistency" are now structurally prevented from happening. The test suite will need significant refactoring after this change. Once this change has been merged, there are several follow-ups to do: * there's no mechanism for keeping frames up to date as they get committed or rolled back - TODO * naming is confused - many names for the same thing for legacy reason * forkedchain support is still missing in lots of code * clean up redundant logic based on previous designs - in particular the debug and introspection code no longer makes sense * the way change sets are stored will probably need revisiting - because it's a stack of changes where each frame must be interrogated to find an on-disk value, with a base distance of 128 we'll at minimum have to perform 128 frame lookups for *every* database interaction - regardless, the "dag-like" nature will stay * dispose and commit are poorly defined and perhaps redundant - in theory, one could simply let the GC collect abandoned frames etc, though it's likely an explicit mechanism will remain useful, so they stay for now More about the changes: * `AristoDbRef` gains a `txRef` field (todo: rename) that "more or less" corresponds to the old `balancer` field * `AristoDbRef.stack` is gone - instead, there's a chain of `AristoTxRef` objects that hold their respective "layer" which has the actual changes * No more reasoning about "top" and "stack" - instead, each `AristoTxRef` can be a "head" that "more or less" corresponds to the old single-history `top` notion and its stack * `level` still represents "distance to base" - it's computed from the parent chain instead of being stored * one has to be careful not to use frames where forkedchain was intended - layers are only for a single branch of history! * fix layer vtop after rollback * engine fix * Fix test_txpool * Fix test_rpc * Fix copyright year * fix simulator * Fix copyright year * Fix copyright year * Fix tracer * Fix infinite recursion bug * Remove aristo and kvt empty files * Fic copyright year * Fix fc chain_kvt * ForkedChain refactoring * Fix merge master conflict * Fix copyright year * Reparent txFrame * Fix test * Fix txFrame reparent again * Cleanup and fix test * UpdateBase bugfix and fix test * Fixe newPayload bug discovered by hive * Fix engine api fcu * Clean up call template, chain_kvt, andn txguid * Fix copyright year * work around base block loading issue * Add test * Fix updateHead bug * Fix updateBase bug * Change func commitBase to proc commitBase * Touch up and fix debug mode crash --------- Co-authored-by: jangko <jangko128@gmail.com>
2025-02-06 08:04:50 +01:00
let parent = ?txFrame.getBlockHeader(uncle.parentHash)
if uncle.timestamp <= parent.timestamp:
return err("Uncle's parent must me older")
aristo: fork support via layers/txframes (#2960) * aristo: fork support via layers/txframes This change reorganises how the database is accessed: instead holding a "current frame" in the database object, a dag of frames is created based on the "base frame" held in `AristoDbRef` and all database access happens through this frame, which can be thought of as a consistent point-in-time snapshot of the database based on a particular fork of the chain. In the code, "frame", "transaction" and "layer" is used to denote more or less the same thing: a dag of stacked changes backed by the on-disk database. Although this is not a requirement, in practice each frame holds the change set of a single block - as such, the frame and its ancestors leading up to the on-disk state represents the state of the database after that block has been applied. "committing" means merging the changes to its parent frame so that the difference between them is lost and only the cumulative changes remain - this facility enables frames to be combined arbitrarily wherever they are in the dag. In particular, it becomes possible to consolidate a set of changes near the base of the dag and commit those to disk without having to re-do the in-memory frames built on top of them - this is useful for "flattening" a set of changes during a base update and sending those to storage without having to perform a block replay on top. Looking at abstractions, a side effect of this change is that the KVT and Aristo are brought closer together by considering them to be part of the "same" atomic transaction set - the way the code gets organised, applying a block and saving it to the kvt happens in the same "logical" frame - therefore, discarding the frame discards both the aristo and kvt changes at the same time - likewise, they are persisted to disk together - this makes reasoning about the database somewhat easier but has the downside of increased memory usage, something that perhaps will need addressing in the future. Because the code reasons more strictly about frames and the state of the persisted database, it also makes it more visible where ForkedChain should be used and where it is still missing - in particular, frames represent a single branch of history while forkedchain manages multiple parallel forks - user-facing services such as the RPC should use the latter, ie until it has been finalized, a getBlock request should consider all forks and not just the blocks in the canonical head branch. Another advantage of this approach is that `AristoDbRef` conceptually becomes more simple - removing its tracking of the "current" transaction stack simplifies reasoning about what can go wrong since this state now has to be passed around in the form of `AristoTxRef` - as such, many of the tests and facilities in the code that were dealing with "stack inconsistency" are now structurally prevented from happening. The test suite will need significant refactoring after this change. Once this change has been merged, there are several follow-ups to do: * there's no mechanism for keeping frames up to date as they get committed or rolled back - TODO * naming is confused - many names for the same thing for legacy reason * forkedchain support is still missing in lots of code * clean up redundant logic based on previous designs - in particular the debug and introspection code no longer makes sense * the way change sets are stored will probably need revisiting - because it's a stack of changes where each frame must be interrogated to find an on-disk value, with a base distance of 128 we'll at minimum have to perform 128 frame lookups for *every* database interaction - regardless, the "dag-like" nature will stay * dispose and commit are poorly defined and perhaps redundant - in theory, one could simply let the GC collect abandoned frames etc, though it's likely an explicit mechanism will remain useful, so they stay for now More about the changes: * `AristoDbRef` gains a `txRef` field (todo: rename) that "more or less" corresponds to the old `balancer` field * `AristoDbRef.stack` is gone - instead, there's a chain of `AristoTxRef` objects that hold their respective "layer" which has the actual changes * No more reasoning about "top" and "stack" - instead, each `AristoTxRef` can be a "head" that "more or less" corresponds to the old single-history `top` notion and its stack * `level` still represents "distance to base" - it's computed from the parent chain instead of being stored * one has to be careful not to use frames where forkedchain was intended - layers are only for a single branch of history! * fix layer vtop after rollback * engine fix * Fix test_txpool * Fix test_rpc * Fix copyright year * fix simulator * Fix copyright year * Fix copyright year * Fix tracer * Fix infinite recursion bug * Remove aristo and kvt empty files * Fic copyright year * Fix fc chain_kvt * ForkedChain refactoring * Fix merge master conflict * Fix copyright year * Reparent txFrame * Fix test * Fix txFrame reparent again * Cleanup and fix test * UpdateBase bugfix and fix test * Fixe newPayload bug discovered by hive * Fix engine api fcu * Clean up call template, chain_kvt, andn txguid * Fix copyright year * work around base block loading issue * Add test * Fix updateHead bug * Fix updateBase bug * Change func commitBase to proc commitBase * Touch up and fix debug mode crash --------- Co-authored-by: jangko <jangko128@gmail.com>
2025-02-06 08:04:50 +01:00
let uncleParent = ?txFrame.getBlockHeader(uncle.parentHash)
Consolidate block type for block processing (#2325) This PR consolidates the split header-body sequences into a single EthBlock sequence and cleans up the fallout from that which significantly reduces block processing overhead during import thanks to less garbage collection and fewer copies of things all around. Notably, since the number of headers must always match the number of bodies, we also get rid of a pointless degree of freedom that in the future could introduce unnecessary bugs. * only read header and body from era file * avoid several unnecessary copies along the block processing way * simplify signatures, cleaning up unused arguemnts and returns * use `stew/assign2` in a few strategic places where the generated nim assignent is slow and add a few `move` to work around poor analysis in nim 1.6 (will need to be revisited for 2.0) ``` stats-20240607_2223-a814aa0b.csv vs stats-20240608_0714-21c1d0a9.csv bps_x bps_y tps_x tps_y bpsd tpsd timed block_number (498305, 713245] 1,540.52 1,809.73 2,361.58 2775.340189 17.63% 17.63% -14.92% (713245, 928185] 730.36 865.26 1,715.90 2028.973852 18.01% 18.01% -15.21% (928185, 1143126] 663.03 789.10 2,529.26 3032.490771 19.79% 19.79% -16.28% (1143126, 1358066] 393.46 508.05 2,152.50 2777.578119 29.13% 29.13% -22.50% (1358066, 1573007] 370.88 440.72 2,351.31 2791.896052 18.81% 18.81% -15.80% (1573007, 1787947] 283.65 335.11 2,068.93 2441.373402 17.60% 17.60% -14.91% (1787947, 2002888] 287.29 342.11 2,078.39 2474.179448 18.99% 18.99% -15.91% (2002888, 2217828] 293.38 343.16 2,208.83 2584.77457 17.16% 17.16% -14.61% (2217828, 2432769] 140.09 167.86 1,081.87 1296.336926 18.82% 18.82% -15.80% blocks: 1934464, baseline: 3h13m1s, contender: 2h43m47s bpsd (mean): 19.55% tpsd (mean): 19.55% Time (total): -29m13s, -15.14% ```
2024-06-09 16:32:20 +02:00
? com.validateHeader(
aristo: fork support via layers/txframes (#2960) * aristo: fork support via layers/txframes This change reorganises how the database is accessed: instead holding a "current frame" in the database object, a dag of frames is created based on the "base frame" held in `AristoDbRef` and all database access happens through this frame, which can be thought of as a consistent point-in-time snapshot of the database based on a particular fork of the chain. In the code, "frame", "transaction" and "layer" is used to denote more or less the same thing: a dag of stacked changes backed by the on-disk database. Although this is not a requirement, in practice each frame holds the change set of a single block - as such, the frame and its ancestors leading up to the on-disk state represents the state of the database after that block has been applied. "committing" means merging the changes to its parent frame so that the difference between them is lost and only the cumulative changes remain - this facility enables frames to be combined arbitrarily wherever they are in the dag. In particular, it becomes possible to consolidate a set of changes near the base of the dag and commit those to disk without having to re-do the in-memory frames built on top of them - this is useful for "flattening" a set of changes during a base update and sending those to storage without having to perform a block replay on top. Looking at abstractions, a side effect of this change is that the KVT and Aristo are brought closer together by considering them to be part of the "same" atomic transaction set - the way the code gets organised, applying a block and saving it to the kvt happens in the same "logical" frame - therefore, discarding the frame discards both the aristo and kvt changes at the same time - likewise, they are persisted to disk together - this makes reasoning about the database somewhat easier but has the downside of increased memory usage, something that perhaps will need addressing in the future. Because the code reasons more strictly about frames and the state of the persisted database, it also makes it more visible where ForkedChain should be used and where it is still missing - in particular, frames represent a single branch of history while forkedchain manages multiple parallel forks - user-facing services such as the RPC should use the latter, ie until it has been finalized, a getBlock request should consider all forks and not just the blocks in the canonical head branch. Another advantage of this approach is that `AristoDbRef` conceptually becomes more simple - removing its tracking of the "current" transaction stack simplifies reasoning about what can go wrong since this state now has to be passed around in the form of `AristoTxRef` - as such, many of the tests and facilities in the code that were dealing with "stack inconsistency" are now structurally prevented from happening. The test suite will need significant refactoring after this change. Once this change has been merged, there are several follow-ups to do: * there's no mechanism for keeping frames up to date as they get committed or rolled back - TODO * naming is confused - many names for the same thing for legacy reason * forkedchain support is still missing in lots of code * clean up redundant logic based on previous designs - in particular the debug and introspection code no longer makes sense * the way change sets are stored will probably need revisiting - because it's a stack of changes where each frame must be interrogated to find an on-disk value, with a base distance of 128 we'll at minimum have to perform 128 frame lookups for *every* database interaction - regardless, the "dag-like" nature will stay * dispose and commit are poorly defined and perhaps redundant - in theory, one could simply let the GC collect abandoned frames etc, though it's likely an explicit mechanism will remain useful, so they stay for now More about the changes: * `AristoDbRef` gains a `txRef` field (todo: rename) that "more or less" corresponds to the old `balancer` field * `AristoDbRef.stack` is gone - instead, there's a chain of `AristoTxRef` objects that hold their respective "layer" which has the actual changes * No more reasoning about "top" and "stack" - instead, each `AristoTxRef` can be a "head" that "more or less" corresponds to the old single-history `top` notion and its stack * `level` still represents "distance to base" - it's computed from the parent chain instead of being stored * one has to be careful not to use frames where forkedchain was intended - layers are only for a single branch of history! * fix layer vtop after rollback * engine fix * Fix test_txpool * Fix test_rpc * Fix copyright year * fix simulator * Fix copyright year * Fix copyright year * Fix tracer * Fix infinite recursion bug * Remove aristo and kvt empty files * Fic copyright year * Fix fc chain_kvt * ForkedChain refactoring * Fix merge master conflict * Fix copyright year * Reparent txFrame * Fix test * Fix txFrame reparent again * Cleanup and fix test * UpdateBase bugfix and fix test * Fixe newPayload bug discovered by hive * Fix engine api fcu * Clean up call template, chain_kvt, andn txguid * Fix copyright year * work around base block loading issue * Add test * Fix updateHead bug * Fix updateBase bug * Change func commitBase to proc commitBase * Touch up and fix debug mode crash --------- Co-authored-by: jangko <jangko128@gmail.com>
2025-02-06 08:04:50 +01:00
Block.init(uncle, BlockBody()), uncleParent, txFrame)
Consolidate block type for block processing (#2325) This PR consolidates the split header-body sequences into a single EthBlock sequence and cleans up the fallout from that which significantly reduces block processing overhead during import thanks to less garbage collection and fewer copies of things all around. Notably, since the number of headers must always match the number of bodies, we also get rid of a pointless degree of freedom that in the future could introduce unnecessary bugs. * only read header and body from era file * avoid several unnecessary copies along the block processing way * simplify signatures, cleaning up unused arguemnts and returns * use `stew/assign2` in a few strategic places where the generated nim assignent is slow and add a few `move` to work around poor analysis in nim 1.6 (will need to be revisited for 2.0) ``` stats-20240607_2223-a814aa0b.csv vs stats-20240608_0714-21c1d0a9.csv bps_x bps_y tps_x tps_y bpsd tpsd timed block_number (498305, 713245] 1,540.52 1,809.73 2,361.58 2775.340189 17.63% 17.63% -14.92% (713245, 928185] 730.36 865.26 1,715.90 2028.973852 18.01% 18.01% -15.21% (928185, 1143126] 663.03 789.10 2,529.26 3032.490771 19.79% 19.79% -16.28% (1143126, 1358066] 393.46 508.05 2,152.50 2777.578119 29.13% 29.13% -22.50% (1358066, 1573007] 370.88 440.72 2,351.31 2791.896052 18.81% 18.81% -15.80% (1573007, 1787947] 283.65 335.11 2,068.93 2441.373402 17.60% 17.60% -14.91% (1787947, 2002888] 287.29 342.11 2,078.39 2474.179448 18.99% 18.99% -15.91% (2002888, 2217828] 293.38 343.16 2,208.83 2584.77457 17.16% 17.16% -14.61% (2217828, 2432769] 140.09 167.86 1,081.87 1296.336926 18.82% 18.82% -15.80% blocks: 1934464, baseline: 3h13m1s, contender: 2h43m47s bpsd (mean): 19.55% tpsd (mean): 19.55% Time (total): -29m13s, -15.14% ```
2024-06-09 16:32:20 +02:00
ok()
# ------------------------------------------------------------------------------
# Public function, extracted from executor
# ------------------------------------------------------------------------------
func validateLegacySignatureForm(tx: Transaction, fork: EVMFork): bool =
2024-10-26 14:19:48 +07:00
var
vMin = 27'u64
vMax = 28'u64
if tx.V >= EIP155_CHAIN_ID_OFFSET:
let chainId = (tx.V - EIP155_CHAIN_ID_OFFSET) div 2
vMin = 35 + (2 * chainId)
vMax = vMin + 1
var isValid = tx.R >= UInt256.one
isValid = isValid and tx.S >= UInt256.one
isValid = isValid and tx.V >= vMin
isValid = isValid and tx.V <= vMax
isValid = isValid and tx.S < SECPK1_N
isValid = isValid and tx.R < SECPK1_N
if fork >= FkHomestead:
isValid = isValid and tx.S < SECPK1_N div 2
isValid
func validateEip2930SignatureForm(tx: Transaction): bool =
2024-10-26 14:19:48 +07:00
var isValid = tx.V == 0'u64 or tx.V == 1'u64
isValid = isValid and tx.S >= UInt256.one
isValid = isValid and tx.S < SECPK1_N
isValid = isValid and tx.R < SECPK1_N
isValid
func gasCost*(tx: Transaction): UInt256 =
if tx.txType >= TxEip4844:
tx.gasLimit.u256 * tx.maxFeePerGas.u256 + tx.getTotalBlobGas.u256 * tx.maxFeePerBlobGas
elif tx.txType >= TxEip1559:
tx.gasLimit.u256 * tx.maxFeePerGas.u256
else:
tx.gasLimit.u256 * tx.gasPrice.u256
func validateTxBasic*(
com: CommonRef,
tx: Transaction; ## tx to validate
fork: EVMFork,
validateFork: bool = true): Result[void, string] =
if validateFork:
if tx.txType == TxEip2930 and fork < FkBerlin:
return err("invalid tx: Eip2930 Tx type detected before Berlin")
if tx.txType == TxEip1559 and fork < FkLondon:
return err("invalid tx: Eip1559 Tx type detected before London")
if tx.txType == TxEip4844 and fork < FkCancun:
return err("invalid tx: Eip4844 Tx type detected before Cancun")
if tx.txType == TxEip7702 and fork < FkPrague:
return err("invalid tx: Eip7702 Tx type detected before Prague")
2023-01-10 11:25:23 +07:00
if fork >= FkShanghai and tx.contractCreation and tx.payload.len > EIP3860_MAX_INITCODE_SIZE:
return err("invalid tx: initcode size exceeds maximum")
# The total must be the larger of the two
if tx.maxFeePerGasNorm < tx.maxPriorityFeePerGasNorm:
return err(&"invalid tx: maxFee is smaller than maxPriorityFee. maxFee={tx.maxFeePerGas}, maxPriorityFee={tx.maxPriorityFeePerGasNorm}")
let
(intrinsicGas, floorDataGas) = tx.intrinsicGas(fork)
minGasLimit = max(intrinsicGas, floorDataGas)
if tx.gasLimit < minGasLimit:
return err(&"invalid tx: not enough gas to perform calculation. avail={tx.gasLimit}, require={minGasLimit}")
if fork >= FkCancun:
if tx.payload.len > MAX_CALLDATA_SIZE:
return err(&"invalid tx: payload len exceeds MAX_CALLDATA_SIZE. len={tx.payload.len}")
if tx.accessList.len > MAX_ACCESS_LIST_SIZE:
return err("invalid tx: access list len exceeds MAX_ACCESS_LIST_SIZE. len=" &
$tx.accessList.len)
for i, acl in tx.accessList:
if acl.storageKeys.len > MAX_ACCESS_LIST_STORAGE_KEYS:
return err("invalid tx: access list storage keys len exceeds MAX_ACCESS_LIST_STORAGE_KEYS. " &
&"index={i}, len={acl.storageKeys.len}")
2024-10-26 14:19:48 +07:00
if tx.txType == TxLegacy:
if not validateLegacySignatureForm(tx, fork):
return err("invalid tx: invalid legacy signature form")
else:
if not validateEip2930SignatureForm(tx):
return err("invalid tx: invalid post EIP-2930 signature form")
if tx.txType == TxEip4844:
if tx.to.isNone:
return err("invalid tx: destination must be not empty")
if tx.versionedHashes.len == 0:
return err("invalid tx: there must be at least one blob")
let maxBlobsPerBlock = getMaxBlobsPerBlock(com, fork)
if tx.versionedHashes.len.uint64 > maxBlobsPerBlock:
return err(&"invalid tx: versioned hashes len exceeds MAX_BLOBS_PER_BLOCK={maxBlobsPerBlock}, get={tx.versionedHashes.len}")
for i, bv in tx.versionedHashes:
if bv.data[0] != VERSIONED_HASH_VERSION_KZG:
return err("invalid tx: one of blobVersionedHash has invalid version. " &
&"get={bv.data[0].int}, expect={VERSIONED_HASH_VERSION_KZG.int}")
if tx.txType == TxEip7702:
if tx.authorizationList.len == 0:
return err("invalid tx: authorization list must not empty")
ok()
proc validateTransaction*(
roDB: ReadOnlyLedger; ## Parent accounts environment for transaction
tx: Transaction; ## tx to validate
2024-10-26 14:19:48 +07:00
sender: Address; ## tx.recoverSender
maxLimit: GasInt; ## gasLimit from block header
baseFee: UInt256; ## baseFee from block header
2024-10-26 14:19:48 +07:00
excessBlobGas: uint64; ## excessBlobGas from parent block header
com: CommonRef,
fork: EVMFork): Result[void, string] =
? validateTxBasic(com, tx, fork)
let
balance = roDB.getBalance(sender)
nonce = roDB.getNonce(sender)
# Note that the following check bears some plausibility but is _not_
# covered by the eip-1559 reference (sort of) pseudo code, for details
# see `https://eips.ethereum.org/EIPS/eip-1559#specification`_
#
# Rather this check is needed for surviving the post-London unit test
# eth_tests/GeneralStateTests/stEIP1559/lowGasLimit.json which seems to
# be sourced and generated from
# eth_tests/src/GeneralStateTestsFiller/stEIP1559/lowGasLimitFiller.yml
#
# Interestingly, the hive tests do not use this particular test but rather
# eth_tests/BlockchainTests/GeneralStateTests/stEIP1559/lowGasLimit.json
# from a parallel tests series which look like somehow expanded versions.
#
# The parallel lowGasLimit.json test never triggers the case checked below
# as the paricular transaction is omitted (the txs list is just set empty.)
if maxLimit < tx.gasLimit:
return err(&"invalid tx: block header gasLimit exceeded. maxLimit={maxLimit}, gasLimit={tx.gasLimit}")
# ensure that the user was willing to at least pay the base fee
if tx.maxFeePerGasNorm < baseFee.truncate(GasInt):
return err(&"invalid tx: maxFee is smaller than baseFee. maxFee={tx.maxFeePerGas}, baseFee={baseFee}")
# the signer must be able to fully afford the transaction
let gasCost = tx.gasCost()
if balance < gasCost:
return err(&"invalid tx: not enough cash for gas. avail={balance}, require={gasCost}")
if balance - gasCost < tx.value:
return err(&"invalid tx: not enough cash to send. avail={balance}, availMinusGas={balance-gasCost}, require={tx.value}")
if tx.nonce != nonce:
return err(&"invalid tx: account nonce mismatch. txNonce={tx.nonce}, accNonce={nonce}")
if tx.nonce == high(uint64):
return err(&"invalid tx: nonce at maximum")
# EIP-3607 Reject transactions from senders with deployed code
# The EIP spec claims this attack never happened before
# Clients might choose to disable this rule for RPC calls like
# `eth_call` and `eth_estimateGas`
# EOA = Externally Owned Account
2024-11-27 14:59:42 +07:00
let
code = roDB.getCode(sender)
delegated = code.parseDelegation()
if code.len > 0 and not delegated:
return err(&"invalid tx: sender is not an EOA. sender={sender.toHex}, codeLen={code.len}")
if tx.txType == TxEip4844:
# ensure that the user was willing to at least pay the current data gasprice
let blobGasPrice = getBlobBaseFee(excessBlobGas, com, fork)
if tx.maxFeePerBlobGas < blobGasPrice:
return err("invalid tx: maxFeePerBlobGas smaller than blobGasPrice. " &
&"maxFeePerBlobGas={tx.maxFeePerBlobGas}, blobGasPrice={blobGasPrice}")
ok()
# ------------------------------------------------------------------------------
# Public functions, extracted from test_blockchain_json
# ------------------------------------------------------------------------------
proc validateHeaderAndKinship*(
2022-12-02 11:35:41 +07:00
com: CommonRef;
blk: Block;
parent: Header;
aristo: fork support via layers/txframes (#2960) * aristo: fork support via layers/txframes This change reorganises how the database is accessed: instead holding a "current frame" in the database object, a dag of frames is created based on the "base frame" held in `AristoDbRef` and all database access happens through this frame, which can be thought of as a consistent point-in-time snapshot of the database based on a particular fork of the chain. In the code, "frame", "transaction" and "layer" is used to denote more or less the same thing: a dag of stacked changes backed by the on-disk database. Although this is not a requirement, in practice each frame holds the change set of a single block - as such, the frame and its ancestors leading up to the on-disk state represents the state of the database after that block has been applied. "committing" means merging the changes to its parent frame so that the difference between them is lost and only the cumulative changes remain - this facility enables frames to be combined arbitrarily wherever they are in the dag. In particular, it becomes possible to consolidate a set of changes near the base of the dag and commit those to disk without having to re-do the in-memory frames built on top of them - this is useful for "flattening" a set of changes during a base update and sending those to storage without having to perform a block replay on top. Looking at abstractions, a side effect of this change is that the KVT and Aristo are brought closer together by considering them to be part of the "same" atomic transaction set - the way the code gets organised, applying a block and saving it to the kvt happens in the same "logical" frame - therefore, discarding the frame discards both the aristo and kvt changes at the same time - likewise, they are persisted to disk together - this makes reasoning about the database somewhat easier but has the downside of increased memory usage, something that perhaps will need addressing in the future. Because the code reasons more strictly about frames and the state of the persisted database, it also makes it more visible where ForkedChain should be used and where it is still missing - in particular, frames represent a single branch of history while forkedchain manages multiple parallel forks - user-facing services such as the RPC should use the latter, ie until it has been finalized, a getBlock request should consider all forks and not just the blocks in the canonical head branch. Another advantage of this approach is that `AristoDbRef` conceptually becomes more simple - removing its tracking of the "current" transaction stack simplifies reasoning about what can go wrong since this state now has to be passed around in the form of `AristoTxRef` - as such, many of the tests and facilities in the code that were dealing with "stack inconsistency" are now structurally prevented from happening. The test suite will need significant refactoring after this change. Once this change has been merged, there are several follow-ups to do: * there's no mechanism for keeping frames up to date as they get committed or rolled back - TODO * naming is confused - many names for the same thing for legacy reason * forkedchain support is still missing in lots of code * clean up redundant logic based on previous designs - in particular the debug and introspection code no longer makes sense * the way change sets are stored will probably need revisiting - because it's a stack of changes where each frame must be interrogated to find an on-disk value, with a base distance of 128 we'll at minimum have to perform 128 frame lookups for *every* database interaction - regardless, the "dag-like" nature will stay * dispose and commit are poorly defined and perhaps redundant - in theory, one could simply let the GC collect abandoned frames etc, though it's likely an explicit mechanism will remain useful, so they stay for now More about the changes: * `AristoDbRef` gains a `txRef` field (todo: rename) that "more or less" corresponds to the old `balancer` field * `AristoDbRef.stack` is gone - instead, there's a chain of `AristoTxRef` objects that hold their respective "layer" which has the actual changes * No more reasoning about "top" and "stack" - instead, each `AristoTxRef` can be a "head" that "more or less" corresponds to the old single-history `top` notion and its stack * `level` still represents "distance to base" - it's computed from the parent chain instead of being stored * one has to be careful not to use frames where forkedchain was intended - layers are only for a single branch of history! * fix layer vtop after rollback * engine fix * Fix test_txpool * Fix test_rpc * Fix copyright year * fix simulator * Fix copyright year * Fix copyright year * Fix tracer * Fix infinite recursion bug * Remove aristo and kvt empty files * Fic copyright year * Fix fc chain_kvt * ForkedChain refactoring * Fix merge master conflict * Fix copyright year * Reparent txFrame * Fix test * Fix txFrame reparent again * Cleanup and fix test * UpdateBase bugfix and fix test * Fixe newPayload bug discovered by hive * Fix engine api fcu * Clean up call template, chain_kvt, andn txguid * Fix copyright year * work around base block loading issue * Add test * Fix updateHead bug * Fix updateBase bug * Change func commitBase to proc commitBase * Touch up and fix debug mode crash --------- Co-authored-by: jangko <jangko128@gmail.com>
2025-02-06 08:04:50 +01:00
txFrame: CoreDbTxRef
): Result[void, string]
{.gcsafe, raises: [].} =
template header: Header = blk.header
Consolidate block type for block processing (#2325) This PR consolidates the split header-body sequences into a single EthBlock sequence and cleans up the fallout from that which significantly reduces block processing overhead during import thanks to less garbage collection and fewer copies of things all around. Notably, since the number of headers must always match the number of bodies, we also get rid of a pointless degree of freedom that in the future could introduce unnecessary bugs. * only read header and body from era file * avoid several unnecessary copies along the block processing way * simplify signatures, cleaning up unused arguemnts and returns * use `stew/assign2` in a few strategic places where the generated nim assignent is slow and add a few `move` to work around poor analysis in nim 1.6 (will need to be revisited for 2.0) ``` stats-20240607_2223-a814aa0b.csv vs stats-20240608_0714-21c1d0a9.csv bps_x bps_y tps_x tps_y bpsd tpsd timed block_number (498305, 713245] 1,540.52 1,809.73 2,361.58 2775.340189 17.63% 17.63% -14.92% (713245, 928185] 730.36 865.26 1,715.90 2028.973852 18.01% 18.01% -15.21% (928185, 1143126] 663.03 789.10 2,529.26 3032.490771 19.79% 19.79% -16.28% (1143126, 1358066] 393.46 508.05 2,152.50 2777.578119 29.13% 29.13% -22.50% (1358066, 1573007] 370.88 440.72 2,351.31 2791.896052 18.81% 18.81% -15.80% (1573007, 1787947] 283.65 335.11 2,068.93 2441.373402 17.60% 17.60% -14.91% (1787947, 2002888] 287.29 342.11 2,078.39 2474.179448 18.99% 18.99% -15.91% (2002888, 2217828] 293.38 343.16 2,208.83 2584.77457 17.16% 17.16% -14.61% (2217828, 2432769] 140.09 167.86 1,081.87 1296.336926 18.82% 18.82% -15.80% blocks: 1934464, baseline: 3h13m1s, contender: 2h43m47s bpsd (mean): 19.55% tpsd (mean): 19.55% Time (total): -29m13s, -15.14% ```
2024-06-09 16:32:20 +02:00
if header.isGenesis:
if header.extraData.len > 32:
return err("Header.extraData larger than 32 bytes")
return ok()
aristo: fork support via layers/txframes (#2960) * aristo: fork support via layers/txframes This change reorganises how the database is accessed: instead holding a "current frame" in the database object, a dag of frames is created based on the "base frame" held in `AristoDbRef` and all database access happens through this frame, which can be thought of as a consistent point-in-time snapshot of the database based on a particular fork of the chain. In the code, "frame", "transaction" and "layer" is used to denote more or less the same thing: a dag of stacked changes backed by the on-disk database. Although this is not a requirement, in practice each frame holds the change set of a single block - as such, the frame and its ancestors leading up to the on-disk state represents the state of the database after that block has been applied. "committing" means merging the changes to its parent frame so that the difference between them is lost and only the cumulative changes remain - this facility enables frames to be combined arbitrarily wherever they are in the dag. In particular, it becomes possible to consolidate a set of changes near the base of the dag and commit those to disk without having to re-do the in-memory frames built on top of them - this is useful for "flattening" a set of changes during a base update and sending those to storage without having to perform a block replay on top. Looking at abstractions, a side effect of this change is that the KVT and Aristo are brought closer together by considering them to be part of the "same" atomic transaction set - the way the code gets organised, applying a block and saving it to the kvt happens in the same "logical" frame - therefore, discarding the frame discards both the aristo and kvt changes at the same time - likewise, they are persisted to disk together - this makes reasoning about the database somewhat easier but has the downside of increased memory usage, something that perhaps will need addressing in the future. Because the code reasons more strictly about frames and the state of the persisted database, it also makes it more visible where ForkedChain should be used and where it is still missing - in particular, frames represent a single branch of history while forkedchain manages multiple parallel forks - user-facing services such as the RPC should use the latter, ie until it has been finalized, a getBlock request should consider all forks and not just the blocks in the canonical head branch. Another advantage of this approach is that `AristoDbRef` conceptually becomes more simple - removing its tracking of the "current" transaction stack simplifies reasoning about what can go wrong since this state now has to be passed around in the form of `AristoTxRef` - as such, many of the tests and facilities in the code that were dealing with "stack inconsistency" are now structurally prevented from happening. The test suite will need significant refactoring after this change. Once this change has been merged, there are several follow-ups to do: * there's no mechanism for keeping frames up to date as they get committed or rolled back - TODO * naming is confused - many names for the same thing for legacy reason * forkedchain support is still missing in lots of code * clean up redundant logic based on previous designs - in particular the debug and introspection code no longer makes sense * the way change sets are stored will probably need revisiting - because it's a stack of changes where each frame must be interrogated to find an on-disk value, with a base distance of 128 we'll at minimum have to perform 128 frame lookups for *every* database interaction - regardless, the "dag-like" nature will stay * dispose and commit are poorly defined and perhaps redundant - in theory, one could simply let the GC collect abandoned frames etc, though it's likely an explicit mechanism will remain useful, so they stay for now More about the changes: * `AristoDbRef` gains a `txRef` field (todo: rename) that "more or less" corresponds to the old `balancer` field * `AristoDbRef.stack` is gone - instead, there's a chain of `AristoTxRef` objects that hold their respective "layer" which has the actual changes * No more reasoning about "top" and "stack" - instead, each `AristoTxRef` can be a "head" that "more or less" corresponds to the old single-history `top` notion and its stack * `level` still represents "distance to base" - it's computed from the parent chain instead of being stored * one has to be careful not to use frames where forkedchain was intended - layers are only for a single branch of history! * fix layer vtop after rollback * engine fix * Fix test_txpool * Fix test_rpc * Fix copyright year * fix simulator * Fix copyright year * Fix copyright year * Fix tracer * Fix infinite recursion bug * Remove aristo and kvt empty files * Fic copyright year * Fix fc chain_kvt * ForkedChain refactoring * Fix merge master conflict * Fix copyright year * Reparent txFrame * Fix test * Fix txFrame reparent again * Cleanup and fix test * UpdateBase bugfix and fix test * Fixe newPayload bug discovered by hive * Fix engine api fcu * Clean up call template, chain_kvt, andn txguid * Fix copyright year * work around base block loading issue * Add test * Fix updateHead bug * Fix updateBase bug * Change func commitBase to proc commitBase * Touch up and fix debug mode crash --------- Co-authored-by: jangko <jangko128@gmail.com>
2025-02-06 08:04:50 +01:00
? com.validateHeader(blk, parent, txFrame)
Consolidate block type for block processing (#2325) This PR consolidates the split header-body sequences into a single EthBlock sequence and cleans up the fallout from that which significantly reduces block processing overhead during import thanks to less garbage collection and fewer copies of things all around. Notably, since the number of headers must always match the number of bodies, we also get rid of a pointless degree of freedom that in the future could introduce unnecessary bugs. * only read header and body from era file * avoid several unnecessary copies along the block processing way * simplify signatures, cleaning up unused arguemnts and returns * use `stew/assign2` in a few strategic places where the generated nim assignent is slow and add a few `move` to work around poor analysis in nim 1.6 (will need to be revisited for 2.0) ``` stats-20240607_2223-a814aa0b.csv vs stats-20240608_0714-21c1d0a9.csv bps_x bps_y tps_x tps_y bpsd tpsd timed block_number (498305, 713245] 1,540.52 1,809.73 2,361.58 2775.340189 17.63% 17.63% -14.92% (713245, 928185] 730.36 865.26 1,715.90 2028.973852 18.01% 18.01% -15.21% (928185, 1143126] 663.03 789.10 2,529.26 3032.490771 19.79% 19.79% -16.28% (1143126, 1358066] 393.46 508.05 2,152.50 2777.578119 29.13% 29.13% -22.50% (1358066, 1573007] 370.88 440.72 2,351.31 2791.896052 18.81% 18.81% -15.80% (1573007, 1787947] 283.65 335.11 2,068.93 2441.373402 17.60% 17.60% -14.91% (1787947, 2002888] 287.29 342.11 2,078.39 2474.179448 18.99% 18.99% -15.91% (2002888, 2217828] 293.38 343.16 2,208.83 2584.77457 17.16% 17.16% -14.61% (2217828, 2432769] 140.09 167.86 1,081.87 1296.336926 18.82% 18.82% -15.80% blocks: 1934464, baseline: 3h13m1s, contender: 2h43m47s bpsd (mean): 19.55% tpsd (mean): 19.55% Time (total): -29m13s, -15.14% ```
2024-06-09 16:32:20 +02:00
if blk.uncles.len > MAX_UNCLES:
return err("Number of uncles exceed limit.")
aristo: fork support via layers/txframes (#2960) * aristo: fork support via layers/txframes This change reorganises how the database is accessed: instead holding a "current frame" in the database object, a dag of frames is created based on the "base frame" held in `AristoDbRef` and all database access happens through this frame, which can be thought of as a consistent point-in-time snapshot of the database based on a particular fork of the chain. In the code, "frame", "transaction" and "layer" is used to denote more or less the same thing: a dag of stacked changes backed by the on-disk database. Although this is not a requirement, in practice each frame holds the change set of a single block - as such, the frame and its ancestors leading up to the on-disk state represents the state of the database after that block has been applied. "committing" means merging the changes to its parent frame so that the difference between them is lost and only the cumulative changes remain - this facility enables frames to be combined arbitrarily wherever they are in the dag. In particular, it becomes possible to consolidate a set of changes near the base of the dag and commit those to disk without having to re-do the in-memory frames built on top of them - this is useful for "flattening" a set of changes during a base update and sending those to storage without having to perform a block replay on top. Looking at abstractions, a side effect of this change is that the KVT and Aristo are brought closer together by considering them to be part of the "same" atomic transaction set - the way the code gets organised, applying a block and saving it to the kvt happens in the same "logical" frame - therefore, discarding the frame discards both the aristo and kvt changes at the same time - likewise, they are persisted to disk together - this makes reasoning about the database somewhat easier but has the downside of increased memory usage, something that perhaps will need addressing in the future. Because the code reasons more strictly about frames and the state of the persisted database, it also makes it more visible where ForkedChain should be used and where it is still missing - in particular, frames represent a single branch of history while forkedchain manages multiple parallel forks - user-facing services such as the RPC should use the latter, ie until it has been finalized, a getBlock request should consider all forks and not just the blocks in the canonical head branch. Another advantage of this approach is that `AristoDbRef` conceptually becomes more simple - removing its tracking of the "current" transaction stack simplifies reasoning about what can go wrong since this state now has to be passed around in the form of `AristoTxRef` - as such, many of the tests and facilities in the code that were dealing with "stack inconsistency" are now structurally prevented from happening. The test suite will need significant refactoring after this change. Once this change has been merged, there are several follow-ups to do: * there's no mechanism for keeping frames up to date as they get committed or rolled back - TODO * naming is confused - many names for the same thing for legacy reason * forkedchain support is still missing in lots of code * clean up redundant logic based on previous designs - in particular the debug and introspection code no longer makes sense * the way change sets are stored will probably need revisiting - because it's a stack of changes where each frame must be interrogated to find an on-disk value, with a base distance of 128 we'll at minimum have to perform 128 frame lookups for *every* database interaction - regardless, the "dag-like" nature will stay * dispose and commit are poorly defined and perhaps redundant - in theory, one could simply let the GC collect abandoned frames etc, though it's likely an explicit mechanism will remain useful, so they stay for now More about the changes: * `AristoDbRef` gains a `txRef` field (todo: rename) that "more or less" corresponds to the old `balancer` field * `AristoDbRef.stack` is gone - instead, there's a chain of `AristoTxRef` objects that hold their respective "layer" which has the actual changes * No more reasoning about "top" and "stack" - instead, each `AristoTxRef` can be a "head" that "more or less" corresponds to the old single-history `top` notion and its stack * `level` still represents "distance to base" - it's computed from the parent chain instead of being stored * one has to be careful not to use frames where forkedchain was intended - layers are only for a single branch of history! * fix layer vtop after rollback * engine fix * Fix test_txpool * Fix test_rpc * Fix copyright year * fix simulator * Fix copyright year * Fix copyright year * Fix tracer * Fix infinite recursion bug * Remove aristo and kvt empty files * Fic copyright year * Fix fc chain_kvt * ForkedChain refactoring * Fix merge master conflict * Fix copyright year * Reparent txFrame * Fix test * Fix txFrame reparent again * Cleanup and fix test * UpdateBase bugfix and fix test * Fixe newPayload bug discovered by hive * Fix engine api fcu * Clean up call template, chain_kvt, andn txguid * Fix copyright year * work around base block loading issue * Add test * Fix updateHead bug * Fix updateBase bug * Change func commitBase to proc commitBase * Touch up and fix debug mode crash --------- Co-authored-by: jangko <jangko128@gmail.com>
2025-02-06 08:04:50 +01:00
if not com.proofOfStake(header, txFrame):
? com.validateUncles(header, txFrame, blk.uncles)
Consolidate block type for block processing (#2325) This PR consolidates the split header-body sequences into a single EthBlock sequence and cleans up the fallout from that which significantly reduces block processing overhead during import thanks to less garbage collection and fewer copies of things all around. Notably, since the number of headers must always match the number of bodies, we also get rid of a pointless degree of freedom that in the future could introduce unnecessary bugs. * only read header and body from era file * avoid several unnecessary copies along the block processing way * simplify signatures, cleaning up unused arguemnts and returns * use `stew/assign2` in a few strategic places where the generated nim assignent is slow and add a few `move` to work around poor analysis in nim 1.6 (will need to be revisited for 2.0) ``` stats-20240607_2223-a814aa0b.csv vs stats-20240608_0714-21c1d0a9.csv bps_x bps_y tps_x tps_y bpsd tpsd timed block_number (498305, 713245] 1,540.52 1,809.73 2,361.58 2775.340189 17.63% 17.63% -14.92% (713245, 928185] 730.36 865.26 1,715.90 2028.973852 18.01% 18.01% -15.21% (928185, 1143126] 663.03 789.10 2,529.26 3032.490771 19.79% 19.79% -16.28% (1143126, 1358066] 393.46 508.05 2,152.50 2777.578119 29.13% 29.13% -22.50% (1358066, 1573007] 370.88 440.72 2,351.31 2791.896052 18.81% 18.81% -15.80% (1573007, 1787947] 283.65 335.11 2,068.93 2441.373402 17.60% 17.60% -14.91% (1787947, 2002888] 287.29 342.11 2,078.39 2474.179448 18.99% 18.99% -15.91% (2002888, 2217828] 293.38 343.16 2,208.83 2584.77457 17.16% 17.16% -14.61% (2217828, 2432769] 140.09 167.86 1,081.87 1296.336926 18.82% 18.82% -15.80% blocks: 1934464, baseline: 3h13m1s, contender: 2h43m47s bpsd (mean): 19.55% tpsd (mean): 19.55% Time (total): -29m13s, -15.14% ```
2024-06-09 16:32:20 +02:00
ok()
# ------------------------------------------------------------------------------
# End
# ------------------------------------------------------------------------------