nimbus-eth1

mirror of https://github.com/status-im/nimbus-eth1.git synced 2025-03-01 20:30:53 +00:00

Author	SHA1	Message	Date
Jacek Sieka	2961905a95	aristo: fork support via layers/txframes (#2960 ) * aristo: fork support via layers/txframes This change reorganises how the database is accessed: instead holding a "current frame" in the database object, a dag of frames is created based on the "base frame" held in `AristoDbRef` and all database access happens through this frame, which can be thought of as a consistent point-in-time snapshot of the database based on a particular fork of the chain. In the code, "frame", "transaction" and "layer" is used to denote more or less the same thing: a dag of stacked changes backed by the on-disk database. Although this is not a requirement, in practice each frame holds the change set of a single block - as such, the frame and its ancestors leading up to the on-disk state represents the state of the database after that block has been applied. "committing" means merging the changes to its parent frame so that the difference between them is lost and only the cumulative changes remain - this facility enables frames to be combined arbitrarily wherever they are in the dag. In particular, it becomes possible to consolidate a set of changes near the base of the dag and commit those to disk without having to re-do the in-memory frames built on top of them - this is useful for "flattening" a set of changes during a base update and sending those to storage without having to perform a block replay on top. Looking at abstractions, a side effect of this change is that the KVT and Aristo are brought closer together by considering them to be part of the "same" atomic transaction set - the way the code gets organised, applying a block and saving it to the kvt happens in the same "logical" frame - therefore, discarding the frame discards both the aristo and kvt changes at the same time - likewise, they are persisted to disk together - this makes reasoning about the database somewhat easier but has the downside of increased memory usage, something that perhaps will need addressing in the future. Because the code reasons more strictly about frames and the state of the persisted database, it also makes it more visible where ForkedChain should be used and where it is still missing - in particular, frames represent a single branch of history while forkedchain manages multiple parallel forks - user-facing services such as the RPC should use the latter, ie until it has been finalized, a getBlock request should consider all forks and not just the blocks in the canonical head branch. Another advantage of this approach is that `AristoDbRef` conceptually becomes more simple - removing its tracking of the "current" transaction stack simplifies reasoning about what can go wrong since this state now has to be passed around in the form of `AristoTxRef` - as such, many of the tests and facilities in the code that were dealing with "stack inconsistency" are now structurally prevented from happening. The test suite will need significant refactoring after this change. Once this change has been merged, there are several follow-ups to do: * there's no mechanism for keeping frames up to date as they get committed or rolled back - TODO * naming is confused - many names for the same thing for legacy reason * forkedchain support is still missing in lots of code * clean up redundant logic based on previous designs - in particular the debug and introspection code no longer makes sense * the way change sets are stored will probably need revisiting - because it's a stack of changes where each frame must be interrogated to find an on-disk value, with a base distance of 128 we'll at minimum have to perform 128 frame lookups for every database interaction - regardless, the "dag-like" nature will stay * dispose and commit are poorly defined and perhaps redundant - in theory, one could simply let the GC collect abandoned frames etc, though it's likely an explicit mechanism will remain useful, so they stay for now More about the changes: * `AristoDbRef` gains a `txRef` field (todo: rename) that "more or less" corresponds to the old `balancer` field * `AristoDbRef.stack` is gone - instead, there's a chain of `AristoTxRef` objects that hold their respective "layer" which has the actual changes * No more reasoning about "top" and "stack" - instead, each `AristoTxRef` can be a "head" that "more or less" corresponds to the old single-history `top` notion and its stack * `level` still represents "distance to base" - it's computed from the parent chain instead of being stored * one has to be careful not to use frames where forkedchain was intended - layers are only for a single branch of history! * fix layer vtop after rollback * engine fix * Fix test_txpool * Fix test_rpc * Fix copyright year * fix simulator * Fix copyright year * Fix copyright year * Fix tracer * Fix infinite recursion bug * Remove aristo and kvt empty files * Fic copyright year * Fix fc chain_kvt * ForkedChain refactoring * Fix merge master conflict * Fix copyright year * Reparent txFrame * Fix test * Fix txFrame reparent again * Cleanup and fix test * UpdateBase bugfix and fix test * Fixe newPayload bug discovered by hive * Fix engine api fcu * Clean up call template, chain_kvt, andn txguid * Fix copyright year * work around base block loading issue * Add test * Fix updateHead bug * Fix updateBase bug * Change func commitBase to proc commitBase * Touch up and fix debug mode crash --------- Co-authored-by: jangko <jangko128@gmail.com>	2025-02-06 14:04:50 +07:00
andri lim	c11f20f8b0	devnet-6: Update EIP-7702: update EXTCODE* opcodes to act on full 'delegation designator' (#3021 )	2025-01-25 14:32:39 +00:00
Jordan Hrycaj	184af027dc	Beacon sync metrics managemnt update (#3016 ) * Sync scheduler provides an independent `ticker` loop process why: Can be used to update `metrics` and for debug logging. While an event driven solution would stall if there are no events at the moment (e.g. when the syncer hibernates, the `ticker` will run regardless. * Use `runTicker()` loop interface alike for updating ticker why: Not event driven anymore so it will not stall when the syncer hibernates. * Re-implement logging ticker by running it within the `runTicker()` driver why: Simplifies implementation * Re-name metrics variable to better fit into the current naming schemes * Fix copyright header	2025-01-22 10:12:50 +00:00
andri lim	232a9ad247	devnet-5: Update EIP-7702: Remove delegation behavior of EXTCODE* (#2871 )	2025-01-15 14:30:26 +00:00
andri lim	aba9b582db	Rename stateDB to ledger (#2966 ) * Rename stateDB to ledger * Fix readOnlyLedger	2024-12-21 20:46:13 +07:00
Jacek Sieka	036dd23e9b	cleanups, import fixes (#2964 ) * more generic-path removal * tighter imports	2024-12-20 12:57:15 +01:00
Jacek Sieka	b3cb51e89e	Speed up evm stack (#2881 ) The EVM stack is a hot spot in EVM execution and we end up paying a nim seq tax in several ways, adding up to ~5% of execution time: * on initial allocation, all bytes get zeroed - this means we have to choose between allocating a full stack or just a partial one and then growing it * pushing and popping introduce additional zeroing * reallocations on growth copy + zero - expensive again! * redundant range checking on every operation reducing inlining etc Here a custom stack using C memory is instroduced: * no zeroing on allocation * full stack allocated on EVM startup -> no reallocation during execution * fast push/pop - no zeroing again * 32-byte alignment - this makes it easier for the compiler to use vector instructions * no stack allocated for precompiles (these never use it anyway) Of course, this change also means we have to manage memory manually - for the EVM, this turns out to be not too bad because we already manage database transactions the same way (they have to be freed "manually") so we can simply latch on to this mechanism. While we're at it, this PR also skips database lookup for known precompiles by resolving such addresses earlier.	2024-11-30 10:07:10 +01:00
andri lim	e55583bf7a	Fix incomplete PR #2877 (#2880 )	2024-11-27 17:45:37 +07:00
andri lim	fbfc1611d7	Implement EIP-7702: Set EOA account code (#2631 ) * Implement EIP-7702 part 1: Behavior * Implement EIP-7702 part 2: Tx validation * Implement EIP-7702 part 3: Delegation Designation and Gas Costs	2024-11-25 11:28:03 +01:00
Jacek Sieka	652539e628	Simplify state root api (#2864 ) `updateOk` is obsolete and always set to true - callers should not have to care about this detail also take the opportunity to clean up storage root naming	2024-11-22 14:15:35 +01:00
andri lim	f201eb611e	Simplify LedgerRef: remove unnecessary abstraction (#2826 )	2024-11-06 09:01:56 +07:00
Jacek Sieka	d828dead2d	Use stateRoot/storageRoot more consistently (#2791 ) * prefer the spec-derived name where possible * don't pass stateRoot to LedgerRef and friends (it doesn't do anything) * add deprecation warning in graphql - it needs updating to use forkedchain instead	2024-10-27 19:56:28 +01:00
Jordan Hrycaj	5b6ccddaa0	Db folder sources and related remove compiler warnings (#2673 ) * Aristo: Rename `Hash256` -> `Hash32` * CoreDb: Rename `Hash256` -> `Hash32` * Ledger: Rename `Hash256` -> `Hash32` * StorageTypes: Rename `Hash256` -> `Hash32` * Aristo: Rename `Blob` -> `seq[byte]`, `keccakHash` -> `keccak256` * Kvt: Rename `Blob` -> `seq[byte]` * CoreDb: Rename `Blob` -> `seq[byte]`, `keccakHash` -> `keccak256` * Ledger: Rename `Blob` -> `seq[byte]`, `keccakHash` -> `keccak256` * CoreDb: Rename `BlockHeader` -> `Header`, `BlockNonce` -> `Bytes8` * Misc: Rename `StorageKey` -> `Bytes32` * Tracer: `Hash256` -> `Hash32`, `BlockHeader` -> `Header`, etc. * Fix copyright header	2024-10-01 21:03:10 +00:00
Jacek Sieka	43d93bcdab	Don't write slot hashes on import (#2564 ) The reverse slot hash mechanism causes quite a bit of database traffic but is broadly not useful except for iterating the storage of an account, something that a validator never does (it's used by the tracers). This flag adds one more thing that is not stored in the database, to be explored more comprehensively when designing full, validator and archive modes with different pruning options in the future. `ldb` says this is 60gb of data (!): ``` ldb --db=. --ignore_unknown_options --column_family=KvtGen approxsize --hex --from=0x05 --to=0x05ffffffffffffffffffffffffffffffffffffffffffffff 66488353954 ```	2024-08-16 08:22:51 +02:00
Jordan Hrycaj	f08178c592	Separate constructor helpers for core db and ledger (#2480 ) * Extract `CoreDb` constructor helpers from `base.nim` into separate module why: This makes it easier to avoid circular imports. * Extract `Ledger` constructor helpers from `base.nim` into separate module why: Move `accounts_ledger.nim` file to sub-folder `backend`. That way the layout resembles that of the `core_db`.	2024-07-12 19:32:31 +00:00
Jordan Hrycaj	b924fdcaa7	Separate config for core db and ledger (#2479 ) * Updates and corrections * Extract `CoreDb` configuration from `base.nim` into separate module why: This makes it easier to avoid circular imports, in particular when the capture journal (aka tracer) is revived. * Extract `Ledger` configuration from `base.nim` into separate module why: This makes it easier to avoid circular imports (if any.) also: Move `accounts_ledger.nim` file to sub-folder `backend`. That way the layout resembles that of the `core_db`.	2024-07-12 13:12:25 +00:00
Jordan Hrycaj	61bbf40014	Update storage tree admin (#2419 ) * Tighten `CoreDb` API for accounts why: Apart from cruft, the way to fetch the accounts state root via a `CoreDbColRef` record was unnecessarily complicated. * Extend `CoreDb` API for accounts to cover storage tries why: In future, this will make the notion of column objects obsolete. Storage trees will then be indexed by the account address rather than the vertex ID equivalent like a `CoreDbColRef`. * Apply new/extended accounts API to ledger and tests details: This makes the `distinct_ledger` module obsolete * Remove column object constructors why: They were needed as an abstraction of MPT sub-trees including storage trees. Now, storage trees are handled by the account (e.g. via address) they belong to and all other trees can be identified by a constant well known vertex ID. So there is no need for column objects anymore. Still there are some left-over column object methods wnich will be removed next. * Remove `serialise()` and `PayloadRef` from default Aristo API why: Not needed. `PayloadRef` was used for unstructured/unknown payload formats (account or blob) and `serialise()` was used for decodng `PayloadRef`. Now it is known in advance what the payload looks like. * Added query function `hasStorageData()` whether a storage area exists why: Useful for supporting `slotStateEmpty()` of the `CoreDb` API * In the `Ledger` replace `storage.stateEmpty()` by `slotStateEmpty()` * On Aristo, hide the storage root/vertex ID in the `PayloadRef` why: The storage vertex ID is fully controlled by Aristo while the `AristoAccount` object is controlled by the application. With the storage root part of the `AristoAccount` object, there was a useless administrative burden to keep that storage root field up to date. * Remove cruft, update comments etc. * Update changed MPT access paradigms why: Fixes verified proxy tests * Fluffy cosmetics	2024-06-27 09:01:26 +00:00
andri lim	eaf3d9897e	Simplify AccountsLedgerRef complexity (#2239 )	2024-05-29 13:06:49 +02:00
Jordan Hrycaj	9cc6e5a3aa	Aristo resume off line syncing on pre loaded database (#2203 ) * Update some docu & messages * Remove cruft from the ledger modules * Must not overwrite genesis data on an initialised database why: This will overwrite the global state of the Aristo single state DB. Otherwise resuming at the last synced state becomes impossible. * Provide latest block number from journal why: This relates the global state of the DB directly to the corresponding block number. * Implemented unit test providing DB pre-load and resume	2024-05-22 13:41:14 +00:00
Jordan Hrycaj	ee9aea171d	Culling legacy DB and accounts cache (#2197 ) details: + Compiles nimbus all_tests + Failing tests have been commented out	2024-05-20 10:17:51 +00:00
Jordan Hrycaj	a1161b537b	Core db update storage root management for sub tries (#1964 ) * Aristo: Re-phrase `LayerDelta` and `LayerFinal` as object references why: Avoids copying in some cases * Fix copyright header * Aristo: Verify `leafTie.root` function argument for `merge()` proc why: Zero root will lead to inconsistent DB entry * Aristo: Update failure condition for hash labels compiler `hashify()` why: Node need not be rejected as long as links are on the schedule. In that case, `redo[]` is to become `wff.base[]` at a later stage. This amends an earlier fix, part of #1952 by also testing against the target nodes of the `wff.base[]` sets. * Aristo: Add storage root glue record to `hashify()` schedule why: An account leaf node might refer to a non-resolvable storage root ID. Storage root node chains will end up at the storage root. So the link `storage-root->account-leaf` needs an extra item in the schedule. * Aristo: fix error code returned by `fetchPayload()` details: Final error code is implied by the error code form the `hikeUp()` function. * CoreDb: Discard `createOk` argument in API `getRoot()` function why: Not needed for the legacy DB. For the `Arsto` DB, a lazy approach is implemented where a stprage root node is created on-the-fly. * CoreDb: Prevent `$$` logging in some cases why: Logging the function `$$` is not useful when it is used for internal use, i.e. retrieving an an error text for logging. * CoreDb: Add `tryHashFn()` to API for pretty printing why: Pretty printing must not change the hashification status for the `Aristo` DB. So there is an independent API wrapper for getting the node hash which never updated the hashes. * CoreDb: Discard `update` argument in API `hash()` function why: When calling the API function `hash()`, the latest state is always wanted. For a version that uses the current state as-is without checking, the function `tryHash()` was added to the backend. * CoreDb: Update opaque vertex ID objects for the `Aristo` backend why: For `Aristo`, vID objects encapsulate a numeric `VertexID` referencing a vertex (rather than a node hash as used on the legacy backend.) For storage sub-tries, there might be no initial vertex known when the descriptor is created. So opaque vertex ID objects are supported without a valid `VertexID` which will be initalised on-the-fly when the first item is merged. * CoreDb: Add pretty printer for opaque vertex ID objects * Cosmetics, printing profiling data * CoreDb: Fix segfault in `Aristo` backend when creating MPT descriptor why: Missing initialisation error * CoreDb: Allow MPT to inherit shared context on `Aristo` backend why: Creates descriptors with different storage roots for the same shared `Aristo` DB descriptor. * Cosmetics, update diagnostic message items for `Aristo` backend * Fix Copyright year	2024-01-11 19:11:38 +00:00
Jordan Hrycaj	13f51939f6	Core db aristo hasher profiling and timing improvement (#1938 ) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already	2023-12-12 17:47:41 +00:00
Jordan Hrycaj	3198ad1bbd	Fix default pruning for ledger and update core db and ledger logging (#1861 ) * Make sure that storage tries are not pruned (by default) on the new Ledger API why: Pruning might kill some unwanted entries from storage tries ending up with an unstable database leading to crashes. * Implement `CoreDb` and `LedgerRef` API tracing details: + Locally enabled at compile time via constants `ProvideCoreDbLegacyAPI` and `EnableApiTracking` in either `base.nim` source + If enabled it can be selectively turned on/off via public switches in the `CoreDb` descriptor. * Allow suppressing opportunistic `ifNecessaryGetXxx()` functions why: Better troubleshooting when the system crashes (assertions will then most probably happen outside an `async` function.)	2023-10-25 15:03:09 +01:00
Jordan Hrycaj	e8ad950e0a	Ledger abstraction for accounts cache (#1824 ) * Provide TDD/debug facility for inspecting `persistBlocks()` working detail: + Make sure that the last block of a test sample is the first batch item in `persistBlocks()`. + Additionally, allow `AccountsCache` API tracing by setting the flag `extraTraceMessages = true` in the file `accounts_cache.nim` * Overload AccountsCache by abstraction wrapper details: Can facilitate CoreDb API switch, details in `ledger/README.md`.	2023-10-18 20:27:22 +01:00

24 Commits