nimbus-eth1

Commit Graph

Author	SHA1	Message	Date
Jordan Hrycaj	1452e7b1c0	Misc updates (#2513 ) * Update config for Ledger and CoreDb why: Prepare for tracer which depends on the API jump table (as well as the profiler.) The API jump table is now enabled in unit/integration test mode piggybacking on the `unittest2DisableParamFiltering` compiler flag or on an extra compiler flag `dbjapi_enabled`. * No deed for error field in `NodeRef` why: Was opnly needed by proof nodes pre-loader which will be re-implemented * Cosmetics	2024-07-22 18:10:04 +00:00
Jordan Hrycaj	f08178c592	Separate constructor helpers for core db and ledger (#2480 ) * Extract `CoreDb` constructor helpers from `base.nim` into separate module why: This makes it easier to avoid circular imports. * Extract `Ledger` constructor helpers from `base.nim` into separate module why: Move `accounts_ledger.nim` file to sub-folder `backend`. That way the layout resembles that of the `core_db`.	2024-07-12 19:32:31 +00:00
Jordan Hrycaj	b924fdcaa7	Separate config for core db and ledger (#2479 ) * Updates and corrections * Extract `CoreDb` configuration from `base.nim` into separate module why: This makes it easier to avoid circular imports, in particular when the capture journal (aka tracer) is revived. * Extract `Ledger` configuration from `base.nim` into separate module why: This makes it easier to avoid circular imports (if any.) also: Move `accounts_ledger.nim` file to sub-folder `backend`. That way the layout resembles that of the `core_db`.	2024-07-12 13:12:25 +00:00
Jacek Sieka	81e75622cf	storage: store root id together with vid, for better locality of refe… (#2449 ) The state and account MPT:s currenty share key space in the database based on that vertex id:s are assigned essentially randomly, which means that when two adjacent slot values from the same contract are accessed, they might reside at large distance from each other. Here, we prefix each vertex id by its root causing them to be sorted together thus bringing all data belonging to a particular contract closer together - the same effect also happens for the main state MPT whose nodes now end up clustered together more tightly. In the future, the prefix given to the storage keys can also be used to perform range operations such as reading all the storage at once and/or deleting an account with a batch operation. Notably, parts of the API already supported this rooting concept while parts didn't - this PR makes the API consistent by always working with a root+vid.	2024-07-04 15:46:52 +02:00
Jordan Hrycaj	6dc2773957	Only use pre hashed addresses as account keys (#2424 ) * Normalised storage tree addressing in function prototypes detail: Argument list is always `<db> <account-path> <slot-path> ..` with both path arguments as `openArray[]` * Remove cruft * CoreDb internally Use full account paths rather than addresses * Update API logging * Use hashed account address only in prototypes why: This avoids unnecessary repeated hashing of the same account address. The burden of doing that is upon the application. In the case here, the ledger caches all kinds of stuff anyway so it is common sense to exploit that for account address hashes. caveat: Using `openArray[byte]` argument types for hashed accounts is inherently fragile. In non-release mode, a length verification `doAssert` is enabled by default. * No accPath in data record (use `AristoAccount` as `CoreDbAccount`) * Remove now unused `eAddr` field from ledger `AccountRef` type why: Is duplicate of lookup key * Avoid merging the account record/statement in the ledger twice.	2024-06-27 19:21:01 +00:00
Jordan Hrycaj	61bbf40014	Update storage tree admin (#2419 ) * Tighten `CoreDb` API for accounts why: Apart from cruft, the way to fetch the accounts state root via a `CoreDbColRef` record was unnecessarily complicated. * Extend `CoreDb` API for accounts to cover storage tries why: In future, this will make the notion of column objects obsolete. Storage trees will then be indexed by the account address rather than the vertex ID equivalent like a `CoreDbColRef`. * Apply new/extended accounts API to ledger and tests details: This makes the `distinct_ledger` module obsolete * Remove column object constructors why: They were needed as an abstraction of MPT sub-trees including storage trees. Now, storage trees are handled by the account (e.g. via address) they belong to and all other trees can be identified by a constant well known vertex ID. So there is no need for column objects anymore. Still there are some left-over column object methods wnich will be removed next. * Remove `serialise()` and `PayloadRef` from default Aristo API why: Not needed. `PayloadRef` was used for unstructured/unknown payload formats (account or blob) and `serialise()` was used for decodng `PayloadRef`. Now it is known in advance what the payload looks like. * Added query function `hasStorageData()` whether a storage area exists why: Useful for supporting `slotStateEmpty()` of the `CoreDb` API * In the `Ledger` replace `storage.stateEmpty()` by `slotStateEmpty()` * On Aristo, hide the storage root/vertex ID in the `PayloadRef` why: The storage vertex ID is fully controlled by Aristo while the `AristoAccount` object is controlled by the application. With the storage root part of the `AristoAccount` object, there was a useless administrative burden to keep that storage root field up to date. * Remove cruft, update comments etc. * Update changed MPT access paradigms why: Fixes verified proxy tests * Fluffy cosmetics	2024-06-27 09:01:26 +00:00
Jacek Sieka	f294d1e086	Clear account cache after each block (#2411 ) When processing long ranges of blocks, the account cache grows unbounded which cause huge memory spikes. Here, we move the cache to a second-level cache after each block - the second-level cache is cleared on the next block after that which creates a simple LRU effect. There's a small performance cost of course, though overall the freed-up memory can now be reassigned to the rocksdb row cache which not only makes up for the loss but overall leads to a performance increase. The bump to 2gb of rocksdb row cache here needs more testing but is slightly less and loosely basedy on the savings from this PR and the circular ref fix in #2408 - another way to phrase this is that it's better to give rocksdb more breathing room than let the memory sit unused until circular ref collection happens ;)	2024-06-25 07:30:32 +02:00
Jacek Sieka	768307d91d	Cache code and invalid jump destination tables (fixes #2268 ) (#2404 ) It is common for many accounts to share the same code - at the database level, code is stored by hash meaning only one copy exists per unique program but when loaded in memory, a copy is made for each account. Further, every time we execute the code, it must be scanned for invalid jump destinations which slows down EVM exeuction. Finally, the extcodesize call causes code to be loaded even if only the size is needed. This PR improves on all these points by introducing a shared CodeBytesRef type whose code section is immutable and that can be shared between accounts. Further, a dedicated `len` API call is added so that the EXTCODESIZE opcode can operate without polluting the GC and code cache, for cases where only the size is requested - rocksdb will in this case cache the code itself in the row cache meaning that lookup of the code itself remains fast when length is asked for first. With 16k code entries, there's a 90% hit rate which goes up to 99% during the 2.3M attack - the cache significantly lowers memory consumption and execution time not only during this event but across the board.	2024-06-21 09:44:10 +02:00
Jordan Hrycaj	081cb15493	Coredb maintenance (#2398 ) * CoreDb: remove PHK tries why: There is no general use anymore for an MPT with a pre-hashed key. It was used to resemble the `SecureHexaryTrie` logic from the legacy DB. The only pace where this is needed is the `Leger` which uses a a distinct MPT version anyway (see `distinct_ledgers.nim`.) * Rename `CoreDx` -> `CoreDb` why: The naming `CoreDx` was used to differentiate the new CoreDb API from the legacy API which had descriptors named `CoreDb`.	2024-06-19 14:13:12 +00:00
andri lim	69044dda60	Remove AccountStateDB (#2368 ) * Remove AccountStateDB AccountStateDB should no longer be used. It's usage have been reduce to read only operations. Replace it with LedgerRef to reduce maintenance burden. * remove extra spaces Co-authored-by: tersec <tersec@users.noreply.github.com> --------- Co-authored-by: tersec <tersec@users.noreply.github.com>	2024-06-16 10:21:02 +07:00
web3-developer	db8c5b90bd	Cleanup stateless and block witness code. (#2295 ) * Cleanup unneeded stateless and block witness code. Keeping MultiKeys which is used in the eth_getProofsByBlockNumber RPC endpoint which is needed for the Fluffy state network bridge. * Rename generateWitness flag to collectWitnessData to better describe what the flag does. We only collect the keys of the touched accounts and storage slots but no block witness generation is supported for now. * Move remaining stateless code into nimbus directory. * Add vmstate parameter to ChainRef to fix test. * Exclude *.in from check copyright year --------- Co-authored-by: jangko <jangko128@gmail.com>	2024-06-08 15:05:00 +07:00
Jordan Hrycaj	e9eae4df70	Core db disable legacy api n remove distinct tries (#2299 ) * CoreDb: Remove crufty second/off-site KVT why: Was used to allow late `Clique` to store directly to disk * CoreDb: Remove prune flag related functionality why: Is completely legacy stuff * CoreDb: Remove dependence on legacy API (tests unsupported yet) why: Does not fully support Aristo * Re-factoring `state_db` using new API details: Only minimum changes needed to compile `nimbus` * Update tests and aux modules * Turn off legacy API and remove `distinct_tries` comment: The legacy API has now cruft status, will be removed soon * Fix copyright years * Update rpc for verified proxy --------- Co-authored-by: Jacek Sieka <jacek@status.im>	2024-06-05 20:52:04 +00:00
Jacek Sieka	7f76586214	Speed up account ledger a little (#2279 ) `persist` is a hotspot when processing blocks because it is run at least once per transaction and loops over the entire account cache every time. Here, we introduce an extra `dirty` map that keeps track of all accounts that need checking during `persist` which fixes the immediate inefficiency, though probably this could benefit from a more thorough review - we also get rid of the unused clearCache flag - we start with a fresh cache on every fresh vmState. * avoid unnecessary code hash comparisons * avoid unnecessary copies when iterating * use EMPTY_CODE_HASH throughout for code hash comparison	2024-06-02 21:21:29 +02:00
andri lim	eaf3d9897e	Simplify AccountsLedgerRef complexity (#2239 )	2024-05-29 13:06:49 +02:00
Jordan Hrycaj	2c322054e0	Attempt to roll back stateless mode implementation in a single PR (#2209 ) * Attempt to roll back stateless mode implementation in a single PR why: + Stateless mode is not fully working and in the way + Single PR should make it feasible to investigate for a possible re-implementation * Fix copyright year * Fix annotation for exception (evmc mode)	2024-05-22 21:01:19 +00:00
Jordan Hrycaj	7d9e1d8607	Misc updates for full sync (#2140 ) * Code cosmetics * Aristo+Kvt: Fix api wrappers why: Api setup killed the backend descriptor when backend mapping was disabled. * Aristo: Implement masked profiling entries why: Database backend should be listed but not counted in tally * CoreDb: Simplify backend() methods why: DBMS backend access Was provided very early and over engineered. Now there are only two backend machines, one for `Kvt` and the other one for an `Mpt` available only via new API. * CoreDb: Code cleanup regarding descriptor types * CoreDb: Refactor/redefine `persistent()` methods why: There were `persistent()` methods for any type of caching storage facilities `Kvt`, `Mpt`, `Phk`, and `Acc`. Now there is only a single `persistent()` method storing all facilities in tandem (similar to how transactions work.) For non shared `Kvt` tables, there is now an extra storage method `saveOffSite()`. * CoreDb lingo update: `trie` becomes `column` why: Notion of a `trie` is pretty much hidden by the new `CoreDb` api. Revealed are sort of database columns for accounts an storage data, any of which have an internal state represented by a Keccack hash. So a `trie` or `MPT` becomes a `column` and a `rootHash` becomes a column state. * Aristo: rename backend filed `filters` => `journal` * Update full sync logging details: + Disable eth handler noise while syncing + Log journal depth (if available) * Fix copyright year * Fix cruft and unwanted imports	2024-04-19 18:37:27 +00:00
jangko	7a941c8ca0	Rename hasCodeOrNonce to contractCollision	2024-04-16 09:31:10 +07:00
Jordan Hrycaj	14a5f46d13	Core db implement ctx layer for mpt state admin (#2082 ) * CoreDb+Ledger: Update logging why: Use symbol `api` rather than `ctx` because the latter will be used as name for particular objects * CoreDb: Remove cruft * CoreDb: Remove `TxID` support why: It is nowhere used and ugly implemented. The upcoming context layer will be a cleaner alternative to use, instead should this particular functionality be needed. * CoreDb: Rearrange base methods in source code for better reading * CoreDb+Aristo: Update API closures for better reading & maintenance * CoreDb: Implement context layer for MPT why: On `Aristo` the context layer allows to manage different views on the same backend database. This is an abstraction of the legacy hexary trie which can be localised on a particular root nose. details: The `ctx` context provides the state (equiv. to state root) of the database for MPT and account descriptors. * Fix Copyright headers	2024-03-18 19:40:23 +00:00
Jordan Hrycaj	587ca3abbe	Coredb use stackable api for aristo backend (#2060 ) * Aristo/Kvt: Provide function hooks APIs why: These APIs can be used for installing tracers, profiling functoinality, and other niceties on the databases. * Aristo: Provide optional API profiling details: It basically is a re-implementation of the `CoreDb` profiling implementation * Kvt: Provide optional API profiling similar to `Aristo` * CoreDb: Re-implementing profiling using `aristo_profile` * Ledger: Re-implementing profiling using `aristo_profile` * CoreDb: Update unit tests for maintainability * update copyright dates	2024-02-29 21:10:24 +00:00
andri lim	7722a907dc	Implement getAccessList of db/ledger (#2055 )	2024-02-29 18:16:47 +07:00
andri lim	6ff2edc416	Fix styles (#2046 ) * Fix styles * Fix copyright year	2024-02-21 23:04:59 +07:00
Jordan Hrycaj	13f51939f6	Core db aristo hasher profiling and timing improvement (#1938 ) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already	2023-12-12 17:47:41 +00:00
Jordan Hrycaj	5462c05dc6	Core db update api tracking (#1907 ) * Fix copyright year * Show elapsed times with enabled `CoreDb` API tracking * Show elapsed times with enabled `LedgerRef` API tracking * Reorg `CoreDb` auto destructors for `Aristo` DB why: While `Aristo` supports some parallelism for concurrent database access, this comes with a price of management overhead. With a naive approach, the auto-destructor will slow down execution because the ledger and evm treat the database in a shared mode where a DB descriptor is just created and thrown away shortly after. This is reflected in the `Coredb` abstraction layer above `Aristo`/`Kvt` where a few `Shared` type descriptors are cached and a shared reference is returned rather than a disposable new object. * For `CoreDb` support transaction level tracking details: This is mainly an extra for the legacy DB as `Aristo` and `Kvt` support this already. Also return an error on the legacy DB backend when `persistent()` is called while there are transactions pending (the `persistent()` call does nothing otherwise on the legacy backend.) * Clear compiler warnings (remove unused variables etc.)	2023-11-24 22:16:21 +00:00
Jordan Hrycaj	610e2d338d	Core db fix legacy db root vertex fetcher (#1899 ) * Using different `tmp` directories for `Kvt` and `Aristo` why: Closing one database would leave the other set of directories incomplete. * Code cosmetics, silence compiler * Fix typo `EMPTY_ROOT_HASH` vs. `EMPTY_CODE_HASH` * Fix copyright years	2023-11-20 20:22:27 +00:00
Jordan Hrycaj	3e88589eb1	Optional accounts cache module for creating genesis (#1897 ) * Split off `ReadOnlyStateDB` from `AccountStateDB` from `state_db.nim` why: Apart from testing, applications use `ReadOnlyStateDB` as an easy way to access the accounts ledger. This is well supported by the `Aristo` db, but writable mode is only parially supported. The writable AccountStateDB` object for modifying accounts is not used by production code. So, for lecgacy and testing apps, the full support of the previous `AccountStateDB` is now enabled by `import db/state_db/read_write` and the `import db/state_db` provides read-only mode. * Encapsulate `AccountStateDB` as `GenesisLedgerRef` or genesis creation why: `AccountStateDB` has poor support for `Aristo` and is not widely used in favour of `AccountsLedger` (which will be abstracted as `ledger`.) Currently, using other than the `AccountStateDB` ledgers within the `GenesisLedgerRef` wrapper is experimental and test only. Eventually, the wrapper should disappear so that the `Ledger` object (which encapsulates `AccountsCache` and `AccountsLedger`) will prevail. * For the `Ledger`, provide access to raw accounts `MPT` why: This gives to the `CoreDbMptRef` descriptor from the `CoreDb` (which is the legacy version of CoreDxMptRef`.) For the new `ledger` API, the accounts are based on the `CoreDxMAccRef` descriptor which uses a particular sub-system for accounts while legacy applications use the `CoreDbPhkRef` equivalent of the `SecureHexaryTrie`. The only place where this feature will currently be used is the `genesis.nim` source file. * Fix `Aristo` bugs, missing boundary checks, typos, etc. * Verify root vertex in `MPT` and account constructors why: Was missing so far, in particular the accounts constructor must verify `VertexID(1) * Fix include file	2023-11-20 11:51:43 +00:00
Jordan Hrycaj	c47f021596	Core db and aristo updates for destructor and tx logic (#1894 ) * Disable `TransactionID` related functions from `state_db.nim` why: Functions `getCommittedStorage()` and `updateOriginalRoot()` from the `state_db` module are nowhere used. The emulation of a legacy `TransactionID` type functionality is administratively expensive to provide by `Aristo` (the legacy DB version is only partially implemented, anyway). As there is no other place where `TransactionID`s are used, they will not be provided by the `Aristo` variant of the `CoreDb`. For the legacy DB API, nothing will change. * Fix copyright headers in source code * Get rid of compiler warning * Update Aristo code, remove unused `merge()` variant, export `hashify()` why: Adapt to upcoming `CoreDb` wrapper * Remove synced tx feature from `Aristo` why: + This feature allowed to synchronise transaction methods like begin, commit, and rollback for a group of descriptors. + The feature is over engineered and not needed for `CoreDb`, neither is it complete (some convergence features missing.) * Add debugging helpers to `Kvt` also: Update database iterator, add count variable yield argument similar to `Aristo`. * Provide optional destructors for `CoreDb` API why; For the upcoming Aristo wrapper, this allows to control when certain smart destruction and update can take place. The auto destructor works fine in general when the storage/cache strategy is known and acceptable when creating descriptors. * Add update option for `CoreDb` API function `hash()` why; The hash function is typically used to get the state root of the MPT. Due to lazy hashing, this might be not available on the `Aristo` DB. So the `update` function asks for re-hashing the gurrent state changes if needed. * Update API tracking log mode: `info` => `debug * Use shared `Kvt` descriptor in new Ledger API why: No need to create a new descriptor all the time	2023-11-16 19:35:03 +00:00
Jordan Hrycaj	3198ad1bbd	Fix default pruning for ledger and update core db and ledger logging (#1861 ) * Make sure that storage tries are not pruned (by default) on the new Ledger API why: Pruning might kill some unwanted entries from storage tries ending up with an unstable database leading to crashes. * Implement `CoreDb` and `LedgerRef` API tracing details: + Locally enabled at compile time via constants `ProvideCoreDbLegacyAPI` and `EnableApiTracking` in either `base.nim` source + If enabled it can be selectively turned on/off via public switches in the `CoreDb` descriptor. * Allow suppressing opportunistic `ifNecessaryGetXxx()` functions why: Better troubleshooting when the system crashes (assertions will then most probably happen outside an `async` function.)	2023-10-25 15:03:09 +01:00
Jordan Hrycaj	e8ad950e0a	Ledger abstraction for accounts cache (#1824 ) * Provide TDD/debug facility for inspecting `persistBlocks()` working detail: + Make sure that the last block of a test sample is the first batch item in `persistBlocks()`. + Additionally, allow `AccountsCache` API tracing by setting the flag `extraTraceMessages = true` in the file `accounts_cache.nim` * Overload AccountsCache by abstraction wrapper details: Can facilitate CoreDb API switch, details in `ledger/README.md`.	2023-10-18 20:27:22 +01:00

28 Commits