nimbus-eth1

Commit Graph

Author	SHA1	Message	Date
Jordan Hrycaj	5b6ccddaa0	Db folder sources and related remove compiler warnings (#2673 ) * Aristo: Rename `Hash256` -> `Hash32` * CoreDb: Rename `Hash256` -> `Hash32` * Ledger: Rename `Hash256` -> `Hash32` * StorageTypes: Rename `Hash256` -> `Hash32` * Aristo: Rename `Blob` -> `seq[byte]`, `keccakHash` -> `keccak256` * Kvt: Rename `Blob` -> `seq[byte]` * CoreDb: Rename `Blob` -> `seq[byte]`, `keccakHash` -> `keccak256` * Ledger: Rename `Blob` -> `seq[byte]`, `keccakHash` -> `keccak256` * CoreDb: Rename `BlockHeader` -> `Header`, `BlockNonce` -> `Bytes8` * Misc: Rename `StorageKey` -> `Bytes32` * Tracer: `Hash256` -> `Hash32`, `BlockHeader` -> `Header`, etc. * Fix copyright header	2024-10-01 21:03:10 +00:00
Jacek Sieka	c210885b73	eth: bump to new types (#2660 ) This is a minimal set of changes to make things work with the new types in nim-eth - this is the minimal PR that merely resolves incompatibilities while the full change set would include more cleanup and migration.	2024-09-29 14:37:09 +02:00
Jacek Sieka	adb8d64377	simplify VertexRef (#2626 ) * move pfx out of variant which avoids pointless field type panic checks and copies on access * make `VertexRef` a non-inheritable object which reduces its memory footprint and simplifies its use - it's also unclear from a semantic point of view why inheritance makes sense for storing keys	2024-09-13 18:55:17 +02:00
Jacek Sieka	5c1e2e7d3b	Migrate `keyed_queue` to `minilru` (#2608 ) Compared to `keyed_queue`, `minilru` uses significantly less memory, in particular for the 32-byte hash keys where `kq` stores several copies of the key redundantly.	2024-09-13 15:47:50 +02:00
Jacek Sieka	ef1bab0802	avoid some trivial memory allocations (#2587 ) * pre-allocate `blobify` data and remove redundant error handling (cannot fail on correct data) * use threadvar for temporary storage when decoding rdb, avoiding closure env * speed up database walkers by avoiding many temporaries ~5% perf improvement on block import, 100x on database iteration (useful for building analysis tooling)	2024-09-02 16:03:10 +02:00
Jordan Hrycaj	38572bd8ea	Cache a storage root ID forever in the leaf payload of an account (#2551 ) details: Stale root IDs are marked disabled while the ID is kept in the leaf payload. why: This might lead to further caching advantages.	2024-08-07 13:28:01 +00:00
andri lim	a59cc84fca	Not using deprecated functions in config anymore (#2495 )	2024-07-17 02:57:19 +00:00
Jordan Hrycaj	a84a2131cd	No ext update (#2494 ) * Imported/rebase from `no-ext`, PR #2485 Store extension nodes together with the branch Extension nodes must be followed by a branch - as such, it makes sense to store the two together both in the database and in memory: * fewer reads, writes and updates to traverse the tree * simpler logic for maintaining the node structure * less space used, both memory and storage, because there are fewer nodes overall There is also a downside: hashes can no longer be cached for an extension - instead, only the extension+branch hash can be cached - this seems like a fine tradeoff since computing it should be fast. TODO: fix commented code * Fix merge functions and `toNode()` * Update `merkleSignCommit()` prototype why: Result is always a 32bit hash * Update short Merkle hash key generation details: Ethereum reference MPTs use Keccak hashes as node links if the size of an RLP encoded node is at least 32 bytes. Otherwise, the RLP encoded node value is used as a pseudo node link (rather than a hash.) This is specified in the yellow paper, appendix D. Different to the `Aristo` implementation, the reference MPT would not store such a node on the key-value database. Rather the RLP encoded node value is stored instead of a node link in a parent node is stored as a node link on the parent database. Only for the root hash, the top level node is always referred to by the hash. * Fix/update `Extension` sections why: Were commented out after removal of a dedicated `Extension` type which left the system disfunctional. * Clean up unused error codes * Update unit tests * Update docu --------- Co-authored-by: Jacek Sieka <jacek@status.im>	2024-07-16 19:47:59 +00:00
Jacek Sieka	f3a56002ca	Turn payload into value type (#2483 ) The Vertex type unifies branches, extensions and leaves into a single memory area where the larges member is the branch (128 bytes + overhead) - the payloads we have are all smaller than 128 thus wrapping them in an extra layer of `ref` is wasteful from a memory usage perspective. Further, the ref:s must be visited during the M&S phase of garbage collection - since we keep millions of these, many of them short-lived, this takes up significant CPU time. ``` Function CPU Time: Total CPU Time: Self Module Function (Full) Source File Start Address system::markStackAndRegisters 10.0% 4.922s nimbus system::markStackAndRegisters(var<system::GcHeap>).constprop.0 gc.nim 0x701230` ```	2024-07-14 12:02:05 +02:00
Jacek Sieka	7d78fd97d5	avoid allocations for slot storage (#2455 ) Introduce a new `StoData` payload type similar to `AccountData` * slightly more efficient storage format * typed api * fewer seqs * fix encoding docs - it wasn't rlp after all :)	2024-07-04 23:48:45 +00:00
Jacek Sieka	81e75622cf	storage: store root id together with vid, for better locality of refe… (#2449 ) The state and account MPT:s currenty share key space in the database based on that vertex id:s are assigned essentially randomly, which means that when two adjacent slot values from the same contract are accessed, they might reside at large distance from each other. Here, we prefix each vertex id by its root causing them to be sorted together thus bringing all data belonging to a particular contract closer together - the same effect also happens for the main state MPT whose nodes now end up clustered together more tightly. In the future, the prefix given to the storage keys can also be used to perform range operations such as reading all the storage at once and/or deleting an account with a batch operation. Notably, parts of the API already supported this rooting concept while parts didn't - this PR makes the API consistent by always working with a root+vid.	2024-07-04 15:46:52 +02:00
Jacek Sieka	c364426422	Smaller in-database representations (#2436 ) These representations use ~15-20% less data compared to the status quo, mainly by removing redundant zeroes in the integer encodings - a significant effect of this change is that the various rocksdb caches see better efficiency since more items fit in the same amount of space. * use RLP encoding for `VertexID` and `UInt256` wherever it appears * pack `VertexRef`/`PayloadRef` more tightly	2024-07-02 20:25:06 +02:00
Jordan Hrycaj	8dd038144b	Some cleanups (#2428 ) * Remove `dirty` set from structural objects why: Not used anymore, the tree is dirty by default. * Rename `aristo_hashify` -> `aristo_compute` * Remove cruft, update comments, cosmetics, etc. * Simplify `SavedState` object why: The key chaining have become obsolete after extra lazy hashing. There is some available space for a state hash to be maintained in future. details: Accept the legacy `SavedState` object serialisation format for a while (which will be overwritten by new format.)	2024-06-28 18:43:04 +00:00
Jordan Hrycaj	61bbf40014	Update storage tree admin (#2419 ) * Tighten `CoreDb` API for accounts why: Apart from cruft, the way to fetch the accounts state root via a `CoreDbColRef` record was unnecessarily complicated. * Extend `CoreDb` API for accounts to cover storage tries why: In future, this will make the notion of column objects obsolete. Storage trees will then be indexed by the account address rather than the vertex ID equivalent like a `CoreDbColRef`. * Apply new/extended accounts API to ledger and tests details: This makes the `distinct_ledger` module obsolete * Remove column object constructors why: They were needed as an abstraction of MPT sub-trees including storage trees. Now, storage trees are handled by the account (e.g. via address) they belong to and all other trees can be identified by a constant well known vertex ID. So there is no need for column objects anymore. Still there are some left-over column object methods wnich will be removed next. * Remove `serialise()` and `PayloadRef` from default Aristo API why: Not needed. `PayloadRef` was used for unstructured/unknown payload formats (account or blob) and `serialise()` was used for decodng `PayloadRef`. Now it is known in advance what the payload looks like. * Added query function `hasStorageData()` whether a storage area exists why: Useful for supporting `slotStateEmpty()` of the `CoreDb` API * In the `Ledger` replace `storage.stateEmpty()` by `slotStateEmpty()` * On Aristo, hide the storage root/vertex ID in the `PayloadRef` why: The storage vertex ID is fully controlled by Aristo while the `AristoAccount` object is controlled by the application. With the storage root part of the `AristoAccount` object, there was a useless administrative burden to keep that storage root field up to date. * Remove cruft, update comments etc. * Update changed MPT access paradigms why: Fixes verified proxy tests * Fluffy cosmetics	2024-06-27 09:01:26 +00:00
Jacek Sieka	6b68ff92d3	Allocation-free nibbles buffer (#2406 ) This buffer eleminates a large part of allocations during MPT traversal, reducing overall memory usage and GC pressure. Ideally, we would use it throughout in the API instead of `openArray[byte]` since the built-in length limit appropriately exposes the natural 64-nibble depth constraint that `openArray` fails to capture.	2024-06-22 22:33:37 +02:00
Jordan Hrycaj	51f02090b8	Aristo uses pre classified tree types (#2385 ) * Remove unused `merge()` functions (for production) details: Some functionality moved to test suite Make sure that only `AccountData` leaf type is exactly used on VertexID(1) * clean up payload type * Provide dedicated functions for merging accounts and storage trees why: Storage trees are always linked to an account, so there is no need for an application to fiddle about (e.e. creating, re-cycling) with storage tree vertex IDs. * CoreDb: Disable tracer functionality why: Must be updated to accommodate new/changed `Aristo` functions. * CoreDb: Use new `mergeXXX()` functions why: Makes explicit vertex ID management obsolete for creating new storage trees. * Remove `mergePayload()` and other cruft from API, `aristo_merge`, etc. * clean up merge functions details: The merge implementation `mergePayloadImpl()` does not need to be super generic anymore as all the edge cases are covered by the specialised functions `mergeAccountPayload()`, `mergeGenericData()`, and `mergeStorageData()`. * No tracer available at the moment, so disable offending tests	2024-06-18 11:14:02 +00:00
Jordan Hrycaj	392088e5e9	Coredb fix storage tree issues (#2317 ) * Code cosmetics * Re-org `aristo_merge`, internally split into sub-modules why: Became a burden for maintenance because it hosts two different functionalities under the same merge paradigm: account/data merge and snap proof merge where the latter produces a partial trie. * Fix CoreDb tracer * Ledger: fix potential account vs. storage tree sync problems * Remove bound on the size of removable whole storage trees * Activate `test_tracer_json`	2024-06-07 10:56:31 +00:00
Jordan Hrycaj	8985535ab2	Core db+aristo updates n fixes (#2298 ) * Fix `blobify()` for `SavedState` object why: Have to treat varying sizes for `HashKey`, i.p. for an empty key which has zero size. * Store correct block number in `SavedState` record why: Stored `block-number - 1` for some obscure reason. * Cosmetcs, docu	2024-06-05 18:17:50 +00:00
Jordan Hrycaj	69a158864c	Remove vid recycling feature (#2294 )	2024-06-04 15:05:13 +00:00
Jordan Hrycaj	cc909c99f2	Fix crash in de-serialiser (#2289 ) why: Late change from `Hash256` to `HashKey` without fully updating the serialiser.	2024-06-04 10:38:11 +00:00
Jordan Hrycaj	f926222fec	Aristo cull journal related stuff (#2288 ) * Remove all journal related stuff * Refactor function names journal() => delta(), filter() => delta() * remove `trg` fileld from `FilterRef` why: Same as `kMap[$1]` * Re-type FilterRef.src as `HashKey` why: So it is directly comparable to `kMap[$1]` * Moved `vGen[]` field from `LayerFinalRef` to `LayerDeltaRef` why: Then a separate `FilterRef` type is not needed, anymore * Rename `roFilter` field in `AristoDbRef` => `balancer` why: New name more appropriate. * Replace `FilterRef` by `LayerDeltaRef` type why: This allows to avoid copying into the `balancer` (see next patch set) most of the time. Typically, only one instance is running on the backend and the `balancer` is only used as a stage before saving data. * Refactor way how to store data persistently why: Avoid useless copy when staging `top` layer for persistently saving to backend. * Fix copyright header?	2024-06-03 20:10:35 +00:00
Jacek Sieka	9f879406f3	append instead of reallocate in blobify (#2277 ) ...otherwise, we get lots and lots of temporary allocations of seq's	2024-06-01 17:13:24 +02:00
Jordan Hrycaj	bda760f41d	Run coredb without journal (#2266 ) * Add persistent last state stamp feature why: This allows to run `CoreDb` without journal * Start `CoreDb` without journal * Remove journal related functions from `CoredDb`	2024-05-31 17:32:22 +00:00
Jordan Hrycaj	0f430c70fd	Aristo avoid storage trie update race conditions (#2251 ) * Update TDD suite logger output format choices why: New format is not practical for TDD as it just dumps data across a wide range (considerably larder than 80 columns.) So the new format can be turned on by function argument. * Update unit tests samples configuration why: Slightly changed the way to find the `era1` directory * Remove compiler warnings (fix deprecated expressions and phrases) * Update `Aristo` debugging tools * Always update the `storageID` field of account leaf vertices why: Storage tries are weekly linked to an account leaf object in that the `storageID` field is updated by the application. Previously, `Aristo` verified that leaf objects make sense when passed to the database. As a consequence * the database was inconsistent for a short while * the burden for correctness was all on the application which led to delayed error handling which is hard to debug. So `Aristo` will internally update the account leaf objects so that there are no race conditions due to the storage trie handling * Aristo: Let `stow()`/`persist()` bail out unless there is a `VertexID(1)` why: The journal and filter logic depends on the hash of the `VertexID(1)` which is commonly known as the state root. This implies that all changes to the database are somehow related to that. * Make sure that a `Ledger` account does not overwrite the storage trie reference why: Due to the abstraction of a sub-trie (now referred to as column with a hash describing its state) there was a weakness in the `Aristo` handler where an account leaf could be overwritten though changing the validity of the database. This has been changed and the database will now reject such changes. This patch fixes the behaviour on the application layer. In particular, the column handle returned by the `CoreDb` needs to be updated by the `Aristo` database state. This mitigates the problem that a storage trie might have vanished or re-apperaed with a different vertex ID. * Fix sub-trie deletion test why: Was originally hinged on `VertexID(1)` which cannot be wholesale deleted anymore after the last Aristo update. Also, running with `VertexID(2)` needs an artificial `VertexID(1)` for making `stow()` or `persist()` work. * Cosmetics * Activate `test_generalstate_json` * Temporarily `deactivate test_tracer_json` * Fix copyright header --------- Co-authored-by: jordan <jordan@dry.pudding> Co-authored-by: Jacek Sieka <jacek@status.im>	2024-05-30 17:48:38 +00:00
Jacek Sieka	0a49833d69	avoid a few more copies (#2215 )	2024-05-24 11:27:17 +02:00
Jacek Sieka	f38c5e631e	trivial memory-based speedups (#2205 ) * trivial memory-based speedups * HashKey becomes non-ref * use openArray instead of seq in lots of places * avoid sequtils.reversed when unnecessary * add basic perf stats to test_coredb * copyright	2024-05-23 17:37:51 +02:00
andri lim	7c1af9a78f	Add style check to config.nims and fix styles in source code (#2038 ) * Add style check to config.nims and fix styles in source code * Fix copyright year	2024-02-20 10:07:38 +07:00
Guido Vranken	b6599b73f0	Add missing import to aristo_blobify.nim (#1983 )	2024-01-24 12:09:05 +07:00
Jordan Hrycaj	5462c05dc6	Core db update api tracking (#1907 ) * Fix copyright year * Show elapsed times with enabled `CoreDb` API tracking * Show elapsed times with enabled `LedgerRef` API tracking * Reorg `CoreDb` auto destructors for `Aristo` DB why: While `Aristo` supports some parallelism for concurrent database access, this comes with a price of management overhead. With a naive approach, the auto-destructor will slow down execution because the ledger and evm treat the database in a shared mode where a DB descriptor is just created and thrown away shortly after. This is reflected in the `Coredb` abstraction layer above `Aristo`/`Kvt` where a few `Shared` type descriptors are cached and a shared reference is returned rather than a disposable new object. * For `CoreDb` support transaction level tracking details: This is mainly an extra for the legacy DB as `Aristo` and `Kvt` support this already. Also return an error on the legacy DB backend when `persistent()` is called while there are transactions pending (the `persistent()` call does nothing otherwise on the legacy backend.) * Clear compiler warnings (remove unused variables etc.)	2023-11-24 22:16:21 +00:00
Jordan Hrycaj	6e0397e276	Aristo and ledger small updates (#1888 ) * Fix debug noise in `hashify()` for perfectly normal situation why: Was previously considered a fixable error * Fix test sample file names why: The larger test file `goerli68161.txt.gz` is already in the local archive. So there is no need to use the smaller one from the external repo. * Activate `accounts_cache` module from `db/ledger` why: A copy of the original `accounts_cache.nim` source to be integrated into the `Ledger` module wrapper which allows to switch between different `accounts_cache` implementations unser tha same API. details: At a later state, the `db/accounts_cache.nim` wrapper will be removed so that there is only one access to that module via `db/ledger/accounts_cache.nim`. * Fix copyright headers in source code	2023-11-08 16:52:25 +00:00
Jordan Hrycaj	4feaa2cfab	Aristo db update for short nodes key edge cases (#1887 ) * Aristo: Provide key-value list signature calculator detail: Simple wrappers around `Aristo` core functionality * Update new API for `CoreDb` details: + Renamed new API functions `contains()` => `hasKey()` or `hasPath()` which disables the `in` operator on non-boolean `contains()` functions + The functions `get()` and `fetch()` always return a not-found error if there is no item, available. The new functions `getOrEmpty()` and `mergeOrEmpty()` return an an empty `Blob` if there is no such key found. * Rewrite `core_apps.nim` using new API from `CoreDb` * Use `Aristo` functionality for calculating Merkle signatures details: For debugging, the `VerifyAristoForMerkleRootCalc` can be set so that `Aristo` results will be verified against the legacy versions. * Provide general interface for Merkle signing key-value tables details: Export `Aristo` wrappers * Activate `CoreDb` tests why: Now, API seems to be stable enough for general tests. * Update `toHex()` usage why: Byteutils' `toHex()` is superior to `toSeq.mapIt(it.toHex(2)).join` * Split `aristo_transcode` => `aristo_serialise` + `aristo_blobify` why: + Different modules for different purposes + `aristo_serialise`: RLP encoding/decoding + `aristo_blobify`: Aristo database encoding/decoding * Compacted representation of small nodes' links instead of Keccak hashes why: Ethereum MPTs use Keccak hashes as node links if the size of an RLP encoded node is at least 32 bytes. Otherwise, the RLP encoded node value is used as a pseudo node link (rather than a hash.) Such a node is nor stored on key-value database. Rather the RLP encoded node value is stored instead of a lode link in a parent node instead. Only for the root hash, the top level node is always referred to by the hash. This feature needed an abstraction of the `HashKey` object which is now either a hash or a blob of length at most 31 bytes. This leaves two ways of representing an empty/void `HashKey` type, either as an empty blob of zero length, or the hash of an empty blob. * Update `CoreDb` interface (mainly reducing logger noise) * Fix copyright years (to make `Lint` happy)	2023-11-08 12:18:32 +00:00

31 Commits