nimbus-eth1

Commit Graph

Author	SHA1	Message	Date
Jordan Hrycaj	72c3ab8ced	Provide partial tree support for preloading tests (#2536 ) * Implement partial trees why: This is currently needed for unit tests to pre-load the database with test data similar to `proof` node pre-load. The basic features for `snap-sync` boundary proofs are available as well for future use. What is missing is the final proof verification and a complete storage data load/merge function (stub is available.) * Cosmetics, clean up	2024-07-29 20:15:17 +00:00
Jordan Hrycaj	5ac362fe6f	Aristo and kvt balancer management update (#2504 ) * Aristo: Merge `delta_siblings` module into `deltaPersistent()` * Aristo: Add `isEmpty()` for canonical checking whether a layer is empty * Aristo: Merge `LayerDeltaRef` into `LayerObj` why: No need to maintain nested object refs anymore. Previously the `LayerDeltaRef` object had a companion `LayerFinalRef` which held non-delta layer information. * Kvt: Merge `LayerDeltaRef` into `LayerRef` why: No need to maintain nested object refs (as with `Aristo`) * Kvt: Re-write balancer logic similar to `Aristo` why: Although `Kvt` was a cheap copy of `Aristo` it sort of got out of sync and the balancer code was wrong. * Update iterator over forked peers why: Yield additional field `isLast` indicating that the last iteration cycle was approached. * Optimise balancer calculation. why: One can often avoid providing a new object containing the merge of two layers for the balancer. This avoids copying tables. In some cases this is replaced by `hasKey()` look ups though. One uses one of the two to combine and merges the other into the first. Of course, this needs some checks for making sure that none of the components to merge is eventually shared with something else. * Fix copyright year	2024-07-18 21:32:32 +00:00
Jordan Hrycaj	6677f57ea9	Aristo balancer clean up (#2501 ) * Remove `chunkedMpt` from `persistent()`/`stow()` function why: Proof-mode code was removed with PR #2445 and needs to be re-designed. * Remove unused `beStateRoot` argument from `deltaMerge()` * Update/drastically simplify `txStow()` why: Got rid of many boundary conditions details: Many pre-conditions have changed. In particular, previous versions used the account state (hash) which was conveniently available and checked it against the backend in order to find out whether there was something to do, at all. Currently, only an empty set of all tables in the delta layer has the balancer update ignored. Notable changes are: * no check against account state (see above) * balancer filters have no hash signature (some legacy stuff left over from journals) * no (shap sync) proof data which made the generation of the a top layer more complex * Cosmetics, cruft removal * Update unit test file & function name why: Was legacy module	2024-07-17 19:27:33 +00:00
Jordan Hrycaj	a84a2131cd	No ext update (#2494 ) * Imported/rebase from `no-ext`, PR #2485 Store extension nodes together with the branch Extension nodes must be followed by a branch - as such, it makes sense to store the two together both in the database and in memory: * fewer reads, writes and updates to traverse the tree * simpler logic for maintaining the node structure * less space used, both memory and storage, because there are fewer nodes overall There is also a downside: hashes can no longer be cached for an extension - instead, only the extension+branch hash can be cached - this seems like a fine tradeoff since computing it should be fast. TODO: fix commented code * Fix merge functions and `toNode()` * Update `merkleSignCommit()` prototype why: Result is always a 32bit hash * Update short Merkle hash key generation details: Ethereum reference MPTs use Keccak hashes as node links if the size of an RLP encoded node is at least 32 bytes. Otherwise, the RLP encoded node value is used as a pseudo node link (rather than a hash.) This is specified in the yellow paper, appendix D. Different to the `Aristo` implementation, the reference MPT would not store such a node on the key-value database. Rather the RLP encoded node value is stored instead of a node link in a parent node is stored as a node link on the parent database. Only for the root hash, the top level node is always referred to by the hash. * Fix/update `Extension` sections why: Were commented out after removal of a dedicated `Extension` type which left the system disfunctional. * Clean up unused error codes * Update unit tests * Update docu --------- Co-authored-by: Jacek Sieka <jacek@status.im>	2024-07-16 19:47:59 +00:00
Jacek Sieka	f3a56002ca	Turn payload into value type (#2483 ) The Vertex type unifies branches, extensions and leaves into a single memory area where the larges member is the branch (128 bytes + overhead) - the payloads we have are all smaller than 128 thus wrapping them in an extra layer of `ref` is wasteful from a memory usage perspective. Further, the ref:s must be visited during the M&S phase of garbage collection - since we keep millions of these, many of them short-lived, this takes up significant CPU time. ``` Function CPU Time: Total CPU Time: Self Module Function (Full) Source File Start Address system::markStackAndRegisters 10.0% 4.922s nimbus system::markStackAndRegisters(var<system::GcHeap>).constprop.0 gc.nim 0x701230` ```	2024-07-14 12:02:05 +02:00
Jacek Sieka	81e75622cf	storage: store root id together with vid, for better locality of refe… (#2449 ) The state and account MPT:s currenty share key space in the database based on that vertex id:s are assigned essentially randomly, which means that when two adjacent slot values from the same contract are accessed, they might reside at large distance from each other. Here, we prefix each vertex id by its root causing them to be sorted together thus bringing all data belonging to a particular contract closer together - the same effect also happens for the main state MPT whose nodes now end up clustered together more tightly. In the future, the prefix given to the storage keys can also be used to perform range operations such as reading all the storage at once and/or deleting an account with a batch operation. Notably, parts of the API already supported this rooting concept while parts didn't - this PR makes the API consistent by always working with a root+vid.	2024-07-04 15:46:52 +02:00
Jacek Sieka	b23795ab39	remove pPrf, fRpp (#2445 ) No longer used now that hashify is gone	2024-07-03 22:21:57 +02:00
Jacek Sieka	1f60e8e453	Use `Hash256` directly for account path (#2439 ) Account paths are always a hash - passing it around as such helps avoid confusion as to how long it is	2024-07-03 10:14:26 +02:00
Jordan Hrycaj	2c87fd1636	Aristo code cosmetics and tests update (#2434 ) * Update some docu * Resolve obsolete compile time option why: Not optional anymore * Update checks why: The notion of what constitutes a valid `Aristo` db has changed due to (even more) lazy calculating Merkle hash keys. * Disable redundant unit test for production	2024-07-01 10:59:18 +00:00
Jordan Hrycaj	8dd038144b	Some cleanups (#2428 ) * Remove `dirty` set from structural objects why: Not used anymore, the tree is dirty by default. * Rename `aristo_hashify` -> `aristo_compute` * Remove cruft, update comments, cosmetics, etc. * Simplify `SavedState` object why: The key chaining have become obsolete after extra lazy hashing. There is some available space for a state hash to be maintained in future. details: Accept the legacy `SavedState` object serialisation format for a while (which will be overwritten by new format.)	2024-06-28 18:43:04 +00:00
Jordan Hrycaj	14c3772545	On demand mpt revisited (#2426 ) * rebased from `github/on-demand-mpt` ackn: wip: on-demand mpt construction Given that actual data is stored in the `Vertex` structure, it's useful to think of the MPT as a cache for computing roots rather than being a functional requirement on its own. This PR engenders this line of thinking by incrementally computing the MPT only when it's needed, ie when a state (or similar) root is needed. This has the effect of siginficantly reducing memory usage as well as improving performance: * no need for dirty-mpt-node book-keeping * no need to build complex forest of upcoming hashing work * only hashes that are functionally needed are ever computed - intermediate nodes whose MTP root is not observed are never computed / processed * Unit test hot fixes * Unit test hot fixes cont. (somehow lost that part) --------- Co-authored-by: Jacek Sieka <jacek@status.im>	2024-06-28 15:03:12 +00:00
Jordan Hrycaj	61bbf40014	Update storage tree admin (#2419 ) * Tighten `CoreDb` API for accounts why: Apart from cruft, the way to fetch the accounts state root via a `CoreDbColRef` record was unnecessarily complicated. * Extend `CoreDb` API for accounts to cover storage tries why: In future, this will make the notion of column objects obsolete. Storage trees will then be indexed by the account address rather than the vertex ID equivalent like a `CoreDbColRef`. * Apply new/extended accounts API to ledger and tests details: This makes the `distinct_ledger` module obsolete * Remove column object constructors why: They were needed as an abstraction of MPT sub-trees including storage trees. Now, storage trees are handled by the account (e.g. via address) they belong to and all other trees can be identified by a constant well known vertex ID. So there is no need for column objects anymore. Still there are some left-over column object methods wnich will be removed next. * Remove `serialise()` and `PayloadRef` from default Aristo API why: Not needed. `PayloadRef` was used for unstructured/unknown payload formats (account or blob) and `serialise()` was used for decodng `PayloadRef`. Now it is known in advance what the payload looks like. * Added query function `hasStorageData()` whether a storage area exists why: Useful for supporting `slotStateEmpty()` of the `CoreDb` API * In the `Ledger` replace `storage.stateEmpty()` by `slotStateEmpty()` * On Aristo, hide the storage root/vertex ID in the `PayloadRef` why: The storage vertex ID is fully controlled by Aristo while the `AristoAccount` object is controlled by the application. With the storage root part of the `AristoAccount` object, there was a useless administrative burden to keep that storage root field up to date. * Remove cruft, update comments etc. * Update changed MPT access paradigms why: Fixes verified proxy tests * Fluffy cosmetics	2024-06-27 09:01:26 +00:00
Jacek Sieka	41cf81f80b	Fix dboptions init (#2391 ) For the block cache to be shared between column families, the options instance must be shared between the various column families being created. This also ensures that there is only one source of truth for configuration options instead of having two different sets depending on how the tables were initialized. This PR also removes the re-opening mechanism which can double startup time - every time the database is opened, the log is replayed - a large log file will take a long time to open. Finally, several options got correclty implemented as column family options, including an one that puts a hash index in the SST files.	2024-06-19 10:55:57 +02:00
Jordan Hrycaj	8727307ef4	Aristo uses pre classified tree types cont1 (#2389 ) * Provide dedicated functions for deleteing accounts and storage trees why: Storage trees are always linked to an account, so there is no need for an application to fiddle about (e.g. re-cycling, unlinking) storage tree vertex IDs. * Remove `delete()` and other cruft from API, `aristo_delete`, etc. * clean up delete functions details: The delete implementations `deleteImpl()` and `delTreeImpl()` do not need to be super generic anymore as all the edge cases are covered by the specialised `deleteAccountPayload()`, `deleteGenericData()`, etc. * Avoid unnecessary re-calculations of account keys why: The function `registerAccountForUpdate()` did extract the storage ID (if any) and automatically marked the Merkle keys along the account path for re-hashing. This would also apply if there was later detected that the account or the storage tree did not need to be updated. So the `registerAccountForUpdate()` function was split into a part which retrieved the storage ID, and another one which marked the Merkle keys for re-calculation to be applied only when needed.	2024-06-18 19:30:01 +00:00
Jordan Hrycaj	51f02090b8	Aristo uses pre classified tree types (#2385 ) * Remove unused `merge()` functions (for production) details: Some functionality moved to test suite Make sure that only `AccountData` leaf type is exactly used on VertexID(1) * clean up payload type * Provide dedicated functions for merging accounts and storage trees why: Storage trees are always linked to an account, so there is no need for an application to fiddle about (e.e. creating, re-cycling) with storage tree vertex IDs. * CoreDb: Disable tracer functionality why: Must be updated to accommodate new/changed `Aristo` functions. * CoreDb: Use new `mergeXXX()` functions why: Makes explicit vertex ID management obsolete for creating new storage trees. * Remove `mergePayload()` and other cruft from API, `aristo_merge`, etc. * clean up merge functions details: The merge implementation `mergePayloadImpl()` does not need to be super generic anymore as all the edge cases are covered by the specialised functions `mergeAccountPayload()`, `mergeGenericData()`, and `mergeStorageData()`. * No tracer available at the moment, so disable offending tests	2024-06-18 11:14:02 +00:00
Jordan Hrycaj	debba5a620	Coeredb related clean up and maint fixes (#2360 ) * Fix initialiser why: Possible crash (app profiling, tracer etc.) * Update column family options processing why: Same for kvt as for aristo * Move `AristoDbDualRocks` backend type to the test suite why: So it is not available for production * Fix typos in API jump table why: Used for tracing and app profiling only. Needed some update * Purged CoreDb legacy API why: Not needed anymore, was transitionary and disabled. * Rename `flush` argument to `eradicate` in a DB close context why: The word `eradicate` leaves no doubt what is meant * Rename `stoFlush()` -> `stoDelete()` * Rename `core_apps_newapi` -> `core_apps` (not so new anymore)	2024-06-14 11:19:48 +00:00
Jordan Hrycaj	5a5cc6295e	Triggered write event for kvt (#2351 ) * bump rockdb * Rename `KVT` objects related to filters according to `Aristo` naming details: filter* => delta* roFilter => balancer * Compulsory error handling if `persistent()` fails * Add return code to `reCentre()` why: Might eventually fail if re-centring is blocked. Some logic will be added in subsequent patch sets. * Add column families from earlier session to rocksdb in opening procedure why: All previously used CFs must be declared when re-opening an existing database. * Update `init()` and add rocksdb `reinit()` methods for changing parameters why: Opening a set column families (with different open options) must span at least the ones that are already on disk. * Provide write-trigger-event interface into `Aristo` backend why: This allows to save data from a guest application (think `KVT`) to get synced with the write cycle so the guest and `Aristo` save all atomically. * Use `KVT` with new column family interface from `Aristo` * Remove obsolete guest interface * Implement `KVT` piggyback on `Aristo` backend * CoreDb: Add separate `KVT`/`Aristo` backend mode for debugging * Remove `rocks_db` import from `persist()` function why: Some systems (i.p `fluffy` and friends) use the `Aristo` memory backend emulation and do not link against rocksdb when building the application. So this should fix that problem.	2024-06-13 18:15:11 +00:00
Jordan Hrycaj	a347291413	Aristo use rocksdb cf instead of key pfx (#2332 ) * Use RocksDb column families instead of a prefixed single column why: Better performance * Use structural objects `VertexRef` and `HashKey` in LRU cache for RocksDb why: Avoids repeated de/serialisation	2024-06-10 12:04:22 +00:00
Jordan Hrycaj	392088e5e9	Coredb fix storage tree issues (#2317 ) * Code cosmetics * Re-org `aristo_merge`, internally split into sub-modules why: Became a burden for maintenance because it hosts two different functionalities under the same merge paradigm: account/data merge and snap proof merge where the latter produces a partial trie. * Fix CoreDb tracer * Ledger: fix potential account vs. storage tree sync problems * Remove bound on the size of removable whole storage trees * Activate `test_tracer_json`	2024-06-07 10:56:31 +00:00
Jordan Hrycaj	1e65093b3e	Remove obsolete tests (#2307 ) * Remove `test_sync_snap` why: Snap sync needs to be re-factored. All the interesting database parts from this test suite has been recycled into `Aristo` * Remove `test_rocksdb_timing` * Update `all_tests`	2024-06-06 09:29:38 +00:00
Jacek Sieka	c876729c4d	Add some basic rocksdb options to command line (#2286 ) These options are there mainly to drive experiments, and are therefore hidden. One thing that this PR brings in is an initial set of caches and buffers for rocksdb - the set that I've been using during various performance tests to get to a viable baseline performance level.	2024-06-05 17:08:29 +02:00
Jordan Hrycaj	69a158864c	Remove vid recycling feature (#2294 )	2024-06-04 15:05:13 +00:00
Jordan Hrycaj	f926222fec	Aristo cull journal related stuff (#2288 ) * Remove all journal related stuff * Refactor function names journal() => delta(), filter() => delta() * remove `trg` fileld from `FilterRef` why: Same as `kMap[$1]` * Re-type FilterRef.src as `HashKey` why: So it is directly comparable to `kMap[$1]` * Moved `vGen[]` field from `LayerFinalRef` to `LayerDeltaRef` why: Then a separate `FilterRef` type is not needed, anymore * Rename `roFilter` field in `AristoDbRef` => `balancer` why: New name more appropriate. * Replace `FilterRef` by `LayerDeltaRef` type why: This allows to avoid copying into the `balancer` (see next patch set) most of the time. Typically, only one instance is running on the backend and the `balancer` is only used as a stage before saving data. * Refactor way how to store data persistently why: Avoid useless copy when staging `top` layer for persistently saving to backend. * Fix copyright header?	2024-06-03 20:10:35 +00:00
Jordan Hrycaj	0f430c70fd	Aristo avoid storage trie update race conditions (#2251 ) * Update TDD suite logger output format choices why: New format is not practical for TDD as it just dumps data across a wide range (considerably larder than 80 columns.) So the new format can be turned on by function argument. * Update unit tests samples configuration why: Slightly changed the way to find the `era1` directory * Remove compiler warnings (fix deprecated expressions and phrases) * Update `Aristo` debugging tools * Always update the `storageID` field of account leaf vertices why: Storage tries are weekly linked to an account leaf object in that the `storageID` field is updated by the application. Previously, `Aristo` verified that leaf objects make sense when passed to the database. As a consequence * the database was inconsistent for a short while * the burden for correctness was all on the application which led to delayed error handling which is hard to debug. So `Aristo` will internally update the account leaf objects so that there are no race conditions due to the storage trie handling * Aristo: Let `stow()`/`persist()` bail out unless there is a `VertexID(1)` why: The journal and filter logic depends on the hash of the `VertexID(1)` which is commonly known as the state root. This implies that all changes to the database are somehow related to that. * Make sure that a `Ledger` account does not overwrite the storage trie reference why: Due to the abstraction of a sub-trie (now referred to as column with a hash describing its state) there was a weakness in the `Aristo` handler where an account leaf could be overwritten though changing the validity of the database. This has been changed and the database will now reject such changes. This patch fixes the behaviour on the application layer. In particular, the column handle returned by the `CoreDb` needs to be updated by the `Aristo` database state. This mitigates the problem that a storage trie might have vanished or re-apperaed with a different vertex ID. * Fix sub-trie deletion test why: Was originally hinged on `VertexID(1)` which cannot be wholesale deleted anymore after the last Aristo update. Also, running with `VertexID(2)` needs an artificial `VertexID(1)` for making `stow()` or `persist()` work. * Cosmetics * Activate `test_generalstate_json` * Temporarily `deactivate test_tracer_json` * Fix copyright header --------- Co-authored-by: jordan <jordan@dry.pudding> Co-authored-by: Jacek Sieka <jacek@status.im>	2024-05-30 17:48:38 +00:00
tersec	c466edfd8d	change an Aristo function name to avoid Nim stdlib ambiguity (#2240 )	2024-05-29 15:08:00 +00:00
Jordan Hrycaj	143f2e99f5	Core db+aristo fixes and tx handling updates (#2164 ) * Aristo: Rename journal related sources and functions why: Previously, the naming was hinged on the phrases `fifo`, `filter` etc. which reflect the inner workings of cascaded filters. This was unfortunate for reading/understanding the source code for actions where the focus is the journal as a whole. * Aristo: Fix buffer overflow (path length truncating error) * Aristo: Tighten `hikeUp()` stop check, update error code why: Detect dangling vertex links. These are legit with `snap` sync processing but not with regular processing. * Aristo: Raise assert in regular mode `merge()` at a dangling link/edge why: With `snap` sync processing, partial trees are ok and can be amended. Not so in regular mode. Previously there was only a debug message when a non-legit dangling edge was encountered. * Aristo: Make sure that vertices are copied before modification why: Otherwise vertices from lower layers might also be modified * Aristo: Fix relaxed mode for validity checker `check()` * Remove cruft * Aristo: Update API for transaction handling details: + Split `aristo_tx.nim` into sub-modules + Split `forkWith()` into `findTx()` + `forkTx()` + Removed `forkTop()`, `forkBase()` (now superseded by new `forkTx()`) * CoreDb+Aristo: Fix initialiser (missing methods)	2024-05-03 17:38:17 +00:00
Jordan Hrycaj	961f63358e	Core db+aristo update recovery journal management (#2156 ) * Aristo: Allow to define/set `FilterID` for journal filter records why: After some changes, the `FilterID` is isomorphic to the `BlockNumber` scalar (well, the first 2^64 entries of a `BlockNumber`.) The needed change for `FilterID` is that the `FilterID(0)` value is valid part of the `FilterID` scalar. A non-valid `FilterID` entry is represented by `none(FilterID)`. * Aristo: Split off function `persist()` as persistent version of `stow()` why: In production, `stow(persistent=false,..)` is currently unused. So, using `persist()` rather than `stow(persistent=true,..)` improves readability and is better to maintain. * CoreDb+Aristo: Store block numbers in journal records why: This makes journal records searchable by block numbers * Aristo: Rename some journal related functions why: The name journal is more appropriate to api functions than something with fifo or filter. * CoreDb+Aristo: Update last/oldest journal state retrieval * CoreDb+Aristo: Register block number with state root in journal why: No need anymore for extra lookup table `stRootToBlockNum` which maps a storage root -> block number. * Aristo: Remove unused function `getFilUbe()` from api * CoreDb: Remove now unused virtual table `stRootToBlockNum` why: Was used to map a state root to a block number. This functionality is now embedded into the recovery journal backend. * Turn of API tracking (will fail on `fluffy`)	2024-04-29 20:17:17 +00:00
Jordan Hrycaj	0d4ef023ed	Update aristo journal functionality (#2155 ) * Aristo: Code cosmetics, e.g. update some CamelCase names * CoreDb+Aristo: Provide oldest known state root implied details: The Aristo journal allows to recover earlier but not all state roots. * Aristo: Fix journal backward index operator, e.g. `[^1]` * Aristo: Fix journal updater why: The `fifosStore()` store function slightly misinterpreted the update instructions when translation is to database `put()` functions. The effect was that the journal was ever growing due to stale entries which were never deleted. * CoreDb+Aristo: Provide utils for purging stale data from the KVT details: See earlier patch, not all state roots are available. This patch provides a mapping from some state root to a block number and allows to remove all KVT data related to a particular block number * Aristo+Kvt: Implement a clean up schedule for expired data in KVT why: For a single state ledger like `Aristo`, there is only a limited backlog of states. So KVT data (i.e. headers etc.) are cleaned up regularly * Fix copyright year	2024-04-26 13:43:52 +00:00
Jordan Hrycaj	b9187e0493	Aristo selective read cashing for rocksdb backend (#2145 ) * Aristo+Kvt: Better RocksDB profiling why: Providing more detailed information, mainly for `Aristo` * Aristo: Renamed journal `stats()` to `capacity()` why: `Stats()` was a misnomer * Aristo: Provide backend read caches for key and vertex IDs why: Dedicated LRU caching for particular types gives a throughput advantage. The sizes of the LRU queues used for caching are currently constant but might be adjusted at a later time. * Fix copyright year	2024-04-22 19:02:22 +00:00
Jordan Hrycaj	7d9e1d8607	Misc updates for full sync (#2140 ) * Code cosmetics * Aristo+Kvt: Fix api wrappers why: Api setup killed the backend descriptor when backend mapping was disabled. * Aristo: Implement masked profiling entries why: Database backend should be listed but not counted in tally * CoreDb: Simplify backend() methods why: DBMS backend access Was provided very early and over engineered. Now there are only two backend machines, one for `Kvt` and the other one for an `Mpt` available only via new API. * CoreDb: Code cleanup regarding descriptor types * CoreDb: Refactor/redefine `persistent()` methods why: There were `persistent()` methods for any type of caching storage facilities `Kvt`, `Mpt`, `Phk`, and `Acc`. Now there is only a single `persistent()` method storing all facilities in tandem (similar to how transactions work.) For non shared `Kvt` tables, there is now an extra storage method `saveOffSite()`. * CoreDb lingo update: `trie` becomes `column` why: Notion of a `trie` is pretty much hidden by the new `CoreDb` api. Revealed are sort of database columns for accounts an storage data, any of which have an internal state represented by a Keccack hash. So a `trie` or `MPT` becomes a `column` and a `rootHash` becomes a column state. * Aristo: rename backend filed `filters` => `journal` * Update full sync logging details: + Disable eth handler noise while syncing + Log journal depth (if available) * Fix copyright year * Fix cruft and unwanted imports	2024-04-19 18:37:27 +00:00
Jordan Hrycaj	1502014e36	Core db+aristo re org tracer (#2123 ) * Kvt: Update API hooks * Aristo: Generalised merging snap proofs, now for multiple state roots why: This accommodates pre-loading partial tries for unit tests * Aristo: Update some unit tests * CoreDb+Aristo: Re-factor tracer why: Was bonkers anyway. The main change is that the trace journal is now kept in a way similar to a transaction layer so that it can predictably interact with DB transactions. * Ledger: Debugging helper * Update tracer unit test applicable for `Aristo` * Fix copyright year * Disable `dump()` function as compile time default why: This needs to pull in the `rocks_db` library at compile time.	2024-04-03 15:48:35 +00:00
Jordan Hrycaj	99238ce0e4	Core db maintenance update (#2087 ) * CoreDb+Aristo: Fix handler code * Aristo+Kvt: Remove cruft * Aristo+Kvt: The function `forkTop()` always provides a single transaction why: Previously it provided a single squashed tx only if there were any. Now it will provide a blind one if there were none. * Fix Copyright header	2024-03-20 15:15:56 +00:00
andri lim	c41206be39	Fix styles and reduce compiler warnings (#2086 ) * Fix styles and reduce compiler warnings * Fix copyright year	2024-03-20 14:35:38 +07:00
Jordan Hrycaj	5379302ce9	Aristo+Kvt: Let destructor crash when `nil` argument is given (#2080 ) why: Ignoring `nil` objects was handy for a while but eventually led to lazy programming which in turn led to double destructor calls for the rocks-db.	2024-03-15 14:20:00 +00:00
Jordan Hrycaj	0d73637f14	Core db simplify new api storage modes (#2075 ) * Aristo+Kvt: Fix backend `dup()` function in api setup why: Backend object is subject to an inheritance cascade which was not taken care of, before. Only the base object was duplicated. * Kvt: Simplify DB clone/peers management * Aristo: Simplify DB clone/peers management * Aristo: Adjust unit test for working with memory DB only why: This currently causes some memory corruption persumably in the `libc` background layer. * CoredDb+Kvt: Simplify API for KVT why: Simplified storage models (was over engineered) for better performance and code maintenance. * CoredDb+Aristo: Simplify API for `Aristo` why: Only single database state needed here. Accessing a similar state will be implemented from outside this module using a context layer. This gives better performance and improves code maintenance. * Fix Copyright headers * CoreDb: Turn off API tracking why: CI would ot go through. Was accidentally turned on.	2024-03-14 22:17:43 +00:00
Jordan Hrycaj	587ca3abbe	Coredb use stackable api for aristo backend (#2060 ) * Aristo/Kvt: Provide function hooks APIs why: These APIs can be used for installing tracers, profiling functoinality, and other niceties on the databases. * Aristo: Provide optional API profiling details: It basically is a re-implementation of the `CoreDb` profiling implementation * Kvt: Provide optional API profiling similar to `Aristo` * CoreDb: Re-implementing profiling using `aristo_profile` * Ledger: Re-implementing profiling using `aristo_profile` * CoreDb: Update unit tests for maintainability * update copyright dates	2024-02-29 21:10:24 +00:00
Jordan Hrycaj	8e18e85288	Aristodb remove obsolete and time consuming admin features (#2048 ) * Aristo: Reorg `hashify()` using different schedule algorithm why: Directly calculating the search tree top down from the roots turns out to be faster than using the cached structures left over by `merge()` and `delete()`. Time gains is short of 20% * Aristo: Remove `lTab[]` leaf entry object type why: Not used anymore. It was previously needed to build the schedule for `hashify()`. * Aristo: Avoid unnecessary re-org of the vertex ID recycling list why: This list can become quite large so a heuristic is employed whether it makes sense to re-org. Also, re-org check is only done by `delete()` functions. * Aristo: Remove key/reverse lookup table from tx layers why: It is ignored except for handling proof nodes and costs unnecessary run time resources. This feature was originally needed to accommodate the mental transition from the legacy MPT to the `Aristo` trie :). * Fix copyright year	2024-02-22 08:24:58 +00:00
andri lim	6ff2edc416	Fix styles (#2046 ) * Fix styles * Fix copyright year	2024-02-21 23:04:59 +07:00
andri lim	bea558740f	Reduce compiler warnings (#2030 ) * Reduce compiler warnings * Reduce compiler warnings in test code	2024-02-16 16:08:07 +07:00
Jordan Hrycaj	1b4a43c140	Aristo db remove over engineered object type (#2027 ) * CoreDb: update test suite * Aristo: Simplify reverse key map why: The reverse key map `pAmk: (root,key) -> {vid,..}` as been simplified to `pAmk: key -> {vid,..}` as the state `root` domain argument is not used, anymore * Aristo: Remove `HashLabel` object type and replace it by `HashKey` why: The `HashLabel` object attaches a root hash to a hash key. This is nowhere used, anymore. * Fix copyright	2024-02-14 19:11:59 +00:00
Jordan Hrycaj	9e50af839f	Core db+aristo update storage trie handling (#2023 ) * CoreDb: Test module with additional sample selector cmd line options * Aristo: Do not automatically remove a storage trie with the account why: This is an unnecessary side effect. Rather than using an automatism, a a storage root must be deleted manually. * Aristo: Can handle stale storage root vertex IDs as empty IDs. why: This is currently needed for the ledger API supporting both, a legacy and the `Aristo` database backend. This feature can be disabled at compile time by re-setting the `LOOSE_STORAGE_TRIE_COUPLING` flag in the `aristo_constants` module. * CoreDb+Aristo: Flush/delete storage trie when deleting account why: On either backend, a deleted account leave a dangling storage trie on the database. For consistency nn the legacy backend, storage tries must not be deleted as they might be shared by several accounts whereas on `Aristo` they are always unique.	2024-02-12 19:37:00 +00:00
Jordan Hrycaj	2c35390bdf	Core db and aristo maintenance update (#2014 ) * Aristo: Update error return code why: Failing of `Aristo` function `delete()` might fail because there is no such data item on the db. This must return a single error code as is done with `fetch()`. * Ledger: Better error handling why: The `expect()` clauses have been replaced by raising asserts indicating the error from the database backend. Also, `delete()` failures are legitimate if the item to delete does not exist. * Aristo: Delete function must always leave a label on DB for `hashify()` why: The `hashify()` uses the labels left bu `merge()` and `delete()` to compile (and optimise) a scheduler for subsequent hashing. Originally, the labels were not used for deleted entries and `delete()` still had some edge case where the deletion label was not properly handled. * Aristo: Update `hashify()` scheduler, remove buggy optimisation why: Was left over from version without virtual state roots which did not know about account payload leaf vertices referring to storage roots. * Aristo: Label storage trie account in `delete()` similar to `merge()` details; The `delete()` function applied to a non-static state root (assumed to be a storage root) will check the payload of an accounts leaf and mark its Merkle keys to be re-checked when runninh `hashify()` * Aristo: Clean up and re-org recycled vertex IDs in `hashify()` why: Re-organising the recycled vertex IDs list intends to reduce the size of the list. This list is organised as a LIFO (or stack.) By reorganising it in a way so that the least vertex ID numbers are on top, the list will be kept smaller as observed on some examples (less than 30%.) * CoreDb: Accept storage trie deletion requests in non-initialised state why: Due to lazy initialisation, the root vertex ID might not yet exist. So the `Aristo` database handlers would reject this call with an error and this condition needs to be handled by the API (which realises the lazy feature.) * Cosmetics & code massage, prettify logging * fix missing import	2024-02-08 16:32:16 +00:00
Jordan Hrycaj	3b306a9689	Aristo: Update unit test suite (#2002 ) * Aristo: Update unit test suite * Aristo/Kvt: Fix iterators why: Generic iterators were not properly updated after backend change * Aristo: Add sub-trie deletion functionality why: For storage tries linked to an account payload vertex ID, a the whole storage trie needs to be deleted with the account. * Aristo: Reserve vertex ID numbers for static custom state roots why: Static custom state roots may be controlled by an application, e.g. for a receipt or a transaction root. The `Aristo` functions are agnostic of what the static state roots are when different from the internal tree vertex ID 1. details; The `merge()` function applied to a non-static state root (assumed to be a storage root) will check the payload of an accounts leaf and mark its Merkle keys to be re-checked. * Aristo: Correct error code symbol * Aristo: Update error code symbols * Aristo: Code cosmetics/comments * Aristo: Fix hashify schedule calculator why: Had a tendency to stop early leaving an incomplete job	2024-02-01 21:27:48 +00:00
Jordan Hrycaj	43e5f428af	Aristo db kvt maintenance update (#1952 ) * Update KVT layers abstraction details: modelled after Aristo layers * Simplified KVT database iterators (removed item counters) why: Not needed for production functions * Simplify KVT merge function `layersCc()` * Simplified Aristo database iterators (removed item counters) why: Not needed for production functions * Update failure condition for hash labels compiler `hashify()` why: Node need not be rejected as long as links are on the schedule. In that case, `redo[]` is to become `wff.base[]` at a later stage. * Update merging layers and label update functions why: + Merging a stack of layers with `layersCc()` could be simplified + Merging layers will optimise the reverse `kMap[]` table maps `pAmk: label->{vid, ..}` by deleting empty mappings `label->{}` where they are redundant. + Updated `layersPutLabel()` for optimising `pAmk[]` tables	2023-12-20 16:19:00 +00:00
Jordan Hrycaj	ffa8ad2246	Core db use differential tx layers for aristo and kvt (#1949 ) * Fix kvt headers * Provide differential layers for KVT transaction stack why: Significant performance improvement * Provide abstraction layer for database top cache layer why: This will eventually implemented as a differential database layers or transaction layers. The latter is needed to improve performance. behavioural changes: Zero vertex and keys (i.e. delete requests) are not optimised out until the last layer is written to the database. * Provide differential layers for Aristo transaction stack why: Significant performance improvement	2023-12-19 12:39:23 +00:00
Jordan Hrycaj	13f51939f6	Core db aristo hasher profiling and timing improvement (#1938 ) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already	2023-12-12 17:47:41 +00:00
Jordan Hrycaj	657379f484	Aristo db update merkle hasher (#1925 ) * Register paths for added leafs because of trie re-balancing why: While the payload would not change, the prefix in the leaf vertex would. So it needs to be flagged for hash recompilation for the `hashify()` module. also: Make sure that `Hike` paths which might have vertex links into the backend filter are replaced by vertex copies before manipulating. Otherwise the vertices on the immutable filter might be involuntarily changed. * Also check for paths where the leaf vertex is on the backend, already why: A a path can have dome vertices on the top layer cache with the `Leaf` vertex on the backend. * Re-define a void `HashLabel` type. why: A `HashLabel` type is a pair `(root-vertex-ID, Keccak-hash)`. Previously, a valid `HashLabel` consisted of a non-empty hash and a non-zero vertex ID. This definition leads to a non-unique representation of a void `HashLabel` with either root-ID or has void. This has been changed to the unique void `HashLabel` exactly if the hash entry is void. * Update consistency checkers * Re-org `hashify()` procedure why: Syncing against block chain showed serious deficiencies which produced wrong hashes or simply bailed out with error. So all fringe cases (mainly due to deleted entries) could be integrated into the labelling schedule rather than handling separate fringe cases.	2023-12-04 20:39:26 +00:00
Jordan Hrycaj	c47f021596	Core db and aristo updates for destructor and tx logic (#1894 ) * Disable `TransactionID` related functions from `state_db.nim` why: Functions `getCommittedStorage()` and `updateOriginalRoot()` from the `state_db` module are nowhere used. The emulation of a legacy `TransactionID` type functionality is administratively expensive to provide by `Aristo` (the legacy DB version is only partially implemented, anyway). As there is no other place where `TransactionID`s are used, they will not be provided by the `Aristo` variant of the `CoreDb`. For the legacy DB API, nothing will change. * Fix copyright headers in source code * Get rid of compiler warning * Update Aristo code, remove unused `merge()` variant, export `hashify()` why: Adapt to upcoming `CoreDb` wrapper * Remove synced tx feature from `Aristo` why: + This feature allowed to synchronise transaction methods like begin, commit, and rollback for a group of descriptors. + The feature is over engineered and not needed for `CoreDb`, neither is it complete (some convergence features missing.) * Add debugging helpers to `Kvt` also: Update database iterator, add count variable yield argument similar to `Aristo`. * Provide optional destructors for `CoreDb` API why; For the upcoming Aristo wrapper, this allows to control when certain smart destruction and update can take place. The auto destructor works fine in general when the storage/cache strategy is known and acceptable when creating descriptors. * Add update option for `CoreDb` API function `hash()` why; The hash function is typically used to get the state root of the MPT. Due to lazy hashing, this might be not available on the `Aristo` DB. So the `update` function asks for re-hashing the gurrent state changes if needed. * Update API tracking log mode: `info` => `debug * Use shared `Kvt` descriptor in new Ledger API why: No need to create a new descriptor all the time	2023-11-16 19:35:03 +00:00
Jordan Hrycaj	4feaa2cfab	Aristo db update for short nodes key edge cases (#1887 ) * Aristo: Provide key-value list signature calculator detail: Simple wrappers around `Aristo` core functionality * Update new API for `CoreDb` details: + Renamed new API functions `contains()` => `hasKey()` or `hasPath()` which disables the `in` operator on non-boolean `contains()` functions + The functions `get()` and `fetch()` always return a not-found error if there is no item, available. The new functions `getOrEmpty()` and `mergeOrEmpty()` return an an empty `Blob` if there is no such key found. * Rewrite `core_apps.nim` using new API from `CoreDb` * Use `Aristo` functionality for calculating Merkle signatures details: For debugging, the `VerifyAristoForMerkleRootCalc` can be set so that `Aristo` results will be verified against the legacy versions. * Provide general interface for Merkle signing key-value tables details: Export `Aristo` wrappers * Activate `CoreDb` tests why: Now, API seems to be stable enough for general tests. * Update `toHex()` usage why: Byteutils' `toHex()` is superior to `toSeq.mapIt(it.toHex(2)).join` * Split `aristo_transcode` => `aristo_serialise` + `aristo_blobify` why: + Different modules for different purposes + `aristo_serialise`: RLP encoding/decoding + `aristo_blobify`: Aristo database encoding/decoding * Compacted representation of small nodes' links instead of Keccak hashes why: Ethereum MPTs use Keccak hashes as node links if the size of an RLP encoded node is at least 32 bytes. Otherwise, the RLP encoded node value is used as a pseudo node link (rather than a hash.) Such a node is nor stored on key-value database. Rather the RLP encoded node value is stored instead of a lode link in a parent node instead. Only for the root hash, the top level node is always referred to by the hash. This feature needed an abstraction of the `HashKey` object which is now either a hash or a blob of length at most 31 bytes. This leaves two ways of representing an empty/void `HashKey` type, either as an empty blob of zero length, or the hash of an empty blob. * Update `CoreDb` interface (mainly reducing logger noise) * Fix copyright years (to make `Lint` happy)	2023-11-08 12:18:32 +00:00
Jordan Hrycaj	3fe0a49a5e	Aristo db allow shorter than 64 nibbles path keys (#1864 ) * Aristo: Single `FetchPathNotFound` error in `fetchXxx()` and `hasPath()` why: Missing path hike returns too many detailed reasons why it failed which becomes cumbersome to handle. also: Renamed `contains()` => `hasPath()` which disables the `in` operator on non-boolean `contains()` functions * Kvt: Renamed `contains()` => `hasKey()` why: which disables the `in` operator on non-boolean `contains()` functions * Aristo: Generalising `HashID` by variable length `PathID` why: There are cases when the `Aristo` database is to be used with shorter than 64 nibbles keys when handling transactions indexes with sequence IDs. caveat: This patch only works reliable for full length `PathID` values. Tests for shorter `PathID` values are currently missing.	2023-10-27 22:36:51 +01:00

1 2

86 Commits