nimbus-eth1

Commit Graph

Author	SHA1	Message	Date
Jacek Sieka	ef1bab0802	avoid some trivial memory allocations (#2587 ) * pre-allocate `blobify` data and remove redundant error handling (cannot fail on correct data) * use threadvar for temporary storage when decoding rdb, avoiding closure env * speed up database walkers by avoiding many temporaries ~5% perf improvement on block import, 100x on database iteration (useful for building analysis tooling)	2024-09-02 16:03:10 +02:00
Jacek Sieka	81e75622cf	storage: store root id together with vid, for better locality of refe… (#2449 ) The state and account MPT:s currenty share key space in the database based on that vertex id:s are assigned essentially randomly, which means that when two adjacent slot values from the same contract are accessed, they might reside at large distance from each other. Here, we prefix each vertex id by its root causing them to be sorted together thus bringing all data belonging to a particular contract closer together - the same effect also happens for the main state MPT whose nodes now end up clustered together more tightly. In the future, the prefix given to the storage keys can also be used to perform range operations such as reading all the storage at once and/or deleting an account with a batch operation. Notably, parts of the API already supported this rooting concept while parts didn't - this PR makes the API consistent by always working with a root+vid.	2024-07-04 15:46:52 +02:00
Jacek Sieka	c364426422	Smaller in-database representations (#2436 ) These representations use ~15-20% less data compared to the status quo, mainly by removing redundant zeroes in the integer encodings - a significant effect of this change is that the various rocksdb caches see better efficiency since more items fit in the same amount of space. * use RLP encoding for `VertexID` and `UInt256` wherever it appears * pack `VertexRef`/`PayloadRef` more tightly	2024-07-02 20:25:06 +02:00
Jacek Sieka	3e001e322c	Fix memory usage spikes during sync, give memory to rocksdb (#2413 ) * creating a seq from a table that holds lots of changes means copying all data into the table - this can be several GB of data while syncing blocks * nim fails to optimize the moving of the `WidthFirstForest` - the real solution is to not construct a `wff` to begin with, but this PR provides relief while that is being worked on This spike fix allows us to bump the rocksdb cache by another 2 GB and still have a significantly lower peak memory usage during sync.	2024-06-25 13:39:53 +02:00
Jacek Sieka	41cf81f80b	Fix dboptions init (#2391 ) For the block cache to be shared between column families, the options instance must be shared between the various column families being created. This also ensures that there is only one source of truth for configuration options instead of having two different sets depending on how the tables were initialized. This PR also removes the re-opening mechanism which can double startup time - every time the database is opened, the log is replayed - a large log file will take a long time to open. Finally, several options got correclty implemented as column family options, including an one that puts a hash index in the SST files.	2024-06-19 10:55:57 +02:00
Jordan Hrycaj	debba5a620	Coeredb related clean up and maint fixes (#2360 ) * Fix initialiser why: Possible crash (app profiling, tracer etc.) * Update column family options processing why: Same for kvt as for aristo * Move `AristoDbDualRocks` backend type to the test suite why: So it is not available for production * Fix typos in API jump table why: Used for tracing and app profiling only. Needed some update * Purged CoreDb legacy API why: Not needed anymore, was transitionary and disabled. * Rename `flush` argument to `eradicate` in a DB close context why: The word `eradicate` leaves no doubt what is meant * Rename `stoFlush()` -> `stoDelete()` * Rename `core_apps_newapi` -> `core_apps` (not so new anymore)	2024-06-14 11:19:48 +00:00
Jordan Hrycaj	5a5cc6295e	Triggered write event for kvt (#2351 ) * bump rockdb * Rename `KVT` objects related to filters according to `Aristo` naming details: filter* => delta* roFilter => balancer * Compulsory error handling if `persistent()` fails * Add return code to `reCentre()` why: Might eventually fail if re-centring is blocked. Some logic will be added in subsequent patch sets. * Add column families from earlier session to rocksdb in opening procedure why: All previously used CFs must be declared when re-opening an existing database. * Update `init()` and add rocksdb `reinit()` methods for changing parameters why: Opening a set column families (with different open options) must span at least the ones that are already on disk. * Provide write-trigger-event interface into `Aristo` backend why: This allows to save data from a guest application (think `KVT`) to get synced with the write cycle so the guest and `Aristo` save all atomically. * Use `KVT` with new column family interface from `Aristo` * Remove obsolete guest interface * Implement `KVT` piggyback on `Aristo` backend * CoreDb: Add separate `KVT`/`Aristo` backend mode for debugging * Remove `rocks_db` import from `persist()` function why: Some systems (i.p `fluffy` and friends) use the `Aristo` memory backend emulation and do not link against rocksdb when building the application. So this should fix that problem.	2024-06-13 18:15:11 +00:00
Jacek Sieka	eb041abba7	avoid unnecessary memory allocations and lookups (#2334 ) * use `withValue` instead of `hasKey` + `[]` * avoid `@` et al * parse database data inside `onData` instead of making seq then parsing	2024-06-11 11:38:58 +02:00
Jordan Hrycaj	a347291413	Aristo use rocksdb cf instead of key pfx (#2332 ) * Use RocksDb column families instead of a prefixed single column why: Better performance * Use structural objects `VertexRef` and `HashKey` in LRU cache for RocksDb why: Avoids repeated de/serialisation	2024-06-10 12:04:22 +00:00
Jacek Sieka	c5b3081828	eth: bump (#2308 ) * eth: bump Speed up basic operations like hashing and creating RLP:s - up to 25% improvement in certain block ranges! ``` 876729c.csv /data/nimbus_stats/stats-20240605_2204-ed4f6221.csv stats-20240605_2000-c876729c.csv vs stats-20240605_2204-ed4f6221.csv bps_x bps_y tps_x tps_y bpsd tpsd timed block_number (500001, 888889] 1,017.72 996.07 1,784.96 1742.438676 -2.72% -2.72% 3.31% (888889, 1277778] 528.00 536.30 2,159.79 2198.781046 1.69% 1.69% -1.44% (1277778, 1666667] 324.29 317.78 2,064.48 2008.106377 -2.82% -2.82% 3.33% (1666667, 2055556] 253.87 258.74 1,840.94 1872.935273 1.67% 1.67% -1.39% (2055556, 2444445] 175.79 178.66 1,340.61 1363.248939 0.93% 0.93% -0.74% (2444445, 2833334] 137.27 159.74 958.75 1113.323757 14.24% 14.24% -10.69% (2833334, 3222223] 170.48 228.63 1,272.70 1704.047195 34.41% 34.41% -25.17% (3222223, 3611112] 127.49 125.48 1,572.39 1548.835791 -1.19% -1.19% 1.47% (3611112, 4000001] 37.25 40.42 1,100.65 1184.740493 9.58% 9.58% -7.04% blocks: 3501696, baseline: 11h59m40s, contender: 11h21m38s bpsd (mean): 6.18% tpsd (mean): 6.18% Time (sum): -38m1s, -4.26% bpsd = blocks per sec diff (+), tpsd = txs per sec diff, timed = time to process diff (-) + = more is better, - = less is better ``` * ignore gitignore	2024-06-06 23:39:09 +00:00
Jordan Hrycaj	8985535ab2	Core db+aristo updates n fixes (#2298 ) * Fix `blobify()` for `SavedState` object why: Have to treat varying sizes for `HashKey`, i.p. for an empty key which has zero size. * Store correct block number in `SavedState` record why: Stored `block-number - 1` for some obscure reason. * Cosmetcs, docu	2024-06-05 18:17:50 +00:00
Jacek Sieka	c876729c4d	Add some basic rocksdb options to command line (#2286 ) These options are there mainly to drive experiments, and are therefore hidden. One thing that this PR brings in is an initial set of caches and buffers for rocksdb - the set that I've been using during various performance tests to get to a viable baseline performance level.	2024-06-05 17:08:29 +02:00
Jacek Sieka	95a4adc1e8	use statically linked rocksdb on linux/mac, dll on windows (#2291 ) The `rocksdb` version shipped with distributions is typically old and therefore often lacks features we use - it also doesn't match the one assumed by nim-rocksdb leading to ABI mismatch risks. Instead of depending on the system rocksdb, we'll now use the rocksdb version assumed by nim-rocksdb and locked in its vendor folder by always building it together with nimbus. This avoids the problem of unknown rocksdb versions at a (small) cost to build time. CI caching and full windows support for building from source [remains TODO](https://github.com/status-im/nim-rocksdb/issues/44).	2024-06-04 18:15:33 +02:00
Jordan Hrycaj	69a158864c	Remove vid recycling feature (#2294 )	2024-06-04 15:05:13 +00:00
Jordan Hrycaj	f926222fec	Aristo cull journal related stuff (#2288 ) * Remove all journal related stuff * Refactor function names journal() => delta(), filter() => delta() * remove `trg` fileld from `FilterRef` why: Same as `kMap[$1]` * Re-type FilterRef.src as `HashKey` why: So it is directly comparable to `kMap[$1]` * Moved `vGen[]` field from `LayerFinalRef` to `LayerDeltaRef` why: Then a separate `FilterRef` type is not needed, anymore * Rename `roFilter` field in `AristoDbRef` => `balancer` why: New name more appropriate. * Replace `FilterRef` by `LayerDeltaRef` type why: This allows to avoid copying into the `balancer` (see next patch set) most of the time. Typically, only one instance is running on the backend and the `balancer` is only used as a stage before saving data. * Refactor way how to store data persistently why: Avoid useless copy when staging `top` layer for persistently saving to backend. * Fix copyright header?	2024-06-03 20:10:35 +00:00
Jordan Hrycaj	bda760f41d	Run coredb without journal (#2266 ) * Add persistent last state stamp feature why: This allows to run `CoreDb` without journal * Start `CoreDb` without journal * Remove journal related functions from `CoredDb`	2024-05-31 17:32:22 +00:00
Jordan Hrycaj	0f430c70fd	Aristo avoid storage trie update race conditions (#2251 ) * Update TDD suite logger output format choices why: New format is not practical for TDD as it just dumps data across a wide range (considerably larder than 80 columns.) So the new format can be turned on by function argument. * Update unit tests samples configuration why: Slightly changed the way to find the `era1` directory * Remove compiler warnings (fix deprecated expressions and phrases) * Update `Aristo` debugging tools * Always update the `storageID` field of account leaf vertices why: Storage tries are weekly linked to an account leaf object in that the `storageID` field is updated by the application. Previously, `Aristo` verified that leaf objects make sense when passed to the database. As a consequence * the database was inconsistent for a short while * the burden for correctness was all on the application which led to delayed error handling which is hard to debug. So `Aristo` will internally update the account leaf objects so that there are no race conditions due to the storage trie handling * Aristo: Let `stow()`/`persist()` bail out unless there is a `VertexID(1)` why: The journal and filter logic depends on the hash of the `VertexID(1)` which is commonly known as the state root. This implies that all changes to the database are somehow related to that. * Make sure that a `Ledger` account does not overwrite the storage trie reference why: Due to the abstraction of a sub-trie (now referred to as column with a hash describing its state) there was a weakness in the `Aristo` handler where an account leaf could be overwritten though changing the validity of the database. This has been changed and the database will now reject such changes. This patch fixes the behaviour on the application layer. In particular, the column handle returned by the `CoreDb` needs to be updated by the `Aristo` database state. This mitigates the problem that a storage trie might have vanished or re-apperaed with a different vertex ID. * Fix sub-trie deletion test why: Was originally hinged on `VertexID(1)` which cannot be wholesale deleted anymore after the last Aristo update. Also, running with `VertexID(2)` needs an artificial `VertexID(1)` for making `stow()` or `persist()` work. * Cosmetics * Activate `test_generalstate_json` * Temporarily `deactivate test_tracer_json` * Fix copyright header --------- Co-authored-by: jordan <jordan@dry.pudding> Co-authored-by: Jacek Sieka <jacek@status.im>	2024-05-30 17:48:38 +00:00
Jordan Hrycaj	54f784bef1	Kvt remodel tx and forked descriptors (#2168 ) * Aristo: Generalise alien/guest interface for piggiback on database * Aristo: Code cosmetics * CoreDb+Kvt: Update transaction API why: Use single addressable function `forkTx(backLevel: int)` as used in `Aristo`. So `Kvt` can be synced simultaneously to `Aristo`. also: Refactored `kvt_tx.nim` in a similar fashion to `Aristo`. * Kvt: Replace `LayerDelta` object by reference why: Will be needed when introducing filters * Kvt: Remodel backend filter facility similar to `Aristo` why: This allows to operate on several KVT instances simultaneously. * CoreDb+Kvt: Fix on-disk storage why: Overlooked name change: `stow()` => `persist()` for permanent storage * Fix copyright headers	2024-05-07 19:59:27 +00:00
Jordan Hrycaj	b9187e0493	Aristo selective read cashing for rocksdb backend (#2145 ) * Aristo+Kvt: Better RocksDB profiling why: Providing more detailed information, mainly for `Aristo` * Aristo: Renamed journal `stats()` to `capacity()` why: `Stats()` was a misnomer * Aristo: Provide backend read caches for key and vertex IDs why: Dedicated LRU caching for particular types gives a throughput advantage. The sizes of the LRU queues used for caching are currently constant but might be adjusted at a later time. * Fix copyright year	2024-04-22 19:02:22 +00:00
Jordan Hrycaj	7d9e1d8607	Misc updates for full sync (#2140 ) * Code cosmetics * Aristo+Kvt: Fix api wrappers why: Api setup killed the backend descriptor when backend mapping was disabled. * Aristo: Implement masked profiling entries why: Database backend should be listed but not counted in tally * CoreDb: Simplify backend() methods why: DBMS backend access Was provided very early and over engineered. Now there are only two backend machines, one for `Kvt` and the other one for an `Mpt` available only via new API. * CoreDb: Code cleanup regarding descriptor types * CoreDb: Refactor/redefine `persistent()` methods why: There were `persistent()` methods for any type of caching storage facilities `Kvt`, `Mpt`, `Phk`, and `Acc`. Now there is only a single `persistent()` method storing all facilities in tandem (similar to how transactions work.) For non shared `Kvt` tables, there is now an extra storage method `saveOffSite()`. * CoreDb lingo update: `trie` becomes `column` why: Notion of a `trie` is pretty much hidden by the new `CoreDb` api. Revealed are sort of database columns for accounts an storage data, any of which have an internal state represented by a Keccack hash. So a `trie` or `MPT` becomes a `column` and a `rootHash` becomes a column state. * Aristo: rename backend filed `filters` => `journal` * Update full sync logging details: + Disable eth handler noise while syncing + Log journal depth (if available) * Fix copyright year * Fix cruft and unwanted imports	2024-04-19 18:37:27 +00:00
Jordan Hrycaj	e8eb3268f5	Generalise prune mode option 4 different db models (#2139 ) * Update README * Nimbus-main: replaced `PruneMode` options by `ChainDbMode` options details: For the legacy database, this changes the phrase - `conf.pruneMode == PruneMode.Full` to the expression + `conf.chainDbMode == ChainDbMode.Prune`. * Fix issues moaned about by NIM compiler * Fix copyright year	2024-04-17 18:09:55 +00:00
Jordan Hrycaj	d6a4205324	Aristo update rocksdb backend drivers (#2135 ) * Aristo+RocksDB: Update backend drivers why: RocksDB update allows use some of the newly provided methods which were previously implemented by using the very C backend (for the lack of NIM methods.) * Aristo+RocksDB: Simplify drivers wrapper * Kvt: Update backend drivers and wrappers similar to `Aristo` * Aristo+Kvm: Use column families for RocksDB * Aristo+MemoryDB: Code cosmetics * Aristo: Provide guest column family for export why: So `Kvt` can piggyback on `Aristo` so there avoiding to run a second DBMS system in parallel. * Kvt: Provide import mechanism for RoksDB guest column family why: So `Kvt` can piggyback on `Aristo` so there avoiding to run a second DBMS system in parallel. * CoreDb+Aristo: Run persistent `Kvt` DB piggybacked on `Aristo` why: Avoiding to run two DBMS systems in parallel. * Fix copyright year * Ditto	2024-04-16 20:39:11 +00:00
Jordan Hrycaj	0d73637f14	Core db simplify new api storage modes (#2075 ) * Aristo+Kvt: Fix backend `dup()` function in api setup why: Backend object is subject to an inheritance cascade which was not taken care of, before. Only the base object was duplicated. * Kvt: Simplify DB clone/peers management * Aristo: Simplify DB clone/peers management * Aristo: Adjust unit test for working with memory DB only why: This currently causes some memory corruption persumably in the `libc` background layer. * CoredDb+Kvt: Simplify API for KVT why: Simplified storage models (was over engineered) for better performance and code maintenance. * CoredDb+Aristo: Simplify API for `Aristo` why: Only single database state needed here. Accessing a similar state will be implemented from outside this module using a context layer. This gives better performance and improves code maintenance. * Fix Copyright headers * CoreDb: Turn off API tracking why: CI would ot go through. Was accidentally turned on.	2024-03-14 22:17:43 +00:00
Jordan Hrycaj	43e5f428af	Aristo db kvt maintenance update (#1952 ) * Update KVT layers abstraction details: modelled after Aristo layers * Simplified KVT database iterators (removed item counters) why: Not needed for production functions * Simplify KVT merge function `layersCc()` * Simplified Aristo database iterators (removed item counters) why: Not needed for production functions * Update failure condition for hash labels compiler `hashify()` why: Node need not be rejected as long as links are on the schedule. In that case, `redo[]` is to become `wff.base[]` at a later stage. * Update merging layers and label update functions why: + Merging a stack of layers with `layersCc()` could be simplified + Merging layers will optimise the reverse `kMap[]` table maps `pAmk: label->{vid, ..}` by deleting empty mappings `label->{}` where they are redundant. + Updated `layersPutLabel()` for optimising `pAmk[]` tables	2023-12-20 16:19:00 +00:00
Jordan Hrycaj	6e0397e276	Aristo and ledger small updates (#1888 ) * Fix debug noise in `hashify()` for perfectly normal situation why: Was previously considered a fixable error * Fix test sample file names why: The larger test file `goerli68161.txt.gz` is already in the local archive. So there is no need to use the smaller one from the external repo. * Activate `accounts_cache` module from `db/ledger` why: A copy of the original `accounts_cache.nim` source to be integrated into the `Ledger` module wrapper which allows to switch between different `accounts_cache` implementations unser tha same API. details: At a later state, the `db/accounts_cache.nim` wrapper will be removed so that there is only one access to that module via `db/ledger/accounts_cache.nim`. * Fix copyright headers in source code	2023-11-08 16:52:25 +00:00
Jordan Hrycaj	4feaa2cfab	Aristo db update for short nodes key edge cases (#1887 ) * Aristo: Provide key-value list signature calculator detail: Simple wrappers around `Aristo` core functionality * Update new API for `CoreDb` details: + Renamed new API functions `contains()` => `hasKey()` or `hasPath()` which disables the `in` operator on non-boolean `contains()` functions + The functions `get()` and `fetch()` always return a not-found error if there is no item, available. The new functions `getOrEmpty()` and `mergeOrEmpty()` return an an empty `Blob` if there is no such key found. * Rewrite `core_apps.nim` using new API from `CoreDb` * Use `Aristo` functionality for calculating Merkle signatures details: For debugging, the `VerifyAristoForMerkleRootCalc` can be set so that `Aristo` results will be verified against the legacy versions. * Provide general interface for Merkle signing key-value tables details: Export `Aristo` wrappers * Activate `CoreDb` tests why: Now, API seems to be stable enough for general tests. * Update `toHex()` usage why: Byteutils' `toHex()` is superior to `toSeq.mapIt(it.toHex(2)).join` * Split `aristo_transcode` => `aristo_serialise` + `aristo_blobify` why: + Different modules for different purposes + `aristo_serialise`: RLP encoding/decoding + `aristo_blobify`: Aristo database encoding/decoding * Compacted representation of small nodes' links instead of Keccak hashes why: Ethereum MPTs use Keccak hashes as node links if the size of an RLP encoded node is at least 32 bytes. Otherwise, the RLP encoded node value is used as a pseudo node link (rather than a hash.) Such a node is nor stored on key-value database. Rather the RLP encoded node value is stored instead of a lode link in a parent node instead. Only for the root hash, the top level node is always referred to by the hash. This feature needed an abstraction of the `HashKey` object which is now either a hash or a blob of length at most 31 bytes. This leaves two ways of representing an empty/void `HashKey` type, either as an empty blob of zero length, or the hash of an empty blob. * Update `CoreDb` interface (mainly reducing logger noise) * Fix copyright years (to make `Lint` happy)	2023-11-08 12:18:32 +00:00
Jordan Hrycaj	8e00143313	Aristo db code massage n cosmetics (#1745 ) * Rewrite remaining `AristoError` return code into `Result[void,AristoError]` why: Better code maintenance * Update import sections * Update Aristo DB paths why: More systematic so directory can be shared with other DB types * More cosmetcs * Update unit tests runners why: Proper handling of persistent and mem-only DB. The latter can be consistently triggered by an empty DB path.	2023-09-12 19:45:12 +01:00
Jordan Hrycaj	3936d4d0ad	Aristo db fixes n updates needed for filter fifo (#1728 ) * Set scheduler state as part of the backend descriptor details: Moved type definitions `QidLayoutRef` and `QidSchedRef` to `desc_structural.nim` so that it shares the same folder as `desc_backend.nim` * Automatic filter queue table initialisation in backend details: Scheduler can be tweaked or completely disabled * Updated backend unit tests details: + some code clean up/beautification, reads better now + disabled persistent filters so that there is no automated filter management which will be implemented next * Prettify/update unit tests source code details: Mostly replacing the `check()` paradigm by `xCheck()` * Somewhat simplified backend type management why: Backend objects are labelled with a `BackendType` symbol where the `BackendVoid` label is implicitly assumed for a `nil` backend object reference. To make it easier, a `kind()` function is used now applicable to `nil` references as well. * Fix DB storage layout for filter objects why: Need to store the filter ID with the object * Implement reverse [] index on fifo why: An integer index argument on `[]` retrieves the QueueID (label) of the fifo item while a QueueID argument on `[]` retrieves the index (so it is inverse to the former variant). * Provide iterator over filters as fifo why: This iterator goes along the cascased fifo structure (i.e. in historical order)	2023-09-05 14:57:20 +01:00
Jordan Hrycaj	465d694834	Aristo db implement filter storage scheduler (#1713 ) * Rename FilterID => QueueID why: The current usage does not identify a particular filter but uses it as storage tag to manage it on the database (to be organised in a set of FIFOs or queues.) * Split `aristo_filter` source into sub-files why: Make space for filter management API * Store filter queue IDs in pairs on the backend why: Any pair will will describe a FIFO accessed by bottom/top IDs * Reorg some source file names why: The "aristo_" prefix for make local/private files is tedious to use, so removed. * Implement filter slot scheduler details: Filters will be stored on the database on cascaded FIFOs. When a FIFO queue is full, some filter items are bundled together and stored on the next FIFO.	2023-08-25 23:53:59 +01:00

29 Commits