nimbus-eth1

Commit Graph

Author	SHA1	Message	Date
Jacek Sieka	b4b4d16729	speed up key computation (#2642 ) * batch database key writes during `computeKey` calls * log progress when there are many keys to update * avoid evicting the vertex cache when traversing the trie for key computation purposes * avoid storing trivial leaf hashes that directly can be loaded from the vertex	2024-09-20 07:43:53 +02:00
Jacek Sieka	5c1e2e7d3b	Migrate `keyed_queue` to `minilru` (#2608 ) Compared to `keyed_queue`, `minilru` uses significantly less memory, in particular for the 32-byte hash keys where `kq` stores several copies of the key redundantly.	2024-09-13 15:47:50 +02:00
Jacek Sieka	d39c589ec3	lru cache updates (#2590 ) * replace rocksdb row cache with larger rdb lru caches - these serve the same purpose but are more efficient because they skips serialization, locking and rocksdb layering * don't append fresh items to cache - this has the effect of evicting the existing items and replacing them with low-value entries that might never be read - during write-heavy periods of processing, the newly-added entries were evicted during the store loop * allow tuning rdb lru size at runtime * add (hidden) option to print lru stats at exit (replacing the compile-time flag) pre: ``` INF 2024-09-03 15:07:01.136+02:00 Imported blocks blockNumber=20012001 blocks=12000 importedSlot=9216851 txs=1837042 mgas=181911.265 bps=11.675 tps=1870.397 mgps=176.819 avgBps=10.288 avgTps=1574.889 avgMGps=155.952 elapsed=19m26s458ms ``` post: ``` INF 2024-09-03 13:54:26.730+02:00 Imported blocks blockNumber=20012001 blocks=12000 importedSlot=9216851 txs=1837042 mgas=181911.265 bps=11.637 tps=1864.384 mgps=176.250 avgBps=11.202 avgTps=1714.920 avgMGps=169.818 elapsed=17m51s211ms ``` 9%:ish import perf improvement on similar mem usage :)	2024-09-05 11:18:32 +02:00
Jacek Sieka	35cc78c86d	add metrics for rdb lru cache (#2586 ) This is a first step towards measuring the efficiency of the LRU caches over time - metrics can be collected during import or when running regulary. Since `nim-metrics` carries some overhead for its default way of reporting metrics, this PR implements a custom collector over atomic counters, given that this is one of the hottest spots in the block processing pipeline. Using a compile-time flag, the same metrics can be printed on exit which is useful when comparing different strategies for caching - here's a recent run over blocks 16000001-1616384 - this is a good candidate to expose in a better way in the future, maybe: ``` state vtype miss hit total hitrate Account Leaf 4909417 4466215 9375632 47.64% Account Branch 20742574 72015123 92757697 77.64% World Leaf 940483 1140946 2081429 54.82% World Branch 8224151 131496580 139720731 94.11% all all 34816625 209118864 243935489 85.73% ```	2024-09-02 17:34:10 +02:00
Jacek Sieka	ef1bab0802	avoid some trivial memory allocations (#2587 ) * pre-allocate `blobify` data and remove redundant error handling (cannot fail on correct data) * use threadvar for temporary storage when decoding rdb, avoiding closure env * speed up database walkers by avoiding many temporaries ~5% perf improvement on block import, 100x on database iteration (useful for building analysis tooling)	2024-09-02 16:03:10 +02:00
Jacek Sieka	81e75622cf	storage: store root id together with vid, for better locality of refe… (#2449 ) The state and account MPT:s currenty share key space in the database based on that vertex id:s are assigned essentially randomly, which means that when two adjacent slot values from the same contract are accessed, they might reside at large distance from each other. Here, we prefix each vertex id by its root causing them to be sorted together thus bringing all data belonging to a particular contract closer together - the same effect also happens for the main state MPT whose nodes now end up clustered together more tightly. In the future, the prefix given to the storage keys can also be used to perform range operations such as reading all the storage at once and/or deleting an account with a batch operation. Notably, parts of the API already supported this rooting concept while parts didn't - this PR makes the API consistent by always working with a root+vid.	2024-07-04 15:46:52 +02:00
Jacek Sieka	c364426422	Smaller in-database representations (#2436 ) These representations use ~15-20% less data compared to the status quo, mainly by removing redundant zeroes in the integer encodings - a significant effect of this change is that the various rocksdb caches see better efficiency since more items fit in the same amount of space. * use RLP encoding for `VertexID` and `UInt256` wherever it appears * pack `VertexRef`/`PayloadRef` more tightly	2024-07-02 20:25:06 +02:00
Jacek Sieka	eb041abba7	avoid unnecessary memory allocations and lookups (#2334 ) * use `withValue` instead of `hasKey` + `[]` * avoid `@` et al * parse database data inside `onData` instead of making seq then parsing	2024-06-11 11:38:58 +02:00
Jordan Hrycaj	a347291413	Aristo use rocksdb cf instead of key pfx (#2332 ) * Use RocksDb column families instead of a prefixed single column why: Better performance * Use structural objects `VertexRef` and `HashKey` in LRU cache for RocksDb why: Avoids repeated de/serialisation	2024-06-10 12:04:22 +00:00
Jacek Sieka	0a49833d69	avoid a few more copies (#2215 )	2024-05-24 11:27:17 +02:00
Jordan Hrycaj	b9187e0493	Aristo selective read cashing for rocksdb backend (#2145 ) * Aristo+Kvt: Better RocksDB profiling why: Providing more detailed information, mainly for `Aristo` * Aristo: Renamed journal `stats()` to `capacity()` why: `Stats()` was a misnomer * Aristo: Provide backend read caches for key and vertex IDs why: Dedicated LRU caching for particular types gives a throughput advantage. The sizes of the LRU queues used for caching are currently constant but might be adjusted at a later time. * Fix copyright year	2024-04-22 19:02:22 +00:00
Jordan Hrycaj	d6a4205324	Aristo update rocksdb backend drivers (#2135 ) * Aristo+RocksDB: Update backend drivers why: RocksDB update allows use some of the newly provided methods which were previously implemented by using the very C backend (for the lack of NIM methods.) * Aristo+RocksDB: Simplify drivers wrapper * Kvt: Update backend drivers and wrappers similar to `Aristo` * Aristo+Kvm: Use column families for RocksDB * Aristo+MemoryDB: Code cosmetics * Aristo: Provide guest column family for export why: So `Kvt` can piggyback on `Aristo` so there avoiding to run a second DBMS system in parallel. * Kvt: Provide import mechanism for RoksDB guest column family why: So `Kvt` can piggyback on `Aristo` so there avoiding to run a second DBMS system in parallel. * CoreDb+Aristo: Run persistent `Kvt` DB piggybacked on `Aristo` why: Avoiding to run two DBMS systems in parallel. * Fix copyright year * Ditto	2024-04-16 20:39:11 +00:00
Jordan Hrycaj	6e0397e276	Aristo and ledger small updates (#1888 ) * Fix debug noise in `hashify()` for perfectly normal situation why: Was previously considered a fixable error * Fix test sample file names why: The larger test file `goerli68161.txt.gz` is already in the local archive. So there is no need to use the smaller one from the external repo. * Activate `accounts_cache` module from `db/ledger` why: A copy of the original `accounts_cache.nim` source to be integrated into the `Ledger` module wrapper which allows to switch between different `accounts_cache` implementations unser tha same API. details: At a later state, the `db/accounts_cache.nim` wrapper will be removed so that there is only one access to that module via `db/ledger/accounts_cache.nim`. * Fix copyright headers in source code	2023-11-08 16:52:25 +00:00
Jordan Hrycaj	8e00143313	Aristo db code massage n cosmetics (#1745 ) * Rewrite remaining `AristoError` return code into `Result[void,AristoError]` why: Better code maintenance * Update import sections * Update Aristo DB paths why: More systematic so directory can be shared with other DB types * More cosmetcs * Update unit tests runners why: Proper handling of persistent and mem-only DB. The latter can be consistently triggered by an empty DB path.	2023-09-12 19:45:12 +01:00
Jordan Hrycaj	465d694834	Aristo db implement filter storage scheduler (#1713 ) * Rename FilterID => QueueID why: The current usage does not identify a particular filter but uses it as storage tag to manage it on the database (to be organised in a set of FIFOs or queues.) * Split `aristo_filter` source into sub-files why: Make space for filter management API * Store filter queue IDs in pairs on the backend why: Any pair will will describe a FIFO accessed by bottom/top IDs * Reorg some source file names why: The "aristo_" prefix for make local/private files is tedious to use, so removed. * Implement filter slot scheduler details: Filters will be stored on the database on cascaded FIFOs. When a FIFO queue is full, some filter items are bundled together and stored on the next FIFO.	2023-08-25 23:53:59 +01:00

15 Commits