nimbus-eth1

Commit Graph

Author	SHA1	Message	Date
andri lim	4d9e288340	Wiring ForkedChainRef to other components (#2423 ) * Wiring ForkedChainRef to other components - Disable majority of hive simulators - Only enable pyspec_sim for the moment - The pyspec_sim is using a smaller RPC service wired to ForkedChainRef - The RPC service will gradually grow * Addressing PR review * Fix test_beacon/setup_env * Enable consensus_sim (#2441) * Enable consensus_sim * Remove isFile check * Enable Engine API jwt auth tests and exchange cap tests * Enable engine api in build_sim.sh * Wire ForkedChainRef to Engine API newPayload * Wire Engine API getBodies to ForkedChainRef * Wire Engine API api_forkchoice to ForkedChainRef * Wire more RPC methods to ForkedChainRef * Implement eth_syncing * Implement eth_call and eth_getlogs * TxPool: simplify smartHead * Fix smartHead usage * Fix txpool headDiff * Remove hasBlockHeader and use headerExists * Addressing review	2024-09-04 09:54:54 +00:00
Jacek Sieka	35cc78c86d	add metrics for rdb lru cache (#2586 ) This is a first step towards measuring the efficiency of the LRU caches over time - metrics can be collected during import or when running regulary. Since `nim-metrics` carries some overhead for its default way of reporting metrics, this PR implements a custom collector over atomic counters, given that this is one of the hottest spots in the block processing pipeline. Using a compile-time flag, the same metrics can be printed on exit which is useful when comparing different strategies for caching - here's a recent run over blocks 16000001-1616384 - this is a good candidate to expose in a better way in the future, maybe: ``` state vtype miss hit total hitrate Account Leaf 4909417 4466215 9375632 47.64% Account Branch 20742574 72015123 92757697 77.64% World Leaf 940483 1140946 2081429 54.82% World Branch 8224151 131496580 139720731 94.11% all all 34816625 209118864 243935489 85.73% ```	2024-09-02 17:34:10 +02:00
Jacek Sieka	ef1bab0802	avoid some trivial memory allocations (#2587 ) * pre-allocate `blobify` data and remove redundant error handling (cannot fail on correct data) * use threadvar for temporary storage when decoding rdb, avoiding closure env * speed up database walkers by avoiding many temporaries ~5% perf improvement on block import, 100x on database iteration (useful for building analysis tooling)	2024-09-02 16:03:10 +02:00
Jordan Hrycaj	a25ea63dec	Revert lazy implementation (#2585 )	2024-09-02 10:34:42 +00:00
Jacek Sieka	84a72c8658	Use zstd compression in bottommost layer (#2582 ) Tested up to block ~14m, zstd uses ~12% less space which seems to result in a small:ish (2-4%) performance improvement on block import speed - this seems like a better baseline for more extensive testing in the future. Pre: 57383308 kb Post: 50831236 kb	2024-08-30 17:32:13 +02:00
Jordan Hrycaj	42a08cfba9	Coredb and sync maintenance update (#2583 ) * bump metrics * Remove cruft * Cosmetics, update some logging, noise control * Renamed `CoreDb` function `hasKey` => `hasKeyRc` and provided `hasKey` why: Currently, `hasKey` returns a `Result[]` rather than a `bool` which is what one would expect from a function prototype of this name. This was a bit of an annoyance and cost unnecessary attention.	2024-08-30 11:18:36 +00:00
Jacek Sieka	8857fccb44	create per-fork opcode dispatcher (#2579 ) In the current VM opcode dispatcher, a two-level case statement is generated that first matches the opcode and then uses another nested case statement to select the actual implementation based on which fork it is, causing the dispatcher to grow by `O(opcodes) * O(forks)`. The fork does not change between instructions causing significant inefficiency for this approach - not only because it repeats the fork lookup but also because of code size bloat and missed optimizations. A second source of inefficiency in dispatching is the tracer code which in the vast majority of cases is disabled but nevertheless sees multiple conditionals being evaluated for each instruction only to remain disabled throughout exeuction. This PR rewrites the opcode dispatcher macro to generate a separate dispatcher for each fork and tracer setting and goes on to pick the right one at the start of the computation. This has many advantages: * much smaller dispatcher * easier to compile * better inlining * fewer pointlessly repeated instruction * simplified macro (!) * slow "low-compiler-memory" dispatcher code can be removed Net block import improvement at about 4-6% depending on the contract - synthetic EVM benchmnarks would show an even better result most likely.	2024-08-28 10:20:36 +02:00
web3-developer	8bf581e72d	Cleanup unused exp_getProofsByBlockNumber endpoint (#2577 ) * Cleanup unused exp_getProofsByBlockNumber endpoint.	2024-08-23 22:39:33 +08:00
Jacek Sieka	dbabe7e0a7	import: reduce stack usage (#2575 ) Because EthBlock is quite large, the stack usage that results from the multiple copies (temporary and not) present in the import command is larger than it should be - this PR moves some of that data to a closure environment allocated once per EthBlock - a larger restructuring of the code is due but in the meantime, this simple change speeds up garbage collection a little bit.	2024-08-22 10:06:45 +02:00
Jacek Sieka	d72a73de8b	avoid digest when loading era block (#2572 ) Computing the digest is unnecessary but takes a little bit of time - remove computation and reduce mem usage slightly when loading era blocks	2024-08-20 15:23:14 +02:00
Jordan Hrycaj	4db9c5c2d5	Small updates and fixes for rlpx suite (#2571 ) * Remove redundant `eth/68` message and clean up docu details: There is only eth/68 available at the moment * Allow to turn on chronicles line number logging in `Makefile` * Accept (and forget) tx hashes announcements why: Does no harm to just ignore it at the moment * Bump nim-eth (rlp fix)	2024-08-19 14:00:10 +00:00
Jacek Sieka	226cdb7c68	avoid exceptions, tx copy (#2569 ) * avoid some exceptions using the new compile-time strformat * avoid copying the full transaction only to normalise 2 fields (expensive)	2024-08-19 09:42:07 +02:00
Jacek Sieka	5941fef211	Avoid unnecessary layer copy (#2567 ) When the stack has an empty layer on top, there's no need to copy the contents of `top` to it since it would be the same. ~13% processing saved (!) pre ``` INF 2024-08-17 19:11:31.748+02:00 Imported blocks blockNumber=18667648 blocks=12000 importedSlot=7860043 txs=1797812 mgas=181135.177 bps=8.763 tps=1375.062 mgps=132.125 avgBps=6.798 avgTps=1018.501 avgMGps=102.617 elapsed=29m25s154ms ``` post ``` INF 2024-08-17 18:22:52.513+02:00 Imported blocks blockNumber=18667648 blocks=12000 importedSlot=7860043 txs=1797812 mgas=181135.177 bps=9.648 tps=1513.961 mgps=145.472 avgBps=7.876 avgTps=1179.998 avgMGps=118.888 elapsed=25m23s572ms ```	2024-08-19 08:46:04 +02:00
Jacek Sieka	43d93bcdab	Don't write slot hashes on import (#2564 ) The reverse slot hash mechanism causes quite a bit of database traffic but is broadly not useful except for iterating the storage of an account, something that a validator never does (it's used by the tracers). This flag adds one more thing that is not stored in the database, to be explored more comprehensively when designing full, validator and archive modes with different pruning options in the future. `ldb` says this is 60gb of data (!): ``` ldb --db=. --ignore_unknown_options --column_family=KvtGen approxsize --hex --from=0x05 --to=0x05ffffffffffffffffffffffffffffffffffffffffffffff 66488353954 ```	2024-08-16 08:22:51 +02:00
Jordan Hrycaj	4dbc1653ea	Cleanup (#2565 ) * Move snap un-dumpers to aristo unit test folder why: The only place where it is used, now to test the database against legacy snap sync dump samples. While the details of the dumped data have mostly outlived their purpuse, its use as entropy data thrown against `Aristo` has still been useful to find/debug tricky DB problems. * Remove cruft * `nimbus-eth1-blobs` not used anymore as test data source	2024-08-15 12:31:07 +00:00
Jordan Hrycaj	cbe5131927	Simplify aristo tree deletion functionality (#2563 ) * Cleaning up, removing cruft and debugging statements * Make `aristo_delta` fluffy compatible why: A sub-module that uses `chronicles` must import all possible modules used by a parent module that imports the sub-module. * update TODO	2024-08-14 12:09:30 +00:00
Jordan Hrycaj	d148de5b1c	Remove chunked rlpx (#2562 ) * bump nim-eth * Update make environment	2024-08-14 10:56:49 +00:00
Jordan Hrycaj	ce713d95fc	Aristo lazily delete larger subtrees (#2560 ) * Extract sub-tree deletion functions into separate sub-modules * Move/rename `aristo_desc.accLruSize` => `aristo_constants.ACC_LRU_SIZE` * Lazily delete sub-trees why: This gives some control of the memory used to keep the deleted vertices in the cached layers. For larger sub-trees, keys and vertices might be on the persistent backend to a large extend. This would pull an amount of extra information from the backend into the cached layer. For lazy deleting it is enough to remember sub-trees by a small set of (at most 16) sub-roots to be processed when storing persistent data. Marking the tree root deleted immediately allows to let most of the code base work as before. * Comments and cosmetics * No need to import all for `Aristo` here * Kludge to make `chronicle` usage in sub-modules work with `fluffy` why: That `fluffy` would not run with any logging in `core_deb` is a problem I have known for a while. Up to now, logging was only used for debugging. With the current `Aristo` PR, there are cases where logging might be wanted but this works only if `chronicles` runs without the `json[dynamic]` sinks. So this should be re-visited. * More of a kludge	2024-08-14 08:54:44 +00:00
Jordan Hrycaj	7becf4e389	Remove vertex ID recycle function (#2558 ) why: It is not safe in general to recycle vertex IDs while the `RocksDb` cache has `VertexID` rather than `RootedVertexID` where the former type seems preferable. In some fringe cases one might remove a vertex with key `(root1,vid)` and insert another vertex with key `(root2,vid)` while re-using the vertex ID `vid`. Without knowledge of `root1` and `root2`, the LRU cache will return the same vertex for `(root2,vid)` also for `(root1,vid)`.	2024-08-12 20:56:15 +00:00
Jacek Sieka	19451cadff	rebalance rocksdb cache sizes (#2557 ) Based on some simple testing done with a few combinations of cache sizes, it seems that the block cache has grown in importance compared to the where we were before changing on-disk format and adding a lot of other point caches. With these settings, there's roughly a 15% performance increase when processing blocks in the 18M range over the status quo while memory usage decreases by more than 1gb! Only a few values were tested so there's certainly more to do here but this change sets up a better baseline for any future optimizations. In particular, since the initial defaults were chosen root vertex id:s were introduced as key prefixes meaning that storage for each account will be grouped together and thus it becomes more likely that a block loaded from disk will be hit multiple times - this seems to give the block cache an edge over the row cache, specially when traversing the storage trie.	2024-08-12 05:52:09 +00:00
andri lim	b8e128203f	Rewire blockValue from Txpool to EngineAPI (#2554 )	2024-08-09 06:05:18 +07:00
Jacek Sieka	3dc30195ad	log http/jwt information on startup (#2553 )	2024-08-08 10:03:30 +00:00
Jacek Sieka	094486d0ce	Hash bump	2024-08-08 07:46:35 +02:00
Jacek Sieka	3cefd7ed38	move db init to init (#2552 ) When using the common interface, the database always (potentially) needs init - take the opportunity to log some basic database info on startup.	2024-08-08 07:45:30 +02:00
andri lim	d5786758b5	TxPool: Merge tx_chain and tx_packer to reduce complexity (#2549 ) * TxPool: Merge tx_chain and tx_packer to reduce complexity * Fix copyright year	2024-08-07 22:35:17 +07:00
Jordan Hrycaj	38572bd8ea	Cache a storage root ID forever in the leaf payload of an account (#2551 ) details: Stale root IDs are marked disabled while the ID is kept in the leaf payload. why: This might lead to further caching advantages.	2024-08-07 13:28:01 +00:00
Jordan Hrycaj	488bdbc267	Provide portal proof functionality with coredb (#2550 ) * Provide portal proof functions in `aristo_api` why: So it can be fully supported by `CoreDb` * Fix prototype in `kvt_api` * Fix node constructor for account leafs with storage trees * Provide simple path check based on portal proof functionality * Provide portal proof functionality in `CoreDb` * Update TODO list	2024-08-07 11:30:55 +00:00
andri lim	3cef119b78	Return empty list instead of error in getPooledTxs handler (#2547 )	2024-08-06 22:06:48 +07:00
Jordan Hrycaj	6bae929439	Added comments (#2546 )	2024-08-06 12:43:39 +00:00
Jordan Hrycaj	5b502a06c4	Added portal proof nodes generation functionality (#2539 ) * Extracted `test_tx.testTxMergeProofAndKvpList()` => separate file * Fix serialiser why: Typo lead to duplicate rlp-encoded nodes in chain * Remove cruft * Implemnt portal proof nodes generators `partXxxTwig()` * Add unit test for portal proof nodes generator `partAccountTwig()` * Cosmetics * Simplify serialiser return code format * Fix proof generator for extension nodes why: Code was simply bonkers, not detected before the unit tests were adapted to check for just this. * Implemented portal proof nodes verifier `partUntwig()` * Cosmetics * Fix `testutp` cli poblem	2024-08-06 11:29:26 +00:00
andri lim	ec118a438a	Refactor txpool: reduce complexity (#2542 )	2024-08-06 16:12:56 +07:00
andri lim	9dacfed943	Disable txpool in eth wire protocol handler (#2540 )	2024-08-06 11:26:55 +07:00
Jordan Hrycaj	01b5c08763	Revive json tracer unit tests (#2538 ) * Some `Aristo` clean-ups/updates * Re-implemented core-db tracer functionality * Rename nimbus tracer `no-tracer.nim` => `tracer.nim` why: Restore original name for easy diff tracking with upcoming update * Update nimbus tracer using new core-db tracer functionality * Updating json tracer unit tests * Enable json tracer unit tests	2024-08-01 10:41:20 +00:00
andri lim	e331c9e9b7	TxPool: Replace GasPrice and GasPriceEx with GasInt (#2537 ) * TxPool: Replace GasPrice and GasPriceEx with GasInt	2024-07-31 14:33:30 +07:00
Jordan Hrycaj	72c3ab8ced	Provide partial tree support for preloading tests (#2536 ) * Implement partial trees why: This is currently needed for unit tests to pre-load the database with test data similar to `proof` node pre-load. The basic features for `snap-sync` boundary proofs are available as well for future use. What is missing is the final proof verification and a complete storage data load/merge function (stub is available.) * Cosmetics, clean up	2024-07-29 20:15:17 +00:00
Jacek Sieka	bdc86b3fd4	small cleanups (#2526 ) * remove some redundant EH * avoid pessimising move (introduces a copy in this case!) * shift less data around when reading era files (reduces stack usage)	2024-07-26 12:32:01 +07:00
andri lim	254bda365f	Remove txpool sender locality (#2525 ) * Remove txpool sender locality We no longer distinct local or remote sender * Fix copyright year	2024-07-25 22:36:08 +07:00
andri lim	0cc730dd05	Fix CodeBytes: invalidPositions out of bound crash (#2523 )	2024-07-25 19:23:53 +07:00
andri lim	01ba18da74	Fix sepolia chain config: mergeForkBlock -> 1450409 (#2518 ) * Fix sepolia chain config: mergeForkBlock -> 1450407 * Fix test_forkid	2024-07-24 03:07:55 +00:00
Advaita Saha	08bbb0079f	faster slot finding in nimbus import (#2491 ) * faster slot finding in nimbus import * feat: blocknumber based slot finding * fix: formatting * added comments * fix: added is_execution_block * added comment	2024-07-22 21:17:07 +00:00
Jordan Hrycaj	1452e7b1c0	Misc updates (#2513 ) * Update config for Ledger and CoreDb why: Prepare for tracer which depends on the API jump table (as well as the profiler.) The API jump table is now enabled in unit/integration test mode piggybacking on the `unittest2DisableParamFiltering` compiler flag or on an extra compiler flag `dbjapi_enabled`. * No deed for error field in `NodeRef` why: Was opnly needed by proof nodes pre-loader which will be re-implemented * Cosmetics	2024-07-22 18:10:04 +00:00
andri lim	6d03acec30	TxPool refactoring: Simplify TxChainRef and remove gauges (#2506 ) This is one of the txPool refactoring series to make it ready for integration with the new ForkedChainRef	2024-07-19 16:24:36 +07:00
andri lim	fb196849ee	EVM cosmetic changes, one less indirect access of VmCpt (#2503 )	2024-07-19 08:44:01 +07:00
Jordan Hrycaj	5ac362fe6f	Aristo and kvt balancer management update (#2504 ) * Aristo: Merge `delta_siblings` module into `deltaPersistent()` * Aristo: Add `isEmpty()` for canonical checking whether a layer is empty * Aristo: Merge `LayerDeltaRef` into `LayerObj` why: No need to maintain nested object refs anymore. Previously the `LayerDeltaRef` object had a companion `LayerFinalRef` which held non-delta layer information. * Kvt: Merge `LayerDeltaRef` into `LayerRef` why: No need to maintain nested object refs (as with `Aristo`) * Kvt: Re-write balancer logic similar to `Aristo` why: Although `Kvt` was a cheap copy of `Aristo` it sort of got out of sync and the balancer code was wrong. * Update iterator over forked peers why: Yield additional field `isLast` indicating that the last iteration cycle was approached. * Optimise balancer calculation. why: One can often avoid providing a new object containing the merge of two layers for the balancer. This avoids copying tables. In some cases this is replaced by `hasKey()` look ups though. One uses one of the two to combine and merges the other into the first. Of course, this needs some checks for making sure that none of the components to merge is eventually shared with something else. * Fix copyright year	2024-07-18 21:32:32 +00:00
andri lim	ee323d5ff8	Optimize EVM stack usage (#2502 ) * EVM: Optimize CALL family stack usage * EVM: Optimize CREATE family stack usage * EVM: Optimize arith stack usage * EVM: Optimize stack usage in the rest of opcodes * Fix test_op_env and clean up unused imports * EVM: Optimize arithmetic binary ops	2024-07-18 18:59:53 +07:00
Jacek Sieka	df4a21c910	Store cached hash at the layer corresponding to the source data (#2492 ) When lazily verifying state roots, we may end up with an entire state without roots that gets computed for the whole database - in the current design, that would result in hashes for the entire trie being held in memory. Since the hash depends only on the data in the vertex, we can store it directly at the top-most level derived from the verticies it depends on - be that memory or database - this makes the memory usage broadly linear with respect to the already-existing in-memory change set stored in the layers. It also ensures that if we have multiple forks in memory, hashes get cached in the correct layer maximising reuse between forks. The same layer numbering scheme as elsewhere is reused, where -2 is the backend, -1 is the balancer, then 0+ is the top of the stack and stack. A downside of this approach is that we create many small batches - a future improvement could be to collect all such writes in a single batch, though the memory profile of this approach should be examined first (where is the batch kept, exactly?).	2024-07-18 09:13:56 +02:00
Jordan Hrycaj	6677f57ea9	Aristo balancer clean up (#2501 ) * Remove `chunkedMpt` from `persistent()`/`stow()` function why: Proof-mode code was removed with PR #2445 and needs to be re-designed. * Remove unused `beStateRoot` argument from `deltaMerge()` * Update/drastically simplify `txStow()` why: Got rid of many boundary conditions details: Many pre-conditions have changed. In particular, previous versions used the account state (hash) which was conveniently available and checked it against the backend in order to find out whether there was something to do, at all. Currently, only an empty set of all tables in the delta layer has the balancer update ignored. Notable changes are: * no check against account state (see above) * balancer filters have no hash signature (some legacy stuff left over from journals) * no (shap sync) proof data which made the generation of the a top layer more complex * Cosmetics, cruft removal * Update unit test file & function name why: Was legacy module	2024-07-17 19:27:33 +00:00
andri lim	cfe14f1825	EVM: use assign2 whenever possible (#2499 ) Before: GST finish in 59 secs. After: GST finish in 52 secs!	2024-07-17 20:48:50 +07:00
andri lim	8d1e21bbae	Simplify txPool gasLimit calculator (#2498 ) Our need is only a baseline tx pool gasLimit calculator. If need we can expand it in the future. But for now, a simple but understandable tx pool is more important.	2024-07-17 20:48:35 +07:00
Jordan Hrycaj	17391b58d0	Hash keys and hash256 revisited (#2497 ) * Remove cruft left-over from PR #2494 * TODO * Update comments on `HashKey` type values * Remove obsolete hash key conversion flag `forceRoot` why: Is treated implicitly by having vertex keys as `HashKey` type and root vertex states converted to `Hash256`	2024-07-17 20:48:21 +07:00
andri lim	916f88a373	Use block number or timestamp to determine fork rules (#2496 ) * Use block number or timestamp to determine fork rules Avoid confusion raised by `forkGTE` usage where block informations are present. * Get rid of forkGTE	2024-07-17 17:05:53 +07:00
andri lim	a59cc84fca	Not using deprecated functions in config anymore (#2495 )	2024-07-17 02:57:19 +00:00
Jordan Hrycaj	a84a2131cd	No ext update (#2494 ) * Imported/rebase from `no-ext`, PR #2485 Store extension nodes together with the branch Extension nodes must be followed by a branch - as such, it makes sense to store the two together both in the database and in memory: * fewer reads, writes and updates to traverse the tree * simpler logic for maintaining the node structure * less space used, both memory and storage, because there are fewer nodes overall There is also a downside: hashes can no longer be cached for an extension - instead, only the extension+branch hash can be cached - this seems like a fine tradeoff since computing it should be fast. TODO: fix commented code * Fix merge functions and `toNode()` * Update `merkleSignCommit()` prototype why: Result is always a 32bit hash * Update short Merkle hash key generation details: Ethereum reference MPTs use Keccak hashes as node links if the size of an RLP encoded node is at least 32 bytes. Otherwise, the RLP encoded node value is used as a pseudo node link (rather than a hash.) This is specified in the yellow paper, appendix D. Different to the `Aristo` implementation, the reference MPT would not store such a node on the key-value database. Rather the RLP encoded node value is stored instead of a node link in a parent node is stored as a node link on the parent database. Only for the root hash, the top level node is always referred to by the hash. * Fix/update `Extension` sections why: Were commented out after removal of a dedicated `Extension` type which left the system disfunctional. * Clean up unused error codes * Update unit tests * Update docu --------- Co-authored-by: Jacek Sieka <jacek@status.im>	2024-07-16 19:47:59 +00:00
Jacek Sieka	0e36a17e5b	avoid re-writing code (#2490 ) Avoids pointless rocksdb writes that cause write compaction / amplification, specially in the case where code is shared between multiple accounts	2024-07-15 15:02:23 +02:00
Jacek Sieka	9d91191154	storage hike cache (#2484 ) This PR adds a storage hike cache similar to the account hike cache already present - this cache is less efficient because account storage is already partically cached in the account ledger but nonetheless helps keep hiking down. Notably, there's an opportunity to optimise this cache and the others so that they cooperate better insteado of overlapping, which is left for a future PR. This PR also fixes an O(N) memory usage for storage slots where the delete would keep the full storage in a work list which on mainnet can grow very large - the work list is replaced with a more conventional recursive `O(log N)` approach.	2024-07-14 19:12:10 +02:00
Jacek Sieka	f3a56002ca	Turn payload into value type (#2483 ) The Vertex type unifies branches, extensions and leaves into a single memory area where the larges member is the branch (128 bytes + overhead) - the payloads we have are all smaller than 128 thus wrapping them in an extra layer of `ref` is wasteful from a memory usage perspective. Further, the ref:s must be visited during the M&S phase of garbage collection - since we keep millions of these, many of them short-lived, this takes up significant CPU time. ``` Function CPU Time: Total CPU Time: Self Module Function (Full) Source File Start Address system::markStackAndRegisters 10.0% 4.922s nimbus system::markStackAndRegisters(var<system::GcHeap>).constprop.0 gc.nim 0x701230` ```	2024-07-14 12:02:05 +02:00
Jacek Sieka	72947b3647	odds and ends (#2481 ) small cleanups to reduce memory allocations	2024-07-13 20:42:49 +02:00
Jordan Hrycaj	f08178c592	Separate constructor helpers for core db and ledger (#2480 ) * Extract `CoreDb` constructor helpers from `base.nim` into separate module why: This makes it easier to avoid circular imports. * Extract `Ledger` constructor helpers from `base.nim` into separate module why: Move `accounts_ledger.nim` file to sub-folder `backend`. That way the layout resembles that of the `core_db`.	2024-07-12 19:32:31 +00:00
Jordan Hrycaj	b924fdcaa7	Separate config for core db and ledger (#2479 ) * Updates and corrections * Extract `CoreDb` configuration from `base.nim` into separate module why: This makes it easier to avoid circular imports, in particular when the capture journal (aka tracer) is revived. * Extract `Ledger` configuration from `base.nim` into separate module why: This makes it easier to avoid circular imports (if any.) also: Move `accounts_ledger.nim` file to sub-folder `backend`. That way the layout resembles that of the `core_db`.	2024-07-12 13:12:25 +00:00
Jacek Sieka	01ab209497	cache account payload (#2478 ) Instead of caching just the storage id, we can cache the full payload which further reduces expensive hikes	2024-07-12 15:08:26 +02:00
Jacek Sieka	d07540766f	coredb: tracking fixes (#2476 )	2024-07-12 13:40:13 +02:00
Advaita Saha	25af347dfd	Shift era helpers to a different file (#2475 ) * shift helpers to a different file * fix: few logic fixed for transition from era1 to era	2024-07-12 03:15:14 +00:00
Jacek Sieka	a6764670f0	merge: avoid hike allocations (#2472 ) hike allocations (and the garbage collection maintenance that follows) are responsible for some 10% of cpu time (not wall time!) at this point - this PR avoids them by stepping through the layers one step at a time, simplifying the code at the same time.	2024-07-11 13:26:46 +02:00
Jordan Hrycaj	800fd77333	Core db remove legacy phrases (#2468 ) * Rename `newKvt()` -> `ctx.getKvt()` why: Clean up legacy shortcut. Also, the `KVT` returned is not instantiated but refers to the shared `KVT` that resides in a context which is a generalisation of an in-memory database fork. The function `ctx` retrieves the default context. * Rename `newTransaction()` -> `ctx.newTransaction()` why: Clean up legacy shortcut. The transaction is applied to a context as a generalisation of an in-memory database fork. The function `ctx` retrieves the default context. * Rename `getColumn(CtGeneric)` -> `getGeneric()` why: No more a list of well known sub-tries needed, a single one is enough. In fact, `getColumn()` did only support a single sub-tree by now. * Reduce TODO list	2024-07-10 12:19:35 +00:00
Jacek Sieka	3382c2427b	increase rdb cache sizes (#2466 ) This trivial bump should improve performance a bit without costing too much memory - as the trie grows, so does the number of levels in it and creating hikes becomes ever more expensive - hopefully this cache increase should give a nice little boost even if it's not a lot.	2024-07-09 17:35:27 +02:00
Jacek Sieka	ab23148aab	don't rewrite hash->slot map (#2463 ) Avoid writing the same slot/hash values to the hash->slot mapping to avoid spamming the rocksdb WAL and cause unnecessary compaction In the same vein, avoid writing trivially detectable A-B-A storage changes which happen with surprising frequency.	2024-07-09 17:25:43 +02:00
Advaita Saha	9a499eb45f	Era support for nimbus import (#2429 ) * add the era-dir option * feat: support for era files in nimbus import * fix: metric logs * fix: eraDir check * fix: redundant code and sepolia support * fix: remove dependency from csv + formatting * fix: typo * fix: RVO * fix: parseBiggestInt * fix: opt impl * fix: network agnostic loading * fix: shift to int64	2024-07-09 15:28:01 +02:00
andri lim	4fa3756860	Convert GasInt to uint64, bump nim-eth and nimbus-eth2 (#2461 ) * Convert GasInt to uint64, bump nim-eth and nimbus-eth2 * Bump nimbus-eth2 * int64.high.GasInt instead of 0x7fffffffffffffff.GasInt	2024-07-07 06:52:11 +00:00
andri lim	e8683692fd	EVM gasSstore refund reduction using positive integer (#2460 ) This is the hopefully the last part of preparations before converting GasInt to uint64	2024-07-06 08:39:38 +07:00
andri lim	4eaae5cbfa	EVM gasCall values always stay on positive side (#2459 ) * EVM gasCall values always stay on positive side This is also another part of preparations before converting GasInt to uint64 * Fix test_evm_support	2024-07-06 08:39:22 +07:00
andri lim	c775c906a2	Fix LedgerRef storage iterator and add test (#2458 )	2024-07-05 10:15:48 +00:00
andri lim	6fe7411ac0	Saner EVM gasCosts (#2457 ) This is also a part of preparations before converting GasInt to uint64	2024-07-05 11:55:13 +07:00
andri lim	23c00ce88c	Separate evmc gasCosts and nim-evm gasCosts (#2454 ) This is part of preparations before converting GasInt to uint64	2024-07-05 07:00:03 +07:00
Jacek Sieka	7d78fd97d5	avoid allocations for slot storage (#2455 ) Introduce a new `StoData` payload type similar to `AccountData` * slightly more efficient storage format * typed api * fewer seqs * fix encoding docs - it wasn't rlp after all :)	2024-07-04 23:48:45 +00:00
Jacek Sieka	79788c01d4	Add debug mode for disabling per-chunk state root validation (#2453 ) This significantly speeds up block import at the cost of less protection against invalid data, potentially resulting in an invalid database getting stored. The risk is small given that import is used only for validated data - evaluating the right level of of validation vs performance is left for a future PR. A side effect of this approach is that there is no cached stated root in the database - computing it currently requires a lot of memory since the intermediate roots get cached in memory in full while the computation is ongoing - a future PR will need to address this deficiency, for example by streaming the already-computed hashes directly to the database.	2024-07-04 16:51:50 +02:00
andri lim	f04f30c72b	Reduce EVM complexity by removing forkOverride (#2448 ) * Reduce EVM complexity by removing forkOverride * Fixes	2024-07-04 15:48:36 +02:00
Jacek Sieka	81e75622cf	storage: store root id together with vid, for better locality of refe… (#2449 ) The state and account MPT:s currenty share key space in the database based on that vertex id:s are assigned essentially randomly, which means that when two adjacent slot values from the same contract are accessed, they might reside at large distance from each other. Here, we prefix each vertex id by its root causing them to be sorted together thus bringing all data belonging to a particular contract closer together - the same effect also happens for the main state MPT whose nodes now end up clustered together more tightly. In the future, the prefix given to the storage keys can also be used to perform range operations such as reading all the storage at once and/or deleting an account with a batch operation. Notably, parts of the API already supported this rooting concept while parts didn't - this PR makes the API consistent by always working with a root+vid.	2024-07-04 15:46:52 +02:00
andri lim	b82dcdcc76	Remove unused StructLog (#2447 )	2024-07-04 19:23:53 +07:00
andri lim	d9e502bbc5	Bump web3/kzg4844/nimbus-eth2 and related fixes (#2446 )	2024-07-04 05:41:32 +00:00
Jacek Sieka	b23795ab39	remove pPrf, fRpp (#2445 ) No longer used now that hashify is gone	2024-07-03 22:21:57 +02:00
Jacek Sieka	443c6d1f8e	Cache account path storage id (#2443 ) The storage id is frequently accessed when executing contract code and finding the path via the database requires several hops making the process slow - here, we add a cache to keep the most recently used account storage id:s in memory. A possible future improvement would be to cache all account accesses so that for example updating the balance doesn't cause several hikes.	2024-07-03 17:58:25 +02:00
Jordan Hrycaj	ea7c756a9d	Core db reorg (#2444 ) * CoreDb: Merged all sub-descriptors into `base_desc` module * Dissolve `aristo_db/common_desc.nim` * No need to export `Aristo` methods in `CoreDb` * Resolve/tighten methods in `aristo_db` sub-moduled why: So they can be straihgt implemented into the `base` module * Moved/re-implemented `KVT` methods into `base` module * Moved/re-implemented `MPT` methods into `base` module * Moved/re-implemented account methods into `base` module * Moved/re-implemented `CTX` methods into `base` module * Moved/re-implemented `handler_{aristo,kvt}` into `aristo_db` module * Moved/re-implemented `TX` methods into `base` module * Moved/re-implemented base methods into `base` module * Replaced `toAristoSavedStateBlockNumber()` by proper base method why: Was the last for keeping reason for keeping low level backend access methods * Remove dedicated low level access to `Aristo` backend why: Not needed anymore, for debugging the descriptors can be accessed directly also: some clean up stuff * Re-factor `CoreDb` descriptor layout and adjust base methods * Moved/re-implemented iterators into `base_iterator` modules Update docu	2024-07-03 15:50:27 +00:00
Jacek Sieka	1f60e8e453	Use `Hash256` directly for account path (#2439 ) Account paths are always a hash - passing it around as such helps avoid confusion as to how long it is	2024-07-03 10:14:26 +02:00
Jacek Sieka	c364426422	Smaller in-database representations (#2436 ) These representations use ~15-20% less data compared to the status quo, mainly by removing redundant zeroes in the integer encodings - a significant effect of this change is that the various rocksdb caches see better efficiency since more items fit in the same amount of space. * use RLP encoding for `VertexID` and `UInt256` wherever it appears * pack `VertexRef`/`PayloadRef` more tightly	2024-07-02 20:25:06 +02:00
web3-developer	e163b69261	Bump RocksDb version and enable autoClose on opt types to prevent memory leaks (#2427 ) * Bump RocksDb version and enable autoClose on opt types to prevent memory leaks.	2024-07-02 13:44:09 +08:00
Jacek Sieka	3d3831dde8	Small cleanups (#2435 ) * avoid costly hike memory allocations for operations that don't need to re-traverse it * avoid unnecessary state checks (which might trigger unwanted state root computations) * disable optimize-for-hits due to the MPT no longer being complete at all times	2024-07-01 14:07:39 +02:00
Jordan Hrycaj	2c87fd1636	Aristo code cosmetics and tests update (#2434 ) * Update some docu * Resolve obsolete compile time option why: Not optional anymore * Update checks why: The notion of what constitutes a valid `Aristo` db has changed due to (even more) lazy calculating Merkle hash keys. * Disable redundant unit test for production	2024-07-01 10:59:18 +00:00
andri lim	401537ad38	Add ForkedChainRef tests (#2430 ) ForkedChainRef have become quite complex. test_blockchain_json is not sufficient cover for edge cases or synthetic cases.	2024-06-30 14:40:14 +07:00
andri lim	c24affadee	Use simpler schema when writing transactions, receipts, and withdrawals (#2420 ) * Use simpler schema when writing transactions, receipts, and withdrawals Using MPT not only slow but also take up more spaces than needed. Aristo will remove older tries and only keep the last block tries. Using simpler schema will avoid those problems. * Rename getTransaction to getTransactionByIndex	2024-06-29 12:43:17 +07:00
Jordan Hrycaj	8dd038144b	Some cleanups (#2428 ) * Remove `dirty` set from structural objects why: Not used anymore, the tree is dirty by default. * Rename `aristo_hashify` -> `aristo_compute` * Remove cruft, update comments, cosmetics, etc. * Simplify `SavedState` object why: The key chaining have become obsolete after extra lazy hashing. There is some available space for a state hash to be maintained in future. details: Accept the legacy `SavedState` object serialisation format for a while (which will be overwritten by new format.)	2024-06-28 18:43:04 +00:00
Jordan Hrycaj	14c3772545	On demand mpt revisited (#2426 ) * rebased from `github/on-demand-mpt` ackn: wip: on-demand mpt construction Given that actual data is stored in the `Vertex` structure, it's useful to think of the MPT as a cache for computing roots rather than being a functional requirement on its own. This PR engenders this line of thinking by incrementally computing the MPT only when it's needed, ie when a state (or similar) root is needed. This has the effect of siginficantly reducing memory usage as well as improving performance: * no need for dirty-mpt-node book-keeping * no need to build complex forest of upcoming hashing work * only hashes that are functionally needed are ever computed - intermediate nodes whose MTP root is not observed are never computed / processed * Unit test hot fixes * Unit test hot fixes cont. (somehow lost that part) --------- Co-authored-by: Jacek Sieka <jacek@status.im>	2024-06-28 15:03:12 +00:00
Jordan Hrycaj	6dc2773957	Only use pre hashed addresses as account keys (#2424 ) * Normalised storage tree addressing in function prototypes detail: Argument list is always `<db> <account-path> <slot-path> ..` with both path arguments as `openArray[]` * Remove cruft * CoreDb internally Use full account paths rather than addresses * Update API logging * Use hashed account address only in prototypes why: This avoids unnecessary repeated hashing of the same account address. The burden of doing that is upon the application. In the case here, the ledger caches all kinds of stuff anyway so it is common sense to exploit that for account address hashes. caveat: Using `openArray[byte]` argument types for hashed accounts is inherently fragile. In non-release mode, a length verification `doAssert` is enabled by default. * No accPath in data record (use `AristoAccount` as `CoreDbAccount`) * Remove now unused `eAddr` field from ledger `AccountRef` type why: Is duplicate of lookup key * Avoid merging the account record/statement in the ledger twice.	2024-06-27 19:21:01 +00:00
Jordan Hrycaj	61bbf40014	Update storage tree admin (#2419 ) * Tighten `CoreDb` API for accounts why: Apart from cruft, the way to fetch the accounts state root via a `CoreDbColRef` record was unnecessarily complicated. * Extend `CoreDb` API for accounts to cover storage tries why: In future, this will make the notion of column objects obsolete. Storage trees will then be indexed by the account address rather than the vertex ID equivalent like a `CoreDbColRef`. * Apply new/extended accounts API to ledger and tests details: This makes the `distinct_ledger` module obsolete * Remove column object constructors why: They were needed as an abstraction of MPT sub-trees including storage trees. Now, storage trees are handled by the account (e.g. via address) they belong to and all other trees can be identified by a constant well known vertex ID. So there is no need for column objects anymore. Still there are some left-over column object methods wnich will be removed next. * Remove `serialise()` and `PayloadRef` from default Aristo API why: Not needed. `PayloadRef` was used for unstructured/unknown payload formats (account or blob) and `serialise()` was used for decodng `PayloadRef`. Now it is known in advance what the payload looks like. * Added query function `hasStorageData()` whether a storage area exists why: Useful for supporting `slotStateEmpty()` of the `CoreDb` API * In the `Ledger` replace `storage.stateEmpty()` by `slotStateEmpty()` * On Aristo, hide the storage root/vertex ID in the `PayloadRef` why: The storage vertex ID is fully controlled by Aristo while the `AristoAccount` object is controlled by the application. With the storage root part of the `AristoAccount` object, there was a useless administrative burden to keep that storage root field up to date. * Remove cruft, update comments etc. * Update changed MPT access paradigms why: Fixes verified proxy tests * Fluffy cosmetics	2024-06-27 09:01:26 +00:00
web3-developer	ea94e8a351	Use RocksDb column family handles instead of name strings. (#2418 ) * Bump RocksDb to latest and update Nimbus database to pass column family handles to RocksDb API. * Bump RocksDb version.	2024-06-27 16:51:43 +08:00
andri lim	b80521a84d	ForkedChain become ForkedChainRef (#2417 ) * ForkedChain become ForkedChainRef It will be shared between engine API, RPC, and txPool * Fix ForkedChainRef constructor	2024-06-27 12:54:52 +07:00
andri lim	27339e9520	Simplify txpool baseFeeGet (#2416 ) * Simplify txpool baseFeeGet - Avoid using toEVMFork because we are not in EVM - Rename `isLondon` to `isLondonOrLater` * Remove timestamp from isLondonOrLater	2024-06-27 12:54:36 +07:00
Jacek Sieka	c8cdffa775	Small cleanups (#2414 ) * remove unnecessary / expensive error checking * avoid some trivial memory allocs * work around table move bug	2024-06-26 09:25:09 +02:00
andri lim	cd21c4fbec	ForkedChain implementation (#2405 ) * ForkedChain implementation - revamp test_blockchain_json using ForkedChain - re-enable previously failing test cases. * Remove excess error handling * Avoid reloading parent header * Do not force base update * Write baggage to database * Add findActiveChain to finalizedSegment * Create new stagingTx in addBlock * Check last stateRoot existence in test_blockchain_json * Resolve rebase conflict * More precise nomenclature for block import cursor * Ensure bad block nor imported and good block not rejected * finalizeSegment become forkChoice and align with engine API forkChoice spec * Display reason when good block rejected * Fix comments * Put BaseDistance into CalculateNewBase equation * Separate finalizedHash from baseHash * Add more doAssert constraint * Add push raises: []	2024-06-26 07:27:48 +07:00
Jacek Sieka	3e001e322c	Fix memory usage spikes during sync, give memory to rocksdb (#2413 ) * creating a seq from a table that holds lots of changes means copying all data into the table - this can be several GB of data while syncing blocks * nim fails to optimize the moving of the `WidthFirstForest` - the real solution is to not construct a `wff` to begin with, but this PR provides relief while that is being worked on This spike fix allows us to bump the rocksdb cache by another 2 GB and still have a significantly lower peak memory usage during sync.	2024-06-25 13:39:53 +02:00
Jacek Sieka	f294d1e086	Clear account cache after each block (#2411 ) When processing long ranges of blocks, the account cache grows unbounded which cause huge memory spikes. Here, we move the cache to a second-level cache after each block - the second-level cache is cleared on the next block after that which creates a simple LRU effect. There's a small performance cost of course, though overall the freed-up memory can now be reassigned to the rocksdb row cache which not only makes up for the loss but overall leads to a performance increase. The bump to 2gb of rocksdb row cache here needs more testing but is slightly less and loosely basedy on the savings from this PR and the circular ref fix in #2408 - another way to phrase this is that it's better to give rocksdb more breathing room than let the memory sit unused until circular ref collection happens ;)	2024-06-25 07:30:32 +02:00

1 2 3 4 5 ...

1933 Commits