Commit Graph

4167 Commits

Author SHA1 Message Date
Kim De Mey 157fb4f1ef
Re-add toString function for proper hex logs of content keys (#2508) 2024-07-19 14:48:03 +02:00
andri lim 6d03acec30
TxPool refactoring: Simplify TxChainRef and remove gauges (#2506)
This is one of the txPool refactoring series to make it ready
for integration with the new ForkedChainRef
2024-07-19 16:24:36 +07:00
andri lim 57b36dc4ee
Bump nim-stew to 54cc67cbb83f61b6e3168b09701758c5b805120a (#2505) 2024-07-19 16:23:54 +07:00
andri lim fb196849ee
EVM cosmetic changes, one less indirect access of VmCpt (#2503) 2024-07-19 08:44:01 +07:00
Jordan Hrycaj 5ac362fe6f
Aristo and kvt balancer management update (#2504)
* Aristo: Merge `delta_siblings` module into `deltaPersistent()`

* Aristo: Add `isEmpty()` for canonical checking whether a layer is empty

* Aristo: Merge `LayerDeltaRef` into `LayerObj`

why:
  No need to maintain nested object refs anymore. Previously the
 `LayerDeltaRef` object had a companion `LayerFinalRef` which held
  non-delta layer information.

* Kvt: Merge `LayerDeltaRef` into `LayerRef`

why:
  No need to maintain nested object refs (as with `Aristo`)

* Kvt: Re-write balancer logic similar to `Aristo`

why:
  Although `Kvt` was a cheap copy of `Aristo` it sort of got out of
  sync and the balancer code was wrong.

* Update iterator over forked peers

why:
  Yield additional field `isLast` indicating that the last iteration
  cycle was approached.

* Optimise balancer calculation.

why:
  One can often avoid providing a new object containing the merge of two
  layers for the balancer. This avoids copying tables. In some cases this
  is replaced by `hasKey()` look ups though. One uses one of the two
  to combine and merges the other into the first.

  Of course, this needs some checks for making sure that none of the
  components to merge is eventually shared with something else.

* Fix copyright year
2024-07-18 21:32:32 +00:00
andri lim ee323d5ff8
Optimize EVM stack usage (#2502)
* EVM: Optimize CALL family stack usage

* EVM: Optimize CREATE family stack usage

* EVM: Optimize arith stack usage

* EVM: Optimize stack usage in the rest of opcodes

* Fix test_op_env and clean up unused imports

* EVM: Optimize arithmetic binary ops
2024-07-18 18:59:53 +07:00
web3-developer f9956eba59
Fluffy State Bridge - Building state using state diffs. (#2486)
* Started state bridge.

* Implement call to fetch stateDiffs using trace_replayBlockTransactions.

* Convert JSON responses to stateDiff types.

* State updates working for first few blocks.

* Correctly building state for first 200K blocks.

* Add storage of code and cleanup.

* Start state bridge refactor.

* More cleanup and fixes.

* Use RocksDb as backend for state.

* Implement transactions.

* Build RocksDb dependency when building fluffy tools.

* Move code to world state helper.

* Implement producer and consumer queue.

* Cleanup exceptions.

* Improve logging.

* Add update caches to DatabaseRef backends.
2024-07-18 17:01:40 +08:00
Jacek Sieka df4a21c910
Store cached hash at the layer corresponding to the source data (#2492)
When lazily verifying state roots, we may end up with an entire state
without roots that gets computed for the whole database - in the current
design, that would result in hashes for the entire trie being held in
memory.

Since the hash depends only on the data in the vertex, we can store it
directly at the top-most level derived from the verticies it depends on
- be that memory or database - this makes the memory usage broadly
linear with respect to the already-existing in-memory change set stored
in the layers.

It also ensures that if we have multiple forks in memory, hashes get
cached in the correct layer maximising reuse between forks.

The same layer numbering scheme as elsewhere is reused, where -2 is the
backend, -1 is the balancer, then 0+ is the top of the stack and stack.

A downside of this approach is that we create many small batches - a
future improvement could be to collect all such writes in a single
batch, though the memory profile of this approach should be examined
first (where is the batch kept, exactly?).
2024-07-18 09:13:56 +02:00
Jordan Hrycaj 6677f57ea9
Aristo balancer clean up (#2501)
* Remove `chunkedMpt` from `persistent()`/`stow()` function

why:
  Proof-mode code was removed with PR #2445 and needs to be re-designed.

* Remove unused `beStateRoot` argument from `deltaMerge()`

* Update/drastically simplify `txStow()`

why:
  Got rid of many boundary conditions

details:
  Many pre-conditions have changed. In particular, previous versions
  used the account state (hash) which was conveniently available and
  checked it against the backend in order to find out whether there
  was something to do, at all. Currently, only an empty set of all
  tables in the delta layer has the balancer update ignored.

  Notable changes are:
  * no check against account state (see above)
  * balancer filters have no hash signature (some legacy stuff left over
    from journals)
  * no (shap sync) proof data which made the generation of the a top layer
    more complex

* Cosmetics, cruft removal

* Update unit test file & function name

why:
  Was legacy module
2024-07-17 19:27:33 +00:00
Kim De Mey 51cf991439
Bump ssz_serialization and use ByteList[n] + add ContentKeyByteList (#2500) 2024-07-17 17:07:27 +02:00
andri lim cfe14f1825
EVM: use assign2 whenever possible (#2499)
Before: GST finish in 59 secs.
After: GST finish in 52 secs!
2024-07-17 20:48:50 +07:00
andri lim 8d1e21bbae
Simplify txPool gasLimit calculator (#2498)
Our need is only a baseline tx pool gasLimit calculator.
If need we can expand it in the future.
But for now, a simple but understandable tx pool is more important.
2024-07-17 20:48:35 +07:00
Jordan Hrycaj 17391b58d0
Hash keys and hash256 revisited (#2497)
* Remove cruft left-over from PR #2494

* TODO

* Update comments on `HashKey` type values

* Remove obsolete hash key conversion flag `forceRoot`

why:
  Is treated implicitly by having vertex keys as `HashKey` type and
  root vertex states converted to `Hash256`
2024-07-17 20:48:21 +07:00
andri lim 916f88a373
Use block number or timestamp to determine fork rules (#2496)
* Use block number or timestamp to determine fork rules

Avoid confusion raised by `forkGTE` usage where block informations are present.

* Get rid of forkGTE
2024-07-17 17:05:53 +07:00
Kim De Mey 3bb707422b
Rework beacon block proofs to better structure (#2493)
The 3 proofs can be reworked to two proofs as we can use the
BeaconBlock directly instead of BeaconBlockHeader and
BeaconBlockBody. This is possible because the HTR of the
BeaconBlock is the same as the one of the BeaconBlockHeader.

This results in 32 bytes less as an intermediate hash can be
removed. But more importantly looks more clean and compact in
structure and code.
2024-07-17 11:32:05 +02:00
andri lim a59cc84fca
Not using deprecated functions in config anymore (#2495) 2024-07-17 02:57:19 +00:00
Jordan Hrycaj a84a2131cd
No ext update (#2494)
* Imported/rebase from `no-ext`, PR #2485

  Store extension nodes together with the branch

  Extension nodes must be followed by a branch - as such, it makes sense
  to store the two together both in the database and in memory:

  * fewer reads, writes and updates to traverse the tree
  * simpler logic for maintaining the node structure
  * less space used, both memory and storage, because there are fewer
    nodes overall

  There is also a downside: hashes can no longer be cached for an
  extension - instead, only the extension+branch hash can be cached - this
  seems like a fine tradeoff since computing it should be fast.

  TODO: fix commented code

* Fix merge functions and `toNode()`

* Update `merkleSignCommit()` prototype

why:
  Result is always a 32bit hash

* Update short Merkle hash key generation

details:
  Ethereum reference MPTs use Keccak hashes as node links if the size of
  an RLP encoded node is at least 32 bytes. Otherwise, the RLP encoded
  node value is used as a pseudo node link (rather than a hash.) This is
  specified in the yellow paper, appendix D.

  Different to the `Aristo` implementation, the reference MPT would not
  store such a node on the key-value database. Rather the RLP encoded node value is stored instead of a node link in a parent node
  is stored as a node link on the parent database.

  Only for the root hash, the top level node is always referred to by the
  hash.

* Fix/update `Extension` sections

why:
  Were commented out after removal of a dedicated `Extension` type which
  left the system disfunctional.

* Clean up unused error codes

* Update unit tests

* Update docu

---------

Co-authored-by: Jacek Sieka <jacek@status.im>
2024-07-16 19:47:59 +00:00
Jacek Sieka 0e36a17e5b
avoid re-writing code (#2490)
Avoids pointless rocksdb writes that cause write compaction /
amplification, specially in the case where code is shared between
multiple accounts
2024-07-15 15:02:23 +02:00
Kim De Mey 6cf57d9912
Remove duplicate logging code from fluffy (#2488)
Code was copied back when nimbus-eth2 was not a dependency.
Now that is an established dependency for many parts, lets reuse
the original code.
2024-07-15 11:41:17 +02:00
Jacek Sieka 9d91191154
storage hike cache (#2484)
This PR adds a storage hike cache similar to the account hike cache
already present - this cache is less efficient because account storage
is already partically cached in the account ledger but nonetheless helps
keep hiking down.

Notably, there's an opportunity to optimise this cache and the others so
that they cooperate better insteado of overlapping, which is left for a
future PR.

This PR also fixes an O(N) memory usage for storage slots where the
delete would keep the full storage in a work list which on mainnet can
grow very large - the work list is replaced with a more conventional
recursive `O(log N)` approach.
2024-07-14 19:12:10 +02:00
Jacek Sieka f3a56002ca
Turn payload into value type (#2483)
The Vertex type unifies branches, extensions and leaves into a single
memory area where the larges member is the branch (128 bytes + overhead) -
the payloads we have are all smaller than 128 thus wrapping them in an
extra layer of `ref` is wasteful from a memory usage perspective.

Further, the ref:s must be visited during the M&S phase of garbage
collection - since we keep millions of these, many of them
short-lived, this takes up significant CPU time.

```
Function	CPU Time: Total	CPU Time: Self	Module	Function (Full)	Source File	Start Address
system::markStackAndRegisters	10.0%	4.922s	nimbus	system::markStackAndRegisters(var<system::GcHeap>).constprop.0	gc.nim	0x701230`
```
2024-07-14 12:02:05 +02:00
Jacek Sieka 72947b3647
odds and ends (#2481)
small cleanups to reduce memory allocations
2024-07-13 20:42:49 +02:00
Jordan Hrycaj f08178c592
Separate constructor helpers for core db and ledger (#2480)
* Extract `CoreDb` constructor helpers from `base.nim` into separate module

why:
  This makes it easier to avoid circular imports.

* Extract `Ledger` constructor helpers from `base.nim` into separate module

why:
  Move `accounts_ledger.nim` file to sub-folder `backend`. That way the
  layout resembles that of the `core_db`.
2024-07-12 19:32:31 +00:00
Jordan Hrycaj b924fdcaa7
Separate config for core db and ledger (#2479)
* Updates and corrections

* Extract `CoreDb` configuration from `base.nim` into separate module

why:
  This makes it easier to avoid circular imports, in particular
  when the capture journal (aka tracer) is revived.

* Extract `Ledger` configuration from `base.nim` into separate module

why:
  This makes it easier to avoid circular imports (if any.)

also:
  Move `accounts_ledger.nim` file to sub-folder `backend`. That way the
  layout resembles that of the `core_db`.
2024-07-12 13:12:25 +00:00
Jacek Sieka 01ab209497
cache account payload (#2478)
Instead of caching just the storage id, we can cache the full payload
which further reduces expensive hikes
2024-07-12 15:08:26 +02:00
Jacek Sieka d07540766f
coredb: tracking fixes (#2476) 2024-07-12 13:40:13 +02:00
Kim De Mey 70682cd4a8
Bump NimYAML module and related changes (#2474) 2024-07-12 09:30:50 +02:00
Advaita Saha 25af347dfd
Shift era helpers to a different file (#2475)
* shift helpers to a different file

* fix: few logic fixed for transition from era1 to era
2024-07-12 03:15:14 +00:00
Kim De Mey d996e60347
Rename to EpochRecord and other accumulator spec changes (#2473)
- EpochAccumulator got renamed to EpochRecord
- MasterAccumulator is not HistoricalHashesAccumulator
- The List size for the accumulator got a different maximum which
also result in a different encoding and HTR
2024-07-11 17:42:45 +02:00
Jacek Sieka a6764670f0
merge: avoid hike allocations (#2472)
hike allocations (and the garbage collection maintenance that follows)
are responsible for some 10% of cpu time (not wall time!) at this point
- this PR avoids them by stepping through the layers one step at a time,
simplifying the code at the same time.
2024-07-11 13:26:46 +02:00
Kim De Mey 4a20756e6b
Remove unused seed_db and related code (#2471) 2024-07-10 23:02:15 +02:00
Kim De Mey 94340037bf
Set the routing table ip limits back to defaults (#2470) 2024-07-10 17:26:30 +02:00
Kim De Mey 5ae5bd8b69
Bump portal-mainnet for updated fluffy bootstrap ENRs (#2469) 2024-07-10 15:45:11 +02:00
Jordan Hrycaj 800fd77333
Core db remove legacy phrases (#2468)
* Rename `newKvt()` -> `ctx.getKvt()`

why:
  Clean up legacy shortcut. Also, the `KVT` returned is not instantiated
  but refers to the shared `KVT` that resides in a context which is a
  generalisation of an in-memory database fork. The function `ctx`
  retrieves the default context.

* Rename `newTransaction()` -> `ctx.newTransaction()`

why:
  Clean up legacy shortcut. The transaction is applied to a context as a
  generalisation of an in-memory database fork. The function `ctx`
  retrieves the default context.

* Rename `getColumn(CtGeneric)` -> `getGeneric()`

why:
  No more a list of well known sub-tries needed, a single one is enough.
  In fact, `getColumn()` did only support a single sub-tree by now.

* Reduce TODO list
2024-07-10 12:19:35 +00:00
web3-developer 9fc5495d49
Update nim-rocksdb to latest version. (#2467) 2024-07-10 14:18:02 +08:00
Kim De Mey 54e3fd1a94
Move Portal wire and networks setup to new portal_node module (#2464) 2024-07-09 19:22:25 +02:00
Jacek Sieka 25b5f01357
bump stint (#2465)
avoids extreme modmul bottleneck
2024-07-09 18:07:21 +02:00
Jacek Sieka 3382c2427b
increase rdb cache sizes (#2466)
This trivial bump should improve performance a bit without costing too
much memory - as the trie grows, so does the number of levels in it and
creating hikes becomes ever more expensive - hopefully this cache
increase should give a nice little boost even if it's not a lot.
2024-07-09 17:35:27 +02:00
Jacek Sieka ab23148aab
don't rewrite hash->slot map (#2463)
Avoid writing the same slot/hash values to the hash->slot mapping
to avoid spamming the rocksdb WAL and cause unnecessary compaction

In the same vein, avoid writing trivially detectable A-B-A storage
changes which happen with surprising frequency.
2024-07-09 17:25:43 +02:00
Advaita Saha 9a499eb45f
Era support for nimbus import (#2429)
* add the era-dir option

* feat: support for era files in nimbus import

* fix: metric logs

* fix: eraDir check

* fix: redundant code and sepolia support

* fix: remove dependency from csv + formatting

* fix: typo

* fix: RVO

* fix: parseBiggestInt

* fix: opt impl

* fix: network agnostic loading

* fix: shift to int64
2024-07-09 15:28:01 +02:00
andri lim 4fa3756860
Convert GasInt to uint64, bump nim-eth and nimbus-eth2 (#2461)
* Convert GasInt to uint64, bump nim-eth and nimbus-eth2

* Bump nimbus-eth2

* int64.high.GasInt instead of 0x7fffffffffffffff.GasInt
2024-07-07 06:52:11 +00:00
andri lim e8683692fd
EVM gasSstore refund reduction using positive integer (#2460)
This is the hopefully the last part of preparations
before converting GasInt to uint64
2024-07-06 08:39:38 +07:00
andri lim 4eaae5cbfa
EVM gasCall values always stay on positive side (#2459)
* EVM gasCall values always stay on positive side

This is also another part of preparations before
converting GasInt to uint64

* Fix test_evm_support
2024-07-06 08:39:22 +07:00
andri lim c775c906a2
Fix LedgerRef storage iterator and add test (#2458) 2024-07-05 10:15:48 +00:00
andri lim 6fe7411ac0
Saner EVM gasCosts (#2457)
This is also a part of preparations before converting GasInt to uint64
2024-07-05 11:55:13 +07:00
andri lim 23c00ce88c
Separate evmc gasCosts and nim-evm gasCosts (#2454)
This is part of preparations before converting GasInt to uint64
2024-07-05 07:00:03 +07:00
Jacek Sieka 7d78fd97d5
avoid allocations for slot storage (#2455)
Introduce a new `StoData` payload type similar to `AccountData`

* slightly more efficient storage format
* typed api
* fewer seqs
* fix encoding docs - it wasn't rlp after all :)
2024-07-04 23:48:45 +00:00
tersec 1f40b710ee
fix UnusedImport warnings; bump nim-bearssl, nim-stint, and nim-stew (#2456) 2024-07-05 06:46:59 +07:00
Jacek Sieka 893bfa4305
Enable LTO compilation (#2450)
* Enable LTO compilation

Similar to nimbus-eth2, LTO gives a significant boost for any CPU-bound operations such as the EVM.

The options are copied straight from nimbus-eth2 - for example at block height 1.7M there's a computation-heavy section where we can see a 15%-20% improvement in block processing time.

```
                       bps_x     bps_y     tps_x     tps_y time_x time_y     bpsd     tpsd    timed

(1722223, 1733334]    102.52    138.90  1,049.67  1,420.61  2m41s  1m58s   35.78%   35.78%  -26.32%
```

* avoid defer

When evmc recursion is enabled together with LTO, we run out of stack
space.

`defer` creates an exception handling context that takes up hundreds of
bytes of stack space - now that the EVM is no longer using exceptions,
we can safely get rid of it.
2024-07-04 23:10:40 +07:00
Jacek Sieka 79788c01d4
Add debug mode for disabling per-chunk state root validation (#2453)
This significantly speeds up block import at the cost of less protection
against invalid data, potentially resulting in an invalid database
getting stored.

The risk is small given that import is used only for validated data -
evaluating the right level of of validation vs performance is left for a
future PR.

A side effect of this approach is that there is no cached stated root in
the database - computing it currently requires a lot of memory since the
intermediate roots get cached in memory in full while the computation is
ongoing - a future PR will need to address this deficiency, for example
by streaming the already-computed hashes directly to the database.
2024-07-04 16:51:50 +02:00