433 Commits

Author SHA1 Message Date
Jacek Sieka
189a20bbae
Avoid recomputing hashes when persisting data (#2350) 2024-06-14 07:10:00 +02:00
Jordan Hrycaj
5a5cc6295e
Triggered write event for kvt (#2351)
* bump rockdb

* Rename `KVT` objects related to filters according to `Aristo` naming

details:
  filter* => delta*
  roFilter => balancer

* Compulsory error handling if `persistent()` fails

* Add return code to `reCentre()`

why:
  Might eventually fail if re-centring is blocked. Some logic will be
  added in subsequent patch sets.

* Add column families from earlier session to rocksdb in opening procedure

why:
  All previously used CFs must be declared when re-opening an existing
  database.

* Update `init()` and add rocksdb `reinit()` methods for changing parameters

why:
  Opening a set column families (with different open options) must span
  at least the ones that are already on disk.

* Provide write-trigger-event interface into `Aristo` backend

why:
  This allows to save data from a guest application (think `KVT`) to
  get synced with the write cycle so the guest and `Aristo` save all
  atomically.

* Use `KVT` with new column family interface from `Aristo`

* Remove obsolete guest interface

* Implement `KVT` piggyback on `Aristo` backend

* CoreDb: Add separate `KVT`/`Aristo` backend mode for debugging

* Remove `rocks_db` import from `persist()` function

why:
  Some systems (i.p `fluffy` and friends) use the `Aristo` memory
  backend emulation and do not link against rocksdb when building the
  application. So this should fix that problem.
2024-06-13 18:15:11 +00:00
Jacek Sieka
5f44be1bcd
Remove rehashing in storage ledger init (#2340)
This looks like a debug leftover, but it causes a state root
recomputation which is slow
2024-06-12 21:13:53 +02:00
Jacek Sieka
54f793f946
Apply some basic rocksdb options (#2339)
These options, inspired by Nethermind and general internet wisdom, bring
the database size down to 2/3 without affecting throughput. In theory,
they should also bring down memory usage and/or make more efficient use
of whatever memory is already assigned to rocksdb but this needs
verification in a longer test at synced-mainnet sizes.

In the meantime, they make testing easier by removing some noise that
the profiler says are bad, such as excessive SkipList access (countered
by bloom filters).
2024-06-12 14:52:27 +02:00
Jacek Sieka
eb041abba7
avoid unnecessary memory allocations and lookups (#2334)
* use `withValue` instead of `hasKey` + `[]`
* avoid `@` et al
* parse database data inside `onData` instead of making seq then parsing
2024-06-11 11:38:58 +02:00
Jordan Hrycaj
a347291413
Aristo use rocksdb cf instead of key pfx (#2332)
* Use RocksDb column families instead of a prefixed single column

why:
  Better performance

* Use structural objects `VertexRef` and `HashKey` in LRU cache for RocksDb

why:
  Avoids repeated de/serialisation
2024-06-10 12:04:22 +00:00
Jacek Sieka
f6be4bd0ec
avoid initTable (#2328)
`initTable` is obsolete since nim 0.19 and can introduce significant
memory overhead while providing no benefit (since the table will be
grown to the default initial size on first use anyway).

In particular, aristo layers will not necessarily use all tables they
initialize, for exampe when many empty accounts are being created.
2024-06-10 11:05:30 +02:00
Jacek Sieka
359f7ada65
eth: avoid sink (#2331)
* eth: avoid sink

* bump

* fix extra transactions

from #2330
2024-06-10 14:16:22 +07:00
Jacek Sieka
0b32078c4b
Consolidate block type for block processing (#2325)
This PR consolidates the split header-body sequences into a single EthBlock
sequence and cleans up the fallout from that which significantly reduces
block processing overhead during import thanks to less garbage collection
and fewer copies of things all around.

Notably, since the number of headers must always match the number of bodies,
we also get rid of a pointless degree of freedom that in the future could
introduce unnecessary bugs.

* only read header and body from era file
* avoid several unnecessary copies along the block processing way
* simplify signatures, cleaning up unused arguemnts and returns
* use `stew/assign2` in a few strategic places where the generated
  nim assignent is slow and add a few `move` to work around poor
  analysis in nim 1.6 (will need to be revisited for 2.0)

```
stats-20240607_2223-a814aa0b.csv vs stats-20240608_0714-21c1d0a9.csv
                       bps_x     bps_y     tps_x        tps_y    bpsd    tpsd    timed
block_number
(498305, 713245]    1,540.52  1,809.73  2,361.58  2775.340189  17.63%  17.63%  -14.92%
(713245, 928185]      730.36    865.26  1,715.90  2028.973852  18.01%  18.01%  -15.21%
(928185, 1143126]     663.03    789.10  2,529.26  3032.490771  19.79%  19.79%  -16.28%
(1143126, 1358066]    393.46    508.05  2,152.50  2777.578119  29.13%  29.13%  -22.50%
(1358066, 1573007]    370.88    440.72  2,351.31  2791.896052  18.81%  18.81%  -15.80%
(1573007, 1787947]    283.65    335.11  2,068.93  2441.373402  17.60%  17.60%  -14.91%
(1787947, 2002888]    287.29    342.11  2,078.39  2474.179448  18.99%  18.99%  -15.91%
(2002888, 2217828]    293.38    343.16  2,208.83   2584.77457  17.16%  17.16%  -14.61%
(2217828, 2432769]    140.09    167.86  1,081.87  1296.336926  18.82%  18.82%  -15.80%

blocks: 1934464, baseline: 3h13m1s, contender: 2h43m47s
bpsd (mean): 19.55%
tpsd (mean): 19.55%
Time (total): -29m13s, -15.14%
```
2024-06-09 16:32:20 +02:00
web3-developer
db8c5b90bd
Cleanup stateless and block witness code. (#2295)
* Cleanup unneeded stateless and block witness code. Keeping MultiKeys which is used in the eth_getProofsByBlockNumber RPC endpoint which is needed for the Fluffy state network bridge.

* Rename generateWitness flag to collectWitnessData to better describe what the flag does. We only collect the keys of the touched accounts and storage slots but no block witness generation is supported for now.

* Move remaining stateless code into nimbus directory.

* Add vmstate parameter to ChainRef to fix test.

* Exclude *.in from check copyright year

---------

Co-authored-by: jangko <jangko128@gmail.com>
2024-06-08 15:05:00 +07:00
tersec
5008b89185
EIP-2537 BLS12-381 G1 add/mul/exp and G2 add/mul support with tests (#2315) 2024-06-08 07:39:53 +07:00
tersec
fd03038cab
Replace some usage of std/options with results Opt (#2323)
* Replace some usage of std/options with results Opt

* more updates
2024-06-07 23:39:58 +02:00
Jordan Hrycaj
392088e5e9
Coredb fix storage tree issues (#2317)
* Code cosmetics

* Re-org `aristo_merge`, internally split into sub-modules

why:
  Became a burden for maintenance because it hosts two different
  functionalities under the same merge paradigm: account/data merge
  and snap proof merge where the latter produces a partial trie.

* Fix CoreDb tracer

* Ledger: fix potential account vs. storage tree sync problems

* Remove bound on the size of removable whole storage trees

* Activate `test_tracer_json`
2024-06-07 10:56:31 +00:00
Jacek Sieka
c5b3081828
eth: bump (#2308)
* eth: bump

Speed up basic operations like hashing and creating RLP:s - up to 25%
improvement in certain block ranges!

```
876729c.csv /data/nimbus_stats/stats-20240605_2204-ed4f6221.csv
stats-20240605_2000-c876729c.csv vs stats-20240605_2204-ed4f6221.csv
                       bps_x   bps_y     tps_x        tps_y    bpsd    tpsd    timed
block_number
(500001, 888889]    1,017.72  996.07  1,784.96  1742.438676  -2.72%  -2.72%    3.31%
(888889, 1277778]     528.00  536.30  2,159.79  2198.781046   1.69%   1.69%   -1.44%
(1277778, 1666667]    324.29  317.78  2,064.48  2008.106377  -2.82%  -2.82%    3.33%
(1666667, 2055556]    253.87  258.74  1,840.94  1872.935273   1.67%   1.67%   -1.39%
(2055556, 2444445]    175.79  178.66  1,340.61  1363.248939   0.93%   0.93%   -0.74%
(2444445, 2833334]    137.27  159.74    958.75  1113.323757  14.24%  14.24%  -10.69%
(2833334, 3222223]    170.48  228.63  1,272.70  1704.047195  34.41%  34.41%  -25.17%
(3222223, 3611112]    127.49  125.48  1,572.39  1548.835791  -1.19%  -1.19%    1.47%
(3611112, 4000001]     37.25   40.42  1,100.65  1184.740493   9.58%   9.58%   -7.04%

blocks: 3501696, baseline: 11h59m40s, contender: 11h21m38s
bpsd (mean): 6.18%
tpsd (mean): 6.18%
Time (sum): -38m1s, -4.26%

bpsd = blocks per sec diff (+), tpsd = txs per sec diff, timed = time to process diff (-)
+ = more is better, - = less is better
```

* ignore gitignore
2024-06-06 23:39:09 +00:00
Jordan Hrycaj
e9eae4df70
Core db disable legacy api n remove distinct tries (#2299)
* CoreDb: Remove crufty second/off-site KVT

why:
  Was used to allow late `Clique` to store directly to disk

* CoreDb: Remove prune flag related functionality

why:
  Is completely legacy stuff

* CoreDb: Remove dependence on legacy API (tests unsupported yet)

why:
  Does not fully support Aristo

* Re-factoring `state_db` using new API

details:
  Only minimum changes needed to compile `nimbus`

* Update tests and aux modules

* Turn off legacy API and remove `distinct_tries`

comment:
  The legacy API has now cruft status, will be removed soon

* Fix copyright years

* Update rpc for verified proxy

---------

Co-authored-by: Jacek Sieka <jacek@status.im>
2024-06-05 20:52:04 +00:00
Jordan Hrycaj
8985535ab2
Core db+aristo updates n fixes (#2298)
* Fix `blobify()` for `SavedState` object

why:
  Have to treat varying sizes for `HashKey`, i.p. for an empty key which
  has zero size.

* Store correct block number in `SavedState` record

why:
  Stored `block-number - 1` for some obscure reason.

* Cosmetcs, docu
2024-06-05 18:17:50 +00:00
Jacek Sieka
c876729c4d
Add some basic rocksdb options to command line (#2286)
These options are there mainly to drive experiments, and are therefore
hidden.

One thing that this PR brings in is an initial set of caches and buffers for rocksdb - the set that I've been using during various performance tests to get to a viable baseline performance level.
2024-06-05 17:08:29 +02:00
Jacek Sieka
95a4adc1e8
use statically linked rocksdb on linux/mac, dll on windows (#2291)
The `rocksdb` version shipped with distributions is typically old and
therefore often lacks features we use - it also doesn't match the one
assumed by nim-rocksdb leading to ABI mismatch risks.

Instead of depending on the system rocksdb, we'll now use the rocksdb
version assumed by nim-rocksdb and locked in its vendor folder by always
building it together with nimbus.

This avoids the problem of unknown rocksdb versions at a (small) cost to
build time.

CI caching and full windows support for building from source [remains
TODO](https://github.com/status-im/nim-rocksdb/issues/44).
2024-06-04 18:15:33 +02:00
Jordan Hrycaj
69a158864c
Remove vid recycling feature (#2294) 2024-06-04 15:05:13 +00:00
Jordan Hrycaj
cc909c99f2
Fix crash in de-serialiser (#2289)
why:
  Late change from `Hash256` to `HashKey` without fully updating
  the serialiser.
2024-06-04 10:38:11 +00:00
Jordan Hrycaj
f926222fec
Aristo cull journal related stuff (#2288)
* Remove all journal related stuff

* Refactor function names journal*() => delta*(), filter*() => delta*()

* remove `trg` fileld from `FilterRef`

why:
  Same as `kMap[$1]`

* Re-type FilterRef.src as `HashKey`

why:
  So it is directly comparable to `kMap[$1]`

* Moved `vGen[]` field from `LayerFinalRef` to `LayerDeltaRef`

why:
  Then a separate `FilterRef` type is not needed, anymore

* Rename `roFilter` field in `AristoDbRef` => `balancer`

why:
  New name more appropriate.

* Replace `FilterRef` by `LayerDeltaRef` type

why:
  This allows to avoid copying into the `balancer` (see next patch set)
  most of the time. Typically, only one instance is running on the backend
  and the `balancer` is only used as a stage before saving data.

* Refactor way how to store data persistently

why:
  Avoid useless copy when staging `top` layer for persistently saving to
  backend.

* Fix copyright header?
2024-06-03 20:10:35 +00:00
Jacek Sieka
7f76586214
Speed up account ledger a little (#2279)
`persist` is a hotspot when processing blocks because it is run at least
once per transaction and loops over the entire account cache every time.

Here, we introduce an extra `dirty` map that keeps track of all accounts
that need checking during `persist` which fixes the immediate
inefficiency, though probably this could benefit from a more thorough
review - we also get rid of the unused clearCache flag - we start with
a fresh cache on every fresh vmState.

* avoid unnecessary code hash comparisons
* avoid unnecessary copies when iterating
* use EMPTY_CODE_HASH throughout for code hash comparison
2024-06-02 21:21:29 +02:00
Jacek Sieka
ef864ba167
disable vid reuse compaction (#2276)
The current implementation cannot be used practically since it causes
several full reallocations of the whole free list per deletion - it
needs to be reimplemented, or the chain cannot practically progress
beyond ~2.5M blocks where a lot of removals happen.

Co-authored-by: tersec <tersec@users.noreply.github.com>
2024-06-01 17:14:54 +02:00
Jacek Sieka
9f879406f3
append instead of reallocate in blobify (#2277)
...otherwise, we get lots and lots of temporary allocations of seq's
2024-06-01 17:13:24 +02:00
andri lim
f9765e617b
Early exit from some of CoreDbRef functions if nothing to do (#2259)
* Early exit from some of CoreDbRef functions if nothing to do

* More exits

* persistReceipts early exit if nothing to do
2024-06-01 16:54:02 +02:00
tersec
cfbbcda4f7
rm unused Nim modules (#2270) 2024-06-01 17:49:46 +07:00
Jordan Hrycaj
bda760f41d
Run coredb without journal (#2266)
* Add persistent last state stamp feature

why:
  This allows to run `CoreDb` without journal

* Start `CoreDb` without journal

* Remove journal related functions from `CoredDb`
2024-05-31 17:32:22 +00:00
Jordan Hrycaj
0f430c70fd
Aristo avoid storage trie update race conditions (#2251)
* Update TDD suite logger output format choices

why:
  New format is not practical for TDD as it just dumps data across a wide
  range (considerably larder than 80 columns.)

  So the new format can be turned on by function argument.

* Update unit tests samples configuration

why:
  Slightly changed the way to find the `era1` directory

* Remove compiler warnings (fix deprecated expressions and phrases)

* Update `Aristo` debugging tools

* Always update the `storageID` field of account leaf vertices

why:
  Storage tries are weekly linked to an account leaf object in that
  the `storageID` field is updated by the application.

  Previously, `Aristo` verified that leaf objects make sense when passed
  to the database. As a consequence
  * the database was inconsistent for a short while
  * the burden for correctness was all on the application which led
    to delayed error handling which is hard to debug.

  So `Aristo` will internally update the account leaf objects so that
  there are no race conditions due to the storage trie handling

* Aristo: Let `stow()`/`persist()` bail out unless there is a `VertexID(1)`

why:
  The journal and filter logic depends on the hash of the `VertexID(1)`
  which is commonly known as the state root. This implies that all
  changes to the database are somehow related to that.

* Make sure that a `Ledger` account does not overwrite the storage trie reference

why:
  Due to the abstraction of a sub-trie (now referred to as column with a
  hash describing its state) there was a weakness in the `Aristo` handler
  where an account leaf could be overwritten though changing the validity
  of the database. This has been changed and the database will now reject
  such changes.

  This patch fixes the behaviour on the application layer. In particular,
  the column handle returned by the `CoreDb` needs to be updated by
  the `Aristo` database state. This mitigates the problem that a storage
  trie might have vanished or re-apperaed with a different vertex ID.

* Fix sub-trie deletion test

why:
  Was originally hinged on `VertexID(1)` which cannot be wholesale
  deleted anymore after the last Aristo update. Also, running with
  `VertexID(2)` needs an artificial `VertexID(1)` for making `stow()`
  or `persist()` work.

* Cosmetics

* Activate `test_generalstate_json`

* Temporarily `deactivate test_tracer_json`

* Fix copyright header

---------

Co-authored-by: jordan <jordan@dry.pudding>
Co-authored-by: Jacek Sieka <jacek@status.im>
2024-05-30 17:48:38 +00:00
Jacek Sieka
919242c98e
results: use canonical import (#2248) 2024-05-30 14:54:03 +02:00
andri lim
0a07425112
Cleanup unused raises in evm/state and other obsolete informations (#2243) 2024-05-30 09:03:54 +00:00
tersec
674394b924
fix import path; force refc memory management even with Nim 2.0+ (#2241) 2024-05-29 20:47:06 +02:00
tersec
c466edfd8d
change an Aristo function name to avoid Nim stdlib ambiguity (#2240) 2024-05-29 15:08:00 +00:00
andri lim
eaf3d9897e
Simplify AccountsLedgerRef complexity (#2239) 2024-05-29 13:06:49 +02:00
Jordan Hrycaj
3a62250d04
Make test op memory work again (#2236)
* Remove crufty `pruneTrie` arguments

* Replaced legacy `distinct_trie` logic by new `ledger` functionality

why:
  The module `distinct_trie` is supported by `Aristo` in trivial cases.

* Activate `test_op_memory`
2024-05-28 14:24:10 +00:00
Jacek Sieka
08e98eb385
restore a few tests, cleanup (#2234)
* remove `compensateLegacySetup`, `localDbOnly`
* enable trivially fixable tests
2024-05-28 14:49:35 +02:00
tersec
ca60b13e6a
rm clique/mining remnants; rm unused code (#2232) 2024-05-28 07:10:10 +02:00
andri lim
eb50dcc2f9
Noop in AccountsLedger.getCode if the codehash is EMPTY_CODE_HASH (#2229) 2024-05-27 10:26:26 +02:00
andri lim
5db6ee07fb
Fix accounts_ledger savePoint conversion (#2226) 2024-05-26 17:42:02 +07:00
Jacek Sieka
9c3de888a4
era: simplify, instant startup (#2218)
This PR exploits structural properties of era files to simplify the
implementation and in particular remove the need to load all era file
indicies at startup which may be slow (due to archival storage residing
on slow drives)
2024-05-26 08:24:13 +02:00
tersec
34ac68990f
fix warnings around unused imports of std/algorithm; proc -> func (#2220) 2024-05-25 21:01:28 +02:00
Jacek Sieka
0a49833d69
avoid a few more copies (#2215) 2024-05-24 11:27:17 +02:00
Jacek Sieka
f38c5e631e
trivial memory-based speedups (#2205)
* trivial memory-based speedups

* HashKey becomes non-ref
* use openArray instead of seq in lots of places
* avoid sequtils.reversed when unnecessary
* add basic perf stats to test_coredb

* copyright
2024-05-23 17:37:51 +02:00
Jordan Hrycaj
2c322054e0
Attempt to roll back stateless mode implementation in a single PR (#2209)
* Attempt to roll back stateless mode implementation in a single PR

why:
+ Stateless mode is not fully working and in the way
+ Single PR should make it feasible to investigate for a possible
  re-implementation

* Fix copyright year

* Fix annotation for exception (evmc mode)
2024-05-22 21:01:19 +00:00
Jordan Hrycaj
9cc6e5a3aa
Aristo resume off line syncing on pre loaded database (#2203)
* Update some docu & messages

* Remove cruft from the ledger modules

* Must not overwrite genesis data on an initialised database

why:
  This will overwrite the global state of the Aristo single state DB.
  Otherwise resuming at the last synced state becomes impossible.

* Provide latest block number from journal

why:
  This relates the global state of the DB directly to the corresponding
  block number.

* Implemented unit test providing DB pre-load and resume
2024-05-22 13:41:14 +00:00
Jordan Hrycaj
2629d412d7
CoreDb+Aristo: Update tracer (#2201)
why:
  When deleting accounts while restoring the previous state, storage tries
  must be deleted first. Otherwise a `DelDanglingStoTrie` error will occur
  when trying to delete an account which refers to an active storage trie.
2024-05-21 12:51:06 +00:00
Jordan Hrycaj
de0388919f
Unified mode for undumping gzip-ed or era1-ed encoded block dumps (#2198)
ackn:
  Built on Daniel's work
2024-05-20 13:59:18 +00:00
Jordan Hrycaj
ee9aea171d
Culling legacy DB and accounts cache (#2197)
details:
+ Compiles nimbus all_tests
+ Failing tests have been commented out
2024-05-20 10:17:51 +00:00
Jacek Sieka
293ce28e4d
remove geth_db (#2189)
doesn't look like it's used
2024-05-16 22:00:20 +07:00
Etan Kissling
c4c37302b1
Introduce wrapper type for EIP-4844 transactions (#2177)
* Introduce wrapper type for EIP-4844 transactions

EIP-4844 blob sidecars are a concept that only exists in the mempool.
After inclusion of a transaction into an execution block, only the
versioned hash within the transaction remains. To improve type safety,
replace the `Transaction.networkPayload` member with a wrapper type
`PooledTransaction` that is used in contexts where blob sidecars exist.

* Bump nimbus-eth2 to 87605d08a7f9cfc3b223bd32143e93a6cdf351ac

* IPv6 'listen-address' in `nimbus_verified_proxy`

* Bump nim-libp2p to 21cbe3a91a70811522554e89e6a791172cebfef2

* Fix beacon_lc_bridge payload conversion and conf.listenAddress type

* Change nimbus_verified_proxy.asExecutionData param to SomeExecutionPayload

* Rerun nph to fix asExecutionData style format

* nimbus_verified_proxy listenAddress

* Use PooledTransaction in nimbus-eth1 tests

---------

Co-authored-by: jangko <jangko128@gmail.com>
2024-05-15 10:07:59 +07:00
Jordan Hrycaj
54f784bef1
Kvt remodel tx and forked descriptors (#2168)
* Aristo: Generalise alien/guest interface for piggiback on database

* Aristo: Code cosmetics

* CoreDb+Kvt: Update transaction API

why:
  Use single addressable function `forkTx(backLevel: int)` as used
  in `Aristo`. So `Kvt` can be synced simultaneously to `Aristo`.

also:
  Refactored `kvt_tx.nim` in a similar fashion to `Aristo`.

* Kvt: Replace `LayerDelta` object by reference

why:
  Will be needed when introducing filters

* Kvt: Remodel backend filter facility similar to `Aristo`

why:
  This allows to operate on several KVT instances simultaneously.

* CoreDb+Kvt: Fix on-disk storage

why:
  Overlooked name change: `stow()` => `persist()` for permanent storage

* Fix copyright headers
2024-05-07 19:59:27 +00:00