Commit Graph

83 Commits

Author SHA1 Message Date
Jacek Sieka 06a544ac85
Remove `forkTx` and friends (#2951)
The forking facility has been replaced by ForkedChain - frames and
layers are two other mechanisms that mostly do the same thing at the
aristo level, without quite providing the functionality FC needs - this
cleanup will make that integration easier.
2024-12-18 11:56:46 +01:00
Jacek Sieka 7b88bb3b30
Add branch cache (#2923)
Now that branches are small, we can add a branch cache that fits more
verticies in memory by only storing the branch portion (16 bytes) of the
VertexRef (136 bytes).

Where the original vertex cache hovers around a hit rate of ~60:ish,
this branch cache reaches >90% hit rate instead around block 20M which
gives a nice boost to processing.

A downside of this approach is that a new VertexRef must be allocated
for every cache hit instead of reusing an existing instance - this
causes some GC overhead that needs to be addressed.

Nice 15% improvement nonetheless, can't complain!

```
blocks: 19630784, baseline: 161h18m38s, contender: 136h23m23s
Time (total): -24h55m14s, -15.45%
```
2024-12-11 11:53:26 +01:00
Jacek Sieka f034af422a
Pre-allocate vids for branches (#2882)
Each branch node may have up to 16 sub-items - currently, these are
given VertexID based when they are first needed leading to a
mostly-random order of vertexid for each subitem.

Here, we pre-allocate all 16 vertex ids such that when a branch subitem
is filled, it already has a vertexid waiting for it. This brings several
important benefits:

* subitems are sorted and "close" in their id sequencing - this means
that when rocksdb stores them, they are likely to end up in the same
data block thus improving read efficiency
* because the ids are consequtive, we can store just the starting id and
a bitmap representing which subitems are in use - this reduces disk
space usage for branches allowing more of them fit into a single disk
read, further improving disk read and caching performance - disk usage
at block 18M is down from 84 to 78gb!
* the in-memory footprint of VertexRef reduced allowing more instances
to fit into caches and less memory to be used overall.

Because of the increased locality of reference, it turns out that we no
longer need to iterate over the entire database to efficiently generate
the hash key database because the normal computation is now faster -
this significantly benefits "live" chain processing as well where each
dirtied key must be accompanied by a read of all branch subitems next to
it - most of the performance benefit in this branch comes from this
locality-of-reference improvement.

On a sample resync, there's already ~20% improvement with later blocks
seeing increasing benefit (because the trie is deeper in later blocks
leading to more benefit from branch read perf improvements)

```
blocks: 18729664, baseline: 190h43m49s, contender: 153h59m0s
Time (total): -36h44m48s, -19.27%
```

Note: clients need to be resynced as the PR changes the on-disk format

R.I.P. little bloom filter - your life in the repo was short but
valuable
2024-12-04 11:42:04 +01:00
andri lim 7b2b59a976
Add missing fields to RPC object conversion (#2863)
* Add missing fields to RPC object conversion

* Fix populateBlockObject call

* Remove server_api_helpers.nim

* Add metric defined conditional compilation

* link with rocksdb
2024-11-22 17:07:53 +07:00
Jacek Sieka 01ca415721
Store keys together with node data (#2849)
Currently, computed hash keys are stored in a separate column family
with respect to the MPT data they're generated from - this has several
disadvantages:

* A lot of space is wasted because the lookup key (`RootedVertexID`) is
repeated in both tables - this is 30% of the `AriKey` content!
* rocksdb must maintain in-memory bloom filters and LRU caches for said
keys, doubling its "minimal efficient cache size"
* An extra disk traversal must be made to check for existence of cached
hash key
* Doubles the amount of files on disk due to each column family being
its own set of files

Here, the two CFs are joined such that both key and data is stored in
`AriVtx`. This means:

* we save ~30% disk space on repeated lookup keys
* we save ~2gb of memory overhead that can be used to cache data instead
of indices
* we can skip storing hash keys for MPT leaf nodes - these are trivial
to compute and waste a lot of space - previously they had to present in
the `AriKey` CF to avoid having to look in two tables on the happy path.
* There is a small increase in write amplification because when a hash
value is updated for a branch node, we must write both key and branch
data - previously we would write only the key
* There's a small shift in CPU usage - instead of performing lookups in
the database, hashes for leaf nodes are (re)-computed on the fly
* We can return to slightly smaller on-disk SST files since there's
fewer of them, which should reduce disk traffic a bit

Internally, there are also other advantages:

* when clearing keys, we no longer have to store a zero hash in memory -
instead, we deduce staleness of the cached key from the presence of an
updated VertexRef - this saves ~1gb of mem overhead during import
* hash key cache becomes dedicated to branch keys since leaf keys are no
longer stored in memory, reducing churn
* key computation is a lot faster thanks to the skipped second disk
traversal - a key computation for mainnet can be completed in 11 hours
instead of ~2 days (!) thanks to better cache usage and less read
amplification - with additional improvements to the on-disk format, we
can probably get rid of the initial full traversal method of seeding the
key cache on first start after import

All in all, this PR reduces the size of a mainnet database from 160gb to
110gb and the peak memory footprint during import by ~1-2gb.
2024-11-20 09:56:27 +01:00
tersec 73661fd8a4
switch to Nim v2.0.12 (#2817)
* switch to Nim v2.0.12

* fix LruCache capitalization for styleCheck

* KzgProof/KzgCommitment for styleCheck

* TxEip4844 for styleCheck

* styleCheck issues in nimbus/beacon/payload_conv.nim

* ENode for styleCheck

* isOk for styleCheck

* some more styleCheck fixes

* more styleCheck fixes

---------

Co-authored-by: jangko <jangko128@gmail.com>
2024-11-01 19:06:26 +00:00
Jacek Sieka 188d689d9d
Speed up initial MPT root computation after import (#2788)
When `nimbus import` runs, we end up with a database without MPT roots
leading to long startup times the first time one is needed.

Computing the state root is slow because the on-disk order based on
VertexID sorting does not match the trie traversal order and therefore
makes lookups inefficent.

Here we introduce a helper that speeds up this computation by traversing
the trie in on-disk order and computing the trie hashes bottom up
instead - even though this leads to some redundant reads of nodes that
we cannot yet compute, it's still a net win as leaves and "bottom"
branches make up the majority of the database.

This PR also addresses a few other sources of inefficiency largely due
to the separation of AriKey and AriVtx into their own column families.

Each column family is its own LSM tree that produces hundreds of SST
filtes - with a limit of 512 open files, rocksdb must keep closing and
opening files which leads to expensive metadata reads during random
access.

When rocksdb makes a lookup, it has to read several layers of files for
each lookup. Ribbon filters to skip over files that don't have the
requested data but when these filters are not in memory, reading them is
slow - this happens in two cases: when opening a file and when the
filter has been evicted from the LRU cache. Addressing the open file
limit solves one source of inefficiency, but we must also increase the
block cache size to deal with this problem.

* rocksdb.max_open_files increased to 2048
* per-file size limits increased so that fewer files are created
* WAL size increased to avoid partial flushes which lead to small files
* rocksdb block cache increased

All these increases of course lead to increased memory usage, but at
least performance is acceptable - in the future, we'll need to explore
options such as joining AriVtx and AriKey and/or reducing the row count
(by grouping branch layers under a single vertexid).

With this PR, the mainnet state root can be computed in ~8 hours (down
from 2-3 days) - not great, but still better.

Further, we write all keys to the database, also those that are less
than 32 bytes - because the mpt path is part of the input, it is very
rare that we actually hit a key like this (about 200k such entries on
mainnet), so the code complexity is not worth the benefit really, in the
current database layout / design.
2024-10-27 11:08:37 +00:00
Jordan Hrycaj 5b6ccddaa0
Db folder sources and related remove compiler warnings (#2673)
* Aristo: Rename `Hash256` -> `Hash32`

* CoreDb: Rename `Hash256` -> `Hash32`

* Ledger: Rename `Hash256` -> `Hash32`

* StorageTypes: Rename `Hash256` -> `Hash32`

* Aristo: Rename `Blob` -> `seq[byte]`, `keccakHash` -> `keccak256`

* Kvt: Rename `Blob` -> `seq[byte]`

* CoreDb: Rename `Blob` -> `seq[byte]`, `keccakHash` -> `keccak256`

* Ledger: Rename `Blob` -> `seq[byte]`, `keccakHash` -> `keccak256`

* CoreDb: Rename `BlockHeader` -> `Header`, `BlockNonce` -> `Bytes8`

* Misc: Rename `StorageKey` -> `Bytes32`

* Tracer: `Hash256` -> `Hash32`, `BlockHeader` -> `Header`, etc.

* Fix copyright header
2024-10-01 21:03:10 +00:00
Jacek Sieka b4b4d16729
speed up key computation (#2642)
* batch database key writes during `computeKey` calls
* log progress when there are many keys to update
* avoid evicting the vertex cache when traversing the trie for key
computation purposes
* avoid storing trivial leaf hashes that directly can be loaded from the
vertex
2024-09-20 07:43:53 +02:00
Jacek Sieka 5c1e2e7d3b
Migrate `keyed_queue` to `minilru` (#2608)
Compared to `keyed_queue`, `minilru` uses significantly less memory, in
particular for the 32-byte hash keys where `kq` stores several copies of
the key redundantly.
2024-09-13 15:47:50 +02:00
Jacek Sieka d39c589ec3
lru cache updates (#2590)
* replace rocksdb row cache with larger rdb lru caches - these serve the
same purpose but are more efficient because they skips serialization,
locking and rocksdb layering
* don't append fresh items to cache - this has the effect of evicting
the existing items and replacing them with low-value entries that might
never be read - during write-heavy periods of processing, the
newly-added entries were evicted during the store loop
* allow tuning rdb lru size at runtime
* add (hidden) option to print lru stats at exit (replacing the
compile-time flag)

pre:
```
INF 2024-09-03 15:07:01.136+02:00 Imported blocks
blockNumber=20012001 blocks=12000 importedSlot=9216851 txs=1837042
mgas=181911.265 bps=11.675 tps=1870.397 mgps=176.819 avgBps=10.288
avgTps=1574.889 avgMGps=155.952 elapsed=19m26s458ms
```

post:
```
INF 2024-09-03 13:54:26.730+02:00 Imported blocks
blockNumber=20012001 blocks=12000 importedSlot=9216851 txs=1837042
mgas=181911.265 bps=11.637 tps=1864.384 mgps=176.250 avgBps=11.202
avgTps=1714.920 avgMGps=169.818 elapsed=17m51s211ms
```

9%:ish import perf improvement on similar mem usage :)
2024-09-05 11:18:32 +02:00
Jordan Hrycaj 3c6400673d
Coredb fix kvt save only fringe condition (#2592)
* Cosmetics, spelling, etc.

* Aristo: make sure that a save cycle always commits even when empty

why:
  If `Kvt` is tied to the `Aristo` DB save cycle, then this save cycle
  must also be committed if there is no data to save for `Aristo`.

  Otherwise this will lead to excessive core memory use with some fringe
  condition where Eth headers (or blocks) are downloaded while syncing
  and not really stored on disk.

* CoreDb: Correct persistent save mode

why:
  Saving `Kvt` first is seen as a harbinger (or canary) for `Aristo` as
  both run in sync. If `Kvt` succeeds saving first, so must be `Aristo`
  next. Other than this is a defect.
2024-09-04 13:48:38 +00:00
Jacek Sieka 35cc78c86d
add metrics for rdb lru cache (#2586)
This is a first step towards measuring the efficiency of the LRU caches
over time - metrics can be collected during import or when running
regulary.

Since `nim-metrics` carries some overhead for its default way of
reporting metrics, this PR implements a custom collector over atomic
counters, given that this is one of the hottest spots in the block
processing pipeline.

Using a compile-time flag, the same metrics can be printed on exit which
is useful when comparing different strategies for caching - here's a
recent run over blocks 16000001-1616384 - this is a good candidate to
expose in a better way in the future, maybe:

```
   state    vtype       miss        hit      total hitrate
 Account     Leaf    4909417    4466215    9375632  47.64%
 Account   Branch   20742574   72015123   92757697  77.64%
   World     Leaf     940483    1140946    2081429  54.82%
   World   Branch    8224151  131496580  139720731  94.11%
     all      all   34816625  209118864  243935489  85.73%
```
2024-09-02 17:34:10 +02:00
Jacek Sieka ef1bab0802
avoid some trivial memory allocations (#2587)
* pre-allocate `blobify` data and remove redundant error handling
(cannot fail on correct data)
* use threadvar for temporary storage when decoding rdb, avoiding
closure env
* speed up database walkers by avoiding many temporaries

~5% perf improvement on block import, 100x on database iteration (useful
for building analysis tooling)
2024-09-02 16:03:10 +02:00
Jordan Hrycaj 7becf4e389
Remove vertex ID recycle function (#2558)
why:
  It is not safe in general to recycle vertex IDs while the `RocksDb`
  cache has `VertexID` rather than `RootedVertexID` where the former
  type seems preferable.

  In some fringe cases one might remove a vertex with key `(root1,vid)`
  and insert another vertex with key `(root2,vid)` while re-using the
  vertex ID `vid`. Without knowledge of `root1` and `root2`, the LRU
  cache will return the same vertex for `(root2,vid)` also for
  `(root1,vid)`.
2024-08-12 20:56:15 +00:00
Jordan Hrycaj 5b502a06c4
Added portal proof nodes generation functionality (#2539)
* Extracted `test_tx.testTxMergeProofAndKvpList()` => separate file

* Fix serialiser

why:
  Typo lead to duplicate rlp-encoded nodes in chain

* Remove cruft

* Implemnt portal proof nodes generators `partXxxTwig()`

* Add unit test for portal proof nodes generator `partAccountTwig()`

* Cosmetics

* Simplify serialiser return code format

* Fix proof generator for extension nodes

why:
  Code was simply bonkers, not detected before the unit tests were
  adapted to check for just this.

* Implemented portal proof nodes verifier `partUntwig()`

* Cosmetics

* Fix `testutp` cli poblem
2024-08-06 11:29:26 +00:00
andri lim 254bda365f
Remove txpool sender locality (#2525)
* Remove txpool sender locality

We no longer distinct local or remote sender

* Fix copyright year
2024-07-25 22:36:08 +07:00
Jordan Hrycaj 5ac362fe6f
Aristo and kvt balancer management update (#2504)
* Aristo: Merge `delta_siblings` module into `deltaPersistent()`

* Aristo: Add `isEmpty()` for canonical checking whether a layer is empty

* Aristo: Merge `LayerDeltaRef` into `LayerObj`

why:
  No need to maintain nested object refs anymore. Previously the
 `LayerDeltaRef` object had a companion `LayerFinalRef` which held
  non-delta layer information.

* Kvt: Merge `LayerDeltaRef` into `LayerRef`

why:
  No need to maintain nested object refs (as with `Aristo`)

* Kvt: Re-write balancer logic similar to `Aristo`

why:
  Although `Kvt` was a cheap copy of `Aristo` it sort of got out of
  sync and the balancer code was wrong.

* Update iterator over forked peers

why:
  Yield additional field `isLast` indicating that the last iteration
  cycle was approached.

* Optimise balancer calculation.

why:
  One can often avoid providing a new object containing the merge of two
  layers for the balancer. This avoids copying tables. In some cases this
  is replaced by `hasKey()` look ups though. One uses one of the two
  to combine and merges the other into the first.

  Of course, this needs some checks for making sure that none of the
  components to merge is eventually shared with something else.

* Fix copyright year
2024-07-18 21:32:32 +00:00
Jacek Sieka 3382c2427b
increase rdb cache sizes (#2466)
This trivial bump should improve performance a bit without costing too
much memory - as the trie grows, so does the number of levels in it and
creating hikes becomes ever more expensive - hopefully this cache
increase should give a nice little boost even if it's not a lot.
2024-07-09 17:35:27 +02:00
Jacek Sieka 81e75622cf
storage: store root id together with vid, for better locality of refe… (#2449)
The state and account MPT:s currenty share key space in the database
based on that vertex id:s are assigned essentially randomly, which means
that when two adjacent slot values from the same contract are accessed,
they might reside at large distance from each other.

Here, we prefix each vertex id by its root causing them to be sorted
together thus bringing all data belonging to a particular contract
closer together - the same effect also happens for the main state MPT
whose nodes now end up clustered together more tightly.

In the future, the prefix given to the storage keys can also be used to
perform range operations such as reading all the storage at once and/or
deleting an account with a batch operation.

Notably, parts of the API already supported this rooting concept while
parts didn't - this PR makes the API consistent by always working with a
root+vid.
2024-07-04 15:46:52 +02:00
Jacek Sieka b23795ab39
remove pPrf, fRpp (#2445)
No longer used now that hashify is gone
2024-07-03 22:21:57 +02:00
Jacek Sieka c364426422
Smaller in-database representations (#2436)
These representations use ~15-20% less data compared to the status quo,
mainly by removing redundant zeroes in the integer encodings - a
significant effect of this change is that the various rocksdb caches see
better efficiency since more items fit in the same amount of space.

* use RLP encoding for `VertexID` and `UInt256` wherever it appears
* pack `VertexRef`/`PayloadRef` more tightly
2024-07-02 20:25:06 +02:00
web3-developer ea94e8a351
Use RocksDb column family handles instead of name strings. (#2418)
* Bump RocksDb to latest and update Nimbus database to pass column family handles to RocksDb API.

* Bump RocksDb version.
2024-06-27 16:51:43 +08:00
Jacek Sieka 3e001e322c
Fix memory usage spikes during sync, give memory to rocksdb (#2413)
* creating a seq from a table that holds lots of changes means copying
all data into the table - this can be several GB of data while syncing
blocks
* nim fails to optimize the moving of the `WidthFirstForest` - the real
solution is to not construct a `wff` to begin with, but this PR provides
relief while that is being worked on

This spike fix allows us to bump the rocksdb cache by another 2 GB and
still have a significantly lower peak memory usage during sync.
2024-06-25 13:39:53 +02:00
Jacek Sieka 41cf81f80b
Fix dboptions init (#2391)
For the block cache to be shared between column families, the options
instance must be shared between the various column families being
created. This also ensures that there is only one source of truth for
configuration options instead of having two different sets depending on
how the tables were initialized.

This PR also removes the re-opening mechanism which can double startup
time - every time the database is opened, the log is replayed - a large
log file will take a long time to open.

Finally, several options got correclty implemented as column family
options, including an one that puts a hash index in the SST files.
2024-06-19 10:55:57 +02:00
Jacek Sieka af34f90fe4
fix `max_total_wal_size` which should be set on the DB (#2363) 2024-06-16 02:11:30 +00:00
Jordan Hrycaj debba5a620
Coeredb related clean up and maint fixes (#2360)
* Fix initialiser

why:
  Possible crash (app profiling, tracer etc.)

* Update column family options processing

why:
  Same for kvt as for aristo

* Move `AristoDbDualRocks` backend type to the test suite

why:
  So it is not available for production

* Fix typos in API jump table

why:
  Used for tracing and app profiling only. Needed some update

* Purged CoreDb legacy API

why:
  Not needed anymore, was transitionary and disabled.

* Rename `flush` argument to `eradicate` in a DB close context

why:
  The word `eradicate` leaves no doubt what is meant

* Rename `stoFlush()` -> `stoDelete()`

* Rename `core_apps_newapi` -> `core_apps` (not so new anymore)
2024-06-14 11:19:48 +00:00
Jordan Hrycaj 5a5cc6295e
Triggered write event for kvt (#2351)
* bump rockdb

* Rename `KVT` objects related to filters according to `Aristo` naming

details:
  filter* => delta*
  roFilter => balancer

* Compulsory error handling if `persistent()` fails

* Add return code to `reCentre()`

why:
  Might eventually fail if re-centring is blocked. Some logic will be
  added in subsequent patch sets.

* Add column families from earlier session to rocksdb in opening procedure

why:
  All previously used CFs must be declared when re-opening an existing
  database.

* Update `init()` and add rocksdb `reinit()` methods for changing parameters

why:
  Opening a set column families (with different open options) must span
  at least the ones that are already on disk.

* Provide write-trigger-event interface into `Aristo` backend

why:
  This allows to save data from a guest application (think `KVT`) to
  get synced with the write cycle so the guest and `Aristo` save all
  atomically.

* Use `KVT` with new column family interface from `Aristo`

* Remove obsolete guest interface

* Implement `KVT` piggyback on `Aristo` backend

* CoreDb: Add separate `KVT`/`Aristo` backend mode for debugging

* Remove `rocks_db` import from `persist()` function

why:
  Some systems (i.p `fluffy` and friends) use the `Aristo` memory
  backend emulation and do not link against rocksdb when building the
  application. So this should fix that problem.
2024-06-13 18:15:11 +00:00
Jacek Sieka 54f793f946
Apply some basic rocksdb options (#2339)
These options, inspired by Nethermind and general internet wisdom, bring
the database size down to 2/3 without affecting throughput. In theory,
they should also bring down memory usage and/or make more efficient use
of whatever memory is already assigned to rocksdb but this needs
verification in a longer test at synced-mainnet sizes.

In the meantime, they make testing easier by removing some noise that
the profiler says are bad, such as excessive SkipList access (countered
by bloom filters).
2024-06-12 14:52:27 +02:00
Jacek Sieka eb041abba7
avoid unnecessary memory allocations and lookups (#2334)
* use `withValue` instead of `hasKey` + `[]`
* avoid `@` et al
* parse database data inside `onData` instead of making seq then parsing
2024-06-11 11:38:58 +02:00
Jordan Hrycaj a347291413
Aristo use rocksdb cf instead of key pfx (#2332)
* Use RocksDb column families instead of a prefixed single column

why:
  Better performance

* Use structural objects `VertexRef` and `HashKey` in LRU cache for RocksDb

why:
  Avoids repeated de/serialisation
2024-06-10 12:04:22 +00:00
tersec fd03038cab
Replace some usage of std/options with results Opt (#2323)
* Replace some usage of std/options with results Opt

* more updates
2024-06-07 23:39:58 +02:00
Jacek Sieka c5b3081828
eth: bump (#2308)
* eth: bump

Speed up basic operations like hashing and creating RLP:s - up to 25%
improvement in certain block ranges!

```
876729c.csv /data/nimbus_stats/stats-20240605_2204-ed4f6221.csv
stats-20240605_2000-c876729c.csv vs stats-20240605_2204-ed4f6221.csv
                       bps_x   bps_y     tps_x        tps_y    bpsd    tpsd    timed
block_number
(500001, 888889]    1,017.72  996.07  1,784.96  1742.438676  -2.72%  -2.72%    3.31%
(888889, 1277778]     528.00  536.30  2,159.79  2198.781046   1.69%   1.69%   -1.44%
(1277778, 1666667]    324.29  317.78  2,064.48  2008.106377  -2.82%  -2.82%    3.33%
(1666667, 2055556]    253.87  258.74  1,840.94  1872.935273   1.67%   1.67%   -1.39%
(2055556, 2444445]    175.79  178.66  1,340.61  1363.248939   0.93%   0.93%   -0.74%
(2444445, 2833334]    137.27  159.74    958.75  1113.323757  14.24%  14.24%  -10.69%
(2833334, 3222223]    170.48  228.63  1,272.70  1704.047195  34.41%  34.41%  -25.17%
(3222223, 3611112]    127.49  125.48  1,572.39  1548.835791  -1.19%  -1.19%    1.47%
(3611112, 4000001]     37.25   40.42  1,100.65  1184.740493   9.58%   9.58%   -7.04%

blocks: 3501696, baseline: 11h59m40s, contender: 11h21m38s
bpsd (mean): 6.18%
tpsd (mean): 6.18%
Time (sum): -38m1s, -4.26%

bpsd = blocks per sec diff (+), tpsd = txs per sec diff, timed = time to process diff (-)
+ = more is better, - = less is better
```

* ignore gitignore
2024-06-06 23:39:09 +00:00
Jordan Hrycaj 8985535ab2
Core db+aristo updates n fixes (#2298)
* Fix `blobify()` for `SavedState` object

why:
  Have to treat varying sizes for `HashKey`, i.p. for an empty key which
  has zero size.

* Store correct block number in `SavedState` record

why:
  Stored `block-number - 1` for some obscure reason.

* Cosmetcs, docu
2024-06-05 18:17:50 +00:00
Jacek Sieka c876729c4d
Add some basic rocksdb options to command line (#2286)
These options are there mainly to drive experiments, and are therefore
hidden.

One thing that this PR brings in is an initial set of caches and buffers for rocksdb - the set that I've been using during various performance tests to get to a viable baseline performance level.
2024-06-05 17:08:29 +02:00
Jacek Sieka 95a4adc1e8
use statically linked rocksdb on linux/mac, dll on windows (#2291)
The `rocksdb` version shipped with distributions is typically old and
therefore often lacks features we use - it also doesn't match the one
assumed by nim-rocksdb leading to ABI mismatch risks.

Instead of depending on the system rocksdb, we'll now use the rocksdb
version assumed by nim-rocksdb and locked in its vendor folder by always
building it together with nimbus.

This avoids the problem of unknown rocksdb versions at a (small) cost to
build time.

CI caching and full windows support for building from source [remains
TODO](https://github.com/status-im/nim-rocksdb/issues/44).
2024-06-04 18:15:33 +02:00
Jordan Hrycaj 69a158864c
Remove vid recycling feature (#2294) 2024-06-04 15:05:13 +00:00
Jordan Hrycaj f926222fec
Aristo cull journal related stuff (#2288)
* Remove all journal related stuff

* Refactor function names journal*() => delta*(), filter*() => delta*()

* remove `trg` fileld from `FilterRef`

why:
  Same as `kMap[$1]`

* Re-type FilterRef.src as `HashKey`

why:
  So it is directly comparable to `kMap[$1]`

* Moved `vGen[]` field from `LayerFinalRef` to `LayerDeltaRef`

why:
  Then a separate `FilterRef` type is not needed, anymore

* Rename `roFilter` field in `AristoDbRef` => `balancer`

why:
  New name more appropriate.

* Replace `FilterRef` by `LayerDeltaRef` type

why:
  This allows to avoid copying into the `balancer` (see next patch set)
  most of the time. Typically, only one instance is running on the backend
  and the `balancer` is only used as a stage before saving data.

* Refactor way how to store data persistently

why:
  Avoid useless copy when staging `top` layer for persistently saving to
  backend.

* Fix copyright header?
2024-06-03 20:10:35 +00:00
Jacek Sieka 9f879406f3
append instead of reallocate in blobify (#2277)
...otherwise, we get lots and lots of temporary allocations of seq's
2024-06-01 17:13:24 +02:00
Jordan Hrycaj bda760f41d
Run coredb without journal (#2266)
* Add persistent last state stamp feature

why:
  This allows to run `CoreDb` without journal

* Start `CoreDb` without journal

* Remove journal related functions from `CoredDb`
2024-05-31 17:32:22 +00:00
Jordan Hrycaj 0f430c70fd
Aristo avoid storage trie update race conditions (#2251)
* Update TDD suite logger output format choices

why:
  New format is not practical for TDD as it just dumps data across a wide
  range (considerably larder than 80 columns.)

  So the new format can be turned on by function argument.

* Update unit tests samples configuration

why:
  Slightly changed the way to find the `era1` directory

* Remove compiler warnings (fix deprecated expressions and phrases)

* Update `Aristo` debugging tools

* Always update the `storageID` field of account leaf vertices

why:
  Storage tries are weekly linked to an account leaf object in that
  the `storageID` field is updated by the application.

  Previously, `Aristo` verified that leaf objects make sense when passed
  to the database. As a consequence
  * the database was inconsistent for a short while
  * the burden for correctness was all on the application which led
    to delayed error handling which is hard to debug.

  So `Aristo` will internally update the account leaf objects so that
  there are no race conditions due to the storage trie handling

* Aristo: Let `stow()`/`persist()` bail out unless there is a `VertexID(1)`

why:
  The journal and filter logic depends on the hash of the `VertexID(1)`
  which is commonly known as the state root. This implies that all
  changes to the database are somehow related to that.

* Make sure that a `Ledger` account does not overwrite the storage trie reference

why:
  Due to the abstraction of a sub-trie (now referred to as column with a
  hash describing its state) there was a weakness in the `Aristo` handler
  where an account leaf could be overwritten though changing the validity
  of the database. This has been changed and the database will now reject
  such changes.

  This patch fixes the behaviour on the application layer. In particular,
  the column handle returned by the `CoreDb` needs to be updated by
  the `Aristo` database state. This mitigates the problem that a storage
  trie might have vanished or re-apperaed with a different vertex ID.

* Fix sub-trie deletion test

why:
  Was originally hinged on `VertexID(1)` which cannot be wholesale
  deleted anymore after the last Aristo update. Also, running with
  `VertexID(2)` needs an artificial `VertexID(1)` for making `stow()`
  or `persist()` work.

* Cosmetics

* Activate `test_generalstate_json`

* Temporarily `deactivate test_tracer_json`

* Fix copyright header

---------

Co-authored-by: jordan <jordan@dry.pudding>
Co-authored-by: Jacek Sieka <jacek@status.im>
2024-05-30 17:48:38 +00:00
Jacek Sieka 0a49833d69
avoid a few more copies (#2215) 2024-05-24 11:27:17 +02:00
Jacek Sieka f38c5e631e
trivial memory-based speedups (#2205)
* trivial memory-based speedups

* HashKey becomes non-ref
* use openArray instead of seq in lots of places
* avoid sequtils.reversed when unnecessary
* add basic perf stats to test_coredb

* copyright
2024-05-23 17:37:51 +02:00
Jordan Hrycaj 54f784bef1
Kvt remodel tx and forked descriptors (#2168)
* Aristo: Generalise alien/guest interface for piggiback on database

* Aristo: Code cosmetics

* CoreDb+Kvt: Update transaction API

why:
  Use single addressable function `forkTx(backLevel: int)` as used
  in `Aristo`. So `Kvt` can be synced simultaneously to `Aristo`.

also:
  Refactored `kvt_tx.nim` in a similar fashion to `Aristo`.

* Kvt: Replace `LayerDelta` object by reference

why:
  Will be needed when introducing filters

* Kvt: Remodel backend filter facility similar to `Aristo`

why:
  This allows to operate on several KVT instances simultaneously.

* CoreDb+Kvt: Fix on-disk storage

why:
  Overlooked name change: `stow()` => `persist()` for permanent storage

* Fix copyright headers
2024-05-07 19:59:27 +00:00
Jordan Hrycaj b9187e0493
Aristo selective read cashing for rocksdb backend (#2145)
* Aristo+Kvt: Better RocksDB profiling

why:
  Providing more detailed information, mainly for `Aristo`

* Aristo: Renamed journal `stats()` to `capacity()`

why:
  `Stats()` was a misnomer

* Aristo: Provide backend read caches for key and vertex IDs

why:
  Dedicated LRU caching for particular types gives a throughput advantage.
  The sizes of the LRU queues used for caching are currently constant
  but might be adjusted at a later time.

* Fix copyright year
2024-04-22 19:02:22 +00:00
Jordan Hrycaj 7d9e1d8607
Misc updates for full sync (#2140)
* Code cosmetics

* Aristo+Kvt: Fix api wrappers

why:
  Api setup killed the backend descriptor when backend mapping was
  disabled.

* Aristo: Implement masked profiling entries

why:
  Database backend should be listed but not counted in tally

* CoreDb: Simplify backend() methods

why:
  DBMS backend access Was provided very early and over engineered. Now
  there are only two backend machines, one for `Kvt` and the other one
  for an `Mpt` available only via new API.

* CoreDb: Code cleanup regarding descriptor types

* CoreDb: Refactor/redefine `persistent()` methods

why:
  There were `persistent()` methods for any type of caching storage
  facilities `Kvt`, `Mpt`, `Phk`, and `Acc`. Now there is only a single
  `persistent()` method storing all facilities in tandem (similar to
  how transactions work.)

  For non shared `Kvt` tables, there is now an extra storage method
  `saveOffSite()`.

* CoreDb lingo update: `trie` becomes `column`

why:
  Notion of a `trie` is pretty much hidden by the new `CoreDb` api.
  Revealed are sort of database columns for accounts an storage data,
  any of which have an internal state represented by a Keccack hash.
  So a `trie` or `MPT` becomes a `column` and a `rootHash` becomes a
  column state.

* Aristo: rename backend filed `filters` => `journal`

* Update full sync logging

details:
  + Disable eth handler noise while syncing
  + Log journal depth (if available)

* Fix copyright year

* Fix cruft and unwanted imports
2024-04-19 18:37:27 +00:00
Jordan Hrycaj e8eb3268f5
Generalise prune mode option 4 different db models (#2139)
* Update README

* Nimbus-main: replaced `PruneMode` options by `ChainDbMode` options

details:
  For the legacy database, this changes the phrase
  - `conf.pruneMode == PruneMode.Full` to the expression
  + `conf.chainDbMode == ChainDbMode.Prune`.

* Fix issues moaned about by NIM compiler

* Fix copyright year
2024-04-17 18:09:55 +00:00
Jordan Hrycaj d6a4205324
Aristo update rocksdb backend drivers (#2135)
* Aristo+RocksDB: Update backend drivers

why:
  RocksDB update allows use some of the newly provided methods which
  were previously implemented by using the very C backend (for the lack
  of NIM methods.)

* Aristo+RocksDB: Simplify drivers wrapper

* Kvt: Update backend drivers and wrappers similar to `Aristo`

* Aristo+Kvm: Use column families for RocksDB

* Aristo+MemoryDB: Code cosmetics

* Aristo: Provide guest column family for export

why:
  So `Kvt` can piggyback on `Aristo` so there avoiding to run a second
  DBMS system in parallel.

* Kvt: Provide import mechanism for RoksDB guest column family

why:
  So `Kvt` can piggyback on `Aristo` so there avoiding to run a second
   DBMS system in parallel.

* CoreDb+Aristo: Run persistent `Kvt` DB piggybacked on `Aristo`

why:
  Avoiding to run two DBMS systems in parallel.

* Fix copyright year

* Ditto
2024-04-16 20:39:11 +00:00
Jordan Hrycaj 5379302ce9
Aristo+Kvt: Let destructor crash when `nil` argument is given (#2080)
why:
  Ignoring `nil` objects was handy for a while but eventually led to
  lazy programming which in turn led to double destructor calls for
  the rocks-db.
2024-03-15 14:20:00 +00:00
Jordan Hrycaj 0d73637f14
Core db simplify new api storage modes (#2075)
* Aristo+Kvt: Fix backend `dup()` function in api setup

why:
  Backend object is subject to an inheritance cascade which was not
  taken care of, before. Only the base object was duplicated.

* Kvt: Simplify DB clone/peers management

* Aristo: Simplify DB clone/peers management

* Aristo: Adjust unit test for working with memory DB only

why:
  This currently causes some memory corruption persumably in the
  `libc` background layer.

* CoredDb+Kvt: Simplify API for KVT

why:
  Simplified storage models (was over engineered) for better performance
  and code maintenance.

* CoredDb+Aristo: Simplify API for `Aristo`

why:
  Only single database state needed here. Accessing a similar state will
  be implemented from outside this module using a context layer. This
  gives better performance and improves code maintenance.

* Fix Copyright headers

* CoreDb: Turn off API tracking

why:
  CI would ot go through. Was accidentally turned on.
2024-03-14 22:17:43 +00:00