14 Commits

Author SHA1 Message Date
Jacek Sieka
2961905a95
aristo: fork support via layers/txframes (#2960)
* aristo: fork support via layers/txframes

This change reorganises how the database is accessed: instead holding a
"current frame" in the database object, a dag of frames is created based
on the "base frame" held in `AristoDbRef` and all database access
happens through this frame, which can be thought of as a consistent
point-in-time snapshot of the database based on a particular fork of the
chain.

In the code, "frame", "transaction" and "layer" is used to denote more
or less the same thing: a dag of stacked changes backed by the on-disk
database.

Although this is not a requirement, in practice each frame holds the
change set of a single block - as such, the frame and its ancestors
leading up to the on-disk state represents the state of the database
after that block has been applied.

"committing" means merging the changes to its parent frame so that the
difference between them is lost and only the cumulative changes remain -
this facility enables frames to be combined arbitrarily wherever they
are in the dag.

In particular, it becomes possible to consolidate a set of changes near
the base of the dag and commit those to disk without having to re-do the
in-memory frames built on top of them - this is useful for "flattening"
a set of changes during a base update and sending those to storage
without having to perform a block replay on top.

Looking at abstractions, a side effect of this change is that the KVT
and Aristo are brought closer together by considering them to be part of
the "same" atomic transaction set - the way the code gets organised,
applying a block and saving it to the kvt happens in the same "logical"
frame - therefore, discarding the frame discards both the aristo and kvt
changes at the same time - likewise, they are persisted to disk together
- this makes reasoning about the database somewhat easier but has the
downside of increased memory usage, something that perhaps will need
addressing in the future.

Because the code reasons more strictly about frames and the state of the
persisted database, it also makes it more visible where ForkedChain
should be used and where it is still missing - in particular, frames
represent a single branch of history while forkedchain manages multiple
parallel forks - user-facing services such as the RPC should use the
latter, ie until it has been finalized, a getBlock request should
consider all forks and not just the blocks in the canonical head branch.

Another advantage of this approach is that `AristoDbRef` conceptually
becomes more simple - removing its tracking of the "current" transaction
stack simplifies reasoning about what can go wrong since this state now
has to be passed around in the form of `AristoTxRef` - as such, many of
the tests and facilities in the code that were dealing with "stack
inconsistency" are now structurally prevented from happening. The test
suite will need significant refactoring after this change.

Once this change has been merged, there are several follow-ups to do:

* there's no mechanism for keeping frames up to date as they get
committed or rolled back - TODO
* naming is confused - many names for the same thing for legacy reason
* forkedchain support is still missing in lots of code
* clean up redundant logic based on previous designs - in particular the
debug and introspection code no longer makes sense
* the way change sets are stored will probably need revisiting - because
it's a stack of changes where each frame must be interrogated to find an
on-disk value, with a base distance of 128 we'll at minimum have to
perform 128 frame lookups for *every* database interaction - regardless,
the "dag-like" nature will stay
* dispose and commit are poorly defined and perhaps redundant - in
theory, one could simply let the GC collect abandoned frames etc, though
it's likely an explicit mechanism will remain useful, so they stay for
now

More about the changes:

* `AristoDbRef` gains a `txRef` field (todo: rename) that "more or less"
corresponds to the old `balancer` field
* `AristoDbRef.stack` is gone - instead, there's a chain of
`AristoTxRef` objects that hold their respective "layer" which has the
actual changes
* No more reasoning about "top" and "stack" - instead, each
`AristoTxRef` can be a "head" that "more or less" corresponds to the old
single-history `top` notion and its stack
* `level` still represents "distance to base" - it's computed from the
parent chain instead of being stored
* one has to be careful not to use frames where forkedchain was intended
- layers are only for a single branch of history!

* fix layer vtop after rollback

* engine fix

* Fix test_txpool

* Fix test_rpc

* Fix copyright year

* fix simulator

* Fix copyright year

* Fix copyright year

* Fix tracer

* Fix infinite recursion bug

* Remove aristo and kvt empty files

* Fic copyright year

* Fix fc chain_kvt

* ForkedChain refactoring

* Fix merge master conflict

* Fix copyright year

* Reparent txFrame

* Fix test

* Fix txFrame reparent again

* Cleanup and fix test

* UpdateBase bugfix and fix test

* Fixe newPayload bug discovered by hive

* Fix engine api fcu

* Clean up call template, chain_kvt, andn txguid

* Fix copyright year

* work around base block loading issue

* Add test

* Fix updateHead bug

* Fix updateBase bug

* Change func commitBase to proc commitBase

* Touch up and fix debug mode crash

---------

Co-authored-by: jangko <jangko128@gmail.com>
2025-02-06 14:04:50 +07:00
Jacek Sieka
036dd23e9b
cleanups, import fixes (#2964)
* more generic-path removal
* tighter imports
2024-12-20 12:57:15 +01:00
Jacek Sieka
06a544ac85
Remove forkTx and friends (#2951)
The forking facility has been replaced by ForkedChain - frames and
layers are two other mechanisms that mostly do the same thing at the
aristo level, without quite providing the functionality FC needs - this
cleanup will make that integration easier.
2024-12-18 11:56:46 +01:00
Jordan Hrycaj
5b6ccddaa0
Db folder sources and related remove compiler warnings (#2673)
* Aristo: Rename `Hash256` -> `Hash32`

* CoreDb: Rename `Hash256` -> `Hash32`

* Ledger: Rename `Hash256` -> `Hash32`

* StorageTypes: Rename `Hash256` -> `Hash32`

* Aristo: Rename `Blob` -> `seq[byte]`, `keccakHash` -> `keccak256`

* Kvt: Rename `Blob` -> `seq[byte]`

* CoreDb: Rename `Blob` -> `seq[byte]`, `keccakHash` -> `keccak256`

* Ledger: Rename `Blob` -> `seq[byte]`, `keccakHash` -> `keccak256`

* CoreDb: Rename `BlockHeader` -> `Header`, `BlockNonce` -> `Bytes8`

* Misc: Rename `StorageKey` -> `Bytes32`

* Tracer: `Hash256` -> `Hash32`, `BlockHeader` -> `Header`, etc.

* Fix copyright header
2024-10-01 21:03:10 +00:00
Jordan Hrycaj
5ac362fe6f
Aristo and kvt balancer management update (#2504)
* Aristo: Merge `delta_siblings` module into `deltaPersistent()`

* Aristo: Add `isEmpty()` for canonical checking whether a layer is empty

* Aristo: Merge `LayerDeltaRef` into `LayerObj`

why:
  No need to maintain nested object refs anymore. Previously the
 `LayerDeltaRef` object had a companion `LayerFinalRef` which held
  non-delta layer information.

* Kvt: Merge `LayerDeltaRef` into `LayerRef`

why:
  No need to maintain nested object refs (as with `Aristo`)

* Kvt: Re-write balancer logic similar to `Aristo`

why:
  Although `Kvt` was a cheap copy of `Aristo` it sort of got out of
  sync and the balancer code was wrong.

* Update iterator over forked peers

why:
  Yield additional field `isLast` indicating that the last iteration
  cycle was approached.

* Optimise balancer calculation.

why:
  One can often avoid providing a new object containing the merge of two
  layers for the balancer. This avoids copying tables. In some cases this
  is replaced by `hasKey()` look ups though. One uses one of the two
  to combine and merges the other into the first.

  Of course, this needs some checks for making sure that none of the
  components to merge is eventually shared with something else.

* Fix copyright year
2024-07-18 21:32:32 +00:00
Jordan Hrycaj
5a5cc6295e
Triggered write event for kvt (#2351)
* bump rockdb

* Rename `KVT` objects related to filters according to `Aristo` naming

details:
  filter* => delta*
  roFilter => balancer

* Compulsory error handling if `persistent()` fails

* Add return code to `reCentre()`

why:
  Might eventually fail if re-centring is blocked. Some logic will be
  added in subsequent patch sets.

* Add column families from earlier session to rocksdb in opening procedure

why:
  All previously used CFs must be declared when re-opening an existing
  database.

* Update `init()` and add rocksdb `reinit()` methods for changing parameters

why:
  Opening a set column families (with different open options) must span
  at least the ones that are already on disk.

* Provide write-trigger-event interface into `Aristo` backend

why:
  This allows to save data from a guest application (think `KVT`) to
  get synced with the write cycle so the guest and `Aristo` save all
  atomically.

* Use `KVT` with new column family interface from `Aristo`

* Remove obsolete guest interface

* Implement `KVT` piggyback on `Aristo` backend

* CoreDb: Add separate `KVT`/`Aristo` backend mode for debugging

* Remove `rocks_db` import from `persist()` function

why:
  Some systems (i.p `fluffy` and friends) use the `Aristo` memory
  backend emulation and do not link against rocksdb when building the
  application. So this should fix that problem.
2024-06-13 18:15:11 +00:00
Jacek Sieka
f38c5e631e
trivial memory-based speedups (#2205)
* trivial memory-based speedups

* HashKey becomes non-ref
* use openArray instead of seq in lots of places
* avoid sequtils.reversed when unnecessary
* add basic perf stats to test_coredb

* copyright
2024-05-23 17:37:51 +02:00
Jordan Hrycaj
54f784bef1
Kvt remodel tx and forked descriptors (#2168)
* Aristo: Generalise alien/guest interface for piggiback on database

* Aristo: Code cosmetics

* CoreDb+Kvt: Update transaction API

why:
  Use single addressable function `forkTx(backLevel: int)` as used
  in `Aristo`. So `Kvt` can be synced simultaneously to `Aristo`.

also:
  Refactored `kvt_tx.nim` in a similar fashion to `Aristo`.

* Kvt: Replace `LayerDelta` object by reference

why:
  Will be needed when introducing filters

* Kvt: Remodel backend filter facility similar to `Aristo`

why:
  This allows to operate on several KVT instances simultaneously.

* CoreDb+Kvt: Fix on-disk storage

why:
  Overlooked name change: `stow()` => `persist()` for permanent storage

* Fix copyright headers
2024-05-07 19:59:27 +00:00
Jordan Hrycaj
0d73637f14
Core db simplify new api storage modes (#2075)
* Aristo+Kvt: Fix backend `dup()` function in api setup

why:
  Backend object is subject to an inheritance cascade which was not
  taken care of, before. Only the base object was duplicated.

* Kvt: Simplify DB clone/peers management

* Aristo: Simplify DB clone/peers management

* Aristo: Adjust unit test for working with memory DB only

why:
  This currently causes some memory corruption persumably in the
  `libc` background layer.

* CoredDb+Kvt: Simplify API for KVT

why:
  Simplified storage models (was over engineered) for better performance
  and code maintenance.

* CoredDb+Aristo: Simplify API for `Aristo`

why:
  Only single database state needed here. Accessing a similar state will
  be implemented from outside this module using a context layer. This
  gives better performance and improves code maintenance.

* Fix Copyright headers

* CoreDb: Turn off API tracking

why:
  CI would ot go through. Was accidentally turned on.
2024-03-14 22:17:43 +00:00
Jordan Hrycaj
ffa8ad2246
Core db use differential tx layers for aristo and kvt (#1949)
* Fix kvt headers

* Provide differential layers for KVT transaction stack

why:
  Significant performance improvement

* Provide abstraction layer for database top cache layer

why:
  This will eventually implemented as a differential database layers
  or transaction layers. The latter is needed to improve performance.

behavioural changes:
  Zero vertex and keys (i.e. delete requests) are not optimised out
  until the last layer is written to the database.

* Provide differential layers for Aristo transaction stack

why:
  Significant performance improvement
2023-12-19 12:39:23 +00:00
Jordan Hrycaj
c47f021596
Core db and aristo updates for destructor and tx logic (#1894)
* Disable `TransactionID` related functions from `state_db.nim`

why:
  Functions `getCommittedStorage()` and `updateOriginalRoot()` from
  the `state_db` module are nowhere used. The emulation of a legacy
  `TransactionID` type functionality is administratively expensive to
  provide by `Aristo` (the legacy DB version is only partially
  implemented, anyway).

  As there is no other place where `TransactionID`s are used, they will
  not be provided by the `Aristo` variant of the `CoreDb`. For the
  legacy DB API, nothing will change.

* Fix copyright headers in source code

* Get rid of compiler warning

* Update Aristo code, remove unused `merge()` variant, export `hashify()`

why:
  Adapt to upcoming `CoreDb` wrapper

* Remove synced tx feature from `Aristo`

why:
+ This feature allowed to synchronise transaction methods like begin,
  commit, and rollback for a group of descriptors.
+ The feature is over engineered and not needed for `CoreDb`, neither
  is it complete (some convergence features missing.)

* Add debugging helpers to `Kvt`

also:
  Update database iterator, add count variable yield argument similar
  to `Aristo`.

* Provide optional destructors for `CoreDb` API

why;
  For the upcoming Aristo wrapper, this allows to control when certain
  smart destruction and update can take place. The auto destructor works
  fine in general when the storage/cache strategy is known and acceptable
  when creating descriptors.

* Add update option for `CoreDb` API function `hash()`

why;
  The hash function is typically used to get the state root of the MPT.
  Due to lazy hashing, this might be not available on the `Aristo` DB.
  So the `update` function asks for re-hashing the gurrent state changes
  if needed.

* Update API tracking log mode: `info` => `debug

* Use shared `Kvt` descriptor in new Ledger API

why:
  No need to create a new descriptor all the time
2023-11-16 19:35:03 +00:00
Jordan Hrycaj
6d132811ba
Core db update providing additional results code interface (#1776)
* Split `core_db/base.nim` into several sources

* Rename `core_db/legacy.nim` => `core_db/legacy_db.nim`

* Update `CoreDb` API, dual methods returning `Result[]` or plain value

detail:
  Plain value methods implemet the legacy API, they defect on error results

* Redesign `CoreDB` direct backend access

why:
  Made the `backend` directive integral part of the API

* Discontinue providing unused or otherwise available functions

details:
+ setTransactionID() removed, not used and not easily replicable in Aristo
+ maybeGet() removed, available via direct backend access
+ newPhk() removed, never used & was experimental anyway

* Update/reorg backend API

why:
+ Added error print function `$$()`
+ General descriptor completion (and optional validation) via `bless()`

* Update `Aristo`/`Kvt` exception handling

why:
  Avoid `CatchableError` exceptions, rather pass them as error code where
  appropriate.

* More `CoreDB` compliant `Aristo` and `Kvt` methods

details:
+ Providing functions like `contains()`, `getVtxRc()` (returns `Result[]`).
+ Additional error code: `NotImplemented`

* Rewrite/reorg of Aristo DB constructor

why:
  Previously used global object `DefaultQidLayoutRef` as default
  initialiser. This object was created at compile time which lead to
  non-gc safe functions.

* Update nimbus/db/core_db/legacy_db.nim

Co-authored-by: Kim De Mey <kim.demey@gmail.com>

* Update nimbus/db/aristo/aristo_transcode.nim

Co-authored-by: Kim De Mey <kim.demey@gmail.com>

* Update nimbus/db/core_db/legacy_db.nim

Co-authored-by: Kim De Mey <kim.demey@gmail.com>

---------

Co-authored-by: Kim De Mey <kim.demey@gmail.com>
2023-09-26 10:21:13 +01:00
Jordan Hrycaj
6bc55d4e6f
Core db aristo and kvt updates preparing for integration (#1760)
* Kvt: Implemented multi-descriptor access on the same backend

why:
  This behaviour mirrors the one of Aristo and can be used for
  simultaneous transactions on Aristo + Kvt

* Kvt: Update database iterators

why:
  Forgot to run on the top layer first

* Kvt: Misc fixes

* Aristo, use `openArray[byte]` rather than `Blob` in prototype

* Aristo, by default hashify right after cloning descriptor

why:
  Typically, a completed descriptor is expected after cloning. Hashing
  can be suppressed by argument flag.

* Aristo provides `replicate()` iterator, similar to legacy `replicate()`

* Aristo API fixes and updates

* CoreDB: Rename `legacy_persistent` => `legacy_rocksdb`

why:
  More systematic, will be in line with Aristo DB which might have
  more than one persistent backends

* CoreDB: Prettify API sources

why:
  Better to read and maintain

details:
  Annotating with custom pragmas which cleans up the prototypes

* CoreDB: Update MPT/put() prototype allowing `CatchableError`

why:
  Will be needed for Aristo API (legacy is OK with `RlpError`)
2023-09-18 21:20:28 +01:00
Jordan Hrycaj
dda049cd43
Simple stupid key-value table companion for Aristo DB (#1746)
why:
  Additional tables needed for the `CoreDB` object with separate
  key-value table and MPT.

details:
+ Stripped down copy of Aristo DB to have a similar look'n feel. Otherwise
  it is just a posh way for accessing `Table` objects or `RocksDB` data.
+ No unit tests yet, will be tested on the go.
2023-09-12 19:44:45 +01:00