Commit Graph

26 Commits

Author SHA1 Message Date
Jordan Hrycaj 143f2e99f5
Core db+aristo fixes and tx handling updates (#2164)
* Aristo: Rename journal related sources and functions

why:
  Previously, the naming was hinged on the phrases `fifo`, `filter` etc.
  which reflect the inner workings of cascaded filters. This was
  unfortunate for reading/understanding the source code for actions where
  the focus is the journal as a whole.

* Aristo: Fix buffer overflow (path length truncating error)

* Aristo: Tighten `hikeUp()` stop check, update error code

why:
  Detect dangling vertex links. These are legit with `snap` sync
  processing but not with regular processing.

* Aristo: Raise assert in regular mode `merge()` at a dangling link/edge

why:
  With `snap` sync processing, partial trees are ok and can be amended.
  Not so in regular mode.

  Previously there was only a debug message when a non-legit dangling edge
  was encountered.

* Aristo: Make sure that vertices are copied before modification

why:
  Otherwise vertices from lower layers might also be modified

* Aristo: Fix relaxed mode for validity checker `check()`

* Remove cruft

* Aristo: Update API for transaction handling

details:
+ Split `aristo_tx.nim` into sub-modules
+ Split `forkWith()` into `findTx()` + `forkTx()`
+ Removed `forkTop()`, `forkBase()` (now superseded by new `forkTx()`)

* CoreDb+Aristo: Fix initialiser (missing methods)
2024-05-03 17:38:17 +00:00
Jordan Hrycaj 0d4ef023ed
Update aristo journal functionality (#2155)
* Aristo: Code cosmetics, e.g. update some CamelCase names

* CoreDb+Aristo: Provide oldest known state root implied

details:
  The Aristo journal allows to recover earlier but not all state roots.

* Aristo: Fix journal backward index operator, e.g. `[^1]`

* Aristo: Fix journal updater

why:
  The `fifosStore()` store function slightly misinterpreted the update
  instructions when translation is to database `put()` functions. The
  effect was that the journal was ever growing due to stale entries which
  were never deleted.

* CoreDb+Aristo: Provide utils for purging stale data from the KVT

details:
  See earlier patch, not all state roots are available. This patch
  provides a mapping from some state root to a block number and allows to
  remove all KVT data related to a particular block number

* Aristo+Kvt: Implement a clean up schedule for expired data in KVT

why:
  For a single state ledger like `Aristo`, there is only a limited
  backlog of states. So KVT data (i.e. headers etc.) are cleaned up
  regularly

* Fix copyright year
2024-04-26 13:43:52 +00:00
Jordan Hrycaj 7d9e1d8607
Misc updates for full sync (#2140)
* Code cosmetics

* Aristo+Kvt: Fix api wrappers

why:
  Api setup killed the backend descriptor when backend mapping was
  disabled.

* Aristo: Implement masked profiling entries

why:
  Database backend should be listed but not counted in tally

* CoreDb: Simplify backend() methods

why:
  DBMS backend access Was provided very early and over engineered. Now
  there are only two backend machines, one for `Kvt` and the other one
  for an `Mpt` available only via new API.

* CoreDb: Code cleanup regarding descriptor types

* CoreDb: Refactor/redefine `persistent()` methods

why:
  There were `persistent()` methods for any type of caching storage
  facilities `Kvt`, `Mpt`, `Phk`, and `Acc`. Now there is only a single
  `persistent()` method storing all facilities in tandem (similar to
  how transactions work.)

  For non shared `Kvt` tables, there is now an extra storage method
  `saveOffSite()`.

* CoreDb lingo update: `trie` becomes `column`

why:
  Notion of a `trie` is pretty much hidden by the new `CoreDb` api.
  Revealed are sort of database columns for accounts an storage data,
  any of which have an internal state represented by a Keccack hash.
  So a `trie` or `MPT` becomes a `column` and a `rootHash` becomes a
  column state.

* Aristo: rename backend filed `filters` => `journal`

* Update full sync logging

details:
  + Disable eth handler noise while syncing
  + Log journal depth (if available)

* Fix copyright year

* Fix cruft and unwanted imports
2024-04-19 18:37:27 +00:00
Jordan Hrycaj 587ca3abbe
Coredb use stackable api for aristo backend (#2060)
* Aristo/Kvt: Provide function hooks APIs

why:
  These APIs can be used for installing tracers, profiling functoinality,
  and other niceties on the databases.

* Aristo: Provide optional API profiling

details:
  It basically is a re-implementation of the `CoreDb` profiling
  implementation

* Kvt: Provide optional API profiling similar to `Aristo`

* CoreDb: Re-implementing profiling using `aristo_profile`

* Ledger: Re-implementing profiling using `aristo_profile`

* CoreDb: Update unit tests for maintainability

* update copyright dates
2024-02-29 21:10:24 +00:00
Jordan Hrycaj 8e18e85288
Aristodb remove obsolete and time consuming admin features (#2048)
* Aristo: Reorg `hashify()` using different schedule algorithm

why:
  Directly calculating the search tree top down from the roots turns
  out to be faster than using the cached structures left over by `merge()`
  and `delete()`.
  Time gains is short of 20%

* Aristo: Remove `lTab[]` leaf entry object type

why:
  Not used anymore. It was previously needed to build the schedule for
  `hashify()`.

* Aristo: Avoid unnecessary re-org of the vertex ID recycling list

why:
  This list can become quite large so a heuristic is employed whether
  it makes sense to re-org.

  Also, re-org check is only done by `delete()` functions.

* Aristo: Remove key/reverse lookup table from tx layers

why:
  It is ignored except for handling proof nodes and costs unnecessary
  run time resources.

  This feature was originally needed to accommodate the mental transition
  from the legacy MPT to the `Aristo` trie :).

* Fix copyright year
2024-02-22 08:24:58 +00:00
andri lim 6ff2edc416
Fix styles (#2046)
* Fix styles

* Fix copyright year
2024-02-21 23:04:59 +07:00
andri lim bea558740f
Reduce compiler warnings (#2030)
* Reduce compiler warnings

* Reduce compiler warnings in test code
2024-02-16 16:08:07 +07:00
Jordan Hrycaj 1b4a43c140
Aristo db remove over engineered object type (#2027)
* CoreDb: update test suite

* Aristo: Simplify reverse key map

why:
  The reverse key map `pAmk: (root,key) -> {vid,..}` as been simplified to
  `pAmk: key -> {vid,..}` as the state `root` domain argument is not used,
  anymore

* Aristo: Remove `HashLabel` object type and replace it by `HashKey`

why:
  The `HashLabel` object attaches a root hash to a hash key. This is
  nowhere used, anymore.

* Fix copyright
2024-02-14 19:11:59 +00:00
Jordan Hrycaj 3b306a9689
Aristo: Update unit test suite (#2002)
* Aristo: Update unit test suite

* Aristo/Kvt: Fix iterators

why:
  Generic iterators were not properly updated after backend change

* Aristo: Add sub-trie deletion functionality

why:
  For storage tries linked to an account payload vertex ID, a the
  whole storage trie needs to be deleted with the account.

* Aristo: Reserve vertex ID numbers for static custom state roots

why:
  Static custom state roots may be controlled by an application,
  e.g. for a receipt or a transaction root. The `Aristo` functions
  are agnostic of what the static state roots are when different
  from the internal tree vertex ID 1.

details;
  The `merge()` function applied to a non-static state root (assumed
  to be a storage root) will check the payload of an accounts leaf
  and mark its Merkle keys to be re-checked.

* Aristo: Correct error code symbol

* Aristo: Update error code symbols

* Aristo: Code cosmetics/comments

* Aristo: Fix hashify schedule calculator

why:
  Had a tendency to stop early leaving an incomplete job
2024-02-01 21:27:48 +00:00
Jordan Hrycaj 43e5f428af
Aristo db kvt maintenance update (#1952)
* Update KVT layers abstraction

details:
  modelled after Aristo layers

* Simplified KVT database iterators (removed item counters)

why:
  Not needed for production functions

* Simplify KVT merge function `layersCc()`

* Simplified Aristo database iterators (removed item counters)

why:
  Not needed for production functions

* Update failure condition for hash labels compiler `hashify()`

why:
  Node need not be rejected as long as links are on the schedule. In
  that case, `redo[]` is to become `wff.base[]` at a later stage.

* Update merging layers and label update functions

why:
+ Merging a stack of layers with `layersCc()` could be simplified
+ Merging layers will optimise the reverse `kMap[]` table maps
  `pAmk: label->{vid, ..}` by deleting empty mappings `label->{}` where
  they are redundant.
+ Updated `layersPutLabel()` for optimising `pAmk[]` tables
2023-12-20 16:19:00 +00:00
Jordan Hrycaj ffa8ad2246
Core db use differential tx layers for aristo and kvt (#1949)
* Fix kvt headers

* Provide differential layers for KVT transaction stack

why:
  Significant performance improvement

* Provide abstraction layer for database top cache layer

why:
  This will eventually implemented as a differential database layers
  or transaction layers. The latter is needed to improve performance.

behavioural changes:
  Zero vertex and keys (i.e. delete requests) are not optimised out
  until the last layer is written to the database.

* Provide differential layers for Aristo transaction stack

why:
  Significant performance improvement
2023-12-19 12:39:23 +00:00
Jordan Hrycaj 13f51939f6
Core db aristo hasher profiling and timing improvement (#1938)
* Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup.

why:
  Speeds up lookup time with `Aristo` backend. For writing `Clique` data,
  the `Companion` model allows to write `Clique` data past the database
  locked by evm transactions.

* Implement `CoreDb` profiling with API tracking

why:
  Chasing time spent per APT procs ...

* Implement `Ledger` profiling with API tracking

why:
  Chasing time spent per APT procs ...

* Always hashify when commiting or storing

why:
  A dirty cache makes no sense when committing

* Make sure that a zero key is created when adding/updating vertices

why:
  This is an error fix mainly for edge cases. A typical error was
  that the root key got deleted when there were only a few vertices
  left on the DB.

* Need all created and changed vertices zero-keyed on the cache

why:
  A zero key (i.e. empty Merkle hash) indicates that a vertex key
  needs to be updated. This would not be needed immediately after
  a merge as there is an actual leaf path on the cache layer. But
  after subsequent merge and delete operations this information
  might get blurred.

* Re-org hashing algorithm

why:
  Apart from errors, the previous implementation was too slow for
  two reasons:
  + some control hashes were calculated for debugging (now all
    verification is done in `aristo_check` module)
  + the leaf paths stored on the cache are used to build the
    labelling (aka hashing) schedule; there paths were accumulated
    over successive hash sessions although it is clear that all
    keys were generated, already
2023-12-12 17:47:41 +00:00
Jordan Hrycaj 657379f484
Aristo db update merkle hasher (#1925)
* Register paths for added leafs because of trie re-balancing

why:
  While the payload would not change, the prefix in the leaf vertex
  would. So it needs to be flagged for hash recompilation for the
  `hashify()` module.

also:
  Make sure that `Hike` paths which might have vertex links into the
  backend filter are replaced by vertex copies before manipulating.
  Otherwise the vertices on the immutable filter might be involuntarily
  changed.

* Also check for paths where the leaf vertex is on the backend, already

why:
  A a path can have dome vertices on the top layer cache with the
  `Leaf` vertex on  the backend.

* Re-define a void `HashLabel` type.

why:
  A `HashLabel` type is a pair `(root-vertex-ID, Keccak-hash)`. Previously,
  a valid `HashLabel` consisted of a non-empty hash and a non-zero vertex
  ID. This definition leads to a non-unique representation of a void
  `HashLabel` with either root-ID or has void. This has been changed to
  the unique void `HashLabel` exactly if the hash entry is void.

* Update consistency checkers

* Re-org `hashify()` procedure

why:
  Syncing against block chain showed serious deficiencies which produced
  wrong hashes or simply bailed out with error.

  So all fringe cases (mainly due to deleted entries) could be integrated
  into the labelling schedule rather than handling separate fringe cases.
2023-12-04 20:39:26 +00:00
Jordan Hrycaj 6e0397e276
Aristo and ledger small updates (#1888)
* Fix debug noise in `hashify()` for perfectly normal situation

why:
  Was previously considered a fixable error

* Fix test sample file names

why:
  The larger test file `goerli68161.txt.gz` is already in the local
  archive. So there is no need to use the smaller one from the external
  repo.

* Activate `accounts_cache` module from `db/ledger`

why:
  A copy of the original `accounts_cache.nim` source to be integrated
  into the `Ledger` module wrapper which allows to switch between
  different `accounts_cache` implementations unser tha same API.

details:
  At a later state, the `db/accounts_cache.nim` wrapper will be
  removed so that there is only one access to that module via
  `db/ledger/accounts_cache.nim`.

* Fix copyright headers in source code
2023-11-08 16:52:25 +00:00
Jordan Hrycaj 4feaa2cfab
Aristo db update for short nodes key edge cases (#1887)
* Aristo: Provide key-value list signature calculator

detail:
  Simple wrappers around `Aristo` core functionality

* Update new API for `CoreDb`

details:
+ Renamed new API functions `contains()` => `hasKey()` or `hasPath()`
  which disables the `in` operator on non-boolean 	`contains()` functions
+ The functions `get()` and `fetch()` always return a not-found error if
  there is no item, available. The new functions `getOrEmpty()` and
  `mergeOrEmpty()` return an an empty `Blob` if there is no such key
  found.

* Rewrite `core_apps.nim` using new API from `CoreDb`

* Use `Aristo` functionality for calculating Merkle signatures

details:
  For debugging, the `VerifyAristoForMerkleRootCalc` can be set so
  that `Aristo` results will be verified against the legacy versions.

* Provide general interface for Merkle signing key-value tables

details:
  Export `Aristo` wrappers

* Activate `CoreDb` tests

why:
  Now, API seems to be stable enough for general tests.

* Update `toHex()` usage

why:
  Byteutils' `toHex()` is superior to `toSeq.mapIt(it.toHex(2)).join`

* Split `aristo_transcode` => `aristo_serialise` + `aristo_blobify`

why:
+ Different modules for different purposes
+ `aristo_serialise`: RLP encoding/decoding
+ `aristo_blobify`: Aristo database encoding/decoding

* Compacted representation of small nodes' links instead of Keccak hashes

why:
  Ethereum MPTs use Keccak hashes as node links if the size of an RLP
  encoded node is at least 32 bytes. Otherwise, the RLP encoded node
  value is used as a pseudo node link (rather than a hash.) Such a node
  is nor stored on key-value database. Rather the RLP encoded node value
  is stored instead of a lode link in a parent node instead. Only for
  the root hash, the top level node is always referred to by the hash.

  This feature needed an abstraction of the `HashKey` object which is now
  either a hash or a blob of length at most 31 bytes. This leaves two
  ways of representing an empty/void `HashKey` type, either as an empty
  blob of zero length, or the hash of an empty blob.

* Update `CoreDb` interface (mainly reducing logger noise)

* Fix copyright years (to make `Lint` happy)
2023-11-08 12:18:32 +00:00
Jordan Hrycaj cd1d370543
Aristo db api extensions for use as core db backend (#1754)
* Update docu

* Update Aristo/Kvt constructor prototype

why:
  Previous version used an `enum` value to indicate what backend is to
  be used. This was replaced by using the backend object type.

* Rewrite `hikeUp()` return code into `Result[Hike,(Hike,AristoError)]`

why:
  Better code maintenance. Previously, the `Hike` object was returned. It
  had an internal error field so partial success was also available on
  a failure. This error field has been removed.

* Use `openArray[byte]` rather than `Blob` in functions prototypes

* Provide synchronised multi instance transactions

why:
  The `CoreDB` object was geared towards the legacy DB which used a single
  transaction for the key-value backend DB. Different state roots are
  provided by the backend database, so all instances work directly on the
  same backend.

  Aristo db instances have different in-memory mappings (aka different
  state roots) and the transactions are on top of there mappings. So each
  instance might run different transactions.

  Multi instance transactions are a compromise to converge towards the
  legacy behaviour. The synchronised transactions span over all instances
  available at the time when base transaction was opened. Instances
  created later are unaffected.

* Provide key-value pair database iterator

why:
  Needed in `CoreDB` for `replicate()` emulation

also:
  Some update of internal code

* Extend API (i.e. prototype variants)

why:
  Needed for `CoreDB` geared towards the legacy backend which has a more
  basic API than Aristo.
2023-09-15 16:23:53 +01:00
Jordan Hrycaj 8e00143313
Aristo db code massage n cosmetics (#1745)
* Rewrite remaining `AristoError` return code into `Result[void,AristoError]`

why:
  Better code maintenance

* Update import sections

* Update Aristo DB paths

why:
 More systematic so directory can be shared with other DB types

* More cosmetcs

* Update unit tests runners

why:
  Proper handling of persistent and mem-only DB. The latter can be
  consistently triggered by an empty DB path.
2023-09-12 19:45:12 +01:00
Jordan Hrycaj 8e46953390
Aristo db state root repos and reorg (#1744)
* Reorg of distributed backend access

details:
  Now handled via API provided in `aristo_desc`.

* Rename `checkCache()` => `checkTop()`

why:
  Better naming for top layer cache checker

also:
  Provide cascaded fifos checker

* Provide `eq` directive for finding filter by exact filter ID (think block number)

* Some code beautification (for better code reading)

* State root reposition and reorg

details:
  Repositioning is supported by forking a new descriptor. Reorg is then
  accomplished by writing this forked state on the backend database.
2023-09-11 21:38:49 +01:00
Jordan Hrycaj 445fa75251
Aristo db consolidate and clean up (#1699)
* Removed dedicated transcoder tests

why:
  will implicitely be provided by other tests:
  + encode/write -> hashify -> test_tx
  + decode/read -> merge raw nodes -> test_tx
  + de/blobfiy -> backend operations, taext_tx, test_backend, test_filter

* Clarify how the vertex ID generator state is accessed from the backend

why:
  This state is a list of unused vertex IDs. It was just stored somewhere
  on the backend which details were exposed when iterating over some
  sub-table(s).

  As there will be more such single information records, an admin
  sub-tables has been defined (formerly ID generator table) with dedicated
  access keys and type. Also, the iterator over the single ID generator
  state item has been removed. It must be accessed via the `get()`
  interface.

* Remove trailing space from file name

why:
  fixes windows bail out
2023-08-21 15:58:30 +01:00
Jordan Hrycaj 3078c207ca
Aristo db implement distributed backend access (#1688)
* Fix hashing algorithm

why:
  Particular case where a sub-tree is on the backend, linked by an
  Extension vertex to the top level.

* Update backend verification to report `dirty` top layer

* Implement distributed merge of backend filters

* Implement distributed backend access management

details:
  Implemented and tested as described in chapter 5 of the `README.md`
  file.
2023-08-17 14:42:01 +01:00
Jordan Hrycaj 01fe172738
Aristo db integrate hashify into tx (#1679)
* Renamed type `NoneBackendRef` => `VoidBackendRef`

* Clarify names: `BE=filter+backend` and `UBE=backend (unfiltered)`

why:
  Most functions used full names as `getVtxUnfilteredBackend()` or
  `getKeyBackend()`. After defining abbreviations (and its meaning) it
   seems easier to use `getVtxUBE()` and `getKeyBE()`.

* Integrate `hashify()` process into transaction logic

why:
  Is now transparent unless explicitly controlled.

details:
  Cache changes imply setting a `dirty` flag which in turn triggers
  `hashify()` processing in transaction and `pack()` directives.

* Removed `aristo_tx.exec()` directive

why:
  Inconsistent implementation, functionality will be provided with a
  different paradigm.
2023-08-11 18:23:57 +01:00
Jordan Hrycaj 09fabd04eb
Aristo db use filter betw backend and tx cache (#1678)
* Provide deep copy for each transaction layer

why:
  Localising changes. Selective deep copy was just overlooked.

* Generalise vertex ID generator state reorg function `vidReorg()`

why:
  makes it somewhat easier to handle when saving layers.

* Provide dummy back end descriptor `NoneBackendRef`

* Optional read-only filter between backend and transaction cache

why:
  Some staging area for accumulating changes to the backend DB. This
  will eventually be an access layer for emulating a backend with
  multiple/historic state roots.

* Re-factor `persistent()` with filter between backend/tx-cache => `stow()`

why:
  The filter provides an abstraction from the physically stored data on
  disk. So, there can be several MPT instances using the same disk data
  with different state roots. Of course, all the MPT instances should
  not differ too much for practical reasons :).

TODO:
  Filter administration tools need to be provided.
2023-08-10 21:01:28 +01:00
Jordan Hrycaj 56d5c382d7
Aristo db traversal helpers (#1638)
* Misc fixes

detail:
* Fix de-serialisation for account leafs
* Update node recovery from unit tests

* Remove `LegacyAccount` from `PayloadRef` object

why:
  Legacy accounts use a hash key as storage root which is detrimental
  to the working of the Aristo database which uses a vertex ID.

* Dissolve `hashify_helper` into `aristo_utils` and `aristo_transcode`

why:
  Functions are of general interest so they should live in first level
  code files.

* Added left/right iterators over leaf nodes

* Some helper/wrapper functions that might be useful
2023-07-13 00:03:14 +01:00
Jordan Hrycaj 93a72025a1
Extended data Payload specs for the backend. (#1630)
why:
  For the main tree with root vertex ID 1, the leaf nodes hold the
  account data. These accounts may link to sub trees the storage root
  node ID of which must be registered here. There is no reverse key
  lookup on the backend.

note:
  These definitions are experimental. Also, there are some tests missing
  for validating Payload data conversions.
2023-07-05 21:27:48 +01:00
Jordan Hrycaj ff6673beac
Aristo db tidy up a bit (#1625)
* Slightly tighten some self-check conditions

* Redefined the database descriptor object as reference (to the object)

why:
  The upcoming transaction wrapper will work with a database reference
  rather than the object itself

* Append state before `save()` to the Aristo descriptor

why:
  This stae was previously returned by the function. Appending it to
  a field of the Aristo descriptor seems easier to handle.
2023-07-04 19:24:03 +01:00
Jordan Hrycaj dd1c8ed6f2
Aristo db update delete functionality (#1621)
* Fix missing branch checks in transcoder

why:
  Symmetry problem. `Blobify()` allowed for encoding degenerate branch
  vertices while `Deblobify()` rejected decoding wrongly encoded data.

* Update memory backend so that it rejects storing bogus vertices.

why:
  Error behaviour made similar to the rocks DB backend.

* Make sure that leaf vertex IDs are not repurposed

why:
  This makes it easier to record leaf node changes

* Update error return code for next()/right() traversal

why:
  Returning offending vertex ID (besides error code) helps debugging

* Update Merkle hasher for deleted nodes

why:
  Not implemented, yet

also:
  Provide cache & backend consistency check functions. This was
  partly re-implemented from `hashifyCheck()`

* Simplify some unit tests

* Fix delete function

why:
  Was conceptually wrong
2023-06-30 23:22:33 +01:00