These representations use ~15-20% less data compared to the status quo,
mainly by removing redundant zeroes in the integer encodings - a
significant effect of this change is that the various rocksdb caches see
better efficiency since more items fit in the same amount of space.
* use RLP encoding for `VertexID` and `UInt256` wherever it appears
* pack `VertexRef`/`PayloadRef` more tightly
* creating a seq from a table that holds lots of changes means copying
all data into the table - this can be several GB of data while syncing
blocks
* nim fails to optimize the moving of the `WidthFirstForest` - the real
solution is to not construct a `wff` to begin with, but this PR provides
relief while that is being worked on
This spike fix allows us to bump the rocksdb cache by another 2 GB and
still have a significantly lower peak memory usage during sync.
* bump rockdb
* Rename `KVT` objects related to filters according to `Aristo` naming
details:
filter* => delta*
roFilter => balancer
* Compulsory error handling if `persistent()` fails
* Add return code to `reCentre()`
why:
Might eventually fail if re-centring is blocked. Some logic will be
added in subsequent patch sets.
* Add column families from earlier session to rocksdb in opening procedure
why:
All previously used CFs must be declared when re-opening an existing
database.
* Update `init()` and add rocksdb `reinit()` methods for changing parameters
why:
Opening a set column families (with different open options) must span
at least the ones that are already on disk.
* Provide write-trigger-event interface into `Aristo` backend
why:
This allows to save data from a guest application (think `KVT`) to
get synced with the write cycle so the guest and `Aristo` save all
atomically.
* Use `KVT` with new column family interface from `Aristo`
* Remove obsolete guest interface
* Implement `KVT` piggyback on `Aristo` backend
* CoreDb: Add separate `KVT`/`Aristo` backend mode for debugging
* Remove `rocks_db` import from `persist()` function
why:
Some systems (i.p `fluffy` and friends) use the `Aristo` memory
backend emulation and do not link against rocksdb when building the
application. So this should fix that problem.
* Use RocksDb column families instead of a prefixed single column
why:
Better performance
* Use structural objects `VertexRef` and `HashKey` in LRU cache for RocksDb
why:
Avoids repeated de/serialisation
* Aristo+Kvt: Better RocksDB profiling
why:
Providing more detailed information, mainly for `Aristo`
* Aristo: Renamed journal `stats()` to `capacity()`
why:
`Stats()` was a misnomer
* Aristo: Provide backend read caches for key and vertex IDs
why:
Dedicated LRU caching for particular types gives a throughput advantage.
The sizes of the LRU queues used for caching are currently constant
but might be adjusted at a later time.
* Fix copyright year
* Update README
* Nimbus-main: replaced `PruneMode` options by `ChainDbMode` options
details:
For the legacy database, this changes the phrase
- `conf.pruneMode == PruneMode.Full` to the expression
+ `conf.chainDbMode == ChainDbMode.Prune`.
* Fix issues moaned about by NIM compiler
* Fix copyright year
* Aristo+RocksDB: Update backend drivers
why:
RocksDB update allows use some of the newly provided methods which
were previously implemented by using the very C backend (for the lack
of NIM methods.)
* Aristo+RocksDB: Simplify drivers wrapper
* Kvt: Update backend drivers and wrappers similar to `Aristo`
* Aristo+Kvm: Use column families for RocksDB
* Aristo+MemoryDB: Code cosmetics
* Aristo: Provide guest column family for export
why:
So `Kvt` can piggyback on `Aristo` so there avoiding to run a second
DBMS system in parallel.
* Kvt: Provide import mechanism for RoksDB guest column family
why:
So `Kvt` can piggyback on `Aristo` so there avoiding to run a second
DBMS system in parallel.
* CoreDb+Aristo: Run persistent `Kvt` DB piggybacked on `Aristo`
why:
Avoiding to run two DBMS systems in parallel.
* Fix copyright year
* Ditto
* Fix copyright year
* Show elapsed times with enabled `CoreDb` API tracking
* Show elapsed times with enabled `LedgerRef` API tracking
* Reorg `CoreDb` auto destructors for `Aristo` DB
why:
While `Aristo` supports some parallelism for concurrent database access,
this comes with a price of management overhead. With a naive approach,
the auto-destructor will slow down execution because the ledger and
evm treat the database in a shared mode where a DB descriptor is just
created and thrown away shortly after.
This is reflected in the `Coredb` abstraction layer above `Aristo`/`Kvt`
where a few `Shared` type descriptors are cached and a shared reference
is returned rather than a disposable new object.
* For `CoreDb` support transaction level tracking
details:
This is mainly an extra for the legacy DB as `Aristo` and `Kvt` support
this already.
Also return an error on the legacy DB backend when `persistent()` is
called while there are transactions pending (the `persistent()` call
does nothing otherwise on the legacy backend.)
* Clear compiler warnings (remove unused variables etc.)
* Using different `tmp` directories for `Kvt` and `Aristo`
why:
Closing one database would leave the other set of directories
incomplete.
* Code cosmetics, silence compiler
* Fix typo `EMPTY_ROOT_HASH` vs. `EMPTY_CODE_HASH`
* Fix copyright years
* Fix debug noise in `hashify()` for perfectly normal situation
why:
Was previously considered a fixable error
* Fix test sample file names
why:
The larger test file `goerli68161.txt.gz` is already in the local
archive. So there is no need to use the smaller one from the external
repo.
* Activate `accounts_cache` module from `db/ledger`
why:
A copy of the original `accounts_cache.nim` source to be integrated
into the `Ledger` module wrapper which allows to switch between
different `accounts_cache` implementations unser tha same API.
details:
At a later state, the `db/accounts_cache.nim` wrapper will be
removed so that there is only one access to that module via
`db/ledger/accounts_cache.nim`.
* Fix copyright headers in source code
* Split `core_db/base.nim` into several sources
* Rename `core_db/legacy.nim` => `core_db/legacy_db.nim`
* Update `CoreDb` API, dual methods returning `Result[]` or plain value
detail:
Plain value methods implemet the legacy API, they defect on error results
* Redesign `CoreDB` direct backend access
why:
Made the `backend` directive integral part of the API
* Discontinue providing unused or otherwise available functions
details:
+ setTransactionID() removed, not used and not easily replicable in Aristo
+ maybeGet() removed, available via direct backend access
+ newPhk() removed, never used & was experimental anyway
* Update/reorg backend API
why:
+ Added error print function `$$()`
+ General descriptor completion (and optional validation) via `bless()`
* Update `Aristo`/`Kvt` exception handling
why:
Avoid `CatchableError` exceptions, rather pass them as error code where
appropriate.
* More `CoreDB` compliant `Aristo` and `Kvt` methods
details:
+ Providing functions like `contains()`, `getVtxRc()` (returns `Result[]`).
+ Additional error code: `NotImplemented`
* Rewrite/reorg of Aristo DB constructor
why:
Previously used global object `DefaultQidLayoutRef` as default
initialiser. This object was created at compile time which lead to
non-gc safe functions.
* Update nimbus/db/core_db/legacy_db.nim
Co-authored-by: Kim De Mey <kim.demey@gmail.com>
* Update nimbus/db/aristo/aristo_transcode.nim
Co-authored-by: Kim De Mey <kim.demey@gmail.com>
* Update nimbus/db/core_db/legacy_db.nim
Co-authored-by: Kim De Mey <kim.demey@gmail.com>
---------
Co-authored-by: Kim De Mey <kim.demey@gmail.com>
* Rewrite remaining `AristoError` return code into `Result[void,AristoError]`
why:
Better code maintenance
* Update import sections
* Update Aristo DB paths
why:
More systematic so directory can be shared with other DB types
* More cosmetcs
* Update unit tests runners
why:
Proper handling of persistent and mem-only DB. The latter can be
consistently triggered by an empty DB path.
* Reorg of distributed backend access
details:
Now handled via API provided in `aristo_desc`.
* Rename `checkCache()` => `checkTop()`
why:
Better naming for top layer cache checker
also:
Provide cascaded fifos checker
* Provide `eq` directive for finding filter by exact filter ID (think block number)
* Some code beautification (for better code reading)
* State root reposition and reorg
details:
Repositioning is supported by forking a new descriptor. Reorg is then
accomplished by writing this forked state on the backend database.
* Rename FilterID => QueueID
why:
The current usage does not identify a particular filter but uses it as
storage tag to manage it on the database (to be organised in a set of
FIFOs or queues.)
* Split `aristo_filter` source into sub-files
why:
Make space for filter management API
* Store filter queue IDs in pairs on the backend
why:
Any pair will will describe a FIFO accessed by bottom/top IDs
* Reorg some source file names
why:
The "aristo_" prefix for make local/private files is tedious to
use, so removed.
* Implement filter slot scheduler
details:
Filters will be stored on the database on cascaded FIFOs. When a FIFO
queue is full, some filter items are bundled together and stored on the
next FIFO.