* Aristo: Update unit test suite
* Aristo/Kvt: Fix iterators
why:
Generic iterators were not properly updated after backend change
* Aristo: Add sub-trie deletion functionality
why:
For storage tries linked to an account payload vertex ID, a the
whole storage trie needs to be deleted with the account.
* Aristo: Reserve vertex ID numbers for static custom state roots
why:
Static custom state roots may be controlled by an application,
e.g. for a receipt or a transaction root. The `Aristo` functions
are agnostic of what the static state roots are when different
from the internal tree vertex ID 1.
details;
The `merge()` function applied to a non-static state root (assumed
to be a storage root) will check the payload of an accounts leaf
and mark its Merkle keys to be re-checked.
* Aristo: Correct error code symbol
* Aristo: Update error code symbols
* Aristo: Code cosmetics/comments
* Aristo: Fix hashify schedule calculator
why:
Had a tendency to stop early leaving an incomplete job
* Return slots when verifying witness regardless of code length.
* Prevent AccountCache WitnessData codeTouched from being reset.
* Default to using {wfNoFlag} for witness flags in tests to allow running with data from before EIP170.
* Add additional json files to experimental JSON RPC test.
* Use HTTP RPC server in tests.
* Add test to check that block witness contains bytecode.
* Aristo: Re-phrase `LayerDelta` and `LayerFinal` as object references
why:
Avoids copying in some cases
* Fix copyright header
* Aristo: Verify `leafTie.root` function argument for `merge()` proc
why:
Zero root will lead to inconsistent DB entry
* Aristo: Update failure condition for hash labels compiler `hashify()`
why:
Node need not be rejected as long as links are on the schedule. In
that case, `redo[]` is to become `wff.base[]` at a later stage.
This amends an earlier fix, part of #1952 by also testing against
the target nodes of the `wff.base[]` sets.
* Aristo: Add storage root glue record to `hashify()` schedule
why:
An account leaf node might refer to a non-resolvable storage root ID.
Storage root node chains will end up at the storage root. So the link
`storage-root->account-leaf` needs an extra item in the schedule.
* Aristo: fix error code returned by `fetchPayload()`
details:
Final error code is implied by the error code form the `hikeUp()`
function.
* CoreDb: Discard `createOk` argument in API `getRoot()` function
why:
Not needed for the legacy DB. For the `Arsto` DB, a lazy approach is
implemented where a stprage root node is created on-the-fly.
* CoreDb: Prevent `$$` logging in some cases
why:
Logging the function `$$` is not useful when it is used for internal
use, i.e. retrieving an an error text for logging.
* CoreDb: Add `tryHashFn()` to API for pretty printing
why:
Pretty printing must not change the hashification status for the
`Aristo` DB. So there is an independent API wrapper for getting the
node hash which never updated the hashes.
* CoreDb: Discard `update` argument in API `hash()` function
why:
When calling the API function `hash()`, the latest state is always
wanted. For a version that uses the current state as-is without checking,
the function `tryHash()` was added to the backend.
* CoreDb: Update opaque vertex ID objects for the `Aristo` backend
why:
For `Aristo`, vID objects encapsulate a numeric `VertexID`
referencing a vertex (rather than a node hash as used on the
legacy backend.) For storage sub-tries, there might be no initial
vertex known when the descriptor is created. So opaque vertex ID
objects are supported without a valid `VertexID` which will be
initalised on-the-fly when the first item is merged.
* CoreDb: Add pretty printer for opaque vertex ID objects
* Cosmetics, printing profiling data
* CoreDb: Fix segfault in `Aristo` backend when creating MPT descriptor
why:
Missing initialisation error
* CoreDb: Allow MPT to inherit shared context on `Aristo` backend
why:
Creates descriptors with different storage roots for the same
shared `Aristo` DB descriptor.
* Cosmetics, update diagnostic message items for `Aristo` backend
* Fix Copyright year
* Initial implementation of eth_getProof endpoint.
* Implemented generation of account and storage proofs.
* Minor fixes and additional tests.
* Refactor getBranch code into a separate file.
* Improve usage of test data.
* Fix copyright year.
* Return zero hash for codeHash and storageHash if account doesn't exist.
* Update copyright notice and moved trie key hashing to inside getBranch proc.
* Update KVT layers abstraction
details:
modelled after Aristo layers
* Simplified KVT database iterators (removed item counters)
why:
Not needed for production functions
* Simplify KVT merge function `layersCc()`
* Simplified Aristo database iterators (removed item counters)
why:
Not needed for production functions
* Update failure condition for hash labels compiler `hashify()`
why:
Node need not be rejected as long as links are on the schedule. In
that case, `redo[]` is to become `wff.base[]` at a later stage.
* Update merging layers and label update functions
why:
+ Merging a stack of layers with `layersCc()` could be simplified
+ Merging layers will optimise the reverse `kMap[]` table maps
`pAmk: label->{vid, ..}` by deleting empty mappings `label->{}` where
they are redundant.
+ Updated `layersPutLabel()` for optimising `pAmk[]` tables
* Fix kvt headers
* Provide differential layers for KVT transaction stack
why:
Significant performance improvement
* Provide abstraction layer for database top cache layer
why:
This will eventually implemented as a differential database layers
or transaction layers. The latter is needed to improve performance.
behavioural changes:
Zero vertex and keys (i.e. delete requests) are not optimised out
until the last layer is written to the database.
* Provide differential layers for Aristo transaction stack
why:
Significant performance improvement
* Activate `LedgerRef` wrapper for `AccountsCache`
details:
`accounts_cache.nim` methods are indirectly processed by the wrapper
methods from `ledger.nim`.
This works for all sources except `test_state_db.nim` where the source
`accounts_cache.nim` is included (rather than imported) in order to
access objects privy to the very source.
* Provide facility to switch to a preselected `LedgerRef` type
details:
Can be set as suggestion when initialising `CommonRef`
* Update `CoreDb` test suite for better time tracking
details:
+ Allow time logging by pre-defined block intervals
+ Print `CoreDb`/`Ledger`profiling results (if enabled)
* Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup.
why:
Speeds up lookup time with `Aristo` backend. For writing `Clique` data,
the `Companion` model allows to write `Clique` data past the database
locked by evm transactions.
* Implement `CoreDb` profiling with API tracking
why:
Chasing time spent per APT procs ...
* Implement `Ledger` profiling with API tracking
why:
Chasing time spent per APT procs ...
* Always hashify when commiting or storing
why:
A dirty cache makes no sense when committing
* Make sure that a zero key is created when adding/updating vertices
why:
This is an error fix mainly for edge cases. A typical error was
that the root key got deleted when there were only a few vertices
left on the DB.
* Need all created and changed vertices zero-keyed on the cache
why:
A zero key (i.e. empty Merkle hash) indicates that a vertex key
needs to be updated. This would not be needed immediately after
a merge as there is an actual leaf path on the cache layer. But
after subsequent merge and delete operations this information
might get blurred.
* Re-org hashing algorithm
why:
Apart from errors, the previous implementation was too slow for
two reasons:
+ some control hashes were calculated for debugging (now all
verification is done in `aristo_check` module)
+ the leaf paths stored on the cache are used to build the
labelling (aka hashing) schedule; there paths were accumulated
over successive hash sessions although it is clear that all
keys were generated, already
* Register paths for added leafs because of trie re-balancing
why:
While the payload would not change, the prefix in the leaf vertex
would. So it needs to be flagged for hash recompilation for the
`hashify()` module.
also:
Make sure that `Hike` paths which might have vertex links into the
backend filter are replaced by vertex copies before manipulating.
Otherwise the vertices on the immutable filter might be involuntarily
changed.
* Also check for paths where the leaf vertex is on the backend, already
why:
A a path can have dome vertices on the top layer cache with the
`Leaf` vertex on the backend.
* Re-define a void `HashLabel` type.
why:
A `HashLabel` type is a pair `(root-vertex-ID, Keccak-hash)`. Previously,
a valid `HashLabel` consisted of a non-empty hash and a non-zero vertex
ID. This definition leads to a non-unique representation of a void
`HashLabel` with either root-ID or has void. This has been changed to
the unique void `HashLabel` exactly if the hash entry is void.
* Update consistency checkers
* Re-org `hashify()` procedure
why:
Syncing against block chain showed serious deficiencies which produced
wrong hashes or simply bailed out with error.
So all fringe cases (mainly due to deleted entries) could be integrated
into the labelling schedule rather than handling separate fringe cases.
* Fix copyright year
* Show elapsed times with enabled `CoreDb` API tracking
* Show elapsed times with enabled `LedgerRef` API tracking
* Reorg `CoreDb` auto destructors for `Aristo` DB
why:
While `Aristo` supports some parallelism for concurrent database access,
this comes with a price of management overhead. With a naive approach,
the auto-destructor will slow down execution because the ledger and
evm treat the database in a shared mode where a DB descriptor is just
created and thrown away shortly after.
This is reflected in the `Coredb` abstraction layer above `Aristo`/`Kvt`
where a few `Shared` type descriptors are cached and a shared reference
is returned rather than a disposable new object.
* For `CoreDb` support transaction level tracking
details:
This is mainly an extra for the legacy DB as `Aristo` and `Kvt` support
this already.
Also return an error on the legacy DB backend when `persistent()` is
called while there are transactions pending (the `persistent()` call
does nothing otherwise on the legacy backend.)
* Clear compiler warnings (remove unused variables etc.)
* Using different `tmp` directories for `Kvt` and `Aristo`
why:
Closing one database would leave the other set of directories
incomplete.
* Code cosmetics, silence compiler
* Fix typo `EMPTY_ROOT_HASH` vs. `EMPTY_CODE_HASH`
* Fix copyright years
* Split off `ReadOnlyStateDB` from `AccountStateDB` from `state_db.nim`
why:
Apart from testing, applications use `ReadOnlyStateDB` as an easy
way to access the accounts ledger. This is well supported by the
`Aristo` db, but writable mode is only parially supported.
The writable AccountStateDB` object for modifying accounts is not
used by production code.
So, for lecgacy and testing apps, the full support of the previous
`AccountStateDB` is now enabled by `import db/state_db/read_write`
and the `import db/state_db` provides read-only mode.
* Encapsulate `AccountStateDB` as `GenesisLedgerRef` or genesis creation
why:
`AccountStateDB` has poor support for `Aristo` and is not widely used
in favour of `AccountsLedger` (which will be abstracted as `ledger`.)
Currently, using other than the `AccountStateDB` ledgers within the
`GenesisLedgerRef` wrapper is experimental and test only. Eventually,
the wrapper should disappear so that the `Ledger` object (which
encapsulates `AccountsCache` and `AccountsLedger`) will prevail.
* For the `Ledger`, provide access to raw accounts `MPT`
why:
This gives to the `CoreDbMptRef` descriptor from the `CoreDb` (which is
the legacy version of CoreDxMptRef`.) For the new `ledger` API, the
accounts are based on the `CoreDxMAccRef` descriptor which uses a
particular sub-system for accounts while legacy applications use the
`CoreDbPhkRef` equivalent of the `SecureHexaryTrie`.
The only place where this feature will currently be used is the
`genesis.nim` source file.
* Fix `Aristo` bugs, missing boundary checks, typos, etc.
* Verify root vertex in `MPT` and account constructors
why:
Was missing so far, in particular the accounts constructor must
verify `VertexID(1)
* Fix include file
* Disable `TransactionID` related functions from `state_db.nim`
why:
Functions `getCommittedStorage()` and `updateOriginalRoot()` from
the `state_db` module are nowhere used. The emulation of a legacy
`TransactionID` type functionality is administratively expensive to
provide by `Aristo` (the legacy DB version is only partially
implemented, anyway).
As there is no other place where `TransactionID`s are used, they will
not be provided by the `Aristo` variant of the `CoreDb`. For the
legacy DB API, nothing will change.
* Fix copyright headers in source code
* Get rid of compiler warning
* Update Aristo code, remove unused `merge()` variant, export `hashify()`
why:
Adapt to upcoming `CoreDb` wrapper
* Remove synced tx feature from `Aristo`
why:
+ This feature allowed to synchronise transaction methods like begin,
commit, and rollback for a group of descriptors.
+ The feature is over engineered and not needed for `CoreDb`, neither
is it complete (some convergence features missing.)
* Add debugging helpers to `Kvt`
also:
Update database iterator, add count variable yield argument similar
to `Aristo`.
* Provide optional destructors for `CoreDb` API
why;
For the upcoming Aristo wrapper, this allows to control when certain
smart destruction and update can take place. The auto destructor works
fine in general when the storage/cache strategy is known and acceptable
when creating descriptors.
* Add update option for `CoreDb` API function `hash()`
why;
The hash function is typically used to get the state root of the MPT.
Due to lazy hashing, this might be not available on the `Aristo` DB.
So the `update` function asks for re-hashing the gurrent state changes
if needed.
* Update API tracking log mode: `info` => `debug
* Use shared `Kvt` descriptor in new Ledger API
why:
No need to create a new descriptor all the time
* Fix debug noise in `hashify()` for perfectly normal situation
why:
Was previously considered a fixable error
* Fix test sample file names
why:
The larger test file `goerli68161.txt.gz` is already in the local
archive. So there is no need to use the smaller one from the external
repo.
* Activate `accounts_cache` module from `db/ledger`
why:
A copy of the original `accounts_cache.nim` source to be integrated
into the `Ledger` module wrapper which allows to switch between
different `accounts_cache` implementations unser tha same API.
details:
At a later state, the `db/accounts_cache.nim` wrapper will be
removed so that there is only one access to that module via
`db/ledger/accounts_cache.nim`.
* Fix copyright headers in source code
* Aristo: Provide key-value list signature calculator
detail:
Simple wrappers around `Aristo` core functionality
* Update new API for `CoreDb`
details:
+ Renamed new API functions `contains()` => `hasKey()` or `hasPath()`
which disables the `in` operator on non-boolean `contains()` functions
+ The functions `get()` and `fetch()` always return a not-found error if
there is no item, available. The new functions `getOrEmpty()` and
`mergeOrEmpty()` return an an empty `Blob` if there is no such key
found.
* Rewrite `core_apps.nim` using new API from `CoreDb`
* Use `Aristo` functionality for calculating Merkle signatures
details:
For debugging, the `VerifyAristoForMerkleRootCalc` can be set so
that `Aristo` results will be verified against the legacy versions.
* Provide general interface for Merkle signing key-value tables
details:
Export `Aristo` wrappers
* Activate `CoreDb` tests
why:
Now, API seems to be stable enough for general tests.
* Update `toHex()` usage
why:
Byteutils' `toHex()` is superior to `toSeq.mapIt(it.toHex(2)).join`
* Split `aristo_transcode` => `aristo_serialise` + `aristo_blobify`
why:
+ Different modules for different purposes
+ `aristo_serialise`: RLP encoding/decoding
+ `aristo_blobify`: Aristo database encoding/decoding
* Compacted representation of small nodes' links instead of Keccak hashes
why:
Ethereum MPTs use Keccak hashes as node links if the size of an RLP
encoded node is at least 32 bytes. Otherwise, the RLP encoded node
value is used as a pseudo node link (rather than a hash.) Such a node
is nor stored on key-value database. Rather the RLP encoded node value
is stored instead of a lode link in a parent node instead. Only for
the root hash, the top level node is always referred to by the hash.
This feature needed an abstraction of the `HashKey` object which is now
either a hash or a blob of length at most 31 bytes. This leaves two
ways of representing an empty/void `HashKey` type, either as an empty
blob of zero length, or the hash of an empty blob.
* Update `CoreDb` interface (mainly reducing logger noise)
* Fix copyright years (to make `Lint` happy)
* Aristo: Single `FetchPathNotFound` error in `fetchXxx()` and `hasPath()`
why:
Missing path hike returns too many detailed reasons why it failed
which becomes cumbersome to handle.
also:
Renamed `contains()` => `hasPath()` which disables the `in` operator on
non-boolean `contains()` functions
* Kvt: Renamed `contains()` => `hasKey()`
why:
which disables the `in` operator on non-boolean `contains()` functions
* Aristo: Generalising `HashID` by variable length `PathID`
why:
There are cases when the `Aristo` database is to be used with
shorter than 64 nibbles keys when handling transactions indexes
with sequence IDs.
caveat:
This patch only works reliable for full length `PathID` values. Tests
for shorter `PathID` values are currently missing.
* Make sure that storage tries are not pruned (by default) on the new Ledger API
why:
Pruning might kill some unwanted entries from storage tries ending up with an unstable database
leading to crashes.
* Implement `CoreDb` and `LedgerRef` API tracing
details:
+ Locally enabled at compile time via constants `ProvideCoreDbLegacyAPI`
and `EnableApiTracking` in either `base.nim` source
+ If enabled it can be selectively turned on/off via public switches in
the `CoreDb` descriptor.
* Allow suppressing opportunistic `ifNecessaryGetXxx()` functions
why:
Better troubleshooting when the system crashes (assertions will then
most probably happen outside an `async` function.)
* Provide TDD/debug facility for inspecting `persistBlocks()` working
detail:
+ Make sure that the last block of a test sample is the first batch
item in `persistBlocks()`.
+ Additionally, allow `AccountsCache` API tracing by setting the flag
`extraTraceMessages = true` in the file `accounts_cache.nim`
* Overload AccountsCache by abstraction wrapper
details:
Can facilitate CoreDb API switch, details in `ledger/README.md`.
* CoreDB: Re-org API
details:
Legacy API internally uses vertex ID for root node abstraction
* Cosmetics: Move some unit test helpers to common sub-directory
* Extract constant from `accouns_cache.nim` => `constants.nim`
* Fix tracer methods
why:
Logger dump data were wrongly dumped from the production database. This
caused an assert exception when iterating over the persistent database
(instead of the memory logger.) This event in turn was enabled after
fixing another inconsistency which just set up an empty iterator. Unit
tests failed to detect that.
* Aristo: remove obsolete functions
* Aristo: Fix error code for non-available hash keys
why:
Must not return `not-found` when the key is not available (i.e. the
current changes were not hashified, yet.)
* CoreDB: Provide TDD and test framework
* Split `core_db/base.nim` into several sources
* Rename `core_db/legacy.nim` => `core_db/legacy_db.nim`
* Update `CoreDb` API, dual methods returning `Result[]` or plain value
detail:
Plain value methods implemet the legacy API, they defect on error results
* Redesign `CoreDB` direct backend access
why:
Made the `backend` directive integral part of the API
* Discontinue providing unused or otherwise available functions
details:
+ setTransactionID() removed, not used and not easily replicable in Aristo
+ maybeGet() removed, available via direct backend access
+ newPhk() removed, never used & was experimental anyway
* Update/reorg backend API
why:
+ Added error print function `$$()`
+ General descriptor completion (and optional validation) via `bless()`
* Update `Aristo`/`Kvt` exception handling
why:
Avoid `CatchableError` exceptions, rather pass them as error code where
appropriate.
* More `CoreDB` compliant `Aristo` and `Kvt` methods
details:
+ Providing functions like `contains()`, `getVtxRc()` (returns `Result[]`).
+ Additional error code: `NotImplemented`
* Rewrite/reorg of Aristo DB constructor
why:
Previously used global object `DefaultQidLayoutRef` as default
initialiser. This object was created at compile time which lead to
non-gc safe functions.
* Update nimbus/db/core_db/legacy_db.nim
Co-authored-by: Kim De Mey <kim.demey@gmail.com>
* Update nimbus/db/aristo/aristo_transcode.nim
Co-authored-by: Kim De Mey <kim.demey@gmail.com>
* Update nimbus/db/core_db/legacy_db.nim
Co-authored-by: Kim De Mey <kim.demey@gmail.com>
---------
Co-authored-by: Kim De Mey <kim.demey@gmail.com>
* Kvt: Implemented multi-descriptor access on the same backend
why:
This behaviour mirrors the one of Aristo and can be used for
simultaneous transactions on Aristo + Kvt
* Kvt: Update database iterators
why:
Forgot to run on the top layer first
* Kvt: Misc fixes
* Aristo, use `openArray[byte]` rather than `Blob` in prototype
* Aristo, by default hashify right after cloning descriptor
why:
Typically, a completed descriptor is expected after cloning. Hashing
can be suppressed by argument flag.
* Aristo provides `replicate()` iterator, similar to legacy `replicate()`
* Aristo API fixes and updates
* CoreDB: Rename `legacy_persistent` => `legacy_rocksdb`
why:
More systematic, will be in line with Aristo DB which might have
more than one persistent backends
* CoreDB: Prettify API sources
why:
Better to read and maintain
details:
Annotating with custom pragmas which cleans up the prototypes
* CoreDB: Update MPT/put() prototype allowing `CatchableError`
why:
Will be needed for Aristo API (legacy is OK with `RlpError`)
* Update docu
* Update Aristo/Kvt constructor prototype
why:
Previous version used an `enum` value to indicate what backend is to
be used. This was replaced by using the backend object type.
* Rewrite `hikeUp()` return code into `Result[Hike,(Hike,AristoError)]`
why:
Better code maintenance. Previously, the `Hike` object was returned. It
had an internal error field so partial success was also available on
a failure. This error field has been removed.
* Use `openArray[byte]` rather than `Blob` in functions prototypes
* Provide synchronised multi instance transactions
why:
The `CoreDB` object was geared towards the legacy DB which used a single
transaction for the key-value backend DB. Different state roots are
provided by the backend database, so all instances work directly on the
same backend.
Aristo db instances have different in-memory mappings (aka different
state roots) and the transactions are on top of there mappings. So each
instance might run different transactions.
Multi instance transactions are a compromise to converge towards the
legacy behaviour. The synchronised transactions span over all instances
available at the time when base transaction was opened. Instances
created later are unaffected.
* Provide key-value pair database iterator
why:
Needed in `CoreDB` for `replicate()` emulation
also:
Some update of internal code
* Extend API (i.e. prototype variants)
why:
Needed for `CoreDB` geared towards the legacy backend which has a more
basic API than Aristo.
* Rewrite remaining `AristoError` return code into `Result[void,AristoError]`
why:
Better code maintenance
* Update import sections
* Update Aristo DB paths
why:
More systematic so directory can be shared with other DB types
* More cosmetcs
* Update unit tests runners
why:
Proper handling of persistent and mem-only DB. The latter can be
consistently triggered by an empty DB path.
why:
Additional tables needed for the `CoreDB` object with separate
key-value table and MPT.
details:
+ Stripped down copy of Aristo DB to have a similar look'n feel. Otherwise
it is just a posh way for accessing `Table` objects or `RocksDB` data.
+ No unit tests yet, will be tested on the go.
* Reorg of distributed backend access
details:
Now handled via API provided in `aristo_desc`.
* Rename `checkCache()` => `checkTop()`
why:
Better naming for top layer cache checker
also:
Provide cascaded fifos checker
* Provide `eq` directive for finding filter by exact filter ID (think block number)
* Some code beautification (for better code reading)
* State root reposition and reorg
details:
Repositioning is supported by forking a new descriptor. Reorg is then
accomplished by writing this forked state on the backend database.
details:
* Tested features
+ Successively store filters with increasing filter ID (think block number)
+ Cascading through fifos, deeper fifos merge groups of filters
+ Fetch squash merged N top fifos
+ Delete N top fifos, push back merged fifo, continue storing
+ Fifo chain is verified by hashes and filter ID
* Not tested yet
+ Real live scenario (using data dumps)
+ Real filter data (only shallow filters used so far)
* Set scheduler state as part of the backend descriptor
details:
Moved type definitions `QidLayoutRef` and `QidSchedRef` to
`desc_structural.nim` so that it shares the same folder as
`desc_backend.nim`
* Automatic filter queue table initialisation in backend
details:
Scheduler can be tweaked or completely disabled
* Updated backend unit tests
details:
+ some code clean up/beautification, reads better now
+ disabled persistent filters so that there is no automated filter
management which will be implemented next
* Prettify/update unit tests source code
details:
Mostly replacing the `check()` paradigm by `xCheck()`
* Somewhat simplified backend type management
why:
Backend objects are labelled with a `BackendType` symbol where the
`BackendVoid` label is implicitly assumed for a `nil` backend object
reference.
To make it easier, a `kind()` function is used now applicable to
`nil` references as well.
* Fix DB storage layout for filter objects
why:
Need to store the filter ID with the object
* Implement reverse [] index on fifo
why:
An integer index argument on `[]` retrieves the QueueID (label) of the
fifo item while a QueueID argument on `[]` retrieves the index (so
it is inverse to the former variant).
* Provide iterator over filters as fifo
why:
This iterator goes along the cascased fifo structure (i.e. in
historical order)
* Add backwards index `[]` operator into fifo
also:
Need another maintenance instruction: The last overflow queue must
irrevocably delete some item in order to make space for a new one.
* Add re-org scheduler
details:
Generates instructions how to extract and merge some leading entries
* Add filter ID selector
details:
This allows to find the next filter now newer that a given filter ID
* Message update
* Rename FilterID => QueueID
why:
The current usage does not identify a particular filter but uses it as
storage tag to manage it on the database (to be organised in a set of
FIFOs or queues.)
* Split `aristo_filter` source into sub-files
why:
Make space for filter management API
* Store filter queue IDs in pairs on the backend
why:
Any pair will will describe a FIFO accessed by bottom/top IDs
* Reorg some source file names
why:
The "aristo_" prefix for make local/private files is tedious to
use, so removed.
* Implement filter slot scheduler
details:
Filters will be stored on the database on cascaded FIFOs. When a FIFO
queue is full, some filter items are bundled together and stored on the
next FIFO.
* Remove unused unit test sources
* Redefine and document serialised data records for Aristo backend
why:
Unique record types determined by marker byte, i.e. the last byte of a
serialisation record. This just needed some tweaking after adding new
record types.
* Removed dedicated transcoder tests
why:
will implicitely be provided by other tests:
+ encode/write -> hashify -> test_tx
+ decode/read -> merge raw nodes -> test_tx
+ de/blobfiy -> backend operations, taext_tx, test_backend, test_filter
* Clarify how the vertex ID generator state is accessed from the backend
why:
This state is a list of unused vertex IDs. It was just stored somewhere
on the backend which details were exposed when iterating over some
sub-table(s).
As there will be more such single information records, an admin
sub-tables has been defined (formerly ID generator table) with dedicated
access keys and type. Also, the iterator over the single ID generator
state item has been removed. It must be accessed via the `get()`
interface.
* Remove trailing space from file name
why:
fixes windows bail out
* Remove concept of empty/blind filters
why:
Not needed. A non-existent filter is is coded as a nil reference.
* Slightly generalised backend iterators
why:
* VertexID as key for the ID generator state makes no sense
* there will be more tables addressed by non-VertexID keys
* Store serialised/blobified vertices on memory backend
why:
This is more in line with the RocksDB backend so more appropriate
for testing when comparing behaviour. For a speedy memory database,
a backend-less variant should be used.
* Drop the `Aristo` prefix from names `AristoLayerRef`, etc.
* Suppress compiler warning
why:
duplicate imports
* Add filter serialisation transcoder
why:
Will be used as storage format
* Fix hashing algorithm
why:
Particular case where a sub-tree is on the backend, linked by an
Extension vertex to the top level.
* Update backend verification to report `dirty` top layer
* Implement distributed merge of backend filters
* Implement distributed backend access management
details:
Implemented and tested as described in chapter 5 of the `README.md`
file.
* Renamed type `NoneBackendRef` => `VoidBackendRef`
* Clarify names: `BE=filter+backend` and `UBE=backend (unfiltered)`
why:
Most functions used full names as `getVtxUnfilteredBackend()` or
`getKeyBackend()`. After defining abbreviations (and its meaning) it
seems easier to use `getVtxUBE()` and `getKeyBE()`.
* Integrate `hashify()` process into transaction logic
why:
Is now transparent unless explicitly controlled.
details:
Cache changes imply setting a `dirty` flag which in turn triggers
`hashify()` processing in transaction and `pack()` directives.
* Removed `aristo_tx.exec()` directive
why:
Inconsistent implementation, functionality will be provided with a
different paradigm.
* Provide deep copy for each transaction layer
why:
Localising changes. Selective deep copy was just overlooked.
* Generalise vertex ID generator state reorg function `vidReorg()`
why:
makes it somewhat easier to handle when saving layers.
* Provide dummy back end descriptor `NoneBackendRef`
* Optional read-only filter between backend and transaction cache
why:
Some staging area for accumulating changes to the backend DB. This
will eventually be an access layer for emulating a backend with
multiple/historic state roots.
* Re-factor `persistent()` with filter between backend/tx-cache => `stow()`
why:
The filter provides an abstraction from the physically stored data on
disk. So, there can be several MPT instances using the same disk data
with different state roots. Of course, all the MPT instances should
not differ too much for practical reasons :).
TODO:
Filter administration tools need to be provided.
* Better error handling
why:
Bail out on some error as early as possible before any changes.
* Implement `fetch()` as opposite of `merge()`
rationale:
In the `Aristo` realm, the action named `fetch()` and `merge()` indicate
leaf value related actions on the MPT, while actions `get()` and `put()`
handle vertex or hash key related operations that constitute the MPT.
* Re-factor `merge()` prototypes
why:
The most used variant of `merge()` should have the simplest prototype.
* Persistent DB constructor needs to import `aristo/aristo_init/persistent`
why:
Most applications use memory DB anyway. This avoids linking `-lrocksdb`
or any other back end libraries by default.
* Re-factor transaction module
why:
Got the paradigm wrong. The transaction descriptor did replace the
database one but should be handled separately.
* Nimbus folder environment update
details:
* Integrated `CoreDbRef` for the sources in the `nimbus` sub-folder.
* The `nimbus` program does not compile yet as it needs the updates
in the parallel `stateless` sub-folder.
* Stateless environment update
details:
* Integrated `CoreDbRef` for the sources in the `stateless` sub-folder.
* The `nimbus` program compiles now.
* Premix environment update
details:
* Integrated `CoreDbRef` for the sources in the `premix` sub-folder.
* Fluffy environment update
details:
* Integrated `CoreDbRef` for the sources in the `fluffy` sub-folder.
* Tools environment update
details:
* Integrated `CoreDbRef` for the sources in the `tools` sub-folder.
* Nodocker environment update
details:
* Integrated `CoreDbRef` for the sources in the
`hive_integration/nodocker` sub-folder.
* Tests environment update
details:
* Integrated `CoreDbRef` for the sources in the `tests` sub-folder.
* The unit tests compile and run cleanly now.
* Generalise `CoreDbRef` to any `select_backend` supported database
why:
Generalisation was just missed due to overcoming some compiler oddity
which was tied to rocksdb for testing.
* Suppress compiler warning for `newChainDB()`
why:
Warning was added to this function which must be wrapped so that
any `CatchableError` is re-raised as `Defect`.
* Split off persistent `CoreDbRef` constructor into separate file
why:
This allows to compile a memory only database version without linking
the backend library.
* Use memory `CoreDbRef` database by default
detail:
Persistent DB constructor needs to import `db/core_db/persistent
why:
Most tests use memory DB anyway. This avoids linking `-lrocksdb` or
any other backend by default.
* fix `toLegacyBackend()` availability check
why:
got garbled after memory/persistent split.
* Clarify raw access to MPT for snap sync handler
why:
Logically, `kvt` is not the raw access for the hexary trie (although
this holds for the legacy database)
why:
* Resolves some compiler coughing when it bails out on persitent
db constructor inside `test()` caluses (works perfectly outside.)
* API looks cleaner and better to maintain for the price of slightly
more work at the backend
* Remove 32bit os support from `custom_network` unit test
also:
* Fix compilation annoyance #1648
* Fix unit test on Kiln (changed `merge` logic?)
* Hide unused sources do not compile
why:
* Get them out of the way before major update
* Import and function prototype mismatch -- maybe some changes got out
of scope.
* Re-implemented `db_chain` as `core_db`
why:
Hiding `TrieDatabaseRef` and `HexaryTrie` by default allows to replace
the current db wrapper by some other one, e.g. Aristo
* Support compiler exception warnings for CoreDbRef base methods.
* Allow `pairs()` iterator on all memory based key-value tables
why:
Previously only available for capture recorder.
* Backport `chain_db.nim` changes into its re-implementation `core_apps.nim`
* Fix exception annotation
* Misc fixes
detail:
* Fix de-serialisation for account leafs
* Update node recovery from unit tests
* Remove `LegacyAccount` from `PayloadRef` object
why:
Legacy accounts use a hash key as storage root which is detrimental
to the working of the Aristo database which uses a vertex ID.
* Dissolve `hashify_helper` into `aristo_utils` and `aristo_transcode`
why:
Functions are of general interest so they should live in first level
code files.
* Added left/right iterators over leaf nodes
* Some helper/wrapper functions that might be useful
why:
For the main tree with root vertex ID 1, the leaf nodes hold the
account data. These accounts may link to sub trees the storage root
node ID of which must be registered here. There is no reverse key
lookup on the backend.
note:
These definitions are experimental. Also, there are some tests missing
for validating Payload data conversions.
* Provide transaction based interface for standard operations
* Provide unit tests for new Aristo interface using transactions
details:
These new tests combine and replace several single-purpose tests.
The now unused test sources will be kept for a while to be eventually
removed.
* Slightly tighten some self-check conditions
* Redefined the database descriptor object as reference (to the object)
why:
The upcoming transaction wrapper will work with a database reference
rather than the object itself
* Append state before `save()` to the Aristo descriptor
why:
This stae was previously returned by the function. Appending it to
a field of the Aristo descriptor seems easier to handle.
* Fix missing branch checks in transcoder
why:
Symmetry problem. `Blobify()` allowed for encoding degenerate branch
vertices while `Deblobify()` rejected decoding wrongly encoded data.
* Update memory backend so that it rejects storing bogus vertices.
why:
Error behaviour made similar to the rocks DB backend.
* Make sure that leaf vertex IDs are not repurposed
why:
This makes it easier to record leaf node changes
* Update error return code for next()/right() traversal
why:
Returning offending vertex ID (besides error code) helps debugging
* Update Merkle hasher for deleted nodes
why:
Not implemented, yet
also:
Provide cache & backend consistency check functions. This was
partly re-implemented from `hashifyCheck()`
* Simplify some unit tests
* Fix delete function
why:
Was conceptually wrong
* Added missing deferred cleanup directive to sub-test functions
why:
Rocksdb keeps the files locked for a short while leading to errors. This
was previously solved my using different db sub-directories
* Provide vertex deep-copy function globally.
why:
is just handy
* Avoid unnecessary vertex caching when merging proof nodes
also:
Run all merge tests on the rocksdb backend
Previously, proof node tests were run without backend
* Fix vertex ID generator state handling for rocksdb backend
why:
* Key error in walk iterator
* Needs to be loaded when opening the database
* Use non-zero sub-table prefixes for rocksdb
why:
Handy for debugging
* Fix error code for missing key on rocksdb backend
why:
Previously returned `VOID_HASH_KEY` rather than `GetKeyNotFound`
* Explicitly copy vertex data between internal table and function/result argument
why:
Function argument or return reference may still refer to the same data
object.
* Updated error symbols
why:
Error symbol names for the hike module now start with the prefix `Hike`.
* Write back modified branch node into local top layer cache
why:
With the backend available, the source of the branch node references
might not be the top layer cache. So any change must be explicitely
recorded.
* Generalised Aristo DB constructor for any type of backend
details:
* Records to be deleted are represented as key-void (rather than
key-value) pairs by the put-function arguments
* Allow direct driver access, iterators as example implementation and
for testing.
* Provide backend storage interface
details:
Stores the top layer onto backend tables
* Implemented Rocks DB backend
details:
Transaction based `put()` functionality
Iterators (based on direct RocksDB access)
* Fix include
why:
Eth67 not default yet so that got missed
* Rename `LeafKey` => `LeafTie`
why:
Name is a pen picture of what this object is for. Also, it avoids the
ubiquitous term `key`.
* Provided `getOrVoid()` wrapper for `getOrDefault()`
also:
Provide `isValid()` syntactic sugar for `.isNil.not`, `!= 0` etc.
Reorg descriptor source, split into sub-sources
* Bundled `NodeKey` objects with root ID and called it `HashLabel`
why:
`NodeKey` (aka repurposed Hash265) objects are unique only within a
particular sub-trie (e.g. storage slots) which are kept separated
(i.e non-interleaved) by design. This is not applied to the backend
as the map VertexID->NodeKey labelling the nodes needs not be injective.
For the in-memory database (transaction) layers, the injective map
VertexID->(VertexID,NodeKey) is used where the first field of the image
tuple is the root ID of the sub-trie the `NodeKey` object is valid. So
identical storage tries for different accounts can be represented.
* Exclude some storage tests
why:
These test running on external dumps slipped through. The particular
dumps were reported earlier as somehow dodgy.
This was changed in `#1457` but having a second look, the change on
hexary_interpolate.nim(350) might be incorrect.
* Redesign `Aristo DB` descriptor for transaction based layers
why:
Previous descriptor layout made it cumbersome to push/pop
database delta layers.
The new architecture keeps each layer with the full delta set
relative to the database backend.
* Keep root ID as part of the `Patricia Trie` leaf path
why;
That way, forests are supported
* Fix missing Merkle key removal in `merge()`
* Accept optional root hash argument in `hashify()`
why:
For importing a full database, there will be no proof data except the
root key. So this can be used to check and set the root key in the
database descriptor.
also:
Associate vertex ID to `hashify()` error return code
* Added Aristo Trie traversal function
why:
* step along leaf vertices in sorted order
* tree/trie consistency checks when debugging
* Enabled storage slots test data for Aristo DB
* Keep vertex ID generator state with each db-layer
why:
The vertex ID generator state is part of the difference to the below
layer
* Move otherwise unused source to test directory
* Add Merkle hash generator
also:
* Verification facility for debugging
* Empty Merkle key hashes encoded as `EMPTY_ROOT_HASH`
details:
1. Merging a leaf vertex merges a `Patricia Trie` path (while
adding/modiying vertices) and adds a leaf node with payload
2. Merging a Merkel node merges a single vertex to the `Patricia Trie`
and registers merkel hashes
3. Action 2 can be used before action 1 in order to construct a
Merkel proof as required for handling `snap/1` data.
4. Unit tests show that action 3 is benign for now :)
* Cosmetics, renamed fields (eVtx, bVtx) -> (eVid, bVid)
* Multilayered delta architecture for Aristo DB
details:
Any VertexID or data retrieval needs to go down the rabbit hole and
fetch/get/manipulate the bottom layer -- even without explicit
backend.
* Direct reference to backend from top-level layer
why:
Some services as the vid management needs to be synchronised among all
layers. So access is optimised.
* Experimental MP-trie
why:
Deleting records is a infeasible with the current structure
* Added vertex ID recycling management
Todo:
Provide some unit tests
* DB layout update
why:
Main news is the separation of `Merkel` hashes into an extra table.
details:
The code fragments cover conversion between compact MPT records and
Aristo DB records as well as some rudimentary cache handling for
the `Merkel` hashes (i.e. the extra table entries.)
todo:
Add some simple unit test for the descriptor record (currently used
for vertex ID management, only.)
* Updated vertex ID recycling management
details:
added simple unit tests (mainly testing ABI)
* docu update
* Recreating some of the speculative-execution code.
Not really using it yet. Also there's some new inefficiency in
memory.nim, but it's fixable - just haven't gotten around to it yet.
The big thing introduced here is the idea of "cells" for stack,
memory, and storage values. A cell is basically just a Future (though
there's also the option of making it an Identity - just a simple
distinct wrapper around a value - if you want to turn off the
asynchrony).
* Bumped nim-eth.
* Cleaned up a few comments.
* Bumped nim-secp256k1.
* Oops.
* Fixing a few compiler errors that show up with EVMC enabled.
* Update sync scheduler pool mode
why:
The pool mode allows to loop over active peers one after another. This
is ideal for soft re-starting peers. As this is a two tier experience
(start/stop, setup/release) the loop must be run twice. This is
controlled by a more rigid re-definition of how to use the `poolMode`
flag.
* Mitigate RLP serialiser deficiency
why:
Currently, serialising the `BlockBody` in not conevrtible and need
to be checked in the `eth` module. Currently a local fix for the
wire protocol applies. Unit tests will stay (after this local solution
will have been removed.)
* Code cosmetics and massage
details:
Main part is `types.toStr()` as a unified function for logging block
numbers.
* Allow to use a logical genesis replacement (start of history)
why:
Snap sync will set up an arbitrary pivot at a block number different
from zero. In fact, the higher the block number the better.
details:
A non-genesis start of history will currently only affect the score
values which were derived from the difficulty.
* Provide function to store the snap pivot block header in chain db
why:
Together with the start of history facility, this allows to proceed
with full syncing once snap has finished.
details:
Snap db storage was switched from a sub-tables to the flat chain db.
* Provide database completeness and sanity checker
details:
For debugging on smaller databases, only
* Implement snap -> full sync switch
t8n: a silly bug contract address generator, should use original
tx nonce instead of read the nonce from sender address in state db.
Although in EVM contract address generated by reading nonce from state db
is correct, outside EVM that nonce value might have been modified,
thus generating incorrect contract address.
accounts cache: when clearing account storage, the originalValue
cache is not cleared, only the storageRoot set to empty storage root,
this will cause getStorage and getCommitedStorage return wrong value
if the originalValue cache contains old value.
simplify EVM and delegete those things to accounts cache.
also no more manual state clearing, accounts cache will be
responsible for both collecting touched account and perform
state clearing.
* Gwei conversion should use u256 because u64 can overflow.
* Make withdrawals follow the EIP-158 state-clearing rules.
(i.e. Empty accounts should be deleted.)
* Allow the zero address in normalizeNumber.
(Necessary for one of the new withdrawals-related tests.)
* Silence some compiler gossip -- part 5, common
details:
Mostly removing redundant imports and `Defect` tracer after switch
to nim 1.6
* Silence some compiler gossip -- part 6, db, rpc, utils
details:
Mostly removing redundant imports and `Defect` tracer after switch
to nim 1.6
* Silence some compiler gossip -- part 7, randomly collected source files
details:
Mostly removing redundant imports and `Defect` tracer after switch
to nim 1.6
* Silence some compiler gossip -- part 8, assorted tests
details:
Mostly removing redundant imports and `Defect` tracer after switch
to nim 1.6
* Clique update
why:
More impossible exceptions (undoes temporary fix from previous PR)
* Updated to the latest nim-eth, nim-rocksdb, nim-web3
* Bump nimbus-eth2 module and fix related issues
Temporarily disabling Portal beacon light client network as it is
a lot of copy pasted code that did not yet take into account
forks. This will require a bigger rework and was not yet tested
in an actual network anyhow.
* More nimbus fixes after module bumps
---------
Co-authored-by: Adam Spitz <adamspitz@status.im>
Co-authored-by: jangko <jangko128@gmail.com>
Two unresolved items currently:
- Three tests that are temporarily disabled as they fail in the
macro_assembler code, which seems to be due to an ambigious
identifier Stop (Ops and chronos ServerCommand enum).
- i386 CI disabled as it fails at Nim compilation already. Failed
tests where already ignored for this target.
* Piecemeal trie inspection
details:
Trie inspection will stop after maximum number of nodes visited.
The inspection can be resumed using the returned state from the
last session.
why:
This feature allows for task switch between `piecemeal` sessions.
* Extract pivot helper code from `worker.nim` => `pivot_helper.nim`
* Accounts import will now return dangling paths from `proof` nodes
why:
With proper bookkeeping, this can be used to start healing without
analysing the the probably full trie.
* Update `unprocessed` account range handling
why:
More generally, the API of a pairs of unprocessed intervals favours
the first set and not before that is exhausted the second set comes
into play.
This was unfortunately implemented which caused the ranges to be
unnecessarily fractioned. Now the number of range interval typically
remains in the lower single digit numbers.
* Save sync state after end of downloading some accounts
details:
restore/resume to be implemented later
* Update log ticker, using time interval rather than ticker count
why:
Counting and logging ticker occurrences is inherently imprecise. So
time intervals are used.
* Use separate storage tables for snap sync data
* Left boundary proof update
why:
Was not properly implemented, yet.
* Capture pivot in peer worker (aka buddy) tasks
why:
The pivot environment is linked to the `buddy` descriptor. While
there is a task switch, the pivot may change. So it is passed on as
function argument `env` rather than retrieved from the buddy at
the start of a sub-function.
* Split queues `fetchStorage` into `fetchStorageFull` and `fetchStoragePart`
* Remove obsolete account range returned from `GetAccountRange` message
why:
Handler returned the wrong right value of the range. This range was
for convenience, only.
* Prioritise storage slots if the queue becomes large
why:
Currently, accounts processing is prioritised up until all accounts
are downloaded. The new prioritisation has two thresholds for
+ start processing storage slots with a new worker
+ stop account processing and switch to storage processing
also:
Provide api for `SnapTodoRanges` pair of range sets in `worker_desc.nim`
* Generalise left boundary proof for accounts or storage slots.
why:
Detailed explanation how this works is documented with
`snapdb_accounts.importAccounts()`.
Instead of enforcing a left boundary proof (which is still the default),
the importer functions return a list of `holes` (aka node paths) found in
the argument ranges of leaf nodes. This in turn is used by the book
keeping software for data download.
* Forgot to pass on variable in function wrapper
also:
+ Start healing not before 99% accounts covered (previously 95%)
+ Logging updated/prettified
* Provided common scheduler API, applied to `full` sync
* Use hexary trie as storage for proofs_db records
also:
+ Store metadata with account for keeping track of account state
+ add iterator over accounts
* Common scheduler API applied to `snap` sync
* Prepare for accounts bulk import
details:
+ Added some ad-hoc checks for proving accounts data received from the
snap/1 (will be replaced by proper database version when ready)
+ Added code that dumps some of the received snap/1 data into a file
(turned of by default, see `worker_desc.nim`)
* Relocated `IntervalSets` to nim-stew repo
* Accumulate accounts on temporary kv-DB
why:
Explore the data as returned from snap/1. Will be converted to a
`eth/db` next.
details:
Verify and accumulate per/state-root accounts downloaded via snap.
also:
Some unit tests
* Replace `Table` by `TrieDatabaseRef` for accounts accumulator
* update ticker statistics
details:
mean/variance based counter update
* allow persistent db for proved accounts
* rebase, and globally activate unit test
* fix statistics
Includes a simple test harness for the merge interop M1 milestone
This aims to enable connecting nimbus-eth2 to nimbus-eth1 within
the testing protocol described here:
https://github.com/status-im/nimbus-eth2/blob/amphora-merge-interop/docs/interop_merge.md
To execute the work-in-progress test, please run:
In terminal 1:
tests/amphora/launch-nimbus.sh
In terminal 2:
tests/amphora/check-merge-test-vectors.sh
* Redesign of BaseVMState descriptor
why:
BaseVMState provides an environment for executing transactions. The
current descriptor also provides data that cannot generally be known
within the execution environment, e.g. the total gasUsed which is
available not before after all transactions have finished.
Also, the BaseVMState constructor has been replaced by a constructor
that does not need pre-initialised input of the account database.
also:
Previous constructor and some fields are provided with a deprecated
annotation (producing a lot of noise.)
* Replace legacy directives in production sources
* Replace legacy directives in unit test sources
* fix CI (missing premix update)
* Remove legacy directives
* chase CI problem
* rebased
* Re-introduce 'AccountsCache' constructor optimisation for 'BaseVmState' re-initialisation
why:
Constructing a new 'AccountsCache' descriptor can be avoided sometimes
when the current state root is properly positioned already. Such a
feature existed already as the update function 'initStateDB()' for the
'BaseChanDB' where the accounts cache was linked into this desctiptor.
The function 'initStateDB()' was removed and re-implemented into the
'BaseVmState' constructor without optimisation. The old version was of
restricted use as a wrong accounts cache state would unconditionally
throw an exception rather than conceptually ask for a remedy.
The optimised 'BaseVmState' re-initialisation has been implemented for
the 'persistBlocks()' function.
also:
moved some test helpers to 'test/replay' folder
* Remove unused & undocumented fields from Chain descriptor
why:
Reduces attack surface in general & improves reading the code.
previously, every time the VMState was created, it will also create
new stateDB, and this action will nullify the advantages of cached accounts.
the new changes will conserve the accounts cache if the executed blocks
are contiguous. if not the stateDB need to be reinited.
this changes also allow rpcCallEvm and rpcEstimateGas executed properly
using current stateDB instead of creating new one each time they are called.
this is a preparation for migration to confutils based config
although there is still some getConfiguration usage in tests code
it will be removed after new config arrived
instead of using stdlib/json, now we switch to json_serialization
the result is much tidier code and more robust when parsing
optional fields.
fixes#635
instead of using header as input param, now getReceipts using
receiptRoot hash, the intention is clearer and less data passed around
when we only using receiptRoot instead of whole block header.