`initTable` is obsolete since nim 0.19 and can introduce significant
memory overhead while providing no benefit (since the table will be
grown to the default initial size on first use anyway).
In particular, aristo layers will not necessarily use all tables they
initialize, for exampe when many empty accounts are being created.
* Cleanup unneeded stateless and block witness code. Keeping MultiKeys which is used in the eth_getProofsByBlockNumber RPC endpoint which is needed for the Fluffy state network bridge.
* Rename generateWitness flag to collectWitnessData to better describe what the flag does. We only collect the keys of the touched accounts and storage slots but no block witness generation is supported for now.
* Move remaining stateless code into nimbus directory.
* Add vmstate parameter to ChainRef to fix test.
* Exclude *.in from check copyright year
---------
Co-authored-by: jangko <jangko128@gmail.com>
* Code cosmetics
* Re-org `aristo_merge`, internally split into sub-modules
why:
Became a burden for maintenance because it hosts two different
functionalities under the same merge paradigm: account/data merge
and snap proof merge where the latter produces a partial trie.
* Fix CoreDb tracer
* Ledger: fix potential account vs. storage tree sync problems
* Remove bound on the size of removable whole storage trees
* Activate `test_tracer_json`
* CoreDb: Remove crufty second/off-site KVT
why:
Was used to allow late `Clique` to store directly to disk
* CoreDb: Remove prune flag related functionality
why:
Is completely legacy stuff
* CoreDb: Remove dependence on legacy API (tests unsupported yet)
why:
Does not fully support Aristo
* Re-factoring `state_db` using new API
details:
Only minimum changes needed to compile `nimbus`
* Update tests and aux modules
* Turn off legacy API and remove `distinct_tries`
comment:
The legacy API has now cruft status, will be removed soon
* Fix copyright years
* Update rpc for verified proxy
---------
Co-authored-by: Jacek Sieka <jacek@status.im>
`persist` is a hotspot when processing blocks because it is run at least
once per transaction and loops over the entire account cache every time.
Here, we introduce an extra `dirty` map that keeps track of all accounts
that need checking during `persist` which fixes the immediate
inefficiency, though probably this could benefit from a more thorough
review - we also get rid of the unused clearCache flag - we start with
a fresh cache on every fresh vmState.
* avoid unnecessary code hash comparisons
* avoid unnecessary copies when iterating
* use EMPTY_CODE_HASH throughout for code hash comparison
* Update TDD suite logger output format choices
why:
New format is not practical for TDD as it just dumps data across a wide
range (considerably larder than 80 columns.)
So the new format can be turned on by function argument.
* Update unit tests samples configuration
why:
Slightly changed the way to find the `era1` directory
* Remove compiler warnings (fix deprecated expressions and phrases)
* Update `Aristo` debugging tools
* Always update the `storageID` field of account leaf vertices
why:
Storage tries are weekly linked to an account leaf object in that
the `storageID` field is updated by the application.
Previously, `Aristo` verified that leaf objects make sense when passed
to the database. As a consequence
* the database was inconsistent for a short while
* the burden for correctness was all on the application which led
to delayed error handling which is hard to debug.
So `Aristo` will internally update the account leaf objects so that
there are no race conditions due to the storage trie handling
* Aristo: Let `stow()`/`persist()` bail out unless there is a `VertexID(1)`
why:
The journal and filter logic depends on the hash of the `VertexID(1)`
which is commonly known as the state root. This implies that all
changes to the database are somehow related to that.
* Make sure that a `Ledger` account does not overwrite the storage trie reference
why:
Due to the abstraction of a sub-trie (now referred to as column with a
hash describing its state) there was a weakness in the `Aristo` handler
where an account leaf could be overwritten though changing the validity
of the database. This has been changed and the database will now reject
such changes.
This patch fixes the behaviour on the application layer. In particular,
the column handle returned by the `CoreDb` needs to be updated by
the `Aristo` database state. This mitigates the problem that a storage
trie might have vanished or re-apperaed with a different vertex ID.
* Fix sub-trie deletion test
why:
Was originally hinged on `VertexID(1)` which cannot be wholesale
deleted anymore after the last Aristo update. Also, running with
`VertexID(2)` needs an artificial `VertexID(1)` for making `stow()`
or `persist()` work.
* Cosmetics
* Activate `test_generalstate_json`
* Temporarily `deactivate test_tracer_json`
* Fix copyright header
---------
Co-authored-by: jordan <jordan@dry.pudding>
Co-authored-by: Jacek Sieka <jacek@status.im>
* Remove crufty `pruneTrie` arguments
* Replaced legacy `distinct_trie` logic by new `ledger` functionality
why:
The module `distinct_trie` is supported by `Aristo` in trivial cases.
* Activate `test_op_memory`
* Attempt to roll back stateless mode implementation in a single PR
why:
+ Stateless mode is not fully working and in the way
+ Single PR should make it feasible to investigate for a possible
re-implementation
* Fix copyright year
* Fix annotation for exception (evmc mode)
* Update some docu & messages
* Remove cruft from the ledger modules
* Must not overwrite genesis data on an initialised database
why:
This will overwrite the global state of the Aristo single state DB.
Otherwise resuming at the last synced state becomes impossible.
* Provide latest block number from journal
why:
This relates the global state of the DB directly to the corresponding
block number.
* Implemented unit test providing DB pre-load and resume
* Code cosmetics
* Aristo+Kvt: Fix api wrappers
why:
Api setup killed the backend descriptor when backend mapping was
disabled.
* Aristo: Implement masked profiling entries
why:
Database backend should be listed but not counted in tally
* CoreDb: Simplify backend() methods
why:
DBMS backend access Was provided very early and over engineered. Now
there are only two backend machines, one for `Kvt` and the other one
for an `Mpt` available only via new API.
* CoreDb: Code cleanup regarding descriptor types
* CoreDb: Refactor/redefine `persistent()` methods
why:
There were `persistent()` methods for any type of caching storage
facilities `Kvt`, `Mpt`, `Phk`, and `Acc`. Now there is only a single
`persistent()` method storing all facilities in tandem (similar to
how transactions work.)
For non shared `Kvt` tables, there is now an extra storage method
`saveOffSite()`.
* CoreDb lingo update: `trie` becomes `column`
why:
Notion of a `trie` is pretty much hidden by the new `CoreDb` api.
Revealed are sort of database columns for accounts an storage data,
any of which have an internal state represented by a Keccack hash.
So a `trie` or `MPT` becomes a `column` and a `rootHash` becomes a
column state.
* Aristo: rename backend filed `filters` => `journal`
* Update full sync logging
details:
+ Disable eth handler noise while syncing
+ Log journal depth (if available)
* Fix copyright year
* Fix cruft and unwanted imports
* Update README
* Nimbus-main: replaced `PruneMode` options by `ChainDbMode` options
details:
For the legacy database, this changes the phrase
- `conf.pruneMode == PruneMode.Full` to the expression
+ `conf.chainDbMode == ChainDbMode.Prune`.
* Fix issues moaned about by NIM compiler
* Fix copyright year
* Kvt: Update API hooks
* Aristo: Generalised merging snap proofs, now for multiple state roots
why:
This accommodates pre-loading partial tries for unit tests
* Aristo: Update some unit tests
* CoreDb+Aristo: Re-factor tracer
why:
Was bonkers anyway. The main change is that the trace journal is now
kept in a way similar to a transaction layer so that it can predictably
interact with DB transactions.
* Ledger: Debugging helper
* Update tracer unit test applicable for `Aristo`
* Fix copyright year
* Disable `dump()` function as compile time default
why:
This needs to pull in the `rocks_db` library at compile time.
* Remove cruft
* Docu/code cosmetics
* Aristo: Update `forkBase()`
why:
Was not up to the job
* Update/correct tracer for running against `Aristo`
details:
This patch makes sure that before creating a new `BaseVMState` the
`CoreDb` context is adjusted to accommodate for the state root that
is passed to the `BaseVMState` constructor.
* CpreDb+legacy: Always return current context with `ctxFromTx()`
why:
There was an experimental setting trying to find the node with the
proper setting in the KVT (not the hexary tie layer) which currently
does not work reliable, probably due to `Ledger` caching effects.
* Fix 'value out of range' RangeDefect caused by large/expensive blocks/transactions during DOS period.
* Clear witness cache in AccountCache persist.
* Revert previous fix and force clear cache after processing each block.
* Revert clear cache in process block.
* CoreDb+Ledger: Update logging
why:
Use symbol `api` rather than `ctx` because the latter will be used
as name for particular objects
* CoreDb: Remove cruft
* CoreDb: Remove `TxID` support
why:
It is nowhere used and ugly implemented. The upcoming context layer
will be a cleaner alternative to use, instead should this particular
functionality be needed.
* CoreDb: Rearrange base methods in source code for better reading
* CoreDb+Aristo: Update API closures for better reading & maintenance
* CoreDb: Implement context layer for MPT
why:
On `Aristo` the context layer allows to manage different views on
the same backend database. This is an abstraction of the legacy
hexary trie which can be localised on a particular root nose.
details:
The `ctx` context provides the state (equiv. to state root) of the
database for MPT and account descriptors.
* Fix Copyright headers
* Aristo+Kvt: Fix backend `dup()` function in api setup
why:
Backend object is subject to an inheritance cascade which was not
taken care of, before. Only the base object was duplicated.
* Kvt: Simplify DB clone/peers management
* Aristo: Simplify DB clone/peers management
* Aristo: Adjust unit test for working with memory DB only
why:
This currently causes some memory corruption persumably in the
`libc` background layer.
* CoredDb+Kvt: Simplify API for KVT
why:
Simplified storage models (was over engineered) for better performance
and code maintenance.
* CoredDb+Aristo: Simplify API for `Aristo`
why:
Only single database state needed here. Accessing a similar state will
be implemented from outside this module using a context layer. This
gives better performance and improves code maintenance.
* Fix Copyright headers
* CoreDb: Turn off API tracking
why:
CI would ot go through. Was accidentally turned on.
* Aristo/Kvt: Provide function hooks APIs
why:
These APIs can be used for installing tracers, profiling functoinality,
and other niceties on the databases.
* Aristo: Provide optional API profiling
details:
It basically is a re-implementation of the `CoreDb` profiling
implementation
* Kvt: Provide optional API profiling similar to `Aristo`
* CoreDb: Re-implementing profiling using `aristo_profile`
* Ledger: Re-implementing profiling using `aristo_profile`
* CoreDb: Update unit tests for maintainability
* update copyright dates
* Aristo: Update schedule runner for `hashify()`
why:
Width-first schedule walker overlooked resolvable vertex and ended in
a deadlock situation
* CoreDb+Aristo: Update error code for fringe condition
* Ledger: Remove redundant extra MPT hashing statements
why:
These statement were used for troubleshooting
* CoreDb: Couch medicine for legacy DB backend
why:
MPT will occasionally enter inconsistent states after deletion, some
dirty fix is randomly added to mitgate that.
* CoreDb: Test module with additional sample selector cmd line options
* Aristo: Do not automatically remove a storage trie with the account
why:
This is an unnecessary side effect. Rather than using an automatism, a
a storage root must be deleted manually.
* Aristo: Can handle stale storage root vertex IDs as empty IDs.
why:
This is currently needed for the ledger API supporting both, a legacy
and the `Aristo` database backend.
This feature can be disabled at compile time by re-setting the
`LOOSE_STORAGE_TRIE_COUPLING` flag in the `aristo_constants` module.
* CoreDb+Aristo: Flush/delete storage trie when deleting account
why:
On either backend, a deleted account leave a dangling storage trie on
the database.
For consistency nn the legacy backend, storage tries must not be
deleted as they might be shared by several accounts whereas on `Aristo`
they are always unique.
* Update experimental rpc test to do further validation on proof responses.
* Enable zero value storage slots in the witness cache data so that proofs will be returned when a storage slot is updated to zero. Refactor and simplify implementation of getProofs endpoint.
* Improve test validation.
* Minor fixes and added test to track the changes introduced in every block using a local state.
* Refactor and cleanup test.
* Comments added to test and account cache fixes applied to account ledger.
* Return updated storage slots even when storage is empty and add test to build.
* Fix copyright and remove incorrect depth check during witness building in writeShortRlp assertion.
* Aristo: Update error return code
why:
Failing of `Aristo` function `delete()` might fail because there is
no such data item on the db. This must return a single error code
as is done with `fetch()`.
* Ledger: Better error handling
why:
The `expect()` clauses have been replaced by raising asserts indicating
the error from the database backend.
Also, `delete()` failures are legitimate if the item to delete does not
exist.
* Aristo: Delete function must always leave a label on DB for `hashify()`
why:
The `hashify()` uses the labels left bu `merge()` and `delete()` to
compile (and optimise) a scheduler for subsequent hashing.
Originally, the labels were not used for deleted entries and `delete()`
still had some edge case where the deletion label was not properly
handled.
* Aristo: Update `hashify()` scheduler, remove buggy optimisation
why:
Was left over from version without virtual state roots which did not
know about account payload leaf vertices referring to storage roots.
* Aristo: Label storage trie account in `delete()` similar to `merge()`
details;
The `delete()` function applied to a non-static state root (assumed
to be a storage root) will check the payload of an accounts leaf
and mark its Merkle keys to be re-checked when runninh `hashify()`
* Aristo: Clean up and re-org recycled vertex IDs in `hashify()`
why:
Re-organising the recycled vertex IDs list intends to reduce the size of the
list.
This list is organised as a LIFO (or stack.) By reorganising it in a way
so that the least vertex ID numbers are on top, the list will be kept
smaller as observed on some examples (less than 30%.)
* CoreDb: Accept storage trie deletion requests in non-initialised state
why:
Due to lazy initialisation, the root vertex ID might not yet exist. So
the `Aristo` database handlers would reject this call with an error and
this condition needs to be handled by the API (which realises the lazy
feature.)
* Cosmetics & code massage, prettify logging
* fix missing import
* CoreDb: Improve API and API tracking
why:
Now logs state roots where appropriate
* CoreDb: re-implement `CoreDbVidRef` => `CoreDbTrieRef`
why:
Instead of a root vertex ID wrapper, the purpose of this object type
has been upgrades to a sub-trie prototype.
caveat:
`Aristo` backend not fully functional, yet.
* CoreDb: Update `Aristo` backend
why:
Supports virtual sub-tries
* CoreDb: Account address tracking for `StorageTrie` virtual tries
details:
Supported with API tracking/logging
* CoreDb: Keep account address in payload object
why:
No need to provide extra address argument for `merge()`, also
provides tracking possibility for account debugging.
* Ledger: Update new API for `Aristo` specific storage trie handling
* CoreDb+Ledger: Update unit tests
* Fix copyright headers
* Return slots when verifying witness regardless of code length.
* Prevent AccountCache WitnessData codeTouched from being reset.
* Default to using {wfNoFlag} for witness flags in tests to allow running with data from before EIP170.
* Add additional json files to experimental JSON RPC test.
* Use HTTP RPC server in tests.
* Add test to check that block witness contains bytecode.
* Aristo: Re-phrase `LayerDelta` and `LayerFinal` as object references
why:
Avoids copying in some cases
* Fix copyright header
* Aristo: Verify `leafTie.root` function argument for `merge()` proc
why:
Zero root will lead to inconsistent DB entry
* Aristo: Update failure condition for hash labels compiler `hashify()`
why:
Node need not be rejected as long as links are on the schedule. In
that case, `redo[]` is to become `wff.base[]` at a later stage.
This amends an earlier fix, part of #1952 by also testing against
the target nodes of the `wff.base[]` sets.
* Aristo: Add storage root glue record to `hashify()` schedule
why:
An account leaf node might refer to a non-resolvable storage root ID.
Storage root node chains will end up at the storage root. So the link
`storage-root->account-leaf` needs an extra item in the schedule.
* Aristo: fix error code returned by `fetchPayload()`
details:
Final error code is implied by the error code form the `hikeUp()`
function.
* CoreDb: Discard `createOk` argument in API `getRoot()` function
why:
Not needed for the legacy DB. For the `Arsto` DB, a lazy approach is
implemented where a stprage root node is created on-the-fly.
* CoreDb: Prevent `$$` logging in some cases
why:
Logging the function `$$` is not useful when it is used for internal
use, i.e. retrieving an an error text for logging.
* CoreDb: Add `tryHashFn()` to API for pretty printing
why:
Pretty printing must not change the hashification status for the
`Aristo` DB. So there is an independent API wrapper for getting the
node hash which never updated the hashes.
* CoreDb: Discard `update` argument in API `hash()` function
why:
When calling the API function `hash()`, the latest state is always
wanted. For a version that uses the current state as-is without checking,
the function `tryHash()` was added to the backend.
* CoreDb: Update opaque vertex ID objects for the `Aristo` backend
why:
For `Aristo`, vID objects encapsulate a numeric `VertexID`
referencing a vertex (rather than a node hash as used on the
legacy backend.) For storage sub-tries, there might be no initial
vertex known when the descriptor is created. So opaque vertex ID
objects are supported without a valid `VertexID` which will be
initalised on-the-fly when the first item is merged.
* CoreDb: Add pretty printer for opaque vertex ID objects
* Cosmetics, printing profiling data
* CoreDb: Fix segfault in `Aristo` backend when creating MPT descriptor
why:
Missing initialisation error
* CoreDb: Allow MPT to inherit shared context on `Aristo` backend
why:
Creates descriptors with different storage roots for the same
shared `Aristo` DB descriptor.
* Cosmetics, update diagnostic message items for `Aristo` backend
* Fix Copyright year
* Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup.
why:
Speeds up lookup time with `Aristo` backend. For writing `Clique` data,
the `Companion` model allows to write `Clique` data past the database
locked by evm transactions.
* Implement `CoreDb` profiling with API tracking
why:
Chasing time spent per APT procs ...
* Implement `Ledger` profiling with API tracking
why:
Chasing time spent per APT procs ...
* Always hashify when commiting or storing
why:
A dirty cache makes no sense when committing
* Make sure that a zero key is created when adding/updating vertices
why:
This is an error fix mainly for edge cases. A typical error was
that the root key got deleted when there were only a few vertices
left on the DB.
* Need all created and changed vertices zero-keyed on the cache
why:
A zero key (i.e. empty Merkle hash) indicates that a vertex key
needs to be updated. This would not be needed immediately after
a merge as there is an actual leaf path on the cache layer. But
after subsequent merge and delete operations this information
might get blurred.
* Re-org hashing algorithm
why:
Apart from errors, the previous implementation was too slow for
two reasons:
+ some control hashes were calculated for debugging (now all
verification is done in `aristo_check` module)
+ the leaf paths stored on the cache are used to build the
labelling (aka hashing) schedule; there paths were accumulated
over successive hash sessions although it is clear that all
keys were generated, already
* Fix copyright year
* Show elapsed times with enabled `CoreDb` API tracking
* Show elapsed times with enabled `LedgerRef` API tracking
* Reorg `CoreDb` auto destructors for `Aristo` DB
why:
While `Aristo` supports some parallelism for concurrent database access,
this comes with a price of management overhead. With a naive approach,
the auto-destructor will slow down execution because the ledger and
evm treat the database in a shared mode where a DB descriptor is just
created and thrown away shortly after.
This is reflected in the `Coredb` abstraction layer above `Aristo`/`Kvt`
where a few `Shared` type descriptors are cached and a shared reference
is returned rather than a disposable new object.
* For `CoreDb` support transaction level tracking
details:
This is mainly an extra for the legacy DB as `Aristo` and `Kvt` support
this already.
Also return an error on the legacy DB backend when `persistent()` is
called while there are transactions pending (the `persistent()` call
does nothing otherwise on the legacy backend.)
* Clear compiler warnings (remove unused variables etc.)
* Using different `tmp` directories for `Kvt` and `Aristo`
why:
Closing one database would leave the other set of directories
incomplete.
* Code cosmetics, silence compiler
* Fix typo `EMPTY_ROOT_HASH` vs. `EMPTY_CODE_HASH`
* Fix copyright years
* Split off `ReadOnlyStateDB` from `AccountStateDB` from `state_db.nim`
why:
Apart from testing, applications use `ReadOnlyStateDB` as an easy
way to access the accounts ledger. This is well supported by the
`Aristo` db, but writable mode is only parially supported.
The writable AccountStateDB` object for modifying accounts is not
used by production code.
So, for lecgacy and testing apps, the full support of the previous
`AccountStateDB` is now enabled by `import db/state_db/read_write`
and the `import db/state_db` provides read-only mode.
* Encapsulate `AccountStateDB` as `GenesisLedgerRef` or genesis creation
why:
`AccountStateDB` has poor support for `Aristo` and is not widely used
in favour of `AccountsLedger` (which will be abstracted as `ledger`.)
Currently, using other than the `AccountStateDB` ledgers within the
`GenesisLedgerRef` wrapper is experimental and test only. Eventually,
the wrapper should disappear so that the `Ledger` object (which
encapsulates `AccountsCache` and `AccountsLedger`) will prevail.
* For the `Ledger`, provide access to raw accounts `MPT`
why:
This gives to the `CoreDbMptRef` descriptor from the `CoreDb` (which is
the legacy version of CoreDxMptRef`.) For the new `ledger` API, the
accounts are based on the `CoreDxMAccRef` descriptor which uses a
particular sub-system for accounts while legacy applications use the
`CoreDbPhkRef` equivalent of the `SecureHexaryTrie`.
The only place where this feature will currently be used is the
`genesis.nim` source file.
* Fix `Aristo` bugs, missing boundary checks, typos, etc.
* Verify root vertex in `MPT` and account constructors
why:
Was missing so far, in particular the accounts constructor must
verify `VertexID(1)
* Fix include file
* Disable `TransactionID` related functions from `state_db.nim`
why:
Functions `getCommittedStorage()` and `updateOriginalRoot()` from
the `state_db` module are nowhere used. The emulation of a legacy
`TransactionID` type functionality is administratively expensive to
provide by `Aristo` (the legacy DB version is only partially
implemented, anyway).
As there is no other place where `TransactionID`s are used, they will
not be provided by the `Aristo` variant of the `CoreDb`. For the
legacy DB API, nothing will change.
* Fix copyright headers in source code
* Get rid of compiler warning
* Update Aristo code, remove unused `merge()` variant, export `hashify()`
why:
Adapt to upcoming `CoreDb` wrapper
* Remove synced tx feature from `Aristo`
why:
+ This feature allowed to synchronise transaction methods like begin,
commit, and rollback for a group of descriptors.
+ The feature is over engineered and not needed for `CoreDb`, neither
is it complete (some convergence features missing.)
* Add debugging helpers to `Kvt`
also:
Update database iterator, add count variable yield argument similar
to `Aristo`.
* Provide optional destructors for `CoreDb` API
why;
For the upcoming Aristo wrapper, this allows to control when certain
smart destruction and update can take place. The auto destructor works
fine in general when the storage/cache strategy is known and acceptable
when creating descriptors.
* Add update option for `CoreDb` API function `hash()`
why;
The hash function is typically used to get the state root of the MPT.
Due to lazy hashing, this might be not available on the `Aristo` DB.
So the `update` function asks for re-hashing the gurrent state changes
if needed.
* Update API tracking log mode: `info` => `debug
* Use shared `Kvt` descriptor in new Ledger API
why:
No need to create a new descriptor all the time
* Make sure that storage tries are not pruned (by default) on the new Ledger API
why:
Pruning might kill some unwanted entries from storage tries ending up with an unstable database
leading to crashes.
* Implement `CoreDb` and `LedgerRef` API tracing
details:
+ Locally enabled at compile time via constants `ProvideCoreDbLegacyAPI`
and `EnableApiTracking` in either `base.nim` source
+ If enabled it can be selectively turned on/off via public switches in
the `CoreDb` descriptor.
* Allow suppressing opportunistic `ifNecessaryGetXxx()` functions
why:
Better troubleshooting when the system crashes (assertions will then
most probably happen outside an `async` function.)
* Provide TDD/debug facility for inspecting `persistBlocks()` working
detail:
+ Make sure that the last block of a test sample is the first batch
item in `persistBlocks()`.
+ Additionally, allow `AccountsCache` API tracing by setting the flag
`extraTraceMessages = true` in the file `accounts_cache.nim`
* Overload AccountsCache by abstraction wrapper
details:
Can facilitate CoreDb API switch, details in `ledger/README.md`.