When lazily verifying state roots, we may end up with an entire state
without roots that gets computed for the whole database - in the current
design, that would result in hashes for the entire trie being held in
memory.
Since the hash depends only on the data in the vertex, we can store it
directly at the top-most level derived from the verticies it depends on
- be that memory or database - this makes the memory usage broadly
linear with respect to the already-existing in-memory change set stored
in the layers.
It also ensures that if we have multiple forks in memory, hashes get
cached in the correct layer maximising reuse between forks.
The same layer numbering scheme as elsewhere is reused, where -2 is the
backend, -1 is the balancer, then 0+ is the top of the stack and stack.
A downside of this approach is that we create many small batches - a
future improvement could be to collect all such writes in a single
batch, though the memory profile of this approach should be examined
first (where is the batch kept, exactly?).
* Remove `chunkedMpt` from `persistent()`/`stow()` function
why:
Proof-mode code was removed with PR #2445 and needs to be re-designed.
* Remove unused `beStateRoot` argument from `deltaMerge()`
* Update/drastically simplify `txStow()`
why:
Got rid of many boundary conditions
details:
Many pre-conditions have changed. In particular, previous versions
used the account state (hash) which was conveniently available and
checked it against the backend in order to find out whether there
was something to do, at all. Currently, only an empty set of all
tables in the delta layer has the balancer update ignored.
Notable changes are:
* no check against account state (see above)
* balancer filters have no hash signature (some legacy stuff left over
from journals)
* no (shap sync) proof data which made the generation of the a top layer
more complex
* Cosmetics, cruft removal
* Update unit test file & function name
why:
Was legacy module
Our need is only a baseline tx pool gasLimit calculator.
If need we can expand it in the future.
But for now, a simple but understandable tx pool is more important.
* Remove cruft left-over from PR #2494
* TODO
* Update comments on `HashKey` type values
* Remove obsolete hash key conversion flag `forceRoot`
why:
Is treated implicitly by having vertex keys as `HashKey` type and
root vertex states converted to `Hash256`
* Use block number or timestamp to determine fork rules
Avoid confusion raised by `forkGTE` usage where block informations are present.
* Get rid of forkGTE
The 3 proofs can be reworked to two proofs as we can use the
BeaconBlock directly instead of BeaconBlockHeader and
BeaconBlockBody. This is possible because the HTR of the
BeaconBlock is the same as the one of the BeaconBlockHeader.
This results in 32 bytes less as an intermediate hash can be
removed. But more importantly looks more clean and compact in
structure and code.
* Imported/rebase from `no-ext`, PR #2485
Store extension nodes together with the branch
Extension nodes must be followed by a branch - as such, it makes sense
to store the two together both in the database and in memory:
* fewer reads, writes and updates to traverse the tree
* simpler logic for maintaining the node structure
* less space used, both memory and storage, because there are fewer
nodes overall
There is also a downside: hashes can no longer be cached for an
extension - instead, only the extension+branch hash can be cached - this
seems like a fine tradeoff since computing it should be fast.
TODO: fix commented code
* Fix merge functions and `toNode()`
* Update `merkleSignCommit()` prototype
why:
Result is always a 32bit hash
* Update short Merkle hash key generation
details:
Ethereum reference MPTs use Keccak hashes as node links if the size of
an RLP encoded node is at least 32 bytes. Otherwise, the RLP encoded
node value is used as a pseudo node link (rather than a hash.) This is
specified in the yellow paper, appendix D.
Different to the `Aristo` implementation, the reference MPT would not
store such a node on the key-value database. Rather the RLP encoded node value is stored instead of a node link in a parent node
is stored as a node link on the parent database.
Only for the root hash, the top level node is always referred to by the
hash.
* Fix/update `Extension` sections
why:
Were commented out after removal of a dedicated `Extension` type which
left the system disfunctional.
* Clean up unused error codes
* Update unit tests
* Update docu
---------
Co-authored-by: Jacek Sieka <jacek@status.im>
This PR adds a storage hike cache similar to the account hike cache
already present - this cache is less efficient because account storage
is already partically cached in the account ledger but nonetheless helps
keep hiking down.
Notably, there's an opportunity to optimise this cache and the others so
that they cooperate better insteado of overlapping, which is left for a
future PR.
This PR also fixes an O(N) memory usage for storage slots where the
delete would keep the full storage in a work list which on mainnet can
grow very large - the work list is replaced with a more conventional
recursive `O(log N)` approach.
The Vertex type unifies branches, extensions and leaves into a single
memory area where the larges member is the branch (128 bytes + overhead) -
the payloads we have are all smaller than 128 thus wrapping them in an
extra layer of `ref` is wasteful from a memory usage perspective.
Further, the ref:s must be visited during the M&S phase of garbage
collection - since we keep millions of these, many of them
short-lived, this takes up significant CPU time.
```
Function CPU Time: Total CPU Time: Self Module Function (Full) Source File Start Address
system::markStackAndRegisters 10.0% 4.922s nimbus system::markStackAndRegisters(var<system::GcHeap>).constprop.0 gc.nim 0x701230`
```
* Extract `CoreDb` constructor helpers from `base.nim` into separate module
why:
This makes it easier to avoid circular imports.
* Extract `Ledger` constructor helpers from `base.nim` into separate module
why:
Move `accounts_ledger.nim` file to sub-folder `backend`. That way the
layout resembles that of the `core_db`.
* Updates and corrections
* Extract `CoreDb` configuration from `base.nim` into separate module
why:
This makes it easier to avoid circular imports, in particular
when the capture journal (aka tracer) is revived.
* Extract `Ledger` configuration from `base.nim` into separate module
why:
This makes it easier to avoid circular imports (if any.)
also:
Move `accounts_ledger.nim` file to sub-folder `backend`. That way the
layout resembles that of the `core_db`.
- EpochAccumulator got renamed to EpochRecord
- MasterAccumulator is not HistoricalHashesAccumulator
- The List size for the accumulator got a different maximum which
also result in a different encoding and HTR
hike allocations (and the garbage collection maintenance that follows)
are responsible for some 10% of cpu time (not wall time!) at this point
- this PR avoids them by stepping through the layers one step at a time,
simplifying the code at the same time.
* Rename `newKvt()` -> `ctx.getKvt()`
why:
Clean up legacy shortcut. Also, the `KVT` returned is not instantiated
but refers to the shared `KVT` that resides in a context which is a
generalisation of an in-memory database fork. The function `ctx`
retrieves the default context.
* Rename `newTransaction()` -> `ctx.newTransaction()`
why:
Clean up legacy shortcut. The transaction is applied to a context as a
generalisation of an in-memory database fork. The function `ctx`
retrieves the default context.
* Rename `getColumn(CtGeneric)` -> `getGeneric()`
why:
No more a list of well known sub-tries needed, a single one is enough.
In fact, `getColumn()` did only support a single sub-tree by now.
* Reduce TODO list
This trivial bump should improve performance a bit without costing too
much memory - as the trie grows, so does the number of levels in it and
creating hikes becomes ever more expensive - hopefully this cache
increase should give a nice little boost even if it's not a lot.
Avoid writing the same slot/hash values to the hash->slot mapping
to avoid spamming the rocksdb WAL and cause unnecessary compaction
In the same vein, avoid writing trivially detectable A-B-A storage
changes which happen with surprising frequency.
Introduce a new `StoData` payload type similar to `AccountData`
* slightly more efficient storage format
* typed api
* fewer seqs
* fix encoding docs - it wasn't rlp after all :)
* Enable LTO compilation
Similar to nimbus-eth2, LTO gives a significant boost for any CPU-bound operations such as the EVM.
The options are copied straight from nimbus-eth2 - for example at block height 1.7M there's a computation-heavy section where we can see a 15%-20% improvement in block processing time.
```
bps_x bps_y tps_x tps_y time_x time_y bpsd tpsd timed
(1722223, 1733334] 102.52 138.90 1,049.67 1,420.61 2m41s 1m58s 35.78% 35.78% -26.32%
```
* avoid defer
When evmc recursion is enabled together with LTO, we run out of stack
space.
`defer` creates an exception handling context that takes up hundreds of
bytes of stack space - now that the EVM is no longer using exceptions,
we can safely get rid of it.
This significantly speeds up block import at the cost of less protection
against invalid data, potentially resulting in an invalid database
getting stored.
The risk is small given that import is used only for validated data -
evaluating the right level of of validation vs performance is left for a
future PR.
A side effect of this approach is that there is no cached stated root in
the database - computing it currently requires a lot of memory since the
intermediate roots get cached in memory in full while the computation is
ongoing - a future PR will need to address this deficiency, for example
by streaming the already-computed hashes directly to the database.
The state and account MPT:s currenty share key space in the database
based on that vertex id:s are assigned essentially randomly, which means
that when two adjacent slot values from the same contract are accessed,
they might reside at large distance from each other.
Here, we prefix each vertex id by its root causing them to be sorted
together thus bringing all data belonging to a particular contract
closer together - the same effect also happens for the main state MPT
whose nodes now end up clustered together more tightly.
In the future, the prefix given to the storage keys can also be used to
perform range operations such as reading all the storage at once and/or
deleting an account with a batch operation.
Notably, parts of the API already supported this rooting concept while
parts didn't - this PR makes the API consistent by always working with a
root+vid.
The storage id is frequently accessed when executing contract code and
finding the path via the database requires several hops making the
process slow - here, we add a cache to keep the most recently used
account storage id:s in memory.
A possible future improvement would be to cache all account accesses so
that for example updating the balance doesn't cause several hikes.