nimbus-eth1/nimbus/db/aristo/aristo_hashify.nim

328 lines
11 KiB
Nim
Raw Normal View History

# nimbus-eth1
Core db update storage root management for sub tries (#1964) * Aristo: Re-phrase `LayerDelta` and `LayerFinal` as object references why: Avoids copying in some cases * Fix copyright header * Aristo: Verify `leafTie.root` function argument for `merge()` proc why: Zero root will lead to inconsistent DB entry * Aristo: Update failure condition for hash labels compiler `hashify()` why: Node need not be rejected as long as links are on the schedule. In that case, `redo[]` is to become `wff.base[]` at a later stage. This amends an earlier fix, part of #1952 by also testing against the target nodes of the `wff.base[]` sets. * Aristo: Add storage root glue record to `hashify()` schedule why: An account leaf node might refer to a non-resolvable storage root ID. Storage root node chains will end up at the storage root. So the link `storage-root->account-leaf` needs an extra item in the schedule. * Aristo: fix error code returned by `fetchPayload()` details: Final error code is implied by the error code form the `hikeUp()` function. * CoreDb: Discard `createOk` argument in API `getRoot()` function why: Not needed for the legacy DB. For the `Arsto` DB, a lazy approach is implemented where a stprage root node is created on-the-fly. * CoreDb: Prevent `$$` logging in some cases why: Logging the function `$$` is not useful when it is used for internal use, i.e. retrieving an an error text for logging. * CoreDb: Add `tryHashFn()` to API for pretty printing why: Pretty printing must not change the hashification status for the `Aristo` DB. So there is an independent API wrapper for getting the node hash which never updated the hashes. * CoreDb: Discard `update` argument in API `hash()` function why: When calling the API function `hash()`, the latest state is always wanted. For a version that uses the current state as-is without checking, the function `tryHash()` was added to the backend. * CoreDb: Update opaque vertex ID objects for the `Aristo` backend why: For `Aristo`, vID objects encapsulate a numeric `VertexID` referencing a vertex (rather than a node hash as used on the legacy backend.) For storage sub-tries, there might be no initial vertex known when the descriptor is created. So opaque vertex ID objects are supported without a valid `VertexID` which will be initalised on-the-fly when the first item is merged. * CoreDb: Add pretty printer for opaque vertex ID objects * Cosmetics, printing profiling data * CoreDb: Fix segfault in `Aristo` backend when creating MPT descriptor why: Missing initialisation error * CoreDb: Allow MPT to inherit shared context on `Aristo` backend why: Creates descriptors with different storage roots for the same shared `Aristo` DB descriptor. * Cosmetics, update diagnostic message items for `Aristo` backend * Fix Copyright year
2024-01-11 19:11:38 +00:00
# Copyright (c) 2023-2024 Status Research & Development GmbH
# Licensed under either of
# * Apache License, version 2.0, ([LICENSE-APACHE](LICENSE-APACHE) or
# http://www.apache.org/licenses/LICENSE-2.0)
# * MIT license ([LICENSE-MIT](LICENSE-MIT) or
# http://opensource.org/licenses/MIT)
# at your option. This file may not be copied, modified, or distributed
# except according to those terms.
## Aristo DB -- Patricia Trie Merkleisation
## ========================================
##
## For the current state of the `Patricia Trie`, keys (equivalent to hashes)
## are associated with the vertex IDs. Existing key associations are taken
## as-is/unchecked unless the ID is marked a proof node. In the latter case,
## the key is assumed to be correct after re-calculation.
##
## The labelling algorithm works roughly as follows:
##
## * Given a set of start or root vertices, build the forest (of trees)
## downwards towards leafs vertices so that none of these vertices has a
## Merkle hash label.
##
## * Starting at the leaf vertices in width-first fashion, calculate the
## Merkle hashes and label the leaf vertices. Recursively work up labelling
## vertices up until the root nodes are reached.
##
## Note that there are some tweaks for `proof` node vertices which lead to
## incomplete trees in a way that the algoritm handles existing Merkle hash
## labels for missing vertices.
##
{.push raises: [].}
import
std/[algorithm, sequtils, sets, tables],
chronicles,
eth/common,
Aristo db api extensions for use as core db backend (#1754) * Update docu * Update Aristo/Kvt constructor prototype why: Previous version used an `enum` value to indicate what backend is to be used. This was replaced by using the backend object type. * Rewrite `hikeUp()` return code into `Result[Hike,(Hike,AristoError)]` why: Better code maintenance. Previously, the `Hike` object was returned. It had an internal error field so partial success was also available on a failure. This error field has been removed. * Use `openArray[byte]` rather than `Blob` in functions prototypes * Provide synchronised multi instance transactions why: The `CoreDB` object was geared towards the legacy DB which used a single transaction for the key-value backend DB. Different state roots are provided by the backend database, so all instances work directly on the same backend. Aristo db instances have different in-memory mappings (aka different state roots) and the transactions are on top of there mappings. So each instance might run different transactions. Multi instance transactions are a compromise to converge towards the legacy behaviour. The synchronised transactions span over all instances available at the time when base transaction was opened. Instances created later are unaffected. * Provide key-value pair database iterator why: Needed in `CoreDB` for `replicate()` emulation also: Some update of internal code * Extend API (i.e. prototype variants) why: Needed for `CoreDB` geared towards the legacy backend which has a more basic API than Aristo.
2023-09-15 16:23:53 +01:00
results,
"."/[aristo_desc, aristo_get, aristo_layers, aristo_serialise, aristo_utils]
type
WidthFirstForest = object
## Collected width first search trees
root: HashSet[VertexID] ## Top level, root targets
pool: Table[VertexID,VertexID] ## Upper links pool
base: Table[VertexID,VertexID] ## Width-first leaf level links
leaf: seq[VertexID] ## Stand-alone leaf to process
rev: Table[VertexID,HashSet[VertexID]] ## Reverse look up table
logScope:
topics = "aristo-hashify"
# ------------------------------------------------------------------------------
# Private helpers
# ------------------------------------------------------------------------------
func getOrVoid(tab: Table[VertexID,VertexID]; vid: VertexID): VertexID =
tab.getOrDefault(vid, VertexID(0))
# ------------------------------------------------------------------------------
# Private functions
# ------------------------------------------------------------------------------
func hasValue(
wffTable: Table[VertexID,VertexID];
vid: VertexID;
wff: var WidthFirstForest;
): bool =
## Helper for efficient `value` access:
Core db aristo hasher profiling and timing improvement (#1938) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already
2023-12-12 17:47:41 +00:00
## ::
## wffTable.hasValue(wff, vid)
##
## instead of
Core db aristo hasher profiling and timing improvement (#1938) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already
2023-12-12 17:47:41 +00:00
## ::
## vid in wffTable.values.toSeq
##
wff.rev.withValue(vid, v):
for w in v[]:
if w in wffTable:
return true
proc pedigree(
Core db aristo hasher profiling and timing improvement (#1938) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already
2023-12-12 17:47:41 +00:00
db: AristoDbRef; # Database, top layer
wff: var WidthFirstForest;
ancestors: HashSet[VertexID]; # Vertex IDs to start connecting from
proofs: HashSet[VertexID]; # Additional proof nodes to start from
): Result[void, (VertexID,AristoError)] =
## For each vertex ID from the argument set `ancestors` find all un-labelled
## grand child vertices and build a forest (of trees) starting from the
## grand child vertices.
##
Core db aristo hasher profiling and timing improvement (#1938) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already
2023-12-12 17:47:41 +00:00
var
leafs: HashSet[VertexID]
proc register(wff: var WidthFirstForest; fromVid, toVid: VertexID) =
if toVid in wff.base:
# * there is `toVid->*` in `base[]`
# * so ``toVid->*` moved to `pool[]`
wff.pool[toVid] = wff.base.getOrVoid toVid
wff.base.del toVid
if wff.base.hasValue(fromVid, wff):
# * there is `*->fromVid` in `base[]`
# * so store `fromVid->toVid` in `pool[]`
wff.pool[fromVid] = toVid
else:
# store `fromVid->toVid` in `base[]`
wff.base[fromVid] = toVid
# Register reverse pair for quick table value lookup
wff.rev.withValue(toVid, val):
val[].incl fromVid
do:
2024-05-24 11:27:17 +02:00
wff.rev[toVid] = [fromVid].toHashSet
Core db aristo hasher profiling and timing improvement (#1938) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already
2023-12-12 17:47:41 +00:00
# Remove unnecessarey sup-trie roots (e.g. for a storage root)
wff.root.excl fromVid
Core db aristo hasher profiling and timing improvement (#1938) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already
2023-12-12 17:47:41 +00:00
# Initialise greedy search which will keep a set of current leafs in the
# `leafs{}` set and follow up links in the `pool[]` table, leading all the
# way up to the `root{}` set.
Core db aristo hasher profiling and timing improvement (#1938) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already
2023-12-12 17:47:41 +00:00
#
# Process root nodes if they are unlabelled
var rootWasDeleted = VertexID(0)
for root in ancestors:
let vtx = db.getVtx root
if vtx.isNil:
if VertexID(LEAST_FREE_VID) <= root:
# There must be a another root, as well (e.g. `$1` for a storage
# root). Only the last one of some will be reported with error code.
rootWasDeleted = root
elif not db.getKey(root).isValid:
# Need to process `root` node
let children = vtx.subVids
if children.len == 0:
# This is an isolated leaf node
wff.leaf.add root
else:
wff.root.incl root
for child in vtx.subVids:
if not db.getKey(child).isValid:
leafs.incl child
wff.register(child, root)
if rootWasDeleted.isValid and
wff.root.len == 0 and
wff.leaf.len == 0:
return err((rootWasDeleted,HashifyRootVtxUnresolved))
# Initialisation for `proof` nodes which are sort of similar to `root` nodes.
for proof in proofs:
let vtx = db.getVtx proof
if vtx.isNil or not db.getKey(proof).isValid:
return err((proof,HashifyVtxUnresolved))
let children = vtx.subVids
if 0 < children.len:
# To be treated as a root node
wff.root.incl proof
for child in vtx.subVids:
if not db.getKey(child).isValid:
leafs.incl child
wff.register(child, proof)
# Recursively step down and collect unlabelled vertices
while 0 < leafs.len:
var redo: typeof(leafs)
for parent in leafs:
assert parent.isValid
assert not db.getKey(parent).isValid
let vtx = db.getVtx parent
if not vtx.isNil:
let children = vtx.subVids.filterIt(not db.getKey(it).isValid)
if 0 < children.len:
for child in children:
redo.incl child
wff.register(child, parent)
continue
if parent notin wff.base:
# The buck stops here:
# move `(parent,granny)` from `pool[]` to `base[]`
let granny = wff.pool.getOrVoid parent
assert granny.isValid
wff.register(parent, granny)
wff.pool.del parent
redo.swap leafs
ok()
# ------------------------------------------------------------------------------
# Private functions, tree traversal
# ------------------------------------------------------------------------------
proc createSched(
wff: var WidthFirstForest; # Search tree to create
db: AristoDbRef; # Database, top layer
): Result[void,(VertexID,AristoError)] =
## Create width-first search schedule (aka forest)
##
? db.pedigree(wff, db.dirty, db.pPrf)
if 0 < wff.leaf.len:
for vid in wff.leaf:
let node = db.getVtx(vid).toNode(db, beKeyOk=false).valueOr:
# Make sure that all those nodes are reachable
for needed in error:
if needed notin wff.base and
needed notin wff.pool:
return err((needed,HashifyVtxUnresolved))
continue
db.layersPutKey(VertexID(1), vid, node.digestTo(HashKey))
wff.leaf.reset() # No longer needed
ok()
proc processSched(
wff: var WidthFirstForest; # Search tree to process
db: AristoDbRef; # Database, top layer
): Result[void,(VertexID,AristoError)] =
## Traverse width-first schedule and update vertex hash labels.
##
while 0 < wff.base.len:
var
accept = false
redo: typeof(wff.base)
Core db aristo hasher profiling and timing improvement (#1938) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already
2023-12-12 17:47:41 +00:00
for (vid,toVid) in wff.base.pairs:
Core db aristo hasher profiling and timing improvement (#1938) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already
2023-12-12 17:47:41 +00:00
let vtx = db.getVtx vid
assert vtx.isValid
# Try to convert the vertex to a node. This is possible only if all
# link references have Merkle hash keys, already.
let node = vtx.toNode(db, stopEarly=false).valueOr:
# Do this vertex later, again
if wff.pool.hasValue(vid, wff):
wff.pool[vid] = toVid
accept = true # `redo[]` will be fifferent from `base[]`
else:
redo[vid] = toVid
continue
# End `valueOr` terminates error clause
# Could resolve => update Merkle hash
db.layersPutKey(VertexID(1), vid, node.digestTo HashKey)
# Set follow up link for next round
let toToVid = wff.pool.getOrVoid toVid
if toToVid.isValid:
if toToVid in redo:
# Got predecessor `(toVid,toToVid)` of `(toToVid,xxx)`,
# so move `(toToVid,xxx)` from `redo[]` to `pool[]`
wff.pool[toToVid] = redo.getOrVoid toToVid
redo.del toToVid
# Move `(toVid,toToVid)` from `pool[]` to `redo[]`
wff.pool.del toVid
redo[toVid] = toToVid
accept = true # `redo[]` will be fifferent from `base[]`
# End `for (vid,toVid)..`
# Make sure that `base[]` is different from `redo[]`
if not accept:
let vid = wff.base.keys.toSeq[0]
return err((vid,HashifyVtxUnresolved))
# Restart `wff.base[]`
wff.base.swap redo
ok()
proc finaliseRoots(
wff: var WidthFirstForest; # Search tree to process
db: AristoDbRef; # Database, top layer
): Result[void,(VertexID,AristoError)] =
## Process root vertices after all other vertices are done.
##
# Make sure that the pool has been exhausted
if 0 < wff.pool.len:
let vid = wff.pool.keys.toSeq.sorted[0]
return err((vid,HashifyVtxUnresolved))
# Update or verify root nodes
for vid in wff.root:
# Calculate hash key
let
node = db.getVtx(vid).toNode(db).valueOr:
return err((vid,HashifyRootVtxUnresolved))
key = node.digestTo(HashKey)
if vid notin db.pPrf:
db.layersPutKey(VertexID(1), vid, key)
elif key != db.getKey vid:
return err((vid,HashifyProofHashMismatch))
ok()
# ------------------------------------------------------------------------------
# Public functions
# ------------------------------------------------------------------------------
proc hashify*(
db: AristoDbRef; # Database, top layer
): Result[void,(VertexID,AristoError)] =
## Add keys to the `Patricia Trie` so that it becomes a `Merkle Patricia
## Tree`.
##
if 0 < db.dirty.len:
# Set up widh-first traversal schedule
var wff: WidthFirstForest
? wff.createSched db
# Traverse tree spanned by `wff` and label remaining vertices.
? wff.processSched db
Core db update storage root management for sub tries (#1964) * Aristo: Re-phrase `LayerDelta` and `LayerFinal` as object references why: Avoids copying in some cases * Fix copyright header * Aristo: Verify `leafTie.root` function argument for `merge()` proc why: Zero root will lead to inconsistent DB entry * Aristo: Update failure condition for hash labels compiler `hashify()` why: Node need not be rejected as long as links are on the schedule. In that case, `redo[]` is to become `wff.base[]` at a later stage. This amends an earlier fix, part of #1952 by also testing against the target nodes of the `wff.base[]` sets. * Aristo: Add storage root glue record to `hashify()` schedule why: An account leaf node might refer to a non-resolvable storage root ID. Storage root node chains will end up at the storage root. So the link `storage-root->account-leaf` needs an extra item in the schedule. * Aristo: fix error code returned by `fetchPayload()` details: Final error code is implied by the error code form the `hikeUp()` function. * CoreDb: Discard `createOk` argument in API `getRoot()` function why: Not needed for the legacy DB. For the `Arsto` DB, a lazy approach is implemented where a stprage root node is created on-the-fly. * CoreDb: Prevent `$$` logging in some cases why: Logging the function `$$` is not useful when it is used for internal use, i.e. retrieving an an error text for logging. * CoreDb: Add `tryHashFn()` to API for pretty printing why: Pretty printing must not change the hashification status for the `Aristo` DB. So there is an independent API wrapper for getting the node hash which never updated the hashes. * CoreDb: Discard `update` argument in API `hash()` function why: When calling the API function `hash()`, the latest state is always wanted. For a version that uses the current state as-is without checking, the function `tryHash()` was added to the backend. * CoreDb: Update opaque vertex ID objects for the `Aristo` backend why: For `Aristo`, vID objects encapsulate a numeric `VertexID` referencing a vertex (rather than a node hash as used on the legacy backend.) For storage sub-tries, there might be no initial vertex known when the descriptor is created. So opaque vertex ID objects are supported without a valid `VertexID` which will be initalised on-the-fly when the first item is merged. * CoreDb: Add pretty printer for opaque vertex ID objects * Cosmetics, printing profiling data * CoreDb: Fix segfault in `Aristo` backend when creating MPT descriptor why: Missing initialisation error * CoreDb: Allow MPT to inherit shared context on `Aristo` backend why: Creates descriptors with different storage roots for the same shared `Aristo` DB descriptor. * Cosmetics, update diagnostic message items for `Aristo` backend * Fix Copyright year
2024-01-11 19:11:38 +00:00
# Do/complete state root vertices
? wff.finaliseRoots db
db.top.final.dirty.clear # Mark top layer clean
Core db and aristo maintenance update (#2014) * Aristo: Update error return code why: Failing of `Aristo` function `delete()` might fail because there is no such data item on the db. This must return a single error code as is done with `fetch()`. * Ledger: Better error handling why: The `expect()` clauses have been replaced by raising asserts indicating the error from the database backend. Also, `delete()` failures are legitimate if the item to delete does not exist. * Aristo: Delete function must always leave a label on DB for `hashify()` why: The `hashify()` uses the labels left bu `merge()` and `delete()` to compile (and optimise) a scheduler for subsequent hashing. Originally, the labels were not used for deleted entries and `delete()` still had some edge case where the deletion label was not properly handled. * Aristo: Update `hashify()` scheduler, remove buggy optimisation why: Was left over from version without virtual state roots which did not know about account payload leaf vertices referring to storage roots. * Aristo: Label storage trie account in `delete()` similar to `merge()` details; The `delete()` function applied to a non-static state root (assumed to be a storage root) will check the payload of an accounts leaf and mark its Merkle keys to be re-checked when runninh `hashify()` * Aristo: Clean up and re-org recycled vertex IDs in `hashify()` why: Re-organising the recycled vertex IDs list intends to reduce the size of the list. This list is organised as a LIFO (or stack.) By reorganising it in a way so that the least vertex ID numbers are on top, the list will be kept smaller as observed on some examples (less than 30%.) * CoreDb: Accept storage trie deletion requests in non-initialised state why: Due to lazy initialisation, the root vertex ID might not yet exist. So the `Aristo` database handlers would reject this call with an error and this condition needs to be handled by the API (which realises the lazy feature.) * Cosmetics & code massage, prettify logging * fix missing import
2024-02-08 16:32:16 +00:00
ok()
# ------------------------------------------------------------------------------
# End
# ------------------------------------------------------------------------------