nimbus-eth1/nimbus/db/aristo/aristo_delete.nim

407 lines
14 KiB
Nim
Raw Normal View History

2023-06-02 19:21:46 +00:00
# nimbus-eth1
# Copyright (c) 2023 Status Research & Development GmbH
2023-06-02 19:21:46 +00:00
# Licensed under either of
# * Apache License, version 2.0, ([LICENSE-APACHE](LICENSE-APACHE) or
# http://www.apache.org/licenses/LICENSE-2.0)
# * MIT license ([LICENSE-MIT](LICENSE-MIT) or
# http://opensource.org/licenses/MIT)
# at your option. This file may not be copied, modified, or distributed
# except according to those terms.
## Aristo DB -- Patricia Trie delete funcionality
## ==============================================
##
## Deleate by `Hike` type chain of vertices.
{.push raises: [].}
import
std/[sets, tables],
chronicles,
eth/[common, trie/nibbles],
results,
"."/[aristo_desc, aristo_get, aristo_hike, aristo_layers, aristo_path,
aristo_vid]
2023-06-02 19:21:46 +00:00
logScope:
topics = "aristo-delete"
# ------------------------------------------------------------------------------
# Private heplers
2023-06-02 19:21:46 +00:00
# ------------------------------------------------------------------------------
func toVae(err: AristoError): (VertexID,AristoError) =
## Map single error to error pair with dummy vertex
(VertexID(0),err)
Aristo db api extensions for use as core db backend (#1754) * Update docu * Update Aristo/Kvt constructor prototype why: Previous version used an `enum` value to indicate what backend is to be used. This was replaced by using the backend object type. * Rewrite `hikeUp()` return code into `Result[Hike,(Hike,AristoError)]` why: Better code maintenance. Previously, the `Hike` object was returned. It had an internal error field so partial success was also available on a failure. This error field has been removed. * Use `openArray[byte]` rather than `Blob` in functions prototypes * Provide synchronised multi instance transactions why: The `CoreDB` object was geared towards the legacy DB which used a single transaction for the key-value backend DB. Different state roots are provided by the backend database, so all instances work directly on the same backend. Aristo db instances have different in-memory mappings (aka different state roots) and the transactions are on top of there mappings. So each instance might run different transactions. Multi instance transactions are a compromise to converge towards the legacy behaviour. The synchronised transactions span over all instances available at the time when base transaction was opened. Instances created later are unaffected. * Provide key-value pair database iterator why: Needed in `CoreDB` for `replicate()` emulation also: Some update of internal code * Extend API (i.e. prototype variants) why: Needed for `CoreDB` geared towards the legacy backend which has a more basic API than Aristo.
2023-09-15 15:23:53 +00:00
func toVae(err: (Hike,AristoError)): (VertexID,AristoError) =
if 0 < err[0].legs.len:
(err[0].legs[^1].wp.vid, err[1])
else:
(VertexID(0), err[1])
proc branchStillNeeded(vtx: VertexRef): Result[int,void] =
## Returns the nibble if there is only one reference left.
var nibble = -1
2023-06-02 19:21:46 +00:00
for n in 0 .. 15:
if vtx.bVid[n].isValid:
if 0 <= nibble:
return ok(-1)
nibble = n
if 0 <= nibble:
return ok(nibble)
# Oops, degenerated branch node
err()
2023-06-02 19:21:46 +00:00
Core db aristo hasher profiling and timing improvement (#1938) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already
2023-12-12 17:47:41 +00:00
# -----------
proc nullifyKey(
db: AristoDbRef; # Database, top layer
vid: VertexID; # Vertex IDs to clear
) =
Core db aristo hasher profiling and timing improvement (#1938) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already
2023-12-12 17:47:41 +00:00
# Register for void hash (to be recompiled)
db.layersResLabel vid
2023-06-02 19:21:46 +00:00
Core db aristo hasher profiling and timing improvement (#1938) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already
2023-12-12 17:47:41 +00:00
proc disposeOfVtx(
db: AristoDbRef; # Database, top layer
vid: VertexID; # Vertex IDs to clear
) =
2023-06-02 19:21:46 +00:00
# Remove entry
db.layersResVtx vid
db.layersResLabel vid
db.vidDispose vid # Recycle ID
2023-06-02 19:21:46 +00:00
# ------------------------------------------------------------------------------
# Private functions
# ------------------------------------------------------------------------------
proc collapseBranch(
db: AristoDbRef; # Database, top layer
hike: Hike; # Fully expanded path
nibble: byte; # Applicable link for `Branch` vertex
): Result[void,(VertexID,AristoError)] =
## Convert/merge vertices:
## ::
## current | becomes | condition
## | |
## ^3 ^2 | ^3 ^2 |
## -------------------+---------------------+------------------
## Branch <br> Branch | Branch <ext> Branch | 2 < legs.len (1)
## Ext <br> Branch | <ext> Branch | 2 < legs.len (2)
## <br> Branch | <ext> Branch | legs.len == 2 (3)
##
## Depending on whether the parent `par` is an extension, merge `br` into
## `par`. Otherwise replace `br` by an extension.
##
let br = hike.legs[^2].wp
var xt = VidVtxPair( # Rewrite `br`
vid: br.vid,
vtx: VertexRef(
vType: Extension,
ePfx: @[nibble].initNibbleRange.slice(1),
eVid: br.vtx.bVid[nibble]))
if 2 < hike.legs.len: # (1) or (2)
let par = hike.legs[^3].wp
case par.vtx.vType:
of Branch: # (1)
# Replace `br` (use `xt` as-is)
discard
of Extension: # (2)
# Merge `br` into ^3 (update `xt`)
Core db aristo hasher profiling and timing improvement (#1938) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already
2023-12-12 17:47:41 +00:00
db.disposeOfVtx xt.vid
xt.vid = par.vid
xt.vtx.ePfx = par.vtx.ePfx & xt.vtx.ePfx
of Leaf:
return err((par.vid,DelLeafUnexpected))
else: # (3)
# Replace `br` (use `xt` as-is)
discard
db.layersPutVtx(xt.vid, xt.vtx)
ok()
proc collapseExt(
db: AristoDbRef; # Database, top layer
hike: Hike; # Fully expanded path
nibble: byte; # Link for `Branch` vertex `^2`
vtx: VertexRef; # Follow up extension vertex (nibble)
): Result[void,(VertexID,AristoError)] =
## Convert/merge vertices:
## ::
## ^3 ^2 `vtx` | ^3 ^2 |
## --------------------+-----------------------+------------------
## Branch <br> Ext | Branch <ext> | 2 < legs.len (1)
## Ext <br> Ext | <ext> | 2 < legs.len (2)
## <br> Ext | <ext> | legs.len == 2 (3)
##
## Merge `vtx` into `br` and unlink `vtx`.
##
let br = hike.legs[^2].wp
var xt = VidVtxPair( # Merge `vtx` into `br`
vid: br.vid,
vtx: VertexRef(
vType: Extension,
ePfx: @[nibble].initNibbleRange.slice(1) & vtx.ePfx,
eVid: vtx.eVid))
Core db aristo hasher profiling and timing improvement (#1938) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already
2023-12-12 17:47:41 +00:00
db.disposeOfVtx br.vtx.bVid[nibble] # `vtx` is obsolete now
if 2 < hike.legs.len: # (1) or (2)
let par = hike.legs[^3].wp
case par.vtx.vType:
of Branch: # (1)
# Replace `br` by `^2 & vtx` (use `xt` as-is)
discard
of Extension: # (2)
# Replace ^3 by `^3 & ^2 & vtx` (update `xt`)
Core db aristo hasher profiling and timing improvement (#1938) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already
2023-12-12 17:47:41 +00:00
db.disposeOfVtx xt.vid
xt.vid = par.vid
xt.vtx.ePfx = par.vtx.ePfx & xt.vtx.ePfx
of Leaf:
return err((par.vid,DelLeafUnexpected))
else: # (3)
# Replace ^2 by `^2 & vtx` (use `xt` as-is)
discard
db.layersPutVtx(xt.vid, xt.vtx)
ok()
proc collapseLeaf(
db: AristoDbRef; # Database, top layer
hike: Hike; # Fully expanded path
nibble: byte; # Link for `Branch` vertex `^2`
vtx: VertexRef; # Follow up leaf vertex (from nibble)
): Result[void,(VertexID,AristoError)] =
## Convert/merge vertices:
## ::
## current | becomes | condition
## | |
## ^4 ^3 ^2 `vtx` | ^4 ^3 ^2 |
## -------------------------+----------------------------+------------------
## .. Branch <br> Leaf | .. Branch <Leaf> | 2 < legs.len (1)
## Branch Ext <br> Leaf | Branch <Leaf> | 3 < legs.len (2)
## Ext <br> Leaf | <Leaf> | legs.len == 3 (3)
## <br> Leaf | <Leaf> | legs.len == 2 (4)
##
## Merge `<br>` and `Leaf` replacing one and removing the other.
##
let br = hike.legs[^2].wp
var lf = VidVtxPair( # Merge `br` into `vtx`
vid: br.vtx.bVid[nibble],
vtx: VertexRef(
vType: Leaf,
lPfx: @[nibble].initNibbleRange.slice(1) & vtx.lPfx,
lData: vtx.lData))
Core db aristo hasher profiling and timing improvement (#1938) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already
2023-12-12 17:47:41 +00:00
db.nullifyKey lf.vid # `vtx` was modified
if 2 < hike.legs.len: # (1), (2), or (3)
Core db aristo hasher profiling and timing improvement (#1938) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already
2023-12-12 17:47:41 +00:00
db.disposeOfVtx br.vid # `br` is obsolete now
# Merge `br` into the leaf `vtx` and unlink `br`.
let par = hike.legs[^3].wp.dup # Writable vertex
case par.vtx.vType:
of Branch: # (1)
# Replace `vtx` by `^2 & vtx` (use `lf` as-is)
par.vtx.bVid[hike.legs[^3].nibble] = lf.vid
db.layersPutVtx(par.vid, par.vtx)
db.layersPutVtx(lf.vid, lf.vtx)
# Make sure that there is a cache enty in case the leaf was pulled from
# the backend.!
let
lfPath = hike.legsTo(hike.legs.len - 2, NibblesSeq) & lf.vtx.lPfx
tag = lfPath.pathToTag.valueOr:
return err((lf.vid,error))
db.top.final.lTab[LeafTie(root: hike.root, path: tag)] = lf.vid
return ok()
of Extension: # (2) or (3)
# Merge `^3` into `lf` but keep the leaf vertex ID unchanged. This
# avoids some `lTab[]` registry update.
lf.vtx.lPfx = par.vtx.ePfx & lf.vtx.lPfx
if 3 < hike.legs.len: # (2)
# Grandparent exists
let gpr = hike.legs[^4].wp.dup # Writable vertex
if gpr.vtx.vType != Branch:
return err((gpr.vid,DelBranchExpexted))
Core db aristo hasher profiling and timing improvement (#1938) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already
2023-12-12 17:47:41 +00:00
db.disposeOfVtx par.vid # `par` is obsolete now
gpr.vtx.bVid[hike.legs[^4].nibble] = lf.vid
db.layersPutVtx(gpr.vid, gpr.vtx)
db.layersPutVtx(lf.vid, lf.vtx)
# Make sure that there is a cache enty in case the leaf was pulled from
# the backend.!
let
lfPath = hike.legsTo(hike.legs.len - 3, NibblesSeq) & lf.vtx.lPfx
tag = lfPath.pathToTag.valueOr:
return err((lf.vid,error))
db.top.final.lTab[LeafTie(root: hike.root, path: tag)] = lf.vid
return ok()
# No grandparent, so ^3 is root vertex # (3)
db.layersPutVtx(par.vid, lf.vtx)
# Continue below
of Leaf:
return err((par.vid,DelLeafUnexpected))
else: # (4)
Core db aristo hasher profiling and timing improvement (#1938) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already
2023-12-12 17:47:41 +00:00
# Replace ^2 by `^2 & vtx` (use `lf` as-is) # `br` is root vertex
db.nullifyKey br.vid # root was changed
db.layersPutVtx(br.vid, lf.vtx)
# Continue below
# Common part for setting up `lf` as root vertex # Rest of (3) or (4)
let rc = lf.vtx.lPfx.pathToTag
if rc.isErr:
return err((br.vid,rc.error))
#
# No need to update the cache unless `lf` is present there. The leaf path
# as well as the value associated with the leaf path has not been changed.
let lfTie = LeafTie(root: hike.root, path: rc.value)
if db.top.final.lTab.hasKey lfTie:
db.top.final.lTab[lfTie] = lf.vid
# Clean up stale leaf vertex which has moved to root position
Core db aristo hasher profiling and timing improvement (#1938) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already
2023-12-12 17:47:41 +00:00
db.disposeOfVtx lf.vid
# If some `Leaf` vertex was installed as root, there must be a an extra
# `LeafTie` lookup entry.
let rootVtx = db.getVtx hike.root
if rootVtx.isValid and
rootVtx != hike.legs[0].wp.vtx and
rootVtx.vType == Leaf:
let tag = rootVtx.lPfx.pathToTag.valueOr:
return err((hike.root,error))
db.top.final.lTab[LeafTie(root: hike.root, path: tag)] = hike.root
ok()
# -------------------------
2023-06-02 19:21:46 +00:00
proc deleteImpl(
db: AristoDbRef; # Database, top layer
2023-06-02 19:21:46 +00:00
hike: Hike; # Fully expanded path
lty: LeafTie; # `Patricia Trie` path root-to-leaf
2023-06-02 19:21:46 +00:00
): Result[void,(VertexID,AristoError)] =
## Implementation of *delete* functionality.
Core db aristo hasher profiling and timing improvement (#1938) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already
2023-12-12 17:47:41 +00:00
# Remove leaf entry on the top
let lf = hike.legs[^1].wp
if lf.vtx.vType != Leaf:
return err((lf.vid,DelLeafExpexted))
if lf.vid in db.pPrf:
return err((lf.vid, DelLeafLocked))
# Will be needed at the end. Just detect an error early enouhh
let leafVidBe = block:
let rc = db.getVtxBE lf.vid
if rc.isErr:
if rc.error != GetVtxNotFound:
return err((lf.vid, rc.error))
VertexRef(nil)
else:
rc.value
Core db aristo hasher profiling and timing improvement (#1938) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already
2023-12-12 17:47:41 +00:00
db.disposeOfVtx lf.vid
if 1 < hike.legs.len:
# Get current `Branch` vertex `br`
Aristo db update for short nodes key edge cases (#1887) * Aristo: Provide key-value list signature calculator detail: Simple wrappers around `Aristo` core functionality * Update new API for `CoreDb` details: + Renamed new API functions `contains()` => `hasKey()` or `hasPath()` which disables the `in` operator on non-boolean `contains()` functions + The functions `get()` and `fetch()` always return a not-found error if there is no item, available. The new functions `getOrEmpty()` and `mergeOrEmpty()` return an an empty `Blob` if there is no such key found. * Rewrite `core_apps.nim` using new API from `CoreDb` * Use `Aristo` functionality for calculating Merkle signatures details: For debugging, the `VerifyAristoForMerkleRootCalc` can be set so that `Aristo` results will be verified against the legacy versions. * Provide general interface for Merkle signing key-value tables details: Export `Aristo` wrappers * Activate `CoreDb` tests why: Now, API seems to be stable enough for general tests. * Update `toHex()` usage why: Byteutils' `toHex()` is superior to `toSeq.mapIt(it.toHex(2)).join` * Split `aristo_transcode` => `aristo_serialise` + `aristo_blobify` why: + Different modules for different purposes + `aristo_serialise`: RLP encoding/decoding + `aristo_blobify`: Aristo database encoding/decoding * Compacted representation of small nodes' links instead of Keccak hashes why: Ethereum MPTs use Keccak hashes as node links if the size of an RLP encoded node is at least 32 bytes. Otherwise, the RLP encoded node value is used as a pseudo node link (rather than a hash.) Such a node is nor stored on key-value database. Rather the RLP encoded node value is stored instead of a lode link in a parent node instead. Only for the root hash, the top level node is always referred to by the hash. This feature needed an abstraction of the `HashKey` object which is now either a hash or a blob of length at most 31 bytes. This leaves two ways of representing an empty/void `HashKey` type, either as an empty blob of zero length, or the hash of an empty blob. * Update `CoreDb` interface (mainly reducing logger noise) * Fix copyright years (to make `Lint` happy)
2023-11-08 12:18:32 +00:00
let br = block:
var wp = hike.legs[^2].wp
wp.vtx = wp.vtx.dup # make sure that layers are not impliciteley modified
wp
if br.vtx.vType != Branch:
return err((br.vid,DelBranchExpexted))
# Unlink child vertex from structural table
br.vtx.bVid[hike.legs[^2].nibble] = VertexID(0)
db.layersPutVtx(br.vid, br.vtx)
# Clear all keys up to the root key
for n in 0 .. hike.legs.len - 2:
let vid = hike.legs[n].wp.vid
if vid in db.top.final.pPrf:
return err((vid, DelBranchLocked))
Core db aristo hasher profiling and timing improvement (#1938) * Explicitly use shared `Kvt` table on `Ledger` and `Clique` lookup. why: Speeds up lookup time with `Aristo` backend. For writing `Clique` data, the `Companion` model allows to write `Clique` data past the database locked by evm transactions. * Implement `CoreDb` profiling with API tracking why: Chasing time spent per APT procs ... * Implement `Ledger` profiling with API tracking why: Chasing time spent per APT procs ... * Always hashify when commiting or storing why: A dirty cache makes no sense when committing * Make sure that a zero key is created when adding/updating vertices why: This is an error fix mainly for edge cases. A typical error was that the root key got deleted when there were only a few vertices left on the DB. * Need all created and changed vertices zero-keyed on the cache why: A zero key (i.e. empty Merkle hash) indicates that a vertex key needs to be updated. This would not be needed immediately after a merge as there is an actual leaf path on the cache layer. But after subsequent merge and delete operations this information might get blurred. * Re-org hashing algorithm why: Apart from errors, the previous implementation was too slow for two reasons: + some control hashes were calculated for debugging (now all verification is done in `aristo_check` module) + the leaf paths stored on the cache are used to build the labelling (aka hashing) schedule; there paths were accumulated over successive hash sessions although it is clear that all keys were generated, already
2023-12-12 17:47:41 +00:00
db.nullifyKey vid
let nibble = block:
let rc = br.vtx.branchStillNeeded()
if rc.isErr:
return err((br.vid,DelBranchWithoutRefs))
rc.value
# Convert to `Extension` or `Leaf` vertex
if 0 <= nibble:
# Get child vertex (there must be one after a `Branch` node)
let nxt = block:
let vid = br.vtx.bVid[nibble]
VidVtxPair(vid: vid, vtx: db.getVtx vid)
if not nxt.vtx.isValid:
return err((nxt.vid, DelVidStaleVtx))
# Collapse `Branch` vertex `br` depending on `nxt` vertex type
case nxt.vtx.vType:
of Branch:
? db.collapseBranch(hike, nibble.byte)
of Extension:
? db.collapseExt(hike, nibble.byte, nxt.vtx)
of Leaf:
? db.collapseLeaf(hike, nibble.byte, nxt.vtx)
2023-06-02 19:21:46 +00:00
# Delete leaf entry
if leafVidBe.isValid:
# To be recorded on change history
db.top.final.lTab[lty] = VertexID(0)
else:
# No need to keep it any longer in cache
db.top.final.lTab.del lty
2023-06-02 19:21:46 +00:00
ok()
# ------------------------------------------------------------------------------
# Public functions
# ------------------------------------------------------------------------------
proc delete*(
db: AristoDbRef; # Database, top layer
hike: Hike; # Fully expanded chain of vertices
2023-06-02 19:21:46 +00:00
): Result[void,(VertexID,AristoError)] =
## Delete argument `hike` chain of vertices from the database
Aristo db api extensions for use as core db backend (#1754) * Update docu * Update Aristo/Kvt constructor prototype why: Previous version used an `enum` value to indicate what backend is to be used. This was replaced by using the backend object type. * Rewrite `hikeUp()` return code into `Result[Hike,(Hike,AristoError)]` why: Better code maintenance. Previously, the `Hike` object was returned. It had an internal error field so partial success was also available on a failure. This error field has been removed. * Use `openArray[byte]` rather than `Blob` in functions prototypes * Provide synchronised multi instance transactions why: The `CoreDB` object was geared towards the legacy DB which used a single transaction for the key-value backend DB. Different state roots are provided by the backend database, so all instances work directly on the same backend. Aristo db instances have different in-memory mappings (aka different state roots) and the transactions are on top of there mappings. So each instance might run different transactions. Multi instance transactions are a compromise to converge towards the legacy behaviour. The synchronised transactions span over all instances available at the time when base transaction was opened. Instances created later are unaffected. * Provide key-value pair database iterator why: Needed in `CoreDB` for `replicate()` emulation also: Some update of internal code * Extend API (i.e. prototype variants) why: Needed for `CoreDB` geared towards the legacy backend which has a more basic API than Aristo.
2023-09-15 15:23:53 +00:00
##
2023-06-02 19:21:46 +00:00
# Need path in order to remove it from `lTab[]`
let lty = LeafTie(
root: hike.root,
path: ? hike.to(NibblesSeq).pathToTag().mapErr toVae)
db.deleteImpl(hike, lty)
2023-06-02 19:21:46 +00:00
proc delete*(
db: AristoDbRef; # Database, top layer
lty: LeafTie; # `Patricia Trie` path root-to-leaf
2023-06-02 19:21:46 +00:00
): Result[void,(VertexID,AristoError)] =
## Variant of `delete()`
Aristo db api extensions for use as core db backend (#1754) * Update docu * Update Aristo/Kvt constructor prototype why: Previous version used an `enum` value to indicate what backend is to be used. This was replaced by using the backend object type. * Rewrite `hikeUp()` return code into `Result[Hike,(Hike,AristoError)]` why: Better code maintenance. Previously, the `Hike` object was returned. It had an internal error field so partial success was also available on a failure. This error field has been removed. * Use `openArray[byte]` rather than `Blob` in functions prototypes * Provide synchronised multi instance transactions why: The `CoreDB` object was geared towards the legacy DB which used a single transaction for the key-value backend DB. Different state roots are provided by the backend database, so all instances work directly on the same backend. Aristo db instances have different in-memory mappings (aka different state roots) and the transactions are on top of there mappings. So each instance might run different transactions. Multi instance transactions are a compromise to converge towards the legacy behaviour. The synchronised transactions span over all instances available at the time when base transaction was opened. Instances created later are unaffected. * Provide key-value pair database iterator why: Needed in `CoreDB` for `replicate()` emulation also: Some update of internal code * Extend API (i.e. prototype variants) why: Needed for `CoreDB` geared towards the legacy backend which has a more basic API than Aristo.
2023-09-15 15:23:53 +00:00
##
db.deleteImpl(? lty.hikeUp(db).mapErr toVae, lty)
proc delete*(
db: AristoDbRef;
root: VertexID;
path: openArray[byte];
Aristo db update for short nodes key edge cases (#1887) * Aristo: Provide key-value list signature calculator detail: Simple wrappers around `Aristo` core functionality * Update new API for `CoreDb` details: + Renamed new API functions `contains()` => `hasKey()` or `hasPath()` which disables the `in` operator on non-boolean `contains()` functions + The functions `get()` and `fetch()` always return a not-found error if there is no item, available. The new functions `getOrEmpty()` and `mergeOrEmpty()` return an an empty `Blob` if there is no such key found. * Rewrite `core_apps.nim` using new API from `CoreDb` * Use `Aristo` functionality for calculating Merkle signatures details: For debugging, the `VerifyAristoForMerkleRootCalc` can be set so that `Aristo` results will be verified against the legacy versions. * Provide general interface for Merkle signing key-value tables details: Export `Aristo` wrappers * Activate `CoreDb` tests why: Now, API seems to be stable enough for general tests. * Update `toHex()` usage why: Byteutils' `toHex()` is superior to `toSeq.mapIt(it.toHex(2)).join` * Split `aristo_transcode` => `aristo_serialise` + `aristo_blobify` why: + Different modules for different purposes + `aristo_serialise`: RLP encoding/decoding + `aristo_blobify`: Aristo database encoding/decoding * Compacted representation of small nodes' links instead of Keccak hashes why: Ethereum MPTs use Keccak hashes as node links if the size of an RLP encoded node is at least 32 bytes. Otherwise, the RLP encoded node value is used as a pseudo node link (rather than a hash.) Such a node is nor stored on key-value database. Rather the RLP encoded node value is stored instead of a lode link in a parent node instead. Only for the root hash, the top level node is always referred to by the hash. This feature needed an abstraction of the `HashKey` object which is now either a hash or a blob of length at most 31 bytes. This leaves two ways of representing an empty/void `HashKey` type, either as an empty blob of zero length, or the hash of an empty blob. * Update `CoreDb` interface (mainly reducing logger noise) * Fix copyright years (to make `Lint` happy)
2023-11-08 12:18:32 +00:00
): Result[void,(VertexID,AristoError)] =
## Variant of `delete()`
Aristo db api extensions for use as core db backend (#1754) * Update docu * Update Aristo/Kvt constructor prototype why: Previous version used an `enum` value to indicate what backend is to be used. This was replaced by using the backend object type. * Rewrite `hikeUp()` return code into `Result[Hike,(Hike,AristoError)]` why: Better code maintenance. Previously, the `Hike` object was returned. It had an internal error field so partial success was also available on a failure. This error field has been removed. * Use `openArray[byte]` rather than `Blob` in functions prototypes * Provide synchronised multi instance transactions why: The `CoreDB` object was geared towards the legacy DB which used a single transaction for the key-value backend DB. Different state roots are provided by the backend database, so all instances work directly on the same backend. Aristo db instances have different in-memory mappings (aka different state roots) and the transactions are on top of there mappings. So each instance might run different transactions. Multi instance transactions are a compromise to converge towards the legacy behaviour. The synchronised transactions span over all instances available at the time when base transaction was opened. Instances created later are unaffected. * Provide key-value pair database iterator why: Needed in `CoreDB` for `replicate()` emulation also: Some update of internal code * Extend API (i.e. prototype variants) why: Needed for `CoreDB` geared towards the legacy backend which has a more basic API than Aristo.
2023-09-15 15:23:53 +00:00
##
db.delete(? path.initNibbleRange.hikeUp(root, db).mapErr toVae)
2023-06-02 19:21:46 +00:00
# ------------------------------------------------------------------------------
# End
# ------------------------------------------------------------------------------