nimbus-eth1/nimbus/db/aristo/aristo_desc.nim

182 lines
6.1 KiB
Nim
Raw Permalink Normal View History

# nimbus-eth1
Core db update storage root management for sub tries (#1964) * Aristo: Re-phrase `LayerDelta` and `LayerFinal` as object references why: Avoids copying in some cases * Fix copyright header * Aristo: Verify `leafTie.root` function argument for `merge()` proc why: Zero root will lead to inconsistent DB entry * Aristo: Update failure condition for hash labels compiler `hashify()` why: Node need not be rejected as long as links are on the schedule. In that case, `redo[]` is to become `wff.base[]` at a later stage. This amends an earlier fix, part of #1952 by also testing against the target nodes of the `wff.base[]` sets. * Aristo: Add storage root glue record to `hashify()` schedule why: An account leaf node might refer to a non-resolvable storage root ID. Storage root node chains will end up at the storage root. So the link `storage-root->account-leaf` needs an extra item in the schedule. * Aristo: fix error code returned by `fetchPayload()` details: Final error code is implied by the error code form the `hikeUp()` function. * CoreDb: Discard `createOk` argument in API `getRoot()` function why: Not needed for the legacy DB. For the `Arsto` DB, a lazy approach is implemented where a stprage root node is created on-the-fly. * CoreDb: Prevent `$$` logging in some cases why: Logging the function `$$` is not useful when it is used for internal use, i.e. retrieving an an error text for logging. * CoreDb: Add `tryHashFn()` to API for pretty printing why: Pretty printing must not change the hashification status for the `Aristo` DB. So there is an independent API wrapper for getting the node hash which never updated the hashes. * CoreDb: Discard `update` argument in API `hash()` function why: When calling the API function `hash()`, the latest state is always wanted. For a version that uses the current state as-is without checking, the function `tryHash()` was added to the backend. * CoreDb: Update opaque vertex ID objects for the `Aristo` backend why: For `Aristo`, vID objects encapsulate a numeric `VertexID` referencing a vertex (rather than a node hash as used on the legacy backend.) For storage sub-tries, there might be no initial vertex known when the descriptor is created. So opaque vertex ID objects are supported without a valid `VertexID` which will be initalised on-the-fly when the first item is merged. * CoreDb: Add pretty printer for opaque vertex ID objects * Cosmetics, printing profiling data * CoreDb: Fix segfault in `Aristo` backend when creating MPT descriptor why: Missing initialisation error * CoreDb: Allow MPT to inherit shared context on `Aristo` backend why: Creates descriptors with different storage roots for the same shared `Aristo` DB descriptor. * Cosmetics, update diagnostic message items for `Aristo` backend * Fix Copyright year
2024-01-11 19:11:38 +00:00
# Copyright (c) 2023-2024 Status Research & Development GmbH
# Licensed under either of
# * Apache License, version 2.0, ([LICENSE-APACHE](LICENSE-APACHE) or
# http://www.apache.org/licenses/LICENSE-2.0)
# * MIT license ([LICENSE-MIT](LICENSE-MIT) or
# http://opensource.org/licenses/MIT)
# at your option. This file may not be copied, modified, or distributed
# except according to those terms.
## Aristo DB -- a Patricia Trie with labeled edges
## ===============================================
##
## These data structures allow to overlay the *Patricia Trie* with *Merkel
## Trie* hashes. See the `README.md` in the `aristo` folder for documentation.
##
## Some semantic explanations;
##
## * HashKey, NodeRef etc. refer to the standard/legacy `Merkle Patricia Tree`
## * VertexID, VertexRef, etc. refer to the `Aristo Trie`
##
{.push raises: [].}
import
std/[hashes, sets, tables],
eth/common,
results,
./aristo_constants,
./aristo_desc/[desc_error, desc_identifiers, desc_nibbles, desc_structural],
minilru
from ./aristo_desc/desc_backend
import BackendRef
# Not auto-exporting backend
export
tables, aristo_constants, desc_error, desc_identifiers, desc_nibbles,
desc_structural, minilru, common
type
AristoTxRef* = ref object
## Transaction descriptor
db*: AristoDbRef ## Database descriptor
parent*: AristoTxRef ## Previous transaction
txUid*: uint ## Unique ID among transactions
level*: int ## Stack index for this transaction
AristoDbRef* = ref object
## Three tier database object supporting distributed instances.
top*: LayerRef ## Database working layer, mutable
stack*: seq[LayerRef] ## Stashed immutable parent layers
balancer*: LayerRef ## Balance out concurrent backend access
backend*: BackendRef ## Backend database (may well be `nil`)
txRef*: AristoTxRef ## Latest active transaction
txUidGen*: uint ## Tx-relative unique number generator
accLeaves*: LruCache[Hash32, VertexRef]
## Account path to payload cache - accounts are frequently accessed by
## account path when contracts interact with them - this cache ensures
## that we don't have to re-traverse the storage trie for every such
## interaction
## TODO a better solution would probably be to cache this in a type
## exposed to the high-level API
stoLeaves*: LruCache[Hash32, VertexRef]
## Mixed account/storage path to payload cache - same as above but caches
## the full lookup of storage slots
# Debugging data below, might go away in future
xMap*: Table[HashKey,RootedVertexID] ## For pretty printing/debugging
Leg* = object
## For constructing a `VertexPath`
wp*: VidVtxPair ## Vertex ID and data ref
nibble*: int8 ## Next vertex selector for `Branch` (if any)
Hike* = object
## Trie traversal path
root*: VertexID ## Handy for some fringe cases
legs*: ArrayBuf[NibblesBuf.high + 1, Leg] ## Chain of vertices and IDs
tail*: NibblesBuf ## Portion of non completed path
# ------------------------------------------------------------------------------
# Public helpers
# ------------------------------------------------------------------------------
template mixUp*(accPath, stoPath: Hash32): Hash32 =
# Insecure but fast way of mixing the values of two hashes, for the purpose
# of quick lookups - this is certainly not a good idea for general Hash32
# values but account paths are generated from accounts which would be hard
# to create pre-images for, for the purpose of collisions with a particular
# storage slot
var v {.noinit.}: Hash32
for i in 0..<v.data.len:
# `+` wraps leaving all bits used
v.data[i] = accPath.data[i] + stoPath.data[i]
v
func getOrVoid*[W](tab: Table[W,VertexRef]; w: W): VertexRef =
tab.getOrDefault(w, VertexRef(nil))
Aristo db update for short nodes key edge cases (#1887) * Aristo: Provide key-value list signature calculator detail: Simple wrappers around `Aristo` core functionality * Update new API for `CoreDb` details: + Renamed new API functions `contains()` => `hasKey()` or `hasPath()` which disables the `in` operator on non-boolean `contains()` functions + The functions `get()` and `fetch()` always return a not-found error if there is no item, available. The new functions `getOrEmpty()` and `mergeOrEmpty()` return an an empty `Blob` if there is no such key found. * Rewrite `core_apps.nim` using new API from `CoreDb` * Use `Aristo` functionality for calculating Merkle signatures details: For debugging, the `VerifyAristoForMerkleRootCalc` can be set so that `Aristo` results will be verified against the legacy versions. * Provide general interface for Merkle signing key-value tables details: Export `Aristo` wrappers * Activate `CoreDb` tests why: Now, API seems to be stable enough for general tests. * Update `toHex()` usage why: Byteutils' `toHex()` is superior to `toSeq.mapIt(it.toHex(2)).join` * Split `aristo_transcode` => `aristo_serialise` + `aristo_blobify` why: + Different modules for different purposes + `aristo_serialise`: RLP encoding/decoding + `aristo_blobify`: Aristo database encoding/decoding * Compacted representation of small nodes' links instead of Keccak hashes why: Ethereum MPTs use Keccak hashes as node links if the size of an RLP encoded node is at least 32 bytes. Otherwise, the RLP encoded node value is used as a pseudo node link (rather than a hash.) Such a node is nor stored on key-value database. Rather the RLP encoded node value is stored instead of a lode link in a parent node instead. Only for the root hash, the top level node is always referred to by the hash. This feature needed an abstraction of the `HashKey` object which is now either a hash or a blob of length at most 31 bytes. This leaves two ways of representing an empty/void `HashKey` type, either as an empty blob of zero length, or the hash of an empty blob. * Update `CoreDb` interface (mainly reducing logger noise) * Fix copyright years (to make `Lint` happy)
2023-11-08 12:18:32 +00:00
func getOrVoid*[W](tab: Table[W,NodeRef]; w: W): NodeRef =
tab.getOrDefault(w, NodeRef(nil))
func getOrVoid*[W](tab: Table[W,HashKey]; w: W): HashKey =
tab.getOrDefault(w, VOID_HASH_KEY)
func getOrVoid*[W](tab: Table[W,RootedVertexID]; w: W): RootedVertexID =
tab.getOrDefault(w, default(RootedVertexID))
func getOrVoid*[W](tab: Table[W,HashSet[RootedVertexID]]; w: W): HashSet[RootedVertexID] =
tab.getOrDefault(w, default(HashSet[RootedVertexID]))
Aristo db update for short nodes key edge cases (#1887) * Aristo: Provide key-value list signature calculator detail: Simple wrappers around `Aristo` core functionality * Update new API for `CoreDb` details: + Renamed new API functions `contains()` => `hasKey()` or `hasPath()` which disables the `in` operator on non-boolean `contains()` functions + The functions `get()` and `fetch()` always return a not-found error if there is no item, available. The new functions `getOrEmpty()` and `mergeOrEmpty()` return an an empty `Blob` if there is no such key found. * Rewrite `core_apps.nim` using new API from `CoreDb` * Use `Aristo` functionality for calculating Merkle signatures details: For debugging, the `VerifyAristoForMerkleRootCalc` can be set so that `Aristo` results will be verified against the legacy versions. * Provide general interface for Merkle signing key-value tables details: Export `Aristo` wrappers * Activate `CoreDb` tests why: Now, API seems to be stable enough for general tests. * Update `toHex()` usage why: Byteutils' `toHex()` is superior to `toSeq.mapIt(it.toHex(2)).join` * Split `aristo_transcode` => `aristo_serialise` + `aristo_blobify` why: + Different modules for different purposes + `aristo_serialise`: RLP encoding/decoding + `aristo_blobify`: Aristo database encoding/decoding * Compacted representation of small nodes' links instead of Keccak hashes why: Ethereum MPTs use Keccak hashes as node links if the size of an RLP encoded node is at least 32 bytes. Otherwise, the RLP encoded node value is used as a pseudo node link (rather than a hash.) Such a node is nor stored on key-value database. Rather the RLP encoded node value is stored instead of a lode link in a parent node instead. Only for the root hash, the top level node is always referred to by the hash. This feature needed an abstraction of the `HashKey` object which is now either a hash or a blob of length at most 31 bytes. This leaves two ways of representing an empty/void `HashKey` type, either as an empty blob of zero length, or the hash of an empty blob. * Update `CoreDb` interface (mainly reducing logger noise) * Fix copyright years (to make `Lint` happy)
2023-11-08 12:18:32 +00:00
# --------
func isValid*(vtx: VertexRef): bool =
vtx != VertexRef(nil)
func isValid*(nd: NodeRef): bool =
nd != NodeRef(nil)
func isValid*(pid: PathID): bool =
pid != VOID_PATH_ID
func isValid*(layer: LayerRef): bool =
layer != LayerRef(nil)
func isValid*(root: Hash32): bool =
Aristo db update for short nodes key edge cases (#1887) * Aristo: Provide key-value list signature calculator detail: Simple wrappers around `Aristo` core functionality * Update new API for `CoreDb` details: + Renamed new API functions `contains()` => `hasKey()` or `hasPath()` which disables the `in` operator on non-boolean `contains()` functions + The functions `get()` and `fetch()` always return a not-found error if there is no item, available. The new functions `getOrEmpty()` and `mergeOrEmpty()` return an an empty `Blob` if there is no such key found. * Rewrite `core_apps.nim` using new API from `CoreDb` * Use `Aristo` functionality for calculating Merkle signatures details: For debugging, the `VerifyAristoForMerkleRootCalc` can be set so that `Aristo` results will be verified against the legacy versions. * Provide general interface for Merkle signing key-value tables details: Export `Aristo` wrappers * Activate `CoreDb` tests why: Now, API seems to be stable enough for general tests. * Update `toHex()` usage why: Byteutils' `toHex()` is superior to `toSeq.mapIt(it.toHex(2)).join` * Split `aristo_transcode` => `aristo_serialise` + `aristo_blobify` why: + Different modules for different purposes + `aristo_serialise`: RLP encoding/decoding + `aristo_blobify`: Aristo database encoding/decoding * Compacted representation of small nodes' links instead of Keccak hashes why: Ethereum MPTs use Keccak hashes as node links if the size of an RLP encoded node is at least 32 bytes. Otherwise, the RLP encoded node value is used as a pseudo node link (rather than a hash.) Such a node is nor stored on key-value database. Rather the RLP encoded node value is stored instead of a lode link in a parent node instead. Only for the root hash, the top level node is always referred to by the hash. This feature needed an abstraction of the `HashKey` object which is now either a hash or a blob of length at most 31 bytes. This leaves two ways of representing an empty/void `HashKey` type, either as an empty blob of zero length, or the hash of an empty blob. * Update `CoreDb` interface (mainly reducing logger noise) * Fix copyright years (to make `Lint` happy)
2023-11-08 12:18:32 +00:00
root != EMPTY_ROOT_HASH
Aristo db update for short nodes key edge cases (#1887) * Aristo: Provide key-value list signature calculator detail: Simple wrappers around `Aristo` core functionality * Update new API for `CoreDb` details: + Renamed new API functions `contains()` => `hasKey()` or `hasPath()` which disables the `in` operator on non-boolean `contains()` functions + The functions `get()` and `fetch()` always return a not-found error if there is no item, available. The new functions `getOrEmpty()` and `mergeOrEmpty()` return an an empty `Blob` if there is no such key found. * Rewrite `core_apps.nim` using new API from `CoreDb` * Use `Aristo` functionality for calculating Merkle signatures details: For debugging, the `VerifyAristoForMerkleRootCalc` can be set so that `Aristo` results will be verified against the legacy versions. * Provide general interface for Merkle signing key-value tables details: Export `Aristo` wrappers * Activate `CoreDb` tests why: Now, API seems to be stable enough for general tests. * Update `toHex()` usage why: Byteutils' `toHex()` is superior to `toSeq.mapIt(it.toHex(2)).join` * Split `aristo_transcode` => `aristo_serialise` + `aristo_blobify` why: + Different modules for different purposes + `aristo_serialise`: RLP encoding/decoding + `aristo_blobify`: Aristo database encoding/decoding * Compacted representation of small nodes' links instead of Keccak hashes why: Ethereum MPTs use Keccak hashes as node links if the size of an RLP encoded node is at least 32 bytes. Otherwise, the RLP encoded node value is used as a pseudo node link (rather than a hash.) Such a node is nor stored on key-value database. Rather the RLP encoded node value is stored instead of a lode link in a parent node instead. Only for the root hash, the top level node is always referred to by the hash. This feature needed an abstraction of the `HashKey` object which is now either a hash or a blob of length at most 31 bytes. This leaves two ways of representing an empty/void `HashKey` type, either as an empty blob of zero length, or the hash of an empty blob. * Update `CoreDb` interface (mainly reducing logger noise) * Fix copyright years (to make `Lint` happy)
2023-11-08 12:18:32 +00:00
func isValid*(key: HashKey): bool =
assert key.len != 32 or key.to(Hash32).isValid
0 < key.len
func isValid*(vid: VertexID): bool =
vid != VertexID(0)
func isValid*(rvid: RootedVertexID): bool =
rvid.vid.isValid and rvid.root.isValid
func isValid*(sqv: HashSet[RootedVertexID]): bool =
sqv.len > 0
Aristo db update for short nodes key edge cases (#1887) * Aristo: Provide key-value list signature calculator detail: Simple wrappers around `Aristo` core functionality * Update new API for `CoreDb` details: + Renamed new API functions `contains()` => `hasKey()` or `hasPath()` which disables the `in` operator on non-boolean `contains()` functions + The functions `get()` and `fetch()` always return a not-found error if there is no item, available. The new functions `getOrEmpty()` and `mergeOrEmpty()` return an an empty `Blob` if there is no such key found. * Rewrite `core_apps.nim` using new API from `CoreDb` * Use `Aristo` functionality for calculating Merkle signatures details: For debugging, the `VerifyAristoForMerkleRootCalc` can be set so that `Aristo` results will be verified against the legacy versions. * Provide general interface for Merkle signing key-value tables details: Export `Aristo` wrappers * Activate `CoreDb` tests why: Now, API seems to be stable enough for general tests. * Update `toHex()` usage why: Byteutils' `toHex()` is superior to `toSeq.mapIt(it.toHex(2)).join` * Split `aristo_transcode` => `aristo_serialise` + `aristo_blobify` why: + Different modules for different purposes + `aristo_serialise`: RLP encoding/decoding + `aristo_blobify`: Aristo database encoding/decoding * Compacted representation of small nodes' links instead of Keccak hashes why: Ethereum MPTs use Keccak hashes as node links if the size of an RLP encoded node is at least 32 bytes. Otherwise, the RLP encoded node value is used as a pseudo node link (rather than a hash.) Such a node is nor stored on key-value database. Rather the RLP encoded node value is stored instead of a lode link in a parent node instead. Only for the root hash, the top level node is always referred to by the hash. This feature needed an abstraction of the `HashKey` object which is now either a hash or a blob of length at most 31 bytes. This leaves two ways of representing an empty/void `HashKey` type, either as an empty blob of zero length, or the hash of an empty blob. * Update `CoreDb` interface (mainly reducing logger noise) * Fix copyright years (to make `Lint` happy)
2023-11-08 12:18:32 +00:00
# ------------------------------------------------------------------------------
# Public functions, miscellaneous
# ------------------------------------------------------------------------------
# Hash set helper
func hash*(db: AristoDbRef): Hash =
## Table/KeyedQueue/HashSet mixin
cast[pointer](db).hash
# ------------------------------------------------------------------------------
# Public helpers
# ------------------------------------------------------------------------------
iterator rstack*(db: AristoDbRef): LayerRef =
# Stack in reverse order
for i in 0..<db.stack.len:
yield db.stack[db.stack.len - i - 1]
proc deltaAtLevel*(db: AristoDbRef, level: int): LayerRef =
if level == 0:
db.top
elif level > 0:
doAssert level <= db.stack.len
db.stack[^level]
elif level == -1:
doAssert db.balancer != nil
db.balancer
elif level == -2:
nil
else:
raiseAssert "Unknown level " & $level
# ------------------------------------------------------------------------------
# End
# ------------------------------------------------------------------------------