Root encoding is on the hot path for block verification both in the
consensus (when syncing) and execution clients and oddly consititutes a
significant part of resource usage even though it is not that much work.
While the trie code is capable of producing a transaction root and
similar feats, it turns out that it is quite inefficient - even for
small work loads.
This PR brings in a helper for the specific use case of building tries
of lists of values whose key is the RLP-encoded index of the item.
As it happens, such keys follow a particular structure where items end
up "almost" sorted, with the exception for the item at index 0 which
gets encoded as `[0x80]`, ie the empty list, thus moving it to a new
location.
Armed with this knowledge and the understanding that inserting ordered
items into a trie easily can be done with a simple recursion, this PR
brings a ~100x improvement in CPU usage (360ms vs 33s) and a ~50x
reduction in memory usage (70mb vs >3gb!) for the simple test of
encoding 1000000 keys.
In part, the memory usage reduction is due to a trick where the hash of
the item is computed as the item is being added instead of storing it in
the value.
There are further reductions possible such as maintaining a hasher per
level instead of storing hash values as well as using a direct-to-hash
rlp encoder.
Since these types were written, we've gained an executable spec:
https://github.com/ethereum/execution-specs
This PR aligns some of the types we use with this spec to simplify
comparisons and cross-referencing.
Using a `distinct` type is a tradeoff between nim ergonomics, type
safety and the ability to work around nim quirks and stdlib weaknesses.
In particular, it allows us to overload common functions such as `hash`
with correct and performant versions as well as maintain control over
string conversions etc at the cost of a little bit of ceremony when
instantiating them.
Apart from distinct byte types, `Hash32`, is introduced in lieu of the
existing `Hash256`, again aligning this commonly used type with the spec
which picks bytes rather than bits in the name.
Currently only setting `--styleCheck:hint` as there are some
dependency fixes required and the compiler seems to trip over the
findnode MessageKind, findnode Message field and the findNode
proc. Also over protocol.Protocol usage.
* Add build_dcli target and add it to CI
* Fix local imports for dcli
* And use local imports for all other files too
* Use local imports in tests and rlpx protocols
* port kvstore from nim-beacon-chain
* remove old database backends
* use kvstore in trie database
* add sqlite dep
* avoid template param double evaluation
* clean up heterogenous lookup todo