nim-eth

Commit Graph

Author	SHA1	Message	Date
Jacek Sieka	00c91a1dca	Ordered trie for computing roots (#744 ) Root encoding is on the hot path for block verification both in the consensus (when syncing) and execution clients and oddly consititutes a significant part of resource usage even though it is not that much work. While the trie code is capable of producing a transaction root and similar feats, it turns out that it is quite inefficient - even for small work loads. This PR brings in a helper for the specific use case of building tries of lists of values whose key is the RLP-encoded index of the item. As it happens, such keys follow a particular structure where items end up "almost" sorted, with the exception for the item at index 0 which gets encoded as `[0x80]`, ie the empty list, thus moving it to a new location. Armed with this knowledge and the understanding that inserting ordered items into a trie easily can be done with a simple recursion, this PR brings a ~100x improvement in CPU usage (360ms vs 33s) and a ~50x reduction in memory usage (70mb vs >3gb!) for the simple test of encoding 1000000 keys. In part, the memory usage reduction is due to a trick where the hash of the item is computed as the item is being added instead of storing it in the value. There are further reductions possible such as maintaining a hasher per level instead of storing hash values as well as using a direct-to-hash rlp encoder.	2024-10-08 20:02:58 +02:00

Author

SHA1

Message

Date

Jacek Sieka

00c91a1dca

Ordered trie for computing roots (#744 )

Root encoding is on the hot path for block verification both in the
consensus (when syncing) and execution clients and oddly consititutes a
significant part of resource usage even though it is not that much work.

While the trie code is capable of producing a transaction root and
similar feats, it turns out that it is quite inefficient - even for
small work loads.

This PR brings in a helper for the specific use case of building tries
of lists of values whose key is the RLP-encoded index of the item.

As it happens, such keys follow a particular structure where items end
up "almost" sorted, with the exception for the item at index 0 which
gets encoded as `[0x80]`, ie the empty list, thus moving it to a new
location.

Armed with this knowledge and the understanding that inserting ordered
items into a trie easily can be done with a simple recursion, this PR
brings a ~100x improvement in CPU usage (360ms vs 33s) and a ~50x
reduction in memory usage (70mb vs >3gb!) for the simple test of
encoding 1000000 keys.

In part, the memory usage reduction is due to a trick where the hash of
the item is computed as the item is being added instead of storing it in
the value.

There are further reductions possible such as maintaining a hasher per
level instead of storing hash values as well as using a direct-to-hash
rlp encoder.

2024-10-08 20:02:58 +02:00

1 Commits