Because EthBlock is quite large, the stack usage that results from the
multiple copies (temporary and not) present in the import command is
larger than it should be - this PR moves some of that data to a closure
environment allocated once per EthBlock - a larger restructuring of the
code is due but in the meantime, this simple change speeds up garbage
collection a little bit.
* Remove redundant `eth/68` message and clean up docu
details:
There is only eth/68 available at the moment
* Allow to turn on chronicles line number logging in `Makefile`
* Accept (and forget) tx hashes announcements
why:
Does no harm to just ignore it at the moment
* Bump nim-eth (rlp fix)
When the stack has an empty layer on top, there's no need to copy the
contents of `top` to it since it would be the same.
~13% processing saved (!)
pre
```
INF 2024-08-17 19:11:31.748+02:00 Imported blocks
blockNumber=18667648 blocks=12000 importedSlot=7860043 txs=1797812
mgas=181135.177 bps=8.763 tps=1375.062 mgps=132.125 avgBps=6.798
avgTps=1018.501 avgMGps=102.617 elapsed=29m25s154ms
```
post
```
INF 2024-08-17 18:22:52.513+02:00 Imported blocks
blockNumber=18667648 blocks=12000 importedSlot=7860043 txs=1797812
mgas=181135.177 bps=9.648 tps=1513.961 mgps=145.472 avgBps=7.876
avgTps=1179.998 avgMGps=118.888 elapsed=25m23s572ms
```
The reverse slot hash mechanism causes quite a bit of database traffic
but is broadly not useful except for iterating the storage of an
account, something that a validator never does (it's used by the
tracers).
This flag adds one more thing that is not stored in the database, to be
explored more comprehensively when designing full, validator and archive
modes with different pruning options in the future.
`ldb` says this is 60gb of data (!):
```
ldb --db=. --ignore_unknown_options --column_family=KvtGen approxsize
--hex --from=0x05
--to=0x05ffffffffffffffffffffffffffffffffffffffffffffff
66488353954
```
* Move snap un-dumpers to aristo unit test folder
why:
The only place where it is used, now to test the database against
legacy snap sync dump samples.
While the details of the dumped data have mostly outlived their purpuse,
its use as **entropy** data thrown against `Aristo` has still been
useful to find/debug tricky DB problems.
* Remove cruft
* `nimbus-eth1-blobs` not used anymore as test data source
* Cleaning up, removing cruft and debugging statements
* Make `aristo_delta` fluffy compatible
why:
A sub-module that uses `chronicles` must import all possible
modules used by a parent module that imports the sub-module.
* update TODO
* Extract sub-tree deletion functions into separate sub-modules
* Move/rename `aristo_desc.accLruSize` => `aristo_constants.ACC_LRU_SIZE`
* Lazily delete sub-trees
why:
This gives some control of the memory used to keep the deleted vertices
in the cached layers. For larger sub-trees, keys and vertices might be
on the persistent backend to a large extend. This would pull an amount
of extra information from the backend into the cached layer.
For lazy deleting it is enough to remember sub-trees by a small set of
(at most 16) sub-roots to be processed when storing persistent data.
Marking the tree root deleted immediately allows to let most of the code
base work as before.
* Comments and cosmetics
* No need to import all for `Aristo` here
* Kludge to make `chronicle` usage in sub-modules work with `fluffy`
why:
That `fluffy` would not run with any logging in `core_deb` is a problem
I have known for a while. Up to now, logging was only used for debugging.
With the current `Aristo` PR, there are cases where logging might be
wanted but this works only if `chronicles` runs without the
`json[dynamic]` sinks.
So this should be re-visited.
* More of a kludge
why:
It is not safe in general to recycle vertex IDs while the `RocksDb`
cache has `VertexID` rather than `RootedVertexID` where the former
type seems preferable.
In some fringe cases one might remove a vertex with key `(root1,vid)`
and insert another vertex with key `(root2,vid)` while re-using the
vertex ID `vid`. Without knowledge of `root1` and `root2`, the LRU
cache will return the same vertex for `(root2,vid)` also for
`(root1,vid)`.
Based on some simple testing done with a few combinations of cache
sizes, it seems that the block cache has grown in importance compared to
the where we were before changing on-disk format and adding a lot of
other point caches.
With these settings, there's roughly a 15% performance increase when
processing blocks in the 18M range over the status quo while memory
usage decreases by more than 1gb!
Only a few values were tested so there's certainly more to do here but
this change sets up a better baseline for any future optimizations.
In particular, since the initial defaults were chosen root vertex id:s
were introduced as key prefixes meaning that storage for each account
will be grouped together and thus it becomes more likely that a block
loaded from disk will be hit multiple times - this seems to give the
block cache an edge over the row cache, specially when traversing the
storage trie.
* Update state network to use addressHash instead of address in contract trie and contract code content keys.
* Fix path calculation bug in getParent when working with extension nodes.
* Bump portal spec tests repo.
* Finish updating tests due to portal test vector changes.
* Update Fluffy state bridge to use addressHash.
* Update Fluffy book with correct commands for running portal hive tests.
* Provide portal proof functions in `aristo_api`
why:
So it can be fully supported by `CoreDb`
* Fix prototype in `kvt_api`
* Fix node constructor for account leafs with storage trees
* Provide simple path check based on portal proof functionality
* Provide portal proof functionality in `CoreDb`
* Update TODO list
* Extracted `test_tx.testTxMergeProofAndKvpList()` => separate file
* Fix serialiser
why:
Typo lead to duplicate rlp-encoded nodes in chain
* Remove cruft
* Implemnt portal proof nodes generators `partXxxTwig()`
* Add unit test for portal proof nodes generator `partAccountTwig()`
* Cosmetics
* Simplify serialiser return code format
* Fix proof generator for extension nodes
why:
Code was simply bonkers, not detected before the unit tests were
adapted to check for just this.
* Implemented portal proof nodes verifier `partUntwig()`
* Cosmetics
* Fix `testutp` cli poblem
* Use RPC batching to send offer requests and filter out duplicate offers.
* Lookup offers after gossip to check if gossip successful.
* Use multiple workers for gossiping offers.
* Update Fluffy state network logging.
* Use single RPC calls instead of batching.
* Update cli parameters.
* Fix bug in contract trie offer building.
* Create block offers queue and collect account preimages.
* Implement iterators to return account and storage proofs and bytecode from updatedCaches.
* Implement building offers from proofs.
* Refactor BlockDataRef type to only include required fields.
* Store block data in database.
* Improve state diff types.
* Implement start state backfill from specific block.
* Record last persisted block number in database.
* Persist account preimages in db.
* Apply state updates for DAO hard fork.
* Implement state gossip of block offers via portal JSON RPC.
* Implement partial trees
why:
This is currently needed for unit tests to pre-load the database
with test data similar to `proof` node pre-load.
The basic features for `snap-sync` boundary proofs are available
as well for future use. What is missing is the final proof verification
and a complete storage data load/merge function (stub is available.)
* Cosmetics, clean up
* remove some redundant EH
* avoid pessimising move (introduces a copy in this case!)
* shift less data around when reading era files (reduces stack usage)