This PR consolidates the split header-body sequences into a single EthBlock
sequence and cleans up the fallout from that which significantly reduces
block processing overhead during import thanks to less garbage collection
and fewer copies of things all around.
Notably, since the number of headers must always match the number of bodies,
we also get rid of a pointless degree of freedom that in the future could
introduce unnecessary bugs.
* only read header and body from era file
* avoid several unnecessary copies along the block processing way
* simplify signatures, cleaning up unused arguemnts and returns
* use `stew/assign2` in a few strategic places where the generated
nim assignent is slow and add a few `move` to work around poor
analysis in nim 1.6 (will need to be revisited for 2.0)
```
stats-20240607_2223-a814aa0b.csv vs stats-20240608_0714-21c1d0a9.csv
bps_x bps_y tps_x tps_y bpsd tpsd timed
block_number
(498305, 713245] 1,540.52 1,809.73 2,361.58 2775.340189 17.63% 17.63% -14.92%
(713245, 928185] 730.36 865.26 1,715.90 2028.973852 18.01% 18.01% -15.21%
(928185, 1143126] 663.03 789.10 2,529.26 3032.490771 19.79% 19.79% -16.28%
(1143126, 1358066] 393.46 508.05 2,152.50 2777.578119 29.13% 29.13% -22.50%
(1358066, 1573007] 370.88 440.72 2,351.31 2791.896052 18.81% 18.81% -15.80%
(1573007, 1787947] 283.65 335.11 2,068.93 2441.373402 17.60% 17.60% -14.91%
(1787947, 2002888] 287.29 342.11 2,078.39 2474.179448 18.99% 18.99% -15.91%
(2002888, 2217828] 293.38 343.16 2,208.83 2584.77457 17.16% 17.16% -14.61%
(2217828, 2432769] 140.09 167.86 1,081.87 1296.336926 18.82% 18.82% -15.80%
blocks: 1934464, baseline: 3h13m1s, contender: 2h43m47s
bpsd (mean): 19.55%
tpsd (mean): 19.55%
Time (total): -29m13s, -15.14%
```
* Update some docu & messages
* Remove cruft from the ledger modules
* Must not overwrite genesis data on an initialised database
why:
This will overwrite the global state of the Aristo single state DB.
Otherwise resuming at the last synced state becomes impossible.
* Provide latest block number from journal
why:
This relates the global state of the DB directly to the corresponding
block number.
* Implemented unit test providing DB pre-load and resume
* Disable `TransactionID` related functions from `state_db.nim`
why:
Functions `getCommittedStorage()` and `updateOriginalRoot()` from
the `state_db` module are nowhere used. The emulation of a legacy
`TransactionID` type functionality is administratively expensive to
provide by `Aristo` (the legacy DB version is only partially
implemented, anyway).
As there is no other place where `TransactionID`s are used, they will
not be provided by the `Aristo` variant of the `CoreDb`. For the
legacy DB API, nothing will change.
* Fix copyright headers in source code
* Get rid of compiler warning
* Update Aristo code, remove unused `merge()` variant, export `hashify()`
why:
Adapt to upcoming `CoreDb` wrapper
* Remove synced tx feature from `Aristo`
why:
+ This feature allowed to synchronise transaction methods like begin,
commit, and rollback for a group of descriptors.
+ The feature is over engineered and not needed for `CoreDb`, neither
is it complete (some convergence features missing.)
* Add debugging helpers to `Kvt`
also:
Update database iterator, add count variable yield argument similar
to `Aristo`.
* Provide optional destructors for `CoreDb` API
why;
For the upcoming Aristo wrapper, this allows to control when certain
smart destruction and update can take place. The auto destructor works
fine in general when the storage/cache strategy is known and acceptable
when creating descriptors.
* Add update option for `CoreDb` API function `hash()`
why;
The hash function is typically used to get the state root of the MPT.
Due to lazy hashing, this might be not available on the `Aristo` DB.
So the `update` function asks for re-hashing the gurrent state changes
if needed.
* Update API tracking log mode: `info` => `debug
* Use shared `Kvt` descriptor in new Ledger API
why:
No need to create a new descriptor all the time
* Nimbus folder environment update
details:
* Integrated `CoreDbRef` for the sources in the `nimbus` sub-folder.
* The `nimbus` program does not compile yet as it needs the updates
in the parallel `stateless` sub-folder.
* Stateless environment update
details:
* Integrated `CoreDbRef` for the sources in the `stateless` sub-folder.
* The `nimbus` program compiles now.
* Premix environment update
details:
* Integrated `CoreDbRef` for the sources in the `premix` sub-folder.
* Fluffy environment update
details:
* Integrated `CoreDbRef` for the sources in the `fluffy` sub-folder.
* Tools environment update
details:
* Integrated `CoreDbRef` for the sources in the `tools` sub-folder.
* Nodocker environment update
details:
* Integrated `CoreDbRef` for the sources in the
`hive_integration/nodocker` sub-folder.
* Tests environment update
details:
* Integrated `CoreDbRef` for the sources in the `tests` sub-folder.
* The unit tests compile and run cleanly now.
* Generalise `CoreDbRef` to any `select_backend` supported database
why:
Generalisation was just missed due to overcoming some compiler oddity
which was tied to rocksdb for testing.
* Suppress compiler warning for `newChainDB()`
why:
Warning was added to this function which must be wrapped so that
any `CatchableError` is re-raised as `Defect`.
* Split off persistent `CoreDbRef` constructor into separate file
why:
This allows to compile a memory only database version without linking
the backend library.
* Use memory `CoreDbRef` database by default
detail:
Persistent DB constructor needs to import `db/core_db/persistent
why:
Most tests use memory DB anyway. This avoids linking `-lrocksdb` or
any other backend by default.
* fix `toLegacyBackend()` availability check
why:
got garbled after memory/persistent split.
* Clarify raw access to MPT for snap sync handler
why:
Logically, `kvt` is not the raw access for the hexary trie (although
this holds for the legacy database)
* Extract RocksDB timing tests from snap unit tests as separate module
why:
Declutter, make space for more snap related unit tests.
* Renamed `undumpNextGroup()` => `undumpBlocks()`
why:
Source file name is called `undump_blocks.nim` which should be sort
of in sync with the method name(s).
* Implement snap/1 server method `getByteCodes()`
* Implement snap/1 client method `getByteCodes()`
* Implement faculty for handling contract code fetching via snap/1
* Provide persistent storage for contract code records
* Implement contract code snap sync fetch & store
* Code massage, cosmetics
* Unit tests for verifying snap sync snapshot dump
details:
Use `undump_kvp.dumpAllDb()` to dump any database.
* Add state root to node steps path register `RPath` or `XPath`
why:
Typically, the first node in the path register is the state root. There
are occasions, when the path register is empty (i.e. there are no node
references) which typically applies to a zero node key.
In order to find the next node key greater than zero, the state root is
is needed which is now part of the `RPath` or `XPath` data types.
* Extracted hexary tree debugging functions into separate files
* Update empty path fringe case for left/right node neighbour
why:
When starting at zero, the node steps path register would be empty. So
will any path that is before the fist non-zero link of a state root (if
it is a `Branch` node.)
The `hexaryNearbyRight()` or `hexaryNearbyLeft()` function required a
non-zero node steps path register. Now the first node is to be advanced
starting at the first state root link if necessary.
* Simplify/reorg neighbour node finder
why:
There was too mach code repetition for the cases
* persistent or in-memory database
* left or right move
details:
Most algorithms apply for persistent and in-memory alike. Using
templates/generic functions most of these algorithms can be stated
in a unified way
* Update storage slots snap/1 handler
details:
Minor changes to be more debugging friendly.
* Fix detection of full database for snap sync
* Docu: Snap sync test & debugging scenario
* Re-implemented `hexaryFollow()` in a more general fashion
details:
+ New name for re-implemented `hexaryFollow()` is `hexaryPath()`
+ Renamed `rTreeFollow()` as `hexaryPath()`
why:
Returning similarly organised structures, the results of the
`hexaryPath()` functions become comparable when running over
the persistent and the in-memory databases.
* Added traversal functionality for persistent ChainDB
* Using `Account` values as re-packed Blob
* Repack samples as compressed data files
* Produce test data
details:
+ Can force pivot state root switch after minimal coverage.
+ For emulating certain network behaviour, downloading accounts stops for
a particular pivot state root if 30% (some static number) coverage is
reached. Following accounts are downloaded for a later pivot state root.