* Block header download starting at Beacon down to Era1
details:
The header download implementation is intended to be completed to a
full sync facility.
Downloaded block headers are stored in a `CoreDb` table. Later on they
should be fetched, complemented by a block body, executed/imported,
and deleted from the table.
The Era1 repository may be partial or missing. Era1 headers are neither
downloaded nor stored on the `CoreDb` table.
Headers are downloaded top down (largest block number first) using the
hash of the block header by one peer. Other peers fetch headers
opportunistically using block numbers
Observed download times for 14m `MainNet` headers varies between 30min
and 1h (Era1 size truncated to 66m blocks.), full download 52min
(anectdotal.) The number of peers downloading concurrently is crucial
here.
* Activate `flare` by command line option
* Fix copyright year
* replace rocksdb row cache with larger rdb lru caches - these serve the
same purpose but are more efficient because they skips serialization,
locking and rocksdb layering
* don't append fresh items to cache - this has the effect of evicting
the existing items and replacing them with low-value entries that might
never be read - during write-heavy periods of processing, the
newly-added entries were evicted during the store loop
* allow tuning rdb lru size at runtime
* add (hidden) option to print lru stats at exit (replacing the
compile-time flag)
pre:
```
INF 2024-09-03 15:07:01.136+02:00 Imported blocks
blockNumber=20012001 blocks=12000 importedSlot=9216851 txs=1837042
mgas=181911.265 bps=11.675 tps=1870.397 mgps=176.819 avgBps=10.288
avgTps=1574.889 avgMGps=155.952 elapsed=19m26s458ms
```
post:
```
INF 2024-09-03 13:54:26.730+02:00 Imported blocks
blockNumber=20012001 blocks=12000 importedSlot=9216851 txs=1837042
mgas=181911.265 bps=11.637 tps=1864.384 mgps=176.250 avgBps=11.202
avgTps=1714.920 avgMGps=169.818 elapsed=17m51s211ms
```
9%:ish import perf improvement on similar mem usage :)
* bump metrics
* Remove cruft
* Cosmetics, update some logging, noise control
* Renamed `CoreDb` function `hasKey` => `hasKeyRc` and provided `hasKey`
why:
Currently, `hasKey` returns a `Result[]` rather than a `bool` which
is what one would expect from a function prototype of this name.
This was a bit of an annoyance and cost unnecessary attention.
The reverse slot hash mechanism causes quite a bit of database traffic
but is broadly not useful except for iterating the storage of an
account, something that a validator never does (it's used by the
tracers).
This flag adds one more thing that is not stored in the database, to be
explored more comprehensively when designing full, validator and archive
modes with different pruning options in the future.
`ldb` says this is 60gb of data (!):
```
ldb --db=. --ignore_unknown_options --column_family=KvtGen approxsize
--hex --from=0x05
--to=0x05ffffffffffffffffffffffffffffffffffffffffffffff
66488353954
```
This significantly speeds up block import at the cost of less protection
against invalid data, potentially resulting in an invalid database
getting stored.
The risk is small given that import is used only for validated data -
evaluating the right level of of validation vs performance is left for a
future PR.
A side effect of this approach is that there is no cached stated root in
the database - computing it currently requires a lot of memory since the
intermediate roots get cached in memory in full while the computation is
ongoing - a future PR will need to address this deficiency, for example
by streaming the already-computed hashes directly to the database.
When performing block import, we can batch state root verifications and
header checks, doing them only once per chunk of blocks, assuming that
the other blocks in the batch are valid by extension.
When we're not generating receipts, we can also skip per-transaction
state root computation pre-byzantium, which is what provides a ~20%
speedup in this PR, at least on those early blocks :)
We also stop storing transactions, receipts and uncles redundantly when
importing from era1 - there is no need to waste database storage on this
when we can load it from the era1 file (eventually).
* Cleanup unneeded stateless and block witness code. Keeping MultiKeys which is used in the eth_getProofsByBlockNumber RPC endpoint which is needed for the Fluffy state network bridge.
* Rename generateWitness flag to collectWitnessData to better describe what the flag does. We only collect the keys of the touched accounts and storage slots but no block witness generation is supported for now.
* Move remaining stateless code into nimbus directory.
* Add vmstate parameter to ChainRef to fix test.
* Exclude *.in from check copyright year
---------
Co-authored-by: jangko <jangko128@gmail.com>
This new option saves a CSV to disk while performing `import` such that
the performance of one import can be compared with the other.
This early version is likely to change in the future
These options are there mainly to drive experiments, and are therefore
hidden.
One thing that this PR brings in is an initial set of caches and buffers for rocksdb - the set that I've been using during various performance tests to get to a viable baseline performance level.
This PR extends the `nimbus import` command to also allow reading from
era files - this command allows creating or topping up an existing
database with data coming from era files instead of network sync.
* add `--era1-dir` and `--max-blocks` options to command line
* make `persistBlocks` report basic stats like transactions and gas
* improve error reporting in several API
* allow importing multiple RLP files in one go
* clean up logging options to match nimbus-eth2
* make sure database is closed properly on shutdown
* Attempt to roll back stateless mode implementation in a single PR
why:
+ Stateless mode is not fully working and in the way
+ Single PR should make it feasible to investigate for a possible
re-implementation
* Fix copyright year
* Fix annotation for exception (evmc mode)
* Update README
* Nimbus-main: replaced `PruneMode` options by `ChainDbMode` options
details:
For the legacy database, this changes the phrase
- `conf.pruneMode == PruneMode.Full` to the expression
+ `conf.chainDbMode == ChainDbMode.Prune`.
* Fix issues moaned about by NIM compiler
* Fix copyright year
* Added procs to get and store block witness in db and add generate-witness cli flag.
* Completed initial implementation of block witness storage.
* Added test to verify witness is persisted to db after call to persistBlock.
* Update getBlockWitness to return witness using Result type.
* Make generate witness parameter hidden.
* Nimbus light client integration with status-go
* Add cleanup code, address review comments
* Disable metrics for libverifproxy only
* Update confutils
* missing import
* build proxy in tests
* more build stuff
* namespace make vars
* export NimMain for windows
* reduce dependency on Nim compiler in header file
* copyright
---------
Co-authored-by: Vitaliy Vlasov <siphiuel@protonmail.com>
Co-authored-by: Jacek Sieka <jacek@status.im>
* Completed draft implementation of witness JSON-RPC endpoints for portal network bridge.
* Updated Nimbus RPC configuration to support enabling experimental endpoints.
* Moved witness verification tests.
* Added json test for getProof.
* Added main procs to new tests to fix test suite.
* Added getBlockWitness test to blockchain json test suite.
* Added tests for experimental RPC endpoints and improved the API to support returning state proofs from before or after block execution.
* Correctly rollback transaction in getBlockWitness proc.
* Improve logging and logging options in Fluffy
- Allow selection of log format, including:
- JSON
- automatic selection based on tty
- Allow log levels per topic configured on cli
* Somewhat tighten error handling
why:
Zombie state is invoked when the current peer turns out to be useless
for further communication. While there is a chance to further talk
to a peer about another topic (aka healing) after some protocol failure,
it makes no sense to do so after a network problem.
The latter state is explained bu the `peerDegraded` flag that goes
together with the `zombie` state flag. A degraded peer is dropped
immediately.
* Remove `--sync-mode=snapCtx` option, always start snap in recovery mode
why:
No need for a snap sync option without recovery mode, can be achieved
by deleting the database.
* Code cosmetics, typos, prettify logging, debugging helper, etc.
* Split off snap sync sub-mode handler into separate modules
details:
The original `worker.nim` source has become a multiplexer for several
snap sync sub-modes `full` and `snap`. The source modules of the
incarnations of a particular sync sub-mode are places into the
`worker/play` directory.
* Update ticker for snap and full sync logging
* Clean up some function prototypes
why:
Simplify polymorphic prototype variances for easier maintenance.
* Fix fringe condition crash when importing bogus RLP node
why:
Accessing non-list RLP entry as a list causes `Defect`
* Fix left boundary proof at range extractor
why:
Was insufficient. The main problem was that there was no unit test for
the validity of the generated left boundary.
* Handle incomplete left boundary proofs early
why:
Attempt to do it later leads to overly complex code in order to prevent
looping when the same peer repeats to send the same incomplete proof.
Contrary, gaps in the leaf sequence can be handled gracefully with
registering the gaps
* Implement a manual pivot setup mechanism for snap sync
why:
For a test scenario it is convenient to set the pivot to something
lower than the beacon header from the consensus layer. This does not
need rely on any RPC mechanism.
details:
The file containing the pivot specs is specified by the
`--sync-ctrl-file` option. It is regularly parsed for updates.
* Fix calculation error
why:
Prevent from calculating negative square root
* Enable `snap/1` accounts range service
* Allow to change the garbage collector to `boehm` as a Makefile option.
why:
There is still an unsolved memory corruption problem that might be
related to the standard `gc`. It seemingly goes away if the `gc` is
changed to `boehm`.
Specifying another `gc` on the make level simplifies debugging and
development.
* Code cosmetics
details:
* updated exception annotations
* extracted `worker_desc.nim` from `full/worker.nim`
* etc.
* Implement option to state a sync modifier file
why:
This allows to specify extra sync type specific options which might
change over time. This file is regularly checked for updates.
* Implement a threshold when to suspend full syncing
why:
For a test scenario, a full sync beep may work as a local snap server.
There is no need to download the full block chain.
details:
The file containing the pivot specs is specified by the
`--sync-ctrl-file` option. It is regularly parsed for updates.
* Removed some Windows specific unit test annoyances
details:
+ Short put()/get() cycles on persistent database have a race condition
with vendor rocksdb. On a specific (and slow) qemu/win7 a 50ms `sleep()`
in between will mostly do the job (i.e. unless heavy CPU load.) This
issue was not observed on github/ci.
+ Removed annoyances when qemu/Win7 keeps the rocksdb database files
locked even after closing the db. The problem is solved by strictly
using fresh names for each test. No assumption made to be able to
properly clean up. This issue was not observed on github/ci.
* Silence some compiler gossip -- part 7, misc/non(sync or graphql)
details:
Adding some missing exception annotation
* Reduce Nim 1.6 compiler warnings/hints for Fluffy and Nimbus proxy
Mostly raises Defect removals, TaintedString removal and some
unnecessary imports.
Also updating the copyright years alongside.
* Further reduce Nim 1.6 compiler warnings/hints for Nimbus
* Silence some compiler gossip -- part 5, common
details:
Mostly removing redundant imports and `Defect` tracer after switch
to nim 1.6
* Silence some compiler gossip -- part 6, db, rpc, utils
details:
Mostly removing redundant imports and `Defect` tracer after switch
to nim 1.6
* Silence some compiler gossip -- part 7, randomly collected source files
details:
Mostly removing redundant imports and `Defect` tracer after switch
to nim 1.6
* Silence some compiler gossip -- part 8, assorted tests
details:
Mostly removing redundant imports and `Defect` tracer after switch
to nim 1.6
* Clique update
why:
More impossible exceptions (undoes temporary fix from previous PR)
* Cosmetics, update logger `topics`
* Clean up sync/start methods in nimbus
why:
* The `protocols` list selects served (as opposed to sync) protocols only.
* The `SyncMode.Default` object is allocated with the other possible sync
mode objects.
* Add snap service stub to `nimbus`
* Provide full set of snap response handler stubs
* Bicarb for the latest CI hiccup
why:
Might be a change in the CI engine for MacOS.