* Cosmetics, update log and exception messages
* Update `FC` base tree updater `updateBase()`
why:
Correct `forkJunction` of canonical cursor head record. When moving
the `base`, this field would be below `base` unless updated.
* Fix `FC` chain selector `findCanonicalHead()`
why:
Given a sample ref `hash` the function searched for the unique chain
containing the block header referenced by `hash`.
Unfortunately, when searching down the ancestry lineage, the function
did not necessarily stop an the end of the sub-chain. Rather it
continued with the parent chain without noticing. So returning the
wrong result.
* When calculating new a base it must reside on cursor arc (or leg.)
why:
The finalised block argument (that will eventually be the new base)
might be moved further down the cursor arc if it is too close to the
cursor head (typically smaller than 128 blocks.)
So the finalised block selection is shifted down he cursor arc. And
it might happen that the cursor arc itself is too small and one would
end up at a parent cursor arc. This is rejected.
* Not starting a new cursor arc with a block already on another arc
why:
This leads to an inconsistent set of cursor arcs which are supposed to
be mutually disjunct.
* Tighten condition: A block that is not on the base tree must be on the DB
* One less TODO item
* Ignore `FC` overlapping blocks and the ones <= `base`
why:
Due to concurrently running `importBlock()` by `newPayload` RPC
requests the `FC` module layout might differ when re-visiting for
importing blocks.
* Update logging and docu
details:
Reduce some logging noise
Clarify activating/suspending syncer in log messages
* Log/trace cancellation events in scheduler
* Provide `clear()` functions for explicitly flushing data objects
* Renaming header cache functions
why:
More systematic, all functions start with prefix `dbHeader`
* Remove `danglingParent` from layout
why:
Already provided by header cache
* Remove `couplerHash` and `headHash` from layout
why:
No need to cache, `headHash` is unused and `couplerHash` used typically
once, only.
* Remove `lastLayout` from sync descriptor
why:
No need to compare changes, saving is always triggered after actively
changing the sync layout state
* Early reject unsuitable head + finalised header from CL
why:
The finalised header is only passed by its hash so the header must be
fetched somewhere, e.g. from a peer via eth/xx.
Also, finalised headers earlier than the `base` from `FC` cannot be
handled due to the `Aristo` single state database architecture.
Luckily, on a full node, the complete block history is available so
unsuitable finalised headers are stored there already which is exploited
here to avoid unnecessary network traffic.
* Code cosmetics, remove cruft, prettify logging, remove `final` metrics
detail:
The `final` layout parameter will be deprecated and later removed
* Update/re-calibrate syncer logic documentation
why:
The current implementation sucks if the `FC` module changes the
canonical branch in the middle of completing a header chain (due
to concurrent updates by the `newPayload()` logic.)
* Implement according to re-calibrated syncer docu
details:
The implementation employs the notion of named layout states (see
`SyncLayoutState` in `worker_desc.nim`) which are derived from the
state parameter triple `(C,D,H)` as described in `README.md`.
* Annotate `async` functions for non-exception tracking at compile time
details:
This also requires some additional try/except catching in the function
bodies.
* Update sync logic docu to what is to be updated
why:
The understanding of details of how to accommodate for merging
sub-chains of blocks or headers have changed. Some previous set-ups
are outright wrong.
* Clear rejected sync target so that it would not be processed again
* Use in-memory table to stash headers after FCU import has started
why:
After block imported has started, there is no way to save/stash block
headers persistently. The FCU handlers always maintain a positive
transaction level and in some instances the current transaction is
flushed and re-opened.
This patch fixes an exception thrown when a block header has gone
missing.
* When resuming sync, delete stale headers and state
why:
Deleting headers saves some persistent space that would get lost
otherwise. Deleting the state after resuming prevents from race
conditions.
* On clean start hibernate sync `deamon` entity before first update from CL
details:
Only reduces services are running
* accept FCU from CL
* fetch finalised header after accepting FCY (provides hash only)
* Improve text/meaning of some log messages
* Revisit error handling for useless peers
why:
A peer is abandoned from if the error score is too high. This was not
properly handled for some fringe case when the error was detected at
staging time but fetching via eth/xx was ok.
* Clarify `break` meaning by using labelled `break` statements
* Fix action how to commit when sync target has been reached
why:
The sync target block number might precede than latest FCU block number.
This happens when the engine API squeezes in some request to execute
and import subsequent blocks.
This patch fixes and assert thrown when after reaching target the latest
FCU block number is higher than the expected target block number.
* Update TODO list
* switch to Nim v2.0.12
* fix LruCache capitalization for styleCheck
* KzgProof/KzgCommitment for styleCheck
* TxEip4844 for styleCheck
* styleCheck issues in nimbus/beacon/payload_conv.nim
* ENode for styleCheck
* isOk for styleCheck
* some more styleCheck fixes
* more styleCheck fixes
---------
Co-authored-by: jangko <jangko128@gmail.com>
* Clarifying/commenting FCU setup condition & small fixes, comments etc.
* Update some logging
* Reorg metrics updater and activation
* Better `async` responsiveness
why:
Block import does not allow `async` task activation while
executing. So allow potential switch after each imported
block (rather than a group of 32 blocks.)
* Handle resuming after previous sync followed by import
why:
In this case the ledger state is more recent than the saved
sync state. So this is considered a pristine sync where any
previous sync state is forgotten.
This fixes some assert thrown because of inconsistent internal
state at some point.
* Provide option for clearing saved beacon sync state before starting syncer
why:
It would resume with the last state otherwise which might be undesired
sometimes.
Without RPC available, the syncer typically stops and terminates with
the canonical head larger than the base/finalised head. The latter one
will be saved as database/ledger state and the canonical head as syncer
target. Resuming syncing here will repeat itself.
So clearing the syncer state can prevent from starting the syncer
unnecessarily avoiding useless actions.
* Allow workers to request syncer shutdown from within
why:
In one-trick-pony mode (after resuming without RPC support) the
syncer can be stopped from within soavoiding unnecessary polling.
In that case, the syncer can (theoretically) be restarted externally
with `startSync()`.
* Terminate beacon sync after a single run target is reached
why:
Stops doing useless polling (typically when there is no RPC available)
* Remove crufty comments
* Tighten state reload condition when resuming
why:
Some pathological case might apply if the syncer is stopped while the
distance between finalised block and head is very large and the FCU
base becomes larger than the locked finalised state.
* Verify that finalised number from CL is at least FCU base number
why:
The FCU base number is determined by the database, non zero if
manually imported. The finalised number is passed via RPC by the CL
node and will increase over time. Unless fully synced, this number
will be pretty low.
On the other hand, the FCU call `forkChoice()` will eventually fail
if the `finalizedHash` argument refers to something outside the
internal chain starting at the FCU base block.
* Remove support for completing interrupted sync without RPC support
why:
Simplifies start/stop logic
* Rmove unused import
* Update comments & logs
* Do not start beacon sync unless there is possibly something to do
why:
It would continue polling without having any effect other than
logging. Now it will not start unless there is RPC available
or there was a previously interrupted sync to be resumed.
* Accept finalised hash from RPC with the canon header as well
* Reorg internal sync descriptor(s)
details:
Update target from RPC to provide the `consensus header` as well as
the `finalised` block number
why:
Prepare for using `importBlock()` instead of `persistBlocks()`
* Cosmetic updates
details:
+ Collect all pretty printers in `helpers.nim`
+ Remove unused return codes from function prototype
* Use `importBlock()` + `forkChoice()` rather than `persistBlocks()`
* Update logging and metrics
* Update docu
* rename nimbus binary to nimbus_execution_client
* additional replacements
* makefile and dockerfile
* fix ci building errors
* github workflows
* improved Makefile target
---------
Co-authored-by: Pedro Miranda <pedro.miranda@nimbus.team>
* Fix fringe condition clarifying how to handle an empty range
why:
The `interval_set` module would treat an undefined interval construct
`[2,1]` as`[2,2]` (the right bound being `max(2,1)`.)
* Use the `consensus head` rather than the `finalised` block as sync target
why:
The former is ahead of the `finalised` block.
* In ctx descriptor rename `final` field to `target`
* Update docu, rename `F` -> `T`
* bump nimbus-build-system to use Nim v2.0.10
* 2.0.10 fixes
* fluffy linting
* make trivial change which should trigger whole-nimbus+fluffy rebuild/ci
* Nim v2.0.10 chronicles.error/macros.error ambiguity workaround
* another contentType enum specifier
* fluffy linting
* Rename `base` -> `coupler`, `B` -> `C`
why:
Glossary: The jargon `base` is used for the `base state` block number
which can be smaller than what is now the `coupler`.
* Rename `global state` -> `base`, `T` -> `B`
why:
See glossary
* Rename `final` -> `end`, `F` -> `E`
why:
See glossary. Previously, `final` denoted some finalised block but not
`the finalised` block from the glossary (which is maximal.)
* Properly name finalised block as such, rename `Z` -> `F`
why:
See glossary
* Rename `least` -> `dangling`, `L` -> `D`
* Metrics update (variables not covered yet)
* Docu update and corrections
* Logger updates
* Remove obsolete `skeleton*Key` kvt columns from `storage_types` module
* Remove `--sync-mode` option from nimbus config
why:
Currently there is only one sync mode available.
* Rename `flare` -> `beacon`, but not base module folder and nim source
why:
The name `flare` was used do designate an alternative `beacon` mode that.
Leaving the base folder and source as-is for a moment, makes it easier
to read change diffs.
* Rename `flare` base module folder and nim source: `flare` -> `beacon`
* Dissolve legacy `sync/types.nim` into `*/eth/eth_types.nim`
* Flare sync: Simplify scheduler and remove `runSingle()` method
why:
`runSingle()` is not used anymore (main purpose was for negotiating
best headers in legacy full sync.)
Also, `runMulti()` was renamed `runPeer()`
* Flare sync: Move `chain` field from `sync_desc` -> `worker_desc`
* Flare sync: Remove handler descriptor lateral reference
why:
Not used anymore. It enabled to turn on/off eth handler activity with
regards to the sync state, i.e.from with in the sync worker.
* Flare sync: Update `Hash256` and other deprecated `std/eth` symbols
* Protocols: Update `Hash256` and other deprecated `std/eth` symbols
* Eth handler: Update `Hash256` and other deprecated `std/eth` symbols
* Update flare TODO
* Remove redundant `sync/type` import
why:
The import module `type` has been removed
* Remove duplicate implementation
This is a minimal set of changes to make things work with the new types
in nim-eth - this is the minimal PR that merely resolves
incompatibilities while the full change set would include more cleanup
and migration.
* Cosmetics, small fixes, add stashed headers verifier
* Remove direct `Era1` support
why:
Era1 is indirectly supported by using the import tool before syncing.
* Clarify database persistent save function.
why:
Function relied on the last saved state block number which was wrong.
It now relies on the tx-level. If it is 0, then data are saved directly.
Otherwise the task that owns the tx will do it.
* Extracted configuration constants into separate file
* Enable single peer mode for debugging
* Fix peer losing issue in multi-mode
details:
Running concurrent download peers was previously programmed as running
a batch downloading and storing ~8k headers and then leaving the `async`
function to be restarted by a scheduler.
This was unfortunate because of occasionally occurring long waiting
times for restart.
While the time gap until restarting were typically observed a few
millisecs, there were always a few outliers which well exceed several
seconds. This seemed to let remote peers run into timeouts.
* Prefix function names `unprocXxx()` and `stagedYyy()` by `headers`
why:
There will be other `unproc` and `staged` modules.
* Remove cruft, update logging
* Fix accounting issue
details:
When staging after fetching headers from the network, there was an off
by 1 error occurring when the result was by one smaller than requested.
Also, a whole range was mis-accounted when a peer was terminating
connection immediately after responding.
* Fix slow/error header accounting when fetching
why:
Originally set for detecting slow headers in a row, the counter
was wrongly extended to general errors.
* Ban peers for a while that respond with too few headers continuously
why:
Some peers only returned one header at a time. If these peers sit on a
farm, they might collectively slow down the download process.
* Update RPC beacon header updater
why:
Old function hook has slightly changed its meaning since it was used
for snap sync. Also, the old hook is used by other functions already.
* Limit number of peers or set to single peer mode
details:
Merge several concepts, single peer mode being one of it.
* Some code clean up, fixings for removing of compiler warnings
* De-noise header fetch related sources
why:
Header download looks relatively stable, so general debugging is not
needed, anymore. This is the equivalent of removing the scaffold from
the part of the building where work has completed.
* More clean up and code prettification for headers stuff
* Implement body fetch and block import
details:
Available headers are used stage blocks by combining existing headers
with newly fetched blocks. Then these blocks are imported/executed via
`persistBlocks()`.
* Logger cosmetics and cleanup
* Remove staged block queue debugging
details:
Feature still available, just not executed anymore
* Docu, logging update
* Update/simplify `runDaemon()`
* Re-calibrate block body requests and soft config for import blocks batch
why:
* For fetching, larger fetch requests are mostly truncated anyway on
MainNet.
* For executing, smaller batch sizes reduce the memory needed for the
price of longer execution times.
* Update metrics counters
* Docu update
* Some fixes, formatting updates, etc.
* Update `borrowed` type: uint -. uint64
also:
Always convert to `uint64` rather than `uint` where appropriate
* init style for Hash256
https://github.com/status-im/nim-eth/pull/733 updates `Hash256` to
become an array instead of an object - unfortunately, nim does not allow
constructing arrays with `name()`, so this PR changes it to `default`
which works with both.
* lint
* Reverse order in staged blob lists
why:
having the largest block number with the least header list index `0`
makes it easier to grow the list with parent headers, i.e. decreasing
block numbers.
* Set a header response threshold when to ditch peer
* Refactor extension of staged header chains record
why:
Was cobbled together as a proof of concept after several approaches of
how to run the download.
* TODO update
* Make debugging code independent of `release` flag
* Update import from jacek
* Block header download starting at Beacon down to Era1
details:
The header download implementation is intended to be completed to a
full sync facility.
Downloaded block headers are stored in a `CoreDb` table. Later on they
should be fetched, complemented by a block body, executed/imported,
and deleted from the table.
The Era1 repository may be partial or missing. Era1 headers are neither
downloaded nor stored on the `CoreDb` table.
Headers are downloaded top down (largest block number first) using the
hash of the block header by one peer. Other peers fetch headers
opportunistically using block numbers
Observed download times for 14m `MainNet` headers varies between 30min
and 1h (Era1 size truncated to 66m blocks.), full download 52min
(anectdotal.) The number of peers downloading concurrently is crucial
here.
* Activate `flare` by command line option
* Fix copyright year
* Wiring ForkedChainRef to other components
- Disable majority of hive simulators
- Only enable pyspec_sim for the moment
- The pyspec_sim is using a smaller RPC service wired to ForkedChainRef
- The RPC service will gradually grow
* Addressing PR review
* Fix test_beacon/setup_env
* Enable consensus_sim (#2441)
* Enable consensus_sim
* Remove isFile check
* Enable Engine API jwt auth tests and exchange cap tests
* Enable engine api in build_sim.sh
* Wire ForkedChainRef to Engine API newPayload
* Wire Engine API getBodies to ForkedChainRef
* Wire Engine API api_forkchoice to ForkedChainRef
* Wire more RPC methods to ForkedChainRef
* Implement eth_syncing
* Implement eth_call and eth_getlogs
* TxPool: simplify smartHead
* Fix smartHead usage
* Fix txpool headDiff
* Remove hasBlockHeader and use headerExists
* Addressing review
* bump metrics
* Remove cruft
* Cosmetics, update some logging, noise control
* Renamed `CoreDb` function `hasKey` => `hasKeyRc` and provided `hasKey`
why:
Currently, `hasKey` returns a `Result[]` rather than a `bool` which
is what one would expect from a function prototype of this name.
This was a bit of an annoyance and cost unnecessary attention.
* Remove redundant `eth/68` message and clean up docu
details:
There is only eth/68 available at the moment
* Allow to turn on chronicles line number logging in `Makefile`
* Accept (and forget) tx hashes announcements
why:
Does no harm to just ignore it at the moment
* Bump nim-eth (rlp fix)
* Move snap un-dumpers to aristo unit test folder
why:
The only place where it is used, now to test the database against
legacy snap sync dump samples.
While the details of the dumped data have mostly outlived their purpuse,
its use as **entropy** data thrown against `Aristo` has still been
useful to find/debug tricky DB problems.
* Remove cruft
* `nimbus-eth1-blobs` not used anymore as test data source
* remove some redundant EH
* avoid pessimising move (introduces a copy in this case!)
* shift less data around when reading era files (reduces stack usage)
* Use block number or timestamp to determine fork rules
Avoid confusion raised by `forkGTE` usage where block informations are present.
* Get rid of forkGTE
* Rename `newKvt()` -> `ctx.getKvt()`
why:
Clean up legacy shortcut. Also, the `KVT` returned is not instantiated
but refers to the shared `KVT` that resides in a context which is a
generalisation of an in-memory database fork. The function `ctx`
retrieves the default context.
* Rename `newTransaction()` -> `ctx.newTransaction()`
why:
Clean up legacy shortcut. The transaction is applied to a context as a
generalisation of an in-memory database fork. The function `ctx`
retrieves the default context.
* Rename `getColumn(CtGeneric)` -> `getGeneric()`
why:
No more a list of well known sub-tries needed, a single one is enough.
In fact, `getColumn()` did only support a single sub-tree by now.
* Reduce TODO list
* Bump nim-eth, nim-web3, nimbus-eth2
- Replace std.Option with results.Opt
- Fields name changes
* More fixes
* Fix Portal stream async raises and portal testnet Opt usage
* Bump eth + nimbus-eth2 + more fixes related to eth_types changes
* Fix in utp test app and nimbus-eth2 bump
* Fix test_blockchain_json rebase conflict
* Fix EVMC block_timestamp conversion plus commentary
---------
Co-authored-by: kdeme <kim.demey@gmail.com>
`initTable` is obsolete since nim 0.19 and can introduce significant
memory overhead while providing no benefit (since the table will be
grown to the default initial size on first use anyway).
In particular, aristo layers will not necessarily use all tables they
initialize, for exampe when many empty accounts are being created.
This PR consolidates the split header-body sequences into a single EthBlock
sequence and cleans up the fallout from that which significantly reduces
block processing overhead during import thanks to less garbage collection
and fewer copies of things all around.
Notably, since the number of headers must always match the number of bodies,
we also get rid of a pointless degree of freedom that in the future could
introduce unnecessary bugs.
* only read header and body from era file
* avoid several unnecessary copies along the block processing way
* simplify signatures, cleaning up unused arguemnts and returns
* use `stew/assign2` in a few strategic places where the generated
nim assignent is slow and add a few `move` to work around poor
analysis in nim 1.6 (will need to be revisited for 2.0)
```
stats-20240607_2223-a814aa0b.csv vs stats-20240608_0714-21c1d0a9.csv
bps_x bps_y tps_x tps_y bpsd tpsd timed
block_number
(498305, 713245] 1,540.52 1,809.73 2,361.58 2775.340189 17.63% 17.63% -14.92%
(713245, 928185] 730.36 865.26 1,715.90 2028.973852 18.01% 18.01% -15.21%
(928185, 1143126] 663.03 789.10 2,529.26 3032.490771 19.79% 19.79% -16.28%
(1143126, 1358066] 393.46 508.05 2,152.50 2777.578119 29.13% 29.13% -22.50%
(1358066, 1573007] 370.88 440.72 2,351.31 2791.896052 18.81% 18.81% -15.80%
(1573007, 1787947] 283.65 335.11 2,068.93 2441.373402 17.60% 17.60% -14.91%
(1787947, 2002888] 287.29 342.11 2,078.39 2474.179448 18.99% 18.99% -15.91%
(2002888, 2217828] 293.38 343.16 2,208.83 2584.77457 17.16% 17.16% -14.61%
(2217828, 2432769] 140.09 167.86 1,081.87 1296.336926 18.82% 18.82% -15.80%
blocks: 1934464, baseline: 3h13m1s, contender: 2h43m47s
bpsd (mean): 19.55%
tpsd (mean): 19.55%
Time (total): -29m13s, -15.14%
```
* CoreDb: Remove crufty second/off-site KVT
why:
Was used to allow late `Clique` to store directly to disk
* CoreDb: Remove prune flag related functionality
why:
Is completely legacy stuff
* CoreDb: Remove dependence on legacy API (tests unsupported yet)
why:
Does not fully support Aristo
* Re-factoring `state_db` using new API
details:
Only minimum changes needed to compile `nimbus`
* Update tests and aux modules
* Turn off legacy API and remove `distinct_tries`
comment:
The legacy API has now cruft status, will be removed soon
* Fix copyright years
* Update rpc for verified proxy
---------
Co-authored-by: Jacek Sieka <jacek@status.im>