* Rename `LeafRange` => `NodeTagRange`
* Replacing storage slot partition point by interval
why:
The partition point only allows to describe slots `[point,high(Uint256)]`
for fetching interval slot ranges. This has been generalised for any
interval.
* Replacing `SnapAccountRanges` by `SnapTrieRangeBatch`
why:
Generalised healing status for accounts, and later for storage slots.
* Improve accounts healing loop
* Split `snap_db` into accounts and storage modules
why:
It is cleaner to have separate session descriptors for accounts and
storage slots (based on a common base descriptor.)
Also, persistent storage handling might be changed in future which
requires the storage slot implementation disentangled from the accounts
handling.
* Re-model worker queues for storage slots
why:
There is a dynamic list of storage sub-tries, each one has to be
treated similar to the accounts database. This applied to slot
interval downloads as well as to healing
* Compress some return value report lists for snapdb methods
why:
No need to report all handling details for work items that are filteres
out and discarded, anyway.
* Remove inner loop frame from healing function
why:
The healing function runs as a loop body already.
* Split fetch accounts into sub-modules
details:
There will be separated modules for accounts snapshot, storage snapshot,
and healing for either.
* Allow to rebase pivot before negotiated header
why:
Peers seem to have not too many snapshots available. By setting back the
pivot block header slightly, the chances might be higher to find more
peers to serve this pivot. Experiment on mainnet showed that setting back
too much (tested with 1024), the chances to find matching snapshot peers
seem to decrease.
* Add accounts healing
* Update variable/field naming in `worker_desc` for readability
* Handle leaf nodes in accounts healing
why:
There is no need to fetch accounts when they had been added by the
healing process. On the flip side, these accounts must be checked for
storage data and the batch queue updated, accordingly.
* Reorganising accounts hash ranges batch queue
why:
The aim is to formally cover as many accounts as possible for different
pivot state root environments. Formerly, this was tried by starting the
accounts batch queue at a random value for each pivot (and wrapping
around.)
Now, each pivot environment starts with an interval set mutually
disjunct from any interval set retrieved with other pivot state roots.
also:
Stop fishing for more pivots in `worker` if 100% download is reached
* Reorganise/update accounts healing
why:
Error handling was wrong and the (math. complexity of) whole process
could be better managed.
details:
Much of the algorithm is now documented at the top of the file
`heal_accounts.nim`
* Miscellaneous updates TBC
* Disentangled pivot2 module from snap
why:
Wrote as template on top of sync so it can be shared by fast and snap
sync.
* Renamed and relocated pivot sources
* Integrated `best_pivot` module into full and snap sync
why:
Full sync used an older version of `best_pivot`
* isolating download module from full sync
why;
might be shared with snap sync at a later stage
* Added inspect module
why:
Find dangling references for trie healing support.
details:
+ This patch set provides only the inspect module and some unit tests.
+ There are also extensive unit tests which need bulk data from the
`nimbus-eth1-blob` module.
* Alternative pivot finder
why:
Attempt to be faster on start up. Also tying to decouple pivot finder
somehow by providing different mechanisms (this one runs in `single`
mode.)
* Use inspect module for healing
details:
+ After some progress with account and storage data, the inspect facility
is used to find dangling links in the database to be filled nose-wise.
+ This is a crude attempt to cobble together functional elements. The
set up needs to be honed.
* fix scheduler to avoid starting dead peers
why:
Some peers drop out while in `sleepAsync()`. So extra `if` clauses
make sure that this event is detected early.
* Bug fixes causing crashes
details:
+ prettify.toPC():
int/intToStr() numeric range over/underflow
+ hexary_inspect.hexaryInspectPath():
take care of half initialised step with branch but missing index into
branch array
* improve handling of dropped peers in alternaive pivot finder
why:
Strange things may happen while querying data from the network.
Additional checks make sure that the state of other peers is updated
immediately.
* Update trace messages
* reorganise snap fetch & store schedule
* Re-implemented `hexaryFollow()` in a more general fashion
details:
+ New name for re-implemented `hexaryFollow()` is `hexaryPath()`
+ Renamed `rTreeFollow()` as `hexaryPath()`
why:
Returning similarly organised structures, the results of the
`hexaryPath()` functions become comparable when running over
the persistent and the in-memory databases.
* Added traversal functionality for persistent ChainDB
* Using `Account` values as re-packed Blob
* Repack samples as compressed data files
* Produce test data
details:
+ Can force pivot state root switch after minimal coverage.
+ For emulating certain network behaviour, downloading accounts stops for
a particular pivot state root if 30% (some static number) coverage is
reached. Following accounts are downloaded for a later pivot state root.
* Bump nim-stew
why:
Need fixed interval set
* Keep track of accumulated account ranges over all state roots
* Added comments and explanations to unit tests
* typo
* Extracted functionality into sub-modules for maintainability
* Setting SST bulk load as default in `accounts_db`
details:
+ currently, the same data are stored via rocksdb if available, and
the same via embedded `storage_type` with (non-standard) prefix 200
for time comparisons
+ fallback to normal `put()` unless rocksdb is accessible
* Provided common scheduler API, applied to `full` sync
* Use hexary trie as storage for proofs_db records
also:
+ Store metadata with account for keeping track of account state
+ add iterator over accounts
* Common scheduler API applied to `snap` sync
* Prepare for accounts bulk import
details:
+ Added some ad-hoc checks for proving accounts data received from the
snap/1 (will be replaced by proper database version when ready)
+ Added code that dumps some of the received snap/1 data into a file
(turned of by default, see `worker_desc.nim`)
* Error return in `persistBlocks()` on initial `VmState` roblem
why:
previously threw an exception
* Updated sync mode option
why:
using enum rather than bool => space for more
* Added sync mode `full`, re-factued legacy sync
also:
rebased
* Fix typo (crashes `pesistBlocks()` otherwise)
also:
rebase to master
* Reduce log ticker noise by suppressing duplicate messages
* Clarify staged queue overflow handling
why:
backtrack/re-org mode in `stageItem()` should be detected by both,
the global indicator or the work item where it might have moved into.
also:
rebased
* Added sepolia specs
* temporarily avoid latest `master` branch from `nim-eth1`
why:
Currently does not cleanly compile after the `bearssl` split api update.
* Relocated `IntervalSets` to nim-stew repo
* Accumulate accounts on temporary kv-DB
why:
Explore the data as returned from snap/1. Will be converted to a
`eth/db` next.
details:
Verify and accumulate per/state-root accounts downloaded via snap.
also:
Some unit tests
* Replace `Table` by `TrieDatabaseRef` for accounts accumulator
* update ticker statistics
details:
mean/variance based counter update
* allow persistent db for proved accounts
* rebase, and globally activate unit test
* fix statistics
* Using `IntervalSet` type data for `LeafRange`
* Updated log ticker
* Update to `eth67`
details:
Disabled by default, use `ENABLE_LEGACY_ETH66=0` to enable
No support for `Get/NodeData` dialogue via eth, anymore
* Dissolved fetch/common.nim
details;
the log/ticker part becomes ticker.nim
the interval range management is merged into fetch.nim
* Updated account scheduler
why:
The previous scheduler fetched each account once (for different state
roots.) The updated scheduler re-calibrates after a change of the state
root and potentially (until told otherwise) fetches all possible
accounts.
* Fix `high(P)` fringe cases in `IntervalSet` handling
why:
The `high(P)` value for a point type `P` cannot be represented with
half open intervals `[a,b)` for a,b points of `P`. So this single value
needs extra treatment which was slightly wrong.
* Updated docu/comments
also:
rebased
* Update scheduler
details:
Change the `pivot` management when creating new accounts lists. It is
strictly increasing (and wrapping around) depending on last updated
accounts list.
* Fix/recover download flag
why:
The fetch indicator used to control the data download somehow got
lost during re-org.
* Updated chronicles/logger topics
* Reorganised run state flags
why:
The original code used a pair of boolean flags `(stopped,stopThisState)`
which was translated to three states running, stoppedPending, and
stopped. It is currently not clear whether collapsing some states was
correct. So the original logic has been re-stored, albeit wrapped into
directives like `isStopped()` etc.
also:
Moving some function bodies in `worker.nim`
* Moved `reply_data.nim` and `validate_trienode.nim` to sub-directory `fetch_trie`
why:
Only used in `fetch_trie.nim`.
* Move `fetch_*` file and directory objects to `fetch` subdirectory
why:
Only used in `fetch.nim`
* Added start/stop and/or setup/release methods for all sub-modules
why:
good housekeeping
also:
updated getters/setters for ctrl states
updated trace messages
if we are not voting, coinbase should be filled with zero
because other subsystem e.g txpool can produce block header
with non zero coinbase. if that coinbase is one of the signer
and the nonce is zero, that signer will be vote out from
signer list
* Disentangle `collect` module from `reply_data`
why:
Now the module visible from `collect` for fetching data is `peer/fetch`
only.
* Merge `SnapPeerHunt` into `collect`
why:
This part needs to be known by `collect`, only
* rename collect => worker
* Dissolve `sync_fetch_xdesc` module into `common`
why:
Descriptor is only used in `common` and `fetch_trie`
* rename `snap/peer` directory => `snap/worker`
* rename `SnapSync` -> `Worker`, `SnapPeer` -> `WorkerBuddy`
* moved `snap/base_desc.nim` -> `snap/worker/worker_desc.nim`
* Unified opaque object ref naming in `worker_desc.nim`
details:
indicated my inheriting module (exactly one, always)
* Reorg SnapPeerBase descriptor, notably start/stop flags
details:
Instead of using three boolean flags startedFetch, stopped, and
stopThisState a single enum type is used with values SyncRunningOk,
SyncStopRequest, and SyncStopped.
* Restricting snap to eth66 and later
why:
Id-tracked request/response wire protocol can handle overlapped
responses when requests are sent in row.
* Align function names with source code file names
why:
Easier to reconcile when following the implemented logic.
* Update trace logging (want file locations)
why:
The macros previously used hid the relevant file location (when
`chroniclesLineNumbers` turned on.) It rather printed the file
location of the template that was wrapping `trace`.
* Use KeyedQueue table instead of sequence
why:
Quick access, easy configuration as LRU or FIFO with max entries
(currently LRU.)
* Dissolve `SnapPeerEx` object extension into `SnapPeer`
why;
It is logically cleaner and more obvious not to inherit from
`SnapPeerBase` but to specify opaque field object references of the
merged `SnapPeer` object. These can then be locally inherited.
* Dissolve `SnapSyncEx` object extension into `SnapSync`
why;
It is logically cleaner and more obvious not to inherit from
`SnapSyncEx` but to specify opaque field object references of
the `SnapPeer` object. These can then be locally inherited.
Also, in the re-factored code here the interface descriptor
`SnapSyncCtx` inherited `SnapSyncEx` which was sub-optimal (OO
inheritance makes it easier to work with call back functions.)
* new: time_helper, types
* new: path_desc
* new: base_desc
* Re-organised objects inheritance
why:
Previous code used macros to instantiate opaque object references. This
has been re-implemented with OO inheritance based logic.
* Normalised trace macros
* Using distinct types for Hash256 aliases
why:
Better control of the meaning of the hashes, all or the same format
caveat:
The protocol handler DSL used by eth66.nim and snap1.nim uses the
underlying type Hash256 and cannot handle the distinct alias in
rlp and chronicles/log macros. So Hash256 is used directly (does
not change readability as the type is clear by parameter names.)
* Use type name eth and snap (rather than snap1)
* Prettified snap/eth handler trace messages
* Regrouped sync sources
details:
Snap storage related sources are moved to common directory.
Option --new-sync renamed to --snap-sync
also:
Normalised logging for secondary/non-protocol handlers.
* Merge protocol wrapper files => protocol.nim
details:
Merge wrapper sync/protocol_ethxx.nim and sync/protocol_snapxx.nim
into single file snap/protocol.nim
* Comments cosmetics
* Similar start logic for blockchain_sync.nim and sync/snap.nim
* Renamed p2p/blockchain_sync.nim -> sync/fast.nim
why:
Accidentally wrapped into waitFor() directive with reviving jl/sync
branch.
also:
Decorate eth/66 and snap/1 protocol trace messages with protocol
type and version
* Squashed snap-sync-preview patch
why:
Providing end results makes it easier to have an overview.
Collected patch set comments are available as nimbus/sync/ChangeLog.md
in chronological order, oldest first.
* Removed some cruft and obsolete imports, normalised logging
* Update exception tracinig
* Use lazy JSON parser
why:
Uint246 type is not directly supported by the JSON serializer
* Json parser update for stringified UInt256 integers
why:
Now available
* update sub-module branch reference
- remove `waitFor` and store async result in Future[T] for future execution
instead of blocking other services from starting and running.
This also fixes very long delay Ctrl-C issue. Fixes#585
- always create sealing engine instance even though there is no
signer. other services such engine api need it.
if engine api receive forkChoiceUpdated and turn out
reorg happened, sealing engine must update txpool
head.
so in the next iteration when the txpool asked to
produce a fresh block, it will produce it in the
current 'correct' chain and abandon side chain.
* Prepare unit tests for running without tx-pool job queue
why:
Most of the job queue logic can be emulated. This adapts to a few
pathological test cases.
* Replace tx-pool job queue logic with in-place actions
why:
This additional execution layer is not needed, anymore which has been
learned from working with the integration/hive tests.
details:
Execution of add or deletion jobs are executed in-place. Some actions
-- as in smartHead() -- have been combined for facilitating the correct
order actions
* Update production functions, remove txpool legacy stuff
* Enable JWT authentication for websockets
details:
Currently, this is optional and only enabled when the jwtsecret option
is set.
There is a default mechanism to generate a JWT secret if it is not
explicitly stated. This mechanism is currently unused.
* Make JWT authentication compulsory for websockets
* Fix unit test entry point + cosmetics
* Update JSON-RPC link
* Improvements as suggested by Mamy
result.gasPrice = baseFee + min(result.maxPriorityFee, result.maxFee - baseFee)
cannot be simplified into
result.gasPrice = min(result.maxPriorityFee + baseFee, result.maxFee)
the expression is not commutative
why:
Causes havoc in most bit fringe cases.
details:
When setting the head forward, the delta was wrongly registered from
the static "left" end (which limits the loop) rather than the moving
"right" end.
* Fix database sort order for local txs
why:
For convenience, packed txs were stored in the block sorted by
rank->nonce. Using local accounts, the greedy grabber uses the sort
order (local,non-local)->rank->nonce which leads to a wrong calculation
of the txRoot.
* Housekeeping
details:
Replaced a couple of local eip1559TxNormalization() functions by a
single public
* Support for local accounts
why:
Accounts tagged local will be packed with priority over untagged
accounts
* Added functions for queuing txs and simultaneously setting account locality
why:
Might be a popular task, in particular for unconditionally adding txs to
a local (aka prioritised account) via "xp.addLocal(tx,true)"
caveat:
Untested yet
* fix typo
* backup
* No baseFee for pre-London tx in verifier
why:
The packer would wrongly discard valid legacy txs.
* Remove unused import of config to avoid select_backend db import
- Importing nimbus-eth1 config.nim causes import of select_backend
which will default cause an import of kvstore_rocksdb and thus a
require rocksdb. Remove unused one to avoid rocksdb dependency
for Fluffy.
- Remove some whitespace in bridge_client (to make fluffy CI
trigger for sure).
* Use specific cache keys for fluffy CI workflow
* Disable Fluffy CI reproducibility test
* Enable optional chunked RLPx messages
why:
Legacy feature used by Nethermind
details:
Disable with make flag: ENABLE_CHUNKED_RLPX=0
* Rebase & bump nim-eth
* Fix default behaviour
why:
Got lost somehow. Comments do not match GNU make code directives.
* De-noisify some Clique logging
why:
Too annoying when syncing against Goerly
* Replay devnet# and kiln sessions
why:
Compiled as local program, the unit test was used for TDD.
* dist: precompiled binaries and Docker images
The builds are reproducible, the binaries are portable and statically link librocksdb.
This took some patching. Upstream PR: https://github.com/facebook/rocksdb/pull/9752
32-bit ARM is missing as a target because two different GCC versions
fail with an ICE when trying to cross-compile RocksDB. Using Clang
instead is too much trouble for a platform that nobody should be using
anyway.
(Clang doesn't come with its own target headers and libraries, can't be
easily convinced to use the ones from GCC, so it needs an fs image from
a 32-bit ARM distro - at which point I stopped caring).
* CI: disable reproducibility test
* Activate wire protocol eth/66
and:
Disentangle protocol_eth66.nim from import sections
why:
Importing the protocol_eth66 module is not necessary. There is
no need to know too many details of the underlying wire protocol. All
that is needed will be exported by blockchain_sync.nim.
* fixes, and rebase
* Update nimbus/p2p/blockchain_sync.nim
Co-authored-by: Kim De Mey <kim.demey@gmail.com>
* Fixes and rebase
Co-authored-by: Kim De Mey <kim.demey@gmail.com>
why:
Testing against a replay unit test for Devnet4 made it necessary to
adjust the TTD handling. Without updated, importing fails at block #5646
which is the parent of the terminal PoW block. Similar considerations
apply for Devnet5 and Kiln.
- prevRandao/mixDigest = payloadAttrs.prevRandao
- timestamp = payloadAttrs.timestamp
- coinbase = payloadAttrs.suggestedFeeRecipient
- extraData = it can be anything 0-32 bytes, but currently set to empty bytes.
* Rearrange/rename test_kintsugu => test_custom_network
why:
Debug, fix and test more general problems related to running
nimbus on a custom network.
* Update UInt265/Json parser for --custom-network command line option
why:
As found out with the Kintsugi configuration, block number and balance
have the same Nim type which led to misunderstandings. This patch makes
sure that UInt265 encoded string values "0x11" decodes to 17, and "b"
and "11" to 11.
* Refactored genesis.toBlock() => genesis.toBlockHeader()
why:
The function toBlock(g,db) may return different results depending on
whether the db descriptor argument is nil, or initialised. This is due
to the db.config data sub-descriptor which may give various outcomes
for the baseFee field of the genesis header.
Also, the version where db is non-nil initialised is used internally
only. So the public rewrite toBlockHeader() that replaces the toBlock()
function expects a full set of NetworkParams.
* update comments
* Rename toBlockHeader() => toGenesisHeader()
why:
Polymorphic prototype used for BaseChainDB or NetworkParams argument.
With a BaseChainDB descriptor argument, the name shall imply that the
header is generated from the config fields rather than fetched from
the database.
* Added command line option --static-peers-file
why:
Handy feature to keep peer nodes in a file, similar to the
--bootstrap-file option.
Any transaction where tx.sender has a CODEHASH != EMPTYCODEHASH MUST
be rejected as invalid, where EMPTYCODEHASH =
0xc5d2460186f7233c927e7db2dcc703c0e500b653ca82273b7bfad8045d85a470.
The invalid transaction MUST be rejected by the client and not be included
in a block. A block containing such a transaction MUST be considered invalid.
because EIP-4399 override mixDigest validation rule,
there is no need to check mixDigest == ZERO_HASH
`mixDigest` will carry POS block randomness value
* Kludge needed for setting up custom network
why:
Some non-features in the persistent hexary trie DB produce an assert
error when initiating the Kinsugi network.
details:
This fix should be temporary, only.
* Fix OS detection
why:
directive detectOs() bails out on Windows if checking for Ubuntu
Includes a simple test harness for the merge interop M1 milestone
This aims to enable connecting nimbus-eth2 to nimbus-eth1 within
the testing protocol described here:
https://github.com/status-im/nimbus-eth2/blob/amphora-merge-interop/docs/interop_merge.md
To execute the work-in-progress test, please run:
In terminal 1:
tests/amphora/launch-nimbus.sh
In terminal 2:
tests/amphora/check-merge-test-vectors.sh
details:
For documentation, see comments in the file tx_pool.nim.
For prettified manual pages run 'make docs' in the nimbus directory and
point your web browser to the newly created 'docs' directory.
why:
Some helper file will not be generated by the nim document gereator,
so they have been stashed from a later nim version to be provided when
missing.
This problem was known with an earlier nim version (see here
https://github.com/nim-lang/Nim/issues/8952) but was reported solved.
Maybe we need a second look into that.
why:
Previously, the function 'snapshot_desc.loadSnapshot()' contained the
equivalent of 'eth.decode(@[],SnapshotData)' for some type 'SnapshotData'
which should result in an exception of type 'RlpTypeMismatch'.
Before mid October, this worked for all systems on the Github CI. Since
then, a segfault message in the Github CI can be reproduced on all 64bit
Windows wuns when running 'build/all_tests <id-of-test_txpool>' after the
failed 'make test' directive (the latter one needs to be extended by
'|| true'.) This error cannot be reproduced on my local Win7/64 system
with the same MSYS2 and gcc 11.2.0 compiler.
The fix is, rather than catching an exception, to explicitly check the
first argument of 'eth.decode(@[],SnapshotData)' and act if it is empty.
also:
removed some obsolete {.inline.} annotations.
why:
Previous version was based on lru_cache which is ugly. This module is
based on the stew/keyed_queue library module.
other:
There are still some other modules rely on lru_cache which should be
removed.
* Redesign of BaseVMState descriptor
why:
BaseVMState provides an environment for executing transactions. The
current descriptor also provides data that cannot generally be known
within the execution environment, e.g. the total gasUsed which is
available not before after all transactions have finished.
Also, the BaseVMState constructor has been replaced by a constructor
that does not need pre-initialised input of the account database.
also:
Previous constructor and some fields are provided with a deprecated
annotation (producing a lot of noise.)
* Replace legacy directives in production sources
* Replace legacy directives in unit test sources
* fix CI (missing premix update)
* Remove legacy directives
* chase CI problem
* rebased
* Re-introduce 'AccountsCache' constructor optimisation for 'BaseVmState' re-initialisation
why:
Constructing a new 'AccountsCache' descriptor can be avoided sometimes
when the current state root is properly positioned already. Such a
feature existed already as the update function 'initStateDB()' for the
'BaseChanDB' where the accounts cache was linked into this desctiptor.
The function 'initStateDB()' was removed and re-implemented into the
'BaseVmState' constructor without optimisation. The old version was of
restricted use as a wrong accounts cache state would unconditionally
throw an exception rather than conceptually ask for a remedy.
The optimised 'BaseVmState' re-initialisation has been implemented for
the 'persistBlocks()' function.
also:
moved some test helpers to 'test/replay' folder
* Remove unused & undocumented fields from Chain descriptor
why:
Reduces attack surface in general & improves reading the code.
details:
1. The check for cumulativeGasUsed + tx.gasLimit <= header.gasLimit
makes neither sense nor is it part of the Eip1559 specs. Nevertheless
a check tx.gasLimit <= header.gasLimit is added to satisfy some
unit test (see comments in validateTransaction() body.)
2. As a replacement check for the one removed in 1, a check for
cumulativeGasUsed + gasBurned <= header.gasLimit has been added
(see comments in processTransactionImpl() body.)
3. Prototypes for processTransaction() variants have been cleaned up and
commented.
why:
Detail 1. in particular produces an error for tightly packed blocks when
the last tx in the list has a generous gasLimit.
- Remove the `--evm` option on non-EVMC builds.
`when` around an option doesn't work with confutils; it fails to compile.
Workaround that by setting the `ignore` pragma on EVMC-specific options.
(Thanks @jangko for that new pragma). I prefer this to a solution which
moves the whole option's pragma elsewhere, especially if we add more options.
- Improve the help text, so that it shows the standard library extension on
each target platform (or none if on another platform).
- Undo b3f21bf4 "add missing evmc_enabled conditional compilation in
evmc_dynamic_loader". Move the conditional to `nimbus.nim`, and take more
care there to only use the loader function in EVMC builds.
It's ok to just not include this EVMC-only module (like some other EVMC
modules), rather than making the module itself a bit broken: Without this
change, it references a function that's not imported or linked to, and it
only links because there is no call sequence reaching that function.
Signed-off-by: Jamie Lokier <jamie@shareable.org>