* Split fetch accounts into sub-modules
details:
There will be separated modules for accounts snapshot, storage snapshot,
and healing for either.
* Allow to rebase pivot before negotiated header
why:
Peers seem to have not too many snapshots available. By setting back the
pivot block header slightly, the chances might be higher to find more
peers to serve this pivot. Experiment on mainnet showed that setting back
too much (tested with 1024), the chances to find matching snapshot peers
seem to decrease.
* Add accounts healing
* Update variable/field naming in `worker_desc` for readability
* Handle leaf nodes in accounts healing
why:
There is no need to fetch accounts when they had been added by the
healing process. On the flip side, these accounts must be checked for
storage data and the batch queue updated, accordingly.
* Reorganising accounts hash ranges batch queue
why:
The aim is to formally cover as many accounts as possible for different
pivot state root environments. Formerly, this was tried by starting the
accounts batch queue at a random value for each pivot (and wrapping
around.)
Now, each pivot environment starts with an interval set mutually
disjunct from any interval set retrieved with other pivot state roots.
also:
Stop fishing for more pivots in `worker` if 100% download is reached
* Reorganise/update accounts healing
why:
Error handling was wrong and the (math. complexity of) whole process
could be better managed.
details:
Much of the algorithm is now documented at the top of the file
`heal_accounts.nim`
* Added inspect module
why:
Find dangling references for trie healing support.
details:
+ This patch set provides only the inspect module and some unit tests.
+ There are also extensive unit tests which need bulk data from the
`nimbus-eth1-blob` module.
* Alternative pivot finder
why:
Attempt to be faster on start up. Also tying to decouple pivot finder
somehow by providing different mechanisms (this one runs in `single`
mode.)
* Use inspect module for healing
details:
+ After some progress with account and storage data, the inspect facility
is used to find dangling links in the database to be filled nose-wise.
+ This is a crude attempt to cobble together functional elements. The
set up needs to be honed.
* fix scheduler to avoid starting dead peers
why:
Some peers drop out while in `sleepAsync()`. So extra `if` clauses
make sure that this event is detected early.
* Bug fixes causing crashes
details:
+ prettify.toPC():
int/intToStr() numeric range over/underflow
+ hexary_inspect.hexaryInspectPath():
take care of half initialised step with branch but missing index into
branch array
* improve handling of dropped peers in alternaive pivot finder
why:
Strange things may happen while querying data from the network.
Additional checks make sure that the state of other peers is updated
immediately.
* Update trace messages
* reorganise snap fetch & store schedule
* Re-implemented `hexaryFollow()` in a more general fashion
details:
+ New name for re-implemented `hexaryFollow()` is `hexaryPath()`
+ Renamed `rTreeFollow()` as `hexaryPath()`
why:
Returning similarly organised structures, the results of the
`hexaryPath()` functions become comparable when running over
the persistent and the in-memory databases.
* Added traversal functionality for persistent ChainDB
* Using `Account` values as re-packed Blob
* Repack samples as compressed data files
* Produce test data
details:
+ Can force pivot state root switch after minimal coverage.
+ For emulating certain network behaviour, downloading accounts stops for
a particular pivot state root if 30% (some static number) coverage is
reached. Following accounts are downloaded for a later pivot state root.
* Bump nim-stew
why:
Need fixed interval set
* Keep track of accumulated account ranges over all state roots
* Added comments and explanations to unit tests
* typo
* Extracted functionality into sub-modules for maintainability
* Setting SST bulk load as default in `accounts_db`
details:
+ currently, the same data are stored via rocksdb if available, and
the same via embedded `storage_type` with (non-standard) prefix 200
for time comparisons
+ fallback to normal `put()` unless rocksdb is accessible
* Provided common scheduler API, applied to `full` sync
* Use hexary trie as storage for proofs_db records
also:
+ Store metadata with account for keeping track of account state
+ add iterator over accounts
* Common scheduler API applied to `snap` sync
* Prepare for accounts bulk import
details:
+ Added some ad-hoc checks for proving accounts data received from the
snap/1 (will be replaced by proper database version when ready)
+ Added code that dumps some of the received snap/1 data into a file
(turned of by default, see `worker_desc.nim`)
* Added sepolia specs
* temporarily avoid latest `master` branch from `nim-eth1`
why:
Currently does not cleanly compile after the `bearssl` split api update.
* Relocated `IntervalSets` to nim-stew repo
* Accumulate accounts on temporary kv-DB
why:
Explore the data as returned from snap/1. Will be converted to a
`eth/db` next.
details:
Verify and accumulate per/state-root accounts downloaded via snap.
also:
Some unit tests
* Replace `Table` by `TrieDatabaseRef` for accounts accumulator
* update ticker statistics
details:
mean/variance based counter update
* allow persistent db for proved accounts
* rebase, and globally activate unit test
* fix statistics
* Using `IntervalSet` type data for `LeafRange`
* Updated log ticker
* Update to `eth67`
details:
Disabled by default, use `ENABLE_LEGACY_ETH66=0` to enable
No support for `Get/NodeData` dialogue via eth, anymore
* Dissolved fetch/common.nim
details;
the log/ticker part becomes ticker.nim
the interval range management is merged into fetch.nim
* Updated account scheduler
why:
The previous scheduler fetched each account once (for different state
roots.) The updated scheduler re-calibrates after a change of the state
root and potentially (until told otherwise) fetches all possible
accounts.
* Fix `high(P)` fringe cases in `IntervalSet` handling
why:
The `high(P)` value for a point type `P` cannot be represented with
half open intervals `[a,b)` for a,b points of `P`. So this single value
needs extra treatment which was slightly wrong.
* Updated docu/comments
also:
rebased
* Update scheduler
details:
Change the `pivot` management when creating new accounts lists. It is
strictly increasing (and wrapping around) depending on last updated
accounts list.
* Use type name eth and snap (rather than snap1)
* Prettified snap/eth handler trace messages
* Regrouped sync sources
details:
Snap storage related sources are moved to common directory.
Option --new-sync renamed to --snap-sync
also:
Normalised logging for secondary/non-protocol handlers.
* Merge protocol wrapper files => protocol.nim
details:
Merge wrapper sync/protocol_ethxx.nim and sync/protocol_snapxx.nim
into single file snap/protocol.nim
* Comments cosmetics
* Similar start logic for blockchain_sync.nim and sync/snap.nim
* Renamed p2p/blockchain_sync.nim -> sync/fast.nim
* Update exception tracinig
* Use lazy JSON parser
why:
Uint246 type is not directly supported by the JSON serializer
* Json parser update for stringified UInt256 integers
why:
Now available
* update sub-module branch reference
* Prepare unit tests for running without tx-pool job queue
why:
Most of the job queue logic can be emulated. This adapts to a few
pathological test cases.
* Replace tx-pool job queue logic with in-place actions
why:
This additional execution layer is not needed, anymore which has been
learned from working with the integration/hive tests.
details:
Execution of add or deletion jobs are executed in-place. Some actions
-- as in smartHead() -- have been combined for facilitating the correct
order actions
* Update production functions, remove txpool legacy stuff
* Enable JWT authentication for websockets
details:
Currently, this is optional and only enabled when the jwtsecret option
is set.
There is a default mechanism to generate a JWT secret if it is not
explicitly stated. This mechanism is currently unused.
* Make JWT authentication compulsory for websockets
* Fix unit test entry point + cosmetics
* Update JSON-RPC link
* Improvements as suggested by Mamy
why:
Causes havoc in most bit fringe cases.
details:
When setting the head forward, the delta was wrongly registered from
the static "left" end (which limits the loop) rather than the moving
"right" end.
* Fix database sort order for local txs
why:
For convenience, packed txs were stored in the block sorted by
rank->nonce. Using local accounts, the greedy grabber uses the sort
order (local,non-local)->rank->nonce which leads to a wrong calculation
of the txRoot.
* Housekeeping
details:
Replaced a couple of local eip1559TxNormalization() functions by a
single public
* Support for local accounts
why:
Accounts tagged local will be packed with priority over untagged
accounts
* Added functions for queuing txs and simultaneously setting account locality
why:
Might be a popular task, in particular for unconditionally adding txs to
a local (aka prioritised account) via "xp.addLocal(tx,true)"
caveat:
Untested yet
* fix typo
* backup
* No baseFee for pre-London tx in verifier
why:
The packer would wrongly discard valid legacy txs.
* De-noisify some Clique logging
why:
Too annoying when syncing against Goerly
* Replay devnet# and kiln sessions
why:
Compiled as local program, the unit test was used for TDD.
* dist: precompiled binaries and Docker images
The builds are reproducible, the binaries are portable and statically link librocksdb.
This took some patching. Upstream PR: https://github.com/facebook/rocksdb/pull/9752
32-bit ARM is missing as a target because two different GCC versions
fail with an ICE when trying to cross-compile RocksDB. Using Clang
instead is too much trouble for a platform that nobody should be using
anyway.
(Clang doesn't come with its own target headers and libraries, can't be
easily convinced to use the ones from GCC, so it needs an fs image from
a 32-bit ARM distro - at which point I stopped caring).
* CI: disable reproducibility test
* Activate wire protocol eth/66
and:
Disentangle protocol_eth66.nim from import sections
why:
Importing the protocol_eth66 module is not necessary. There is
no need to know too many details of the underlying wire protocol. All
that is needed will be exported by blockchain_sync.nim.
* fixes, and rebase
* Update nimbus/p2p/blockchain_sync.nim
Co-authored-by: Kim De Mey <kim.demey@gmail.com>
* Fixes and rebase
Co-authored-by: Kim De Mey <kim.demey@gmail.com>
why:
TDD data and test script that are not needed for CI are externally held.
This saves space.
also:
Added support for test-custom_networks.nim to run import Devnet5 dump.
* Rearrange/rename test_kintsugu => test_custom_network
why:
Debug, fix and test more general problems related to running
nimbus on a custom network.
* Update UInt265/Json parser for --custom-network command line option
why:
As found out with the Kintsugi configuration, block number and balance
have the same Nim type which led to misunderstandings. This patch makes
sure that UInt265 encoded string values "0x11" decodes to 17, and "b"
and "11" to 11.
* Refactored genesis.toBlock() => genesis.toBlockHeader()
why:
The function toBlock(g,db) may return different results depending on
whether the db descriptor argument is nil, or initialised. This is due
to the db.config data sub-descriptor which may give various outcomes
for the baseFee field of the genesis header.
Also, the version where db is non-nil initialised is used internally
only. So the public rewrite toBlockHeader() that replaces the toBlock()
function expects a full set of NetworkParams.
* update comments
* Rename toBlockHeader() => toGenesisHeader()
why:
Polymorphic prototype used for BaseChainDB or NetworkParams argument.
With a BaseChainDB descriptor argument, the name shall imply that the
header is generated from the config fields rather than fetched from
the database.
* Added command line option --static-peers-file
why:
Handy feature to keep peer nodes in a file, similar to the
--bootstrap-file option.
* Kludge needed for setting up custom network
why:
Some non-features in the persistent hexary trie DB produce an assert
error when initiating the Kinsugi network.
details:
This fix should be temporary, only.
* Fix OS detection
why:
directive detectOs() bails out on Windows if checking for Ubuntu
* test environment for studying crash of hexary trie
why:
the persistent test case will crash unless in genesis.toBlock():
+ pruneTrie is set false, or
+ the directive "tdb.put(emptyRlpHash.data,emptyRlp)" is added right
before the "for k, v in account.storage:" loop
* different tests for OS variants
Includes a simple test harness for the merge interop M1 milestone
This aims to enable connecting nimbus-eth2 to nimbus-eth1 within
the testing protocol described here:
https://github.com/status-im/nimbus-eth2/blob/amphora-merge-interop/docs/interop_merge.md
To execute the work-in-progress test, please run:
In terminal 1:
tests/amphora/launch-nimbus.sh
In terminal 2:
tests/amphora/check-merge-test-vectors.sh
details:
For documentation, see comments in the file tx_pool.nim.
For prettified manual pages run 'make docs' in the nimbus directory and
point your web browser to the newly created 'docs' directory.
why:
Previous version was based on lru_cache which is ugly. This module is
based on the stew/keyed_queue library module.
other:
There are still some other modules rely on lru_cache which should be
removed.
* Redesign of BaseVMState descriptor
why:
BaseVMState provides an environment for executing transactions. The
current descriptor also provides data that cannot generally be known
within the execution environment, e.g. the total gasUsed which is
available not before after all transactions have finished.
Also, the BaseVMState constructor has been replaced by a constructor
that does not need pre-initialised input of the account database.
also:
Previous constructor and some fields are provided with a deprecated
annotation (producing a lot of noise.)
* Replace legacy directives in production sources
* Replace legacy directives in unit test sources
* fix CI (missing premix update)
* Remove legacy directives
* chase CI problem
* rebased
* Re-introduce 'AccountsCache' constructor optimisation for 'BaseVmState' re-initialisation
why:
Constructing a new 'AccountsCache' descriptor can be avoided sometimes
when the current state root is properly positioned already. Such a
feature existed already as the update function 'initStateDB()' for the
'BaseChanDB' where the accounts cache was linked into this desctiptor.
The function 'initStateDB()' was removed and re-implemented into the
'BaseVmState' constructor without optimisation. The old version was of
restricted use as a wrong accounts cache state would unconditionally
throw an exception rather than conceptually ask for a remedy.
The optimised 'BaseVmState' re-initialisation has been implemented for
the 'persistBlocks()' function.
also:
moved some test helpers to 'test/replay' folder
* Remove unused & undocumented fields from Chain descriptor
why:
Reduces attack surface in general & improves reading the code.
details:
1. The check for cumulativeGasUsed + tx.gasLimit <= header.gasLimit
makes neither sense nor is it part of the Eip1559 specs. Nevertheless
a check tx.gasLimit <= header.gasLimit is added to satisfy some
unit test (see comments in validateTransaction() body.)
2. As a replacement check for the one removed in 1, a check for
cumulativeGasUsed + gasBurned <= header.gasLimit has been added
(see comments in processTransactionImpl() body.)
3. Prototypes for processTransaction() variants have been cleaned up and
commented.
why:
Detail 1. in particular produces an error for tightly packed blocks when
the last tx in the list has a generous gasLimit.
* crash test scenario
details:
Example code for inspecting nested block chain and accounts cache
database transaction framework. There seems to be a pathological
case where the system crashes after a rollback (as appeared in the
tx-pool packer code.)
* simplified crash scenario
* Workable solution (as suggested by Andri)
details:
Avoiding db.rollback() (db.commit() is OK) while vmState.stateDB is
alive.
* Rename text_txcrash => test_accounts_cache
why:
Unit tests covers part of accounts_cache handling
* comment update
Add the new [Arrow Glacier fork](https://eips.ethereum.org/EIPS/eip-4345).
Only the difficulty calculation is changed, but as a new fork it still affects
a number of places in the code.
To the best of my knowledge the change is only scheduled on Mainnet.
In addition:
- The fork date comments in `chain_config.nim` have been checked against the
real networks, set consistently in UTC instead of random timezones, and made
neater. Maybe we'll keep these when transferring config to a file someday.
- It's added to forkid hash tests (EIP-2124/EIP-2364), of course.
Signed-off-by: Jamie Lokier <jamie@shareable.org>
* PoW wrapper for verification & mining
why:
It eases data management of per-Epoch lookup tables. Also some unit
tests show limits of usefulness on non-specialised machines for
mining besides developing tests.
details:
For PoW verification, this patch provides a pretty wrapper hiding the
details of the ethash/Hashimoto lookup cache management.
For mining on my development system without special hardware, the
underlying ethash functions are prohibitively slow. It takes
* ~20 minutes to prepare the full ethash/Hashimoto lookup dataset
* a second to run ~25k nonce tests (in the mining loop)
The mining part might be of some use for generating test data for
the tx-pool, though.
* Using PowRef as replacement for EpochHashCache + hashimotoLight()
* Fix typo (CI failed)
why:
was below log level when testing locally
* fix canonical naming
previously, every time the VMState was created, it will also create
new stateDB, and this action will nullify the advantages of cached accounts.
the new changes will conserve the accounts cache if the executed blocks
are contiguous. if not the stateDB need to be reinited.
this changes also allow rpcCallEvm and rpcEstimateGas executed properly
using current stateDB instead of creating new one each time they are called.
EIP-2718:
- chainID: Long! of Query
- chainID: Long of Transaction
EIP-1559:
- baseFeePerGas: BigInt of Block
- effectiveGasPrice: BigInt of Transaction
- maxFeePerGas: BigInt of Transaction
- maxPriorityFeePerGas: BigInt of Transaction
this is a preparation for migration to confutils based config
although there is still some getConfiguration usage in tests code
it will be removed after new config arrived
although they are technically different, but in reality,
many networks are using the same id for ChainId dan NetworkId.
in this commit, we set networkid from config file's chainId.
- allow clique period and epoch to be configured via config file
- this also activate poaEngine mode
- remove clique period configuration from cli to reduce confusion
- fix#786
Add these to the list of specially disabled tests during `make test`,
as they are exceptionally slow.
Signed-off-by: Jamie Lokier <jamie@shareable.org>
* Provide PoA voting header generator
why:
Handy for hive/smoke test
details:
Header generator is a re-implementation of the generator previously
used for the canonical reference tests.
* try fixing ci out-of-mem condition
why:
for some reason, the ci began behaving like a real win7/i386 machine
where gcc is limited to 64k optimiser space
* fix comments, typos ..
* Provide API
details:
API is bundled via clique.nim.
* Set extraValidation as default for PoA chains
why:
This triggers consensus verification and an update of the list
of authorised signers. These signers are integral part of the
PoA block chain.
todo:
Option argument to control validation for the nimbus binary.
* Fix snapshot state block number
why:
Using sub-sequence here, so the len() function was wrong.
* Optional start where block verification begins
why:
Can speed up time building loading initial parts of block chain. For
PoA, this allows to prove & test that authorised signers can be
(correctly) calculated starting at any point on the block chain.
todo:
On Goerli around blocks #193537..#197568, processing time increases
disproportionally -- needs to be understand
* For Clique test, get old grouping back (7 transactions per log entry)
why:
Forgot to change back after troubleshooting
* Fix field/function/module-name misunderstanding
why:
Make compilation work
* Use eth_types.blockHash() rather than utils.hash() in Clique modules
why:
Prefer lib module
* Dissolve snapshot_misc.nim
details:
.. into clique_verify.nim (the other source file clique_unused.nim
is inactive)
* Hide unused AsyncLock in Clique descriptor
details:
Unused here but was part of the Go reference implementation
* Remove fakeDiff flag from Clique descriptor
details:
This flag was a kludge in the Go reference implementation used for the
canonical tests. The tests have been adapted so there is no need for
the fakeDiff flag and its implementation.
* Not observing minimum distance from epoch sync point
why:
For compiling PoA state, the go implementation will walk back to the
epoch header with at least 90000 blocks apart from the current header
in the absence of other synchronisation points.
Here just the nearest epoch header is used. The assumption is that all
the checkpoints before have been vetted already regardless of the
current branch.
details:
The behaviour of using the nearest vs the minimum distance epoch is
controlled by a flag and can be changed at run time.
* Analysing processing time (patch adds some debugging/visualisation support)
why:
At the first half million blocks of the Goerli replay, blocks on the
interval #194854..#196224 take exceptionally long to process, but not
due to PoA processing.
details:
It turns out that much time is spent in p2p/excecutor.processBlock()
where the elapsed transaction execution time is significantly greater
for many of these blocks.
Between the 1371 blocks #194854..#196224 there are 223 blocks with more
than 1/2 seconds execution time whereas there are only 4 such blocks
before and 13 such after this range up to #504192.
* fix debugging symbol in clique_desc (causes CI failing)
* Fixing canonical reference tests
why:
Two errors were introduced earlier but ovelooked:
1. "Remove fakeDiff flag .." patch was incomplete
2. "Not observing minimum distance .." introduced problem w/tests 23/24
details:
Fixing 2. needed to revert the behaviour by setting the
applySnapsMinBacklog flag for the Clique descriptor. Also a new
test was added to lock the new behaviour.
* Remove cruft
why:
Clique/PoA processing was intended to take place somewhere in
executor/process_block.processBlock() but was decided later to run
from chain/persist_block.persistBlock() instead.
* Update API comment
* ditto
Use a hard-coded number instead of the same expression as the client, so that
bugs introduced via that expression are detected. Using the same expression as
the client can hide issues when the value is wrong in both places.
When the expected value genuinely changes, it'll be obvious.
Just change this number.
Signed-off-by: Jamie Lokier <jamie@shareable.org>