* Use unittest2 test runner
Since upgrading to unittest2, the test runner prints the command line to
re-run a failed test - this however relies on actually using the
unittest2 command line runner.
Previously, test files were assigned numbers - with the unittest2
runner, tests are run using suite/category names instead, like so:
```
# run the Genesis suite
build/all_tests "Genesis::``
# run all tests with "blsMapG1" in the name
build/all_tests "blsMapG1*"
# run tests verbosely
build/all_tests -v
```
A reasonable follow-up here would be to review the suite names to make
them easier to run :)
* lint
* easier-to-compare test order
* bump unittest2 (also the repo)
* renamed nimbus folder to execution_chain
* Renamed "nimbus" references to "execution_chain"
* fixed wrongly changed http reference
* delete snap types file given that it was deleted before this PR merge
* missing 'execution_chain' replacement
---------
Co-authored-by: pmmiranda <pedro.miranda@nimbus.team>
* aristo: fork support via layers/txframes
This change reorganises how the database is accessed: instead holding a
"current frame" in the database object, a dag of frames is created based
on the "base frame" held in `AristoDbRef` and all database access
happens through this frame, which can be thought of as a consistent
point-in-time snapshot of the database based on a particular fork of the
chain.
In the code, "frame", "transaction" and "layer" is used to denote more
or less the same thing: a dag of stacked changes backed by the on-disk
database.
Although this is not a requirement, in practice each frame holds the
change set of a single block - as such, the frame and its ancestors
leading up to the on-disk state represents the state of the database
after that block has been applied.
"committing" means merging the changes to its parent frame so that the
difference between them is lost and only the cumulative changes remain -
this facility enables frames to be combined arbitrarily wherever they
are in the dag.
In particular, it becomes possible to consolidate a set of changes near
the base of the dag and commit those to disk without having to re-do the
in-memory frames built on top of them - this is useful for "flattening"
a set of changes during a base update and sending those to storage
without having to perform a block replay on top.
Looking at abstractions, a side effect of this change is that the KVT
and Aristo are brought closer together by considering them to be part of
the "same" atomic transaction set - the way the code gets organised,
applying a block and saving it to the kvt happens in the same "logical"
frame - therefore, discarding the frame discards both the aristo and kvt
changes at the same time - likewise, they are persisted to disk together
- this makes reasoning about the database somewhat easier but has the
downside of increased memory usage, something that perhaps will need
addressing in the future.
Because the code reasons more strictly about frames and the state of the
persisted database, it also makes it more visible where ForkedChain
should be used and where it is still missing - in particular, frames
represent a single branch of history while forkedchain manages multiple
parallel forks - user-facing services such as the RPC should use the
latter, ie until it has been finalized, a getBlock request should
consider all forks and not just the blocks in the canonical head branch.
Another advantage of this approach is that `AristoDbRef` conceptually
becomes more simple - removing its tracking of the "current" transaction
stack simplifies reasoning about what can go wrong since this state now
has to be passed around in the form of `AristoTxRef` - as such, many of
the tests and facilities in the code that were dealing with "stack
inconsistency" are now structurally prevented from happening. The test
suite will need significant refactoring after this change.
Once this change has been merged, there are several follow-ups to do:
* there's no mechanism for keeping frames up to date as they get
committed or rolled back - TODO
* naming is confused - many names for the same thing for legacy reason
* forkedchain support is still missing in lots of code
* clean up redundant logic based on previous designs - in particular the
debug and introspection code no longer makes sense
* the way change sets are stored will probably need revisiting - because
it's a stack of changes where each frame must be interrogated to find an
on-disk value, with a base distance of 128 we'll at minimum have to
perform 128 frame lookups for *every* database interaction - regardless,
the "dag-like" nature will stay
* dispose and commit are poorly defined and perhaps redundant - in
theory, one could simply let the GC collect abandoned frames etc, though
it's likely an explicit mechanism will remain useful, so they stay for
now
More about the changes:
* `AristoDbRef` gains a `txRef` field (todo: rename) that "more or less"
corresponds to the old `balancer` field
* `AristoDbRef.stack` is gone - instead, there's a chain of
`AristoTxRef` objects that hold their respective "layer" which has the
actual changes
* No more reasoning about "top" and "stack" - instead, each
`AristoTxRef` can be a "head" that "more or less" corresponds to the old
single-history `top` notion and its stack
* `level` still represents "distance to base" - it's computed from the
parent chain instead of being stored
* one has to be careful not to use frames where forkedchain was intended
- layers are only for a single branch of history!
* fix layer vtop after rollback
* engine fix
* Fix test_txpool
* Fix test_rpc
* Fix copyright year
* fix simulator
* Fix copyright year
* Fix copyright year
* Fix tracer
* Fix infinite recursion bug
* Remove aristo and kvt empty files
* Fic copyright year
* Fix fc chain_kvt
* ForkedChain refactoring
* Fix merge master conflict
* Fix copyright year
* Reparent txFrame
* Fix test
* Fix txFrame reparent again
* Cleanup and fix test
* UpdateBase bugfix and fix test
* Fixe newPayload bug discovered by hive
* Fix engine api fcu
* Clean up call template, chain_kvt, andn txguid
* Fix copyright year
* work around base block loading issue
* Add test
* Fix updateHead bug
* Fix updateBase bug
* Change func commitBase to proc commitBase
* Touch up and fix debug mode crash
---------
Co-authored-by: jangko <jangko128@gmail.com>
* Refactor TxPool: leaner and simpler
* Rewrite test_txpool
Reduce number of tables used, from 5 to 2. Reduce number of files.
If need to modify the price rule or other filters, now is far more easier because only one table to work with(sender/nonce).
And the other table is just a map from txHash to TxItemRef.
Removing transactions from txPool either because of producing new block or syncing became much easier.
Removing expired transactions also simple.
Explicit Tx Pending, Staged, or Packed status is removed. The status of the transactions can be inferred implicitly.
Developer new to TxPool can easily follow the logic.
But the most important is we can revive the test_txpool without dirty trick and remove usage of getCanonicalHead furthermore to prepare for better integration with ForkedChain.
The current getCanonicalHead of core db should not be confused with ForkedChain.latestHeader.
Therefore we need to use getCanonicalHead to restricted case only, e.g. initializing ForkedChain.
In block processing, depending on the complexity of a transaction and
hotness of caches etc, signature checking can actually make up the
majority of time needed to process a transaction (60% observed in some
randomly sampled block ranges).
Fortunately, this is a task that trivially can be offloaded to a task
pool similar to how nimbus-eth2 does it.
This PR introduces taskpools in the most simple way possible, by
performing signature checking concurrently with other TX processing,
assigning a taskpool task per TX effectively.
With this little trick, we're in gigagas land 🎉 on my laptop!
```
INF 2024-12-10 21:05:35.170+01:00 Imported blocks
blockNumber=3874817 b... mgps=1222.707 ...
```
Tests don't use the taskpool for now because it needs manual cleanup and
we don't have a good mechanism in place. Future PR:s should address this
by creating a common shutdown sequence that also closes and cleans up
other resources like the DB.
Co-authored-by: andri lim <jangko128@gmail.com>
* prefer the spec-derived name where possible
* don't pass stateRoot to LedgerRef and friends (it doesn't do anything)
* add deprecation warning in graphql - it needs updating to use
forkedchain instead
* partial commit
* fixes
* remove converters too
* revert changes on nimbus_verified_proxy
* revert changes in converter
* revert changes(re-xport) in rpc_types
* update copyright year
* replace types in other binaries
* chain config bug
* fix rebase conflict imcomplete buffer
* fix more rebase buffers
* remove ditto types and converters
* fix the tests
* update copyright year
This is a minimal set of changes to make things work with the new types
in nim-eth - this is the minimal PR that merely resolves
incompatibilities while the full change set would include more cleanup
and migration.
* Wiring ForkedChainRef to other components
- Disable majority of hive simulators
- Only enable pyspec_sim for the moment
- The pyspec_sim is using a smaller RPC service wired to ForkedChainRef
- The RPC service will gradually grow
* Addressing PR review
* Fix test_beacon/setup_env
* Enable consensus_sim (#2441)
* Enable consensus_sim
* Remove isFile check
* Enable Engine API jwt auth tests and exchange cap tests
* Enable engine api in build_sim.sh
* Wire ForkedChainRef to Engine API newPayload
* Wire Engine API getBodies to ForkedChainRef
* Wire Engine API api_forkchoice to ForkedChainRef
* Wire more RPC methods to ForkedChainRef
* Implement eth_syncing
* Implement eth_call and eth_getlogs
* TxPool: simplify smartHead
* Fix smartHead usage
* Fix txpool headDiff
* Remove hasBlockHeader and use headerExists
* Addressing review
The reverse slot hash mechanism causes quite a bit of database traffic
but is broadly not useful except for iterating the storage of an
account, something that a validator never does (it's used by the
tracers).
This flag adds one more thing that is not stored in the database, to be
explored more comprehensively when designing full, validator and archive
modes with different pruning options in the future.
`ldb` says this is 60gb of data (!):
```
ldb --db=. --ignore_unknown_options --column_family=KvtGen approxsize
--hex --from=0x05
--to=0x05ffffffffffffffffffffffffffffffffffffffffffffff
66488353954
```
* Updates and corrections
* Extract `CoreDb` configuration from `base.nim` into separate module
why:
This makes it easier to avoid circular imports, in particular
when the capture journal (aka tracer) is revived.
* Extract `Ledger` configuration from `base.nim` into separate module
why:
This makes it easier to avoid circular imports (if any.)
also:
Move `accounts_ledger.nim` file to sub-folder `backend`. That way the
layout resembles that of the `core_db`.
* Rename `newKvt()` -> `ctx.getKvt()`
why:
Clean up legacy shortcut. Also, the `KVT` returned is not instantiated
but refers to the shared `KVT` that resides in a context which is a
generalisation of an in-memory database fork. The function `ctx`
retrieves the default context.
* Rename `newTransaction()` -> `ctx.newTransaction()`
why:
Clean up legacy shortcut. The transaction is applied to a context as a
generalisation of an in-memory database fork. The function `ctx`
retrieves the default context.
* Rename `getColumn(CtGeneric)` -> `getGeneric()`
why:
No more a list of well known sub-tries needed, a single one is enough.
In fact, `getColumn()` did only support a single sub-tree by now.
* Reduce TODO list
1. test_state_db and test_ledger -> test_ledger.
They are the same thing now.
2. stack, memory, code_stream, gas_meter, misc,
overflow -> test_evm_support.
They are small tests and fall into the same area.
* Tighten `CoreDb` API for accounts
why:
Apart from cruft, the way to fetch the accounts state root via a
`CoreDbColRef` record was unnecessarily complicated.
* Extend `CoreDb` API for accounts to cover storage tries
why:
In future, this will make the notion of column objects obsolete. Storage
trees will then be indexed by the account address rather than the vertex
ID equivalent like a `CoreDbColRef`.
* Apply new/extended accounts API to ledger and tests
details:
This makes the `distinct_ledger` module obsolete
* Remove column object constructors
why:
They were needed as an abstraction of MPT sub-trees including storage
trees. Now, storage trees are handled by the account (e.g. via address)
they belong to and all other trees can be identified by a constant well
known vertex ID. So there is no need for column objects anymore.
Still there are some left-over column object methods wnich will be
removed next.
* Remove `serialise()` and `PayloadRef` from default Aristo API
why:
Not needed. `PayloadRef` was used for unstructured/unknown payload
formats (account or blob) and `serialise()` was used for decodng
`PayloadRef`. Now it is known in advance what the payload looks
like.
* Added query function `hasStorageData()` whether a storage area exists
why:
Useful for supporting `slotStateEmpty()` of the `CoreDb` API
* In the `Ledger` replace `storage.stateEmpty()` by `slotStateEmpty()`
* On Aristo, hide the storage root/vertex ID in the `PayloadRef`
why:
The storage vertex ID is fully controlled by Aristo while the
`AristoAccount` object is controlled by the application. With the
storage root part of the `AristoAccount` object, there was a useless
administrative burden to keep that storage root field up to date.
* Remove cruft, update comments etc.
* Update changed MPT access paradigms
why:
Fixes verified proxy tests
* Fluffy cosmetics
The module name is a misnomer, because AccountsCache have been
replaced by LedgerRef. But the test still applicable.
Instead of replaying unsupported goerli blocks,
we generate our own transactions and block.