Commit Graph

142 Commits

Author SHA1 Message Date
Jacek Sieka 8857fccb44
create per-fork opcode dispatcher (#2579)
In the current VM opcode dispatcher, a two-level case statement is
generated that first matches the opcode and then uses another nested
case statement to select the actual implementation based on which fork
it is, causing the dispatcher to grow by `O(opcodes) * O(forks)`.

The fork does not change between instructions causing significant
inefficiency for this approach - not only because it repeats the fork
lookup but also because of code size bloat and missed optimizations.

A second source of inefficiency in dispatching is the tracer code which
in the vast majority of cases is disabled but nevertheless sees multiple
conditionals being evaluated for each instruction only to remain
disabled throughout exeuction.

This PR rewrites the opcode dispatcher macro to generate a separate
dispatcher for each fork and tracer setting and goes on to pick the
right one at the start of the computation.

This has many advantages:

* much smaller dispatcher
* easier to compile
* better inlining
* fewer pointlessly repeated instruction
* simplified macro (!)
* slow "low-compiler-memory" dispatcher code can be removed

Net block import improvement at about 4-6% depending on the contract -
synthetic EVM benchmnarks would show an even better result most likely.
2024-08-28 10:20:36 +02:00
Jacek Sieka 43d93bcdab
Don't write slot hashes on import (#2564)
The reverse slot hash mechanism causes quite a bit of database traffic
but is broadly not useful except for iterating the storage of an
account, something that a validator never does (it's used by the
tracers).

This flag adds one more thing that is not stored in the database, to be
explored more comprehensively when designing full, validator and archive
modes with different pruning options in the future.

`ldb` says this is 60gb of data (!):
```
ldb --db=. --ignore_unknown_options --column_family=KvtGen approxsize
--hex --from=0x05
--to=0x05ffffffffffffffffffffffffffffffffffffffffffffff
66488353954
```
2024-08-16 08:22:51 +02:00
Jacek Sieka 094486d0ce
Hash bump 2024-08-08 07:46:35 +02:00
andri lim 0cc730dd05
Fix CodeBytes: invalidPositions out of bound crash (#2523) 2024-07-25 19:23:53 +07:00
andri lim fb196849ee
EVM cosmetic changes, one less indirect access of VmCpt (#2503) 2024-07-19 08:44:01 +07:00
andri lim ee323d5ff8
Optimize EVM stack usage (#2502)
* EVM: Optimize CALL family stack usage

* EVM: Optimize CREATE family stack usage

* EVM: Optimize arith stack usage

* EVM: Optimize stack usage in the rest of opcodes

* Fix test_op_env and clean up unused imports

* EVM: Optimize arithmetic binary ops
2024-07-18 18:59:53 +07:00
andri lim cfe14f1825
EVM: use assign2 whenever possible (#2499)
Before: GST finish in 59 secs.
After: GST finish in 52 secs!
2024-07-17 20:48:50 +07:00
andri lim 916f88a373
Use block number or timestamp to determine fork rules (#2496)
* Use block number or timestamp to determine fork rules

Avoid confusion raised by `forkGTE` usage where block informations are present.

* Get rid of forkGTE
2024-07-17 17:05:53 +07:00
Jacek Sieka 0e36a17e5b
avoid re-writing code (#2490)
Avoids pointless rocksdb writes that cause write compaction /
amplification, specially in the case where code is shared between
multiple accounts
2024-07-15 15:02:23 +02:00
Jacek Sieka 72947b3647
odds and ends (#2481)
small cleanups to reduce memory allocations
2024-07-13 20:42:49 +02:00
andri lim 4fa3756860
Convert GasInt to uint64, bump nim-eth and nimbus-eth2 (#2461)
* Convert GasInt to uint64, bump nim-eth and nimbus-eth2

* Bump nimbus-eth2

* int64.high.GasInt instead of 0x7fffffffffffffff.GasInt
2024-07-07 06:52:11 +00:00
andri lim e8683692fd
EVM gasSstore refund reduction using positive integer (#2460)
This is the hopefully the last part of preparations
before converting GasInt to uint64
2024-07-06 08:39:38 +07:00
andri lim 4eaae5cbfa
EVM gasCall values always stay on positive side (#2459)
* EVM gasCall values always stay on positive side

This is also another part of preparations before
converting GasInt to uint64

* Fix test_evm_support
2024-07-06 08:39:22 +07:00
andri lim 6fe7411ac0
Saner EVM gasCosts (#2457)
This is also a part of preparations before converting GasInt to uint64
2024-07-05 11:55:13 +07:00
andri lim 23c00ce88c
Separate evmc gasCosts and nim-evm gasCosts (#2454)
This is part of preparations before converting GasInt to uint64
2024-07-05 07:00:03 +07:00
Jacek Sieka 7d78fd97d5
avoid allocations for slot storage (#2455)
Introduce a new `StoData` payload type similar to `AccountData`

* slightly more efficient storage format
* typed api
* fewer seqs
* fix encoding docs - it wasn't rlp after all :)
2024-07-04 23:48:45 +00:00
Jacek Sieka 79788c01d4
Add debug mode for disabling per-chunk state root validation (#2453)
This significantly speeds up block import at the cost of less protection
against invalid data, potentially resulting in an invalid database
getting stored.

The risk is small given that import is used only for validated data -
evaluating the right level of of validation vs performance is left for a
future PR.

A side effect of this approach is that there is no cached stated root in
the database - computing it currently requires a lot of memory since the
intermediate roots get cached in memory in full while the computation is
ongoing - a future PR will need to address this deficiency, for example
by streaming the already-computed hashes directly to the database.
2024-07-04 16:51:50 +02:00
andri lim f04f30c72b
Reduce EVM complexity by removing forkOverride (#2448)
* Reduce EVM complexity by removing forkOverride

* Fixes
2024-07-04 15:48:36 +02:00
andri lim b82dcdcc76
Remove unused StructLog (#2447) 2024-07-04 19:23:53 +07:00
Jordan Hrycaj 6dc2773957
Only use pre hashed addresses as account keys (#2424)
* Normalised storage tree addressing in function prototypes

detail:
  Argument list is always `<db> <account-path> <slot-path> ..` with
  both path arguments as `openArray[]`

* Remove cruft

* CoreDb internally Use full account paths rather than addresses

* Update API logging

* Use hashed account address only in prototypes

why:
  This avoids unnecessary repeated hashing of the same account address.
  The burden of doing that is upon the application. In the case here,
  the ledger caches all kinds of stuff anyway so it is common sense to
  exploit that for account address hashes.

caveat:
  Using `openArray[byte]` argument types for hashed accounts is inherently
  fragile. In non-release mode, a length verification `doAssert` is
  enabled by default.

* No accPath in data record (use `AristoAccount` as `CoreDbAccount`)

* Remove now unused `eAddr` field from ledger `AccountRef` type

why:
  Is duplicate of lookup key

* Avoid merging the account record/statement in the ledger twice.
2024-06-27 19:21:01 +00:00
andri lim 6a10dfd0fe
Remove pre and post opcode handlers from EVM (#2409) 2024-06-24 07:58:15 +02:00
andri lim 99ff8dc876
Fix t8n: blobGasUsed exceeds allowance issue (#2407)
* Fix t8n: blobGasUsed exceeds allowance issue

* Put blobGasUsed validation into transaction precessing pipeline
2024-06-24 07:56:24 +02:00
Jacek Sieka 768307d91d
Cache code and invalid jump destination tables (fixes #2268) (#2404)
It is common for many accounts to share the same code - at the database
level, code is stored by hash meaning only one copy exists per unique
program but when loaded in memory, a copy is made for each account.

Further, every time we execute the code, it must be scanned for invalid
jump destinations which slows down EVM exeuction.

Finally, the extcodesize call causes code to be loaded even if only the
size is needed.

This PR improves on all these points by introducing a shared
CodeBytesRef type whose code section is immutable and that can be shared
between accounts. Further, a dedicated `len` API call is added so that
the EXTCODESIZE opcode can operate without polluting the GC and code
cache, for cases where only the size is requested - rocksdb will in this
case cache the code itself in the row cache meaning that lookup of the
code itself remains fast when length is asked for first.

With 16k code entries, there's a 90% hit rate which goes up to 99%
during the 2.3M attack - the cache significantly lowers memory
consumption and execution time not only during this event but across the
board.
2024-06-21 09:44:10 +02:00
andri lim 035ef696a6
EVMC refundGas not breaching host/evm separation anymore (#2395) 2024-06-19 14:15:23 +02:00
andri lim 83f6f89869
Add t8n debugging tool and fix EVM regression (#2386)
- fix blockNumber overflow in blockHash op code
- reenable 3 test cases of test_blockchain_json
- fix t8n crash when creating invalid tracer stream
2024-06-19 08:58:08 +07:00
Jacek Sieka 1a96b4a97c
evm: generate more specialized functions (#2390)
Nicer name in profiler and avoids a few range checks
2024-06-19 08:57:29 +07:00
Miran ea0d18424a
use Nim 2.0.6 (#2384)
* use Nim 2.0.6

* Fixes for nim 2.0.6

* Workaround nim 2.0 array indexing issue

* Remove excess gcsafe pragma

* Oops, fix recursive template

* Fix imports

* Fluffy nph linting

---------

Co-authored-by: jangko <jangko128@gmail.com>
Co-authored-by: tersec <tersec@users.noreply.github.com>
2024-06-19 01:27:54 +00:00
Jacek Sieka 8926da02b6
Fix lowest-hanging fruit in VM (#2382)
* replace set with bitseq for code validity test
* remove unusued code from CodeStream
* avoid unnecessary byte-by-byte copies
2024-06-18 07:55:35 +07:00
Jacek Sieka 135ef222a2
avoid intermediate const in opcodes (#2381)
The extra layer of `const` makes the function name harder to see a
debugger / profiler
2024-06-17 18:13:38 +02:00
andri lim 61a809cf4d
Remove EVM indirect imports and unused EVM errors (#2370)
Those indirect imports are used when there was two EVMs.
2024-06-17 09:56:39 +02:00
andri lim c5508b8dac
Bump nim-blscurve for gcc-14 compatibility (#2365)
* Bump nim-blscurve for gcc-14 compatibility

* Fix evm/blscurve.nim pointer works when using blscurve_abi
2024-06-15 17:34:07 +00:00
andri lim 27d710294b
Vm2Ctx -> VmCtx, Vm2Op -> VmOp (#2369)
Legacy evm  have been removed, no longer to keep the Vm2 prefix.
2024-06-15 23:18:53 +07:00
Jacek Sieka 68f462e3e4
avoid state root lookup when computing linear history (#2362)
State lookups potentially trigger expensive re-hashings - this is the
first of several steps to remove the unnecessary ones from the general
flow of block processing

* avoid re-reading parent block header from database when it's already
in memory
2024-06-14 15:56:56 +02:00
andri lim 5a18537450
Bump nim-eth, nim-web3, nimbus-eth2 (#2344)
* Bump nim-eth, nim-web3, nimbus-eth2

- Replace std.Option with results.Opt
- Fields name changes

* More fixes

* Fix Portal stream async raises and portal testnet Opt usage

* Bump eth + nimbus-eth2 + more fixes related to eth_types changes

* Fix in utp test app and nimbus-eth2 bump

* Fix test_blockchain_json rebase conflict

* Fix EVMC block_timestamp conversion plus commentary

---------

Co-authored-by: kdeme <kim.demey@gmail.com>
2024-06-14 14:31:08 +07:00
Jacek Sieka f6be4bd0ec
avoid initTable (#2328)
`initTable` is obsolete since nim 0.19 and can introduce significant
memory overhead while providing no benefit (since the table will be
grown to the default initial size on first use anyway).

In particular, aristo layers will not necessarily use all tables they
initialize, for exampe when many empty accounts are being created.
2024-06-10 11:05:30 +02:00
Jacek Sieka 0b32078c4b
Consolidate block type for block processing (#2325)
This PR consolidates the split header-body sequences into a single EthBlock
sequence and cleans up the fallout from that which significantly reduces
block processing overhead during import thanks to less garbage collection
and fewer copies of things all around.

Notably, since the number of headers must always match the number of bodies,
we also get rid of a pointless degree of freedom that in the future could
introduce unnecessary bugs.

* only read header and body from era file
* avoid several unnecessary copies along the block processing way
* simplify signatures, cleaning up unused arguemnts and returns
* use `stew/assign2` in a few strategic places where the generated
  nim assignent is slow and add a few `move` to work around poor
  analysis in nim 1.6 (will need to be revisited for 2.0)

```
stats-20240607_2223-a814aa0b.csv vs stats-20240608_0714-21c1d0a9.csv
                       bps_x     bps_y     tps_x        tps_y    bpsd    tpsd    timed
block_number
(498305, 713245]    1,540.52  1,809.73  2,361.58  2775.340189  17.63%  17.63%  -14.92%
(713245, 928185]      730.36    865.26  1,715.90  2028.973852  18.01%  18.01%  -15.21%
(928185, 1143126]     663.03    789.10  2,529.26  3032.490771  19.79%  19.79%  -16.28%
(1143126, 1358066]    393.46    508.05  2,152.50  2777.578119  29.13%  29.13%  -22.50%
(1358066, 1573007]    370.88    440.72  2,351.31  2791.896052  18.81%  18.81%  -15.80%
(1573007, 1787947]    283.65    335.11  2,068.93  2441.373402  17.60%  17.60%  -14.91%
(1787947, 2002888]    287.29    342.11  2,078.39  2474.179448  18.99%  18.99%  -15.91%
(2002888, 2217828]    293.38    343.16  2,208.83   2584.77457  17.16%  17.16%  -14.61%
(2217828, 2432769]    140.09    167.86  1,081.87  1296.336926  18.82%  18.82%  -15.80%

blocks: 1934464, baseline: 3h13m1s, contender: 2h43m47s
bpsd (mean): 19.55%
tpsd (mean): 19.55%
Time (total): -29m13s, -15.14%
```
2024-06-09 16:32:20 +02:00
web3-developer db8c5b90bd
Cleanup stateless and block witness code. (#2295)
* Cleanup unneeded stateless and block witness code. Keeping MultiKeys which is used in the eth_getProofsByBlockNumber RPC endpoint which is needed for the Fluffy state network bridge.

* Rename generateWitness flag to collectWitnessData to better describe what the flag does. We only collect the keys of the touched accounts and storage slots but no block witness generation is supported for now.

* Move remaining stateless code into nimbus directory.

* Add vmstate parameter to ChainRef to fix test.

* Exclude *.in from check copyright year

---------

Co-authored-by: jangko <jangko128@gmail.com>
2024-06-08 15:05:00 +07:00
tersec 5008b89185
EIP-2537 BLS12-381 G1 add/mul/exp and G2 add/mul support with tests (#2315) 2024-06-08 07:39:53 +07:00
tersec fd03038cab
Replace some usage of std/options with results Opt (#2323)
* Replace some usage of std/options with results Opt

* more updates
2024-06-07 23:39:58 +02:00
andri lim b3a5c67532
Remove exceptions from EVM (#2314)
* Remove exception from evm memory

* Remove exception from gas meter

* Remove exception from stack

* Remove exception from precompiles

* Remove exception from gas_costs

* Remove exception from op handlers

* Remove exception from op dispatcher

* Remove exception from call_evm

* Remove exception from EVM

* Fix tools and tests

* Remove exception from EVMC

* fix evmc

* Fix evmc

* Remove remnants of async evm stuff

* Remove superflous error handling

* Proc to func

* Fix errors detected by CI

* Fix EVM op call stack usage

* REmove exception handling from getVmState

* Better error message instead of just doAssert

* Remove unused validation

* Remove superflous catchRaise

* Use results.expect instead of unsafeValue
2024-06-07 15:24:32 +07:00
tersec d9375858fe
rm withdrawn EIP-2315 (#2309)
* rm withdrawn EIP-2315

* copyright year linting

* EVMC
2024-06-07 08:59:05 +07:00
Jacek Sieka 7f76586214
Speed up account ledger a little (#2279)
`persist` is a hotspot when processing blocks because it is run at least
once per transaction and loops over the entire account cache every time.

Here, we introduce an extra `dirty` map that keeps track of all accounts
that need checking during `persist` which fixes the immediate
inefficiency, though probably this could benefit from a more thorough
review - we also get rid of the unused clearCache flag - we start with
a fresh cache on every fresh vmState.

* avoid unnecessary code hash comparisons
* avoid unnecessary copies when iterating
* use EMPTY_CODE_HASH throughout for code hash comparison
2024-06-02 21:21:29 +02:00
Jacek Sieka 8b658343f6
simplify EVM callback (#2282)
The EVM callbacks currently don't require closures, so we can use
`nimcall` to reduce overhead slightly.
2024-06-02 13:00:27 +02:00
tersec cfbbcda4f7
rm unused Nim modules (#2270) 2024-06-01 17:49:46 +07:00
tersec 828fd63348
rm Miracl and remaining i386 (32-bit) build support (#2250) 2024-05-30 21:08:09 +02:00
Jacek Sieka 919242c98e
results: use canonical import (#2248) 2024-05-30 14:54:03 +02:00
andri lim 0a07425112
Cleanup unused raises in evm/state and other obsolete informations (#2243) 2024-05-30 09:03:54 +00:00
tersec 6a9ca40cc8
use --styleCheck:error (#2242)
* use --styleCheck:error

* rename evm/types.nim statusCode to evmStatus
2024-05-30 09:01:07 +00:00
andri lim eaf3d9897e
Simplify AccountsLedgerRef complexity (#2239) 2024-05-29 13:06:49 +02:00
Jordan Hrycaj 2c322054e0
Attempt to roll back stateless mode implementation in a single PR (#2209)
* Attempt to roll back stateless mode implementation in a single PR

why:
+ Stateless mode is not fully working and in the way
+ Single PR should make it feasible to investigate for a possible
  re-implementation

* Fix copyright year

* Fix annotation for exception (evmc mode)
2024-05-22 21:01:19 +00:00