Commit Graph

4004 Commits

Author SHA1 Message Date
Jordan Hrycaj 71e466d173
Block header download beacon to era1 (#2601)
* Block header download starting at Beacon down to Era1

details:
  The header download implementation is intended to be completed to a
  full sync facility.

  Downloaded block headers are stored in a `CoreDb` table. Later on they
  should be fetched, complemented by a block body, executed/imported,
  and deleted from the table.

  The Era1 repository may be partial or missing. Era1 headers are neither
  downloaded nor stored on the `CoreDb` table.

  Headers are downloaded top down (largest block number first) using the
  hash of the block header by one peer. Other peers fetch headers
  opportunistically using block numbers

  Observed download times for 14m `MainNet` headers varies between 30min
  and 1h (Era1 size truncated to 66m blocks.), full download 52min
  (anectdotal.) The number of peers downloading concurrently is crucial
  here.

* Activate `flare` by command line option

* Fix copyright year
2024-09-09 09:12:56 +00:00
Kim De Mey cb94dd0c5b
Small cleanup of the ContentDB store handler (#2597) 2024-09-07 15:03:13 +02:00
Jacek Sieka 0a8986bc77
avoid copying data when merging save points (#2584)
Saving both memory and processing, we can move entries from one
savepoint to another, specially when the target is empty as it often is
during transaction processing
2024-09-06 22:45:29 +02:00
Kim De Mey e919f57902
Remove old header network code as this no longer exists in specs (#2596) 2024-09-06 18:00:58 +02:00
Kim De Mey 8a19118ddf
Fix initial radius bug due to uninitialized localId (#2595) 2024-09-06 11:45:27 +02:00
andri lim a70bb78d27
Fix engine_sim compilation issue (#2594) 2024-09-06 11:06:31 +07:00
Kim De Mey 0a80a3bb25
Reverse calculate the radius at node restart (#2593)
This avoid restarting the node always with a full radius, which
causes the node the be bombarded with offers which it later has
to delete anyhow.

In order to implement this functionality, several changes were
made as the radius needed to move from the Portal wire protocol
current location to the contentDB and beaconDB, which is
conceptually more correct anyhow.

So radius is now part of the database objects and a handler is
used in the portal wire protocol to access its value.
2024-09-05 18:31:55 +02:00
Jacek Sieka d39c589ec3
lru cache updates (#2590)
* replace rocksdb row cache with larger rdb lru caches - these serve the
same purpose but are more efficient because they skips serialization,
locking and rocksdb layering
* don't append fresh items to cache - this has the effect of evicting
the existing items and replacing them with low-value entries that might
never be read - during write-heavy periods of processing, the
newly-added entries were evicted during the store loop
* allow tuning rdb lru size at runtime
* add (hidden) option to print lru stats at exit (replacing the
compile-time flag)

pre:
```
INF 2024-09-03 15:07:01.136+02:00 Imported blocks
blockNumber=20012001 blocks=12000 importedSlot=9216851 txs=1837042
mgas=181911.265 bps=11.675 tps=1870.397 mgps=176.819 avgBps=10.288
avgTps=1574.889 avgMGps=155.952 elapsed=19m26s458ms
```

post:
```
INF 2024-09-03 13:54:26.730+02:00 Imported blocks
blockNumber=20012001 blocks=12000 importedSlot=9216851 txs=1837042
mgas=181911.265 bps=11.637 tps=1864.384 mgps=176.250 avgBps=11.202
avgTps=1714.920 avgMGps=169.818 elapsed=17m51s211ms
```

9%:ish import perf improvement on similar mem usage :)
2024-09-05 11:18:32 +02:00
Jordan Hrycaj 3c6400673d
Coredb fix kvt save only fringe condition (#2592)
* Cosmetics, spelling, etc.

* Aristo: make sure that a save cycle always commits even when empty

why:
  If `Kvt` is tied to the `Aristo` DB save cycle, then this save cycle
  must also be committed if there is no data to save for `Aristo`.

  Otherwise this will lead to excessive core memory use with some fringe
  condition where Eth headers (or blocks) are downloaded while syncing
  and not really stored on disk.

* CoreDb: Correct persistent save mode

why:
  Saving `Kvt` first is seen as a harbinger (or canary) for `Aristo` as
  both run in sync. If `Kvt` succeeds saving first, so must be `Aristo`
  next. Other than this is a defect.
2024-09-04 13:48:38 +00:00
andri lim 4d9e288340
Wiring ForkedChainRef to other components (#2423)
* Wiring ForkedChainRef to other components

- Disable majority of hive simulators
- Only enable pyspec_sim for the moment
- The pyspec_sim is using a smaller RPC service wired to ForkedChainRef
- The RPC service will gradually grow

* Addressing PR review

* Fix test_beacon/setup_env

* Enable consensus_sim (#2441)

* Enable consensus_sim

* Remove isFile check

* Enable Engine API jwt auth tests and exchange cap tests

* Enable engine api in build_sim.sh

* Wire ForkedChainRef to Engine API newPayload

* Wire Engine API getBodies to ForkedChainRef

* Wire Engine API api_forkchoice to ForkedChainRef

* Wire more RPC methods to ForkedChainRef

* Implement eth_syncing

* Implement eth_call and eth_getlogs

* TxPool: simplify smartHead

* Fix smartHead usage

* Fix txpool headDiff

* Remove hasBlockHeader and use headerExists

* Addressing review
2024-09-04 09:54:54 +00:00
Kim De Mey 486c492bd4
Bump nim-eth for uTP overflow fix (#2591) 2024-09-04 09:49:58 +00:00
web3-developer 139b35a80a
Fluffy State Bridge: State building fixes (#2589)
* Fix bug where code was not correctly being updated during contract creation.

* Delete value from update cache during db delete.
2024-09-03 21:05:38 +08:00
Jacek Sieka 35cc78c86d
add metrics for rdb lru cache (#2586)
This is a first step towards measuring the efficiency of the LRU caches
over time - metrics can be collected during import or when running
regulary.

Since `nim-metrics` carries some overhead for its default way of
reporting metrics, this PR implements a custom collector over atomic
counters, given that this is one of the hottest spots in the block
processing pipeline.

Using a compile-time flag, the same metrics can be printed on exit which
is useful when comparing different strategies for caching - here's a
recent run over blocks 16000001-1616384 - this is a good candidate to
expose in a better way in the future, maybe:

```
   state    vtype       miss        hit      total hitrate
 Account     Leaf    4909417    4466215    9375632  47.64%
 Account   Branch   20742574   72015123   92757697  77.64%
   World     Leaf     940483    1140946    2081429  54.82%
   World   Branch    8224151  131496580  139720731  94.11%
     all      all   34816625  209118864  243935489  85.73%
```
2024-09-02 17:34:10 +02:00
Jacek Sieka ef1bab0802
avoid some trivial memory allocations (#2587)
* pre-allocate `blobify` data and remove redundant error handling
(cannot fail on correct data)
* use threadvar for temporary storage when decoding rdb, avoiding
closure env
* speed up database walkers by avoiding many temporaries

~5% perf improvement on block import, 100x on database iteration (useful
for building analysis tooling)
2024-09-02 16:03:10 +02:00
Jordan Hrycaj a25ea63dec
Revert lazy implementation (#2585) 2024-09-02 10:34:42 +00:00
Jacek Sieka 84a72c8658
Use zstd compression in bottommost layer (#2582)
Tested up to block ~14m, zstd uses ~12% less space which seems to result
in a small:ish (2-4%) performance improvement on block import speed -
this seems like a better baseline for more extensive testing in the
future.

Pre: 57383308 kb
Post: 50831236 kb
2024-08-30 17:32:13 +02:00
Jordan Hrycaj 42a08cfba9
Coredb and sync maintenance update (#2583)
* bump metrics

* Remove cruft

* Cosmetics, update some logging, noise control

* Renamed `CoreDb` function `hasKey` => `hasKeyRc` and provided `hasKey`

why:
  Currently, `hasKey` returns a `Result[]` rather than a `bool` which
  is what one would expect from a function prototype of this name.

  This was a bit of an annoyance and cost unnecessary attention.
2024-08-30 11:18:36 +00:00
web3-developer ee6a7e8259
Fluffy state endpoint improvements (#2580)
* Return default values when account, slot or code doesn't exist.

* Handle case when storage doesn't exist due to account not existing or being a non contract account.
2024-08-28 16:27:36 +08:00
Jacek Sieka 8857fccb44
create per-fork opcode dispatcher (#2579)
In the current VM opcode dispatcher, a two-level case statement is
generated that first matches the opcode and then uses another nested
case statement to select the actual implementation based on which fork
it is, causing the dispatcher to grow by `O(opcodes) * O(forks)`.

The fork does not change between instructions causing significant
inefficiency for this approach - not only because it repeats the fork
lookup but also because of code size bloat and missed optimizations.

A second source of inefficiency in dispatching is the tracer code which
in the vast majority of cases is disabled but nevertheless sees multiple
conditionals being evaluated for each instruction only to remain
disabled throughout exeuction.

This PR rewrites the opcode dispatcher macro to generate a separate
dispatcher for each fork and tracer setting and goes on to pick the
right one at the start of the computation.

This has many advantages:

* much smaller dispatcher
* easier to compile
* better inlining
* fewer pointlessly repeated instruction
* simplified macro (!)
* slow "low-compiler-memory" dispatcher code can be removed

Net block import improvement at about 4-6% depending on the contract -
synthetic EVM benchmnarks would show an even better result most likely.
2024-08-28 10:20:36 +02:00
web3-developer fa59898388
Fluffy state debug endpoints (#2578)
* Add debug endpoints that support looking up state by state root.

* Test lookup by state root endpoints.
2024-08-27 20:35:27 +08:00
web3-developer 8bf581e72d
Cleanup unused exp_getProofsByBlockNumber endpoint (#2577)
* Cleanup unused exp_getProofsByBlockNumber endpoint.
2024-08-23 22:39:33 +08:00
web3-developer 507a9e71df
Fluffy state network fixes and improvements (#2576)
* Cleanup tests.

* Improve offer sort function in state bridge.

* Verify nibble prefix for leaf and extension nodes during lookup.
2024-08-23 15:46:23 +08:00
Jacek Sieka dbabe7e0a7
import: reduce stack usage (#2575)
Because EthBlock is quite large, the stack usage that results from the
multiple copies (temporary and not) present in the import command is
larger than it should be - this PR moves some of that data to a closure
environment allocated once per EthBlock - a larger restructuring of the
code is due but in the meantime, this simple change speeds up garbage
collection a little bit.
2024-08-22 10:06:45 +02:00
Jacek Sieka 22dacdd81c
t8n: enable reverse slot hash map (#2573)
turns out tracer still needs it
2024-08-20 15:23:24 +02:00
Jacek Sieka d72a73de8b
avoid digest when loading era block (#2572)
Computing the digest is unnecessary but takes a little bit of time -
remove computation and reduce mem usage slightly when loading era blocks
2024-08-20 15:23:14 +02:00
Jordan Hrycaj 4db9c5c2d5
Small updates and fixes for rlpx suite (#2571)
* Remove redundant `eth/68` message and clean up docu

details:
  There is only eth/68 available at the moment

* Allow to turn on chronicles line number logging in `Makefile`

* Accept (and forget) tx hashes announcements

why:
  Does no harm to just ignore it at the moment

* Bump nim-eth (rlp fix)
2024-08-19 14:00:10 +00:00
web3-developer 60c9b2c00d
Fluffy State Bridge: Fetch portal client nodeId on startup and sort offers by distance from nodeId (#2570)
* Fetch portal client nodeId on startup and sort offers by distance from nodeId.

* Fix logging.

* Improve error handling at startup.
2024-08-19 20:49:07 +08:00
Jacek Sieka 9826557184
fix make_states script dir 2024-08-19 10:16:32 +02:00
Jacek Sieka 226cdb7c68
avoid exceptions, tx copy (#2569)
* avoid some exceptions using the new compile-time strformat
* avoid copying the full transaction only to normalise 2 fields
(expensive)
2024-08-19 09:42:07 +02:00
Jacek Sieka 5941fef211
Avoid unnecessary layer copy (#2567)
When the stack has an empty layer on top, there's no need to copy the
contents of `top` to it since it would be the same.

~13% processing saved (!)

pre
```
INF 2024-08-17 19:11:31.748+02:00 Imported blocks
blockNumber=18667648 blocks=12000 importedSlot=7860043 txs=1797812
mgas=181135.177 bps=8.763 tps=1375.062 mgps=132.125 avgBps=6.798
avgTps=1018.501 avgMGps=102.617 elapsed=29m25s154ms
```

post
```
INF 2024-08-17 18:22:52.513+02:00 Imported blocks
blockNumber=18667648 blocks=12000 importedSlot=7860043 txs=1797812
mgas=181135.177 bps=9.648 tps=1513.961 mgps=145.472 avgBps=7.876
avgTps=1179.998 avgMGps=118.888 elapsed=25m23s572ms
```
2024-08-19 08:46:04 +02:00
web3-developer c8d34eba9b
Bump portal-spec-tests and update Fluffy state tests (#2568)
* Bump portal-spec-tests.

* Update state network tests to use block_header instead of state_root.
2024-08-19 14:45:54 +08:00
web3-developer 9699293bfc
Update nim-rocksdb to latest version and cleanup outdated RocksDb install instructions in readme. (#2566) 2024-08-16 08:23:04 +02:00
Jacek Sieka 43d93bcdab
Don't write slot hashes on import (#2564)
The reverse slot hash mechanism causes quite a bit of database traffic
but is broadly not useful except for iterating the storage of an
account, something that a validator never does (it's used by the
tracers).

This flag adds one more thing that is not stored in the database, to be
explored more comprehensively when designing full, validator and archive
modes with different pruning options in the future.

`ldb` says this is 60gb of data (!):
```
ldb --db=. --ignore_unknown_options --column_family=KvtGen approxsize
--hex --from=0x05
--to=0x05ffffffffffffffffffffffffffffffffffffffffffffff
66488353954
```
2024-08-16 08:22:51 +02:00
Jordan Hrycaj 4dbc1653ea
Cleanup (#2565)
* Move snap un-dumpers to aristo unit test folder

why:
  The only place where it is used, now to test the database against
  legacy snap sync dump samples.

   While the details of the dumped data have mostly outlived their purpuse,
   its use as **entropy** data thrown against `Aristo` has still been
   useful to find/debug tricky DB problems.

* Remove cruft

* `nimbus-eth1-blobs` not used anymore as test data source
2024-08-15 12:31:07 +00:00
Jordan Hrycaj cbe5131927
Simplify aristo tree deletion functionality (#2563)
* Cleaning up, removing cruft and debugging statements

* Make `aristo_delta` fluffy compatible

why:
  A sub-module that uses `chronicles` must import all possible
  modules used by a parent module that imports the sub-module.

* update TODO
2024-08-14 12:09:30 +00:00
Jordan Hrycaj d148de5b1c
Remove chunked rlpx (#2562)
* bump nim-eth

* Update make environment
2024-08-14 10:56:49 +00:00
Jacek Sieka ec7dd5ccf6
results: bump (lent) (#2561) 2024-08-14 11:02:14 +02:00
Jordan Hrycaj ce713d95fc
Aristo lazily delete larger subtrees (#2560)
* Extract sub-tree deletion functions into separate sub-modules

* Move/rename `aristo_desc.accLruSize` => `aristo_constants.ACC_LRU_SIZE`

* Lazily delete sub-trees

why:
  This gives some control of the memory used to keep the deleted vertices
  in the cached layers. For larger sub-trees, keys and vertices might be
  on the persistent backend to a large extend. This would pull an amount
  of extra information from the backend into the cached layer.

  For lazy deleting it is enough to remember sub-trees by a small set of
  (at most 16) sub-roots to be processed when storing persistent data.
  Marking the tree root deleted immediately allows to let most of the code
  base work as before.

* Comments and cosmetics

* No need to import all for `Aristo` here

* Kludge to make `chronicle` usage in sub-modules work with `fluffy`

why:
  That `fluffy` would not run with any logging in `core_deb` is a problem
  I have known for a while. Up to now, logging was only used for debugging.

  With the current `Aristo` PR, there are cases where logging might be
  wanted but this works only if `chronicles` runs without the
  `json[dynamic]` sinks.

  So this should be re-visited.

* More of a kludge
2024-08-14 08:54:44 +00:00
Jacek Sieka e3908a7b0d
results: bump (genericsOpenSym support) (#2559)
Makes `error` in `valueOr` behave.
2024-08-13 14:50:43 +02:00
Jordan Hrycaj 7becf4e389
Remove vertex ID recycle function (#2558)
why:
  It is not safe in general to recycle vertex IDs while the `RocksDb`
  cache has `VertexID` rather than `RootedVertexID` where the former
  type seems preferable.

  In some fringe cases one might remove a vertex with key `(root1,vid)`
  and insert another vertex with key `(root2,vid)` while re-using the
  vertex ID `vid`. Without knowledge of `root1` and `root2`, the LRU
  cache will return the same vertex for `(root2,vid)` also for
  `(root1,vid)`.
2024-08-12 20:56:15 +00:00
Jacek Sieka 8723a79225
add era dir to make_states 2024-08-12 14:49:32 +02:00
Jacek Sieka 19451cadff
rebalance rocksdb cache sizes (#2557)
Based on some simple testing done with a few combinations of cache
sizes, it seems that the block cache has grown in importance compared to
the where we were before changing on-disk format and adding a lot of
other point caches.

With these settings, there's roughly a 15% performance increase when
processing blocks in the 18M range over the status quo while memory
usage decreases by more than 1gb!

Only a few values were tested so there's certainly more to do here but
this change sets up a better baseline for any future optimizations.

In particular, since the initial defaults were chosen root vertex id:s
were introduced as key prefixes meaning that storage for each account
will be grouped together and thus it becomes more likely that a block
loaded from disk will be hit multiple times - this seems to give the
block cache an edge over the row cache, specially when traversing the
storage trie.
2024-08-12 05:52:09 +00:00
Jacek Sieka b3184b716e
bncurve: bump (#2555)
...for significant speed gain
2024-08-09 13:02:30 +02:00
Jacek Sieka 32e2206d68
fix fluffy/eth (#2556)
* fix fluffy/eth

* revert speedups
2024-08-09 09:00:00 +02:00
andri lim b8e128203f
Rewire blockValue from Txpool to EngineAPI (#2554) 2024-08-09 06:05:18 +07:00
Jacek Sieka 3dc30195ad
log http/jwt information on startup (#2553) 2024-08-08 10:03:30 +00:00
Jacek Sieka 094486d0ce
Hash bump 2024-08-08 07:46:35 +02:00
Jacek Sieka 3cefd7ed38
move db init to init (#2552)
When using the common interface, the database always (potentially) needs
init - take the opportunity to log some basic database info on startup.
2024-08-08 07:45:30 +02:00
web3-developer 93a160b569
Update Fluffy State Network to match Portal spec addressHash change (#2548)
* Update state network to use addressHash instead of address in contract trie and contract code content keys.

* Fix path calculation bug in getParent when working with extension nodes.

* Bump portal spec tests repo.

* Finish updating tests due to portal test vector changes.

* Update Fluffy state bridge to use addressHash.

* Update Fluffy book with correct commands for running portal hive tests.
2024-08-08 00:01:30 +08:00
andri lim d5786758b5
TxPool: Merge tx_chain and tx_packer to reduce complexity (#2549)
* TxPool: Merge tx_chain and tx_packer to reduce complexity

* Fix copyright year
2024-08-07 22:35:17 +07:00