Commit Graph

18 Commits

Author SHA1 Message Date
Jordan Hrycaj c5e895aaab
Code reorg 4 snap sync suite (#1560)
* Rename `playXXX` => `passXXX`

why:
  Better purpose match

* Code massage, log message updates

* Moved `ticker.nim` to `misc` folder to be used the same by full and snap sync

why:
  Simplifies maintenance

* Move `worker/pivot*` => `worker/pass/pass_snap/*`

why:
  better for maintenance

* Moved helper source file => `pass/pass_snap/helper`

* Renamed ComError => GetError, `worker/com/` => `worker/get/`

* Keep ticker enable flag in worker descriptor

why:
  This allows to pass this flag with the descriptor and not an extra
  function argument when calling the setup function.

* Extracted setup/release code from `worker.nim` => `pass/pass_init.nim`
2023-04-24 21:24:07 +01:00
Jordan Hrycaj 0a3bc102eb
Pre functional snap to full sync (#1546)
* Update sync scheduler pool mode

why:
  The pool mode allows to loop over active peers one after another. This
  is ideal for soft re-starting peers. As this is a two tier experience
  (start/stop, setup/release) the loop must be run twice. This is
  controlled by a more rigid re-definition of how to use the `poolMode`
  flag.

* Mitigate RLP serialiser deficiency

why:
  Currently, serialising the `BlockBody` in not conevrtible and need
  to be checked in the `eth` module. Currently a local fix for the
  wire protocol applies. Unit tests will stay (after this local solution
  will have been removed.)

* Code cosmetics and massage

details:
  Main part is `types.toStr()` as a unified function for logging block
  numbers.

* Allow to use a logical genesis replacement (start of history)

why:
  Snap sync will set up an arbitrary pivot at a block number different
  from zero. In fact, the higher the block number the better.

details:
  A non-genesis start of history will currently only affect the score
  values which were derived from the difficulty.

* Provide function to store the snap pivot block header in chain db

why:
  Together with the start of history facility, this allows to proceed
  with full syncing once snap has finished.

details:
  Snap db storage was switched from a sub-tables to the flat chain db.

* Provide database completeness and sanity checker

details:
  For debugging on smaller databases, only

* Implement snap -> full sync switch
2023-04-14 23:28:57 +01:00
Jordan Hrycaj fe3a6d67c6
Prepare snap server client test scenario cont2 (#1487)
* Clean up some function prototypes

why:
  Simplify polymorphic prototype variances for easier maintenance.

* Fix fringe condition crash when importing bogus RLP node

why:
  Accessing non-list RLP entry as a list causes `Defect`

* Fix left boundary proof at range extractor

why:
  Was insufficient. The main problem was that there was no unit test for
  the validity of the generated left boundary.

* Handle incomplete left boundary proofs early

why:
  Attempt to do it later leads to overly complex code in order to prevent
  looping when the same peer repeats to send the same incomplete proof.

  Contrary, gaps in the leaf sequence can be handled gracefully with
  registering the gaps

* Implement a manual pivot setup mechanism for snap sync

why:
  For a test scenario it is convenient to set the pivot to something
  lower than the beacon header from the consensus layer. This does not
  need rely on any RPC mechanism.

details:
  The file containing the pivot specs is specified by the
  `--sync-ctrl-file` option. It is regularly parsed for updates.

* Fix calculation error

why:
  Prevent from calculating negative square root
2023-03-07 14:23:22 +00:00
Jordan Hrycaj 10ad7867e4
Prepare snap server client test scenario cont1 (#1485)
* Renaming androgynous sub-object names according to where they belong

why:
  These objects are not explicitly dealt with. They give meaning to
  some generic wrapper objects. Naming them after their origin may
  help troubleshooting.

* Redefine proof nodes list data type for `snap/1` wire protocol

why:
  The current specification suffered from the fact that the basic data
  type for a proof node is an RLP encoded hexary node. This slightly
  confused the encoding/decoding magic.

details:
  This is the second attempt, now wrapping the `seq[Blob]` into a
  wrapper object of `seq[SnapProof]` for a distinct alias sequence.

  In the previous attempt, `SnapProof` was a wrapper object holding the
  `Blob` with magic applied to the `seq[]`. This needed the `append`
  mixin to strip the outer wrapper that was applied to the `Blob` already
  when it was passed as argument.

* Fix some prototype inconsistency

why:
  For easy reading, `getAccountRange()` handler return code should
  resemble the `accoundRange()` anruments prototype.
2023-03-03 20:01:59 +00:00
Jordan Hrycaj f20f20f962
Prepare snap server client test scenario (#1483)
* Enable `snap/1` accounts range service

* Allow to change the garbage collector to `boehm` as a Makefile option.

why:
  There is still an unsolved memory corruption problem that might be
  related to the standard `gc`. It seemingly goes away if the `gc` is
  changed to `boehm`.

  Specifying another `gc` on the make level simplifies debugging and
  development.

* Code cosmetics

details:
* updated exception annotations
* extracted `worker_desc.nim` from `full/worker.nim`
* etc.

* Implement option to state a sync modifier file

why:
  This allows to specify extra sync type specific options which might
  change over time. This file is regularly checked for updates.

* Implement a threshold when to suspend full syncing

why:
  For a test scenario, a full sync beep may work as a local snap server.
  There is no need to download the full block chain.

details:
  The file containing the pivot specs is specified by the
  `--sync-ctrl-file` option. It is regularly parsed for updates.
2023-03-02 09:57:58 +00:00
Jordan Hrycaj bf53226c2c
Minor updates for testing and cosmetics (#1476)
* Fix locked database file annoyance with unit tests on Windows

why:
  Need to clean up old files first from previous session as files remain
  locked despite closing of database.

* Fix initialisation order

detail:
  Apparently this has no real effect as the ticker is only initialised
  here but started later.

  This possible bug has been in all for a while and was running with the
  previous compiler and libraries.

* Better naming of data fields for sync descriptors

details:
* BuddyRef[S,W]: buddy.data -> buddy.only
* CtxRef[S]: ctx.data -> ctx.pool
2023-02-23 13:13:02 +00:00
Jordan Hrycaj 880313d7a4
Silence some compiler gossip -- part 8, sync (#1467)
details:
  Adding some missing exception annotation
2023-02-14 23:38:33 +00:00
Jordan Hrycaj 89ae9621c4
Silence compiler gossip after nim upgrade (#1454)
* Silence some compiler gossip -- part 1, tx_pool

details:
  Mostly removing redundant imports and `Defect` tracer after switch
  to nim 1.6

* Silence some compiler gossip -- part 2, clique

details:
  Mostly removing redundant imports and `Defect` tracer after switch
  to nim 1.6

* Silence some compiler gossip -- part 3, misc core

details:
  Mostly removing redundant imports and `Defect` tracer after switch
  to nim 1.6

* Silence some compiler gossip -- part 4, sync

details:
  Mostly removing redundant imports and `Defect` tracer after switch
  to nim 1.6

* Clique update

why:
  Missing exception annotation
2023-01-30 22:10:23 +00:00
Jordan Hrycaj d55a72ae49
Full sync peer negotiation control (#1390)
* Additional logging for scheduler

* Fix duplicate occurrence of `bestNumber`

why:
  Happened when the `block_queue` module was separated out of
  the `worker` module. Somehow testing was insufficient or skipped,
  at all.

* Update `runPool()` mixin for scheduler

details:
  Could be simplified

* Dynamically adapt pivot header negotiation mode

details:
  After accepting one peer and some timeout, do not search for more
  peers for start syncing but rather continue in relaxed mode with a
  single peer.
2022-12-18 16:06:43 +00:00
Jordan Hrycaj c0d580715e
Remodel persistent snapdb access (#1274)
* Re-model persistent database access

why:
  Storage slots healing just run on the wrong sub-trie (i.e. the wrong
  key mapping). So get/put and bulk functions now use the definitions
  in `snapdb_desc` (earlier there were some shortcuts for `get()`.)

* Fixes: missing return code, typo, redundant imports etc.

* Remove obsolete debugging directives from `worker_desc` module

* Correct failing unit tests for storage slots trie inspection

why:
  Some pathological cases for the extended tests do not produce any
  hexary trie data. This is rightly detected by the trie inspection
  and the result checks needed to adjusted.
2022-10-20 17:59:54 +01:00
jangko 3fa1b012e6
initial wire protocol transformation
rework on the eth wire protocol handlers.
curently still missing 4 handlers implementation.
but the framework is ready for eexpansion.
2022-10-15 19:48:21 +07:00
Jordan Hrycaj d53eacb854
Prep for full sync after snap (#1253)
* Split fetch accounts into sub-modules

details:
  There will be separated modules for accounts snapshot, storage snapshot,
  and healing for either.

* Allow to rebase pivot before negotiated header

why:
  Peers seem to have not too many snapshots available. By setting back the
  pivot block header slightly, the chances might be higher to find more
  peers to serve this pivot. Experiment on mainnet showed that setting back
  too much (tested with 1024), the chances to find matching snapshot peers
  seem to decrease.

* Add accounts healing

* Update variable/field naming in `worker_desc` for readability

* Handle leaf nodes in accounts healing

why:
  There is no need to fetch accounts when they had been added by the
  healing process. On the flip side, these accounts must be checked for
  storage data and the batch queue updated, accordingly.

* Reorganising accounts hash ranges batch queue

why:
  The aim is to formally cover as many accounts as possible for different
  pivot state root environments. Formerly, this was tried by starting the
  accounts batch queue at a random value for each pivot (and wrapping
  around.)

  Now, each pivot environment starts with an interval set mutually
  disjunct from any interval set retrieved with other pivot state roots.

also:
  Stop fishing for more pivots in `worker` if 100% download is reached

* Reorganise/update accounts healing

why:
  Error handling was wrong and the (math. complexity of) whole process
  could be better managed.

details:
  Much of the algorithm is now documented at the top of the file
  `heal_accounts.nim`
2022-10-08 18:20:50 +01:00
Jordan Hrycaj eca5882238
Isolating sync action modules (#1249)
* Miscellaneous updates TBC

* Disentangled pivot2 module from snap

why:
  Wrote as template on top of sync so it can be shared by fast and snap
  sync.

* Renamed and relocated pivot sources

* Integrated `best_pivot` module into full and snap sync

why:
  Full sync used an older version of `best_pivot`

* isolating download module from full sync

why;
  might be shared with snap sync at a later stage
2022-09-30 09:22:14 +01:00
Jordan Hrycaj 4ff0948fed
Snap sync accounts healing (#1225)
* Added inspect module

why:
  Find dangling references for trie healing support.

details:
 + This patch set provides only the inspect module and some unit tests.
 + There are also extensive unit tests which need bulk data from the
   `nimbus-eth1-blob` module.

* Alternative pivot finder

why:
  Attempt to be faster on start up. Also tying to decouple pivot finder
  somehow by providing different mechanisms (this one runs in `single`
  mode.)

* Use inspect module for healing

details:
 + After some progress with account and storage data, the inspect facility
   is used to find dangling links in the database to be filled nose-wise.
 + This is a crude attempt to cobble together functional elements. The
   set up needs to be honed.

* fix scheduler to avoid starting dead peers

why:
  Some peers drop out while in `sleepAsync()`. So extra `if` clauses
  make sure that this event is detected early.

* Bug fixes causing crashes

details:

+ prettify.toPC():
  int/intToStr() numeric range over/underflow

+ hexary_inspect.hexaryInspectPath():
  take care of half initialised step with branch but missing index into
  branch array

* improve handling of dropped peers in alternaive pivot finder

why:
  Strange things may happen while querying data from the network.
  Additional checks make sure that the state of other peers is updated
  immediately.

* Update trace messages

* reorganise snap fetch & store schedule
2022-09-16 08:24:12 +01:00
Jordan Hrycaj de2c13e136
Update snap offline tests (#1199)
* Re-implemented `hexaryFollow()` in a more general fashion

details:
+ New name for re-implemented `hexaryFollow()` is `hexaryPath()`
+ Renamed `rTreeFollow()` as `hexaryPath()`

why:
  Returning similarly organised structures, the results of the
  `hexaryPath()` functions become comparable when running over
  the persistent and the in-memory databases.

* Added traversal functionality for persistent ChainDB

* Using `Account` values as re-packed Blob

* Repack samples as compressed data files

* Produce test data

details:
+ Can force pivot state root switch after minimal coverage.
+ For emulating certain network behaviour, downloading accounts stops for
  a particular pivot state root if 30% (some static number) coverage is
  reached. Following accounts are downloaded for a later pivot state root.
2022-08-24 14:44:18 +01:00
Jordan Hrycaj 5f0e89a41e
Snap accounts bulk import preparer (#1183)
* Provided common scheduler API, applied to `full` sync

* Use hexary trie as storage for proofs_db records

also:
 + Store metadata with account for keeping track of account state
 + add iterator over accounts

* Common scheduler API applied to `snap` sync

* Prepare for accounts bulk import

details:
+ Added some ad-hoc checks for proving accounts data received from the
  snap/1 (will be replaced by proper database version when ready)
+ Added code that dumps some of the received snap/1 data into a file
  (turned of by default, see `worker_desc.nim`)
2022-08-04 09:04:30 +01:00
Jordan Hrycaj 73b628491d
Clique snapshots reorg (#1169)
* Add persistent snapshot size logging

why:
  Suspecting too much space used

snapshot statistic:
  [..]
  blockNumber=2214912 nSnaps=2236 snapsTotal=1.14m
  blockNumber=2215936 nSnaps=2237 snapsTotal=1.14m
  [..]
  Persisting blocks fromBlock=2216449 toBlock=2216640
  36458496	datadir-nimbus-goerlish/data/nimbus/

* Replace legacy `lru_cache` by `keyed_queue`

why:
  `keyed_queue` generalises `lru_cache`

snapshot statistic:
  [..]
  blockNumber=2234368 nSnaps=2259 snapsTotal=1.15m
  blockNumber=2235392 nSnaps=2260 snapsTotal=1.15m
  [..]
  Persisting blocks fromBlock=2235649 toBlock=2235840
  37627288	datadir-nimbus-goerlish/data/nimbus/

* Increase persistent snapshot storage interval by 300%

snapshot statistic:
      [..]
      blockNumber=2232320 nSnaps=620 snapsTotal=0.30m
      blockNumber=2236416 nSnaps=621 snapsTotal=0.30m
      [..]
      Persisting blocks fromBlock=2237185 toBlock=2237376
      37627288	datadir-nimbus-goerlish/data/nimbus/

* Cull legacy debugging environment for clique

why:
  Chronicles provides a better choice (when properly set up)
2022-07-21 19:16:28 +01:00
Jordan Hrycaj 5d98f68c09
Sync update to work with sepolia reorgs (#1168)
* Error return in `persistBlocks()` on initial `VmState` roblem

why:
  previously threw an exception

* Updated sync mode option

why:
 using enum rather than bool => space for more

* Added sync mode `full`, re-factued legacy sync

also:
  rebased

* Fix typo (crashes `pesistBlocks()` otherwise)

also:
  rebase to master

* Reduce log ticker noise by suppressing duplicate messages

* Clarify staged queue overflow handling

why:
  backtrack/re-org mode in `stageItem()` should be detected by both,
  the global indicator or the work item where it might have moved into.

also:
  rebased
2022-07-21 13:14:41 +01:00