Commit Graph

27 Commits

Author SHA1 Message Date
Jordan Hrycaj c5e895aaab
Code reorg 4 snap sync suite (#1560)
* Rename `playXXX` => `passXXX`

why:
  Better purpose match

* Code massage, log message updates

* Moved `ticker.nim` to `misc` folder to be used the same by full and snap sync

why:
  Simplifies maintenance

* Move `worker/pivot*` => `worker/pass/pass_snap/*`

why:
  better for maintenance

* Moved helper source file => `pass/pass_snap/helper`

* Renamed ComError => GetError, `worker/com/` => `worker/get/`

* Keep ticker enable flag in worker descriptor

why:
  This allows to pass this flag with the descriptor and not an extra
  function argument when calling the setup function.

* Extracted setup/release code from `worker.nim` => `pass/pass_init.nim`
2023-04-24 21:24:07 +01:00
Jordan Hrycaj 0a3bc102eb
Pre functional snap to full sync (#1546)
* Update sync scheduler pool mode

why:
  The pool mode allows to loop over active peers one after another. This
  is ideal for soft re-starting peers. As this is a two tier experience
  (start/stop, setup/release) the loop must be run twice. This is
  controlled by a more rigid re-definition of how to use the `poolMode`
  flag.

* Mitigate RLP serialiser deficiency

why:
  Currently, serialising the `BlockBody` in not conevrtible and need
  to be checked in the `eth` module. Currently a local fix for the
  wire protocol applies. Unit tests will stay (after this local solution
  will have been removed.)

* Code cosmetics and massage

details:
  Main part is `types.toStr()` as a unified function for logging block
  numbers.

* Allow to use a logical genesis replacement (start of history)

why:
  Snap sync will set up an arbitrary pivot at a block number different
  from zero. In fact, the higher the block number the better.

details:
  A non-genesis start of history will currently only affect the score
  values which were derived from the difficulty.

* Provide function to store the snap pivot block header in chain db

why:
  Together with the start of history facility, this allows to proceed
  with full syncing once snap has finished.

details:
  Snap db storage was switched from a sub-tables to the flat chain db.

* Provide database completeness and sanity checker

details:
  For debugging on smaller databases, only

* Implement snap -> full sync switch
2023-04-14 23:28:57 +01:00
Jordan Hrycaj 9facab91cb
Prepare snap client for continuing with full sync (#1534)
* Somewhat tighten error handling

why:
  Zombie state is invoked when the current peer turns out to be useless
  for further communication. While there is a chance to further talk
  to a peer about another topic (aka healing) after some protocol failure,
  it makes no sense to do so after a network problem.

  The latter state is explained bu the `peerDegraded` flag that goes
  together with the `zombie` state flag. A degraded peer is dropped
  immediately.

* Remove `--sync-mode=snapCtx` option, always start snap in recovery mode

why:
  No need for a snap sync option without recovery mode, can be achieved
  by deleting the database.

* Code cosmetics, typos, prettify logging, debugging helper, etc.

* Split off snap sync sub-mode handler into separate modules

details:
  The original `worker.nim` source has become a multiplexer for several
  snap sync sub-modes `full` and `snap`. The source modules of the
  incarnations of a particular sync sub-mode are places into the
  `worker/play` directory.

* Update ticker for snap and full sync logging
2023-04-06 20:42:07 +01:00
Jordan Hrycaj 10ad7867e4
Prepare snap server client test scenario cont1 (#1485)
* Renaming androgynous sub-object names according to where they belong

why:
  These objects are not explicitly dealt with. They give meaning to
  some generic wrapper objects. Naming them after their origin may
  help troubleshooting.

* Redefine proof nodes list data type for `snap/1` wire protocol

why:
  The current specification suffered from the fact that the basic data
  type for a proof node is an RLP encoded hexary node. This slightly
  confused the encoding/decoding magic.

details:
  This is the second attempt, now wrapping the `seq[Blob]` into a
  wrapper object of `seq[SnapProof]` for a distinct alias sequence.

  In the previous attempt, `SnapProof` was a wrapper object holding the
  `Blob` with magic applied to the `seq[]`. This needed the `append`
  mixin to strip the outer wrapper that was applied to the `Blob` already
  when it was passed as argument.

* Fix some prototype inconsistency

why:
  For easy reading, `getAccountRange()` handler return code should
  resemble the `accoundRange()` anruments prototype.
2023-03-03 20:01:59 +00:00
Jordan Hrycaj f20f20f962
Prepare snap server client test scenario (#1483)
* Enable `snap/1` accounts range service

* Allow to change the garbage collector to `boehm` as a Makefile option.

why:
  There is still an unsolved memory corruption problem that might be
  related to the standard `gc`. It seemingly goes away if the `gc` is
  changed to `boehm`.

  Specifying another `gc` on the make level simplifies debugging and
  development.

* Code cosmetics

details:
* updated exception annotations
* extracted `worker_desc.nim` from `full/worker.nim`
* etc.

* Implement option to state a sync modifier file

why:
  This allows to specify extra sync type specific options which might
  change over time. This file is regularly checked for updates.

* Implement a threshold when to suspend full syncing

why:
  For a test scenario, a full sync beep may work as a local snap server.
  There is no need to download the full block chain.

details:
  The file containing the pivot specs is specified by the
  `--sync-ctrl-file` option. It is regularly parsed for updates.
2023-03-02 09:57:58 +00:00
Jordan Hrycaj bf53226c2c
Minor updates for testing and cosmetics (#1476)
* Fix locked database file annoyance with unit tests on Windows

why:
  Need to clean up old files first from previous session as files remain
  locked despite closing of database.

* Fix initialisation order

detail:
  Apparently this has no real effect as the ticker is only initialised
  here but started later.

  This possible bug has been in all for a while and was running with the
  previous compiler and libraries.

* Better naming of data fields for sync descriptors

details:
* BuddyRef[S,W]: buddy.data -> buddy.only
* CtxRef[S]: ctx.data -> ctx.pool
2023-02-23 13:13:02 +00:00
Jordan Hrycaj 880313d7a4
Silence some compiler gossip -- part 8, sync (#1467)
details:
  Adding some missing exception annotation
2023-02-14 23:38:33 +00:00
Jordan Hrycaj 89ae9621c4
Silence compiler gossip after nim upgrade (#1454)
* Silence some compiler gossip -- part 1, tx_pool

details:
  Mostly removing redundant imports and `Defect` tracer after switch
  to nim 1.6

* Silence some compiler gossip -- part 2, clique

details:
  Mostly removing redundant imports and `Defect` tracer after switch
  to nim 1.6

* Silence some compiler gossip -- part 3, misc core

details:
  Mostly removing redundant imports and `Defect` tracer after switch
  to nim 1.6

* Silence some compiler gossip -- part 4, sync

details:
  Mostly removing redundant imports and `Defect` tracer after switch
  to nim 1.6

* Clique update

why:
  Missing exception annotation
2023-01-30 22:10:23 +00:00
Jordan Hrycaj 6fb48517ba
Add snap protocol service stub (#1438)
* Cosmetics, update logger `topics`

* Clean up sync/start methods in nimbus

why:
* The `protocols` list selects served (as opposed to sync) protocols only.
* The `SyncMode.Default` object is allocated with the other possible sync
  mode objects.

* Add snap service stub to `nimbus`

* Provide full set of snap response handler stubs

* Bicarb for the latest CI hiccup

why:
  Might be a change in the CI engine for MacOS.
2023-01-20 15:01:29 +00:00
Jordan Hrycaj 179b4adac3
Snap sync tweaks n fixes (#1359)
* Miscellaneous tweaks & fixes

details:
+ Catch `TransportError` exception in `legacy.nim` module
+ Fix self-calling wrapper `hexaryEnvelopeTouchedBy()`

* Update documentation, logging etc.

* Changed `checkNode` batch list `seq[Blob]` => `seq[NodeSpecs]`

why:
  The `NodeSpecs` type as used here is a tuple `(partial-path,node-key)`.

  When `checkNode` partial paths are collected, also the node key is
  available so it should be registered and not repeatedly recovered from
  the database.

* Add optional begin/end trace statement in snap scheduler

why:
  Allows to trace invoked entity and scheduler state variables
2022-12-09 13:43:55 +00:00
jangko 94a94c5b65 implement better hardfork management 2022-12-02 13:51:42 +07:00
Jordan Hrycaj 7688148565
Snap sync can start on saved checkpoint (#1327)
* Stop negotiating pivot if peer repeatedly replies w/usesless answers

why:
  There is some fringe condition where a peer replies with legit but
  useless empty headers repetely. This goes on until somebody stops.
  We stop now.

* Rename `missingNodes` => `sickSubTries`

why:
  These (probably missing) nodes represent in reality fully or partially
  missing sub-tries. The top nodes may even exist, e.g. as a shallow
  sub-trie.

also:
  Keep track of account healing on/of by bool variable `accountsHealing`
  controlled in `pivot_helper.execSnapSyncAction()`

* Add `nimbus` option argument `snapCtx` for starting snap recovery (if any)

also:
+ Trigger the recovery (or similar) process from inside the global peer
  worker initialisation `worker.setup()` and not by the `snap.start()`
  function.
+ Have `runPool()` returned a `bool` code to indicate early stop to
  scheduler.

* Can import partial snap sync checkpoint at start

details:
 + Modified what is stored with the checkpoint in `snapdb_pivot.nim`
 + Will be loaded within `runDaemon()` if activated

* Forgot to import total coverage range

why:
  Only the top (or latest) pivot needs coverage but the total coverage
  is the list of all ranges for all pivots -- simply forgotten.
2022-11-25 14:56:42 +00:00
Jordan Hrycaj 9aa925cf36
Update sync scheduler (#1297)
* Add `stop()` methods to shutdown to shutdown procedure

why:
  Nasty behaviour when hitting Ctrl-C, otherwise

* Add background service to sync scheduler

why:
  The background service will be used for sync data import and recovery
  after restart.

   It is controlled by the sync scheduler for an easy turn/on off API.

also:
  Simplified snap ticker time calc.

* Fix typo
2022-11-14 14:13:00 +00:00
Jordan Hrycaj c0d580715e
Remodel persistent snapdb access (#1274)
* Re-model persistent database access

why:
  Storage slots healing just run on the wrong sub-trie (i.e. the wrong
  key mapping). So get/put and bulk functions now use the definitions
  in `snapdb_desc` (earlier there were some shortcuts for `get()`.)

* Fixes: missing return code, typo, redundant imports etc.

* Remove obsolete debugging directives from `worker_desc` module

* Correct failing unit tests for storage slots trie inspection

why:
  Some pathological cases for the extended tests do not produce any
  hexary trie data. This is rightly detected by the trie inspection
  and the result checks needed to adjusted.
2022-10-20 17:59:54 +01:00
Jordan Hrycaj 85fdb61699
Prep for full sync after snap make 3 (#1270)
* For snap sync, publish `EthWireRef` in sync descriptor

why:
  currently used for noise control

* Detect and reuse existing storage slots

* Provide healing module for storage slots

* Update statistic ticker (adding range factor for unprocessed storage)

* Complete mere function for work item ranges

why:
  Merging interval into existing partial item was missing

* Show av storage queue lengths in ticker

detail;
  Previous attempt shows average completeness which did not tell much

* Correct the meaning of the storage counter (per pivot)

detail:
  Is the # accounts that have a storage saved
2022-10-19 11:04:06 +01:00
jangko 3fa1b012e6
initial wire protocol transformation
rework on the eth wire protocol handlers.
curently still missing 4 handlers implementation.
but the framework is ready for eexpansion.
2022-10-15 19:48:21 +07:00
Jordan Hrycaj de2c13e136
Update snap offline tests (#1199)
* Re-implemented `hexaryFollow()` in a more general fashion

details:
+ New name for re-implemented `hexaryFollow()` is `hexaryPath()`
+ Renamed `rTreeFollow()` as `hexaryPath()`

why:
  Returning similarly organised structures, the results of the
  `hexaryPath()` functions become comparable when running over
  the persistent and the in-memory databases.

* Added traversal functionality for persistent ChainDB

* Using `Account` values as re-packed Blob

* Repack samples as compressed data files

* Produce test data

details:
+ Can force pivot state root switch after minimal coverage.
+ For emulating certain network behaviour, downloading accounts stops for
  a particular pivot state root if 30% (some static number) coverage is
  reached. Following accounts are downloaded for a later pivot state root.
2022-08-24 14:44:18 +01:00
Jordan Hrycaj 7d7e26d45f
Experimental bulk loader tests (#1187)
why:
  Rocksdb bulk loading might provide a slight advantage when loading
  larger data sets into the system
2022-08-12 16:42:07 +01:00
Jordan Hrycaj 5f0e89a41e
Snap accounts bulk import preparer (#1183)
* Provided common scheduler API, applied to `full` sync

* Use hexary trie as storage for proofs_db records

also:
 + Store metadata with account for keeping track of account state
 + add iterator over accounts

* Common scheduler API applied to `snap` sync

* Prepare for accounts bulk import

details:
+ Added some ad-hoc checks for proving accounts data received from the
  snap/1 (will be replaced by proper database version when ready)
+ Added code that dumps some of the received snap/1 data into a file
  (turned of by default, see `worker_desc.nim`)
2022-08-04 09:04:30 +01:00
Jordan Hrycaj 5d98f68c09
Sync update to work with sepolia reorgs (#1168)
* Error return in `persistBlocks()` on initial `VmState` roblem

why:
  previously threw an exception

* Updated sync mode option

why:
 using enum rather than bool => space for more

* Added sync mode `full`, re-factued legacy sync

also:
  rebased

* Fix typo (crashes `pesistBlocks()` otherwise)

also:
  rebase to master

* Reduce log ticker noise by suppressing duplicate messages

* Clarify staged queue overflow handling

why:
  backtrack/re-org mode in `stageItem()` should be detected by both,
  the global indicator or the work item where it might have moved into.

also:
  rebased
2022-07-21 13:14:41 +01:00
Jordan Hrycaj 134fe26997
Store proved snap accounts (#1145)
* Relocated `IntervalSets` to nim-stew repo

* Accumulate accounts on temporary kv-DB

why:
  Explore the data as returned from snap/1. Will be converted to a
  `eth/db` next.

details:
  Verify and accumulate per/state-root accounts downloaded via snap.

also:
  Some unit tests

* Replace `Table` by `TrieDatabaseRef` for accounts accumulator

* update ticker statistics

details:
  mean/variance based counter update

* allow persistent db for proved accounts

* rebase, and globally activate unit test

* fix statistics
2022-07-01 12:42:17 +01:00
Jordan Hrycaj c123e1eb93
Updated account scheduler (#1124)
* Using `IntervalSet` type data for `LeafRange`

* Updated log ticker

* Update to `eth67`

details:
  Disabled by default, use `ENABLE_LEGACY_ETH66=0` to enable
  No support for `Get/NodeData` dialogue via eth, anymore

* Dissolved fetch/common.nim

details;
  the log/ticker part becomes ticker.nim
  the interval range management is merged into fetch.nim

* Updated account scheduler

why:
  The previous scheduler fetched each account once (for different state
  roots.) The updated scheduler re-calibrates after a change of the state
  root and potentially (until told otherwise) fetches all possible
  accounts.

* Fix `high(P)` fringe cases in `IntervalSet` handling

why:
  The `high(P)` value for a point type `P` cannot be represented with
  half open intervals `[a,b)` for a,b points of `P`. So this single value
  needs extra treatment which was slightly wrong.

* Updated docu/comments

also:
  rebased

* Update scheduler

details:
  Change the `pivot` management when creating new accounts lists. It is
  strictly increasing (and wrapping around) depending on last updated
  accounts list.
2022-06-16 09:58:50 +01:00
Jordan Hrycaj 76f6de8059
Normalise snap objects (#1114)
* Fix/recover download flag

why:
  The fetch indicator used to control the data download somehow got
  lost during re-org.

* Updated chronicles/logger topics

* Reorganised run state flags

why:
  The original code used a pair of boolean flags `(stopped,stopThisState)`
  which was translated to three states running, stoppedPending, and
  stopped. It is currently not clear whether collapsing some states was
  correct. So the original logic has been re-stored, albeit wrapped into
  directives like `isStopped()` etc.

also:
  Moving some function bodies in `worker.nim`

* Moved `reply_data.nim` and `validate_trienode.nim` to sub-directory `fetch_trie`

why:
  Only used in `fetch_trie.nim`.

* Move `fetch_*` file and directory objects to `fetch` subdirectory

why:
  Only used in `fetch.nim`

* Added start/stop and/or setup/release methods for all sub-modules

why:
  good housekeeping

also:
  updated getters/setters for ctrl states
  updated trace messages
2022-06-06 14:42:08 +01:00
Jordan Hrycaj 96bb09457e
Snap sync rename objects (#1099)
* Disentangle `collect` module from `reply_data`

why:
  Now the module visible from `collect` for fetching data is `peer/fetch`
  only.

* Merge `SnapPeerHunt` into `collect`

why:
  This part needs to be known by `collect`, only

* rename collect => worker

* Dissolve `sync_fetch_xdesc` module into `common`

why:
  Descriptor is only used in `common` and `fetch_trie`

* rename `snap/peer` directory => `snap/worker`

* rename `SnapSync` -> `Worker`, `SnapPeer` -> `WorkerBuddy`

* moved `snap/base_desc.nim` -> `snap/worker/worker_desc.nim`

* Unified opaque object ref naming in `worker_desc.nim`

details:
  indicated my inheriting module (exactly one, always)
2022-05-24 09:07:39 +01:00
Jordan Hrycaj ba940a5ce7
Snap sync simplify object inheritance (#1098)
* Reorg SnapPeerBase descriptor, notably start/stop flags

details:
  Instead of using three boolean flags startedFetch, stopped, and
  stopThisState a single enum type is used with values SyncRunningOk,
  SyncStopRequest, and SyncStopped.

* Restricting snap to eth66 and later

why:
  Id-tracked request/response wire protocol can handle overlapped
  responses when requests are sent in row.

* Align function names with source code file names

why:
  Easier to reconcile when following the implemented logic.

* Update trace logging (want file locations)

why:
  The macros previously used hid the relevant file location (when
  `chroniclesLineNumbers` turned on.) It rather printed the file
  location of the template that was wrapping `trace`.

* Use KeyedQueue table instead of sequence

why:
  Quick access, easy configuration as LRU or FIFO with max entries
  (currently LRU.)

* Dissolve `SnapPeerEx` object extension into `SnapPeer`

why;
  It is logically cleaner and more obvious not to inherit from
  `SnapPeerBase` but to specify opaque field object references of the
  merged `SnapPeer` object. These can then be locally inherited.

* Dissolve `SnapSyncEx` object extension into `SnapSync`

why;
  It is logically cleaner and more obvious not to inherit from
  `SnapSyncEx` but to specify opaque field object references of
  the `SnapPeer` object. These can then be locally inherited.

  Also, in the re-factored code here the interface descriptor
  `SnapSyncCtx` inherited `SnapSyncEx` which was sub-optimal (OO
  inheritance makes it easier to work with call back functions.)
2022-05-23 17:53:19 +01:00
Jordan Hrycaj 575c69e6ba
Objects inheritance reorg for snap sync (#1091)
* new: time_helper, types

* new: path_desc

* new: base_desc

* Re-organised objects inheritance

why:
  Previous code used macros to instantiate opaque object references. This
  has been re-implemented with OO inheritance based logic.

* Normalised trace macros

* Using distinct types for Hash256 aliases

why:
  Better control of the meaning of the hashes, all or the same format

caveat:
  The protocol handler DSL used by eth66.nim and snap1.nim uses the
  underlying type Hash256 and cannot handle the distinct alias in
  rlp and chronicles/log macros. So Hash256 is used directly (does
  not change readability as the type is clear by parameter names.)
2022-05-17 12:09:49 +01:00
Jordan Hrycaj 62d31d6f1d
Normalise sync handler prototypes (#1087)
* Use type name eth and snap (rather than snap1)

* Prettified snap/eth handler trace messages

* Regrouped sync sources

details:
  Snap storage related sources are moved to common directory.
  Option --new-sync renamed to --snap-sync

also:
  Normalised logging for secondary/non-protocol handlers.

* Merge protocol wrapper files => protocol.nim

details:
  Merge wrapper sync/protocol_ethxx.nim and sync/protocol_snapxx.nim
  into single file snap/protocol.nim

* Comments cosmetics

* Similar start logic for blockchain_sync.nim and sync/snap.nim

* Renamed p2p/blockchain_sync.nim -> sync/fast.nim
2022-05-13 17:30:10 +01:00