Jordan Hrycaj e14fd4b96c
Prep for full sync after snap make 6 (#1291)
* Update log ticker, using time interval rather than ticker count

why:
  Counting and logging ticker occurrences is inherently imprecise. So
  time intervals are used.

* Use separate storage tables for snap sync data

* Left boundary proof update

why:
  Was not properly implemented, yet.

* Capture pivot in peer worker (aka buddy) tasks

why:
  The pivot environment is linked to the `buddy` descriptor. While
  there is a task switch, the pivot may change. So it is passed on as
  function argument `env` rather than retrieved from the buddy at
  the start of a sub-function.

* Split queues `fetchStorage` into `fetchStorageFull` and `fetchStoragePart`

* Remove obsolete account range returned from `GetAccountRange` message

why:
  Handler returned the wrong right value of the range. This range was
  for convenience, only.

* Prioritise storage slots if the queue becomes large

why:
  Currently, accounts processing is prioritised up until all accounts
  are downloaded. The new prioritisation has two thresholds for
  + start processing storage slots with a new worker
  + stop account processing and switch to storage processing

also:
  Provide api for `SnapTodoRanges` pair of range sets in `worker_desc.nim`

* Generalise left boundary proof for accounts or storage slots.

why:
  Detailed explanation how this works is documented with
  `snapdb_accounts.importAccounts()`.

  Instead of enforcing a left boundary proof (which is still the default),
  the importer functions return a list of `holes` (aka node paths) found in
  the argument ranges of leaf nodes. This in turn is used by the book
   keeping software for data download.

* Forgot to pass on variable in function wrapper

also:
  + Start healing not before 99% accounts covered (previously 95%)
  + Logging updated/prettified
2022-11-08 18:56:04 +00:00

109 lines
3.0 KiB
Nim

# Nimbus
# Copyright (c) 2021 Status Research & Development GmbH
# Licensed under either of
# * Apache License, version 2.0, ([LICENSE-APACHE](LICENSE-APACHE) or
# http://www.apache.org/licenses/LICENSE-2.0)
# * MIT license ([LICENSE-MIT](LICENSE-MIT) or
# http://opensource.org/licenses/MIT)
# at your option. This file may not be copied, modified, or distributed
# except according to those terms.
import
chronos,
../../../sync_desc,
../../constants
type
ComErrorStatsRef* = ref object
## particular error counters so connections will not be cut immediately
## after a particular error.
nTimeouts*: uint
nNoData*: uint
nNetwork*: uint
ComError* = enum
ComNothingSerious
ComAccountsMaxTooLarge
ComAccountsMinTooSmall
ComEmptyAccountsArguments
ComEmptyPartialRange
ComEmptyRequestArguments
ComNetworkProblem
ComNoAccountsForStateRoot
ComNoByteCodesAvailable
#ComNoHeaderAvailable -- unused, see get_block_header.nim
ComNoStorageForAccounts
ComNoTrieNodesAvailable
ComResponseTimeout
ComTooManyByteCodes
ComTooManyHeaders
ComTooManyStorageSlots
ComTooManyTrieNodes
proc resetComError*(stats: ComErrorStatsRef) =
## Reset error counts after successful network operation
stats[].reset
proc stopAfterSeriousComError*(
ctrl: BuddyCtrlRef;
error: ComError;
stats: ComErrorStatsRef;
): Future[bool]
{.async.} =
## Error handling after data protocol failed. Returns `true` if the current
## worker should be terminated as *zombie*.
case error:
of ComResponseTimeout:
stats.nTimeouts.inc
if comErrorsTimeoutMax < stats.nTimeouts:
# Mark this peer dead, i.e. avoid fetching from this peer for a while
ctrl.zombie = true
return true
when 0 < comErrorsTimeoutSleepMSecs:
# Otherwise try again some time later.
await sleepAsync(comErrorsTimeoutSleepMSecs.milliseconds)
of ComNetworkProblem:
stats.nNetwork.inc
if comErrorsNetworkMax < stats.nNetwork:
ctrl.zombie = true
return true
when 0 < comErrorsNetworkSleepMSecs:
# Otherwise try again some time later.
await sleepAsync(comErrorsNetworkSleepMSecs.milliseconds)
of ComNoAccountsForStateRoot,
ComNoByteCodesAvailable,
ComNoStorageForAccounts,
#ComNoHeaderAvailable,
ComNoTrieNodesAvailable:
stats.nNoData.inc
if comErrorsNoDataMax < stats.nNoData:
ctrl.zombie = true
return true
when 0 < comErrorsNoDataSleepMSecs:
# Otherwise try again some time later.
await sleepAsync(comErrorsNoDataSleepMSecs.milliseconds)
of ComAccountsMinTooSmall,
ComAccountsMaxTooLarge,
ComTooManyByteCodes,
ComTooManyHeaders,
ComTooManyStorageSlots,
ComTooManyTrieNodes:
# Mark this peer dead, i.e. avoid fetching from this peer for a while
ctrl.zombie = true
return true
of ComEmptyAccountsArguments,
ComEmptyRequestArguments,
ComEmptyPartialRange,
ComNothingSerious:
discard
# End