nimbus-eth1/nimbus/sync/snap/constants.nim

207 lines
7.3 KiB
Nim
Raw Normal View History

# Nimbus
# Copyright (c) 2021 Status Research & Development GmbH
# Licensed under either of
# * Apache License, version 2.0, ([LICENSE-APACHE](LICENSE-APACHE) or
# http://www.apache.org/licenses/LICENSE-2.0)
# * MIT license ([LICENSE-MIT](LICENSE-MIT) or
# http://opensource.org/licenses/MIT)
# at your option. This file may not be copied, modified, or distributed
# except according to those terms.
{.push raises: [].}
import
std/sets,
eth/[common, trie/nibbles]
const
EmptyBlob* = seq[byte].default
## Useful shortcut
EmptyBlobSet* = HashSet[Blob].default
## Useful shortcut
EmptyBlobSeq* = seq[Blob].default
## Useful shortcut
EmptyNibbleSeq* = EmptyBlob.initNibbleRange
## Useful shortcut
# ---------
pivotTableLruEntriesMax* = 50
## Max depth of pivot table. On overflow, the oldest one will be removed.
pivotBlockDistanceMin* = 128
## The minimal depth of two block headers needed to activate a new state
## root pivot.
##
## Effects on assembling the state via `snap/1` protocol:
##
## * A small value of this constant increases the propensity to update the
## pivot header more often. This is so because each new peer negoiates a
## pivot block number at least the current one.
##
## * A large value keeps the current pivot more stable but some experiments
## suggest that the `snap/1` protocol is answered only for later block
## numbers (aka pivot blocks.) So a large value tends to keep the pivot
## farther away from the chain head.
##
## Note that 128 is the magic distance for snapshots used by *Geth*.
# --------------
Snap sync refactor healing (#1397) * Simplify accounts healing threshold management why: Was over-engineered. details: Previously, healing was based on recursive hexary trie perusal. Due to "cheap" envelope decomposition of a range complement for the hexary trie, the cost of running extra laps have become time-affordable again and a simple trigger mechanism for healing will do. * Control number of dangling result nodes in `hexaryInspectTrie()` also: + Returns number of visited nodes available for logging so the maximum number of nodes can be tuned accordingly. + Some code and docu update * Update names of constants why: Declutter, more systematic naming * Re-implemented `worker_desc.merge()` for storage slots why: Provided as proper queue management in `storage_queue_helper`. details: + Several append modes (replaces `merge()`) + Added third queue to record entries currently fetched by a worker. So another parallel running worker can safe the complete set of storage slots in as checkpoint. This was previously lost. * Refactor healing why: Simplify and remove deep hexary trie perusal for finding completeness. Due to "cheap" envelope decomposition of a range complement for the hexary trie, the cost of running extra laps have become time-affordable again and a simple trigger mechanism for healing will do. * Docu update * Run a storage job only once in download loop why: Download failure or rejection (i.e. missing data) lead to repeated fetch requests until peer disconnects, otherwise.
2022-12-24 09:54:18 +00:00
fetchRequestBytesLimit* = 2 * 1024 * 1024
## Soft bytes limit to request in `snap/1` protocol calls.
Snap sync refactor healing (#1397) * Simplify accounts healing threshold management why: Was over-engineered. details: Previously, healing was based on recursive hexary trie perusal. Due to "cheap" envelope decomposition of a range complement for the hexary trie, the cost of running extra laps have become time-affordable again and a simple trigger mechanism for healing will do. * Control number of dangling result nodes in `hexaryInspectTrie()` also: + Returns number of visited nodes available for logging so the maximum number of nodes can be tuned accordingly. + Some code and docu update * Update names of constants why: Declutter, more systematic naming * Re-implemented `worker_desc.merge()` for storage slots why: Provided as proper queue management in `storage_queue_helper`. details: + Several append modes (replaces `merge()`) + Added third queue to record entries currently fetched by a worker. So another parallel running worker can safe the complete set of storage slots in as checkpoint. This was previously lost. * Refactor healing why: Simplify and remove deep hexary trie perusal for finding completeness. Due to "cheap" envelope decomposition of a range complement for the hexary trie, the cost of running extra laps have become time-affordable again and a simple trigger mechanism for healing will do. * Docu update * Run a storage job only once in download loop why: Download failure or rejection (i.e. missing data) lead to repeated fetch requests until peer disconnects, otherwise.
2022-12-24 09:54:18 +00:00
fetchRequestTrieNodesMax* = 1024
## Informal maximal number of trie nodes to fetch at once in `snap/1`
## protocol calls. This is not an official limit but found with several
## implementations (e.g. Geth.)
##
Snap sync refactor healing (#1397) * Simplify accounts healing threshold management why: Was over-engineered. details: Previously, healing was based on recursive hexary trie perusal. Due to "cheap" envelope decomposition of a range complement for the hexary trie, the cost of running extra laps have become time-affordable again and a simple trigger mechanism for healing will do. * Control number of dangling result nodes in `hexaryInspectTrie()` also: + Returns number of visited nodes available for logging so the maximum number of nodes can be tuned accordingly. + Some code and docu update * Update names of constants why: Declutter, more systematic naming * Re-implemented `worker_desc.merge()` for storage slots why: Provided as proper queue management in `storage_queue_helper`. details: + Several append modes (replaces `merge()`) + Added third queue to record entries currently fetched by a worker. So another parallel running worker can safe the complete set of storage slots in as checkpoint. This was previously lost. * Refactor healing why: Simplify and remove deep hexary trie perusal for finding completeness. Due to "cheap" envelope decomposition of a range complement for the hexary trie, the cost of running extra laps have become time-affordable again and a simple trigger mechanism for healing will do. * Docu update * Run a storage job only once in download loop why: Download failure or rejection (i.e. missing data) lead to repeated fetch requests until peer disconnects, otherwise.
2022-12-24 09:54:18 +00:00
## Resticting the fetch list length early allows to better parallelise
## healing.
Snap sync refactor healing (#1397) * Simplify accounts healing threshold management why: Was over-engineered. details: Previously, healing was based on recursive hexary trie perusal. Due to "cheap" envelope decomposition of a range complement for the hexary trie, the cost of running extra laps have become time-affordable again and a simple trigger mechanism for healing will do. * Control number of dangling result nodes in `hexaryInspectTrie()` also: + Returns number of visited nodes available for logging so the maximum number of nodes can be tuned accordingly. + Some code and docu update * Update names of constants why: Declutter, more systematic naming * Re-implemented `worker_desc.merge()` for storage slots why: Provided as proper queue management in `storage_queue_helper`. details: + Several append modes (replaces `merge()`) + Added third queue to record entries currently fetched by a worker. So another parallel running worker can safe the complete set of storage slots in as checkpoint. This was previously lost. * Refactor healing why: Simplify and remove deep hexary trie perusal for finding completeness. Due to "cheap" envelope decomposition of a range complement for the hexary trie, the cost of running extra laps have become time-affordable again and a simple trigger mechanism for healing will do. * Docu update * Run a storage job only once in download loop why: Download failure or rejection (i.e. missing data) lead to repeated fetch requests until peer disconnects, otherwise.
2022-12-24 09:54:18 +00:00
fetchRequestStorageSlotsMax* = 2 * 1024
## Maximal number of storage tries to fetch with a single request message.
Snap sync refactor healing (#1397) * Simplify accounts healing threshold management why: Was over-engineered. details: Previously, healing was based on recursive hexary trie perusal. Due to "cheap" envelope decomposition of a range complement for the hexary trie, the cost of running extra laps have become time-affordable again and a simple trigger mechanism for healing will do. * Control number of dangling result nodes in `hexaryInspectTrie()` also: + Returns number of visited nodes available for logging so the maximum number of nodes can be tuned accordingly. + Some code and docu update * Update names of constants why: Declutter, more systematic naming * Re-implemented `worker_desc.merge()` for storage slots why: Provided as proper queue management in `storage_queue_helper`. details: + Several append modes (replaces `merge()`) + Added third queue to record entries currently fetched by a worker. So another parallel running worker can safe the complete set of storage slots in as checkpoint. This was previously lost. * Refactor healing why: Simplify and remove deep hexary trie perusal for finding completeness. Due to "cheap" envelope decomposition of a range complement for the hexary trie, the cost of running extra laps have become time-affordable again and a simple trigger mechanism for healing will do. * Docu update * Run a storage job only once in download loop why: Download failure or rejection (i.e. missing data) lead to repeated fetch requests until peer disconnects, otherwise.
2022-12-24 09:54:18 +00:00
# --------------
fetchRequestContractsMax* = 1024
## Maximal number of contract codes fetch with a single request message.
# --------------
saveAccountsProcessedChunksMax* = 1000
## Recovery data are stored if the processed ranges list contains no more
## than this many range *chunks*.
##
## If the range set is too much fragmented, no data will be saved and
## restart has to perform from scratch or an earlier checkpoint.
saveStorageSlotsMax* = 20_000
## Recovery data are stored if the oustanding storage slots to process do
## not amount to more than this many entries.
##
## If there are too many dangling nodes, no data will be saved and restart
## has to perform from scratch or an earlier checkpoint.
saveContactsMax* = 10_000
## Similar to `saveStorageSlotsMax`
# --------------
storageSlotsFetchFailedFullMax* = fetchRequestStorageSlotsMax + 100
## Maximal number of failures when fetching full range storage slots.
## These failed slot ranges are only called for once in the same cycle.
storageSlotsFetchFailedPartialMax* = 300
## Ditto for partial range storage slots.
Snap sync refactor healing (#1397) * Simplify accounts healing threshold management why: Was over-engineered. details: Previously, healing was based on recursive hexary trie perusal. Due to "cheap" envelope decomposition of a range complement for the hexary trie, the cost of running extra laps have become time-affordable again and a simple trigger mechanism for healing will do. * Control number of dangling result nodes in `hexaryInspectTrie()` also: + Returns number of visited nodes available for logging so the maximum number of nodes can be tuned accordingly. + Some code and docu update * Update names of constants why: Declutter, more systematic naming * Re-implemented `worker_desc.merge()` for storage slots why: Provided as proper queue management in `storage_queue_helper`. details: + Several append modes (replaces `merge()`) + Added third queue to record entries currently fetched by a worker. So another parallel running worker can safe the complete set of storage slots in as checkpoint. This was previously lost. * Refactor healing why: Simplify and remove deep hexary trie perusal for finding completeness. Due to "cheap" envelope decomposition of a range complement for the hexary trie, the cost of running extra laps have become time-affordable again and a simple trigger mechanism for healing will do. * Docu update * Run a storage job only once in download loop why: Download failure or rejection (i.e. missing data) lead to repeated fetch requests until peer disconnects, otherwise.
2022-12-24 09:54:18 +00:00
storageSlotsTrieInheritPerusalMax* = 30_000
## Maximal number of nodes to visit in order to find out whether this
## storage slots trie is complete. This allows to *inherit* the full trie
## for an existing root node if the trie is small enough.
Snap sync refactor healing (#1397) * Simplify accounts healing threshold management why: Was over-engineered. details: Previously, healing was based on recursive hexary trie perusal. Due to "cheap" envelope decomposition of a range complement for the hexary trie, the cost of running extra laps have become time-affordable again and a simple trigger mechanism for healing will do. * Control number of dangling result nodes in `hexaryInspectTrie()` also: + Returns number of visited nodes available for logging so the maximum number of nodes can be tuned accordingly. + Some code and docu update * Update names of constants why: Declutter, more systematic naming * Re-implemented `worker_desc.merge()` for storage slots why: Provided as proper queue management in `storage_queue_helper`. details: + Several append modes (replaces `merge()`) + Added third queue to record entries currently fetched by a worker. So another parallel running worker can safe the complete set of storage slots in as checkpoint. This was previously lost. * Refactor healing why: Simplify and remove deep hexary trie perusal for finding completeness. Due to "cheap" envelope decomposition of a range complement for the hexary trie, the cost of running extra laps have become time-affordable again and a simple trigger mechanism for healing will do. * Docu update * Run a storage job only once in download loop why: Download failure or rejection (i.e. missing data) lead to repeated fetch requests until peer disconnects, otherwise.
2022-12-24 09:54:18 +00:00
storageSlotsQuPrioThresh* = 5_000
Prep for full sync after snap make 6 (#1291) * Update log ticker, using time interval rather than ticker count why: Counting and logging ticker occurrences is inherently imprecise. So time intervals are used. * Use separate storage tables for snap sync data * Left boundary proof update why: Was not properly implemented, yet. * Capture pivot in peer worker (aka buddy) tasks why: The pivot environment is linked to the `buddy` descriptor. While there is a task switch, the pivot may change. So it is passed on as function argument `env` rather than retrieved from the buddy at the start of a sub-function. * Split queues `fetchStorage` into `fetchStorageFull` and `fetchStoragePart` * Remove obsolete account range returned from `GetAccountRange` message why: Handler returned the wrong right value of the range. This range was for convenience, only. * Prioritise storage slots if the queue becomes large why: Currently, accounts processing is prioritised up until all accounts are downloaded. The new prioritisation has two thresholds for + start processing storage slots with a new worker + stop account processing and switch to storage processing also: Provide api for `SnapTodoRanges` pair of range sets in `worker_desc.nim` * Generalise left boundary proof for accounts or storage slots. why: Detailed explanation how this works is documented with `snapdb_accounts.importAccounts()`. Instead of enforcing a left boundary proof (which is still the default), the importer functions return a list of `holes` (aka node paths) found in the argument ranges of leaf nodes. This in turn is used by the book keeping software for data download. * Forgot to pass on variable in function wrapper also: + Start healing not before 99% accounts covered (previously 95%) + Logging updated/prettified
2022-11-08 18:56:04 +00:00
## For a new worker, prioritise processing the storage slots queue over
## processing accounts if the queue has more than this many items.
##
Prep for full sync after snap make 6 (#1291) * Update log ticker, using time interval rather than ticker count why: Counting and logging ticker occurrences is inherently imprecise. So time intervals are used. * Use separate storage tables for snap sync data * Left boundary proof update why: Was not properly implemented, yet. * Capture pivot in peer worker (aka buddy) tasks why: The pivot environment is linked to the `buddy` descriptor. While there is a task switch, the pivot may change. So it is passed on as function argument `env` rather than retrieved from the buddy at the start of a sub-function. * Split queues `fetchStorage` into `fetchStorageFull` and `fetchStoragePart` * Remove obsolete account range returned from `GetAccountRange` message why: Handler returned the wrong right value of the range. This range was for convenience, only. * Prioritise storage slots if the queue becomes large why: Currently, accounts processing is prioritised up until all accounts are downloaded. The new prioritisation has two thresholds for + start processing storage slots with a new worker + stop account processing and switch to storage processing also: Provide api for `SnapTodoRanges` pair of range sets in `worker_desc.nim` * Generalise left boundary proof for accounts or storage slots. why: Detailed explanation how this works is documented with `snapdb_accounts.importAccounts()`. Instead of enforcing a left boundary proof (which is still the default), the importer functions return a list of `holes` (aka node paths) found in the argument ranges of leaf nodes. This in turn is used by the book keeping software for data download. * Forgot to pass on variable in function wrapper also: + Start healing not before 99% accounts covered (previously 95%) + Logging updated/prettified
2022-11-08 18:56:04 +00:00
## For a running worker processing accounts, stop processing accounts
## and switch to processing the storage slots queue if the queue has
## more than this many items.
# --------------
contractsQuPrioThresh* = 2_000
## Similar to `storageSlotsQuPrioThresh`
# --------------
healAccountsCoverageTrigger* = 1.01
## Apply accounts healing if the global snap download coverage factor
## exceeds this setting. The global coverage factor is derived by merging
## all account ranges retrieved for all pivot state roots (see
Snap sync refactor healing (#1397) * Simplify accounts healing threshold management why: Was over-engineered. details: Previously, healing was based on recursive hexary trie perusal. Due to "cheap" envelope decomposition of a range complement for the hexary trie, the cost of running extra laps have become time-affordable again and a simple trigger mechanism for healing will do. * Control number of dangling result nodes in `hexaryInspectTrie()` also: + Returns number of visited nodes available for logging so the maximum number of nodes can be tuned accordingly. + Some code and docu update * Update names of constants why: Declutter, more systematic naming * Re-implemented `worker_desc.merge()` for storage slots why: Provided as proper queue management in `storage_queue_helper`. details: + Several append modes (replaces `merge()`) + Added third queue to record entries currently fetched by a worker. So another parallel running worker can safe the complete set of storage slots in as checkpoint. This was previously lost. * Refactor healing why: Simplify and remove deep hexary trie perusal for finding completeness. Due to "cheap" envelope decomposition of a range complement for the hexary trie, the cost of running extra laps have become time-affordable again and a simple trigger mechanism for healing will do. * Docu update * Run a storage job only once in download loop why: Download failure or rejection (i.e. missing data) lead to repeated fetch requests until peer disconnects, otherwise.
2022-12-24 09:54:18 +00:00
## `coveredAccounts` in the object `CtxData`.) Note that a coverage factor
## greater than 100% is not exact but rather a lower bound estimate.
healAccountsInspectionPlanBLevel* = 4
## Search this level deep for missing nodes if `hexaryEnvelopeDecompose()`
## only produces existing nodes.
healAccountsInspectionPlanBRetryMax* = 2
## Retry inspection with depth level argument starting at
## `healAccountsInspectionPlanBLevel-1` and counting down at most this
## many times until there is at least one dangling node found and the
## depth level argument remains positive. The cumulative depth of the
## iterated seach is
## ::
## b 1
## Σ ν = --- (b - a + 1) (a + b)
## a 2
## for
## ::
## b = healAccountsInspectionPlanBLevel
## a = b - healAccountsInspectionPlanBRetryMax
##
healAccountsInspectionPlanBRetryNapMSecs* = 2
## Sleep beween inspection retrys to allow thread switch. If this constant
## is set `0`, `1`ns wait is used.
# --------------
healStorageSlotsInspectionPlanBLevel* = 5
Snap sync refactor healing (#1397) * Simplify accounts healing threshold management why: Was over-engineered. details: Previously, healing was based on recursive hexary trie perusal. Due to "cheap" envelope decomposition of a range complement for the hexary trie, the cost of running extra laps have become time-affordable again and a simple trigger mechanism for healing will do. * Control number of dangling result nodes in `hexaryInspectTrie()` also: + Returns number of visited nodes available for logging so the maximum number of nodes can be tuned accordingly. + Some code and docu update * Update names of constants why: Declutter, more systematic naming * Re-implemented `worker_desc.merge()` for storage slots why: Provided as proper queue management in `storage_queue_helper`. details: + Several append modes (replaces `merge()`) + Added third queue to record entries currently fetched by a worker. So another parallel running worker can safe the complete set of storage slots in as checkpoint. This was previously lost. * Refactor healing why: Simplify and remove deep hexary trie perusal for finding completeness. Due to "cheap" envelope decomposition of a range complement for the hexary trie, the cost of running extra laps have become time-affordable again and a simple trigger mechanism for healing will do. * Docu update * Run a storage job only once in download loop why: Download failure or rejection (i.e. missing data) lead to repeated fetch requests until peer disconnects, otherwise.
2022-12-24 09:54:18 +00:00
## Similar to `healAccountsInspectionPlanBLevel`
healStorageSlotsInspectionPlanBRetryMax* = 99 # 5 + 4 + .. + 1 => 15
## Similar to `healAccountsInspectionPlanBRetryMax`
healStorageSlotsInspectionPlanBRetryNapMSecs* = 2
## Similar to `healAccountsInspectionPlanBRetryNapMSecs`
healStorageSlotsBatchMax* = 32
## Maximal number of storage tries to to heal in a single batch run. Only
## this many items will be removed from the batch queue. These items will
## then be processed one by one.
healStorageSlotsFailedMax* = 300
## Ditto for partial range storage slots.
# --------------
Prep for full sync after snap make 6 (#1291) * Update log ticker, using time interval rather than ticker count why: Counting and logging ticker occurrences is inherently imprecise. So time intervals are used. * Use separate storage tables for snap sync data * Left boundary proof update why: Was not properly implemented, yet. * Capture pivot in peer worker (aka buddy) tasks why: The pivot environment is linked to the `buddy` descriptor. While there is a task switch, the pivot may change. So it is passed on as function argument `env` rather than retrieved from the buddy at the start of a sub-function. * Split queues `fetchStorage` into `fetchStorageFull` and `fetchStoragePart` * Remove obsolete account range returned from `GetAccountRange` message why: Handler returned the wrong right value of the range. This range was for convenience, only. * Prioritise storage slots if the queue becomes large why: Currently, accounts processing is prioritised up until all accounts are downloaded. The new prioritisation has two thresholds for + start processing storage slots with a new worker + stop account processing and switch to storage processing also: Provide api for `SnapTodoRanges` pair of range sets in `worker_desc.nim` * Generalise left boundary proof for accounts or storage slots. why: Detailed explanation how this works is documented with `snapdb_accounts.importAccounts()`. Instead of enforcing a left boundary proof (which is still the default), the importer functions return a list of `holes` (aka node paths) found in the argument ranges of leaf nodes. This in turn is used by the book keeping software for data download. * Forgot to pass on variable in function wrapper also: + Start healing not before 99% accounts covered (previously 95%) + Logging updated/prettified
2022-11-08 18:56:04 +00:00
comErrorsTimeoutMax* = 3
## Maximal number of non-resonses accepted in a row. If there are more than
## `comErrorsTimeoutMax` consecutive errors, the worker will be degraded
## as zombie.
comErrorsTimeoutSleepMSecs* = 5000
## Wait/suspend for this many seconds after a timeout error if there are
## not more than `comErrorsTimeoutMax` errors in a row (maybe some other
## network or no-data errors mixed in.) Set 0 to disable.
comErrorsNetworkMax* = 5
## Similar to `comErrorsTimeoutMax` but for network errors.
comErrorsNetworkSleepMSecs* = 5000
## Similar to `comErrorsTimeoutSleepSecs` but for network errors.
## Set 0 to disable.
comErrorsNoDataMax* = 3
## Similar to `comErrorsTimeoutMax` but for missing data errors.
comErrorsNoDataSleepMSecs* = 0
## Similar to `comErrorsTimeoutSleepSecs` but for missing data errors.
## Set 0 to disable.
static:
doAssert storageSlotsQuPrioThresh < saveStorageSlotsMax
doAssert contractsQuPrioThresh < saveContactsMax
doAssert 0 <= storageSlotsFetchFailedFullMax
doAssert 0 <= storageSlotsFetchFailedPartialMax
# ------------------------------------------------------------------------------
# End
# ------------------------------------------------------------------------------