2024-09-09 09:12:56 +00:00
|
|
|
# Nimbus - Fetch account and storage states from peers efficiently
|
|
|
|
#
|
|
|
|
# Copyright (c) 2021-2024 Status Research & Development GmbH
|
|
|
|
# Licensed under either of
|
|
|
|
# * Apache License, version 2.0, ([LICENSE-APACHE](LICENSE-APACHE) or
|
|
|
|
# http://www.apache.org/licenses/LICENSE-2.0)
|
|
|
|
# * MIT license ([LICENSE-MIT](LICENSE-MIT) or
|
|
|
|
# http://opensource.org/licenses/MIT)
|
|
|
|
# at your option. This file may not be copied, modified, or distributed
|
|
|
|
# except according to those terms.
|
|
|
|
|
|
|
|
{.push raises: [].}
|
|
|
|
|
|
|
|
import
|
Flare sync (#2627)
* Cosmetics, small fixes, add stashed headers verifier
* Remove direct `Era1` support
why:
Era1 is indirectly supported by using the import tool before syncing.
* Clarify database persistent save function.
why:
Function relied on the last saved state block number which was wrong.
It now relies on the tx-level. If it is 0, then data are saved directly.
Otherwise the task that owns the tx will do it.
* Extracted configuration constants into separate file
* Enable single peer mode for debugging
* Fix peer losing issue in multi-mode
details:
Running concurrent download peers was previously programmed as running
a batch downloading and storing ~8k headers and then leaving the `async`
function to be restarted by a scheduler.
This was unfortunate because of occasionally occurring long waiting
times for restart.
While the time gap until restarting were typically observed a few
millisecs, there were always a few outliers which well exceed several
seconds. This seemed to let remote peers run into timeouts.
* Prefix function names `unprocXxx()` and `stagedYyy()` by `headers`
why:
There will be other `unproc` and `staged` modules.
* Remove cruft, update logging
* Fix accounting issue
details:
When staging after fetching headers from the network, there was an off
by 1 error occurring when the result was by one smaller than requested.
Also, a whole range was mis-accounted when a peer was terminating
connection immediately after responding.
* Fix slow/error header accounting when fetching
why:
Originally set for detecting slow headers in a row, the counter
was wrongly extended to general errors.
* Ban peers for a while that respond with too few headers continuously
why:
Some peers only returned one header at a time. If these peers sit on a
farm, they might collectively slow down the download process.
* Update RPC beacon header updater
why:
Old function hook has slightly changed its meaning since it was used
for snap sync. Also, the old hook is used by other functions already.
* Limit number of peers or set to single peer mode
details:
Merge several concepts, single peer mode being one of it.
* Some code clean up, fixings for removing of compiler warnings
* De-noise header fetch related sources
why:
Header download looks relatively stable, so general debugging is not
needed, anymore. This is the equivalent of removing the scaffold from
the part of the building where work has completed.
* More clean up and code prettification for headers stuff
* Implement body fetch and block import
details:
Available headers are used stage blocks by combining existing headers
with newly fetched blocks. Then these blocks are imported/executed via
`persistBlocks()`.
* Logger cosmetics and cleanup
* Remove staged block queue debugging
details:
Feature still available, just not executed anymore
* Docu, logging update
* Update/simplify `runDaemon()`
* Re-calibrate block body requests and soft config for import blocks batch
why:
* For fetching, larger fetch requests are mostly truncated anyway on
MainNet.
* For executing, smaller batch sizes reduce the memory needed for the
price of longer execution times.
* Update metrics counters
* Docu update
* Some fixes, formatting updates, etc.
* Update `borrowed` type: uint -. uint64
also:
Always convert to `uint64` rather than `uint` where appropriate
2024-09-27 15:07:42 +00:00
|
|
|
std/strutils,
|
2024-09-09 09:12:56 +00:00
|
|
|
pkg/[chronos, chronicles, eth/common, stint],
|
|
|
|
../../../../utils/prettify,
|
2024-10-01 09:19:29 +00:00
|
|
|
../helpers
|
2024-09-09 09:12:56 +00:00
|
|
|
|
|
|
|
logScope:
|
2024-10-21 18:01:45 +00:00
|
|
|
topics = "beacon ticker"
|
2024-09-09 09:12:56 +00:00
|
|
|
|
|
|
|
type
|
2024-10-02 11:31:33 +00:00
|
|
|
TickerStatsUpdater* = proc: TickerStats {.gcsafe, raises: [].}
|
2024-09-09 09:12:56 +00:00
|
|
|
## Full sync state update function
|
|
|
|
|
2024-10-02 11:31:33 +00:00
|
|
|
TickerStats* = object
|
2024-09-09 09:12:56 +00:00
|
|
|
## Full sync state (see `TickerFullStatsUpdater`)
|
|
|
|
base*: BlockNumber
|
2024-10-17 17:59:50 +00:00
|
|
|
latest*: BlockNumber
|
2024-10-03 20:19:11 +00:00
|
|
|
coupler*: BlockNumber
|
|
|
|
dangling*: BlockNumber
|
2024-10-17 17:59:50 +00:00
|
|
|
head*: BlockNumber
|
2024-10-28 16:22:04 +00:00
|
|
|
headOk*: bool
|
2024-10-09 18:00:00 +00:00
|
|
|
target*: BlockNumber
|
2024-10-17 17:59:50 +00:00
|
|
|
targetOk*: bool
|
Flare sync (#2627)
* Cosmetics, small fixes, add stashed headers verifier
* Remove direct `Era1` support
why:
Era1 is indirectly supported by using the import tool before syncing.
* Clarify database persistent save function.
why:
Function relied on the last saved state block number which was wrong.
It now relies on the tx-level. If it is 0, then data are saved directly.
Otherwise the task that owns the tx will do it.
* Extracted configuration constants into separate file
* Enable single peer mode for debugging
* Fix peer losing issue in multi-mode
details:
Running concurrent download peers was previously programmed as running
a batch downloading and storing ~8k headers and then leaving the `async`
function to be restarted by a scheduler.
This was unfortunate because of occasionally occurring long waiting
times for restart.
While the time gap until restarting were typically observed a few
millisecs, there were always a few outliers which well exceed several
seconds. This seemed to let remote peers run into timeouts.
* Prefix function names `unprocXxx()` and `stagedYyy()` by `headers`
why:
There will be other `unproc` and `staged` modules.
* Remove cruft, update logging
* Fix accounting issue
details:
When staging after fetching headers from the network, there was an off
by 1 error occurring when the result was by one smaller than requested.
Also, a whole range was mis-accounted when a peer was terminating
connection immediately after responding.
* Fix slow/error header accounting when fetching
why:
Originally set for detecting slow headers in a row, the counter
was wrongly extended to general errors.
* Ban peers for a while that respond with too few headers continuously
why:
Some peers only returned one header at a time. If these peers sit on a
farm, they might collectively slow down the download process.
* Update RPC beacon header updater
why:
Old function hook has slightly changed its meaning since it was used
for snap sync. Also, the old hook is used by other functions already.
* Limit number of peers or set to single peer mode
details:
Merge several concepts, single peer mode being one of it.
* Some code clean up, fixings for removing of compiler warnings
* De-noise header fetch related sources
why:
Header download looks relatively stable, so general debugging is not
needed, anymore. This is the equivalent of removing the scaffold from
the part of the building where work has completed.
* More clean up and code prettification for headers stuff
* Implement body fetch and block import
details:
Available headers are used stage blocks by combining existing headers
with newly fetched blocks. Then these blocks are imported/executed via
`persistBlocks()`.
* Logger cosmetics and cleanup
* Remove staged block queue debugging
details:
Feature still available, just not executed anymore
* Docu, logging update
* Update/simplify `runDaemon()`
* Re-calibrate block body requests and soft config for import blocks batch
why:
* For fetching, larger fetch requests are mostly truncated anyway on
MainNet.
* For executing, smaller batch sizes reduce the memory needed for the
price of longer execution times.
* Update metrics counters
* Docu update
* Some fixes, formatting updates, etc.
* Update `borrowed` type: uint -. uint64
also:
Always convert to `uint64` rather than `uint` where appropriate
2024-09-27 15:07:42 +00:00
|
|
|
|
|
|
|
hdrUnprocTop*: BlockNumber
|
|
|
|
nHdrUnprocessed*: uint64
|
|
|
|
nHdrUnprocFragm*: int
|
|
|
|
nHdrStaged*: int
|
|
|
|
hdrStagedTop*: BlockNumber
|
|
|
|
|
Beacon sync update multi exe heads aware (#2861)
* Log/trace cancellation events in scheduler
* Provide `clear()` functions for explicitly flushing data objects
* Renaming header cache functions
why:
More systematic, all functions start with prefix `dbHeader`
* Remove `danglingParent` from layout
why:
Already provided by header cache
* Remove `couplerHash` and `headHash` from layout
why:
No need to cache, `headHash` is unused and `couplerHash` used typically
once, only.
* Remove `lastLayout` from sync descriptor
why:
No need to compare changes, saving is always triggered after actively
changing the sync layout state
* Early reject unsuitable head + finalised header from CL
why:
The finalised header is only passed by its hash so the header must be
fetched somewhere, e.g. from a peer via eth/xx.
Also, finalised headers earlier than the `base` from `FC` cannot be
handled due to the `Aristo` single state database architecture.
Luckily, on a full node, the complete block history is available so
unsuitable finalised headers are stored there already which is exploited
here to avoid unnecessary network traffic.
* Code cosmetics, remove cruft, prettify logging, remove `final` metrics
detail:
The `final` layout parameter will be deprecated and later removed
* Update/re-calibrate syncer logic documentation
why:
The current implementation sucks if the `FC` module changes the
canonical branch in the middle of completing a header chain (due
to concurrent updates by the `newPayload()` logic.)
* Implement according to re-calibrated syncer docu
details:
The implementation employs the notion of named layout states (see
`SyncLayoutState` in `worker_desc.nim`) which are derived from the
state parameter triple `(C,D,H)` as described in `README.md`.
2024-11-21 16:32:47 +00:00
|
|
|
blkUnprocBottom*: BlockNumber
|
Flare sync (#2627)
* Cosmetics, small fixes, add stashed headers verifier
* Remove direct `Era1` support
why:
Era1 is indirectly supported by using the import tool before syncing.
* Clarify database persistent save function.
why:
Function relied on the last saved state block number which was wrong.
It now relies on the tx-level. If it is 0, then data are saved directly.
Otherwise the task that owns the tx will do it.
* Extracted configuration constants into separate file
* Enable single peer mode for debugging
* Fix peer losing issue in multi-mode
details:
Running concurrent download peers was previously programmed as running
a batch downloading and storing ~8k headers and then leaving the `async`
function to be restarted by a scheduler.
This was unfortunate because of occasionally occurring long waiting
times for restart.
While the time gap until restarting were typically observed a few
millisecs, there were always a few outliers which well exceed several
seconds. This seemed to let remote peers run into timeouts.
* Prefix function names `unprocXxx()` and `stagedYyy()` by `headers`
why:
There will be other `unproc` and `staged` modules.
* Remove cruft, update logging
* Fix accounting issue
details:
When staging after fetching headers from the network, there was an off
by 1 error occurring when the result was by one smaller than requested.
Also, a whole range was mis-accounted when a peer was terminating
connection immediately after responding.
* Fix slow/error header accounting when fetching
why:
Originally set for detecting slow headers in a row, the counter
was wrongly extended to general errors.
* Ban peers for a while that respond with too few headers continuously
why:
Some peers only returned one header at a time. If these peers sit on a
farm, they might collectively slow down the download process.
* Update RPC beacon header updater
why:
Old function hook has slightly changed its meaning since it was used
for snap sync. Also, the old hook is used by other functions already.
* Limit number of peers or set to single peer mode
details:
Merge several concepts, single peer mode being one of it.
* Some code clean up, fixings for removing of compiler warnings
* De-noise header fetch related sources
why:
Header download looks relatively stable, so general debugging is not
needed, anymore. This is the equivalent of removing the scaffold from
the part of the building where work has completed.
* More clean up and code prettification for headers stuff
* Implement body fetch and block import
details:
Available headers are used stage blocks by combining existing headers
with newly fetched blocks. Then these blocks are imported/executed via
`persistBlocks()`.
* Logger cosmetics and cleanup
* Remove staged block queue debugging
details:
Feature still available, just not executed anymore
* Docu, logging update
* Update/simplify `runDaemon()`
* Re-calibrate block body requests and soft config for import blocks batch
why:
* For fetching, larger fetch requests are mostly truncated anyway on
MainNet.
* For executing, smaller batch sizes reduce the memory needed for the
price of longer execution times.
* Update metrics counters
* Docu update
* Some fixes, formatting updates, etc.
* Update `borrowed` type: uint -. uint64
also:
Always convert to `uint64` rather than `uint` where appropriate
2024-09-27 15:07:42 +00:00
|
|
|
nBlkUnprocessed*: uint64
|
|
|
|
nBlkUnprocFragm*: int
|
|
|
|
nBlkStaged*: int
|
|
|
|
blkStagedBottom*: BlockNumber
|
|
|
|
|
2024-09-09 09:12:56 +00:00
|
|
|
reorg*: int
|
2024-10-03 20:19:11 +00:00
|
|
|
nBuddies*: int
|
2024-09-09 09:12:56 +00:00
|
|
|
|
|
|
|
TickerRef* = ref object
|
|
|
|
## Ticker descriptor object
|
|
|
|
started: Moment
|
|
|
|
visited: Moment
|
|
|
|
prettyPrint: proc(t: TickerRef) {.gcsafe, raises: [].}
|
2024-10-02 11:31:33 +00:00
|
|
|
statsCb: TickerStatsUpdater
|
|
|
|
lastStats: TickerStats
|
2024-09-09 09:12:56 +00:00
|
|
|
|
|
|
|
const
|
|
|
|
tickerStartDelay = chronos.milliseconds(100)
|
|
|
|
tickerLogInterval = chronos.seconds(1)
|
|
|
|
tickerLogSuppressMax = chronos.seconds(100)
|
|
|
|
|
|
|
|
# ------------------------------------------------------------------------------
|
|
|
|
# Private functions: printing ticker messages
|
|
|
|
# ------------------------------------------------------------------------------
|
|
|
|
|
2024-10-02 11:31:33 +00:00
|
|
|
proc tickerLogger(t: TickerRef) {.gcsafe.} =
|
2024-09-09 09:12:56 +00:00
|
|
|
let
|
2024-10-02 11:31:33 +00:00
|
|
|
data = t.statsCb()
|
2024-09-09 09:12:56 +00:00
|
|
|
now = Moment.now()
|
|
|
|
|
|
|
|
if data != t.lastStats or
|
|
|
|
tickerLogSuppressMax < (now - t.visited):
|
|
|
|
let
|
2024-10-17 17:59:50 +00:00
|
|
|
B = if data.base == data.latest: "L" else: data.base.bnStr
|
|
|
|
L = if data.latest == data.coupler: "C" else: data.latest.bnStr
|
2024-10-09 18:00:00 +00:00
|
|
|
C = if data.coupler == data.dangling: "D" else: data.coupler.bnStr
|
Beacon sync update multi exe heads aware (#2861)
* Log/trace cancellation events in scheduler
* Provide `clear()` functions for explicitly flushing data objects
* Renaming header cache functions
why:
More systematic, all functions start with prefix `dbHeader`
* Remove `danglingParent` from layout
why:
Already provided by header cache
* Remove `couplerHash` and `headHash` from layout
why:
No need to cache, `headHash` is unused and `couplerHash` used typically
once, only.
* Remove `lastLayout` from sync descriptor
why:
No need to compare changes, saving is always triggered after actively
changing the sync layout state
* Early reject unsuitable head + finalised header from CL
why:
The finalised header is only passed by its hash so the header must be
fetched somewhere, e.g. from a peer via eth/xx.
Also, finalised headers earlier than the `base` from `FC` cannot be
handled due to the `Aristo` single state database architecture.
Luckily, on a full node, the complete block history is available so
unsuitable finalised headers are stored there already which is exploited
here to avoid unnecessary network traffic.
* Code cosmetics, remove cruft, prettify logging, remove `final` metrics
detail:
The `final` layout parameter will be deprecated and later removed
* Update/re-calibrate syncer logic documentation
why:
The current implementation sucks if the `FC` module changes the
canonical branch in the middle of completing a header chain (due
to concurrent updates by the `newPayload()` logic.)
* Implement according to re-calibrated syncer docu
details:
The implementation employs the notion of named layout states (see
`SyncLayoutState` in `worker_desc.nim`) which are derived from the
state parameter triple `(C,D,H)` as described in `README.md`.
2024-11-21 16:32:47 +00:00
|
|
|
D = if data.dangling == data.head: "H"
|
2024-10-28 16:22:04 +00:00
|
|
|
else: data.dangling.bnStr
|
|
|
|
H = if data.headOk:
|
|
|
|
if data.head == data.target: "T" else: data.head.bnStr
|
|
|
|
else:
|
|
|
|
if data.head == data.target: "?T" else: "?" & $data.head
|
2024-10-17 17:59:50 +00:00
|
|
|
T = if data.targetOk: data.target.bnStr else: "?" & $data.target
|
Flare sync (#2627)
* Cosmetics, small fixes, add stashed headers verifier
* Remove direct `Era1` support
why:
Era1 is indirectly supported by using the import tool before syncing.
* Clarify database persistent save function.
why:
Function relied on the last saved state block number which was wrong.
It now relies on the tx-level. If it is 0, then data are saved directly.
Otherwise the task that owns the tx will do it.
* Extracted configuration constants into separate file
* Enable single peer mode for debugging
* Fix peer losing issue in multi-mode
details:
Running concurrent download peers was previously programmed as running
a batch downloading and storing ~8k headers and then leaving the `async`
function to be restarted by a scheduler.
This was unfortunate because of occasionally occurring long waiting
times for restart.
While the time gap until restarting were typically observed a few
millisecs, there were always a few outliers which well exceed several
seconds. This seemed to let remote peers run into timeouts.
* Prefix function names `unprocXxx()` and `stagedYyy()` by `headers`
why:
There will be other `unproc` and `staged` modules.
* Remove cruft, update logging
* Fix accounting issue
details:
When staging after fetching headers from the network, there was an off
by 1 error occurring when the result was by one smaller than requested.
Also, a whole range was mis-accounted when a peer was terminating
connection immediately after responding.
* Fix slow/error header accounting when fetching
why:
Originally set for detecting slow headers in a row, the counter
was wrongly extended to general errors.
* Ban peers for a while that respond with too few headers continuously
why:
Some peers only returned one header at a time. If these peers sit on a
farm, they might collectively slow down the download process.
* Update RPC beacon header updater
why:
Old function hook has slightly changed its meaning since it was used
for snap sync. Also, the old hook is used by other functions already.
* Limit number of peers or set to single peer mode
details:
Merge several concepts, single peer mode being one of it.
* Some code clean up, fixings for removing of compiler warnings
* De-noise header fetch related sources
why:
Header download looks relatively stable, so general debugging is not
needed, anymore. This is the equivalent of removing the scaffold from
the part of the building where work has completed.
* More clean up and code prettification for headers stuff
* Implement body fetch and block import
details:
Available headers are used stage blocks by combining existing headers
with newly fetched blocks. Then these blocks are imported/executed via
`persistBlocks()`.
* Logger cosmetics and cleanup
* Remove staged block queue debugging
details:
Feature still available, just not executed anymore
* Docu, logging update
* Update/simplify `runDaemon()`
* Re-calibrate block body requests and soft config for import blocks batch
why:
* For fetching, larger fetch requests are mostly truncated anyway on
MainNet.
* For executing, smaller batch sizes reduce the memory needed for the
price of longer execution times.
* Update metrics counters
* Docu update
* Some fixes, formatting updates, etc.
* Update `borrowed` type: uint -. uint64
also:
Always convert to `uint64` rather than `uint` where appropriate
2024-09-27 15:07:42 +00:00
|
|
|
|
|
|
|
hS = if data.nHdrStaged == 0: "n/a"
|
2024-10-01 09:19:29 +00:00
|
|
|
else: data.hdrStagedTop.bnStr & "(" & $data.nHdrStaged & ")"
|
2024-10-28 16:22:04 +00:00
|
|
|
hU = if data.nHdrUnprocFragm == 0 and data.nHdrUnprocessed == 0: "n/a"
|
2024-10-01 09:19:29 +00:00
|
|
|
else: data.hdrUnprocTop.bnStr & "(" &
|
Flare sync (#2627)
* Cosmetics, small fixes, add stashed headers verifier
* Remove direct `Era1` support
why:
Era1 is indirectly supported by using the import tool before syncing.
* Clarify database persistent save function.
why:
Function relied on the last saved state block number which was wrong.
It now relies on the tx-level. If it is 0, then data are saved directly.
Otherwise the task that owns the tx will do it.
* Extracted configuration constants into separate file
* Enable single peer mode for debugging
* Fix peer losing issue in multi-mode
details:
Running concurrent download peers was previously programmed as running
a batch downloading and storing ~8k headers and then leaving the `async`
function to be restarted by a scheduler.
This was unfortunate because of occasionally occurring long waiting
times for restart.
While the time gap until restarting were typically observed a few
millisecs, there were always a few outliers which well exceed several
seconds. This seemed to let remote peers run into timeouts.
* Prefix function names `unprocXxx()` and `stagedYyy()` by `headers`
why:
There will be other `unproc` and `staged` modules.
* Remove cruft, update logging
* Fix accounting issue
details:
When staging after fetching headers from the network, there was an off
by 1 error occurring when the result was by one smaller than requested.
Also, a whole range was mis-accounted when a peer was terminating
connection immediately after responding.
* Fix slow/error header accounting when fetching
why:
Originally set for detecting slow headers in a row, the counter
was wrongly extended to general errors.
* Ban peers for a while that respond with too few headers continuously
why:
Some peers only returned one header at a time. If these peers sit on a
farm, they might collectively slow down the download process.
* Update RPC beacon header updater
why:
Old function hook has slightly changed its meaning since it was used
for snap sync. Also, the old hook is used by other functions already.
* Limit number of peers or set to single peer mode
details:
Merge several concepts, single peer mode being one of it.
* Some code clean up, fixings for removing of compiler warnings
* De-noise header fetch related sources
why:
Header download looks relatively stable, so general debugging is not
needed, anymore. This is the equivalent of removing the scaffold from
the part of the building where work has completed.
* More clean up and code prettification for headers stuff
* Implement body fetch and block import
details:
Available headers are used stage blocks by combining existing headers
with newly fetched blocks. Then these blocks are imported/executed via
`persistBlocks()`.
* Logger cosmetics and cleanup
* Remove staged block queue debugging
details:
Feature still available, just not executed anymore
* Docu, logging update
* Update/simplify `runDaemon()`
* Re-calibrate block body requests and soft config for import blocks batch
why:
* For fetching, larger fetch requests are mostly truncated anyway on
MainNet.
* For executing, smaller batch sizes reduce the memory needed for the
price of longer execution times.
* Update metrics counters
* Docu update
* Some fixes, formatting updates, etc.
* Update `borrowed` type: uint -. uint64
also:
Always convert to `uint64` rather than `uint` where appropriate
2024-09-27 15:07:42 +00:00
|
|
|
data.nHdrUnprocessed.toSI & "," & $data.nHdrUnprocFragm & ")"
|
|
|
|
|
|
|
|
bS = if data.nBlkStaged == 0: "n/a"
|
2024-10-01 09:19:29 +00:00
|
|
|
else: data.blkStagedBottom.bnStr & "(" & $data.nBlkStaged & ")"
|
2024-10-28 16:22:04 +00:00
|
|
|
bU = if data.nBlkUnprocFragm == 0 and data.nBlkUnprocessed == 0: "n/a"
|
Beacon sync update multi exe heads aware (#2861)
* Log/trace cancellation events in scheduler
* Provide `clear()` functions for explicitly flushing data objects
* Renaming header cache functions
why:
More systematic, all functions start with prefix `dbHeader`
* Remove `danglingParent` from layout
why:
Already provided by header cache
* Remove `couplerHash` and `headHash` from layout
why:
No need to cache, `headHash` is unused and `couplerHash` used typically
once, only.
* Remove `lastLayout` from sync descriptor
why:
No need to compare changes, saving is always triggered after actively
changing the sync layout state
* Early reject unsuitable head + finalised header from CL
why:
The finalised header is only passed by its hash so the header must be
fetched somewhere, e.g. from a peer via eth/xx.
Also, finalised headers earlier than the `base` from `FC` cannot be
handled due to the `Aristo` single state database architecture.
Luckily, on a full node, the complete block history is available so
unsuitable finalised headers are stored there already which is exploited
here to avoid unnecessary network traffic.
* Code cosmetics, remove cruft, prettify logging, remove `final` metrics
detail:
The `final` layout parameter will be deprecated and later removed
* Update/re-calibrate syncer logic documentation
why:
The current implementation sucks if the `FC` module changes the
canonical branch in the middle of completing a header chain (due
to concurrent updates by the `newPayload()` logic.)
* Implement according to re-calibrated syncer docu
details:
The implementation employs the notion of named layout states (see
`SyncLayoutState` in `worker_desc.nim`) which are derived from the
state parameter triple `(C,D,H)` as described in `README.md`.
2024-11-21 16:32:47 +00:00
|
|
|
else: data.blkUnprocBottom.bnStr & "(" &
|
Flare sync (#2627)
* Cosmetics, small fixes, add stashed headers verifier
* Remove direct `Era1` support
why:
Era1 is indirectly supported by using the import tool before syncing.
* Clarify database persistent save function.
why:
Function relied on the last saved state block number which was wrong.
It now relies on the tx-level. If it is 0, then data are saved directly.
Otherwise the task that owns the tx will do it.
* Extracted configuration constants into separate file
* Enable single peer mode for debugging
* Fix peer losing issue in multi-mode
details:
Running concurrent download peers was previously programmed as running
a batch downloading and storing ~8k headers and then leaving the `async`
function to be restarted by a scheduler.
This was unfortunate because of occasionally occurring long waiting
times for restart.
While the time gap until restarting were typically observed a few
millisecs, there were always a few outliers which well exceed several
seconds. This seemed to let remote peers run into timeouts.
* Prefix function names `unprocXxx()` and `stagedYyy()` by `headers`
why:
There will be other `unproc` and `staged` modules.
* Remove cruft, update logging
* Fix accounting issue
details:
When staging after fetching headers from the network, there was an off
by 1 error occurring when the result was by one smaller than requested.
Also, a whole range was mis-accounted when a peer was terminating
connection immediately after responding.
* Fix slow/error header accounting when fetching
why:
Originally set for detecting slow headers in a row, the counter
was wrongly extended to general errors.
* Ban peers for a while that respond with too few headers continuously
why:
Some peers only returned one header at a time. If these peers sit on a
farm, they might collectively slow down the download process.
* Update RPC beacon header updater
why:
Old function hook has slightly changed its meaning since it was used
for snap sync. Also, the old hook is used by other functions already.
* Limit number of peers or set to single peer mode
details:
Merge several concepts, single peer mode being one of it.
* Some code clean up, fixings for removing of compiler warnings
* De-noise header fetch related sources
why:
Header download looks relatively stable, so general debugging is not
needed, anymore. This is the equivalent of removing the scaffold from
the part of the building where work has completed.
* More clean up and code prettification for headers stuff
* Implement body fetch and block import
details:
Available headers are used stage blocks by combining existing headers
with newly fetched blocks. Then these blocks are imported/executed via
`persistBlocks()`.
* Logger cosmetics and cleanup
* Remove staged block queue debugging
details:
Feature still available, just not executed anymore
* Docu, logging update
* Update/simplify `runDaemon()`
* Re-calibrate block body requests and soft config for import blocks batch
why:
* For fetching, larger fetch requests are mostly truncated anyway on
MainNet.
* For executing, smaller batch sizes reduce the memory needed for the
price of longer execution times.
* Update metrics counters
* Docu update
* Some fixes, formatting updates, etc.
* Update `borrowed` type: uint -. uint64
also:
Always convert to `uint64` rather than `uint` where appropriate
2024-09-27 15:07:42 +00:00
|
|
|
data.nBlkUnprocessed.toSI & "," & $data.nBlkUnprocFragm & ")"
|
|
|
|
|
2024-10-17 17:59:50 +00:00
|
|
|
rrg = data.reorg
|
2024-10-03 20:19:11 +00:00
|
|
|
peers = data.nBuddies
|
2024-09-09 09:12:56 +00:00
|
|
|
|
|
|
|
# With `int64`, there are more than 29*10^10 years range for seconds
|
|
|
|
up = (now - t.started).seconds.uint64.toSI
|
|
|
|
mem = getTotalMem().uint.toSI
|
|
|
|
|
|
|
|
t.lastStats = data
|
|
|
|
t.visited = now
|
|
|
|
|
Beacon sync update multi exe heads aware (#2861)
* Log/trace cancellation events in scheduler
* Provide `clear()` functions for explicitly flushing data objects
* Renaming header cache functions
why:
More systematic, all functions start with prefix `dbHeader`
* Remove `danglingParent` from layout
why:
Already provided by header cache
* Remove `couplerHash` and `headHash` from layout
why:
No need to cache, `headHash` is unused and `couplerHash` used typically
once, only.
* Remove `lastLayout` from sync descriptor
why:
No need to compare changes, saving is always triggered after actively
changing the sync layout state
* Early reject unsuitable head + finalised header from CL
why:
The finalised header is only passed by its hash so the header must be
fetched somewhere, e.g. from a peer via eth/xx.
Also, finalised headers earlier than the `base` from `FC` cannot be
handled due to the `Aristo` single state database architecture.
Luckily, on a full node, the complete block history is available so
unsuitable finalised headers are stored there already which is exploited
here to avoid unnecessary network traffic.
* Code cosmetics, remove cruft, prettify logging, remove `final` metrics
detail:
The `final` layout parameter will be deprecated and later removed
* Update/re-calibrate syncer logic documentation
why:
The current implementation sucks if the `FC` module changes the
canonical branch in the middle of completing a header chain (due
to concurrent updates by the `newPayload()` logic.)
* Implement according to re-calibrated syncer docu
details:
The implementation employs the notion of named layout states (see
`SyncLayoutState` in `worker_desc.nim`) which are derived from the
state parameter triple `(C,D,H)` as described in `README.md`.
2024-11-21 16:32:47 +00:00
|
|
|
debug "Sync state", up, peers, B, L, C, D, H, T, hS, hU, bS, bU, rrg, mem
|
2024-09-09 09:12:56 +00:00
|
|
|
|
|
|
|
# ------------------------------------------------------------------------------
|
|
|
|
# Private functions: ticking log messages
|
|
|
|
# ------------------------------------------------------------------------------
|
|
|
|
|
|
|
|
proc setLogTicker(t: TickerRef; at: Moment) {.gcsafe.}
|
|
|
|
|
|
|
|
proc runLogTicker(t: TickerRef) {.gcsafe.} =
|
2024-10-28 16:22:04 +00:00
|
|
|
if not t.statsCb.isNil:
|
|
|
|
t.prettyPrint(t)
|
|
|
|
t.setLogTicker(Moment.fromNow(tickerLogInterval))
|
2024-09-09 09:12:56 +00:00
|
|
|
|
|
|
|
proc setLogTicker(t: TickerRef; at: Moment) =
|
2024-10-02 11:31:33 +00:00
|
|
|
if t.statsCb.isNil:
|
2024-10-28 16:22:04 +00:00
|
|
|
debug "Ticker stopped"
|
2024-09-09 09:12:56 +00:00
|
|
|
else:
|
|
|
|
# Store the `runLogTicker()` in a closure to avoid some garbage collection
|
|
|
|
# memory corruption issues that might occur otherwise.
|
|
|
|
discard setTimer(at, proc(ign: pointer) = runLogTicker(t))
|
|
|
|
|
|
|
|
# ------------------------------------------------------------------------------
|
|
|
|
# Public constructor and start/stop functions
|
|
|
|
# ------------------------------------------------------------------------------
|
|
|
|
|
2024-10-02 11:31:33 +00:00
|
|
|
proc init*(T: type TickerRef; cb: TickerStatsUpdater): T =
|
2024-09-09 09:12:56 +00:00
|
|
|
## Constructor
|
|
|
|
result = TickerRef(
|
2024-10-02 11:31:33 +00:00
|
|
|
prettyPrint: tickerLogger,
|
|
|
|
statsCb: cb,
|
2024-09-09 09:12:56 +00:00
|
|
|
started: Moment.now())
|
|
|
|
result.setLogTicker Moment.fromNow(tickerStartDelay)
|
|
|
|
|
|
|
|
proc destroy*(t: TickerRef) =
|
|
|
|
## Stop ticker unconditionally
|
|
|
|
if not t.isNil:
|
2024-10-02 11:31:33 +00:00
|
|
|
t.statsCb = TickerStatsUpdater(nil)
|
2024-09-09 09:12:56 +00:00
|
|
|
|
|
|
|
# ------------------------------------------------------------------------------
|
|
|
|
# End
|
|
|
|
# ------------------------------------------------------------------------------
|