* Harden handling of unviable forks
In our current handling of unviable forks, we allow peers to send us
blocks that come from a different fork - this is not necessarily an
error as it can happen naturally, but it does open up the client to a
case where the same unviable fork keeps getting requested - rather than
allowing this to happen, we'll now give these peers a small negative
score - if it keeps happening, we'll disconnect them.
* keep track of unviable forks in quarantine, to avoid filling it with
known junk
* collect peer scores in single module
* descore peers when they send unviable blocks during sync
* don't give score for duplicate blocks
* increase quarantine size to a level that allows finality to happen
under optimal conditions - this helps avoid downloading the same blocks
over and over in case of an unviable fork
* increase initial score for new peers to make room for one more failure
before disconnection
* log and score invalid/unviable blocks in requestmanager too
* avoid ChainDAG dependency in quarantine
* reject gossip blocks with unviable parent
* continue processing unviable sync blocks in order to build unviable
dag
* docs
* Update beacon_chain/consensus_object_pools/block_pools_types.nim
* add unviable queue test
* SyncManager cleanups for backfill support
Cleanups, fixes and simplifications, in anticipation of backfill support
for the `SyncManager`:
* reformat sync progress indicator to show time left and % done more
prominently:
* old: `sync="sPssPsssss:2:2.4229:00h57m (2706898)"`
* new: `sync="14d12h31m (0.52%) 1.1378slots/s (wQQQQQDDQQ:1287520)"`
* reset average speed when going out of sync
* pass all block errors to sync manager, including duplicate/unviable
* penalize peers for reporting a head block that is outside of our
expected wall clock time (they're likely on a different network or
trying to disrupt sync)
* remove `SyncFailureKind` (unused)
* remove `inRange` (unused)
* add `Q` for sync queue requests that are in the `SyncQueue` but not
yet in the `BlockProcessor` queue
* update last slot in `SyncQueue` after getting peer status
* fix race condition between `wakeupWaiters` and `resetWait`, where
workers would not be correctly reset if block verification returned a
completed future without event loop
* log syncmanager direction
* Fix ordering issue.
Some of the requests size of which are not equal to `chunkSize` could be processed in wrong order which could lead to sync process freezes.
Co-authored-by: cheatfate <eugene.kabanov@status.im>
* move quarantine outside of chaindag
The quarantine has been part of the ChainDAG for the longest time, but
this design has a few issues:
* the function in which blocks are verified and added to the dag becomes
reentrant and therefore difficult to reason about - we're currently
using a stateful flag to work around it
* quarantined blocks bypass the processing queue leading to a processing
stampede
* the quarantine flow is unsuitable for orphaned attestations - these
should also should be quarantined eventually
Instead of processing the quarantine inside ChainDAG, this PR moves
re-queueing to `block_processor` which already is responsible for
dealing with follow-up work when a block is added to the dag
This sets the stage for keeping attestations in the quarantine as well.
Also:
* make `BlockError` `{.pure.}`
* avoid use of `ValidationResult` in block clearance (that's for gossip)
* reorganize ssz dependencies
This PR continues the work in
https://github.com/status-im/nimbus-eth2/pull/2646,
https://github.com/status-im/nimbus-eth2/pull/2779 as well as past
issues with serialization and type, to disentangle SSZ from eth2 and at
the same time simplify imports and exports with a structured approach.
The principal idea here is that when a library wants to introduce SSZ
support, they do so via 3 files:
* `ssz_codecs` which imports and reexports `codecs` - this covers the
basic byte conversions and ensures no overloads get lost
* `xxx_merkleization` imports and exports `merkleization` to specialize
and get access to `hash_tree_root` and friends
* `xxx_ssz_serialization` imports and exports `ssz_serialization` to
specialize ssz for a specific library
Those that need to interact with SSZ always import the `xxx_` versions
of the modules and never `ssz` itself so as to keep imports simple and
safe.
This is similar to how the REST / JSON-RPC serializers are structured in
that someone wanting to serialize spec types to REST-JSON will import
`eth2_rest_serialization` and nothing else.
* split up ssz into a core library that is independendent of eth2 types
* rename `bytes_reader` to `codec` to highlight that it contains coding
and decoding of bytes and native ssz types
* remove tricky List init overload that causes compile issues
* get rid of top-level ssz import
* reenable merkleization tests
* move some "standard" json serializers to spec
* remove `ValidatorIndex` serialization for now
* remove test_ssz_merkleization
* add tests for over/underlong byte sequences
* fix broken seq[byte] test - seq[byte] is not an SSZ type
There are a few things this PR doesn't solve:
* like #2646 this PR is weak on how to handle root and other
dontSerialize fields that "sometimes" should be computed - the same
problem appears in REST / JSON-RPC etc
* Fix a build problem on macOS
* Another way to fix the macOS builds
Co-authored-by: Zahary Karadjov <zahary@gmail.com>
The spec imports are a mess to work with, so this branch cleans them up
a bit to ensure that we avoid generic sandwitches and that importing
stuff generally becomes easier.
* reexport crypto/digest/presets because these are part of the public
symbol set of the rest of the spec types
* don't export `merge` types from `base` - this causes circular deps
* fix circular deps in `ssz/spec_types` - this is the first step in
disentangling ssz from spec
* be explicit about phase0 vs altair - longer term, `altair` will become
the "natural" type set, then merge and so on, so no point in giving
`phase0` special preferential treatment
Simpler module name for stuff that covers forks
* check that runtime config matches database state
* also include some assorted altair cleanups
* use "standard" genesis fork in local testnet to work around missing
runtime config support
* bump libp2p
* altair sync v2
Use V2 sync requests after the altair fork has happened, according to
the wall clock
* Fix the behavior of the v1 req/resp calls after Altair
Co-authored-by: Zahary Karadjov <zahary@gmail.com>
* gossip_to_consensus -> block_processor (it's processing only blocks,
but not only from gossip)
* measure queue and validation time for blocks
* measure assignment and state loading times for updateStateData
* avoid some unnecessary block copies in block sync
* warn that database is corrupt if we hit tail without a state