33 Commits

Author SHA1 Message Date
Jacek Sieka
805e85e1ff
time: spring cleaning (#3262)
Time in the beacon chain is expressed relative to the genesis time -
this PR creates a `beacon_time` module that collects helpers and
utilities for dealing the time units - the new module does not deal with
actual wall time (that's remains in `beacon_clock`).

Collecting the time related stuff in one place makes it easier to find,
avoids some circular imports and allows more easily identifying the code
actually needs wall time to operate.

* move genesis-time-related functionality into `spec/beacon_time`
* avoid using `chronos.Duration` for time differences - it does not
support negative values (such as when something happens earlier than it
should)
* saturate conversions between `FAR_FUTURE_XXX`, so as to avoid
overflows
* fix delay reporting in validator client so it uses the expected
deadline of the slot, not "closest wall slot"
* simplify looping over the slots of an epoch
* `compute_start_slot_at_epoch` -> `start_slot`
* `compute_epoch_at_slot` -> `epoch`

A follow-up PR will (likely) introduce saturating arithmetic for the
time units - this is merely code moves, renames and fixing of small
bugs.
2022-01-11 11:01:54 +01:00
Jacek Sieka
118840d241
SyncManager cleanups for backfill support (#3189)
* SyncManager cleanups for backfill support

Cleanups, fixes and simplifications, in anticipation of backfill support
for the `SyncManager`:

* reformat sync progress indicator to show time left and % done more
prominently:
  * old: `sync="sPssPsssss:2:2.4229:00h57m (2706898)"`
  * new: `sync="14d12h31m (0.52%) 1.1378slots/s (wQQQQQDDQQ:1287520)"`
* reset average speed when going out of sync
* pass all block errors to sync manager, including duplicate/unviable
* penalize peers for reporting a head block that is outside of our
expected wall clock time (they're likely on a different network or
trying to disrupt sync)
* remove `SyncFailureKind` (unused)
* remove `inRange` (unused)
* add `Q` for sync queue requests that are in the `SyncQueue` but not
yet in the `BlockProcessor` queue
* update last slot in `SyncQueue` after getting peer status
* fix race condition between `wakeupWaiters` and `resetWait`, where
workers would not be correctly reset if block verification returned a
completed future without event loop
* log syncmanager direction

* Fix ordering issue.
Some of the requests size of which are not equal to `chunkSize` could be processed in wrong order which could lead to sync process freezes.

Co-authored-by: cheatfate <eugene.kabanov@status.im>
2021-12-16 15:57:16 +01:00
Eugene Kabanov
b05734f610
Backward sync support for SyncManager. (#3131)
* Unbundle SyncQueue from sync_manager.nim.
Unbundle Peer scores constants to peer_scores.nim.
Add Forward/Backward enum.

* Further improvements and tests.

* Adopt getRewindPoint() and fix MissingParent handler.

* Remove unused procedures.
Refactor `result` usage.
Fix resetWait().

* Add all the tests and fix the issue with rewind point.

* Fix get() issue.

* Fix flaky tests.

* test fixes

Co-authored-by: Jacek Sieka <jacek@status.im>
2021-12-08 22:15:29 +01:00
Jacek Sieka
1a8b7469e3
move quarantine outside of chaindag (#3124)
* move quarantine outside of chaindag

The quarantine has been part of the ChainDAG for the longest time, but
this design has a few issues:

* the function in which blocks are verified and added to the dag becomes
reentrant and therefore difficult to reason about - we're currently
using a stateful flag to work around it
* quarantined blocks bypass the processing queue leading to a processing
stampede
* the quarantine flow is unsuitable for orphaned attestations - these
should also should be quarantined eventually

Instead of processing the quarantine inside ChainDAG, this PR moves
re-queueing to `block_processor` which already is responsible for
dealing with follow-up work when a block is added to the dag

This sets the stage for keeping attestations in the quarantine as well.

Also:

* make `BlockError` `{.pure.}`
* avoid use of `ValidationResult` in block clearance (that's for gossip)
2021-12-06 10:49:01 +01:00
Jacek Sieka
c40cc6cec1 clean up fork enum and field names
* single naming strategy
* simplify some fork code
* simplify forked block production
2021-10-19 11:06:38 +03:00
tersec
a060985abc
unexport various parts of tests/ and remove unused code (#2794) 2021-08-18 13:58:43 +00:00
Jacek Sieka
9697b73e71
forkedbeaconstate_helpers -> forks (#2772)
Simpler module name for stuff that covers forks

* check that runtime config matches database state
* also include some assorted altair cleanups
* use "standard" genesis fork in local testnet to work around missing
runtime config support
2021-08-10 22:46:35 +02:00
Jacek Sieka
2d6a661ac6
Syncv2 (#2723)
* bump libp2p

* altair sync v2

Use V2 sync requests after the altair fork has happened, according to
the wall clock

* Fix the behavior of the v1 req/resp calls after Altair

Co-authored-by: Zahary Karadjov <zahary@gmail.com>
2021-07-15 21:01:07 +02:00
Jacek Sieka
7f52ffb8d9
clean up block processing (#2610)
* gossip_to_consensus -> block_processor (it's processing only blocks,
but not only from gossip)
* measure queue and validation time for blocks
* measure assignment and state loading times for updateStateData
* avoid some unnecessary block copies in block sync
* warn that database is corrupt if we hit tail without a state
2021-05-28 19:34:00 +03:00
Eugene Kabanov
5b5ea2e813
Fix integer overflow issue in sync_manager. (#2564)
* Make Refactor rewind point assignment more concrete.

* Fix overflow issue in getRewindPoint().
Add tests.
2021-05-18 12:25:14 +02:00
Jacek Sieka
ce49da6c0a
Introduce unittest2 and junit reports (#2522)
* Introduce unittest2 and junit reports

* fix XML path

* don't combine multiple CI runs

* fixup

* public combined report also

Co-authored-by: Ștefan Talpalaru <stefantalpalaru@yahoo.com>
2021-04-28 18:41:02 +02:00
Dustin Brody
97504fdb9d ncli_db pruneDatabase checkpointing; remove onSlotEnd lookaheadTime 2021-03-12 23:15:46 +02:00
Mamy Ratsimbazafy
c47d636cb3
Split Eth2Processor in prep for batching (#2396)
* Split Eth2Processor in gossip and consensus part and materialize the shared block queue

* Update initialization in test_sync_manager
2021-03-11 11:10:57 +01:00
Mamy Ratsimbazafy
d47f53cd9d
Reorg (5/5) (#2377)
* Reorg things left into networking and gossip_processing

* time -> beacon_clock

* fix builds
2021-03-05 14:12:00 +01:00
Mamy Ratsimbazafy
3276dfc683
Consolidate modules by areas [part 1] (#2365)
* Move sync in subfolder

* move validator related thingies in validators

* fix binary builds

* update bounds comment [skip ci]
2021-03-02 11:27:45 +01:00
Jacek Sieka
22998fdfd4 avoid double deserialization
When blocks and attestations arrive, they are SSZ-decoded twice: once
for validation and once for processing. This branch enqueues the decoded
block directly for processing, avoiding the second, slow
deserialization.

* move processing of blocks and attestations to queue
* ...and out from beacon_node
* split attestation processing into attestations and aggregates
  * also updates metrics
* clean up logging to better follow the lifetime of gossip: arrival,
validation and processing
* drop attestations and aggregates if there are too many
* try to prioritise blocks and aggregates before single-validator
attestations
2020-08-21 11:46:25 +03:00
Eugene Kabanov
55fcece0b2
SyncManager fix to process blocks one by one. (#1464)
* Allow sync manager process blocks one by one.

* Log storeBlock() and updateHead() duration.

* Calculate duration only for blocks added without any error.

* Fix float compilation error.

* Fix duration.

* Fix SyncQueue tests.
2020-08-10 09:15:50 +02:00
Eugene Kabanov
1fc9413c48
Fix #1153. (#1160)
Add ability for SyncQueue to recover from unexpected MissingParent.
2020-06-11 16:20:53 +02:00
cheatfate
12e28a1fa9 Add proper concurrent connections.
Add SeenTable to avoid continuous attempts to dead peers.
Refactor onSecond.
Block backward sync while forward sync is working.
SyncManager now checks responses according corresponding requests + tests.
SyncManager now watching for not progressing local_head_slot and resets SyncQueue.
2020-06-03 12:53:57 +03:00
Eugene Kabanov
21131e629b
Sync freeze fixes. (#1072)
* Add ability to reset state of sync manager.
Fix bug when sync got stuck on `zero-point` reset.
Fix bug when sync got stuck when some of the workers waiting for failing one.

* Remove debugging comments and imports.

* Remove not used pendingLock.
2020-05-28 07:02:28 +02:00
Eugene Kabanov
ea95021073
Fix sync issues. (#1035)
* Fix sync issues.

* Add documentation about zero-point.
Add more comments about syncing loops.
Change to 4 blocks per request.
2020-05-19 14:08:50 +02:00
Jacek Sieka
ed74770451
spec: regulate exceptions (#913)
* spec: regulate exceptions

* a few more simple raises
2020-04-22 07:53:02 +02:00
Eugene Kabanov
3d42da90a8
Syncing. (#909) 2020-04-20 16:59:18 +02:00
Ștefan Talpalaru
b7a32a17ba
bump submodules
and remove failing syncManagerGroupRecoveryTest
2020-04-14 18:21:56 +02:00
Zahary Karadjov
22876da593 Fix gcsafety issues in the test suite 2020-03-24 22:14:40 +02:00
Zed
6ba7b4b117 Generate markdown test reports 2020-03-13 14:38:59 +00:00
Ștefan Talpalaru
1caafba79c
Merge pull request #803 from status-im/misc/bump-libp2p
bumping libp2p to latest master
2020-03-12 01:57:11 +01:00
cheatfate
399e91fe5b
Fix failing test. 2020-03-12 02:02:33 +02:00
Dustin Brody
0d3de00714 remove unused imports 2020-03-11 10:50:55 +00:00
cheatfate
98dc701473 Add PeerPool.addPeer async version and tests. 2020-01-29 15:28:41 +00:00
cheatfate
8b229d68ad Add testutil and timedTest. 2020-01-29 15:28:41 +00:00
cheatfate
db20fc1172 Fix SyncQueue push(data) bug.
Rename lastSlot to HeadSlot.
Add failure test.
2020-01-29 15:28:41 +00:00
cheatfate
73dc72583f Initial commit. 2020-01-29 15:28:41 +00:00