Commit Graph

45 Commits

Author SHA1 Message Date
Eugene Kabanov 687cbaf94c
Adjust number of sync workers from 20 to 10. (#2077)
Adjust watch task min-pause-time to 1.minute.
Slowed down pause-time recovery by factor 3/4 instead of 1/2.
2020-11-26 09:19:27 +01:00
Jacek Sieka 7cc3dc8027
Revert "Full "node" RPC calls implementation and fixes to peer lifetime states. (#2065)" (#2082)
This reverts commit d041287a4f.
2020-11-26 09:05:23 +01:00
Eugene Kabanov d041287a4f
Full "node" RPC calls implementation and fixes to peer lifetime states. (#2065)
* Initial commit.

* Fix log lines and compilation error.

* Add get_v1_node_peers() implementation.

* Fix peer's lifetime states.

* Use the most recent multiaddress.

* Fix assign NewPeerScore again.
Fix compilation error with last seen address.
Fix Peer upgraded log line place.

* syncing, health, peer_count, peer_id and fixes for identity.

* Fix compilation problems.

* Move object declaration to callsigs.
Fix identity addresses fields.

* Finish node RPC calls.

* Avoid leak of lifetime future.

* Bump chronos.

* Fix json generator problem.
2020-11-26 08:09:59 +01:00
tersec 1d7fb2ed0c
remove {.inline.} pragmas (#2033)
* remove {.inline.} pragmas

* re-add inline on bitseqs functions and tweak inlining threshold

* remove macOS/LLVM inlining setting; revert non-init/module-local/tests inline pragma removal
2020-11-20 11:00:22 +01:00
tersec 9e716b32bd
address some XDeclaredButNotUsed hints (#2028) 2020-11-17 11:14:53 +01:00
cheatfate f642f71457 Calculate how much time syncing takes, and show it at statusbar. 2020-11-12 14:16:12 +02:00
Eugene Kabanov ea4d94c65c
Add guard task to monitor synchronization worker Future. (#1972)
* Add guard task to monitor synchronization worker Future.

* Change log events from debug to warning level.
2020-11-10 14:47:26 +01:00
Zahary Karadjov 14b2d4324d openarray -> openArray 2020-11-03 23:23:10 +02:00
Eugene Kabanov 3f5c7c36bc
Some syncing fixes (#1919)
* Add exponential rewind on MissingParent.

* Try to avoid peers which are useless for syncing.
Fix forward sync restart at proper point.
Fix getLocalWallSlot() to not return slots from the future.

* Fix incorrect logs.

* Fix logging.
Enable peer's status messages log on DEBUG level.

* Fix watch task to monitor operation progress, but not local head progress.

* Add more logging information.
Remove recurring failures detection mechanism.
2020-10-30 13:33:52 +01:00
Eugene Kabanov c82ff24b5c
Syncing fixes (#1909)
* Fix continuous sync queue rewinds on slow PCs.
Fix recurring disconnects on low peer score.

* Calculate average syncing speed, not the current one.
Move speed calculation to different task.

* Address review comments.
2020-10-27 10:25:28 +01:00
Eugene Kabanov 3bd7ab4c20
Do not reward empty responses. (#1827)
Request status from peers every StatusExpirationTime.
2020-10-08 14:50:48 +02:00
Mamy Ratsimbazafy 0280d6c73e
Revisiting log levels (#1788)
* Update log level - https://github.com/status-im/nim-beacon-chain/issues/1779 https://github.com/status-im/nim-beacon-chain/issues/1785

* Address review comments

* Document the logging strategy [skip ci]
2020-10-01 20:56:42 +02:00
Eugene Kabanov 2cd0c3adaa
Fix condition. (#1734) 2020-09-24 14:14:29 +02:00
Eugene Kabanov 08795b3f5d
Fix tight loop at the end of sync process. (#1731) 2020-09-23 17:58:02 +02:00
Eugene Kabanov 654b8d66bf
Peer management (#1707)
* addPeer() and addPeerNoWait() now returns PeerStatus, not bool.
Minor refactoring of PeerPool.
Fix tests.

* Refactor PeerPool.
Add lenSpace.
Add tests for lenSpace.
PeerPool.add procedures now return different error codes.
Fix SyncManager break/continue problem.
Fix connectWorker break/continue problem.
Refactor connectWorker and discoveryLoop.
Fix incoming/outgoing blocking problem.

* Refactor discovery loop.
Add checkPeer.

* Fix logic and compilation bugs.

* Adjust position of debugging log.

* Fix issue with maximum peers in PeerPool.
Optimize node record decoding.

* fix discoveryLoop.

* Remove aliases and fix tests using aliases.
2020-09-21 18:02:27 +02:00
Eugene Kabanov 9abdbdabd8
Fix sync_manager.nim not rewarding peers for good responses. (#1660) 2020-09-16 09:15:06 +02:00
Eugene Kabanov c7c9b9d5f1
Syncing V2 (#1602)
* Syncing workers now not bound to peers.
Sync status is now printed in statusbar.

* Add `SyncQueue.outSlot` to statusbar too.

* Add `inRangeEvent` and `rangeAge` parameter.

* Fix rangeAge is not depends on SyncQueue latest slot.
Fix syncManager to start from latest local head slot.

* Add notInRange event.

* Remove suspects field.
2020-09-11 14:46:01 +02:00
Jacek Sieka a087909683
fix req/resp protocol (#1621)
per spec, we must half-close request stream - not doing so may lead to
failure of the other end to start processing our request leading to
timeouts.

In particular, this fixes many sync problems that have been seen on
medalla.

* remove safeClose - close no longer raises
* use per-chunk timeouts in request processing
2020-09-10 21:40:09 +02:00
Jacek Sieka c810b64ed8
log getblocks error 2020-08-27 10:24:41 +02:00
Dustin Brody 95d5736128 don't rely on head updates for topic subscription decision 2020-08-22 01:50:50 +03:00
Jacek Sieka 22998fdfd4 avoid double deserialization
When blocks and attestations arrive, they are SSZ-decoded twice: once
for validation and once for processing. This branch enqueues the decoded
block directly for processing, avoiding the second, slow
deserialization.

* move processing of blocks and attestations to queue
* ...and out from beacon_node
* split attestation processing into attestations and aggregates
  * also updates metrics
* clean up logging to better follow the lifetime of gossip: arrival,
validation and processing
* drop attestations and aggregates if there are too many
* try to prioritise blocks and aggregates before single-validator
attestations
2020-08-21 11:46:25 +03:00
Eugene Kabanov 711f1f88ee
Use one single async queue and loop for processing blocks. (#1487)
* Initial commit

* Fix compilation problem.

* Address review comments.
2020-08-12 11:29:11 +02:00
Eugene Kabanov 55fcece0b2
SyncManager fix to process blocks one by one. (#1464)
* Allow sync manager process blocks one by one.

* Log storeBlock() and updateHead() duration.

* Calculate duration only for blocks added without any error.

* Fix float compilation error.

* Fix duration.

* Fix SyncQueue tests.
2020-08-10 09:15:50 +02:00
cheatfate 99dcb81e77 Initial commit. 2020-07-27 17:48:26 +03:00
Eugene Kabanov 01c00c960c
Fix syncman topics log lines. (#1295)
* Fix syncman log topics is not applied properly.
2020-07-10 11:25:58 +03:00
cheatfate 6ef2e71468 Fix names 2020-07-07 15:34:04 +03:00
cheatfate 322ec3d2f9 Forward sync should always start from finalized epoch's first slot. 2020-07-07 15:34:04 +03:00
Eugene Kabanov 96f26c447c
Replace zero-point rewind with rewind to latest finalized epoch's first slot. (#1176)
* Replace zero-point rewind with rewind to latest finalized epoch's first slot.

* Fix tests.

* Add missing penalty for MissingParent
Fix comments.
2020-06-15 21:41:26 +02:00
Eugene Kabanov 50c5d47250
Add maximum number of workers (peers used) by SyncManager (default: 10) (#1172)
Refactor and simplification of `sync` procedure.
Fix aggressive looping on excessive recurring failures.
2020-06-14 11:45:53 +02:00
Eugene Kabanov 1fc9413c48
Fix #1153. (#1160)
Add ability for SyncQueue to recover from unexpected MissingParent.
2020-06-11 16:20:53 +02:00
Eugene Kabanov 3ce98d5bca
Add checks for penalties which are not applied immediately. (#1139)
Change default maxHeadAge value to 1 epoch.
Set zero-point at the SyncQueue's initialization.
Remove annoying logs in runDiscoveryLoop.
2020-06-07 17:36:24 +02:00
Jacek Sieka 56ffb696be
reorder ssz (#1099)
* reorder ssz

* split into hash_trees and ssz_serialization, roughly, for hashing and
IO
* move bitseqs into ssz (from stew)
* clean up imports

* docs, imports
2020-06-03 15:52:02 +02:00
cheatfate 12e28a1fa9 Add proper concurrent connections.
Add SeenTable to avoid continuous attempts to dead peers.
Refactor onSecond.
Block backward sync while forward sync is working.
SyncManager now checks responses according corresponding requests + tests.
SyncManager now watching for not progressing local_head_slot and resets SyncQueue.
2020-06-03 12:53:57 +03:00
Eugene Kabanov 21131e629b
Sync freeze fixes. (#1072)
* Add ability to reset state of sync manager.
Fix bug when sync got stuck on `zero-point` reset.
Fix bug when sync got stuck when some of the workers waiting for failing one.

* Remove debugging comments and imports.

* Remove not used pendingLock.
2020-05-28 07:02:28 +02:00
Eugene Kabanov ea95021073
Fix sync issues. (#1035)
* Fix sync issues.

* Add documentation about zero-point.
Add more comments about syncing loops.
Change to 4 blocks per request.
2020-05-19 14:08:50 +02:00
Zahary Karadjov 75c1c6a95c Enable Snappy by default (using LibP2P steams for now)
This refactors the newly added Snappy streaming back-ends trying to
make them more similar and to reduce the code duplication to a minimum.
2020-05-13 12:18:42 +03:00
Eugene Kabanov da0b1a4993
Fix status handling. (#1008)
* Fix status handling.
Add log map of received blocks.

* Fix review comments.
Fix UnusedImport in sync_protocol.nim
2020-05-13 08:37:58 +02:00
Jacek Sieka fb2e0ddbec
sync fixes (#1005)
* sync fixes

* fix Status message finalized info
* work around sync starting before initial status exchange
* don't fail block on deposit signature check failure (fixes #989)
* print ForkDigest and Version nicely
* dump incoming blocks
* fix crash when libp2p peer connection is closed
* update chunk size to 16 to work around missing blocks when syncing

* bump libp2p

* bump libp2p

* better deposit skip message
2020-05-11 18:08:52 +00:00
Eugene Kabanov be89a3c54d
Add "drop by score" ability to PeerPool. (#917)
* Add "drop by score" ability to PeerPool.
Add tests.
Fix syncmanager queue to start from most fresh data.

* Fix endless cycle at the end of syncing process.
2020-04-23 17:31:00 +02:00
Jacek Sieka 65ca74c980 req: cap requested blocks better
also cap blocks in roots request
2020-04-22 12:09:26 +03:00
Eugene Kabanov 3d42da90a8
Syncing. (#909) 2020-04-20 16:59:18 +02:00
Ștefan Talpalaru b7a32a17ba
bump submodules
and remove failing syncManagerGroupRecoveryTest
2020-04-14 18:21:56 +02:00
Zahary Karadjov 22876da593 Fix gcsafety issues in the test suite 2020-03-24 22:14:40 +02:00
cheatfate db20fc1172 Fix SyncQueue push(data) bug.
Rename lastSlot to HeadSlot.
Add failure test.
2020-01-29 15:28:41 +00:00
cheatfate 73dc72583f Initial commit. 2020-01-29 15:28:41 +00:00