46 Commits

Author SHA1 Message Date
Jacek Sieka
d16e127daf
Peer stuff (#2084)
* Revert "Revert "Full "node" RPC calls implementation and fixes to peer lifetime states. (#2065)" (#2082)"

This reverts commit 7cc3dc8027a46cb9dee1ae56534880010151481e.

* fix nil disconnectedFut crash

* fixes

don't resetPeer, it causes peer miscounts

* disconnect disconnecting peers

...when there's a race.

* avoid connection spamming

* never decrease SeenTable timeout
* only recover ENR for known peers

* seen only when really disconnected
2020-11-26 20:23:45 +01:00
Eugene Kabanov
687cbaf94c
Adjust number of sync workers from 20 to 10. (#2077)
Adjust watch task min-pause-time to 1.minute.
Slowed down pause-time recovery by factor 3/4 instead of 1/2.
2020-11-26 09:19:27 +01:00
Jacek Sieka
7cc3dc8027
Revert "Full "node" RPC calls implementation and fixes to peer lifetime states. (#2065)" (#2082)
This reverts commit d041287a4fbb7f00d59096b8d3abc94f40e5cb7d.
2020-11-26 09:05:23 +01:00
Eugene Kabanov
d041287a4f
Full "node" RPC calls implementation and fixes to peer lifetime states. (#2065)
* Initial commit.

* Fix log lines and compilation error.

* Add get_v1_node_peers() implementation.

* Fix peer's lifetime states.

* Use the most recent multiaddress.

* Fix assign NewPeerScore again.
Fix compilation error with last seen address.
Fix Peer upgraded log line place.

* syncing, health, peer_count, peer_id and fixes for identity.

* Fix compilation problems.

* Move object declaration to callsigs.
Fix identity addresses fields.

* Finish node RPC calls.

* Avoid leak of lifetime future.

* Bump chronos.

* Fix json generator problem.
2020-11-26 08:09:59 +01:00
tersec
1d7fb2ed0c
remove {.inline.} pragmas (#2033)
* remove {.inline.} pragmas

* re-add inline on bitseqs functions and tweak inlining threshold

* remove macOS/LLVM inlining setting; revert non-init/module-local/tests inline pragma removal
2020-11-20 11:00:22 +01:00
tersec
9e716b32bd
address some XDeclaredButNotUsed hints (#2028) 2020-11-17 11:14:53 +01:00
cheatfate
f642f71457 Calculate how much time syncing takes, and show it at statusbar. 2020-11-12 14:16:12 +02:00
Eugene Kabanov
ea4d94c65c
Add guard task to monitor synchronization worker Future. (#1972)
* Add guard task to monitor synchronization worker Future.

* Change log events from debug to warning level.
2020-11-10 14:47:26 +01:00
Zahary Karadjov
14b2d4324d openarray -> openArray 2020-11-03 23:23:10 +02:00
Eugene Kabanov
3f5c7c36bc
Some syncing fixes (#1919)
* Add exponential rewind on MissingParent.

* Try to avoid peers which are useless for syncing.
Fix forward sync restart at proper point.
Fix getLocalWallSlot() to not return slots from the future.

* Fix incorrect logs.

* Fix logging.
Enable peer's status messages log on DEBUG level.

* Fix watch task to monitor operation progress, but not local head progress.

* Add more logging information.
Remove recurring failures detection mechanism.
2020-10-30 13:33:52 +01:00
Eugene Kabanov
c82ff24b5c
Syncing fixes (#1909)
* Fix continuous sync queue rewinds on slow PCs.
Fix recurring disconnects on low peer score.

* Calculate average syncing speed, not the current one.
Move speed calculation to different task.

* Address review comments.
2020-10-27 10:25:28 +01:00
Eugene Kabanov
3bd7ab4c20
Do not reward empty responses. (#1827)
Request status from peers every StatusExpirationTime.
2020-10-08 14:50:48 +02:00
Mamy Ratsimbazafy
0280d6c73e
Revisiting log levels (#1788)
* Update log level - https://github.com/status-im/nim-beacon-chain/issues/1779 https://github.com/status-im/nim-beacon-chain/issues/1785

* Address review comments

* Document the logging strategy [skip ci]
2020-10-01 20:56:42 +02:00
Eugene Kabanov
2cd0c3adaa
Fix condition. (#1734) 2020-09-24 14:14:29 +02:00
Eugene Kabanov
08795b3f5d
Fix tight loop at the end of sync process. (#1731) 2020-09-23 17:58:02 +02:00
Eugene Kabanov
654b8d66bf
Peer management (#1707)
* addPeer() and addPeerNoWait() now returns PeerStatus, not bool.
Minor refactoring of PeerPool.
Fix tests.

* Refactor PeerPool.
Add lenSpace.
Add tests for lenSpace.
PeerPool.add procedures now return different error codes.
Fix SyncManager break/continue problem.
Fix connectWorker break/continue problem.
Refactor connectWorker and discoveryLoop.
Fix incoming/outgoing blocking problem.

* Refactor discovery loop.
Add checkPeer.

* Fix logic and compilation bugs.

* Adjust position of debugging log.

* Fix issue with maximum peers in PeerPool.
Optimize node record decoding.

* fix discoveryLoop.

* Remove aliases and fix tests using aliases.
2020-09-21 18:02:27 +02:00
Eugene Kabanov
9abdbdabd8
Fix sync_manager.nim not rewarding peers for good responses. (#1660) 2020-09-16 09:15:06 +02:00
Eugene Kabanov
c7c9b9d5f1
Syncing V2 (#1602)
* Syncing workers now not bound to peers.
Sync status is now printed in statusbar.

* Add `SyncQueue.outSlot` to statusbar too.

* Add `inRangeEvent` and `rangeAge` parameter.

* Fix rangeAge is not depends on SyncQueue latest slot.
Fix syncManager to start from latest local head slot.

* Add notInRange event.

* Remove suspects field.
2020-09-11 14:46:01 +02:00
Jacek Sieka
a087909683
fix req/resp protocol (#1621)
per spec, we must half-close request stream - not doing so may lead to
failure of the other end to start processing our request leading to
timeouts.

In particular, this fixes many sync problems that have been seen on
medalla.

* remove safeClose - close no longer raises
* use per-chunk timeouts in request processing
2020-09-10 21:40:09 +02:00
Jacek Sieka
c810b64ed8
log getblocks error 2020-08-27 10:24:41 +02:00
Dustin Brody
95d5736128 don't rely on head updates for topic subscription decision 2020-08-22 01:50:50 +03:00
Jacek Sieka
22998fdfd4 avoid double deserialization
When blocks and attestations arrive, they are SSZ-decoded twice: once
for validation and once for processing. This branch enqueues the decoded
block directly for processing, avoiding the second, slow
deserialization.

* move processing of blocks and attestations to queue
* ...and out from beacon_node
* split attestation processing into attestations and aggregates
  * also updates metrics
* clean up logging to better follow the lifetime of gossip: arrival,
validation and processing
* drop attestations and aggregates if there are too many
* try to prioritise blocks and aggregates before single-validator
attestations
2020-08-21 11:46:25 +03:00
Eugene Kabanov
711f1f88ee
Use one single async queue and loop for processing blocks. (#1487)
* Initial commit

* Fix compilation problem.

* Address review comments.
2020-08-12 11:29:11 +02:00
Eugene Kabanov
55fcece0b2
SyncManager fix to process blocks one by one. (#1464)
* Allow sync manager process blocks one by one.

* Log storeBlock() and updateHead() duration.

* Calculate duration only for blocks added without any error.

* Fix float compilation error.

* Fix duration.

* Fix SyncQueue tests.
2020-08-10 09:15:50 +02:00
cheatfate
99dcb81e77 Initial commit. 2020-07-27 17:48:26 +03:00
Eugene Kabanov
01c00c960c
Fix syncman topics log lines. (#1295)
* Fix syncman log topics is not applied properly.
2020-07-10 11:25:58 +03:00
cheatfate
6ef2e71468 Fix names 2020-07-07 15:34:04 +03:00
cheatfate
322ec3d2f9 Forward sync should always start from finalized epoch's first slot. 2020-07-07 15:34:04 +03:00
Eugene Kabanov
96f26c447c
Replace zero-point rewind with rewind to latest finalized epoch's first slot. (#1176)
* Replace zero-point rewind with rewind to latest finalized epoch's first slot.

* Fix tests.

* Add missing penalty for MissingParent
Fix comments.
2020-06-15 21:41:26 +02:00
Eugene Kabanov
50c5d47250
Add maximum number of workers (peers used) by SyncManager (default: 10) (#1172)
Refactor and simplification of `sync` procedure.
Fix aggressive looping on excessive recurring failures.
2020-06-14 11:45:53 +02:00
Eugene Kabanov
1fc9413c48
Fix #1153. (#1160)
Add ability for SyncQueue to recover from unexpected MissingParent.
2020-06-11 16:20:53 +02:00
Eugene Kabanov
3ce98d5bca
Add checks for penalties which are not applied immediately. (#1139)
Change default maxHeadAge value to 1 epoch.
Set zero-point at the SyncQueue's initialization.
Remove annoying logs in runDiscoveryLoop.
2020-06-07 17:36:24 +02:00
Jacek Sieka
56ffb696be
reorder ssz (#1099)
* reorder ssz

* split into hash_trees and ssz_serialization, roughly, for hashing and
IO
* move bitseqs into ssz (from stew)
* clean up imports

* docs, imports
2020-06-03 15:52:02 +02:00
cheatfate
12e28a1fa9 Add proper concurrent connections.
Add SeenTable to avoid continuous attempts to dead peers.
Refactor onSecond.
Block backward sync while forward sync is working.
SyncManager now checks responses according corresponding requests + tests.
SyncManager now watching for not progressing local_head_slot and resets SyncQueue.
2020-06-03 12:53:57 +03:00
Eugene Kabanov
21131e629b
Sync freeze fixes. (#1072)
* Add ability to reset state of sync manager.
Fix bug when sync got stuck on `zero-point` reset.
Fix bug when sync got stuck when some of the workers waiting for failing one.

* Remove debugging comments and imports.

* Remove not used pendingLock.
2020-05-28 07:02:28 +02:00
Eugene Kabanov
ea95021073
Fix sync issues. (#1035)
* Fix sync issues.

* Add documentation about zero-point.
Add more comments about syncing loops.
Change to 4 blocks per request.
2020-05-19 14:08:50 +02:00
Zahary Karadjov
75c1c6a95c Enable Snappy by default (using LibP2P steams for now)
This refactors the newly added Snappy streaming back-ends trying to
make them more similar and to reduce the code duplication to a minimum.
2020-05-13 12:18:42 +03:00
Eugene Kabanov
da0b1a4993
Fix status handling. (#1008)
* Fix status handling.
Add log map of received blocks.

* Fix review comments.
Fix UnusedImport in sync_protocol.nim
2020-05-13 08:37:58 +02:00
Jacek Sieka
fb2e0ddbec
sync fixes (#1005)
* sync fixes

* fix Status message finalized info
* work around sync starting before initial status exchange
* don't fail block on deposit signature check failure (fixes #989)
* print ForkDigest and Version nicely
* dump incoming blocks
* fix crash when libp2p peer connection is closed
* update chunk size to 16 to work around missing blocks when syncing

* bump libp2p

* bump libp2p

* better deposit skip message
2020-05-11 18:08:52 +00:00
Eugene Kabanov
be89a3c54d
Add "drop by score" ability to PeerPool. (#917)
* Add "drop by score" ability to PeerPool.
Add tests.
Fix syncmanager queue to start from most fresh data.

* Fix endless cycle at the end of syncing process.
2020-04-23 17:31:00 +02:00
Jacek Sieka
65ca74c980 req: cap requested blocks better
also cap blocks in roots request
2020-04-22 12:09:26 +03:00
Eugene Kabanov
3d42da90a8
Syncing. (#909) 2020-04-20 16:59:18 +02:00
Ștefan Talpalaru
b7a32a17ba
bump submodules
and remove failing syncManagerGroupRecoveryTest
2020-04-14 18:21:56 +02:00
Zahary Karadjov
22876da593 Fix gcsafety issues in the test suite 2020-03-24 22:14:40 +02:00
cheatfate
db20fc1172 Fix SyncQueue push(data) bug.
Rename lastSlot to HeadSlot.
Add failure test.
2020-01-29 15:28:41 +00:00
cheatfate
73dc72583f Initial commit. 2020-01-29 15:28:41 +00:00