Commit Graph

253 Commits

Author SHA1 Message Date
Zahary Karadjov 8a6281aad2 Simple cost model for sync requests; Penalize peers perfoming flooding or invalid requests 2020-10-15 20:15:51 +03:00
tersec 3ee2dd8da4
p2p-interface spec ref bump (except non-updated places) (#1862) 2020-10-12 14:37:14 +00:00
Zahary Karadjov 00a8a68671
Address #1695
Better error messages when the beacon node is asked to listen on a
reserved port (0) or an already taken one.
2020-10-09 16:39:03 +03:00
tersec b79e5f8af5
update nim-beacon-chain to nimbus-eth2 in beacon_chain/, ncli/, tests/, and README.md (#1843) 2020-10-08 19:02:05 +00:00
cheatfate f091c8d4df Add chronicles.formatIt for PublicKey.
Fix logs.
Rename checkFilePermissions to checkSensitiveFilePermissions.
2020-10-05 22:19:50 +03:00
cheatfate add22a20e1 Update local_testnet and simulation scripts to use netkey-file and insecure-netkey-password.
Add more logging
2020-10-05 22:19:50 +03:00
cheatfate e1182f8000 Add insecure password for automated testing.
Fix checkDataDir to run before setupLogging.
2020-10-05 22:19:50 +03:00
cheatfate 40f2b74f73 Add keystore management and interactive password handling. 2020-10-05 22:19:50 +03:00
cheatfate c5c788a9db Secure network key file and data directory. 2020-10-05 22:19:50 +03:00
Mamy Ratsimbazafy b57693ec0d
Logging update (#1795)
* Fix discovery log message trigger

* Bump chronicles - include https://github.com/status-im/nim-chronicles/pull/89 for better NOTICE/WARNING color
2020-10-03 08:35:45 +02:00
tersec 5e95fd7468
Revert "update to v0.12.3 message ID for Spadina launch (#1762)" (#1801)
This reverts commit a2270a5f27.
2020-10-02 19:50:21 +00:00
Mamy Ratsimbazafy 0280d6c73e
Revisiting log levels (#1788)
* Update log level - https://github.com/status-im/nim-beacon-chain/issues/1779 https://github.com/status-im/nim-beacon-chain/issues/1785

* Address review comments

* Document the logging strategy [skip ci]
2020-10-01 20:56:42 +02:00
tersec a2270a5f27
update to v0.12.3 message ID for Spadina launch (#1762)
* update to v0.12.3 message ID for Spadina launch

* remove base64 import
2020-09-28 17:07:10 +02:00
Kim De Mey 23bec99341
Let also discovery used listen-address cli option instead of always any address (#1658) 2020-09-27 22:00:24 +02:00
Jacek Sieka 7837646079
anonymize libp2p messages (#1756)
* anonymize libp2p messages

* bump
2020-09-25 18:40:30 +02:00
Eugene Kabanov 1bf8d3af33
Disconnect peers with low score. (#1747)
* Disconnect peers with low score.

* Change PeerScoreLow value.

* Add spec url for DisconnectionReason.
2020-09-25 15:43:45 +02:00
tersec f96ad87d28
switch another 50+ spec refs from v0.12.2 to v0.12.3 (#1749) 2020-09-25 11:52:50 +00:00
Jacek Sieka b3a9afa0b1
libp2p: limit max gossip writes (#1739)
* libp2p: limit max gossip writes

* bump
2020-09-24 19:03:17 +02:00
Jacek Sieka e1c177cdd1
bump libp2p (#1721)
gossipsub 1.1 can be enabled with -d:nbc_gossipsub_11
2020-09-22 19:34:34 +02:00
Eugene Kabanov 654b8d66bf
Peer management (#1707)
* addPeer() and addPeerNoWait() now returns PeerStatus, not bool.
Minor refactoring of PeerPool.
Fix tests.

* Refactor PeerPool.
Add lenSpace.
Add tests for lenSpace.
PeerPool.add procedures now return different error codes.
Fix SyncManager break/continue problem.
Fix connectWorker break/continue problem.
Refactor connectWorker and discoveryLoop.
Fix incoming/outgoing blocking problem.

* Refactor discovery loop.
Add checkPeer.

* Fix logic and compilation bugs.

* Adjust position of debugging log.

* Fix issue with maximum peers in PeerPool.
Optimize node record decoding.

* fix discoveryLoop.

* Remove aliases and fix tests using aliases.
2020-09-21 18:02:27 +02:00
Jacek Sieka fc10f5121a protect against data after initial request
spec requires that channel is closed

also, avoid some unnecessary futures
2020-09-18 21:34:07 +03:00
tersec e106549efe
keep REJECT/IGNORE of messages failing validation for libp2p scoring (#1676)
* keep REJECT/IGNORE status of messages failing validation for libp2p scoring

* fix test suite
2020-09-18 13:53:09 +02:00
Dmitriy Ryajov 2f89e2ab4e
drop subscribePeer, it's called from pubsub now (#1677) 2020-09-17 11:40:21 +02:00
Eugene Kabanov 6e463257f4
PeerPool fixes. (#1654)
* Refactor peer_pool.
Fix eth2_network peer counters.
Fix PeerPool do not allow to add more peers when empty space available.

* Remove unused imports.

* Add test for a bug.

* Fix eth2_network disconnect should deletePeer not release.
More PeerPool refactoring.
2020-09-16 13:00:11 +03:00
Jacek Sieka c76305f824
fix some todo (#1645)
* remove some superfluous gcsafes
* remove getTailState (unused)
* don't store old epochrefs in blocks
* document attestation pool a bit
* remove `pcs =` cruft from log
2020-09-14 14:50:03 +00:00
Jacek Sieka a087909683
fix req/resp protocol (#1621)
per spec, we must half-close request stream - not doing so may lead to
failure of the other end to start processing our request leading to
timeouts.

In particular, this fixes many sync problems that have been seen on
medalla.

* remove safeClose - close no longer raises
* use per-chunk timeouts in request processing
2020-09-10 21:40:09 +02:00
tersec d0de1a49a3
Fix some warnings and hints and partly revert #1610 (#1615)
* address some XDeclaredButNotUsed, ConvFromXtoItselfNotNeeded, and UnusedImport hints and warnings

* partly revert #1610
2020-09-08 11:32:43 +00:00
Jacek Sieka d584591ded
simplify libp2p logging (#1605)
and a few other small logging fixes
2020-09-06 10:39:25 +02:00
Jacek Sieka a7a279d615
add option to disable discv5 (#1509) 2020-08-24 13:52:06 +02:00
Jacek Sieka 22998fdfd4 avoid double deserialization
When blocks and attestations arrive, they are SSZ-decoded twice: once
for validation and once for processing. This branch enqueues the decoded
block directly for processing, avoiding the second, slow
deserialization.

* move processing of blocks and attestations to queue
* ...and out from beacon_node
* split attestation processing into attestations and aggregates
  * also updates metrics
* clean up logging to better follow the lifetime of gossip: arrival,
validation and processing
* drop attestations and aggregates if there are too many
* try to prioritise blocks and aggregates before single-validator
attestations
2020-08-21 11:46:25 +03:00
Zahary Karadjov 2c19e3f8cd
[skip ci] Use GOSSIP_MAX_SIZE when snappy decoding in the inspector as well; Bumps 2020-08-19 14:33:52 +03:00
Zahary Karadjov 3433c77c35 Prevent Snappy decompression bombs 2020-08-19 10:13:04 +03:00
Dmitriy Ryajov 87f983c639 use split out pubsub 2020-08-17 17:24:36 +03:00
tersec 612881b95d
refactor topic (un)subscribing/validating to collate each (#1510)
* refactor topic (un)subscribing/validating to collate each

* fix comment

* tweak comment
2020-08-17 14:07:29 +02:00
tersec af3355e0f8
create local testnet mode for eth2_network (#1494) 2020-08-12 14:16:59 +00:00
tersec 22c1ef5a8d
split subscribe into non-validating subscribe and addValidator (#1485)
* split subscribe into non-validating subscribe and addValidator

* stop exporting get_committee_assignments
2020-08-11 15:08:44 +00:00
Jacek Sieka 280e72f3c9
remove snappy RPC support (#1477)
removed in 0.12.2 - the flow, in particular when the other peer doesn't
support snappy, is hard to follow because of the trial-and-error
approach - removing it simplifies things and removes some of the
hard-to-read parts of the thunking etc
2020-08-10 15:18:17 +02:00
Jacek Sieka 936440fccd
use libp2p peer events to track peer (#1468)
this resolves some peer counting issues that were happening because the
lifetime future in PeerInfo was unreliable (multiple PeerInfo instances
existed per peer)

In addition, this solves another race condition: when connecting to a
peer and later dialling that protocol, it is not certain that the same
connection will be used if there's a concurrent incoming peer connection
ongoing - better not make too many assumptions about who sent statuses
when.
2020-08-10 12:58:34 +02:00
Jacek Sieka 585c410d90 remove randompeers
unused, requires importing `random` which we're trying to avoid
2020-08-10 11:48:33 +03:00
Dmitriy Ryajov c5077af4bc
decreate amount of concurent dials (#1460) 2020-08-06 19:21:12 +00:00
Jacek Sieka f4c16ed0db
eh cleanups (#1458)
current exception sometimes buggy in nim
2020-08-06 18:47:39 +00:00
Jacek Sieka 221f372dbc use peer id in a number of places 2020-08-05 19:34:59 +03:00
Jacek Sieka d22a2cec2b
Start libp2p before writing ENR file (#1418)
this makes sure that all libp2p transports are open for business when
the file hits the ground
2020-08-03 19:35:27 +02:00
Dmitriy Ryajov 52d9d269d7
bump libp2p (delayed send, without hooks) (#1413)
* use `switch.isConnected`

* libp2p

* add timeout to publish

* use isConnected

* adjust timeouts

* latest libp2p master

* do not drop peers
2020-08-03 16:43:22 +00:00
tersec e0a6f58abe
convert 10 v0.12.1 spec refs to v0.12.2 (#1406) 2020-07-31 09:59:14 +00:00
Jacek Sieka c5fecd472f
more fork-choice fixes (#1388)
* more fork-choice fixes

* use target block/epoch to validate attestations
* make addLocalValidators sync
* add current and previous epoch to cache before doing state transition
* update head state using clearance state as a shortcut, when possible
* use blockslot for fork choice balances
* send attestations using epochref cache

* fix invalid finalized parent being used

also simplify epoch block traversal

* single error handling style in fork choice

* import fix, remove unused async
2020-07-30 17:48:25 +02:00
tersec 99f2d8e06c
update 14 v0.12.1 spec refs to v0.12.2 (#1400) 2020-07-30 09:47:57 +00:00
cheatfate 99dcb81e77 Initial commit. 2020-07-27 17:48:26 +03:00
tersec 90708a8287
Prefer converting int` to uint64 and switch foo.len.uint64 to .len64 (#1375)
* avoid converting from uint64 to int, and where most feasible, int type conversion at all

* .len.uint64 -> .len64

* fix 32-bit compilation

* try keeping state_sim loop variable/bounds as int for 32-bit Azure

* len64 -> lenu64
2020-07-26 20:55:48 +02:00
Ștefan Talpalaru c47532f2b0
deal with a temporary loss of network connectivity (#1354)
* don't kill the program if not connected to a bootstrap node within 30 seconds

* recover faster from loss of network connectivity

* connectWorker(): sleep 1s between dials

* launch_local_testnet.sh: increase BOOTSTRAP_TIMEOUT

* don't use metric value in program logic

* refactor some ungainly variable names
2020-07-23 22:51:56 +02:00
Eugene Kabanov 8c5aa7cbe7
Use only secp256k1 as identity in libp2p. (#1343)
* Add libp2p_pki_schemes to beacon_node and inspector configuration files.

* Fix inspector.nim.cfg file name.

* Do not allow beacon_node to be build without libp2p_pki_schemes option value.

* Fix compilation problems.

* Fix tests.

* Fix validator_client.
2020-07-21 18:07:14 +02:00
Dustin Brody 4140b3b9d9 update 29 spec refs to v0.12.1 2020-07-08 20:49:25 +00:00
Jacek Sieka f53425873c Only use noise 2020-07-08 08:05:38 +00:00
Jacek Sieka 6fe0a623f5
Crypto rng (#1284)
* use bearssl rng throughout

* bump

* bump

* move keygen out of crypto
2020-07-07 17:51:02 +02:00
Jacek Sieka ef2f037571
bump libp2p (#1267) 2020-07-01 13:41:40 +02:00
Jacek Sieka 816779733e
use the eth2 message id for gossip (in logging too) (#1246)
* use the eth2 message id for gossip (in logging too)

* bump

* add spec link
2020-06-28 22:06:50 +02:00
Jacek Sieka eeccaaf16d
stop gossipping non-snappy (#1240)
* stop gossipping non-snappy

Also simplify subscription and actually handle decoding errors

* log weird states too
2020-06-27 12:16:43 +02:00
Jacek Sieka 1301600341
Trusted blocks (#1227)
* cleanups

* fix ncli state root check flag
* add block dump to ncli_db
* limit ncli_db benchmark length
* tone down finalization logs

* introduce trusted blocks

We only store blocks whose signature we've verified in the database - as
such, there's no need to check it again, and most importantly, no need
to deserialize the signature when loading from database.

50x startup time improvement, 200x block load time improvement.

* fix rewinding when deposits have invalid signature
* speed up ancestor iteration by avoiding copy
* avoid deserializing signatures for trusted data
* load blocks lazily when rewinding (less memory used)

* chronicles workarounds

* document trustedbeaconblock
2020-06-25 12:23:10 +02:00
Jacek Sieka 1d709c09f4
secp: requiresInit (#1210)
* secp: requiresInit

* bump
2020-06-22 21:40:19 +02:00
Zahary Karadjov 14274587cf More user-friendly logging during mainchain monitoring 2020-06-22 17:30:04 +03:00
Eugene Kabanov 47eaaa7696
Fix connection workers race. (#1204) 2020-06-21 18:49:48 +02:00
Jacek Sieka a661ecbae1
bump libp2p (#1209) 2020-06-21 18:45:28 +02:00
Jacek Sieka 7e0e4dc327
don't crash on unknown disconnection reason, fix disconnection reason enum (#1208) 2020-06-20 09:24:33 +02:00
Jacek Sieka e813111b3b peers rpc call
simple way to display nbc peer table
2020-06-18 07:29:20 +00:00
Jacek Sieka 8fbbd59885
metric names (#1188)
* fix metric names to not clash with native libp2p metrics
* run testnet node with rpc enabled by default
2020-06-17 13:04:24 +02:00
kdeme a25bc025d1 Start discovery after starting libp2p switch 2020-06-16 13:33:46 +00:00
Kim De Mey 68a8b7d969
Filter discovery nodes on forkId (#1162) 2020-06-12 16:14:18 +02:00
Zahary Karadjov cf6a869e9e Address some TODO items; Handle start-up before genesis more properly 2020-06-11 17:40:08 +03:00
Zahary Karadjov c773e10c1a Attempt to reduce the risk of dropped network connections during the loading of KeyStores 2020-06-11 17:40:08 +03:00
Zahary Karadjov 25821331c4 More greppable code for the onPeerConnected operation 2020-06-11 17:40:08 +03:00
Jacek Sieka 016cc22173
show peer info on connect (#1155) 2020-06-11 07:14:26 +02:00
Eugene Kabanov 040e38529a
Fix #1140. (#1143)
SeenTable is now able to hold peers with different timeout values.
2020-06-08 18:02:50 +02:00
Eugene Kabanov 3ce98d5bca
Add checks for penalties which are not applied immediately. (#1139)
Change default maxHeadAge value to 1 epoch.
Set zero-point at the SyncQueue's initialization.
Remove annoying logs in runDiscoveryLoop.
2020-06-07 17:36:24 +02:00
Zahary Karadjov 0c78fc39e7
Use the latest LibP2P 2020-06-05 19:34:12 +03:00
Jacek Sieka bcbfa736c9
format ErrorMsg messages reasonably (#1109) 2020-06-04 08:19:25 +02:00
Jacek Sieka 56ffb696be
reorder ssz (#1099)
* reorder ssz

* split into hash_trees and ssz_serialization, roughly, for hashing and
IO
* move bitseqs into ssz (from stew)
* clean up imports

* docs, imports
2020-06-03 15:52:02 +02:00
cheatfate 405e9db199
Fix problem of good peers is also logged as timed out. 2020-06-03 13:48:01 +03:00
cheatfate 12e28a1fa9 Add proper concurrent connections.
Add SeenTable to avoid continuous attempts to dead peers.
Refactor onSecond.
Block backward sync while forward sync is working.
SyncManager now checks responses according corresponding requests + tests.
SyncManager now watching for not progressing local_head_slot and resets SyncQueue.
2020-06-03 12:53:57 +03:00
kdeme 06f025b228 Add timeout to switch.connect 2020-06-02 23:06:11 +03:00
Ștefan Talpalaru a90b0dd197
Merge pull request #1077 from status-im/timeout
Eth2Node.stop(): trace msg on timeout
2020-05-29 16:30:13 +02:00
Kim De Mey e33c8d9067
Bump nim-eth and accompanying discv5 cleanup (#1081) 2020-05-29 12:03:29 +02:00
Eugene Kabanov 21131e629b
Sync freeze fixes. (#1072)
* Add ability to reset state of sync manager.
Fix bug when sync got stuck on `zero-point` reset.
Fix bug when sync got stuck when some of the workers waiting for failing one.

* Remove debugging comments and imports.

* Remove not used pendingLock.
2020-05-28 07:02:28 +02:00
Ștefan Talpalaru 273a912ae0
Eth2Node.stop(): trace msg on timeout 2020-05-28 03:14:01 +02:00
Zahary Karadjov 28128f4d2f Add a handler for the Goodbye message
The lack of body of `goodbye` in sync_protocol.nim was preventing
the respective LibP2P protocol to be mounted and advertised on the
network.

Adding a body fixes that, but I've also made some changes in the
P2P protocol codegen that will prevent the issue from happening
again (no body is now considered the equivalent of having an empty
body).
2020-05-26 22:17:26 +03:00
Zahary Karadjov 833f19e942 Reform the networking layer in order to handle the new stricter SSZ API 2020-05-24 19:00:34 +03:00
Zahary Karadjov accd5fe954 Don't use StackArray in ssz; Drop the support for strings 2020-05-24 19:00:34 +03:00
Ștefan Talpalaru b2193f1b8f
Eth2Node.stop(): 5s timeout 2020-05-21 00:06:01 +02:00
Jacek Sieka a38eddcaac
remove ssz stint support (#1046) 2020-05-20 19:05:22 +02:00
Ștefan Talpalaru 383b22795c
bump submodules (#1043) 2020-05-20 06:57:39 +02:00
Ștefan Talpalaru c4462af4ab
beacon_node: graceful shutdown (#1033)
* beacon_node: graceful shutdown

* separate BeaconNodeStatus and BeaconNode instances
2020-05-19 20:57:35 +02:00
Dmitriy Ryajov 0649d47df0 use proper transport flags 2020-05-18 21:51:03 +00:00
Jacek Sieka 6be7d64e8c
bump libp2p (#1031) 2020-05-18 10:11:21 +02:00
Zahary Karadjov 24a17f5814 Fix an RPC error in Lighthouse triggered by the getMetadata message 2020-05-16 09:56:13 +03:00
tersec 74db0f3c8d
fix some XDeclaredButNotUsed hints (#1027) 2020-05-15 14:41:00 +02:00
Zahary Karadjov 75c1c6a95c Enable Snappy by default (using LibP2P steams for now)
This refactors the newly added Snappy streaming back-ends trying to
make them more similar and to reduce the code duplication to a minimum.
2020-05-13 12:18:42 +03:00
Zahary Karadjov f055fad08a Make the Snappy FastStreams integration optional by duplicating it for LibP2P streams 2020-05-13 12:18:42 +03:00
Zahary Karadjov 15f0153441 Cosmetic improvements 2020-05-13 12:18:42 +03:00
Zahary Karadjov 9538b60704 Integrate the async Snappy implementation 2020-05-13 12:18:42 +03:00
Zahary Karadjov a739d7e8d6 Adapt SSZ to the latest FastStreams API 2020-05-13 12:18:42 +03:00
Eugene Kabanov da0b1a4993
Fix status handling. (#1008)
* Fix status handling.
Add log map of received blocks.

* Fix review comments.
Fix UnusedImport in sync_protocol.nim
2020-05-13 08:37:58 +02:00