Commit Graph

641 Commits

Author SHA1 Message Date
Eugene Kabanov 654b8d66bf
Peer management (#1707)
* addPeer() and addPeerNoWait() now returns PeerStatus, not bool.
Minor refactoring of PeerPool.
Fix tests.

* Refactor PeerPool.
Add lenSpace.
Add tests for lenSpace.
PeerPool.add procedures now return different error codes.
Fix SyncManager break/continue problem.
Fix connectWorker break/continue problem.
Refactor connectWorker and discoveryLoop.
Fix incoming/outgoing blocking problem.

* Refactor discovery loop.
Add checkPeer.

* Fix logic and compilation bugs.

* Adjust position of debugging log.

* Fix issue with maximum peers in PeerPool.
Optimize node record decoding.

* fix discoveryLoop.

* Remove aliases and fix tests using aliases.
2020-09-21 18:02:27 +02:00
tersec 3190c695b0
minimal v0.12.3 update (#1716) 2020-09-21 15:58:35 +00:00
tersec e106549efe
keep REJECT/IGNORE of messages failing validation for libp2p scoring (#1676)
* keep REJECT/IGNORE status of messages failing validation for libp2p scoring

* fix test suite
2020-09-18 13:53:09 +02:00
Jacek Sieka dcf8a6b05d
improve slot processing speeds (#1670)
about 40% better slot processing times (with LTO enabled) - these don't
do BLS but are used
heavily during replay (state transition = slot + block transition)

tests using a recent medalla state and advancing it 1000 slots:

```
./ncli slots --preState2:state-302271-3c1dbf19-c1f944bf.ssz --slot:1000
--postState2:xx.ssz
```
pre:

```

All time are ms
Average,       StdDev,          Min,          Max,      Samples,
Test
Validation is turned off meaning that no BLS operations are performed
39.236,        0.000,       39.236,       39.236,            1,
Load state from file
0.049,        0.002,        0.046,        0.063,          968,
Apply slot
256.504,       81.008,      213.471,      591.902,           32,
Apply epoch slot
28.597,        0.000,       28.597,       28.597,            1,
Save state to file
```

cast:
```
All time are ms
Average,       StdDev,          Min,          Max,      Samples,
Test
Validation is turned off meaning that no BLS operations are performed
37.079,        0.000,       37.079,       37.079,            1,
Load state from file
0.042,        0.002,        0.040,        0.090,          968,
Apply slot
215.552,       68.763,      180.155,      500.103,           32,
Apply epoch slot
25.106,        0.000,       25.106,       25.106,            1,
Save state to file
```

cast+rewards:
```
All time are ms
Average,       StdDev,          Min,          Max,      Samples,
Test
Validation is turned off meaning that no BLS operations are performed
40.049,        0.000,       40.049,       40.049,            1,
Load state from file
0.048,        0.001,        0.045,        0.060,          968,
Apply slot
164.981,       76.273,      142.099,      477.868,           32,
Apply epoch slot
28.498,        0.000,       28.498,       28.498,            1,
Save state to file
```

cast+rewards+shr
```
All time are ms
Average,       StdDev,          Min,          Max,      Samples,
Test
Validation is turned off meaning that no BLS operations are performed
12.898,        0.000,       12.898,       12.898,            1,
Load state from file
0.039,        0.002,        0.038,        0.054,          968,
Apply slot
139.971,       68.797,      120.088,      428.844,           32,
Apply epoch slot
24.761,        0.000,       24.761,       24.761,            1,
Save state to file

```
2020-09-16 20:59:33 +00:00
Mamy Ratsimbazafy 52548f079b
Opt-in Slashing protection + interchange (#1643)
* Slashing protection + interchange initial commit

* Restrict the when UseSlashingProtection dance in other modules

* Integrate slashing tests in other all_tests

* Add attestation slashing protection support

* Add a message that mention if built with/without slashing protection

* no op the initialization proc

* test slashing protection in Jenkins (temp)

* where to configure NIMFLAGS in Jenkins ...

* Jenkins -> ensure Built with slashing protection

* Add slashing protection complete import

* use Opt.get(otherwise)

* Don't use negation in proc name

* Turn slashing protection on by default
2020-09-16 13:30:03 +02:00
Eugene Kabanov 6e463257f4
PeerPool fixes. (#1654)
* Refactor peer_pool.
Fix eth2_network peer counters.
Fix PeerPool do not allow to add more peers when empty space available.

* Remove unused imports.

* Add test for a bug.

* Fix eth2_network disconnect should deletePeer not release.
More PeerPool refactoring.
2020-09-16 13:00:11 +03:00
Jacek Sieka c76305f824
fix some todo (#1645)
* remove some superfluous gcsafes
* remove getTailState (unused)
* don't store old epochrefs in blocks
* document attestation pool a bit
* remove `pcs =` cruft from log
2020-09-14 14:50:03 +00:00
tersec aca1a318f2
cleanly close kvstore databases and bump nim-eth (#1630)
* cleanly close kvstore databases

* close databases for all subcommands and during error conditions
2020-09-12 05:35:58 +00:00
Jacek Sieka 8a5a261fcd
Quick fix to prune some states, pending smarter state storage (#1624)
* Quick fix to prune some states, pending smarter state storage

Adverse effects might include slow rewinds - typically the protocol
doesn't ask for pre-finalized states but RPC might

* document issue, add test

* fix cache miss log
2020-09-11 10:03:50 +02:00
tersec d0de1a49a3
Fix some warnings and hints and partly revert #1610 (#1615)
* address some XDeclaredButNotUsed, ConvFromXtoItselfNotNeeded, and UnusedImport hints and warnings

* partly revert #1610
2020-09-08 11:32:43 +00:00
tersec 3d5f24f14c
stop discarding future epochs; remove a StateCache() construction (#1610)
* stop discarding non-existent future epochs during epoch state transitions; remove a pointless StateCache() construction in advance_slots()

* update nbench to pass StateCache to process_slots()
2020-09-07 15:04:33 +00:00
Viktor Kirilov d9f9949ef0 use a separate process for the private keys (Off by default) - there is a new signing_process binary which loads all validators of the beacon node and the BN dictates through stdin of the signing process what to be signed and when and reads from stdout of the process 2020-09-02 12:47:00 +03:00
Viktor Kirilov 65d7787b1e 50/50 bn/vc split for the validator keys ON by default for the testnet scripts 2020-09-01 16:39:07 +03:00
tersec ab255662df
bound block quarantine size (#1564)
* bound block quarantine size

* add additional logging for block quarantining

* re-add quarantine.add() call

* remove pre-finalization blocks; add logging for full quarantine

* clear quarantine on chain reorganization

* update block_sim and tests

* update test_attestation_pool
2020-08-31 11:00:38 +02:00
Jacek Sieka fa1621db46
implement clock disparity for attestation validation (#1568)
This implements disparity, resolving a part of
https://github.com/status-im/nim-beacon-chain/issues/1367

* make BeaconTime a duration for fractional seconds
* factor out attestation/aggregate validation
* simplify recording of queued attestations
* simplify attestation signature check
* fix blocks_received metric
* add some trivial validation tests
* remove unresolved attestation table - attestations for unknown blocks
are dropped instead (cannot verify their signature)
2020-08-27 09:34:12 +02:00
Mamy Ratsimbazafy 81788becfc
Fork choice - almost free pruning - fix #1534 (#1535)
* initial - cheaper pruning - addresses  #1534

* Pass tests: update offset when pruning, proper handling of pruned parents

* Use options instead of nil for nilable newHead (finalization passing but rootcause not solved)

* First line of defense against stackoverflow in tests

* Fix compute_delta offset after pruning

* Rebase fix - medalla ready

* Remove Option[BlockRef]
2020-08-26 17:23:34 +02:00
Jacek Sieka f26d6a4fd3
reuse validator key cache better (#1562)
new key cache can be used for old epochs in the same tree
2020-08-26 17:06:40 +02:00
Jacek Sieka 61538fa581 speed up shuffling
Replace shuffling function with zrnt version - `get_shuffled_seq` in
particular puts more strain on the GC by allocating superfluous seq's
which turns out to have a significant impact on block processing (when
replaying blocks for example) - 4x improvement on non-epoch, 1.5x on
epoch blocks (replay is done without signature checking)

Medalla, first 10k slots - pre:

```
Loaded 68973 blocks, head slot 117077
All time are ms
Average,       StdDev,          Min,          Max,      Samples,
Test
Validation is turned off meaning that no BLS operations are performed
76855.848,        0.000,    76855.848,    76855.848,            1,
Initialize DB
1.073,        0.914,        0.071,       12.454,         7831,
Load block from database
31.382,        0.000,       31.382,       31.382,            1,
Load state from database
85.644,       30.350,        3.056,      466.136,         7519,
Apply block
506.569,       91.129,      130.654,      874.786,          312,
Apply epoch block
```

post:

```
Loaded 68973 blocks, head slot 117077
All time are ms
Average,       StdDev,          Min,          Max,      Samples,
Test
Validation is turned off meaning that no BLS operations are performed
72457.303,        0.000,    72457.303,    72457.303,            1,
Initialize DB
1.015,        0.858,        0.070,       11.231,         7831,
Load block from database
28.983,        0.000,       28.983,       28.983,            1,
Load state from database
21.725,       17.461,        2.659,      393.217,         7519,
Apply block
324.012,       33.954,       45.452,      440.532,          312,
Apply epoch block
```
2020-08-21 16:05:10 +03:00
Jacek Sieka 22998fdfd4 avoid double deserialization
When blocks and attestations arrive, they are SSZ-decoded twice: once
for validation and once for processing. This branch enqueues the decoded
block directly for processing, avoiding the second, slow
deserialization.

* move processing of blocks and attestations to queue
* ...and out from beacon_node
* split attestation processing into attestations and aggregates
  * also updates metrics
* clean up logging to better follow the lifetime of gossip: arrival,
validation and processing
* drop attestations and aggregates if there are too many
* try to prioritise blocks and aggregates before single-validator
attestations
2020-08-21 11:46:25 +03:00
Jacek Sieka 46c94a18ba rework epoch cache referencing
* collect all epochrefs in specific blocks to make them easier to find
and to avoid lots of small seqs
* reuse validator key databases more aggressively by comparing keys
* make state cache available from within `withState`
* make epochRef available from within onBlockAdded callback
* integrate getEpochInfo into block resolution and epoch ref logic such
that epochrefs are created when blocks are added to pool or lazily when
needed by a getEpochRef
* fill state cache better from EpochRef, speeding up replay and
validation
* store epochRef in specific blocks to make them easier to find and
reuse
* fix database corruption when state is saved while replaying quarantine
* replay slots fully from block pool before processing state
* compare bls values more smartly
* store epoch state without block applied in database - it's recommended
to resync the node!

this branch will drastically speed up processing in times of long
non-finality, as well as cut memory usage by 10x during the recent
medalla madness.
2020-08-19 10:09:06 +03:00
Jacek Sieka 9da8b2692f
simplify fork choice code (#1521)
* standardize init
* avoid loading state on init
* avoid some inefficient exception-based code
* remove some TODO
2020-08-18 16:56:32 +02:00
Jacek Sieka 79ff4f7c41
fork choice refresh (#1520)
* add attestation processing queue so attestations don't get processed
too early
* rework justified slot delay to match spec / lighthouse better
* keep less state in fork choice
* request epochref less
2020-08-17 20:36:13 +02:00
tersec ed9bec0147
add EF finality tests (#1515) 2020-08-17 07:19:48 +00:00
tersec f34eddb6e9
fix get_unslashed_attesting_indices() and add official EF rewards tests for it (#1514) 2020-08-17 01:09:27 +00:00
Mamy Ratsimbazafy 454b9d0724
Bump nim-blscurve (#1491)
* Bump BLSCurve

* Use unified aggregation API

* use new blscurve with unified aggregate API

* bump

* fix toRaw

* replace state_sim combine with AggregateSignature

* Fix 32-bit

* Fix 32-bit for real and test deactivating ccache for fno-tree-lopp-vectorize flag

* change compilation switches to narrow down Linux issue

* Use -fno-tree-vectorize to disable both tree-loop-vectorize and tree-slp-vectorize

* blscurve now disables both Loop and SLP vectorization

* Add tests for the miracl/milagro fallback

* Travis has max log size of 4MB

* Test with Miracl in the finalization test

* fix state_sim log level

* Coment out the slow fallback tests
2020-08-15 19:33:58 +02:00
tersec 611c5097cc
use cache in process_voluntary_exit() (#1507) 2020-08-14 12:42:59 +00:00
Jacek Sieka 58d77153fc
fix invalid state root being written to database (#1493)
* fix invalid state root being written to database

When rewinding state data, the wrong block reference would be used when
saving the state root - this would cause state loading to fail by
loading a different state than expected, preventing blocks to be
applied.

* refactor state loading and saving to consistently use and set
StateData block
* avoid rollback when state is missing from database (as opposed to
being partially overwritten and therefore in need of rollback)
* don't store state roots for empty slots - previously, these were used
as a cache to avoid recalculating them in state transition, but this has
been superceded by hash tree root caching
* don't attempt loading states / state roots for non-epoch slots, these
are not saved to the database
* simplify rewinder and clean up funcitions after caches have been
reworked
* fix chaindag logscope
* add database reload metric
* re-enable clearance epoch tests

* names
2020-08-13 11:50:05 +02:00
tersec af3355e0f8
create local testnet mode for eth2_network (#1494) 2020-08-12 14:16:59 +00:00
Eugene Kabanov 711f1f88ee
Use one single async queue and loop for processing blocks. (#1487)
* Initial commit

* Fix compilation problem.

* Address review comments.
2020-08-12 11:29:11 +02:00
Jacek Sieka 8b0f2cc96f
share validator keys in EpochRef (#1486) 2020-08-11 21:39:53 +02:00
Zahary Karadjov 30a8ec410d More spec compliant blocksByRange requests
* Eliminate possibilities for range errors and overflows
* Handle more properly invalid requests for furute slots
* Eliminate the confusing surrounding the MAX_REQUEST_BLOCKS constant

Addresses https://github.com/status-im/nim-beacon-chain/issues/1366
2020-08-10 22:09:13 +03:00
Eugene Kabanov 55fcece0b2
SyncManager fix to process blocks one by one. (#1464)
* Allow sync manager process blocks one by one.

* Log storeBlock() and updateHead() duration.

* Calculate duration only for blocks added without any error.

* Fix float compilation error.

* Fix duration.

* Fix SyncQueue tests.
2020-08-10 09:15:50 +02:00
Jacek Sieka 1c830cf623
add state root test (#1466) 2020-08-07 22:17:24 +02:00
Viktor Kirilov 5bbeb38f2d fixes the BN/VC communication - properly getting the attestation duties & also fixed start.sh 2020-08-06 15:29:05 +03:00
protolambda e90c5440e8 make eth1 distance runtime configurable 2020-08-06 14:49:58 +03:00
Zahary Karadjov 4b8ebb5d71 Correct instructions in the README for running prometheus on Medalla 2020-08-05 19:28:35 +03:00
Jacek Sieka cf8cd8321b
Revert "Lazy crypto [alt design #1369] (#1371)" (#1435)
This reverts commit 023f7f4518.
2020-08-04 17:15:13 +00:00
Jacek Sieka c6674de5d2 use epoch ref to update fork choice
this dramatically speeds up startup in long periods of non-finality
2020-08-04 20:00:31 +03:00
tersec a5d900d052
skip hotfixed DEPOSIT_CHAIN_ID/DEPOSIT_NETWORK_ID const checks (#1434) 2020-08-04 15:53:39 +00:00
tersec df80071bcf
update attestation and block validation to v0.12.2; clean up getAncestorAt()/get_ancestor() (#1417)
* update attestation validation to v0.12.2; clean up getAncestorAt()/get_ancestor()

* update beacon block validation to v0.12.2
2020-08-03 19:47:42 +00:00
Zahary Karadjov c882b7c2f3 Add Scrypt support in the Keystores 2020-08-02 23:00:43 +03:00
Zahary Karadjov 1aba7aed6d Updated Keystore test vectors 2020-08-02 23:00:43 +03:00
Zahary Karadjov c293254ded Add 'deposits import' command; Switch to NJS when loading the keystores and improve the data validation 2020-08-02 23:00:43 +03:00
Viktor Kirilov 0a96e5f564
renamed CandidateChains to ChainDagRef and made the Quarantine type a ref type so there is a single instance in the beacon node (#1407) 2020-07-31 14:49:06 +00:00
Viktor Kirilov c032366547
removed the BlockPool type and all of the proxy functions around it (#1401)
* removed the BlockPool type and all of the proxy functions around it - passing the chain DAG and the quarantine explicitly where appropriately - they don't need to be bundled in a type

* fixed the build after the rebase
2020-07-30 21:18:17 +02:00
Jacek Sieka c5fecd472f
more fork-choice fixes (#1388)
* more fork-choice fixes

* use target block/epoch to validate attestations
* make addLocalValidators sync
* add current and previous epoch to cache before doing state transition
* update head state using clearance state as a shortcut, when possible
* use blockslot for fork choice balances
* send attestations using epochref cache

* fix invalid finalized parent being used

also simplify epoch block traversal

* single error handling style in fork choice

* import fix, remove unused async
2020-07-30 17:48:25 +02:00
Mamy Ratsimbazafy 023f7f4518
Lazy crypto [alt design #1369] (#1371)
* Lazy loading of crypto objects

* Try to fix incorrect field access by hiding fields but no luck. SSZ/Chronicles/macro bug?

* Fix incorrect blsValue access. was "aggregate" not "chronicles"

* Fix tests that rely on the internal BLSValue representation
2020-07-29 18:13:05 +00:00
Ștefan Talpalaru fa9f35e148
Jenkins: run local testnet test on macOS (#1391) 2020-07-29 14:08:27 +02:00
Jacek Sieka 157ddd2ac4
Fork choice fixes 5 (#1381)
* limit attestations kept in attestation pool

With fork choice updated, the attestation pool only needs to keep track
of attestations that will eventually end up in blocks - we can thus
limit the horizon of attestations that we keep more aggressively.

To get here, we expose getEpochRef which gets metadata about a
particular epochref, and make sure to populate it when a block is added
- this ensures that state rewinds during block addition are minimized.

In addition, we'll use the target root/epoch when validating
attestations - this helps minimize the number of different states that
we need to rewind to, in general.

* remove CandidateChains.justifiedState

unused

* remove BlockPools.Head object

* avoid quadratic quarantine loop

* fix
2020-07-28 13:54:32 +00:00
Zahary Karadjov f4c19e303a Non-interactive generation of keystores in the local sim 2020-07-28 07:36:25 +03:00