Commit Graph

1315 Commits

Author SHA1 Message Date
Dustin Brody 99e330d014 fix underflow in deposit procesing (#1542) 2020-08-21 19:00:36 +03:00
Jacek Sieka 61538fa581 speed up shuffling
Replace shuffling function with zrnt version - `get_shuffled_seq` in
particular puts more strain on the GC by allocating superfluous seq's
which turns out to have a significant impact on block processing (when
replaying blocks for example) - 4x improvement on non-epoch, 1.5x on
epoch blocks (replay is done without signature checking)

Medalla, first 10k slots - pre:

```
Loaded 68973 blocks, head slot 117077
All time are ms
Average,       StdDev,          Min,          Max,      Samples,
Test
Validation is turned off meaning that no BLS operations are performed
76855.848,        0.000,    76855.848,    76855.848,            1,
Initialize DB
1.073,        0.914,        0.071,       12.454,         7831,
Load block from database
31.382,        0.000,       31.382,       31.382,            1,
Load state from database
85.644,       30.350,        3.056,      466.136,         7519,
Apply block
506.569,       91.129,      130.654,      874.786,          312,
Apply epoch block
```

post:

```
Loaded 68973 blocks, head slot 117077
All time are ms
Average,       StdDev,          Min,          Max,      Samples,
Test
Validation is turned off meaning that no BLS operations are performed
72457.303,        0.000,    72457.303,    72457.303,            1,
Initialize DB
1.015,        0.858,        0.070,       11.231,         7831,
Load block from database
28.983,        0.000,       28.983,       28.983,            1,
Load state from database
21.725,       17.461,        2.659,      393.217,         7519,
Apply block
324.012,       33.954,       45.452,      440.532,          312,
Apply epoch block
```
2020-08-21 16:05:10 +03:00
cheatfate 5fc07fef75 Workaround fix password issues on Windows. 2020-08-21 12:55:49 +03:00
Viktor Kirilov 678a7efaaa moved away from WithState() for the common validator duties in the API - using EpochRef 2020-08-21 11:47:43 +03:00
Jacek Sieka 22998fdfd4 avoid double deserialization
When blocks and attestations arrive, they are SSZ-decoded twice: once
for validation and once for processing. This branch enqueues the decoded
block directly for processing, avoiding the second, slow
deserialization.

* move processing of blocks and attestations to queue
* ...and out from beacon_node
* split attestation processing into attestations and aggregates
  * also updates metrics
* clean up logging to better follow the lifetime of gossip: arrival,
validation and processing
* drop attestations and aggregates if there are too many
* try to prioritise blocks and aggregates before single-validator
attestations
2020-08-21 11:46:25 +03:00
Dustin Brody bbc90afa27 fix attestation aggregation broadcasting 2020-08-21 11:32:43 +03:00
Jacek Sieka 9244ae7a38 more speedups
* evaluate block attestations under the epochref of the block - this is
what the state transition function does
* avoid copying attestation seq unnecessarily
* avoid unnecessary hashset for unslashed indices
2020-08-19 14:51:04 +03:00
Jacek Sieka 7de05efaaf small perf fixes
* don't sort shuffled_validator_indices, just get them directly with
iteration
* grab full epoch of proposer indices while we have the data available -
they'll get cached and reused
* avoid computing active validator set when not used for logging
2020-08-19 14:51:04 +03:00
Zahary Karadjov 2c19e3f8cd
[skip ci] Use GOSSIP_MAX_SIZE when snappy decoding in the inspector as well; Bumps 2020-08-19 14:33:52 +03:00
Zahary Karadjov 3433c77c35 Prevent Snappy decompression bombs 2020-08-19 10:13:04 +03:00
Jacek Sieka 46c94a18ba rework epoch cache referencing
* collect all epochrefs in specific blocks to make them easier to find
and to avoid lots of small seqs
* reuse validator key databases more aggressively by comparing keys
* make state cache available from within `withState`
* make epochRef available from within onBlockAdded callback
* integrate getEpochInfo into block resolution and epoch ref logic such
that epochrefs are created when blocks are added to pool or lazily when
needed by a getEpochRef
* fill state cache better from EpochRef, speeding up replay and
validation
* store epochRef in specific blocks to make them easier to find and
reuse
* fix database corruption when state is saved while replaying quarantine
* replay slots fully from block pool before processing state
* compare bls values more smartly
* store epoch state without block applied in database - it's recommended
to resync the node!

this branch will drastically speed up processing in times of long
non-finality, as well as cut memory usage by 10x during the recent
medalla madness.
2020-08-19 10:09:06 +03:00
Jacek Sieka 9da8b2692f
simplify fork choice code (#1521)
* standardize init
* avoid loading state on init
* avoid some inefficient exception-based code
* remove some TODO
2020-08-18 16:56:32 +02:00
tersec 17af7f34f4
increase Jenkins timeout from 90 to 100 minutes (#1519) 2020-08-18 07:13:53 +00:00
Jacek Sieka 79ff4f7c41
fork choice refresh (#1520)
* add attestation processing queue so attestations don't get processed
too early
* rework justified slot delay to match spec / lighthouse better
* keep less state in fork choice
* request epochref less
2020-08-17 20:36:13 +02:00
Dmitriy Ryajov 87f983c639 use split out pubsub 2020-08-17 17:24:36 +03:00
tersec 612881b95d
refactor topic (un)subscribing/validating to collate each (#1510)
* refactor topic (un)subscribing/validating to collate each

* fix comment

* tweak comment
2020-08-17 14:07:29 +02:00
tersec f34eddb6e9
fix get_unslashed_attesting_indices() and add official EF rewards tests for it (#1514) 2020-08-17 01:09:27 +00:00
tersec bc6eefe31e
add --enable-logtrace argument to launch_local_testnet (#1502)
* add --enable-logtrace argument to launch_local_testnet

* scan for all available logfiles

* remove specific filename references

* update v0.11.3 spec ref to v0.12.2
2020-08-16 11:12:19 +02:00
Mamy Ratsimbazafy 454b9d0724
Bump nim-blscurve (#1491)
* Bump BLSCurve

* Use unified aggregation API

* use new blscurve with unified aggregate API

* bump

* fix toRaw

* replace state_sim combine with AggregateSignature

* Fix 32-bit

* Fix 32-bit for real and test deactivating ccache for fno-tree-lopp-vectorize flag

* change compilation switches to narrow down Linux issue

* Use -fno-tree-vectorize to disable both tree-loop-vectorize and tree-slp-vectorize

* blscurve now disables both Loop and SLP vectorization

* Add tests for the miracl/milagro fallback

* Travis has max log size of 4MB

* Test with Miracl in the finalization test

* fix state_sim log level

* Coment out the slow fallback tests
2020-08-15 19:33:58 +02:00
tersec 611c5097cc
use cache in process_voluntary_exit() (#1507) 2020-08-14 12:42:59 +00:00
Dustin Brody 3d121d9734 remove quadratic deposit Merkle tree initialization 2020-08-14 12:33:58 +03:00
Jacek Sieka a1bd44f4b0
small spec cleanups (#1501)
* clean up logging a bit
* return error on indexed attestation check
2020-08-13 13:47:06 +00:00
Zahary Karadjov c765c5ae2d
Bugfix: Correct wallet by UIID search in 'deposits create' 2020-08-13 14:32:22 +03:00
Jacek Sieka 58d77153fc
fix invalid state root being written to database (#1493)
* fix invalid state root being written to database

When rewinding state data, the wrong block reference would be used when
saving the state root - this would cause state loading to fail by
loading a different state than expected, preventing blocks to be
applied.

* refactor state loading and saving to consistently use and set
StateData block
* avoid rollback when state is missing from database (as opposed to
being partially overwritten and therefore in need of rollback)
* don't store state roots for empty slots - previously, these were used
as a cache to avoid recalculating them in state transition, but this has
been superceded by hash tree root caching
* don't attempt loading states / state roots for non-epoch slots, these
are not saved to the database
* simplify rewinder and clean up funcitions after caches have been
reworked
* fix chaindag logscope
* add database reload metric
* re-enable clearance epoch tests

* names
2020-08-13 11:50:05 +02:00
tersec ab34584f23
initial dynamic subscribe/unsubscribe for attestations to/from subnets (#1462)
* initial dynamic subscribe/unsubscribe for attestations to/from subnets

* implement random stability subnet and clean up

* switch from HashSet[uint64] to set[uint8]

* refactor subnet logic out from beacon_node and actual (un)subscribing

* only try to subscribe to marginally different subnets

* add assertions

* maintain ENR subnets

* assert that beacon_node and eth2_network have consistent view of subscribed subnets

* disable actual cycling
2020-08-12 17:48:31 +00:00
tersec af3355e0f8
create local testnet mode for eth2_network (#1494) 2020-08-12 14:16:59 +00:00
Eugene Kabanov 711f1f88ee
Use one single async queue and loop for processing blocks. (#1487)
* Initial commit

* Fix compilation problem.

* Address review comments.
2020-08-12 11:29:11 +02:00
Jacek Sieka 5da25e76be
avoid rewind in fork choice application (#1489) 2020-08-12 04:49:52 +00:00
Jacek Sieka 8b0f2cc96f
share validator keys in EpochRef (#1486) 2020-08-11 21:39:53 +02:00
tersec 22c1ef5a8d
split subscribe into non-validating subscribe and addValidator (#1485)
* split subscribe into non-validating subscribe and addValidator

* stop exporting get_committee_assignments
2020-08-11 15:08:44 +00:00
Zahary Karadjov 224ebdfd72 A simple metric for measuring the delay in the onSecond timer 2020-08-10 23:53:55 +03:00
Zahary Karadjov 30a8ec410d More spec compliant blocksByRange requests
* Eliminate possibilities for range errors and overflows
* Handle more properly invalid requests for furute slots
* Eliminate the confusing surrounding the MAX_REQUEST_BLOCKS constant

Addresses https://github.com/status-im/nim-beacon-chain/issues/1366
2020-08-10 22:09:13 +03:00
Ștefan Talpalaru 7763df95a4
storeBlock() duration metric (#1480) 2020-08-10 19:10:43 +02:00
Jacek Sieka 2b4526e743
bls: avoid exception flow on cache miss (#1479) 2020-08-10 14:51:23 +00:00
tersec fe1a7922c8
update attestation aggregation validation to spec refs of v0.12.2 (#1481) 2020-08-10 14:49:18 +00:00
Jacek Sieka 10da7fe9da remove eth from default status bar
not viable at higher validator counts - linear scan + forced public key
init makes it extremely slow
2020-08-10 17:01:53 +03:00
Jacek Sieka 2a36949913
use epochcache for attesting (#1478) 2020-08-10 15:21:31 +02:00
Jacek Sieka 280e72f3c9
remove snappy RPC support (#1477)
removed in 0.12.2 - the flow, in particular when the other peer doesn't
support snappy, is hard to follow because of the trial-and-error
approach - removing it simplifies things and removes some of the
hard-to-read parts of the thunking etc
2020-08-10 15:18:17 +02:00
Jacek Sieka 936440fccd
use libp2p peer events to track peer (#1468)
this resolves some peer counting issues that were happening because the
lifetime future in PeerInfo was unreliable (multiple PeerInfo instances
existed per peer)

In addition, this solves another race condition: when connecting to a
peer and later dialling that protocol, it is not certain that the same
connection will be used if there's a concurrent incoming peer connection
ongoing - better not make too many assumptions about who sent statuses
when.
2020-08-10 12:58:34 +02:00
Jacek Sieka 585c410d90 remove randompeers
unused, requires importing `random` which we're trying to avoid
2020-08-10 11:48:33 +03:00
Eugene Kabanov 55fcece0b2
SyncManager fix to process blocks one by one. (#1464)
* Allow sync manager process blocks one by one.

* Log storeBlock() and updateHead() duration.

* Calculate duration only for blocks added without any error.

* Fix float compilation error.

* Fix duration.

* Fix SyncQueue tests.
2020-08-10 09:15:50 +02:00
Jacek Sieka 3b6a8a692d
cleanup unused chaindag epoch features
these are somewhat obsoleted by the more extensive use of EpochRef
2020-08-07 19:49:52 +02:00
Eugene Kabanov 38bf8ccbec
Implement tracing of lags in the logs. (#1465) 2020-08-07 16:22:58 +00:00
Jacek Sieka 84a501d1ff
remove one cache, add another (#1449)
* remove one cache, add another

This cache removes the need for rewinding in most attestation validation
flow since the attestations come from one of two epochs and must be
targetting a viable block.

Additionally, it also removes all state caches which are less likely to
be used over-all - more metrics are needed to track the rewinding.

On risk is that when chains don't finalize, we'll have lots of epochrefs
in memory meaning lots of validator key databases, most being exactly
the same. This can be addressed in any number of ways. Some of the
memory usage is mitigated by the fact that we previously had lots of big
state caches and now we're keeping only keys instead.

* cleanups

* doc
2020-08-06 19:48:47 +00:00
Dmitriy Ryajov c5077af4bc
decreate amount of concurent dials (#1460) 2020-08-06 19:21:12 +00:00
Zahary Karadjov 9861eb1152
Use the same keystore directory names as Lighthouse
Rationale: this makes moving keys between the clients eaiser

Other changes:

* Restore building with custom presets
  (defaultRuntimePreset is not a template in this mode)
2020-08-06 21:50:19 +03:00
Jacek Sieka f4c16ed0db
eh cleanups (#1458)
current exception sometimes buggy in nim
2020-08-06 18:47:39 +00:00
Zahary Karadjov b902fddd19 Allow loading keystores produced by Lighthouse
The spec allows the description to be set to 'null'
2020-08-06 17:33:57 +03:00
tersec 81b3c0ea40
update spec refs to v0.12.2 (#1457)
* update spec refs to v0.12.2 and change a .len.uint64 to .lenu64

* pull back from any non-pure-comment changes, since Jenkins is being wonky
2020-08-06 13:05:13 +00:00
Viktor Kirilov 5bbeb38f2d fixes the BN/VC communication - properly getting the attestation duties & also fixed start.sh 2020-08-06 15:29:05 +03:00