Commit Graph

112 Commits

Author SHA1 Message Date
Jacek Sieka 58d77153fc
fix invalid state root being written to database (#1493)
* fix invalid state root being written to database

When rewinding state data, the wrong block reference would be used when
saving the state root - this would cause state loading to fail by
loading a different state than expected, preventing blocks to be
applied.

* refactor state loading and saving to consistently use and set
StateData block
* avoid rollback when state is missing from database (as opposed to
being partially overwritten and therefore in need of rollback)
* don't store state roots for empty slots - previously, these were used
as a cache to avoid recalculating them in state transition, but this has
been superceded by hash tree root caching
* don't attempt loading states / state roots for non-epoch slots, these
are not saved to the database
* simplify rewinder and clean up funcitions after caches have been
reworked
* fix chaindag logscope
* add database reload metric
* re-enable clearance epoch tests

* names
2020-08-13 11:50:05 +02:00
Jacek Sieka 5da25e76be
avoid rewind in fork choice application (#1489) 2020-08-12 04:49:52 +00:00
Jacek Sieka 8b0f2cc96f
share validator keys in EpochRef (#1486) 2020-08-11 21:39:53 +02:00
Zahary Karadjov 30a8ec410d More spec compliant blocksByRange requests
* Eliminate possibilities for range errors and overflows
* Handle more properly invalid requests for furute slots
* Eliminate the confusing surrounding the MAX_REQUEST_BLOCKS constant

Addresses https://github.com/status-im/nim-beacon-chain/issues/1366
2020-08-10 22:09:13 +03:00
Jacek Sieka 2a36949913
use epochcache for attesting (#1478) 2020-08-10 15:21:31 +02:00
Jacek Sieka 3b6a8a692d
cleanup unused chaindag epoch features
these are somewhat obsoleted by the more extensive use of EpochRef
2020-08-07 19:49:52 +02:00
Jacek Sieka 84a501d1ff
remove one cache, add another (#1449)
* remove one cache, add another

This cache removes the need for rewinding in most attestation validation
flow since the attestations come from one of two epochs and must be
targetting a viable block.

Additionally, it also removes all state caches which are less likely to
be used over-all - more metrics are needed to track the rewinding.

On risk is that when chains don't finalize, we'll have lots of epochrefs
in memory meaning lots of validator key databases, most being exactly
the same. This can be addressed in any number of ways. Some of the
memory usage is mitigated by the fact that we previously had lots of big
state caches and now we're keeping only keys instead.

* cleanups

* doc
2020-08-06 19:48:47 +00:00
Jacek Sieka deaeb62de3
clean up quarantine 2020-08-05 16:19:55 +02:00
Jacek Sieka 15b99e4c11
cache beacon proposer indices (#1440)
also clear old epochrefs as they're growing unwieldy

in particular, this speeds up gossip block validation by avoiding the
rewind
2020-08-05 08:28:43 +02:00
Dustin Brody c142de4b7f be more consistent about pubkeys fed to verify_foo_signature() not being separately initialized, while pubkeys, generally, used for matching purposes, elsewhere explicitly initialized 2020-08-04 23:00:33 +03:00
Jacek Sieka ac78e75bf8
lear missing on orphan add in quarantine (#1441) 2020-08-04 19:49:25 +00:00
Jacek Sieka 70df0ad057 don't mark quarantined blocks as missing 2020-08-04 22:37:06 +03:00
Jacek Sieka c6674de5d2 use epoch ref to update fork choice
this dramatically speeds up startup in long periods of non-finality
2020-08-04 20:00:31 +03:00
tersec df80071bcf
update attestation and block validation to v0.12.2; clean up getAncestorAt()/get_ancestor() (#1417)
* update attestation validation to v0.12.2; clean up getAncestorAt()/get_ancestor()

* update beacon block validation to v0.12.2
2020-08-03 19:47:42 +00:00
Viktor Kirilov 0a96e5f564
renamed CandidateChains to ChainDagRef and made the Quarantine type a ref type so there is a single instance in the beacon node (#1407) 2020-07-31 14:49:06 +00:00
tersec e0a6f58abe
convert 10 v0.12.1 spec refs to v0.12.2 (#1406) 2020-07-31 09:59:14 +00:00
Viktor Kirilov c032366547
removed the BlockPool type and all of the proxy functions around it (#1401)
* removed the BlockPool type and all of the proxy functions around it - passing the chain DAG and the quarantine explicitly where appropriately - they don't need to be bundled in a type

* fixed the build after the rebase
2020-07-30 21:18:17 +02:00
Jacek Sieka c5fecd472f
more fork-choice fixes (#1388)
* more fork-choice fixes

* use target block/epoch to validate attestations
* make addLocalValidators sync
* add current and previous epoch to cache before doing state transition
* update head state using clearance state as a shortcut, when possible
* use blockslot for fork choice balances
* send attestations using epochref cache

* fix invalid finalized parent being used

also simplify epoch block traversal

* single error handling style in fork choice

* import fix, remove unused async
2020-07-30 17:48:25 +02:00
tersec 99f2d8e06c
update 14 v0.12.1 spec refs to v0.12.2 (#1400) 2020-07-30 09:47:57 +00:00
Jacek Sieka 157ddd2ac4
Fork choice fixes 5 (#1381)
* limit attestations kept in attestation pool

With fork choice updated, the attestation pool only needs to keep track
of attestations that will eventually end up in blocks - we can thus
limit the horizon of attestations that we keep more aggressively.

To get here, we expose getEpochRef which gets metadata about a
particular epochref, and make sure to populate it when a block is added
- this ensures that state rewinds during block addition are minimized.

In addition, we'll use the target root/epoch when validating
attestations - this helps minimize the number of different states that
we need to rewind to, in general.

* remove CandidateChains.justifiedState

unused

* remove BlockPools.Head object

* avoid quadratic quarantine loop

* fix
2020-07-28 13:54:32 +00:00
Jacek Sieka fd4d319450
Use fork v2 (#1358)
* fork choice fixes, round 3

* introduce checkpoint tracker
* split out fork choice backend that is independent of dag
* correctly update best checkpoint to use for head selection
* correctly consider wall clock when processing attestations
* preload head history only (only one history is loaded from database
anyway)
* love the DAG

* switch to fork choice v2

also remove BlockRef.children

* fix
2020-07-25 21:41:12 +02:00
Jacek Sieka fb2f742972
Fork choice fixes 2 (#1356)
* fork choice cleanup

* enable v2 pruning
* prefer `get_current_epoch`
* fix finalization check to use correct epoch

* small cleanups

* add `count_active_validators`
* remove misleading logs
* fix justified checkpoint slot calculation in rpc
2020-07-22 23:01:44 +02:00
Jacek Sieka f0720faf17
Fork choice fixes (#1350)
* remove cruft

* reenable fork choice and fix several issues

* in addForkChoice_v2, the `.error` field would be accessed even when
Result is ok
* remove workaround for invalid block structure in fork choice
* fix `tmpState` being used recursively in callback, causing state
corruption while processing attestation
* fix block callback being called twice per block
* pass state to callback to avoid unnecessary rewinding

* enable head select, fix another bug

* never use `get` without `isOk`
* log nil blockref in case blockref is nil

* add missing error checking

* use correct epoch when updating attestation message
2020-07-22 11:42:55 +02:00
tersec 4a9a7be271
faster syncing (#1348)
* maybe faster syncing

* 80-character lines

* remove instrumentation debugEchos; fix target attestation epoch in attestation pool validation

* use the epoch-granularity matching in attestation.addResolved(...)
2020-07-22 09:51:45 +02:00
Jacek Sieka 7e0bf7b1b5
Add separate state for clearance (#1352)
When clearing blocks, a callback is called - this callback, if it uses
`tmpState`, will be corrupted because it's not fully up to date when the
callback is called - we thus introduce a specific state cache for this
purpose - ideally, it can be removed later when epoch caching is
improved.

Incidentally, this helps block sync speed a lot - without this state,
the block sync would ping-pong between attestation state and block state
which is costly.
2020-07-22 08:25:13 +02:00
tersec 83abbcb917
drop get_attesting_indices()/get_unslashed_attesting_indices() from 15% to 1% of workload at block_sim at 100k validators (#1351) 2020-07-21 18:35:43 +02:00
Jacek Sieka 8b01284b0e
cache block hash (#1329)
hash_tree_root was turning up when running beacon_node, turns out to be
repeated hash_tree_root invocations - this pr brings them back down to
normal.

this PR caches the root of a block in the SignedBeaconBlock object -
this has the potential downside that even invalid blocks will be hashed
(as part of deserialization) - later, one could imagine delaying this
until checks have passed

there's also some cleanup of the `cat=` logs which were applied randomly
and haphazardly, and to a large degree are duplicated by other
information in the log statements - in particular, topics fulfill the
same role
2020-07-16 15:16:51 +02:00
tersec 26e893ffc2
restore EpochRef and flush statecaches on epoch transitions (#1312)
* restore EpochRef and flush statecaches on epoch transitions

* more targeted cache invalidation

* remove get_empty_per_epoch_cache(); implement simpler but still faster get_beacon_proposer_index()/compute_proposer_index() approach; add some abstraction layer for accessing the shuffled validator indices cache

* reduce integer type conversions

* remove most of rest of integer type conversion in compute_proposer_index()
2020-07-15 12:44:18 +02:00
tersec 853bd5b799
quick workaround for epochref cache issue (#1296)
* quick workaround for epochref cache issue

* disable assertion which doesn't work without epochref caches

* get local testnets and altona running again
2020-07-12 17:09:49 +02:00
Jacek Sieka d6f317950c
Compute state root instead of loading from database (#1297)
htr is fast now, so hitting the database to load the state root is no
longer motivated - the more simple code is less vulnerable to database
corruption.
2020-07-10 22:47:39 +02:00
Zahary Karadjov 3ec6a02b12
Merge devel and resolve conflicts 2020-07-10 02:02:40 +03:00
tersec 61b0b5af17
update most remaining non-fork-choice spec refs, updating code where necessary (#1292)
* update most of the remaining non-fork-choice spec refs, updating code where necessary

* revert presumably harmless compute_signing_root() change, but this way, keep things really unchanged outside inspector
2020-07-09 11:43:27 +00:00
Mamy Ratsimbazafy 3cdae9f6be
Dual headed fork choice [Revolution] (#1238)
* Dual headed fork choice

* fix finalizedEpoch not moving

* reduce fork choice verbosity

* Add failing tests due to pruning

* Properly handle duplicate blocks in sync

* test_block_pool also add a test for duplicate blocks

* comments addressing review

* Fix fork choice v2, was missing integrating block proposed

* remove a spurious debug writeStackTrace

* update block_sim

* Use OrderedTable to ensure that we always load parents before children in fork choice

* Load the DAG data in fork choice at init if there is some (can sync witti)

* Cluster of quarantined blocks were not properly added to the fork choice

* Workaround async gcsafe warnings

* Update blockpoool tests

* Do the callback before clearing the quarantine

* Revert OrderedTable, implement topological sort of DAG, allow forkChoice to be initialized from arbitrary finalized heads

* Make it work with latest devel - Altona readyness

* Add a recovery mechanism when forkchoice desyncs with blockpool

* add the current problematic node to the stack

* Fix rebase indentation bug (but still producing invalid block)

* Fix cache at epoch boundaries and lateBlock addition
2020-07-09 11:29:32 +02:00
Dustin Brody 4140b3b9d9 update 29 spec refs to v0.12.1 2020-07-08 20:49:25 +00:00
Zahary Karadjov 318b225ccd
Merge devel and resolve the conflicts 2020-07-08 15:36:03 +03:00
Dustin Brody fc8502c54e halve memory usage from state caches 2020-07-08 10:21:41 +00:00
Zahary Karadjov c4af4e2f35
Working test suite with run-time presets 2020-07-08 02:02:14 +03:00
tersec c64737e7f2
implement aggregated attestation receiving/validating (#1272)
* implement aggregated attestation receiving/validating

* document the conditions without explicit implementations in isValidAggregatedAttestation()
2020-07-02 16:15:27 +00:00
Jacek Sieka 66c230ffd1
check that parent of added block is sufficiently recent (#1269)
Otherwise, we might introduce a fork into the DAG that is no longer
viable, creating trouble for both sync and fork choice
2020-07-01 17:21:21 +02:00
Jacek Sieka 1301600341
Trusted blocks (#1227)
* cleanups

* fix ncli state root check flag
* add block dump to ncli_db
* limit ncli_db benchmark length
* tone down finalization logs

* introduce trusted blocks

We only store blocks whose signature we've verified in the database - as
such, there's no need to check it again, and most importantly, no need
to deserialize the signature when loading from database.

50x startup time improvement, 200x block load time improvement.

* fix rewinding when deposits have invalid signature
* speed up ancestor iteration by avoiding copy
* avoid deserializing signatures for trusted data
* load blocks lazily when rewinding (less memory used)

* chronicles workarounds

* document trustedbeaconblock
2020-06-25 12:23:10 +02:00
Mamy Ratsimbazafy 902093f57c
Revert "Dual headed fork choice [Reloaded] (#1223)" (#1234)
This reverts commit 6836d41ebd.
2020-06-25 11:36:03 +02:00
Mamy Ratsimbazafy 6836d41ebd
Dual headed fork choice [Reloaded] (#1223)
* Dual headed fork choice

* fix finalizedEpoch not moving

* reduce fork choice verbosity

* Add failing tests due to pruning

* Properly handle duplicate blocks in sync

* test_block_pool also add a test for duplicate blocks

* comments addressing review
2020-06-24 20:24:36 +02:00
tersec 807b920c19
state_transition implements the spec fairly directly (#1220) 2020-06-23 13:54:24 +00:00
tersec a683656238
send and validate with v0.12.1 attestations (#1213)
* send and validate with v0.12.1 attestations

* use EpochRef instead of empty cache in attestation validation
2020-06-23 10:38:59 +00:00
Eugene Kabanov f60235b3e9
Attestation validator now populates list of missing blocks. (#1211) 2020-06-23 11:29:08 +02:00
tersec dc1a565b3f
support v0.12.1 attestation topics in beacon node/inspector subscribing (#1187)
* support v0.12.1 attestation topics in beacon node and inspector subscribing

* bump is_valid_merkle_branch() spec ref
2020-06-18 15:10:25 +02:00
Eugene Kabanov 4436c85ff7
Forward sync refactoring. (#1191)
* Forward sync refactoring.
Rename Quarantine.pending to Quarantine.orphans.
Removing "old" fields.

* Fix test's FetchRecord.

* Fix `checkResponse` to not allow duplicates in response.
2020-06-18 12:03:36 +02:00
Dustin Brody ffca27b45f update 24 v0.11.x spec refs to v0.12.1 2020-06-17 12:11:03 +00:00
Jacek Sieka 49e9167b28 clean up dump feature
* don't write blocks that get added to database
* don't write states
* write to folders
* add state dumping feature to `ncli_db` to get any known state from the
database
2020-06-16 13:44:37 +00:00
Jacek Sieka 89e4819ce9
collect signature production and verificaiton in one place (#1179)
* collect signature production and verificaiton in one place

Signatures are made over data and domain - here we collect all such
activities in one place.

Also:
* security: fix cast-before-range-check
* log block/attestation verification consistently
* run block verification based on `getProposer` in its own history
* clean up some unused stuff

* import

* missing raises
2020-06-16 07:45:04 +02:00
Jacek Sieka 78b767f645
avoid genericAssign for beacon node types (#1166)
* avoid genericAssign for beacon node types

ok, I got fed up of this function messing up cpu measurements - it's so
ridiculously slow, it's sad.

before, while syncing:

```
40,65%  beacon_node_shared_witti_0  [.]
genericAssignAux__U5DxFPRpHCCZDKWQzM9adaw
   9,02%  libc-2.31.so                [.] __memmove_avx_unaligned_erms
   7,07%  beacon_node_shared_witti_0  [.] BIG_384_58_monty
   5,19%  beacon_node_shared_witti_0  [.] BIG_384_58_mul
   2,72%  beacon_node_shared_witti_0  [.] memcpy@plt
   1,18%  [kernel]                    [k] rb_next
   1,17%  beacon_node_shared_witti_0  [.] genericReset
   1,06%  [kernel]                    [k] map_private_extent_buffer
```

after:

```
  24,88%  beacon_node_shared_witti_0  [.] BIG_384_58_monty
  20,29%  beacon_node_shared_witti_0  [.] BIG_384_58_mul
   3,15%  beacon_node_shared_witti_0  [.] BIG_384_58_norm
   2,93%  beacon_node_shared_witti_0  [.] BIG_384_58_add
   2,55%  beacon_node_shared_witti_0  [.] BIG_384_58_sqr
   1,64%  beacon_node_shared_witti_0  [.] BIG_384_58_mod
1,63%  beacon_node_shared_witti_0  [.]
sha256Transform__BJNBQtWr9bJwzqbyfKXd38Q
   1,48%  beacon_node_shared_witti_0  [.] FP_BLS381_add
   1,39%  beacon_node_shared_witti_0  [.] BIG_384_58_sub
   1,33%  beacon_node_shared_witti_0  [.] BIG_384_58_dnorm
   1,14%  beacon_node_shared_witti_0  [.] FP2_BLS381_mul
   1,05%  beacon_node_shared_witti_0  [.] BIG_384_58_cmove
1,05%  beacon_node_shared_witti_0  [.]
get_shuffled_seq__4uncAHNsSG3Pndo5H11U9aQ
```

* better field iteration
2020-06-12 21:10:22 +02:00
Jacek Sieka 42832cefa8
Small fixes (#1165)
* random fixes

* create dump dir on startup
* don't crash on failure to write dump
* fix a few `uint64` instances being used when indexing arrays - this
should be a compile error but isn't due to compiler bugs
* fix standalone test_block_pool compilation
* add signed block processing in ncli

* reuse cache entry instead of allocating a new one

* allow for small clock disparities when validating blocks
2020-06-12 18:43:20 +02:00
tersec c8f24ae3b8
Remove three skipMerkleValidation usages (#1164)
* remove three skipMerkleValidation usages

* remove a couple obsolete comments/TODOs
2020-06-12 18:03:46 +02:00
Mamy Ratsimbazafy ce897fe83f
[Split fork choice PR] Derisk-ed attestation checks changes (#1154)
* Derisked attestation pool improvements

* tune down frequent logs

* VoteTracker logging
2020-06-10 08:58:12 +02:00
Dustin Brody 74dc2fffa6 3x blocksim speedup by using EpochRef in attestation pool addResolved(...) 2020-06-05 13:02:35 +00:00
Jacek Sieka 56ffb696be
reorder ssz (#1099)
* reorder ssz

* split into hash_trees and ssz_serialization, roughly, for hashing and
IO
* move bitseqs into ssz (from stew)
* clean up imports

* docs, imports
2020-06-03 15:52:02 +02:00
tersec a327e8581b
switch state transition caching to match EpochRef (#1089)
* switch state transition caching usage to shuffled active validator indices to match EpochRef

* refactor the EpochRef -> StateCache transformation; elide pointless mapIt

* limit state passed between get_beacon_committee(...) and compute_committee(...)

* tweaks
2020-06-01 09:44:50 +02:00
tersec b5f45db5e9
keep cache of per-epoch items in block pool (#1068)
* plumbing between block pool and state transition functions around active validator indices and committees

* have shared epochrefs followed by blockref tree while allowing for skipped slots

* factor out the epoch info extraction; document how the EpochRef follows forks
2020-05-29 08:10:20 +02:00
Dustin Brody 0929d90d93 unexport candidate_chains.init; some spec version bumps 2020-05-26 05:06:37 +00:00
Jacek Sieka f06df1cea6 remove some copies
* in makeBeaconBlock - use rollback instead
* in tests - this helps state_sim give more accurate data and makes it
30% faster
* fix some usages of raw BeaconState
2020-05-22 17:15:35 +00:00
Jacek Sieka 7fbb8c0bc2
return block result details (#1049) 2020-05-21 19:08:31 +02:00
Mamy Ratsimbazafy c014f0b301
Split quarantine (#1038)
* split blockpool into hotDB and Quarantine

* Rename hotdb -> dag/candidate chains
2020-05-19 16:18:07 +02:00