Commit Graph

38 Commits

Author SHA1 Message Date
tersec d69ab20d47 address issue #1552 (#1601) 2020-09-16 19:21:59 +02:00
tersec a52b0436fd bound block quarantine size (#1564)
* bound block quarantine size

* add additional logging for block quarantining

* re-add quarantine.add() call

* remove pre-finalization blocks; add logging for full quarantine

* clear quarantine on chain reorganization

* update block_sim and tests

* update test_attestation_pool
2020-09-16 19:21:59 +02:00
Jacek Sieka 1ef98c36ac reuse validator key cache better (#1562)
new key cache can be used for old epochs in the same tree
2020-09-16 19:21:59 +02:00
Jacek Sieka 46c94a18ba rework epoch cache referencing
* collect all epochrefs in specific blocks to make them easier to find
and to avoid lots of small seqs
* reuse validator key databases more aggressively by comparing keys
* make state cache available from within `withState`
* make epochRef available from within onBlockAdded callback
* integrate getEpochInfo into block resolution and epoch ref logic such
that epochrefs are created when blocks are added to pool or lazily when
needed by a getEpochRef
* fill state cache better from EpochRef, speeding up replay and
validation
* store epochRef in specific blocks to make them easier to find and
reuse
* fix database corruption when state is saved while replaying quarantine
* replay slots fully from block pool before processing state
* compare bls values more smartly
* store epoch state without block applied in database - it's recommended
to resync the node!

this branch will drastically speed up processing in times of long
non-finality, as well as cut memory usage by 10x during the recent
medalla madness.
2020-08-19 10:09:06 +03:00
Jacek Sieka 58d77153fc
fix invalid state root being written to database (#1493)
* fix invalid state root being written to database

When rewinding state data, the wrong block reference would be used when
saving the state root - this would cause state loading to fail by
loading a different state than expected, preventing blocks to be
applied.

* refactor state loading and saving to consistently use and set
StateData block
* avoid rollback when state is missing from database (as opposed to
being partially overwritten and therefore in need of rollback)
* don't store state roots for empty slots - previously, these were used
as a cache to avoid recalculating them in state transition, but this has
been superceded by hash tree root caching
* don't attempt loading states / state roots for non-epoch slots, these
are not saved to the database
* simplify rewinder and clean up funcitions after caches have been
reworked
* fix chaindag logscope
* add database reload metric
* re-enable clearance epoch tests

* names
2020-08-13 11:50:05 +02:00
Jacek Sieka 8b0f2cc96f
share validator keys in EpochRef (#1486) 2020-08-11 21:39:53 +02:00
Jacek Sieka 15b99e4c11
cache beacon proposer indices (#1440)
also clear old epochrefs as they're growing unwieldy

in particular, this speeds up gossip block validation by avoiding the
rewind
2020-08-05 08:28:43 +02:00
Jacek Sieka 70df0ad057 don't mark quarantined blocks as missing 2020-08-04 22:37:06 +03:00
tersec df80071bcf
update attestation and block validation to v0.12.2; clean up getAncestorAt()/get_ancestor() (#1417)
* update attestation validation to v0.12.2; clean up getAncestorAt()/get_ancestor()

* update beacon block validation to v0.12.2
2020-08-03 19:47:42 +00:00
Viktor Kirilov 0a96e5f564
renamed CandidateChains to ChainDagRef and made the Quarantine type a ref type so there is a single instance in the beacon node (#1407) 2020-07-31 14:49:06 +00:00
Viktor Kirilov c032366547
removed the BlockPool type and all of the proxy functions around it (#1401)
* removed the BlockPool type and all of the proxy functions around it - passing the chain DAG and the quarantine explicitly where appropriately - they don't need to be bundled in a type

* fixed the build after the rebase
2020-07-30 21:18:17 +02:00
Jacek Sieka c5fecd472f
more fork-choice fixes (#1388)
* more fork-choice fixes

* use target block/epoch to validate attestations
* make addLocalValidators sync
* add current and previous epoch to cache before doing state transition
* update head state using clearance state as a shortcut, when possible
* use blockslot for fork choice balances
* send attestations using epochref cache

* fix invalid finalized parent being used

also simplify epoch block traversal

* single error handling style in fork choice

* import fix, remove unused async
2020-07-30 17:48:25 +02:00
tersec 99f2d8e06c
update 14 v0.12.1 spec refs to v0.12.2 (#1400) 2020-07-30 09:47:57 +00:00
Jacek Sieka 157ddd2ac4
Fork choice fixes 5 (#1381)
* limit attestations kept in attestation pool

With fork choice updated, the attestation pool only needs to keep track
of attestations that will eventually end up in blocks - we can thus
limit the horizon of attestations that we keep more aggressively.

To get here, we expose getEpochRef which gets metadata about a
particular epochref, and make sure to populate it when a block is added
- this ensures that state rewinds during block addition are minimized.

In addition, we'll use the target root/epoch when validating
attestations - this helps minimize the number of different states that
we need to rewind to, in general.

* remove CandidateChains.justifiedState

unused

* remove BlockPools.Head object

* avoid quadratic quarantine loop

* fix
2020-07-28 13:54:32 +00:00
Jacek Sieka fb2f742972
Fork choice fixes 2 (#1356)
* fork choice cleanup

* enable v2 pruning
* prefer `get_current_epoch`
* fix finalization check to use correct epoch

* small cleanups

* add `count_active_validators`
* remove misleading logs
* fix justified checkpoint slot calculation in rpc
2020-07-22 23:01:44 +02:00
Jacek Sieka f0720faf17
Fork choice fixes (#1350)
* remove cruft

* reenable fork choice and fix several issues

* in addForkChoice_v2, the `.error` field would be accessed even when
Result is ok
* remove workaround for invalid block structure in fork choice
* fix `tmpState` being used recursively in callback, causing state
corruption while processing attestation
* fix block callback being called twice per block
* pass state to callback to avoid unnecessary rewinding

* enable head select, fix another bug

* never use `get` without `isOk`
* log nil blockref in case blockref is nil

* add missing error checking

* use correct epoch when updating attestation message
2020-07-22 11:42:55 +02:00
Jacek Sieka 7e0bf7b1b5
Add separate state for clearance (#1352)
When clearing blocks, a callback is called - this callback, if it uses
`tmpState`, will be corrupted because it's not fully up to date when the
callback is called - we thus introduce a specific state cache for this
purpose - ideally, it can be removed later when epoch caching is
improved.

Incidentally, this helps block sync speed a lot - without this state,
the block sync would ping-pong between attestation state and block state
which is costly.
2020-07-22 08:25:13 +02:00
Jacek Sieka 8b01284b0e
cache block hash (#1329)
hash_tree_root was turning up when running beacon_node, turns out to be
repeated hash_tree_root invocations - this pr brings them back down to
normal.

this PR caches the root of a block in the SignedBeaconBlock object -
this has the potential downside that even invalid blocks will be hashed
(as part of deserialization) - later, one could imagine delaying this
until checks have passed

there's also some cleanup of the `cat=` logs which were applied randomly
and haphazardly, and to a large degree are duplicated by other
information in the log statements - in particular, topics fulfill the
same role
2020-07-16 15:16:51 +02:00
Zahary Karadjov 3ec6a02b12
Merge devel and resolve conflicts 2020-07-10 02:02:40 +03:00
Mamy Ratsimbazafy 3cdae9f6be
Dual headed fork choice [Revolution] (#1238)
* Dual headed fork choice

* fix finalizedEpoch not moving

* reduce fork choice verbosity

* Add failing tests due to pruning

* Properly handle duplicate blocks in sync

* test_block_pool also add a test for duplicate blocks

* comments addressing review

* Fix fork choice v2, was missing integrating block proposed

* remove a spurious debug writeStackTrace

* update block_sim

* Use OrderedTable to ensure that we always load parents before children in fork choice

* Load the DAG data in fork choice at init if there is some (can sync witti)

* Cluster of quarantined blocks were not properly added to the fork choice

* Workaround async gcsafe warnings

* Update blockpoool tests

* Do the callback before clearing the quarantine

* Revert OrderedTable, implement topological sort of DAG, allow forkChoice to be initialized from arbitrary finalized heads

* Make it work with latest devel - Altona readyness

* Add a recovery mechanism when forkchoice desyncs with blockpool

* add the current problematic node to the stack

* Fix rebase indentation bug (but still producing invalid block)

* Fix cache at epoch boundaries and lateBlock addition
2020-07-09 11:29:32 +02:00
Zahary Karadjov c4af4e2f35
Working test suite with run-time presets 2020-07-08 02:02:14 +03:00
Jacek Sieka 66c230ffd1
check that parent of added block is sufficiently recent (#1269)
Otherwise, we might introduce a fork into the DAG that is no longer
viable, creating trouble for both sync and fork choice
2020-07-01 17:21:21 +02:00
Mamy Ratsimbazafy 902093f57c
Revert "Dual headed fork choice [Reloaded] (#1223)" (#1234)
This reverts commit 6836d41ebd.
2020-06-25 11:36:03 +02:00
Mamy Ratsimbazafy 6836d41ebd
Dual headed fork choice [Reloaded] (#1223)
* Dual headed fork choice

* fix finalizedEpoch not moving

* reduce fork choice verbosity

* Add failing tests due to pruning

* Properly handle duplicate blocks in sync

* test_block_pool also add a test for duplicate blocks

* comments addressing review
2020-06-24 20:24:36 +02:00
tersec 807b920c19
state_transition implements the spec fairly directly (#1220) 2020-06-23 13:54:24 +00:00
Eugene Kabanov 4436c85ff7
Forward sync refactoring. (#1191)
* Forward sync refactoring.
Rename Quarantine.pending to Quarantine.orphans.
Removing "old" fields.

* Fix test's FetchRecord.

* Fix `checkResponse` to not allow duplicates in response.
2020-06-18 12:03:36 +02:00
Dustin Brody ffca27b45f update 24 v0.11.x spec refs to v0.12.1 2020-06-17 12:11:03 +00:00
Jacek Sieka 49e9167b28 clean up dump feature
* don't write blocks that get added to database
* don't write states
* write to folders
* add state dumping feature to `ncli_db` to get any known state from the
database
2020-06-16 13:44:37 +00:00
Jacek Sieka 89e4819ce9
collect signature production and verificaiton in one place (#1179)
* collect signature production and verificaiton in one place

Signatures are made over data and domain - here we collect all such
activities in one place.

Also:
* security: fix cast-before-range-check
* log block/attestation verification consistently
* run block verification based on `getProposer` in its own history
* clean up some unused stuff

* import

* missing raises
2020-06-16 07:45:04 +02:00
Jacek Sieka 78b767f645
avoid genericAssign for beacon node types (#1166)
* avoid genericAssign for beacon node types

ok, I got fed up of this function messing up cpu measurements - it's so
ridiculously slow, it's sad.

before, while syncing:

```
40,65%  beacon_node_shared_witti_0  [.]
genericAssignAux__U5DxFPRpHCCZDKWQzM9adaw
   9,02%  libc-2.31.so                [.] __memmove_avx_unaligned_erms
   7,07%  beacon_node_shared_witti_0  [.] BIG_384_58_monty
   5,19%  beacon_node_shared_witti_0  [.] BIG_384_58_mul
   2,72%  beacon_node_shared_witti_0  [.] memcpy@plt
   1,18%  [kernel]                    [k] rb_next
   1,17%  beacon_node_shared_witti_0  [.] genericReset
   1,06%  [kernel]                    [k] map_private_extent_buffer
```

after:

```
  24,88%  beacon_node_shared_witti_0  [.] BIG_384_58_monty
  20,29%  beacon_node_shared_witti_0  [.] BIG_384_58_mul
   3,15%  beacon_node_shared_witti_0  [.] BIG_384_58_norm
   2,93%  beacon_node_shared_witti_0  [.] BIG_384_58_add
   2,55%  beacon_node_shared_witti_0  [.] BIG_384_58_sqr
   1,64%  beacon_node_shared_witti_0  [.] BIG_384_58_mod
1,63%  beacon_node_shared_witti_0  [.]
sha256Transform__BJNBQtWr9bJwzqbyfKXd38Q
   1,48%  beacon_node_shared_witti_0  [.] FP_BLS381_add
   1,39%  beacon_node_shared_witti_0  [.] BIG_384_58_sub
   1,33%  beacon_node_shared_witti_0  [.] BIG_384_58_dnorm
   1,14%  beacon_node_shared_witti_0  [.] FP2_BLS381_mul
   1,05%  beacon_node_shared_witti_0  [.] BIG_384_58_cmove
1,05%  beacon_node_shared_witti_0  [.]
get_shuffled_seq__4uncAHNsSG3Pndo5H11U9aQ
```

* better field iteration
2020-06-12 21:10:22 +02:00
Jacek Sieka 42832cefa8
Small fixes (#1165)
* random fixes

* create dump dir on startup
* don't crash on failure to write dump
* fix a few `uint64` instances being used when indexing arrays - this
should be a compile error but isn't due to compiler bugs
* fix standalone test_block_pool compilation
* add signed block processing in ncli

* reuse cache entry instead of allocating a new one

* allow for small clock disparities when validating blocks
2020-06-12 18:43:20 +02:00
Mamy Ratsimbazafy ce897fe83f
[Split fork choice PR] Derisk-ed attestation checks changes (#1154)
* Derisked attestation pool improvements

* tune down frequent logs

* VoteTracker logging
2020-06-10 08:58:12 +02:00
Dustin Brody 74dc2fffa6 3x blocksim speedup by using EpochRef in attestation pool addResolved(...) 2020-06-05 13:02:35 +00:00
Jacek Sieka 56ffb696be
reorder ssz (#1099)
* reorder ssz

* split into hash_trees and ssz_serialization, roughly, for hashing and
IO
* move bitseqs into ssz (from stew)
* clean up imports

* docs, imports
2020-06-03 15:52:02 +02:00
tersec a327e8581b
switch state transition caching to match EpochRef (#1089)
* switch state transition caching usage to shuffled active validator indices to match EpochRef

* refactor the EpochRef -> StateCache transformation; elide pointless mapIt

* limit state passed between get_beacon_committee(...) and compute_committee(...)

* tweaks
2020-06-01 09:44:50 +02:00
tersec b5f45db5e9
keep cache of per-epoch items in block pool (#1068)
* plumbing between block pool and state transition functions around active validator indices and committees

* have shared epochrefs followed by blockref tree while allowing for skipped slots

* factor out the epoch info extraction; document how the EpochRef follows forks
2020-05-29 08:10:20 +02:00
Jacek Sieka 7fbb8c0bc2
return block result details (#1049) 2020-05-21 19:08:31 +02:00
Mamy Ratsimbazafy c014f0b301
Split quarantine (#1038)
* split blockpool into hotDB and Quarantine

* Rename hotdb -> dag/candidate chains
2020-05-19 16:18:07 +02:00