This is a revamp of the attestation pool that cleans up several aspects
of attestation processing as the network grows larger and block space
becomes more precious.
The aim is to better exploit the divide between attestation subnets and
aggregations by keeping the two kinds separate until it's time to either
produce a block or aggregate. This means we're no longer eagerly
combining single-vote attestations, but rather wait until the last
moment, and then try to add singles to all aggregates, including those
coming from the network.
Importantly, the branch improves on poor aggregate quality and poor
attestation packing in cases where block space is running out.
A basic greed scoring mechanism is used to select attestations for
blocks - attestations are added based on how much many new votes they
bring to the table.
* Collect single-vote attestations separately and store these until it's
time to make aggregates
* Create aggregates based on single-vote attestations
* Select _best_ aggregate rather than _first_ aggregate when on
aggregation duty
* Top up all aggregates with singles when it's time make the attestation
cut, thus improving the chances of grabbing the best aggregates out
there
* Improve aggregation test coverage
* Improve bitseq operations
* Simplify aggregate signature creation
* Make attestation cache temporary instead of storing it in attestation
pool - most of the time, blocks are not being produced, no need to keep
the data around
* Remove redundant aggregate storage that was used only for RPC
* Use tables to avoid some linear seeks when looking up attestation data
* Fix long cleanup on large slot jumps
* Avoid some pointers
* Speed up iterating all attestations for a slot (fixes#2490)
* only deserialize attestation and aggregation gossiped signatures once
* re-indent some aggregate checks into block scope
* spelling
* remove debugging assertion
* put part of gossip validation back into block context
* attestation pool test signature loading isn't so unsafe, and exportRaw isn't free
* remove more development doAsserts; don't exportRaw in loops
* batch attestations
* Fixes (but now need to investigate the chronos 0 .. 4095 crash similar to https://github.com/status-im/nimbus-eth2/issues/1518
* Try to remove the processing loop to no avail :/
* batch aggregates
* use resultsBuffer size for triggering deadline schedule
* pass attestation pool tests
* Introduce async gossip validators. May fix the 4096 bug (reentrancy issue?) (similar to sync unknown blocks #1518)
* Put logging at debug level, add speed info
* remove unnecessary batch info when it is known to be one
* downgrade some logs to trace level
* better comments [skip ci]
* Address most review comments
* only use ref for async proc
* fix exceptions in eth2_network
* update async exceptions in gossip_validation
* eth2_network 2nd pass
* change to sleepAsync
* Update beacon_chain/gossip_processing/batch_validation.nim
Co-authored-by: Jacek Sieka <jacek@status.im>
Co-authored-by: Jacek Sieka <jacek@status.im>
* Deferred DAG and fork choice pruning
* fixup
* Address https://github.com/status-im/nimbus-eth2/pull/2384/files#r589448448, rely only on onSLotEnd for state pruning
* no need to store needPruning in the data structure
* lastPrunePoint is updated in pruning proc
* Split eager and LazyPruning
* enforce pruning in updateHead
* use IntSet rather than HashSet[ValidatorIndex]
* add bounds check before uint64 -> int conversion
* use intsets in block transitions
* remove superfluous Nim issue explanation/reference
* fix subnet calculation in RPC and insert broadcast attestations into node's pool
* unify codepaths to ensure only mostly-checked-to-be-valid attestations enter the pool, even from node's own broadcasts
* update attestation pool tests for new validateAttestation param
* detect already-aggregate-voted condition before attestation pool; add is_aggregator tests
* replace pair of attestation-per-epoch tracking lists with single list and remove Option use
* fix attestation condition
* use safer type conversions; add more is_aggregator tests
* remove some superfluous gcsafes
* remove getTailState (unused)
* don't store old epochrefs in blocks
* document attestation pool a bit
* remove `pcs =` cruft from log
* stop discarding non-existent future epochs during epoch state transitions; remove a pointless StateCache() construction in advance_slots()
* update nbench to pass StateCache to process_slots()
This implements disparity, resolving a part of
https://github.com/status-im/nim-beacon-chain/issues/1367
* make BeaconTime a duration for fractional seconds
* factor out attestation/aggregate validation
* simplify recording of queued attestations
* simplify attestation signature check
* fix blocks_received metric
* add some trivial validation tests
* remove unresolved attestation table - attestations for unknown blocks
are dropped instead (cannot verify their signature)
* initial - cheaper pruning - addresses #1534
* Pass tests: update offset when pruning, proper handling of pruned parents
* Use options instead of nil for nilable newHead (finalization passing but rootcause not solved)
* First line of defense against stackoverflow in tests
* Fix compute_delta offset after pruning
* Rebase fix - medalla ready
* Remove Option[BlockRef]
* collect all epochrefs in specific blocks to make them easier to find
and to avoid lots of small seqs
* reuse validator key databases more aggressively by comparing keys
* make state cache available from within `withState`
* make epochRef available from within onBlockAdded callback
* integrate getEpochInfo into block resolution and epoch ref logic such
that epochrefs are created when blocks are added to pool or lazily when
needed by a getEpochRef
* fill state cache better from EpochRef, speeding up replay and
validation
* store epochRef in specific blocks to make them easier to find and
reuse
* fix database corruption when state is saved while replaying quarantine
* replay slots fully from block pool before processing state
* compare bls values more smartly
* store epoch state without block applied in database - it's recommended
to resync the node!
this branch will drastically speed up processing in times of long
non-finality, as well as cut memory usage by 10x during the recent
medalla madness.
* add attestation processing queue so attestations don't get processed
too early
* rework justified slot delay to match spec / lighthouse better
* keep less state in fork choice
* request epochref less
* Bump BLSCurve
* Use unified aggregation API
* use new blscurve with unified aggregate API
* bump
* fix toRaw
* replace state_sim combine with AggregateSignature
* Fix 32-bit
* Fix 32-bit for real and test deactivating ccache for fno-tree-lopp-vectorize flag
* change compilation switches to narrow down Linux issue
* Use -fno-tree-vectorize to disable both tree-loop-vectorize and tree-slp-vectorize
* blscurve now disables both Loop and SLP vectorization
* Add tests for the miracl/milagro fallback
* Travis has max log size of 4MB
* Test with Miracl in the finalization test
* fix state_sim log level
* Coment out the slow fallback tests
* removed the BlockPool type and all of the proxy functions around it - passing the chain DAG and the quarantine explicitly where appropriately - they don't need to be bundled in a type
* fixed the build after the rebase
* limit attestations kept in attestation pool
With fork choice updated, the attestation pool only needs to keep track
of attestations that will eventually end up in blocks - we can thus
limit the horizon of attestations that we keep more aggressively.
To get here, we expose getEpochRef which gets metadata about a
particular epochref, and make sure to populate it when a block is added
- this ensures that state rewinds during block addition are minimized.
In addition, we'll use the target root/epoch when validating
attestations - this helps minimize the number of different states that
we need to rewind to, in general.
* remove CandidateChains.justifiedState
unused
* remove BlockPools.Head object
* avoid quadratic quarantine loop
* fix
* add helper for beacon committee length (used for quickly validating
attestations)
* refactor some attestation checks to do cheap checks early
* validate attestation epoch before computing active validator set
* clean up documentation / comments
* fill state cache on demand
* fork choice fixes, round 3
* introduce checkpoint tracker
* split out fork choice backend that is independent of dag
* correctly update best checkpoint to use for head selection
* correctly consider wall clock when processing attestations
* preload head history only (only one history is loaded from database
anyway)
* love the DAG
* switch to fork choice v2
also remove BlockRef.children
* fix
* cache beacon committee size calculation
this fixes a bug in get_validator_churn_limit as well
* fix
* make committee counts consistently uint64
mixing feels like the worst of the two worlds
* remove cruft
* reenable fork choice and fix several issues
* in addForkChoice_v2, the `.error` field would be accessed even when
Result is ok
* remove workaround for invalid block structure in fork choice
* fix `tmpState` being used recursively in callback, causing state
corruption while processing attestation
* fix block callback being called twice per block
* pass state to callback to avoid unnecessary rewinding
* enable head select, fix another bug
* never use `get` without `isOk`
* log nil blockref in case blockref is nil
* add missing error checking
* use correct epoch when updating attestation message
hash_tree_root was turning up when running beacon_node, turns out to be
repeated hash_tree_root invocations - this pr brings them back down to
normal.
this PR caches the root of a block in the SignedBeaconBlock object -
this has the potential downside that even invalid blocks will be hashed
(as part of deserialization) - later, one could imagine delaying this
until checks have passed
there's also some cleanup of the `cat=` logs which were applied randomly
and haphazardly, and to a large degree are duplicated by other
information in the log statements - in particular, topics fulfill the
same role
* restore EpochRef and flush statecaches on epoch transitions
* more targeted cache invalidation
* remove get_empty_per_epoch_cache(); implement simpler but still faster get_beacon_proposer_index()/compute_proposer_index() approach; add some abstraction layer for accessing the shuffled validator indices cache
* reduce integer type conversions
* remove most of rest of integer type conversion in compute_proposer_index()
* Dual headed fork choice
* fix finalizedEpoch not moving
* reduce fork choice verbosity
* Add failing tests due to pruning
* Properly handle duplicate blocks in sync
* test_block_pool also add a test for duplicate blocks
* comments addressing review
* Fix fork choice v2, was missing integrating block proposed
* remove a spurious debug writeStackTrace
* update block_sim
* Use OrderedTable to ensure that we always load parents before children in fork choice
* Load the DAG data in fork choice at init if there is some (can sync witti)
* Cluster of quarantined blocks were not properly added to the fork choice
* Workaround async gcsafe warnings
* Update blockpoool tests
* Do the callback before clearing the quarantine
* Revert OrderedTable, implement topological sort of DAG, allow forkChoice to be initialized from arbitrary finalized heads
* Make it work with latest devel - Altona readyness
* Add a recovery mechanism when forkchoice desyncs with blockpool
* add the current problematic node to the stack
* Fix rebase indentation bug (but still producing invalid block)
* Fix cache at epoch boundaries and lateBlock addition
* in makeBeaconBlock - use rollback instead
* in tests - this helps state_sim give more accurate data and makes it
30% faster
* fix some usages of raw BeaconState
* remove all but one UnusedImport warning
* bump a few more spec version references from v0.10.1 to v0.11.1
* more v0.10.1 spec reference updates/removals
* yet more v0.10.1 spec reference updates
* state data cache in block pool
* keep head state around
* more attestation logic in attestation pool
* first fork choice tests (!)
* fix fork choice (it's still likely broken / out of date)
* rename compute_epoch_of_slot(...) to compute_epoch_at_slot(...)
* remove some unnecessary imports; remove some crosslink-related code and tests; complete renaming of compute_epoch_of_slot(...) to compute_epoch_at_slot(...)
* rm more transfer-related code and tests; rm more unnecessary strutils imports
* rm remaining unused imports
* remove useless get_empty_per_epoch_cache(...)/compute_start_slot_of_epoch(...) calls
* rename compute_start_slot_of_epoch(...) to compute_start_slot_at_epoch(...)
* rename ACTIVATION_EXIT_DELAY to MAX_SEED_LOOKAHEAD
* update domain types to 0.9.0
* mark AttesterSlashing, IndexedAttestation, AttestationDataAndCustodyBit, DepositData, BeaconBlockHeader, Fork, integer_squareroot(...), and process_voluntary_exit(...) as 0.9.0
* mark increase_balance(...), decrease_balance(...), get_block_root(...), CheckPoint, Deposit, PendingAttestation, HistoricalBatch, is_active_validator(...), and is_slashable_attestation_data(...) as 0.9.0
* mark compute_activation_exit_epoch(...), bls_verify(...), Validator, get_active_validator_indices(...), get_current_epoch(...), get_total_active_balance(...), and get_previous_epoch(...) as 0.9.0
* mark get_block_root_at_slot(...), ProposerSlashing, get_domain(...), VoluntaryExit, mainnet preset Gwei values, minimal preset max operations, process_block_header(...), and is_slashable_validator(...) as 0.9.0
* mark makeWithdrawalCredentials(...), get_validator_churn_limit(...), get_total_balance(...), is_valid_indexed_attestation(...), bls_aggregate_pubkeys(...), initial genesis value/constants, Attestation, get_randao_mix(...), mainnet preset max operations per block constants, minimal preset Gwei values and time parameters, process_eth1_data(...), get_shuffled_seq(...), compute_committee(...), and process_slots(...) as 0.9.0; partially update get_indexed_attestation(...) to 0.9.0 by removing crosslink refs and associated tests
* mark initiate_validator_exit(...), process_registry_updates(...), BeaconBlock, Eth1Data, compute_domain(...), process_randao(...), process_attester_slashing(...), get_base_reward(...), and process_slot(...) as 0.9.0
* relax attestation validation when attestation is incoming but make it
more strict when adding to block
* share attestation validation logic between attestation pool and state
transition
* remove a bunch of redundant logging
* fix potential underflow in attestation delay checking
* fix committee used for attestation shard selection when attesting
* fix attestation data construction
* State sim compiles and run
* Works on libp2p testnet1
* Successfully update the state transition tests
* naive update of attestation pool (failing attstations can arrive in any order)
* Remove an official assert to allow attestation reordering
* rename get_genesis_beacon_state(...) to initialize_beacon_state_from_eth1(...); remove unused vestiges get_temporary_block_header(...) and get_empty_block(...); partly align initialize_beacon_state_from_eth1(...) with 0.8.0 version; implement CompactCommittee object type; update process_final_updates(...) to 0.8.0
* mark get_shard_delta(...) as 0.8.0
* mark get_total_active_balance(...) and epoch transition helper functions as 0.8.0
* move FORK_CHOICE_BALANCE_INCREMENT, not in spec anymore, to sole user in fork_choice
* rename slot_to_epoch(...) to compute_epoch_of_slot(...)
* rename aggregation_bitfield/custody_bitfield fields to aggregation_bits/custody_bits; rename convert_to_indexed(...) to get_indexed_attestation(...); mark BeaconBlockHeader as 0.8.0; update minimal preset MIN_ATTESTATION_INCLUSION_DELAY/time parameters in general to 0.8
* fix beacon node compilation; it used slightly different capitalizations of slot_to_epoch/slotToEpoch
* mark integer_squareroot(...), Deposit, VoluntaryExit, and Transfer as 0.8.0; rename get_epoch_start_slot(...) to compute_start_slot_of_epoch(...)
* add new Domain alias for uint64; rename bls_domain(...) to compute_domain(...), MAX_INDICES_PER_ATTESTATION to MAX_VALIDATORS_PER_COMMITTEE, and SignatureDomain to DomainType; update get_domain(...) to 0.8.0
* update/mark initiate_validator_exit(...) and Crosslink as 0.8.0; rename BeaconState.latest_block_roots -> BeaconState.block_roots; mark HistoricalBatch as 0.8.0
* migrate attestation pool tests from get_crosslink_committees_at_slot(...) to get_crosslink_committee(...)
* rm obsolete, unused get_crosslink_committees_at_slot_cached(...) and migrate tests/test_state_transition from get_crosslink_committees_at_slot(...) to get_crosslink_committee(...)
* migrate tests/testutil from get_crosslink_committees_at_slot(...) to get_crosslink_committee(...)
* use more pervasive caching infrastructure, initially of compute_committee; remove buggy (and per-index) shuffling to fix research/state_sim, which was noticing validators on multiple shard committees; rm now-unused specific get_attesting_balance_cached
* add some get_active_validator_index caching
* rm obsolete/unused get_attesting_indices_cached
* rm get_crosslink_committees_at_slot(...)
* some more 0.7 changes -- some of which can't be completed pending some data structure updates -- and shard handling changes to offset with epoch start shards
* Initial move constants to mainnet preset
* Bump fixtures to include shuffling with minimal preset
* Add minimal preset with shuffling test
* Add minimal preset to the full test suite
* use preset() everywhere
* no need to pass prev_block_root when updating state
* some Slot fixes
* fix `hash_tree_root` for Slot, Epoch
* detect missing hash_tree_root type support
* remove some <0.5 state checks
* keep track of a finalized block
* keep track of all justified blocks
* use naive spec version of LMD ghost
* cache slot number and a few more things in BlockRef
* keep track of the latest vote of each validator
* depend less on the state of node.state (it's a cache, effectively)
* fix state db lookup typo
* fix randao reveal slot when proposing blocks
* only store blocks that can be applied to a state
* store state at every epoch boundary (yes, needs pruning!)
* split out state advancement function when there's no block
* default state sim to 0.9 attestation ratio
* implement in-memory block graph
* store tail block in database
* resolve unknown parents by syncing them from peers
* introduce concept of resolved blocks and attestations - those that
follow minimal protocol rules
* update state head lazily
* log more stuff
* shortHash -> shortLog
* start 9/10 beacon nodes by default, last can be started manually
* see also #134
* fix start.sh epoch length
* convert some asserts to doAsserts to keep them in release mode builds; rename get_initial_beacon_state to get_genesis_beacon_state to track spec; switch target spec version to 0.3.0; switch references to penalize_validator to slash_validator/slashValidator to track spec; make some function returns safer by omitting 'return'
* 2x shuffling speedup by hoisting pivot calculations per https://github.com/protolambda/eth2-shuffle
* fix initial attestation pool on reordered attestations
* simplify db layer api
* load head block from database on startup, then load state
* significantly changes database format
* move subscriptions to separate proc's
* implement block replay from historical state
* avoid rescheduling epoch actions on block receipt (why?)
* make sure genesis block is created and used
* relax initial state sim parameters a bit
* move attestation pool to separate file
* combine attestations lazily when needed
* advance state when there's a gap while attesting
* compile beacon node with optimizations - it's tooo slow right now
* log when unable to keep up