With the introduction of batching and lazy attestation aggregation, it
no longer makes sense to enqueue attestations between the signature
check and adding them to the attestation pool - this only takes up
valuable CPU without any real benefit.
* add successfully validated attestations to attestion pool directly
* avoid copying participant list around for single-vote attestations,
pass single validator index instead
* release decompressed gossip memory earlier, specially during async
message validation
* use cooked signatures in a few more places to avoid reloads and errors
* remove some Defect-raising versions of signature-loading
* release decompressed data memory before validating message
* only deserialize attestation and aggregation gossiped signatures once
* re-indent some aggregate checks into block scope
* spelling
* remove debugging assertion
* put part of gossip validation back into block context
* attestation pool test signature loading isn't so unsafe, and exportRaw isn't free
* remove more development doAsserts; don't exportRaw in loops
* set upper bound on EpochRef cache
* max 32 EpochRef instances
* less memory waste in BlockRef by removing EpochRef seq that is mostly
unused (~20mb)
* less memory waste in dag block lookup by not keeping an extra copy of
digest (~70mb)
* fix `==` and `$` for Eth2Digest
* remove `ChainDAG.tmpState` (~50mb?)
all in all, this branch cuts mainnet memory usage by ~160-180mb and puts
limits on EpochRef cache usage - where normally it hovered around 950mb
before, it's now sitting at 600-700mb on my machine.
* docs
* Deferred DAG and fork choice pruning
* fixup
* Address https://github.com/status-im/nimbus-eth2/pull/2384/files#r589448448, rely only on onSLotEnd for state pruning
* no need to store needPruning in the data structure
* lastPrunePoint is updated in pruning proc
* Split eager and LazyPruning
* enforce pruning in updateHead
* use IntSet rather than HashSet[ValidatorIndex]
* add bounds check before uint64 -> int conversion
* use intsets in block transitions
* remove superfluous Nim issue explanation/reference
* in exit pool, filter out already-packaged messages; bundle remaining messages into beaconblocks
* filter messages at block construction time
* allow adding up to intended capacity of buffers, beyond per-block limits
* document rationale/design for filtering mechanism
* remove some superfluous gcsafes
* remove getTailState (unused)
* don't store old epochrefs in blocks
* document attestation pool a bit
* remove `pcs =` cruft from log
This implements disparity, resolving a part of
https://github.com/status-im/nim-beacon-chain/issues/1367
* make BeaconTime a duration for fractional seconds
* factor out attestation/aggregate validation
* simplify recording of queued attestations
* simplify attestation signature check
* fix blocks_received metric
* add some trivial validation tests
* remove unresolved attestation table - attestations for unknown blocks
are dropped instead (cannot verify their signature)
* collect all epochrefs in specific blocks to make them easier to find
and to avoid lots of small seqs
* reuse validator key databases more aggressively by comparing keys
* make state cache available from within `withState`
* make epochRef available from within onBlockAdded callback
* integrate getEpochInfo into block resolution and epoch ref logic such
that epochrefs are created when blocks are added to pool or lazily when
needed by a getEpochRef
* fill state cache better from EpochRef, speeding up replay and
validation
* store epochRef in specific blocks to make them easier to find and
reuse
* fix database corruption when state is saved while replaying quarantine
* replay slots fully from block pool before processing state
* compare bls values more smartly
* store epoch state without block applied in database - it's recommended
to resync the node!
this branch will drastically speed up processing in times of long
non-finality, as well as cut memory usage by 10x during the recent
medalla madness.
* removed the BlockPool type and all of the proxy functions around it - passing the chain DAG and the quarantine explicitly where appropriately - they don't need to be bundled in a type
* fixed the build after the rebase
* limit attestations kept in attestation pool
With fork choice updated, the attestation pool only needs to keep track
of attestations that will eventually end up in blocks - we can thus
limit the horizon of attestations that we keep more aggressively.
To get here, we expose getEpochRef which gets metadata about a
particular epochref, and make sure to populate it when a block is added
- this ensures that state rewinds during block addition are minimized.
In addition, we'll use the target root/epoch when validating
attestations - this helps minimize the number of different states that
we need to rewind to, in general.
* remove CandidateChains.justifiedState
unused
* remove BlockPools.Head object
* avoid quadratic quarantine loop
* fix
* fork choice fixes, round 3
* introduce checkpoint tracker
* split out fork choice backend that is independent of dag
* correctly update best checkpoint to use for head selection
* correctly consider wall clock when processing attestations
* preload head history only (only one history is loaded from database
anyway)
* love the DAG
* switch to fork choice v2
also remove BlockRef.children
* fix
* cache beacon committee size calculation
this fixes a bug in get_validator_churn_limit as well
* fix
* make committee counts consistently uint64
mixing feels like the worst of the two worlds
* remove cruft
* reenable fork choice and fix several issues
* in addForkChoice_v2, the `.error` field would be accessed even when
Result is ok
* remove workaround for invalid block structure in fork choice
* fix `tmpState` being used recursively in callback, causing state
corruption while processing attestation
* fix block callback being called twice per block
* pass state to callback to avoid unnecessary rewinding
* enable head select, fix another bug
* never use `get` without `isOk`
* log nil blockref in case blockref is nil
* add missing error checking
* use correct epoch when updating attestation message
hash_tree_root was turning up when running beacon_node, turns out to be
repeated hash_tree_root invocations - this pr brings them back down to
normal.
this PR caches the root of a block in the SignedBeaconBlock object -
this has the potential downside that even invalid blocks will be hashed
(as part of deserialization) - later, one could imagine delaying this
until checks have passed
there's also some cleanup of the `cat=` logs which were applied randomly
and haphazardly, and to a large degree are duplicated by other
information in the log statements - in particular, topics fulfill the
same role
* restore EpochRef and flush statecaches on epoch transitions
* more targeted cache invalidation
* remove get_empty_per_epoch_cache(); implement simpler but still faster get_beacon_proposer_index()/compute_proposer_index() approach; add some abstraction layer for accessing the shuffled validator indices cache
* reduce integer type conversions
* remove most of rest of integer type conversion in compute_proposer_index()
* Dual headed fork choice
* fix finalizedEpoch not moving
* reduce fork choice verbosity
* Add failing tests due to pruning
* Properly handle duplicate blocks in sync
* test_block_pool also add a test for duplicate blocks
* comments addressing review
* Fix fork choice v2, was missing integrating block proposed
* remove a spurious debug writeStackTrace
* update block_sim
* Use OrderedTable to ensure that we always load parents before children in fork choice
* Load the DAG data in fork choice at init if there is some (can sync witti)
* Cluster of quarantined blocks were not properly added to the fork choice
* Workaround async gcsafe warnings
* Update blockpoool tests
* Do the callback before clearing the quarantine
* Revert OrderedTable, implement topological sort of DAG, allow forkChoice to be initialized from arbitrary finalized heads
* Make it work with latest devel - Altona readyness
* Add a recovery mechanism when forkchoice desyncs with blockpool
* add the current problematic node to the stack
* Fix rebase indentation bug (but still producing invalid block)
* Fix cache at epoch boundaries and lateBlock addition
* add tests for unviable blocks
also enable finalization tests in all test configs - they're plenty fast
now
also fix newClone for non-rvo cases. sigh.
* fixes
* collect signature production and verificaiton in one place
Signatures are made over data and domain - here we collect all such
activities in one place.
Also:
* security: fix cast-before-range-check
* log block/attestation verification consistently
* run block verification based on `getProposer` in its own history
* clean up some unused stuff
* import
* missing raises
* reorder ssz
* split into hash_trees and ssz_serialization, roughly, for hashing and
IO
* move bitseqs into ssz (from stew)
* clean up imports
* docs, imports
* in makeBeaconBlock - use rollback instead
* in tests - this helps state_sim give more accurate data and makes it
30% faster
* fix some usages of raw BeaconState
When replaying state transitions, for the slots that have a block, the
state root is taken from the block. For slots that lack a block, it's
currently calculated using hash_tree_root which is expensive.
Caching the empty slot state roots helps us avoid recalculating this
hash, meaning that for replay, hashes are never calculated. This turns
blocks into fairly lightweight "state-diffs"!
* avoid re-saving state when replaying blocks
* advance empty slots slot-by-slot and save root
* fix sim randomness
* fix sim genesis filename
* introduce `isEpoch` to check if a slot is an epoch slot