Currently, we require genesis and a checkpoint block and state to start
from an arbitrary slot - this PR relaxes this requirement so that we can
start with a state alone.
The current trusted-node-sync algorithm works by first downloading
blocks until we find an epoch aligned non-empty slot, then downloads the
state via slot.
However, current
[proposals](https://github.com/ethereum/beacon-APIs/pull/226) for
checkpointing prefer finalized state as
the main reference - this allows more simple access control and caching
on the server side - in particular, this should help checkpoint-syncing
from sources that have a fast `finalized` state download (like infura
and teku) but are slow when accessing state via slot.
Earlier versions of Nimbus will not be able to read databases created
without a checkpoint block and genesis. In most cases, backfilling makes
the database compatible except where genesis is also missing (custom
networks).
* backfill checkpoint block from libp2p instead of checkpoint source,
when doing trusted node sync
* allow starting the client without genesis / checkpoint block
* perform epoch start slot lookahead when loading tail state, so as to
deal with the case where the epoch start slot does not have a block
* replace `--blockId` with `--state-id` in TNS command line
* when replaying, also look at the parent of the last-known-block (even
if we don't have the parent block data, we can still replay from a
"parent" state) - in particular, this clears the way for implementing
state pruning
* deprecate `--finalized-checkpoint-block` option (no longer needed)
`news` has a few open issues that are not present in `nim-websock`:
1. There is a 1 second delay between each MB of sent data.
2. Cancelling an ongoing `send` makes the entire WebSocket unusable.
3. Control packets do not have priority over ongoing message frames.
Using `news`, there are quite a few of these messages in Geth:
```
Previously seen beacon client is offline. Please ensure it is
operational to follow the chain!
```
It may take quite some time to reconnect when this happens.
Using `nim-websock`, this message still occurs because `eth1_monitor`
reconnects the EL connection when no new blocks occurred for 5 minutes,
but reconnecting is quick and the message is rarer.
* Fixes a segfault during block production when the Keymanager API
is disabled. The Keymanager is now disabled on half of the local
testnet nodes to catch such problems in the future.
* Fixes multiple potential stalls from REST requests being done
without a timeout. From practice, we know that such requests
can hang forever if not cancelled with a timeout. At best,
this would be a resource leak, at worst, it may lead to a
full stall of the client and missed validator duties.
* Changes some Options usages to Opt (for easier use of valueOr)
https://github.com/status-im/nimbus-eth2/pull/3944
The use of nested `awaitWithRetries` calls would have
resulted in an unexpected number of retries (3x3).
We now use regular `await` in outer layer to avoid the problem.
https://github.com/status-im/nimbus-eth2/pull/3943
The new code has an invariant that the `headMerkleizer` field in
the `Eth1Chain` is always kept in sync with the blocks stored in
the chain.
This invariant is now enforced better by doing the necessary merkleizer updates
in the `Eth1Chain.addBlock` function, in the `Eth1Chain.init` function and in the
`Eth1Chain.reset` function.
The justified and finalized `Checkpoint` are frequently passed around
together. This introduces a new `FinalityCheckpoint` data structure that
combines them into one.
Due to the large usage of this structure in fork choice, also took this
opportunity to update fork choice tests to the latest v1.2.0-rc.1 spec.
Many additional tests enabled, some need more work, e.g. EL mock blocks.
Also implemented `discard_equivocations` which was skipped in #3661,
and improved code reuse across fork choice logic while at it.
Incorporates the latest changes to the light client sync protocol based
on Devconnect AMS feedback. Note that this breaks compatibility with the
previous prototype, due to changes to data structures and endpoints.
See https://github.com/ethereum/consensus-specs/pull/2802
Other changes:
* logtrace can now verify sync committee messages and contributions
* Many unnecessary use of pairs() have been removed for consistency
* Map 40x BN response codes to BeaconNodeStatus.Incompatible in the VC
Some upstream repos still need fixes, but this gets us close enough that
style hints can be enabled by default.
In general, "canonical" spellings are preferred even if they violate
nep-1 - this applies in particular to spec-related stuff like
`genesis_validators_root` which appears throughout the codebase.
Up til now, the block dag has been using `BlockRef`, a structure adapted
for a full DAG, to represent all of chain history. This is a correct and
simple design, but does not exploit the linearity of the chain once
parts of it finalize.
By pruning the in-memory `BlockRef` structure at finalization, we save,
at the time of writing, a cool ~250mb (or 25%:ish) chunk of memory
landing us at a steady state of ~750mb normal memory usage for a
validating node.
Above all though, we prevent memory usage from growing proportionally
with the length of the chain, something that would not be sustainable
over time - instead, the steady state memory usage is roughly
determined by the validator set size which grows much more slowly. With
these changes, the core should remain sustainable memory-wise post-merge
all the way to withdrawals (when the validator set is expected to grow).
In-memory indices are still used for the "hot" unfinalized portion of
the chain - this ensure that consensus performance remains unchanged.
What changes is that for historical access, we use a db-based linear
slot index which is cache-and-disk-friendly, keeping the cost for
accessing historical data at a similar level as before, achieving the
savings at no percievable cost to functionality or performance.
A nice collateral benefit is the almost-instant startup since we no
longer load any large indicies at dag init.
The cost of this functionality instead can be found in the complexity of
having to deal with two ways of traversing the chain - by `BlockRef` and
by slot.
* use `BlockId` instead of `BlockRef` where finalized / historical data
may be required
* simplify clearance pre-advancement
* remove dag.finalizedBlocks (~50:ish mb)
* remove `getBlockAtSlot` - use `getBlockIdAtSlot` instead
* `parent` and `atSlot` for `BlockId` now require a `ChainDAGRef`
instance, unlike `BlockRef` traversal
* prune `BlockRef` parents on finality (~200:ish mb)
* speed up ChainDAG init by not loading finalized history index
* mess up light client server error handling - this need revisiting :)
One more step on the journey to reduce `BlockRef` usage across the
codebase - this one gets rid of `StateData` whose job was to keep track
of which block was last assigned to a state - these duties have now been
taken over by `latest_block_root`, a fairly recent addition that
computes this block root from state data (at a small cost that should be
insignificant)
99% mechanical change.
* Initial commit.
* Fix current test suite.
* Fix keymanager api test.
* Fix wss_sim.
* Add more keystore_management tests.
* Recover deleted isEmptyDir().
* Add `HttpHostUri` distinct type.
Move keymanager calls away from rest_beacon_calls to rest_keymanager_calls.
Add REST serialization of RemoteKeystore and Keystore object.
Add tests for Remote Keystore management API.
Add tests for Keystore management API (Add keystore).
Fix serialzation issues.
* Fix test to use HttpHostUri instead of Uri.
* Add links to specification in comments.
* Remove debugging echoes.
It's sometimes useful to simulate what happens when a chain runs from a
given state with a given set of private keys - `wss_sim` allows running
such a simulation.
One use of such a tool is to simulate a weak subjectivity attack,
creating alternative histories of the same chain:
https://notes.status.im/nimbus-insecura-network#
Time in the beacon chain is expressed relative to the genesis time -
this PR creates a `beacon_time` module that collects helpers and
utilities for dealing the time units - the new module does not deal with
actual wall time (that's remains in `beacon_clock`).
Collecting the time related stuff in one place makes it easier to find,
avoids some circular imports and allows more easily identifying the code
actually needs wall time to operate.
* move genesis-time-related functionality into `spec/beacon_time`
* avoid using `chronos.Duration` for time differences - it does not
support negative values (such as when something happens earlier than it
should)
* saturate conversions between `FAR_FUTURE_XXX`, so as to avoid
overflows
* fix delay reporting in validator client so it uses the expected
deadline of the slot, not "closest wall slot"
* simplify looping over the slots of an epoch
* `compute_start_slot_at_epoch` -> `start_slot`
* `compute_epoch_at_slot` -> `epoch`
A follow-up PR will (likely) introduce saturating arithmetic for the
time units - this is merely code moves, renames and fixing of small
bugs.
* Harden CommitteeIndex, SubnetId, SyncSubcommitteeIndex
Harden the use of `CommitteeIndex` et al to prevent future issues by
using a distinct type, then validating before use in several cases -
datatypes in spec are kept simple though so that invalid data still can
be read.
* fix invalid epoch used in REST
`/eth/v1/beacon/states/{state_id}/committees` committee length (could
return invalid data)
* normalize some variable names
* normalize committee index loops
* fix `RestAttesterDuty` to use `uint64` for `validator_committee_index`
* validate `CommitteeIndex` on ingress in REST API
* update rest rules with stricter parsing
* better REST serializers
* save lots of memory by not using `zip` ...at least a few bytes!
With checkpoint sync in particular, and state pruning in the future,
loading states or state-dependent data may fail. This PR adjusts the
code to allow this to be handled gracefully.
In particular, the new availability assumption is that states are always
available for the finalized checkpoint and newer, but may fail for
anything older.
The `tail` remains the point where state loading de-facto fails, meaning
that between the tail and the finalized checkpoint, we can still get
historical data (but code should be prepared to handle this as an
error).
However, to harden the code against long replays, several operations
which are assumed to work only with non-final data (such as gossip
verification and validator duties) now limit their search horizon to
post-finalized data.
* harden several state-dependent operations by logging an error instead
of introducing a panic when state loading fails
* `withState` -> `withUpdatedState` to differentiate from the other
`withState`
* `updateStateData` can now fail if no state is found in database - it
is also hardened against excessively long replays
* `getEpochRef` can now fail when replay fails
* reject blocks with invalid target root - they would be ignored
previously
* fix recursion bug in `isProposed`
* use v1.1.6 test vectors; use BeaconTime instead of Slot in fork choice
* tick through every slot at least once
* use div INTERVALS_PER_SLOT and use precomputed constants of them
* use correct (even if numerically equal) constant
Validator monitoring based on and mostly compatible with the
implementation in Lighthouse - tracks additional logs and metrics for
specified validators so as to stay on top on performance.
The implementation works more or less the following way:
* Validator pubkeys are singled out for monitoring - these can be
running on the node or not
* For every action that the validator takes, we record steps in the
process such as messages being seen on the network or published in the
API
* When the dust settles at the end of an epoch, we report the
information from one epoch before that, which coincides with the
balances being updated - this is a tradeoff between being correct
(waiting for finalization) and providing relevant information in a
timely manner)
In the ChainDAG, 3 block pointers are kept: genesis, tail and head. This
PR adds one more block pointer: the backfill block which represents the
block that has been backfilled so far.
When doing a checkpoint sync, a random block is given as starting point
- this is the tail block, and we require that the tail block has a
corresponding state.
When backfilling, we end up with blocks without corresponding states,
hence we cannot use `tail` as a backfill pointer - there is no state.
Nonetheless, we need to keep track of where we are in the backfill
process between restarts, such that we can answer GetBeaconBlocksByRange
requests.
This PR adds the basic support for backfill handling - it needs to be
integrated with backfill sync, and the REST API needs to be adjusted to
take advantage of the new backfilled blocks when responding to certain
requests.
Future work will also enable moving the tail in either direction:
* pruning means moving the tail forward in time and removing states
* backwards means recreating past states from genesis, such that
intermediate states are recreated step by step all the way to the tail -
at that point, tail, genesis and backfill will match up.
* backfilling is done when backfill != genesis - later, this will be the
WSS checkpoint instead
* batch-verify sync messages for a small perf boost
Generally reuses the same structure as attestation and aggregate
verification
* normalize `signatures` and `signature_batch` to use the same pattern
of verification
* normalize parameter names, order etc for signature stuff in general
* avoid calling `blsSign` directly - instead, go through `signatures`
consistently
* move quarantine outside of chaindag
The quarantine has been part of the ChainDAG for the longest time, but
this design has a few issues:
* the function in which blocks are verified and added to the dag becomes
reentrant and therefore difficult to reason about - we're currently
using a stateful flag to work around it
* quarantined blocks bypass the processing queue leading to a processing
stampede
* the quarantine flow is unsuitable for orphaned attestations - these
should also should be quarantined eventually
Instead of processing the quarantine inside ChainDAG, this PR moves
re-queueing to `block_processor` which already is responsible for
dealing with follow-up work when a block is added to the dag
This sets the stage for keeping attestations in the quarantine as well.
Also:
* make `BlockError` `{.pure.}`
* avoid use of `ValidationResult` in block clearance (that's for gossip)
Renames and cleanups split out from the validator monitoring branch, so
as to reduce conflict area vs other PR:s
* add constants for expected message timing
* name validators after the messages they validate, mostly, to make
grepping easier
* unify field naming of EpochInfo across forks to make cross-fork code
easier
* ncli_db: add putState, putBlock
These tools allow modifying an existing nimbus database for the purpose
of recovery or reorg, moving the head, tail and genesis to arbitrary
points.
* remove potentially expensive `putState` in `BeaconStateDB`
* introduce `latest_block_root` which computes the root of the latest
applied block from the `latest_block_header` field (instead of passing
it in separately)
* avoid some unnecessary BeaconState copies during init
* discover https://github.com/nim-lang/Nim/issues/19094
* prefer `HashedBeaconState` in a few places to avoid recomputing state
root
* fetch latest block root from state when creating blocks
* harden `get_beacon_proposer_index` against invalid slots and document
* move random spec function tests to `test_spec.nim`
* avoid unnecessary state root computation before block proposal
* Support starting from altair
* hide `finalized-checkpoint-` - they are incomplete and usage may cause
crashes
* remove genesis detection code (broken, obsolete)
* enable starting ChainDAG from altair checkpoints - this is a
prerequisite for checkpoint sync (TODO: backfill)
* tighten checkpoint state conditions
* show error when starting from checkpoint with existing database (not
supported)
* print rest-compatible JSON in ncli/state_sim
* altair/merge support in ncli
* more altair/merge support in ncli_db
* pre-load header to speed up loading
* fix forked block decoding
* fix stack overflow crash in REST/debug/getStateV2
* introduce `ForkyXxx` for generic type matching of `Xxx` across
branches (SomeHashedBeaconState -> ForkyHashedBeaconState et al) -
`Some` is already used for other types of type classes
* consolidate function naming in BeaconChainDB, use some generics
* import `forks.nim` from other spec modules and move `Forked*` helpers
around to resolve circular imports
* remove `ForkedBeaconState`, use `ForkedHashedBeaconState` throughout
(less data shuffling between the types)
* fix several cases of states being stored on stack in tests, causing
random failures on some platforms
* remove reading json support from ncli - this should be ported to the
rest json reading instead (doesn't currently work because stack sizes)
* `SyncCommitteeIndex` -> `SyncSubcommitteeIndex`
* `syncCommitteePeriod` -> `sync_committee_period` (spec spelling)
* tighten period comparisons
* fix assert when validating committee message with non-altair state in
REST api
There are a few locations in the code that compare the current epoch to
the various FORK_EPOCH constants and branch off into fork-specific code.
When a new fork is introduced, it is sometimes forgotten to update all
of those branch locations. This patch introduces a compile-time check
that ensures that all branches need to be covered exhaustively. This is
done by replacing if-elif structures with case expressions.
In tests, the private key was put into the validator deposit's withdraw
credentials so that it can be recovered later. This leads to problems
when creating the validators through other means that do not put the key
there. In general, mock private keys only depend on the validator index,
though, and because it is clear what the index of a validator is, it is
not actually needed to put the key into the credentials.
There are a number of locations in the code that get attestations on a
forked beacon state. For attestation pools test, a convenience wrapper
was available to reduce clutter. This patch integrates that wrapper into
the core component so that it can also take advantage of the wrapper.
The initialization of a `SyncAggregate` to its default value is not very
intuitive. There is an `init` function in `sync_committee_msg_pool` that
provides a convenience wrapper. This patch exports that initializer so
that the rest of the code base can also take advantage of it.
The phase0 and altair overloads of `makeBeaconBlock` slightly differ in
their signatures which makes using them unnecessarily verbose.
- A placeholder `sync_aggregate` argument similar to `executionPayload`
is added to the phase0 overload to match the altair signature.
- A wrapper operating on `ForkedHashedBeaconState` is introduced.
testblockutil currently fails to add test blocks to Altair state because
it assumes phase0 blocks in various places. This patch corrects this
limitation by using forked types internally. Note that existing clients
so far solely operate on phase0 blocks and should eventually be updated.
* Add parallel attestation verification
* Update tests, batchVerify doesn't use the threadpool with only single core (nim-blscurve update)
* bump nim-blscurve
* Debug info for failing eth2 test vectors
* remove submodule eth2-testnets
* verbose debugging of make failure on Windows (libbacktrace?)
* Remove CI debug mode
* initialization convention
* Fix new altair tests
In edd9826464, VCS ignores were introduced
to ignore the output from building and running the research tools from
their directory. Over time, new research tools got introduced, but the
ignore list did not get updated. Updating now for current set of tools.
* reorganize ssz dependencies
This PR continues the work in
https://github.com/status-im/nimbus-eth2/pull/2646,
https://github.com/status-im/nimbus-eth2/pull/2779 as well as past
issues with serialization and type, to disentangle SSZ from eth2 and at
the same time simplify imports and exports with a structured approach.
The principal idea here is that when a library wants to introduce SSZ
support, they do so via 3 files:
* `ssz_codecs` which imports and reexports `codecs` - this covers the
basic byte conversions and ensures no overloads get lost
* `xxx_merkleization` imports and exports `merkleization` to specialize
and get access to `hash_tree_root` and friends
* `xxx_ssz_serialization` imports and exports `ssz_serialization` to
specialize ssz for a specific library
Those that need to interact with SSZ always import the `xxx_` versions
of the modules and never `ssz` itself so as to keep imports simple and
safe.
This is similar to how the REST / JSON-RPC serializers are structured in
that someone wanting to serialize spec types to REST-JSON will import
`eth2_rest_serialization` and nothing else.
* split up ssz into a core library that is independendent of eth2 types
* rename `bytes_reader` to `codec` to highlight that it contains coding
and decoding of bytes and native ssz types
* remove tricky List init overload that causes compile issues
* get rid of top-level ssz import
* reenable merkleization tests
* move some "standard" json serializers to spec
* remove `ValidatorIndex` serialization for now
* remove test_ssz_merkleization
* add tests for over/underlong byte sequences
* fix broken seq[byte] test - seq[byte] is not an SSZ type
There are a few things this PR doesn't solve:
* like #2646 this PR is weak on how to handle root and other
dontSerialize fields that "sometimes" should be computed - the same
problem appears in REST / JSON-RPC etc
* Fix a build problem on macOS
* Another way to fix the macOS builds
Co-authored-by: Zahary Karadjov <zahary@gmail.com>
The spec imports are a mess to work with, so this branch cleans them up
a bit to ensure that we avoid generic sandwitches and that importing
stuff generally becomes easier.
* reexport crypto/digest/presets because these are part of the public
symbol set of the rest of the spec types
* don't export `merge` types from `base` - this causes circular deps
* fix circular deps in `ssz/spec_types` - this is the first step in
disentangling ssz from spec
* be explicit about phase0 vs altair - longer term, `altair` will become
the "natural" type set, then merge and so on, so no point in giving
`phase0` special preferential treatment
Simpler module name for stuff that covers forks
* check that runtime config matches database state
* also include some assorted altair cleanups
* use "standard" genesis fork in local testnet to work around missing
runtime config support
* Implement split preset/config support
This is the initial bulk refactor to introduce runtime config values in
a number of places, somewhat replacing the existing mechanism of loading
network metadata.
It still needs more work, this is the initial refactor that introduces
runtime configuration in some of the places that need it.
The PR changes the way presets and constants work, to match the spec. In
particular, a "preset" now refers to the compile-time configuration
while a "cfg" or "RuntimeConfig" is the dynamic part.
A single binary can support either mainnet or minimal, but not both.
Support for other presets has been removed completely (can be readded,
in case there's need).
There's a number of outstanding tasks:
* `SECONDS_PER_SLOT` still needs fixing
* loading custom runtime configs needs redoing
* checking constants against YAML file
* yeerongpilly support
`build/nimbus_beacon_node --network=yeerongpilly --discv5:no --log-level=DEBUG`
* load fork epoch from config
* fix fork digest sent in status
* nicer error string for request failures
* fix tools
* one more
* fixup
* fixup
* fixup
* use "standard" network definition folder in local testnet
Files are loaded from their standard locations, including genesis etc,
to conform to the format used in the `eth2-networks` repo.
* fix launch scripts, allow unknown config values
* fix base config of rest test
* cleanups
* bundle mainnet config using common loader
* fix spec links and names
* only include supported preset in binary
* drop yeerongpilly, add altair-devnet-0, support boot_enr.yaml
* add Altair support to the block quarantine
* switch some spec/datatypes imports to spec/datatypes/base
* add Altair support to block_clearance
* allow runtime configuration of Altair transition slot
* enable Altair in block_sim, including in CI
* remove false OnBlockAdded dependency on phase.HashedBeaconState
* introduce altair data types into block_clearance; update some alpha.6 spec refs to alpha.7; add get_active_validator_indices_len ForkedHashedBeaconState wrapper
* switch many modules from using datatypes (with phase0 states/blocks) to datatypes/base (fork-independent); update spec refs from alpha.6 to alpha.7 and remove rm'd G2_POINT_AT_INFINITY
* switch more modules from using datatypes (with phase0 states/blocks) to datatypes/base (fork-independent); update spec refs from alpha.6 to alpha.7
* remove unnecessary phase0-only wrapper of get_attesting_indices(); allow signatures_batch to process either fork; remove O(n^2) nested loop in process_inactivity_updates(); add altair support to getAttestationsforTestBlock()
* add Altair versions of asSigVerified(), asTrusted(), and makeBeaconBlock()
* fix spec URL to be Altair for Altair makeBeaconBlock()
* use ForkedHashedBeaconState in StateData
* fix FAR_FUTURE_EPOCH -> slot overflow; almost always use assign()
* avoid stack allocation in maybeUpgradeStateToAltair()
* create and use dispatch functions for check_attester_slashing(), check_proposer_slashing(), and check_voluntary_exit()
* use getStateRoot() instead of various state.data.hbsPhase0.root
* remove withStateVars.hashedState(), which doesn't work as a design anymore
* introduce spec/datatypes/altair into beacon_chain_db
* fix inefficient codegen for getStateField(largeStateField)
* state_transition_slots() doesn't either need/use blocks or runtime presets
* combine process_slots(HBS)/state_transition_slots(HBS) which differ only in last-slot htr optimization
* getStateField(StateData, ...) was replaced by getStateField(ForkedHashedBeaconState, ...)
* fix rollback
* switch some state_transition(), process_slots, makeTestBlocks(), etc to use ForkedHashedBeaconState
* remove state_transition(phase0.HashedBeaconState)
* remove process_slots(phase0.HashedBeaconState)
* remove state_transition_block(phase0.HashedBeaconState)
* remove unused callWithBS(); separate case expression from if statement
* switch back from nested-ref-object construction to (ref Foo)(Bar())
Instead of keeping a validator key list per EpochRef, this PR introduces
a single shared validator key list in ChainDAG, and cleans up some other
ChainDAG and key-related issues.
The PR does not introduce the validator key list in the state transition
- this is because we batch-check all signatures before entering the spec
code, thus the spec code never hits the cache.
A future refactor should _probably_ remove the threadvar altogether.
There's a few other small fixes in here that make the flow easier to
read:
* fix `var ChainDAGRef` -> `ChainDAGRef`
* fix `var QuarantineRef` -> `QuarantineRef`
* consistent `dag` variable name
* avoid using threadvar pubkey cache in most cases
* better error messages in batch signature checking
* use StateData in place of BeaconState outside state transition code
* propagate more StateData usage
* remove withStateVars().state
* wrap get_beacon_committee(BeaconState, ...) as gbc(StateData, ...)
* switch makeAttestation() to use StateData
* use StateData wrapper/dispatcher for get_committee_count_per_slot()
* convert AttestationCache.init(), weak subjectivity functions, and updateValidatorMetrics()
* add get_shuffled_active_validator_indices(StateData) and get_block_root_at_slot(StateData)
* switch makeAttestationData() to StateData
* sync AllTests-mainnet.md after rebase
With the introduction of batching and lazy attestation aggregation, it
no longer makes sense to enqueue attestations between the signature
check and adding them to the attestation pool - this only takes up
valuable CPU without any real benefit.
* add successfully validated attestations to attestion pool directly
* avoid copying participant list around for single-vote attestations,
pass single validator index instead
* release decompressed gossip memory earlier, specially during async
message validation
* use cooked signatures in a few more places to avoid reloads and errors
* remove some Defect-raising versions of signature-loading
* release decompressed data memory before validating message
This is a revamp of the attestation pool that cleans up several aspects
of attestation processing as the network grows larger and block space
becomes more precious.
The aim is to better exploit the divide between attestation subnets and
aggregations by keeping the two kinds separate until it's time to either
produce a block or aggregate. This means we're no longer eagerly
combining single-vote attestations, but rather wait until the last
moment, and then try to add singles to all aggregates, including those
coming from the network.
Importantly, the branch improves on poor aggregate quality and poor
attestation packing in cases where block space is running out.
A basic greed scoring mechanism is used to select attestations for
blocks - attestations are added based on how much many new votes they
bring to the table.
* Collect single-vote attestations separately and store these until it's
time to make aggregates
* Create aggregates based on single-vote attestations
* Select _best_ aggregate rather than _first_ aggregate when on
aggregation duty
* Top up all aggregates with singles when it's time make the attestation
cut, thus improving the chances of grabbing the best aggregates out
there
* Improve aggregation test coverage
* Improve bitseq operations
* Simplify aggregate signature creation
* Make attestation cache temporary instead of storing it in attestation
pool - most of the time, blocks are not being produced, no need to keep
the data around
* Remove redundant aggregate storage that was used only for RPC
* Use tables to avoid some linear seeks when looking up attestation data
* Fix long cleanup on large slot jumps
* Avoid some pointers
* Speed up iterating all attestations for a slot (fixes#2490)
* only deserialize attestation and aggregation gossiped signatures once
* re-indent some aggregate checks into block scope
* spelling
* remove debugging assertion
* put part of gossip validation back into block context
* attestation pool test signature loading isn't so unsafe, and exportRaw isn't free
* remove more development doAsserts; don't exportRaw in loops
* set upper bound on EpochRef cache
* max 32 EpochRef instances
* less memory waste in BlockRef by removing EpochRef seq that is mostly
unused (~20mb)
* less memory waste in dag block lookup by not keeping an extra copy of
digest (~70mb)
* fix `==` and `$` for Eth2Digest
* remove `ChainDAG.tmpState` (~50mb?)
all in all, this branch cuts mainnet memory usage by ~160-180mb and puts
limits on EpochRef cache usage - where normally it hovered around 950mb
before, it's now sitting at 600-700mb on my machine.
* docs
* Deferred DAG and fork choice pruning
* fixup
* Address https://github.com/status-im/nimbus-eth2/pull/2384/files#r589448448, rely only on onSLotEnd for state pruning
* no need to store needPruning in the data structure
* lastPrunePoint is updated in pruning proc
* Split eager and LazyPruning
* enforce pruning in updateHead
* database state storage benchmarking via ncli_db
* more cleanups from immutable validator state branch
* unexport some eth2_network constants and remove unused variables/templates
* make two PeerScore constants public
* use IntSet rather than HashSet[ValidatorIndex]
* add bounds check before uint64 -> int conversion
* use intsets in block transitions
* remove superfluous Nim issue explanation/reference