Instead of keeping a validator key list per EpochRef, this PR introduces
a single shared validator key list in ChainDAG, and cleans up some other
ChainDAG and key-related issues.
The PR does not introduce the validator key list in the state transition
- this is because we batch-check all signatures before entering the spec
code, thus the spec code never hits the cache.
A future refactor should _probably_ remove the threadvar altogether.
There's a few other small fixes in here that make the flow easier to
read:
* fix `var ChainDAGRef` -> `ChainDAGRef`
* fix `var QuarantineRef` -> `QuarantineRef`
* consistent `dag` variable name
* avoid using threadvar pubkey cache in most cases
* better error messages in batch signature checking
With the introduction of batching and lazy attestation aggregation, it
no longer makes sense to enqueue attestations between the signature
check and adding them to the attestation pool - this only takes up
valuable CPU without any real benefit.
* add successfully validated attestations to attestion pool directly
* avoid copying participant list around for single-vote attestations,
pass single validator index instead
* release decompressed gossip memory earlier, specially during async
message validation
* use cooked signatures in a few more places to avoid reloads and errors
* remove some Defect-raising versions of signature-loading
* release decompressed data memory before validating message
This is a revamp of the attestation pool that cleans up several aspects
of attestation processing as the network grows larger and block space
becomes more precious.
The aim is to better exploit the divide between attestation subnets and
aggregations by keeping the two kinds separate until it's time to either
produce a block or aggregate. This means we're no longer eagerly
combining single-vote attestations, but rather wait until the last
moment, and then try to add singles to all aggregates, including those
coming from the network.
Importantly, the branch improves on poor aggregate quality and poor
attestation packing in cases where block space is running out.
A basic greed scoring mechanism is used to select attestations for
blocks - attestations are added based on how much many new votes they
bring to the table.
* Collect single-vote attestations separately and store these until it's
time to make aggregates
* Create aggregates based on single-vote attestations
* Select _best_ aggregate rather than _first_ aggregate when on
aggregation duty
* Top up all aggregates with singles when it's time make the attestation
cut, thus improving the chances of grabbing the best aggregates out
there
* Improve aggregation test coverage
* Improve bitseq operations
* Simplify aggregate signature creation
* Make attestation cache temporary instead of storing it in attestation
pool - most of the time, blocks are not being produced, no need to keep
the data around
* Remove redundant aggregate storage that was used only for RPC
* Use tables to avoid some linear seeks when looking up attestation data
* Fix long cleanup on large slot jumps
* Avoid some pointers
* Speed up iterating all attestations for a slot (fixes#2490)
* batch attestations
* Fixes (but now need to investigate the chronos 0 .. 4095 crash similar to https://github.com/status-im/nimbus-eth2/issues/1518
* Try to remove the processing loop to no avail :/
* batch aggregates
* use resultsBuffer size for triggering deadline schedule
* pass attestation pool tests
* Introduce async gossip validators. May fix the 4096 bug (reentrancy issue?) (similar to sync unknown blocks #1518)
* Put logging at debug level, add speed info
* remove unnecessary batch info when it is known to be one
* downgrade some logs to trace level
* better comments [skip ci]
* Address most review comments
* only use ref for async proc
* fix exceptions in eth2_network
* update async exceptions in gossip_validation
* eth2_network 2nd pass
* change to sleepAsync
* Update beacon_chain/gossip_processing/batch_validation.nim
Co-authored-by: Jacek Sieka <jacek@status.im>
Co-authored-by: Jacek Sieka <jacek@status.im>
* performance fixes
* don't mark tree cache as dirty on read-only List accesses
* store only blob in memory for keys and signatures, parse blob lazily
* compare public keys by blob instead of parsing / converting to raw
* compare Eth2Digest using non-constant-time comparison
* avoid some unnecessary validator copying
This branch will in particular speed up deposit processing which has
been slowing down block replay.
Pre (mainnet, 1600 blocks):
```
All time are ms
Average, StdDev, Min, Max, Samples, Test
Validation is turned off meaning that no BLS operations are performed
3450.269, 0.000, 3450.269, 3450.269, 1, Initialize DB
0.417, 0.822, 0.036, 21.098, 1400, Load block from database
16.521, 0.000, 16.521, 16.521, 1, Load state from database
27.906, 50.846, 8.104, 1507.633, 1350, Apply block
52.617, 37.029, 20.640, 135.938, 50, Apply epoch block
```
Post:
```
3502.715, 0.000, 3502.715, 3502.715, 1, Initialize DB
0.080, 0.560, 0.035, 21.015, 1400, Load block from database
17.595, 0.000, 17.595, 17.595, 1, Load state from database
15.706, 11.028, 8.300, 107.537, 1350, Apply block
33.217, 12.622, 17.331, 60.580, 50, Apply epoch block
```
* more perf fixes
* load EpochRef cache into StateCache more aggressively
* point out security concern with public key cache
* reuse proposer index from state when processing block
* avoid genericAssign in a few more places
* don't parse key when signature is unparseable
* fix `==` overload for Eth2Digest
* preallocate validator list when getting active validators
* speed up proposer index calculation a little bit
* reuse cache when replaying blocks in ncli_db
* avoid a few more copying loops
```
Average, StdDev, Min, Max, Samples, Test
Validation is turned off meaning that no BLS operations are performed
3279.158, 0.000, 3279.158, 3279.158, 1, Initialize DB
0.072, 0.357, 0.035, 13.400, 1400, Load block from database
17.295, 0.000, 17.295, 17.295, 1, Load state from database
5.918, 9.896, 0.198, 98.028, 1350, Apply block
15.888, 10.951, 7.902, 39.535, 50, Apply epoch block
0.000, 0.000, 0.000, 0.000, 0, Database block store
```
* clear full balance cache before processing rewards and penalties
```
All time are ms
Average, StdDev, Min, Max, Samples, Test
Validation is turned off meaning that no BLS operations are performed
3947.901, 0.000, 3947.901, 3947.901, 1, Initialize DB
0.124, 0.506, 0.026, 202.370, 363345, Load block from database
97.614, 0.000, 97.614, 97.614, 1, Load state from database
0.186, 0.188, 0.012, 99.561, 357262, Advance slot, non-epoch
14.161, 5.966, 1.099, 395.511, 11524, Advance slot, epoch
1.372, 4.170, 0.017, 276.401, 363345, Apply block, no slot processing
0.000, 0.000, 0.000, 0.000, 0, Database block store
```
* collect all epochrefs in specific blocks to make them easier to find
and to avoid lots of small seqs
* reuse validator key databases more aggressively by comparing keys
* make state cache available from within `withState`
* make epochRef available from within onBlockAdded callback
* integrate getEpochInfo into block resolution and epoch ref logic such
that epochrefs are created when blocks are added to pool or lazily when
needed by a getEpochRef
* fill state cache better from EpochRef, speeding up replay and
validation
* store epochRef in specific blocks to make them easier to find and
reuse
* fix database corruption when state is saved while replaying quarantine
* replay slots fully from block pool before processing state
* compare bls values more smartly
* store epoch state without block applied in database - it's recommended
to resync the node!
this branch will drastically speed up processing in times of long
non-finality, as well as cut memory usage by 10x during the recent
medalla madness.
* Bump BLSCurve
* Use unified aggregation API
* use new blscurve with unified aggregate API
* bump
* fix toRaw
* replace state_sim combine with AggregateSignature
* Fix 32-bit
* Fix 32-bit for real and test deactivating ccache for fno-tree-lopp-vectorize flag
* change compilation switches to narrow down Linux issue
* Use -fno-tree-vectorize to disable both tree-loop-vectorize and tree-slp-vectorize
* blscurve now disables both Loop and SLP vectorization
* Add tests for the miracl/milagro fallback
* Travis has max log size of 4MB
* Test with Miracl in the finalization test
* fix state_sim log level
* Coment out the slow fallback tests
* Lazy loading of crypto objects
* Try to fix incorrect field access by hiding fields but no luck. SSZ/Chronicles/macro bug?
* Fix incorrect blsValue access. was "aggregate" not "chronicles"
* Fix tests that rely on the internal BLSValue representation
* cleanups
* fix ncli state root check flag
* add block dump to ncli_db
* limit ncli_db benchmark length
* tone down finalization logs
* introduce trusted blocks
We only store blocks whose signature we've verified in the database - as
such, there's no need to check it again, and most importantly, no need
to deserialize the signature when loading from database.
50x startup time improvement, 200x block load time improvement.
* fix rewinding when deposits have invalid signature
* speed up ancestor iteration by avoiding copy
* avoid deserializing signatures for trusted data
* load blocks lazily when rewinding (less memory used)
* chronicles workarounds
* document trustedbeaconblock
* collect signature production and verificaiton in one place
Signatures are made over data and domain - here we collect all such
activities in one place.
Also:
* security: fix cast-before-range-check
* log block/attestation verification consistently
* run block verification based on `getProposer` in its own history
* clean up some unused stuff
* import
* missing raises
* reorder ssz
* split into hash_trees and ssz_serialization, roughly, for hashing and
IO
* move bitseqs into ssz (from stew)
* clean up imports
* docs, imports
* crypto: cleanup
* fix several Defect-on-user-input
* make crypto interface more similar to secp
* use `crypto.nim` in all of nbc
* digest: raises
* fix
* vendor
* Initial implementation of runtime bls skipping.
Add libnfuzz skipBLSValidation handling, check that it propagates.
* Rename skipBLSValidation -> skipBlsValidation, start skipStateRootValidation
* Replace skipValidation flags with more granular flags.
Also added skipBlockParentRootValidation flag
Mainly replaced with skipBlsValidation but also StateRoot or
BlockParentRootValidation flags where appropriate.
* Adjust interop test to pass when skipping merkle validation.
* Stop skipping validation for mainchain_monitor.
* Remove comment.
* Also skipMerkleValidation for test_beacon_chain_db.
* re-enable test_interop based on zcli with 0.9.1 specs and update initialize_beacon_state_from_eth1(...) to 0.9.1
* switch many procs to funcs
* fix import os.nim instead; ospaths is deprecated [Deprecated] warnings
When the connect_to_testnet script is invoked it will first verify that
the genesis file of the testnet hasn't changed. If it has changed, any
previously created database associated with the testnet will be erased.
To facilitate this, the genesis file of each network is written to the
data folder of the beacon node. The beacon node will refuse to start if
it detects a discrepancy between the data folder and any state snapshot
specified on the command-line.
Since the testnet sharing spec requires us to use SSZ snapshots, the Json
support is now phased out. To help with the transition and to preserve the
functionality of the multinet scripts, the beacon node now supports a CLI
query command that can extract any data from the genesis state. This is
based on new developments in the SSZ navigators.
* rename compute_epoch_of_slot(...) to compute_epoch_at_slot(...)
* remove some unnecessary imports; remove some crosslink-related code and tests; complete renaming of compute_epoch_of_slot(...) to compute_epoch_at_slot(...)
* rm more transfer-related code and tests; rm more unnecessary strutils imports
* rm remaining unused imports
* remove useless get_empty_per_epoch_cache(...)/compute_start_slot_of_epoch(...) calls
* rename compute_start_slot_of_epoch(...) to compute_start_slot_at_epoch(...)
* rename ACTIVATION_EXIT_DELAY to MAX_SEED_LOOKAHEAD
* update domain types to 0.9.0
* mark AttesterSlashing, IndexedAttestation, AttestationDataAndCustodyBit, DepositData, BeaconBlockHeader, Fork, integer_squareroot(...), and process_voluntary_exit(...) as 0.9.0
* mark increase_balance(...), decrease_balance(...), get_block_root(...), CheckPoint, Deposit, PendingAttestation, HistoricalBatch, is_active_validator(...), and is_slashable_attestation_data(...) as 0.9.0
* mark compute_activation_exit_epoch(...), bls_verify(...), Validator, get_active_validator_indices(...), get_current_epoch(...), get_total_active_balance(...), and get_previous_epoch(...) as 0.9.0
* mark get_block_root_at_slot(...), ProposerSlashing, get_domain(...), VoluntaryExit, mainnet preset Gwei values, minimal preset max operations, process_block_header(...), and is_slashable_validator(...) as 0.9.0
* mark makeWithdrawalCredentials(...), get_validator_churn_limit(...), get_total_balance(...), is_valid_indexed_attestation(...), bls_aggregate_pubkeys(...), initial genesis value/constants, Attestation, get_randao_mix(...), mainnet preset max operations per block constants, minimal preset Gwei values and time parameters, process_eth1_data(...), get_shuffled_seq(...), compute_committee(...), and process_slots(...) as 0.9.0; partially update get_indexed_attestation(...) to 0.9.0 by removing crosslink refs and associated tests
* mark initiate_validator_exit(...), process_registry_updates(...), BeaconBlock, Eth1Data, compute_domain(...), process_randao(...), process_attester_slashing(...), get_base_reward(...), and process_slot(...) as 0.9.0
* Add attestation unit test
* process_attestation doesn't throw exceptions
* Allow SSZ deserialization of both real and invalid signatures
* Add new process_attestation checks - pass all process_attestation tests
* Add sanity check for #361
* Fix SSZ testing after fromBytes/fromSSZBytes changes
* fix network sim
* mark BeaconState, state list/vector lengths, misc values, get_base_reward(...), verifyStateRoot(...), and process_slot(...) as 0.8.3; update minimal/mainnet config initial values to 0.8.3 by removing GENESIS_FORK_VERSION
* Add sanity check for slot processing (also impacted by https://github.com/status-im/nim-beacon-chain/issues/373)
* use reportDiff also for all state tests vs EF
* initial sanity checks for blocks - workaround zero signature in block headers: https://github.com/status-im/nim-beacon-chain/issues/374
* Remove generic object variant compare commented code
* Add the one block state transition sanity checks
* generalize blocks test to multiple blocks
* simplify slots test runner
* Add official epoch transitions, sanity blocks, sanity slots to the test suite
* Fix index out-of-bounds in initiate_validator_exit - enable proposer slashings unittest
* add interop launcher scripts
* stick validator_keygen into beacon_node
* fix lmd ghost slot number on missing block
* use mocked eth1data when producing blocks
* use bls public key method for withdrawal credentials
* fix deposit domain
* prefer lowercase for a bunch of toHex
* build simulation binary in data folder to avoid data types confusion
* 0.7.0 updates which don't change semantics; mostly comment changes
* mark get_attesting_indices, bls_verify, ProposerSlashing, BeaconBlockBody, deposit contract constants, state list length constants, compute_committee, get_crosslink_committee, is_slashable_attestation_data, processAttesterSlashings as 0.7.0; rename BASE_REWARD_QUOTIENT to BASE_REWARD_FACTOR
* complete marking unchanged-in-0.7.0 datatypes as such; a few more have trivial field renamings, etc
* update process_justification_and_finalization to 0.6.2; mark AttesterSlashing as 0.6.2
* replace get_effective_balance(...) with state.validator_registry[idx].effective_balance; rm get_effective_balance, process_ejections, should_update_validator_registry, update_validator_registry, and update_registry_and_shuffling_data; update get_total_balance to 0.6.2; implement process_registry_updates
* rm exit_validator; implement is_slashable_attestation_data; partly update processAttesterSlashings
* mark HistoricalBatch and Eth1Data as 0.6.2; implement get_shard_delta(...); replace 0.5 finish_epoch_update with 0.6 process_final_updates
* mark increase_balance, decrease_balance, get_delayed_activation_exit_epoch, bls_aggregate_pubkeys, bls_verify_multiple, Attestation, Transfer, slot_to_epoch, Crosslink, get_current_epoch, int_to_bytes*, various constants, processEth1Data, processTransfers, and verifyStateRoot as 0.6.2; rm is_double_vote and is_surround_vote
* mark get_bitfield_bit, verify_bitfield, ProposerSlashing, DepositData, VoluntaryExit, PendingAttestation, Fork, integer_squareroot, get_epoch_start_slot, is_active_validator, generate_seed, some constants to 0.6.2; rename MIN_PENALTY_QUOTIENT to MIN_SLASHING_PENALTY_QUOTIENT
* rm get_previous_total_balance, get_current_epoch_boundary_attestations, get_previous_epoch_boundary_attestations, and get_previous_epoch_matching_head_attestations
* update BeaconState to 0.6.2; simplify legacy get_crosslink_committees_at_slot infrastructure a bit by noting that registry_change is always false; reimplment 0.5 get_crosslink_committees_at_slot in terms of 0.6 get_crosslink_committee
* mark process_deposit(...), get_block_root_at_slot(...), get_block_root(...), Deposit, BeaconBlockHeader, BeaconBlockBody, hash(...), get_active_index_root(...), various constants, get_shard_delta(...), get_epoch_start_shard(...), get_crosslink_committee(...), processRandao(...), processVoluntaryExits(...), cacheState(...) as 0.6.2
* rm removed-since-0.5 split(...), is_power_of_2(...), get_shuffling(...); rm 0.5 versions of get_active_validator_indices and get_epoch_committee_count; add a few tests for integer_squareroot
* mark bytes_to_int(...) and advanceState(...) as 0.6.2
* rm 0.5 get_attesting_indices; update get_attesting_balance to 0.6.2
* another tiny commit to poke AppVeyor to maybe not timeout at connecting to GitHub partway through CI: mark get_churn_limit(...), initiate_validator_exit(...), and Validator as 0.6.2
* mark get_attestation_slot(...), AttestationDataAndCustodyBit, and BeaconBlock as 0.6.2
* begin 0.6.0: new get_domain/increase_balance/reduce_balance, BeaconState.validator_balances -> BeaconState.balances, some renamed constants, transaction processing changes, SlashableAttestation field name changes, 0.6.0 get_beacon_proposer_index always uses given state's slot, update tests subrepo
* mark get_bitfield_bit/bls_verify_multiple/stat-list-lengths/is_active_validator/is_surround_vote/slot_to_epoch/int_to_bytes/etc as unchanged in 0.6.0; rm Eth1DataVote/maybe_reset_eth1_period and thus adjust expected tree hash test results
* mark verify_bitfield/bls_verify/deposit-contract/VoluntaryExit/PendingAttestation/Historicalbatch/Fork as 0.6.0; update DOMAIN_BEACON_BLOCK to DOMAIN_BEACON_PROPOSER
* update Crosslink to 0.6.0 (also requires tree hashing test result change, so isolate in individual commit)
* mark verify_merkle_branch/get_delayed_activation_exit_epoch/ProposerSlashing/Attestation/AttestationDataAndCustodyBit/hash/integer_squareroot/get_epoch_start_slot/is_double_vote/get_randao_mix/generate_seed as 0.6.0; update reward and penalty quotients; SlashableAttestation -> IndexedAttestation; rm get_fork_version; ATTESTATION_INCLUSION_REWARD_QUOTIENT -> PROPOSER_REWARD_QUOTIENT
* ssz: update to 0.5.1:ish
* slightly fewer seq allocations
* still a lot of potential for optimization
* fixes#174
* ssz: avoid reallocating leaves (logN merkle impl)