* reorganize ssz dependencies
This PR continues the work in
https://github.com/status-im/nimbus-eth2/pull/2646,
https://github.com/status-im/nimbus-eth2/pull/2779 as well as past
issues with serialization and type, to disentangle SSZ from eth2 and at
the same time simplify imports and exports with a structured approach.
The principal idea here is that when a library wants to introduce SSZ
support, they do so via 3 files:
* `ssz_codecs` which imports and reexports `codecs` - this covers the
basic byte conversions and ensures no overloads get lost
* `xxx_merkleization` imports and exports `merkleization` to specialize
and get access to `hash_tree_root` and friends
* `xxx_ssz_serialization` imports and exports `ssz_serialization` to
specialize ssz for a specific library
Those that need to interact with SSZ always import the `xxx_` versions
of the modules and never `ssz` itself so as to keep imports simple and
safe.
This is similar to how the REST / JSON-RPC serializers are structured in
that someone wanting to serialize spec types to REST-JSON will import
`eth2_rest_serialization` and nothing else.
* split up ssz into a core library that is independendent of eth2 types
* rename `bytes_reader` to `codec` to highlight that it contains coding
and decoding of bytes and native ssz types
* remove tricky List init overload that causes compile issues
* get rid of top-level ssz import
* reenable merkleization tests
* move some "standard" json serializers to spec
* remove `ValidatorIndex` serialization for now
* remove test_ssz_merkleization
* add tests for over/underlong byte sequences
* fix broken seq[byte] test - seq[byte] is not an SSZ type
There are a few things this PR doesn't solve:
* like #2646 this PR is weak on how to handle root and other
dontSerialize fields that "sometimes" should be computed - the same
problem appears in REST / JSON-RPC etc
* Fix a build problem on macOS
* Another way to fix the macOS builds
Co-authored-by: Zahary Karadjov <zahary@gmail.com>
The spec imports are a mess to work with, so this branch cleans them up
a bit to ensure that we avoid generic sandwitches and that importing
stuff generally becomes easier.
* reexport crypto/digest/presets because these are part of the public
symbol set of the rest of the spec types
* don't export `merge` types from `base` - this causes circular deps
* fix circular deps in `ssz/spec_types` - this is the first step in
disentangling ssz from spec
* be explicit about phase0 vs altair - longer term, `altair` will become
the "natural" type set, then merge and so on, so no point in giving
`phase0` special preferential treatment
Simpler module name for stuff that covers forks
* check that runtime config matches database state
* also include some assorted altair cleanups
* use "standard" genesis fork in local testnet to work around missing
runtime config support
* some whole-file copies from altair branch
* rpc/node_api and rpc/node_rest_api also need to be copied
* remove new sync committee-related functionality
* bump libp2p
* altair sync v2
Use V2 sync requests after the altair fork has happened, according to
the wall clock
* Fix the behavior of the v1 req/resp calls after Altair
Co-authored-by: Zahary Karadjov <zahary@gmail.com>
* Implement split preset/config support
This is the initial bulk refactor to introduce runtime config values in
a number of places, somewhat replacing the existing mechanism of loading
network metadata.
It still needs more work, this is the initial refactor that introduces
runtime configuration in some of the places that need it.
The PR changes the way presets and constants work, to match the spec. In
particular, a "preset" now refers to the compile-time configuration
while a "cfg" or "RuntimeConfig" is the dynamic part.
A single binary can support either mainnet or minimal, but not both.
Support for other presets has been removed completely (can be readded,
in case there's need).
There's a number of outstanding tasks:
* `SECONDS_PER_SLOT` still needs fixing
* loading custom runtime configs needs redoing
* checking constants against YAML file
* yeerongpilly support
`build/nimbus_beacon_node --network=yeerongpilly --discv5:no --log-level=DEBUG`
* load fork epoch from config
* fix fork digest sent in status
* nicer error string for request failures
* fix tools
* one more
* fixup
* fixup
* fixup
* use "standard" network definition folder in local testnet
Files are loaded from their standard locations, including genesis etc,
to conform to the format used in the `eth2-networks` repo.
* fix launch scripts, allow unknown config values
* fix base config of rest test
* cleanups
* bundle mainnet config using common loader
* fix spec links and names
* only include supported preset in binary
* drop yeerongpilly, add altair-devnet-0, support boot_enr.yaml
* remove false OnBlockAdded dependency on phase.HashedBeaconState
* introduce altair data types into block_clearance; update some alpha.6 spec refs to alpha.7; add get_active_validator_indices_len ForkedHashedBeaconState wrapper
* switch many modules from using datatypes (with phase0 states/blocks) to datatypes/base (fork-independent); update spec refs from alpha.6 to alpha.7 and remove rm'd G2_POINT_AT_INFINITY
* switch more modules from using datatypes (with phase0 states/blocks) to datatypes/base (fork-independent); update spec refs from alpha.6 to alpha.7
* remove unnecessary phase0-only wrapper of get_attesting_indices(); allow signatures_batch to process either fork; remove O(n^2) nested loop in process_inactivity_updates(); add altair support to getAttestationsforTestBlock()
* add Altair versions of asSigVerified(), asTrusted(), and makeBeaconBlock()
* fix spec URL to be Altair for Altair makeBeaconBlock()
* use ForkedHashedBeaconState in StateData
* fix FAR_FUTURE_EPOCH -> slot overflow; almost always use assign()
* avoid stack allocation in maybeUpgradeStateToAltair()
* create and use dispatch functions for check_attester_slashing(), check_proposer_slashing(), and check_voluntary_exit()
* use getStateRoot() instead of various state.data.hbsPhase0.root
* remove withStateVars.hashedState(), which doesn't work as a design anymore
* introduce spec/datatypes/altair into beacon_chain_db
* fix inefficient codegen for getStateField(largeStateField)
* state_transition_slots() doesn't either need/use blocks or runtime presets
* combine process_slots(HBS)/state_transition_slots(HBS) which differ only in last-slot htr optimization
* getStateField(StateData, ...) was replaced by getStateField(ForkedHashedBeaconState, ...)
* fix rollback
* switch some state_transition(), process_slots, makeTestBlocks(), etc to use ForkedHashedBeaconState
* remove state_transition(phase0.HashedBeaconState)
* remove process_slots(phase0.HashedBeaconState)
* remove state_transition_block(phase0.HashedBeaconState)
* remove unused callWithBS(); separate case expression from if statement
* switch back from nested-ref-object construction to (ref Foo)(Bar())
* proposed structure for hf1
* refactor datatypes.nim into datatypes/{base, phase0, hf1}.nim
* hf1 is Altair
* some syncing with alpha 2
* adjust epoch processing to disambiguate access to RewardFlags
* relocate StateData to stay consistent with meaning phase 0 StateData
* passes v1.1.0 alpha 5 SSZ consensus object tests
* Altair block header test fixtures work
* fix slash_validator() so that Altair attester slashings, proposer slashings, and voluntary exit textures work
* deposit operation Altair test fixtures work
* slot sanity and all but a couple epoch transition tests switched to Altair
* attestation Altair test fixtures work
* Altair block sanity test fixtures work
* add working altair sync committee tests
* improve workarounds for sum-types-across-modules Nim bug; incorporate SignedBeaconBlock root reconstuction to SSZ byte reader
* use StateData in place of BeaconState outside state transition code
* propagate more StateData usage
* remove withStateVars().state
* wrap get_beacon_committee(BeaconState, ...) as gbc(StateData, ...)
* switch makeAttestation() to use StateData
* use StateData wrapper/dispatcher for get_committee_count_per_slot()
* convert AttestationCache.init(), weak subjectivity functions, and updateValidatorMetrics()
* add get_shuffled_active_validator_indices(StateData) and get_block_root_at_slot(StateData)
* switch makeAttestationData() to StateData
* sync AllTests-mainnet.md after rebase
* Revert "Revert "Upgrade database schema" (#2570)"
This reverts commit 6057c2ffb4.
* ssz: fix loading empty lists into existing instances
Not a problem earlier because we didn't reuse instances
* bump nim-eth
* bump nim-web3
The `kvstore` design we're using now turns out to not be the best way to
use `sqlite` - in particular, there are some significant benefits to
using rowid in certain situations and to keep data in separate tables.
With this branch, there are massive improvements in startup time
(seconds instead of minutes) and state/block storage and pruning times
(milliseconds instead of seconds) - these improvements can in particular
be seen on slow drives and translate directly into better attestation
performance.
* update kvstore to new keyspace design
* remove `DirStoreRef` and the hidden `--state-db-kind` option - this
was an experiment to store large blobs in files, but with the new
kvstore, there's no compelling reason to do so
* remove `DbMap` - unused and would need updating for new keyspace
design
* introduce separate tables for each data type (blocks, states etc)
* remove "WITHOUT ROWID" pessimization for tables with large blobs
* close DbSeq statements explicitly (and earlier)
* store beacon block summaries in separate table, without SSZ
compression and load them all with single query on startup
* stop storing backwards compat full states
* mark genesis beacon block as trusted
* avoid faststreams when loading SSZ data
* remove `DisagreementBehavior` (unused)
This PR decreases the lead subscription time which should help
decrease bandwidth usage and CPU making the subscription for future
aggregation happen a bit later. There's room for more tuning here,
probably.
* fix missing negation from in #2550
* fix silly bitarray issues
* decrease subnet lead subscription time
* log all subnet switching source data
* rename subnet trackers to refer to stability and aggregate subnets
* more tests
Currently, we have a bit of a convoluted flow where when sending
attestations, we start broadcasting them over gossip then pass them to
the attestation validation to include them in the local attestation pool
- it should be the other way around: we should be checking attestations
_before_ gossipping them - this serves as an additional safety net to
ensure that we don't publish junk - this becomes more important when
publishing attestations from the API.
Also, the REST API was performing its own validation meaning
attestations coming from REST would be validated twice - finally, the
JSON RPC wasn't pre-validating and would happily broadcast invalid
attestations.
* Unified attestation production pipeline with the same flow for gossip,
locally and API-produced attestations: all are now validated and entered
into the pool, then broadcast/republished
* Refactor subnet handling with specific SubnetId alias, streamlining
where subnets are computed, avoiding the need to pass around the number
of active validators
* Move some of the subnet handling code to eth2_network
* Use BitArray throughout for subnet handling
This is a revamp of the attestation pool that cleans up several aspects
of attestation processing as the network grows larger and block space
becomes more precious.
The aim is to better exploit the divide between attestation subnets and
aggregations by keeping the two kinds separate until it's time to either
produce a block or aggregate. This means we're no longer eagerly
combining single-vote attestations, but rather wait until the last
moment, and then try to add singles to all aggregates, including those
coming from the network.
Importantly, the branch improves on poor aggregate quality and poor
attestation packing in cases where block space is running out.
A basic greed scoring mechanism is used to select attestations for
blocks - attestations are added based on how much many new votes they
bring to the table.
* Collect single-vote attestations separately and store these until it's
time to make aggregates
* Create aggregates based on single-vote attestations
* Select _best_ aggregate rather than _first_ aggregate when on
aggregation duty
* Top up all aggregates with singles when it's time make the attestation
cut, thus improving the chances of grabbing the best aggregates out
there
* Improve aggregation test coverage
* Improve bitseq operations
* Simplify aggregate signature creation
* Make attestation cache temporary instead of storing it in attestation
pool - most of the time, blocks are not being produced, no need to keep
the data around
* Remove redundant aggregate storage that was used only for RPC
* Use tables to avoid some linear seeks when looking up attestation data
* Fix long cleanup on large slot jumps
* Avoid some pointers
* Speed up iterating all attestations for a slot (fixes#2490)
* Reset cached indices when resetting cache on SSZ read
When deserializing into an existing structure, the cache should be
cleared - goes for json also. Also improve error messages.
Since quite a lot of additional procs were now compiled as generics, this lead to compiler bugs that had
to be worked-around:
* The `Domain` type was renamed to `Eth2Domain` to avoid compilation errors
due to conflicts with `nativesockets.Domain`.
Similarly, `eth2_network.KeyPair` was renamed to `NetKeyPair`.
* A new more robust version of `hexToByteArray` was added to stew
* performance fixes
* don't mark tree cache as dirty on read-only List accesses
* store only blob in memory for keys and signatures, parse blob lazily
* compare public keys by blob instead of parsing / converting to raw
* compare Eth2Digest using non-constant-time comparison
* avoid some unnecessary validator copying
This branch will in particular speed up deposit processing which has
been slowing down block replay.
Pre (mainnet, 1600 blocks):
```
All time are ms
Average, StdDev, Min, Max, Samples, Test
Validation is turned off meaning that no BLS operations are performed
3450.269, 0.000, 3450.269, 3450.269, 1, Initialize DB
0.417, 0.822, 0.036, 21.098, 1400, Load block from database
16.521, 0.000, 16.521, 16.521, 1, Load state from database
27.906, 50.846, 8.104, 1507.633, 1350, Apply block
52.617, 37.029, 20.640, 135.938, 50, Apply epoch block
```
Post:
```
3502.715, 0.000, 3502.715, 3502.715, 1, Initialize DB
0.080, 0.560, 0.035, 21.015, 1400, Load block from database
17.595, 0.000, 17.595, 17.595, 1, Load state from database
15.706, 11.028, 8.300, 107.537, 1350, Apply block
33.217, 12.622, 17.331, 60.580, 50, Apply epoch block
```
* more perf fixes
* load EpochRef cache into StateCache more aggressively
* point out security concern with public key cache
* reuse proposer index from state when processing block
* avoid genericAssign in a few more places
* don't parse key when signature is unparseable
* fix `==` overload for Eth2Digest
* preallocate validator list when getting active validators
* speed up proposer index calculation a little bit
* reuse cache when replaying blocks in ncli_db
* avoid a few more copying loops
```
Average, StdDev, Min, Max, Samples, Test
Validation is turned off meaning that no BLS operations are performed
3279.158, 0.000, 3279.158, 3279.158, 1, Initialize DB
0.072, 0.357, 0.035, 13.400, 1400, Load block from database
17.295, 0.000, 17.295, 17.295, 1, Load state from database
5.918, 9.896, 0.198, 98.028, 1350, Apply block
15.888, 10.951, 7.902, 39.535, 50, Apply epoch block
0.000, 0.000, 0.000, 0.000, 0, Database block store
```
* clear full balance cache before processing rewards and penalties
```
All time are ms
Average, StdDev, Min, Max, Samples, Test
Validation is turned off meaning that no BLS operations are performed
3947.901, 0.000, 3947.901, 3947.901, 1, Initialize DB
0.124, 0.506, 0.026, 202.370, 363345, Load block from database
97.614, 0.000, 97.614, 97.614, 1, Load state from database
0.186, 0.188, 0.012, 99.561, 357262, Advance slot, non-epoch
14.161, 5.966, 1.099, 395.511, 11524, Advance slot, epoch
1.372, 4.170, 0.017, 276.401, 363345, Apply block, no slot processing
0.000, 0.000, 0.000, 0.000, 0, Database block store
```
* Handle some web3 timeouts better
* Add support for developer .env files
* Eth1 improvements; Mainnet genesis state
Notable changes:
* The deposits table have been removed from the database. The client
will no longer process all deposits on start-up.
* The network metadata now includes a "state snapshot" of the deposit
contract. This allows the client to skip syncing deposits made prior
to the snapshot (i.e. genesis). Suitable metadata added for Pyrmont
and Mainnet.
* The Eth1 monitor won't be started unless there are validators attached
to the node.
* The genesis detection code is now optional and disabled by default
* Bugfix: The client should not produce blocks that will fail validation
when it hasn't downloaded the latest deposits yet
* Bugfix: Work around the database corruption affecting Pyrmont nodes
* Remove metadata for Toledo and Medalla
about 40% better slot processing times (with LTO enabled) - these don't
do BLS but are used
heavily during replay (state transition = slot + block transition)
tests using a recent medalla state and advancing it 1000 slots:
```
./ncli slots --preState2:state-302271-3c1dbf19-c1f944bf.ssz --slot:1000
--postState2:xx.ssz
```
pre:
```
All time are ms
Average, StdDev, Min, Max, Samples,
Test
Validation is turned off meaning that no BLS operations are performed
39.236, 0.000, 39.236, 39.236, 1,
Load state from file
0.049, 0.002, 0.046, 0.063, 968,
Apply slot
256.504, 81.008, 213.471, 591.902, 32,
Apply epoch slot
28.597, 0.000, 28.597, 28.597, 1,
Save state to file
```
cast:
```
All time are ms
Average, StdDev, Min, Max, Samples,
Test
Validation is turned off meaning that no BLS operations are performed
37.079, 0.000, 37.079, 37.079, 1,
Load state from file
0.042, 0.002, 0.040, 0.090, 968,
Apply slot
215.552, 68.763, 180.155, 500.103, 32,
Apply epoch slot
25.106, 0.000, 25.106, 25.106, 1,
Save state to file
```
cast+rewards:
```
All time are ms
Average, StdDev, Min, Max, Samples,
Test
Validation is turned off meaning that no BLS operations are performed
40.049, 0.000, 40.049, 40.049, 1,
Load state from file
0.048, 0.001, 0.045, 0.060, 968,
Apply slot
164.981, 76.273, 142.099, 477.868, 32,
Apply epoch slot
28.498, 0.000, 28.498, 28.498, 1,
Save state to file
```
cast+rewards+shr
```
All time are ms
Average, StdDev, Min, Max, Samples,
Test
Validation is turned off meaning that no BLS operations are performed
12.898, 0.000, 12.898, 12.898, 1,
Load state from file
0.039, 0.002, 0.038, 0.054, 968,
Apply slot
139.971, 68.797, 120.088, 428.844, 32,
Apply epoch slot
24.761, 0.000, 24.761, 24.761, 1,
Save state to file
```