nimbus-eth2

Commit Graph

Author	SHA1	Message	Date
zah	eb2dc5cbbb	Implement the new Altair req/resp protocols (#2676 ) * Implement the new Altair req/resp protocols Also fixes the altair message-id computation by providing the correct forkdigest prefix in `isAltairTopic`. Co-authored-by: Tanguy Cizain <tanguycizain@gmail.com>	2021-07-07 12:09:47 +03:00
tersec	ac7f719382	use isomorphicCast between beacon block types (#2698 )	2021-07-06 14:32:49 +02:00
tersec	7577f8c2ef	add blockchain_dag altair database reading; add rollback tests (#2683 ) * add blockchain_dag altair database reading; add rollback tests; fix some unnecessary type conversions * remove debugging scaffolding * proposeSignedBlock() will need to be async for merge; introduce altair types to VC	2021-06-29 15:09:29 +00:00
tersec	445def6c8b	block_clearance, ncli, and ncli_db Altair state saving (#2672 ) * block_clearance, ncli, and ncli_db Altair state saving * avoid invalidating SSZ hash caches with every assignment	2021-06-24 18:34:08 +00:00
tersec	41e0a7abc0	introduce database support for Altair (#2667 ) * introduce immutable Altair BeaconState * add database support for Altair blocks and states * add tests for Altair get/put/contains/delete state * enable blockchain_dag Altair state database storing * properly return error on getting missing altair block	2021-06-24 07:11:47 +00:00
tersec	ae1abf24af	add Altair support to block quarantine/clearance and block_sim (#2662 ) * add Altair support to the block quarantine * switch some spec/datatypes imports to spec/datatypes/base * add Altair support to block_clearance * allow runtime configuration of Altair transition slot * enable Altair in block_sim, including in CI	2021-06-23 14:43:18 +00:00
tersec	b1d5609171	remove false OnBlockAdded dependency on phase0 HashedBeaconState (#2661 ) * remove false OnBlockAdded dependency on phase.HashedBeaconState * introduce altair data types into block_clearance; update some alpha.6 spec refs to alpha.7; add get_active_validator_indices_len ForkedHashedBeaconState wrapper * switch many modules from using datatypes (with phase0 states/blocks) to datatypes/base (fork-independent); update spec refs from alpha.6 to alpha.7 and remove rm'd G2_POINT_AT_INFINITY * switch more modules from using datatypes (with phase0 states/blocks) to datatypes/base (fork-independent); update spec refs from alpha.6 to alpha.7 * remove unnecessary phase0-only wrapper of get_attesting_indices(); allow signatures_batch to process either fork; remove O(n^2) nested loop in process_inactivity_updates(); add altair support to getAttestationsforTestBlock() * add Altair versions of asSigVerified(), asTrusted(), and makeBeaconBlock() * fix spec URL to be Altair for Altair makeBeaconBlock()	2021-06-21 08:35:24 +00:00
tersec	9616220280	implement Altair attestation pool cache init (#2659 ) * implement Altair attestation pool cache init * remove code duplication around previous/current epoch updates	2021-06-17 17:13:14 +00:00
tersec	1c3314f08b	update to Altair as of v1.1.0-alpha.7 (#2649 ) * update to Altair as of v1.1.0-alpha.7 * introduce Altair types into attestation pool * avoid allocating/copying pubkeys excessively in get_next_sync_committee()	2021-06-14 17:42:46 +00:00
tersec	146fa48454	use ForkedHashedBeaconState in StateData (#2634 ) * use ForkedHashedBeaconState in StateData * fix FAR_FUTURE_EPOCH -> slot overflow; almost always use assign() * avoid stack allocation in maybeUpgradeStateToAltair() * create and use dispatch functions for check_attester_slashing(), check_proposer_slashing(), and check_voluntary_exit() * use getStateRoot() instead of various state.data.hbsPhase0.root * remove withStateVars.hashedState(), which doesn't work as a design anymore * introduce spec/datatypes/altair into beacon_chain_db * fix inefficient codegen for getStateField(largeStateField) * state_transition_slots() doesn't either need/use blocks or runtime presets * combine process_slots(HBS)/state_transition_slots(HBS) which differ only in last-slot htr optimization * getStateField(StateData, ...) was replaced by getStateField(ForkedHashedBeaconState, ...) * fix rollback * switch some state_transition(), process_slots, makeTestBlocks(), etc to use ForkedHashedBeaconState * remove state_transition(phase0.HashedBeaconState) * remove process_slots(phase0.HashedBeaconState) * remove state_transition_block(phase0.HashedBeaconState) * remove unused callWithBS(); separate case expression from if statement * switch back from nested-ref-object construction to (ref Foo)(Bar())	2021-06-11 20:51:46 +03:00
Jacek Sieka	9193be9b7b	fix epoch logging (fixes #2283 ) (#2642 ) Also put epoch first to disambiguate vs slot	2021-06-11 01:07:16 +03:00
Jacek Sieka	d859bc12f0	write uncompressed validator keys to database (#2639 ) * write uncompressed validator keys to database Loading 150k+ validator keys on startup in compressed format takes a lot of time - better store them in uncompressed format which makes behaviour just after startup faster / more predictable. * refactor cached validator key access * fix isomorphic cast to work with non-var instances * remove cooked pubkey cache - directly use database cache in chaindag as well (one less cache to keep in sync) * bump blscurve, introduce loadValid for known-to-be-valid keys	2021-06-10 10:37:02 +03:00
tersec	8ebd496fbe	Altair transition tests (#2624 ) * Working Altair transition tests * with fixed upstream test vectors, remove state root workaround * switch upgrade_to_altair() to returning a reference * remove test_state_transition * fix invalid fork state/block combinations error messages * avoid memory copies by reintroducing state_transition_slots(var SomeHashedBeaconState)	2021-06-04 10:38:00 +00:00
Jacek Sieka	b11da2cb34	fix state cache loading * load the cache of the current state epoch instead of the target state epoch, when applying states and slots * load state cache for each slot/block (for longer slot jumps) * load state cache after full updateStateData * look up two state cache epochs, instead of the same epoch twice :)	2021-06-03 21:37:52 +03:00
tersec	28a5bca71a	split state_transition() into slots/block parts and use only block where appropriate (#2630 )	2021-06-03 11:42:25 +02:00
Jacek Sieka	0fb02b5206	log state update duration, lower info threshold for detail logging	2021-06-01 20:43:44 +03:00
Jacek Sieka	abe0d7b4ae	singe validator key cache Instead of keeping a validator key list per EpochRef, this PR introduces a single shared validator key list in ChainDAG, and cleans up some other ChainDAG and key-related issues. The PR does not introduce the validator key list in the state transition - this is because we batch-check all signatures before entering the spec code, thus the spec code never hits the cache. A future refactor should _probably_ remove the threadvar altogether. There's a few other small fixes in here that make the flow easier to read: * fix `var ChainDAGRef` -> `ChainDAGRef` * fix `var QuarantineRef` -> `QuarantineRef` * consistent `dag` variable name * avoid using threadvar pubkey cache in most cases * better error messages in batch signature checking	2021-06-01 20:43:44 +03:00
tersec	ea9ceb693a	update ChainDAG.effective_balance() to use StateData; rm ChainDAG.getBlockByPreciseSlot() (#2622 ) * update ChainDAG.effective_balance() to use StateData; rm unused ChainDAG.getBlockByPreciseSlot() * update get_effective_balances to avoid god object; avoid most memory allocation in Altair epoch reward and penalty processing	2021-06-01 12:40:13 +00:00
Jacek Sieka	9b89f58089	revert advance back to trace	2021-06-01 14:09:11 +02:00
Jacek Sieka	60df17786e	avoid reading legacy db on write * don't consider legacy database when writing state - this read is slow on kvstore * avoid epoch transition when there's an exact match in cache already * simplify init to only consider checkpoint states	2021-05-30 12:32:51 +03:00
Jacek Sieka	df7bc87af5	Pre-compute slot transition for clearance state This way we perform the expensive epoch processing before the block arrives. Of course, this may lead to speculative misses which in turn lead to replays - it's likely that in the case of a miss, we'll see a replay regardless.	2021-05-30 12:04:09 +03:00
Jacek Sieka	2df8a3b28d	add more block processing durations (#2611 )	2021-05-28 21:03:20 +02:00
Jacek Sieka	7f52ffb8d9	clean up block processing (#2610 ) * gossip_to_consensus -> block_processor (it's processing only blocks, but not only from gossip) * measure queue and validation time for blocks * measure assignment and state loading times for updateStateData * avoid some unnecessary block copies in block sync * warn that database is corrupt if we hit tail without a state	2021-05-28 19:34:00 +03:00
tersec	46c5a0110a	log doppelganger attestation signature; rm withState.HashedBeaconState uses (#2608 )	2021-05-28 15:51:15 +03:00
Jacek Sieka	eebc828778	create new database in separate file (#2596 ) The V1 table structure shows great improvements in performance, but if there's an old `kvstore` without rowid:s, these benefits are nullified: reorgs during writes and deletes remain expensive (even if the degradation is reduced somewhat). This PR creates the tables in a new file instead, and uses the old file as a read-only store - this has several interesting properties: * the old database is left completely untouched - this guarantees that downgrades work smooth (they'll only need to resync their missing portions) * starting sync after this PR means only a v1 database is created * v0 databases stick around - no migration is performed (for now) Future PR:s can introduce migration of the data from one database to another - a simply copy will take hours which is downtime we want to avoid - at that point, it might make sense to migrate straight to era files instead.	2021-05-26 09:07:18 +02:00
tersec	0b0bfd1de0	use StateData in place of BeaconState outside state transition code (#2551 ) * use StateData in place of BeaconState outside state transition code * propagate more StateData usage * remove withStateVars().state * wrap get_beacon_committee(BeaconState, ...) as gbc(StateData, ...) * switch makeAttestation() to use StateData * use StateData wrapper/dispatcher for get_committee_count_per_slot() * convert AttestationCache.init(), weak subjectivity functions, and updateValidatorMetrics() * add get_shuffled_active_validator_indices(StateData) and get_block_root_at_slot(StateData) * switch makeAttestationData() to StateData * sync AllTests-mainnet.md after rebase	2021-05-21 09:23:28 +00:00
Jacek Sieka	97f4e1fffe	Db1 cont (#2573 ) * Revert "Revert "Upgrade database schema" (#2570)" This reverts commit `6057c2ffb4`. * ssz: fix loading empty lists into existing instances Not a problem earlier because we didn't reuse instances * bump nim-eth * bump nim-web3	2021-05-17 18:37:26 +02:00
tersec	6057c2ffb4	Revert "Upgrade database schema" (#2570 ) This reverts commit `22ddf74752`.	2021-05-17 06:34:44 +00:00
Jacek Sieka	22ddf74752	Upgrade database schema The `kvstore` design we're using now turns out to not be the best way to use `sqlite` - in particular, there are some significant benefits to using rowid in certain situations and to keep data in separate tables. With this branch, there are massive improvements in startup time (seconds instead of minutes) and state/block storage and pruning times (milliseconds instead of seconds) - these improvements can in particular be seen on slow drives and translate directly into better attestation performance. * update kvstore to new keyspace design * remove `DirStoreRef` and the hidden `--state-db-kind` option - this was an experiment to store large blobs in files, but with the new kvstore, there's no compelling reason to do so * remove `DbMap` - unused and would need updating for new keyspace design * introduce separate tables for each data type (blocks, states etc) * remove "WITHOUT ROWID" pessimization for tables with large blobs * close DbSeq statements explicitly (and earlier) * store beacon block summaries in separate table, without SSZ compression and load them all with single query on startup * stop storing backwards compat full states * mark genesis beacon block as trusted * avoid faststreams when loading SSZ data * remove `DisagreementBehavior` (unused)	2021-05-14 20:05:23 +03:00
Jacek Sieka	867d8f3223	Perform attestation check before broadcast (#2550 ) Currently, we have a bit of a convoluted flow where when sending attestations, we start broadcasting them over gossip then pass them to the attestation validation to include them in the local attestation pool - it should be the other way around: we should be checking attestations _before_ gossipping them - this serves as an additional safety net to ensure that we don't publish junk - this becomes more important when publishing attestations from the API. Also, the REST API was performing its own validation meaning attestations coming from REST would be validated twice - finally, the JSON RPC wasn't pre-validating and would happily broadcast invalid attestations. * Unified attestation production pipeline with the same flow for gossip, locally and API-produced attestations: all are now validated and entered into the pool, then broadcast/republished * Refactor subnet handling with specific SubnetId alias, streamlining where subnets are computed, avoiding the need to pass around the number of active validators * Move some of the subnet handling code to eth2_network * Use BitArray throughout for subnet handling	2021-05-10 09:13:36 +02:00
Jacek Sieka	646923c3dd	add attestation stats tool to ncli_db (#2539 ) This also makes future efforts to provide metrics and logs for attestation efficiency easier * Export rewards from epoch transition * Use less memory for reward calculation (bool -> set[enum], field alignment) * Reuse reward memory when replaying, avoiding spike * Allow replaying any range in ncli_db benchmark	2021-05-07 13:36:21 +02:00
tersec	1d6c8ee9ab	store full state 4x less often (#2542 )	2021-05-06 07:36:18 +02:00
Jacek Sieka	427c0f307c	avoid extraneous hash root calculation (#2537 ) When applying a block, we'll currently compute a state root for the state after slot processing but before block processing - this is unnecessary when a block is being applied because the intermediate state root is never observed.	2021-05-05 08:54:21 +02:00
Jacek Sieka	ce49da6c0a	Introduce unittest2 and junit reports (#2522 ) * Introduce unittest2 and junit reports * fix XML path * don't combine multiple CI runs * fixup * public combined report also Co-authored-by: Ștefan Talpalaru <stefantalpalaru@yahoo.com>	2021-04-28 18:41:02 +02:00
Jacek Sieka	7dba1b37dd	remove attestation/aggregate queue (#2519 ) With the introduction of batching and lazy attestation aggregation, it no longer makes sense to enqueue attestations between the signature check and adding them to the attestation pool - this only takes up valuable CPU without any real benefit. * add successfully validated attestations to attestion pool directly * avoid copying participant list around for single-vote attestations, pass single validator index instead * release decompressed gossip memory earlier, specially during async message validation * use cooked signatures in a few more places to avoid reloads and errors * remove some Defect-raising versions of signature-loading * release decompressed data memory before validating message	2021-04-26 22:39:44 +02:00
Jacek Sieka	54d6884c89	fix sync issue when upgrading from 1.1.0-inited db This patch writes a full genesis state to `kvstore` if one was missing, which fixes 1.2.0 restarting sync when upgrading from 1.1.0, or when downgrading to a pre-1.1.0 release.	2021-04-20 16:55:18 +03:00
tersec	99fccaee6e	more abstraction over BeaconState (#2509 ) * more abstraction over BeaconState * use HashedBeaconState copy of htr	2021-04-16 08:49:37 +00:00
Jacek Sieka	f1f424cc2d	attestation processing speedups * avoid creating indexed attestation just to check signatures - above all, don't create it when not checking signatures ;) * avoid pointer op when adding attestation to pool * better iterator for yielding attestations * add metric / log for attestation packing time	2021-04-14 21:51:17 +03:00
Jacek Sieka	4ed2e34a9e	Revamp attestation pool This is a revamp of the attestation pool that cleans up several aspects of attestation processing as the network grows larger and block space becomes more precious. The aim is to better exploit the divide between attestation subnets and aggregations by keeping the two kinds separate until it's time to either produce a block or aggregate. This means we're no longer eagerly combining single-vote attestations, but rather wait until the last moment, and then try to add singles to all aggregates, including those coming from the network. Importantly, the branch improves on poor aggregate quality and poor attestation packing in cases where block space is running out. A basic greed scoring mechanism is used to select attestations for blocks - attestations are added based on how much many new votes they bring to the table. * Collect single-vote attestations separately and store these until it's time to make aggregates * Create aggregates based on single-vote attestations * Select _best_ aggregate rather than _first_ aggregate when on aggregation duty * Top up all aggregates with singles when it's time make the attestation cut, thus improving the chances of grabbing the best aggregates out there * Improve aggregation test coverage * Improve bitseq operations * Simplify aggregate signature creation * Make attestation cache temporary instead of storing it in attestation pool - most of the time, blocks are not being produced, no need to keep the data around * Remove redundant aggregate storage that was used only for RPC * Use tables to avoid some linear seeks when looking up attestation data * Fix long cleanup on large slot jumps * Avoid some pointers * Speed up iterating all attestations for a slot (fixes #2490)	2021-04-13 20:24:02 +03:00
cheatfate	477decbcf5	Address #2490 .	2021-04-13 17:07:41 +03:00
tersec	498c998552	abstract over most withStateVars/withState state var usage (#2484 ) * abstract over most withStateVars/withState state var usage * cleanups	2021-04-13 15:05:44 +02:00
tersec	79bb0d5379	only deserialize attestation and aggregation gossiped signatures once (#2472 ) * only deserialize attestation and aggregation gossiped signatures once * re-indent some aggregate checks into block scope * spelling * remove debugging assertion * put part of gossip validation back into block context * attestation pool test signature loading isn't so unsafe, and exportRaw isn't free * remove more development doAsserts; don't exportRaw in loops	2021-04-09 14:59:24 +02:00
tersec	d3cad92693	remove some BeaconState use and abstract over other uses (#2482 ) * remove some BeaconState use and abstract over other uses * remove out-of-context comment	2021-04-08 08:24:25 +00:00
tersec	8d7792e6e9	add Altair domains and participation flags; clean up imports (#2462 )	2021-04-04 16:24:45 +00:00
Jacek Sieka	3cd7cebc7c	Fix block dag pruning frequency (#2469 ) Should always prune after finality change but not more than once	2021-04-01 13:26:17 +02:00
tersec	bd8b60f8c8	use epochref for get_committee_assignments() to avoid repeated shuffling (#2463 ) * use epochref for get_committee_assignments() to avoid repeated shuffling * remove unnecessary imports and StateCache() construction	2021-03-30 15:01:47 +00:00
Mamy Ratsimbazafy	a9938a2067	Fix pruning time display (#2461 ) * Fix pruning time display * remove import	2021-03-30 09:40:28 +02:00
Jacek Sieka	8b76ceed52	Fix minor exception effect issues (#2448 ) Makes code compatible with https://github.com/status-im/nim-chronos/pull/166 without requiring it.	2021-03-24 17:20:55 +01:00
tersec	b059cb42c5	increase block proposal speed with many validators (#2423 ) * increase block proposal speed with many validators * document CookedSig rationale	2021-03-17 13:35:59 +00:00
Jacek Sieka	3cb31e66b4	set upper bound on EpochRef cache (#2403 ) * set upper bound on EpochRef cache * max 32 EpochRef instances * less memory waste in BlockRef by removing EpochRef seq that is mostly unused (~20mb) * less memory waste in dag block lookup by not keeping an extra copy of digest (~70mb) * fix `==` and `$` for Eth2Digest * remove `ChainDAG.tmpState` (~50mb?) all in all, this branch cuts mainnet memory usage by ~160-180mb and puts limits on EpochRef cache usage - where normally it hovered around 950mb before, it's now sitting at 600-700mb on my machine. * docs	2021-03-17 11:17:15 +01:00
Mamy Ratsimbazafy	6e38d474cc	Add pruning timings (#2422 )	2021-03-17 07:30:16 +01:00
tersec	8def2486b0	immutable validator database factoring (#2297 ) * initial immutable validator database factoring * remove changes from chain_dag: this abstraction properly belongs in beacon_chain_db * add merging mutable/immutable validator portions; individually test database roundtripping of immutable validators and states-sans-immutable-validators * update test summaries * use stew/assign2 instead of Nim assignment * add reading/writing of immutable validators in chaindag * remove unused import * replace chunked k/v store of immutable validators with per-row SQL table storage * use List instead of HashList * un-stub some ncli_db code so that it uses * switch HashArray to array; move BeaconStateNoImmutableValidators from datatypes to beacon_chain_db * begin only-mutable-part state storage * uncomment some assigns * work around https://github.com/nim-lang/Nim/issues/17253 * fix most of the issues/oversights; local sim runs again * fix test suite by adding missing beaconstate field to copy function * have ncli bench also store immutable validators * extract some immutable-validator-specific code from the beacon chain db module * add more rigorous database state roundtripping, with changing validator sets * adjust ncli_db to use new schema * simplify putState/getState by moving all immutable validator accounting into beacon state DB * remove redundant test case and move code to immutable-beacon-chain module * more efficient, but still brute-force, mutable+immutable validator merging * reuse BeaconState in getState * ensure HashList/HashArray caches are cleared when reusing getState buffers; add ncli_db and a unit test to verify this * HashList.clear() -> HashList.clearCache() * only copy incrementally necessary immutable validators * increase strictness of test cases and fix/work around resulting HashList cache invalidation issues * remove explanatory scaffolding * allow for storage of full (with all validators) states for backwards/forwards-compatibility * adjust DbSeq type usage * store full, with-validators, state every 64 epochs to enable reverting versions * reduce memory allocation and intermediate objects in state storage codepath * eliminate allocation/copying through intermediate BeaconStateNoImmutableValidators objects * skip benchmarking initial genesis-validator-heavy state store * always store new-style state and sometimes old-style state * document intent behind BeaconState/Validator type-punnery * more accurate failure message on SQLite in-memory database initialization failure	2021-03-15 14:11:51 +00:00
Jacek Sieka	aabdd34704	e2store: add era format (#2382 ) Era files contain 8192 blocks and a state corresponding to the length of the array holding block roots in the state, meaning that each block is verifiable using the pubkeys and block roots from the state. Of course, one would need to know the root of the state as well, which is available in the first block of the _next_ file - or known from outside. This PR also adds an implementation to write e2s, e2i and era files, as well as a python script to inspect them. All in all, the format is very similar to what goes on in the network requests meaning it can trivially serve as a backing format for serving said requests. Mainnet, up to the first 671k slots, take up 3.5gb - in each era file, the BeaconState contributes about 9mb at current validator set sizes, up from ~3mb in the early blocks, for a grand total of ~558mb for the 82 eras tested - this overhead could potentially be calculated but one would lose the ability to verify individual blocks (eras could still be verified using historical roots). ``` -rw-rw-r--. 1 arnetheduck arnetheduck 16 5 mar 11.47 ethereum2-mainnet-00000000-00000001.e2i -rw-rw-r--. 1 arnetheduck arnetheduck 1,8M 5 mar 11.47 ethereum2-mainnet-00000000-00000001.e2s -rw-rw-r--. 1 arnetheduck arnetheduck 65K 5 mar 11.47 ethereum2-mainnet-00000001-00000001.e2i -rw-rw-r--. 1 arnetheduck arnetheduck 18M 5 mar 11.47 ethereum2-mainnet-00000001-00000001.e2s ... -rw-rw-r--. 1 arnetheduck arnetheduck 65K 5 mar 11.52 ethereum2-mainnet-00000051-00000001.e2i -rw-rw-r--. 1 arnetheduck arnetheduck 68M 5 mar 11.52 ethereum2-mainnet-00000051-00000001.e2s -rw-rw-r--. 1 arnetheduck arnetheduck 61K 5 mar 11.11 ethereum2-mainnet-00000052-00000001.e2i -rw-rw-r--. 1 arnetheduck arnetheduck 62M 5 mar 11.11 ethereum2-mainnet-00000052-00000001.e2s ```	2021-03-15 11:31:39 +01:00
Mamy Ratsimbazafy	8e28a05cea	Move pruning out of latency critical path (#2384 ) * Deferred DAG and fork choice pruning * fixup * Address https://github.com/status-im/nimbus-eth2/pull/2384/files#r589448448, rely only on onSLotEnd for state pruning * no need to store needPruning in the data structure * lastPrunePoint is updated in pruning proc * Split eager and LazyPruning * enforce pruning in updateHead	2021-03-09 15:36:17 +01:00
Mamy Ratsimbazafy	de1060e7f3	centralize p2p validation in a single file and address https://github.com/status-im/nimbus-eth2/pull/2377#issuecomment-791313118 (#2383 )	2021-03-06 08:32:55 +01:00
Mamy Ratsimbazafy	d47f53cd9d	Reorg (5/5) (#2377 ) * Reorg things left into networking and gossip_processing * time -> beacon_clock * fix builds	2021-03-05 14:12:00 +01:00
Mamy Ratsimbazafy	5d7f9c3a04	Consensus object pools [reorg 4/5] (#2374 ) * Add documentation * make test doesn't try to build the beacon node :/	2021-03-04 10:13:44 +01:00

1 2 3 4

157 Commits