nimbus-eth2

Commit Graph

Author	SHA1	Message	Date
Jacek Sieka	2df8a3b28d	add more block processing durations (#2611 )	2021-05-28 21:03:20 +02:00
Jacek Sieka	7f52ffb8d9	clean up block processing (#2610 ) * gossip_to_consensus -> block_processor (it's processing only blocks, but not only from gossip) * measure queue and validation time for blocks * measure assignment and state loading times for updateStateData * avoid some unnecessary block copies in block sync * warn that database is corrupt if we hit tail without a state	2021-05-28 19:34:00 +03:00
tersec	46c5a0110a	log doppelganger attestation signature; rm withState.HashedBeaconState uses (#2608 )	2021-05-28 15:51:15 +03:00
Jacek Sieka	eebc828778	create new database in separate file (#2596 ) The V1 table structure shows great improvements in performance, but if there's an old `kvstore` without rowid:s, these benefits are nullified: reorgs during writes and deletes remain expensive (even if the degradation is reduced somewhat). This PR creates the tables in a new file instead, and uses the old file as a read-only store - this has several interesting properties: * the old database is left completely untouched - this guarantees that downgrades work smooth (they'll only need to resync their missing portions) * starting sync after this PR means only a v1 database is created * v0 databases stick around - no migration is performed (for now) Future PR:s can introduce migration of the data from one database to another - a simply copy will take hours which is downtime we want to avoid - at that point, it might make sense to migrate straight to era files instead.	2021-05-26 09:07:18 +02:00
tersec	0b0bfd1de0	use StateData in place of BeaconState outside state transition code (#2551 ) * use StateData in place of BeaconState outside state transition code * propagate more StateData usage * remove withStateVars().state * wrap get_beacon_committee(BeaconState, ...) as gbc(StateData, ...) * switch makeAttestation() to use StateData * use StateData wrapper/dispatcher for get_committee_count_per_slot() * convert AttestationCache.init(), weak subjectivity functions, and updateValidatorMetrics() * add get_shuffled_active_validator_indices(StateData) and get_block_root_at_slot(StateData) * switch makeAttestationData() to StateData * sync AllTests-mainnet.md after rebase	2021-05-21 09:23:28 +00:00
Jacek Sieka	97f4e1fffe	Db1 cont (#2573 ) * Revert "Revert "Upgrade database schema" (#2570)" This reverts commit `6057c2ffb4`. * ssz: fix loading empty lists into existing instances Not a problem earlier because we didn't reuse instances * bump nim-eth * bump nim-web3	2021-05-17 18:37:26 +02:00
tersec	6057c2ffb4	Revert "Upgrade database schema" (#2570 ) This reverts commit `22ddf74752`.	2021-05-17 06:34:44 +00:00
Jacek Sieka	22ddf74752	Upgrade database schema The `kvstore` design we're using now turns out to not be the best way to use `sqlite` - in particular, there are some significant benefits to using rowid in certain situations and to keep data in separate tables. With this branch, there are massive improvements in startup time (seconds instead of minutes) and state/block storage and pruning times (milliseconds instead of seconds) - these improvements can in particular be seen on slow drives and translate directly into better attestation performance. * update kvstore to new keyspace design * remove `DirStoreRef` and the hidden `--state-db-kind` option - this was an experiment to store large blobs in files, but with the new kvstore, there's no compelling reason to do so * remove `DbMap` - unused and would need updating for new keyspace design * introduce separate tables for each data type (blocks, states etc) * remove "WITHOUT ROWID" pessimization for tables with large blobs * close DbSeq statements explicitly (and earlier) * store beacon block summaries in separate table, without SSZ compression and load them all with single query on startup * stop storing backwards compat full states * mark genesis beacon block as trusted * avoid faststreams when loading SSZ data * remove `DisagreementBehavior` (unused)	2021-05-14 20:05:23 +03:00
Jacek Sieka	867d8f3223	Perform attestation check before broadcast (#2550 ) Currently, we have a bit of a convoluted flow where when sending attestations, we start broadcasting them over gossip then pass them to the attestation validation to include them in the local attestation pool - it should be the other way around: we should be checking attestations _before_ gossipping them - this serves as an additional safety net to ensure that we don't publish junk - this becomes more important when publishing attestations from the API. Also, the REST API was performing its own validation meaning attestations coming from REST would be validated twice - finally, the JSON RPC wasn't pre-validating and would happily broadcast invalid attestations. * Unified attestation production pipeline with the same flow for gossip, locally and API-produced attestations: all are now validated and entered into the pool, then broadcast/republished * Refactor subnet handling with specific SubnetId alias, streamlining where subnets are computed, avoiding the need to pass around the number of active validators * Move some of the subnet handling code to eth2_network * Use BitArray throughout for subnet handling	2021-05-10 09:13:36 +02:00
Jacek Sieka	646923c3dd	add attestation stats tool to ncli_db (#2539 ) This also makes future efforts to provide metrics and logs for attestation efficiency easier * Export rewards from epoch transition * Use less memory for reward calculation (bool -> set[enum], field alignment) * Reuse reward memory when replaying, avoiding spike * Allow replaying any range in ncli_db benchmark	2021-05-07 13:36:21 +02:00
tersec	1d6c8ee9ab	store full state 4x less often (#2542 )	2021-05-06 07:36:18 +02:00
Jacek Sieka	427c0f307c	avoid extraneous hash root calculation (#2537 ) When applying a block, we'll currently compute a state root for the state after slot processing but before block processing - this is unnecessary when a block is being applied because the intermediate state root is never observed.	2021-05-05 08:54:21 +02:00
Jacek Sieka	ce49da6c0a	Introduce unittest2 and junit reports (#2522 ) * Introduce unittest2 and junit reports * fix XML path * don't combine multiple CI runs * fixup * public combined report also Co-authored-by: Ștefan Talpalaru <stefantalpalaru@yahoo.com>	2021-04-28 18:41:02 +02:00
Jacek Sieka	7dba1b37dd	remove attestation/aggregate queue (#2519 ) With the introduction of batching and lazy attestation aggregation, it no longer makes sense to enqueue attestations between the signature check and adding them to the attestation pool - this only takes up valuable CPU without any real benefit. * add successfully validated attestations to attestion pool directly * avoid copying participant list around for single-vote attestations, pass single validator index instead * release decompressed gossip memory earlier, specially during async message validation * use cooked signatures in a few more places to avoid reloads and errors * remove some Defect-raising versions of signature-loading * release decompressed data memory before validating message	2021-04-26 22:39:44 +02:00
Jacek Sieka	54d6884c89	fix sync issue when upgrading from 1.1.0-inited db This patch writes a full genesis state to `kvstore` if one was missing, which fixes 1.2.0 restarting sync when upgrading from 1.1.0, or when downgrading to a pre-1.1.0 release.	2021-04-20 16:55:18 +03:00
tersec	99fccaee6e	more abstraction over BeaconState (#2509 ) * more abstraction over BeaconState * use HashedBeaconState copy of htr	2021-04-16 08:49:37 +00:00
Jacek Sieka	f1f424cc2d	attestation processing speedups * avoid creating indexed attestation just to check signatures - above all, don't create it when not checking signatures ;) * avoid pointer op when adding attestation to pool * better iterator for yielding attestations * add metric / log for attestation packing time	2021-04-14 21:51:17 +03:00
Jacek Sieka	4ed2e34a9e	Revamp attestation pool This is a revamp of the attestation pool that cleans up several aspects of attestation processing as the network grows larger and block space becomes more precious. The aim is to better exploit the divide between attestation subnets and aggregations by keeping the two kinds separate until it's time to either produce a block or aggregate. This means we're no longer eagerly combining single-vote attestations, but rather wait until the last moment, and then try to add singles to all aggregates, including those coming from the network. Importantly, the branch improves on poor aggregate quality and poor attestation packing in cases where block space is running out. A basic greed scoring mechanism is used to select attestations for blocks - attestations are added based on how much many new votes they bring to the table. * Collect single-vote attestations separately and store these until it's time to make aggregates * Create aggregates based on single-vote attestations * Select _best_ aggregate rather than _first_ aggregate when on aggregation duty * Top up all aggregates with singles when it's time make the attestation cut, thus improving the chances of grabbing the best aggregates out there * Improve aggregation test coverage * Improve bitseq operations * Simplify aggregate signature creation * Make attestation cache temporary instead of storing it in attestation pool - most of the time, blocks are not being produced, no need to keep the data around * Remove redundant aggregate storage that was used only for RPC * Use tables to avoid some linear seeks when looking up attestation data * Fix long cleanup on large slot jumps * Avoid some pointers * Speed up iterating all attestations for a slot (fixes #2490)	2021-04-13 20:24:02 +03:00
cheatfate	477decbcf5	Address #2490 .	2021-04-13 17:07:41 +03:00
tersec	498c998552	abstract over most withStateVars/withState state var usage (#2484 ) * abstract over most withStateVars/withState state var usage * cleanups	2021-04-13 15:05:44 +02:00
tersec	79bb0d5379	only deserialize attestation and aggregation gossiped signatures once (#2472 ) * only deserialize attestation and aggregation gossiped signatures once * re-indent some aggregate checks into block scope * spelling * remove debugging assertion * put part of gossip validation back into block context * attestation pool test signature loading isn't so unsafe, and exportRaw isn't free * remove more development doAsserts; don't exportRaw in loops	2021-04-09 14:59:24 +02:00
tersec	d3cad92693	remove some BeaconState use and abstract over other uses (#2482 ) * remove some BeaconState use and abstract over other uses * remove out-of-context comment	2021-04-08 08:24:25 +00:00
tersec	8d7792e6e9	add Altair domains and participation flags; clean up imports (#2462 )	2021-04-04 16:24:45 +00:00
Jacek Sieka	3cd7cebc7c	Fix block dag pruning frequency (#2469 ) Should always prune after finality change but not more than once	2021-04-01 13:26:17 +02:00
tersec	bd8b60f8c8	use epochref for get_committee_assignments() to avoid repeated shuffling (#2463 ) * use epochref for get_committee_assignments() to avoid repeated shuffling * remove unnecessary imports and StateCache() construction	2021-03-30 15:01:47 +00:00
Mamy Ratsimbazafy	a9938a2067	Fix pruning time display (#2461 ) * Fix pruning time display * remove import	2021-03-30 09:40:28 +02:00
Jacek Sieka	8b76ceed52	Fix minor exception effect issues (#2448 ) Makes code compatible with https://github.com/status-im/nim-chronos/pull/166 without requiring it.	2021-03-24 17:20:55 +01:00
tersec	b059cb42c5	increase block proposal speed with many validators (#2423 ) * increase block proposal speed with many validators * document CookedSig rationale	2021-03-17 13:35:59 +00:00
Jacek Sieka	3cb31e66b4	set upper bound on EpochRef cache (#2403 ) * set upper bound on EpochRef cache * max 32 EpochRef instances * less memory waste in BlockRef by removing EpochRef seq that is mostly unused (~20mb) * less memory waste in dag block lookup by not keeping an extra copy of digest (~70mb) * fix `==` and `$` for Eth2Digest * remove `ChainDAG.tmpState` (~50mb?) all in all, this branch cuts mainnet memory usage by ~160-180mb and puts limits on EpochRef cache usage - where normally it hovered around 950mb before, it's now sitting at 600-700mb on my machine. * docs	2021-03-17 11:17:15 +01:00
Mamy Ratsimbazafy	6e38d474cc	Add pruning timings (#2422 )	2021-03-17 07:30:16 +01:00
tersec	8def2486b0	immutable validator database factoring (#2297 ) * initial immutable validator database factoring * remove changes from chain_dag: this abstraction properly belongs in beacon_chain_db * add merging mutable/immutable validator portions; individually test database roundtripping of immutable validators and states-sans-immutable-validators * update test summaries * use stew/assign2 instead of Nim assignment * add reading/writing of immutable validators in chaindag * remove unused import * replace chunked k/v store of immutable validators with per-row SQL table storage * use List instead of HashList * un-stub some ncli_db code so that it uses * switch HashArray to array; move BeaconStateNoImmutableValidators from datatypes to beacon_chain_db * begin only-mutable-part state storage * uncomment some assigns * work around https://github.com/nim-lang/Nim/issues/17253 * fix most of the issues/oversights; local sim runs again * fix test suite by adding missing beaconstate field to copy function * have ncli bench also store immutable validators * extract some immutable-validator-specific code from the beacon chain db module * add more rigorous database state roundtripping, with changing validator sets * adjust ncli_db to use new schema * simplify putState/getState by moving all immutable validator accounting into beacon state DB * remove redundant test case and move code to immutable-beacon-chain module * more efficient, but still brute-force, mutable+immutable validator merging * reuse BeaconState in getState * ensure HashList/HashArray caches are cleared when reusing getState buffers; add ncli_db and a unit test to verify this * HashList.clear() -> HashList.clearCache() * only copy incrementally necessary immutable validators * increase strictness of test cases and fix/work around resulting HashList cache invalidation issues * remove explanatory scaffolding * allow for storage of full (with all validators) states for backwards/forwards-compatibility * adjust DbSeq type usage * store full, with-validators, state every 64 epochs to enable reverting versions * reduce memory allocation and intermediate objects in state storage codepath * eliminate allocation/copying through intermediate BeaconStateNoImmutableValidators objects * skip benchmarking initial genesis-validator-heavy state store * always store new-style state and sometimes old-style state * document intent behind BeaconState/Validator type-punnery * more accurate failure message on SQLite in-memory database initialization failure	2021-03-15 14:11:51 +00:00
Jacek Sieka	aabdd34704	e2store: add era format (#2382 ) Era files contain 8192 blocks and a state corresponding to the length of the array holding block roots in the state, meaning that each block is verifiable using the pubkeys and block roots from the state. Of course, one would need to know the root of the state as well, which is available in the first block of the _next_ file - or known from outside. This PR also adds an implementation to write e2s, e2i and era files, as well as a python script to inspect them. All in all, the format is very similar to what goes on in the network requests meaning it can trivially serve as a backing format for serving said requests. Mainnet, up to the first 671k slots, take up 3.5gb - in each era file, the BeaconState contributes about 9mb at current validator set sizes, up from ~3mb in the early blocks, for a grand total of ~558mb for the 82 eras tested - this overhead could potentially be calculated but one would lose the ability to verify individual blocks (eras could still be verified using historical roots). ``` -rw-rw-r--. 1 arnetheduck arnetheduck 16 5 mar 11.47 ethereum2-mainnet-00000000-00000001.e2i -rw-rw-r--. 1 arnetheduck arnetheduck 1,8M 5 mar 11.47 ethereum2-mainnet-00000000-00000001.e2s -rw-rw-r--. 1 arnetheduck arnetheduck 65K 5 mar 11.47 ethereum2-mainnet-00000001-00000001.e2i -rw-rw-r--. 1 arnetheduck arnetheduck 18M 5 mar 11.47 ethereum2-mainnet-00000001-00000001.e2s ... -rw-rw-r--. 1 arnetheduck arnetheduck 65K 5 mar 11.52 ethereum2-mainnet-00000051-00000001.e2i -rw-rw-r--. 1 arnetheduck arnetheduck 68M 5 mar 11.52 ethereum2-mainnet-00000051-00000001.e2s -rw-rw-r--. 1 arnetheduck arnetheduck 61K 5 mar 11.11 ethereum2-mainnet-00000052-00000001.e2i -rw-rw-r--. 1 arnetheduck arnetheduck 62M 5 mar 11.11 ethereum2-mainnet-00000052-00000001.e2s ```	2021-03-15 11:31:39 +01:00
Mamy Ratsimbazafy	8e28a05cea	Move pruning out of latency critical path (#2384 ) * Deferred DAG and fork choice pruning * fixup * Address https://github.com/status-im/nimbus-eth2/pull/2384/files#r589448448, rely only on onSLotEnd for state pruning * no need to store needPruning in the data structure * lastPrunePoint is updated in pruning proc * Split eager and LazyPruning * enforce pruning in updateHead	2021-03-09 15:36:17 +01:00
Mamy Ratsimbazafy	de1060e7f3	centralize p2p validation in a single file and address https://github.com/status-im/nimbus-eth2/pull/2377#issuecomment-791313118 (#2383 )	2021-03-06 08:32:55 +01:00
Mamy Ratsimbazafy	d47f53cd9d	Reorg (5/5) (#2377 ) * Reorg things left into networking and gossip_processing * time -> beacon_clock * fix builds	2021-03-05 14:12:00 +01:00
Mamy Ratsimbazafy	5d7f9c3a04	Consensus object pools [reorg 4/5] (#2374 ) * Add documentation * make test doesn't try to build the beacon node :/	2021-03-04 10:13:44 +01:00

1 2 3

136 Commits