nimbus-eth2

Commit Graph

Author	SHA1	Message	Date
tersec	f09686e835	update some spec URLs to v1.1.6 (#3188 )	2021-12-13 15:45:48 +00:00
Jacek Sieka	dfbd50b4d6	avoid SyncCommitteMsgPool copy (#3185 ) introduced by batch verification, when verifiers were made async	2021-12-11 16:39:24 +01:00
Jacek Sieka	069bccd51b	batch-verify sync messages for a small perf boost (#3151 ) * batch-verify sync messages for a small perf boost Generally reuses the same structure as attestation and aggregate verification * normalize `signatures` and `signature_batch` to use the same pattern of verification * normalize parameter names, order etc for signature stuff in general * avoid calling `blsSign` directly - instead, go through `signatures` consistently	2021-12-09 14:56:54 +02:00
tersec	2ca28fb861	Merge BeaconBlock gossip validation (#3165 ) * Merge BeaconBlock gossip validation * figure/ground inversion * revert cosmetic cleanups to reduce merge conflicts	2021-12-08 17:29:22 +00:00
Jacek Sieka	1a8b7469e3	move quarantine outside of chaindag (#3124 ) * move quarantine outside of chaindag The quarantine has been part of the ChainDAG for the longest time, but this design has a few issues: * the function in which blocks are verified and added to the dag becomes reentrant and therefore difficult to reason about - we're currently using a stateful flag to work around it * quarantined blocks bypass the processing queue leading to a processing stampede * the quarantine flow is unsuitable for orphaned attestations - these should also should be quarantined eventually Instead of processing the quarantine inside ChainDAG, this PR moves re-queueing to `block_processor` which already is responsible for dealing with follow-up work when a block is added to the dag This sets the stage for keeping attestations in the quarantine as well. Also: * make `BlockError` `{.pure.}` * avoid use of `ValidationResult` in block clearance (that's for gossip)	2021-12-06 10:49:01 +01:00
tersec	4378f3f096	almost all remaining ethereum/{eth2.0-specs -> consensus-specs} (#3158 )	2021-12-03 20:01:13 +00:00
Jacek Sieka	aa1dea03cd	speed up gossip and sync block validation (#3143 ) * avoid recomputing hash for block signature check * check block slot match before hitting the database	2021-12-01 10:52:40 +01:00
Jacek Sieka	a223d62b07	Cleanups (#3123 ) Renames and cleanups split out from the validator monitoring branch, so as to reduce conflict area vs other PR:s * add constants for expected message timing * name validators after the messages they validate, mostly, to make grepping easier * unify field naming of EpochInfo across forks to make cross-fork code easier	2021-11-25 13:20:36 +01:00
Jacek Sieka	9c2f43ed0e	Speed up altair block processing 2x (#3115 ) * Speed up altair block processing >2x Like #3089, this PR drastially speeds up historical REST queries and other long state replays. * cache sync committee validator indices * use ~80mb less memory for validator pubkey mappings * batch-verify sync aggregate signature (fixes #2985) * document sync committee hack with head block vs sync message block * add batch signature verification failure tests Before: ``` ../env.sh nim c -d:release -r ncli_db --db:mainnet_0/db bench --start-slot:-1000 All time are ms Average, StdDev, Min, Max, Samples, Test Validation is turned off meaning that no BLS operations are performed 5830.675, 0.000, 5830.675, 5830.675, 1, Initialize DB 0.481, 1.878, 0.215, 59.167, 981, Load block from database 8422.566, 0.000, 8422.566, 8422.566, 1, Load state from database 6.996, 1.678, 0.042, 14.385, 969, Advance slot, non-epoch 93.217, 8.318, 84.192, 122.209, 32, Advance slot, epoch 20.513, 23.665, 11.510, 201.561, 981, Apply block, no slot processing 0.000, 0.000, 0.000, 0.000, 0, Database load 0.000, 0.000, 0.000, 0.000, 0, Database store ``` After: ``` 7081.422, 0.000, 7081.422, 7081.422, 1, Initialize DB 0.553, 2.122, 0.175, 66.692, 981, Load block from database 5439.446, 0.000, 5439.446, 5439.446, 1, Load state from database 6.829, 1.575, 0.043, 12.156, 969, Advance slot, non-epoch 94.716, 2.749, 88.395, 100.026, 32, Advance slot, epoch 11.636, 23.766, 4.889, 205.250, 981, Apply block, no slot processing 0.000, 0.000, 0.000, 0.000, 0, Database load 0.000, 0.000, 0.000, 0.000, 0, Database store ``` * add comment	2021-11-24 13:43:50 +01:00
Zahary Karadjov	29e5700838	Bugfix: Avoid the aggregation of duplicate signatures when creating sync committee contributions	2021-11-07 21:41:10 +02:00
Jacek Sieka	ea0a191723	Better REST/RPC error messages (#3046 ) * Better REST/RPC error messages * homogenise block logging (root first) * homegenise message verification pipeline (verify in `gossip_verification`, act in `eth2_processor`) * use `subcommitteeIdx` consistently * log each sent contribution * fix block_sim * fix block topic * don't recalc root on gossip block validation * move position loop into sync pool	2021-11-05 17:39:47 +02:00
Jacek Sieka	9cf32c3748	clean up sync subcommittee handling * `SyncCommitteeIndex` -> `SyncSubcommitteeIndex` * `syncCommitteePeriod` -> `sync_committee_period` (spec spelling) * tighten period comparisons * fix assert when validating committee message with non-altair state in REST api	2021-10-20 22:59:13 +03:00
Jacek Sieka	bf6ad41d7d	add drop and sync committee metrics * use storeBlock for processing API blocks * avoid double block dump * count all gossip metrics at the same spot * simplify block broadcast	2021-10-20 18:20:12 +03:00
Jacek Sieka	c247702ebc	normalize subnet logging * call it subnet id everywhere * log aggregate sent from VC * log subnet with aggregate	2021-10-20 15:06:44 +03:00
tersec	c0a2f1c98e	refactor executionPayload tests; reduce HashSet creation (#3003 )	2021-10-20 13:36:38 +02:00
Jacek Sieka	df3fc9525f	import cleanup (#2997 ) * import cleanup ...and remove some unused types * add random imports * more imports	2021-10-19 16:09:26 +02:00
Etan Kissling	4743807079	use errReject template everywhere There were still a few instances that used the expansion of `errReject` instead of using the template itself. It seems that those cases were forgotten as part of other cleanups in #2809. Done now for readability.	2021-09-29 14:16:09 +03:00
Etan Kissling	01a9b275ec	handle duplicate pubkeys in sync committee (#2902 ) When sync committee message handling was introduced in #2830, the edge case of the same validator being selected multiple times as part of a sync subcommittee was not covered. Not handling that edge case makes sync contributions have a lower-than-expected participation rate as each sync validator is only counted up through once per subcommittee. This patch ensures that this edge case is properly covered.	2021-09-28 07:44:20 +00:00
Etan Kissling	ba3884f449	ignore instead of reject duplicate sync msgs (#2903 ) The P2P spec defines how certain error classes should be handled through either IGNORE or REJECT verdicts. For sync committee message, the spec defines that only the first message from each validator per subcommittee and slot shall be accepted, the rest is ignored. However, current code rejects those messages instead of ignoring them. Fixed to match spec.	2021-09-27 14:36:28 +00:00
Eugene Kabanov	b566d4657f	REST /eth/v1/events API call implementation. (#2878 ) * Placing callbacks into strategic places. * Initial events call implementation. * Post rebase fixes. * Change addSyncContribution() implementation. * Add `attestation-sent` event. Remove gcsafe, raises from callbacks implementations. Move `attestation-received` fire at the end of attestation processing. * Address review comments.	2021-09-22 14:17:15 +02:00
Zahary Karadjov	7d1efa443d	Restore the sync committee pool pruning and add tests	2021-08-30 11:06:45 +03:00
tersec	166e22a43b	sync committee message pool and gossip validation (#2830 )	2021-08-28 10:40:01 +00:00
Jacek Sieka	ba06f13942	cleanups (#2809 ) * cleanups * use ForkedTrustedSignedBeaconBlock.ionit where appropriate * move `is_aggregator` to `spec/` * use `errReject` in a few more places * update enr fork id when time is auspicious * use network broadcast functions * Return Ignore for aggregate signature validation timeouts ...consistently between aggregates and attestations. * clean up some more reject/ignore rules * shorten texts a bit * errReject->checkedReject, use err helpers throughout * get rid of quarantine in exitpool as well	2021-08-24 21:49:51 +02:00
tersec	0a61d1112e	add errIgnore helper and refactor errReject helper (#2808 )	2021-08-23 12:39:06 +02:00
Jacek Sieka	a7a65bce42	disentangle eth2 types from the ssz library (#2785 ) * reorganize ssz dependencies This PR continues the work in https://github.com/status-im/nimbus-eth2/pull/2646, https://github.com/status-im/nimbus-eth2/pull/2779 as well as past issues with serialization and type, to disentangle SSZ from eth2 and at the same time simplify imports and exports with a structured approach. The principal idea here is that when a library wants to introduce SSZ support, they do so via 3 files: * `ssz_codecs` which imports and reexports `codecs` - this covers the basic byte conversions and ensures no overloads get lost * `xxx_merkleization` imports and exports `merkleization` to specialize and get access to `hash_tree_root` and friends * `xxx_ssz_serialization` imports and exports `ssz_serialization` to specialize ssz for a specific library Those that need to interact with SSZ always import the `xxx_` versions of the modules and never `ssz` itself so as to keep imports simple and safe. This is similar to how the REST / JSON-RPC serializers are structured in that someone wanting to serialize spec types to REST-JSON will import `eth2_rest_serialization` and nothing else. * split up ssz into a core library that is independendent of eth2 types * rename `bytes_reader` to `codec` to highlight that it contains coding and decoding of bytes and native ssz types * remove tricky List init overload that causes compile issues * get rid of top-level ssz import * reenable merkleization tests * move some "standard" json serializers to spec * remove `ValidatorIndex` serialization for now * remove test_ssz_merkleization * add tests for over/underlong byte sequences * fix broken seq[byte] test - seq[byte] is not an SSZ type There are a few things this PR doesn't solve: * like #2646 this PR is weak on how to handle root and other dontSerialize fields that "sometimes" should be computed - the same problem appears in REST / JSON-RPC etc * Fix a build problem on macOS * Another way to fix the macOS builds Co-authored-by: Zahary Karadjov <zahary@gmail.com>	2021-08-18 20:57:58 +02:00
Jacek Sieka	70259e4e64	treat gossip decoding errors more strictly (#2793 ) * penalize peers for sending gossip messages that fail decoding * add metrics for decoding/decompression errors * clean up obsolete exception handlers	2021-08-18 14:30:05 +02:00
Jacek Sieka	7a622e8505	rework spec imports (#2779 ) The spec imports are a mess to work with, so this branch cleans them up a bit to ensure that we avoid generic sandwitches and that importing stuff generally becomes easier. * reexport crypto/digest/presets because these are part of the public symbol set of the rest of the spec types * don't export `merge` types from `base` - this causes circular deps * fix circular deps in `ssz/spec_types` - this is the first step in disentangling ssz from spec * be explicit about phase0 vs altair - longer term, `altair` will become the "natural" type set, then merge and so on, so no point in giving `phase0` special preferential treatment	2021-08-12 13:08:20 +00:00
Jacek Sieka	9697b73e71	forkedbeaconstate_helpers -> forks (#2772 ) Simpler module name for stuff that covers forks * check that runtime config matches database state * also include some assorted altair cleanups * use "standard" genesis fork in local testnet to work around missing runtime config support	2021-08-10 22:46:35 +02:00
tersec	38ce948647	partial altair merge (#2735 ) * partial altair merge * exclude still-in-flux sync committee data structures from partial merge * undo the remaining sync_committee mention	2021-07-26 09:51:14 +00:00
tersec	aebc606cb7	tighten local network simulation correctness checking (#2706 ) * tighten local network simulation correctness checking * rename rejectFirmly to errReject	2021-07-19 11:58:22 +00:00
tersec	e4afc36d71	use ForkedTrustedSignedBeaconBlock (#2720 ) * use ForkedTrustedSignedBeaconBlock * remove --subscribe-all-subnets * https://ethereum.github.io/eth2.0-APIs/#/Beacon/getBlock implementation was passing through forked beaconblocks	2021-07-14 12:18:52 +00:00
Jacek Sieka	23eea197f6	Implement split preset/config support (#2710 ) * Implement split preset/config support This is the initial bulk refactor to introduce runtime config values in a number of places, somewhat replacing the existing mechanism of loading network metadata. It still needs more work, this is the initial refactor that introduces runtime configuration in some of the places that need it. The PR changes the way presets and constants work, to match the spec. In particular, a "preset" now refers to the compile-time configuration while a "cfg" or "RuntimeConfig" is the dynamic part. A single binary can support either mainnet or minimal, but not both. Support for other presets has been removed completely (can be readded, in case there's need). There's a number of outstanding tasks: * `SECONDS_PER_SLOT` still needs fixing * loading custom runtime configs needs redoing * checking constants against YAML file * yeerongpilly support `build/nimbus_beacon_node --network=yeerongpilly --discv5:no --log-level=DEBUG` * load fork epoch from config * fix fork digest sent in status * nicer error string for request failures * fix tools * one more * fixup * fixup * fixup * use "standard" network definition folder in local testnet Files are loaded from their standard locations, including genesis etc, to conform to the format used in the `eth2-networks` repo. * fix launch scripts, allow unknown config values * fix base config of rest test * cleanups * bundle mainnet config using common loader * fix spec links and names * only include supported preset in binary * drop yeerongpilly, add altair-devnet-0, support boot_enr.yaml	2021-07-12 15:01:38 +02:00
tersec	146fa48454	use ForkedHashedBeaconState in StateData (#2634 ) * use ForkedHashedBeaconState in StateData * fix FAR_FUTURE_EPOCH -> slot overflow; almost always use assign() * avoid stack allocation in maybeUpgradeStateToAltair() * create and use dispatch functions for check_attester_slashing(), check_proposer_slashing(), and check_voluntary_exit() * use getStateRoot() instead of various state.data.hbsPhase0.root * remove withStateVars.hashedState(), which doesn't work as a design anymore * introduce spec/datatypes/altair into beacon_chain_db * fix inefficient codegen for getStateField(largeStateField) * state_transition_slots() doesn't either need/use blocks or runtime presets * combine process_slots(HBS)/state_transition_slots(HBS) which differ only in last-slot htr optimization * getStateField(StateData, ...) was replaced by getStateField(ForkedHashedBeaconState, ...) * fix rollback * switch some state_transition(), process_slots, makeTestBlocks(), etc to use ForkedHashedBeaconState * remove state_transition(phase0.HashedBeaconState) * remove process_slots(phase0.HashedBeaconState) * remove state_transition_block(phase0.HashedBeaconState) * remove unused callWithBS(); separate case expression from if statement * switch back from nested-ref-object construction to (ref Foo)(Bar())	2021-06-11 20:51:46 +03:00
Jacek Sieka	d859bc12f0	write uncompressed validator keys to database (#2639 ) * write uncompressed validator keys to database Loading 150k+ validator keys on startup in compressed format takes a lot of time - better store them in uncompressed format which makes behaviour just after startup faster / more predictable. * refactor cached validator key access * fix isomorphic cast to work with non-var instances * remove cooked pubkey cache - directly use database cache in chaindag as well (one less cache to keep in sync) * bump blscurve, introduce loadValid for known-to-be-valid keys	2021-06-10 10:37:02 +03:00
Jacek Sieka	abe0d7b4ae	singe validator key cache Instead of keeping a validator key list per EpochRef, this PR introduces a single shared validator key list in ChainDAG, and cleans up some other ChainDAG and key-related issues. The PR does not introduce the validator key list in the state transition - this is because we batch-check all signatures before entering the spec code, thus the spec code never hits the cache. A future refactor should _probably_ remove the threadvar altogether. There's a few other small fixes in here that make the flow easier to read: * fix `var ChainDAGRef` -> `ChainDAGRef` * fix `var QuarantineRef` -> `QuarantineRef` * consistent `dag` variable name * avoid using threadvar pubkey cache in most cases * better error messages in batch signature checking	2021-06-01 20:43:44 +03:00
Mamy André-Ratsimbazafy	d05c9dbcf4	improve batch crypto sanity checks error reporting	2021-05-26 18:17:12 +03:00
Jacek Sieka	867d8f3223	Perform attestation check before broadcast (#2550 ) Currently, we have a bit of a convoluted flow where when sending attestations, we start broadcasting them over gossip then pass them to the attestation validation to include them in the local attestation pool - it should be the other way around: we should be checking attestations _before_ gossipping them - this serves as an additional safety net to ensure that we don't publish junk - this becomes more important when publishing attestations from the API. Also, the REST API was performing its own validation meaning attestations coming from REST would be validated twice - finally, the JSON RPC wasn't pre-validating and would happily broadcast invalid attestations. * Unified attestation production pipeline with the same flow for gossip, locally and API-produced attestations: all are now validated and entered into the pool, then broadcast/republished * Refactor subnet handling with specific SubnetId alias, streamlining where subnets are computed, avoiding the need to pass around the number of active validators * Move some of the subnet handling code to eth2_network * Use BitArray throughout for subnet handling	2021-05-10 09:13:36 +02:00
Jacek Sieka	7dba1b37dd	remove attestation/aggregate queue (#2519 ) With the introduction of batching and lazy attestation aggregation, it no longer makes sense to enqueue attestations between the signature check and adding them to the attestation pool - this only takes up valuable CPU without any real benefit. * add successfully validated attestations to attestion pool directly * avoid copying participant list around for single-vote attestations, pass single validator index instead * release decompressed gossip memory earlier, specially during async message validation * use cooked signatures in a few more places to avoid reloads and errors * remove some Defect-raising versions of signature-loading * release decompressed data memory before validating message	2021-04-26 22:39:44 +02:00
Jacek Sieka	daf98e4330	fix committee vs subnet confusion	2021-04-18 14:17:45 +03:00
Jacek Sieka	f1f424cc2d	attestation processing speedups * avoid creating indexed attestation just to check signatures - above all, don't create it when not checking signatures ;) * avoid pointer op when adding attestation to pool * better iterator for yielding attestations * add metric / log for attestation packing time	2021-04-14 21:51:17 +03:00
Jacek Sieka	4ed2e34a9e	Revamp attestation pool This is a revamp of the attestation pool that cleans up several aspects of attestation processing as the network grows larger and block space becomes more precious. The aim is to better exploit the divide between attestation subnets and aggregations by keeping the two kinds separate until it's time to either produce a block or aggregate. This means we're no longer eagerly combining single-vote attestations, but rather wait until the last moment, and then try to add singles to all aggregates, including those coming from the network. Importantly, the branch improves on poor aggregate quality and poor attestation packing in cases where block space is running out. A basic greed scoring mechanism is used to select attestations for blocks - attestations are added based on how much many new votes they bring to the table. * Collect single-vote attestations separately and store these until it's time to make aggregates * Create aggregates based on single-vote attestations * Select _best_ aggregate rather than _first_ aggregate when on aggregation duty * Top up all aggregates with singles when it's time make the attestation cut, thus improving the chances of grabbing the best aggregates out there * Improve aggregation test coverage * Improve bitseq operations * Simplify aggregate signature creation * Make attestation cache temporary instead of storing it in attestation pool - most of the time, blocks are not being produced, no need to keep the data around * Remove redundant aggregate storage that was used only for RPC * Use tables to avoid some linear seeks when looking up attestation data * Fix long cleanup on large slot jumps * Avoid some pointers * Speed up iterating all attestations for a slot (fixes #2490)	2021-04-13 20:24:02 +03:00
tersec	79bb0d5379	only deserialize attestation and aggregation gossiped signatures once (#2472 ) * only deserialize attestation and aggregation gossiped signatures once * re-indent some aggregate checks into block scope * spelling * remove debugging assertion * put part of gossip validation back into block context * attestation pool test signature loading isn't so unsafe, and exportRaw isn't free * remove more development doAsserts; don't exportRaw in loops	2021-04-09 14:59:24 +02:00
tersec	d3cad92693	remove some BeaconState use and abstract over other uses (#2482 ) * remove some BeaconState use and abstract over other uses * remove out-of-context comment	2021-04-08 08:24:25 +00:00
Mamy Ratsimbazafy	6b13cdce36	Batch attestations (#2439 ) * batch attestations * Fixes (but now need to investigate the chronos 0 .. 4095 crash similar to https://github.com/status-im/nimbus-eth2/issues/1518 * Try to remove the processing loop to no avail :/ * batch aggregates * use resultsBuffer size for triggering deadline schedule * pass attestation pool tests * Introduce async gossip validators. May fix the 4096 bug (reentrancy issue?) (similar to sync unknown blocks #1518) * Put logging at debug level, add speed info * remove unnecessary batch info when it is known to be one * downgrade some logs to trace level * better comments [skip ci] * Address most review comments * only use ref for async proc * fix exceptions in eth2_network * update async exceptions in gossip_validation * eth2_network 2nd pass * change to sleepAsync * Update beacon_chain/gossip_processing/batch_validation.nim Co-authored-by: Jacek Sieka <jacek@status.im> Co-authored-by: Jacek Sieka <jacek@status.im>	2021-04-02 16:36:43 +02:00
Mamy Ratsimbazafy	de1060e7f3	centralize p2p validation in a single file and address https://github.com/status-im/nimbus-eth2/pull/2377#issuecomment-791313118 (#2383 )	2021-03-06 08:32:55 +01:00

1 2 3 4

195 Commits