nimbus-eth2

Commit Graph

Author	SHA1	Message	Date
Etan Kissling	e4c4d11480	Merge branch 'dev/etan/nw-nilpeer' into feat/splitview	2024-03-28 01:40:00 +01:00
Etan Kissling	95aed2b220	filter out `nil` values when iterating peers Iterating peers should only yield peers present in registry, otherwise `nil` pointers are returned and depending on comparison function it will break, see #6149.	2024-03-28 01:38:49 +01:00
Etan Kissling	ee80daba2a	Merge branch 'dev/etan/nw-ranking' into feat/splitview	2024-03-27 23:32:22 +01:00
Etan Kissling	5fb293c595	explain the `<` usage	2024-03-27 23:31:02 +01:00
Etan Kissling	0f679d9463	Merge branch 'unstable' into feat/splitview	2024-03-27 23:13:58 +01:00
Etan Kissling	40242ac277	rank peers by their score instead of their memory address The `<` function to compare peers was not exported, leading to the same peer be acquired over and over again until kick. `mixin` doesn't pull it into `peerCmp` without `*` export, and with the export no mixin needed.	2024-03-27 23:10:42 +01:00
diegomrsantos	885989f3df	bump libp2p (#6148 )	2024-03-27 15:53:02 +00:00
Etan Kissling	55a5ffaf8c	Merge branch 'dev/etan/zf-branchpull' into feat/splitview	2024-03-27 16:40:29 +01:00
Etan Kissling	b37ad4dccb	fix	2024-03-27 16:39:42 +01:00
Etan Kissling	3ab0767b35	Merge branch 'dev/etan/zf-branchpull' into feat/splitview	2024-03-27 16:25:07 +01:00
Etan Kissling	986c548b00	fix	2024-03-27 16:24:26 +01:00
Etan Kissling	b869546524	Merge branch 'dev/etan/zf-branchpull' into feat/splitview	2024-03-27 16:14:38 +01:00
Etan Kissling	ce19875583	Merge branch 'unstable' into feat/splitview	2024-03-27 16:14:13 +01:00
Etan Kissling	3376887ba7	add research notes	2024-03-27 16:00:51 +01:00
Etan Kissling	02a69be4e2	generic branch discovery version that supports mocking peers	2024-03-27 16:00:36 +01:00
Etan Kissling	f8be7c326e	be careful not to disconnect syncing peers in fragmented network	2024-03-27 16:00:21 +01:00
Etan Kissling	9f37ffdc62	suspend light client sync while branch discovery is in progress	2024-03-27 16:00:02 +01:00
diegomrsantos	edad7c8a4c	bump libp2p (#6132 )	2024-03-27 11:16:57 +01:00
tersec	7a3edb6961	more initialize_validator_exit optimization (#6146 )	2024-03-27 09:18:50 +01:00
tersec	f9e5294802	dump EL-INVALID blocks if requested the same way as CL-INVALID blocks; optimize epoch transition validator exit (#6144 )	2024-03-27 04:34:56 +01:00
tersec	605bf99344	remove macOS/aarch64 workaround from proposeBlockAux (#6138 )	2024-03-26 23:05:49 +00:00
tersec	21daaad754	support special characters in network metadata paths (#6141 )	2024-03-26 22:47:42 +01:00
Etan Kissling	be5ad82f33	Merge branch 'dev/etan/zf-branchpull' into feat/splitview	2024-03-26 11:17:38 +01:00
Etan Kissling	1c04697e1d	tweak rate limiting	2024-03-26 11:17:07 +01:00
Etan Kissling	1744d68af8	Merge branch 'dev/etan/vd-incprop' into feat/splitview	2024-03-26 05:14:15 +01:00
Etan Kissling	2b169efa23	keep proposal state around in clearance to reapply block lagfree	2024-03-26 05:13:35 +01:00
Etan Kissling	bc58c3249f	Merge branch 'dev/etan/zf-branchpull' into feat/splitview	2024-03-26 03:55:52 +01:00
Etan Kissling	7f26fb1670	filter out useless peers earlier	2024-03-26 03:55:22 +01:00
Etan Kissling	ea2cf8e69b	Merge branch 'dev/etan/zf-branchpull' into feat/splitview	2024-03-25 23:44:50 +01:00
Etan Kissling	74606c6e1b	handoff useless peers from sync manager directly into branch discovery	2024-03-25 23:44:05 +01:00
Etan Kissling	58383d1ca0	Merge branch 'dev/etan/vd-incprop' into feat/splitview	2024-03-25 22:29:03 +01:00
Etan Kissling	97ec45e939	after a deep reorg, both `newPayload` and `forkchoiceUpdated` are needed	2024-03-25 22:25:46 +01:00
Etan Kissling	db5b8b0bc2	enable `--debug-split-views-merge` on this research branch	2024-03-25 22:05:24 +01:00
Etan Kissling	63971c0e1f	Merge branch 'dev/etan/zf-branchpull' into feat/splitview	2024-03-25 22:05:01 +01:00
Etan Kissling	08b87e2506	add branch discovery module for use in split view scenarios When the network is partitioned for a long time, e.g., Goerli, branches start forming where different peers have distinct views about the chain state. The current syncing solution with sync manager doesn't handle the case well, as it is optimized for a healthy network where syncing can be parallelized across different peers. To support sync manager discovering additional branches, a new module is added that pulls in histories from peers on unknown branches in a backwards manner.	2024-03-25 22:02:23 +01:00
Etan Kissling	17e8a5137f	enable `--debug-propose-stale` on this research branch	2024-03-25 21:16:37 +01:00
Etan Kissling	c5352cf89f	add option to incrementally compute proposal state if behind There can be situations where proposals need to be made on top of stale parent state. Because the distance to the wall slot is big, the proposal state (`getProposalState`) has to be computed incrementally to avoid lag spikes. Proposals have to be timely to have a chance to propagate on the network. Peak memory requirements don't change, but the proposal state needs to be allocated for a longer duration than usual.	2024-03-25 21:11:37 +01:00
Etan Kissling	2dbe24c740	move split view catchup to research branch (#6133 ) Using a dedicated branch for researching the effectiveness of split view scenario handling simplifies testing and avoids having partial work on `unstable`. If we want, we can reintroduce it under a `--debug` flag at a later time. But for now, Goerli is a rare opoprtunity to test this, maybe just for another week or so. - https://github.com/status-im/infra-nimbus/pull/179	2024-03-25 19:09:31 +01:00
Etan Kissling	fc9bc1da3a	add branch discovery module for supporting chain stall situation (#6125 ) In split view situation, the canonical chain may only be served by a tiny amount of peers, and branches may span long durations. Minority branches may still have a large weight from attestations and should be discovered. To assist with that, add a branch discovery module that assists in such a situation by specifically targeting peers with unknown histories and downloading from them, in addition to sync manager work which handles popular branches.	2024-03-24 08:41:47 +00:00
Etan Kissling	66a9304fea	use separate state when catching up to perform validator duties (#6131 ) There are situations where all states in the `blockchain_dag` are occupied and cannot be borrowed. - headState: Many assumptions in the code that it cannot be advanced - clearanceState: Resets every time a new block gets imported, including blocks from non-canonical branches - epochRefState: Used even more frequently than clearanceState This means that during the catch-up mechanic where the head state is slowly advanced to wall clock to catch up on validator duties in the situation where the canonical head is way behind non-canonical heads, we cannot use any of the three existing states. In that situation, Nimbus already consumes an increased amount of memory due to all the `BlockRef`, fork choice states and so on, so experience is degraded. It seems reasonable to allocate a fourth state temporarily during that mechanic, until a new proposal could be made on the canonical chain. Note that currently, on `unstable`, proposals _do_ happen every couple hours because sync manager doesn't manage to discover additional heads in a split-view scenario on Goerli. However, with the branch discovery module, new blocks are discovered all the time, and the clearanceState may no longer be borrowed as it is reset to different branch too often. The extra state could also find other uses in the future, e.g., for incremental computations as in reindexing the database, or online collection of historical light client data.	2024-03-24 07:18:33 +01:00
Etan Kissling	c4a5bca629	update block quarantine eviction order to FIFO (#6129 ) Use the same eviction policy for blocks as already the case for blobs. FIFO makes more sense, because it favors keeping ancestors of blocks which need to be applied to the DAG before their children get eligible.	2024-03-24 06:03:51 +01:00
Etan Kissling	991e7cafbc	descore when opening connection fails, same as when reading fails (#6130 ) `eth2_network` forgets to descore peers when opening connection times out. It only descores when opening the connection succeeds and then there is a subsequent error. The caller cannot distinguish the cases, so ensure that the descore is also applied if the request fails during its initial portion.	2024-03-24 05:37:47 +01:00
Etan Kissling	3765e8ac06	ensure blobs are quarantined when block is quarantined (#6127 ) When quarantining a block from block processor, we should also keep a copy of its blobs. Otherwise, this involves more network roundtrips to obtain information we already have. This is in line with how blobs arrive from gossip and request manager sources. The existing flow does not work when applying blocks from quarantine, which is addressed here.	2024-03-24 04:56:30 +01:00
Etan Kissling	bedc601903	increase blob quarantine capacity to match block quarantine capacity (#6128 ) Blobs are cached from gossip and other sources for all orphans, not just those specifically tagged as `blobless`. `blobless` only means that they are actively fetched from the network. The `MaxBlobs` should be aligned to match `MaxOrphans`. Note that blobs are tiny compared to blocks, so this isn't a huge memory hog.	2024-03-24 04:29:44 +01:00
tersec	c5f0d1def3	Revert "Revert "Set default localBlockValueBoost to 10 (#6103 )" (#6118 )" (#6126 ) This reverts commit `213076e4cd`.	2024-03-23 10:17:29 +01:00
Etan Kissling	33e34ee8bd	handle case of unreachable block in `is_optimstic` helper (#6124 ) * handle case of unreachable block in `is_optimstic` helper When a non-canonical block is still in the DB, it can be accessed via `BlockId`, but `BlockRef` may be unavailable if the block was not properly cleaned when it got orphaned. Report it as optimistic. * `template` -> `func`	2024-03-22 22:50:21 +00:00
Etan Kissling	2d9586a5a8	enqueue missing parent block if stored in local DB (#6122 ) When checking for `MissingParent`, it may be that the parent block was already discovered as part of a prior run. In that case, it can be loaded from storage and processed without having to rediscover the entire branch from the network. This is similar to #6112 but for blocks that are discovered via gossip / sync mgr instead of via request mgr.	2024-03-22 14:35:46 +01:00
Eugene Kabanov	a6e9e0774c	VC: Refactor some timing code around sync committee processing (#6073 ) * Add some duration metering. Refactor some log statements. Rework sync contribution deadline waiting. Add some cancellation reporting handlers. * Make all validator's shortLog to become validatorLog. Optimize some logs with logScope. * Add `raises`. * More log statements polishing.	2024-03-22 02:37:44 +00:00
Etan Kissling	9d5643240b	only request blobs if a sync response actually provided blocks (#6121 ) During sync, we can skip the `blobSidecarsByRange` request when there are no blocks with `kzg_commitments` in the blocks data. Avoids running into throttling from peers during long periods of non-finality.	2024-03-22 03:27:02 +01:00
Etan Kissling	17ee40b39b	make blobs use less quota when other nodes sync from us (#6120 ) Each individual blob currently uses as much quota from the network limit as an entire block does, 128 items per second shared across all peers. Blobs are 128 KB each instead of up to several MB and are simpler to encode. There can be multiple per block (6 currently), so allow 2000 blobs per second across all peers. That decreases the cost per block from `3125 + 3125 * blobs.len` quota (= `[3125, 21875]`) to a lower `3125 + 200 * blobs.len` quota (= `[3125, 4325]`), accounting for the slight increase in data transfer and encoding time.	2024-03-22 02:36:08 +01:00

1 2 3 4 5 ...

4053 Commits