Commit Graph

628 Commits

Author SHA1 Message Date
Agnish Ghosh 1d80f7608d attempt reconstruction from gossip itself 2024-10-10 04:34:39 +05:30
Agnish Ghosh daa55d8e10 add more caveats to has dc 2024-10-10 02:00:54 +05:30
Agnish Ghosh ad699dd389 has dc checks 2024-10-10 01:55:59 +05:30
Agnish Ghosh a89847b1f1 missing data columns improvement 2024-10-10 01:41:18 +05:30
Agnish Ghosh 44c95a0667 data column fetch record 2024-10-10 01:00:32 +05:30
Agnish Ghosh 1927366eaa data availability fix 2024-10-09 14:46:32 +05:30
Agnish Ghosh 44fac84a4f quarantine fix 2024-10-08 11:19:49 +05:30
Agnish Ghosh 162a3bd6cf don't need all data columns to declare it columnless 2024-10-08 03:16:44 +05:30
Agnish Ghosh e962f591cc fixed reconstruction issue, + quarantine issue 2024-10-08 02:28:42 +05:30
Agnish Ghosh 64f1d235df minor fix 2024-10-05 05:28:24 +05:30
Agnish Ghosh 19ed1bd28c fixed quarantine issue for full nodes 2024-10-05 05:19:14 +05:30
Agnish Ghosh 44a5514861 fix initial columnless situation 2024-10-03 15:49:01 +05:30
Agnish Ghosh 6e057edc89 initial start fix 2024-10-03 15:43:45 +05:30
Agnish Ghosh 6d837bcdaa fix more quarantine issues 2024-10-03 02:33:25 +05:30
Agnish Ghosh 5bf1e021a7
initiate data column quarantine 2024-06-28 14:53:08 +05:30
Agnish Ghosh 791d2fb0d1
add: forward and backward syncing for data columns, broadcasting data columns created from blobs, added dc support to sync_queue 2024-06-24 17:32:06 +05:30
Agnish Ghosh e2afc583cb
fix: reviews, pass1 2024-06-21 14:51:54 +05:30
Agnish Ghosh 51f189ef53
add: getMissingDataColumns, requestManagerDataColumnLoop 2024-06-19 03:46:03 +05:30
Agnish Ghosh 46d07b140d
add: data column support in sync_protocol, sync_manager, request_manager, fix: gossipValidation rules 2024-06-18 19:01:56 +05:30
tersec c7bf6fb542
rm debugRaiseAssert; clean up several debugComments (#6308)
* rm debugRaiseAssert; clean up several debugComments

* exception linting
2024-05-23 23:51:09 +02:00
tersec b56a671122
fix most ConvFromXtoItselfNotNeeded hints and unhide remaining ones (#6307) 2024-05-22 13:56:37 +02:00
Jacek Sieka 045c4cf185
electra attestation updates (#6295)
* electra attestation updates

In Electra, we have two attestation formats: on-chain and on-network -
the former combines all committees of a slot in a single committee bit
list.

This PR makes a number of cleanups to move towards fixing this -
attestation packing however still needs to be fixed as it currently
creates attestations with a single committee only which is very
inefficient.

* more attestations in the blocks

* signing and aggregation fixes

* tool fix

* test, import
2024-05-17 15:37:41 +03:00
tersec c1b9e82502
electra attestation gossip plumbing (#6287) 2024-05-14 19:01:26 +03:00
tersec 1c3aaa7be2
add (Signed)AggregateAndProof SSZ tests (#6285) 2024-05-14 13:51:06 +02:00
tersec 6b8061b5d6
automated consensus spec URL updating to v1.5.0-alpha.2 (#6279) 2024-05-09 05:03:10 +00:00
tersec 1b30dcc165
initial electra attestation pool changes; electra block_sim (#6255) 2024-05-07 15:01:51 +00:00
tersec 6119389c3a
add EF consensus spec test Electra attestation operations fixture (#6248) 2024-04-28 00:52:14 +00:00
tersec 87452374e4
add Electra SSZ object test fixture (#6225) 2024-04-22 09:00:38 +00:00
tersec d139c92df9
explicitly scope AttesterSlashing and IndexedAttestation types to phase0 (#6224) 2024-04-21 05:49:11 +00:00
tersec 0132f5d689
some consensus spec v1.4.0 spec URL updates (#6215) 2024-04-18 03:00:04 +02:00
tersec 603c83522e
explicitly refer to phase0.{Attestation,TrustedAttestation} rather than sans module name (#6214) 2024-04-17 20:44:29 +00:00
tersec 867995acd1
some consensus spec v1.4.0 spec URL updates (#6208) 2024-04-17 05:51:16 +02:00
tersec e51c5ec783
add Electra blob support to block/blob quarantines, block processor, and request manager (#6201) 2024-04-11 09:31:39 +00:00
Etan Kissling 9bffc51cf6
allow frontfilling finalized CP block from era file (#6164)
Add support for using era file for the initial checkpoint block.
This should also avoid an error when the beacon node is restarted
before the backfill process has made any progress (#6059).
2024-04-10 14:11:07 +02:00
tersec 6ce5d5814c
support electra block proposals for internal BN validators (#6187) 2024-04-09 12:04:33 +02:00
tersec 937cc62b85
block_sim runs electra (#6181) 2024-04-07 09:58:11 +02:00
tersec 27921406e9
remove some debugRaiseAsserts and fill in Electra functionality (#6179) 2024-04-06 15:11:47 +02:00
Etan Kissling df0ff5f0fb
fix initialization of sync committee cache after loading non-epoch state (#6160)
When initializing from a state that's not aligned to an epoch boundary,
an earlier state is loaded that's epoch aligned, and subsequently topped
up with the missing blocks. `dag.headSyncCommittee` is initialized prior
to topping up the missing blocks, though. If the sync committee changes
while applying the blocks (e.g., a sync committee period boundary hits),
the cached information becomes unlinked from `dag.head`, leading to
valid blocks based on that chain being rejected. To fix this, move cache
initialization after the top up with blocks. This has been observed on
Goerli by initializing from 7919502 and attempting to top up 7920111.
The block gets rejected with an invalid state root on nodes that have
restarted after setting 7920111 as head, while it gets accepted by all
other nodes. Error message is `block: state root verification failed`.
The incorrect initialization behaviour was introduced in #4592, before
which the sync committee cache was initialized after applying blocks.
2024-04-03 23:03:06 +02:00
tersec 7fa32b7f02
add Electra to ConsensusFork enum (#6169)
* add Electra to ConsensusFork enum

* fix gnosis check
2024-04-03 16:43:43 +02:00
Etan Kissling 6a318d0f1a
avoid rejecting empty era file in verification (#6163)
`batchVerify`'s precondition is a non-empty signature list:

```nim
  if input.len == 0:
    # Spec precondition
    return false
```

This means that in eras without any blocks (as has happened on Goerli),
calling it leads to era files being reported as invalid.
2024-04-03 10:06:21 +02:00
Etan Kissling 0000f81df0
remove unused and redundant `PayloadID` type definition (#6165)
`PayloadID` is defined in `nim-web3` and our own Bellatrix definition
can be removed.
2024-04-03 07:27:00 +02:00
Etan Kissling 2dbe24c740
move split view catchup to research branch (#6133)
Using a dedicated branch for researching the effectiveness of split view
scenario handling simplifies testing and avoids having partial work on
`unstable`. If we want, we can reintroduce it under a `--debug` flag at
a later time. But for now, Goerli is a rare opoprtunity to test this,
maybe just for another week or so.

- https://github.com/status-im/infra-nimbus/pull/179
2024-03-25 19:09:31 +01:00
Etan Kissling fc9bc1da3a
add branch discovery module for supporting chain stall situation (#6125)
In split view situation, the canonical chain may only be served by a
tiny amount of peers, and branches may span long durations. Minority
branches may still have a large weight from attestations and should
be discovered. To assist with that, add a branch discovery module that
assists in such a situation by specifically targeting peers with unknown
histories and downloading from them, in addition to sync manager work
which handles popular branches.
2024-03-24 08:41:47 +00:00
Etan Kissling 66a9304fea
use separate state when catching up to perform validator duties (#6131)
There are situations where all states in the `blockchain_dag` are
occupied and cannot be borrowed.

- headState: Many assumptions in the code that it cannot be advanced
- clearanceState: Resets every time a new block gets imported, including
  blocks from non-canonical branches
- epochRefState: Used even more frequently than clearanceState

This means that during the catch-up mechanic where the head state is
slowly advanced to wall clock to catch up on validator duties in the
situation where the canonical head is way behind non-canonical heads,
we cannot use any of the three existing states. In that situation,
Nimbus already consumes an increased amount of memory due to all the
`BlockRef`, fork choice states and so on, so experience is degraded.
It seems reasonable to allocate a fourth state temporarily during that
mechanic, until a new proposal could be made on the canonical chain.

Note that currently, on `unstable`, proposals _do_ happen every couple
hours because sync manager doesn't manage to discover additional heads
in a split-view scenario on Goerli. However, with the branch discovery
module, new blocks are discovered all the time, and the clearanceState
may no longer be borrowed as it is reset to different branch too often.

The extra state could also find other uses in the future, e.g., for
incremental computations as in reindexing the database, or online
collection of historical light client data.
2024-03-24 07:18:33 +01:00
Etan Kissling c4a5bca629
update block quarantine eviction order to FIFO (#6129)
Use the same eviction policy for blocks as already the case for blobs.
FIFO makes more sense, because it favors keeping ancestors of blocks
which need to be applied to the DAG before their children get eligible.
2024-03-24 06:03:51 +01:00
Etan Kissling bedc601903
increase blob quarantine capacity to match block quarantine capacity (#6128)
Blobs are cached from gossip and other sources for all orphans, not just
those specifically tagged as `blobless`. `blobless` only means that they
are actively fetched from the network. The `MaxBlobs` should be aligned
to match `MaxOrphans`. Note that blobs are tiny compared to blocks, so
this isn't a huge memory hog.
2024-03-24 04:29:44 +01:00
Etan Kissling 33e34ee8bd
handle case of unreachable block in `is_optimstic` helper (#6124)
* handle case of unreachable block in `is_optimstic` helper

When a non-canonical block is still in the DB, it can be accessed via
`BlockId`, but `BlockRef` may be unavailable if the block was not
properly cleaned when it got orphaned. Report it as optimistic.

* `template` -> `func`
2024-03-22 22:50:21 +00:00
Etan Kissling 12a2f8c026
when adding duplicates to quarantine, schedule deepest missing parent (#6112)
During sync, sometimes the same block gets encountered and added to
quarantine multiple times. If its parent is already known, quarantine
incorrectly registers it as missing, leading to re-download. This can
be fixed by registering the parent's deepest missing parent recursively.

Also increase the stickiness of `missing`. We only perform 4 attempts
within ~16 seconds before giving up. Very frequently, this is not enough
and there is no progress until sync manager kicks in even on holesky.
2024-03-21 18:41:05 +00:00
Etan Kissling 6f466894ab
answer `RequestManager` queries from disk if possible (#6109)
When restarting beacon node, orphaned blocks remain in the database but
on startup, only the canonical chain as selected by fork choice loads.
When a new block is discovered that builds on top of an orphaned block,
the orphaned block is re-downloaded using sync/request manager, despite
it already being present on disk. Such queries can be answered locally
to improve discovery speed of alternate forks.
2024-03-21 18:37:31 +01:00
Etan Kissling eb5acdb7dd
make sure `clearanceState` builds on top of `headState` in chain stall (#6108)
The `clearanceState` points to the latest resolved block, regardless of
whether that block is canonical according to fork choice. If chain is
stalled and we want to prepare for resuming validator duties, we need
a recent state according to fork choice to avoid lag spikes and missing
slot timings.
2024-03-20 16:41:56 +01:00