Commit Graph

969 Commits

Author SHA1 Message Date
Jacek Sieka ba99c8fe4f
update era file documentation / impl (#3226)
Overhaul of era files, including documentation and reference
implementations

* store blocks, then state, then slot indices for easy lookup at low
cost
* document era file rationale
* altair+ support in era writer
2022-01-07 11:13:19 +01:00
tersec 8242e57f41
initial migration from spec/datatypes/{merge => bellatrix} (#3249) 2022-01-06 12:25:35 +01:00
Jacek Sieka 0e2b4e39fa
REST JSON support improvements (#3232)
* support downloading blocks / states via JSON in addition to SSZ -
slow, but needed for infura support - SSZ is still used when server
supports it
* use common forked block/state reader in REST API
* fix stack overflows in REST JSON decoder
* fix invalid serialization of `justification_bits` in
`/eth/v1/debug/beacon/states` and `/eth/v2/debug/beacon/states`
* fix REST client to use `/eth/...` instead of `/api/eth/...`, update
"default" urls to expose REST api via `/eth` as well as this is what the
standard says - `/api` was added early on based on an example "base url"
in the spec that has been removed since
* expose Nimbus REST extensions via `/nimbus` in addition to
`/api/nimbus` to stay consistent with `/eth`
* fix invalid state root when reading states via REST
* fix recursive imports in `spec/ssz_codec`
* remove usages of `serialization.useCustomSerialization` - fickle
2022-01-06 08:38:40 +01:00
Jacek Sieka 0a4728a241
Handle access to historical data for which there is no state (#3217)
With checkpoint sync in particular, and state pruning in the future,
loading states or state-dependent data may fail. This PR adjusts the
code to allow this to be handled gracefully.

In particular, the new availability assumption is that states are always
available for the finalized checkpoint and newer, but may fail for
anything older.

The `tail` remains the point where state loading de-facto fails, meaning
that between the tail and the finalized checkpoint, we can still get
historical data (but code should be prepared to handle this as an
error).

However, to harden the code against long replays, several operations
which are assumed to work only with non-final data (such as gossip
verification and validator duties) now limit their search horizon to
post-finalized data.

* harden several state-dependent operations by logging an error instead
of introducing a panic when state loading fails
* `withState` -> `withUpdatedState` to differentiate from the other
`withState`
* `updateStateData` can now fail if no state is found in database - it
is also hardened against excessively long replays
* `getEpochRef` can now fail when replay fails
* reject blocks with invalid target root - they would be ignored
previously
* fix recursion bug in `isProposed`
2022-01-05 19:38:04 +01:00
tersec 66c9b7fbce
shift block_sim fork epochs; allow VC to work with non-multiple-of-3 SECONDS_PER_SLOT (#3244) 2022-01-05 13:41:39 +00:00
tersec 7594fa660d
copyright year and spec URL updates (#3243) 2022-01-05 11:07:14 +00:00
tersec cd77377375
add Bellatrix fork and transition tests; "Ethereum Foundation" -> EF (#3242) 2022-01-05 09:42:56 +01:00
Zahary Karadjov 54d0d588b1 Implementation of the Keymanager API (BETA)
https://github.com/ethereum/keymanager-APIs
2022-01-04 18:51:45 +02:00
tersec b81c06edab
rename Beacon{Block,State}Fork.Merge to Bellatrix; update copyright years (#3240) 2022-01-04 09:45:38 +00:00
tersec d20387e910
update copyright years and spec URLs (#3239) 2022-01-04 06:08:19 +00:00
tersec da017d2ca5
update from phase0/altair v1.1.6 URLs to v1.1.8 spec URLs (#3238) 2022-01-04 03:57:15 +00:00
tersec 3c63a78c01
use v1.1.8 test vectors (#3236) 2022-01-03 17:43:00 +00:00
tersec 8be1699014
use v1.1.7 test vectors (#3231)
* use v1.1.7 test vectors
2022-01-03 13:06:14 +00:00
tersec d4680df8d2
convert between engine and consensus ExecutionPayloads (#3228)
* convert between engine and consensus ExecutionPayloads
2022-01-03 13:22:56 +01:00
Jacek Sieka 7ec97a6b35
Fix missing checkpoint states` (#3225)
With the right sequence of events (for example a REST request or a
validation), it can happen that the first traversal across a state
checkpoint boundary is done without storing that state on disk - this
causes problens when replaying states, because now states may be missing
from the database.

Here, we simply avoid using the caches when advancing a state that will
go into the database, ensuring that the information lost during caching
always is permanently stored.

* fix recursion bug in `isProposed`
2021-12-30 12:33:03 +01:00
Zahary Karadjov 6b4f32ae23
Replicate a recent fix from the launch_local_testnet script due to a wide-spread code duplication 2021-12-22 17:59:45 +02:00
tersec 1a6a56bdb1
use BeaconTime instead of Slot in fork choice (#3138)
* use v1.1.6 test vectors; use BeaconTime instead of Slot in fork choice

* tick through every slot at least once

* use div INTERVALS_PER_SLOT and use precomputed constants of them

* use correct (even if numerically equal) constant
2021-12-21 18:56:08 +00:00
tersec 0d4e49f946
Merge fork gossip support (#3213)
* Merge fork gossip support

* index directly by BeaconStateFork and remove debugging log statement
2021-12-21 15:24:23 +01:00
Jacek Sieka 1021e3324e
Revert writing backfill root to database (#3215)
Introduced in #3171, it turns out we can just follow the block headers
to achieve the same effect

* leaves the constant in the code so as to avoid confusion when reading
database that had the constant written (such as the fleet nodes and
other unstable users)
2021-12-21 11:40:14 +01:00
Jacek Sieka c270ec21e4
Validator monitoring (#2925)
Validator monitoring based on and mostly compatible with the
implementation in Lighthouse - tracks additional logs and metrics for
specified validators so as to stay on top on performance.

The implementation works more or less the following way:
* Validator pubkeys are singled out for monitoring - these can be
running on the node or not
* For every action that the validator takes, we record steps in the
process such as messages being seen on the network or published in the
API
* When the dust settles at the end of an epoch, we report the
information from one epoch before that, which coincides with the
balances being updated - this is a tradeoff between being correct
(waiting for finalization) and providing relevant information in a
timely manner)
2021-12-20 20:20:31 +01:00
tersec d7799ecdcc
v1.1.6 spec updates (#3206) 2021-12-17 06:56:33 +00:00
Jacek Sieka 118840d241
SyncManager cleanups for backfill support (#3189)
* SyncManager cleanups for backfill support

Cleanups, fixes and simplifications, in anticipation of backfill support
for the `SyncManager`:

* reformat sync progress indicator to show time left and % done more
prominently:
  * old: `sync="sPssPsssss:2:2.4229:00h57m (2706898)"`
  * new: `sync="14d12h31m (0.52%) 1.1378slots/s (wQQQQQDDQQ:1287520)"`
* reset average speed when going out of sync
* pass all block errors to sync manager, including duplicate/unviable
* penalize peers for reporting a head block that is outside of our
expected wall clock time (they're likely on a different network or
trying to disrupt sync)
* remove `SyncFailureKind` (unused)
* remove `inRange` (unused)
* add `Q` for sync queue requests that are in the `SyncQueue` but not
yet in the `BlockProcessor` queue
* update last slot in `SyncQueue` after getting peer status
* fix race condition between `wakeupWaiters` and `resetWait`, where
workers would not be correctly reset if block verification returned a
completed future without event loop
* log syncmanager direction

* Fix ordering issue.
Some of the requests size of which are not equal to `chunkSize` could be processed in wrong order which could lead to sync process freezes.

Co-authored-by: cheatfate <eugene.kabanov@status.im>
2021-12-16 15:57:16 +01:00
tersec 36ade1c1c6
v1.1.6 spec updates (minor, mostly URLs) (#3197) 2021-12-14 21:02:29 +00:00
tersec 4498d96a9a
don't build tests_blockchain_dag or tests_keystore on i386 (#3190) 2021-12-14 06:06:05 +00:00
tersec f09686e835
update some spec URLs to v1.1.6 (#3188) 2021-12-13 15:45:48 +00:00
Jacek Sieka 03005f48e1
Backfill support for ChainDAG (#3171)
In the ChainDAG, 3 block pointers are kept: genesis, tail and head. This
PR adds one more block pointer: the backfill block which represents the
block that has been backfilled so far.

When doing a checkpoint sync, a random block is given as starting point
- this is the tail block, and we require that the tail block has a
corresponding state.

When backfilling, we end up with blocks without corresponding states,
hence we cannot use `tail` as a backfill pointer - there is no state.

Nonetheless, we need to keep track of where we are in the backfill
process between restarts, such that we can answer GetBeaconBlocksByRange
requests.

This PR adds the basic support for backfill handling - it needs to be
integrated with backfill sync, and the REST API needs to be adjusted to
take advantage of the new backfilled blocks when responding to certain
requests.

Future work will also enable moving the tail in either direction:
* pruning means moving the tail forward in time and removing states
* backwards means recreating past states from genesis, such that
intermediate states are recreated step by step all the way to the tail -
at that point, tail, genesis and backfill will match up.
* backfilling is done when backfill != genesis - later, this will be the
WSS checkpoint instead
2021-12-13 14:36:06 +01:00
Jacek Sieka dfbd50b4d6
avoid SyncCommitteMsgPool copy (#3185)
introduced by batch verification, when verifiers were made async
2021-12-11 16:39:24 +01:00
Etan Kissling 984dc18dc6
import `is_valid_merkle_branch` test cases from `nim-eth` (#3182)
As of https://github.com/status-im/nim-eth/pull/379 `nim-eth` defines a
couple static test cases for merkle proof verification.
Since the EF has defined a `is_valid_merkle_branch` function in the spec
we are no longer using the custom implementation from `nim-eth`, but the
tests were never ported to target the new implementation. This patch now
follows up on that and integrates those tests from `nim-eth`.
2021-12-10 16:56:26 +01:00
Jacek Sieka 9f27f0d97c
BlockId reform (#3176)
* BlockId reform

Introduce `BlockId` that helps track a root/slot pair - this prepares
the codebase for backfilling and handling out-of-dag blocks

* move block dag code to separate module
* fix finalised state root in REST event stream
* fix finalised head computation on head update, when starting from
checkpoint
* clean up chaindag init
* revert `epochAncestor` change in introduced in #3144 that would return
an epoch ancestor from the canoncial history instead of the given
history, causing `EpochRef` keys to point to the wrong block
2021-12-09 19:06:21 +02:00
Etan Kissling 5cc6db5e20
remove disabled incorrect attestation test (#3175)
In #780 a test was disabled that verified that an attestation with
empty `aggregation_bits` completes successfully. The test was never
re-introduced, and as of the current consensus spec v1.1.6, such
attestations are not considered valid, as they fail the check in
`is_valid_indexed_attestation`. This patch fully removes that outdated
test, and moves it to the list of pending invalid attestation tests.
2021-12-09 14:03:22 +01:00
Jacek Sieka 069bccd51b
batch-verify sync messages for a small perf boost (#3151)
* batch-verify sync messages for a small perf boost

Generally reuses the same structure as attestation and aggregate
verification

* normalize `signatures` and `signature_batch` to use the same pattern
of verification
* normalize parameter names, order etc for signature stuff in general
* avoid calling `blsSign` directly - instead, go through `signatures`
consistently
2021-12-09 14:56:54 +02:00
tersec d93a279565
engine API alpha.5 field renaming (#3174) 2021-12-09 11:18:38 +00:00
Eugene Kabanov b05734f610
Backward sync support for SyncManager. (#3131)
* Unbundle SyncQueue from sync_manager.nim.
Unbundle Peer scores constants to peer_scores.nim.
Add Forward/Backward enum.

* Further improvements and tests.

* Adopt getRewindPoint() and fix MissingParent handler.

* Remove unused procedures.
Refactor `result` usage.
Fix resetWait().

* Add all the tests and fix the issue with rewind point.

* Fix get() issue.

* Fix flaky tests.

* test fixes

Co-authored-by: Jacek Sieka <jacek@status.im>
2021-12-08 22:15:29 +01:00
Jacek Sieka 89d6a1b403
Introduce slot->BlockRef mapping for finalized chain (#3144)
* Introduce slot->BlockRef mapping for finalized chain

The finalized chain is linear, thus we can use a seq to lookup blocks by
slot number.

Here, we introduce such a seq, even though in the future, it should
likely be backed by a database structure instead, or, more likely, a
flat era file with a flat lookup index.

This dramatically speeds up requests by slot, such as those coming from
the REST interface or GetBlocksByRange, as these are currently served by
a linear iteration from head.

* fix REST block requests to not return blocks from an earlier slot when
the given slot is empty
* fix StateId interpretation such that it doesn't treat state roots as
block roots
* don't load full block from database just to return its root
2021-12-06 20:52:35 +02:00
Jacek Sieka 1a8b7469e3
move quarantine outside of chaindag (#3124)
* move quarantine outside of chaindag

The quarantine has been part of the ChainDAG for the longest time, but
this design has a few issues:

* the function in which blocks are verified and added to the dag becomes
reentrant and therefore difficult to reason about - we're currently
using a stateful flag to work around it
* quarantined blocks bypass the processing queue leading to a processing
stampede
* the quarantine flow is unsuitable for orphaned attestations - these
should also should be quarantined eventually

Instead of processing the quarantine inside ChainDAG, this PR moves
re-queueing to `block_processor` which already is responsible for
dealing with follow-up work when a block is added to the dag

This sets the stage for keeping attestations in the quarantine as well.

Also:

* make `BlockError` `{.pure.}`
* avoid use of `ValidationResult` in block clearance (that's for gossip)
2021-12-06 10:49:01 +01:00
tersec a8c801eddd
fix Altair fork tests in minimal preset (#3163) 2021-12-06 05:56:46 +00:00
tersec e6921f808f
cleanups, partly from kintsugi branch (#3161)
* cleanups, partly from kintsugi branch

* re-export shortLog(EthBlock) and preserve exception messages in batchVerify and processBatch
2021-12-05 17:32:41 +00:00
tersec 4378f3f096
almost all remaining ethereum/{eth2.0-specs -> consensus-specs} (#3158) 2021-12-03 20:01:13 +00:00
tersec cc51f3fd12
v1.1.{5 -> 6} phase 0 and altair spec URL updates (#3157) 2021-12-03 17:40:23 +00:00
Zahary Karadjov 6fddff524c Switch back to WebSocket URLs in the Eth1Monitor
The HTTP support is not stable enough yet.
2021-12-03 17:04:29 +02:00
Etan Kissling 5e9625c1be extend `makeTestBlocks` for sync aggregates
This extends the `makeTestBlocks` function used in tests with a new
parameter `syncCommitteeRatio` to control whether the produced blocks
should be signed by the validators assigned to the sync committee.
A similar parameter already exists to configure whether attestations
for the test blocks should be produced.
2021-12-01 20:10:33 +02:00
tersec 61fb458f89
use v1.1.6 test vectors (#3146) 2021-12-01 12:55:42 +00:00
Jacek Sieka aa1dea03cd
speed up gossip and sync block validation (#3143)
* avoid recomputing hash for block signature check
* check block slot match before hitting the database
2021-12-01 10:52:40 +01:00
cheatfate b3ee5d67bd Remove nimbus_signing_process. 2021-11-30 16:48:36 +02:00
Eugene Kabanov e62c7c7c37
Remote signing client/server. (#3077) 2021-11-30 03:20:21 +02:00
Mamy Ratsimbazafy 97da6e1365
Fork choice EF consensus tests (#3041)
* add EF fork choice tests to CI

* checkpoints

* compilation fixes and add test to preset dependent suite

* support longpaths on Windows CI

* skip minimal tests (long paths issue + impl detals tested)

* fix stackoverflow on some platforms

* rebase on top of https://github.com/status-im/nimbus-eth2/pull/3054

* fix stack usage
2021-11-25 19:41:39 +01:00
Jacek Sieka a223d62b07
Cleanups (#3123)
Renames and cleanups split out from the validator monitoring branch, so
as to reduce conflict area vs other PR:s

* add constants for expected message timing
* name validators after the messages they validate, mostly, to make
grepping easier
* unify field naming of EpochInfo across forks to make cross-fork code
easier
2021-11-25 13:20:36 +01:00
Jacek Sieka 9c2f43ed0e
Speed up altair block processing 2x (#3115)
* Speed up altair block processing >2x

Like #3089, this PR drastially speeds up historical REST queries and
other long state replays.

* cache sync committee validator indices
* use ~80mb less memory for validator pubkey mappings
* batch-verify sync aggregate signature (fixes #2985)
* document sync committee hack with head block vs sync message block
* add batch signature verification failure tests

Before:

```
../env.sh nim c -d:release -r ncli_db --db:mainnet_0/db bench --start-slot:-1000
All time are ms
     Average,       StdDev,          Min,          Max,      Samples,         Test
Validation is turned off meaning that no BLS operations are performed
    5830.675,        0.000,     5830.675,     5830.675,            1, Initialize DB
       0.481,        1.878,        0.215,       59.167,          981, Load block from database
    8422.566,        0.000,     8422.566,     8422.566,            1, Load state from database
       6.996,        1.678,        0.042,       14.385,          969, Advance slot, non-epoch
      93.217,        8.318,       84.192,      122.209,           32, Advance slot, epoch
      20.513,       23.665,       11.510,      201.561,          981, Apply block, no slot processing
       0.000,        0.000,        0.000,        0.000,            0, Database load
       0.000,        0.000,        0.000,        0.000,            0, Database store
```

After:

```
    7081.422,        0.000,     7081.422,     7081.422,            1, Initialize DB
       0.553,        2.122,        0.175,       66.692,          981, Load block from database
    5439.446,        0.000,     5439.446,     5439.446,            1, Load state from database
       6.829,        1.575,        0.043,       12.156,          969, Advance slot, non-epoch
      94.716,        2.749,       88.395,      100.026,           32, Advance slot, epoch
      11.636,       23.766,        4.889,      205.250,          981, Apply block, no slot processing
       0.000,        0.000,        0.000,        0.000,            0, Database load
       0.000,        0.000,        0.000,        0.000,            0, Database store
```

* add comment
2021-11-24 13:43:50 +01:00
Zahary Karadjov 88c623e250 Add support for HTTPS Web3 providers 2021-11-23 15:56:18 +02:00
Jacek Sieka f19a497eec
ncli_db: add putState, putBlock (#3096)
* ncli_db: add putState, putBlock

These tools allow modifying an existing nimbus database for the purpose
of recovery or reorg, moving the head, tail and genesis to arbitrary
points.

* remove potentially expensive `putState` in `BeaconStateDB`
* introduce `latest_block_root` which computes the root of the latest
applied block from the `latest_block_header` field (instead of passing
it in separately)
* avoid some unnecessary BeaconState copies during init
* discover https://github.com/nim-lang/Nim/issues/19094
* prefer `HashedBeaconState` in a few places to avoid recomputing state
root
* fetch latest block root from state when creating blocks
* harden `get_beacon_proposer_index` against invalid slots and document
* move random spec function tests to `test_spec.nim`
* avoid unnecessary state root computation before block proposal
2021-11-18 13:02:43 +01:00