Commit Graph

2365 Commits

Author SHA1 Message Date
Tanguy bcd7b4598c
Tune peering (#3348)
- Request metadata_v2 (altair) by default instead of the v1
- Change the metadata pinger to a 3 failure-then-kick, instead of being time based
- Update kicker scorer to take into account topics which we're not subscribed to, to be sure that we will be able to publish correctly
- Add some metrics to give "fanout" health (in the same spirit of mesh health)
2022-02-01 18:20:55 +01:00
tersec 0c814f49ee
rename sync_{committee_,}aggregate and execute_payload -> notify_new_payload (#3347) 2022-02-01 07:31:53 +00:00
EmilIvanichkovv 336403d18b Refactor `handleValidatorExitCommand`
Make `validator exit command` work both with `JSON-RPC` and `REST` APIs
Fix problem with specifying rest-url using `localhost`
Change back exit error messages in `state_transition_block`
2022-02-01 01:24:05 +02:00
Jacek Sieka 3df9ffca9f val-mon: remove redundant `_total` suffix from counters
It turns out nim-metrics adds this suffix on its own - it also turns out
some of the names are non-conventional and need follow-up.
2022-01-31 18:51:24 +02:00
tersec c9aa1bee01
spec URL updates (#3342) 2022-01-31 09:56:59 +00:00
Jacek Sieka ad327a8769
Fix counters in validator monitor totals mode (#3332)
The current counters set gauges etc to the value of the _last_ validator
to be processed - as the name of the feature implies, we should be using
sums instead.

* fix missing beacon state metrics on startup, pre-first-head-selection
* fix epoch metrics not being updated on cross-epoch reorg
2022-01-31 08:36:29 +01:00
Jacek Sieka d583e8e4ac
Store finalized block roots in database (3s startup) (#3320)
* Store finalized block roots in database (3s startup)

When the chain has finalized a checkpoint, the history from that point
onwards becomes linear - this is exploited in `.era` files to allow
constant-time by-slot lookups.

In the database, we can do the same by storing finalized block roots in
a simple sparse table indexed by slot, bringing the two representations
closer to each other in terms of conceptual layout and performance.

Doing so has a number of interesting effects:

* mainnet startup time is improved 3-5x (3s on my laptop)
* the _first_ startup might take slightly longer as the new index is
being built - ~10s on the same laptop
* we no longer rely on the beacon block summaries to load the full dag -
this is a lot faster because we no longer have to look up each block by
parent root
* a collateral benefit is that we no longer need to load the full
summaries table into memory - we get the RSS benefits of #3164 without
the CPU hit.

Other random stuff:

* simplify forky block generics
* fix withManyWrites multiple evaluation
* fix validator key cache not being updated properly in chaindag
read-only mode
* drop pre-altair summaries from `kvstore`
* recreate missing summaries from altair+ blocks as well (in case
database has lost some to an involuntary restart)
* print database startup timings in chaindag load log
* avoid allocating superfluos state at startup
* use a recursive sql query to load the summaries of the unfinalized
blocks
2022-01-30 18:51:04 +02:00
Emil 0051af430b Put `application/json` as a higher preference than `application/octet-stream` 2022-01-30 18:50:14 +02:00
tersec 29e2169585
phase 0 & altair beacon chain and altair validator spec URL updates (#3339) 2022-01-29 13:53:31 +00:00
tersec 89ffa8a1a7
spec URL & copyright year update (#3338) 2022-01-29 01:05:39 +00:00
tersec 60bf5b8bf4
use v1.1.9 test vectors (#3337) 2022-01-28 22:47:48 +00:00
tersec 95fee10328
clean up hashed rollback proc declarations (#3333)
* clean up hashed rollback proc declarations

* use generic hashed rollback proc type
2022-01-28 14:24:37 +00:00
cheatfate 1287a20b13 Use HTTP status codes instead of status in body. 2022-01-28 15:36:27 +02:00
Jacek Sieka e264276b36
keep unviables in quarantine (#3331)
they remain unviable even after a reorg
2022-01-28 11:59:55 +01:00
Zahary Karadjov 49b7daa39d [ncli_db] bugfix: take into account finalization delay in reward calc post Altair
This fixes a problem affecting Prater's epoch 64444.
2022-01-28 12:03:23 +02:00
tersec dcb671617c
add/support TERMINAL_BLOCK_HASH_ACTIVATION_EPOCH (#3303) 2022-01-27 19:52:08 +00:00
Jacek Sieka 84b6ad871d harden status message handling
Additional sanity checking of the status message exchanged during a
fresh connection:

* check that head and finalized make sense, slot-wise
* verify that finalized root lies on the canonical chain, when possible
* re-check these things for every status message during sync
2022-01-27 18:46:47 +02:00
Eugene Kabanov aa27baacf5
Fix 408 Timeout error returned by REST server. (#3301)
* Disable REST server timeouts.
* Add options to CLI to tune REST server parameters.
2022-01-27 18:41:05 +02:00
tersec 7c51da037f
add block gossip validation condition (#3325) 2022-01-26 17:22:06 +00:00
tersec 2b4a960270
rename On{Merge,Bellatrix}BlockAdded and Rollback{Merge,Bellatrix}HashedProc (#3321) 2022-01-26 13:21:29 +01:00
Jacek Sieka f70aceef37
Harden handling of unviable forks (#3312)
* Harden handling of unviable forks

In our current handling of unviable forks, we allow peers to send us
blocks that come from a different fork - this is not necessarily an
error as it can happen naturally, but it does open up the client to a
case where the same unviable fork keeps getting requested - rather than
allowing this to happen, we'll now give these peers a small negative
score - if it keeps happening, we'll disconnect them.

* keep track of unviable forks in quarantine, to avoid filling it with
known junk
* collect peer scores in single module
* descore peers when they send unviable blocks during sync
* don't give score for duplicate blocks
* increase quarantine size to a level that allows finality to happen
under optimal conditions - this helps avoid downloading the same blocks
over and over in case of an unviable fork
* increase initial score for new peers to make room for one more failure
before disconnection
* log and score invalid/unviable blocks in requestmanager too
* avoid ChainDAG dependency in quarantine
* reject gossip blocks with unviable parent
* continue processing unviable sync blocks in order to build unviable
dag

* docs

* Update beacon_chain/consensus_object_pools/block_pools_types.nim

* add unviable queue test
2022-01-26 13:20:08 +01:00
tersec bd0a3a9b10
rearrange MEV code (#3319) 2022-01-25 19:43:28 +00:00
Emil efbd939108 Make `handleValidatorExitCommand` work with `REST API` 2022-01-25 14:00:29 +02:00
Jacek Sieka d076e1a11b
ncli_db: import states and blocks from era file (#3313) 2022-01-25 09:28:26 +01:00
tersec 00a347457a
dynamic sync committee subscriptions (#3308)
* dynamic sync committee subscriptions

* fast-path trivial case rather than rely on RNG with probability 1 outcome

Co-authored-by: zah <zahary@gmail.com>

* use func instead of template; avoid calling async function unnecessarily

* avoid unnecessary sync committee topic computation; use correct epoch lookahead; enforce exception/effect tracking

* don't over-optimistically update ENR syncnets; non-looping version of nearSyncCommitteePeriod

* allow separately setting --allow-all-{sub,att,sync}nets

* remove unnecessary async

Co-authored-by: zah <zahary@gmail.com>
2022-01-24 20:40:59 +00:00
tersec 062275461c
add flashbots (milestone 1) consensus beacon block types (#3314)
* add flashbots (milestone 1) consensus beacon block types

* remove MEV types from main bellatrix spec module
2022-01-24 20:15:22 +00:00
tersec 351c2fd48a
rename mergeData to bellatrixData and mergeFork to bellatrixFork (#3315) 2022-01-24 16:23:13 +00:00
Jacek Sieka 61342c2449
limit by-root requests to non-finalized blocks (#3293)
* limit by-root requests to non-finalized blocks

Presently, we keep a mapping from block root to `BlockRef` in memory -
this has simplified reasoning about the dag, but is not sustainable with
the chain growing.

We can distinguish between two cases where by-root access is useful:

* unfinalized blocks - this is where the beacon chain is operating
generally, by validating incoming data as interesting for future fork
choice decisions - bounded by the length of the unfinalized period
* finalized blocks - historical access in the REST API etc - no bounds,
really

In this PR, we limit the by-root block index to the first use case:
finalized chain data can more efficiently be addressed by slot number.

Future work includes:

* limiting the `BlockRef` horizon in general - each instance is 40
bytes+overhead which adds up - this needs further refactoring to deal
with the tail vs state problem
* persisting the finalized slot-to-hash index - this one also keeps
growing unbounded (albeit slowly)

Anyway, this PR easily shaves ~128mb of memory usage at the time of
writing.

* No longer honor `BeaconBlocksByRoot` requests outside of the
non-finalized period - previously, Nimbus would generously return any
block through this libp2p request - per the spec, finalized blocks
should be fetched via `BeaconBlocksByRange` instead.
* return `Opt[BlockRef]` instead of `nil` when blocks can't be found -
this becomes a lot more common now and thus deserves more attention
* `dag.blocks` -> `dag.forkBlocks` - this index only carries unfinalized
blocks from now - `finalizedBlocks` covers the other `BlockRef`
instances
* in backfill, verify that the last backfilled block leads back to
genesis, or panic
* add backfill timings to log
* fix missing check that `BlockRef` block can be fetched with
`getForkedBlock` reliably
* shortcut doppelganger check when feature is not enabled
* in REST/JSON-RPC, fetch blocks without involving `BlockRef`

* fix dag.blocks ref
2022-01-21 13:33:16 +02:00
tersec 1a37cae329
allow Eth1 monitor to run without genesis_deposit_contract_snapshot.ssz (#3279) 2022-01-21 12:59:09 +02:00
Eugene Kabanov 0ea6dfa517
Fix current slot value and finishing progress for backfilling. (#3304) 2022-01-21 10:35:54 +01:00
Dustin Brody 9699858422 rename MERGE_FORK_VERSION to BELLATRIX_FORK_VERSION 2022-01-20 19:33:05 +02:00
Mamy Ratsimbazafy 9e9ccf4a1f
Slashing prot interchange tests v5.2.1 (#3277)
* initial support for minification and new interchange tests. Removal of v1 and v1 migration.

* Synthetic attestations: SQLite3 requires one statement/query per prepared statement

* Fix DB import interrupted if no attestation was found

* Skip test relying on undocumented test behavior (https://github.com/eth-clients/slashing-protection-interchange-tests/pull/12#issuecomment-1011158701)

* Skip test relying on unclear minification behavior:
creating an invalid minified attestation with source > target or setting target = max(source, target)

* remove DB v1 and update submodule

* Apply suggestions from code review

Co-authored-by: Jacek Sieka <jacek@status.im>

Co-authored-by: Jacek Sieka <jacek@status.im>
2022-01-20 17:14:06 +01:00
Jacek Sieka 1df549143e
rest: fix GraffitiBytes serialization (#3299) 2022-01-20 13:31:55 +01:00
Jacek Sieka 570379d3d9
Backfiller (#3263)
Backfilling is the process of downloading historical blocks via P2P that
are required to fulfill `GetBlocksByRange` duties - this happens during
both trusted node and finalized checkpoint syncs.

In particular, backfilling happens after syncing to head, such that
attestation work can start as soon as possible.

* Fix SyncQueue initialization procedure.
Remove usage of `awaitne`.
Add cancellation support.
Remove unneeded `sleepAsync()` if peer's head is older than needed.
Add `direction` field to all logs.
Fix syncmanager wedge issue.
Add proper resource cleaning procedure on backward sync finish.

Co-authored-by: cheatfate <eugene.kabanov@status.im>
2022-01-20 08:25:45 +01:00
Jacek Sieka e4939538cd
fix non-totalized validator metric (#3297)
...when running on host with many validators, this explodes
2022-01-19 15:20:48 +01:00
tersec 2f635d3337
rename *_{MERGE => BELLATRIX} constant names (#3296) 2022-01-18 16:31:05 +00:00
tersec 9c0c9c98ce
complete switch to beacon_chain/specs/datatypes/bellatrix (#3295) 2022-01-18 13:36:52 +00:00
Zahary Karadjov 47f1f7ff1a More efficient reward data persistance; Address review comments
The new format is based on compressed CSV files in two channels:

* Detailed per-epoch data
* Aggregated "daily" summaries

The use of append-only CSV file speeds up significantly the epoch
processing speed during data generation. The use of compression
results in smaller storage requirements overall. The use of the
aggregated files has a very minor cost in both CPU and storage,
but leads to near interactive speed for report generation.

Other changes:

- Implemented support for graceful shut downs to avoid corrupting
  the saved files.

- Fixed a memory leak caused by lacking `StateCache` clean up on each
  iteration.

- Addressed review comments

- Moved the rewards and penalties calculation code in a separate module

Required invasive changes to existing modules:

- The `data` field of the `KeyedBlockRef` type is made public to be used
  by the validator rewards monitor's Chain DAG update procedure.

- The `getForkedBlock` procedure from the `blockchain_dag.nim` module
  is made public to be used by the validator rewards monitor's Chain DAG
  update procedure.
2022-01-18 01:56:56 +02:00
Zahary Karadjov 29aad0241b Precise per-component ETH-denominated rewards tracking
This is an alternative take on https://github.com/status-im/nimbus-eth2/pull/3107
that aims for more minimal interventions in the spec modules at the expense of
duplicating more of the spec logic in ncli_db.
2022-01-18 01:56:56 +02:00
Jacek Sieka 4e2d2ff7f4 rest: fix invalid type `RestSyncCommitteeSubscription`
Using the wrong type here causes requests to fail due to the overly
zealous parameter validation - the failure is harmless in the current
duty subscription model, but would have caused more serious failures
down the line.
2022-01-17 22:33:24 +02:00
Jacek Sieka 6bf3330d73
fix ugly delay logging 2022-01-17 20:12:36 +01:00
Jacek Sieka ff5b91cd58
Revert "Don't use GC memory for the initial beacon block summaries loading" (#3292)
This reverts commit 7e2fc2b726.
2022-01-17 12:07:49 +00:00
Jacek Sieka 836f6984bb
move `state_transition` to `Result` (#3284)
* better error messages in api
* avoid `BlockData` copies when replaying blocks
2022-01-17 12:19:58 +01:00
Jacek Sieka 68247f81b3
Trusted node sync (#3209)
* Trusted node sync

Trusted node sync, aka checkpoint sync, allows syncing tyhe chain from a
trusted node instead of relying on a full sync from genesis.

Features include:

* sync from any slot, including the latest finalized slot
* backfill blocks either from the REST api (default) or p2p (#3263)

Future improvements:

* top up blocks between head in database and some other node - this
makes for an efficient backup tool
* recreate historical state to enable historical queries

* fixes

* load genesis from network metadata
* check checkpoint block root against state
* fix invalid block root in rest json decoding
* odds and ends

* retry looking for epoch-boundary checkpoint blocks
2022-01-17 10:27:08 +01:00
Zahary Karadjov ebde027262 Re-enable the HTTP support in Eth1Monitor
This reverts commit 6fddff524c.
2022-01-16 18:26:21 +02:00
Zahary Karadjov 7e2fc2b726 Don't use GC memory for the initial beacon block summaries loading 2022-01-15 10:15:17 +02:00
Jacek Sieka 167068e739
valmon: fix `--validator-monitor-totals` feature (#3286)
Else log size explodes for machines with many validators
2022-01-14 15:57:46 +01:00
Zahary Karadjov bef13b6cce
Version 1.6.0 2022-01-14 13:52:06 +02:00
tersec d878948ed2
update sync committee gossip validation comments; spec URL updates (#3280) 2022-01-13 13:46:08 +00:00
Jacek Sieka d57c2dc4e5
use tail block as sync pivot (#3276)
When syncing, we show how much of the sync has completed - with
checkpoint sync, the syncing does not always go from slot 0 to head, but
rather can start in the middle.

To show a consistent `%` between restarts, we introduce the concept of a
pivot point, such that if I sync 10% of the chain, then restart the
client, it picks up at 10% (instead of counting from 0).

What it looks like:
```
INF ... sync="01d12h41m (15.96%) 13.5158slots/s (QDDQDDQQDP:339018)" ...
```
2022-01-13 10:37:53 +01:00