Commit Graph

4376 Commits

Author SHA1 Message Date
Jacek Sieka a50e21e229
fix doppelganger detection logging
* update action tracker on dependent-root-changing reorg (instead of
epoch change)
* don't try to log duties while syncing - we're not tracking actions yet
* fix slot used for doppelganger loss detection
2022-02-04 12:25:32 +01:00
Jacek Sieka 49282e9477
val_mon: register locally produced aggregates (#3352)
These use a separate flow, and were previously only registered from the
network

* don't log successes in totals mode (TMI)
* remove `attestation-sent` event which is unused
2022-02-04 08:33:20 +01:00
tersec 9c18765b3b
remove ncli_db pruneDatabase (#3356) 2022-02-03 20:03:01 +01:00
tersec de0d473ea1
docs: don't rely on ncli_db pruneDatabase for reducing storage usage (#3355)
* don't rely on ncli_db pruneDatabase for reducing storage usage

* remove "Running out of storage" section altogether

* Update docs/the_nimbus_book/src/troubleshooting.md

Co-authored-by: sacha <sacha@status.im>

Co-authored-by: sacha <sacha@status.im>
2022-02-03 12:35:34 +00:00
Zahary Karadjov 215caa21ae Eth1 monitor fixes
* Fix a resource leak introduced in https://github.com/status-im/nimbus-eth2/pull/3279

* Don't restart the Eth1 syncing proggress from scratch in case of
  monitor failures during Eth2 syncing.

* Switch to the primary operator as soon as it is back online.

* Log the web3 credentials in fewer places

Other changes:

The 'web3 test' command has been enhanced to obtain and print more
data regarding the selected provider.
2022-02-03 14:01:55 +02:00
tersec 702d9e8c55
for x86 macOS, require >= Nehalem (#3353) 2022-02-02 15:24:41 +00:00
tersec 8e6a920bf4
rename MERGE_FORK_EPOCH to BELLATRIX_FORK_EPOCH (#3350)
* rename MERGE_FORK_EPOCH to BELLATRIX_FORK_EPOCH

* fix REST test rules
2022-02-02 14:06:55 +01:00
Jacek Sieka ff4f2a6b6c
better log on finalized slot failure 2022-02-01 21:23:18 +01:00
Tanguy bcd7b4598c
Tune peering (#3348)
- Request metadata_v2 (altair) by default instead of the v1
- Change the metadata pinger to a 3 failure-then-kick, instead of being time based
- Update kicker scorer to take into account topics which we're not subscribed to, to be sure that we will be able to publish correctly
- Add some metrics to give "fanout" health (in the same spirit of mesh health)
2022-02-01 18:20:55 +01:00
Zachinquarantine f5de887df7
Delete Pyrmont docs (#3340)
* Delete pyrmont.md

* Update log-rotate.md

* Update pi-guide.md
2022-02-01 12:05:20 +01:00
sacha 7d731322b2
Book: trusted sync (edits and clarifications) (#3329)
* edits

* more edits and clarifications

* edit

* add clarification on --trusted-node-url

* address feedback

* remove repetition
2022-02-01 12:02:04 +01:00
Zahary Karadjov ac16eb4691 Streamline the validator reward analysis
Notable improvements:

* A separate aggregation pass is no longer required.

* The user can opt to produce only aggregated data
  (resuing in a much smaller data set).

* Large portion of the number cruching in Jupyter is now done in C
  through the rich DataFrames API.

* Added support for comparisons against the "median" validator
  performance in the network.
2022-02-01 11:30:14 +02:00
tersec 0c814f49ee
rename sync_{committee_,}aggregate and execute_payload -> notify_new_payload (#3347) 2022-02-01 07:31:53 +00:00
EmilIvanichkovv 336403d18b Refactor `handleValidatorExitCommand`
Make `validator exit command` work both with `JSON-RPC` and `REST` APIs
Fix problem with specifying rest-url using `localhost`
Change back exit error messages in `state_transition_block`
2022-02-01 01:24:05 +02:00
Jacek Sieka 3df9ffca9f val-mon: remove redundant `_total` suffix from counters
It turns out nim-metrics adds this suffix on its own - it also turns out
some of the names are non-conventional and need follow-up.
2022-01-31 18:51:24 +02:00
tersec c9aa1bee01
spec URL updates (#3342) 2022-01-31 09:56:59 +00:00
Jacek Sieka ad327a8769
Fix counters in validator monitor totals mode (#3332)
The current counters set gauges etc to the value of the _last_ validator
to be processed - as the name of the feature implies, we should be using
sums instead.

* fix missing beacon state metrics on startup, pre-first-head-selection
* fix epoch metrics not being updated on cross-epoch reorg
2022-01-31 08:36:29 +01:00
Jacek Sieka d583e8e4ac
Store finalized block roots in database (3s startup) (#3320)
* Store finalized block roots in database (3s startup)

When the chain has finalized a checkpoint, the history from that point
onwards becomes linear - this is exploited in `.era` files to allow
constant-time by-slot lookups.

In the database, we can do the same by storing finalized block roots in
a simple sparse table indexed by slot, bringing the two representations
closer to each other in terms of conceptual layout and performance.

Doing so has a number of interesting effects:

* mainnet startup time is improved 3-5x (3s on my laptop)
* the _first_ startup might take slightly longer as the new index is
being built - ~10s on the same laptop
* we no longer rely on the beacon block summaries to load the full dag -
this is a lot faster because we no longer have to look up each block by
parent root
* a collateral benefit is that we no longer need to load the full
summaries table into memory - we get the RSS benefits of #3164 without
the CPU hit.

Other random stuff:

* simplify forky block generics
* fix withManyWrites multiple evaluation
* fix validator key cache not being updated properly in chaindag
read-only mode
* drop pre-altair summaries from `kvstore`
* recreate missing summaries from altair+ blocks as well (in case
database has lost some to an involuntary restart)
* print database startup timings in chaindag load log
* avoid allocating superfluos state at startup
* use a recursive sql query to load the summaries of the unfinalized
blocks
2022-01-30 18:51:04 +02:00
Emil 0051af430b Put `application/json` as a higher preference than `application/octet-stream` 2022-01-30 18:50:14 +02:00
tersec 29e2169585
phase 0 & altair beacon chain and altair validator spec URL updates (#3339) 2022-01-29 13:53:31 +00:00
tersec 89ffa8a1a7
spec URL & copyright year update (#3338) 2022-01-29 01:05:39 +00:00
tersec 60bf5b8bf4
use v1.1.9 test vectors (#3337) 2022-01-28 22:47:48 +00:00
tersec 95fee10328
clean up hashed rollback proc declarations (#3333)
* clean up hashed rollback proc declarations

* use generic hashed rollback proc type
2022-01-28 14:24:37 +00:00
cheatfate 1287a20b13 Use HTTP status codes instead of status in body. 2022-01-28 15:36:27 +02:00
Jacek Sieka e264276b36
keep unviables in quarantine (#3331)
they remain unviable even after a reorg
2022-01-28 11:59:55 +01:00
Zahary Karadjov 49b7daa39d [ncli_db] bugfix: take into account finalization delay in reward calc post Altair
This fixes a problem affecting Prater's epoch 64444.
2022-01-28 12:03:23 +02:00
tersec dcb671617c
add/support TERMINAL_BLOCK_HASH_ACTIVATION_EPOCH (#3303) 2022-01-27 19:52:08 +00:00
Jacek Sieka 84b6ad871d harden status message handling
Additional sanity checking of the status message exchanged during a
fresh connection:

* check that head and finalized make sense, slot-wise
* verify that finalized root lies on the canonical chain, when possible
* re-check these things for every status message during sync
2022-01-27 18:46:47 +02:00
Eugene Kabanov aa27baacf5
Fix 408 Timeout error returned by REST server. (#3301)
* Disable REST server timeouts.
* Add options to CLI to tune REST server parameters.
2022-01-27 18:41:05 +02:00
Ștefan Talpalaru d5a2c75963
restapi.sh: cleanup on exit (#3328)
also rename a confusing option/var combo
2022-01-27 13:03:38 +01:00
Jacek Sieka 5dd362fc6e
book: add trusted node sync to index (#3326)
* book: add trusted node sync to index

...and some doc updates

* Update docs/the_nimbus_book/src/start-syncing.md

Co-authored-by: Ștefan Talpalaru <stefantalpalaru@yahoo.com>

Co-authored-by: Ștefan Talpalaru <stefantalpalaru@yahoo.com>
2022-01-27 10:19:13 +01:00
tersec 7c51da037f
add block gossip validation condition (#3325) 2022-01-26 17:22:06 +00:00
Ștefan Talpalaru fdb76b8e0b
bump nimbus-build-system (#3324) 2022-01-26 17:11:28 +01:00
Ștefan Talpalaru 47a6f9950f
bump nimbus-build-system (#3322) 2022-01-26 14:15:53 +01:00
tersec 2b4a960270
rename On{Merge,Bellatrix}BlockAdded and Rollback{Merge,Bellatrix}HashedProc (#3321) 2022-01-26 13:21:29 +01:00
Jacek Sieka f70aceef37
Harden handling of unviable forks (#3312)
* Harden handling of unviable forks

In our current handling of unviable forks, we allow peers to send us
blocks that come from a different fork - this is not necessarily an
error as it can happen naturally, but it does open up the client to a
case where the same unviable fork keeps getting requested - rather than
allowing this to happen, we'll now give these peers a small negative
score - if it keeps happening, we'll disconnect them.

* keep track of unviable forks in quarantine, to avoid filling it with
known junk
* collect peer scores in single module
* descore peers when they send unviable blocks during sync
* don't give score for duplicate blocks
* increase quarantine size to a level that allows finality to happen
under optimal conditions - this helps avoid downloading the same blocks
over and over in case of an unviable fork
* increase initial score for new peers to make room for one more failure
before disconnection
* log and score invalid/unviable blocks in requestmanager too
* avoid ChainDAG dependency in quarantine
* reject gossip blocks with unviable parent
* continue processing unviable sync blocks in order to build unviable
dag

* docs

* Update beacon_chain/consensus_object_pools/block_pools_types.nim

* add unviable queue test
2022-01-26 13:20:08 +01:00
tersec bd0a3a9b10
rearrange MEV code (#3319) 2022-01-25 19:43:28 +00:00
sacha ed38b187cc
book: migration guide update (#3316)
* rest api: first draft

* minor edit

* add lighthouse export

* add teku

* cp

* cp

* fix merge conflict"

* fix typo

* incorporate feedback

* correct path mistake
2022-01-25 15:16:50 +01:00
Emil efbd939108 Make `handleValidatorExitCommand` work with `REST API` 2022-01-25 14:00:29 +02:00
Jacek Sieka d076e1a11b
ncli_db: import states and blocks from era file (#3313) 2022-01-25 09:28:26 +01:00
tersec 00a347457a
dynamic sync committee subscriptions (#3308)
* dynamic sync committee subscriptions

* fast-path trivial case rather than rely on RNG with probability 1 outcome

Co-authored-by: zah <zahary@gmail.com>

* use func instead of template; avoid calling async function unnecessarily

* avoid unnecessary sync committee topic computation; use correct epoch lookahead; enforce exception/effect tracking

* don't over-optimistically update ENR syncnets; non-looping version of nearSyncCommitteePeriod

* allow separately setting --allow-all-{sub,att,sync}nets

* remove unnecessary async

Co-authored-by: zah <zahary@gmail.com>
2022-01-24 20:40:59 +00:00
tersec 062275461c
add flashbots (milestone 1) consensus beacon block types (#3314)
* add flashbots (milestone 1) consensus beacon block types

* remove MEV types from main bellatrix spec module
2022-01-24 20:15:22 +00:00
tersec 351c2fd48a
rename mergeData to bellatrixData and mergeFork to bellatrixFork (#3315) 2022-01-24 16:23:13 +00:00
sacha e07044754a
Update rest-api.md 2022-01-24 15:06:35 +01:00
sacha b7b517d97e
improve rest api documentation: first draft (#3306)
* rest api: first draft

* minor edit

* Update docs/the_nimbus_book/src/rest-api.md

Co-authored-by: Jacek Sieka <jacek@status.im>

* Update docs/the_nimbus_book/src/rest-api.md

Co-authored-by: Jacek Sieka <jacek@status.im>

* Update docs/the_nimbus_book/src/rest-api.md

Co-authored-by: Jacek Sieka <jacek@status.im>

* incorporate feedback

Co-authored-by: Jacek Sieka <jacek@status.im>
2022-01-24 14:46:40 +01:00
Zahary Karadjov 54a745cb0e Bugfix: Take into account the finalization delay in the ncli_db rewards calculation
This fixes a reward calculation error affecting Prater's epoch 31256
2022-01-23 23:10:56 +02:00
Jacek Sieka c7e92bfd84
wss_sim: state transition simulator (#3309)
It's sometimes useful to simulate what happens when a chain runs from a
given state with a given set of private keys - `wss_sim` allows running
such a simulation.

One use of such a tool is to simulate a weak subjectivity attack,
creating alternative histories of the same chain:
https://notes.status.im/nimbus-insecura-network#
2022-01-22 10:25:30 +01:00
Jacek Sieka 61342c2449
limit by-root requests to non-finalized blocks (#3293)
* limit by-root requests to non-finalized blocks

Presently, we keep a mapping from block root to `BlockRef` in memory -
this has simplified reasoning about the dag, but is not sustainable with
the chain growing.

We can distinguish between two cases where by-root access is useful:

* unfinalized blocks - this is where the beacon chain is operating
generally, by validating incoming data as interesting for future fork
choice decisions - bounded by the length of the unfinalized period
* finalized blocks - historical access in the REST API etc - no bounds,
really

In this PR, we limit the by-root block index to the first use case:
finalized chain data can more efficiently be addressed by slot number.

Future work includes:

* limiting the `BlockRef` horizon in general - each instance is 40
bytes+overhead which adds up - this needs further refactoring to deal
with the tail vs state problem
* persisting the finalized slot-to-hash index - this one also keeps
growing unbounded (albeit slowly)

Anyway, this PR easily shaves ~128mb of memory usage at the time of
writing.

* No longer honor `BeaconBlocksByRoot` requests outside of the
non-finalized period - previously, Nimbus would generously return any
block through this libp2p request - per the spec, finalized blocks
should be fetched via `BeaconBlocksByRange` instead.
* return `Opt[BlockRef]` instead of `nil` when blocks can't be found -
this becomes a lot more common now and thus deserves more attention
* `dag.blocks` -> `dag.forkBlocks` - this index only carries unfinalized
blocks from now - `finalizedBlocks` covers the other `BlockRef`
instances
* in backfill, verify that the last backfilled block leads back to
genesis, or panic
* add backfill timings to log
* fix missing check that `BlockRef` block can be fetched with
`getForkedBlock` reliably
* shortcut doppelganger check when feature is not enabled
* in REST/JSON-RPC, fetch blocks without involving `BlockRef`

* fix dag.blocks ref
2022-01-21 13:33:16 +02:00
tersec 1a37cae329
allow Eth1 monitor to run without genesis_deposit_contract_snapshot.ssz (#3279) 2022-01-21 12:59:09 +02:00
Eugene Kabanov 0ea6dfa517
Fix current slot value and finishing progress for backfilling. (#3304) 2022-01-21 10:35:54 +01:00