Commit Graph

258 Commits

Author SHA1 Message Date
Jacek Sieka 167068e739
valmon: fix `--validator-monitor-totals` feature (#3286)
Else log size explodes for machines with many validators
2022-01-14 15:57:46 +01:00
Jacek Sieka d57c2dc4e5
use tail block as sync pivot (#3276)
When syncing, we show how much of the sync has completed - with
checkpoint sync, the syncing does not always go from slot 0 to head, but
rather can start in the middle.

To show a consistent `%` between restarts, we introduce the concept of a
pivot point, such that if I sync 10% of the chain, then restart the
client, it picks up at 10% (instead of counting from 0).

What it looks like:
```
INF ... sync="01d12h41m (15.96%) 13.5158slots/s (QDDQDDQQDP:339018)" ...
```
2022-01-13 10:37:53 +01:00
Jacek Sieka 805e85e1ff
time: spring cleaning (#3262)
Time in the beacon chain is expressed relative to the genesis time -
this PR creates a `beacon_time` module that collects helpers and
utilities for dealing the time units - the new module does not deal with
actual wall time (that's remains in `beacon_clock`).

Collecting the time related stuff in one place makes it easier to find,
avoids some circular imports and allows more easily identifying the code
actually needs wall time to operate.

* move genesis-time-related functionality into `spec/beacon_time`
* avoid using `chronos.Duration` for time differences - it does not
support negative values (such as when something happens earlier than it
should)
* saturate conversions between `FAR_FUTURE_XXX`, so as to avoid
overflows
* fix delay reporting in validator client so it uses the expected
deadline of the slot, not "closest wall slot"
* simplify looping over the slots of an epoch
* `compute_start_slot_at_epoch` -> `start_slot`
* `compute_epoch_at_slot` -> `epoch`

A follow-up PR will (likely) introduce saturating arithmetic for the
time units - this is merely code moves, renames and fixing of small
bugs.
2022-01-11 11:01:54 +01:00
tersec ae61512ee9
rename upgrade_to_{merge,bellatrix}; detect unchanging spec YAMLs (#3265) 2022-01-10 09:39:43 +00:00
Jacek Sieka 20e700fae4
Harden CommitteeIndex, SubnetId, SyncSubcommitteeIndex (#3259)
* Harden CommitteeIndex, SubnetId, SyncSubcommitteeIndex

Harden the use of `CommitteeIndex` et al to prevent future issues by
using a distinct type, then validating before use in several cases -
datatypes in spec are kept simple though so that invalid data still can
be read.

* fix invalid epoch used in REST
`/eth/v1/beacon/states/{state_id}/committees` committee length (could
return invalid data)
* normalize some variable names
* normalize committee index loops
* fix `RestAttesterDuty` to use `uint64` for `validator_committee_index`
* validate `CommitteeIndex` on ingress in REST API
* update rest rules with stricter parsing
* better REST serializers
* save lots of memory by not using `zip` ...at least a few bytes!
2022-01-09 01:28:49 +02:00
Jacek Sieka 0e2b4e39fa
REST JSON support improvements (#3232)
* support downloading blocks / states via JSON in addition to SSZ -
slow, but needed for infura support - SSZ is still used when server
supports it
* use common forked block/state reader in REST API
* fix stack overflows in REST JSON decoder
* fix invalid serialization of `justification_bits` in
`/eth/v1/debug/beacon/states` and `/eth/v2/debug/beacon/states`
* fix REST client to use `/eth/...` instead of `/api/eth/...`, update
"default" urls to expose REST api via `/eth` as well as this is what the
standard says - `/api` was added early on based on an example "base url"
in the spec that has been removed since
* expose Nimbus REST extensions via `/nimbus` in addition to
`/api/nimbus` to stay consistent with `/eth`
* fix invalid state root when reading states via REST
* fix recursive imports in `spec/ssz_codec`
* remove usages of `serialization.useCustomSerialization` - fickle
2022-01-06 08:38:40 +01:00
Jacek Sieka 0a4728a241
Handle access to historical data for which there is no state (#3217)
With checkpoint sync in particular, and state pruning in the future,
loading states or state-dependent data may fail. This PR adjusts the
code to allow this to be handled gracefully.

In particular, the new availability assumption is that states are always
available for the finalized checkpoint and newer, but may fail for
anything older.

The `tail` remains the point where state loading de-facto fails, meaning
that between the tail and the finalized checkpoint, we can still get
historical data (but code should be prepared to handle this as an
error).

However, to harden the code against long replays, several operations
which are assumed to work only with non-final data (such as gossip
verification and validator duties) now limit their search horizon to
post-finalized data.

* harden several state-dependent operations by logging an error instead
of introducing a panic when state loading fails
* `withState` -> `withUpdatedState` to differentiate from the other
`withState`
* `updateStateData` can now fail if no state is found in database - it
is also hardened against excessively long replays
* `getEpochRef` can now fail when replay fails
* reject blocks with invalid target root - they would be ignored
previously
* fix recursion bug in `isProposed`
2022-01-05 19:38:04 +01:00
zah fba1f08a5e
Implement #3129 (Optimized history traversals in the REST API) (#3219)
* Fix REST some rest call signatures and implement a simple API benchmark tool

* Implement #3129 (Optimized history traversals in the REST API)

Other notable changes:

The `updateStateData` procedure in the `blockchain_dag.nim` module is
optimized to not rewind down to the last snapshot state saved in the
database if the supplied input state can be used as a starting point
instead.

* Disallow await in withStateForBlockSlot
2022-01-05 15:49:10 +01:00
tersec 5878d34117
rename forkDigests.merge to forkDigests.bellatrix (#3245) 2022-01-05 14:24:15 +00:00
Zahary Karadjov 54d0d588b1 Implementation of the Keymanager API (BETA)
https://github.com/ethereum/keymanager-APIs
2022-01-04 18:51:45 +02:00
tersec da017d2ca5
update from phase0/altair v1.1.6 URLs to v1.1.8 spec URLs (#3238) 2022-01-04 03:57:15 +00:00
Jacek Sieka c4ce59e55b
Assorted logging improvements (#3237)
* log doppelganger detection when it activates and when it causes missed
duties
* less prominent eth1 sync progress
* log in-progress sync at notice only when actually missing duties
* better detail in replay log
* don't log finalization checkpoints - this is quite verbose when
syncing and already included in "Slot start"
2022-01-03 22:18:49 +01:00
tersec d4680df8d2
convert between engine and consensus ExecutionPayloads (#3228)
* convert between engine and consensus ExecutionPayloads
2022-01-03 13:22:56 +01:00
Zahary Karadjov a860cd6250
Restore the build support of the -d:has_genesis_detection feature 2021-12-23 16:58:54 +02:00
tersec 0d4e49f946
Merge fork gossip support (#3213)
* Merge fork gossip support

* index directly by BeaconStateFork and remove debugging log statement
2021-12-21 15:24:23 +01:00
Jacek Sieka 1021e3324e
Revert writing backfill root to database (#3215)
Introduced in #3171, it turns out we can just follow the block headers
to achieve the same effect

* leaves the constant in the code so as to avoid confusion when reading
database that had the constant written (such as the fleet nodes and
other unstable users)
2021-12-21 11:40:14 +01:00
Jacek Sieka c270ec21e4
Validator monitoring (#2925)
Validator monitoring based on and mostly compatible with the
implementation in Lighthouse - tracks additional logs and metrics for
specified validators so as to stay on top on performance.

The implementation works more or less the following way:
* Validator pubkeys are singled out for monitoring - these can be
running on the node or not
* For every action that the validator takes, we record steps in the
process such as messages being seen on the network or published in the
API
* When the dust settles at the end of an epoch, we report the
information from one epoch before that, which coincides with the
balances being updated - this is a tradeoff between being correct
(waiting for finalization) and providing relevant information in a
timely manner)
2021-12-20 20:20:31 +01:00
tersec 6ef3834f4a
fix type-conversions-to-self, unexport from nimbus_beacon_node, and rm unused vars/procs (#3211) 2021-12-20 12:21:17 +01:00
tersec c7be88b432
some spec URL updates (#3210) 2021-12-19 15:12:33 +00:00
Jacek Sieka 118840d241
SyncManager cleanups for backfill support (#3189)
* SyncManager cleanups for backfill support

Cleanups, fixes and simplifications, in anticipation of backfill support
for the `SyncManager`:

* reformat sync progress indicator to show time left and % done more
prominently:
  * old: `sync="sPssPsssss:2:2.4229:00h57m (2706898)"`
  * new: `sync="14d12h31m (0.52%) 1.1378slots/s (wQQQQQDDQQ:1287520)"`
* reset average speed when going out of sync
* pass all block errors to sync manager, including duplicate/unviable
* penalize peers for reporting a head block that is outside of our
expected wall clock time (they're likely on a different network or
trying to disrupt sync)
* remove `SyncFailureKind` (unused)
* remove `inRange` (unused)
* add `Q` for sync queue requests that are in the `SyncQueue` but not
yet in the `BlockProcessor` queue
* update last slot in `SyncQueue` after getting peer status
* fix race condition between `wakeupWaiters` and `resetWait`, where
workers would not be correctly reset if block verification returned a
completed future without event loop
* log syncmanager direction

* Fix ordering issue.
Some of the requests size of which are not equal to `chunkSize` could be processed in wrong order which could lead to sync process freezes.

Co-authored-by: cheatfate <eugene.kabanov@status.im>
2021-12-16 15:57:16 +01:00
Jacek Sieka 0f44d2eff7
additional startup logging 2021-12-15 11:13:48 +01:00
Etan Kissling ad26284614
fix typo (`snapshop` -> `snapshot`) (#3192)
This fixes a typo in the word `snapshot` of function name
`extractGenesisValidatorRootFromSnapshop`.
2021-12-14 15:44:34 +00:00
Jacek Sieka 069bccd51b
batch-verify sync messages for a small perf boost (#3151)
* batch-verify sync messages for a small perf boost

Generally reuses the same structure as attestation and aggregate
verification

* normalize `signatures` and `signature_batch` to use the same pattern
of verification
* normalize parameter names, order etc for signature stuff in general
* avoid calling `blsSign` directly - instead, go through `signatures`
consistently
2021-12-09 14:56:54 +02:00
tersec 2ca28fb861
Merge BeaconBlock gossip validation (#3165)
* Merge BeaconBlock gossip validation

* figure/ground inversion

* revert cosmetic cleanups to reduce merge conflicts
2021-12-08 17:29:22 +00:00
Jacek Sieka 1a8b7469e3
move quarantine outside of chaindag (#3124)
* move quarantine outside of chaindag

The quarantine has been part of the ChainDAG for the longest time, but
this design has a few issues:

* the function in which blocks are verified and added to the dag becomes
reentrant and therefore difficult to reason about - we're currently
using a stateful flag to work around it
* quarantined blocks bypass the processing queue leading to a processing
stampede
* the quarantine flow is unsuitable for orphaned attestations - these
should also should be quarantined eventually

Instead of processing the quarantine inside ChainDAG, this PR moves
re-queueing to `block_processor` which already is responsible for
dealing with follow-up work when a block is added to the dag

This sets the stage for keeping attestations in the quarantine as well.

Also:

* make `BlockError` `{.pure.}`
* avoid use of `ValidationResult` in block clearance (that's for gossip)
2021-12-06 10:49:01 +01:00
tersec 4378f3f096
almost all remaining ethereum/{eth2.0-specs -> consensus-specs} (#3158) 2021-12-03 20:01:13 +00:00
Eugene Kabanov e62c7c7c37
Remote signing client/server. (#3077) 2021-11-30 03:20:21 +02:00
Zahary Karadjov 7902e7684c Sync with Eth1 even when there are no validators attached 2021-11-27 18:43:01 +02:00
Zahary Karadjov ef1de66316 Add polling support in the Eth1Monitor (extracted from the merge branch) 2021-11-27 18:43:01 +02:00
Jacek Sieka a223d62b07
Cleanups (#3123)
Renames and cleanups split out from the validator monitoring branch, so
as to reduce conflict area vs other PR:s

* add constants for expected message timing
* name validators after the messages they validate, mostly, to make
grepping easier
* unify field naming of EpochInfo across forks to make cross-fork code
easier
2021-11-25 13:20:36 +01:00
tersec cc0dbd5bc0
implement terminal-total-difficulty-override; keep kintsugi TTDs consistent (#3118)
* implement terminal-total-difficulty-override; keep TTD consistent for m2 scripts/docs

* use Option[uint64] instead of uint64
2021-11-25 10:53:31 +00:00
Zahary Karadjov 88c623e250 Add support for HTTPS Web3 providers 2021-11-23 15:56:18 +02:00
Jacek Sieka f19a497eec
ncli_db: add putState, putBlock (#3096)
* ncli_db: add putState, putBlock

These tools allow modifying an existing nimbus database for the purpose
of recovery or reorg, moving the head, tail and genesis to arbitrary
points.

* remove potentially expensive `putState` in `BeaconStateDB`
* introduce `latest_block_root` which computes the root of the latest
applied block from the `latest_block_header` field (instead of passing
it in separately)
* avoid some unnecessary BeaconState copies during init
* discover https://github.com/nim-lang/Nim/issues/19094
* prefer `HashedBeaconState` in a few places to avoid recomputing state
root
* fetch latest block root from state when creating blocks
* harden `get_beacon_proposer_index` against invalid slots and document
* move random spec function tests to `test_spec.nim`
* avoid unnecessary state root computation before block proposal
2021-11-18 13:02:43 +01:00
tersec 6f8eff8f13
merge topic sub/unsub infrastructure (#3099) 2021-11-14 09:00:25 +01:00
tersec 00d066f7dd
add merge gossip validation, except for beaconblocks (#3095) 2021-11-13 22:26:02 +01:00
Jacek Sieka ec650c7fd7
Support starting from altair (#3054)
* Support starting from altair

* hide `finalized-checkpoint-` - they are incomplete and usage may cause
crashes
* remove genesis detection code (broken, obsolete)
* enable starting ChainDAG from altair checkpoints - this is a
prerequisite for checkpoint sync (TODO: backfill)
* tighten checkpoint state conditions
* show error when starting from checkpoint with existing database (not
supported)
* print rest-compatible JSON in ncli/state_sim
* altair/merge support in ncli
* more altair/merge support in ncli_db
* pre-load header to speed up loading
* fix forked block decoding
2021-11-10 13:39:08 +02:00
Zahary Karadjov 200902ccb3 Don't start the Eth1Monitor when there are no validators attached
This is a temporary revert of the new logic introduced in
d6cd1cd46c, because it leads
to a memory leak in the Eth1Monitor caused by the lack of
pruning of the Eth1Chain when the validator duties are not
executed.
2021-11-09 12:23:11 +02:00
Jacek Sieka ea0a191723
Better REST/RPC error messages (#3046)
* Better REST/RPC error messages
* homogenise block logging (root first)
* homegenise message verification pipeline (verify in
`gossip_verification`, act in `eth2_processor`)
* use `subcommitteeIdx` consistently
* log each sent contribution
* fix block_sim
* fix block topic
* don't recalc root on gossip block validation
* move position loop into sync pool
2021-11-05 17:39:47 +02:00
Jacek Sieka a086cf01ac
altair fork handling cleanups (#3050)
* fix stack overflow crash in REST/debug/getStateV2
* introduce `ForkyXxx` for generic type matching of `Xxx` across
branches (SomeHashedBeaconState -> ForkyHashedBeaconState et al) -
`Some` is already used for other types of type classes
* consolidate function naming in BeaconChainDB, use some generics
* import `forks.nim` from other spec modules and move `Forked*` helpers
around to resolve circular imports
* remove `ForkedBeaconState`, use `ForkedHashedBeaconState` throughout
(less data shuffling between the types)
* fix several cases of states being stored on stack in tests, causing
random failures on some platforms
* remove reading json support from ncli - this should be ported to the
rest json reading instead (doesn't currently work because stack sizes)
2021-11-05 08:34:34 +01:00
Jacek Sieka 233d756518
Logging and startup improvements (#3038)
* Logging and startup improvements

Color support for released binaries!

* startup scripts no longer log to file by default - this only affects
source builds - released binaries don't support file logging
* add --log-stdout option to control logging to stdout (colors, json)
* detect tty:s vs redirected logs and log accordingly
* add option to disable log colors at runtime
* simplify several "common" logs, showing the most important information
earlier and more clearly
* remove line numbers / file information / tid - these take up space and
are of little use to end users
  * still enabled in debug builds and tools
* remove `testnet_servers_image` compile-time option
* server images, released binaries and compile-from-source now offer
the same behaviour and features
* fixes https://github.com/status-im/nimbus-eth2/issues/2326
* fixes https://github.com/status-im/nimbus-eth2/issues/1794
* remove instanteneous block speed from sync message, keeping only
average

before:

```
INF 2021-10-28 16:45:59.000+02:00 Slot start                                 topics="beacnde" tid=386429 file=nimbus_beacon_node.nim:884 lastSlot=2384027 wallSlot=2384028 delay=461us84ns peers=0 head=75a10ee5:3348 headEpoch=104 finalized=cd6804ba:3264 finalizedEpoch=102 sync="wwwwwwwwww:0:0.0000:0.0000:00h00m (3348)"
INF 2021-10-28 16:45:59.046+02:00 Slot end                                   topics="beacnde" tid=386429 file=nimbus_beacon_node.nim:821 slot=2384028 nextSlot=2384029 head=75a10ee5:3348 headEpoch=104 finalizedHead=cd6804ba:3264 finalizedEpoch=102 nextAttestationSlot=-1 nextProposalSlot=-1 nextActionWait=n/a
```

after:

```
INF 2021-10-28 22:43:23.033+02:00 Slot start                                 topics="beacnde" slot=2385815 epoch=74556 sync="DDPDDPUDDD:10:5.2258:01h19m (2361088)" peers=37 head=eacd2dae:2361096 finalized=73782:a4751487 delay=33ms687us715ns
INF 2021-10-28 22:43:23.291+02:00 Slot end                                   topics="beacnde" slot=2385815 nextActionWait=n/a nextAttestationSlot=-1 nextProposalSlot=-1 head=eacd2dae:2361096
```

* fix comment

* documentation updates

* mention `--log-file` may be deprecated in the future
* update various docs
2021-11-02 18:06:36 +01:00
Jacek Sieka d6cd1cd46c
Remove web3-mode, always keep web3 monitor enabled when given (#3042)
* Init some components fully before BeaconNode, per component dependency
graph
* remove `--web3-mode` option
* fixes https://github.com/status-im/nimbus-eth2/issues/2685
* reshuffle some beacon node init code
2021-11-01 15:50:24 +01:00
Ștefan Talpalaru 99113a4b62
new metric: network_name (#3018) 2021-10-21 17:13:35 +02:00
Jacek Sieka 421bf936ff
odds and ends (#3015)
* `allSyncCommittees` => `allSyncSubcommittees`
* simplify `_snappy` topic generation (avoid pointless string copies)
* simplify gossip id generator (avoid pointless string copies)
* avoid redundant syncnet ENR updates
* simplify topic validation (allow only validated topics)
2021-10-21 15:09:19 +02:00
Jacek Sieka df3fc9525f
import cleanup (#2997)
* import cleanup

...and remove some unused types

* add random imports

* more imports
2021-10-19 16:09:26 +02:00
Jacek Sieka 4f7a8cf79d
register vc duties with subnet tracker (#2949)
* register vc duties with subnet tracker
* fix activation logging during startup
* cache slot signature to avoid duplicate signature work
* schedule aggregation duties one slot at a time to avoid CPU spike at
each epoch
* lower aggregation subnet pre-subscription time to 4 slots (lowers
bandwidth and CPU usage)
* update stability subnets in ENR on startup
* log gossip state
* perform gossip subscriptions just before the next slot starts
* document stuff
* add random include
* don't overwrite subscription state when not subscribed
* log target gossip state
* updating gossip status once is enough
* add test
* remove syncQueueLen - this one is not updated at the end of the sync
and may cause gossip to disconnect itself completely - use a simple head
distance instead
* fix gossip disconnection - if in hysteresis, node.gossipState will be
set to disabled even though we don't disable topic subscriptions
* fix extra duty registration call
2021-10-18 11:11:44 +02:00
Eugene Kabanov 75a8fe3b2c
Increase REST server default limits. (#2991)
* Increase REST server default limits.

* Fix compilation error.

* Fix tests.
2021-10-18 09:14:44 +02:00
Eugene Kabanov 7319f150e8
Number of REST fixes. (#2987)
* Initial commit.

* Fix path.

* Add validator keys to indices cache mechanism.
Move syncComitteeParticipants to common place.

* Fix sync participants order issue.

* Fix error code when state could not be found.
Refactor `state/validators` to use keysToIndices mechanism.

* Fix RestValidatorIndex to ValidatorIndex conversion TODOs.

* Address review comments.

* Fix REST test rules.
2021-10-14 12:38:38 +02:00
Eugene Kabanov 65257b82f8
Validator key management API (#2755)
Implements https://github.com/ethereum/beacon-APIs/pull/151
2021-10-04 22:08:31 +03:00
Tanguy 06df40454f
Improve discovery during fork (#2913) 2021-09-29 14:06:16 +03:00
Etan Kissling c510ffb16a
use `allSyncCommittees` everywhere (#2917)
There are still a few cases with manual loops through sync subcommittees
even though an `allSyncCommittees` iterator exists. Adjusted the few
remaining instances to also use the iterator instead.
2021-09-28 18:02:01 +00:00
Jacek Sieka e47a8cbe42
fixes (#2901)
* export kvstore from beacon_chain_db
* fix rest HashList deserialization
* fix asTrusted
2021-09-27 11:24:58 +02:00
tersec 6f391691d9
disable sync committee subnets in simulation mode (#2897)
* disable sync committee subnets in simulation mode

* also gate getSyncCommitteeContributionAndProofTopic()
2021-09-24 14:43:53 +00:00
tersec bef98f6f50
sync committees don't exist in phase 0 (#2894) 2021-09-24 07:39:35 +00:00
Zahary Karadjov 9b41bb64da
Register the Altair topics in the valid topics list 2021-09-22 16:28:48 +03:00
Eugene Kabanov b566d4657f
REST /eth/v1/events API call implementation. (#2878)
* Placing callbacks into strategic places.

* Initial events call implementation.

* Post rebase fixes.

* Change addSyncContribution() implementation.

* Add `attestation-sent` event.
Remove gcsafe, raises from callbacks implementations.
Move `attestation-received` fire at the end of attestation processing.

* Address review comments.
2021-09-22 14:17:15 +02:00
Mamy Ratsimbazafy d1cb5b7220
Parallel attestation verification (#2718)
* Add parallel attestation verification

* Update tests, batchVerify doesn't use the threadpool with only single core (nim-blscurve update)

* bump nim-blscurve

* Debug info for failing eth2 test vectors

* remove submodule eth2-testnets

* verbose debugging of make failure on Windows (libbacktrace?)

* Remove CI debug mode

* initialization convention

* Fix new altair tests
2021-09-17 03:13:52 +03:00
Zahary Karadjov 7d1efa443d Restore the sync committee pool pruning and add tests 2021-08-30 11:06:45 +03:00
tersec 3efcdb0de5
subscribe to/unsubscribe from sync committee subnets (#2832) 2021-08-29 05:58:27 +00:00
tersec 0418fbada2
introduce SyncCommitteeMsgPool to eth2_processor and nimbus_beacon_node (#2831) 2021-08-28 22:27:51 +00:00
Jacek Sieka 01596c45dd
cleanups and fixes (#2827)
* import cleanup
* fix json-rpc exception handlers
* avoid unnecessary presto client import
* introduce ForkedBeaconBlock, some altair logging
* url fixes
2021-08-27 11:00:06 +02:00
Jacek Sieka ba06f13942
cleanups (#2809)
* cleanups

* use ForkedTrustedSignedBeaconBlock.ionit where appropriate
* move `is_aggregator` to `spec/`
* use `errReject` in a few more places
* update enr fork id when time is auspicious
* use network broadcast functions

* Return Ignore for aggregate signature validation timeouts

...consistently between aggregates and attestations.

* clean up some more reject/ignore rules
* shorten texts a bit

* errReject->checkedReject, use err helpers throughout

* get rid of quarantine in exitpool as well
2021-08-24 21:49:51 +02:00
tersec 4678a2bee7
move BeaconClock from ChainDAG to BeaconNode (#2796) 2021-08-20 08:58:15 +00:00
tersec 317b6de4e6
send attestations and exit messages on fork-appropriate topic (#2773)
* send attestations and exit messages on fork-appropriate topic

* document why use wall clock over attestation slot

* centralize some fork-topic-picking-logic in eth2_network

* pick up new test in summary

* allow specified GetTimeFn for testing purposes

* add GenesisTime and use it in eth2_network

* replace GetTimeFn and GenesisTime with GetBeaconTimeFn
2021-08-19 10:45:31 +00:00
Jacek Sieka a7a65bce42
disentangle eth2 types from the ssz library (#2785)
* reorganize ssz dependencies

This PR continues the work in
https://github.com/status-im/nimbus-eth2/pull/2646,
https://github.com/status-im/nimbus-eth2/pull/2779 as well as past
issues with serialization and type, to disentangle SSZ from eth2 and at
the same time simplify imports and exports with a structured approach.

The principal idea here is that when a library wants to introduce SSZ
support, they do so via 3 files:

* `ssz_codecs` which imports and reexports `codecs` - this covers the
basic byte conversions and ensures no overloads get lost
* `xxx_merkleization` imports and exports `merkleization` to specialize
and get access to `hash_tree_root` and friends
* `xxx_ssz_serialization` imports and exports `ssz_serialization` to
specialize ssz for a specific library

Those that need to interact with SSZ always import the `xxx_` versions
of the modules and never `ssz` itself so as to keep imports simple and
safe.

This is similar to how the REST / JSON-RPC serializers are structured in
that someone wanting to serialize spec types to REST-JSON will import
`eth2_rest_serialization` and nothing else.

* split up ssz into a core library that is independendent of eth2 types
* rename `bytes_reader` to `codec` to highlight that it contains coding
and decoding of bytes and native ssz types
* remove tricky List init overload that causes compile issues
* get rid of top-level ssz import
* reenable merkleization tests
* move some "standard" json serializers to spec
* remove `ValidatorIndex` serialization for now
* remove test_ssz_merkleization
* add tests for over/underlong byte sequences
* fix broken seq[byte] test - seq[byte] is not an SSZ type

There are a few things this PR doesn't solve:

* like #2646 this PR is weak on how to handle root and other
dontSerialize fields that "sometimes" should be computed - the same
problem appears in REST / JSON-RPC etc

* Fix a build problem on macOS

* Another way to fix the macOS builds

Co-authored-by: Zahary Karadjov <zahary@gmail.com>
2021-08-18 20:57:58 +02:00
Jacek Sieka 70259e4e64
treat gossip decoding errors more strictly (#2793)
* penalize peers for sending gossip messages that fail decoding
* add metrics for decoding/decompression errors
* clean up obsolete exception handlers
2021-08-18 14:30:05 +02:00
Jacek Sieka 63717531dc remove remaining traces of nim-prompt
upstream dead, as is feature in eth2
2021-08-16 21:56:50 +03:00
Jacek Sieka 7a622e8505
rework spec imports (#2779)
The spec imports are a mess to work with, so this branch cleans them up
a bit to ensure that we avoid generic sandwitches and that importing
stuff generally becomes easier.

* reexport crypto/digest/presets because these are part of the public
symbol set of the rest of the spec types
* don't export `merge` types from `base` - this causes circular deps
* fix circular deps in `ssz/spec_types` - this is the first step in
disentangling ssz from spec
* be explicit about phase0 vs altair - longer term, `altair` will become
the "natural" type set, then merge and so on, so no point in giving
`phase0` special preferential treatment
2021-08-12 13:08:20 +00:00
Jacek Sieka 9697b73e71
forkedbeaconstate_helpers -> forks (#2772)
Simpler module name for stuff that covers forks

* check that runtime config matches database state
* also include some assorted altair cleanups
* use "standard" genesis fork in local testnet to work around missing
runtime config support
2021-08-10 22:46:35 +02:00
tersec 2eb6246b90
altair enr switching (#2770)
* altair enr switching

* altair transitions only happen on epoch boundaries

* determine ENR period by wall time, not chain head
2021-08-10 08:19:13 +02:00
tersec 2afe2802b6
altair topic switching (#2767)
* altair topic switching

* remove validate{Committee,Validator}IndexOr unused within branch
2021-08-09 12:54:45 +00:00
Jacek Sieka 3d7bee8502
REST API client, JSON-RPC cleanups (#2756)
This refactoring puts the JSON-RPC and REST APIs on more equal footing
by renaming and moving things around, creating a separation between
client and server, and documenting what they are - the aim is to have a
simple-to-use base to start from when developing API clients, as well as
make it easier to navigate the code when looking for the legacy JSON-RPC
interface vs the new REST API.

* move REST client, serialization and supporting types to spec/eth2_apis
* REST stuff now starts with `rest_`, JSON-RPC stuff starts with `rpc_`,
more or less
* simplify imports such that there's a simple module to import for both
server and client
* map REST type and proc names to yaml spec more closely - in
particular, reuse operation and type names in `rest_types` to make
comparisons against spec more easy
* cleaner separation between client and server modules - modules common
between server and client such as `rest_types` and serialization move to
the spec folder - this allows the client to be built with less knowledge
about server internals
2021-08-03 17:17:11 +02:00
Kim De Mey 06b2a1642d
Remove incorrect enode import (#2748) 2021-07-29 20:53:58 +02:00
tersec 9d9c37c561
some whole-file copies from altair branch (#2728)
* some whole-file copies from altair branch

* rpc/node_api and rpc/node_rest_api also need to be copied

* remove new sync committee-related functionality
2021-07-19 11:58:30 +00:00
Jacek Sieka 2d6a661ac6
Syncv2 (#2723)
* bump libp2p

* altair sync v2

Use V2 sync requests after the altair fork has happened, according to
the wall clock

* Fix the behavior of the v1 req/resp calls after Altair

Co-authored-by: Zahary Karadjov <zahary@gmail.com>
2021-07-15 21:01:07 +02:00
tersec e4afc36d71
use ForkedTrustedSignedBeaconBlock (#2720)
* use ForkedTrustedSignedBeaconBlock

* remove --subscribe-all-subnets

* https://ethereum.github.io/eth2.0-APIs/#/Beacon/getBlock implementation was passing through forked beaconblocks
2021-07-14 12:18:52 +00:00
Jacek Sieka 3f9c1fdf4e
More RuntimeConfig cleanup (#2716)
* remove from BeaconChainDB (doesn't depend on runtime config)
* eth2-testnets -> eth2-networks
* use `cfg` name throughout
2021-07-13 16:27:10 +02:00
Eugene Kabanov 3b6f4fab4a
New validator client using REST API. (#2651)
* Initial commit.

* Exporting getConfig().

* Add beacon node checking procedures.

* Post rebase fixes.

* Use runSlotLoop() from nimbus_beacon_node.
Fallback implementation.
Fixes for ETH2 REST serialization.

* Add beacon_clock.durationToNextSlot().
Move type declarations from beacon_rest_api to json_rest_serialization.
Fix seq[ValidatorIndex] serialization.
Refactor ValidatorPool and add some utility procedures.
Create separate version of validator_client.

* Post-rebase fixes.
Remove CookedPubKey from validator_pool.nim.

* Now we should be able to produce attestations and aggregate and proofs.
But its not working yet.

* Debugging attestation sending.

* Add durationToNextAttestation.
Optimize some debug logs.
Fix aggregation_bits encoding.
Bump chronos/presto.

* Its alive.

* Fixes for launch_local_testnet script.
Bump chronos.

* Switch client API to not use `/api` prefix.

* Post-rebase adjustments.

* Fix endpoint for publishBlock().

* Add CONFIG_NAME.
Add more checks to ensure that beacon_node is compatible.

* Add beacon committee subscription support to validator_client.

* Fix stacktrace should be an array of strings.
Fix committee subscriptions should not be `data` keyed.

* Log duration to next block proposal.

* Fix beacon_node_status import.

* Use jsonMsgResponse() instead of jsonError().

* Fix graffityBytes usage.
Remove unnecessary `await`.
Adjust creation of SignedBlock instance.
Remove legacy files.

* Rework durationToNextSlot() and durationToNextEpoch() to use `fromNow`.

* Fix race condition for block proposal and attestations for same slot.
Fix local_testnet script to properly kill tasks on Windows.
Bump chronos and nim-http-tools, to allow connections to infura.io (basic auth).

* Catch services errors.
Improve performance of local_testnet.sh script on Windows.
Fix race condition when attestation producing.

* Post-rebase fixes.

* Bump chronos and presto.

* Calculate block publishing delay.
Fix pkill in one more place.

* Add error handling and timeouts to firstSuccess() template.
Add onceToAll() template.
Add checkNodes() procedure.
Refactor firstSuccess() template.
Add error checking to api.nim calls.

* Deprecated usage onceToAll() for better stability.
Address comment and send attestations asap.

* Avoid unnecessary loop when calculating minimal duration.
2021-07-13 13:15:07 +02:00
Jacek Sieka 23eea197f6
Implement split preset/config support (#2710)
* Implement split preset/config support

This is the initial bulk refactor to introduce runtime config values in
a number of places, somewhat replacing the existing mechanism of loading
network metadata.

It still needs more work, this is the initial refactor that introduces
runtime configuration in some of the places that need it.

The PR changes the way presets and constants work, to match the spec. In
particular, a "preset" now refers to the compile-time configuration
while a "cfg" or "RuntimeConfig" is the dynamic part.

A single binary can support either mainnet or minimal, but not both.
Support for other presets has been removed completely (can be readded,
in case there's need).

There's a number of outstanding tasks:

* `SECONDS_PER_SLOT` still needs fixing
* loading custom runtime configs needs redoing
* checking constants against YAML file

* yeerongpilly support

`build/nimbus_beacon_node --network=yeerongpilly --discv5:no --log-level=DEBUG`

* load fork epoch from config

* fix fork digest sent in status
* nicer error string for request failures
* fix tools

* one more

* fixup

* fixup

* fixup

* use "standard" network definition folder in local testnet

Files are loaded from their standard locations, including genesis etc,
to conform to the format used in the `eth2-networks` repo.

* fix launch scripts, allow unknown config values

* fix base config of rest test

* cleanups

* bundle mainnet config using common loader
* fix spec links and names
* only include supported preset in binary

* drop yeerongpilly, add altair-devnet-0, support boot_enr.yaml
2021-07-12 15:01:38 +02:00
Ștefan Talpalaru 0aef63948f
add version metric (using labels) (#2711) 2021-07-09 07:41:44 +02:00
zah eb2dc5cbbb
Implement the new Altair req/resp protocols (#2676)
* Implement the new Altair req/resp protocols

Also fixes the altair message-id computation by providing the correct
forkdigest prefix in `isAltairTopic`.

Co-authored-by: Tanguy Cizain <tanguycizain@gmail.com>
2021-07-07 12:09:47 +03:00
tersec 146fa48454
use ForkedHashedBeaconState in StateData (#2634)
* use ForkedHashedBeaconState in StateData

* fix FAR_FUTURE_EPOCH -> slot overflow; almost always use assign()

* avoid stack allocation in maybeUpgradeStateToAltair()

* create and use dispatch functions for check_attester_slashing(), check_proposer_slashing(), and check_voluntary_exit()

* use getStateRoot() instead of various state.data.hbsPhase0.root

* remove withStateVars.hashedState(), which doesn't work as a design anymore

* introduce spec/datatypes/altair into beacon_chain_db

* fix inefficient codegen for getStateField(largeStateField)

* state_transition_slots() doesn't either need/use blocks or runtime presets

* combine process_slots(HBS)/state_transition_slots(HBS) which differ only in last-slot htr optimization

* getStateField(StateData, ...) was replaced by getStateField(ForkedHashedBeaconState, ...)

* fix rollback

* switch some state_transition(), process_slots, makeTestBlocks(), etc to use ForkedHashedBeaconState

* remove state_transition(phase0.HashedBeaconState)

* remove process_slots(phase0.HashedBeaconState)

* remove state_transition_block(phase0.HashedBeaconState)

* remove unused callWithBS(); separate case expression from if statement

* switch back from nested-ref-object construction to (ref Foo)(Bar())
2021-06-11 20:51:46 +03:00
Eugene Kabanov 84f3b2dd09
Increase REST service timeout setting to 2xSECONDS_PER_SLOT. (#2627)
Fix `ansi_c` warning on Windows.
2021-06-03 11:43:04 +02:00
Jacek Sieka abe0d7b4ae singe validator key cache
Instead of keeping a validator key list per EpochRef, this PR introduces
a single shared validator key list in ChainDAG, and cleans up some other
ChainDAG and key-related issues.

The PR does not introduce the validator key list in the state transition
- this is because we batch-check all signatures before entering the spec
code, thus the spec code never hits the cache.

A future refactor should _probably_ remove the threadvar altogether.

There's a few other small fixes in here that make the flow easier to
read:

* fix `var ChainDAGRef` -> `ChainDAGRef`
* fix `var QuarantineRef` -> `QuarantineRef`
* consistent `dag` variable name
* avoid using threadvar pubkey cache in most cases
* better error messages in batch signature checking
2021-06-01 20:43:44 +03:00
Jacek Sieka df7bc87af5 Pre-compute slot transition for clearance state
This way we perform the expensive epoch processing before the block
arrives.

Of course, this may lead to speculative misses which in turn lead to
replays - it's likely that in the case of a miss, we'll see a replay
regardless.
2021-05-30 12:04:09 +03:00
Jacek Sieka 7f52ffb8d9
clean up block processing (#2610)
* gossip_to_consensus -> block_processor (it's processing only blocks,
but not only from gossip)
* measure queue and validation time for blocks
* measure assignment and state loading times for updateStateData
* avoid some unnecessary block copies in block sync
* warn that database is corrupt if we hit tail without a state
2021-05-28 19:34:00 +03:00
Jacek Sieka ab70f371e1
Revert "create new database in separate file (#2596)" (#2604)
This reverts commit eebc828778.

Adding a separate file turns out not to be enough. This PR reverts the
separate file change.

Another theory is that the large kvstore table causes cache thrashing -
all database connections share a common page cache which would explain
the poor performance of the separate file solution.
2021-05-27 12:59:42 +02:00
Jacek Sieka eebc828778
create new database in separate file (#2596)
The V1 table structure shows great improvements in performance, but if
there's an old `kvstore` without rowid:s, these benefits are nullified:
reorgs during writes and deletes remain expensive (even if the
degradation is reduced somewhat).

This PR creates the tables in a new file instead, and uses the old file
as a read-only store - this has several interesting properties:

* the old database is left completely untouched - this guarantees that
downgrades work smooth (they'll only need to resync their missing
portions)
* starting sync after this PR means only a v1 database is created
* v0 databases stick around - no migration is performed (for now)

Future PR:s can introduce migration of the data from one database to
another - a simply copy will take hours which is downtime we want to
avoid - at that point, it might make sense to migrate straight to era
files instead.
2021-05-26 09:07:18 +02:00
tersec 0b0bfd1de0
use StateData in place of BeaconState outside state transition code (#2551)
* use StateData in place of BeaconState outside state transition code

* propagate more StateData usage

* remove withStateVars().state

* wrap get_beacon_committee(BeaconState, ...) as gbc(StateData, ...)

* switch makeAttestation() to use StateData

* use StateData wrapper/dispatcher for get_committee_count_per_slot()

* convert AttestationCache.init(), weak subjectivity functions, and updateValidatorMetrics()

* add get_shuffled_active_validator_indices(StateData) and get_block_root_at_slot(StateData)

* switch makeAttestationData() to StateData

* sync AllTests-mainnet.md after rebase
2021-05-21 09:23:28 +00:00
Zahary Karadjov b7aa30adfd
Merge stable into unstable 2021-05-20 13:50:40 +03:00
Eugene Kabanov cf06c4e87e
Make REST server more compatible with Lighthouse and Teku validator clients. (#2575)
* Allow REST server to parse arrays with comma delimiter.

* Fix compilation issues because of new presto framework.
2021-05-18 12:24:57 +02:00
Jacek Sieka 97f4e1fffe
Db1 cont (#2573)
* Revert "Revert "Upgrade database schema" (#2570)"

This reverts commit 6057c2ffb4.

* ssz: fix loading empty lists into existing instances

Not a problem earlier because we didn't reuse instances

* bump nim-eth

* bump nim-web3
2021-05-17 18:37:26 +02:00
tersec 6057c2ffb4
Revert "Upgrade database schema" (#2570)
This reverts commit 22ddf74752.
2021-05-17 06:34:44 +00:00
Jacek Sieka 22ddf74752 Upgrade database schema
The `kvstore` design we're using now turns out to not be the best way to
use `sqlite` - in particular, there are some significant benefits to
using rowid in certain situations and to keep data in separate tables.

With this branch, there are massive improvements in startup time
(seconds instead of minutes) and state/block storage and pruning times
(milliseconds instead of seconds) - these improvements can in particular
be seen on slow drives and translate directly into better attestation
performance.

* update kvstore to new keyspace design
* remove `DirStoreRef` and the hidden `--state-db-kind` option - this
was an experiment to store large blobs in files, but with the new
kvstore, there's no compelling reason to do so
* remove `DbMap` - unused and would need updating for new keyspace
design
* introduce separate tables for each data type (blocks, states etc)
* remove "WITHOUT ROWID" pessimization for tables with large blobs
* close DbSeq statements explicitly (and earlier)
* store beacon block summaries in separate table, without SSZ
compression and load them all with single query on startup
* stop storing backwards compat full states
* mark genesis beacon block as trusted
* avoid faststreams when loading SSZ data
* remove `DisagreementBehavior` (unused)
2021-05-14 20:05:23 +03:00
Jacek Sieka 0a477eb2d3
fix subnet logic (#2555)
This PR decreases the lead subscription time which should help
decrease bandwidth usage and CPU making the subscription for future
aggregation happen a bit later. There's room for more tuning here,
probably.

* fix missing negation from in #2550
* fix silly bitarray issues
* decrease subnet lead subscription time
* log all subnet switching source data
* rename subnet trackers to refer to stability and aggregate subnets
* more tests
2021-05-11 22:03:40 +02:00
Mamy Ratsimbazafy e6b559a35a
Slashing db pruning [Merge only after v2 has been default for 1 noticeable release] (#2452)
* Enable slashing DB pruning

* integrate slashing DB pruning with onSlotEnd

* rebase tests
2021-05-10 16:32:28 +02:00
Jacek Sieka 867d8f3223
Perform attestation check before broadcast (#2550)
Currently, we have a bit of a convoluted flow where when sending
attestations, we start broadcasting them over gossip then pass them to
the attestation validation to include them in the local attestation pool
- it should be the other way around: we should be checking attestations
_before_ gossipping them - this serves as an additional safety net to
ensure that we don't publish junk - this becomes more important when
publishing attestations from the API.

Also, the REST API was performing its own validation meaning
attestations coming from REST would be validated twice - finally, the
JSON RPC wasn't pre-validating and would happily broadcast invalid
attestations.

* Unified attestation production pipeline with the same flow for gossip,
locally and API-produced attestations: all are now validated and entered
into the pool, then broadcast/republished
* Refactor subnet handling with specific SubnetId alias, streamlining
where subnets are computed, avoiding the need to pass around the number
of active validators
* Move some of the subnet handling code to eth2_network
* Use BitArray throughout for subnet handling
2021-05-10 09:13:36 +02:00
Jacek Sieka 4d74c742da
move ENRForkID into `spec` (#2538)
* move ENRForkID into `spec`

also get rid of strformat in topic formation and fix some case
discrepancies

* also move `Eth2Metadata`
2021-05-04 17:28:48 +02:00
Jacek Sieka efdf759cc0
avoid some slashing protection queries (#2528)
This PR reduces the number of database queries for slashing protection
from 5 reads and 1 write to 2 reads and 1 write in the optimistic case.

In the process, it removes user-level support for writing the database
in the version 1 format in order to simplify the code flow, and prevent
code rot. In particular, the v1 format was not covered by any unit tests
and has no advantages over v2. The concrete code to read and write it
remains for now, in particular to support upgrades from v1 to v2.

The branch also removes the use of concepts which doesn't work with
checked exceptions - in particular, this highlights code that both
raises exceptions and returns error codes, which could be cleaned up in
the future.

* Cache internal validator ID
* Rely on unique index to check for trivial duplicate votes
* Combine two surround vote queries into one
* Combine API for checking and registering slashing into single function

The slashing DB is normally not a bottleneck, but may become one with
high attached validator counts.
2021-05-04 15:17:28 +02:00
tersec e0f4d28116
rename initialize_beacon_state to initialize_beacon_state_from_eth1 (#2536) 2021-05-04 12:19:11 +02:00
Dustin Brody 7f42d38219 rename initialize_beacon_state{_from_eth1,}; suppress warnings when doppelganger detection disabled 2021-04-28 00:12:41 +03:00