Commit Graph

44 Commits

Author SHA1 Message Date
Eugene Kabanov eea13ee5ed
VC: roles & strategies. (#4113)
* Initial commit.

* Roles changes.

* Fix all the compilation issues.

* Add beacon node roles.
Add loop for firstSuccessParallel().

* Remove unused variables.
2022-09-29 09:57:14 +02:00
zah b1ac9c9fe4
Fix a potential segfault and various potential stalls (#4003)
* Fixes a segfault during block production when the Keymanager API
  is disabled. The Keymanager is now disabled on half of the local
  testnet nodes to catch such problems in the future.

* Fixes multiple potential stalls from REST requests being done
  without a timeout. From practice, we know that such requests
  can hang forever if not cancelled with a timeout. At best,
  this would be a resource leak, at worst, it may lead to a
  full stall of the client and missed validator duties.

* Changes some Options usages to Opt (for easier use of valueOr)
2022-08-19 21:51:30 +00:00
zah fca20e08d6
Keymanager API for the validator client (#3976)
* Keymanager API for the validator client
* Properly treat the 'description' field as optional when loading Keystores
* Spec-compliant serialization of the slashing data in Keymanager's DeleteKeys response ()

Fixes #3940
Fixes #3964
Closes #3884 by adding test
2022-08-19 13:30:07 +03:00
Eugene Kabanov ce9e50e275
VC: metrics (#3915)
* Initial commit.
Enable MetricsHttpServerRef and configuration.

* Add metrics.

* Add headers.
Add compilation issue fixes.
2022-07-29 11:36:20 +03:00
Eugene Kabanov c3d3397843
VC: doppelganger protection (#3877)
* Improve fallback_service.

* Improve logging in fallback_service.

* Apply signal handling for all stages.

* Fix some logging statements.

* Add doppelganger REST api endpoint.
Add some structures to VC.

* Add client API call implementation.

* Initial fix & refactor onceToAll()
Add doppelganger service.
Add doppelganger helpers.

* Add doppelganger checks.

* Move doppelganger log messages to higher levels.

* Fix firstSuccess().

* Bump chronos.

* Post rebase fixes.

* Proper chronos bump.

* Address review comments.

* Attempt to fix finalization test issue.

* Fix nimbus_signing_node.

* Mark validators which are added at GENESIS_SLOT in GENESIS_EPOCH as passed doppelganger validation.

* Do not send empty requests to server.

* Fix log statement.

* Address review comments and re-raise cancellations.

Co-authored-by: zah <zahary@gmail.com>
2022-07-21 19:54:07 +03:00
Eugene Kabanov d4bafdf5a4
VC: cancellation hot-fixes. (#3875)
* Fix cancellation issues.
* Add exitEvent which will allow gracefully shutdown validator client.
* Fix firstSuccessTimeout() template.
* Fix service names.
* Modify waitOnlineNodes to include timeout parameter.
2022-07-15 00:11:25 +03:00
Eugene Kabanov 263a2ffa14
Validator client various fixes. (#3840)
* Improve fallback_service.
* Fix nextAction negative time issue.
* Improve logging in fallback_service.
* Improve logging in sync_committee_service.
* Prepare all services for cancellation.
* Signals handlers for validator client
* Address #3800

Co-authored-by: Zahary Karadjov <zahary@gmail.com>
2022-07-13 17:43:57 +03:00
Jacek Sieka 6a0433cb08
REST parameter defaults (#3689)
* add default BN for VC
* unify "default" beacon node across commands
* clean up validator exit docs
2022-06-01 10:47:52 +00:00
zah a2ba34f686
Implement all sync committee duties in the validator client (#3583)
Other changes:

* logtrace can now verify sync committee messages and contributions
* Many unnecessary use of pairs() have been removed for consistency
* Map 40x BN response codes to BeaconNodeStatus.Incompatible in the VC
2022-05-10 10:03:40 +00:00
zah 6d11ad6ce1
Support for distributed keystores with multiple remotes based on threshold signatures (#3616)
Other fixes:

* Fix bit rot in the `make prater-dev-deposit` target.
* Correct content-type in the responses of the Nimbus signing node
* Invalid JSON payload was being sent in the web3signer requests
2022-05-10 03:32:12 +03:00
Zahary Karadjov e6723ddb24
Allow running Nimbus as a Windows service (--run-as-service) 2022-03-05 15:53:47 +02:00
Eugene Kabanov 3a80b9951c
VC: Fix forks handling. (#3389)
* Trying to debug the finalization issue.

* Add debug logs to understand signature issue.

* Remove all the debugging helpers.

* Initial commit.

* Address review comments.

* Remove unneeded checks for empty fork schedule.

* Fix bellatrix ExecutionAddress serialization/deserialization procedures.
2022-02-16 12:31:23 +01:00
Jacek Sieka 805e85e1ff
time: spring cleaning (#3262)
Time in the beacon chain is expressed relative to the genesis time -
this PR creates a `beacon_time` module that collects helpers and
utilities for dealing the time units - the new module does not deal with
actual wall time (that's remains in `beacon_clock`).

Collecting the time related stuff in one place makes it easier to find,
avoids some circular imports and allows more easily identifying the code
actually needs wall time to operate.

* move genesis-time-related functionality into `spec/beacon_time`
* avoid using `chronos.Duration` for time differences - it does not
support negative values (such as when something happens earlier than it
should)
* saturate conversions between `FAR_FUTURE_XXX`, so as to avoid
overflows
* fix delay reporting in validator client so it uses the expected
deadline of the slot, not "closest wall slot"
* simplify looping over the slots of an epoch
* `compute_start_slot_at_epoch` -> `start_slot`
* `compute_epoch_at_slot` -> `epoch`

A follow-up PR will (likely) introduce saturating arithmetic for the
time units - this is merely code moves, renames and fixing of small
bugs.
2022-01-11 11:01:54 +01:00
Zahary Karadjov 54d0d588b1 Implementation of the Keymanager API (BETA)
https://github.com/ethereum/keymanager-APIs
2022-01-04 18:51:45 +02:00
tersec 0d4e49f946
Merge fork gossip support (#3213)
* Merge fork gossip support

* index directly by BeaconStateFork and remove debugging log statement
2021-12-21 15:24:23 +01:00
Jacek Sieka 233d756518
Logging and startup improvements (#3038)
* Logging and startup improvements

Color support for released binaries!

* startup scripts no longer log to file by default - this only affects
source builds - released binaries don't support file logging
* add --log-stdout option to control logging to stdout (colors, json)
* detect tty:s vs redirected logs and log accordingly
* add option to disable log colors at runtime
* simplify several "common" logs, showing the most important information
earlier and more clearly
* remove line numbers / file information / tid - these take up space and
are of little use to end users
  * still enabled in debug builds and tools
* remove `testnet_servers_image` compile-time option
* server images, released binaries and compile-from-source now offer
the same behaviour and features
* fixes https://github.com/status-im/nimbus-eth2/issues/2326
* fixes https://github.com/status-im/nimbus-eth2/issues/1794
* remove instanteneous block speed from sync message, keeping only
average

before:

```
INF 2021-10-28 16:45:59.000+02:00 Slot start                                 topics="beacnde" tid=386429 file=nimbus_beacon_node.nim:884 lastSlot=2384027 wallSlot=2384028 delay=461us84ns peers=0 head=75a10ee5:3348 headEpoch=104 finalized=cd6804ba:3264 finalizedEpoch=102 sync="wwwwwwwwww:0:0.0000:0.0000:00h00m (3348)"
INF 2021-10-28 16:45:59.046+02:00 Slot end                                   topics="beacnde" tid=386429 file=nimbus_beacon_node.nim:821 slot=2384028 nextSlot=2384029 head=75a10ee5:3348 headEpoch=104 finalizedHead=cd6804ba:3264 finalizedEpoch=102 nextAttestationSlot=-1 nextProposalSlot=-1 nextActionWait=n/a
```

after:

```
INF 2021-10-28 22:43:23.033+02:00 Slot start                                 topics="beacnde" slot=2385815 epoch=74556 sync="DDPDDPUDDD:10:5.2258:01h19m (2361088)" peers=37 head=eacd2dae:2361096 finalized=73782:a4751487 delay=33ms687us715ns
INF 2021-10-28 22:43:23.291+02:00 Slot end                                   topics="beacnde" slot=2385815 nextActionWait=n/a nextAttestationSlot=-1 nextProposalSlot=-1 head=eacd2dae:2361096
```

* fix comment

* documentation updates

* mention `--log-file` may be deprecated in the future
* update various docs
2021-11-02 18:06:36 +01:00
Eugene Kabanov 65257b82f8
Validator key management API (#2755)
Implements https://github.com/ethereum/beacon-APIs/pull/151
2021-10-04 22:08:31 +03:00
tersec 9c0d9b546a
successfull -> successful (#2842) 2021-09-01 18:08:24 +02:00
Jacek Sieka 3d7bee8502
REST API client, JSON-RPC cleanups (#2756)
This refactoring puts the JSON-RPC and REST APIs on more equal footing
by renaming and moving things around, creating a separation between
client and server, and documenting what they are - the aim is to have a
simple-to-use base to start from when developing API clients, as well as
make it easier to navigate the code when looking for the legacy JSON-RPC
interface vs the new REST API.

* move REST client, serialization and supporting types to spec/eth2_apis
* REST stuff now starts with `rest_`, JSON-RPC stuff starts with `rpc_`,
more or less
* simplify imports such that there's a simple module to import for both
server and client
* map REST type and proc names to yaml spec more closely - in
particular, reuse operation and type names in `rest_types` to make
comparisons against spec more easy
* cleaner separation between client and server modules - modules common
between server and client such as `rest_types` and serialization move to
the spec folder - this allows the client to be built with less knowledge
about server internals
2021-08-03 17:17:11 +02:00
Eugene Kabanov f0c30e31b4
VC: various fixes (#2730)
* Fix firstSuccess() template missing timeouts.

* Fix validator race condition.
Fix logs to be compatible with beacon_node logs.
Add CatchableError handlers to avoid crashes.
Move some logs from Notice to Debug level.
Fix some [unused] warnings.

* Fix block proposal issue for slots in the past and from the future.

* Change sent to published.

* Address review comments #1.
2021-07-19 14:31:02 +00:00
Eugene Kabanov 3b6f4fab4a
New validator client using REST API. (#2651)
* Initial commit.

* Exporting getConfig().

* Add beacon node checking procedures.

* Post rebase fixes.

* Use runSlotLoop() from nimbus_beacon_node.
Fallback implementation.
Fixes for ETH2 REST serialization.

* Add beacon_clock.durationToNextSlot().
Move type declarations from beacon_rest_api to json_rest_serialization.
Fix seq[ValidatorIndex] serialization.
Refactor ValidatorPool and add some utility procedures.
Create separate version of validator_client.

* Post-rebase fixes.
Remove CookedPubKey from validator_pool.nim.

* Now we should be able to produce attestations and aggregate and proofs.
But its not working yet.

* Debugging attestation sending.

* Add durationToNextAttestation.
Optimize some debug logs.
Fix aggregation_bits encoding.
Bump chronos/presto.

* Its alive.

* Fixes for launch_local_testnet script.
Bump chronos.

* Switch client API to not use `/api` prefix.

* Post-rebase adjustments.

* Fix endpoint for publishBlock().

* Add CONFIG_NAME.
Add more checks to ensure that beacon_node is compatible.

* Add beacon committee subscription support to validator_client.

* Fix stacktrace should be an array of strings.
Fix committee subscriptions should not be `data` keyed.

* Log duration to next block proposal.

* Fix beacon_node_status import.

* Use jsonMsgResponse() instead of jsonError().

* Fix graffityBytes usage.
Remove unnecessary `await`.
Adjust creation of SignedBlock instance.
Remove legacy files.

* Rework durationToNextSlot() and durationToNextEpoch() to use `fromNow`.

* Fix race condition for block proposal and attestations for same slot.
Fix local_testnet script to properly kill tasks on Windows.
Bump chronos and nim-http-tools, to allow connections to infura.io (basic auth).

* Catch services errors.
Improve performance of local_testnet.sh script on Windows.
Fix race condition when attestation producing.

* Post-rebase fixes.

* Bump chronos and presto.

* Calculate block publishing delay.
Fix pkill in one more place.

* Add error handling and timeouts to firstSuccess() template.
Add onceToAll() template.
Add checkNodes() procedure.
Refactor firstSuccess() template.
Add error checking to api.nim calls.

* Deprecated usage onceToAll() for better stability.
Address comment and send attestations asap.

* Avoid unnecessary loop when calculating minimal duration.
2021-07-13 13:15:07 +02:00
tersec 7577f8c2ef
add blockchain_dag altair database reading; add rollback tests (#2683)
* add blockchain_dag altair database reading; add rollback tests; fix some unnecessary type conversions

* remove debugging scaffolding

* proposeSignedBlock() will need to be async for merge; introduce altair types to VC
2021-06-29 15:09:29 +00:00
Jacek Sieka 867d8f3223
Perform attestation check before broadcast (#2550)
Currently, we have a bit of a convoluted flow where when sending
attestations, we start broadcasting them over gossip then pass them to
the attestation validation to include them in the local attestation pool
- it should be the other way around: we should be checking attestations
_before_ gossipping them - this serves as an additional safety net to
ensure that we don't publish junk - this becomes more important when
publishing attestations from the API.

Also, the REST API was performing its own validation meaning
attestations coming from REST would be validated twice - finally, the
JSON RPC wasn't pre-validating and would happily broadcast invalid
attestations.

* Unified attestation production pipeline with the same flow for gossip,
locally and API-produced attestations: all are now validated and entered
into the pool, then broadcast/republished
* Refactor subnet handling with specific SubnetId alias, streamlining
where subnets are computed, avoiding the need to pass around the number
of active validators
* Move some of the subnet handling code to eth2_network
* Use BitArray throughout for subnet handling
2021-05-10 09:13:36 +02:00
Jacek Sieka efdf759cc0
avoid some slashing protection queries (#2528)
This PR reduces the number of database queries for slashing protection
from 5 reads and 1 write to 2 reads and 1 write in the optimistic case.

In the process, it removes user-level support for writing the database
in the version 1 format in order to simplify the code flow, and prevent
code rot. In particular, the v1 format was not covered by any unit tests
and has no advantages over v2. The concrete code to read and write it
remains for now, in particular to support upgrades from v1 to v2.

The branch also removes the use of concepts which doesn't work with
checked exceptions - in particular, this highlights code that both
raises exceptions and returns error codes, which could be cleaned up in
the future.

* Cache internal validator ID
* Rely on unique index to check for trivial duplicate votes
* Combine two surround vote queries into one
* Combine API for checking and registering slashing into single function

The slashing DB is normally not a bottleneck, but may become one with
high attached validator counts.
2021-05-04 15:17:28 +02:00
Jacek Sieka 74732a23fe
json cleanups (#2456)
* move json-rpc specific marshalling to rpc
* serialize Epoch/Slot with cast to avoid Defect
* avoid a few eth1 deps
* simplify imports
2021-03-26 15:11:06 +01:00
Jacek Sieka 2695cfa864
EH cleanup (#2455)
almost 100% raises in nimbus-eth2 now!

* fix some rare exception-related crashes in json-rpc
2021-03-26 07:52:01 +01:00
tersec 3076f5c3b6
rm std/random from beacon_chain and rm attestation timing randomness (#2442)
* remove added attestation timing randomness

* remove os/random from rest of beacon_chain, primarily deposit_contract

* remove scaffolding

* randomize std/random seed in beacon node and validator client

* use CSPRNG to more securely seed std/random
2021-03-23 06:57:10 +00:00
Mamy Ratsimbazafy de1060e7f3
centralize p2p validation in a single file and address https://github.com/status-im/nimbus-eth2/pull/2377#issuecomment-791313118 (#2383) 2021-03-06 08:32:55 +01:00
Mamy Ratsimbazafy d47f53cd9d
Reorg (5/5) (#2377)
* Reorg things left into networking and gossip_processing

* time -> beacon_clock

* fix builds
2021-03-05 14:12:00 +01:00
tersec 4278e80657
document two uint64 -> int64 conversions (#2375)
* document two uint64 -> int64 conversions

* fix minimal preset slot time & calculation
2021-03-04 10:13:23 +01:00
Mamy Ratsimbazafy 3276dfc683
Consolidate modules by areas [part 1] (#2365)
* Move sync in subfolder

* move validator related thingies in validators

* fix binary builds

* update bounds comment [skip ci]
2021-03-02 11:27:45 +01:00
tersec 97f7284e51
bump spec refs from v1.0.0 to v1.0.1 and update copyright years (#2357) 2021-02-25 13:37:22 +00:00
tersec 5cab17dc1a
database state storage benchmarking via ncli_db (#2312)
* database state storage benchmarking via ncli_db

* more cleanups from immutable validator state branch

* unexport some eth2_network constants and remove unused variables/templates

* make two PeerScore constants public
2021-02-15 17:40:00 +01:00
Mamy Ratsimbazafy 03f47c8f2f
Slashing protection refactor - EIP 3076 (#2094)
* Create CLI tool for slashing export

* Use SQLite as a DB instead of a KV-store

* Keeps v1 and v2 DBs around

* Uses the same schema as Lighthouse v1.1.0

* Passes all interchange tests + skeleton of finalization pruning

* Removes tests that would violate v5 / minimal slashing DB and MinSlot rules

* Migration tool added using low-watermark scheme for faster migration of large number of validators
2021-02-09 17:23:06 +02:00
Jacek Sieka 5d3c995d42 json-rpc fixes
* expose node signatures
* format bitseqs as hex strings
* format trusted sigs as hex strings (same as untrusted)
* reuse rpc client sigs
* include validator index in duties
* move SyncInfo to spec
2021-02-02 14:55:36 +02:00
Jacek Sieka 5713a3ce4c
performance fixes (#2259)
* performance fixes

* don't mark tree cache as dirty on read-only List accesses
* store only blob in memory for keys and signatures, parse blob lazily
* compare public keys by blob instead of parsing / converting to raw
* compare Eth2Digest using non-constant-time comparison
* avoid some unnecessary validator copying

This branch will in particular speed up deposit processing which has
been slowing down block replay.

Pre (mainnet, 1600 blocks):

```
All time are ms
     Average,       StdDev,          Min,          Max,      Samples,         Test
Validation is turned off meaning that no BLS operations are performed
    3450.269,        0.000,     3450.269,     3450.269,            1, Initialize DB
       0.417,        0.822,        0.036,       21.098,         1400, Load block from database
      16.521,        0.000,       16.521,       16.521,            1, Load state from database
      27.906,       50.846,        8.104,     1507.633,         1350, Apply block
      52.617,       37.029,       20.640,      135.938,           50, Apply epoch block
```

Post:

```
    3502.715,        0.000,     3502.715,     3502.715,            1, Initialize DB
       0.080,        0.560,        0.035,       21.015,         1400, Load block from database
      17.595,        0.000,       17.595,       17.595,            1, Load state from database
      15.706,       11.028,        8.300,      107.537,         1350, Apply block
      33.217,       12.622,       17.331,       60.580,           50, Apply epoch block
```

* more perf fixes

* load EpochRef cache into StateCache more aggressively
* point out security concern with public key cache
* reuse proposer index from state when processing block
* avoid genericAssign in a few more places
* don't parse key when signature is unparseable
* fix `==` overload for Eth2Digest
* preallocate validator list when getting active validators
* speed up proposer index calculation a little bit
* reuse cache when replaying blocks in ncli_db
* avoid a few more copying loops

```
     Average,       StdDev,          Min,          Max,      Samples,         Test
Validation is turned off meaning that no BLS operations are performed
    3279.158,        0.000,     3279.158,     3279.158,            1, Initialize DB
       0.072,        0.357,        0.035,       13.400,         1400, Load block from database
      17.295,        0.000,       17.295,       17.295,            1, Load state from database
       5.918,        9.896,        0.198,       98.028,         1350, Apply block
      15.888,       10.951,        7.902,       39.535,           50, Apply epoch block
       0.000,        0.000,        0.000,        0.000,            0, Database block store
```

* clear full balance cache before processing rewards and penalties

```
All time are ms
     Average,       StdDev,          Min,          Max,      Samples,         Test
Validation is turned off meaning that no BLS operations are performed
    3947.901,        0.000,     3947.901,     3947.901,            1, Initialize DB
       0.124,        0.506,        0.026,      202.370,       363345, Load block from database
      97.614,        0.000,       97.614,       97.614,            1, Load state from database
       0.186,        0.188,        0.012,       99.561,       357262, Advance slot, non-epoch
      14.161,        5.966,        1.099,      395.511,        11524, Advance slot, epoch
       1.372,        4.170,        0.017,      276.401,       363345, Apply block, no slot processing
       0.000,        0.000,        0.000,        0.000,            0, Database block store
```
2021-01-25 13:04:18 +01:00
tersec 97c4f7c5c0
fix subnet calculation in RPC and insert broadcast attestations into node's pool (#2207)
* fix subnet calculation in RPC and insert broadcast attestations into node's pool

* unify codepaths to ensure only mostly-checked-to-be-valid attestations enter the pool, even from node's own broadcasts

* update attestation pool tests for new validateAttestation param
2020-12-23 13:59:04 +01:00
tersec afbaa36ef7
make subnet cycling more robust; use one stability subnet/validator; explicitly represent gossip enabled/disabled (#2201)
* make subnet cycling more robust; use one stability subnet/validator; explicitly represent gossip enabled/disabled

* fix asymmetry in _snappy being used for subscriptions but not unsubscriptions

* remove redundant comment

* minimal RPC and VC support for infoming BN of subnets

* create and verify slot signatures in RPC interface and VC

* loosen old slot check

* because Slot + uint64 works but uint64 + Slot doesn't

* document assumptions for head state use; don't clear stability subnets; guard against VC not having checked an epoch ahead, fixing a crash; clarify unsigned comparison

* revert unsub fix
2020-12-22 10:05:36 +01:00
Eugene Kabanov 46c2740097
Documentation for Validators API. (#2147)
* Recover proper validator API call and remove incorrect one.
Add more examples to API documentation.
2020-12-07 14:51:14 +02:00
Jacek Sieka 8685eb7042 add validator balance metrics 2020-11-28 16:22:09 +02:00
tersec 21c4ce8fd4
remove superfluous TODOs/not-really-TODOs, type conversion, imports (#2025) 2020-11-16 17:10:51 +01:00
Zahary Karadjov bd1047b715 0.6.0 release fixes
* Updated README
* No double v in nimbus_beacon_node --version
* No md5 checksums
2020-11-09 17:09:49 +02:00
tersec 271df8b604
bump 1.0.0rc-0 spec refs to 1.0.0 (#1974) 2020-11-09 14:18:55 +00:00
Zahary Karadjov e9b9cd75ee Rename binaries; Mimic the original repo layout in the distribution 2020-11-09 11:38:52 +02:00