Commit Graph

12 Commits

Author SHA1 Message Date
zah b1ac9c9fe4
Fix a potential segfault and various potential stalls (#4003)
* Fixes a segfault during block production when the Keymanager API
  is disabled. The Keymanager is now disabled on half of the local
  testnet nodes to catch such problems in the future.

* Fixes multiple potential stalls from REST requests being done
  without a timeout. From practice, we know that such requests
  can hang forever if not cancelled with a timeout. At best,
  this would be a resource leak, at worst, it may lead to a
  full stall of the client and missed validator duties.

* Changes some Options usages to Opt (for easier use of valueOr)
2022-08-19 21:51:30 +00:00
Jacek Sieka c8fb447020
valmon: log autoregistration once only (#3993) 2022-08-18 23:09:49 +00:00
Miran dfd4afc9f2
compatibility with Nim 1.4+ (#3888) 2022-07-29 10:53:42 +00:00
zah a2ba34f686
Implement all sync committee duties in the validator client (#3583)
Other changes:

* logtrace can now verify sync committee messages and contributions
* Many unnecessary use of pairs() have been removed for consistency
* Map 40x BN response codes to BeaconNodeStatus.Incompatible in the VC
2022-05-10 10:03:40 +00:00
Jacek Sieka 30eef0a369
Validator monitor polish (#3569)
* lower "Previous epoch attestation missing" to `NOTICE` for easier
filtering
* add delay logging to validator monitor logs
* simplify delay logging code post-`BeaconTime`
2022-04-06 09:23:01 +00:00
tersec ef9767eb7a
implement --jwt-secret and HS256 JWT/JWS signing for engine API alpha.7 (#3440) 2022-02-27 16:55:02 +00:00
Jacek Sieka 49282e9477
val_mon: register locally produced aggregates (#3352)
These use a separate flow, and were previously only registered from the
network

* don't log successes in totals mode (TMI)
* remove `attestation-sent` event which is unused
2022-02-04 08:33:20 +01:00
Jacek Sieka 3df9ffca9f val-mon: remove redundant `_total` suffix from counters
It turns out nim-metrics adds this suffix on its own - it also turns out
some of the names are non-conventional and need follow-up.
2022-01-31 18:51:24 +02:00
Jacek Sieka ad327a8769
Fix counters in validator monitor totals mode (#3332)
The current counters set gauges etc to the value of the _last_ validator
to be processed - as the name of the feature implies, we should be using
sums instead.

* fix missing beacon state metrics on startup, pre-first-head-selection
* fix epoch metrics not being updated on cross-epoch reorg
2022-01-31 08:36:29 +01:00
Jacek Sieka e4939538cd
fix non-totalized validator metric (#3297)
...when running on host with many validators, this explodes
2022-01-19 15:20:48 +01:00
Jacek Sieka 805e85e1ff
time: spring cleaning (#3262)
Time in the beacon chain is expressed relative to the genesis time -
this PR creates a `beacon_time` module that collects helpers and
utilities for dealing the time units - the new module does not deal with
actual wall time (that's remains in `beacon_clock`).

Collecting the time related stuff in one place makes it easier to find,
avoids some circular imports and allows more easily identifying the code
actually needs wall time to operate.

* move genesis-time-related functionality into `spec/beacon_time`
* avoid using `chronos.Duration` for time differences - it does not
support negative values (such as when something happens earlier than it
should)
* saturate conversions between `FAR_FUTURE_XXX`, so as to avoid
overflows
* fix delay reporting in validator client so it uses the expected
deadline of the slot, not "closest wall slot"
* simplify looping over the slots of an epoch
* `compute_start_slot_at_epoch` -> `start_slot`
* `compute_epoch_at_slot` -> `epoch`

A follow-up PR will (likely) introduce saturating arithmetic for the
time units - this is merely code moves, renames and fixing of small
bugs.
2022-01-11 11:01:54 +01:00
Jacek Sieka c270ec21e4
Validator monitoring (#2925)
Validator monitoring based on and mostly compatible with the
implementation in Lighthouse - tracks additional logs and metrics for
specified validators so as to stay on top on performance.

The implementation works more or less the following way:
* Validator pubkeys are singled out for monitoring - these can be
running on the node or not
* For every action that the validator takes, we record steps in the
process such as messages being seen on the network or published in the
API
* When the dust settles at the end of an epoch, we report the
information from one epoch before that, which coincides with the
balances being updated - this is a tradeoff between being correct
(waiting for finalization) and providing relevant information in a
timely manner)
2021-12-20 20:20:31 +01:00