Commit Graph

171 Commits

Author SHA1 Message Date
tersec 5eecb9a21f
rename no{R=>r}eturn, no{I=>i}init, short{l=>L}og, E{T=>t}h2Node, Beacon{c=>C}hainDB (#3403) 2022-02-16 23:24:44 +01:00
Jacek Sieka 7db5647a6e
clean up / document init (#3387)
* clean up / document init

* drop `immutable_validators` data (pre-altair)
* document versions where data is first added
* avoid needlessly loading genesis block data on startup
* add a few more internal database consistency checks
* remove duplicate state root lookup on state load

* comment
2022-02-16 16:44:04 +01:00
tersec 873a8ec1e6
use isZeroMemory for Eth2Digest comparisons (#3386)
* use isZeroMemory for Eth2Digest comparisons

* use Eth2Digest.isZero abstraction
2022-02-14 05:26:19 +00:00
Jacek Sieka 40fe8f5336 fix missing backfill when restarting node
When node is restarted before backfill has started but after some blocks
have finalized with forward sync, we would not start the backfill.

* also clean up one last `SomeSome`
2022-02-11 23:08:50 +02:00
Zahary Karadjov 215caa21ae Eth1 monitor fixes
* Fix a resource leak introduced in https://github.com/status-im/nimbus-eth2/pull/3279

* Don't restart the Eth1 syncing proggress from scratch in case of
  monitor failures during Eth2 syncing.

* Switch to the primary operator as soon as it is back online.

* Log the web3 credentials in fewer places

Other changes:

The 'web3 test' command has been enhanced to obtain and print more
data regarding the selected provider.
2022-02-03 14:01:55 +02:00
tersec 8e6a920bf4
rename MERGE_FORK_EPOCH to BELLATRIX_FORK_EPOCH (#3350)
* rename MERGE_FORK_EPOCH to BELLATRIX_FORK_EPOCH

* fix REST test rules
2022-02-02 14:06:55 +01:00
Jacek Sieka ff4f2a6b6c
better log on finalized slot failure 2022-02-01 21:23:18 +01:00
Jacek Sieka 3df9ffca9f val-mon: remove redundant `_total` suffix from counters
It turns out nim-metrics adds this suffix on its own - it also turns out
some of the names are non-conventional and need follow-up.
2022-01-31 18:51:24 +02:00
Jacek Sieka ad327a8769
Fix counters in validator monitor totals mode (#3332)
The current counters set gauges etc to the value of the _last_ validator
to be processed - as the name of the feature implies, we should be using
sums instead.

* fix missing beacon state metrics on startup, pre-first-head-selection
* fix epoch metrics not being updated on cross-epoch reorg
2022-01-31 08:36:29 +01:00
Jacek Sieka d583e8e4ac
Store finalized block roots in database (3s startup) (#3320)
* Store finalized block roots in database (3s startup)

When the chain has finalized a checkpoint, the history from that point
onwards becomes linear - this is exploited in `.era` files to allow
constant-time by-slot lookups.

In the database, we can do the same by storing finalized block roots in
a simple sparse table indexed by slot, bringing the two representations
closer to each other in terms of conceptual layout and performance.

Doing so has a number of interesting effects:

* mainnet startup time is improved 3-5x (3s on my laptop)
* the _first_ startup might take slightly longer as the new index is
being built - ~10s on the same laptop
* we no longer rely on the beacon block summaries to load the full dag -
this is a lot faster because we no longer have to look up each block by
parent root
* a collateral benefit is that we no longer need to load the full
summaries table into memory - we get the RSS benefits of #3164 without
the CPU hit.

Other random stuff:

* simplify forky block generics
* fix withManyWrites multiple evaluation
* fix validator key cache not being updated properly in chaindag
read-only mode
* drop pre-altair summaries from `kvstore`
* recreate missing summaries from altair+ blocks as well (in case
database has lost some to an involuntary restart)
* print database startup timings in chaindag load log
* avoid allocating superfluos state at startup
* use a recursive sql query to load the summaries of the unfinalized
blocks
2022-01-30 18:51:04 +02:00
tersec 29e2169585
phase 0 & altair beacon chain and altair validator spec URL updates (#3339) 2022-01-29 13:53:31 +00:00
Jacek Sieka e264276b36
keep unviables in quarantine (#3331)
they remain unviable even after a reorg
2022-01-28 11:59:55 +01:00
Jacek Sieka d076e1a11b
ncli_db: import states and blocks from era file (#3313) 2022-01-25 09:28:26 +01:00
tersec 351c2fd48a
rename mergeData to bellatrixData and mergeFork to bellatrixFork (#3315) 2022-01-24 16:23:13 +00:00
Jacek Sieka 61342c2449
limit by-root requests to non-finalized blocks (#3293)
* limit by-root requests to non-finalized blocks

Presently, we keep a mapping from block root to `BlockRef` in memory -
this has simplified reasoning about the dag, but is not sustainable with
the chain growing.

We can distinguish between two cases where by-root access is useful:

* unfinalized blocks - this is where the beacon chain is operating
generally, by validating incoming data as interesting for future fork
choice decisions - bounded by the length of the unfinalized period
* finalized blocks - historical access in the REST API etc - no bounds,
really

In this PR, we limit the by-root block index to the first use case:
finalized chain data can more efficiently be addressed by slot number.

Future work includes:

* limiting the `BlockRef` horizon in general - each instance is 40
bytes+overhead which adds up - this needs further refactoring to deal
with the tail vs state problem
* persisting the finalized slot-to-hash index - this one also keeps
growing unbounded (albeit slowly)

Anyway, this PR easily shaves ~128mb of memory usage at the time of
writing.

* No longer honor `BeaconBlocksByRoot` requests outside of the
non-finalized period - previously, Nimbus would generously return any
block through this libp2p request - per the spec, finalized blocks
should be fetched via `BeaconBlocksByRange` instead.
* return `Opt[BlockRef]` instead of `nil` when blocks can't be found -
this becomes a lot more common now and thus deserves more attention
* `dag.blocks` -> `dag.forkBlocks` - this index only carries unfinalized
blocks from now - `finalizedBlocks` covers the other `BlockRef`
instances
* in backfill, verify that the last backfilled block leads back to
genesis, or panic
* add backfill timings to log
* fix missing check that `BlockRef` block can be fetched with
`getForkedBlock` reliably
* shortcut doppelganger check when feature is not enabled
* in REST/JSON-RPC, fetch blocks without involving `BlockRef`

* fix dag.blocks ref
2022-01-21 13:33:16 +02:00
Dustin Brody 9699858422 rename MERGE_FORK_VERSION to BELLATRIX_FORK_VERSION 2022-01-20 19:33:05 +02:00
Jacek Sieka 570379d3d9
Backfiller (#3263)
Backfilling is the process of downloading historical blocks via P2P that
are required to fulfill `GetBlocksByRange` duties - this happens during
both trusted node and finalized checkpoint syncs.

In particular, backfilling happens after syncing to head, such that
attestation work can start as soon as possible.

* Fix SyncQueue initialization procedure.
Remove usage of `awaitne`.
Add cancellation support.
Remove unneeded `sleepAsync()` if peer's head is older than needed.
Add `direction` field to all logs.
Fix syncmanager wedge issue.
Add proper resource cleaning procedure on backward sync finish.

Co-authored-by: cheatfate <eugene.kabanov@status.im>
2022-01-20 08:25:45 +01:00
tersec 9c0c9c98ce
complete switch to beacon_chain/specs/datatypes/bellatrix (#3295) 2022-01-18 13:36:52 +00:00
Zahary Karadjov 47f1f7ff1a More efficient reward data persistance; Address review comments
The new format is based on compressed CSV files in two channels:

* Detailed per-epoch data
* Aggregated "daily" summaries

The use of append-only CSV file speeds up significantly the epoch
processing speed during data generation. The use of compression
results in smaller storage requirements overall. The use of the
aggregated files has a very minor cost in both CPU and storage,
but leads to near interactive speed for report generation.

Other changes:

- Implemented support for graceful shut downs to avoid corrupting
  the saved files.

- Fixed a memory leak caused by lacking `StateCache` clean up on each
  iteration.

- Addressed review comments

- Moved the rewards and penalties calculation code in a separate module

Required invasive changes to existing modules:

- The `data` field of the `KeyedBlockRef` type is made public to be used
  by the validator rewards monitor's Chain DAG update procedure.

- The `getForkedBlock` procedure from the `blockchain_dag.nim` module
  is made public to be used by the validator rewards monitor's Chain DAG
  update procedure.
2022-01-18 01:56:56 +02:00
Zahary Karadjov 29aad0241b Precise per-component ETH-denominated rewards tracking
This is an alternative take on https://github.com/status-im/nimbus-eth2/pull/3107
that aims for more minimal interventions in the spec modules at the expense of
duplicating more of the spec logic in ncli_db.
2022-01-18 01:56:56 +02:00
Jacek Sieka 836f6984bb
move `state_transition` to `Result` (#3284)
* better error messages in api
* avoid `BlockData` copies when replaying blocks
2022-01-17 12:19:58 +01:00
Jacek Sieka 805e85e1ff
time: spring cleaning (#3262)
Time in the beacon chain is expressed relative to the genesis time -
this PR creates a `beacon_time` module that collects helpers and
utilities for dealing the time units - the new module does not deal with
actual wall time (that's remains in `beacon_clock`).

Collecting the time related stuff in one place makes it easier to find,
avoids some circular imports and allows more easily identifying the code
actually needs wall time to operate.

* move genesis-time-related functionality into `spec/beacon_time`
* avoid using `chronos.Duration` for time differences - it does not
support negative values (such as when something happens earlier than it
should)
* saturate conversions between `FAR_FUTURE_XXX`, so as to avoid
overflows
* fix delay reporting in validator client so it uses the expected
deadline of the slot, not "closest wall slot"
* simplify looping over the slots of an epoch
* `compute_start_slot_at_epoch` -> `start_slot`
* `compute_epoch_at_slot` -> `epoch`

A follow-up PR will (likely) introduce saturating arithmetic for the
time units - this is merely code moves, renames and fixing of small
bugs.
2022-01-11 11:01:54 +01:00
tersec ae61512ee9
rename upgrade_to_{merge,bellatrix}; detect unchanging spec YAMLs (#3265) 2022-01-10 09:39:43 +00:00
Jacek Sieka 6f7e0e3393
REST cleanups (#3255)
* REST cleanups

* reject out-of-range committee requests
* print all hex values as lower-case
* allow requesting state information by head state root
* turn `DomainType` into array (follow spec)
* `uint_to_bytesXX` -> `uint_to_bytes` (follow spec)
* fix wrong dependent root in `/eth/v1/validator/duties/proposer/`
* update documentation - `--subscribe-all-subnets` is no longer needed
when using the REST interface with validator clients
* more fixes
* common helpers for dependent block
* remove test rules obsoleted by more strict epoch tests
* fix trailing commas

* Update docs/the_nimbus_book/src/rest-api.md
* Update docs/the_nimbus_book/src/rest-api.md

Co-authored-by: sacha <sacha@status.im>
2022-01-08 22:06:34 +02:00
Jacek Sieka ba99c8fe4f
update era file documentation / impl (#3226)
Overhaul of era files, including documentation and reference
implementations

* store blocks, then state, then slot indices for easy lookup at low
cost
* document era file rationale
* altair+ support in era writer
2022-01-07 11:13:19 +01:00
tersec 0fd8bf7b56
spec URL updates (#3254) 2022-01-06 18:35:38 +00:00
Jacek Sieka 0a4728a241
Handle access to historical data for which there is no state (#3217)
With checkpoint sync in particular, and state pruning in the future,
loading states or state-dependent data may fail. This PR adjusts the
code to allow this to be handled gracefully.

In particular, the new availability assumption is that states are always
available for the finalized checkpoint and newer, but may fail for
anything older.

The `tail` remains the point where state loading de-facto fails, meaning
that between the tail and the finalized checkpoint, we can still get
historical data (but code should be prepared to handle this as an
error).

However, to harden the code against long replays, several operations
which are assumed to work only with non-final data (such as gossip
verification and validator duties) now limit their search horizon to
post-finalized data.

* harden several state-dependent operations by logging an error instead
of introducing a panic when state loading fails
* `withState` -> `withUpdatedState` to differentiate from the other
`withState`
* `updateStateData` can now fail if no state is found in database - it
is also hardened against excessively long replays
* `getEpochRef` can now fail when replay fails
* reject blocks with invalid target root - they would be ignored
previously
* fix recursion bug in `isProposed`
2022-01-05 19:38:04 +01:00
zah fba1f08a5e
Implement #3129 (Optimized history traversals in the REST API) (#3219)
* Fix REST some rest call signatures and implement a simple API benchmark tool

* Implement #3129 (Optimized history traversals in the REST API)

Other notable changes:

The `updateStateData` procedure in the `blockchain_dag.nim` module is
optimized to not rewind down to the last snapshot state saved in the
database if the supplied input state can be used as a starting point
instead.

* Disallow await in withStateForBlockSlot
2022-01-05 15:49:10 +01:00
tersec 5878d34117
rename forkDigests.merge to forkDigests.bellatrix (#3245) 2022-01-05 14:24:15 +00:00
tersec b81c06edab
rename Beacon{Block,State}Fork.Merge to Bellatrix; update copyright years (#3240) 2022-01-04 09:45:38 +00:00
Jacek Sieka c4ce59e55b
Assorted logging improvements (#3237)
* log doppelganger detection when it activates and when it causes missed
duties
* less prominent eth1 sync progress
* log in-progress sync at notice only when actually missing duties
* better detail in replay log
* don't log finalization checkpoints - this is quite verbose when
syncing and already included in "Slot start"
2022-01-03 22:18:49 +01:00
Jacek Sieka 7ec97a6b35
Fix missing checkpoint states` (#3225)
With the right sequence of events (for example a REST request or a
validation), it can happen that the first traversal across a state
checkpoint boundary is done without storing that state on disk - this
causes problens when replaying states, because now states may be missing
from the database.

Here, we simply avoid using the caches when advancing a state that will
go into the database, ensuring that the information lost during caching
always is permanently stored.

* fix recursion bug in `isProposed`
2021-12-30 12:33:03 +01:00
tersec 0d4e49f946
Merge fork gossip support (#3213)
* Merge fork gossip support

* index directly by BeaconStateFork and remove debugging log statement
2021-12-21 15:24:23 +01:00
Jacek Sieka 1021e3324e
Revert writing backfill root to database (#3215)
Introduced in #3171, it turns out we can just follow the block headers
to achieve the same effect

* leaves the constant in the code so as to avoid confusion when reading
database that had the constant written (such as the fleet nodes and
other unstable users)
2021-12-21 11:40:14 +01:00
Jacek Sieka c270ec21e4
Validator monitoring (#2925)
Validator monitoring based on and mostly compatible with the
implementation in Lighthouse - tracks additional logs and metrics for
specified validators so as to stay on top on performance.

The implementation works more or less the following way:
* Validator pubkeys are singled out for monitoring - these can be
running on the node or not
* For every action that the validator takes, we record steps in the
process such as messages being seen on the network or published in the
API
* When the dust settles at the end of an epoch, we report the
information from one epoch before that, which coincides with the
balances being updated - this is a tradeoff between being correct
(waiting for finalization) and providing relevant information in a
timely manner)
2021-12-20 20:20:31 +01:00
Jacek Sieka 03005f48e1
Backfill support for ChainDAG (#3171)
In the ChainDAG, 3 block pointers are kept: genesis, tail and head. This
PR adds one more block pointer: the backfill block which represents the
block that has been backfilled so far.

When doing a checkpoint sync, a random block is given as starting point
- this is the tail block, and we require that the tail block has a
corresponding state.

When backfilling, we end up with blocks without corresponding states,
hence we cannot use `tail` as a backfill pointer - there is no state.

Nonetheless, we need to keep track of where we are in the backfill
process between restarts, such that we can answer GetBeaconBlocksByRange
requests.

This PR adds the basic support for backfill handling - it needs to be
integrated with backfill sync, and the REST API needs to be adjusted to
take advantage of the new backfilled blocks when responding to certain
requests.

Future work will also enable moving the tail in either direction:
* pruning means moving the tail forward in time and removing states
* backwards means recreating past states from genesis, such that
intermediate states are recreated step by step all the way to the tail -
at that point, tail, genesis and backfill will match up.
* backfilling is done when backfill != genesis - later, this will be the
WSS checkpoint instead
2021-12-13 14:36:06 +01:00
Jacek Sieka 9f27f0d97c
BlockId reform (#3176)
* BlockId reform

Introduce `BlockId` that helps track a root/slot pair - this prepares
the codebase for backfilling and handling out-of-dag blocks

* move block dag code to separate module
* fix finalised state root in REST event stream
* fix finalised head computation on head update, when starting from
checkpoint
* clean up chaindag init
* revert `epochAncestor` change in introduced in #3144 that would return
an epoch ancestor from the canoncial history instead of the given
history, causing `EpochRef` keys to point to the wrong block
2021-12-09 19:06:21 +02:00
Jacek Sieka 069bccd51b
batch-verify sync messages for a small perf boost (#3151)
* batch-verify sync messages for a small perf boost

Generally reuses the same structure as attestation and aggregate
verification

* normalize `signatures` and `signature_batch` to use the same pattern
of verification
* normalize parameter names, order etc for signature stuff in general
* avoid calling `blsSign` directly - instead, go through `signatures`
consistently
2021-12-09 14:56:54 +02:00
tersec 2ca28fb861
Merge BeaconBlock gossip validation (#3165)
* Merge BeaconBlock gossip validation

* figure/ground inversion

* revert cosmetic cleanups to reduce merge conflicts
2021-12-08 17:29:22 +00:00
Etan Kissling 38e64b3441 cleanup sync subcommittee accessors
This removes some dead code from `getSubcommitteePositionsAux` which is
no longer needed since the introduction of `SyncCommitteeCache`.
This also cleans up some formatting, uses `let` instead of `var` where
possible, and uses implicit `pairs` in one case for consistency.
2021-12-07 18:17:03 +02:00
Jacek Sieka 89d6a1b403
Introduce slot->BlockRef mapping for finalized chain (#3144)
* Introduce slot->BlockRef mapping for finalized chain

The finalized chain is linear, thus we can use a seq to lookup blocks by
slot number.

Here, we introduce such a seq, even though in the future, it should
likely be backed by a database structure instead, or, more likely, a
flat era file with a flat lookup index.

This dramatically speeds up requests by slot, such as those coming from
the REST interface or GetBlocksByRange, as these are currently served by
a linear iteration from head.

* fix REST block requests to not return blocks from an earlier slot when
the given slot is empty
* fix StateId interpretation such that it doesn't treat state roots as
block roots
* don't load full block from database just to return its root
2021-12-06 20:52:35 +02:00
Jacek Sieka 1a8b7469e3
move quarantine outside of chaindag (#3124)
* move quarantine outside of chaindag

The quarantine has been part of the ChainDAG for the longest time, but
this design has a few issues:

* the function in which blocks are verified and added to the dag becomes
reentrant and therefore difficult to reason about - we're currently
using a stateful flag to work around it
* quarantined blocks bypass the processing queue leading to a processing
stampede
* the quarantine flow is unsuitable for orphaned attestations - these
should also should be quarantined eventually

Instead of processing the quarantine inside ChainDAG, this PR moves
re-queueing to `block_processor` which already is responsible for
dealing with follow-up work when a block is added to the dag

This sets the stage for keeping attestations in the quarantine as well.

Also:

* make `BlockError` `{.pure.}`
* avoid use of `ValidationResult` in block clearance (that's for gossip)
2021-12-06 10:49:01 +01:00
tersec 4378f3f096
almost all remaining ethereum/{eth2.0-specs -> consensus-specs} (#3158) 2021-12-03 20:01:13 +00:00
Jacek Sieka aa1dea03cd
speed up gossip and sync block validation (#3143)
* avoid recomputing hash for block signature check
* check block slot match before hitting the database
2021-12-01 10:52:40 +01:00
Etan Kissling eb777a6c8b allow `withState` to be called multiple times
This allows `blockchain_dag`'s `withState` template to be called more
than once in a single function. This led to a compilation error before
because the injected variables and functions shared the same scope.
2021-11-29 15:24:12 +02:00
Jacek Sieka 9c2f43ed0e
Speed up altair block processing 2x (#3115)
* Speed up altair block processing >2x

Like #3089, this PR drastially speeds up historical REST queries and
other long state replays.

* cache sync committee validator indices
* use ~80mb less memory for validator pubkey mappings
* batch-verify sync aggregate signature (fixes #2985)
* document sync committee hack with head block vs sync message block
* add batch signature verification failure tests

Before:

```
../env.sh nim c -d:release -r ncli_db --db:mainnet_0/db bench --start-slot:-1000
All time are ms
     Average,       StdDev,          Min,          Max,      Samples,         Test
Validation is turned off meaning that no BLS operations are performed
    5830.675,        0.000,     5830.675,     5830.675,            1, Initialize DB
       0.481,        1.878,        0.215,       59.167,          981, Load block from database
    8422.566,        0.000,     8422.566,     8422.566,            1, Load state from database
       6.996,        1.678,        0.042,       14.385,          969, Advance slot, non-epoch
      93.217,        8.318,       84.192,      122.209,           32, Advance slot, epoch
      20.513,       23.665,       11.510,      201.561,          981, Apply block, no slot processing
       0.000,        0.000,        0.000,        0.000,            0, Database load
       0.000,        0.000,        0.000,        0.000,            0, Database store
```

After:

```
    7081.422,        0.000,     7081.422,     7081.422,            1, Initialize DB
       0.553,        2.122,        0.175,       66.692,          981, Load block from database
    5439.446,        0.000,     5439.446,     5439.446,            1, Load state from database
       6.829,        1.575,        0.043,       12.156,          969, Advance slot, non-epoch
      94.716,        2.749,       88.395,      100.026,           32, Advance slot, epoch
      11.636,       23.766,        4.889,      205.250,          981, Apply block, no slot processing
       0.000,        0.000,        0.000,        0.000,            0, Database load
       0.000,        0.000,        0.000,        0.000,            0, Database store
```

* add comment
2021-11-24 13:43:50 +01:00
Jacek Sieka f19a497eec
ncli_db: add putState, putBlock (#3096)
* ncli_db: add putState, putBlock

These tools allow modifying an existing nimbus database for the purpose
of recovery or reorg, moving the head, tail and genesis to arbitrary
points.

* remove potentially expensive `putState` in `BeaconStateDB`
* introduce `latest_block_root` which computes the root of the latest
applied block from the `latest_block_header` field (instead of passing
it in separately)
* avoid some unnecessary BeaconState copies during init
* discover https://github.com/nim-lang/Nim/issues/19094
* prefer `HashedBeaconState` in a few places to avoid recomputing state
root
* fetch latest block root from state when creating blocks
* harden `get_beacon_proposer_index` against invalid slots and document
* move random spec function tests to `test_spec.nim`
* avoid unnecessary state root computation before block proposal
2021-11-18 13:02:43 +01:00
Jacek Sieka 222674b203
better logging on invalid database (#3097)
* remove redundant test report
2021-11-13 17:27:28 +01:00
Jacek Sieka ec650c7fd7
Support starting from altair (#3054)
* Support starting from altair

* hide `finalized-checkpoint-` - they are incomplete and usage may cause
crashes
* remove genesis detection code (broken, obsolete)
* enable starting ChainDAG from altair checkpoints - this is a
prerequisite for checkpoint sync (TODO: backfill)
* tighten checkpoint state conditions
* show error when starting from checkpoint with existing database (not
supported)
* print rest-compatible JSON in ncli/state_sim
* altair/merge support in ncli
* more altair/merge support in ncli_db
* pre-load header to speed up loading
* fix forked block decoding
2021-11-10 13:39:08 +02:00
Jacek Sieka ea0a191723
Better REST/RPC error messages (#3046)
* Better REST/RPC error messages
* homogenise block logging (root first)
* homegenise message verification pipeline (verify in
`gossip_verification`, act in `eth2_processor`)
* use `subcommitteeIdx` consistently
* log each sent contribution
* fix block_sim
* fix block topic
* don't recalc root on gossip block validation
* move position loop into sync pool
2021-11-05 17:39:47 +02:00
Jacek Sieka a086cf01ac
altair fork handling cleanups (#3050)
* fix stack overflow crash in REST/debug/getStateV2
* introduce `ForkyXxx` for generic type matching of `Xxx` across
branches (SomeHashedBeaconState -> ForkyHashedBeaconState et al) -
`Some` is already used for other types of type classes
* consolidate function naming in BeaconChainDB, use some generics
* import `forks.nim` from other spec modules and move `Forked*` helpers
around to resolve circular imports
* remove `ForkedBeaconState`, use `ForkedHashedBeaconState` throughout
(less data shuffling between the types)
* fix several cases of states being stored on stack in tests, causing
random failures on some platforms
* remove reading json support from ncli - this should be ported to the
rest json reading instead (doesn't currently work because stack sizes)
2021-11-05 08:34:34 +01:00
Jacek Sieka 233d756518
Logging and startup improvements (#3038)
* Logging and startup improvements

Color support for released binaries!

* startup scripts no longer log to file by default - this only affects
source builds - released binaries don't support file logging
* add --log-stdout option to control logging to stdout (colors, json)
* detect tty:s vs redirected logs and log accordingly
* add option to disable log colors at runtime
* simplify several "common" logs, showing the most important information
earlier and more clearly
* remove line numbers / file information / tid - these take up space and
are of little use to end users
  * still enabled in debug builds and tools
* remove `testnet_servers_image` compile-time option
* server images, released binaries and compile-from-source now offer
the same behaviour and features
* fixes https://github.com/status-im/nimbus-eth2/issues/2326
* fixes https://github.com/status-im/nimbus-eth2/issues/1794
* remove instanteneous block speed from sync message, keeping only
average

before:

```
INF 2021-10-28 16:45:59.000+02:00 Slot start                                 topics="beacnde" tid=386429 file=nimbus_beacon_node.nim:884 lastSlot=2384027 wallSlot=2384028 delay=461us84ns peers=0 head=75a10ee5:3348 headEpoch=104 finalized=cd6804ba:3264 finalizedEpoch=102 sync="wwwwwwwwww:0:0.0000:0.0000:00h00m (3348)"
INF 2021-10-28 16:45:59.046+02:00 Slot end                                   topics="beacnde" tid=386429 file=nimbus_beacon_node.nim:821 slot=2384028 nextSlot=2384029 head=75a10ee5:3348 headEpoch=104 finalizedHead=cd6804ba:3264 finalizedEpoch=102 nextAttestationSlot=-1 nextProposalSlot=-1 nextActionWait=n/a
```

after:

```
INF 2021-10-28 22:43:23.033+02:00 Slot start                                 topics="beacnde" slot=2385815 epoch=74556 sync="DDPDDPUDDD:10:5.2258:01h19m (2361088)" peers=37 head=eacd2dae:2361096 finalized=73782:a4751487 delay=33ms687us715ns
INF 2021-10-28 22:43:23.291+02:00 Slot end                                   topics="beacnde" slot=2385815 nextActionWait=n/a nextAttestationSlot=-1 nextProposalSlot=-1 head=eacd2dae:2361096
```

* fix comment

* documentation updates

* mention `--log-file` may be deprecated in the future
* update various docs
2021-11-02 18:06:36 +01:00
Jacek Sieka 9cf32c3748 clean up sync subcommittee handling
* `SyncCommitteeIndex` -> `SyncSubcommitteeIndex`
* `syncCommitteePeriod` -> `sync_committee_period` (spec spelling)
* tighten period comparisons
* fix assert when validating committee message with non-altair state in
REST api
2021-10-20 22:59:13 +03:00
Jacek Sieka c40cc6cec1 clean up fork enum and field names
* single naming strategy
* simplify some fork code
* simplify forked block production
2021-10-19 11:06:38 +03:00
Jacek Sieka 5b33783898
load altair state from database on startup (#2994)
* fixes replay from last phase0 block on startup
* better forked block loading
2021-10-18 14:32:54 +02:00
Eugene Kabanov 7319f150e8
Number of REST fixes. (#2987)
* Initial commit.

* Fix path.

* Add validator keys to indices cache mechanism.
Move syncComitteeParticipants to common place.

* Fix sync participants order issue.

* Fix error code when state could not be found.
Refactor `state/validators` to use keysToIndices mechanism.

* Fix RestValidatorIndex to ValidatorIndex conversion TODOs.

* Address review comments.

* Fix REST test rules.
2021-10-14 12:38:38 +02:00
Jacek Sieka f90b2b8b1f
reward accounting for altair+ (#2981)
Similar to the existing `RewardInfo`, this PR adds the infrastructure
needed to export epoch processing information from altair+. Because
accounting is done somewhat differently, the PR uses a fork-specific
object to extrct the information in order to make the cost on the spec
side low.

* RewardInfo -> EpochInfo, ForkedEpochInfo
* use array for computing new sync committee
* avoid repeated total active balance computations in block processing
* simplify proposer index check
* simplify epoch transition tests
* pre-compute base increment and reuse in epoch processing, and a few
other small optimizations

This PR introduces the type and does the heavy lifting in terms of
refactoring - the tools that use the accounting will need separate PR:s
(as well as refinements to the exportred information)
2021-10-13 16:24:36 +02:00
tersec 2eb9a608a4
add payloadId; add merge vector test script; remove consensusValidated (#2982) 2021-10-13 16:08:50 +02:00
Jacek Sieka fabec894dd
harden sync sub committee selection (#2965)
* harden sync sub committee selection

also turn it into an iterator

* fix test and warning
2021-10-07 13:19:47 +00:00
Ștefan Talpalaru 359937fa70
interop metrics (#2964)
https://github.com/ethereum/beacon-metrics/blob/master/metrics.md#interop-metrics
2021-10-07 06:19:07 +00:00
Etan Kissling 9ee134324b
allow `withXxx` to access fork-specific fields (#2943)
So far, `withState` and `withBlck` templates could only be used to have
convenience access to fork-agnostic BeaconState and BeaconBlock fields.
This patch:
- injects an additional `stateFork` constant that allows to use
  `when` expressions to also access Altair and Merge-specific fields.
- introduces a `withStateAndBlck` template to support operating on both
  a `BeaconState` and `BeaconBlock` at a time.
- makes sync committee related functions Merge aware.
- changes a couple if-else trees for forks into case statements so that
  forgotten future forks are promoted to compile-time errors.
2021-10-06 20:05:06 +03:00
Etan Kissling 2bbffbde10
abort compile when fork epoch is forgotten (#2939)
There are a few locations in the code that compare the current epoch to
the various FORK_EPOCH constants and branch off into fork-specific code.
When a new fork is introduced, it is sometimes forgotten to update all
of those branch locations. This patch introduces a compile-time check
that ensures that all branches need to be covered exhaustively. This is
done by replacing if-elif structures with case expressions.
2021-10-04 08:31:21 +00:00
tersec 6b3bf7eb7b
merge hardfork database support (#2911)
* merge hardfork database support

* working block_sim

* recreate state transition changes
2021-09-30 01:07:24 +00:00
Etan Kissling 01a9b275ec
handle duplicate pubkeys in sync committee (#2902)
When sync committee message handling was introduced in #2830, the edge
case of the same validator being selected multiple times as part of a
sync subcommittee was not covered. Not handling that edge case makes
sync contributions have a lower-than-expected participation rate as each
sync validator is only counted up through once per subcommittee.
This patch ensures that this edge case is properly covered.
2021-09-28 07:44:20 +00:00
tersec 2b2846b468
implement forked merge state/block support (#2890)
* implement forked state/block support

* merge support for containsOrphan; import cleanup; 80-column lines

* add merge block header operations and slot sanity fixture

* add epoch state transition tests; implement is_valid_gas_limit(), is_merge_block(), is_execution_enabled(), and compute_timestamp_at_slot()

* implement process_execution_payload() and add merge deposit operations tests

* add merge block sanity tests

* add merge case to syncCommitteeParticipants

* v1.1.0-beta.5 updates

* reduce getTestStates-based memory usage; don't try to REST-serialize ExecutionPayload transactions without underlying support

* add execution payload tests; switch var to let in tests/official/
2021-09-27 14:22:58 +00:00
Eugene Kabanov 0c635334a2
Sync committee related REST API implementation. (#2856) 2021-09-24 01:13:25 +03:00
Eugene Kabanov b566d4657f
REST /eth/v1/events API call implementation. (#2878)
* Placing callbacks into strategic places.

* Initial events call implementation.

* Post rebase fixes.

* Change addSyncContribution() implementation.

* Add `attestation-sent` event.
Remove gcsafe, raises from callbacks implementations.
Move `attestation-received` fire at the end of attestation processing.

* Address review comments.
2021-09-22 14:17:15 +02:00
Dustin Brody 6638476b5f discard putative blocks from invalid hardforks 2021-09-16 16:18:40 +03:00
tersec 166e22a43b
sync committee message pool and gossip validation (#2830) 2021-08-28 10:40:01 +00:00
Jacek Sieka ba06f13942
cleanups (#2809)
* cleanups

* use ForkedTrustedSignedBeaconBlock.ionit where appropriate
* move `is_aggregator` to `spec/`
* use `errReject` in a few more places
* update enr fork id when time is auspicious
* use network broadcast functions

* Return Ignore for aggregate signature validation timeouts

...consistently between aggregates and attestations.

* clean up some more reject/ignore rules
* shorten texts a bit

* errReject->checkedReject, use err helpers throughout

* get rid of quarantine in exitpool as well
2021-08-24 21:49:51 +02:00
tersec 4678a2bee7
move BeaconClock from ChainDAG to BeaconNode (#2796) 2021-08-20 08:58:15 +00:00
Jacek Sieka a7a65bce42
disentangle eth2 types from the ssz library (#2785)
* reorganize ssz dependencies

This PR continues the work in
https://github.com/status-im/nimbus-eth2/pull/2646,
https://github.com/status-im/nimbus-eth2/pull/2779 as well as past
issues with serialization and type, to disentangle SSZ from eth2 and at
the same time simplify imports and exports with a structured approach.

The principal idea here is that when a library wants to introduce SSZ
support, they do so via 3 files:

* `ssz_codecs` which imports and reexports `codecs` - this covers the
basic byte conversions and ensures no overloads get lost
* `xxx_merkleization` imports and exports `merkleization` to specialize
and get access to `hash_tree_root` and friends
* `xxx_ssz_serialization` imports and exports `ssz_serialization` to
specialize ssz for a specific library

Those that need to interact with SSZ always import the `xxx_` versions
of the modules and never `ssz` itself so as to keep imports simple and
safe.

This is similar to how the REST / JSON-RPC serializers are structured in
that someone wanting to serialize spec types to REST-JSON will import
`eth2_rest_serialization` and nothing else.

* split up ssz into a core library that is independendent of eth2 types
* rename `bytes_reader` to `codec` to highlight that it contains coding
and decoding of bytes and native ssz types
* remove tricky List init overload that causes compile issues
* get rid of top-level ssz import
* reenable merkleization tests
* move some "standard" json serializers to spec
* remove `ValidatorIndex` serialization for now
* remove test_ssz_merkleization
* add tests for over/underlong byte sequences
* fix broken seq[byte] test - seq[byte] is not an SSZ type

There are a few things this PR doesn't solve:

* like #2646 this PR is weak on how to handle root and other
dontSerialize fields that "sometimes" should be computed - the same
problem appears in REST / JSON-RPC etc

* Fix a build problem on macOS

* Another way to fix the macOS builds

Co-authored-by: Zahary Karadjov <zahary@gmail.com>
2021-08-18 20:57:58 +02:00
Jacek Sieka 7a622e8505
rework spec imports (#2779)
The spec imports are a mess to work with, so this branch cleans them up
a bit to ensure that we avoid generic sandwitches and that importing
stuff generally becomes easier.

* reexport crypto/digest/presets because these are part of the public
symbol set of the rest of the spec types
* don't export `merge` types from `base` - this causes circular deps
* fix circular deps in `ssz/spec_types` - this is the first step in
disentangling ssz from spec
* be explicit about phase0 vs altair - longer term, `altair` will become
the "natural" type set, then merge and so on, so no point in giving
`phase0` special preferential treatment
2021-08-12 13:08:20 +00:00
Jacek Sieka 9697b73e71
forkedbeaconstate_helpers -> forks (#2772)
Simpler module name for stuff that covers forks

* check that runtime config matches database state
* also include some assorted altair cleanups
* use "standard" genesis fork in local testnet to work around missing
runtime config support
2021-08-10 22:46:35 +02:00
Jacek Sieka a663ed65f4
Merge branch 'merge-stable' into unstable 2021-08-09 15:00:58 +02:00
tersec 2afe2802b6
altair topic switching (#2767)
* altair topic switching

* remove validate{Committee,Validator}IndexOr unused within branch
2021-08-09 12:54:45 +00:00
Jacek Sieka 7bb76a6cd1
Merge remote-tracking branch 'origin/stable' into merge-stable 2021-08-09 13:14:28 +02:00
Jacek Sieka ee79c10a7d
update validator key cache on startup (#2760)
* update validator key cache on startup

Versions prior to 1.1.0 do not write a validator key cache at all.

Versions from 1.4.0 and upwards require an immutable validator key cache
to verify blocks - normally, block verification fills the cache but that
assumes that at least one block was verified by a version that has the
key cache.

Taken together, this breaks direct upgrades from anything <1.1.0 to
1.4.0.

The fix is simply to refresh fill the cache from an existing state on
startup.

* also log serious block validation failures at info level
2021-08-05 11:26:10 +03:00
tersec a2c1b96ac9
prevent resyncing from genesis with altair head block (#2750) 2021-08-01 10:20:43 +02:00
tersec e4afc36d71
use ForkedTrustedSignedBeaconBlock (#2720)
* use ForkedTrustedSignedBeaconBlock

* remove --subscribe-all-subnets

* https://ethereum.github.io/eth2.0-APIs/#/Beacon/getBlock implementation was passing through forked beaconblocks
2021-07-14 12:18:52 +00:00
Jacek Sieka 23eea197f6
Implement split preset/config support (#2710)
* Implement split preset/config support

This is the initial bulk refactor to introduce runtime config values in
a number of places, somewhat replacing the existing mechanism of loading
network metadata.

It still needs more work, this is the initial refactor that introduces
runtime configuration in some of the places that need it.

The PR changes the way presets and constants work, to match the spec. In
particular, a "preset" now refers to the compile-time configuration
while a "cfg" or "RuntimeConfig" is the dynamic part.

A single binary can support either mainnet or minimal, but not both.
Support for other presets has been removed completely (can be readded,
in case there's need).

There's a number of outstanding tasks:

* `SECONDS_PER_SLOT` still needs fixing
* loading custom runtime configs needs redoing
* checking constants against YAML file

* yeerongpilly support

`build/nimbus_beacon_node --network=yeerongpilly --discv5:no --log-level=DEBUG`

* load fork epoch from config

* fix fork digest sent in status
* nicer error string for request failures
* fix tools

* one more

* fixup

* fixup

* fixup

* use "standard" network definition folder in local testnet

Files are loaded from their standard locations, including genesis etc,
to conform to the format used in the `eth2-networks` repo.

* fix launch scripts, allow unknown config values

* fix base config of rest test

* cleanups

* bundle mainnet config using common loader
* fix spec links and names
* only include supported preset in binary

* drop yeerongpilly, add altair-devnet-0, support boot_enr.yaml
2021-07-12 15:01:38 +02:00
zah eb2dc5cbbb
Implement the new Altair req/resp protocols (#2676)
* Implement the new Altair req/resp protocols

Also fixes the altair message-id computation by providing the correct
forkdigest prefix in `isAltairTopic`.

Co-authored-by: Tanguy Cizain <tanguycizain@gmail.com>
2021-07-07 12:09:47 +03:00
tersec 7577f8c2ef
add blockchain_dag altair database reading; add rollback tests (#2683)
* add blockchain_dag altair database reading; add rollback tests; fix some unnecessary type conversions

* remove debugging scaffolding

* proposeSignedBlock() will need to be async for merge; introduce altair types to VC
2021-06-29 15:09:29 +00:00
tersec 445def6c8b
block_clearance, ncli, and ncli_db Altair state saving (#2672)
* block_clearance, ncli, and ncli_db Altair state saving

* avoid invalidating SSZ hash caches with every assignment
2021-06-24 18:34:08 +00:00
tersec 41e0a7abc0
introduce database support for Altair (#2667)
* introduce immutable Altair BeaconState

* add database support for Altair blocks and states

* add tests for Altair get/put/contains/delete state

* enable blockchain_dag Altair state database storing

* properly return error on getting missing altair block
2021-06-24 07:11:47 +00:00
tersec ae1abf24af
add Altair support to block quarantine/clearance and block_sim (#2662)
* add Altair support to the block quarantine

* switch some spec/datatypes imports to spec/datatypes/base

* add Altair support to block_clearance

* allow runtime configuration of Altair transition slot

* enable Altair in block_sim, including in CI
2021-06-23 14:43:18 +00:00
tersec 9616220280
implement Altair attestation pool cache init (#2659)
* implement Altair attestation pool cache init

* remove code duplication around previous/current epoch updates
2021-06-17 17:13:14 +00:00
tersec 146fa48454
use ForkedHashedBeaconState in StateData (#2634)
* use ForkedHashedBeaconState in StateData

* fix FAR_FUTURE_EPOCH -> slot overflow; almost always use assign()

* avoid stack allocation in maybeUpgradeStateToAltair()

* create and use dispatch functions for check_attester_slashing(), check_proposer_slashing(), and check_voluntary_exit()

* use getStateRoot() instead of various state.data.hbsPhase0.root

* remove withStateVars.hashedState(), which doesn't work as a design anymore

* introduce spec/datatypes/altair into beacon_chain_db

* fix inefficient codegen for getStateField(largeStateField)

* state_transition_slots() doesn't either need/use blocks or runtime presets

* combine process_slots(HBS)/state_transition_slots(HBS) which differ only in last-slot htr optimization

* getStateField(StateData, ...) was replaced by getStateField(ForkedHashedBeaconState, ...)

* fix rollback

* switch some state_transition(), process_slots, makeTestBlocks(), etc to use ForkedHashedBeaconState

* remove state_transition(phase0.HashedBeaconState)

* remove process_slots(phase0.HashedBeaconState)

* remove state_transition_block(phase0.HashedBeaconState)

* remove unused callWithBS(); separate case expression from if statement

* switch back from nested-ref-object construction to (ref Foo)(Bar())
2021-06-11 20:51:46 +03:00
Jacek Sieka 9193be9b7b
fix epoch logging (fixes #2283) (#2642)
Also put epoch first to disambiguate vs slot
2021-06-11 01:07:16 +03:00
Jacek Sieka d859bc12f0
write uncompressed validator keys to database (#2639)
* write uncompressed validator keys to database

Loading 150k+ validator keys on startup in compressed format takes a lot
of time - better store them in uncompressed format which makes behaviour
just after startup faster / more predictable.

* refactor cached validator key access
* fix isomorphic cast to work with non-var instances
* remove cooked pubkey cache - directly use database cache in chaindag
as well (one less cache to keep in sync)
* bump blscurve, introduce loadValid for known-to-be-valid keys
2021-06-10 10:37:02 +03:00
Jacek Sieka b11da2cb34 fix state cache loading
* load the cache of the current state epoch instead of the target state
epoch, when applying states and slots
* load state cache for each slot/block (for longer slot jumps)
* load state cache after full updateStateData
* look up two state cache epochs, instead of the same epoch twice :)
2021-06-03 21:37:52 +03:00
tersec 28a5bca71a
split state_transition() into slots/block parts and use only block where appropriate (#2630) 2021-06-03 11:42:25 +02:00
Jacek Sieka 0fb02b5206 log state update duration, lower info threshold for detail logging 2021-06-01 20:43:44 +03:00
Jacek Sieka abe0d7b4ae singe validator key cache
Instead of keeping a validator key list per EpochRef, this PR introduces
a single shared validator key list in ChainDAG, and cleans up some other
ChainDAG and key-related issues.

The PR does not introduce the validator key list in the state transition
- this is because we batch-check all signatures before entering the spec
code, thus the spec code never hits the cache.

A future refactor should _probably_ remove the threadvar altogether.

There's a few other small fixes in here that make the flow easier to
read:

* fix `var ChainDAGRef` -> `ChainDAGRef`
* fix `var QuarantineRef` -> `QuarantineRef`
* consistent `dag` variable name
* avoid using threadvar pubkey cache in most cases
* better error messages in batch signature checking
2021-06-01 20:43:44 +03:00
tersec ea9ceb693a
update ChainDAG.effective_balance() to use StateData; rm ChainDAG.getBlockByPreciseSlot() (#2622)
* update ChainDAG.effective_balance() to use StateData; rm unused ChainDAG.getBlockByPreciseSlot()

* update get_effective_balances to avoid god object; avoid most memory allocation in Altair epoch reward and penalty processing
2021-06-01 12:40:13 +00:00
Jacek Sieka 9b89f58089
revert advance back to trace 2021-06-01 14:09:11 +02:00
Jacek Sieka 60df17786e avoid reading legacy db on write
* don't consider legacy database when writing state - this read is slow
on kvstore
* avoid epoch transition when there's an exact match in cache already
* simplify init to only consider checkpoint states
2021-05-30 12:32:51 +03:00
Jacek Sieka df7bc87af5 Pre-compute slot transition for clearance state
This way we perform the expensive epoch processing before the block
arrives.

Of course, this may lead to speculative misses which in turn lead to
replays - it's likely that in the case of a miss, we'll see a replay
regardless.
2021-05-30 12:04:09 +03:00
Jacek Sieka 2df8a3b28d
add more block processing durations (#2611) 2021-05-28 21:03:20 +02:00
Jacek Sieka 7f52ffb8d9
clean up block processing (#2610)
* gossip_to_consensus -> block_processor (it's processing only blocks,
but not only from gossip)
* measure queue and validation time for blocks
* measure assignment and state loading times for updateStateData
* avoid some unnecessary block copies in block sync
* warn that database is corrupt if we hit tail without a state
2021-05-28 19:34:00 +03:00