Commit Graph

196 Commits

Author SHA1 Message Date
Jacek Sieka 10d99c166c
print attestation/aggregate drop notice once per slot (#2475)
* add metrics for queue-related drops
* avoid importing beacon node conf in processor
2021-04-06 13:59:11 +02:00
Mamy Ratsimbazafy 6b13cdce36
Batch attestations (#2439)
* batch attestations

* Fixes (but now need to investigate the chronos 0 .. 4095 crash similar to https://github.com/status-im/nimbus-eth2/issues/1518

* Try to remove the processing loop to no avail :/

* batch aggregates

* use resultsBuffer size for triggering deadline schedule

* pass attestation pool tests

* Introduce async gossip validators. May fix the 4096 bug (reentrancy issue?) (similar to sync unknown blocks #1518)

* Put logging at debug level, add speed info

* remove unnecessary batch info when it is known to be one

* downgrade some logs to trace level

* better comments [skip ci]

* Address most review comments

* only use ref for async proc

* fix exceptions in eth2_network

* update async exceptions in gossip_validation

* eth2_network 2nd pass

* change to sleepAsync

* Update beacon_chain/gossip_processing/batch_validation.nim

Co-authored-by: Jacek Sieka <jacek@status.im>

Co-authored-by: Jacek Sieka <jacek@status.im>
2021-04-02 16:36:43 +02:00
Jacek Sieka f821bc878e
Remove `-d:insecure` compile option (#2468)
With metrics running on top of chronos, the metrics server no longer
needs to be compiled in conditionally - it remains disabled by default.
2021-04-01 14:44:11 +02:00
tersec bd8b60f8c8
use epochref for get_committee_assignments() to avoid repeated shuffling (#2463)
* use epochref for get_committee_assignments() to avoid repeated shuffling

* remove unnecessary imports and StateCache() construction
2021-03-30 15:01:47 +00:00
Jacek Sieka 74732a23fe
json cleanups (#2456)
* move json-rpc specific marshalling to rpc
* serialize Epoch/Slot with cast to avoid Defect
* avoid a few eth1 deps
* simplify imports
2021-03-26 15:11:06 +01:00
Jacek Sieka 2695cfa864
EH cleanup (#2455)
almost 100% raises in nimbus-eth2 now!

* fix some rare exception-related crashes in json-rpc
2021-03-26 07:52:01 +01:00
Jacek Sieka 8b76ceed52
Fix minor exception effect issues (#2448)
Makes code compatible with
https://github.com/status-im/nim-chronos/pull/166 without requiring it.
2021-03-24 17:20:55 +01:00
tersec 36311bfc05
incorporate proposals into nextActionWait; switch some proc to func (#2438) 2021-03-24 10:05:04 +00:00
tersec 3076f5c3b6
rm std/random from beacon_chain and rm attestation timing randomness (#2442)
* remove added attestation timing randomness

* remove os/random from rest of beacon_chain, primarily deposit_contract

* remove scaffolding

* randomize std/random seed in beacon node and validator client

* use CSPRNG to more securely seed std/random
2021-03-23 06:57:10 +00:00
Jacek Sieka 01fe564e46
chronos-based metrics (#2432)
This opens up the road for removing `-d:insecure` for metrics.
2021-03-19 10:36:38 +01:00
Ștefan Talpalaru 683edbff7a
restore terminal echoing after pressing Ctrl+C at a password prompt (#2412) 2021-03-16 08:06:45 +00:00
Dustin Brody 97504fdb9d ncli_db pruneDatabase checkpointing; remove onSlotEnd lookaheadTime 2021-03-12 23:15:46 +02:00
tersec 5ebf36f54d
add metric for nextActionWait (#2399)
* add metric for nextActionWait

* use toFloatSeconds
2021-03-12 09:46:26 +00:00
Mamy Ratsimbazafy c47d636cb3
Split Eth2Processor in prep for batching (#2396)
* Split Eth2Processor in gossip and consensus part and materialize the shared block queue

* Update initialization in test_sync_manager
2021-03-11 11:10:57 +01:00
Mamy Ratsimbazafy 8e28a05cea
Move pruning out of latency critical path (#2384)
* Deferred DAG and fork choice pruning

* fixup

* Address https://github.com/status-im/nimbus-eth2/pull/2384/files#r589448448, rely only on onSLotEnd for state pruning

* no need to store needPruning in the data structure

* lastPrunePoint is updated in pruning proc

* Split eager and LazyPruning

* enforce pruning in updateHead
2021-03-09 15:36:17 +01:00
Mamy Ratsimbazafy de1060e7f3
centralize p2p validation in a single file and address https://github.com/status-im/nimbus-eth2/pull/2377#issuecomment-791313118 (#2383) 2021-03-06 08:32:55 +01:00
kdeme 5f750f84b4 Use setupAddress for better IP and ports configuration 2021-03-06 00:09:09 +02:00
Mamy Ratsimbazafy d47f53cd9d
Reorg (5/5) (#2377)
* Reorg things left into networking and gossip_processing

* time -> beacon_clock

* fix builds
2021-03-05 14:12:00 +01:00
Mamy Ratsimbazafy 5d7f9c3a04
Consensus object pools [reorg 4/5] (#2374)
* Add documentation

* make test doesn't try to build the beacon node :/
2021-03-04 10:13:44 +01:00
Mamy Ratsimbazafy 2f17ac7b64
Move SSZ, deposit_contracts & eth1_monitor [reorg files 3/5] (#2371)
* move deposit_contract

* Move SSZ

* fix ssz import in tests

* move also eth1_monitor

* forgot to delete the original

* fix comma [skip ci]

* Fix "make" & tools imports

* Fix import

* Fix import again

* rename deposit_contract -> eth1

* Revert ssz move to subfolder

* path fixes [skip ci]
2021-03-03 07:23:05 +01:00
Mamy Ratsimbazafy 3276dfc683
Consolidate modules by areas [part 1] (#2365)
* Move sync in subfolder

* move validator related thingies in validators

* fix binary builds

* update bounds comment [skip ci]
2021-03-02 11:27:45 +01:00
tersec 5653b2e13c
more spec v1.0.1 spec ref URL and copyright year updates (#2367) 2021-03-02 06:04:14 +00:00
tersec e661f7d0c7
prevent uint64 to int64-induced RangeError/RangeDefects in metrics (#2358)
* prevent uint64 to int64-induced RangeError/RangeDefects in metrics

* remove redundant min(foo, int64.high)

* adjust spacing to be consistent
2021-03-01 20:55:25 +01:00
Jacek Sieka 3e2c0a220c
refactor slot loop (#2355)
* refactor slot loop

* fix attestations being sent out early when _any_ block arrives (as
opposed to the block for the "correct" slot)
* fix attestations being sent out late when block already arrived
* refactor slot processing loop
* shutdown if clock moves backwards significantly
* fix docs

* notify caller whether the block actually arrived
2021-03-01 17:36:06 +01:00
Jacek Sieka 0dbc7162ac startup cleanup
* fix several memory leaks due to temporaries not being reset during
init
* avoid massive main() function with lots of stuff in it
* disable nim-prompt (unused)
* reuse validator pool instance in eth2_processor
* style cleanup
2021-02-22 23:32:54 +02:00
Zahary Karadjov e1d6df1e5d Continue using the V1 Slashing DB by default 2021-02-20 22:46:35 +02:00
Mamy Ratsimbazafy 5daafd480f
Slashing protection updates (#2333)
* Fix slashing protection always try to migrate at startup
* Add CLI option for dual DB
2021-02-19 17:18:17 +02:00
tersec a3a0df17f8
remove too-aggressive assertion (#2343) 2021-02-19 10:54:47 +00:00
Dustin Brody c7093c4ab5 show next attestation slot & wait time in Slot end log 2021-02-15 22:49:20 +02:00
tersec 5cab17dc1a
database state storage benchmarking via ncli_db (#2312)
* database state storage benchmarking via ncli_db

* more cleanups from immutable validator state branch

* unexport some eth2_network constants and remove unused variables/templates

* make two PeerScore constants public
2021-02-15 17:40:00 +01:00
Mamy Ratsimbazafy 03f47c8f2f
Slashing protection refactor - EIP 3076 (#2094)
* Create CLI tool for slashing export

* Use SQLite as a DB instead of a KV-store

* Keeps v1 and v2 DBs around

* Uses the same schema as Lighthouse v1.1.0

* Passes all interchange tests + skeleton of finalization pruning

* Removes tests that would violate v5 / minimal slashing DB and MinSlot rules

* Migration tool added using low-watermark scheme for faster migration of large number of validators
2021-02-09 17:23:06 +02:00
Giovanni Petrantoni 72b01161c1
populate gossipsub scores (#2091)
* force pushing to fix unstable base

* increase attestation/aggregate queue sizes

when there are many validators, many aggregates and attestations arrive
every slot - increase the queue size a bit - also do batches on each
idle loop iteration since it's fairly quick

* don't score subnets for now

* wrapping up

* refactor and cleanups

* gossip parameters fixes

* comment fix

Co-authored-by: Jacek Sieka <jacek@status.im>
2021-02-09 10:20:55 +01:00
Kim De Mey 73d9c2aa53
Add enr auto update cli option and bump nim-eth (#2278) 2021-02-02 09:07:21 +01:00
tersec 6141286547
rename doppelganger protection to doppelganger detection; switch default from warn to stop (#2281) 2021-02-01 12:18:16 +01:00
Zahary Karadjov fa99c3b417 Fix #2261
Also bumps Confutils to allow setting the hidden --web3-mode param
(to allow testing the eth1 syncing without validators)
2021-01-30 01:32:20 +02:00
Dustin Brody 281853eee8 rename options and internal structures to doppelgangerFoo and remove probing 2021-01-30 00:17:54 +02:00
Kim De Mey 40a5d44887
Fix selection of bootstrap nodes from metadata (#2273)
Also removes again the doubling of max peers
2021-01-29 08:56:02 +01:00
tersec 1bdbf099cc
use IntSet rather than HashSet[ValidatorIndex] (#2267)
* use IntSet rather than HashSet[ValidatorIndex]

* add bounds check before uint64 -> int conversion

* use intsets in block transitions

* remove superfluous Nim issue explanation/reference
2021-01-26 12:52:00 +01:00
Mamy Ratsimbazafy 70a03658e3
Block validation flow v2 + Batch (serial) sig verification (#2250)
* bump nim-blscurve

* Outline the block validation flow

* introduce the SigVerified types, pass the tests

* Split clearance/quarantine to prepare for batch crypto verif

* Add a batch signature collector

* Make clearance use SigVerified block and split verification between crypto and state transition

* Always use signedBeaconBlock for the onBlockAdded callback

* RANDAO signing_root is the epoch instead of the full block

* Support skipping BLS for testing

* Fix compilation of the validator client

* Try to fix strange errors MacOS and Jenkins (Clang, unknown type name br_hmac_drbg_context in stdlib_assertions.nim.c)

* address https://github.com/status-im/nimbus-eth2/pull/2250#discussion_r561819858

* address https://github.com/status-im/nimbus-eth2/pull/2250#discussion_r561828025

* onBlockAdded callback should use TrustedSignedBeaconBlock https://github.com/status-im/nimbus-eth2/pull/2250#discussion_r561837261

* address https://github.com/status-im/nimbus-eth2/pull/2250#discussion_r561828946

* Use the application RNG: https://github.com/status-im/nimbus-eth2/pull/2250#discussion_r561815336

* Improve codegen of conversion zero-cost)

* Quick fixes with loadWithCache after #2259 (TODO: graceful error since pubkey validations is now done first in signatures_batch)

* Graceful handle rogue pubkeys and signatures now that those are lazy-loaded
2021-01-25 20:45:48 +02:00
tersec 7d74d3bfbc
only subscribe to subnets when aggregating (#2254)
Only subscribe to subnets when aggregating
2021-01-25 19:39:56 +02:00
Zahary Karadjov 960666d1ed
Remove std/random again 2021-01-21 19:39:04 +02:00
Mamy Ratsimbazafy 718feef802
Fix unstable after #2244 (#2255) 2021-01-21 18:27:24 +01:00
Dustin Brody a16f5afcd5 pre-emptive duplicate validator detection heuristic 2021-01-21 16:03:02 +02:00
tersec 55ecb61c3a cycle attestation subnets every slot (#2240)
Cycle attestation subnets every slot
2021-01-19 19:44:03 +02:00
tersec 0fce8ad0d7
Revert "only checkpoint every four slots (#2236)" (#2242)
This reverts commit 7da16f4908.
2021-01-18 11:02:56 +01:00
Zahary Karadjov c8c819359c
More clear error message when a validator exit was rejected 2021-01-15 19:40:05 +02:00
tersec 7da16f4908
only checkpoint every four slots (#2236)
* only checkpoint every four slots

* only checkpoint every 16 slots

* every 8 slots

* every 4 slots; 8 seems probably okay, but be a bit conservative
2021-01-15 05:23:54 +00:00
Giovanni Petrantoni 295e3c9c73
Topics validation and direct peers (#2237)
* pick the right libp2p branch

* add topics validation
2021-01-15 04:17:06 +00:00
tersec fa75c477cd
only initially subscribe to relevant attestation subnets (#2231) 2021-01-14 09:43:21 +01:00
tersec 0fad1b6b26
don't special-case zero-validator subnet cycling (#2230) 2021-01-12 17:17:43 +01:00
tersec dde973e2d4
allow always-on subscription to all attestation subnets when gossiping (#2225)
* allow always-on subscription to all attestation subnets when gossiping

* in subscribe-all-subnets mode, consider all subnets to be stability subnets for ENR purposes
2021-01-12 13:43:15 +01:00
Giovanni Petrantoni a3a651b565
always enable topic and aggreate metric topics (#2229) 2021-01-12 04:27:09 +01:00
Zahary Karadjov 338428cbd7 Add Eth1 deposits simulation to block_sim 2021-01-04 13:22:00 +02:00
Giovanni Petrantoni ed24f60f70
remove async from sub/unsub (#2197)
* remove await/async from sub/unsub

* fix unsubscribe wrong key (missed _snappy)

* use the right libp2p commit hash

* remove unused async

* fix inspector

* fix subnet calculation in RPC and insert broadcast attestations into node's pool

* unify codepaths to ensure only mostly-checked-to-be-valid attestations enter the pool, even from node's own broadcasts

* update attestation pool tests for new validateAttestation param

Co-authored-by: Dustin Brody <tersec@users.noreply.github.com>
2020-12-24 09:48:52 +01:00
tersec afbaa36ef7
make subnet cycling more robust; use one stability subnet/validator; explicitly represent gossip enabled/disabled (#2201)
* make subnet cycling more robust; use one stability subnet/validator; explicitly represent gossip enabled/disabled

* fix asymmetry in _snappy being used for subscriptions but not unsubscriptions

* remove redundant comment

* minimal RPC and VC support for infoming BN of subnets

* create and verify slot signatures in RPC interface and VC

* loosen old slot check

* because Slot + uint64 works but uint64 + Slot doesn't

* document assumptions for head state use; don't clear stability subnets; guard against VC not having checked an epoch ahead, fixing a crash; clarify unsigned comparison

* revert unsub fix
2020-12-22 10:05:36 +01:00
Jacek Sieka 6c8f630170
Revert "have each validator randomly pick a stability subnet, per spec (#2194)"
This reverts commit 048a67d525.

Fails with:
```
Error: unhandled exception: /data/beacon-node-builds/devel-large/repo/beacon_chain/nimbus_beacon_node.nim(442, 12) `node.attestationSubnets.stabilitySubnets.len == 0`  [AssertionError]
```
2020-12-18 23:04:31 +01:00
Jacek Sieka 0f8a3a5ae8
checkpoint database at end of each slot (#2195)
* checkpoint database at end of each slot

To avoid spending time on synchronizing with the file system while doing
processing, the manual checkpointing mode turns off fsync during
processing and instead checkpoints the database when the slot has ended.

From an sqlite perspecitve, in WAL mode this guaranees database
consistency but may lead to data loss which is fine - anything missing
from the beacon chain database can be recovered on the next startup.

* log sync status and delay in slot start message

* bump
2020-12-18 22:01:24 +01:00
tersec 048a67d525
have each validator randomly pick a stability subnet, per spec (#2194) 2020-12-18 15:46:07 +01:00
Kim De Mey 8cc7effe52
Fix ENR attnets update to only hold persistent subnets (#2193)
* Fix ENR attnets update to only hold persistent subnets

* Use only stability subnet in metadata and enr
2020-12-18 09:50:29 +01:00
Zahary Karadjov 7d95e86c50
Merge branch 'stable' into devel 2020-12-16 22:22:21 +02:00
Jacek Sieka 5d8cdb88c6
update validator metrics on startup 2020-12-16 20:44:48 +02:00
Jacek Sieka de779c7812 update validator metrics on startup 2020-12-16 19:42:19 +02:00
Jacek Sieka 7d5edb4353
use new stew helpers for assignment (#2172)
* bump libp2p (reduces libp2p gossip memory usage to ~1/3)
* use "generic" assign version
2020-12-16 09:37:22 +01:00
Zahary Karadjov 8ebf9c30b0
More complete reset of the web3 provider on each failure; Fix #2184 2020-12-16 00:21:11 +02:00
Ștefan Talpalaru 9daf6be73c
graceful exit on SIGTERM (#2178)
Much easier than convincing all users to change the default signal in
their service definition file to SIGINT.
2020-12-14 16:45:14 +00:00
Jacek Sieka bc977799f6 Log warning when running without metrics support 2020-12-10 17:22:29 +02:00
Zahary Karadjov 983b3c9fbf Add a 'we3 test' command for verifying the compatibility of a web3 provider 2020-12-10 02:54:58 +02:00
Kim De Mey 0ec90b26a5
Update ENR record with metadata attnets at each attestation subnet cycle (#2148) 2020-12-09 10:13:51 +01:00
tersec 8b8b25ddac
always check whether gossip should be enabled in onslotstart (#2162) 2020-12-08 18:11:54 +01:00
Dustin Brody 32a18769e6 remove waitFor in attestation subnet cycling 2020-12-07 14:48:04 +02:00
Ștefan Talpalaru be107df7f1 status bar: actually display it every second, after updating its data 2020-12-03 11:41:40 +02:00
Zahary Karadjov 4feb0a308e Fix #2125 (ETH status bar display); Bump LibP2P 2020-12-02 00:03:59 +02:00
Dustin Brody 68c91d1d1b don't wait until after the first slot to enable gossip 2020-12-01 15:39:03 +02:00
Zahary Karadjov 38f7558e50 Work around a strange codegen issue to fix local sim in CI; Bump LibP2P 2020-12-01 15:38:00 +02:00
Zahary Karadjov 3bdda3dd46
Hotfix: use the mainnet bootstrap nodes without specifying --network=mainnet explicitly 2020-12-01 10:44:30 +02:00
Zahary Karadjov 4328576e18
Hotfix: 'deposits import' was ignoring its arguments in Linux builds 2020-12-01 00:59:57 +02:00
Zahary Karadjov ac9bdde543
Don't rely on a metric value for the ETH display in the status bar 2020-11-29 23:35:39 +02:00
zah cabb07a186 Apply suggestions from code review
Co-authored-by: Sacha Saint-Leger <sacha@status.im>
2020-11-29 23:08:07 +02:00
Zahary Karadjov 3c0dfc2fbe Implement the 'deposits exit' command; Remove 'deposits create' 2020-11-29 23:08:07 +02:00
Zahary Karadjov ae19ab72c0 Implement #2067 2020-11-29 18:27:26 +02:00
Zahary Karadjov bf2673abc4
Restore the ETH display in the status bar 2020-11-28 20:53:51 +02:00
Jacek Sieka e7f2735271
fix broken metrics during replay (#2090)
* move metrics out of state transition
* add validator count metric
* remove expensive beacon_current_validators, beacon_previous_validators
metrics (they should be reimplemented with cache), add cheap
beacon_active_validators to approximate
* remove unused validator count metrics
* tidy imports/defects
2020-11-27 23:16:13 +01:00
tersec 2421d338c1
node.network.metadata.attnets is only secondary source of truth (#2089) 2020-11-27 15:54:13 +01:00
kdeme e69b5ff473 Add a record create and print command 2020-11-25 18:32:59 +02:00
Zahary Karadjov 3594fa2a22
Version 1.0.0-rc1 2020-11-25 03:13:58 +02:00
Zahary Karadjov b9e4fef616
Check that the selected data dir is compatible with the selected network 2020-11-25 03:13:58 +02:00
zah 372c9b798c
Fix the corrupted database state on Pyrmont nodes; Add mainnet genesis (#2056)
* Handle some web3 timeouts better

* Add support for developer .env files

* Eth1 improvements; Mainnet genesis state

Notable changes:

* The deposits table have been removed from the database. The client
  will no longer process all deposits on start-up.

* The network metadata now includes a "state snapshot" of the deposit
  contract. This allows the client to skip syncing deposits made prior
  to the snapshot (i.e. genesis). Suitable metadata added for Pyrmont
  and Mainnet.

* The Eth1 monitor won't be started unless there are validators attached
  to the node.

* The genesis detection code is now optional and disabled by default

* Bugfix: The client should not produce blocks that will fail validation
  when it hasn't downloaded the latest deposits yet

* Bugfix: Work around the database corruption affecting Pyrmont nodes

* Remove metadata for Toledo and Medalla
2020-11-24 22:21:47 +01:00
tersec 54c388b7b4
close slashing protection database (#2050) 2020-11-20 14:23:55 +01:00
Jacek Sieka a6b188bfd4
misc fixes (#2027)
* log when database is loading (to avoid confusion)
* generate network keys later during startup
* fix quarantine not scheduling chain of parents for download and
increase size to one epoch
* log validator count, enr and peerid more clearly on startup
2020-11-16 20:15:43 +01:00
tersec 21c4ce8fd4
remove superfluous TODOs/not-really-TODOs, type conversion, imports (#2025) 2020-11-16 17:10:51 +01:00
tersec e2f161dbf7
fix attestation sending schedules to avoid timing attack (#1995) 2020-11-16 10:44:18 +01:00
Zahary Karadjov b022dc4d1f Use O(n) algorithm in initialize_beacon_state_from_eth1; Avoid unnecessary merkle proofs generation 2020-11-15 21:40:40 +02:00
Zahary Karadjov 17d35e1fd9 Allow the node to start when it fails to initialize the Eth1 monitor
* Avoid hangs when wss:// is specified for a non-secure HTTP server
* Produce an ERROR when the web3 provider is unsupported, but still launch the node
2020-11-12 22:29:43 +02:00
Zahary Karadjov 389c11743a
Review TODO items and self-assign the most important ones 2020-11-10 20:41:04 +02:00
tersec 271df8b604
bump 1.0.0rc-0 spec refs to 1.0.0 (#1974) 2020-11-09 14:18:55 +00:00
Zahary Karadjov e9b9cd75ee Rename binaries; Mimic the original repo layout in the distribution 2020-11-09 11:38:52 +02:00