Commit Graph

767 Commits

Author SHA1 Message Date
Eugene Kabanov 5b5ea2e813
Fix integer overflow issue in sync_manager. (#2564)
* Make Refactor rewind point assignment more concrete.

* Fix overflow issue in getRewindPoint().
Add tests.
2021-05-18 12:25:14 +02:00
Jacek Sieka 97f4e1fffe
Db1 cont (#2573)
* Revert "Revert "Upgrade database schema" (#2570)"

This reverts commit 6057c2ffb4.

* ssz: fix loading empty lists into existing instances

Not a problem earlier because we didn't reuse instances

* bump nim-eth

* bump nim-web3
2021-05-17 18:37:26 +02:00
tersec 6057c2ffb4
Revert "Upgrade database schema" (#2570)
This reverts commit 22ddf74752.
2021-05-17 06:34:44 +00:00
Jacek Sieka 22ddf74752 Upgrade database schema
The `kvstore` design we're using now turns out to not be the best way to
use `sqlite` - in particular, there are some significant benefits to
using rowid in certain situations and to keep data in separate tables.

With this branch, there are massive improvements in startup time
(seconds instead of minutes) and state/block storage and pruning times
(milliseconds instead of seconds) - these improvements can in particular
be seen on slow drives and translate directly into better attestation
performance.

* update kvstore to new keyspace design
* remove `DirStoreRef` and the hidden `--state-db-kind` option - this
was an experiment to store large blobs in files, but with the new
kvstore, there's no compelling reason to do so
* remove `DbMap` - unused and would need updating for new keyspace
design
* introduce separate tables for each data type (blocks, states etc)
* remove "WITHOUT ROWID" pessimization for tables with large blobs
* close DbSeq statements explicitly (and earlier)
* store beacon block summaries in separate table, without SSZ
compression and load them all with single query on startup
* stop storing backwards compat full states
* mark genesis beacon block as trusted
* avoid faststreams when loading SSZ data
* remove `DisagreementBehavior` (unused)
2021-05-14 20:05:23 +03:00
Jacek Sieka 0a477eb2d3
fix subnet logic (#2555)
This PR decreases the lead subscription time which should help
decrease bandwidth usage and CPU making the subscription for future
aggregation happen a bit later. There's room for more tuning here,
probably.

* fix missing negation from in #2550
* fix silly bitarray issues
* decrease subnet lead subscription time
* log all subnet switching source data
* rename subnet trackers to refer to stability and aggregate subnets
* more tests
2021-05-11 22:03:40 +02:00
Mamy Ratsimbazafy e6b559a35a
Slashing db pruning [Merge only after v2 has been default for 1 noticeable release] (#2452)
* Enable slashing DB pruning

* integrate slashing DB pruning with onSlotEnd

* rebase tests
2021-05-10 16:32:28 +02:00
Jacek Sieka 867d8f3223
Perform attestation check before broadcast (#2550)
Currently, we have a bit of a convoluted flow where when sending
attestations, we start broadcasting them over gossip then pass them to
the attestation validation to include them in the local attestation pool
- it should be the other way around: we should be checking attestations
_before_ gossipping them - this serves as an additional safety net to
ensure that we don't publish junk - this becomes more important when
publishing attestations from the API.

Also, the REST API was performing its own validation meaning
attestations coming from REST would be validated twice - finally, the
JSON RPC wasn't pre-validating and would happily broadcast invalid
attestations.

* Unified attestation production pipeline with the same flow for gossip,
locally and API-produced attestations: all are now validated and entered
into the pool, then broadcast/republished
* Refactor subnet handling with specific SubnetId alias, streamlining
where subnets are computed, avoiding the need to pass around the number
of active validators
* Move some of the subnet handling code to eth2_network
* Use BitArray throughout for subnet handling
2021-05-10 09:13:36 +02:00
Jacek Sieka 646923c3dd
add attestation stats tool to ncli_db (#2539)
This also makes future efforts to provide metrics and logs for
attestation efficiency easier

* Export rewards from epoch transition
* Use less memory for reward calculation (bool -> set[enum], field
alignment)
* Reuse reward memory when replaying, avoiding spike
* Allow replaying any range in ncli_db benchmark
2021-05-07 13:36:21 +02:00
Jacek Sieka 427c0f307c
avoid extraneous hash root calculation (#2537)
When applying a block, we'll currently compute a state root for the
state after slot processing but before block processing - this is
unnecessary when a block is being applied because the intermediate state
root is never observed.
2021-05-05 08:54:21 +02:00
Jacek Sieka efdf759cc0
avoid some slashing protection queries (#2528)
This PR reduces the number of database queries for slashing protection
from 5 reads and 1 write to 2 reads and 1 write in the optimistic case.

In the process, it removes user-level support for writing the database
in the version 1 format in order to simplify the code flow, and prevent
code rot. In particular, the v1 format was not covered by any unit tests
and has no advantages over v2. The concrete code to read and write it
remains for now, in particular to support upgrades from v1 to v2.

The branch also removes the use of concepts which doesn't work with
checked exceptions - in particular, this highlights code that both
raises exceptions and returns error codes, which could be cleaned up in
the future.

* Cache internal validator ID
* Rely on unique index to check for trivial duplicate votes
* Combine two surround vote queries into one
* Combine API for checking and registering slashing into single function

The slashing DB is normally not a bottleneck, but may become one with
high attached validator counts.
2021-05-04 15:17:28 +02:00
tersec e0f4d28116
rename initialize_beacon_state to initialize_beacon_state_from_eth1 (#2536) 2021-05-04 12:19:11 +02:00
Jacek Sieka ce49da6c0a
Introduce unittest2 and junit reports (#2522)
* Introduce unittest2 and junit reports

* fix XML path

* don't combine multiple CI runs

* fixup

* public combined report also

Co-authored-by: Ștefan Talpalaru <stefantalpalaru@yahoo.com>
2021-04-28 18:41:02 +02:00
Dustin Brody 7f42d38219 rename initialize_beacon_state{_from_eth1,}; suppress warnings when doppelganger detection disabled 2021-04-28 00:12:41 +03:00
Eugene Kabanov 0c3302b826
REST API test framework and tests. (#2492)
* REST API test framework and tests.

* Fix ValidatorIndex tests to properly handle int32, but not uint32 values.

* Fix tests to follow latest REST fixes.

* refactor restapi.sh

and add it to the test suite

* Fix issues.
Add delay timeout which is required.

* Fix restapi.sh script for Windows.

Co-authored-by: Ștefan Talpalaru <stefantalpalaru@yahoo.com>
2021-04-27 23:46:24 +03:00
Jacek Sieka 7dba1b37dd
remove attestation/aggregate queue (#2519)
With the introduction of batching and lazy attestation aggregation, it
no longer makes sense to enqueue attestations between the signature
check and adding them to the attestation pool - this only takes up
valuable CPU without any real benefit.

* add successfully validated attestations to attestion pool directly
* avoid copying participant list around for single-vote attestations,
pass single validator index instead
* release decompressed gossip memory earlier, specially during async
message validation
* use cooked signatures in a few more places to avoid reloads and errors
* remove some Defect-raising versions of signature-loading
* release decompressed data memory before validating message
2021-04-26 22:39:44 +02:00
tersec 99fccaee6e
more abstraction over BeaconState (#2509)
* more abstraction over BeaconState

* use HashedBeaconState copy of htr
2021-04-16 08:49:37 +00:00
Jacek Sieka 4ed2e34a9e Revamp attestation pool
This is a revamp of the attestation pool that cleans up several aspects
of attestation processing as the network grows larger and block space
becomes more precious.

The aim is to better exploit the divide between attestation subnets and
aggregations by keeping the two kinds separate until it's time to either
produce a block or aggregate. This means we're no longer eagerly
combining single-vote attestations, but rather wait until the last
moment, and then try to add singles to all aggregates, including those
coming from the network.

Importantly, the branch improves on poor aggregate quality and poor
attestation packing in cases where block space is running out.

A basic greed scoring mechanism is used to select attestations for
blocks - attestations are added based on how much many new votes they
bring to the table.

* Collect single-vote attestations separately and store these until it's
time to make aggregates
* Create aggregates based on single-vote attestations
* Select _best_ aggregate rather than _first_ aggregate when on
aggregation duty
* Top up all aggregates with singles when it's time make the attestation
cut, thus improving the chances of grabbing the best aggregates out
there
* Improve aggregation test coverage
* Improve bitseq operations
* Simplify aggregate signature creation
* Make attestation cache temporary instead of storing it in attestation
pool - most of the time, blocks are not being produced, no need to keep
the data around
* Remove redundant aggregate storage that was used only for RPC
* Use tables to avoid some linear seeks when looking up attestation data
* Fix long cleanup on large slot jumps
* Avoid some pointers
* Speed up iterating all attestations for a slot (fixes #2490)
2021-04-13 20:24:02 +03:00
tersec 79bb0d5379
only deserialize attestation and aggregation gossiped signatures once (#2472)
* only deserialize attestation and aggregation gossiped signatures once

* re-indent some aggregate checks into block scope

* spelling

* remove debugging assertion

* put part of gossip validation back into block context

* attestation pool test signature loading isn't so unsafe, and exportRaw isn't free

* remove more development doAsserts; don't exportRaw in loops
2021-04-09 14:59:24 +02:00
Zahary Karadjov 9776fbfe17
Merge branch 'version-1.1.0' into unstable 2021-04-08 20:50:06 +03:00
Zahary Karadjov 61669f269d Clarify that the deposit cretion tools are intended for testnets 2021-04-08 19:58:22 +03:00
Jacek Sieka 7165e0ac31
Reset cached indices when resetting cache on SSZ read (#2480)
* Reset cached indices when resetting cache on SSZ read

When deserializing into an existing structure, the cache should be
cleared - goes for json also. Also improve error messages.
2021-04-08 13:11:04 +03:00
Jacek Sieka beceb060c4
Write state diffs to separate table (and experimentally, files instead of db) (#2460) 2021-04-06 21:56:45 +03:00
Mamy Ratsimbazafy 6b13cdce36
Batch attestations (#2439)
* batch attestations

* Fixes (but now need to investigate the chronos 0 .. 4095 crash similar to https://github.com/status-im/nimbus-eth2/issues/1518

* Try to remove the processing loop to no avail :/

* batch aggregates

* use resultsBuffer size for triggering deadline schedule

* pass attestation pool tests

* Introduce async gossip validators. May fix the 4096 bug (reentrancy issue?) (similar to sync unknown blocks #1518)

* Put logging at debug level, add speed info

* remove unnecessary batch info when it is known to be one

* downgrade some logs to trace level

* better comments [skip ci]

* Address most review comments

* only use ref for async proc

* fix exceptions in eth2_network

* update async exceptions in gossip_validation

* eth2_network 2nd pass

* change to sleepAsync

* Update beacon_chain/gossip_processing/batch_validation.nim

Co-authored-by: Jacek Sieka <jacek@status.im>

Co-authored-by: Jacek Sieka <jacek@status.im>
2021-04-02 16:36:43 +02:00
Jacek Sieka f821bc878e
Remove `-d:insecure` compile option (#2468)
With metrics running on top of chronos, the metrics server no longer
needs to be compiled in conditionally - it remains disabled by default.
2021-04-01 14:44:11 +02:00
tersec 49a5667288
update some v1.1.0 alpha1 to alpha2 (#2457)
* update some v1.1.0 alpha1 to alpha2

* remove unused getDepositMessage overload and move other out of datatypes/base

* bump nim-eth2-scenarios to download v1.1.0-alpha.2 test vectors

* construct object rather than result
2021-03-29 19:17:48 +00:00
Zahary Karadjov 2eacfc4685 Bump modules to take advantage of the new Json format flavors support
Since quite a lot of additional procs were now compiled as generics, this lead to compiler bugs that had
to be worked-around:

* The `Domain` type was renamed to `Eth2Domain` to avoid compilation errors
  due to conflicts with `nativesockets.Domain`.
  Similarly, `eth2_network.KeyPair` was renamed to `NetKeyPair`.

* A new more robust version of `hexToByteArray` was added to stew
2021-03-25 09:37:35 +02:00
Kim De Mey a4a2c1c0e1
Add discovery query with forkId and attnets filter (#2420) 2021-03-24 11:48:53 +01:00
tersec 97850741a0
remove unused imports in slashing prtection (#2436) 2021-03-19 08:26:02 +00:00
Jacek Sieka 3cb31e66b4
set upper bound on EpochRef cache (#2403)
* set upper bound on EpochRef cache

* max 32 EpochRef instances
* less memory waste in BlockRef by removing EpochRef seq that is mostly
unused (~20mb)
* less memory waste in dag block lookup by not keeping an extra copy of
digest (~70mb)
* fix `==` and `$` for Eth2Digest
* remove `ChainDAG.tmpState` (~50mb?)

all in all, this branch cuts mainnet memory usage by ~160-180mb and puts
limits on EpochRef cache usage - where normally it hovered around 950mb
before, it's now sitting at 600-700mb on my machine.

* docs
2021-03-17 11:17:15 +01:00
Kim De Mey 5e3770e994
Remove unused test (#2424) 2021-03-17 06:36:48 +00:00
tersec 8def2486b0
immutable validator database factoring (#2297)
* initial immutable validator database factoring

* remove changes from chain_dag: this abstraction properly belongs in beacon_chain_db

* add merging mutable/immutable validator portions; individually test database roundtripping of immutable validators and states-sans-immutable-validators

* update test summaries

* use stew/assign2 instead of Nim assignment

* add reading/writing of immutable validators in chaindag

* remove unused import

* replace chunked k/v store of immutable validators with per-row SQL table storage

* use List instead of HashList

* un-stub some ncli_db code so that it uses

* switch HashArray to array; move BeaconStateNoImmutableValidators from datatypes to beacon_chain_db

* begin only-mutable-part state storage

* uncomment some assigns

* work around https://github.com/nim-lang/Nim/issues/17253

* fix most of the issues/oversights; local sim runs again

* fix test suite by adding missing beaconstate field to copy function

* have ncli bench also store immutable validators

* extract some immutable-validator-specific code from the beacon chain db module

* add more rigorous database state roundtripping, with changing validator sets

* adjust ncli_db to use new schema

* simplify putState/getState by moving all immutable validator accounting into beacon state DB

* remove redundant test case and move code to immutable-beacon-chain module

* more efficient, but still brute-force, mutable+immutable validator merging

* reuse BeaconState in getState

* ensure HashList/HashArray caches are cleared when reusing getState buffers; add ncli_db and a unit test to verify this

* HashList.clear() -> HashList.clearCache()

* only copy incrementally necessary immutable validators

* increase strictness of test cases and fix/work around resulting HashList cache invalidation issues

* remove explanatory scaffolding

* allow for storage of full (with all validators) states for backwards/forwards-compatibility

* adjust DbSeq type usage

* store full, with-validators, state every 64 epochs to enable reverting versions

* reduce memory allocation and intermediate objects in state storage codepath

* eliminate allocation/copying through intermediate BeaconStateNoImmutableValidators objects

* skip benchmarking initial genesis-validator-heavy state store

* always store new-style state and sometimes old-style state

* document intent behind BeaconState/Validator type-punnery

* more accurate failure message on SQLite in-memory database initialization failure
2021-03-15 14:11:51 +00:00
tersec dfc3322fe2
last few v1.0.0 spec refs to v1.0.1 (#2404) 2021-03-13 20:51:39 +00:00
Dustin Brody 97504fdb9d ncli_db pruneDatabase checkpointing; remove onSlotEnd lookaheadTime 2021-03-12 23:15:46 +02:00
Mamy Ratsimbazafy c47d636cb3
Split Eth2Processor in prep for batching (#2396)
* Split Eth2Processor in gossip and consensus part and materialize the shared block queue

* Update initialization in test_sync_manager
2021-03-11 11:10:57 +01:00
tersec ef4a5b0cc3
remove delta-encoding from state diff balances (#2397)
* remove delta-encoding from state diff balances

* switch HashList to List
2021-03-11 05:39:04 +00:00
Mamy Ratsimbazafy 8e28a05cea
Move pruning out of latency critical path (#2384)
* Deferred DAG and fork choice pruning

* fixup

* Address https://github.com/status-im/nimbus-eth2/pull/2384/files#r589448448, rely only on onSLotEnd for state pruning

* no need to store needPruning in the data structure

* lastPrunePoint is updated in pruning proc

* Split eager and LazyPruning

* enforce pruning in updateHead
2021-03-09 15:36:17 +01:00
Mamy Ratsimbazafy de1060e7f3
centralize p2p validation in a single file and address https://github.com/status-im/nimbus-eth2/pull/2377#issuecomment-791313118 (#2383) 2021-03-06 08:32:55 +01:00
Mamy Ratsimbazafy d47f53cd9d
Reorg (5/5) (#2377)
* Reorg things left into networking and gossip_processing

* time -> beacon_clock

* fix builds
2021-03-05 14:12:00 +01:00
Mamy Ratsimbazafy 5d7f9c3a04
Consensus object pools [reorg 4/5] (#2374)
* Add documentation

* make test doesn't try to build the beacon node :/
2021-03-04 10:13:44 +01:00
tersec 4278e80657
document two uint64 -> int64 conversions (#2375)
* document two uint64 -> int64 conversions

* fix minimal preset slot time & calculation
2021-03-04 10:13:23 +01:00
Mamy Ratsimbazafy 2f17ac7b64
Move SSZ, deposit_contracts & eth1_monitor [reorg files 3/5] (#2371)
* move deposit_contract

* Move SSZ

* fix ssz import in tests

* move also eth1_monitor

* forgot to delete the original

* fix comma [skip ci]

* Fix "make" & tools imports

* Fix import

* Fix import again

* rename deposit_contract -> eth1

* Revert ssz move to subfolder

* path fixes [skip ci]
2021-03-03 07:23:05 +01:00
Mamy Ratsimbazafy 3276dfc683
Consolidate modules by areas [part 1] (#2365)
* Move sync in subfolder

* move validator related thingies in validators

* fix binary builds

* update bounds comment [skip ci]
2021-03-02 11:27:45 +01:00
Jacek Sieka 3f8764ee61
fix replays stalling processing (#2361)
* fix replays stalling processing

Occasionally, attestations will arrive that vote for a target derived
either from the finalized block or earlier. In these cases, Nimbus would
replay the state transition of up to 32 epochs worth of blocks because
the finalized state has been pruned, delaying other processing and
leading to poor inclusion distance.

* put cheap attestation checks before forming EpochRef
* check that attestation target is not from an unviable history with
regards to finalization
* fix overly aggressive state pruning removing the state close to the
finalized checkpoint resulting in rare long replays for valid
attestations
* log long replays
* harden logging and traversal of nil BlockSlot

* simplify target check

no need to lookup target in chain dag again

* fixup

* fixup
2021-03-01 20:50:43 +01:00
tersec 97f7284e51
bump spec refs from v1.0.0 to v1.0.1 and update copyright years (#2357) 2021-02-25 13:37:22 +00:00
Ștefan Talpalaru 22620afa8f bump NimYAML for Nim-1.4 compatibility 2021-02-25 14:20:26 +02:00
Dustin Brody c7093c4ab5 show next attestation slot & wait time in Slot end log 2021-02-15 22:49:20 +02:00
tersec bdd2ec03e3
fix slashing db interchange test compile warnings (#2311) 2021-02-11 13:54:54 +01:00
Mamy Ratsimbazafy 03f47c8f2f
Slashing protection refactor - EIP 3076 (#2094)
* Create CLI tool for slashing export

* Use SQLite as a DB instead of a KV-store

* Keeps v1 and v2 DBs around

* Uses the same schema as Lighthouse v1.1.0

* Passes all interchange tests + skeleton of finalization pruning

* Removes tests that would violate v5 / minimal slashing DB and MinSlot rules

* Migration tool added using low-watermark scheme for faster migration of large number of validators
2021-02-09 17:23:06 +02:00
tersec 8d25663681
remove several IntSet usages in lieu of seq[ValidatorIndex] (#2288)
* remove several IntSet usages in lieu of seq[ValidatorIndex]

* convert smaller types to larger types

* larger type, again
2021-02-08 08:27:30 +01:00
Dustin Brody 67e4a045a3 simplify doppelganger detection to boolean 2021-02-03 20:55:33 +02:00