Commit Graph

757 Commits

Author SHA1 Message Date
tersec 6057c2ffb4
Revert "Upgrade database schema" (#2570)
This reverts commit 22ddf74752.
2021-05-17 06:34:44 +00:00
Jacek Sieka 22ddf74752 Upgrade database schema
The `kvstore` design we're using now turns out to not be the best way to
use `sqlite` - in particular, there are some significant benefits to
using rowid in certain situations and to keep data in separate tables.

With this branch, there are massive improvements in startup time
(seconds instead of minutes) and state/block storage and pruning times
(milliseconds instead of seconds) - these improvements can in particular
be seen on slow drives and translate directly into better attestation
performance.

* update kvstore to new keyspace design
* remove `DirStoreRef` and the hidden `--state-db-kind` option - this
was an experiment to store large blobs in files, but with the new
kvstore, there's no compelling reason to do so
* remove `DbMap` - unused and would need updating for new keyspace
design
* introduce separate tables for each data type (blocks, states etc)
* remove "WITHOUT ROWID" pessimization for tables with large blobs
* close DbSeq statements explicitly (and earlier)
* store beacon block summaries in separate table, without SSZ
compression and load them all with single query on startup
* stop storing backwards compat full states
* mark genesis beacon block as trusted
* avoid faststreams when loading SSZ data
* remove `DisagreementBehavior` (unused)
2021-05-14 20:05:23 +03:00
Jacek Sieka 867d8f3223
Perform attestation check before broadcast (#2550)
Currently, we have a bit of a convoluted flow where when sending
attestations, we start broadcasting them over gossip then pass them to
the attestation validation to include them in the local attestation pool
- it should be the other way around: we should be checking attestations
_before_ gossipping them - this serves as an additional safety net to
ensure that we don't publish junk - this becomes more important when
publishing attestations from the API.

Also, the REST API was performing its own validation meaning
attestations coming from REST would be validated twice - finally, the
JSON RPC wasn't pre-validating and would happily broadcast invalid
attestations.

* Unified attestation production pipeline with the same flow for gossip,
locally and API-produced attestations: all are now validated and entered
into the pool, then broadcast/republished
* Refactor subnet handling with specific SubnetId alias, streamlining
where subnets are computed, avoiding the need to pass around the number
of active validators
* Move some of the subnet handling code to eth2_network
* Use BitArray throughout for subnet handling
2021-05-10 09:13:36 +02:00
Jacek Sieka 646923c3dd
add attestation stats tool to ncli_db (#2539)
This also makes future efforts to provide metrics and logs for
attestation efficiency easier

* Export rewards from epoch transition
* Use less memory for reward calculation (bool -> set[enum], field
alignment)
* Reuse reward memory when replaying, avoiding spike
* Allow replaying any range in ncli_db benchmark
2021-05-07 13:36:21 +02:00
tersec dd43a2c3b0
bump nim-eth2-scenarios to get merge SSZ test vectors (#2541) 2021-05-05 15:35:36 +00:00
Jacek Sieka 427c0f307c
avoid extraneous hash root calculation (#2537)
When applying a block, we'll currently compute a state root for the
state after slot processing but before block processing - this is
unnecessary when a block is being applied because the intermediate state
root is never observed.
2021-05-05 08:54:21 +02:00
Jacek Sieka 4d74c742da
move ENRForkID into `spec` (#2538)
* move ENRForkID into `spec`

also get rid of strformat in topic formation and fix some case
discrepancies

* also move `Eth2Metadata`
2021-05-04 17:28:48 +02:00
tersec 290b889ce6
non-intrusive, novel portions of merge (#2535) 2021-05-04 11:54:19 +00:00
tersec e0f4d28116
rename initialize_beacon_state to initialize_beacon_state_from_eth1 (#2536) 2021-05-04 12:19:11 +02:00
Dustin Brody 7f42d38219 rename initialize_beacon_state{_from_eth1,}; suppress warnings when doppelganger detection disabled 2021-04-28 00:12:41 +03:00
Jacek Sieka 7dba1b37dd
remove attestation/aggregate queue (#2519)
With the introduction of batching and lazy attestation aggregation, it
no longer makes sense to enqueue attestations between the signature
check and adding them to the attestation pool - this only takes up
valuable CPU without any real benefit.

* add successfully validated attestations to attestion pool directly
* avoid copying participant list around for single-vote attestations,
pass single validator index instead
* release decompressed gossip memory earlier, specially during async
message validation
* use cooked signatures in a few more places to avoid reloads and errors
* remove some Defect-raising versions of signature-loading
* release decompressed data memory before validating message
2021-04-26 22:39:44 +02:00
Jacek Sieka f1f424cc2d attestation processing speedups
* avoid creating indexed attestation just to check signatures - above
all, don't create it when not checking signatures ;)
* avoid pointer op when adding attestation to pool
* better iterator for yielding attestations
* add metric / log for attestation packing time
2021-04-14 21:51:17 +03:00
Jacek Sieka 4ed2e34a9e Revamp attestation pool
This is a revamp of the attestation pool that cleans up several aspects
of attestation processing as the network grows larger and block space
becomes more precious.

The aim is to better exploit the divide between attestation subnets and
aggregations by keeping the two kinds separate until it's time to either
produce a block or aggregate. This means we're no longer eagerly
combining single-vote attestations, but rather wait until the last
moment, and then try to add singles to all aggregates, including those
coming from the network.

Importantly, the branch improves on poor aggregate quality and poor
attestation packing in cases where block space is running out.

A basic greed scoring mechanism is used to select attestations for
blocks - attestations are added based on how much many new votes they
bring to the table.

* Collect single-vote attestations separately and store these until it's
time to make aggregates
* Create aggregates based on single-vote attestations
* Select _best_ aggregate rather than _first_ aggregate when on
aggregation duty
* Top up all aggregates with singles when it's time make the attestation
cut, thus improving the chances of grabbing the best aggregates out
there
* Improve aggregation test coverage
* Improve bitseq operations
* Simplify aggregate signature creation
* Make attestation cache temporary instead of storing it in attestation
pool - most of the time, blocks are not being produced, no need to keep
the data around
* Remove redundant aggregate storage that was used only for RPC
* Use tables to avoid some linear seeks when looking up attestation data
* Fix long cleanup on large slot jumps
* Avoid some pointers
* Speed up iterating all attestations for a slot (fixes #2490)
2021-04-13 20:24:02 +03:00
cheatfate 80e79aef97 Add TODO comments for missing implementations.
Change default REST port to use 5052 (Lighthouse).
Add missing checks for maximum amount of validator ids.
2021-04-09 21:42:13 +03:00
cheatfate 9de65fa293 Fixing issues after bump. 2021-04-09 21:42:13 +03:00
cheatfate c4d891f583 Fix sync_manager.nim to return proper status.
Bump REST API dependencies.
2021-04-09 21:42:13 +03:00
tersec 79bb0d5379
only deserialize attestation and aggregation gossiped signatures once (#2472)
* only deserialize attestation and aggregation gossiped signatures once

* re-indent some aggregate checks into block scope

* spelling

* remove debugging assertion

* put part of gossip validation back into block context

* attestation pool test signature loading isn't so unsafe, and exportRaw isn't free

* remove more development doAsserts; don't exportRaw in loops
2021-04-09 14:59:24 +02:00
Zahary Karadjov 9776fbfe17
Merge branch 'version-1.1.0' into unstable 2021-04-08 20:50:06 +03:00
Jacek Sieka 7165e0ac31
Reset cached indices when resetting cache on SSZ read (#2480)
* Reset cached indices when resetting cache on SSZ read

When deserializing into an existing structure, the cache should be
cleared - goes for json also. Also improve error messages.
2021-04-08 13:11:04 +03:00
tersec 8d7792e6e9
add Altair domains and participation flags; clean up imports (#2462) 2021-04-04 16:24:45 +00:00
Mamy Ratsimbazafy 6b13cdce36
Batch attestations (#2439)
* batch attestations

* Fixes (but now need to investigate the chronos 0 .. 4095 crash similar to https://github.com/status-im/nimbus-eth2/issues/1518

* Try to remove the processing loop to no avail :/

* batch aggregates

* use resultsBuffer size for triggering deadline schedule

* pass attestation pool tests

* Introduce async gossip validators. May fix the 4096 bug (reentrancy issue?) (similar to sync unknown blocks #1518)

* Put logging at debug level, add speed info

* remove unnecessary batch info when it is known to be one

* downgrade some logs to trace level

* better comments [skip ci]

* Address most review comments

* only use ref for async proc

* fix exceptions in eth2_network

* update async exceptions in gossip_validation

* eth2_network 2nd pass

* change to sleepAsync

* Update beacon_chain/gossip_processing/batch_validation.nim

Co-authored-by: Jacek Sieka <jacek@status.im>

Co-authored-by: Jacek Sieka <jacek@status.im>
2021-04-02 16:36:43 +02:00
tersec bd8b60f8c8
use epochref for get_committee_assignments() to avoid repeated shuffling (#2463)
* use epochref for get_committee_assignments() to avoid repeated shuffling

* remove unnecessary imports and StateCache() construction
2021-03-30 15:01:47 +00:00
tersec 49a5667288
update some v1.1.0 alpha1 to alpha2 (#2457)
* update some v1.1.0 alpha1 to alpha2

* remove unused getDepositMessage overload and move other out of datatypes/base

* bump nim-eth2-scenarios to download v1.1.0-alpha.2 test vectors

* construct object rather than result
2021-03-29 19:17:48 +00:00
Jacek Sieka 74732a23fe
json cleanups (#2456)
* move json-rpc specific marshalling to rpc
* serialize Epoch/Slot with cast to avoid Defect
* avoid a few eth1 deps
* simplify imports
2021-03-26 15:11:06 +01:00
Zahary Karadjov 2eacfc4685 Bump modules to take advantage of the new Json format flavors support
Since quite a lot of additional procs were now compiled as generics, this lead to compiler bugs that had
to be worked-around:

* The `Domain` type was renamed to `Eth2Domain` to avoid compilation errors
  due to conflicts with `nativesockets.Domain`.
  Similarly, `eth2_network.KeyPair` was renamed to `NetKeyPair`.

* A new more robust version of `hexToByteArray` was added to stew
2021-03-25 09:37:35 +02:00
Jacek Sieka 8b76ceed52
Fix minor exception effect issues (#2448)
Makes code compatible with
https://github.com/status-im/nim-chronos/pull/166 without requiring it.
2021-03-24 17:20:55 +01:00
tersec 36311bfc05
incorporate proposals into nextActionWait; switch some proc to func (#2438) 2021-03-24 10:05:04 +00:00
tersec 3076f5c3b6
rm std/random from beacon_chain and rm attestation timing randomness (#2442)
* remove added attestation timing randomness

* remove os/random from rest of beacon_chain, primarily deposit_contract

* remove scaffolding

* randomize std/random seed in beacon node and validator client

* use CSPRNG to more securely seed std/random
2021-03-23 06:57:10 +00:00
tersec dfd99ec943
Altair (HF1/v1.1.0) minimal and mainnet presets/constants (#2444)
* Altair mainnet & minimal presets

* std/math not used
2021-03-22 14:44:45 +00:00
tersec b059cb42c5
increase block proposal speed with many validators (#2423)
* increase block proposal speed with many validators

* document CookedSig rationale
2021-03-17 13:35:59 +00:00
Jacek Sieka 3cb31e66b4
set upper bound on EpochRef cache (#2403)
* set upper bound on EpochRef cache

* max 32 EpochRef instances
* less memory waste in BlockRef by removing EpochRef seq that is mostly
unused (~20mb)
* less memory waste in dag block lookup by not keeping an extra copy of
digest (~70mb)
* fix `==` and `$` for Eth2Digest
* remove `ChainDAG.tmpState` (~50mb?)

all in all, this branch cuts mainnet memory usage by ~160-180mb and puts
limits on EpochRef cache usage - where normally it hovered around 950mb
before, it's now sitting at 600-700mb on my machine.

* docs
2021-03-17 11:17:15 +01:00
tersec 8def2486b0
immutable validator database factoring (#2297)
* initial immutable validator database factoring

* remove changes from chain_dag: this abstraction properly belongs in beacon_chain_db

* add merging mutable/immutable validator portions; individually test database roundtripping of immutable validators and states-sans-immutable-validators

* update test summaries

* use stew/assign2 instead of Nim assignment

* add reading/writing of immutable validators in chaindag

* remove unused import

* replace chunked k/v store of immutable validators with per-row SQL table storage

* use List instead of HashList

* un-stub some ncli_db code so that it uses

* switch HashArray to array; move BeaconStateNoImmutableValidators from datatypes to beacon_chain_db

* begin only-mutable-part state storage

* uncomment some assigns

* work around https://github.com/nim-lang/Nim/issues/17253

* fix most of the issues/oversights; local sim runs again

* fix test suite by adding missing beaconstate field to copy function

* have ncli bench also store immutable validators

* extract some immutable-validator-specific code from the beacon chain db module

* add more rigorous database state roundtripping, with changing validator sets

* adjust ncli_db to use new schema

* simplify putState/getState by moving all immutable validator accounting into beacon state DB

* remove redundant test case and move code to immutable-beacon-chain module

* more efficient, but still brute-force, mutable+immutable validator merging

* reuse BeaconState in getState

* ensure HashList/HashArray caches are cleared when reusing getState buffers; add ncli_db and a unit test to verify this

* HashList.clear() -> HashList.clearCache()

* only copy incrementally necessary immutable validators

* increase strictness of test cases and fix/work around resulting HashList cache invalidation issues

* remove explanatory scaffolding

* allow for storage of full (with all validators) states for backwards/forwards-compatibility

* adjust DbSeq type usage

* store full, with-validators, state every 64 epochs to enable reverting versions

* reduce memory allocation and intermediate objects in state storage codepath

* eliminate allocation/copying through intermediate BeaconStateNoImmutableValidators objects

* skip benchmarking initial genesis-validator-heavy state store

* always store new-style state and sometimes old-style state

* document intent behind BeaconState/Validator type-punnery

* more accurate failure message on SQLite in-memory database initialization failure
2021-03-15 14:11:51 +00:00
Mamy Ratsimbazafy c47d636cb3
Split Eth2Processor in prep for batching (#2396)
* Split Eth2Processor in gossip and consensus part and materialize the shared block queue

* Update initialization in test_sync_manager
2021-03-11 11:10:57 +01:00
tersec ef4a5b0cc3
remove delta-encoding from state diff balances (#2397)
* remove delta-encoding from state diff balances

* switch HashList to List
2021-03-11 05:39:04 +00:00
Mamy Ratsimbazafy d47f53cd9d
Reorg (5/5) (#2377)
* Reorg things left into networking and gossip_processing

* time -> beacon_clock

* fix builds
2021-03-05 14:12:00 +01:00
tersec 451cc03d76
datatypes spec ref url updates (#2372) 2021-03-02 17:31:34 +01:00
tersec de643d9926
allow multiple hard fork datatypes to coexist (#2328)
* allow multiple hard fork datatypes to coexist

* update to 1.0.1

* merge recent datatypes.nim updates

* trigger rebuild now the out-of-disk-space machine offline
2021-03-02 10:13:39 +00:00
tersec 5653b2e13c
more spec v1.0.1 spec ref URL and copyright year updates (#2367) 2021-03-02 06:04:14 +00:00
tersec e661f7d0c7
prevent uint64 to int64-induced RangeError/RangeDefects in metrics (#2358)
* prevent uint64 to int64-induced RangeError/RangeDefects in metrics

* remove redundant min(foo, int64.high)

* adjust spacing to be consistent
2021-03-01 20:55:25 +01:00
tersec 97f7284e51
bump spec refs from v1.0.0 to v1.0.1 and update copyright years (#2357) 2021-02-25 13:37:22 +00:00
Dustin Brody f14e7babb6 update eth2 specs to version v1.0.1 2021-02-25 14:21:59 +02:00
Ștefan Talpalaru 44a1263ece fix Eth2Digest compile-time comparison 2021-02-25 14:20:26 +02:00
Ștefan Talpalaru 16abf2989b bump NimYAML 2021-02-25 14:20:26 +02:00
Dustin Brody c7093c4ab5 show next attestation slot & wait time in Slot end log 2021-02-15 22:49:20 +02:00
tersec 5cab17dc1a
database state storage benchmarking via ncli_db (#2312)
* database state storage benchmarking via ncli_db

* more cleanups from immutable validator state branch

* unexport some eth2_network constants and remove unused variables/templates

* make two PeerScore constants public
2021-02-15 17:40:00 +01:00
tersec aca3e4cd5c
per HF1, split process_final_updates() (#2319) 2021-02-14 19:31:01 +00:00
Jacek Sieka 9968944329
more callsigs! (#2302) 2021-02-09 10:23:26 +01:00
Jacek Sieka 8a09286423
tune attestation params (#2301)
* a little bit more priority to attestation processing
* better implementation for bytes_to_uint64
2021-02-08 16:13:02 +01:00
Ștefan Talpalaru 80c11546ff Windows binary release
CI: use both cores on GitHub Actions and set timeouts for the local testnet tests
2021-02-04 10:25:44 +02:00
Dustin Brody 707fcd99cb remove unused beacon chain spec and test code 2021-02-02 14:56:38 +02:00