This reverts commit eebc828778.
Adding a separate file turns out not to be enough. This PR reverts the
separate file change.
Another theory is that the large kvstore table causes cache thrashing -
all database connections share a common page cache which would explain
the poor performance of the separate file solution.
The V1 table structure shows great improvements in performance, but if
there's an old `kvstore` without rowid:s, these benefits are nullified:
reorgs during writes and deletes remain expensive (even if the
degradation is reduced somewhat).
This PR creates the tables in a new file instead, and uses the old file
as a read-only store - this has several interesting properties:
* the old database is left completely untouched - this guarantees that
downgrades work smooth (they'll only need to resync their missing
portions)
* starting sync after this PR means only a v1 database is created
* v0 databases stick around - no migration is performed (for now)
Future PR:s can introduce migration of the data from one database to
another - a simply copy will take hours which is downtime we want to
avoid - at that point, it might make sense to migrate straight to era
files instead.
* Revert "Revert "Upgrade database schema" (#2570)"
This reverts commit 6057c2ffb4.
* ssz: fix loading empty lists into existing instances
Not a problem earlier because we didn't reuse instances
* bump nim-eth
* bump nim-web3
The `kvstore` design we're using now turns out to not be the best way to
use `sqlite` - in particular, there are some significant benefits to
using rowid in certain situations and to keep data in separate tables.
With this branch, there are massive improvements in startup time
(seconds instead of minutes) and state/block storage and pruning times
(milliseconds instead of seconds) - these improvements can in particular
be seen on slow drives and translate directly into better attestation
performance.
* update kvstore to new keyspace design
* remove `DirStoreRef` and the hidden `--state-db-kind` option - this
was an experiment to store large blobs in files, but with the new
kvstore, there's no compelling reason to do so
* remove `DbMap` - unused and would need updating for new keyspace
design
* introduce separate tables for each data type (blocks, states etc)
* remove "WITHOUT ROWID" pessimization for tables with large blobs
* close DbSeq statements explicitly (and earlier)
* store beacon block summaries in separate table, without SSZ
compression and load them all with single query on startup
* stop storing backwards compat full states
* mark genesis beacon block as trusted
* avoid faststreams when loading SSZ data
* remove `DisagreementBehavior` (unused)
This patch writes a full genesis state to `kvstore` if one was missing,
which fixes 1.2.0 restarting sync when upgrading from 1.1.0, or when
downgrading to a pre-1.1.0 release.
* Reset cached indices when resetting cache on SSZ read
When deserializing into an existing structure, the cache should be
cleared - goes for json also. Also improve error messages.
* initial immutable validator database factoring
* remove changes from chain_dag: this abstraction properly belongs in beacon_chain_db
* add merging mutable/immutable validator portions; individually test database roundtripping of immutable validators and states-sans-immutable-validators
* update test summaries
* use stew/assign2 instead of Nim assignment
* add reading/writing of immutable validators in chaindag
* remove unused import
* replace chunked k/v store of immutable validators with per-row SQL table storage
* use List instead of HashList
* un-stub some ncli_db code so that it uses
* switch HashArray to array; move BeaconStateNoImmutableValidators from datatypes to beacon_chain_db
* begin only-mutable-part state storage
* uncomment some assigns
* work around https://github.com/nim-lang/Nim/issues/17253
* fix most of the issues/oversights; local sim runs again
* fix test suite by adding missing beaconstate field to copy function
* have ncli bench also store immutable validators
* extract some immutable-validator-specific code from the beacon chain db module
* add more rigorous database state roundtripping, with changing validator sets
* adjust ncli_db to use new schema
* simplify putState/getState by moving all immutable validator accounting into beacon state DB
* remove redundant test case and move code to immutable-beacon-chain module
* more efficient, but still brute-force, mutable+immutable validator merging
* reuse BeaconState in getState
* ensure HashList/HashArray caches are cleared when reusing getState buffers; add ncli_db and a unit test to verify this
* HashList.clear() -> HashList.clearCache()
* only copy incrementally necessary immutable validators
* increase strictness of test cases and fix/work around resulting HashList cache invalidation issues
* remove explanatory scaffolding
* allow for storage of full (with all validators) states for backwards/forwards-compatibility
* adjust DbSeq type usage
* store full, with-validators, state every 64 epochs to enable reverting versions
* reduce memory allocation and intermediate objects in state storage codepath
* eliminate allocation/copying through intermediate BeaconStateNoImmutableValidators objects
* skip benchmarking initial genesis-validator-heavy state store
* always store new-style state and sometimes old-style state
* document intent behind BeaconState/Validator type-punnery
* more accurate failure message on SQLite in-memory database initialization failure
* database state storage benchmarking via ncli_db
* more cleanups from immutable validator state branch
* unexport some eth2_network constants and remove unused variables/templates
* make two PeerScore constants public
* checkpoint database at end of each slot
To avoid spending time on synchronizing with the file system while doing
processing, the manual checkpointing mode turns off fsync during
processing and instead checkpoints the database when the slot has ended.
From an sqlite perspecitve, in WAL mode this guaranees database
consistency but may lead to data loss which is fine - anything missing
from the beacon chain database can be recovered on the next startup.
* log sync status and delay in slot start message
* bump
* Handle some web3 timeouts better
* Add support for developer .env files
* Eth1 improvements; Mainnet genesis state
Notable changes:
* The deposits table have been removed from the database. The client
will no longer process all deposits on start-up.
* The network metadata now includes a "state snapshot" of the deposit
contract. This allows the client to skip syncing deposits made prior
to the snapshot (i.e. genesis). Suitable metadata added for Pyrmont
and Mainnet.
* The Eth1 monitor won't be started unless there are validators attached
to the node.
* The genesis detection code is now optional and disabled by default
* Bugfix: The client should not produce blocks that will fail validation
when it hasn't downloaded the latest deposits yet
* Bugfix: Work around the database corruption affecting Pyrmont nodes
* Remove metadata for Toledo and Medalla
This introcudes a cache for block summaries, useful for instantiating
the block dag on startup, bringing medalla startup times down from
minutes to seconds.
This is something of a temporary band-aid that would be obsoleted by a
finalized block store.
* Concentrate all sensitive writeFile/createPath calls in one place.
Fix eth2_network_simulation for Windows.
* Remove artifacts.
* fix import
Co-authored-by: Jacek Sieka <jacek@status.im>
* fix invalid state root being written to database
When rewinding state data, the wrong block reference would be used when
saving the state root - this would cause state loading to fail by
loading a different state than expected, preventing blocks to be
applied.
* refactor state loading and saving to consistently use and set
StateData block
* avoid rollback when state is missing from database (as opposed to
being partially overwritten and therefore in need of rollback)
* don't store state roots for empty slots - previously, these were used
as a cache to avoid recalculating them in state transition, but this has
been superceded by hash tree root caching
* don't attempt loading states / state roots for non-epoch slots, these
are not saved to the database
* simplify rewinder and clean up funcitions after caches have been
reworked
* fix chaindag logscope
* add database reload metric
* re-enable clearance epoch tests
* names
hash_tree_root was turning up when running beacon_node, turns out to be
repeated hash_tree_root invocations - this pr brings them back down to
normal.
this PR caches the root of a block in the SignedBeaconBlock object -
this has the potential downside that even invalid blocks will be hashed
(as part of deserialization) - later, one could imagine delaying this
until checks have passed
there's also some cleanup of the `cat=` logs which were applied randomly
and haphazardly, and to a large degree are duplicated by other
information in the log statements - in particular, topics fulfill the
same role
* cleanups
* fix ncli state root check flag
* add block dump to ncli_db
* limit ncli_db benchmark length
* tone down finalization logs
* introduce trusted blocks
We only store blocks whose signature we've verified in the database - as
such, there's no need to check it again, and most importantly, no need
to deserialize the signature when loading from database.
50x startup time improvement, 200x block load time improvement.
* fix rewinding when deposits have invalid signature
* speed up ancestor iteration by avoiding copy
* avoid deserializing signatures for trusted data
* load blocks lazily when rewinding (less memory used)
* chronicles workarounds
* document trustedbeaconblock
* 3gb vs 12gb for 4000 epochs of witti
* 3-4x sync blocks/sec performance improvement on my FDE SSD drive
* generate less quirky code for primitive types
* reorder ssz
* split into hash_trees and ssz_serialization, roughly, for hashing and
IO
* move bitseqs into ssz (from stew)
* clean up imports
* docs, imports
* fix crash when state root is present but state is missing
* fix state root removal when state is removed
* fix block pool initialization which needs tail state
* remove tail block pruning
* incomplete - fork states are not pruned
* incomplete - fork blocks are not pruned
* incomplete - empty slot states are not pruned
* unknown - tail/finalized block on empty slot might be incorrect
* simplify data storage to key-value, tries are not relevant for NBC
* locked-down version of lmdb dependency
* easier to build / maintain on various platforms
* rename compute_epoch_of_slot(...) to compute_epoch_at_slot(...)
* remove some unnecessary imports; remove some crosslink-related code and tests; complete renaming of compute_epoch_of_slot(...) to compute_epoch_at_slot(...)
* rm more transfer-related code and tests; rm more unnecessary strutils imports
* rm remaining unused imports
* remove useless get_empty_per_epoch_cache(...)/compute_start_slot_of_epoch(...) calls
* rename compute_start_slot_of_epoch(...) to compute_start_slot_at_epoch(...)
* rename ACTIVATION_EXIT_DELAY to MAX_SEED_LOOKAHEAD
* update domain types to 0.9.0
* mark AttesterSlashing, IndexedAttestation, AttestationDataAndCustodyBit, DepositData, BeaconBlockHeader, Fork, integer_squareroot(...), and process_voluntary_exit(...) as 0.9.0
* mark increase_balance(...), decrease_balance(...), get_block_root(...), CheckPoint, Deposit, PendingAttestation, HistoricalBatch, is_active_validator(...), and is_slashable_attestation_data(...) as 0.9.0
* mark compute_activation_exit_epoch(...), bls_verify(...), Validator, get_active_validator_indices(...), get_current_epoch(...), get_total_active_balance(...), and get_previous_epoch(...) as 0.9.0
* mark get_block_root_at_slot(...), ProposerSlashing, get_domain(...), VoluntaryExit, mainnet preset Gwei values, minimal preset max operations, process_block_header(...), and is_slashable_validator(...) as 0.9.0
* mark makeWithdrawalCredentials(...), get_validator_churn_limit(...), get_total_balance(...), is_valid_indexed_attestation(...), bls_aggregate_pubkeys(...), initial genesis value/constants, Attestation, get_randao_mix(...), mainnet preset max operations per block constants, minimal preset Gwei values and time parameters, process_eth1_data(...), get_shuffled_seq(...), compute_committee(...), and process_slots(...) as 0.9.0; partially update get_indexed_attestation(...) to 0.9.0 by removing crosslink refs and associated tests
* mark initiate_validator_exit(...), process_registry_updates(...), BeaconBlock, Eth1Data, compute_domain(...), process_randao(...), process_attester_slashing(...), get_base_reward(...), and process_slot(...) as 0.9.0
* signed_root -> signing_root
* implement new process_deposit; update timing parameters to 0.6.1; update Deposit to 0.6.1 and remove DepositInput
* update get_active_validator_indices and get_epoch_committee_count to 0.6.1; rm get_current_epoch_committee_count, get_previous_epoch_committee_count, and get_next_epoch_committee_count; bump state_sim default validator count a bit more
* re-introduce 0.5.1-ish get_active_validator_indices, get_epoch_committee_count as scaffolding for still-0.5.1ish shuffling
this is the beginning of tracking block-slots more precisely, so we can
the justification epoch slot bug.
* avoid asyncDiscard (swallows assertions)
* fix attestation delay
* fix several state root cache bugs
* introduce workaround for genesis epoch spec issue
* no need to pass prev_block_root when updating state
* some Slot fixes
* fix `hash_tree_root` for Slot, Epoch
* detect missing hash_tree_root type support
* remove some <0.5 state checks
* ssz: update to 0.5.1:ish
* slightly fewer seq allocations
* still a lot of potential for optimization
* fixes#174
* ssz: avoid reallocating leaves (logN merkle impl)
* begin 0.5.0 spec update: parent_root -> previous_block_root, BeaconState.justified_epoch -> BeaconState.current_justified_epoch, DOMAIN_PROPOSAL -> DOMAIN_BEACON_BLOCK, temporarily rename BeaconBlockHeader to BeaconBlockHeaderRLP to allow for gradual re-merging without disrupting RLP; mark a few unchanged functions and data types, implement get_temporary_block_header/get_empty_block/should_update_validator_registry/processBlockHeader/cacheState; update a few others
* a dozen or so more trivial mostly comment changes, finding more unchanged parts of 0.5.0
* several more trivial changes; goal is to reduce noise around the upcoming substantial changes (epoch processing, etc)
* keep track of a finalized block
* keep track of all justified blocks
* use naive spec version of LMD ghost
* cache slot number and a few more things in BlockRef
* keep track of the latest vote of each validator
* depend less on the state of node.state (it's a cache, effectively)
To enable it, comment out the 'withLibp2p' line in nim.cfg
The history was squashed in order to remove an accidentally
commited binary file.
Other changes:
* SSZ was adapted to use the common serialization framework
* gossibsup.subscribe is not using async handlers at the moment
and this allowed me to simplify it
* fix state db lookup typo
* fix randao reveal slot when proposing blocks
* only store blocks that can be applied to a state
* store state at every epoch boundary (yes, needs pruning!)
* split out state advancement function when there's no block
* default state sim to 0.9 attestation ratio
* implement in-memory block graph
* store tail block in database
* resolve unknown parents by syncing them from peers
* introduce concept of resolved blocks and attestations - those that
follow minimal protocol rules
* update state head lazily
* log more stuff
* shortHash -> shortLog
* start 9/10 beacon nodes by default, last can be started manually
* see also #134
* fix start.sh epoch length
* fix initial attestation pool on reordered attestations
* simplify db layer api
* load head block from database on startup, then load state
* significantly changes database format
* move subscriptions to separate proc's
* implement block replay from historical state
* avoid rescheduling epoch actions on block receipt (why?)
* make sure genesis block is created and used
* relax initial state sim parameters a bit
You'll need the latest versions of nim-eth-p2p, nim-serialization
and nim-json-serialization.
Before starting the simulation script, make sure to delete any previous
json files from the simulation folder:
```
rm tests/simulation/*.json
tests/simulation/start.sh
```
This should survive the creation of few blocks before diying with a
block validation error.
* spec updates
* balances move out to separate seq
* bunch of placeholders for proof-of-custody / phase1
* fix inclusion distance adjustment
* modify state in-place in `updateState` (tests spent over 80% time
copying state! now it's down to 25-50)
* document several conditions and conversations
* some renames here and there to follow spec