The work on this was started last week while I was waiting
for a decision on the "Async Snappy" PR. It was prompted by
a failing test in the test suite, where the HashingStream
was inserting some incorrectly padded chunks that affected
the result of `hash_tree_root`. Instead of working around
the problem in the HashingStream, I've decided to implement
a planned optimisation that allows us to remove the hashing
stream altogether.
With the optimisation in place, `hash_tree_root` will now
use only stack memory and only the precise amount neccesary
to build the chunks-merging tree.
* switch state cache to use ref statedata objects to limit memory usage
* more directly initialize ref StateData
* use HashedBeaconState instead of StateData to try to fix memory leak
* switch cache to seq[ref HashedBeaconState]
* remove unused import
Co-authored-by: Ștefan Talpalaru <stefantalpalaru@yahoo.com>
* sync fixes
* fix Status message finalized info
* work around sync starting before initial status exchange
* don't fail block on deposit signature check failure (fixes#989)
* print ForkDigest and Version nicely
* dump incoming blocks
* fix crash when libp2p peer connection is closed
* update chunk size to 16 to work around missing blocks when syncing
* bump libp2p
* bump libp2p
* better deposit skip message
* fix some warnings related to beacon_node splitting; reimplement finalization verification more robustly; improve attestation pool block selection logic
* re-add missing import
* whitelist allowed state transition flags and make rollback/restore naming more consistent
* restore usage of update flags passed into skipAndUpdateState(...) in addition to the potential verifyFinalization flag
* switch rest of rollback -> restore
When replaying state transitions, for the slots that have a block, the
state root is taken from the block. For slots that lack a block, it's
currently calculated using hash_tree_root which is expensive.
Caching the empty slot state roots helps us avoid recalculating this
hash, meaning that for replay, hashes are never calculated. This turns
blocks into fairly lightweight "state-diffs"!
* avoid re-saving state when replaying blocks
* advance empty slots slot-by-slot and save root
* fix sim randomness
* fix sim genesis filename
* introduce `isEpoch` to check if a slot is an epoch slot
* remove incorrect/obsolete comment; deprecate BeaconState state transition functions
* remove deprecated state_transition(state: var BeaconState)
* add specific workarounds for state_transition() and process_slots() to nfuzz_block() and addTestBlock()
* remove near-duplicate code paths: process_slot(), process_slots(), and state_transition() for BeaconState are now wrappers around the HashedBeaconState versions
* convert tests/test_state_transition.nim to use HashedBeaconState
* convert mocking infrastructure and spec_block/epoch_processing tests to use HashedBeaconBlock, and remove thus unused process_slot*(state: var BeaconState)
* ssz: move ref support outside
Instead of allocating ref's inside SSZ, move it to separate helper:
* makes `ref` allocations explicit
* less magic inside SSZ
* `ref` in nim generally means reference whereas SSZ was loading as
value - if a type indeed used references it would get copies instead of
references to a single value on roundtrip which is unexpected
TODO: EF tests would benefit from some refactoring since they all do the
same thing practically..
Co-authored-by: Zahary Karadjov <zahary@gmail.com>
* refactor blook pool caches to directly use TableRef to avoid SSZ decoding, which was consuming 20% of profile on mainnet eth2_network_simulation
* use table's hasKeyOrPut
* bump eth2 spec reference to v0.11.1
* cache whole StateData objects and switch from expensive clear() to cheaper new object instantiation for caching
* remove scaffolding and stop re-assigning to part of StateData object
* 80-character lines
Please see the newly added 'schlesi-dev' Makefile target.
It demonstrates how the log level can be specified for individual topics.
Additionally, when connecting to testnets like 'schlesi' there will be
two additional log files produced in the working directory:
* json-log.txt
* text-log.txt (in the textblocks format)
In BlockPool, we keep the head state around, so it's trivial to restore
the temporary state there and keep going as if nothing happened.
This solves 3 problems:
* stack space - the state copy on mainnet is huge
* GC scanning - using stack space for state slows down the GC
significantly
* reckless copying - the copy itself takes a long time
In state_sim, we'll do the same and allocate on heap - this helps a
little with GC - without it, the collection of the temporary strings
created with `toHex` while printing the json dominates the trace.
* add another check for inconsistent aggregation and committee length, since ncli_transition bypasses process_attestation(...)/check_attestation(...) and calls almost directly into process_epoch(...)
* bump validator functions to v0.11.1 spec references
* bump some spec references to v0.11.1
* poke
* Add "drop by score" ability to PeerPool.
Add tests.
Fix syncmanager queue to start from most fresh data.
* Fix endless cycle at the end of syncing process.
* reduce stack space usage in process_final_updates(...) to avoid fuzzed segfault in https://github.com/status-im/nim-beacon-chain/issues/921
* document motivation behind manually constructing hash_tree_root of a HistoricalBatch
* fix remaining block pool extended validation issues and re-enable first-block-received and block-signature EV checks; enable Merkle validation in beacon_node in eth2_network_simulation; refactor some Merkle proof generation code outside tests/ as a result
* re-enable Merkle validation skipping, since while it works on make eth2_network_simulation, it has issues with local testnet
* tighten already-seen-block blockpool check; move comment closer to conceptually proximate code; queue up maybe-future-valid-blocks as pending to keep libp2p-synchronous interrupt handling time lower
* revert the cleanups, now in a separate PR
* remove the remaining merkle_minimal cleanup remnants, also moved to other PR
* restore PR to only modifying one file after rebasing
* use signatures as summary to compare block contents
* switch signature comparison to be raw byte-wise to ensure no attempts to deserialize it to valid (or not) BLS signatures first
* fix mainnet finalization and swith eth2_network_simulation to a kind of small-mainnet profile
* Fix slot reference in trace logging
* bump a couple of spec refs from v0.11.0 to v0.11.1
* bump another spec ref to v0.11.1, one more try at Jenkins test vector download CI issue
* fix other slot reference in trace logging and skip past single-block/multi-slot gaps to re-approach from ancestry side by state_transitioning, by requiring exact match on both root hash and slot for fast path
* make more precise the fast path condition
* redo logic to make uniform with BeaconChainDB; fix chronos deprecation warning
* revert not-working replacement of deprecated chronos futures `or`
* switch testnet1 to mainnet
* if we don't have validators, don't consider aggregation work
* if we do have validators, don't aggregate when we're out of sync
* when we do aggregate, use a fresh state, and not one from before
sleeping
* fix warnings by switching from deprecated chronos API addTimer(...) to setTimer(...) and removing especially some unnecessary chronicles and extras imports from test suite modules
* update a couple v0.10.1 spec references to v0.11.1
* crypto: cleanup
* fix several Defect-on-user-input
* make crypto interface more similar to secp
* use `crypto.nim` in all of nbc
* digest: raises
* fix
* vendor
* remove all but one UnusedImport warning
* bump a few more spec version references from v0.10.1 to v0.11.1
* more v0.10.1 spec reference updates/removals
* yet more v0.10.1 spec reference updates
* refactor and fix merkle proof construction in test suite and thereby remove most remaining skipMerkleValidation flags, now unnecessary
* a few non-semantic comment update/removals
* initial fork-choice refactor
* Add fork_choice test for "no votes"
* Initial test with voting: fix handling of unknown validators and parent blocks
* Fix tiebreak of votes
* Cleanup debugging traces
* Complexify the vote test
* fakeHash use the bigEndian repr of number + fix tiebreak for good
* Stash changes: found critical bug in nimcrypto `==` and var openarray
* Passing fork choice tests with varying votes
* Add FFG fork choice scenario + fork choice to the test suite
* Not sure why lmdb / rocksdb reappeared in rebase
* Add sanity checks to .nimble file + integrate fork choice tests to the test DB and test timing
* Cleanup debugging echos
* nimcrypto fix https://github.com/status-im/nim-beacon-chain/pull/864 as been merged, remove TODO comment
* Turn fork choice exception-free
* Cleanup "result" to ensure early return is properly used
* Add a comment on private/public error code vs Result
* result -> results following https://github.com/status-im/nim-beacon-chain/pull/866
* Address comments:
- raises: [Defect] doesn't work -> TODO
- process_attestation cannot fail
- try/except as expression pending Nim v1.2.0
- cleanup TODOs
* re-enable all sanity checks
* tag no raise for process_attestation
* use raises defect everywhere in fork choice and fix process_attestation test
* initial attestation aggregation
* fix usage of committee index, vs index in committee; uniformly set trailing/following distance; document how the only-broadcast-if mechanism works better and what aggregation already happens, not otherwise sufficiently clear; use correct BlockSlot across epoch boundaries
* address inconsistent notion of which slot in past to target for aggregate broadcast; follow 0.11.x aggregate broadcast p2p interface topic
* Fix get_slot_signature(...) call after get_domain(...) change required genesis_validators_root
* mark all spec references which aren't dealt with in other PRs as v0.11.1
* update two more spec refs to v0.11.1
* initial extended validation setup
* flesh out all TODO items for attestation and beaconblock verification
* fix finalization and add chronicles debugging messages
* directly use blockPool.headState rather than pointlessly updating it and document this constraint
* fix logic relating to first-attestation checking; support validating blocks across multiple forks
* initial 0.11.1 spec commit; no test regressions and finalizes in eth2_network_simulation
* with BLS 0.10/0.11 available, stop skipping attester slashing, proposer slashing, and voluntary exist operations fixture tests
* switch param orders to group state.{fork, genesis_validators_root}; bump spec/datatypes spec version for network purposes
* mark attestation construction and broadcast and some minimal/mainnet constants as 0.11.1-compatible; remove phase 1 sharding constants from minimal which don't exist in that preset
* complete (except for get_domain(...)) 0.11.0 beacon chain spec update
* mark compute_start_slot_at_epoch(...), is_active_validator(...), compute_signing_root(...), and get_seed(...) as 0.11.0
* Initial implementation of runtime bls skipping.
Add libnfuzz skipBLSValidation handling, check that it propagates.
* Rename skipBLSValidation -> skipBlsValidation, start skipStateRootValidation
* Replace skipValidation flags with more granular flags.
Also added skipBlockParentRootValidation flag
Mainly replaced with skipBlsValidation but also StateRoot or
BlockParentRootValidation flags where appropriate.
* Adjust interop test to pass when skipping merkle validation.
* Stop skipping validation for mainchain_monitor.
* Remove comment.
* Also skipMerkleValidation for test_beacon_chain_db.
This used to behave properly before the rebase, but currently
it forces the bootstrap node to exit, because it ends up being
launched with an ENR list telling it to connect to itself.
The root cause will be investigated in a follow-up PR.
Turns out the DiscV5 code relies heavily on the presence of ENR
records at the moment, so we cannot drive it with ENodes. @kdeme
is working on refactoring that will relax these requirements.
* The bootstrap_nodes.txt file in the node's data dir is now optional
* Log more data on start-up
* Use the latest ENR APIs
* Fix simulation build errors
We no longer discriminate between ENR, MultiAddress or ENode
bootstrap records (all of them are remapped to ENodes).
The discovery loop will stochastically try to reconnect to
accidentally disconnected nodes.
* beacon node code cleanup
* rudimentary error checking on mainnet monitor
* start client even when sending deposit
* work around missing block number exception
* connect to testnet with web3 url
* pretty-print digests in json
* update to 0.10.1
* SSZ Generic and nbench uses the v0.10.1 fixtures
* Tests + spec links: v0.10.0 -> v0.10.1
* Add v0.10.1 TODO in get_latest_attesting_balance (forkchoice)
* SSZ Bytes are now ByteList
* Remove nim-result submodules that was leftover/added by mistake in the branch
* fix crash when state root is present but state is missing
* fix state root removal when state is removed
* fix block pool initialization which needs tail state
* remove tail block pruning
* incomplete - fork states are not pruned
* incomplete - fork blocks are not pruned
* incomplete - empty slot states are not pruned
* unknown - tail/finalized block on empty slot might be incorrect
The loader has been tested with the presets published by Lighthouse.
You can try connecting to one of their testnets by running:
cd nim-beacon-chain
./connect-to-testnet lighthouse/testnet0
* simplify data storage to key-value, tries are not relevant for NBC
* locked-down version of lmdb dependency
* easier to build / maintain on various platforms
* fix slot time navigation, add tests
* skip block proposal if head is more recent already - shouldn't happen
* use correct head when attesting to previous blocks
* log slot start/end processing
* nbench PoC
* Remove the yaml files from the example scenarios
* update README with current status
* Add an alternative implementation that uses defer
* Forgot to add the old proc body
* slots-processing
* allow benching state_transition failures
* Add Attestations processing (workaround confutils bug:
- https://github.com/status-im/nim-confutils/issues/10
- https://github.com/status-im/nim-confutils/issues/11
- https://github.com/status-im/nim-confutils/issues/12
* Add CLI command in the readme
* Filter report and add notes about CPU cycles
* Report averages
* Add debugecho style time/cycle print
* Report when we skip BLS and state root verification
* Update to 0.9.3
* Generalize scenario parsing
* Support all block processing scenarios
* parallel bench runner PoC
* gitBetter load issues reporting (the load issues were invalid signature and expected to fail)
* state data cache in block pool
* keep head state around
* more attestation logic in attestation pool
* first fork choice tests (!)
* fix fork choice (it's still likely broken / out of date)
* per honest validator and naïve/simple aggregator attestation specs, move attesting up from halfway to one third of the way through slots
* Update beacon_chain/beacon_node.nim
Co-Authored-By: Jacek Sieka <jacek@status.im>
On your very first connection to each testnet, you'll be asked to
become a validator. Please consult our private repo for a Goerli
Eth1 private key that you can use for deposits.
Other changes:
* Added a simple wrapper ./connect-to-testnet script calling the
nims file in the correct environment. No extension was used to
make the command the same on Unix and Windows.
* Bumped a number of modules with fixes from this week
* `make testnet0` and `make testnet1` will no longer delete your
existing database. This is considered a more appropriate behavior
for testing forward sync.
this is a temporary measure until we figure something better out - as it
stands, we'll advance with empty slots and crash because all validators
are out.
* Move BeaconNode type to its own file (fewer imports)
* disentangle sync protocol/request manager
* fix some old nimisms
* de-fear some logs
* simplify eth1 data production
* add stack tracing to release builds
* drop release compile flag for testnet
* re-enable test_interop based on zcli with 0.9.1 specs and update initialize_beacon_state_from_eth1(...) to 0.9.1
* switch many procs to funcs
* fix import os.nim instead; ospaths is deprecated [Deprecated] warnings
* off-by-one error in the returned range of blocks
* larger request time-outs to deal with non-responsive servers
* fix an unhandled exception when we fail to deliver a response chunk
The number of user nodes is now specified with `USER_NODES`.
To make the instructions more stable, the "numeric id" of the user
nodes will be starting from 0 (so you can always use `run_node.sh 0`
to start a user node).
If you specify a node index above the total number of nodes, you'll
launch a node without any validators attached (this is useful for
testing the sync for example).
When the connect_to_testnet script is invoked it will first verify that
the genesis file of the testnet hasn't changed. If it has changed, any
previously created database associated with the testnet will be erased.
To facilitate this, the genesis file of each network is written to the
data folder of the beacon node. The beacon node will refuse to start if
it detects a discrepancy between the data folder and any state snapshot
specified on the command-line.
Since the testnet sharing spec requires us to use SSZ snapshots, the Json
support is now phased out. To help with the transition and to preserve the
functionality of the multinet scripts, the beacon node now supports a CLI
query command that can extract any data from the genesis state. This is
based on new developments in the SSZ navigators.
* update get_seed(...) and get_beacon_proposer_index(...) to 0.9.0, implement compute_proposer_index(...), and render 3 more test fixtures working
* rm stray Crosslink reference which prevented static SSZ tests from building
* remove references to removed tests in attestations test fixture; add minimal-preset block sanity test, plus all but one of mainnet tests for block sanity to transition fixtures
* transition deposit operations fixture to 0.9.0
* mark slash_validator(...) as 0.9.0
* switch remaining non-ref objects to ref objects to maybe avoid crashes in CI
* remove unused helpers/debug_state imports
* rename compute_epoch_of_slot(...) to compute_epoch_at_slot(...)
* remove some unnecessary imports; remove some crosslink-related code and tests; complete renaming of compute_epoch_of_slot(...) to compute_epoch_at_slot(...)
* rm more transfer-related code and tests; rm more unnecessary strutils imports
* rm remaining unused imports
* remove useless get_empty_per_epoch_cache(...)/compute_start_slot_of_epoch(...) calls
* rename compute_start_slot_of_epoch(...) to compute_start_slot_at_epoch(...)
* rename ACTIVATION_EXIT_DELAY to MAX_SEED_LOOKAHEAD
* update domain types to 0.9.0
* mark AttesterSlashing, IndexedAttestation, AttestationDataAndCustodyBit, DepositData, BeaconBlockHeader, Fork, integer_squareroot(...), and process_voluntary_exit(...) as 0.9.0
* mark increase_balance(...), decrease_balance(...), get_block_root(...), CheckPoint, Deposit, PendingAttestation, HistoricalBatch, is_active_validator(...), and is_slashable_attestation_data(...) as 0.9.0
* mark compute_activation_exit_epoch(...), bls_verify(...), Validator, get_active_validator_indices(...), get_current_epoch(...), get_total_active_balance(...), and get_previous_epoch(...) as 0.9.0
* mark get_block_root_at_slot(...), ProposerSlashing, get_domain(...), VoluntaryExit, mainnet preset Gwei values, minimal preset max operations, process_block_header(...), and is_slashable_validator(...) as 0.9.0
* mark makeWithdrawalCredentials(...), get_validator_churn_limit(...), get_total_balance(...), is_valid_indexed_attestation(...), bls_aggregate_pubkeys(...), initial genesis value/constants, Attestation, get_randao_mix(...), mainnet preset max operations per block constants, minimal preset Gwei values and time parameters, process_eth1_data(...), get_shuffled_seq(...), compute_committee(...), and process_slots(...) as 0.9.0; partially update get_indexed_attestation(...) to 0.9.0 by removing crosslink refs and associated tests
* mark initiate_validator_exit(...), process_registry_updates(...), BeaconBlock, Eth1Data, compute_domain(...), process_randao(...), process_attester_slashing(...), get_base_reward(...), and process_slot(...) as 0.9.0
Multi-client testing requires more portable formats, and SSZ is
much better specified than our flavour of Json.
Tools like ncli and zcli can be now used to inspect the contents
of the SSZ files.
* use service/category/process for blockpool logs
Only track fork choice logs in block pool (vs beacon_node)
Reduce verbosity on usual event in block pool
* rework beacon node logs
* log for attestations in blockpool
* log - att pool improvement
* use logScope and topics cf review and discussion
* use 7 letters for beacon_node
[log] report peers at slot start + fix bracket prefix [Block pool] Attestation sent
* Prepare test suite for transfers
* split API process_transfer / processTransfers
* Add range checks on transfer
* Fix invalid transfer conditions
* don't test on windows 64-bit #435
Changes:
* Do not send separate network packets for response codes and msg
len prefixes
* Close streams according to the spec
* Implement more timeouts according to the spec
* Make hello requests during syncing to update our knowledge of
the head block of the other peer.
* Hello is no longer a handshake message
(all handshakes related code was deleted for clarity)
* Deal with the single-parameter inlining defined in the new spec
* add test suite for voluntary exit
* update API to process_voluntary_exit
* Add range check of validator_index for voluntary exits
* Revert to dual single + multiple voluntary exits API + enable in test suite
* no cache or mocking needed
* Add attestation unit test
* process_attestation doesn't throw exceptions
* Allow SSZ deserialization of both real and invalid signatures
* Add new process_attestation checks - pass all process_attestation tests
* Add sanity check for #361
* Fix SSZ testing after fromBytes/fromSSZBytes changes
* Fixed getBeaconBlocks() and getRecentBeaconBlocks() to use BlockPool, not db.
* Got the sync_protocol to compiling state; Removed all obsolete RPC calls
* fix network sim
* mark BeaconState, state list/vector lengths, misc values, get_base_reward(...), verifyStateRoot(...), and process_slot(...) as 0.8.3; update minimal/mainnet config initial values to 0.8.3 by removing GENESIS_FORK_VERSION
* Add sanity check for slot processing (also impacted by https://github.com/status-im/nim-beacon-chain/issues/373)
* use reportDiff also for all state tests vs EF
* initial sanity checks for blocks - workaround zero signature in block headers: https://github.com/status-im/nim-beacon-chain/issues/374
* Remove generic object variant compare commented code
* Add the one block state transition sanity checks
* generalize blocks test to multiple blocks
* simplify slots test runner
* Add official epoch transitions, sanity blocks, sanity slots to the test suite
* Fix index out-of-bounds in initiate_validator_exit - enable proposer slashings unittest
* Update BLS fixtures to 0.8.3
* Bump fixtures with shuffling with minimal preset
* Update shuffling tests
* parseTest generic over file format (Json or SSZ)
* Initial crosslink parsing commit for debugging Nim crash
* Workaround https://github.com/status-im/nim-beacon-chain/issues/369
* Crosslink test works for minimal - https://github.com/status-im/nim-beacon-chain/issues/369 is back on mainnet
* Use ref objects to workaround https://github.com/status-im/nim-beacon-chain/issues/369
* Generalize state transition epoch test to all epoch tests
* Fix slashing (potential uint64 overflow in previous spec)
* Add a state debugging macro to deeply inspect the wrong fields
* make reportDiff visible
* Improve the debug state macro for containers
* add interop launcher scripts
* stick validator_keygen into beacon_node
* fix lmd ghost slot number on missing block
* use mocked eth1data when producing blocks
* use bls public key method for withdrawal credentials
* fix deposit domain
* prefer lowercase for a bunch of toHex
* build simulation binary in data folder to avoid data types confusion