This PR allows sharing the pubkey data between validators by using a
thread-local cache for pubkey data, netting about a 400mb mem usage
reduction on holesky due to us keeping 3 permanent + several ephemeral
state copies in memory at all times and each state copy holding a full
validator.
The PR also introduces a hash cache for the key which gives ~14% speedup
for a full state `hash_tree_root` - the key makes up for a large part of
the `Validator` htr time.
Finally, the time it takes to copy a state goes down as well from ~80m
ms to ~60, for reasons similar to htr.
We use a `ptr` even if a `ref` could in theory have been used - there is
not much practical benefit to a `ref` (given it's mutable) while a `ptr`
is cheaper and easier to copy (when copying temporary states).
We could go further and cache a cooked pubkey but it turns out this is
quite intrusive - in all the relevant places, we're already using a
cooked key from the immutable validator data so there are no immediate
performance gains of doing so while managing the compressed -> cooked
key mapping would become more difficult - something for a future PR
perhaps.
Co-authored-by: Etan Kissling <etan@status.im>
- Refactor p2pProtocol internals
- Fix chronos related deprecated warnings in discv5 and uTP
- normalise nimble, ci
- Fix chronos related deprecated warnings in uTP code part II
- Remove usage of stew/shim/net
- Upgrade github actions to v4
- Fix two bugs in Receipts RLP encoding/decoding
- Move Ethereum specific RLP encoding tests under tests/common
- Reduce compiler warnings in rlp
- test refc in CI in Nim 2.0 and later
- test Nim 2.0 in CI and use non-EOL macOS version
- Use asyncraises in p2p
- Don't auto write p2pProtocol macro expansion to file
- Fix improper yield usage in rlpx and refine exception handling
- Client also handle error message if id is null
- Client pass meaningful error to newFut when processMessage failed
- Refactoring: extract rpc handler from HTTP and WebSocket server
- Use pragma push/pop pair to disable warning
- http server better exception handling
- Fix CI badge url
- Upgrade github actions to v4
- Revert "Fix CI badge url"
- HttpAuthHook use async raises
- Move CancelledError handling to outer try/except of RpcWebsocketServer
- Implement RPC batch call both in servers and clients
- v0.4.0
- Should compile if chronicles log turned on
- Add framework to support more optional types
- v0.4.2
- test refc in CI in Nim 2.0 and later
- use non-EOL macOS version for GitHub Actions CI
- avoid failing uninitialized `Future`
- Improve batch call example and wrapper comments
- Fix ws and socket client error handling and add test to #212
- Add build test with chronicles to json enabled
* track latest duration instead of total in new timing metrics
Change `db_checkpoint_seconds` and `state_replay_seconds` metrics to
record the latest duration instead of the total. `nim-metrics` already
synthesizes a `_total` metric from these implicitly.
* still have to use inc, metrics only synthesizes the name not the sum
* prefix with `beacon_dag`
- update instructions for tracking upstream MIRACL Core
- bump `bls12-381-tests` to `v0.1.2`
- bump `miracl-core` to `0f67878bee7c4108405deb2b0b5e4e58d1ae30fc`
- test refc in CI in Nim 2.0 and later
- rename `milagro.nims` -> `miracl.nims`
- rename `milagro.nim` -> `miracl.nim`
- rename `milagro(Path|_func)` -> `miracl(Path|_func)`
- rename `milagro` references -> `miracl` in documentation
Validator monitoring gained 2 new metrics for tracking when blocks are
included or not on the head chain.
Similar to attestations, if the block is produced in epoch N, reporting
will use the state when switching to epoch N+2 to do the reporting (so
as to reasonably stabilise the block inclusion in the face of reorgs).
Database checkpointing can take seconds, e.g., while Geth is syncing.
Add a debug log + metric for it, and also info log if it takes longer
than 250ms, same as for the existing `State replayed` log. If the log
shows up for a user while the system is not overloaded, it may point
to slow disk speed or thermal issue.
Make raised exceptions explicit in `ncli_common.nim`, and handle more of
them in `ncli_db.nim` to have better UX when directories cannot be read
or file names do not parse against the expected format.
`simutils.nim` is quite outdated w.r.t. code style. Apply the following:
- Use string concatenation instead of `strformat` for simple cases
- Catch `IOError` and `SerializationError` when loading/saving SSZ files
- Catch `ValueError` for remaining `strformat` usage
- Consistently use `chronicles` in `loadGenesis`
* compute post-merge randao mix without loading state
* avoid copying state on shuffling computation and compute epochref
* speed up state copy for block production