Commit Graph

56 Commits

Author SHA1 Message Date
Jacek Sieka b3d80827fb
tns: checkpoint wal periodically while backfilling (#3516)
Witout this, we end up with a massive .wal file that needs to be
checkpointed on first startup (which takes a few minutes) - it's much
more efficient to do smaller checkpoints, it turns out.
2022-03-18 12:32:20 +01:00
Jacek Sieka d0183ccd77
Historical state reindex for trusted node sync (#3452)
When performing trusted node sync, historical access is limited to
states after the checkpoint.

Reindexing restores full historical access by replaying historical
blocks against the state and storing snapshots in the database.

The process can be initiated or resumed at any point in time.
2022-03-11 12:49:47 +00:00
Jacek Sieka 1f89b7f7b9
speed up trusted node backfill (#3371)
With these changes, we can backfill about 400-500 slots/sec, which means
a full backfill of mainnet takes about 2-3h.

However, the CPU is not saturated - neither in server nor in client
meaning that somewhere, there's an artificial inefficiency in the
communication - 16 parallel downloads *should* saturate the CPU.

One plasible cause would be "too many async event loop iterations" per
block request, which would introduce multiple "sleep-like" delays along
the way.

I can push the speed up to 800 slots/sec by increasing parallel
downloads even further, but going after the root cause of the slowness
would be better.

* avoid some unnecessary block copies
* double parallel requests
2022-02-12 12:09:59 +01:00
Jacek Sieka d076e1a11b
ncli_db: import states and blocks from era file (#3313) 2022-01-25 09:28:26 +01:00
Jacek Sieka 61342c2449
limit by-root requests to non-finalized blocks (#3293)
* limit by-root requests to non-finalized blocks

Presently, we keep a mapping from block root to `BlockRef` in memory -
this has simplified reasoning about the dag, but is not sustainable with
the chain growing.

We can distinguish between two cases where by-root access is useful:

* unfinalized blocks - this is where the beacon chain is operating
generally, by validating incoming data as interesting for future fork
choice decisions - bounded by the length of the unfinalized period
* finalized blocks - historical access in the REST API etc - no bounds,
really

In this PR, we limit the by-root block index to the first use case:
finalized chain data can more efficiently be addressed by slot number.

Future work includes:

* limiting the `BlockRef` horizon in general - each instance is 40
bytes+overhead which adds up - this needs further refactoring to deal
with the tail vs state problem
* persisting the finalized slot-to-hash index - this one also keeps
growing unbounded (albeit slowly)

Anyway, this PR easily shaves ~128mb of memory usage at the time of
writing.

* No longer honor `BeaconBlocksByRoot` requests outside of the
non-finalized period - previously, Nimbus would generously return any
block through this libp2p request - per the spec, finalized blocks
should be fetched via `BeaconBlocksByRange` instead.
* return `Opt[BlockRef]` instead of `nil` when blocks can't be found -
this becomes a lot more common now and thus deserves more attention
* `dag.blocks` -> `dag.forkBlocks` - this index only carries unfinalized
blocks from now - `finalizedBlocks` covers the other `BlockRef`
instances
* in backfill, verify that the last backfilled block leads back to
genesis, or panic
* add backfill timings to log
* fix missing check that `BlockRef` block can be fetched with
`getForkedBlock` reliably
* shortcut doppelganger check when feature is not enabled
* in REST/JSON-RPC, fetch blocks without involving `BlockRef`

* fix dag.blocks ref
2022-01-21 13:33:16 +02:00
Jacek Sieka 68247f81b3
Trusted node sync (#3209)
* Trusted node sync

Trusted node sync, aka checkpoint sync, allows syncing tyhe chain from a
trusted node instead of relying on a full sync from genesis.

Features include:

* sync from any slot, including the latest finalized slot
* backfill blocks either from the REST api (default) or p2p (#3263)

Future improvements:

* top up blocks between head in database and some other node - this
makes for an efficient backup tool
* recreate historical state to enable historical queries

* fixes

* load genesis from network metadata
* check checkpoint block root against state
* fix invalid block root in rest json decoding
* odds and ends

* retry looking for epoch-boundary checkpoint blocks
2022-01-17 10:27:08 +01:00