This is a construction site with a proof of concept for syncing with the
proposed light client protocol.
See https://github.com/ethereum/consensus-specs/pull/2802
Note: THINGS ARE STILL BUGGY.
To test, first launch a local testnet:
```
rm -rf local_testnet_data && scripts/launch_local_testnet.sh \
--kill-old-processes --preset minimal --disable-htop --disable-vc -- \
--log-format=auto --max-peers=160 --import-light-client-data=full \
--serve-light-client-data
```
Then, monitor http://127.0.0.1:7500/eth/v1/beacon/headers/finalized
until a slot >= 8 is finalized (Altair fork).
Finally, use a second Terminal to start an additional node to sync with
this testnet using the light client protocol.
```
rm -rf "local_testnet_data/client_$(git rev-parse --abbrev-ref HEAD)"; \
make -j16 nimbus_beacon_node \
NIMFLAGS="-d:local_testnet -d:const_preset=minimal" && \
build/nimbus_beacon_node \
--nat:extip:127.0.0.1 \
"--network=local_testnet_data" \
--tcp-port=9999 --udp-port=9999 --rest-port=5999 \
"--data-dir=local_testnet_data/client_$(git rev-parse --abbrev-ref HEAD)" \
"--bootstrap-node=$(curl http://127.0.0.1:7500/eth/v1/node/identity | jq -r .data.enr)" \
--subscribe-all-subnets --rest --log-level=DEBUG \
"--light-client-state-node-url=http://127.0.0.1:7500" \
"--light-client-trusted-block-root=$(curl http://127.0.0.1:7500/eth/v1/beacon/headers/8 | jq -r .data.root)"
```
The libp2p light client sync protocol defines an endpoint for syncing
historic data that is structured similarly to `beaconBlocksByRange`,
i.e., it uses a start/count/step tuple to sync from finalized to head.
See https://github.com/ethereum/consensus-specs/pull/2802
As preparation, this patch extends the `SyncQueue` and `SyncManager`
implementations to work with such new `ByRange` endpoints as well.
This extends the `--serve-light-client-data` launch option to serve
locally collected light client data via libp2p.
Backfill of historic best `LightClientUpdate` is not yet implemented.
See https://github.com/ethereum/consensus-specs/pull/2802
To test, in `conf.nim` change the `defaultValue` of:
- `serveLightClientData` to `true`
- `importLightClientData` to `ImportLightClientData.Full` (or others)
Then, run:
```
scripts/launch_local_testnet.sh --kill-old-processes --preset minimal \
--nodes 4 --disable-htop --stop-at-epoch 7
```
The log files of the beacon nodes will be in the `local_testnet_data`
directory. They are named `log0.txt` through `log3.txt`. The logs can be
browsed for light client related messages.
Light clients require full nodes to serve additional data so that they
can stay in sync with the network. This patch adds a new launch option
`--serve-light-client-data` to enable collection of light client data.
`--import-light-client-data` configures the classes of data to import.
This can be set to `none`, `only-new`, `full`, or `on-demand`.
Note that data is only locally collected, a separate patch is needed to
actually make it availble over the network. Likewise, data is only kept
in memory; it is not persisted at this time.
When initializing backfill sync, the implementation intends to start at
the first unknown slot (`1` before tail). However, an incorrect variable
is passed, and backfill sync actually starts at the tail slot instead.
This patch corrects this by passing the intended variable. The problem
was introduced with the original backfill implementation at #3263.
The added options work in opt-in fashion. If they are not specified,
the server will respond to all requests as if the CORS specification
doesn't exist. This will result in errors in CORS-enabled clients.
Please note that future versions may support more than one allowed
origin. The option names will stay the same, but the user will be
able to repeat them on the command line (similar to other options
such as --web3-url).
To be documented in the guide in a separate PR.
The `SyncManager` has a leftover optional `sleepTime` parameter in
its constructor that used to configure the sync loop polling rate.
This parameter was replaced with a constant in #1602 and is no longer
functional. This patch removes the `sleepTime` leftovers.
#3304 introduced a regression to the sync status string displayed in the
status bar; during the main forward sync, the current slot is no longer
reported and always displays as `0`. This patch corrects the computation
to accurately report the current slot once more.
The `SyncManager` has a leftover optional `maxStatusAge` parameter in
its constructor that used to configure the libp2p `Status` polling rate.
This parameter was replaced with a constant in #1827 and is no longer
functional. This patch removes the `maxStatusAge` leftovers.
With these changes, we can backfill about 400-500 slots/sec, which means
a full backfill of mainnet takes about 2-3h.
However, the CPU is not saturated - neither in server nor in client
meaning that somewhere, there's an artificial inefficiency in the
communication - 16 parallel downloads *should* saturate the CPU.
One plasible cause would be "too many async event loop iterations" per
block request, which would introduce multiple "sleep-like" delays along
the way.
I can push the speed up to 800 slots/sec by increasing parallel
downloads even further, but going after the root cause of the slowness
would be better.
* avoid some unnecessary block copies
* double parallel requests
When node is restarted before backfill has started but after some blocks
have finalized with forward sync, we would not start the backfill.
* also clean up one last `SomeSome`
* Initial commit.
* Fix current test suite.
* Fix keymanager api test.
* Fix wss_sim.
* Add more keystore_management tests.
* Recover deleted isEmptyDir().
* Add `HttpHostUri` distinct type.
Move keymanager calls away from rest_beacon_calls to rest_keymanager_calls.
Add REST serialization of RemoteKeystore and Keystore object.
Add tests for Remote Keystore management API.
Add tests for Keystore management API (Add keystore).
Fix serialzation issues.
* Fix test to use HttpHostUri instead of Uri.
* Add links to specification in comments.
* Remove debugging echoes.
* harden and speed up block sync
The `GetBlockBy*` server implementation currently reads SSZ bytes from
database, deserializes them into a Nim object then serializes them right
back to SSZ - here, we eliminate the deser/ser steps and send the bytes
straight to the network. Unfortunately, the snappy recoding must still
be done because of differences in framing.
Also, the quota system makes one giant request for quota right before
sending all blocks - this means that a 1024 block request will be
"paused" for a long time, then all blocks will be sent at once causing a
spike in database reads which potentially will see the reading client
time out before any block is sent.
Finally, on the reading side we make several copies of blocks as they
travel through various queues - this was not noticeable before but
becomes a problem in two cases: bellatrix blocks are up to 10mb (instead
of .. 30-40kb) and when backfilling, we process a lot more of them a lot
faster.
* fix status comparisons for nodes syncing from genesis (#3327 was a bit
too hard)
* don't hit database at all for post-altair slots in GetBlock v1
requests
* deactivate Doppelganger Protection during genesis
* also don't actually flag supposed-doppelgangers (because they're before broadcastStartEpoch) on GENESIS_SLOT start
* update action tracker on dependent-root-changing reorg (instead of
epoch change)
* don't try to log duties while syncing - we're not tracking actions yet
* fix slot used for doppelganger loss detection
These use a separate flow, and were previously only registered from the
network
* don't log successes in totals mode (TMI)
* remove `attestation-sent` event which is unused