* Testnet improvements
Increase timeout for reading
Add more logs
Offer endpoint can fail due to talkReq timeout, to avoid
test failure, retry it few times until success.
- Add retries when the network request failed on validation of
block header, body or receipts.
- Shuffle the first set of neighbours that the request is send to
in order to not always hit the same peer first for the same content
- Some general clean-up
- Move the accumulator definitions to a history accumulator file
- Add accumulator build helper calls + temporary database
- Add a header gossip content key encoding test
- Refactor & some cleanup
- More consistent naming in calls used by the JSON-RPC debug API
- Use storeContent everywhere to make sure content is only stored
if in range
- Remove populateHistoryDb subcommand and replace by JSON-RPC call
storeContent
- Remove some whitespace and fix some indentations
Allow also concurrent neighborhood gossip jobs when seeding data
into the network.
Update Grafana dashboard for two additional metrics regarding
lookups in neighborhood gossip.
* Improvements to the propagation and seeding of data
- Use a lookup for nodes selection in neighborhoodGossip
- Rework populate db code and add `propagateBlockHistoryDb` call
and portal_history__propagateBlock json-rpc call
- Small adjustment to blockwalk
* Avoid storing out-of-range data in the propagate db calls
Currently re-using http connections for the json-rpc requests
seems to slow down the requests considerably. Avoid this by
forcing a close after each requests in the blockwalk tool.
* Add concurrency to the content offers of neighborhoodGossip proc
And remove some whitespace
* Remove more whitespace and adjust for 80 char line limit
* Update fluffy grafana dashboard to include gossip offer results
* Remove unused import of config to avoid select_backend db import
- Importing nimbus-eth1 config.nim causes import of select_backend
which will default cause an import of kvstore_rocksdb and thus a
require rocksdb. Remove unused one to avoid rocksdb dependency
for Fluffy.
- Remove some whitespace in bridge_client (to make fluffy CI
trigger for sure).
* Use specific cache keys for fluffy CI workflow
* Disable Fluffy CI reproducibility test
- Truncate returned number of ENRs in Portal Nodes message to fit
the discv5 max. packet size
- Truncate returned number of ENRs in Portal Content message to
fit the discv5 max. packet size
- Use more detailed packet size calculation for max content payload
in Portal Content message.
- Add metrics to track how many ENRs get packed in the Portal
Nodes and Content messages.
* Add block bodies to the propagation and lookups
- Read and propagate block bodies next to the headers
- Add block bodies content (via lookups) to the eth_getBlockByHash
call
- Test the above in test_portal_testnet
* Fix storage/propagation of block bodies
- Data format is an actual block: [header, txs, uncles], which
requires some adjustment to store the block body
- Added also `eth_getBlockTransactionCountByHash` json rpc call
Default the network key will be taken from a network key file
instead of randomly generated on each run. This is done because
the data that gets stored in the content database is dependant on
the network key used, as the node id is derived from this.
This is to avoid having a json rpc call fail if there was a
previous call done more than 10 seconds ago. 10 seconds because
that is the default timeout on the http server side.
* Sharing block header data around in a Portal history network (PoC)
- Rework PortalStream to have an instance per PortalProtocol (this
needs to be improved eventually). Each instance uses the same
UtpDiscv5Protocol instance.
- Add processContent on receival of accepted data
- Add dumb neighborhoodGossip: dumb in the sense that it only
offers one piece of content at a time.
- Add to / adjust populate_db to also allow for propagation of
the data and add debug rpc call: portal_history_propagate
- Add eth_rpc_client
- Add eth_getBlockbyHash (no txs or uncles) to eth API
- Add additional test to test_portal_testnet which loads 5 block
headers to 1 node, and offers this data to few nodes, which should
propagate it over the network further. Next query every node for
this data.
* Adjust paths on which Fluffy CI is triggered
* Add documentation on the local testnet
* Improve the tests of the local testnet
The local testnet test was rather flaky and would occasionally
fail. It has been made more robust by adding the ENRs directly
to the routing table instead of doing some random lookups.
Additionally, the amount of nodes were increased (=64), ip limits
configuration was added, and the bits-per-hop value was set to 1
in order to make the lookups more likely to hit the network
instead of only the local routing table.
Failure is obviously still possible to happen when sufficient
packets get lost. If this turns out to be the case with the current
amount of nodes, we might have to revise the testing strategy here.
* Disable lookup test for State network
Disable lookup test for State network due to issue with custom
distance function causing the lookup to not always converging
towards the target.
- Allow access to contentDB from portal wire protocol
- Use this to do the db.get in `handleFindContent` directly
- Use this to check the `contentKeys` list in `handleOffer`
* Change History content key to us SSZ Union and adjust tests
* Change slot to byteBE instead of LE
This is currently not specified in the Portal network
specifications, but we are using already BE for the actual content
key, so change it also here to remain consistent.
- Add portal network ping and findNodes JSON-RPC endpoints
- clean-up of some cli arguments
- Add protocol_interop.md document and adjust/fix other
documentation
Currently bootstrap nodes for discv5 and for the Portal nodes
were provided through separate cli arguments. This is however
confusing and cumbersome as typically when (currently) testing
a node will have both discv5 and the Portal networks enabled.
We merge them into one argument.
If a node happens not to support a Portal network, it will be
removed after message request failure.
Commit also contains additional clean-up and nim-eth bump.
* Rename FindNode to FindNodes as per spec
* Use consistently lower case starting camelCase for consts
Style guide nep1 allows for both CamelCase and camelCase for
consts, but we seem to use more often camelCase. Using this now
consistently.
* Change some procs to func
* Add resolve call for Portal networks
And:
- Refactor some code by adding a findNodeVerified call
- add the portal network lookup json-rpc call that uses resolve
- add usage of this lookup in the portal testnet tests
- Additional comments
* Let recordsFromBytes fail on invalid ENRs
This behaviour is more similar as how it is done in discovery v5
base layer.
- Add basic discv5 and portal json-rpc calls and activate them in
fluffy
- Renames in the rpc folder
- Add local testnet script and run this script in CI
- bump nim-eth
* Add SSZ Unions through case objects
* Add connection id content response test and improve other test vectors
* Implement content keys and ids for state network as per spec
Content keys case object is used so that it can be serialized and
deserialized as an SSZ Union.
* Let message Union in Portal wire protocol start at 0 as per new spec
- Search for the node on an incoming portal message and try to add
it to the routing table
- Don't rely on discv5 nodes for bootstrapping portal networks for
now.
- Attempt to repopulate the routing table when it is at 0 or drops
to 0.
* Add a basic ContentDB for Portal networks
* Use ContentDB in StateNetwork
* Avoid probably some form of sandwich problem by re-exporting kvstore_sqlite3 from content_db
* Allow for passing Portal specific bootstrap nodes
* Fix to also replaceNode when decodeMessage fails
* Add portal bootstrap node tests and reorder test cases
The Portal Network tests fail with the custom functions,
on one part due to the fact that the portal tests still
uses the xor functions, but also because the
`neighboursAtDistances` call appears to filter them out.
The reverse calculation might be off.
* Generalize netork layer for portal
* Make messages free from any content references
* Use portal network in main fluffy module
* Fix cli
* Use lookup in portal network
* Avoid using result
* Update readme with adjusted intro and dev updates link
* Remove pcre from prerequisites as it no longer is
* Change to hackmd doc as status notes seem not accessible