mirror of https://github.com/status-im/nim-eth.git
e4b4b7f4af
Fixes status-im/nim-eth#341, status-im/nimbus-eth1#489. When using discv4 (Kademlia) to find peers, there is a crash after a few minutes. It occurs for most of us on Eth1 mainnet, and everyone on Ropsten. The cause is `findNodes` being called twice in succession to the same peer, within about 5 seconds of each other. ("About" 5 seconds, because Chronos does not guarantee to run the timeout branch at a particular time, due to queuing and clock reading delays.) Then `findNodes` sends a duplicate message to the peer and calls `waitNeighbours` to listen for the reply. There's already a `waitNeighbours` callback in a shared table, so that function hits an assert failure. Ignoring the assert would be wrong as it would break timeout logic, and sending `FindNodes` twice in rapid succession also makes us a bad peer. As a simple workaround, just skip `findNodes` in this state and return a fake empty `Neighbours` reply. This is a bit of a hack as `findNodes` should not be called like this; there's a logic error at a higher level. But it works. Tested for about 4 days constant operation on Ropsten. The crash which occured every few minutes no longer occurs, and discv4 keeps working. Signed-off-by: Jamie Lokier <jamie@shareable.org> |
||
---|---|---|
.. | ||
discoveryv5 | ||
private | ||
rlpx_protocols | ||
auth.nim | ||
blockchain_sync.nim | ||
blockchain_utils.nim | ||
bootnodes.nim | ||
discovery.nim | ||
ecies.nim | ||
enode.nim | ||
kademlia.nim | ||
mock_peers.nim | ||
p2p_backends_helpers.nim | ||
p2p_protocol_dsl.nim | ||
p2p_tracing.nim | ||
p2p_tracing_ctail_plugin.nim | ||
peer_pool.nim | ||
rlpx.nim | ||
rlpxcrypt.nim | ||
sync.nim | ||
whispernodes.nim |