With the introduction of the new non-blocking code, events for the
peerpool might come out-of-order. That's generally not an issue, but it
made tests fail.
I have changed the code so that order is more consistent (It's still
theoretically possible that a stop signal would arrive out of order in a
real scenario, but impact is low and I don't want to change this code
too much).
There was another deadlock in the peer pool.
Because we made the event handler asynchrnous, another deadlock popped
up, as the loop locks the global peerpool lock before processing events.
But the handlers also take the global look, effectively resulting in the
same situation we had before, i.e the loop is not running.
THE LOOP MUST BE RUNNING AT ALL TIMES OTHERWISE THE SERVER HANGS.
This is a bit complicated, so:
1) Peerpool was subscribing to `event.Feed`, which is a global event
emitter for ethereum.
2) The p2p.Server was publshing on `event.Feed`, this triggered in the
same routine a publish on `event.Feed`.
3) Peerpool was listening to `event.Feed`, react on it, and in the same
routine, trigger some code on p2p.Server that would publish on
`event.Feed`
This meant that if the size of the channel was unbufferred, it would deadlock, as
peerPool would not be consuming when it would publish (the same go
routine publishes and listen effectively, through a lot of indirection
and non-buffered channels, p2p.Server->event.Feed)
The channel though was a buffered channel with size 10, and this meant that most of the times is
fine.
The issue is that peerpool is not the only producer to this channel.
So it's possible that while is processing an event, the buffer would
fill up, and it would hange trying to publish, and nobody is listening
to the channel, hanging EVERYTHING.
At least that's what I think, needs to be tested, but definitely an
issue.
I kept the code changes to a minimum, this code is a bit hairy, but it's
fairly critical so I don't want to make too many changes.
* Replace mclock with time in peer pools
we used mclock as golang before 1.9 did not support monotonic clocks,
https://github.com/gavv/monotime, it does now https://golang.org/pkg/time/
so we can fallback on the system implementation, which will return
nanoseconds with a resolution that is system dependent.
* Handle case where peer have same discovered time
In some system the resolution of the clock is not high enough so
multiple peers are added on the same nanosecond.
This result in the peer just added being immediately removed.
This code adds a check making sure we don't assume that a peer is added.
Another approach would be to make sure to include the peer in the list,
so prevent the peer just being added to be evicted, but it's slightly
more complicated and the resolution is generally accurate enough for our
purpose so that peers will be fresh enough if they have the same
discovered time.
It also adds a regression test, I had to use an interface to stub the
clock.
Fixes: https://github.com/status-im/nim-status-client/issues/522
* bump version to 0.55.3
This commit updates geth to 1.8.17 and adds a possibility to enable metrics during compilation time.
The cascade of issues forced us to upgrade geth to 1.8.17 in order to allow enabling metrics during compilation time. 1.8.17 introduced `NodeID` refactoring and `enode` package which affected our peers pool and integration with Discovery V5.
* Mark peers that were added by peer pool
* Implement thorough test to verify that connection was indeed preserved
* Add more debug logs and remove caching of the peers
* add verifier and test using simulated backend
* add ContractCaller
* commit simulated backend after deploy and after smart contract writes
* use bind.NewKeyedTransactor for all transactions in tests
* rename RegistryVerifier to Verifier
* initialize contract verifier if MailServerRegistryAddress config is set
* use contractAddress.Hash()
* refactoring
* use fmt.Sprintf to format contract address in logs
* fix test and lint warnings
* update Gopkg.lock
* update Gopkg.lock once more
The goal of this PR is to add an interface to verify MailServers. In this PR, MailServers are hardcoded in status-go. The next iteration will use a smart contract.
Update vendor
Integrate rendezvous into status node
Add a test with failover using rendezvous
Use multiple servers in client
Use discovery V5 by default and test that node can be started with rendezvous discovet
Fix linter
Update rendezvous client to one with instrumented stream
Address feedback
Fix test with updated topic limits
Apply several suggestions
Change log to debug for request errors because we continue execution
Remove web3js after rebase
Update rendezvous package