* swarm/network: DRY out repeated giga comment I not necessarily agree with the way we wait for event propagation. But I truly disagree with having duplicated giga comments. * p2p/simulations: encapsulate Node.Up field so we avoid data races The Node.Up field was accessed concurrently without "proper" locking. There was a lock on Network and that was used sometimes to access the field. Other times the locking was missed and we had a data race. For example: https://github.com/ethereum/go-ethereum/pull/18464 The case above was solved, but there were still intermittent/hard to reproduce races. So let's solve the issue permanently. resolves: ethersphere/go-ethereum#1146 * p2p/simulations: fix unmarshal of simulations.Node Making Node.Up field private in 13292ee897e345045fbfab3bda23a77589a271c1 broke TestHTTPNetwork and TestHTTPSnapshot. Because the default UnmarshalJSON does not handle unexported fields. Important: The fix is partial and not proper to my taste. But I cut scope as I think the fix may require a change to the current serialization format. New ticket: https://github.com/ethersphere/go-ethereum/issues/1177 * p2p/simulations: Add a sanity test case for Node.Config UnmarshalJSON * p2p/simulations: revert back to defer Unlock() pattern for Network It's a good patten to call `defer Unlock()` right after `Lock()` so (new) error cases won't miss to unlock. Let's get back to that pattern. The patten was abandoned in 85a79b3ad3c5863f8612d25c246bcfad339f36b7, while fixing a data race. That data race does not exist anymore, since the Node.Up field got hidden behind its own lock. * p2p/simulations: consistent naming for test providers Node.UnmarshalJSON * p2p/simulations: remove JSON annotation from private fields of Node As unexported fields are not serialized. * p2p/simulations: fix deadlock in Network.GetRandomDownNode() Problem: GetRandomDownNode() locks -> getDownNodeIDs() -> GetNodes() tries to lock -> deadlock On Network type, unexported functions must assume that `net.lock` is already acquired and should not call exported functions which might try to lock again. * p2p/simulations: ensure method conformity for Network Connect* methods were moved to p2p/simulations.Network from swarm/network/simulation. However these new methods did not follow the pattern of Network methods, i.e., all exported method locks the whole Network either for read or write. * p2p/simulations: fix deadlock during network shutdown `TestDiscoveryPersistenceSimulationSimAdapter` often got into deadlock. The execution was stuck on two locks, i.e, `Kademlia.lock` and `p2p/simulations.Network.lock`. Usually the test got stuck once in each 20 executions with high confidence. `Kademlia` was stuck in `Kademlia.EachAddr()` and `Network` in `Network.Stop()`. Solution: in `Network.Stop()` `net.lock` must be released before calling `node.Stop()` as stopping a node (somehow - I did not find the exact code path) causes `Network.InitConn()` to be called from `Kademlia.SuggestPeer()` and that blocks on `net.lock`. Related ticket: https://github.com/ethersphere/go-ethereum/issues/1223 * swarm/state: simplify if statement in DBStore.Put() * p2p/simulations: remove faulty godoc from private function The comment started with the wrong method name. The method is simple and self explanatory. Also, it's private. => Let's just remove the comment.
devp2p Simulations
The p2p/simulations
package implements a simulation framework which supports
creating a collection of devp2p nodes, connecting them together to form a
simulation network, performing simulation actions in that network and then
extracting useful information.
Nodes
Each node in a simulation network runs multiple services by wrapping a collection
of objects which implement the node.Service
interface meaning they:
- can be started and stopped
- run p2p protocols
- expose RPC APIs
This means that any object which implements the node.Service
interface can be
used to run a node in the simulation.
Services
Before running a simulation, a set of service initializers must be registered which can then be used to run nodes in the network.
A service initializer is a function with the following signature:
func(ctx *adapters.ServiceContext) (node.Service, error)
These initializers should be registered by calling the adapters.RegisterServices
function in an init()
hook:
func init() {
adapters.RegisterServices(adapters.Services{
"service1": initService1,
"service2": initService2,
})
}
Node Adapters
The simulation framework includes multiple "node adapters" which are responsible for creating an environment in which a node runs.
SimAdapter
The SimAdapter
runs nodes in-memory, connecting them using an in-memory,
synchronous net.Pipe
and connecting to their RPC server using an in-memory
rpc.Client
.
ExecAdapter
The ExecAdapter
runs nodes as child processes of the running simulation.
It does this by executing the binary which is running the simulation but
setting argv[0]
(i.e. the program name) to p2p-node
which is then
detected by an init hook in the child process which runs the node.Service
using the devp2p node stack rather than executing main()
.
The nodes listen for devp2p connections and WebSocket RPC clients on random localhost ports.
Network
A simulation network is created with an ID and default service (which is used if a node is created without an explicit service), exposes methods for creating, starting, stopping, connecting and disconnecting nodes, and emits events when certain actions occur.
Events
A simulation network emits the following events:
- node event - when nodes are created / started / stopped
- connection event - when nodes are connected / disconnected
- message event - when a protocol message is sent between two nodes
The events have a "control" flag which when set indicates that the event is the outcome of a controlled simulation action (e.g. creating a node or explicitly connecting two nodes together).
This is in contrast to a non-control event, otherwise called a "live" event, which is the outcome of something happening in the network as a result of a control event (e.g. a node actually started up or a connection was actually established between two nodes).
Live events are detected by the simulation network by subscribing to node peer events via RPC when the nodes start up.
Testing Framework
The Simulation
type can be used in tests to perform actions in a simulation
network and then wait for expectations to be met.
With a running simulation network, the Simulation.Run
method can be called
with a Step
which has the following fields:
-
Action
- a function which performs some action in the network -
Expect
- an expectation function which returns whether or not a given node meets the expectation -
Trigger
- a channel which receives node IDs which then trigger a check of the expectation function to be performed against that node
As a concrete example, consider a simulated network of Ethereum nodes. An
Action
could be the sending of a transaction, Expect
it being included in
a block, and Trigger
a check for every block that is mined.
On return, the Simulation.Run
method returns a StepResult
which can be used
to determine if all nodes met the expectation, how long it took them to meet
the expectation and what network events were emitted during the step run.
HTTP API
The simulation framework includes a HTTP API which can be used to control the simulation.
The API is initialised with a particular node adapter and has the following endpoints:
GET / Get network information
POST /start Start all nodes in the network
POST /stop Stop all nodes in the network
GET /events Stream network events
GET /snapshot Take a network snapshot
POST /snapshot Load a network snapshot
POST /nodes Create a node
GET /nodes Get all nodes in the network
GET /nodes/:nodeid Get node information
POST /nodes/:nodeid/start Start a node
POST /nodes/:nodeid/stop Stop a node
POST /nodes/:nodeid/conn/:peerid Connect two nodes
DELETE /nodes/:nodeid/conn/:peerid Disconnect two nodes
GET /nodes/:nodeid/rpc Make RPC requests to a node via WebSocket
For convenience, nodeid
in the URL can be the name of a node rather than its
ID.
Command line client
p2psim
is a command line client for the HTTP API, located in
cmd/p2psim
.
It provides the following commands:
p2psim show
p2psim events [--current] [--filter=FILTER]
p2psim snapshot
p2psim load
p2psim node create [--name=NAME] [--services=SERVICES] [--key=KEY]
p2psim node list
p2psim node show <node>
p2psim node start <node>
p2psim node stop <node>
p2psim node connect <node> <peer>
p2psim node disconnect <node> <peer>
p2psim node rpc <node> <method> [<args>] [--subscribe]