op-geth/p2p/simulations
Felix Lange 90caa2cabb
p2p: new dial scheduler (#20592)
* p2p: new dial scheduler

This change replaces the peer-to-peer dial scheduler with a new and
improved implementation. The new code is better than the previous
implementation in two key aspects:

- The time between discovery of a node and dialing that node is
  significantly lower in the new version. The old dialState kept
  a buffer of nodes and launched a task to refill it whenever the buffer
  became empty. This worked well with the discovery interface we used to
  have, but doesn't really work with the new iterator-based discovery
  API.

- Selection of static dial candidates (created by Server.AddPeer or
  through static-nodes.json) performs much better for large amounts of
  static peers. Connections to static nodes are now limited like dynanic
  dials and can no longer overstep MaxPeers or the dial ratio.

* p2p/simulations/adapters: adapt to new NodeDialer interface

* p2p: re-add check for self in checkDial

* p2p: remove peersetCh

* p2p: allow static dials when discovery is disabled

* p2p: add test for dialScheduler.removeStatic

* p2p: remove blank line

* p2p: fix documentation of maxDialPeers

* p2p: change "ok" to "added" in static node log

* p2p: improve dialTask docs

Also increase log level for "Can't resolve node"

* p2p: ensure dial resolver is truly nil without discovery

* p2p: add "looking for peers" log message

* p2p: clean up Server.run comments

* p2p: fix maxDialedConns for maxpeers < dialRatio

Always allocate at least one dial slot unless dialing is disabled using
NoDial or MaxPeers == 0. Most importantly, this fixes MaxPeers == 1 to
dedicate the sole slot to dialing instead of listening.

* p2p: fix RemovePeer to disconnect the peer again

Also make RemovePeer synchronous and add a test.

* p2p: remove "Connection set up" log message

* p2p: clean up connection logging

We previously logged outgoing connection failures up to three times.

- in SetupConn() as "Setting up connection failed addr=..."
- in setupConn() with an error-specific message and "id=... addr=..."
- in dial() as "Dial error task=..."

This commit ensures a single log message is emitted per failure and adds
"id=... addr=... conn=..." everywhere (id= omitted when the ID isn't
known yet).

Also avoid printing a log message when a static dial fails but can't be
resolved because discv4 is disabled. The light client hit this case all
the time, increasing the message count to four lines per failed
connection.

* p2p: document that RemovePeer blocks
2020-02-13 11:10:03 +01:00
..
adapters p2p: new dial scheduler (#20592) 2020-02-13 11:10:03 +01:00
examples p2p/simulations: fix a deadlock and clean up adapters (#17891) 2018-10-11 20:32:14 +02:00
pipes all: update author list and licenses 2019-07-22 12:17:27 +03:00
README.md p2p/simulations: fix a deadlock and clean up adapters (#17891) 2018-10-11 20:32:14 +02:00
connect.go p2p, swarm: fix node up races by granular locking (#18976) 2019-02-18 07:38:14 +01:00
connect_test.go p2p/simulations: eliminate concept of pivot (#18426) 2019-01-11 10:23:45 +01:00
events.go build: use golangci-lint (#20295) 2019-11-18 10:49:17 +02:00
http.go p2p/simulations: fix staticcheck warnings (#20322) 2019-11-19 17:16:42 +01:00
http_test.go build: use golangci-lint (#20295) 2019-11-18 10:49:17 +02:00
mocker.go
mocker_test.go p2p/simulations: fix staticcheck warnings (#20322) 2019-11-19 17:16:42 +01:00
network.go build: use golangci-lint (#20295) 2019-11-18 10:49:17 +02:00
network_test.go p2p/simulations: fix staticcheck warnings (#20322) 2019-11-19 17:16:42 +01:00
simulation.go
test.go all: update author list and licenses 2019-07-22 12:17:27 +03:00

README.md

devp2p Simulations

The p2p/simulations package implements a simulation framework which supports creating a collection of devp2p nodes, connecting them together to form a simulation network, performing simulation actions in that network and then extracting useful information.

Nodes

Each node in a simulation network runs multiple services by wrapping a collection of objects which implement the node.Service interface meaning they:

  • can be started and stopped
  • run p2p protocols
  • expose RPC APIs

This means that any object which implements the node.Service interface can be used to run a node in the simulation.

Services

Before running a simulation, a set of service initializers must be registered which can then be used to run nodes in the network.

A service initializer is a function with the following signature:

func(ctx *adapters.ServiceContext) (node.Service, error)

These initializers should be registered by calling the adapters.RegisterServices function in an init() hook:

func init() {
	adapters.RegisterServices(adapters.Services{
		"service1": initService1,
		"service2": initService2,
	})
}

Node Adapters

The simulation framework includes multiple "node adapters" which are responsible for creating an environment in which a node runs.

SimAdapter

The SimAdapter runs nodes in-memory, connecting them using an in-memory, synchronous net.Pipe and connecting to their RPC server using an in-memory rpc.Client.

ExecAdapter

The ExecAdapter runs nodes as child processes of the running simulation.

It does this by executing the binary which is running the simulation but setting argv[0] (i.e. the program name) to p2p-node which is then detected by an init hook in the child process which runs the node.Service using the devp2p node stack rather than executing main().

The nodes listen for devp2p connections and WebSocket RPC clients on random localhost ports.

Network

A simulation network is created with an ID and default service (which is used if a node is created without an explicit service), exposes methods for creating, starting, stopping, connecting and disconnecting nodes, and emits events when certain actions occur.

Events

A simulation network emits the following events:

  • node event - when nodes are created / started / stopped
  • connection event - when nodes are connected / disconnected
  • message event - when a protocol message is sent between two nodes

The events have a "control" flag which when set indicates that the event is the outcome of a controlled simulation action (e.g. creating a node or explicitly connecting two nodes together).

This is in contrast to a non-control event, otherwise called a "live" event, which is the outcome of something happening in the network as a result of a control event (e.g. a node actually started up or a connection was actually established between two nodes).

Live events are detected by the simulation network by subscribing to node peer events via RPC when the nodes start up.

Testing Framework

The Simulation type can be used in tests to perform actions in a simulation network and then wait for expectations to be met.

With a running simulation network, the Simulation.Run method can be called with a Step which has the following fields:

  • Action - a function which performs some action in the network

  • Expect - an expectation function which returns whether or not a given node meets the expectation

  • Trigger - a channel which receives node IDs which then trigger a check of the expectation function to be performed against that node

As a concrete example, consider a simulated network of Ethereum nodes. An Action could be the sending of a transaction, Expect it being included in a block, and Trigger a check for every block that is mined.

On return, the Simulation.Run method returns a StepResult which can be used to determine if all nodes met the expectation, how long it took them to meet the expectation and what network events were emitted during the step run.

HTTP API

The simulation framework includes a HTTP API which can be used to control the simulation.

The API is initialised with a particular node adapter and has the following endpoints:

GET    /                            Get network information
POST   /start                       Start all nodes in the network
POST   /stop                        Stop all nodes in the network
GET    /events                      Stream network events
GET    /snapshot                    Take a network snapshot
POST   /snapshot                    Load a network snapshot
POST   /nodes                       Create a node
GET    /nodes                       Get all nodes in the network
GET    /nodes/:nodeid               Get node information
POST   /nodes/:nodeid/start         Start a node
POST   /nodes/:nodeid/stop          Stop a node
POST   /nodes/:nodeid/conn/:peerid  Connect two nodes
DELETE /nodes/:nodeid/conn/:peerid  Disconnect two nodes
GET    /nodes/:nodeid/rpc           Make RPC requests to a node via WebSocket

For convenience, nodeid in the URL can be the name of a node rather than its ID.

Command line client

p2psim is a command line client for the HTTP API, located in cmd/p2psim.

It provides the following commands:

p2psim show
p2psim events [--current] [--filter=FILTER]
p2psim snapshot
p2psim load
p2psim node create [--name=NAME] [--services=SERVICES] [--key=KEY]
p2psim node list
p2psim node show <node>
p2psim node start <node>
p2psim node stop <node>
p2psim node connect <node> <peer>
p2psim node disconnect <node> <peer>
p2psim node rpc <node> <method> [<args>] [--subscribe]

Example

See p2p/simulations/examples/README.md.