14 The macro skeptics guide to the p2pProtocol macro
zah edited this page 2021-01-22 16:16:01 +02:00

This guide tries to provide the rationale behind some of the decisions taken when designing the p2pProtocol macro and aims to explain its inner workings.

So, why does the macro exist in the first place?

The first and foremost goal of the macro is to establish the peer-to-peer protocols as symbols that can be named and referenced in the code. For example, to enable a certain protocol in the beacon node, it should be enough to provide its name:

proc initNetworking =
  var node = Eth2Node.init(...)
  node.addCapability BeaconSync

Here, BeaconSync is not a global variable, but rather a strongly-typed symbol. It will also represent the protocol in number of other APIs such as:

if remotePeer.supports(BeaconSync):
  ...

or

for peer in peerPool.randomPeers(BeaconSync):
  # do something only with peers we can sync with

The key here is that you don't have to worry about gcsafety. The metadata of the protocol is a mix of constants and run-time structures associated with its type symbol.

Furthermore, the metadata is inter-linked. For example, to check whether a peer supports a particular message, we can do:

if peer.supports(BeaconSync.getBlocksByRange):
  ...

Here, BeaconSync.getBlocksByRange is also a type symbol. We can continue accessing information such as BeaconSync.getBlocksByRange.libp2pStreamType and so on.

All of this rich web of metadata makes it easier to write generic code that implement routines like "Perform any type of request" or "Handle any type of incoming stream".

What is a p2pProtocol definition?

The tricky part to understand here is that the macro assumes that we are building a peer-to-peer application. In a peer-to-peer application all nodes can act as both clients and servers and every request and response exists symetrically in both worldviews.

When you write the body of the macro, you are writing the implementation of the server worldview. After all, every message may be proccessed in a different way on the server, but the client code for making a request always follows a precise pattern defined in the Eth2 spec.

In other words, if we define only the request handlers for the server, we can mechanically generate the client-side procs needed to perform these requests. That's what the p2pProtocol macro does for us.

What kind of executable code is generated?

Every message in the network needs exactly 3 routines:

1. Client-side request proc

This is very straigh-forward code that looks like this:

proc beaconBlocksByRange*(peer: Peer; headBlockRoot: Eth2Digest; startSlot: Slot;
                         count: uint64; step: uint64;
                         timeout: Duration = milliseconds(10000'i64)): Future[Option[seq[SignedBeaconBlock]]] =
  var outputStream = init OutputStream
  var writer = init(WriterType(SSZ), outputStream)
  var recordWriterCtx = beginRecord(writer, beaconBlocksByRangeObj)
  writeField(writer, recordWriterCtx, "headBlockRoot", headBlockRoot)
  writeField(writer, recordWriterCtx, "startSlot", startSlot)
  writeField(writer, recordWriterCtx, "count", count)
  writeField(writer, recordWriterCtx, "step", step)
  endRecord(writer, recordWriterCtx)
  let msgBytes = getOutput(outputStream)
  makeEth2Request(peer, "/eth2/beacon_chain/req/beacon_blocks_by_range/1/ssz",
                  msgBytes, seq[SignedBeaconBlock], timeout)

As you can see, writing this code by hand would be quite tedious and error-prone. There is not much interesting going on. The bulk of the request logic is in the makeEth2Request and sendNotificationMsg procs that you'll find in eth2_network.nim.

2. Server-side thunk proc

This is a server-side proc that takes the raw contents of a P2PStream and deserializes the correct expected parameter types. We call these procs "thunks", although a trampoline might have been a better name. These procs are even simpler:

  proc status_thunk(stream: Connection; proto`gensym175130276: string): Future[void] {.gcsafe.} =
    return handleIncomingStream(network, stream, statusObj, SSZ)

When you say node.addCapability(Protocol), these are installed as protocol handlers though the switch.mount API. Again, the bulk of the actual logic is in handleIncomingStream defined in eth2_network.nim.

3. Server-side user handler

This is the user-defined handler that the user entered in the p2pProtocol macro. At the end of handleIncomingStream you'll find the following call:

  await callUserHandler(peer, conn, msg)

This is another simple glue proc generated by the macro that just unpacks the deserialized message data into the parameters of the right user handler.

Is that all?

The rest is simple helpers for APIs such as peer.state(BeaconSync), eth2Node.protocolState(BeaconState), etc. Forcing the protocol definitions to a common standard guarantees that we don't end up using certain Nim features such as closures to capture arbitrary application state in the handlers. I consider this a small win that makes the protocols easier to audit and reason about and it exposes their state to other parts of the application with a well-defined API. It will also help us to more easily migrate to a manual memory management scheme in the future.

Closing words on intentional programming and regularity

When you design an API, you should always strive to capture the intent of the programmer as precisely as possible. When you do that, it becomes much easier to add features such as "Implement this protocol in a mock peer that will reply with pre-recoreded messages", "Dump the JSON contents of all network messages", "Create a visualization tool of the network traffic". All of these cross-cutting concerns become a matter of enhancing the protocol processing layer with new capabilities.

When the programmers are responsible for getting the low-level details right, some irregularity will innevitably sneak in. The Waku protocol for example had a number of such irregularities that Kim was able to point out to the design team only because they were difficult to express within the constraints of the p2pProtocol macro.