Note: Scaling in the size of a single content topic is out of scope for this document.
Background and Motivation # Unstructured P2P networks are more robust and resilient against DoS attacks compared to structured P2P networks). However, they do not scale to large traffic loads. A single libp2p gossipsub mesh, which carries messages associated with a single pubsub topic, can be seen as a separate unstructured P2P network (gossip and control messages go beyond these boundaries, but at its core, it is a separate P2P network).">
Note: Scaling in the size of a single content topic is out of scope for this document.
Background and Motivation # Unstructured P2P networks are more robust and resilient against DoS attacks compared to structured P2P networks). However, they do not scale to large traffic loads. A single libp2p gossipsub mesh, which carries messages associated with a single pubsub topic, can be seen as a separate unstructured P2P network (gossip and control messages go beyond these boundaries, but at its core, it is a separate P2P network)." />
However, they do not scale to large traffic loads.
A single <ahref="https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.0.md#gossipsub-the-gossiping-mesh-router">libp2p gossipsub mesh</a>,
which carries messages associated with a single pubsub topic, can be seen as a separate unstructured P2P network
(gossip and control messages go beyond these boundaries, but at its core, it is a separate P2P network).
With this, the number of <ahref="/spec/11/">Waku relay</a> content topics that can be carried over a pubsub topic is limited.
This prevents app protocols that aim to span many multicast groups (realized by content topics) from scaling.</p>
<p>This document specifies three pubsub topic sharding methods (with varying degrees of automation),
which allow application protocols to scale in the number of content topics.
This document also covers discovery of topic shards.</p>
<h1id="named-sharding">
Named Sharding
<aclass="anchor"href="#named-sharding">#</a>
</h1>
<p><em>Named sharding</em> offers apps to freely choose pubsub topic names.
App protocols SHOULD follow the naming structure detailed in <ahref="/spec/23/">23/WAKU2-TOPICS</a>.
With named sharding, managing discovery falls into the responsibility of apps.</p>
<p>The default Waku pubsub topic <code>/waku/2/default-waku/proto</code> can be seen as a named shard available to all app protocols.</p>
<blockquote>
<p><em>Note</em>: Future versions of this document are planned to give more guidance with respect to discovery via
<p>an example for the 2nd shard in the global shard cluster:</p>
<p><code>/waku/2/static-rshard/0/2</code>.</p>
<blockquote>
<p><em>Note</em>: Because <em>all</em> shards distribute payload defined in <ahref="spec/14/">14/WAKU2-MESSAGE</a> via <ahref="https://developers.google.com/protocol-buffers/">protocol buffers</a>,
the pubsub topic name does not explicitly add <code>/proto</code> to indicate protocol buffer encoding.
We use <code>rshard</code> to indicate it is a relay shards; further shard types might follow in the future.</p>
</blockquote>
<p>From an app point of view, a subscription to a content topic <code>waku2/xxx</code> on a static shard would look like:</p>
<p>For each shard cluster a node is part of, the node adds a separate key to its ENR.
The representation corresponds to <ahref="https://github.com/ethereum/consensus-specs/blob/dev/specs/altair/validator.md#sync-committee-subnet-stability">Ethereum shard ENRs</a>).</p>
<p>Example</p>
<table>
<thead>
<tr>
<th>key</th>
<th>value</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>rshard-0</code></td>
<td><code>0x0000100000000000</code></td>
</tr>
<tr>
<td><code>rshard-16</code></td>
<td><code>0x0000100000003000</code></td>
</tr>
</tbody>
</table>
<p>This example node is part of shard <code>45</code> in the global shard cluster,
and part shards <code>13</code>, <code>14</code>, and <code>45</code> in the Status main-net shard cluster.</p>
<p>This method is easier to read.
It is feasible, assuming nodes are only part of a few apps using specific shard clusters.</p>
<p>The app is oblivious to the pubsub topic layer.
(Future versions could deprecate the default pubsub topic and remove the necessity for <code>auto=true</code>.)</p>
<p><em>The basic idea behind automatic sharding</em>:
Content topics are mapped using <ahref="https://en.wikipedia.org/wiki/Consistent_hashing">consistent hashing</a>.
Like with DHTs, the hash space is split into parts,
each covered by a Pubsub topic (mesh network) that carries content topics which are mapped into the respective part of the hash space.</p>
<p>There are (at least) two issues that have to be solved: <em>Hot spots</em> and <em>Discovery</em> (see next subsection).</p>
<p>Hot spots occur (similar to DHTs), when a specific mesh network becomes responsible for (several) large multicast groups (content topics).
The opposite problem occurs when a mesh only carries multicast groups with very few participants: this might cause bad connectivity within the mesh.
Our research goal here is finding efficient ways of distribution.
We could get inspired by the DHT literature.
We also have to consider:
If a node is part of many content topics which are all spread over different shards,
the node will potentially be exposed to a lot of network traffic.</p>
<h2id="discovery-1">
Discovery
<aclass="anchor"href="#discovery-1">#</a>
</h2>
<p>For the discovery of automatic shards this document specifies two methods (the second method will be detailed in a future version of this document).</p>
<p>The first method uses the discovery introduced above in the context of static shards.
The index range <code>49152 - 65535</code> is reserved for automatic sharding.
Each index can be seen as a hash bucket.
Consistent hashing maps content topics in one of these buckets.</p>
<p>The second discovery method will be a successor to the first method,
but is planned to preserve the index range allocation.
Instead of adding the data to the ENR, it will treat each array index as a capability,
which can be hierarchical, having each shard in the indexed shard cluster as a sub-capability.
When scaling to a very large number of shards, this will avoid blowing up the ENR size, and allows efficient discovery.
We currently use <ahref="https://rfc.vac.dev/spec/33/">33/WAKU2-DISCV5</a> for discovery,
which is based on Ethereum’s <ahref="https://github.com/ethereum/devp2p/blob/master/discv5/discv5.md">discv5</a>.
While this allows to sample nodes from a distributed set of nodes efficiently and offers good resilience,
it does not allow to efficiently discover nodes with specific capabilities within this node set.
Our <ahref="https://vac.dev/wakuv2-apd">research log post</a> explains this in more detail.
Adding efficient (but still preserving resilience) capability discovery to discv5 is ongoing research.
<ahref="https://github.com/harnen/service-discovery-paper">A paper on this</a> has been completed,
but the <ahref="https://github.com/ethereum/devp2p/blob/master/discv5/discv5-theory.md">Ethereum discv5 specification</a>
has yet to be updated.
When the new capability discovery is available,
this document will be updated with a specification of the second discovery method.
The transition to the second method will be seamless and fully backwards compatible because nodes can still advertise and discover shard memberships in ENRs.</p>