Motivation and Background # Future versions of this document will serve as a comprehensive list of adversarial models and attack based threats relevant for Waku v2. The main purpose of this document is being a linkable resource for specifications that address protection as well as mitigation mechanisms within the listed models.
Discussing and introducing countermeasures to specific attacks in specific models is out of scope for this document.">
Motivation and Background # Future versions of this document will serve as a comprehensive list of adversarial models and attack based threats relevant for Waku v2. The main purpose of this document is being a linkable resource for specifications that address protection as well as mitigation mechanisms within the listed models.
Discussing and introducing countermeasures to specific attacks in specific models is out of scope for this document." />
<p>Future versions of this document will serve as a comprehensive list of adversarial models and attack based threats relevant for <ahref="/spec/10/">Waku v2</a>.
The main purpose of this document is being a linkable resource for specifications that address protection as well as mitigation mechanisms within the listed models.</p>
<p>Discussing and introducing countermeasures to specific attacks in specific models is out of scope for this document.
Analyses and further information about Waku’s properties within these models may be found in our <em>Waku v2 Anonymity Analysis</em> series of research log posts:</p>
<ul>
<li><ahref="https://vac.dev/wakuv2-relay-anon">Part I: Definitions and Waku Relay</a></li>
</ul>
<p>Note: This document adds to the adversarial models and threat list discussed in our <ahref="https://vac.dev/wakuv2-relay-anon">research log post</a>.
It does not cover analysis of Waku, as the research log post does.
Future versions of this document will extend the adversarial models and threat list.</p>
<p>The concepts of security, privacy, and anonymity are linked and have quite a bit of overlap.</p>
<h2id="security">
Security
<aclass="anchor"href="#security">#</a>
</h2>
<p>Of the three, <ahref="https://en.wikipedia.org/wiki/Information_security">Security</a> has the clearest agreed upon definition,
at least regarding its key concepts: <em>confidentiality</em>, <em>integrity</em>, and <em>availability</em>.</p>
<ul>
<li>confidentiality: data is not disclosed to unauthorized entities.</li>
<li>integrity: data is not modified by unauthorized entities.</li>
<li>availability: data is available, i.e. accessible by authorized entities.</li>
</ul>
<p>While these are the key concepts, the definition of information security has been extended over time including further concepts,
e.g. <ahref="https://en.wikipedia.org/wiki/Authentication">authentication</a> and <ahref="https://en.wikipedia.org/wiki/Non-repudiation">non-repudiation</a>.</p>
<h2id="privacy">
Privacy
<aclass="anchor"href="#privacy">#</a>
</h2>
<p>Privacy allows users to choose which data and information</p>
<ul>
<li>they want to share</li>
<li>and with whom they want to share it.</li>
</ul>
<p>This includes data and information that is associated with and/or generated by users.
Protected data also comprises metadata that might be generated without users being aware of it.
This means, no further information about the sender or the message is leaked.
Metadata that is protected as part of the privacy-preserving property does not cover protecting the identities of sender and receiver.
Identities are protected by the <ahref="#anonymity">anonymity property</a>.</p>
<p>Often privacy is realized by the confidentiality property of security.
This neither makes privacy and security the same, nor the one a sub category of the other.
While security is abstract itself (its properties can be realized in various ways), privacy lives on a more abstract level using security properties.
Privacy typically does not use integrity and availability.
An adversary who has no access to the private data, because the message has been encrypted, could still alter the message.</p>
<h2id="anonymity">
Anonymity
<aclass="anchor"href="#anonymity">#</a>
</h2>
<p>Privacy and anonymity are closely linked.
Both the identity of a user and data that allows inferring a user’s identity should be part of the privacy policy.
For the purpose of analysis, we want to have a clearer separation between these concepts.</p>
Because each <ahref="/spec/14/">Waku message</a> is associated with a content topic, and each receiver is interested in messages with specific content topics,
receiver anonymity in the context of Waku corresponds to <em>subscriber-topic unlinkability</em>.
An example for the “action” part of our receiver anonymity definition is subscribing to a specific topic.</p>
Because the data in the context of Waku is Waku messages, sender anonymity corresponds to <em>sender-message unlinkability</em>.</p>
<h2id="anonymity-trilemma">
Anonymity Trilemma
<aclass="anchor"href="#anonymity-trilemma">#</a>
</h2>
<p><ahref="https://freedom.cs.purdue.edu/projects/trilemma.html">The Anonymity trilemma</a> states that only two out of <em>strong anonymity</em>, <em>low bandwidth</em>, and <em>low latency</em> can be guaranteed in the <em>global attacker</em> model.
Waku’s goal, being a modular set of protocols, is to offer any combination of two out of these three properties, as well as blends.</p>
<p>A fourth factor that influences <ahref="https://freedom.cs.purdue.edu/projects/trilemma.html">the anonymity trilemma</a> is <em>frequency and patterns</em> of messages.
The more messages there are, and the more randomly distributed they are, the better the anonymity protection offered by a given anonymous communication protocol.
So, incentivising users to use the protocol, for instance by lowering entry barriers, helps protecting the anonymity of all users.
The frequency/patterns factor is also related to <ahref="https://en.wikipedia.org/wiki/K-anonymity">k-anonymity</a>.</p>
<p>The following lists various attacks against <ahref="/spec/10/">Waku v2</a> protocols.
If not specifically mentioned, the attacks refer to <ahref="/spec/11">Waku relay</a> and the underlying <ahref="https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/README.md">libp2p GossipSub</a>.
We also list the weakest attacker model in which the attack can be successfully performed against.</p>
<p>An attack is considered more powerful if it can be successfully performed in a weaker attacker model.</p>
<p>Note: This list is work in progress.
We will either expand this list adding more attacks in future versions of this document,
or remove it and point to the “Security Considerations” sections of respective RFCs.</p>
<p>This section lists attacks that aim at deanonymizing a message sender.</p>
<p>We assume that protocol messages are transmitted within a secure channel set up using the <ahref="https://noiseprotocol.org/">Noise Protocol Framework</a>.
For <ahref="/spec/11">Waku Relay</a> this means we only consider messages with version field <code>2</code>,
which indicates that the payload has to be encoded using <ahref="/spec/35/">35/WAKU2-NOISE</a>.</p>
<p>Note: The currently listed attacks are against libp2p in general.
The <ahref="/spec/11/#message-fields">data field of Waku v2 relay</a> must be a <ahref="/spec/14/">Waku v2 message</a>.
The attacks listed in the following do not leverage that fact.</p>
<h3id="replay-attack">
Replay Attack
<aclass="anchor"href="#replay-attack">#</a>
</h3>
<p>In a replay attack, the attacker replays a valid message it received.</p>
<p>Waku relay is inherently safe against replay attack,
because GossipSub nodes, and by extension Waku relay nodes,
feature a <code>seen</code> cache, and only relay messages they have not seen before.</p>
<p>Further, replay attacks will be punished by <ahref="/spec/17/">RLN Relay</a>.</p>
<h3id="observing-messages">
Observing Messages
<aclass="anchor"href="#observing-messages">#</a>
</h3>
<p>If Waku relay was not protected with Noise, the AS attacker could simply check for messages leaving $v$ which have not been relayed to $v$.
These are the messages sent by $v$.
Waku relay protects against this attack by employing secure channels setup using Noise.</p>
<p>This attack can be performed by a single node attacker that is connected to all peers of the victim node $v$ with respect to a specific topic mesh.
The attacker also has to be connected to $v$.
In this position, the attacker will receive messages $m_v$ sent by $v$ both on the direct path from $v$, and on indirect paths relayed by peers of $v$.
It will also receive messages $m_x$ that are not sent by $v$. These messages $m_x$ are relayed by both $v$ and the peers of $v$.
Messages that are received (significantly) faster from $v$ than from any other of $v$’s peers are very likely messages that $v$ sent,
because for these messages the attacker is one hop closer to the source.</p>
<p>The attacker can (periodically) measure latency between itself and $v$, and between itself and the peers of $v$ to get more accurate estimates for the expected timings.
An AS attacker (and if the topology allows, even a local attacker) could also learn the latency between $v$ and its well-behaving peers.
An active AS attacker could also increase the latency between $v$ and its peers to make the timing differences more prominent.
This, however, might lead to $v$ switching to other peers.</p>
<p>This attack cannot (reliably) distinguish messages $m_v$ sent by $v$ from messages $m_y$ relayed by peers of $v$ the attacker is not connected to.
Still, there are hop-count variations that can be leveraged.
Messages $m_v$ always have a hop-count of 1 on the path from $v$ to the attacker, while all other paths are longer.
Messages $m_y$ might have the same hop-count on the path from $v$ as well as on other paths.
Further techniques that are part of the <em>mass deanonymization</em> category, such as <ahref="#bayesian-analysis">bayesian analysis</a>, can be used here as well.</p>
<p>If a multi node attacker manages to control all peers of the victim node, it can trivially tell which messages originated from $v$.</p>
<h3id="correlation">
Correlation
<aclass="anchor"href="#correlation">#</a>
</h3>
<p>Monitoring all traffic (in an AS or globally), allows the attacker to identify traffic correlated with messages originating from $v$.
This (alone) does not allow an external attacker to learn which message $v$ sent, but it allows identifying the respective traffic propagating through the network.
The more traffic in the network, the lower the success rate of this attack.</p>
<p>Combined with just a few nodes controlled by the attacker, the actual message associated with the correlated traffic can eventually be identified.</p>
<p>While attacks in the <em>sender deanonymization</em> category target a set of either specific or arbitrary users,
attacks in the <em>mass deanonymization</em> category aim at deanonymizing (parts of) the whole network.
Mass deanonymization attacks do not necessarily link messages to senders.
They might only reduce the anonymity set in which senders hide,
or infer information about the network topology.</p>
<h3id="graph-learning">
Graph Learning
<aclass="anchor"href="#graph-learning">#</a>
</h3>
<p>Graph learning attacks are a prerequisite for some mass deanonymization attacks,
in which the attacker learns the overlay network topology.
Graph learning attacks require a <em>scaling multinode</em> attacker</p>
<p>For gossipsub this means an attacker learns the topic mesh for specific pubsub topics.
<ahref="https://arxiv.org/abs/1805.11060">Dandelion++</a> describes ways to perform this attack.</p>
<h3id="bayesian-analysis">
Bayesian Analysis
<aclass="anchor"href="#bayesian-analysis">#</a>
</h3>
<p>Bayesian analysis allows attackers to assign each node in the network a likelihood of having sent (originated) a specific message.
Bayesian analysis for mass deanonymization is detailed in <ahref="https://arxiv.org/pdf/2201.11860">On the Anonymity of Peer-To-Peer Network Anonymity Schemes Used by Cryptocurrencies</a>.
It requires a <em>scaling node</em> attacker as well as knowledge of the network topology,
which can be learned via <em>graph learning</em> attacks.</p>
<p>In a flooding attack, attackers flood the network with bogus messages.</p>
<p>Waku employs <ahref="/spec/17/">RLN Relay</a> as the main countermeasure to flooding.
<ahref="/spec/18/">SWAP</a> also helps mitigating DoS attacks.</p>
<h3id="black-hole-internal">
Black Hole (internal)
<aclass="anchor"href="#black-hole-internal">#</a>
</h3>
<p>In a black hole attack, the attacker does not relay messages it is supposed to relay.
Analogous to a black hole, attacker nodes do not allow messages to leave once they entered.</p>
<p>While <em>single node</em> and smaller <em>multi node</em> attackers can have a negative effect on availability, the impact is not significant.
A <em>scaling multi node</em> attacker, however, can significantly disrupt the network with such an attack.</p>
<p>The effects of this attack are especially severe in conjunction with deanonymization mitigation techniques that reduce the out-degree of the overlay,
such as <ahref="/spec/44/">Waku Dandelion</a>.
(<ahref="/spec/44/">Waku Dandelion</a> also discusses mitigation techniques compensating the amplified black hole potential.)</p>