logos-storage/transport-over-mix

Fork 0

mirror of https://github.com/logos-storage/transport-over-mix.git synced 2026-05-19 03:19:36 +00:00

Balazs Komuves 402bf60006

WIP Sphinx packet format documentation

2026-04-26 16:16:40 +02:00

20 KiB

Raw Blame History

Sphinx packet format

Sphinx is concrete packet format for mixnets, with the following goals:

compact (small overhead over the payload)
hiding the path length and relay position
unlinkability between the legs
indistinguishability of forward and reply packets
provable security

The main trick Sphinx achieves compactness is re-using a single public key (via a blinding mechanism) for each leg.

Security parameters

We denote the intended security level by \lambda as usual. By default we target \lambda=128 bits of security. Hence:

private keys are of size p = 2\lambda, that is, 32 bytes
as we use elliptic curve groups, public keys are the same size
symmetric keys are size s=\lambda, that is, 16 bytes
MACs are also of size \lambda, that is, 16 bytes

A further parameter is maximum number of hops (as all packets need to have the same size, you need to pre-agree on this). This is denoted by r, and a good recommended default value is r = 5.

Elliptic curve

For public key cryptography, we use elliptic curves as they are compact (unlike for example RSA), and also conceptually simple.

We denote the elliptic curve group by \mathbb{G}, its fixed generator by \mathbf{g}\in \mathbb{G}, and the corresponding scalar field by \mathbb{F}_q \cong \mathbb{Z}_q. The elliptic curve scalar multiplication is denoted by *:\mathbb{Z}_q \times \mathbb{G} \to \mathbb{G}.

The standard curve choice is Curve25519.

Symmetric primitives

we will need a MAC; the standard choice is HMAC-SHA256 truncated to \lambda=128 bits
we will need a (set of) key derivation function(s) \mathsf{KDF}. A standard choice is SHA256 with a domain separation; eg. \mathsf{KDF}_{\mathsf{MAC}}(x):=H(\texttt{"MAC"}\|x), truncated as necessary
we need a pseudo-random stream generator, which will be used to encrypt the routing information. Usually this is AES128-CTR (that is, AES in counter mode), but it could be also a XOF like SHAKE128. Below we will denote this by \mathsf{XOF}(k) where k is the key.
we also need a pseudo-random permutation (sometimes called a "large block cipher") to encrypt the payload. In particular, one SHOULD NOT USE a standard block cipher in counter mode for this. Note: This is kind of tricky, see more about this below.

Remark: While this is what the original Sphinx paper needs, there may be tweaks of the design, using more modern symmetric primitives, eg. AEAD to replace the problematic pseudo-random permutation. See below, near the bottom.

Packet format

A Sphinx packet consists of four parts, traditionally denoted by \alpha, \beta, \gamma, and \delta:

\alpha is a (blinded) public key (group element; \alpha\in\mathbb{G}).
\beta is the (encrypted) routing information
\gamma is a MAC (message authentication code)
\delta is the (encrypted) payload

The first three of these, \mathcal{H}=(\alpha,\beta,\gamma) is called the header.

Observation: Because we need to allow for the user to create reply paths in advance (SURBs, or single-use reply blocks), where the payload is not yet known, we cannot use traditional "onion" encryption; instead, the header and the payload must be separate. However, the payload will be still encrypted in several layers, and its integrity is protected (if implemented properly).

Constructing a packet

First, the user selects a random path of mix nodes n_0,\dots n_{\ell-1} with \ell\le r. We will also need a final destination \Delta_\mathrm{final}; and in case of replies, a message identifier J.

Computing the shared secrets

We assume all mix nodes have a long-term private-public keypair (\mathsf{sk}_i,\mathsf{pk}_i).

The user first generates an ephemeral secret key x; in practice this is just a random number x\in\mathbb{Z}_q^\times. It can then iteratively derive a sequence of keys (one set per hop) consisting of:

a per-node secret key x_i\in\mathbb{Z}_q^\times
a per-node public key \alpha_i:=x_i*\mathbf{g}\in \mathbb{G}
a per-node shared secret s_i\in \mathbb{G}, derived using Diffie-Hellman: s_i:=x_i*\mathsf{pk}_i = \mathsf{sk}_i*\alpha_i
a blinding factor b_i\in\mathbb{Z}_q^\times, computed from \mathsf{KDF}_{\mathsf{blind}}(\alpha_i,s_i)

Remark: The blinding factor is supposed to be in \mathbb{Z}_q^\times (recall that q\approx 2^{256}). We can ensure this either by simply taking it modulo q, or more properly by rejection sampling a deterministic sequence. Implementations however usually simply say b_i:=H(\alpha_i,s_i) where H is eg. SHA256. Same care should be taken in an actual implementation to avoid possible corner cases.

The key sequence is defined iteratively:

x_0:=x
x_{i+1}:=b_i\cdot x_i

All the rest can be computed from x_i:

the public key is \alpha_i = x_i * \mathbf{g}
the shared secret is s_i=x_i*\mathsf{pk}_i
and the blinding factor is b_i=H(\alpha_i,s_i)

The idea behind this construction is that each hop will have a unique sender public key, from which a shared secret can be derived, and then further symmetric keys for MAC and encryption via a KDF. When composing the forwarded packet for the next hop, the node can then "tweak" this public key using their own blinding factor: \alpha_{i+1}=b_i * \alpha_i.

Thus each hop can only decrypt their own header, and if they try to break the protocol, the message will be ruined.

Fillers (header padding)

In layered, "onion" routing, each node removes a layer of encryption, and forwards the resulting payload. Done naively, this would result both the headers and the packets to decrease in size while travelling through the mix path. This is obviously bad; we need to ensure the headers are of uniform size.

However, we also need to ensure that at all hops, the processing of the header and payload is exactly the same (otherwise nodes could figure out where they are in the path). These requirements result in the following, somewhat convoluted construction.

The filler strings \phi_i \in \{0,1\}^{2\lambda i} are also constructed iteratively:

\phi_0 is the empty string
to construct \phi_{i+1}, take \phi_i, append 2\lambda zero bits (in our case, 32 zero bytes), and XOR the resulting string with the last 2\lambda (i+1) bits of the (fixed size) random stream derived from the shared secret s_i:


\begin{align*}
  e_i &:= \mathsf{KDF}(\texttt{"route-enc-key"}\|s_i) \\
  \widetilde\phi_{i+1} &:= [\,\phi_i \;\|\; 0^{2\lambda}\,] \\
  \phi_{i+1} &:= \widetilde\phi_{i+1} \;\oplus\; \mathsf{XOF}(e_i)_{\rho\dots \rho+2\lambda (i+1)}
\end{align*}

where the offset is \rho = (2r+3)\lambda - 2\lambda (i+1). Note: We use the convention that a tilde denotes the unencrypted version.

One reason for all this complication is that the headers are constructed backwards, but the fillers must be constructed forwards. Furthermore, we also need to replace truncated bits while processing and forwarding mix packets; that's why we have the 2\lambda zero bits at the end (it could be any constant, zero is just the simplest).

Constructing the header

We now want to construct the initial message header \mathcal{H}_0=(\alpha_0,\beta_0,\gamma_0). This needs to encode the routing information in encrypted layers. So, we will do it iteratively, backwards from the last hop's header \mathcal{H}_{\ell-1}=(\alpha_{\ell-1},\beta_{\ell-1},\gamma_{\ell-1}).

Sidenote: In the paper these headers are denoted by M_i, but that's a bit confusing as the reader could think "message" instead.

Apart from the mix path, we also require a destination address \Delta, and message identifier J\in\{0,1\}^\lambda. These are only used in replies (\Delta would be our address, and J is to identify the reply among several); they don't seem to play a role in forward messages (apart from detecting being the exit node). The destination \Delta must fit into 2\lambda (r-\ell+1) bits; for r=\ell this means 2\lambda (32 bytes), and to simplify we will assume that, ie. \Delta\in\{0,1\}^{2\lambda}.

Remark: In the Logos implementation, \Delta is actually somewhat bigger (but still bounded), 48 96 bytes (?). It's relatively straightforward to generalize this construction to allow for that (though 96 bytes instead of 16 is very wasteful, as it's get multiplied by the maximum number of hops $r=5%...).

\alpha_i\in\mathbb{G} is just the per-hop public key computed above
\beta_i \in \{0,1\}^{(2r+1)\lambda} is the layer-by-layer encrypted routing information
\gamma_i \in \{0,1\}^\lambda is the MAC of \beta_i, computed with a mac key derived from the shared secret s_i

Question: Why isn't \alpha_i also included in the MAC? I guess it's not really necessary, as if \alpha_i is tampered with, then decryption and everything else will fail?

We will need several symmetric keys, all of size \lambda; these are all derived from the shared secret:

m_i := \mathsf{KDF}(\texttt{"mac-key"}\;\|\;s_i) is the MAC key
e_i := \mathsf{KDF}(\texttt{"route-enc-key"}\;\|\;s_i) is the key use to encrpytion the routing info
\mathrm{V}_i := \mathsf{KDF}(\texttt{"iv"}\;\|\;s_i) is the initialization vector (when using AES or some other block cipher as our PRG stream generator)

Then we construct the headers iteratively:

\widetilde\beta_{\ell-1} := [\,\Delta \;\|\; J \;\|\; 0^{2(r-\ell)\lambda} \;\|\; \widetilde{\phi}_{\ell-1} \,]
\widetilde\beta_{i-1}:= [\, A_i \;\|\; \gamma_i \;\|\; \mathsf{trunc}_{(2r-1)\lambda}(\beta_i) \,]
\beta_i := \widetilde\beta_i \oplus \mathsf{XOF}(e_i)
\gamma_i := \mathsf{MAC}_{m_i}(\beta_i)

where A_i\in\{0,1\}^\lambda (only 16 bytes, unlike \Delta!) is the address of the $i$-th node in the path. Note: When computing \beta_{i-1}, the \beta_i coming from the next hop is truncated (the last 2\lambda bits is discarded), so that's where the address and MAC fits in. This is not a problem because as we remove the layers, less and less routing information is required; the actual useful content of the final header is only (\Delta,J).

It's useful to have a picture of the sizes of the various components here:

|\Delta| = 2\lambda
|J| = |\gamma_i| = \lambda
|\phi_i| = 2\lambda i
|\beta_{\ell-1}| = 3\lambda + 2(r-1)\lambda = (2r+1)\lambda
|A_i| = \lambda
|\beta_i| = 2\lambda + (2r-1)\lambda = (2r+1)\lambda = |\beta_{\ell-1}|

Creating a forward message

Let the message to be sent be \mathtt{msg}\in\{0,1\}^N, and the final destination (the intended receiver) by \Delta_{\mathrm{final}}\in\{0,1\}^{2\lambda}. Note: N must be a constant, so that each packet has the same size.

The header \mathcal{H}_0 is computed as above, setting \Delta=0^{2\lambda} and J=0^{\lambda}. We also need the per-hop shared secrets s_i. From these, we calculate the payload encryption keys:


k_i := \mathsf{KDF}( \texttt{"payload-key"} \; \| \; s_i)

Now we can compute the encrypted payload iteratively:

P:=[\,0^\lambda\;\|\;\Delta_{\textrm{final}}\;\|\;\mathtt{msg}\,]
\delta_{\ell-1} := \mathsf{ENC}(\,k_{\ell-1};\; P )
\delta_{i-1} := \mathsf{ENC}(\,k_{i-1};\; \delta_i )

The final packet is (\alpha_0,\beta_0,\gamma_0,\delta_0).

This has size 2\lambda+(2r+1)\lambda+\lambda+(3\lambda+N) = N + (2r+7)\lambda. So for the default parameter \lambda=128 and r=5, the overhead is 17\times 16 = 272 bytes (the header being 224 bytes and the integrity check + destination 16+32 = 48 bytes).

Creating a reply message

First, the original sender creates a SURB (single use reply block). To do this, they first pick a random path A_0,\dots,A_{\ell-1} and compute a message header \mathcal{H}_0 as above, with \Delta being their own address (though technically it could be somebody else too), and J a random message identifer.

Pick a random symmetric key K\in\{0,1\}^\lambda, and save the following key mapping in some local table:


J \mapsto (K,k_0,\dots,k_{\ell-1})

where the encryption keys k_i = \mathsf{KDF}( \texttt{"payload-key"} \| s_i) are the same as above.

The SURB is the triple

\mathsf{SURB} := (A_0,\mathcal{H}_0,K)

This has the size (2r+6)\lambda, in our case 256 bytes.

To compose a reply message using a SURB, the sender encrypts their payload \mathtt{reply}\in\{0,1\}^{(N+2\lambda)} into \delta := \mathsf{ENC}(K;\, [0^{\lambda} \,\|\, \mathtt{reply}]) and sends the packet (\mathcal{H}_0,\delta) to the node with address A_0.

Processing mix packets

Upon receiving a mix packet, a node should do the following:

check the packet size to conform the expected fixed size, and split it into (\alpha,\beta,\gamma,\delta)
check if the first 32 bytes correspond to a valid group element \alpha\in\mathbb{G} (apparently this is a no-op for Curve25519)
compute the shared secret s = x*\alpha \in \mathbb{G}, where x\in \mathbb{Z}_q^\times is our long-term secret key
check if this shared secret was seen before by looking up it's hash H(s) in a table we keep. Reject if it was seen before.
recompute the MAC of \beta using the derived MAC key m_i=\mathsf{KDF}(\texttt{"mac-key"}\|s), and compare it with \gamma. Reject if they differ
decrypt the routing info B:=\mathsf{DEC}(e_i;\,[\beta\|0^{2\lambda}]) using the derived encryption key e_i=\mathsf{KDF}(\texttt{"route-enc-key"}\|s)
Parse the address in the first \lambda bits of B: If it's all zeros, then we are the exit node of a forward message. If it's an address of a mix node, then we have to forward. Otherwise we are the exit node of a reply message, and this address is the recipient.
Remove a layer of encryption from the payload \delta' := \mathsf{DEC}(k;\, \delta) using the payload encryption key k := \mathsf{KDF}( \texttt{"payload-key"} \| s)

If we are the exit node of a forward message:

Check if the first \lambda bits of \delta' are zero. Reject if not
Extract the destination address \Delta from the payload (bits [\lambda\dots (3\lambda-1)] of \delta')
If it looks like a valid network address, send them the remaining payload

If we are the exit node of a reply message:

It's basically the same as above, except that we also need to extract J and send it together with the payload
We also don't have to remove the \Delta piece from the payload (as it's not there)

If we have to forward it:

compute the blinding factor b = \mathsf{KDF}_{\mathsf{blind}}(\alpha,s)
compute \alpha' = b * \alpha\in\mathbb{G}
let \gamma' be the second \lambda bits of B
let \beta' be the remaining bits of B (recall that before decrypting \beta, we appended 2\lambda zero bits; thus \beta' is the right size!)
forward (\alpha',\beta',\gamma',\delta') to the next mixed node, whose address was in the first \lambda bits of B

Decrypting replies

Unlike forward message payloads, replies are encrypted; see above. (Of course the sender can encrypt the forward messages themselves, eg. if they know a public key or shared secret with the intended recipient).

To decrypt, the recipient should look up the message id J in their local table, and find the \ell+1 symmetric keys (K,k_0,\dots,k_{\ell-1}).

The recipient of the reply message can decrypt the payload \delta with


\mathsf{DEC}(K;\; \mathsf{ENC}(k_0;\; \mathsf{ENC}(k_1;\;
\dots \mathsf{ENC}(k_{\ell-1};\;\delta)\dots)))

If the first \lambda bits after decryption are zero, then message is consired valid.

Message integrity

A subtle part of this protocol is message integrity.

This is ensured by prepending \lambda zero bits to any payload, and using a "large block cipher", or pseudo-random permutation (PRP) to encrypt the payload in layers:

\pi : \{0,1\}^\lambda \times \{0,1\}^{N+3\lambda} \to \{0,1\}^{N+3\lambda}

This means that \pi(k) should be indistinguishable from a random permutation (of 2^{N+3\lambda}, so not permuting the bits, but permuting all possible payloads!)

Encryption and decryption is then simply


\begin{align*}
\mathsf{ENC}(k;\delta) &:= \pi(k)(\delta) \\
\mathsf{DEC}(k;\delta) &:= \pi^{-1}(k)(\delta) 
\end{align*}

With such a construction, if any single (or more) bit(s) of the payload is flipped, then after applying the permutation, the whole ciphertext should be uncorrelated with the unmodified one. This means that if somebody tries to modify the payload in route, at the end the bits which are supposed to be zeros won't be.

In particular, this most assuredly doesn't work for AES-CTR or any other block cipher in counter mode!

While AES itself is a pseudo-random permutation, it can only permute blocks of fixed size, namely 128 bits. That unfortunately isn't helpful here.

Possible choices for a PRP

While an arbtirary sized PRP is a very powerful building block, there appears fewer choices than for block ciphers.

the paper recommends LIONESS; see the paper "Two practical and provabliy secure block ciphers: BEAR and LION". These are built using the Luby-Rackoff construction
see also this paper on the security of these schemes
AEZ appears to have an arbitrary-sized block cipher, and should be fast, but I find the paper hard to decipher, and probably has less analysis.
see the paper "Improving the Sphinx Mix Network" about modifying the Sphinx construction to use alternative symmetric constructions (eg. AE) (???)

Variations

The above is basically what is described in the Sphinx paper. But for different use cases, some modifications could be useful.

Larger address size

Logos Mix is built on top of libp2p, so the natural address format is libp2p addresses. These are apparently not just an IP address, port number and protocol selector (eg. TCP, UDP, Quic), but also include a "PeerID" (here limited to 39 bytes, but in general it could be longer). Furthermore, a suggested average delay is also encoded in the address field.

In any case, because of the PeerID this doesn't fit into 32 bytes, but just fits into 48 bytes.

Let allow an address size of t\times \lambda, for both mix nodes and final destination (with t\ge 2). In this case, the pieces change like this:

|\Delta|=|A_i|=t\lambda
|\phi_{k}| = (t+1)k\lambda
|\beta_{k}| = (t+1)r\lambda

The total overhead is the sum of 2\lambda (the group element \alpha), (t+1)r\lambda (the routing info \beta), \lambda (the MAC), and (t+1)\lambda (the final destination address and the integrity check).

In total thats ((t+1)(r+1)+3)\lambda. For the Logos choice of t=3 and r=5, this would mean 432 bytes.

TODO: double-check this!

Mix nodes as recipients

We imagine that sometimes our recipients will be also mix nodes themselves. This probably requires a slight modification of the exit node handling.

TODO

Alternative symmetric cryptography