2025-10-31 12:46:57 +01:00

12 KiB

The RS outsourcing protocol

The Reed-Solomon outsourcing protocol is an interactive protocol to convince a client that an untrusted server applied Reed-Solomon encoding to the client's data correctly.

More precisely, the data is organized as a matrix; it's committed to by a Merkle root built on the top of row hashes; and Reed-Solomon encoding is applied to each column independently.

The client uploads the data and the server replies with the Merkle root of the RS-encoded data, and a proof connecting the Merkle roots of the original data and the encoded root (and also proving correctness the of encoding).

As usual, the protocol can be made mostly non-interactive; but still there are two phases of communication: data upload by the client, and then a server reply with a proof.

Note: In the proposed use case, there isn't really a "client", and the original data is already committed; however it's easier to describe in a two-party setting (which could be also useful in other settings).

The protocol has three major parts:

  • data preparation
  • FRI protocol to prove the codeword
  • connecting the original data to the encoded data

Furthermore we need to specify the conventions used, for example:

  • encoding of raw data into field elements
  • hash function used to hash the rows
  • Merkle tree construction

Global parameters

The client and the server needs to agree on the global parameters. These include:

  • the size and shape of the data to be encoded
  • the rate \rho = 2^{-r} of the Reed-Solomon coding
  • all the parameters of the FRI proof

We use the Goldilocks prime field p=2^{64}-2^{32}+1. In the FRI protocol we also use the quadratic field extension \widetilde{\mathbb{F}} = \mathbb{F}_p[X]/(X^2-7); the particular extensions chosen mainly for compatibility reasons: p(X) = X^2\pm 7 are the two "simplest" irreducible polynomials over \mathbb{F}_p, and X^2-7 was chosen by Plonky2 and Plonky3.

We use the Monolith hash function. For linear hashing, we use the sponge construction, with state size t = 12 and rate = 8, the 10* padding strategy, and a custom IV (for domain separation).

For Merkle trees we use a custom construction with a keyed compression function based on the Monolith permutation.

Data preparation

We need to convert the linear sequence of bytes on some harddrive into a matrix of field elements. There are many ways to do that, and some important trade-offs to decide about.

TODO;

See also the separate disk layout document.

Connecting the original data to the encoded data

As mentioned above, the original data is encoded as a matrix D\in\mathbb{F}^{N\times M} of field elements, and the columnwise RS-encoded data by an another such matrix A\in\mathbb{F}^{N'\times M} of size N'\times M. Ideally both N and N' are powers of two, however, in practice we may require trade-offs where this is not satisfied.

Both matrices are committed by a Merkle root, with the binary Merkle tree built over the (linear) hashes of the matrix rows.

The ideal case

In the simplest, ideal case, we have N=2^n and N'=N/\rho=2^{n+r} where the code rate \rho=2^{-r}.

The encoded data can be partitioned as (D;P_1,P_2,\dots,P_{R-1}) where D\in \mathbb{F}^{N\times M} is the original data, and P_i\in \mathbb{F}^{N\times M} are the parity data (here R=2^r=\rho^{-1}).

Each of these R matrices can be committed with a Merkle root \mathsf{h}_i\in\mathcal{H}; the commitment of the encoded data A (the vertical concatenation of these matrices) will be then the root of the Merkle tree built on these R=2^r roots.

Here is an ASCII diagram for \rho = 1/4:

                    root(A)
                      /  \
                    /      \
                  /          \
               /\              /\ 
              /  \            /  \
 root(D) = h0      h1      h2      h3 = root(P3) 
           /\      /\      /\      /\   
          /  \    /  \    /  \    /  \ 
         /____\  /____\  /____\  /____\
           D       P1      P2      P3   

We want to connect the commitment of the original data \mathsf{root}(D)=\mathsf{h}_0 with the commitment of the encoded data \mathsf{root}(A).

Note that (\mathbf{h}_1,\mathbf{h}_2,\mathbf{h}_3)\in\mathcal{H}^3 are a Merkle proof for this connection! To verify we only need to check that:

$$ \mathsf{root}(A) ;=; \mathsf{hash}\Big(;\mathsf{hash}\big( \mathsf{root}(D),|,\mathsf{h}_0\big) ;\big|; \mathsf{hash}\big(\mathsf{h}_2,|,\mathsf{h}_3\big) ;\Big)



#### What if $N$ is not a power of two?

We may want to allow the original data matrix's number of columns not being a power of two.

Reasons to do this include:

- finer control over data size than just changing $M$
- having some other restrictions on the number of columns $M$ and wishing for less waste of space

As we want to use Fast Fourier Transform for the encoding, the simplest solution is pad to the next power of two $2^n$, for example with zeros.

We don't have to store these zeros on the disk, they can be always "added" run-time. By default, the size of the parity data will be still this $(\rho^{-1}-1)\times 2^n$ though.

#### What if $N'$ is not a power of two?

This is more interesting. In the above ideal setting, we allow $\rho=1/2^r$, however in practice that means $\rho=1/2$, as already $\rho=1/4$ would mean a 4x storage overhead!

But already in the only practical case we have a 2x overhead, we may want a smaller one, say 1.5x.

Unfortunately, the FRI protocol as described only works on power of two sizes, so in this case we would still do the encoding with $\rho=1/2$, but then discard half of the parity data (**WARNING!** Doing this may have serious security implications, which we ignore here).

In this case we will have to connect _three_ Merkle roots:

- the original data $D$;
- the RS codewords $E$ with $\rho=1/2$ (recall that FRI itself is also a proof against a Merkle commitment);
- and the truncated codewords (with effective $\rho=2/3$) $A$.

In glorious ASCII, this would look something like this:

               root(E) = h0123
                           /\
                         /    \
                       /        \
                     /            \
         root(D) = h01 -- r(A)     h23
                   /\       \      /\ 
                  /  \       \    /  \
               h0      h1      h2      h3 = root(P3) 
               /\      /\      /\      /\   
              /  \    /  \    /  \    /  \ 
             /____\  /____\  /____\  /____\
               D0      D1      P2      P3   
             \____________/  \____________/
                data D          parity
             \____________________/
                   truncated A

Here the public information is $\mathsf{root}(D)=\mathsf{h}_{01}$ and 

$$\mathsf{root}(A) \;:=\; \mathsf{hash}\big (\mathsf{root}(D)\,\|\,\mathsf{root}(P_2)\big) \;=\; \mathsf{hash}(\mathsf{h}_{01}\|\mathsf{h}_2)

The connection proof will then consist of \mathsf{h}_{2,3}=\mathsf{root}(P_{2,3}) and \mathsf{root}(E)=\mathsf{h}_{0123}; and we can then check that:


\begin{align*}
  \mathsf{root}(A) &= \mathsf{hash}\big(\;
      \mathsf{h_{01}}\;\|\; \mathsf{h_2}\;\big) \\
  \mathsf{root}(E) &= \mathsf{hash}\big(\;\mathsf{h}_{01} \;\|\; \mathsf{hash}(\mathsf{h}_{2}\|\mathsf{h}_2)\;\big) 
\end{align*}

and that E is really a matrix of codewords.

What if we want \rho^{-1}>2, that is, a larger code?

The security reasoning of the FRI protocol is very involved (see the corresponding security document), and we may want the codeword be significantly larger because of that; for example \rho=1/8.

From the "connecting original to encoded" point of view, this situation is similar to the previous one.

Batched FRI protocol

The FRI protocol we use is essentially the same as the one in Plonky2, which is furthermore essentially the same as in the paper "DEEP-FRI: Sampling outside the box improves soundness" by Eli Ben-Sasson et al.

Setup: We have a matrix of Goldilocks field elements of size N\times M with N=2^n being a power of two. We encode each column with Reed-Solomon encoding into size N/\rho (also assumed to be a power of two), interpreting the data as values of a polynomial on a coset \mathcal{C}\subset \mathbb{F}^\times, and the codeword on larger coset \mathcal{C}'\supset\mathcal{C}.

The protocol proves that (the columns of) the matrix are "close to" Reed-Solomon codewords (in a precise sense).

The prover's side:

  • the prover and the verifier agree on the public parameters
  • the prover computes the Reed-Solomon encoding of the columns, and commits to the encoded (larger) matrix with a Merkle root (or Merkle cap)
  • the verifier samples a random \alpha\in\widetilde{\mathbb{F}} combining coefficient
  • the provers computes the linear combination of the RS-encoded columns with coefficients 1,\alpha,\alpha^2,\dots,\alpha^{M-1}
  • "commit phase": the prover repeatedly
    • commits the current vector of values
    • the verifier chooses a random \beta_k\in\widetilde{\mathbb{F}} folding coefficient
    • "folds" the polynomial with the pre-agreed folding arity A_k = 2^{a_k}
    • evaluates the folded polynomial on the evaluation domain \mathcal{D}_{k+1} = \mathcal{D}_{k} ^ {A_k}
  • until the degree of the folded polynomial becomes small enough
  • then the final polynomial is sent in clear (by its coefficients)
  • an optional proof-of-work "grinding" is performed by the prover
  • the verifier samples random row indices 0 \le \mathsf{idx}_j < N/\rho for 0\le j < n_{\mathrm{rounds}}
  • "query phase": repeatedly (by the pre-agreed number n_{\mathrm{rounds}} of times)
    • the provers sends the full row corresponding the index \mathsf{idx}_j, together with a Merkle proof (against the Merkle root of the encoded matrix)
    • repeatedly (for each folding step):
      • extract the folding coset including the "upstream index" from the folded encoded vector
      • send it together with a Merkle proof against the corresponding commit phase Merkle root
  • serialize the proof into a bytestring

The verifier's side:

  • deserialize the proof from a bytestring
  • check the "shape" of the proof data structure against the global parameters:
    • all the global parameters match the expected (if included in the proof)
    • merkle cap sizes
    • number of folding steps and folding arities
    • number of commit phase Merkle caps
    • degree of the final polynomial
    • all Merkle proof lengths
    • number of query rounds
    • number of steps in each query round
    • opened matrix row sizes
    • opened folding coset sizes
  • compute all the FRI challenges from the transcript:
    • combining coeff \alpha\in\widetilde{\mathbb{F}}
    • folding coeffs \beta_k\in\widetilde{\mathbb{F}}
    • grinding PoW response
    • query indicies 0 \le \mathsf{idx}_j < N/\rho
  • check the grinding proof-of-work condition
  • for each query round:
    • check the row opening Merkle proof
    • compute the combined "upstream value" u_0 = \sum \alpha^j\cdot \mathsf{row}_j \in\widetilde{\mathbb{F}}
    • for each folding step:
      • check the "upstream value" u_k\in\widetilde{\mathbb{F}} against the corresponding element in the opened coset
      • check the folding coset values opening Merkle proof
      • compute the "downstream value" u_{k+1}\in\widetilde{\mathbb{F}} from the coset values using the folding coefficient \beta_k (for example by applying an IFFT on the values and linearly combining the result with powers of \beta)
    • check the final downstream value against the evaluation of the final polynomial at the right location
  • accept if all checks passed.

This concludes the batched FRI protocol.

Summary

We have to prove two things:

  • that commitment to the "encoded data" really corresponds to something which looks like a set of Reed-Solomon codewords
  • and that that is really an encoding of the original data, which practically means, because this was a so-called "systematic code", that the original data is contained in the encoded data

The first point can be done using the FRI protocol, and the second part via a very simple Merkle proof-type argument.

There are a lot of complications in the details, starting from how to encode the data into a matrix of field elements (important because of performance considerations) to all the peculiar details of the (optimized) FRI protocol.