2023-09-21 15:45:59 +02:00
|
|
|
Storage proofs & erasure coding
|
|
|
|
===============================
|
|
|
|
|
|
|
|
Erasure coding is used for multiple purposes in Codex:
|
|
|
|
|
|
|
|
- To restore data when a host drops from the network; other hosts can restore
|
|
|
|
the data that the missing host was storing.
|
|
|
|
- To speed up downloads
|
|
|
|
- To increase the probability of detecting missing data on a host
|
|
|
|
|
|
|
|
For the first two items we'll use a different erasure coding scheme than we do
|
|
|
|
for the last. In this document we focus on the last item; an erasure coding
|
|
|
|
scheme that makes it easier to detect missing or corrupted data on a host
|
|
|
|
through storage proofs.
|
|
|
|
|
|
|
|
Storage proofs
|
|
|
|
--------------
|
|
|
|
|
|
|
|
Our proofs of storage allow a host to prove that they are still in possession of
|
|
|
|
the data that they promised to hold. A proof is generated by sampling a number
|
|
|
|
of blocks and providing a Merkle proof for those blocks. The Merkle proof is
|
|
|
|
generated inside a SNARK to compress it to a small size to allow for
|
|
|
|
cost-effective verification on a blockchain.
|
|
|
|
|
|
|
|
These storage proofs depend on erasure coding to ensure that a large part of the
|
|
|
|
data needs to be missing before the original dataset can no longer be restored.
|
|
|
|
This makes it easier to detect when a dataset is no longer recoverable.
|
|
|
|
|
|
|
|
Consider this example without erasure coding:
|
|
|
|
|
|
|
|
-------------------------------------
|
|
|
|
|///|///|///|///|///|///|///| |///|
|
|
|
|
-------------------------------------
|
|
|
|
^
|
|
|
|
|
|
|
|
|
missing
|
|
|
|
|
|
|
|
|
|
|
|
When we query a block from this dataset, we have a low chance of detecting the
|
|
|
|
missing block. But the dataset is no longer recoverable, because a single block
|
|
|
|
is missing.
|
|
|
|
|
|
|
|
When we add erasure coding:
|
|
|
|
|
|
|
|
--------------------------------- ---------------------------------
|
|
|
|
| |///| |///| | | |///| |///|///| | |///|///| | |
|
|
|
|
--------------------------------- ---------------------------------
|
|
|
|
original data parity data
|
|
|
|
|
|
|
|
In this example, more than 50% of the erasure coded data needs to be missing
|
|
|
|
before the dataset can no longer be recovered. When we now query a block from
|
|
|
|
this dataset, we have a more than 50% chance of detecting a missing block. And
|
|
|
|
when we query multiple blocks, the odds of detecting a missing block increase
|
|
|
|
dramatically.
|
|
|
|
|
|
|
|
Erasure coding
|
|
|
|
--------------
|
|
|
|
|
|
|
|
Reed-Solomon erasure coding works by representing data as a polynomial, and then
|
|
|
|
sampling parity data from that polynomial.
|
|
|
|
|
|
|
|
__
|
|
|
|
__ / \ __ __
|
|
|
|
/ \ / \ / \ / \
|
|
|
|
/ \ / \__/ \ __ /
|
|
|
|
-- / \__/ \ __/
|
|
|
|
\__/ \ /
|
|
|
|
^ \ / |
|
|
|
|
| --- |
|
|
|
|
^ ^ | ^ | |
|
|
|
|
| | ^ | | ^ | | | | |
|
|
|
|
| | ^ | | | | | | | | | |
|
|
|
|
| | | ^ | | | | | | | | | | | |
|
|
|
|
| | | | | | | | | | | | | | | |
|
|
|
|
| | | | | | | | v v v v v v v v
|
|
|
|
|
|
|
|
------------------------- -------------------------
|
|
|
|
|//|//|//|//|//|//|//|//| |//|//|//|//|//|//|//|//|
|
|
|
|
------------------------- -------------------------
|
|
|
|
|
|
|
|
original data parity
|
|
|
|
|
|
|
|
|
|
|
|
This only works for small amounts of data. When the polynomial is for instance
|
2023-09-26 13:27:24 +02:00
|
|
|
defined over byte sized elements from a Galois field of 2^8, you can only encode
|
2023-09-21 15:45:59 +02:00
|
|
|
2^8 = 256 bytes (data and parity combined).
|
|
|
|
|
|
|
|
Interleaving
|
|
|
|
------------
|
|
|
|
|
|
|
|
To encode larger pieces of data with erasure coding, interleaving is used. This
|
|
|
|
works by taking larger blocks of data, and encoding smaller elements from these
|
|
|
|
blocks.
|
|
|
|
|
|
|
|
data blocks
|
|
|
|
|
|
|
|
------------- ------------- ------------- -------------
|
|
|
|
|x| | | | | | |x| | | | | | |x| | | | | | |x| | | | | |
|
|
|
|
------------- ------------- ------------- -------------
|
|
|
|
| / / |
|
|
|
|
\___________ | _____________/ |
|
|
|
|
\ | / ____________________________/
|
|
|
|
| | | /
|
|
|
|
v v v v
|
|
|
|
|
|
|
|
--------- ---------
|
|
|
|
data |x|x|x|x| --> |p|p|p|p| parity
|
|
|
|
--------- ---------
|
|
|
|
|
|
|
|
| | | |
|
|
|
|
_____________________________/ / | \_________
|
|
|
|
/ _____________/ | \
|
|
|
|
| / / |
|
|
|
|
v v v v
|
|
|
|
------------- ------------- ------------- -------------
|
|
|
|
|p| | | | | | |p| | | | | | |p| | | | | | |p| | | | | |
|
|
|
|
------------- ------------- ------------- -------------
|
|
|
|
|
|
|
|
parity blocks
|
|
|
|
|
|
|
|
This is repeated for each element inside the blocks.
|
|
|
|
|
|
|
|
Adversarial erasure
|
|
|
|
-------------------
|
|
|
|
|
|
|
|
The disadvantage of interleaving is that it weakens the protection against
|
|
|
|
adversarial erasure that Reed-Solomon provides.
|
|
|
|
|
|
|
|
An adversary can now strategically remove only the first element from more than
|
|
|
|
half of the blocks, and the dataset will be damaged beyond repair. For example,
|
|
|
|
with a dataset of 1TB erasure coded into 256 data and parity blocks, an
|
|
|
|
adversary could strategically remove 129 bytes, and the data can no longer be
|
|
|
|
fully recovered.
|
|
|
|
|
|
|
|
Implications for storage proofs
|
|
|
|
-------------------------------
|
|
|
|
|
|
|
|
This means that when we check for missing data, we should perform our checks on
|
|
|
|
entire blocks to protect against adversarial erasure. In the case of our Merkle
|
|
|
|
storage proofs, this means that we need to hash the entire block, and then check
|
|
|
|
that hash with a Merkle proof. This is rather unfortunate, because hashing large
|
|
|
|
amounts of data is rather expensive to perform in a SNARK, which is what we use
|
|
|
|
to compress proofs in size.
|
|
|
|
|
|
|
|
A large amount of input data in a SNARK leads to a larger circuit, and to more
|
|
|
|
iterations of the hashing algorithm, which also leads to a larger circuit. A
|
|
|
|
larger circuit means longer computation and higher memory consumption.
|
|
|
|
|
|
|
|
Ideally, we'd like to have small blocks to keep Merkle proofs inside SNARKs
|
|
|
|
relatively performant, but we are limited by the maximum amount of blocks that a
|
|
|
|
particular Reed-Solomon algorithm supports. For instance, the [leopard][1]
|
|
|
|
library can create at most 65536 blocks, because it uses a Galois field of 2^16.
|
|
|
|
Should we use this to encode a 1TB file, we'd end up with blocks of 16MB, far
|
|
|
|
too large to be practical in a SNARK.
|
|
|
|
|
|
|
|
Design space
|
|
|
|
------------
|
|
|
|
|
|
|
|
This limits the choices that we can make. The limiting factors seem to be:
|
|
|
|
|
|
|
|
- Maximum number of blocks, determined by the field size of the erasure coding
|
|
|
|
algorithm
|
|
|
|
- Number of blocks per proof, which determines how likely we are to detect
|
|
|
|
missing blocks
|
|
|
|
- Capacity of the SNARK algorithm; how many bytes can we hash in a reasonable
|
|
|
|
time inside the SNARK
|
|
|
|
|
|
|
|
From these limiting factors we can derive:
|
|
|
|
|
|
|
|
- Block size
|
|
|
|
- Maximum slot size; the maximum amount of data that can be verified with a
|
|
|
|
proof
|
|
|
|
- Erasure coding memory requirements
|
|
|
|
|
|
|
|
For example, when we use the leopard library, with a Galois field of 2^16, and
|
|
|
|
require 80 blocks to be sampled per proof, and we can implement a SNARK that can
|
|
|
|
hash 80*64K bytes, then we have:
|
|
|
|
|
|
|
|
- Block size: 64KB
|
|
|
|
- Maximum slot size: 4GB (2^16 * 64KB)
|
|
|
|
- Erasure coding memory: > 128KB (2^16 * 16 bits)
|
|
|
|
|
|
|
|
Which has the disadvantage of having a rather low maximum slot size of 4GB. When
|
|
|
|
we want to improve on this to support e.g. 1TB slot sizes, we'll need to either
|
|
|
|
increase the capacity of the SNARK, increase the field size of the erasure
|
|
|
|
coding algorithm, or decrease the durability guarantees.
|
|
|
|
|
|
|
|
> The [accompanying spreadsheet][4] allows you to explore the design space
|
|
|
|
> yourself
|
|
|
|
|
|
|
|
Increasing SNARK capacity
|
|
|
|
-------------------------
|
|
|
|
|
|
|
|
Increasing the computational capacity of SNARKs is an active field of study, but
|
|
|
|
it is unlikely that we'll see an implementation of SNARKS that is 100-1000x
|
|
|
|
faster before we launch Codex. Better hashing algorithms are also being designed
|
|
|
|
for use in SNARKS, but it is equally unlikely that we'll see such a speedup here
|
|
|
|
either.
|
|
|
|
|
|
|
|
Decreasing durability guarantees
|
|
|
|
--------------------------------
|
|
|
|
|
|
|
|
We could reduce the durability guarantees by requiring e.g. 20 instead of 80
|
|
|
|
blocks per proof. This would still give us a probability of detecting missing
|
|
|
|
data of 1 - 0.5^20, which is 0.999999046, or "six nines". Arguably this is still
|
|
|
|
good enough. Choosing 20 blocks per proof allows for slots up to 16GB:
|
|
|
|
|
|
|
|
- Block size: 256KB
|
|
|
|
- Maximum slot size: 16GB (2^16 * 256KB)
|
|
|
|
- Erasure coding memory: > 128KB (2^16 * 16 bits)
|
|
|
|
|
|
|
|
Erasure coding field size
|
|
|
|
-------------------------
|
|
|
|
|
|
|
|
If we could perform erasure coding on a field of around 2^20 to 2^30, then this
|
|
|
|
would allow us to get to larger slots. For instance, with a field of at least
|
|
|
|
size 2^24, we could support slot sizes up to 1TB:
|
|
|
|
|
|
|
|
- Block size: 64KB
|
|
|
|
- Maximum slot size: 1TB (2^24 * 64KB)
|
|
|
|
- Erasure coding memory: > 48MB (2^24 * 24 bits)
|
|
|
|
|
|
|
|
We are however unaware of any implementations of reed solomon that use a field
|
|
|
|
size larger than 2^16 and still be efficient O(N log(N)). [FastECC][2] uses a
|
|
|
|
prime field of 20 bits, but it lacks a decoder and a byte encoding scheme. The
|
|
|
|
paper ["An Efficient (n,k) Information Dispersal Algorithm Based on Fermat
|
|
|
|
Number Transforms"][3] describes a scheme that uses Proth fields of 2^30, but
|
|
|
|
lacks an implementation.
|
|
|
|
|
|
|
|
If we were to adopt an erasure coding scheme with a large field, it is likely
|
|
|
|
that we'll have implement one.
|
|
|
|
|
|
|
|
Conclusion
|
|
|
|
----------
|
|
|
|
|
|
|
|
It is likely that with the current state of the art in SNARK design and erasure
|
|
|
|
coding implementations we can only support slot sizes up to 4GB. The most
|
|
|
|
promising way to increase the supported slot sizes seems to be to implement an
|
|
|
|
erasure coding algorithm using a field size of around 2^24.
|
|
|
|
|
|
|
|
[1]: https://github.com/catid/leopard
|
|
|
|
[2]: https://github.com/Bulat-Ziganshin/FastECC
|
|
|
|
[3]: https://ieeexplore.ieee.org/abstract/document/6545355
|
|
|
|
[4]: ./proof-erasure-coding.ods
|