From 6271a18ae7f48405b9a550453d932f4fa02d8f46 Mon Sep 17 00:00:00 2001 From: Mark Spanbroek Date: Thu, 21 Sep 2023 15:45:59 +0200 Subject: [PATCH] erasure coding for storage proofs writeup --- design/proof-erasure-coding.md | 244 +++++++++++++++++++++++++++++++++ 1 file changed, 244 insertions(+) create mode 100644 design/proof-erasure-coding.md diff --git a/design/proof-erasure-coding.md b/design/proof-erasure-coding.md new file mode 100644 index 0000000..f397cd3 --- /dev/null +++ b/design/proof-erasure-coding.md @@ -0,0 +1,244 @@ +Storage proofs & erasure coding +=============================== + +Erasure coding is used for multiple purposes in Codex: + +- To restore data when a host drops from the network; other hosts can restore + the data that the missing host was storing. +- To speed up downloads +- To increase the probability of detecting missing data on a host + +For the first two items we'll use a different erasure coding scheme than we do +for the last. In this document we focus on the last item; an erasure coding +scheme that makes it easier to detect missing or corrupted data on a host +through storage proofs. + +Storage proofs +-------------- + +Our proofs of storage allow a host to prove that they are still in possession of +the data that they promised to hold. A proof is generated by sampling a number +of blocks and providing a Merkle proof for those blocks. The Merkle proof is +generated inside a SNARK to compress it to a small size to allow for +cost-effective verification on a blockchain. + +These storage proofs depend on erasure coding to ensure that a large part of the +data needs to be missing before the original dataset can no longer be restored. +This makes it easier to detect when a dataset is no longer recoverable. + +Consider this example without erasure coding: + + ------------------------------------- + |///|///|///|///|///|///|///| |///| + ------------------------------------- + ^ + | + missing + + +When we query a block from this dataset, we have a low chance of detecting the +missing block. But the dataset is no longer recoverable, because a single block +is missing. + +When we add erasure coding: + + --------------------------------- --------------------------------- + | |///| |///| | | |///| |///|///| | |///|///| | | + --------------------------------- --------------------------------- + original data parity data + +In this example, more than 50% of the erasure coded data needs to be missing +before the dataset can no longer be recovered. When we now query a block from +this dataset, we have a more than 50% chance of detecting a missing block. And +when we query multiple blocks, the odds of detecting a missing block increase +dramatically. + +Erasure coding +-------------- + +Reed-Solomon erasure coding works by representing data as a polynomial, and then +sampling parity data from that polynomial. + + __ + __ / \ __ __ + / \ / \ / \ / \ + / \ / \__/ \ __ / + -- / \__/ \ __/ + \__/ \ / + ^ \ / | + | --- | + ^ ^ | ^ | | + | | ^ | | ^ | | | | | + | | ^ | | | | | | | | | | + | | | ^ | | | | | | | | | | | | + | | | | | | | | | | | | | | | | + | | | | | | | | v v v v v v v v + + ------------------------- ------------------------- + |//|//|//|//|//|//|//|//| |//|//|//|//|//|//|//|//| + ------------------------- ------------------------- + + original data parity + + +This only works for small amounts of data. When the polynomial is for instance +defined over byte sized elements from a Galios field of 2^8, you can only encode +2^8 = 256 bytes (data and parity combined). + +Interleaving +------------ + +To encode larger pieces of data with erasure coding, interleaving is used. This +works by taking larger blocks of data, and encoding smaller elements from these +blocks. + + data blocks + + ------------- ------------- ------------- ------------- + |x| | | | | | |x| | | | | | |x| | | | | | |x| | | | | | + ------------- ------------- ------------- ------------- + | / / | + \___________ | _____________/ | + \ | / ____________________________/ + | | | / + v v v v + + --------- --------- + data |x|x|x|x| --> |p|p|p|p| parity + --------- --------- + + | | | | + _____________________________/ / | \_________ + / _____________/ | \ + | / / | + v v v v + ------------- ------------- ------------- ------------- + |p| | | | | | |p| | | | | | |p| | | | | | |p| | | | | | + ------------- ------------- ------------- ------------- + + parity blocks + +This is repeated for each element inside the blocks. + +Adversarial erasure +------------------- + +The disadvantage of interleaving is that it weakens the protection against +adversarial erasure that Reed-Solomon provides. + +An adversary can now strategically remove only the first element from more than +half of the blocks, and the dataset will be damaged beyond repair. For example, +with a dataset of 1TB erasure coded into 256 data and parity blocks, an +adversary could strategically remove 129 bytes, and the data can no longer be +fully recovered. + +Implications for storage proofs +------------------------------- + +This means that when we check for missing data, we should perform our checks on +entire blocks to protect against adversarial erasure. In the case of our Merkle +storage proofs, this means that we need to hash the entire block, and then check +that hash with a Merkle proof. This is rather unfortunate, because hashing large +amounts of data is rather expensive to perform in a SNARK, which is what we use +to compress proofs in size. + +A large amount of input data in a SNARK leads to a larger circuit, and to more +iterations of the hashing algorithm, which also leads to a larger circuit. A +larger circuit means longer computation and higher memory consumption. + +Ideally, we'd like to have small blocks to keep Merkle proofs inside SNARKs +relatively performant, but we are limited by the maximum amount of blocks that a +particular Reed-Solomon algorithm supports. For instance, the [leopard][1] +library can create at most 65536 blocks, because it uses a Galois field of 2^16. +Should we use this to encode a 1TB file, we'd end up with blocks of 16MB, far +too large to be practical in a SNARK. + +Design space +------------ + +This limits the choices that we can make. The limiting factors seem to be: + +- Maximum number of blocks, determined by the field size of the erasure coding + algorithm +- Number of blocks per proof, which determines how likely we are to detect + missing blocks +- Capacity of the SNARK algorithm; how many bytes can we hash in a reasonable + time inside the SNARK + +From these limiting factors we can derive: + +- Block size +- Maximum slot size; the maximum amount of data that can be verified with a + proof +- Erasure coding memory requirements + +For example, when we use the leopard library, with a Galois field of 2^16, and +require 80 blocks to be sampled per proof, and we can implement a SNARK that can +hash 80*64K bytes, then we have: + +- Block size: 64KB +- Maximum slot size: 4GB (2^16 * 64KB) +- Erasure coding memory: > 128KB (2^16 * 16 bits) + +Which has the disadvantage of having a rather low maximum slot size of 4GB. When +we want to improve on this to support e.g. 1TB slot sizes, we'll need to either +increase the capacity of the SNARK, increase the field size of the erasure +coding algorithm, or decrease the durability guarantees. + +> The [accompanying spreadsheet][4] allows you to explore the design space +> yourself + +Increasing SNARK capacity +------------------------- + +Increasing the computational capacity of SNARKs is an active field of study, but +it is unlikely that we'll see an implementation of SNARKS that is 100-1000x +faster before we launch Codex. Better hashing algorithms are also being designed +for use in SNARKS, but it is equally unlikely that we'll see such a speedup here +either. + +Decreasing durability guarantees +-------------------------------- + +We could reduce the durability guarantees by requiring e.g. 20 instead of 80 +blocks per proof. This would still give us a probability of detecting missing +data of 1 - 0.5^20, which is 0.999999046, or "six nines". Arguably this is still +good enough. Choosing 20 blocks per proof allows for slots up to 16GB: + +- Block size: 256KB +- Maximum slot size: 16GB (2^16 * 256KB) +- Erasure coding memory: > 128KB (2^16 * 16 bits) + +Erasure coding field size +------------------------- + +If we could perform erasure coding on a field of around 2^20 to 2^30, then this +would allow us to get to larger slots. For instance, with a field of at least +size 2^24, we could support slot sizes up to 1TB: + +- Block size: 64KB +- Maximum slot size: 1TB (2^24 * 64KB) +- Erasure coding memory: > 48MB (2^24 * 24 bits) + +We are however unaware of any implementations of reed solomon that use a field +size larger than 2^16 and still be efficient O(N log(N)). [FastECC][2] uses a +prime field of 20 bits, but it lacks a decoder and a byte encoding scheme. The +paper ["An Efficient (n,k) Information Dispersal Algorithm Based on Fermat +Number Transforms"][3] describes a scheme that uses Proth fields of 2^30, but +lacks an implementation. + +If we were to adopt an erasure coding scheme with a large field, it is likely +that we'll have implement one. + +Conclusion +---------- + +It is likely that with the current state of the art in SNARK design and erasure +coding implementations we can only support slot sizes up to 4GB. The most +promising way to increase the supported slot sizes seems to be to implement an +erasure coding algorithm using a field size of around 2^24. + +[1]: https://github.com/catid/leopard +[2]: https://github.com/Bulat-Ziganshin/FastECC +[3]: https://ieeexplore.ieee.org/abstract/document/6545355 +[4]: ./proof-erasure-coding.ods