mirror of
https://github.com/logos-storage/logos-storage-proofs-circuits.git
synced 2026-01-07 16:03:08 +00:00
update the README to include the per-block hashing convention
This commit is contained in:
parent
55015008e7
commit
4d101442ca
64
README.md
64
README.md
@ -5,14 +5,24 @@ Codex Storage Proofs for the MVP
|
|||||||
This document describes the storage proof system for the Codex 2023 Q4 MVP.
|
This document describes the storage proof system for the Codex 2023 Q4 MVP.
|
||||||
|
|
||||||
|
|
||||||
|
Repo organization
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
- `README.md` - this document
|
||||||
|
- `circuit/` - the proof circuit (`circom` code)
|
||||||
|
- `reference/haskell/` - Haskell reference implementation of the proof input generation
|
||||||
|
- `reference/nim/` - Nim reference implementation of the proof input generation
|
||||||
|
- `test/` - tests for (some parts of the) circuit (using the `r1cs-solver` tool)
|
||||||
|
|
||||||
|
|
||||||
Setup
|
Setup
|
||||||
-----
|
-----
|
||||||
|
|
||||||
We assume that a user dataset is split into `nSlots` number of (not necessarily
|
We assume that a user dataset is split into `nSlots` number of (not necessarily
|
||||||
uniformly sized) "slots" of size `slotSize`, for example 10 GB or 100 GB (for
|
uniformly sized) "slots" of size `slotSize`, for example 10 GB or 100 GB or even
|
||||||
the MVP we may chose a smaller sizes). The slots of the same dataset are spread
|
1,000 GB (for the MVP we may chose a smaller sizes). The slots of the same dataset
|
||||||
over different storage nodes, but a single storage node can hold several slots
|
are spread over different storage nodes, but a single storage node can hold several
|
||||||
(of different sizes, and belonging to different datasets). The slots themselves
|
slots (of different sizes, and belonging to different datasets). The slots themselves
|
||||||
can be optionally erasure coded, but this does not change the proof system, only
|
can be optionally erasure coded, but this does not change the proof system, only
|
||||||
the robustness of it.
|
the robustness of it.
|
||||||
|
|
||||||
@ -26,23 +36,30 @@ Note that we can simply calculate:
|
|||||||
|
|
||||||
nCells = slotSize / cellSize
|
nCells = slotSize / cellSize
|
||||||
|
|
||||||
We then hash each cell (using the sponge construction with Poseidon2; see below
|
We hash each cell independently, using the sponge construction with Poseidon2
|
||||||
for details), and build a binary Merkle tree over this hashes. This has depth
|
(see below for details).
|
||||||
`d = ceil[ log2(nCells) ]`. Note: if `nCells` is not a power of two, then we
|
|
||||||
have to add dummy hashes. The exact conventions for doing this are described
|
|
||||||
below.
|
|
||||||
|
|
||||||
The Merkle root of the cells of a single slot is called the "slot root", and
|
The cells are then organized to `blockSize = 64kb` blocks, each block containing
|
||||||
is denoted by `slotRoot`.
|
`blockSize / cellSize = 32` cells. This is for compatibility with the networking
|
||||||
|
layer, which use larger (right now 64kb) blocks. For each block, we compute a
|
||||||
|
block hash by building a depth `5 = log2(32)` complete Merkle tree, using again
|
||||||
|
Poseidon2 hash, with the Merkle tree conventions described below.
|
||||||
|
|
||||||
Then for a given dataset, we can build another binary Merkle tree on the top of
|
Then on the set of of block hashes in a slot (we have `slotSize / blockSize` many
|
||||||
its slot roots, resulting in the "dataset root". Grafting these Merkle trees
|
ones), we build another (big) Merkle tree, whose root will identify the slot,
|
||||||
together we get a big dataset Merkle tree; however one should be careful
|
which we call the the "slot root", and is denoted by `slotRoot`.
|
||||||
about the padding conventions (it makes sense to construct the dataset-level
|
|
||||||
Merkle tree separately, as `nSlots` may not be a power of two).
|
|
||||||
|
|
||||||
The dataset root is a commitment to the whole dataset, and the user will post
|
Then for a given dataset, containing several slots, we can build a third binary
|
||||||
it on-chain, to ensure that the nodes really store its data and not something else.
|
Merkle tree on the top of its slot roots, resulting in the "dataset root" (note:
|
||||||
|
this is not the same as the SHA256 hash associated with the original dataset
|
||||||
|
uploaded by the user). Grafting these Merkle trees together we get a big dataset
|
||||||
|
Merkle tree; however one should be careful about the padding conventions
|
||||||
|
(it makes sense to construct the dataset-level Merkle tree separately, as `nSlots`
|
||||||
|
may not be a power of two, and later maybe `nCells` and `nBlocks` won't be
|
||||||
|
power-of-two either).
|
||||||
|
|
||||||
|
The dataset root is a commitment to the whole (erasure coded) dataset, and will
|
||||||
|
be posted on-chain, to ensure that the nodes really store its data and not something else.
|
||||||
Optionally, the slot roots can be posted on-chain, but this seems to be somewhat
|
Optionally, the slot roots can be posted on-chain, but this seems to be somewhat
|
||||||
wasteful.
|
wasteful.
|
||||||
|
|
||||||
@ -194,16 +211,15 @@ the samples in single slot; then use Groth16 to prove it.
|
|||||||
|
|
||||||
Public inputs:
|
Public inputs:
|
||||||
|
|
||||||
- slot or dataset root (depending on what we decide on)
|
- dataset root
|
||||||
- number of cells in the slot (or possibly its logarithm; right now `nCells`
|
- slot index within the dataset
|
||||||
is assumed to be a power of two)
|
|
||||||
- entropy (public randomness)
|
- entropy (public randomness)
|
||||||
- in case of using dataset root, also the slot index:
|
|
||||||
which slot of the dataset we are talking about; `[1..nSlots]`
|
|
||||||
|
|
||||||
Private inputs:
|
Private inputs:
|
||||||
|
|
||||||
- the slot root (if it was not a public input)
|
- the slot root
|
||||||
|
- number of cells in the slot
|
||||||
|
- the number of slots in the dataset
|
||||||
- the underlying data of the cells, as sequences of field elements
|
- the underlying data of the cells, as sequences of field elements
|
||||||
- the Merkle paths from the leaves (the cell hashes) to the slot root
|
- the Merkle paths from the leaves (the cell hashes) to the slot root
|
||||||
- the Merkle path from the slot root to the dataset root
|
- the Merkle path from the slot root to the dataset root
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user