`recover_cells_and_kzg_proofs` & matrix refactor (#3788)

* Recover cells and proofs & matrix clean up

* Fix table of contents

* Update reference tests generator

* Update test format

* Remove unused imports

* Fix some minor nits

* Rename MatrixEntry's proof to kzg_proof

* Move RowIndex & ColumnIndex to das-core
This commit is contained in:
Justin Traglia 2024-06-11 06:52:24 -05:00 committed by GitHub
parent 5633417156
commit 5ace424cd8
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
7 changed files with 254 additions and 163 deletions

View File

@ -17,6 +17,7 @@
- [Custody setting](#custody-setting) - [Custody setting](#custody-setting)
- [Containers](#containers) - [Containers](#containers)
- [`DataColumnSidecar`](#datacolumnsidecar) - [`DataColumnSidecar`](#datacolumnsidecar)
- [`MatrixEntry`](#matrixentry)
- [Helper functions](#helper-functions) - [Helper functions](#helper-functions)
- [`get_custody_columns`](#get_custody_columns) - [`get_custody_columns`](#get_custody_columns)
- [`compute_extended_matrix`](#compute_extended_matrix) - [`compute_extended_matrix`](#compute_extended_matrix)
@ -53,12 +54,10 @@ The following values are (non-configurable) constants used throughout the specif
## Custom types ## Custom types
We define the following Python custom types for type hinting and readability:
| Name | SSZ equivalent | Description | | Name | SSZ equivalent | Description |
| - | - | - | | - | - | - |
| `DataColumn` | `List[Cell, MAX_BLOB_COMMITMENTS_PER_BLOCK]` | The data of each column in EIP-7594 | | `RowIndex` | `uint64` | Row identifier in the matrix of cells |
| `ExtendedMatrix` | `List[Cell, MAX_CELLS_IN_EXTENDED_MATRIX]` | The full data of one-dimensional erasure coding extended blobs (in row major format). | | `ColumnIndex` | `uint64` | Column identifier in the matrix of cells |
## Configuration ## Configuration
@ -79,7 +78,7 @@ We define the following Python custom types for type hinting and readability:
| Name | Value | Description | | Name | Value | Description |
| - | - | - | | - | - | - |
| `SAMPLES_PER_SLOT` | `8` | Number of `DataColumn` random samples a node queries per slot | | `SAMPLES_PER_SLOT` | `8` | Number of `DataColumnSidecar` random samples a node queries per slot |
| `CUSTODY_REQUIREMENT` | `1` | Minimum number of subnets an honest node custodies and serves samples from | | `CUSTODY_REQUIREMENT` | `1` | Minimum number of subnets an honest node custodies and serves samples from |
| `TARGET_NUMBER_OF_PEERS` | `70` | Suggested minimum peer count | | `TARGET_NUMBER_OF_PEERS` | `70` | Suggested minimum peer count |
@ -90,13 +89,23 @@ We define the following Python custom types for type hinting and readability:
```python ```python
class DataColumnSidecar(Container): class DataColumnSidecar(Container):
index: ColumnIndex # Index of column in extended matrix index: ColumnIndex # Index of column in extended matrix
column: DataColumn column: List[Cell, MAX_BLOB_COMMITMENTS_PER_BLOCK]
kzg_commitments: List[KZGCommitment, MAX_BLOB_COMMITMENTS_PER_BLOCK] kzg_commitments: List[KZGCommitment, MAX_BLOB_COMMITMENTS_PER_BLOCK]
kzg_proofs: List[KZGProof, MAX_BLOB_COMMITMENTS_PER_BLOCK] kzg_proofs: List[KZGProof, MAX_BLOB_COMMITMENTS_PER_BLOCK]
signed_block_header: SignedBeaconBlockHeader signed_block_header: SignedBeaconBlockHeader
kzg_commitments_inclusion_proof: Vector[Bytes32, KZG_COMMITMENTS_INCLUSION_PROOF_DEPTH] kzg_commitments_inclusion_proof: Vector[Bytes32, KZG_COMMITMENTS_INCLUSION_PROOF_DEPTH]
``` ```
#### `MatrixEntry`
```python
class MatrixEntry(Container):
cell: Cell
kzg_proof: KZGProof
column_index: ColumnIndex
row_index: RowIndex
```
### Helper functions ### Helper functions
#### `get_custody_columns` #### `get_custody_columns`
@ -132,7 +141,7 @@ def get_custody_columns(node_id: NodeID, custody_subnet_count: uint64) -> Sequen
#### `compute_extended_matrix` #### `compute_extended_matrix`
```python ```python
def compute_extended_matrix(blobs: Sequence[Blob]) -> ExtendedMatrix: def compute_extended_matrix(blobs: Sequence[Blob]) -> List[MatrixEntry, MAX_CELLS_IN_EXTENDED_MATRIX]:
""" """
Return the full ``ExtendedMatrix``. Return the full ``ExtendedMatrix``.
@ -140,29 +149,44 @@ def compute_extended_matrix(blobs: Sequence[Blob]) -> ExtendedMatrix:
The data structure for storing cells is implementation-dependent. The data structure for storing cells is implementation-dependent.
""" """
extended_matrix = [] extended_matrix = []
for blob in blobs: for blob_index, blob in enumerate(blobs):
extended_matrix.extend(compute_cells(blob)) cells, proofs = compute_cells_and_kzg_proofs(blob)
return ExtendedMatrix(extended_matrix) for cell_id, (cell, proof) in enumerate(zip(cells, proofs)):
extended_matrix.append(MatrixEntry(
cell=cell,
kzg_proof=proof,
row_index=blob_index,
column_index=cell_id,
))
return extended_matrix
``` ```
#### `recover_matrix` #### `recover_matrix`
```python ```python
def recover_matrix(cells_dict: Dict[Tuple[BlobIndex, CellID], Cell], blob_count: uint64) -> ExtendedMatrix: def recover_matrix(partial_matrix: Sequence[MatrixEntry],
blob_count: uint64) -> List[MatrixEntry, MAX_CELLS_IN_EXTENDED_MATRIX]:
""" """
Return the recovered ``ExtendedMatrix``. Return the recovered extended matrix.
This helper demonstrates how to apply ``recover_all_cells``. This helper demonstrates how to apply ``recover_cells_and_kzg_proofs``.
The data structure for storing cells is implementation-dependent. The data structure for storing cells is implementation-dependent.
""" """
extended_matrix: List[Cell] = [] extended_matrix = []
for blob_index in range(blob_count): for blob_index in range(blob_count):
cell_ids = [cell_id for b_index, cell_id in cells_dict.keys() if b_index == blob_index] cell_ids = [e.column_index for e in partial_matrix if e.row_index == blob_index]
cells = [cells_dict[(BlobIndex(blob_index), cell_id)] for cell_id in cell_ids] cells = [e.cell for e in partial_matrix if e.row_index == blob_index]
proofs = [e.kzg_proof for e in partial_matrix if e.row_index == blob_index]
all_cells_for_row = recover_all_cells(cell_ids, cells) recovered_cells, recovered_proofs = recover_cells_and_kzg_proofs(cell_ids, cells, proofs)
extended_matrix.extend(all_cells_for_row) for cell_id, (cell, proof) in enumerate(zip(recovered_cells, recovered_proofs)):
return ExtendedMatrix(extended_matrix) extended_matrix.append(MatrixEntry(
cell=cell,
kzg_proof=proof,
row_index=blob_index,
column_index=cell_id,
))
return extended_matrix
``` ```
#### `get_data_column_sidecars` #### `get_data_column_sidecars`
@ -182,15 +206,15 @@ def get_data_column_sidecars(signed_block: SignedBeaconBlock,
proofs = [cells_and_proofs[i][1] for i in range(blob_count)] proofs = [cells_and_proofs[i][1] for i in range(blob_count)]
sidecars = [] sidecars = []
for column_index in range(NUMBER_OF_COLUMNS): for column_index in range(NUMBER_OF_COLUMNS):
column = DataColumn([cells[row_index][column_index] column_cells = [cells[row_index][column_index]
for row_index in range(blob_count)]) for row_index in range(blob_count)]
kzg_proof_of_column = [proofs[row_index][column_index] column_proofs = [proofs[row_index][column_index]
for row_index in range(blob_count)] for row_index in range(blob_count)]
sidecars.append(DataColumnSidecar( sidecars.append(DataColumnSidecar(
index=column_index, index=column_index,
column=column, column=column_cells,
kzg_commitments=block.body.blob_kzg_commitments, kzg_commitments=block.body.blob_kzg_commitments,
kzg_proofs=kzg_proof_of_column, kzg_proofs=column_proofs,
signed_block_header=signed_block_header, signed_block_header=signed_block_header,
kzg_commitments_inclusion_proof=kzg_commitments_inclusion_proof, kzg_commitments_inclusion_proof=kzg_commitments_inclusion_proof,
)) ))
@ -283,7 +307,7 @@ Such trailing techniques and their analysis will be valuable for any DAS constru
### Row (blob) custody ### Row (blob) custody
In the one-dimension construction, a node samples the peers by requesting the whole `DataColumn`. In reconstruction, a node can reconstruct all the blobs by 50% of the columns. Note that nodes can still download the row via `blob_sidecar_{subnet_id}` subnets. In the one-dimension construction, a node samples the peers by requesting the whole `DataColumnSidecar`. In reconstruction, a node can reconstruct all the blobs by 50% of the columns. Note that nodes can still download the row via `blob_sidecar_{subnet_id}` subnets.
The potential benefits of having row custody could include: The potential benefits of having row custody could include:

View File

@ -1,4 +1,4 @@
# EIP-7594 -- Polynomial Commitments # EIP-7594 -- Polynomial Commitments Sampling
## Table of contents ## Table of contents
@ -46,7 +46,7 @@
- [`construct_vanishing_polynomial`](#construct_vanishing_polynomial) - [`construct_vanishing_polynomial`](#construct_vanishing_polynomial)
- [`recover_shifted_data`](#recover_shifted_data) - [`recover_shifted_data`](#recover_shifted_data)
- [`recover_original_data`](#recover_original_data) - [`recover_original_data`](#recover_original_data)
- [`recover_all_cells`](#recover_all_cells) - [`recover_cells_and_kzg_proofs`](#recover_cells_and_kzg_proofs)
<!-- END doctoc generated TOC please keep comment here to allow auto update --> <!-- END doctoc generated TOC please keep comment here to allow auto update -->
<!-- /TOC --> <!-- /TOC -->
@ -67,9 +67,7 @@ Public functions MUST accept raw bytes as input and perform the required cryptog
| `Coset` | `Vector[BLSFieldElement, FIELD_ELEMENTS_PER_CELL]` | The evaluation domain of a cell | | `Coset` | `Vector[BLSFieldElement, FIELD_ELEMENTS_PER_CELL]` | The evaluation domain of a cell |
| `CosetEvals` | `Vector[BLSFieldElement, FIELD_ELEMENTS_PER_CELL]` | The internal representation of a cell (the evaluations over its Coset) | | `CosetEvals` | `Vector[BLSFieldElement, FIELD_ELEMENTS_PER_CELL]` | The internal representation of a cell (the evaluations over its Coset) |
| `Cell` | `ByteVector[BYTES_PER_FIELD_ELEMENT * FIELD_ELEMENTS_PER_CELL]` | The unit of blob data that can come with its own KZG proof | | `Cell` | `ByteVector[BYTES_PER_FIELD_ELEMENT * FIELD_ELEMENTS_PER_CELL]` | The unit of blob data that can come with its own KZG proof |
| `CellID` | `uint64` | Cell identifier | | `CellID` | `uint64` | Validation: `x < CELLS_PER_EXT_BLOB` |
| `RowIndex` | `uint64` | Row identifier |
| `ColumnIndex` | `uint64` | Column identifier |
## Constants ## Constants
@ -660,14 +658,18 @@ def recover_original_data(eval_shifted_extended_evaluation: Sequence[BLSFieldEle
return reconstructed_data return reconstructed_data
``` ```
### `recover_all_cells` ### `recover_cells_and_kzg_proofs`
```python ```python
def recover_all_cells(cell_ids: Sequence[CellID], cells: Sequence[Cell]) -> Sequence[Cell]: def recover_cells_and_kzg_proofs(cell_ids: Sequence[CellID],
cells: Sequence[Cell],
proofs_bytes: Sequence[Bytes48]) -> Tuple[
Vector[Cell, CELLS_PER_EXT_BLOB],
Vector[KZGProof, CELLS_PER_EXT_BLOB]]:
""" """
Recover all of the cells in the extended blob from FIELD_ELEMENTS_PER_EXT_BLOB evaluations, Given at least 50% of cells/proofs for a blob, recover all the cells/proofs.
half of which can be missing. This algorithm uses FFTs to recover cells faster than using Lagrange
This algorithm uses FFTs to recover cells faster than using Lagrange implementation, as can be seen here: implementation, as can be seen here:
https://ethresear.ch/t/reed-solomon-erasure-code-recovery-in-n-log-2-n-time-with-ffts/3039 https://ethresear.ch/t/reed-solomon-erasure-code-recovery-in-n-log-2-n-time-with-ffts/3039
A faster version thanks to Qi Zhou can be found here: A faster version thanks to Qi Zhou can be found here:
@ -675,17 +677,20 @@ def recover_all_cells(cell_ids: Sequence[CellID], cells: Sequence[Cell]) -> Sequ
Public method. Public method.
""" """
assert len(cell_ids) == len(cells) assert len(cell_ids) == len(cells) == len(proofs_bytes)
# Check we have enough cells to be able to perform the reconstruction # Check we have enough cells to be able to perform the reconstruction
assert CELLS_PER_EXT_BLOB / 2 <= len(cell_ids) <= CELLS_PER_EXT_BLOB assert CELLS_PER_EXT_BLOB / 2 <= len(cell_ids) <= CELLS_PER_EXT_BLOB
# Check for duplicates # Check for duplicates
assert len(cell_ids) == len(set(cell_ids)) assert len(cell_ids) == len(set(cell_ids))
# Check that each cell is the correct length
for cell in cells:
assert len(cell) == BYTES_PER_CELL
# Check that the cell ids are within bounds # Check that the cell ids are within bounds
for cell_id in cell_ids: for cell_id in cell_ids:
assert cell_id < CELLS_PER_EXT_BLOB assert cell_id < CELLS_PER_EXT_BLOB
# Check that each cell is the correct length
for cell in cells:
assert len(cell) == BYTES_PER_CELL
# Check that each proof is the correct length
for proof_bytes in proofs_bytes:
assert len(proof_bytes) == BYTES_PER_PROOF
# Get the extended domain # Get the extended domain
roots_of_unity_extended = compute_roots_of_unity(FIELD_ELEMENTS_PER_EXT_BLOB) roots_of_unity_extended = compute_roots_of_unity(FIELD_ELEMENTS_PER_EXT_BLOB)
@ -716,9 +721,21 @@ def recover_all_cells(cell_ids: Sequence[CellID], cells: Sequence[Cell]) -> Sequ
end = (cell_id + 1) * FIELD_ELEMENTS_PER_CELL end = (cell_id + 1) * FIELD_ELEMENTS_PER_CELL
assert reconstructed_data[start:end] == coset_evals assert reconstructed_data[start:end] == coset_evals
reconstructed_data_as_cells = [ recovered_cells = [
coset_evals_to_cell(reconstructed_data[i * FIELD_ELEMENTS_PER_CELL:(i + 1) * FIELD_ELEMENTS_PER_CELL]) coset_evals_to_cell(reconstructed_data[i * FIELD_ELEMENTS_PER_CELL:(i + 1) * FIELD_ELEMENTS_PER_CELL])
for i in range(CELLS_PER_EXT_BLOB)] for i in range(CELLS_PER_EXT_BLOB)]
polynomial_eval = reconstructed_data[:FIELD_ELEMENTS_PER_BLOB]
polynomial_coeff = polynomial_eval_to_coeff(polynomial_eval)
recovered_proofs = [None] * CELLS_PER_EXT_BLOB
for i, cell_id in enumerate(cell_ids):
recovered_proofs[cell_id] = bytes_to_kzg_proof(proofs_bytes[i])
for i in range(CELLS_PER_EXT_BLOB):
if recovered_proofs[i] is None:
coset = coset_for_cell(CellID(i))
proof, ys = compute_kzg_proof_multi_impl(polynomial_coeff, coset)
assert coset_evals_to_cell(ys) == recovered_cells[i]
recovered_proofs[i] = proof
return reconstructed_data_as_cells return recovered_cells, recovered_proofs
``` ```

View File

@ -9,6 +9,11 @@ from eth2spec.test.helpers.sharding import (
) )
def chunks(lst, n):
"""Helper that splits a list into N sized chunks."""
return [lst[i:i + n] for i in range(0, len(lst), n)]
@with_eip7594_and_later @with_eip7594_and_later
@spec_test @spec_test
@single_phase @single_phase
@ -20,15 +25,15 @@ def test_compute_extended_matrix(spec):
extended_matrix = spec.compute_extended_matrix(input_blobs) extended_matrix = spec.compute_extended_matrix(input_blobs)
assert len(extended_matrix) == spec.CELLS_PER_EXT_BLOB * blob_count assert len(extended_matrix) == spec.CELLS_PER_EXT_BLOB * blob_count
rows = [extended_matrix[i:(i + spec.CELLS_PER_EXT_BLOB)] rows = chunks(extended_matrix, spec.CELLS_PER_EXT_BLOB)
for i in range(0, len(extended_matrix), spec.CELLS_PER_EXT_BLOB)]
assert len(rows) == blob_count assert len(rows) == blob_count
assert len(rows[0]) == spec.CELLS_PER_EXT_BLOB for row in rows:
assert len(row) == spec.CELLS_PER_EXT_BLOB
for blob_index, row in enumerate(rows): for blob_index, row in enumerate(rows):
extended_blob = [] extended_blob = []
for cell in row: for entry in row:
extended_blob.extend(spec.cell_to_coset_evals(cell)) extended_blob.extend(spec.cell_to_coset_evals(entry.cell))
blob_part = extended_blob[0:len(extended_blob) // 2] blob_part = extended_blob[0:len(extended_blob) // 2]
blob = b''.join([spec.bls_field_to_bytes(x) for x in blob_part]) blob = b''.join([spec.bls_field_to_bytes(x) for x in blob_part])
assert blob == input_blobs[blob_index] assert blob == input_blobs[blob_index]
@ -43,27 +48,19 @@ def test_recover_matrix(spec):
# Number of samples we will be recovering from # Number of samples we will be recovering from
N_SAMPLES = spec.CELLS_PER_EXT_BLOB // 2 N_SAMPLES = spec.CELLS_PER_EXT_BLOB // 2
# Compute an extended matrix with two blobs
blob_count = 2 blob_count = 2
cells_dict = {} blobs = [get_sample_blob(spec, rng=rng) for _ in range(blob_count)]
original_cells = [] extended_matrix = spec.compute_extended_matrix(blobs)
for blob_index in range(blob_count):
# Get the data we will be working with
blob = get_sample_blob(spec, rng=rng)
# Extend data with Reed-Solomon and split the extended data in cells
cells = spec.compute_cells(blob)
original_cells.append(cells)
cell_ids = []
# First figure out just the indices of the cells
for _ in range(N_SAMPLES):
cell_id = rng.randint(0, spec.CELLS_PER_EXT_BLOB - 1)
while cell_id in cell_ids:
cell_id = rng.randint(0, spec.CELLS_PER_EXT_BLOB - 1)
cell_ids.append(cell_id)
cell = cells[cell_id]
cells_dict[(blob_index, cell_id)] = cell
assert len(cell_ids) == N_SAMPLES
# Recover the matrix # Construct a matrix with some entries missing
recovered_matrix = spec.recover_matrix(cells_dict, blob_count) partial_matrix = []
flatten_original_cells = [cell for cells in original_cells for cell in cells] for blob_entries in chunks(extended_matrix, spec.CELLS_PER_EXT_BLOB):
assert recovered_matrix == flatten_original_cells rng.shuffle(blob_entries)
partial_matrix.extend(blob_entries[:N_SAMPLES])
# Given the partial matrix, recover the missing entries
recovered_matrix = spec.recover_matrix(partial_matrix, blob_count)
# Ensure that the recovered matrix matches the original matrix
assert recovered_matrix == extended_matrix

View File

@ -64,7 +64,7 @@ def test_verify_cell_kzg_proof_batch(spec):
@with_eip7594_and_later @with_eip7594_and_later
@spec_test @spec_test
@single_phase @single_phase
def test_recover_all_cells(spec): def test_recover_cells_and_kzg_proofs(spec):
rng = random.Random(5566) rng = random.Random(5566)
# Number of samples we will be recovering from # Number of samples we will be recovering from
@ -74,7 +74,7 @@ def test_recover_all_cells(spec):
blob = get_sample_blob(spec) blob = get_sample_blob(spec)
# Extend data with Reed-Solomon and split the extended data in cells # Extend data with Reed-Solomon and split the extended data in cells
cells = spec.compute_cells(blob) cells, proofs = spec.compute_cells_and_kzg_proofs(blob)
# Compute the cells we will be recovering from # Compute the cells we will be recovering from
cell_ids = [] cell_ids = []
@ -84,19 +84,21 @@ def test_recover_all_cells(spec):
while j in cell_ids: while j in cell_ids:
j = rng.randint(0, spec.CELLS_PER_EXT_BLOB - 1) j = rng.randint(0, spec.CELLS_PER_EXT_BLOB - 1)
cell_ids.append(j) cell_ids.append(j)
# Now the cells themselves # Now the cells/proofs themselves
known_cells = [cells[cell_id] for cell_id in cell_ids] known_cells = [cells[cell_id] for cell_id in cell_ids]
known_proofs = [proofs[cell_id] for cell_id in cell_ids]
# Recover all of the cells # Recover the missing cells and proofs
recovered_cells = spec.recover_all_cells(cell_ids, known_cells) recovered_cells, recovered_proofs = spec.recover_cells_and_kzg_proofs(cell_ids, known_cells, known_proofs)
recovered_data = [x for xs in recovered_cells for x in xs] recovered_data = [x for xs in recovered_cells for x in xs]
# Check that the original data match the non-extended portion of the recovered data # Check that the original data match the non-extended portion of the recovered data
blob_byte_array = [b for b in blob] blob_byte_array = [b for b in blob]
assert blob_byte_array == recovered_data[:len(recovered_data) // 2] assert blob_byte_array == recovered_data[:len(recovered_data) // 2]
# Check that the recovered cells match the original cells # Check that the recovered cells/proofs match the original cells/proofs
assert cells == recovered_cells assert cells == recovered_cells
assert proofs == recovered_proofs
@with_eip7594_and_later @with_eip7594_and_later

View File

@ -1,23 +0,0 @@
# Test format: Recover all cells
Recover all cells given at least 50% of the original `cells`.
## Test case format
The test data is declared in a `data.yaml` file:
```yaml
input:
cell_ids: List[CellID] -- the cell identifier for each cell
cells: List[Cell] -- the partial collection of cells
output: List[Cell] -- all cells, including recovered cells
```
- `CellID` is an unsigned 64-bit integer.
- `Cell` is a 2048-byte hexadecimal string, prefixed with `0x`.
All byte(s) fields are encoded as strings, hexadecimal encoding, prefixed with `0x`.
## Condition
The `recover_all_cells` handler should recover missing cells, and the result should match the expected `output`. If any cell is invalid (e.g. incorrect length or one of the 32-byte blocks does not represent a BLS field element) or any `cell_id` is invalid (e.g. greater than the number of cells for an extended blob), it should error, i.e. the output should be `null`.

View File

@ -0,0 +1,24 @@
# Test format: Recover cells and KZG proofs
Recover all cells/proofs given at least 50% of the original `cells` and `proofs`.
## Test case format
The test data is declared in a `data.yaml` file:
```yaml
input:
cell_ids: List[CellID] -- the cell identifier for each cell
cells: List[Cell] -- the partial collection of cells
output: Tuple[List[Cell], List[KZGProof]] -- all cells and proofs
```
- `CellID` is an unsigned 64-bit integer.
- `Cell` is a 2048-byte hexadecimal string, prefixed with `0x`.
- `KZGProof` is a 48-byte hexadecimal string, prefixed with `0x`.
All byte(s) fields are encoded as strings, hexadecimal encoding, prefixed with `0x`.
## Condition
The `recover_cells_and_kzg_proofs` handler should recover missing cells and proofs, and the result should match the expected `output`. If any cell is invalid (e.g. incorrect length or one of the 32-byte blocks does not represent a BLS field element), any proof is invalid (e.g. not on the curve or not in the G1 subgroup of the BLS curve), or any `cell_id` is invalid (e.g. greater than the number of cells for an extended blob), it should error, i.e. the output should be `null`.

View File

@ -11,11 +11,9 @@ from eth2spec.gen_helpers.gen_base import gen_runner, gen_typing
from eth2spec.test.helpers.constants import EIP7594 from eth2spec.test.helpers.constants import EIP7594
from eth2spec.test.helpers.typing import SpecForkName from eth2spec.test.helpers.typing import SpecForkName
from eth2spec.test.utils.kzg_tests import ( from eth2spec.test.utils.kzg_tests import (
BLOB_RANDOM_VALID1,
BLOB_RANDOM_VALID2,
BLOB_RANDOM_VALID3,
CELL_RANDOM_VALID1, CELL_RANDOM_VALID1,
CELL_RANDOM_VALID2, CELL_RANDOM_VALID2,
G1,
INVALID_BLOBS, INVALID_BLOBS,
INVALID_G1_POINTS, INVALID_G1_POINTS,
INVALID_INDIVIDUAL_CELL_BYTES, INVALID_INDIVIDUAL_CELL_BYTES,
@ -616,187 +614,237 @@ def case04_verify_cell_kzg_proof_batch():
############################################################################### ###############################################################################
# Test cases for recover_all_cells # Test cases for recover_cells_and_kzg_proofs
############################################################################### ###############################################################################
def case05_recover_all_cells(): def case05_recover_cells_and_kzg_proofs():
# Valid: No missing cells # Valid: No missing cells
blob = BLOB_RANDOM_VALID1 cells, proofs = VALID_CELLS_AND_PROOFS[0]
cells = spec.compute_cells(blob)
cell_ids = list(range(spec.CELLS_PER_EXT_BLOB)) cell_ids = list(range(spec.CELLS_PER_EXT_BLOB))
recovered_cells = spec.recover_all_cells(cell_ids, cells) recovered_cells, recovered_proofs = spec.recover_cells_and_kzg_proofs(cell_ids, cells, proofs)
assert recovered_cells == cells assert recovered_cells == cells
identifier = make_id(cell_ids, cells) assert recovered_proofs == proofs
yield f'recover_all_cells_case_valid_no_missing_{identifier}', { identifier = make_id(cell_ids, cells, proofs)
yield f'recover_cells_and_kzg_proofs_case_valid_no_missing_{identifier}', {
'input': { 'input': {
'cell_ids': cell_ids, 'cell_ids': cell_ids,
'cells': encode_hex_list(cells), 'cells': encode_hex_list(cells),
'proofs': encode_hex_list(proofs),
}, },
'output': encode_hex_list(recovered_cells) 'output': (encode_hex_list(recovered_cells), encode_hex_list(recovered_proofs))
} }
# Valid: Half missing cells (every other cell) # Valid: Half missing cells (every other cell)
blob = BLOB_RANDOM_VALID2 cells, proofs = VALID_CELLS_AND_PROOFS[1]
cells = spec.compute_cells(blob)
cell_ids = list(range(0, spec.CELLS_PER_EXT_BLOB, 2)) cell_ids = list(range(0, spec.CELLS_PER_EXT_BLOB, 2))
partial_cells = [cells[cell_id] for cell_id in cell_ids] partial_cells = [cells[cell_id] for cell_id in cell_ids]
recovered_cells = spec.recover_all_cells(cell_ids, partial_cells) partial_proofs = [proofs[cell_id] for cell_id in cell_ids]
recovered_cells, recovered_proofs = spec.recover_cells_and_kzg_proofs(cell_ids, partial_cells, partial_proofs)
assert recovered_cells == cells assert recovered_cells == cells
identifier = make_id(cell_ids, partial_cells) assert recovered_proofs == proofs
yield f'recover_all_cells_case_valid_half_missing_every_other_cell_{identifier}', { identifier = make_id(cell_ids, partial_cells, partial_proofs)
yield f'recover_cells_and_kzg_proofs_case_valid_half_missing_every_other_cell_{identifier}', {
'input': { 'input': {
'cell_ids': cell_ids, 'cell_ids': cell_ids,
'cells': encode_hex_list(partial_cells), 'cells': encode_hex_list(partial_cells),
'proofs': encode_hex_list(partial_proofs),
}, },
'output': encode_hex_list(recovered_cells) 'output': (encode_hex_list(recovered_cells), encode_hex_list(recovered_proofs))
} }
# Valid: Half missing cells (first half) # Valid: Half missing cells (first half)
blob = BLOB_RANDOM_VALID3 cells, proofs = VALID_CELLS_AND_PROOFS[2]
cells = spec.compute_cells(blob)
cell_ids = list(range(0, spec.CELLS_PER_EXT_BLOB // 2)) cell_ids = list(range(0, spec.CELLS_PER_EXT_BLOB // 2))
partial_cells = [cells[cell_id] for cell_id in cell_ids] partial_cells = [cells[cell_id] for cell_id in cell_ids]
recovered_cells = spec.recover_all_cells(cell_ids, partial_cells) partial_proofs = [proofs[cell_id] for cell_id in cell_ids]
recovered_cells, recovered_proofs = spec.recover_cells_and_kzg_proofs(cell_ids, partial_cells, partial_proofs)
assert recovered_cells == cells assert recovered_cells == cells
assert recovered_proofs == proofs
identifier = make_id(cell_ids, partial_cells) identifier = make_id(cell_ids, partial_cells)
yield f'recover_all_cells_case_valid_half_missing_first_half_{identifier}', { yield f'recover_cells_and_kzg_proofs_case_valid_half_missing_first_half_{identifier}', {
'input': { 'input': {
'cell_ids': cell_ids, 'cell_ids': cell_ids,
'cells': encode_hex_list(partial_cells), 'cells': encode_hex_list(partial_cells),
'proofs': encode_hex_list(partial_proofs),
}, },
'output': encode_hex_list(recovered_cells) 'output': (encode_hex_list(recovered_cells), encode_hex_list(recovered_proofs))
} }
# Valid: Half missing cells (second half) # Valid: Half missing cells (second half)
blob = BLOB_RANDOM_VALID1 cells, proofs = VALID_CELLS_AND_PROOFS[3]
cells = spec.compute_cells(blob)
cell_ids = list(range(spec.CELLS_PER_EXT_BLOB // 2, spec.CELLS_PER_EXT_BLOB)) cell_ids = list(range(spec.CELLS_PER_EXT_BLOB // 2, spec.CELLS_PER_EXT_BLOB))
partial_cells = [cells[cell_id] for cell_id in cell_ids] partial_cells = [cells[cell_id] for cell_id in cell_ids]
recovered_cells = spec.recover_all_cells(cell_ids, partial_cells) partial_proofs = [proofs[cell_id] for cell_id in cell_ids]
recovered_cells, recovered_proofs = spec.recover_cells_and_kzg_proofs(cell_ids, partial_cells, partial_proofs)
assert recovered_cells == cells assert recovered_cells == cells
assert recovered_proofs == proofs
identifier = make_id(cell_ids, partial_cells) identifier = make_id(cell_ids, partial_cells)
yield f'recover_all_cells_case_valid_half_missing_second_half_{identifier}', { yield f'recover_cells_and_kzg_proofs_case_valid_half_missing_second_half_{identifier}', {
'input': { 'input': {
'cell_ids': cell_ids, 'cell_ids': cell_ids,
'cells': encode_hex_list(partial_cells), 'cells': encode_hex_list(partial_cells),
'proofs': encode_hex_list(partial_proofs),
}, },
'output': encode_hex_list(recovered_cells) 'output': (encode_hex_list(recovered_cells), encode_hex_list(recovered_proofs))
} }
# Edge case: All cells are missing # Edge case: All cells are missing
cell_ids, partial_cells = [], [] cell_ids, partial_cells = [], []
expect_exception(spec.recover_all_cells, cell_ids, partial_cells) expect_exception(spec.recover_cells_and_kzg_proofs, cell_ids, partial_cells)
identifier = make_id(cell_ids, partial_cells) identifier = make_id(cell_ids, partial_cells)
yield f'recover_all_cells_case_invalid_all_cells_are_missing_{identifier}', { yield f'recover_cells_and_kzg_proofs_case_invalid_all_cells_are_missing_{identifier}', {
'input': { 'input': {
'cell_ids': cell_ids, 'cell_ids': cell_ids,
'cells': encode_hex_list(partial_cells), 'cells': encode_hex_list(partial_cells),
'proofs': encode_hex_list(partial_proofs),
}, },
'output': None 'output': None
} }
# Edge case: More than half missing # Edge case: More than half missing
blob = BLOB_RANDOM_VALID2 cells, proofs = VALID_CELLS_AND_PROOFS[4]
cells = spec.compute_cells(blob)
cell_ids = list(range(spec.CELLS_PER_EXT_BLOB // 2 - 1)) cell_ids = list(range(spec.CELLS_PER_EXT_BLOB // 2 - 1))
partial_cells = [cells[cell_id] for cell_id in cell_ids] partial_cells = [cells[cell_id] for cell_id in cell_ids]
expect_exception(spec.recover_all_cells, cell_ids, partial_cells) partial_proofs = [proofs[cell_id] for cell_id in cell_ids]
identifier = make_id(cell_ids, partial_cells) expect_exception(spec.recover_cells_and_kzg_proofs, cell_ids, partial_cells, partial_proofs)
yield f'recover_all_cells_case_invalid_more_than_half_missing_{identifier}', { identifier = make_id(cell_ids, partial_cells, partial_proofs)
yield f'recover_cells_and_kzg_proofs_case_invalid_more_than_half_missing_{identifier}', {
'input': { 'input': {
'cell_ids': cell_ids, 'cell_ids': cell_ids,
'cells': encode_hex_list(partial_cells), 'cells': encode_hex_list(partial_cells),
'proofs': encode_hex_list(partial_proofs),
}, },
'output': None 'output': None
} }
# Edge case: More cells provided than CELLS_PER_EXT_BLOB # Edge case: More cells provided than CELLS_PER_EXT_BLOB
blob = BLOB_RANDOM_VALID2 cells, proofs = VALID_CELLS_AND_PROOFS[5]
cells = spec.compute_cells(blob)
cell_ids = list(range(spec.CELLS_PER_EXT_BLOB)) + [0] cell_ids = list(range(spec.CELLS_PER_EXT_BLOB)) + [0]
partial_cells = [cells[cell_id] for cell_id in cell_ids] partial_cells = [cells[cell_id] for cell_id in cell_ids]
expect_exception(spec.recover_all_cells, cell_ids, partial_cells) partial_proofs = [proofs[cell_id] for cell_id in cell_ids]
identifier = make_id(cell_ids, partial_cells) expect_exception(spec.recover_cells_and_kzg_proofs, cell_ids, partial_cells, partial_proofs)
yield f'recover_all_cells_case_invalid_more_cells_than_cells_per_ext_blob_{identifier}', { identifier = make_id(cell_ids, partial_cells, partial_proofs)
yield f'recover_cells_and_kzg_proofs_case_invalid_more_cells_than_cells_per_ext_blob_{identifier}', {
'input': { 'input': {
'cell_ids': cell_ids, 'cell_ids': cell_ids,
'cells': encode_hex_list(partial_cells), 'cells': encode_hex_list(partial_cells),
'proofs': encode_hex_list(partial_proofs),
}, },
'output': None 'output': None
} }
# Edge case: Invalid cell_id # Edge case: Invalid cell_id
blob = BLOB_RANDOM_VALID1 cells, proofs = VALID_CELLS_AND_PROOFS[6]
cells = spec.compute_cells(blob)
cell_ids = list(range(spec.CELLS_PER_EXT_BLOB // 2)) cell_ids = list(range(spec.CELLS_PER_EXT_BLOB // 2))
partial_cells = [cells[cell_id] for cell_id in cell_ids] partial_cells = [cells[cell_id] for cell_id in cell_ids]
partial_proofs = [proofs[cell_id] for cell_id in cell_ids]
# Replace first cell_id with an invalid value # Replace first cell_id with an invalid value
cell_ids[0] = spec.CELLS_PER_EXT_BLOB cell_ids[0] = spec.CELLS_PER_EXT_BLOB
expect_exception(spec.recover_all_cells, cell_ids, partial_cells) expect_exception(spec.recover_cells_and_kzg_proofs, cell_ids, partial_cells, partial_proofs)
identifier = make_id(cell_ids, partial_cells) identifier = make_id(cell_ids, partial_cells, partial_proofs)
yield f'recover_all_cells_case_invalid_cell_id_{identifier}', { yield f'recover_cells_and_kzg_proofs_case_invalid_cell_id_{identifier}', {
'input': { 'input': {
'cell_ids': cell_ids, 'cell_ids': cell_ids,
'cells': encode_hex_list(partial_cells), 'cells': encode_hex_list(partial_cells),
'proofs': encode_hex_list(partial_proofs),
}, },
'output': None 'output': None
} }
# Edge case: Invalid cell # Edge case: Invalid cell
blob = BLOB_RANDOM_VALID2
for cell in INVALID_INDIVIDUAL_CELL_BYTES: for cell in INVALID_INDIVIDUAL_CELL_BYTES:
cells = spec.compute_cells(blob) cells, proofs = VALID_CELLS_AND_PROOFS[6]
cell_ids = list(range(spec.CELLS_PER_EXT_BLOB // 2)) cell_ids = list(range(spec.CELLS_PER_EXT_BLOB // 2))
partial_cells = [cells[cell_id] for cell_id in cell_ids] partial_cells = [cells[cell_id] for cell_id in cell_ids]
partial_proofs = [proofs[cell_id] for cell_id in cell_ids]
# Replace first cell with an invalid value # Replace first cell with an invalid value
partial_cells[0] = cell partial_cells[0] = cell
expect_exception(spec.recover_all_cells, cell_ids, partial_cells) expect_exception(spec.recover_cells_and_kzg_proofs, cell_ids, partial_cells, partial_proofs)
identifier = make_id(cell_ids, partial_cells) identifier = make_id(cell_ids, partial_cells, partial_proofs)
yield f'recover_all_cells_case_invalid_cell_{identifier}', { yield f'recover_cells_and_kzg_proofs_case_invalid_cell_{identifier}', {
'input': { 'input': {
'cell_ids': cell_ids, 'cell_ids': cell_ids,
'cells': encode_hex_list(partial_cells), 'cells': encode_hex_list(partial_cells),
'proofs': encode_hex_list(partial_proofs),
},
'output': None
}
# Edge case: Invalid proof
for proof in INVALID_G1_POINTS:
cells, proofs = VALID_CELLS_AND_PROOFS[0]
cell_ids = list(range(spec.CELLS_PER_EXT_BLOB // 2))
partial_cells = [cells[cell_id] for cell_id in cell_ids]
partial_proofs = [proofs[cell_id] for cell_id in cell_ids]
# Replace first proof with an invalid value
partial_proofs[0] = proof
expect_exception(spec.recover_cells_and_kzg_proofs, cell_ids, partial_cells, partial_proofs)
identifier = make_id(cell_ids, partial_cells, partial_proofs)
yield f'recover_cells_and_kzg_proofs_case_invalid_proof_{identifier}', {
'input': {
'cell_ids': cell_ids,
'cells': encode_hex_list(partial_cells),
'proofs': encode_hex_list(partial_proofs),
}, },
'output': None 'output': None
} }
# Edge case: More cell_ids than cells # Edge case: More cell_ids than cells
blob = BLOB_RANDOM_VALID3 cells, proofs = VALID_CELLS_AND_PROOFS[0]
cells = spec.compute_cells(blob)
cell_ids = list(range(0, spec.CELLS_PER_EXT_BLOB, 2)) cell_ids = list(range(0, spec.CELLS_PER_EXT_BLOB, 2))
partial_cells = [cells[cell_id] for cell_id in cell_ids] partial_cells = [cells[cell_id] for cell_id in cell_ids]
partial_proofs = [proofs[cell_id] for cell_id in cell_ids]
# Add another cell_id # Add another cell_id
cell_ids.append(spec.CELLS_PER_EXT_BLOB - 1) cell_ids.append(spec.CELLS_PER_EXT_BLOB - 1)
expect_exception(spec.recover_all_cells, cell_ids, partial_cells) expect_exception(spec.recover_cells_and_kzg_proofs, cell_ids, partial_cells, partial_proofs)
identifier = make_id(cell_ids, partial_cells) identifier = make_id(cell_ids, partial_cells, partial_proofs)
yield f'recover_all_cells_case_invalid_more_cell_ids_than_cells_{identifier}', { yield f'recover_cells_and_kzg_proofs_case_invalid_more_cell_ids_than_cells_{identifier}', {
'input': { 'input': {
'cell_ids': cell_ids, 'cell_ids': cell_ids,
'cells': encode_hex_list(partial_cells), 'cells': encode_hex_list(partial_cells),
'proofs': encode_hex_list(partial_proofs),
}, },
'output': None 'output': None
} }
# Edge case: More cells than cell_ids # Edge case: More cells than cell_ids
blob = BLOB_RANDOM_VALID1 cells, proofs = VALID_CELLS_AND_PROOFS[1]
cells = spec.compute_cells(blob)
cell_ids = list(range(0, spec.CELLS_PER_EXT_BLOB, 2)) cell_ids = list(range(0, spec.CELLS_PER_EXT_BLOB, 2))
partial_cells = [cells[cell_id] for cell_id in cell_ids] partial_cells = [cells[cell_id] for cell_id in cell_ids]
partial_proofs = [proofs[cell_id] for cell_id in cell_ids]
# Add another cell # Add another cell
partial_cells.append(CELL_RANDOM_VALID1) partial_cells.append(CELL_RANDOM_VALID1)
expect_exception(spec.recover_all_cells, cell_ids, partial_cells) expect_exception(spec.recover_cells_and_kzg_proofs, cell_ids, partial_cells, partial_proofs)
identifier = make_id(cell_ids, partial_cells) identifier = make_id(cell_ids, partial_cells, partial_proofs)
yield f'recover_all_cells_case_invalid_more_cells_than_cell_ids_{identifier}', { yield f'recover_cells_and_kzg_proofs_case_invalid_more_cells_than_cell_ids_{identifier}', {
'input': { 'input': {
'cell_ids': cell_ids, 'cell_ids': cell_ids,
'cells': encode_hex_list(partial_cells), 'cells': encode_hex_list(partial_cells),
'proofs': encode_hex_list(partial_proofs),
},
'output': None
}
# Edge case: More proofs than cell_ids
cells, proofs = VALID_CELLS_AND_PROOFS[1]
cell_ids = list(range(0, spec.CELLS_PER_EXT_BLOB, 2))
partial_cells = [cells[cell_id] for cell_id in cell_ids]
partial_proofs = [proofs[cell_id] for cell_id in cell_ids]
# Add another proof
partial_proofs.append(G1)
expect_exception(spec.recover_cells_and_kzg_proofs, cell_ids, partial_cells, partial_proofs)
identifier = make_id(cell_ids, partial_cells, partial_proofs)
yield f'recover_cells_and_kzg_proofs_case_invalid_more_proofs_than_cell_ids_{identifier}', {
'input': {
'cell_ids': cell_ids,
'cells': encode_hex_list(partial_cells),
'proofs': encode_hex_list(partial_proofs),
}, },
'output': None 'output': None
} }
# Edge case: Duplicate cell_id # Edge case: Duplicate cell_id
blob = BLOB_RANDOM_VALID2 cells, proofs = VALID_CELLS_AND_PROOFS[2]
cells = spec.compute_cells(blob)
# There will be 65 cells, where 64 are unique and 1 is a duplicate. # There will be 65 cells, where 64 are unique and 1 is a duplicate.
# Depending on the implementation, 63 & 1 might not fail for the right # Depending on the implementation, 63 & 1 might not fail for the right
# reason. For example, if the implementation assigns cells in an array # reason. For example, if the implementation assigns cells in an array
@ -804,14 +852,16 @@ def case05_recover_all_cells():
# to insufficient cell count, not because of a duplicate cell. # to insufficient cell count, not because of a duplicate cell.
cell_ids = list(range(spec.CELLS_PER_EXT_BLOB // 2 + 1)) cell_ids = list(range(spec.CELLS_PER_EXT_BLOB // 2 + 1))
partial_cells = [cells[cell_id] for cell_id in cell_ids] partial_cells = [cells[cell_id] for cell_id in cell_ids]
partial_proofs = [proofs[cell_id] for cell_id in cell_ids]
# Replace first cell_id with the second cell_id # Replace first cell_id with the second cell_id
cell_ids[0] = cell_ids[1] cell_ids[0] = cell_ids[1]
expect_exception(spec.recover_all_cells, cell_ids, partial_cells) expect_exception(spec.recover_cells_and_kzg_proofs, cell_ids, partial_cells, partial_proofs)
identifier = make_id(cell_ids, partial_cells) identifier = make_id(cell_ids, partial_cells, partial_proofs)
yield f'recover_all_cells_case_invalid_duplicate_cell_id_{identifier}', { yield f'recover_cells_and_kzg_proofs_case_invalid_duplicate_cell_id_{identifier}', {
'input': { 'input': {
'cell_ids': cell_ids, 'cell_ids': cell_ids,
'cells': encode_hex_list(partial_cells), 'cells': encode_hex_list(partial_cells),
'proofs': encode_hex_list(partial_proofs),
}, },
'output': None 'output': None
} }
@ -853,5 +903,5 @@ if __name__ == "__main__":
create_provider(EIP7594, 'compute_cells_and_kzg_proofs', case02_compute_cells_and_kzg_proofs), create_provider(EIP7594, 'compute_cells_and_kzg_proofs', case02_compute_cells_and_kzg_proofs),
create_provider(EIP7594, 'verify_cell_kzg_proof', case03_verify_cell_kzg_proof), create_provider(EIP7594, 'verify_cell_kzg_proof', case03_verify_cell_kzg_proof),
create_provider(EIP7594, 'verify_cell_kzg_proof_batch', case04_verify_cell_kzg_proof_batch), create_provider(EIP7594, 'verify_cell_kzg_proof_batch', case04_verify_cell_kzg_proof_batch),
create_provider(EIP7594, 'recover_all_cells', case05_recover_all_cells), create_provider(EIP7594, 'recover_cells_and_kzg_proofs', case05_recover_cells_and_kzg_proofs),
]) ])