Add `recover_matrix` and remove unused `FlatExtendedMatrix` type

2024-02-02 01:45:02 +08:00 · 2024-02-02 01:45:02 +08:00 · c47d5f3578
parent 428c166283
commit c47d5f3578
7 changed files with 83 additions and 26 deletions
--- a/specs/_features/eip7594/das-core.md
+++ b/specs/_features/eip7594/das-core.md
@ -18,7 +18,7 @@
  - [Helper functions](#helper-functions)
    - [`get_custody_columns`](#get_custody_columns)
    - [`compute_extended_data`](#compute_extended_data)
-    - [`compute_extended_matrix`](#compute_extended_matrix)
+    - [`recover_matrix`](#recover_matrix)
    - [`get_data_column_sidecars`](#get_data_column_sidecars)
 - [Custody](#custody)
  - [Custody requirement](#custody-requirement)
@ -47,7 +47,6 @@ We define the following Python custom types for type hinting and readability:
 | - | - | - |
 | `DataColumn` | `List[Cell, MAX_BLOBS_PER_BLOCK]` | The data of each column in EIP-7594 |
 | `ExtendedMatrix` | `List[Cell, MAX_BLOBS_PER_BLOCK * NUMBER_OF_COLUMNS]` | The full data of one-dimensional erasure coding extended blobs (in row major format) |
 | `FlatExtendedMatrix` | `List[BLSFieldElement, FIELD_ELEMENTS_PER_CELL * MAX_BLOBS_PER_BLOCK * NUMBER_OF_COLUMNS]` | The flattened format of `ExtendedMatrix` |
 ## Configuration
@ -122,12 +121,29 @@ def compute_extended_data(data: Sequence[BLSFieldElement]) -> Sequence[BLSFieldE
    ...
 ```
-#### `compute_extended_matrix`
+#### `recover_matrix`
 ```python
-def compute_extended_matrix(blobs: Sequence[Blob]) -> FlatExtendedMatrix:
+def recover_matrix(cells_dict: Dict[Tuple[BlobIndex, CellID], Cell], blob_count: uint64) -> ExtendedMatrix:
-    matrix = [compute_extended_data(blob) for blob in blobs]
+    """
-    return FlatExtendedMatrix(matrix)
+    Return the recovered ``ExtendedMatrix``.
    This helper demonstrate how to apply ``recover_polynomial``.
    The data structure for storing cells is implementation-dependent.
    """
    extended_matrix = []
    for blob_index in range(blob_count):
        cell_ids = [cell_id for b_index, cell_id in cells_dict.keys() if b_index == blob_index]
        cells = [cells_dict[(blob_index, cell_id)] for cell_id in cell_ids]
        cells_bytes = [[bls_field_to_bytes(element) for element in cell] for cell in cells]
        full_polynomial = recover_polynomial(cell_ids, cells_bytes)
        cells_from_full_polynomial = [
            full_polynomial[i * FIELD_ELEMENTS_PER_CELL:(i + 1) * FIELD_ELEMENTS_PER_CELL]
            for i in range(CELLS_PER_BLOB)
        ]
        extended_matrix.extend(cells_from_full_polynomial)
    return ExtendedMatrix(extended_matrix)
 ```
 #### `get_data_column_sidecars`
@ -204,7 +220,7 @@ To custody a particular column, a node joins the respective gossip subnet. Verif
 ### Reconstruction and cross-seeding
-If the node obtains 50%+ of all the columns, they can reconstruct the full data matrix via `recover_samples_impl` helper.
+If the node obtains 50%+ of all the columns, they can reconstruct the full data matrix via `recover_matrix` helper.
 If a node fails to sample a peer or fails to get a column on the column subnet, a node can utilize the Req/Resp message to query the missing column from other peers.
@ -218,7 +234,7 @@ Once the node obtain the column, the node should send the missing columns to the
 ## Peer sampling
-At each slot, a node makes (locally randomly determined) `SAMPLES_PER_SLOT` queries for samples from their peers via `DataColumnSidecarByRoot` request. A node utilizes `get_custody_columns` helper to determine which peer(s) to request from. If a node has enough good/honest peers across all rows and columns, this has a high chance of success.
+At each slot, a node makes (locally randomly determined) `SAMPLES_PER_SLOT` queries for samples from their peers via `DataColumnSidecarsByRoot` request. A node utilizes `get_custody_columns` helper to determine which peer(s) to request from. If a node has enough good/honest peers across all rows and columns, this has a high chance of success.
 ## Peer scoring
@ -240,7 +256,7 @@ The fork choice rule (essentially a DA filter) is *orthogonal to a given DAS des
 In any DAS design, there are probably a few degrees of freedom around timing, acceptability of short-term re-orgs, etc. 
-For example, the fork choice rule might require validators to do successful DAS on slot N to be able to include block of slot `N` in its fork choice. That's the tightest DA filter. But trailing filters are also probably acceptable, knowing that there might be some failures/short re-orgs but that they don't hurt the aggregate security. For example, the rule could be — DAS must be completed for slot N-1 for a child block in N to be included in the fork choice.
+For example, the fork choice rule might require validators to do successful DAS on slot `N` to be able to include block of slot `N` in its fork choice. That's the tightest DA filter. But trailing filters are also probably acceptable, knowing that there might be some failures/short re-orgs but that they don't hurt the aggregate security. For example, the rule could be — DAS must be completed for slot N-1 for a child block in N to be included in the fork choice.
 Such trailing techniques and their analysis will be valuable for any DAS construction. The question is — can you relax how quickly you need to do DA and in the worst case not confirm unavailable data via attestations/finality, and what impact does it have on short-term re-orgs and fast confirmation rules.
--- a/specs/deneb/polynomial-commitments.md
+++ b/specs/deneb/polynomial-commitments.md
@ -20,6 +20,7 @@
  - [BLS12-381 helpers](#bls12-381-helpers)
    - [`hash_to_bls_field`](#hash_to_bls_field)
    - [`bytes_to_bls_field`](#bytes_to_bls_field)
    - [`bls_field_to_bytes`](#bls_field_to_bytes)
    - [`validate_kzg_g1`](#validate_kzg_g1)
    - [`bytes_to_kzg_commitment`](#bytes_to_kzg_commitment)
    - [`bytes_to_kzg_proof`](#bytes_to_kzg_proof)
@ -170,6 +171,12 @@ def bytes_to_bls_field(b: Bytes32) -> BLSFieldElement:
    return BLSFieldElement(field_element)
 ```
 #### `bls_field_to_bytes`
 ```python
 def bls_field_to_bytes(x: BLSFieldElement) -> Bytes32:
    return int.to_bytes(x % BLS_MODULUS, 32, KZG_ENDIANNESS)
 ```
 #### `validate_kzg_g1`
--- a/tests/core/pyspec/eth2spec/test/deneb/unittests/polynomial_commitments/test_polynomial_commitments.py
+++ b/tests/core/pyspec/eth2spec/test/deneb/unittests/polynomial_commitments/test_polynomial_commitments.py
@ -32,10 +32,6 @@ def bls_add_one(x):
    )
 def field_element_bytes(x):
    return int.to_bytes(x % BLS_MODULUS, 32, "big")
@with_deneb_and_later
@spec_test
@single_phase
@ -43,7 +39,7 @@ def test_verify_kzg_proof(spec):
    """
    Test the wrapper functions (taking bytes arguments) for computing and verifying KZG proofs.
    """
-    x = field_element_bytes(3)
+    x = spec.bls_field_to_bytes(3)
    blob = get_sample_blob(spec)
    commitment = spec.blob_to_kzg_commitment(blob)
    proof, y = spec.compute_kzg_proof(blob, x)
@ -58,7 +54,7 @@ def test_verify_kzg_proof_incorrect_proof(spec):
    """
    Test the wrapper function `verify_kzg_proof` fails on an incorrect proof.
    """
-    x = field_element_bytes(3465)
+    x = spec.bls_field_to_bytes(3465)
    blob = get_sample_blob(spec)
    commitment = spec.blob_to_kzg_commitment(blob)
    proof, y = spec.compute_kzg_proof(blob, x)
--- a/tests/core/pyspec/eth2spec/test/eip7594/unittests/das/init.py
+++ b/tests/core/pyspec/eth2spec/test/eip7594/unittests/das/init.py
--- a/tests/core/pyspec/eth2spec/test/eip7594/unittests/das/test_das.py
+++ b/tests/core/pyspec/eth2spec/test/eip7594/unittests/das/test_das.py
@ -0,0 +1,44 @@
 import random
 from eth2spec.test.context import (
    spec_test,
    single_phase,
    with_eip7594_and_later,
 )
 from eth2spec.test.helpers.sharding import (
    get_sample_blob,
 )
@with_eip7594_and_later
@spec_test
@single_phase
 def test_recover_matrix(spec):
    rng = random.Random(5566)
    # Number of samples we will be recovering from
    N_SAMPLES = spec.CELLS_PER_BLOB // 2
    blob_count = 2
    cells_dict = {}
    original_cells = []
    for blob_index in range(blob_count):
        # Get the data we will be working with
        blob = get_sample_blob(spec, rng=rng)
        # Extend data with Reed-Solomon and split the extended data in cells
        cells = spec.compute_cells(blob)
        original_cells.append(cells)
        cell_ids = []
        # First figure out just the indices of the cells
        for _ in range(N_SAMPLES):
            cell_id = rng.randint(0, spec.CELLS_PER_BLOB - 1)
            while cell_id in cell_ids:
                cell_id = rng.randint(0, spec.CELLS_PER_BLOB - 1)
            cell_ids.append(cell_id)
            cell = cells[cell_id]
            cells_dict[(blob_index, cell_id)] = cell
        assert len(cell_ids) == N_SAMPLES
    # Recover the matrix
    recovered_matrix = spec.recover_matrix(cells_dict, blob_count)
    flatten_original_cells = [cell for cells in original_cells for cell in cells]
    assert recovered_matrix == flatten_original_cells
--- a/tests/core/pyspec/eth2spec/test/eip7594/unittests/polynomial_commitments/test_polynomial_commitments.py
+++ b/tests/core/pyspec/eth2spec/test/eip7594/unittests/polynomial_commitments/test_polynomial_commitments.py
@ -10,10 +10,6 @@ from eth2spec.test.helpers.sharding import (
 from eth2spec.utils.bls import BLS_MODULUS
 def field_element_bytes(x):
    return int.to_bytes(x % BLS_MODULUS, 32, "big")
@with_eip7594_and_later
@spec_test
@single_phase
@ -39,7 +35,7 @@ def test_verify_cell_proof(spec):
    commitment = spec.blob_to_kzg_commitment(blob)
    cells, proofs = spec.compute_cells_and_proofs(blob)
-    cells_bytes = [[field_element_bytes(element) for element in cell] for cell in cells]
+    cells_bytes = [[spec.bls_field_to_bytes(element) for element in cell] for cell in cells]
    cell_id = 0
    assert spec.verify_cell_proof(commitment, cell_id, cells_bytes[cell_id], proofs[cell_id])
@ -54,7 +50,7 @@ def test_verify_cell_proof_batch(spec):
    blob = get_sample_blob(spec)
    commitment = spec.blob_to_kzg_commitment(blob)
    cells, proofs = spec.compute_cells_and_proofs(blob)
-    cells_bytes = [[field_element_bytes(element) for element in cell] for cell in cells]
+    cells_bytes = [[spec.bls_field_to_bytes(element) for element in cell] for cell in cells]
    assert len(cells) == len(proofs)
@ -83,15 +79,15 @@ def test_recover_polynomial(spec):
    # Extend data with Reed-Solomon and split the extended data in cells
    cells = spec.compute_cells(blob)
-    cells_bytes = [[field_element_bytes(element) for element in cell] for cell in cells]
+    cells_bytes = [[spec.bls_field_to_bytes(element) for element in cell] for cell in cells]
    # Compute the cells we will be recovering from
    cell_ids = []
    # First figure out just the indices of the cells
    for i in range(N_SAMPLES):
-        j = rng.randint(0, spec.CELLS_PER_BLOB)
+        j = rng.randint(0, spec.CELLS_PER_BLOB - 1)
        while j in cell_ids:
-            j = rng.randint(0, spec.CELLS_PER_BLOB)
+            j = rng.randint(0, spec.CELLS_PER_BLOB - 1)
        cell_ids.append(j)
    # Now the cells themselves
    known_cells_bytes = [cells_bytes[cell_id] for cell_id in cell_ids]
--- a/tests/core/pyspec/eth2spec/test/eip7594/unittests/test_custody.py
+++ b/tests/core/pyspec/eth2spec/test/eip7594/unittests/test_custody.py
@ -12,8 +12,6 @@ def run_get_custody_columns(spec, peer_count, custody_subnet_count):
    columns_per_subnet = spec.NUMBER_OF_COLUMNS // spec.config.DATA_COLUMN_SIDECAR_SUBNET_COUNT
    for assignment in assignments:
        assert len(assignment) == custody_subnet_count * columns_per_subnet
        print('assignment', assignment)
        print('set(assignment)', set(assignment))
        assert len(assignment) == len(set(assignment))