219 lines
7.2 KiB
Markdown
219 lines
7.2 KiB
Markdown
# Ethereum 2.0 Phase 1 -- Data Availability Sampling
|
|
|
|
**Notice**: This document is a work-in-progress for researchers and implementers.
|
|
|
|
## Table of contents
|
|
|
|
<!-- START doctoc generated TOC please keep comment here to allow auto update -->
|
|
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
|
|
|
|
- [Custom types](#custom-types)
|
|
- [Configuration](#configuration)
|
|
- [Misc](#misc)
|
|
- [New containers](#new-containers)
|
|
- [DASSample](#dassample)
|
|
- [ShardBlob](#shardblob)
|
|
- [SignedShardBlob](#signedshardblob)
|
|
- [Helper functions](#helper-functions)
|
|
- [Reverse bit ordering](#reverse-bit-ordering)
|
|
- [`reverse_bit_order`](#reverse_bit_order)
|
|
- [`reverse_bit_order_list`](#reverse_bit_order_list)
|
|
- [Data extension](#data-extension)
|
|
- [Data recovery](#data-recovery)
|
|
- [DAS functions](#das-functions)
|
|
|
|
<!-- END doctoc generated TOC please keep comment here to allow auto update -->
|
|
|
|
## Custom types
|
|
|
|
We define the following Python custom types for type hinting and readability:
|
|
|
|
| Name | SSZ equivalent | Description |
|
|
| - | - | - |
|
|
| `SampleIndex` | `uint64` | A sample index, corresponding to chunk of extended data |
|
|
| `BLSPoint` | `uint256` | A number `x` in the range `0 <= x < MODULUS` |
|
|
|
|
|
|
## Configuration
|
|
|
|
### Misc
|
|
|
|
| Name | Value | Notes |
|
|
| - | - | - |
|
|
| `MAX_RESAMPLE_TIME` | `TODO` (= TODO) | Time window to sample a shard blob and put it on vertical subnets |
|
|
|
|
|
|
## New containers
|
|
|
|
### DASSample
|
|
|
|
```python
|
|
class DASSample(Container):
|
|
slot: Slot
|
|
shard: Shard
|
|
index: SampleIndex
|
|
proof: BLSCommitment
|
|
data: Vector[BLSPoint, POINTS_PER_SAMPLE]
|
|
```
|
|
|
|
### ShardBlob
|
|
|
|
The blob of data, effectively a block. Network-only.
|
|
|
|
```python
|
|
class ShardBlob(Container):
|
|
# Slot and shard that this blob is intended for
|
|
slot: Slot
|
|
shard: Shard
|
|
# The actual data
|
|
data: List[BLSPoint, POINTS_PER_SAMPLE * MAX_SAMPLES_PER_BLOCK]
|
|
```
|
|
|
|
Note that the hash-tree-root of the `ShardBlob` does not match the `ShardHeader`,
|
|
since the blob deals with full data, whereas the header includes the KZG commitment instead.
|
|
|
|
### SignedShardBlob
|
|
|
|
Network-only.
|
|
|
|
```python
|
|
class SignedShardBlob(Container):
|
|
blob: ShardBlob
|
|
# The signature, the message is the commitment on the blob
|
|
signature: BLSSignature
|
|
```
|
|
|
|
|
|
## Helper functions
|
|
|
|
### Reverse bit ordering
|
|
|
|
#### `reverse_bit_order`
|
|
|
|
```python
|
|
def reverse_bit_order(n: int, order: int):
|
|
"""
|
|
Reverse the bit order of an integer n
|
|
"""
|
|
assert is_power_of_two(order)
|
|
return int(('{:0' + str(order.bit_length() - 1) + 'b}').format(n)[::-1], 2)
|
|
```
|
|
|
|
#### `reverse_bit_order_list`
|
|
|
|
```python
|
|
def reverse_bit_order_list(elements: Sequence[int]) -> Sequence[int]:
|
|
order = len(elements)
|
|
assert is_power_of_two(order)
|
|
return [elements[reverse_bit_order(i, order)] for i in range(order)]
|
|
```
|
|
|
|
### Data extension
|
|
|
|
Implementations:
|
|
- [Python](https://github.com/protolambda/partial_fft/blob/master/das_fft.py)
|
|
- [Go](https://github.com/protolambda/go-kate/blob/master/das_extension.go)
|
|
|
|
```python
|
|
def das_fft_extension(data: Sequence[Point]) -> Sequence[Point]:
|
|
"""
|
|
Given some even-index values of an IFFT input, compute the odd-index inputs,
|
|
such that the second output half of the IFFT is all zeroes.
|
|
"""
|
|
poly = inverse_fft(data)
|
|
return fft(poly + [0]*len(poly))[1::2]
|
|
```
|
|
|
|
### Data recovery
|
|
|
|
See [Reed-Solomon erasure code recovery in n*log^2(n) time with FFTs](https://ethresear.ch/t/reed-solomon-erasure-code-recovery-in-n-log-2-n-time-with-ffts/3039) for theory.
|
|
Implementations:
|
|
- [Original Python](https://github.com/ethereum/research/blob/master/mimc_stark/recovery.py)
|
|
- [New optimized approach in python](https://github.com/ethereum/research/tree/master/polynomial_reconstruction)
|
|
- [Old approach in Go](https://github.com/protolambda/go-kate/blob/master/recovery.go)
|
|
|
|
```python
|
|
def recover_data(data: Sequence[Optional[Sequence[Point]]]) -> Sequence[Point]:
|
|
"""Given an a subset of half or more of subgroup-aligned ranges of values, recover the None values."""
|
|
...
|
|
```
|
|
|
|
## DAS functions
|
|
|
|
```python
|
|
def extend_data(data: Sequence[Point]) -> Sequence[Point]:
|
|
"""
|
|
The input data gets reverse-bit-ordered, such that the first half of the final output matches the original data.
|
|
We calculated the odd-index values with the DAS FFT extension, reverse-bit-order to put them in the second half.
|
|
"""
|
|
rev_bit_odds = reverse_bit_order_list(das_fft_extension(reverse_bit_order_list(data)))
|
|
return data + rev_bit_odds
|
|
```
|
|
|
|
```python
|
|
def unextend_data(extended_data: Sequence[Point]) -> Sequence[Point]:
|
|
return extended_data[:len(extended_data)//2]
|
|
```
|
|
|
|
```python
|
|
def check_multi_kzg_proof(commitment: BLSCommitment, proof: BLSCommitment, x: Point, ys: Sequence[Point]) -> bool:
|
|
"""
|
|
Run a KZG multi-proof check to verify that for the subgroup starting at x,
|
|
the proof indeed complements the ys to match the commitment.
|
|
"""
|
|
... # Omitted for now, refer to KZG implementation resources.
|
|
```
|
|
|
|
```python
|
|
def construct_proofs(extended_data_as_poly: Sequence[Point]) -> Sequence[BLSCommitment]:
|
|
"""
|
|
Constructs proofs for samples of extended data (in polynomial form, 2nd half being zeroes).
|
|
Use the FK20 multi-proof approach to construct proofs for a chunk length of POINTS_PER_SAMPLE.
|
|
"""
|
|
... # Omitted for now, refer to KZG implementation resources.
|
|
```
|
|
|
|
```python
|
|
def commit_to_data(data_as_poly: Sequence[Point]) -> BLSCommitment:
|
|
"""Commit to a polynomial by """
|
|
```
|
|
|
|
```python
|
|
def sample_data(slot: Slot, shard: Shard, extended_data: Sequence[Point]) -> Sequence[DASSample]:
|
|
sample_count = len(extended_data) // POINTS_PER_SAMPLE
|
|
assert sample_count <= MAX_SAMPLES_PER_BLOCK
|
|
# get polynomial form of full extended data, second half will be all zeroes.
|
|
poly = ifft(reverse_bit_order_list(extended_data))
|
|
assert all(v == 0 for v in poly[len(poly)//2:])
|
|
proofs = construct_proofs(poly)
|
|
return [
|
|
DASSample(
|
|
slot=slot,
|
|
shard=shard,
|
|
# The proof applies to `x = w ** (reverse_bit_order(i, sample_count) * POINTS_PER_SAMPLE)`
|
|
index=i,
|
|
# The computed proofs match the reverse_bit_order_list(extended_data), undo that to get the right proof.
|
|
proof=proofs[reverse_bit_order(i, sample_count)],
|
|
# note: we leave the sample data as-is so it matches the original nicely.
|
|
# The proof applies to `ys = reverse_bit_order_list(sample.data)`
|
|
data=extended_data[i*POINTS_PER_SAMPLE:(i+1)*POINTS_PER_SAMPLE]
|
|
) for i in range(sample_count)
|
|
]
|
|
```
|
|
|
|
```python
|
|
def verify_sample(sample: DASSample, sample_count: uint64, commitment: BLSCommitment):
|
|
domain_pos = reverse_bit_order(sample.index, sample_count)
|
|
sample_root_of_unity = ROOT_OF_UNITY**MAX_SAMPLES_PER_BLOCK # change point-level to sample-level domain
|
|
x = sample_root_of_unity**domain_pos
|
|
ys = reverse_bit_order_list(sample.data)
|
|
assert check_multi_kzg_proof(commitment, sample.proof, x, ys)
|
|
```
|
|
|
|
```python
|
|
def reconstruct_extended_data(samples: Sequence[Optional[DASSample]]) -> Sequence[Point]:
|
|
# Instead of recovering with a point-by-point approach, recover the samples by recovering missing subgroups.
|
|
subgroups = [None if sample is None else reverse_bit_order_list(sample.data) for sample in samples]
|
|
return recover_data(subgroups)
|
|
```
|