2021-08-18 17:11:38 -06:00
# Data Availability Sampling -- Core
2021-01-01 16:51:24 +01:00
**Notice**: This document is a work-in-progress for researchers and implementers.
## Table of contents
2021-03-18 00:07:15 +01:00
<!-- TOC -->
2021-01-01 16:51:24 +01:00
<!-- START doctoc generated TOC please keep comment here to allow auto update -->
2021-01-02 14:32:30 +01:00
<!-- DON'T EDIT THIS SECTION, INST E AD RE-RUN doctoc TO UPDATE -->
- [Custom types ](#custom-types )
- [Configuration ](#configuration )
- [Misc ](#misc )
- [New containers ](#new-containers )
2021-04-05 17:08:38 +03:00
- [`DASSample` ](#dassample )
2021-01-02 14:32:30 +01:00
- [Helper functions ](#helper-functions )
- [Reverse bit ordering ](#reverse-bit-ordering )
- [`reverse_bit_order` ](#reverse_bit_order )
- [`reverse_bit_order_list` ](#reverse_bit_order_list )
- [Data extension ](#data-extension )
- [Data recovery ](#data-recovery )
- [DAS functions ](#das-functions )
2021-01-01 16:51:24 +01:00
<!-- END doctoc generated TOC please keep comment here to allow auto update -->
2021-03-18 00:07:15 +01:00
<!-- /TOC -->
2021-01-01 16:51:24 +01:00
## Custom types
We define the following Python custom types for type hinting and readability:
| Name | SSZ equivalent | Description |
| - | - | - |
| `SampleIndex` | `uint64` | A sample index, corresponding to chunk of extended data |
2021-01-02 14:25:31 +01:00
## Configuration
### Misc
| Name | Value | Notes |
| - | - | - |
| `MAX_RESAMPLE_TIME` | `TODO` (= TODO) | Time window to sample a shard blob and put it on vertical subnets |
2021-01-01 16:51:24 +01:00
## New containers
2021-04-05 16:51:26 +03:00
### `DASSample`
2021-01-01 16:51:24 +01:00
```python
class DASSample(Container):
slot: Slot
shard: Shard
index: SampleIndex
2021-01-02 22:31:25 +01:00
proof: BLSCommitment
2021-01-01 16:51:24 +01:00
data: Vector[BLSPoint, POINTS_PER_SAMPLE]
```
## Helper functions
2021-01-02 14:25:31 +01:00
### Reverse bit ordering
#### `reverse_bit_order`
```python
def reverse_bit_order(n: int, order: int):
"""
Reverse the bit order of an integer n
"""
assert is_power_of_two(order)
return int(('{:0' + str(order.bit_length() - 1) + 'b}').format(n)[::-1], 2)
```
#### `reverse_bit_order_list`
```python
def reverse_bit_order_list(elements: Sequence[int]) -> Sequence[int]:
order = len(elements)
assert is_power_of_two(order)
return [elements[reverse_bit_order(i, order)] for i in range(order)]
```
2021-01-01 18:52:03 +01:00
### Data extension
Implementations:
- [Python ](https://github.com/protolambda/partial_fft/blob/master/das_fft.py )
- [Go ](https://github.com/protolambda/go-kate/blob/master/das_extension.go )
```python
def das_fft_extension(data: Sequence[Point]) -> Sequence[Point]:
2021-01-02 14:25:31 +01:00
"""
Given some even-index values of an IFFT input, compute the odd-index inputs,
such that the second output half of the IFFT is all zeroes.
"""
2021-01-01 18:52:03 +01:00
poly = inverse_fft(data)
return fft(poly + [0]*len(poly))[1::2]
```
### Data recovery
See [Reed-Solomon erasure code recovery in n*log^2(n) time with FFTs ](https://ethresear.ch/t/reed-solomon-erasure-code-recovery-in-n-log-2-n-time-with-ffts/3039 ) for theory.
Implementations:
- [Original Python ](https://github.com/ethereum/research/blob/master/mimc_stark/recovery.py )
- [New optimized approach in python ](https://github.com/ethereum/research/tree/master/polynomial_reconstruction )
- [Old approach in Go ](https://github.com/protolambda/go-kate/blob/master/recovery.go )
2021-01-01 16:51:24 +01:00
```python
2021-01-02 14:25:31 +01:00
def recover_data(data: Sequence[Optional[Sequence[Point]]]) -> Sequence[Point]:
"""Given an a subset of half or more of subgroup-aligned ranges of values, recover the None values."""
2021-01-01 16:51:24 +01:00
...
```
## DAS functions
```python
def extend_data(data: Sequence[Point]) -> Sequence[Point]:
2021-01-02 14:25:31 +01:00
"""
The input data gets reverse-bit-ordered, such that the first half of the final output matches the original data.
We calculated the odd-index values with the DAS FFT extension, reverse-bit-order to put them in the second half.
"""
rev_bit_odds = reverse_bit_order_list(das_fft_extension(reverse_bit_order_list(data)))
return data + rev_bit_odds
2021-01-01 16:51:24 +01:00
```
```python
def unextend_data(extended_data: Sequence[Point]) -> Sequence[Point]:
2021-01-02 14:25:31 +01:00
return extended_data[:len(extended_data)//2]
2021-01-01 16:51:24 +01:00
```
```python
2021-01-02 22:31:25 +01:00
def check_multi_kzg_proof(commitment: BLSCommitment, proof: BLSCommitment, x: Point, ys: Sequence[Point]) -> bool:
2021-01-02 14:25:31 +01:00
"""
Run a KZG multi-proof check to verify that for the subgroup starting at x,
the proof indeed complements the ys to match the commitment.
"""
2021-01-02 22:31:25 +01:00
... # Omitted for now, refer to KZG implementation resources.
2021-01-01 16:51:24 +01:00
```
```python
2021-01-02 22:31:25 +01:00
def construct_proofs(extended_data_as_poly: Sequence[Point]) -> Sequence[BLSCommitment]:
2021-01-02 14:25:31 +01:00
"""
Constructs proofs for samples of extended data (in polynomial form, 2nd half being zeroes).
Use the FK20 multi-proof approach to construct proofs for a chunk length of POINTS_PER_SAMPLE.
"""
2021-01-02 22:31:25 +01:00
... # Omitted for now, refer to KZG implementation resources.
2021-01-02 14:25:31 +01:00
```
```python
2021-01-02 22:06:57 +01:00
def commit_to_data(data_as_poly: Sequence[Point]) -> BLSCommitment:
2021-01-02 14:25:31 +01:00
"""Commit to a polynomial by """
2021-01-01 16:51:24 +01:00
```
```python
2021-01-01 18:52:03 +01:00
def sample_data(slot: Slot, shard: Shard, extended_data: Sequence[Point]) -> Sequence[DASSample]:
sample_count = len(extended_data) // POINTS_PER_SAMPLE
assert sample_count < = MAX_SAMPLES_PER_BLOCK
2021-01-02 14:25:31 +01:00
# get polynomial form of full extended data, second half will be all zeroes.
poly = ifft(reverse_bit_order_list(extended_data))
assert all(v == 0 for v in poly[len(poly)//2:])
proofs = construct_proofs(poly)
2021-01-01 18:52:03 +01:00
return [
DASSample(
slot=slot,
shard=shard,
2021-01-02 14:25:31 +01:00
# The proof applies to `x = w ** (reverse_bit_order(i, sample_count) * POINTS_PER_SAMPLE)`
2021-01-01 18:52:03 +01:00
index=i,
2021-01-02 14:25:31 +01:00
# The computed proofs match the reverse_bit_order_list(extended_data), undo that to get the right proof.
proof=proofs[reverse_bit_order(i, sample_count)],
# note: we leave the sample data as-is so it matches the original nicely.
# The proof applies to `ys = reverse_bit_order_list(sample.data)`
data=extended_data[i*POINTS_PER_SAMPLE:(i+1)*POINTS_PER_SAMPLE]
2021-01-01 18:52:03 +01:00
) for i in range(sample_count)
]
```
```python
def verify_sample(sample: DASSample, sample_count: uint64, commitment: BLSCommitment):
domain_pos = reverse_bit_order(sample.index, sample_count)
sample_root_of_unity = ROOT_OF_UNITY**MAX_SAMPLES_PER_BLOCK # change point-level to sample-level domain
x = sample_root_of_unity**domain_pos
2021-01-02 14:25:31 +01:00
ys = reverse_bit_order_list(sample.data)
2021-01-02 22:31:25 +01:00
assert check_multi_kzg_proof(commitment, sample.proof, x, ys)
2021-01-01 18:52:03 +01:00
```
```python
def reconstruct_extended_data(samples: Sequence[Optional[DASSample]]) -> Sequence[Point]:
2021-01-02 14:25:31 +01:00
# Instead of recovering with a point-by-point approach, recover the samples by recovering missing subgroups.
subgroups = [None if sample is None else reverse_bit_order_list(sample.data) for sample in samples]
return recover_data(subgroups)
2021-01-01 16:51:24 +01:00
```