eth2.0-specs/specs/das/sampling.md

# Ethereum 2.0 Data Availability Sampling

**Notice**: This document is a work-in-progress for researchers and implementers.

## Table of contents

<!-- TOC -->
<!-- START doctoc generated TOC please keep comment here to allow auto update -->
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->

- [Data Availability Sampling](#data-availability-sampling)
- [GossipSub](#gossipsub)
  - [Horizontal subnets](#horizontal-subnets)
  - [Vertical subnets](#vertical-subnets)
    - [Slow rotation: Backbone](#slow-rotation-backbone)
    - [Quick rotation: Sampling](#quick-rotation-sampling)
  - [DAS during network instability](#das-during-network-instability)
    - [Stage 0: Waiting on missing samples](#stage-0-waiting-on-missing-samples)
    - [Stage 1: Pulling missing samples from known peers](#stage-1-pulling-missing-samples-from-known-peers)
    - [Stage 2: Pulling missing data from validators with custody.](#stage-2-pulling-missing-data-from-validators-with-custody)

<!-- END doctoc generated TOC please keep comment here to allow auto update -->
<!-- /TOC -->


## Data Availability Sampling

TODO: Summary of Data Availability problem

TODO: Summary of solution, why 2x extension, and randomized samples

## GossipSub

### Horizontal subnets

TODO

### Vertical subnets

#### Slow rotation: Backbone

TODO

#### Quick rotation: Sampling

TODO


### DAS during network instability

The GossipSub based retrieval of samples may not always work.
In such event, a node can move through below stages until it recovers data availability.

#### Stage 0: Waiting on missing samples

Wait for the sample to re-broadcast. Someone may be slow with publishing, or someone else is able to do the work.

Any node can do the following work to keep the network healthy:
- Common: Listen on a horizontal subnet, chunkify the block data in samples, and propagate the samples to vertical subnets.
- Extreme: Listen on enough vertical subnets, reconstruct the missing samples by recovery, and propagate the recovered samples.

This is not a requirement, but should improve the network stability with little resources, and without any central party.

#### Stage 1: Pulling missing samples from known peers

The more realistic option, to execute when a sample is missing, is to query any node that is known to hold it.
Since *consensus identity is disconnected from network identity*, there is no direct way to contact custody holders
without explicitly asking for the data.

However, *network identities* are still used to build a backbone for each vertical subnet.
These nodes should have received the samples, and can serve a buffer of them on demand.
Although serving these is not directly incentivised, it is little work:
1. Buffer any message you see on the backbone vertical subnets, for a buffer of up to two weeks.
2. Serve the samples on request. An individual sample is just expected to be `~ 0.5 KB`, and does not require any pre-processing to serve.

A validator SHOULD make a `DASQuery` request to random peers, until failing more than the configured failure-rate.

TODO: detailed failure-mode spec. Stop after trying e.g. 3 peers for any sample in a configured time window (after the gossip period).

#### Stage 2: Pulling missing data from validators with custody.

Pulling samples directly from nodes with validators that have a custody responsibility,
without revealing their identity to the network, is an open problem.
Update doc names and sharding readme section 2021-03-17 23:33:07 +00:00			`# Ethereum 2.0 Data Availability Sampling`
refactor/polish style of DAS docs, move DAS validator work to new doc 2021-01-04 21:22:17 +00:00
			`Notice: This document is a work-in-progress for researchers and implementers.`

			`## Table of contents`

toc updates 2021-03-17 23:07:15 +00:00			`<!-- TOC -->`
refactor/polish style of DAS docs, move DAS validator work to new doc 2021-01-04 21:22:17 +00:00			`<!-- START doctoc generated TOC please keep comment here to allow auto update -->`
			`<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->`

DAS docs TOC updates 2021-01-04 21:24:34 +00:00			`- [Data Availability Sampling](#data-availability-sampling)`
			`- [GossipSub](#gossipsub)`
			`- [Horizontal subnets](#horizontal-subnets)`
			`- [Vertical subnets](#vertical-subnets)`
			`- [Slow rotation: Backbone](#slow-rotation-backbone)`
			`- [Quick rotation: Sampling](#quick-rotation-sampling)`
			`- [DAS during network instability](#das-during-network-instability)`
			`- [Stage 0: Waiting on missing samples](#stage-0-waiting-on-missing-samples)`
			`- [Stage 1: Pulling missing samples from known peers](#stage-1-pulling-missing-samples-from-known-peers)`
			`- [Stage 2: Pulling missing data from validators with custody.](#stage-2-pulling-missing-data-from-validators-with-custody)`
refactor/polish style of DAS docs, move DAS validator work to new doc 2021-01-04 21:22:17 +00:00
			`<!-- END doctoc generated TOC please keep comment here to allow auto update -->`
toc updates 2021-03-17 23:07:15 +00:00			`<!-- /TOC -->`
refactor/polish style of DAS docs, move DAS validator work to new doc 2021-01-04 21:22:17 +00:00

			`## Data Availability Sampling`

			`TODO: Summary of Data Availability problem`

			`TODO: Summary of solution, why 2x extension, and randomized samples`

			`## GossipSub`

			`### Horizontal subnets`

			`TODO`

			`### Vertical subnets`

			`#### Slow rotation: Backbone`

			`TODO`

			`#### Quick rotation: Sampling`

			`TODO`


			`### DAS during network instability`

			`The GossipSub based retrieval of samples may not always work.`
			`In such event, a node can move through below stages until it recovers data availability.`

			`#### Stage 0: Waiting on missing samples`

			`Wait for the sample to re-broadcast. Someone may be slow with publishing, or someone else is able to do the work.`

			`Any node can do the following work to keep the network healthy:`
			`- Common: Listen on a horizontal subnet, chunkify the block data in samples, and propagate the samples to vertical subnets.`
			`- Extreme: Listen on enough vertical subnets, reconstruct the missing samples by recovery, and propagate the recovered samples.`

			`This is not a requirement, but should improve the network stability with little resources, and without any central party.`

			`#### Stage 1: Pulling missing samples from known peers`

			`The more realistic option, to execute when a sample is missing, is to query any node that is known to hold it.`
			`Since consensus identity is disconnected from network identity, there is no direct way to contact custody holders`
			`without explicitly asking for the data.`

			`However, network identities are still used to build a backbone for each vertical subnet.`
			`These nodes should have received the samples, and can serve a buffer of them on demand.`
			`Although serving these is not directly incentivised, it is little work:`
			`1. Buffer any message you see on the backbone vertical subnets, for a buffer of up to two weeks.`
			2. Serve the samples on request. An individual sample is just expected to be `~ 0.5 KB`, and does not require any pre-processing to serve.

			A validator SHOULD make a `DASQuery` request to random peers, until failing more than the configured failure-rate.

			`TODO: detailed failure-mode spec. Stop after trying e.g. 3 peers for any sample in a configured time window (after the gossip period).`

			`#### Stage 2: Pulling missing data from validators with custody.`

			`Pulling samples directly from nodes with validators that have a custody responsibility,`
			`without revealing their identity to the network, is an open problem.`