From 91e935d4f35f9c373013365f9d97ef7ed6a91378 Mon Sep 17 00:00:00 2001 From: protolambda Date: Fri, 1 Jan 2021 21:52:29 +0100 Subject: [PATCH] more DAS spec work: DAS function signatures, gossip details --- specs/phase1/beacon-chain.md | 5 ++ specs/phase1/data-availability-sampling.md | 27 ++++++++++ specs/phase1/p2p-das.md | 59 +++++++++++++++++++++- specs/phase1/validator.md | 4 ++ 4 files changed, 93 insertions(+), 2 deletions(-) diff --git a/specs/phase1/beacon-chain.md b/specs/phase1/beacon-chain.md index 6ef3cf1d7..8e81ae237 100644 --- a/specs/phase1/beacon-chain.md +++ b/specs/phase1/beacon-chain.md @@ -145,7 +145,10 @@ class AttestationData(Container): ```python class BeaconBlock(phase0.BeaconBlock): +<<<<<<< HEAD # insert phase 0 fields +======= +>>>>>>> 3c19069e (more DAS spec work: DAS function signatures, gossip details) shard_headers: List[SignedShardHeader, MAX_SHARD_HEADERS] ``` @@ -191,6 +194,8 @@ class ShardHeader(Container): degree_proof: BLSKateProof ``` +TODO: add shard-proposer-index to shard headers, similar to optimization done with beacon-blocks. + ### `SignedShardHeader` ```python diff --git a/specs/phase1/data-availability-sampling.md b/specs/phase1/data-availability-sampling.md index 051106e94..5a62b202e 100644 --- a/specs/phase1/data-availability-sampling.md +++ b/specs/phase1/data-availability-sampling.md @@ -30,6 +30,33 @@ class DASSample(Container): data: Vector[BLSPoint, POINTS_PER_SAMPLE] ``` +### ShardBlob + +The blob of data, effectively a block. Network-only. + +```python +class ShardBlob(Container): + # Slot and shard that this blob is intended for + slot: Slot + shard: Shard + # The actual data + data: List[BLSPoint, POINTS_PER_SAMPLE * MAX_SAMPLES_PER_BLOCK] +``` + +Note that the hash-tree-root of the `ShardBlob` does not match the `ShardHeader`, +since the blob deals with full data, whereas the header includes the Kate commitment instead. + +### SignedShardBlob + +Network-only. + +```python +class ShardBlob(Container): + message: ShardBlob + signature: BLSSignature +``` + + ## Helper functions ### Data extension diff --git a/specs/phase1/p2p-das.md b/specs/phase1/p2p-das.md index 6d31e121b..b72340498 100644 --- a/specs/phase1/p2p-das.md +++ b/specs/phase1/p2p-das.md @@ -85,6 +85,12 @@ sufficient to avoid any significant amount of nodes from being 100% predictable. As soon as a sample is missing after the expected propagation time window, nodes can divert to the pull-model, or ultimately flag it as unavailable data. +Note that the vertical subnets are shared between the different shards, +and a simple hash function `(shard, slot, sample_index) -> subnet_index` defines which samples go where. +This is to evenly distribute samples to subnets, even when one shard has more activity than the other. + +TODO: define `(shard, slot, sample_index) -> subnet_index` hash function. + #### Slow rotation: Backbone To allow for subscriptions to rotate quickly and randomly, a backbone is formed to help onboard peers into other topics. @@ -113,12 +119,61 @@ If the node does not already have connected peers on the topic it needs to sampl ### Topics and messages -#### Horizontal subnets +Following the same scheme as the [Phase0 gossip topics](../phase0/p2p-interface.md#topics-and-messages), names and payload types are: +| Name | Message Type | +|----------------------------------|---------------------------| +| `shard_blob_{shard}` | `SignedShardBlob` | +| `shard_header_{shard}` | `SignedShardHeader` | +| `das_sample_{subnet_index}` | `DASSample` | +TODO: separate phase 1 network spec. +#### Horizontal subnets: `shard_blob_{shard}` -#### Vertical subnets +Shard block data, in the form of a `SignedShardBlob` is published to the `shard_blob_{shard}` subnets. +If participating in DAS, upon receiving a `blob` for the first time, with a `slot` not older than `MAX_RESAMPLE_TIME`, +a subscriber of a `shard_blob_{shard}` SHOULD reconstruct the samples and publish them to vertical subnets. +1. Extend the data: `extended_data = extend_data(blob.data)` +2. Create samples with proofs: `samples = sample_data(blob.slot, blob.shard, extended_data)` +3. Fanout-publish the samples to the vertical subnets of its peers (not all vertical subnets may be reached). + +The [DAS validator spec](./validator.md#data-availability-sampling) outlines when and where to participate in DAS on horizontal subnets. + +The following validations MUST pass before forwarding the `blob` on the horizontal subnet or creating samples for it. +- _[REJECT]_ `blob.message.shard` MUST match the topic `{shard}` parameter. (And thus within valid shard index range) +- _[IGNORE]_ The `blob` is not from a future slot (with a `MAXIMUM_GOSSIP_CLOCK_DISPARITY` allowance) -- + i.e. validate that `blob.message.slot <= current_slot` + (a client MAY queue future blobs for processing at the appropriate slot). +- _[IGNORE]_ The blob is the first blob with valid signature received for the proposer for the `(slot, shard)`: `blob.message.slot`. +- _[REJECT]_ As already limited by the SSZ list-limit, it is important the blob is well-formatted and not too large. +- _[REJECT]_ The `blob.message.data` MUST NOT contain any point `p >= MODULUS`. Although it is a `uint256`, not the full 256 bit range is valid. +- _[REJECT]_ The proposer signature, `blob.signature`, is valid with respect to the `proposer_index` pubkey. +- _[REJECT]_ The blob is proposed by the expected `proposer_index` for the blob's slot + +TODO: define a blob header (note: hash-tree-root instead of commitment data) and make double blob proposals slashable? + +#### Vertical subnets: `das_sample_{subnet_index}` + +Shard blob samples can be verified with just a 48 byte Kate proof, against the commitment specific to that `(shard, slot)` key. + +The following validations MUST pass before forwarding the `sample` on the vertical subnet. +- _[IGNORE]_ The commitment for the (`sample.shard`, `sample.slot`, `sample.index`) tuple must be known. + If not known, the client MAY queue the sample, if it passes formatting conditions. +- _[REJECT]_ `sample.shard`, `sample.slot` and `sample.index` are hashed into a `sbunet_index` (TODO: define hash) which MUST match the topic `{subnet_index}` parameter. +- _[REJECT]_ `sample.shard` must be within valid range: `0 <= sample.shard < get_active_shard_count(state, compute_epoch_at_slot(sample.slot))`. +- _[REJECT]_ `sample.index` must be within valid range: `0 <= sample.index < sample_count`, where: + - `sample_count = (points_count + POINTS_PER_SAMPLE - 1) // POINTS_PER_SAMPLE` + - `points_count` is the length as claimed along with the commitment, which must be smaller than `MAX_SAMPLES_PER_BLOCK`. +- _[IGNORE]_ The `sample` is not from a future slot (with a `MAXIMUM_GOSSIP_CLOCK_DISPARITY` allowance) -- + i.e. validate that `sample.slot <= current_slot`. A client MAY queue future samples for processing at the appropriate slot, if it passed formatting conditions. +- _[IGNORE]_ This is the first received sample with the (`sample.shard`, `sample.slot`, `sample.index`) key tuple. +- _[REJECT]_ As already limited by the SSZ list-limit, it is important the sample data is well-formatted and not too large. +- _[REJECT]_ The `sample.data` MUST NOT contain any point `p >= MODULUS`. Although it is a `uint256`, not the full 256 bit range is valid. +- _[REJECT]_ The `sample.proof` MUST be valid: `verify_sample(sample, sample_count, commitment)` + +Upon receiving a valid sample, it SHOULD be retained for a buffer period, if the local node is part of the backbone that covers this sample. +This is to serve other peers that may have missed it. ## DAS in the Req-Resp domain: Pull diff --git a/specs/phase1/validator.md b/specs/phase1/validator.md index dce794e1b..58ffcf6eb 100644 --- a/specs/phase1/validator.md +++ b/specs/phase1/validator.md @@ -575,6 +575,10 @@ Although serving these is not directly incentivised, it is little work: 1. Buffer any message you see on the backbone vertical subnets, for a buffer of up to two weeks. 2. Serve the samples on request. An individual sample is just expected to be `~ 0.5 KB`, and does not require any pre-processing to serve. +A validator SHOULD make a `DASQuery` request to random peers, until failing more than the configured failure-rate. + +TODO: detailed failure-mode spec. Stop after trying e.g. 3 peers for any sample in a configured time window (after the gossip period). + Pulling samples directly from nodes with a custody responsibility, without revealing their identity to the network, is an open problem.