address comments, iteration 1

This commit is contained in:
Igor Sirotin 2026-05-05 22:37:30 +01:00
parent 06a99219b8
commit f7970ee404
No known key found for this signature in database
GPG Key ID: 0EABBCB40CB9AD4A

View File

@ -44,15 +44,15 @@ message SegmentMessageProto {
bytes entire_message_hash = 1;
// Data segment indexing
uint32 index = 2; // zero-based sequence number for data segments
uint32 segments_count = 3; // number of data segments (>= 1)
uint32 data_segment_index = 2; // zero-indexed sequence number for data segments
uint32 data_segment_count = 3; // number of data segments (>= 1)
// Segment payload (data or parity shard)
bytes payload = 4;
// Parity segment indexing
uint32 parity_segment_index = 5; // zero-based sequence number for parity segments
uint32 parity_segments_count = 6; // number of parity segments
uint32 parity_segment_count = 6; // number of parity segments
// Segment type
bool is_parity = 7; // true for parity segments, false (default) for data segments
@ -62,11 +62,11 @@ message SegmentMessageProto {
**Field descriptions:**
- `entire_message_hash`: A 32-byte Keccak256 hash of the original complete payload, used to identify which segments belong together and verify reconstruction integrity.
- `index`: Zero-based sequence number identifying this data segment's position (0, 1, 2, ..., segments_count - 1). Set only on data segments.
- `segments_count`: Total number of data segments the original message was split into. Set on every segment (data and parity).
- `data_segment_index`: Zero-indexed sequence number identifying this data segment's position (0, 1, 2, ..., data_segment_count - 1). Set only on data segments.
- `data_segment_count`: Total number of data segments the original message was split into. Set on every segment (data and parity).
- `payload`: The actual chunk of data or parity information for this segment.
- `parity_segment_index`: Zero-based sequence number for parity segments. Set only on parity segments.
- `parity_segments_count`: Total number of parity segments generated. Set on every segment (data and parity) when ReedSolomon parity is used; `0` (default) otherwise.
- `parity_segment_count`: Total number of parity segments generated. Set on every segment (data and parity) when ReedSolomon parity is used; `0` (default) otherwise.
- `is_parity`: Explicit segment type marker. `false` (default) for data segments; `true` for parity segments.
A message is either a **data segment** (when `is_parity == false`) or a **parity segment** (when `is_parity == true`).
@ -76,14 +76,15 @@ A message is either a **data segment** (when `is_parity == false`) or a **parity
Receivers **MUST** enforce:
- `entire_message_hash.length == 32`
- `segments_count >= 1`
- `data_segment_count >= 1`
- `data_segment_count + parity_segment_count < maxTotalSegments`
- **Data segments** (`is_parity == false`):
`index < segments_count`
`data_segment_index < data_segment_count`
- **Parity segments** (`is_parity == true`):
`parity_segments_count > 0` AND `parity_segment_index < parity_segments_count`
`parity_segment_count > 0` AND `parity_segment_index < parity_segment_count`
No other combinations are permitted.
A `SegmentMessageProto` with `segments_count == 1` and `index == 0` is a valid single-segment data message: the `payload` field carries the entire original payload (see [Sending](#sending)).
A `SegmentMessageProto` with `data_segment_count == 1` and `data_segment_index == 0` is a valid single-segment data message: the `payload` field carries the entire original payload (see [Sending](#sending)).
## Segmentation
@ -94,17 +95,16 @@ To transmit a payload, the sender:
- **MUST** compute a 32-byte `entire_message_hash = Keccak256(original_payload)`.
- **MUST** split the payload into one or more **data segments**,
each of size up to `segmentSize` bytes.
A payload of size ≤ `segmentSize` produces a single data segment (`segments_count == 1`).
A payload of size ≤ `segmentSize` produces a single data segment (`data_segment_count == 1`).
- **MAY** use ReedSolomon erasure coding at the predefined parity rate.
- **MUST** encode every segment as a `SegmentMessageProto` with:
- The `entire_message_hash`
- `segments_count` (total number of data segments, always set)
- When ReedSolomon parity is used, `parity_segments_count` (total number of parity segments, set on every segment)
- For data segments: `is_parity = false`, `index`
- `data_segment_count` (total number of data segments, always set)
- When ReedSolomon parity is used, `parity_segment_count` (total number of parity segments, set on every segment)
- For data segments: `is_parity = false`, `data_segment_index`
- For parity segments: `is_parity = true`, `parity_segment_index`
- The raw payload data
- Send each segment as an individual transport message according to the underlying transport protocol,
preserving application-level metadata (e.g., content topic).
- Send each segment as an individual transport message according to the underlying transport service.
This yields a deterministic wire format: every transmitted payload is a `SegmentMessageProto`.
@ -114,8 +114,8 @@ Upon receiving a segmented message, the receiver:
- **MUST** validate each segment according to [Wire Format → Validation](#validation).
- **MUST** cache received segments
- **MUST** attempt reconstruction once at least `segments_count` distinct segments (data and parity combined) have been received:
- If all data segments are present, concatenate their `payload` fields in `index` order.
- **MUST** attempt reconstruction once at least `data_segment_count` distinct segments (data and parity combined) have been received:
- If all data segments are present, concatenate their `payload` fields in `data_segment_index` order.
- Otherwise, recover the payload via ReedSolomon decoding over the available data and parity segments.
- **MUST** verify `Keccak256(reconstructed_payload)` matches `entire_message_hash`.
On mismatch,
@ -154,7 +154,7 @@ Implementations **SHOULD** support:
Implementation-specific parameter, fixed. The reference implementation uses **256**.
**Reconstruction capability:**
With the predefined parity rate, reconstruction is possible if **all data segments** are received or if **any combination of data + parity** totals at least `segments_count` (i.e., up to the predefined percentage of loss tolerated).
With the predefined parity rate, reconstruction is possible if **all data segments** are received or if **any combination of data + parity** totals at least `data_segment_count` (i.e., up to the predefined percentage of loss tolerated).
**API simplicity:**
Libraries **SHOULD** require only `segmentSize` from the application for normal operation.
@ -190,14 +190,15 @@ Implementations **MUST** verify the Keccak256 hash post-reconstruction and disca
To mitigate resource exhaustion:
- Limit concurrent reconstructions and per-sender storage
- Limit total concurrent reconstructions and aggregate buffered bytes
- When sender identity is available, apply the same two limits per sender
- Enforce timeouts and size caps
- Validate segment counts (≤ 256)
- Consider rate-limiting at the transport layer (for example, via [17/WAKU2-RLN-RELAY](https://rfc.vac.dev/waku/standards/core/17/rln-relay) on Waku)
### Compatibility
Nodes that do **not** implement this specification cannot reconstruct large messages.
Nodes that do **not** implement this specification cannot reconstruct any messages.
---