mirror of
https://github.com/logos-storage/logos-storage-docs-obsidian.git
synced 2026-01-05 06:43:12 +00:00
Latest block exchange module spec
Signed-off-by: Chrysostomos Nanakos <chris@include.gr>
This commit is contained in:
parent
4008ce553b
commit
782a283d21
@ -0,0 +1,32 @@
|
||||
At a higher level, Block Exchange Module is responsible for sending and receiving blocks related to the given content. Formally a block is described as a tuple consisting of the sequence of bytes and the corresponding CID. Or, using a pseudo-language: `(seq[byte], CID)`.
|
||||
|
||||
When uploading the content to the network, the following steps are taken:
|
||||
|
||||
1. The content is chunked into fixed size blocks. The default block size is `64KiB`. The last block will be padded with `0`s.
|
||||
2. For each chunk, the corresponding `CID` is created, using `(codex-block, 0xCD02)` as a [[Multicodec]] and `(sha2-256, 0x12)` as [[Mutihash]].
|
||||
3. The resulting block, `(seq[byte], CID)` becomes the input to the Block Exchange Module `putBlock` operation.
|
||||
|
||||
When downloading the content from the network:
|
||||
|
||||
1. Given the Codex Manifest CID, the corresponding Codex Manifest is retrieved. Using manifest attributes - `datasetSize` and `blockSize`, the number of blocks - $blockCount$ is computed as $\lceil datasetSize/blockSize \rceil$.
|
||||
2. For each $blockIndex \in [0 .. blockCount]$, a $BlockAddress$ is defined as a tuple $(treeCid, blockIndex)$ where $treeCid$ is an attribute in the Codex Manifest. $BlockAddress$ is the input of the Block Exchange Module `requestBlock` operation.
|
||||
|
||||
The `putBlock` and `requestBlock` operations defined above, define the external interface of the Block Exchange Module. The Block Exchange Module directly interacts with other modules, mainly [[Discovery Module]] and the [[Repo Store]], used for the node discovery and local storage respectively.
|
||||
|
||||
## Functional Requirements
|
||||
|
||||
Before defining the operational model of the Block Exchange Module, which describes *how* the exchange module achieves its goals, let's the find the functional requirements that define *what* the exchange module is expected to achieve.
|
||||
|
||||
1. Given $BlockAddress = (treeCid, blockIndex)$, retrieve the corresponding block from the network. In the operational model, we will see that the exchange module will attempt $3000$ retires, waiting $500ms$ in each iteration for the block to be delivered. Thus, from the declarative perspective, here, we can claim that exchange module should deliver the block withing $3000 \times 500ms = 1500s = 25min$, or fail otherwise.
|
||||
2. When uploading content $C$, for each block $b \in C$, store $b$ in [[Repo Store]] and (1) announce the block *presence* to the peers signaling interest in $b$, (2) deliver $b$ to the peers that earlier requested $b$. Include Merkle Proof in the delivery if $b$ is not Codex Manifest block.
|
||||
|
||||
## Operational Model
|
||||
|
||||
In this section we focus more how the exchange module operates to deliver on the [[#Functional Requirements]].
|
||||
|
||||
In the image below, we provide a high level description of the exchange module operational model. A more detailed discussion follows.
|
||||
|
||||
![[BlockExchangeProtocol.svg]]
|
||||
|
||||
> The image above has SVG format which allows a high resolution local rendering. You can also access online version at: https://link.excalidraw.com/readonly/GLtqSUDCiRe38gDb2MOX.
|
||||
|
||||
@ -1,32 +1,226 @@
|
||||
At a higher level, Block Exchange Module is responsible for sending and receiving blocks related to the given content. Formally a block is described as a tuple consisting of the sequence of bytes and the corresponding CID. Or, using a pseudo-language: `(seq[byte], CID)`.
|
||||
# Codex Block Exchange Module Specification
|
||||
|
||||
When uploading the content to the network, the following steps are taken:
|
||||
## Introduction
|
||||
|
||||
1. The content is chunked into fixed size blocks. The default block size is `64KiB`. The last block will be padded with `0`s.
|
||||
2. For each chunk, the corresponding `CID` is created, using `(codex-block, 0xCD02)` as a [[Multicodec]] and `(sha2-256, 0x12)` as [[Mutihash]].
|
||||
3. The resulting block, `(seq[byte], CID)` becomes the input to the Block Exchange Module `putBlock` operation.
|
||||
The Block Exchange Module is a core component of Codex responsible for peer-to-peer content distribution. It handles the sending and receiving of blocks across the network, enabling efficient data sharing between Codex nodes.
|
||||
|
||||
When downloading the content from the network:
|
||||
## Overview
|
||||
|
||||
1. Given the Codex Manifest CID, the corresponding Codex Manifest is retrieved. Using manifest attributes - `datasetSize` and `blockSize`, the number of blocks - $blockCount$ is computed as $\lceil datasetSize/blockSize \rceil$.
|
||||
2. For each $blockIndex \in [0 .. blockCount]$, a $BlockAddress$ is defined as a tuple $(treeCid, blockIndex)$ where $treeCid$ is an attribute in the Codex Manifest. $BlockAddress$ is the input of the Block Exchange Module `requestBlock` operation.
|
||||
The Codex Block Exchange Protocol is a libp2p-based protocol for exchanging content blocks between Codex nodes.
|
||||
|
||||
The `putBlock` and `requestBlock` operations defined above, define the external interface of the Block Exchange Module. The Block Exchange Module directly interacts with other modules, mainly [[Discovery Module]] and the [[Repo Store]], used for the node discovery and local storage respectively.
|
||||
The protocol enables efficient peer-to-peer content distribution by:
|
||||
- Advertising block availability to interested peers
|
||||
- Requesting blocks from peers who have them
|
||||
- Delivering blocks with Merkle proofs for tree-structured data (both erasure-coded and regular datasets)
|
||||
|
||||
## Functional Requirements
|
||||
### Block Format
|
||||
|
||||
Before defining the operational model of the Block Exchange Module, which describes *how* the exchange module achieves its goals, let's the find the functional requirements that define *what* the exchange module is expected to achieve.
|
||||
In Codex, a block is formally defined as a tuple consisting of raw data and its content identifier: `(data: seq[byte], cid: Cid)`.
|
||||
|
||||
1. Given $BlockAddress = (treeCid, blockIndex)$, retrieve the corresponding block from the network. In the operational model, we will see that the exchange module will attempt $3000$ retires, waiting $500ms$ in each iteration for the block to be delivered. Thus, from the declarative perspective, here, we can claim that exchange module should deliver the block withing $3000 \times 500ms = 1500s = 25min$, or fail otherwise.
|
||||
2. When uploading content $C$, for each block $b \in C$, store $b$ in [[Repo Store]] and (1) announce the block *presence* to the peers signaling interest in $b$, (2) deliver $b$ to the peers that earlier requested $b$. Include Merkle Proof in the delivery if $b$ is not Codex Manifest block.
|
||||
**Block Creation:**
|
||||
- Content is chunked into fixed-size blocks (default: 64 KiB)
|
||||
- The last block is padded with zeros if necessary
|
||||
- Each block's CID uses:
|
||||
- Multicodec: `codex-block` (0xCD02)
|
||||
- Multihash: `sha2-256` (0x12)
|
||||
|
||||
## Operational Model
|
||||
**Block Addressing:**
|
||||
Blocks can be addressed in two ways:
|
||||
- **Standalone blocks**: Direct CID reference
|
||||
- **Tree blocks**: Reference by `(treeCid, blockIndex)` for blocks within Merkle tree structures (both regular files and erasure-coded datasets)
|
||||
|
||||
In this section we focus more how the exchange module operates to deliver on the [[#Functional Requirements]].
|
||||
### Module Context
|
||||
|
||||
In the image below, we provide a high level description of the exchange module operational model. A more detailed discussion follows.
|
||||
The Block Exchange Protocol is the wire protocol used by Codex's Block Exchange Module. The module provides the following public interface:
|
||||
|
||||
![[BlockExchangeProtocol.svg]]
|
||||
- `requestBlock(address: BlockAddress): Future[?!Block]` - Request a single block by address
|
||||
- `requestBlock(cid: Cid): Future[?!Block]` - Request a single block by CID
|
||||
- `requestBlocks(addresses: seq[BlockAddress]): SafeAsyncIter[Block]` - Request multiple blocks as an async iterator
|
||||
|
||||
> The image above has SVG format which allows a high resolution local rendering. You can also access online version at: https://link.excalidraw.com/readonly/GLtqSUDCiRe38gDb2MOX.
|
||||
The module integrates with:
|
||||
- **Discovery Module**: DHT-based peer discovery for content
|
||||
- **Local Store (Repo Store)**: Persistent block storage
|
||||
- **Advertiser**: Announces block availability to the DHT
|
||||
- **Network Layer**: libp2p-based peer connections and message transport
|
||||
|
||||
## Protocol Identifier
|
||||
|
||||
The Block Exchange Protocol uses the following libp2p protocol identifier:
|
||||
|
||||
```
|
||||
/codex/blockexc/1.0.0
|
||||
```
|
||||
|
||||
## Connection Model
|
||||
|
||||
The protocol operates over libp2p streams. When a node wants to communicate with a peer:
|
||||
|
||||
1. The initiating node dials the peer using the protocol identifier
|
||||
2. A bidirectional stream is established
|
||||
3. Both sides can send and receive messages on this stream
|
||||
4. Messages are encoded using Protocol Buffers
|
||||
5. The stream remains open for the duration of the exchange session
|
||||
6. Peers track active connections in a peer context store
|
||||
|
||||
The protocol handles peer lifecycle events:
|
||||
- **Peer Joined**: When a peer connects, it is added to the active peer set
|
||||
- **Peer Departed**: When a peer disconnects gracefully, its context is cleaned up
|
||||
- **Peer Dropped**: When a peer connection fails, it is removed from the active set
|
||||
|
||||
## Message Format
|
||||
|
||||
All messages exchanged between peers use Protocol Buffers encoding.
|
||||
|
||||
### Message Structure
|
||||
|
||||
```protobuf
|
||||
message Message {
|
||||
Wantlist wantlist = 1;
|
||||
repeated BlockDelivery payload = 3;
|
||||
repeated BlockPresence blockPresences = 4;
|
||||
int32 pendingBytes = 5;
|
||||
AccountMessage account = 6;
|
||||
StateChannelUpdate payment = 7;
|
||||
}
|
||||
```
|
||||
|
||||
**Field Descriptions:**
|
||||
|
||||
- `wantlist`: Requests for blocks (presence checks or full block requests)
|
||||
- `payload`: Block deliveries being sent to the peer
|
||||
- `blockPresences`: Block presence information (have/don't have)
|
||||
- `pendingBytes`: Number of bytes currently pending delivery to this peer
|
||||
- `account`: Ethereum account information for receiving payments
|
||||
- `payment`: Nitro state channel update for micropayments
|
||||
|
||||
### Block Addressing
|
||||
|
||||
Codex uses a block addressing scheme that supports both simple content-addressed blocks and blocks within Merkle tree structures.
|
||||
|
||||
```protobuf
|
||||
message BlockAddress {
|
||||
bool leaf = 1;
|
||||
bytes treeCid = 2; // Present when leaf = true
|
||||
uint64 index = 3; // Present when leaf = true
|
||||
bytes cid = 4; // Present when leaf = false
|
||||
}
|
||||
```
|
||||
|
||||
**Addressing Modes:**
|
||||
|
||||
- **Simple Block** (`leaf = false`): Direct CID reference to a standalone content block
|
||||
- **Tree Block** (`leaf = true`): Reference to a block within a Merkle tree by tree CID and index. The tree may represent either an erasure-coded dataset or a regular uploaded file organized in a tree structure
|
||||
|
||||
### WantList
|
||||
|
||||
The WantList allows a peer to request blocks or check for block availability.
|
||||
|
||||
```protobuf
|
||||
message Wantlist {
|
||||
enum WantType {
|
||||
wantBlock = 0;
|
||||
wantHave = 1;
|
||||
}
|
||||
|
||||
message Entry {
|
||||
BlockAddress address = 1;
|
||||
int32 priority = 2;
|
||||
bool cancel = 3;
|
||||
WantType wantType = 4;
|
||||
bool sendDontHave = 5;
|
||||
}
|
||||
|
||||
repeated Entry entries = 1;
|
||||
bool full = 2;
|
||||
}
|
||||
```
|
||||
|
||||
**Field Descriptions:**
|
||||
|
||||
- **Entry.address**: The block being requested
|
||||
- **Entry.priority**: Request priority (currently always 0)
|
||||
- **Entry.cancel**: If true, cancels a previous want for this block
|
||||
- **Entry.wantType**:
|
||||
- `wantHave (1)`: Only check if peer has the block
|
||||
- `wantBlock (0)`: Request full block data
|
||||
- **Entry.sendDontHave**: If true, peer should respond even if it doesn't have the block
|
||||
- **full**: If true, this WantList replaces all previous wants; if false, it's a delta update
|
||||
|
||||
**Delta WantList Updates:**
|
||||
|
||||
The protocol supports efficient delta updates where only changes to the WantList are transmitted:
|
||||
- New blocks are added to the peer's want list
|
||||
- Cancelled blocks are removed from the peer's want list
|
||||
- Delta updates use `full = false`
|
||||
- Full replacements use `full = true`
|
||||
|
||||
### Block Delivery
|
||||
|
||||
Block deliveries contain the actual block data with Merkle proofs for tree blocks.
|
||||
|
||||
```protobuf
|
||||
message BlockDelivery {
|
||||
bytes cid = 1;
|
||||
bytes data = 2;
|
||||
BlockAddress address = 3;
|
||||
bytes proof = 4; // Present only when address.leaf = true
|
||||
}
|
||||
```
|
||||
|
||||
**Field Descriptions:**
|
||||
|
||||
- `cid`: Content identifier of the block
|
||||
- `data`: Raw block data (up to 100 MiB)
|
||||
- `address`: The address that was requested
|
||||
- `proof`: Merkle proof (CodexProof) verifying block correctness (required for tree blocks)
|
||||
|
||||
**Merkle Proof Verification:**
|
||||
|
||||
When delivering tree blocks (`address.leaf = true`):
|
||||
- The delivery must include a Merkle proof (CodexProof)
|
||||
- The proof verifies that the block at the given index is correctly part of the Merkle tree identified by the tree CID
|
||||
- This applies to all tree-structured data, whether erasure-coded or not
|
||||
- Recipients must verify the proof before accepting the block
|
||||
- Invalid proofs result in block rejection
|
||||
|
||||
### Block Presence
|
||||
|
||||
Block presence messages indicate whether a peer has specific blocks.
|
||||
|
||||
```protobuf
|
||||
enum BlockPresenceType {
|
||||
presenceHave = 0;
|
||||
presenceDontHave = 1;
|
||||
}
|
||||
|
||||
message BlockPresence {
|
||||
BlockAddress address = 1;
|
||||
BlockPresenceType type = 2;
|
||||
bytes price = 3;
|
||||
}
|
||||
```
|
||||
|
||||
**Field Descriptions:**
|
||||
|
||||
- `address`: The block address being referenced
|
||||
- `type`: Whether the peer has the block or not
|
||||
- `price`: Price (UInt256 format)
|
||||
|
||||
### Payment Messages
|
||||
|
||||
Payment-related messages for micropayments using Nitro state channels.
|
||||
|
||||
```protobuf
|
||||
message AccountMessage {
|
||||
bytes address = 1; // Ethereum address to which payments should be made
|
||||
}
|
||||
|
||||
message StateChannelUpdate {
|
||||
bytes update = 1; // Signed Nitro state, serialized as JSON
|
||||
}
|
||||
```
|
||||
|
||||
**Field Descriptions:**
|
||||
|
||||
- `AccountMessage.address`: Ethereum address for receiving payments
|
||||
- `StateChannelUpdate.update`: Nitro state channel update containing payment information
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user