adds functional requirements

This commit is contained in:
Marcin Czenko 2025-08-14 16:54:36 +02:00
parent 4b10871baf
commit f819ccb3de
No known key found for this signature in database
GPG Key ID: 9539CC5845E45626

View File

@ -23,15 +23,25 @@ The resulting `Manifest` is encoded and the corresponding CID - *Manifest CID* -
Because in Codex, Manifest CID is announced on the DHT, the nodes storing the corresponding Manifest block can be found. From the resolved manifest, the nodes storing relevant blocks can be identified using the `treeCid` attribute from the manifest. The `treeCid` in Codex is this similar to the `infoHash` from BitTorrent. In version 2 of the BitTorrent protocol, `infoHash` is also announced on the BitTorrent DHT, but a torrent file or the so-called *magnet link* (also introduced later) has to be distributed out-of-band.
Knowing the purpose of Codex Manifest, let now look more into selected technical aspects.
From this rationale we almost immediately see the most important use case for the Codex Manifest in general and the Codex Manifest CID in particular is the ability to uniquely identify the content and be able to retrieve that content from any single Codex client. This and other functional requirements will be the subject of the next section.
## 2. Technical Specification
## 2. Functional Requirements
In this section we describe *how* manifest should be built from the given dataset. We focus on the manifest. How the data is chunked and stored on the network is outside of scope of this specification. What we specify here are the manifest attributes and the manifest encoding suitable for the network storage.
The Codex client should enable the user to achieve the following use cases.
### Codex Manifest Attributes
- Lists manifest CIDs stored locally in node.
- Upload a file in a streaming manner. Once finished, the file can be retrieved from any node in the network using the returned manifest CID. The file name and the *MIME type* (if can be determined) should be recorded in the Codex Manifest.
- Download a file from the local node in a streaming manner. If the file is not available locally, an error message should be returned.
- Delete manifest CID from the local node.
- Given its manifest CID, trigger downloading a file from the network to the local node if it's yet available locally. Download is performed in background and the operation can finish before download is completed.
- Given its manifest CID, download a file from the network in a streaming manner. If the file is not available locally, it will be retrieved from the network in the best effort manner. This operation does not have associated timeout and may take a long time to finish depending on the availability of the nodes keeping the relevant blocks. The user can interrupt the operation at any time.
- Given a Codex manifest CID, retrieve the corresponding Codex manifest from the local node if stored locally, otherwise, download it it from the network.
- For each requirement listed above, a compliant Codex client must provide the corresponding API.
In this section we describe the Codex Manifest Attributes together using Nim as an example concrete realization.
TDB: here comes the list of APIs with examples.
## 3. Non-functional Requirements
As we already saw in the previous section, using the Codex Manifest CID, the user should be able to use any compliant Codex client to download the content identified by Codex Manifest CID. In this section we focus on the non-functional requirements which guarantee interoperability between compliant Codex clients.
#### multicodecs
@ -39,6 +49,10 @@ In this section we describe the Codex Manifest Attributes together using Nim as
The code of a multicodec is a unsigned integer encoded as unsigned varint as defined by [multiformats/unsigned-varint](https://github.com/multiformats/unsigned-varint). It is then used as a prefix to identify the data that follows.
For human readability, where appropriate and non-ambiguous, we can refer to various multicodecs by their symbolic names. For instance, a muliticodec code for a SHA-256 [multihash](https://github.com/multiformats/multihash) is `0x12` and its symbolic name is `sha2-256`. In this specification we often refer to various multicodecs through a tuple containing the descriptive name and the corresponding hex value, e.g.: `(sha2-256, 0x12)`. There is a canonical table of multicodecs at [table.csv](https://github.com/multiformats/multicodec/blob/master/table.csv). Codex specific multicodecs are currently defined in [nim-libp2p](https://github.com/vacp2p/nim-libp2p/blob/master/libp2p/multicodec.nim).
### Codex Manifest Attributes
In this section we describe the Codex Manifest Attributes together using Nim as an example concrete realization.
#### treeCid
The `treeCid` is the CID of the root of the [[Codex Tree]], which is a form of a Merkle Tree corresponding to the dataset described by the manifest. Its multicodec is `(codex-root, 0xCD03)`.