From b99bd876a9f8247c587740666e8414ac0e6108b9 Mon Sep 17 00:00:00 2001 From: Ben Date: Fri, 15 Aug 2025 09:42:48 +0200 Subject: [PATCH] data workflows --- 10 Notes/Component Specification - Codex.md | 52 ++++++++++++++------- 1 file changed, 34 insertions(+), 18 deletions(-) diff --git a/10 Notes/Component Specification - Codex.md b/10 Notes/Component Specification - Codex.md index 0652c1f..b22d876 100644 --- a/10 Notes/Component Specification - Codex.md +++ b/10 Notes/Component Specification - Codex.md @@ -42,6 +42,8 @@ List of features or behaviors the component must provide. - The component provides logging information. - The component returns error objects in case of errors. - The component returns nullable or optional values where appropriate. +- The component provides other components with their required configuration information. +- The component ensures configuration information is safely persisted. ### Network - The component enables the user to view the current network status information as provided by the p2p module and discovery module. - The component enables the user to query the discovery module for connection information. @@ -73,31 +75,45 @@ Description of algorithms, workflows, or state machines. Include diagrams if needed. ### Data -#### Reading data +#### Block retrieval - Given a CID 1. Engage local data storage module to check for block presence -1. If present, use local data storage module to read the data +1. If present, use local data storage module to read and yield the block 1. Otherwise, use block exchange to retrieve the block from the network -1. When completed, read the data -1. Otherwise, fail with retrieval exception +1. When completed, use the local data storage module to store the block +1. Yield the block +#### Reading data block +- Given a data block CID +- Given successful block retrieval +1. Read and yield the block data + +#### Reading a manifest +- Given a manifest CID +- Given successful block retrieval +1. Read and deserialize the manifest +1. Yield the manifest object + +#### Reading dataset +- Given a manifest CID +- Given the manifest was read successfully +1. Iterate the data block information from the manifest +1. Read each data block and yield its data + +#### Writing dataset +- Given a raw dataset in the form of a stream +1. Read data from the stream until a block-sized amount is read or until the stream ends, whichever comes first. +1. If the read data is less than a block-sized amount, pad the remaining space with NULL-bytes. +1. Engage the local data storage module to store the data as a block. It will yield a CID. +1. Append the CID to a temporary array. +1. Repeat the previous steps until the stream ends. +1. Engage the manifest module to create a manifest object from the array of CIDs. +1. Serialize the manifest object. +1. Engage the local data storage module to store the serialized manifest data. It will yield a CID. +1. Yield the manifest CID. *** EACH FLOW ***