From 6da57469d6b89af4a29be7b30fe1bc017662b0f9 Mon Sep 17 00:00:00 2001 From: Corey Petty Date: Mon, 2 Oct 2023 20:42:43 -0400 Subject: [PATCH] Codex sept monthly to review --- content/codex/monthly-reports/2023-sept.md | 40 ++++++++++++++++++---- 1 file changed, 34 insertions(+), 6 deletions(-) diff --git a/content/codex/monthly-reports/2023-sept.md b/content/codex/monthly-reports/2023-sept.md index f0fa176c5..6154ed3fa 100644 --- a/content/codex/monthly-reports/2023-sept.md +++ b/content/codex/monthly-reports/2023-sept.md @@ -5,13 +5,14 @@ lastmod: 2023-09-18 --- ## Executive Summary +September updates for the Codex project were main focused on the ongoing research and analysis of the proofing schemes and their impact on the overall architecture and network economy. + ## Key Updates ### Personnel -A new Business Development job description [was posted](https://jobs.status.im/?gh_jid=5329400) and candidates are currently being interviewed. This role is expected to help facilitate strategy around the much needed partnerships for Codex and laisse with the other BD related resources we have within Logos to ensure efficient communications. +A new Business Development (BD) job description [was posted](https://jobs.status.im/?gh_jid=5329400) and candidates are currently being interviewed. This role is expected to help facilitate strategy around the much needed partnerships for Codex and liaise with the other BD related resources we have within Logos to ensure efficient communications. ### Milestones - The Codex team is broken up into 5 sections, and the weekly reports give details on how each of them have performed. Currently the Milestone definitions are not in line with this reporting process and will be worked on in the subsequent month. The teams are broken up into the following sections: - `Client` - `Infra` @@ -41,6 +42,26 @@ In order to alleviate a concurrency issue with Data Availabilities in the contra #### Research The [Codex Whitepaper v0.1](https://docs.google.com/document/d/1LCy23m90IHf32aUVhRT4r4772w1BfVcSLaJ0z9VTw9A/edit#heading=h.qs3bayckj5u4) was drafted and scheduled for release in October 2023. It is currently under review and improving based on feedback. +There has been a large discussion this month around Erasure Coding (EC) for sampling. An [analysis](https://github.com/codex-storage/zk-research-artifacts/blob/master/sampling/sampling.pdf)was performed which looked at the various effects Erasure Coding schemes have on the sampling process and associated data guarantees. A quote of the conclusion on parameter choices is below: + +> [!QUOTE] +> - we cannot have a small slot size, because that would mean too many proofs by a node (≈ 1 Tb seems to be a minimum) +> - we cannot have a too small block size, because the Merkle tree of the commitments will take too much space (say a minimum of 1024 bytes) +> - we cannot have a too big “checked sample” size, because we cannot do proofs for large amount of data (say a maximum of 65536 bytes) +> - we cannot have too much sampling checks per slot, because we cannot do proofs for many samples (depends on the block size and SNARK tech) +> - we probably want as big N, K parameters as possible, but actual implementations have limit + +A short review of the [Interleaving Schemes for Multidimensional Cluster Errors](https://ieeexplore.ieee.org/abstract/document/661516) was performed [here](https://hackmd.io/DxJzAuTZROulBhPWqScmCg?view) and some general notes on Erasure Coding as it pertains to Codex was written up [here](https://hackmd.io/kxSF8wjPS3arDFcqFJrNDw). Much of these thoughts is being captured in the Erasure Coding Proofing document [here](https://github.com/codex-storage/codex-research/blob/80a88c19989f5095b71db306393fc030df278673/design/proof-erasure-coding.md). The conclusion section (at time of writing) is copied here for convenience: + +> [!QUOTE] +> It is likely that with the current state of the art in SNARK design and erasure coding implementations we can only support slot sizes up to 4GB. There are two design directions that allow an increase of slot size. One is to extend or implement an erasure coding implementation to use a larger field size. The other is to use existing erasure coding implementation in a multi-dimensional setting. +> +>Two concrete options are: +> +>1. Erasure code with a field size that allows for 2^28 shards. Check 20 shards per proof. For 1TB this leads to shards of 4KB. This means the SNARK needs to hash 80KB plus the Merkle paths for a storage proof. Requires custom implementation of Reed-Solomon, and requires at least 1 GB of memory while performing erasure coding. +>2. Erasure code with a field size of 2^16 in two dimensions. Check 160 shards per proof. For 1TB this leads to a shards of 256 bytes. This means that the SNARK needs to hash 40KB plus the Merkle paths for a storage proof. We can use the leopard library for erasure coding and keep memory requirements for erasure coding to a negligable level. + +It appears as though the team is preferring to go with the multi-dimensional approach to EC. #### DAS Work continues on the DAS research in coordination with the Ethereum Foundation (EF). As a result of SBC, a blog post was written by the EF that discussed a forward thinking proposal for[ _PeerDAS - a simpler DAS approach using battle-tested p2p components_](https://ethresear.ch/t/peerdas-a-simpler-das-approach-using-battle-tested-p2p-components/16541) which the team has contributed to (referenced inside). Conversations of relevancy continue. @@ -54,22 +75,29 @@ A Codex Blog post [was published](https://blog.codex.storage/big-blocks-on-mainn Discussions with Felix Lange began around some fixes for `Discv5`. ### Other -A [Codex YouTube channel](https://www.youtube.com/@CodexStorage) has been setup and many tutorial videos and conference talks were uploaded. +A [Codex YouTube channel](https://www.youtube.com/@CodexStorage) has been setup and many tutorial videos and conference talks were uploaded. Go like and subscribe! ## Perceived Changes in Project Risk -In an effort to meet the MVP launch by the end of the year, significant resources have been diverted to engineering efforts. +In an effort to meet the MVP launch by the end of the year, significant resources have been diverted to engineering efforts. Jessie has taken on more responsibility in the administration and project management duties while Dmitriy has started to focus more on the research and engineering needs + +The ongoing research around the Data Availability Proof system still has potential to have drastic changes to the overall architecture of the system and associated resource costs of the various participants within the Codex Network. It is unclear how "locked in" parts of the system are that are included in the MVP launch. ## Future Plans ### Insight -Because of the mismatch of weekly updates with Milestone definitions, it is difficult to assess the impact of any given update. Next month should have all milestone definitions within this site and a reporting structure that is more intuitively associated with it. +Because of the mismatch of weekly updates with Milestone definitions, it is difficult to assess the impact of any given update. Next month should have all milestone definitions within this site and a reporting structure that is more intuitively associated with it. It has been noted that the current structure makes it difficult to track cross-team work which the changes next month hope to fix. A `Logos Collaborations` section will be included next month to highlight differences in alignment with the Logos Collective as well as cross project collaboration updates. +The reporting process has missed a lot of work around the network simulation and modeling of Codex, which we expect to be corrected by next month by previously mentioned actions. - +Depending on the uptake and viability of the [Waku reporting process](https://github.com/waku-org/pm) to other projects, then a myriad of quantitative measures will be included in the next monthly report. ### Project +NEED INPUT HERE ## Sources and Useful Links +- [Zenhub tracker](https://app.zenhub.com/workspaces/engineering-62cee4c7a335690012f826fa/roadmap) - mapping of milestones -> epics -> issues +- [May 2023 - Codex Project Status Report](https://docs.google.com/spreadsheets/d/1ejNnnBPeHqyJBqPfpZyfNGYCV10ae2pvc1mUQm2vOtU/edit#gid=1180662520) - list of risks and details on the milestones + Weekly Reports - [[2023-09-15]]