roadmap/2023-07-21.md at 8a0ffc99db2e3f71879e0c62b0ac0e82e933bc39

20 KiB

Raw Blame History

title

Codex update 07/12/2023 to 07/21/2023

Overall we continue working in various directions, distributed testing, marketplace, p2p client, research, etc...

Our main milestone is to have a fully functional testnet with the marketplace and durability guarantees deployed by end of year. A lot of grunt work is being done to make that possible. Progress is steady, but there are lots of stabilization and testing & infra related work going on.

We're also onboarding several new members to the team (4 to be precise), this will ultimately accelerate our progress, but it requires some upfront investment from some of the more experienced team members.

DevOps/Infrastructure:

Adopted nim-codex Docker builds for Dist Tests.
Ordered Dedicated node on Hetzner.
Configured Hetzner StorageBox for local backup on Dedicated server.
Configured new Logs shipper and Grafana in Dist-Tests cluster.
Created Geth and Prometheus Docker images for Dist-Tests.
Created a separate codex-contracts-eth Docker image for Dist-Tests.
Set up Ingress Controller in Dist-Tests cluster.

Testing:

Set up deployer to gather metrics.
Debugging and identifying potential deadlock in the Codex client.
Added metrics, built image, and ran tests.
Updated dist-test log for Kibana compatibility.
Ran dist-tests on a new master image.
Debugging continuous tests.

Development:

Worked on codex-dht nimble updates and fixing key format issue.
Updated CI and split Windows CI tests to run on two CI machines.
Continued updating dependencies in codex-dht.
Fixed decoding large manifests (PR #479).
Explored the existing implementation of NAT Traversal techniques in nim-libp2p.

Research

Exploring additional directions for remote verification techniques and the interplay of different encoding approaches and cryptographic primitives
Onboarding Balázs as our ZK researcher/engineer
Continued research in DAS related topics
- Running simulation on newly setup infrastructure
Devised a new direction to reduce metadata overhead and enable remote verification https://github.com/codex-storage/codex-research/blob/master/design/metadata-overhead.md
Looked into NAT Traversal (issue #166).

Cross-functional (Combination of DevOps/Testing/Development):

Fixed discovery related issues.
Planned Codex Demo update for the Logos event and prepared environment for the demo.
Described requirements for Dist Tests logs format.
Configured new Logs shipper and Grafana in Dist-Tests cluster.
Dist Tests logs adoption requirements - Updated log format for Kibana compatibility.
Hetzner Dedicated server was configured.
Set up Hetzner StorageBox for local backup on Dedicated server.
Configured new Logs shipper in Dist-Tests cluster.
Setup Grafana in Dist-Tests cluster.
Created a separate codex-contracts-eth Docker image for Dist-Tests.
Setup Ingress Controller in Dist-Tests cluster.

Conversations

zk_id — 07/24/2023 11:59 AM

We've explored VDI for rollups ourselves in the last week, curious to know your thoughts

dryajov — 07/25/2023 1:28 PM

It depends on what you mean, from a high level (A)VID is probably the closest thing to DAS in academic research, in fact DAS is probably either a subset or a superset of VID, so it's definitely worth digging into. But I'm not sure what exactly you're interested in, in the context of rollups...

zk_id — 07/25/2023 3:28 PM

The part of the rollups seems to be the base for choosing proofs that scale linearly with the amount of nodes (which makes it impractical for large numbers of nodes). The protocol is very simple, and would only need to instead provide constant proofs with the Kate commitments (at the cost of large computational resources is my understanding). This was at least the rationale that I get from reading the paper and the conversation with Bunz, one of the founders of the Espresso shared sequencer (which is where I found the first reference to this paper). I guess my main open question is why would you do the sampling if you can do VID in the context of blockchains as well. With the proofs of dispersal on-chain, you wouldn't need to do that for the agreement of the dispersal. You still would need the sampling for the light clients though, of course.
dryajov — 07/25/2023 8:31 PM

I guess my main open question is why would you do the sampling if you can do VID in the context of blockchains as well. With the proofs of dispersal on-chain, you wouldn't need to do that for the agreement of the dispersal.

Yeah, great question. What follows is strictly IMO, as I haven't seen this formally contrasted anywhere, so my reasoning can be wrong in subtle ways.
- (A)VID - dispersing and storing data in a verifiable manner
- Sampling - verifying already dispersed data
tl;dr Sampling allows light nodes to protect against dishonest majority attacks. In other words, a light node cannot be tricked to follow an incorrect chain by a dishonest validator majority that withholds data. More details are here - https://dankradfeist.de/ethereum/2019/12/20/data-availability-checks.html ------------- First, DAS implies (A)VID, as there is an initial phase where data is distributed to some subset of nodes. Moreover, these nodes, usually the validators, attest that they received the data and that it is correct. If a majority of validators accepts, then the block is considered correct, otherwise it is rejected. This is the verifiable dispersal part. But what if the majority of validators are dishonest? Can you prevent them from tricking the rest of the network from following the chain?

Dankrad Feist

Data availability checks

Primer on data availability checks
[8:31 PM]

Dealing with dishonest majorities

This is easy if all the data is downloaded by all nodes all the time, but we're trying to avoid just that. But lets assume, for the sake of the argument, that there are full nodes in the network that download all the data and are able to construct fraud proofs for missing data, can this mitigate the problem? It turns out that it can't, because proving data (un)availability isn't a directly attributable fault - in other words, you can observe/detect it but there is no way you can prove it to the rest of the network reliably. More details here https://github.com/ethereum/research/wiki/A-note-on-data-availability-and-erasure-coding So, if there isn't much that can be done by detecting that a block isn't available, what good is it for? Well nodes can still avoid following the unavailable chain and thus be tricked by a dishonest majority. However, simply attesting that data has been publishing is not enough to prevent a dishonest majority from attacking the network. (edited)
dryajov — 07/25/2023 9:06 PM

To complement, the relevant quote from https://github.com/ethereum/research/wiki/A-note-on-data-availability-and-erasure-coding, is:

Here, fraud proofs are not a solution, because not publishing data is not a uniquely attributable fault - in any scheme where a node ("fisherman") has the ability to "raise the alarm" about some piece of data not being available, if the publisher then publishes the remaining data, all nodes who were not paying attention to that specific piece of data at that exact time cannot determine whether it was the publisher that was maliciously withholding data or whether it was the fisherman that was maliciously making a false alarm.

The relevant quote from from https://dankradfeist.de/ethereum/2019/12/20/data-availability-checks.html, is:

There is one gap in the solution of using fraud proofs to protect light clients from incorrect state transitions: What if a consensus supermajority has signed a block header, but will not publish some of the data (in particular, it could be fraudulent transactions that they will publish later to trick someone into accepting printed/stolen money)? Honest full nodes, obviously, will not follow this chain, as they can’t download the data. But light clients will not know that the data is not available since they don’t try to download the data, only the header. So we are in a situation where the honest full nodes know that something fishy is going on, but they have no means of alerting the light clients, as they are missing the piece of data that might be needed to create a fraud proof.

Both articles are a bit old, but the intuitions still hold.

July 26, 2023

zk_id — 07/26/2023 10:42 AM

Thanks a ton @dryajov ! We are on the same page. TBH it took me a while to get to this point, as it's not an intuitive problem at first. The relationship between the VID and the DAS, and what each is for is crucial for us, btw. Your writing here and your references give us the confidence that we understand the problem and are equipped to evaluate the different solutions. Deeply appreciate that you took the time to write this, and is very valuable.
[10:45 AM]

The dishonest majority is critical scenario for Nomos (essential part of the whole sovereignty narrative), and generally not considered by most blockchain designs
zk_id

Thanks a ton @dryajov ! We are on the same page. TBH it took me a while to get to this point, as it's not an intuitive problem at first. The relationship between the VID and the DAS, and what each is for is crucial for us, btw. Your writing here and your references give us the confidence that we understand the problem and are equipped to evaluate the different solutions. Deeply appreciate that you took the time to write this, and is very valuable.

dryajov — 07/26/2023 4:42 PM

Great! Glad to help anytime
zk_id

The dishonest majority is critical scenario for Nomos (essential part of the whole sovereignty narrative), and generally not considered by most blockchain designs

dryajov — 07/26/2023 4:43 PM

Yes, I'd argue it is crucial in a network with distributed validation, where all nodes are either fully light or partially light nodes.
[4:46 PM]

Btw, there is probably more we can share/compare notes on in this problem space, we're looking at similar things, perhaps from a slightly different perspective in Codex's case, but the work done on DAS with the EF directly is probably very relevant for you as well

July 27, 2023

zk_id — 07/27/2023 3:05 AM

I would love to. Do you have those notes somewhere?
zk_id — 07/27/2023 4:01 AM

all the links you have, anything, would be useful
zk_id

I would love to. Do you have those notes somewhere?

dryajov — 07/27/2023 4:50 PM

A bit scattered all over the place, mainly from @Leobago and @cskiraly @cskiraly has a draft paper somewhere

July 28, 2023

zk_id — 07/28/2023 5:47 AM

Would love to see anything that is possible
[5:47 AM]

Our setting is much simpler, but any progress that you make (specifically in the computational cost of the polynomial commitments or alternative proofs) would be really useful for us
zk_id

Our setting is much simpler, but any progress that you make (specifically in the computational cost of the polynomial commitments or alternative proofs) would be really useful for us

dryajov — 07/28/2023 4:07 PM

Yes, we're also working in this direction as this is crucial for us as well. There should be some result coming soon(tm), now that @bkomuves is helping us with this part.
zk_id

Our setting is much simpler, but any progress that you make (specifically in the computational cost of the polynomial commitments or alternative proofs) would be really useful for us

bkomuves — 07/28/2023 4:44 PM

my current view (it's changing pretty often :) is that there is tension between:
- commitment cost
- proof cost
- and verification cost
the holy grail which is the best for all of them doesn't seem to exist. Hence, you have to make tradeoffs, and it depends on your specific use case what you should optimize for, or what balance you aim for. we plan to find some points in this 3 dimensional space which are hopefully close to the optimal surface, and in parallel to that figure out what balance to aim for, and then choose a solution based on that (and also based on what's possible, there are external restrictions)

July 29, 2023

bkomuves

my current view (it's changing pretty often :) is that there is tension between:
- commitment cost
- proof cost
- and verification cost
the holy grail which is the best for all of them doesn't seem to exist. Hence, you have to make tradeoffs, and it depends on your specific use case what you should optimize for, or what balance you aim for. we plan to find some points in this 3 dimensional space which are hopefully close to the optimal surface, and in parallel to that figure out what balance to aim for, and then choose a solution based on that (and also based on what's possible, there are external restrictions)

zk_id — 07/29/2023 4:23 AM

I agree. That's also my understanding (although surely much more superficial).
[4:24 AM]

There is also the dimension of computation vs size cost
[4:25 AM]

ie the VID scheme (of the paper that kickstarted this conversation) has all the properties we need, but it scales n^2 in message complexity which makes it lose the properties we are looking for after 1k nodes. We need to scale confortably to 10k nodes.
[4:29 AM]

So we are at the moment most likely to use KZG commitments with a 2d RS polynomial. Basically just copy Ethereum. Reason is:
- Our rollups/EZ leader will generate this, and those are beefier machines than the Base Layer. The base layer nodes just need to verify and sign the EC fragments and return them to complete the VID protocol (and then run consensus on the aggregated signed proofs).
- If we ever decide to change the design for the VID dispersal to be done by Base Layer leaders (in a multileader fashion), it can be distributed (rows/columns can be reconstructed and proven separately). I don't think we will pursue this, but we will have to if this scheme doesn't scale with the first option.

August 1, 2023

dryajov

A bit scattered all over the place, mainly from @Leobago and @cskiraly @cskiraly has a draft paper somewhere

Leobago — 08/01/2023 1:13 PM

Note much public write-ups yet. You can find some content here:
- https://blog.codex.storage/data-availability-sampling/
  - https://github.com/codex-storage/das-research
We also have a few Jupiter notebooks but they are not public yet. As soon as that content is out we can let you know

Codex Storage Blog

Data Availability Sampling

The Codex team is busy building a new web3 decentralized storage platform with the latest advances in erasure coding and verification systems. Part of the challenge of deploying decentralized storage infrastructure is to guarantee that the data that has been stored and is available for retrieval from the beginning until

GitHub

GitHub - codex-storage/das-research: This repository hosts all the ...

This repository hosts all the research on DAS for the collaboration between Codex and the EF. - GitHub - codex-storage/das-research: This repository hosts all the research on DAS for the collabora...
zk_id

So we are at the moment most likely to use KZG commitments with a 2d RS polynomial. Basically just copy Ethereum. Reason is:
- Our rollups/EZ leader will generate this, and those are beefier machines than the Base Layer. The base layer nodes just need to verify and sign the EC fragments and return them to complete the VID protocol (and then run consensus on the aggregated signed proofs).
- If we ever decide to change the design for the VID dispersal to be done by Base Layer leaders (in a multileader fashion), it can be distributed (rows/columns can be reconstructed and proven separately). I don't think we will pursue this, but we will have to if this scheme doesn't scale with the first option.
dryajov — 08/01/2023 1:55 PM

This might interest you as well - https://blog.subspace.network/combining-kzg-and-erasure-coding-fc903dc78f1a

Medium

Combining KZG and erasure coding

The Hitchhiker’s Guide to Subspace — Episode II
[1:56 PM]

This is a great analysis of the current state of the art in structure of data + commitment and the interplay. I would also recoment reading the first article of the series which it also links to
zk_id — 08/01/2023 3:04 PM

Thanks @dryajov @Leobago ! Much appreciated!
[3:05 PM]

Very glad that we can discuss these things with you. Maybe I have some specific questions once I finish reading a huge pile of pending docs that I'm tackling starting today...
zk_id — 08/01/2023 6:34 PM

@Leobago @dryajov I was playing with the DAS simulator. It seems the results are a bunch of XML. Is there a way so I visualize the results?
zk_id

@Leobago @dryajov I was playing with the DAS simulator. It seems the results are a bunch of XML. Is there a way so I visualize the results?

Leobago — 08/01/2023 6:36 PM

Yes, checkout the visual branch and make sure to enable plotting in the config file, it should produce a bunch of figures
[6:37 PM]

You might find also some bugs here and there on that branch
zk_id — 08/01/2023 7:44 PM

Thanks!

20 KiB Raw Blame History Unescape Escape