dagger-research/design/storage-proof-timing.md

Timing of Storage Proofs
========================

We present a design that allows a smart contract to determine when proofs of
storage should be provided by hosts.

Context
-------

Hosts that are compensated for providing storage of data, are held accountable
by providing proofs of storage periodically. It's important that a host is not
able to pre-compute those proofs, otherwise it could simply delete the data and
only store the proofs.

A smart contract should be able to check whether those proofs were delivered in
a correct and timely manner. Either the smart contract will be used to perform
these checks directly, or as part of an arbitration mechanism to keep validators
honest.

Design 1: block by block
------------------------

A first design used the property that blocks on Ethereum's main chain arrive in
a predictable cadence; about once every 14 seconds a new block is produced. The
idea is to use the block hashes as a source of non-predictable randomness.

From the block hash you can derive a challenge, which the host uses together
with the stored data to generate a proof of storage.

Furthermore, we use the block hash to perform a die roll to determine whether a
proof is required or not. For instance, if a storage contract stipulates that a
proof is required once every 1000 blocks, then each new block hash leads to a 1
in 1000 chance that a proof is required. This ensures that a host should always
be able to deliver a storage proof at a moments notice, while keeping the total
costs of generating and validating proofs relatively low.

Problems with block cadence
---------------------------

We see a couple of problems emerging from this design. The main problem is that
block production rate is not as reliable as it may seem. The steady cadence of
the Ethereum main net is (by design) not shared by L2 solutions, such as
rollups.

Most L2 solutions therefore [warn][2] [against][3] the use of the block interval
as a measure of time, and tell you to use the block timestamp instead. Even
though these are susceptible to some miner influence, over longer time intervals
they are deemed reliable.

Another issue that we run into is that on some L2 designs, block production
increases when there are more transactions. This could lead to a death spiral
where an increasing number of blocks leads to an increase in required proofs,
leading to more transactions, leading to even more blocks, etc.

And finally, because storage contracts between a client and a host are best
expressed in wall clock time, there are going to be two different ways of
measuring time in the same smart contract, which could lead to some subtle bugs.

These problems lead us to the second design.

Design 2: block pointers
------------------------

In our second design we separate cadence from random number selection. For
cadence we use a time interval measured in seconds. This divides time into
periods that have a unique number. Each period represents a chance that a proof
is required.

We want to associate a random number with each period that is used for the proof
of storage challenge, and for the die roll to determine whether a proof is
required. But since we no longer have a one-on-one relationship between a period
and a block hash, we need to get creative.

EVM and solidity
----------------

For context, our smart contracts are written in Solidity and execute on the EVM.
In this environment we have access to the [most recent 256 block hashes][1], and
to the current time, but not to the timestamps of the previous blocks. We also
have access to the current block number.

Block pointers
--------------

We introduce the notion of a block pointer. This is a number between 0 and 256
that points to one of the latest 256 block hashes. We count from 0 (latest
block) to 255 (oldest available block).

         oldest                              latest
     - - - |-----------------------------------|
          255                    ^             0
                                 |
                              pointer

We want to associate a block pointer with a period such that it keeps pointing
to the same block hash when new blocks are produced. We need this because the
block hash is used to check whether a proof is required, to check a proof when
it's submitted, or to prove absence of a proof, all at different times.

To ensure that the block pointer points to the same block hash for longer
periods of time, we derive it from the current block number:

    pointer(period) = (blocknumber + period) % 256

Each time a new block is produced the block pointer increases by one, which
ensures that it keeps pointing to the same block. Over time, when more blocks
are produced, we get this picture:

     |
     |      - - - |-----------------------------------|
     |           255                          ^       0
     |                                        |
     |                                     pointer
     t
     i      - - - |-----------------------------------|
     m           255                      ^           0
     e                                    |
     |                                  pointer
     |
     |      - - - |-----------------------------------|
     |           255                  ^               0
     |                                |
     v                             pointer

Avoiding surprises
------------------

There is one problem left when we use the pointer as we've just described.
Because of the modulus, there are periods in which the pointer wraps around. It
moves from 255 to 0 from one block to the next. This is undesirable because it
would mean that new proof requirements could all of a sudden appear, leaving too
little time for the host to calculate and submit a proof.

We identified two ways of dealing with this problem: pointer downtime and
pointer duos. Pointer downtime appears to be the simplest solution, so we
present it here. Pointer duos are described in the appendix.

Pointer downtime
----------------

We ignore any proof requirements when the pointer is pointing to one of the most
recent blocks:

     - - - |-------------------------|/////////|
          255                            ^     0
                                         |
                                       pointer

When the pointer is in the grey zone, no proof is required. The amount of blocks
in the grey zone should be chosen such that it is not possible to generate them
all inside a period; this ensures that a host cannot be surprised by new proof
requirements popping up.

If we want a host to provide a proof on average once every N periods, it now no
longer suffices to have a 1 in N chance to provide a proof. Because there are no
proof requirements in the grey zone, the odds have changed in favor of the host.
To compensate, the odds outside of the grey zone should be increased. For
instance, if the grey zone is 64 blocks (¼ of the available blocks), then the
odds of requiring a proof should be 1 in ¾N.

--------------------------------------------------------------------------------

Appendix: Pointer duos
----------------------

An alternative solution to the pointer wrapping problem is pointer duos:

    pointer1(period) = (blocknumber + period) % 256
    pointer2(period) = (blocknumber + period + 128) % 256

The pointers are 128 blocks apart, ensuring that when one pointer wraps, the
other remains stable.

     - - - |-----------------------------------|
          255   ^                ^             0
                |                |
              pointer          pointer

We allow hosts to choose which of the two pointers to use. This has implications
for the die roll that we perform to determine whether a proof is required.

If we want a host to provide a proof on average once every N periods, it no
longer suffices to have a 1 in N chance to provide a proof. Should a host be
completely free to choose between the two pointers (which is not entirely true,
as we shall see shortly) then the odds of a single pointer should be 1 in √N to
get to a probability of `1/√N * 1/√N = 1/N` of both pointers leading to a proof
requirement.

In reality, a host will not be able to always choose either of the two pointers.
When one of the pointers is about to wrap before validation is due, it can no
longer be relied upon. A really conservative host would follow the strategy of
always choosing the pointer that points to the most recent block, requiring the
odds to be 1 in N. A host that tries to optimize towards providing as little
proofs as necessary will require the odds to be nearer to 1 in √N.

[1]: https://docs.soliditylang.org/en/v0.8.12/units-and-global-variables.html#block-and-transaction-properties
[2]: https://community.optimism.io/docs/developers/build/differences/#block-numbers-and-timestamps
[3]: https://support.avax.network/en/articles/5106526-measuring-time-in-smart-contracts