14 KiB
A marketplace for storage durability
We present a new design for a storage marketplace, that is both simpler and includes incentives for repair.
Context
Our current storage marketplace is designed around the notion of sending out requests for storage, waiting for hosts to offer storage, and then choosing a selection from these hosts to start a storage contract with. It requires separate contracts for each of these hosts, active participation of the client during the negotiation phase, and does not yet have any provisions for repairing storage when hosts fail to deliver on their contracts.
In this document we describe a new design that is simpler, requires less interactions, and has repair incentives built in.
A new design
We propose to create new type of storage contract, containing a number of slots. Each of these slots represents an agreement with a storage host to store a part of the content. When a client wants store data on the network with durability guarantees, it posts a storage Request on the blockchain. Hosts that want to offer storage can fill a slot in the Request.
--------
---- fill slot --- | Host |
| --------
|
v
--------------
---------- | | --------
| Client | --- request ---> | Blockchain | <--- fill slot --- | Host |
---------- | | --------
--------------
^
|
| --------
---- fill slot --- | Host |
--------
The Request contains the content identifier, so that hosts can locate and download the content. It also contains the reward that hosts receive for storing the data and the collateral that hosts are expected to deposit. It contains parameters pertaining to storage proofs and erasure coding. And finally, it contains the amount of hosts that are expected to store the content, including a small amount of host losses that can be tolerated.
Request
cid # content identifier
reward # tokens paid per second per filled slot
collateral # amount of collateral required per host and slot
proof probability # frequency at which proofs are required
proof parameters # proof of retrievability parameters
erasure coding # erasure coding parameters
dispersal # dispersal parameter
repair reward # amount of tokens paid for repairs
hosts # amount of storage hosts (including loss)
loss # number of allowed host losses
slots # assigned host slots
expire # slots need to be filled before timeout
Slots
Initially all host slots are empty. An empty slot can be filled by anyone by submitting a correct storage proof together with collateral.
proof & proof &
collateral proof missed collateral missed
| | | | |
v v v v v
-------------------------------------------------------------------
slot: |///////////////////////| |////////////////////|
-------------------------------------------------------------------
| |
v v
collateral collateral
lost lost
---------------- time ---------------->
The time interval that a slot is filled by a host determines the host payout; for every second of the interval a certain amount of tokens are awarded to the host. Hosts that fill a slot are required to submit frequent proofs of storage.
When a certain number of proofs is missed, the slot is considered empty again. The collateral associated with the slot is mostly burned. Some of it is used to pay a fee to the node that indicated that proofs were missing, and some of it is reserved for repairs. An empty slot can be filled again once another host submits a correct proof together with collateral. Payouts for the time interval that a slot is empty are burned.
Payouts for all hosts are accumulated in the smart contract and paid out at Request end. This is to ensure that the incentive posed by the collateral is not diminished over time.
Contract lifecycle
A Request starts when all slots are filled. Regular storage proofs will be required from the hosts that filled the slots.
Some Requests may not attract the required amount of hosts, for instance because the payment is insufficient or the storage demands on the network are too high. To ensure that such Requests end, we add a timeout to the Request. If the Request failed to attract sufficient hosts before the timeout is reached, it is considered cancelled, and the hosts that filled any of the slots are able to withdraw their collateral. They are also paid for the time interval before the timeout. The client is able to withdraw the rest of the tokens in the Request.
A Request ends when the money that was paid upfront runs out. The end time can be calculated from the amount of tokens that are paid out per second. Note that in our scheme this amount does not change during the lifetime of the Request, even when proofs are missed and repair happens. This is a desirable property for hosts; they can be sure of a steady source of income, and a predetermined Request length. When a Request ends, the hosts may withdraw their collateral.
When too many hosts fail to submit storage proofs, and no other hosts take over the slots that they vacate, then the content can be considered lost. The Request is considered failed. The collateral of every host in the Request is burned as an additional incentive for the network hosts to avoid this scenario. The client is able to retrieve any funds that are left in the Request.
|
| create
|
v
----------- timeout -------------
| new | ------------------> | cancelled |
----------- -------------
|
| all slots filled
|
v
----------- too many losses ----------
| started | -------------------> | failed |
----------- ----------
|
| money runs out
|
v
------------
| finished |
------------
Repairs
When a slot is freed because of missing too many storage proofs, some collateral from the host that previously filled the slot is used as an incentive to repair the lost content. Repair typically involves downloading other parts of the content and using erasure coding to restore the missing parts. To incentive other nodes to do this repair, there is repair fee. It is a partial amount of the original host's collateral. The size of the reward is a fraction of slot's collateral where the fraction is parameter of the smart contract.
The size of the reward should be chosen carefully. It should not be too low, to incentivize hosts in the network to prioritize repairs over filling new slots in the network. It should also not be too high, to prevent malicious nodes in the network to try to disable hosts in an attempt to collect the reward.
Renewal
When a Request is about to end, and someone in the network wants the Request to continue for longer, then they can post a new Request with the same content identifier.
We've chosen not to allow top-ups of existing Requests with new funds. Even though this has many advantages (it's a very simple way to extend the lifetime of the Request, it allows people to easily chip in to host content, etc.) it has one big disadvantage: hosts no longer know for how long they'll be kept to the Request. When a Request is continuously topped up, they cannot leave the Request without losing their collateral.
Dispersal
Here we propose an an alternative way to select hosts for slots that is a variant of the "first come, first serve" approach that we described earlier. It intends to alleviate these problems:
- a single host can fill all slots in a Request
- a small group of powerful hosts is able to fill most slots in the network
- resources are wasted when many hosts try to fill the same slot
For a client it is beneficial when their content is stored on as many different hosts as possible, to guard against host failures. Should a single host fill all slots in the Request, then the failure of this single host could mean that the content is lost.
On a network level, we also want to avoid that a few large players are able to fill most Request slots, which would mean that the network becomes fairly centralized.
When too many nodes compete for a slot in a Request, and only one is selected, then this leads to wasted resources in the network. Wasted resources ultimately lead to a higher cost of storage.
To alleviate these problems, we introduce a dispersal parameter in the Request. The dispersal parameter allows a client to choose the amount of spreading within the network. When a slot becomes empty then only a small amount of hosts in the network are allowed to fill the slot. Over time, more and more hosts will be allowed to fill a slot. Each slot starts with a different set of allowed hosts.
The speed at which new hosts are included is chosen by the client. When the client choses a high speed, then very quickly every host in the network will be able to fill slots. This increases the chances of a single host to fill all slots in a Request. When the client choses a low speed, then it is more likely that different hosts fill the slots.
We use the Kademlia distance function to indicate which hosts are allowed to fill a slot.
distance between a and b: xor(a, b)
slot start point: hash(nonce || slot number)
allowed distance: elapsed time * dispersal parameter
Each slot has a different start point:
slot 4 slot 0 slot 2 slot 3 slot 1
| | | | |
v v v v v
----·--------·------------------·-------------------·-------------·----
A host is allowed to fill a slot when the distance between its id and the start point is less that the allowed distance.
start point
| Kademlia distance
t=3 t=2 t=1 v
<------(------(------(------·------)------)------)------>
^ ^
| |
this host is this host is
allowed at t=2 allowed at t=3
Note that even though we use the Kademlia distance function, this bears no relation to the DHT. We use the blockchain address of the host, not its peer id.
This dispersal mechanism still requires modeling to check that it meets its goals, and to find the optimal value for the dispersal parameter, given certain network conditions. It is also worth looking into simpler alternatives.
Conclusion
The design that we presented here deviates significantly from the previous marketplace design.
There is no explicit negotiation phase for Requests. Clients are no longer able to choose which hosts will be responsible for keeping the content on the network. This removes the selection step that was required in the old design. Instead, a host presents the network with an opportunity to earn money by storing content. Hosts can decide whether they want to take part in the Request, and if they do they are expected to keep to their part of the deal lest they lose their collateral.
The first hosts that download the content and provide initial storage proofs are awarded slots in the Request. This removes the explicit Request start (and its associated timeout behavior) that was required in the old design. It also adds an incentive to quickly start storing the content while slots are available in the Request.
While the old design required separate negotiations per host, this design ensures that either the single Request starts with all hosts, or is cancelled. This is a significant reduction in the amount of interactions required.
The old design required new negotiations when a host is not able to fulfill its obligations, and a separately designed repair protocol. In this design we managed to include repair incentives and a repair protocol that is nearly identical to Request start.
In the old design we had a single collateral per host that could be used to cover many Requests. Here we decided to include collateral per Request. This is done to simplify collateral handling, but it is not a requirement of the new design. The new design can also be made to work with a single collateral per host.