mirror of
https://github.com/codex-storage/codex-research.git
synced 2025-01-27 02:46:17 +00:00
300 lines
14 KiB
Markdown
300 lines
14 KiB
Markdown
A marketplace for storage durability
|
|
====================================
|
|
|
|
We present a new design for a storage marketplace, that is both simpler and
|
|
includes incentives for repair.
|
|
|
|
Context
|
|
-------
|
|
|
|
Our current storage marketplace is designed around the notion of sending out
|
|
requests for storage, waiting for hosts to offer storage, and then choosing a
|
|
selection from these hosts to start a storage contract with. It requires
|
|
separate contracts for each of these hosts, active participation of the client
|
|
during the negotiation phase, and does not yet have any provisions for repairing
|
|
storage when hosts fail to deliver on their contracts.
|
|
|
|
In this document we describe a new design that is simpler, requires less
|
|
interactions, and has repair incentives built in.
|
|
|
|
A new design
|
|
------------
|
|
|
|
We propose to create new type of storage contract, containing a number of slots.
|
|
Each of these slots represents an agreement with a storage host to store a part
|
|
of the content. When a client wants store data on the network with durability
|
|
guarantees, it posts a storage Request on the blockchain. Hosts that want to
|
|
offer storage can fill a slot in the Request.
|
|
|
|
|
|
--------
|
|
---- fill slot --- | Host |
|
|
| --------
|
|
|
|
|
v
|
|
--------------
|
|
---------- | | --------
|
|
| Client | --- request ---> | Blockchain | <--- fill slot --- | Host |
|
|
---------- | | --------
|
|
--------------
|
|
^
|
|
|
|
|
| --------
|
|
---- fill slot --- | Host |
|
|
--------
|
|
|
|
|
|
The Request contains the content identifier, so that hosts can locate
|
|
and download the content. It also contains the reward that hosts receive for
|
|
storing the data and the collateral that hosts are expected to deposit. It
|
|
contains parameters pertaining to storage proofs and erasure coding. And
|
|
finally, it contains the amount of hosts that are expected to store the content,
|
|
including a small amount of host losses that can be tolerated.
|
|
|
|
|
|
Request
|
|
|
|
cid # content identifier
|
|
|
|
reward # tokens paid per second per filled slot
|
|
collateral # amount of collateral required per host and slot
|
|
|
|
proof probability # frequency at which proofs are required
|
|
proof parameters # proof of retrievability parameters
|
|
erasure coding # erasure coding parameters
|
|
dispersal # dispersal parameter
|
|
repair reward # amount of tokens paid for repairs
|
|
|
|
hosts # amount of storage hosts (including loss)
|
|
loss # number of allowed host losses
|
|
|
|
slots # assigned host slots
|
|
|
|
expire # slots need to be filled before timeout
|
|
|
|
Slots
|
|
-----
|
|
|
|
Initially all host slots are empty. An empty slot can be filled by anyone by
|
|
submitting a correct storage proof together with collateral.
|
|
|
|
|
|
proof & proof &
|
|
collateral proof missed collateral missed
|
|
| | | | |
|
|
v v v v v
|
|
-------------------------------------------------------------------
|
|
slot: |///////////////////////| |////////////////////|
|
|
-------------------------------------------------------------------
|
|
| |
|
|
v v
|
|
collateral collateral
|
|
lost lost
|
|
|
|
|
|
|
|
---------------- time ---------------->
|
|
|
|
|
|
The time interval that a slot is filled by a host determines the host payout;
|
|
for every second of the interval a certain amount of tokens are awarded to the
|
|
host. Hosts that fill a slot are required to submit frequent proofs of storage.
|
|
|
|
When a certain number of proofs is missed, the slot is considered empty again.
|
|
The collateral associated with the slot is mostly burned. Some of it is used to
|
|
pay a fee to the node that indicated that proofs were missing, and some of it is
|
|
reserved for repairs. An empty slot can be filled again once another host
|
|
submits a correct proof together with collateral. Payouts for the time interval
|
|
that a slot is empty are burned.
|
|
|
|
Payouts for all hosts are accumulated in the smart contract and paid out at Request
|
|
end. This is to ensure that the incentive posed by the collateral is not
|
|
diminished over time.
|
|
|
|
Contract lifecycle
|
|
------------------
|
|
|
|
A Request starts when all slots are filled. Regular storage proofs will be
|
|
required from the hosts that filled the slots.
|
|
|
|
Some Requests may not attract the required amount of hosts, for instance
|
|
because the payment is insufficient or the storage demands on the network are
|
|
too high. To ensure that such Requests end, we add a timeout to the Request.
|
|
If the Request failed to attract sufficient hosts before the timeout is
|
|
reached, it is considered cancelled, and the hosts that filled any of the slots
|
|
are able to withdraw their collateral. They are also paid for the time interval
|
|
before the timeout. The client is able to withdraw the rest of the tokens in the
|
|
Request.
|
|
|
|
A Request ends when the money that was paid upfront runs out. The end time can
|
|
be calculated from the amount of tokens that are paid out per second. Note that
|
|
in our scheme this amount does not change during the lifetime of the Request,
|
|
even when proofs are missed and repair happens. This is a desirable property
|
|
for hosts; they can be sure of a steady source of income, and a predetermined
|
|
Request length. When a Request ends, the hosts may withdraw their collateral.
|
|
|
|
When too many hosts fail to submit storage proofs, and no other hosts take over
|
|
the slots that they vacate, then the content can be considered lost. The
|
|
Request is considered failed. The collateral of every host in the Request is
|
|
burned as an additional incentive for the network hosts to avoid this scenario.
|
|
The client is able to retrieve any funds that are left in the Request.
|
|
|
|
|
|
|
| create
|
|
|
|
|
v
|
|
----------- timeout -------------
|
|
| new | ------------------> | cancelled |
|
|
----------- -------------
|
|
|
|
|
| all slots filled
|
|
|
|
|
v
|
|
----------- too many losses ----------
|
|
| started | -------------------> | failed |
|
|
----------- ----------
|
|
|
|
|
| money runs out
|
|
|
|
|
v
|
|
------------
|
|
| finished |
|
|
------------
|
|
|
|
|
|
Repairs
|
|
-------
|
|
|
|
When a slot is freed because of missing too many storage proofs, some
|
|
collateral from the host that previously filled the slot is used as an incentive
|
|
to repair the lost content. Repair typically involves downloading other parts of
|
|
the content and using erasure coding to restore the missing parts. To incentive
|
|
other nodes to do this repair, there is repair fee. It is a partial amount of the original
|
|
host's collateral. The size of the reward is a fraction of slot's collateral
|
|
where the fraction is parameter of the smart contract.
|
|
|
|
The size of the reward should be chosen carefully. It should not be too low, to
|
|
incentivize hosts in the network to prioritize repairs over filling new slots in
|
|
the network. It should also not be too high, to prevent malicious nodes in the
|
|
network to try to disable hosts in an attempt to collect the reward.
|
|
|
|
Renewal
|
|
-------
|
|
|
|
When a Request is about to end, and someone in the network wants the Request
|
|
to continue for longer, then they can post a new Request with the same content
|
|
identifier.
|
|
|
|
We've chosen not to allow top-ups of existing Requests with new funds. Even
|
|
though this has many advantages (it's a very simple way to extend the lifetime
|
|
of the Request, it allows people to easily chip in to host content, etc.) it
|
|
has one big disadvantage: hosts no longer know for how long they'll be kept to
|
|
the Request. When a Request is continuously topped up, they cannot leave the
|
|
Request without losing their collateral.
|
|
|
|
Dispersal
|
|
---------
|
|
|
|
Here we propose an an alternative way to select hosts for slots that is a
|
|
variant of the "first come, first serve" approach that we described earlier. It
|
|
intends to alleviate these problems:
|
|
|
|
1. a single host can fill all slots in a Request
|
|
2. a small group of powerful hosts is able to fill most slots in the network
|
|
3. resources are wasted when many hosts try to fill the same slot
|
|
|
|
For a client it is beneficial when their content is stored on as many different
|
|
hosts as possible, to guard against host failures. Should a single host fill all
|
|
slots in the Request, then the failure of this single host could mean that the
|
|
content is lost.
|
|
|
|
On a network level, we also want to avoid that a few large players are able to
|
|
fill most Request slots, which would mean that the network becomes fairly
|
|
centralized.
|
|
|
|
When too many nodes compete for a slot in a Request, and only one is selected,
|
|
then this leads to wasted resources in the network. Wasted resources ultimately
|
|
lead to a higher cost of storage.
|
|
|
|
To alleviate these problems, we introduce a dispersal parameter in the Request.
|
|
The dispersal parameter allows a client to choose the amount of
|
|
spreading within the network. When a slot becomes empty then only a small amount
|
|
of hosts in the network are allowed to fill the slot. Over time, more and more
|
|
hosts will be allowed to fill a slot. Each slot starts with a different set of
|
|
allowed hosts.
|
|
|
|
The speed at which new hosts are included is chosen by the client. When the
|
|
client choses a high speed, then very quickly every host in the network will be
|
|
able to fill slots. This increases the chances of a single host to fill all
|
|
slots in a Request. When the client choses a low speed, then it is more likely
|
|
that different hosts fill the slots.
|
|
|
|
We use the Kademlia distance function to indicate which hosts are allowed to
|
|
fill a slot.
|
|
|
|
distance between a and b: xor(a, b)
|
|
slot start point: hash(nonce || slot number)
|
|
allowed distance: elapsed time * dispersal parameter
|
|
|
|
|
|
Each slot has a different start point:
|
|
|
|
slot 4 slot 0 slot 2 slot 3 slot 1
|
|
| | | | |
|
|
v v v v v
|
|
----·--------·------------------·-------------------·-------------·----
|
|
|
|
A host is allowed to fill a slot when the distance between its id and the start
|
|
point is less that the allowed distance.
|
|
|
|
start point
|
|
| Kademlia distance
|
|
t=3 t=2 t=1 v
|
|
<------(------(------(------·------)------)------)------>
|
|
^ ^
|
|
| |
|
|
this host is this host is
|
|
allowed at t=2 allowed at t=3
|
|
|
|
Note that even though we use the Kademlia distance function, this bears no
|
|
relation to the DHT. We use the blockchain address of the host, not its peer id.
|
|
|
|
This dispersal mechanism still requires modeling to check that it meets its
|
|
goals, and to find the optimal value for the dispersal parameter, given certain
|
|
network conditions. It is also worth looking into simpler alternatives.
|
|
|
|
Conclusion
|
|
----------
|
|
|
|
The design that we presented here deviates significantly from the previous
|
|
marketplace design.
|
|
|
|
There is no explicit negotiation phase for Requests. Clients are no
|
|
longer able to choose which hosts will be responsible for keeping the content on
|
|
the network. This removes the selection step that was required in the old
|
|
design. Instead, a host presents the network with an opportunity to earn money by
|
|
storing content. Hosts can decide whether they want to take part in the
|
|
Request, and if they do they are expected to keep to their part of the deal
|
|
lest they lose their collateral.
|
|
|
|
The first hosts that download the content and provide initial storage proofs are
|
|
awarded slots in the Request. This removes the explicit Request start (and its
|
|
associated timeout behavior) that was required in the old design. It also adds
|
|
an incentive to quickly start storing the content while slots are available in
|
|
the Request.
|
|
|
|
While the old design required separate negotiations per host, this design
|
|
ensures that either the single Request starts with all hosts, or is cancelled.
|
|
This is a significant reduction in the amount of interactions required.
|
|
|
|
The old design required new negotiations when a host is not able to fulfill its
|
|
obligations, and a separately designed repair protocol. In this design we
|
|
managed to include repair incentives and a repair protocol that is nearly
|
|
identical to Request start.
|
|
|
|
In the old design we had a single collateral per host that could be used to
|
|
cover many Requests. Here we decided to include collateral per Request. This
|
|
is done to simplify collateral handling, but it is not a requirement of the new
|
|
design. The new design can also be made to work with a single collateral per
|
|
host.
|