add fourth slot reservations proposal, removing the other 3 (2 documents)

2025-01-10 10:46:16 +00:00 · 2024-05-09 14:59:09 +10:00 · 2024-05-09 14:59:09 +10:00 · 1bfd89bf95
commit 1bfd89bf95
parent 8c20d40336
3 changed files with 225 additions and 857 deletions
--- a/design/preventing-node-and-network-overload.md
+++ b/design/preventing-node-and-network-overload.md
@ -1,575 +0,0 @@
 # Preventing node and network overload during slot filling and slot repair
 When a new storage request is created, slots in the request can be filled by the
 first storage provider (SP) to download the slot data, generate a storage proof,
 then supply the proof and collateral to the onchain contract. This is inherently
 a race and all SPs except the one who "won" will have wasted funds downloading
 data that they ultimately did not need, which may eventually lead to higher
 costs of storage. Additionally, clients will have to serve data requests for all
 the racing SPs downloading data for all the slots in the request. This not only
 could cause issues with nodes failing to handle the request load, but also
 creates unnecessary congestion on the network.
 ## Proposed solution #1: Fill reward, Pay2Play, and random sliding window rate of expansion
 One proposed solution to this problem involves using the [expanding window
 mechanism](https://github.com/status-im/codex-research/blob/ad41558900ff8be91811aa5de355148d8d78404f/design/marketplace.md#dispersal)
 as the sole means of throttling node request overload and network congestion.
 The expanding window will have a randomized rate of expansion factor to prevent
 waiting and collusion. Additionally, a fill reward incentivizes fast fills and
 disincentivizes waiting. Pay2Play is a collateral that is put up by the client
 to disincentivize bad behavior: withholding data and spamming the network.
 ### Fill reward
 A fill reward will be issued to the SP that fills the slot. The reward will
 offset the cost of the SPs collateral by a little bit. The fill reward must be
 much less than the collateral.
 This reward will decrease exponentially over time, so that it is inversely
 proportional to the expanding window. This means that while the field of
 eligible addresses is small, the fill reward will be high. Over time, the field
 of eligible addresses will increase exponentially as the fill reward decreases
 exponentially. This incentivizes SPs to fill the slot as fast as possible, with
 the lucky few SPs that closest to the source point of the expanding window
 getting a bigger reward.
 The fill reward curve would be a configuration parameter in the smart contract,
 making it a network-level value, that affects all storage requests.
 The source of the fill reward comes from the client's [Pay2Play
 collateral](#pay2play-client-collateral), and therefore fill rewards will not be
 paid until the contract successful starts (all slots are filled).
 There is one caveat: the smallest collateral that a client can require must be
 greater than the maximum fill reward. If this was not the case, an SP that is
 actively filling a slot, that sees a more profitable opportunity with a high
 fill reward (ie SPs address was close to the source of the expanding window),
 would be incentivized to abandon their existing duties and fill the slot in the
 new request. If the collateral for a request is always greater than the largest
 fill reward, then the fill reward will never be an incentive for an SP
 to abandon their existing duties.
 #### Possible fill reward:collateral factor?
 The maximum fill reward must be less than the smallest allowed collateral by
 some factor. This factor is to be defined a configuration parameter in the smart
 contract at the network-level.
 ### Expanding window mechanism
 The expanding window mechanism prevents node and network overload once a slot
 becomes available to be filled (or repaired) by allowing only a very small
 number of SP addresses to fill/repair the slot at the start. Over time, the number
 of eligible SP addresses increases, until eventually all SP addresses in the
 network are eligible.
 The expanding window mechanism starts off with a random source address, defined
 as $hash(nonce || slot number)$ and a distance, $d = 0$, defined as $xor(a, b)$.
 Over time, $d$ increases and $2^d$ addresses will be eligible to participate. At
 time $t_0$, $d == 0$, so only 1 address (the source address) is eligible. At
 time $t_1$, the distance increases to 1, and 2 addresses will be included. At
 time $t_2$ and kademlia distance of 2, there are 4 addresses, etc, until
 eventually the entire address space of SPs participating in the network are
 eligible to fill a slot.
 Because the source address for the expanding window is generated using the
 slot number, that means the source address for each slot will be different. This
 has the added benefit of preventing a single node from filling all slots in
 request.
 The client can set the rate of sliding window expansion by setting the
 [dispersal
 parameter](https://github.com/status-im/codex-research/blob/ad41558900ff8be91811aa5de355148d8d78404f/design/marketplace.md#dispersal),
 $p_d$ when the storage request is created. Therefore the allowed distance,
 $d_a$, for eligible SP addresses can be defined as:
 $d_a = t_e * p_d$, where $t_e$ is the elapsed time.
 #### Randomizing the rate of expansion
 This presents an issue where if an SP could pre-calculate how long it would take
 for themselves or other SPs to be eligible to fill a slot. For example, if an SP
 gets "lucky" and is close to the source of the expanding window, and they know
 that other SP addresses are not close to the source, they would know there is
 some time before other providers will be eligible to fill the slot. In that
 case, they could potentially wait an amount of time before filling the slot, in
 the hopes that a better opportunity arises. The goal should always be to fill
 slots as fast as possible, so any "waiting" behavior should be discouraged. It
 should be noted that while all SPs in the network should be reachable, their
 addresses may not be known publicly until they fill a slot, when their address
 is stored onchain against a storage request. Therefore, only SPs that have
 filled a slot would be known to other SPs. It is also not possible to know what
 the availabilities of the SP are, meaning that even if that SP is active on the
 network and its address may fall into an expanding window, it may not be able to
 service new requests. Of note, the DHT does not contain SP addresses, only SP
 peer ids, so these values cannot be used in expanding window pre-calculations.
 This means that SP pre-calculation is not an exact science: best guesses can be
 made using the available information at the time.
 A waiting SP is already disincentivized by the exponentially decreasing fill
 reward, however this may not be a large enough reward to create a change in
 behavior. To further mitigate this attack, the SP's pre-calculation can be
 prevented by introducing a random factor into the rate calculation that changes
 for each time/distance increase. The random factor, $r$, will be a factor
 between 0.5 and 1.5, as an example (the range of $r$ should be configurable as a
 network-wide configuration parameter). This means that the expanding window rate
 set by the client in the storage request will be randomly decreased (by up to
 50%) or increased (by up to 150%). The source of randomness for this could be
 the block hash, which means the frequency in which $r$ changes would be
 equivalent to the frequency in which new blocks are produced on chain.
 Therefore, the allowed distance for eligible SPs can then be defined as:
 $d_a = t_e * p_d * r$, where $r \in [0.5, 1.5]$
 The range of $r$ should be a smart contract configuration parameter, making it a
 network-level value, that affects all storage requests.
 $r$ should be unique for each slot to prevent SPs from knowing when all slots of
 a request will be open for it to attempt to fill all slots of a request.
 #### Expanding window trade-offs
 The main trade-off for the expanding window is that after a certain time, many
 SP addresses will be eligible to fill a slot if it was not already filled. This
 will allow a race between all eligible SP addresses, which can overload the
 capacity of the client and cause network congestion. This can be further
 exacerbated by a client choosing a large dispersal parameter in their request.
 However, the decreasing fill reward should incentivize SPs to fill slots as soon
 as possible. This may discourage too many addresses being allowed to fill a
 slot, but it will not prevent it entirely. A situation could occur where SPs
 close to the source of the sliding window do not have availability or the
 request parameters are not within the bounds of SP availabilities, then
 inevitably, the slot will open up to more addresses.
 Additionally, the rate of sliding window expansion that is set by the client can
 be bounded in such a way that expansion does not accelerate too quickly.
 #### Expanding window time interval
 The time interval duration, $t_i$, of the expanding window mechanism should be a
 function of the request expiry and the maximum value of $d_a$, so that the
 maximum value of $d_a$ occurs at the time interval just before expiry. This
 means that only the very last $t_i$ will have all addresses in the network
 available, thus encouraging data distribution, but mainly discouraging too many
 addresses from being eligible too quickly. Because each time interval is an
 increase in kademlia distance, each time interval will add twice the number of
 eligible addresses as the time interval before it. In other words, the number of
 eligible addresses is cumulative: each increase in time interval / distance will
 add double the eligible addresses that were added in the previous time interval.
 Because $d_a$ is a hash, we can assume 256 bits, and therefore the maximum value
 of $d$ (kademlia distance) is `256`.
 ```
 ------.------.------.------.------.------> t_elapsed
                    ^      ^      ^
                    |   d_a=256   |
                    |  All addrs  |
                    |  eligible   |
                    |             |
                 d_a=255        expiry
                half addrs
                 eligible
 ```
 $d_a = t_i*i + 1$
 $t_e = t_i * i$
 $t_i = d_max - 1 / expiry$, eg $t_i = 255/expiry$
 The caveat here is that expiry must always be greater than or equal to 255.
 ### Pay2Play: client collateral
 Pay2Play is an *additional* collateral that is deposited by the client when
 creating a request for storage. It is burned if a contract does not start (all
 slots were not filled before expiry). If the contract starts (all slots are
 filled), it is used to pay for the fill reward, with the difference being
 returned once the contract starts.
 The sole purpose of the collateral is to disincentivize bad behavior, namely,
 initially withholding data and spam prevention. This collateral is "pay to play"
 because the client needs to deposit additional collateral just to submit a
 request for storage. They are taking the risk that their request will not
 complete successfully, however this collateral is necessary for mitigation of
 certain attacks.
 When a client submits a request for storage, SPs that fill the slots download
 the data from the client. The client could withhold this data to disrupt the
 network and create opportunity loss for SPs that are attempting to download the
 data. If this happens, then the slot would not be filled and the client would
 lose its Pay2Play collateral.
 A client could also spam the network by submitting many requests for storage
 that have conditions that would likely not be met by any SP's availability.
 Ultimately, SPs would not waste resources acting on these requests, however the
 requests still need to be processed by the SP, which could, if numerous enough,
 cause unknown issues overloading the SP with requests, unknown issues
 overloading the network, and create opportunity loss associated with processing
 a high volume of untenable requests. While the transaction cost alone could be a
 deterrent for this type of behavior, once deployed on an L2, spamming the
 network in this way would be achievable. The Pay2Play collateral adds to the
 expense of such spam and would further disincentivize this behavior.
 #### How much should the Pay2Play collateral be?
 The Pay2Play collateral should be set to the value of the maximum fill reward
 multiplied by the number of slots, because if the contract starts in optimal
 conditions, where slots are filled straight away, the collateral would need to
 pay for the maximum fill reward for each slot. In reality, slots will not fill
 at optimal conditions, and so there will be some collateral leftover to return
 to the client once the contract starts successfully.
 #### Pay2Play collateral trade-offs
 The trade-off, however, is that there is an increased barrier to entry due to
 the requirement of clients to provide an additional collateral.
 ### Pay2Play alternate solution: client fee for cancelled contracts
 Pay2Play is a fee that is burned if a contract does not start (all slots were
 not filled before expiry). The source of the burned funds comes from the total
 value put up by the client to pay for SPs to store their data. However, if the
 contract is successfully completed, the fee is not burned and goes toward
 payment of the SP/validator rewards.
 The main downside to this is that there is no source of funding for the fill
 reward.
 #### Pay2Play fee as a value
 The Pay2Play fee should not be a percentage of total request reward, because in
 a situation where a client has created a very cheap request, a percentage of the
 total reward (which is small) would be very small. This would make the fee much
 less of an incentive for good behavior. Instead, the fee should be a
 configuration parameter set at the network level, as a value, not a percentage.
 There is a caveat to the Pay2Play fee as a value, however: because the source of
 the funds is the total value of the storage request, the total price of a
 storage contract cannot exceed the fee value. This could potentially limit how
 short in duration requests can be, depending on the value of the fee that is
 set. There is an alternative solution that avoids this caveat: client
 collateral.
 ### Pay2Play trade-off
 However, there is one downside to the Pay2Play collateral/fee. If a client is
 behaving correctly, but submits a request for storage that is out of the range
 of acceptable market conditions, then the request will not be started, and the
 client will lose their collateral.
 ### Solution #1 attacks
 Name               | Attack description
 :------------------|:-----------------------------------------------------------------
 Clever SP          | SP drops a filled slot when a better opportunity presents itself
 Lazy SP            | SP waits to fill a slot to see if better opportunities will arise
 Censoring SP       | SP withholds specific CIDs that it tries to censor
 Greedy SP          | SP tries to fill multiple slots in a request
 Sticky SP          | SP withholds data from other SPs to win re-posted contract
 Withholding client | client withholds data after creating a request
 Controlling client | like a censoring client, but for specific SPs on the network
 Spammy client      | client spams the network with untenable storage requests
 #### Clever SP attack
 The "clever SP attack" is when an SP has already filled a slot, but finds a more
 profitable opportunity in a different request, so the existing slot is abandoned
 in order to clear up storage space for the more profitable slot. This attack is
 mitigated by several factors. Firstly, payouts for SP duties performed (proofs)
 only happen upon contract completion, meaning if an SP abandoned a slot, they'd
 be giving up payment for duties already performed and resources already spent.
 Secondly, the collateral provided for the slot fill would also be burnt.
 Thirdly, the fill reward for the new slot fill must be much less than the
 minimum collateral required for the new slot, so even if the SP's address is
 close to the source of the expanding window, the fill reward will not be greater
 than the collateral being given up. All of these factors together would make it
 very difficult to pull off an attack without losing money.
 #### Lazy SP attack
 In this attack, an SP's address is eligible to fill a slot, but the SP waits to
 fill the slot hoping to come across a more profitable opportunity. This attack
 is partially mitigated by the additional fill reward provided to fill the slot
 early (if their address is close to the source). In addition, SP will not be
 able to pre-calculate when other SPs in the network will become eligible due to
 the randomized rate of expansion in the expanding window mechanism. This means
 that as soon as the window expands, other SPs will be eligible to fill the slot
 first.
 #### Censoring SP attack
 The "censoring SP attack" is when an SP attempts to withhold providing specific
 CIDs from the network in an attempt to censor certain content. In this case, the
 dataset can be reconstructed from K chunks (provided by other SPs) allowing the
 censored CID to be accessed. An SP could also prevent slots from being filled
 (eg in the case of repair) while waiting for the expanding window to open up its
 address. In this case, the SP would need to control `M + 1` chunks to prevent data
 reconstruction by other nodes in the network. The fill reward and expanding
 window mechanism seek to prevent SPs from filling multiple slots in the same
 request.
 #### Greedy SP attack
 The "greedy SP attack" mitigation is the same as the censoring SP attack where
 `M + 1` chunks would need to be filled before this becomes an issue, which is
 already mitigated by the expanding windows and fill reward. However, this is
 only discouraged by not impossible if a controlling entity sets up a sybil
 attack with many highly distributed nodes.
 #### Sticky SP attack
 The "sticky SP attack" is created once a new storage request for the same CID
 already being stored by the SP is posted (this is how contract renewal works).
 SP withholds data from all other SPs until the expanding window allows their
 address, then they quickly fill the slot (they are quick because they don't need
 to download the data). This should not be a problem, as the distribution of data
 around the network for the previous request should be sufficient to be repeated.
 And this is only for one slot. If all SPs in the request did the same, the
 distribution for a successfully executed contract would still remain the same.
 This should not occur for requests with new CIDs as SPs will not already have
 the slot data downloaded.
 #### Withholding client attack
 In this attack, a client might want to disrupt the network by creating requests
 for storage but never releasing the data to SPs attempting to fill the slot.
 The Pay2Play fee/collateral mitigates this as it would become prohibitively
 expensive.
 #### Controlling client attack
 The "controlling client attack" is when a client withholds data from specific
 SPs after creating a storage request. This attack would cause some disruption
 for the SP as it attempts to download the data for the slot, and creates
 opportunity loss as it could be spending the time and resources filling other
 slots. The client, however, cannot do this to too many SPs, or else the slots
 will not get filled and they will lose their Pay2Play fee/collateral.
 #### Spammy client attack
 In this attack, a client seeks to disrupt the network by creating many untenable
 requests for storage. The Pay2Play fee/collateral mitigates this as it would
 become prohibitively expensive on a large scale.
 ## Proposed solution #2: slot reservations aka the "bloem method"
 Competition between hosts to fill slots has some advantages, such as providing
 an incentive for hosts to become proficient in downloading content and
 generating proofs. It also has some drawbacks, for instance it can lead to
 network inefficiencies because multiple hosts do the work of downloading and
 proving, while only one host is rewarded for it. These inefficiencies lead to
 higher costs for hosts, which leads to an overall increase in the price of
 storage on the network. It can also lead to clients inadvertently inviting too
 much network traffic to themselves. Should they for instance post a very
 lucrative storage request, then this invites a lot of hosts to start downloading
 the content from the client simultaneously, not unlike a DDOS attack.
 Slot reservations are a means to avoid these inefficiencies. Before downloading
 the content associated with a slot, a limited number of hosts can reserve the
 slot. Only hosts that have reserved the slot can fill the slot. After the host
 downloads the content and calculates a proof, it can move the slot from its
 reserved state into the filled state by providing collateral and the storage
 proof. Then it begins to periodically provide storage proofs and accrue payments
 for the slot.
 ```
         reserve         proof & collateral
            |                  |
            v                  v
            ---------------------------------------------
     slot:  |/ / / / / / / / / |/////////////////////////
            ---------------------------------------------
            |                  |
            v                  v
          slot                slot
         reserved            filled
            ---------------- time ---------------->
 ```
 Reserving a slot requires some collateral, so there is an initial race for SPs
 who can can deposit collateral first to secure a reservation, then a second race
 amongst the SPs with a reservation to fill the slot (with collateral and the
 generated proof). However, not all SPs in the network can reserve a slot
 initially: the [expanding window
 mechanism](https://github.com/status-im/codex-research/blob/ad41558900ff8be91811aa5de355148d8d78404f/design/marketplace.md#dispersal)
 dictates which SPs are eligible to reserve the slot. As time progresses for an
 unreserved slot (or a slot with less than $R$ reservations), more SPs will be
 allowed to reserve the slot, until eventually any SP in the network can reserve
 the slot. This ensures fair participation opportunity across SPs in the network.
 ### Reservation collateral and reward
 Reservation collateral is paid to reserve a slot. This will decreases over time
 to incentivize participation in aging slots.
 Reservation reward is paid to the SP who reserves and eventually fills the slot.
 This increases over time, to incentivize participation in aging slots.
 [TODO: INSERT GRAPH OF RESERVATION COLLATERAL AND REWARD OVER TIME]
 ### Reservations expiry
 Reservations can expiry after some time to prevent opportunity loss for other
 SPs willing to participate.
 ### Reservation retries
 After expiry, an SP can retry a slot reservation, but if it was the last SP to
 reserve the slot, it can only retry once. In other words, SPs can reserve the
 same slot only twice in a row.
 ### Reservations per slot
 Each slot is allowed to have three reservations, which effectively limits the
 quantity of racing to three SPs.
 ### Expanding windows per slot
 Each slot will have three reservations per slot, and each reservation will have
 its own expanding window (these windows will have a unique starting point). The
 purpose of this is to distribute the reservation potentials
 ### Solution #2 attacks
 Name         | Attack description
 :------------|:--------------------------------------------------------------
 Clever SP    | SP drops reservation when a better opportunity presents itself
 Lazy SP      | SP reserves a slot, but doesn't fill it
 Censoring SP | acts like a lazy SP for specific CIDs that it tries to censor
 Hooligan SP  | acts like a lazy SP for many request to damage to the network
 Greedy SP    | SP tries to fill multiple slots in a request
 Lazy client  | client doesn't release content on the network
 #### Clever SP attack
 In this attack, an SP could reserve a slot, then if a better opportunity comes
 along, forfeit the reservation collateral by reserving and filling another slot,
 with the idea that the reward earned in the new opportunity would make the
 reservation collateral loss from the original slot worthwhile.
 This attack is mitigated by allowing for multiple reservations per slot. All SPs
 that have secured a reservation (capped at three) will race to fill the slot.
 Thus, if one or more SPs that have reserved the slot decide to pursue other
 opportunities, the other SPs that have reserved the slot will still be able to
 fill the slot.
 In addition, the expanding window mechanism allows for more slots
 to participate (reserve/fill) as time progresses, so there will be a larger pool
 of SPs that could potentially fill the slot.
 The increasing reservation reward over time also incentivizes late comers to
 reserve slots, so if there are initially SPs that reserve slots but fail to fill
 the slot due to better opportunities elsewhere, other SPs will be incentivized
 to participate.
 After some time, the slot reservation of the attacking SP will expire, and other
 SPs wills will be allowed to reserve and fill the slot. As time will have passed
 in this scenario, increased reservation rewards and decreased collateral
 requirements will incentivize other SPs to participate.
 #### Lazy SP attack
 The "lazy SP attack" is when an SP reserves a slot, but does not fill it. The
 vector is very similar to the "clever SP attack". The slot reservations
 mechanism mitigates this attack in the same ways, please see the "Clever SP
 attack" section above.
 #### Censoring SP attack
 A "censoring SP attack" is performed by an SP that wants to disrupt storage of
 particular CIDs by reserving a slot and then not filling it.
 Mitigation of this attack is exactly the same as the "lazy SP attack".
 #### Hooligan SP attack
 In this attack, an SP would attempt to disrupt the network by reserving and
 failing to fill random slots in the network
 #### Greedy SP attack
 A "greedy SP attack" is when one SP tries to fill more than M slots (and up to K
 slots) of a request in an attempt to control whether or not the contract
 fails. In the case of M slots controlled, the attacker could cause the contract
 to fail and the client would get only funds not already spent on proof provision
 back. All SPs in the contract would forfeit their collateral in this case,
 however, so this attack does have a significant cost associated.
 In the case of K slots, the SPs could withhold data from the network, and if no
 other SPs or caching nodes hold this data, could prevent retrieval and repair of
 the data.
 This particular attack is difficult to mitigate because there is a sybil
 component to it where an entity could control many nodes in the network but all
 those nodes could collude on the attack.
 At this time, slot reservations does not mitigate against this attack, nor does
 it incentivize behavior that would prevent it.
 #### Lazy client attack
 This attack happens when a client creates a request for storage, but ultimately
 does not release the data to the network when it requested. SPs may reserve the slot,
 with collateral, and yet would never be able to fill the slot as they cannot
 download the data. The result of this attack is that any SPs who reserve the
 slot may lose their collateral.
 At this time, slot reservations does not mitigate against this attack, nor does
 it incentivize behavior that would prevent it.
 ### Open questions
 #### Reservation collateral goes to 0 at time of contract expiry?
 #### Fill race to start at the same time?
 When SPs reserve a slot, the first three SPs can race to fill the slot. However,
 if there is no point in time at which all three SPs can start the race, then it
 would allow any SP that secures a reservation to immediately start downloading
 and filling the slot.
 Perso note: no, want to fill slots as fast as possible.
 ### Trade offs
 The main advantage to this design is that nodes and the network would not be overloaded at
 the outset of slots being available for SP participation.
 The downside of this proposal is that an SP would have to participate in two
 races: one for reserving the slot and another for filling the slot once
 reserved, which brings additional complexities in the smart contract.
 Additionally, there are additional complexities introduced with the reservation
 collateral and reward "dutch auctions" that change over time. It remain unclear
 if the additional complexity in the smart contracts for benefits that may not be
 substantially greater than having the sliding mechanism window on its own.
 In addition, there are two attack vectors, the "greedy SP attack" and the "lazy
 client attack" that are not well covered in the slot reservation design. There
 could be even more complexities added to the design to accommodate these two
 attacks (see the other proposed solution for the mitigation of these attacks).
 // THOUGHTS: Perhaps the expanding window mechanism should be network-aware such
 that there are always a minimum of 2 SPs in a window at a given time, to
 encourage competition. If not, an SP could watch the network, and given a new
 opportunity to fill a slot, understand that it could wait some time before
 reserving the slot knowing that it is the only SP in the allowed address space
 in the expanding window for some time.
 The slot reservation mechanism should be able to mitigate client and SP attacks
 on the network.
 Allowing slots to be claimed by a storage provider (SP) offering both collateral and proof of
 storage creates a race condition where the fastest SP to provide these "wins"
 the slot. . In other words, a host would need to download the data before it's
 allowed to fill a slot.
 This can be problematic. Because hosts are racing to fill a slot, by all downloading the data, they could overwhelm the capacity of the client that is offering the data to the network. Bandwidth incentives and [dispersal](https://github.com/status-im/codex-research/blob/ad41558900ff8be91811aa5de355148d8d78404f/design/marketplace.md#dispersal) could alleviate this, but this is far from certain.
 This is why it's useful to investigate an alternative: allow a host to claim a slot before it starts downloading the data. A host can claim a slot by providing collateral. This changes the dynamics quite a bit; hosts are now racing to put down collateral, and it's not immediately clear what should happen when a contract fails. The incentives need to be carefully changed to avoid unwanted behavior.
--- a/design/slot-reservations-simplified.md
+++ b/design/slot-reservations-simplified.md
@ -1,282 +0,0 @@
 # Preventing node and network overload during slot filling and slot repair
 When a new storage request is created, slots in the request can be filled by the
 first storage provider (SP) to download the slot data, generate a storage proof,
 then supply the proof and collateral to the onchain contract. This is inherently
 a race and all SPs except the one who "won" will have wasted funds downloading
 data that they ultimately did not need, which may eventually lead to higher
 costs of storage. Additionally, clients will have to serve data requests for all
 the racing SPs downloading data for all the slots in the request. This not only
 could cause issues with nodes failing to handle the request load, but also
 creates unnecessary congestion on the network.
 ## Proposed solution: Slot reservations aka the "bloem method"
 Competition between hosts to fill slots has some advantages, such as providing
 an incentive for hosts to become proficient in downloading content and
 generating proofs. It also has some drawbacks, for instance it can lead to
 network inefficiencies because multiple hosts do the work of downloading and
 proving, while only one host is rewarded for it. These inefficiencies lead to
 higher costs for hosts, which leads to an overall increase in the price of
 storage on the network. It can also lead to clients inadvertently inviting too
 much network traffic to themselves. Should they for instance post a very
 lucrative storage request, then this invites a lot of hosts to start downloading
 the content from the client simultaneously, not unlike a DDOS attack.
 Slot reservations are a means to avoid these inefficiencies. Before downloading
 the content associated with a slot, a limited number of hosts can reserve the
 slot. Only hosts that have reserved the slot can fill the slot. After the host
 downloads the content and calculates a proof, it can move the slot from its
 reserved state into the filled state by providing collateral and the storage
 proof. Then it begins to periodically provide storage proofs and accrue payments
 for the slot.
 ```
         reserve         proof & collateral
            |                  |
            v                  v
            ---------------------------------------------
     slot:  |/ / / / / / / / / |/////////////////////////
            ---------------------------------------------
            |                  |
            v                  v
          slot                slot
         reserved            filled
            ---------------- time ---------------->
 ```
 Reserving a slot requires some collateral, so there is an initial race for SPs
 who can can deposit collateral first to secure a reservation, then a second race
 amongst the SPs with a reservation to fill the slot (with collateral and the
 generated proof). However, not all SPs in the network can reserve a slot
 initially: the [expanding window
 mechanism](https://github.com/status-im/codex-research/blob/ad41558900ff8be91811aa5de355148d8d78404f/design/marketplace.md#dispersal)
 dictates which SPs are eligible to reserve the slot. As time progresses for an
 unreserved slot (or a slot with less than $R$ reservations), more SPs will be
 allowed to reserve the slot, until eventually any SP in the network can reserve
 the slot. This ensures fair participation opportunity across SPs in the network.
 Additionally, the SP that fills the slot will be rewarded with a fill reward
 that decreases linearly from the time the slot is available to fill to the
 request expiry.
 ### Expanding window mechanism
 The expanding window mechanism prevents node and network overload once a slot
 becomes available to be filled (or repaired) by allowing only a very small
 number of SP addresses to fill/repair the slot at the start. Over time, the
 number of eligible SP addresses increases, until eventually all SP addresses in
 the network are eligible.
 The [expanding window
 mechanism](https://github.com/status-im/codex-research/blob/ad41558900ff8be91811aa5de355148d8d78404f/design/marketplace.md)
 starts off with a random source address, defined as $hash(block number, request
 id, slot index)$ and a distance defined as $XOR(A, A_0)$. Over time, $t_i$, the
 allowed distance [can be defined as $2^{256} *
 F(t_i)$](https://hackmd.io/@bkomuves/BkDXRJ-fC). As this value gradually
 increases, only addresses that have less of a distance than this value will be
 eligible to participate. In total, eligible addresses are those that satisfy:
 $XOR(A, A_0) < 2^{256} * F(t_i)$
 Because the source address for the expanding window is generated using the slot
 number, that means the source address for each slot will be different. Note that
 the reservation index is not included, meaning that a single node could
 potentially fill all slots in a request. The reason this was done was to
 simplify the expanding window design. The reservation index could be added in if
 necessary.
 The client can set the rate of expansion by defining the [parameter
 $h$](https://hackmd.io/@bkomuves/BkDXRJ-fC#Parametrizing-the-speed-of-expansion).
 Changing the value of $h$ will [affect the curve of the rate of
 expansion](https://www.desmos.com/calculator/pjas1m1472) (interactive graph).
 ### Fill reward
 A fill reward will be issued to the SP that fills the slot. The client will deposit
 an additional fee when creating a request for storage to cover the maximum fill
 reward for all slots. Any difference in fill reward paid versus fill reward
 deposited will be returned to the client after the request is completed
 (including failed and cancelled requests).
 This reward will decrease linearly over time, starting with the maximum value
 at the time the slot is available to fill, and decreasing to zero at the request
 expiry. This incentivizes SPs to fill the slot as fast as possible, with
 the lucky few SPs that closest to the source point of the expanding window
 getting a bigger reward.
 The fill reward maximum value is specified by the client in the request for
 storage.
 #### Fill reward versus request collateral
 There is one caveat to the fill reward: if the fill reward is larger than the
 required collateral in an active request, an SP that is actively filling a slot
 will see a more profitable opportunity with a high fill reward (assuming SP's
 address was close to the source of the expanding window), and would be
 incentivized to abandon their existing duties and fill the slot in the new
 request.
 There are two ways to approach this issue. The first approach is to set bounds
 in the protocol restricting the minimum collateral of new storage requests to be
 greater than the average fill reward in all active requests, increased by a
 percentage (specified at the network-level). The average fill reward at time of
 slot fill would need to be persisted in the contract to calculate what the next
 available minimum collateral limit would be. New slot fills would append to the
 average and completed contracts to detract from this persisted value. This
 method was inspired by the way the base gas fee is calculated in
 [EIP-1559](https://consensys.io/blog/what-is-eip-1559-how-will-it-change-ethereum).
 If the fill reward is continually getting
 The second approach does not set any protocol bounds, allowing any request
 collateral and fill reward for new storage requests. This approach may
 potentially be harmful to the health of existing storage requests if the fill
 reward is higher compared to the collateral of active storage requests. The lack
 of disturbance in market dynamics may be enough for this behavior to be
 acceptable. Clients that set a high fill reward should likely also set a high
 collateral so that the same does not happen to their storage requests. The high
 collateral may be a deterrent to SPs filling the slots, and other aspects of
 their request should be sufficient to attract SPs. In other words, normal market
 behaviors will determine what the values should be. Codex's UI available to
 clients should help guide them when making decisions on their storage request
 parameters.
 Without empirical data on the real world behaviors of SPs, the types of
 behaviors to guard against may be purely speculative and not worth the
 complexity impact on the protocol design. In that regard, perhaps moving forward
 with the second approach is the right choice, and then moving to the first
 approach if real world SP behavior warrants its implementation.
 ### No reservation collateral and reward
 In this simplified slot reservations proposal, there will not be reservation
 collateral and reward requirements until the behavior in a live environment can
 be observed and it is determined this are necessary mechanisms.
 ### No reservations expiry and retries
 As a difference from the originally proposed slot reservations, there will be no
 reservation expiry and no reservation retries until actual behavior on the
 network is observed and it is determined this is a needed mechanism.
 ### Reservations per slot
 Each slot is allowed to have three reservations, which effectively limits the
 quantity of racing to three SPs.
 ### Expanding windows per slot
 The slot will have one expanding window of eligible SP addresses that can
 reserve the slot. This expanding window is shared across all three reservations
 in the slot. This is different to the originally proposed slot reservations,
 which had a unique expanding window per reservation.
 ### Solution #2 attacks
 Name         | Attack description
 :------------|:--------------------------------------------------------------
 Clever SP    | SP drops reservation when a better opportunity presents itself
 Lazy SP      | SP reserves a slot, but doesn't fill it
 Censoring SP | acts like a lazy SP for specific CIDs that it tries to censor
 Hooligan SP  | acts like a lazy SP for many request to damage to the network
 Greedy SP    | SP tries to fill multiple slots in a request
 Lazy client  | client doesn't release content on the network
 #### Clever SP attack
 In this attack, an SP could reserve a slot, then if a better opportunity comes
 along, forfeit the reservation by reserving and filling another slot,
 with the idea that the reward earned in the new opportunity would make the
 reservation collateral loss from the original slot worthwhile.
 This attack is mitigated by allowing for multiple reservations per slot. All SPs
 that have secured a reservation (capped at three) will race to fill the slot.
 Thus, if one or more SPs that have reserved the slot decide to pursue other
 opportunities, the other SPs that have reserved the slot will still be able to
 fill the slot.
 In addition, the expanding window mechanism allows for more slots
 to participate (reserve/fill) as time progresses, so there will be a larger pool
 of SPs that could potentially fill the slot.
 There is also a decreasing fill reward that incentivizes the SP to fill the
 slot as fast as possible to gain the most reward. By waiting to see if there are
 better opportunities that arise, the SP will miss out on a larger fill reward.
 #### Lazy SP attack
 The "lazy SP attack" is when an SP reserves a slot, but does not fill it. The
 vector is very similar to the "clever SP attack". The slot reservations
 mechanism mitigates this attack in the same ways, please see the "Clever SP
 attack" section above.
 #### Censoring SP attack
 A "censoring SP attack" is performed by an SP that wants to disrupt storage of
 particular CIDs by reserving a slot and then not filling it.
 Mitigation of this attack is exactly the same as the "lazy SP attack".
 #### Hooligan SP attack
 In this attack, an SP would attempt to disrupt the network by reserving and
 failing to fill random slots in the network
 #### Greedy SP attack
 A "greedy SP attack" is when one SP tries to fill more than M slots (and up to K
 slots) of a request in an attempt to control whether or not the contract
 fails. In the case of M slots controlled, the attacker could cause the contract
 to fail and the client would get only funds not already spent on proof provision
 back. All SPs in the contract would forfeit their collateral in this case,
 however, so this attack does have a significant cost associated.
 In the case of K slots, the SPs could withhold data from the network, and if no
 other SPs or caching nodes hold this data, could prevent retrieval and repair of
 the data.
 This particular attack is difficult to mitigate because there is a sybil
 component to it where an entity could control many nodes in the network but all
 those nodes could collude on the attack.
 At this time, slot reservations does not mitigate against this attack, nor does
 it incentivize behavior that would prevent it, however the large cost associated
 with this attack is a natural deterrent and is less probably to occur.
 #### Lazy client attack
 This attack happens when a client creates a request for storage, but ultimately
 does not release the data to the network when it requested. SPs may reserve the
 slot, with collateral, and yet would never be able to fill the slot as they
 cannot download the data. The result of this attack is that any SPs who reserve
 the slot may lose their collateral.
 At this time, slot reservations does not mitigate against this attack, nor does
 it incentivize behavior that would prevent it.
 ### Open questions
 Perhaps the expanding window mechanism should be network-aware such
 that there are always a minimum of two SPs in a window at a given time, to
 encourage competition. The downside of this is that active SPs need to be
 persisted and tracked in the contract, with larger transaction costs resulting
 from this.
 ### Trade offs
 The main advantage to this design is that nodes and the network would not be
 overloaded at the outset of slots being available for SP participation.
 The downside of this proposal is that an SP would have to participate in two
 races: one for reserving the slot and another for filling the slot once
 reserved, which brings additional complexities in the smart contract.
 Additionally, there are additional complexities introduced with the reservation
 collateral and reward "dutch auctions" that change over time. It remain unclear
 if the additional complexity in the smart contracts for benefits that may not be
 substantially greater than having the sliding mechanism window on its own.
 In addition, there are two attack vectors, the "greedy SP attack" and the "lazy
 client attack" that are not well covered in the slot reservation design. There
 could be even more complexities added to the design to accommodate these two
 attacks (see the other proposed solution for the mitigation of these attacks).
--- a/design/slot-reservations.md
+++ b/design/slot-reservations.md
@ -0,0 +1,225 @@
 # Slot reservations
 Competition between hosts to fill slots has some advantages, such as providing
 an incentive for hosts to become proficient in downloading content and
 generating proofs. It also has some drawbacks, for instance it can lead to
 network inefficiencies because multiple hosts do the work of downloading and
 proving, while only one host is rewarded for it. These inefficiencies lead to
 higher costs for hosts, which leads to an overall increase in the price of
 storage on the network. It can also lead to clients inadvertently inviting too
 much network traffic to themselves. Should they for instance post a very
 lucrative storage request, then this invites a lot of hosts to start downloading
 the content from the client simultaneously, not unlike a DDOS attack.
 Slot reservations are a means to avoid these inefficiencies by only allowing SPs
 who have secured a slot reservation to fill the slot. Furthermore, slots can
 only be reserved by eligible SPs, governed by a window of eligible addresses
 that starts small and grows larger over time, eventually encompassing the entire
 address space on the network.
 ## Proposed solution: slot reservations
 Before downloading the content associated with a slot, a limited number of hosts
 can reserve the slot. Only hosts that have reserved the slot can fill the slot.
 After the host downloads the content and calculates a proof, it can move the
 slot from its reserved state into the filled state by providing collateral and
 the storage proof. Then it begins to periodically provide storage proofs and
 accrue payments for the slot.
 ```
         reserve         proof & collateral
            |                  |
            v                  v
            ---------------------------------------------
     slot:  |/ / / / / / / / / |/////////////////////////
            ---------------------------------------------
            |                  |
            v                  v
          slot                slot
         reserved            filled
            ---------------- time ---------------->
 ```
 There is an initial race for eligible SPs who are first to secure a reservation, then a
 second race amongst the SPs with a reservation to fill the slot (with collateral
 and the generated proof). However, not all SPs in the network can reserve a slot
 initially: the [expanding window
 mechanism](https://github.com/status-im/codex-research/blob/ad41558900ff8be91811aa5de355148d8d78404f/design/marketplace.md#dispersal)
 dictates which SPs are eligible to reserve the slot.
 ### Expanding window mechanism
 The expanding window mechanism prevents node and network overload once a slot
 becomes available to be filled (or repaired) by allowing only a very small
 number of SP addresses to fill/repair the slot at the start. Over time, the
 number of eligible SP addresses increases, until eventually all SP addresses in
 the network are eligible.
 The expanding window mechanismstarts off with a random source address, defined
 as $hash(block hash, request id, slot index, reservationindex)$. The distance
 between each SP address and the source address can be defined as $XOR(A, A_0)$
 (kademlia distance). Once the allowed distance is greater than SP's distance,
 the SP is considered eligible to reserve a slot. The allowed distance for
 eligible addresses over time $t_i$ can be [defined as $2^{256} *
 F(t_i)$](https://hackmd.io/@bkomuves/BkDXRJ-fC). As this allowed distance value
 increases along a curve, more and more addresses will be eligible to participate
 in reserving that slot. In total, eligible addresses are those that satisfy:
 $XOR(A, A_0) < 2^{256} * F(t_i)$
 Because the source address for the expanding window is generated using the slot
 index and the reservation index, that means the source address for each
 reservation in each slot will be different.
 Furthermore, the client can set the rate of expansion by defining the [parameter
 $h$](https://hackmd.io/@bkomuves/BkDXRJ-fC#Parametrizing-the-speed-of-expansion)
 as the [dispersal parameter](https://github.com/codex-storage/codex-research/blob/ad41558900ff8be91811aa5de355148d8d78404f/design/marketplace.md#dispersal) of the request.
 Changing the value of $h$ will [affect the curve of the rate of
 expansion](https://www.desmos.com/calculator/pjas1m1472) (interactive graph).
 ### Abandoned ideas
 #### No reservation collateral
 Reservation collateral was thought to be able to prevent a situation where an SP
 would reserve a slot then fail to fill it. However, collateral could not be
 burned as it created an attack vector for clients: clients could withhold the
 data and cause SPs to lose their reservation collateral. The reservation
 transaction itself creates a signal of intent from an SP to fill the slot. If
 the SP were to not fill the slot, then other SPs that have reserved the slot
 will fill it.
 #### No reservation reward
 Adding a fill reward for SPs who ultimately fill the slot is not necessary
 because, like the SP rewards for providing proofs, fill rewards would be paid
 once the storage request successfully completes. This would mean that the fill
 reward is effectively the same as an increase in value of the storage request
 payout. Therefore, if a client is so inclined to provide a fill reward, they
 could instead increase the total reward of the storage request.
 In this simplified slot reservations proposal, there will not be reservation
 collateral and reward requirements until the behavior in a live environment can
 be observed and it is determined this are necessary mechanisms.
 ### Slot reservation attacks
 Name         | Attack description
 :------------|:--------------------------------------------------------------
 Clever SP    | SP drops reservation when a better opportunity presents itself
 Lazy SP      | SP reserves a slot, but doesn't fill it
 Censoring SP | acts like a lazy SP for specific CIDs that it tries to censor
 Greedy SP    | SP tries to fill multiple slots in a request
 Sticky SP    | SP tries to fill the same slot in a contract renewal
 Lazy client  | client doesn't release content on the network
 #### Clever SP attack
 In this attack, an SP could fill a slot, and while fulfilling its duties, see
 that a better opportunity has arisen, and abandon its duties in the first slot
 to fill the second slot.
 This attack is mitigated by the SP losing its request collateral for the first
 slot once it is abandoned. Additionally, once the SP fills the first slot, it
 will accrue rewards over time that will not be paid out until the request
 successfully completes. These rewards act as another disincentive for the SP to
 abandon the slot.
 The behavior of SPs filling better opportunities is not necessarily an attack.
 If an SP is fulfilling its duties on a slot and finds a better opportunity
 elsewhere, it should be allowed to do so. The repair mechanisms will allow the
 abandoned slot to be refilled by another SP that deems it profitable.
 #### Lazy SP attack
 In this attack, a SP reserves a slot, but waits to fill the slot hoping a better
 opportunity will arise, in which the reward earned in the new opportunity would
 be greater than the reward earned in the original slot.
 This attack is mitigated by allowing for multiple reservations per slot. All SPs
 that have secured a reservation (capped at three) will race to fill the slot.
 Thus, if one or more SPs that have reserved the slot decide to pursue other
 opportunities, the other SPs that have reserved the slot will still be able to
 fill the slot.
 In addition, the expanding window mechanism allows for more SPs to participate
 (reserve/fill) as time progresses, so there will be a larger pool of SPs that
 could potentially fill the slot. Because each reservation will have its own
 unique expanding window source, SPs reserving one slot in a request will likely
 not have the same opportunities to reserve/fill the same slot in another
 request.
 #### Censoring SP attack
 The "censoring SP attack" is when an SP attempts to withhold providing specific
 CIDs from the network in an attempt to censor certain content. An SP could also
 try this attack in the case of repair, hoping to prevent a freed slot from being
 repaired.
 Even if one SP withholds specific content, the dataset, along with the withheld
 CID can be reconstructed from K chunks (provided by other SPs) allowing the
 censored CID to be accessed. In the case of repair, the SP would need to control
 M+1 chunks to prevent data reconstruction by other nodes in the network. The
 expanding window mechanism seeks to prevent SPs from filling multiple slots in
 the same request, which should prevent any one SP from controlling M+1 slots.
 #### Greedy SP attack
 The "greedy SP attack" is when one SP tries to fill multiple slots in a single
 request. Mitigation of this attack is achieved through the expanding windows for
 each request not allowing a single SP address to fill all the slots. This is
 only effective for the majority of time before expiry, however, meaning it is
 not impossible for this attack to occur. If a request is offered and the slots
 are not filled after some time, the expanding windows across the slots may open
 up to allow all SPs in the network to fill multiple slots in the request.
 A controlling entity may try to circumvent the expanding window by setting up a
 sybil attack with many highly distributed nodes. Even with many nodes covering a
 large distribution of the address space, the randomness of the expanding window
 will make this attack highly improbable, except for undervalued requests that do
 not have slots filled early, in which case there would be a lack of motivation
 to attack data that is undervalued.
 #### Sticky SP attack
 The "sticky SP attack" is where an SP tries to withhold data for a contract
 renewal so they are able to fill the slot again. The SP withholds data from all
 other SPs until the expanding window allows their address, then they quickly
 fill the slot (they are quick because they don't need to download the data). As
 in the censoring SP attack, the SP would need to control M+1 slots for this to
 be effective, because that is the only way to prevent the CID from being
 reconstructed from K slots available from other SPs.
 #### Lazy client attack
 In this attack, a client might want to disrupt the network by creating requests
 for storage but never releasing the data to SPs attempting to fill the slot. The
 transaction cost associated with this type of behavior should provide some
 mitigation. Additionally, if a client tries to spam the network with these types
 of untenable storage requests, the transaction cost will increase with the
 number of requests due to increasing block fill rate and rising gas costs
 associated. However, this attack is not impossible.
 ### Open questions
 Perhaps the expanding window mechanism should be network-aware such
 that there are always a minimum of two SPs in a window at a given time, to
 encourage competition? The downside of this is that active SPs need to be
 persisted and tracked in the contract, with larger transaction costs resulting
 from this.
 ### Trade offs
 The main advantage to this design is that nodes and the network would not be
 overloaded at the outset of slots being available for SP participation.
 The downside of this proposal is that an SP would have to participate in two
 races: one for reserving the slot and another for filling the slot once
 reserved, which brings additional complexities in the smart contract.
 In addition, there are two attack vectors, the "greedy SP attack" and the "lazy
 client attack" that are not well covered in the slot reservation design. There
 could be even more complexities added to the design to accommodate these two
 attacks (see the other proposed solution for the mitigation of these attacks).