mirror of
https://github.com/logos-storage/logos-storage-research.git
synced 2026-01-04 06:23:09 +00:00
add phases, start on phase 2 analysis
This commit is contained in:
parent
2d7e6c01e0
commit
daf6afda9c
@ -44,11 +44,28 @@ RepoStore will add support for explicit deletes.
|
|||||||
- Marketplace calls `onStore` with a `requestId, slotIdx` (and a manifest cid if needed)
|
- Marketplace calls `onStore` with a `requestId, slotIdx` (and a manifest cid if needed)
|
||||||
- Marketplace calls `delete` with a `requestId, slotIdx` (and a manifest cid if needed)
|
- Marketplace calls `delete` with a `requestId, slotIdx` (and a manifest cid if needed)
|
||||||
|
|
||||||
|
#### `onStore` considerations
|
||||||
|
|
||||||
|
`onStore` is a long-running operation that may not finish before a request
|
||||||
|
cancellation. If this happens, `SaleDownloading.run` will be cancelled, which
|
||||||
|
will cancel the `onStore` call, so `onStore` will need to handle this exception
|
||||||
|
appropriately.
|
||||||
|
|
||||||
### Phase III
|
### Phase III
|
||||||
|
|
||||||
Phase III will include additional Marketplace features that go beyond the
|
Phase III will include [resumption of local state in the Marketplace](https://github.com/codex-storage/codex-research/blob/refactor/simplified-sales-and-purchasing/design/sales2.md).
|
||||||
"baseline" [simplification of the Marketplace sales
|
|
||||||
design](https://github.com/codex-storage/codex-research/blob/refactor/simplified-sales-and-purchasing/design/sales2.md).
|
#### Resumable downloads support
|
||||||
|
|
||||||
|
Depends on: [Marketplace resumption of local state](https://github.com/codex-storage/codex-research/blob/refactor/simplified-sales-and-purchasing/design/sales2.md#resuming-local-state-eg-downloading)
|
||||||
|
|
||||||
|
- `onStore` will need to internally track the state of building a slot, to allow
|
||||||
|
for recovery (Download Manager), remembering to handle cancellations
|
||||||
|
appropriately.
|
||||||
|
- When the Marketplace restores a local sale to the downloading state, `onStore`
|
||||||
|
will be called and it will resume its operation based on the state.
|
||||||
|
|
||||||
|
### Phase IV and beyond
|
||||||
|
|
||||||
#### Marketplace concurrency support
|
#### Marketplace concurrency support
|
||||||
|
|
||||||
@ -58,22 +75,6 @@ Depends on: [Marketplace support for concurrent workers](https://github.com/code
|
|||||||
stores/deletes (due to locking) by implementing a Marketplace-level dataset
|
stores/deletes (due to locking) by implementing a Marketplace-level dataset
|
||||||
ref count.
|
ref count.
|
||||||
|
|
||||||
#### Resumable downloads support
|
|
||||||
|
|
||||||
Depends on: [Marketplace resumption of local state](https://github.com/codex-storage/codex-research/blob/refactor/simplified-sales-and-purchasing/design/sales2.md#resuming-local-state-eg-downloading)
|
|
||||||
|
|
||||||
- `onStore` will need to internally track the state of building a slot, to allow
|
|
||||||
for recovery (Download Manager)
|
|
||||||
- When the Marketplace restores a local sale to the downloading state, `onStore`
|
|
||||||
will be called and it will resume its operation based on the state.
|
|
||||||
|
|
||||||
#### `onStore` considerations
|
|
||||||
|
|
||||||
`onStore` is a long-running operation that may not finish before a request
|
|
||||||
cancellation. If this happens, `SaleDownloading.run` will be cancelled, which
|
|
||||||
will cancel the `onStore` call, so `onStore` will need to handle this exception
|
|
||||||
appropriately.
|
|
||||||
|
|
||||||
## Phase I: Expiries but no delete
|
## Phase I: Expiries but no delete
|
||||||
|
|
||||||
Identify points in the sales state machine with crashes, exceptions, or
|
Identify points in the sales state machine with crashes, exceptions, or
|
||||||
@ -140,6 +141,11 @@ flowchart LR
|
|||||||
|
|
||||||
### Downloading
|
### Downloading
|
||||||
|
|
||||||
|
It is important to note that the Marketplace does not update the expiry
|
||||||
|
directly. Updating of the dataset expiry is done indirectly via
|
||||||
|
`onStore(expiry)`. However, consideration of expiration update success is still
|
||||||
|
needed to analyse recoverability.
|
||||||
|
|
||||||
```mermaid
|
```mermaid
|
||||||
---
|
---
|
||||||
config:
|
config:
|
||||||
@ -179,7 +185,7 @@ config:
|
|||||||
flowchart LR
|
flowchart LR
|
||||||
WaitForStableChallenge["Wait for stable challenge"] -->
|
WaitForStableChallenge["Wait for stable challenge"] -->
|
||||||
GetChallenge["Get challenge"] -- challenge -->
|
GetChallenge["Get challenge"] -- challenge -->
|
||||||
onProve["onProve(slot, challege)"] -->
|
onProve["onProve(slot, challenge)"] -->
|
||||||
SaleFilling
|
SaleFilling
|
||||||
|
|
||||||
WaitForStableChallenge -- Exception --> SaleErrored
|
WaitForStableChallenge -- Exception --> SaleErrored
|
||||||
@ -193,7 +199,7 @@ flowchart LR
|
|||||||
| Situation | Outcome |
|
| Situation | Outcome |
|
||||||
|--------------------------------------------------------------------------------|---------------------------------------------|
|
|--------------------------------------------------------------------------------|---------------------------------------------|
|
||||||
| CRASH at any point | The slot is forfeited. |
|
| CRASH at any point | The slot is forfeited. |
|
||||||
| EXCEPTION at any point | Goes to SaleErrored. The slot is forfeited. |
|
| EXCEPTION at any point | Goes to `SaleErrored`. The slot is forfeited. |
|
||||||
| CANCELLATION during "wait for stable challenge", "get challenge", or `onProve` | The slot is forfeited. |
|
| CANCELLATION during "wait for stable challenge", "get challenge", or `onProve` | The slot is forfeited. |
|
||||||
|
|
||||||
### Filling
|
### Filling
|
||||||
@ -290,10 +296,10 @@ flowchart TB
|
|||||||
WaitUntilNextPeriod -- Exception --> SaleErrored
|
WaitUntilNextPeriod -- Exception --> SaleErrored
|
||||||
```
|
```
|
||||||
|
|
||||||
| Situation | Outcome | Solution |
|
| Situation | Outcome | Solution |
|
||||||
|------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
|------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||||||
| CRASH at any point | On chain state is restored at startup, starting in `SaleFilled` state which extends the expiry to request end (no op), then moves back to `SaleProving`. | |
|
| CRASH at any point | On chain state is restored at startup, starting in `SaleFilled` state which extends the expiry to request end (no op), then moves back to `SaleProving`. | |
|
||||||
| EXCEPTION during `getSlotState`, `waitForNextPeriod`, `getCurrentPeriod`, or `getChallenge` | Goes to `SaleErrored`, all proving is stopped and SP becomes at risk of being slashed. | For network exceptions, implement retry functionality, with exponential backoff. For other exceptions, go to `SaleErrored`. Exceptions may include any errors resulting from the JSON RPC call, or errors originating in ethers. |
|
| EXCEPTION during `getSlotState`, `waitForNextPeriod`, `getCurrentPeriod`, or `getChallenge` | Goes to `SaleErrored`, all proving is stopped and SP becomes at risk of being slashed. | For network exceptions, implement retry functionality, with exponential backoff. For other exceptions, go to `SaleErrored`. Exceptions may include any errors resulting from the JSON RPC call, or errors originating in ethers. |
|
||||||
| CANCELLATION during `getSlotState`, `waitForNextPeriod`, `getCurrentPeriod`, or `getChallenge` | Goes to `SaleErrored`, all proving is stopped and SP becomes at risk of being slashed. | Possible options include:<br>1. Raise a Defect, crashing the node and forcing a restart. On startup, slot will enter the `SaleFilled` state, then move to `SaleProving`.<br>2. Log an error and increment a metric counter.<br>2. Do not propagate the cancellation. The sale will continue in the `SaleProving` state. Waiters (eg `cancelAndWait` may wait indefinitely). |
|
| CANCELLATION during `getSlotState`, `waitForNextPeriod`, `getCurrentPeriod`, or `getChallenge` | Goes to `SaleErrored`, all proving is stopped and SP becomes at risk of being slashed. | Possible options include:<br>1. Raise a Defect, crashing the node and forcing a restart. On startup, slot will enter the `SaleFilled` state, then move to `SaleProving`.<br>2. Log an error and increment a metric counter.<br>2. Do not propagate the cancellation. The sale will continue in the `SaleProving` state. Waiters (eg `cancelAndWait` may wait indefinitely). |
|
||||||
|
|
||||||
### Payout
|
### Payout
|
||||||
@ -313,13 +319,13 @@ flowchart LR
|
|||||||
OnFailed["RequestFailed contract event"] --> SaleFailed
|
OnFailed["RequestFailed contract event"] --> SaleFailed
|
||||||
```
|
```
|
||||||
|
|
||||||
| Situation | Outcome | Solution |
|
| Situation | Outcome | Solution |
|
||||||
|-----------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|
|
|-----------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------|
|
||||||
| `freeSlot` is successful | SUCCESS | |
|
| `freeSlot` is successful | SUCCESS | |
|
||||||
| CRASH before `freeSlot` completes | On chain state is restored at startup, starting in `SalePayout`, where `freeSlot` will be tried again. | |
|
| CRASH before `freeSlot` completes | On chain state is restored at startup, starting in `SalePayout`, where `freeSlot` will be tried again. | |
|
||||||
| CRASH after `freeSlot` completes | Slot is no longer part of `mySlots`, so on startup, slot state will not be restored. | |
|
| CRASH after `freeSlot` completes | Slot is no longer part of `mySlots`, so on startup, slot state will not be restored. | |
|
||||||
| EXCEPTION during `freeSlot` | Goes to `SaleErrored`. If `freeSlot` succeeded before the exception, no clean up routines will be performed. If `freeSlot` did not succeed before the exception, no funds will have been paid out and there are no recovery options. | For network exceptions, implement retry functionality with exponential backoff. |
|
| EXCEPTION during `freeSlot` | Goes to `SaleErrored`. If `freeSlot` succeeded before the exception, no clean up routines will be performed. If `freeSlot` did not succeed before the exception, no funds will have been paid out and there are no recovery options. | For network exceptions, implement retry functionality with exponential backoff. |
|
||||||
| CANCELLATION during `freeSlot` | Goes to `SaleErrored`. If `freeSlot` succeeded before the cancellation, no clean up routines will be performed. If `freeSlot` did not succeed before the cancellation, no funds will have been paid out and there are no recovery options. | |
|
| CANCELLATION during `freeSlot` | Goes to `SaleErrored`. If `freeSlot` succeeded before the cancellation, no clean up routines will be performed. If `freeSlot` did not succeed before the cancellation, no funds will have been paid out and there are no recovery options. | |
|
||||||
|
|
||||||
### Finished
|
### Finished
|
||||||
|
|
||||||
@ -348,13 +354,47 @@ flowchart LR
|
|||||||
|
|
||||||
## Phase II: No expiries but deletes
|
## Phase II: No expiries but deletes
|
||||||
|
|
||||||
Depends on: `SalesOrder` implementation
|
Depends on: `SalesOrder` "baseline" design implementation.
|
||||||
|
|
||||||
Identify points of `SalesOrder` updates and `RepoStore` writes, with crashes,
|
Identify points of `SalesOrder` updates and `RepoStore` writes, with crashes,
|
||||||
exceptions, or cancellations in between.
|
exceptions, or cancellations in between.
|
||||||
|
|
||||||
### Downloading
|
### Downloading
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
---
|
||||||
|
config:
|
||||||
|
layout: elk
|
||||||
|
---
|
||||||
|
flowchart LR
|
||||||
|
FetchSlotState["Fetch slot state"] -- isRepairing -->
|
||||||
|
CreateSalesOrder["Create SalesOrder"] -->
|
||||||
|
OnStore["onStore(isRepairing)"] -->
|
||||||
|
SaleInitialProving
|
||||||
|
|
||||||
|
OnStore -- Exception --> SaleErrored
|
||||||
|
FetchSlotState -- Exception --> SaleErrored
|
||||||
|
CreateSalesOrder -- Exception --> SaleErrored
|
||||||
|
|
||||||
|
OnCancelled["Cancelled timer elapsed"] --> SaleCancelled
|
||||||
|
OnFailed["RequestFailed contract event"] --> SaleFailed
|
||||||
|
```
|
||||||
|
|
||||||
|
| Situation | Outcome |
|
||||||
|
|-------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||||||
|
| No crash/exception in `onStore(expiry)` | Success |
|
||||||
|
| CRASH before create `SalesOrder` | The slot is forfeited, with no recovery on startup since the slot was not filled. |
|
||||||
|
| CRASH before `onStore(expiry)` | The slot is forfeited, with no recovery on startup since the slot was not filled. On restart, the `SalesOrder` will be archived. |
|
||||||
|
| CRASH in `onStore(expiry)` | The slot is forfeited, with no recovery on startup since the slot was not filled. On restart, the dataset will be cleaned up (corrective cleanup) and the `SalesOrder` archived. |
|
||||||
|
| CRASH after `onStore(expiry)` but before the transition to `SaleInitialProving` | The slot is forfeited, with no recovery on startup since the slot was not filled. On restart, the dataset will be cleaned up (corrective cleanup) and the `SalesOrder` archived. |
|
||||||
|
| EXCEPTION before create `SalesOrder` | Goes to `SaleErrored`. The slot is forfeited. |
|
||||||
|
| EXCEPTION before `onStore(expiry)` | Goes to `SaleErrored`. The slot is forfeited. The `SalesOrder` will be archived. |
|
||||||
|
| EXCEPTION in `onStore(expiry)` | Goes to `SaleErrored`. The slot is forfeited. The dataset will be cleaned up (active cleanup) and the `SalesOrder` archived. |
|
||||||
|
| EXCEPTION after `onStore(expiry)` but before the transition to `SaleInitialProving` | The slot is forfeited. The dataset will be cleaned up (active cleanup) and the `SalesOrder` archived. |
|
||||||
|
| CANCELLATION while fetching slot state | The slot is forfeited. |
|
||||||
|
| CANCELLATION during `onStore` | The slot is forfeited. The dataset will be cleaned up (active cleanup) and the `SalesOrder` archived. |
|
||||||
|
|
||||||
|
|
||||||
`SalesOrders` are created in the downloading state before any data is written to
|
`SalesOrders` are created in the downloading state before any data is written to
|
||||||
disk.
|
disk.
|
||||||
|
|
||||||
|
|||||||
@ -192,8 +192,9 @@ until it is deleted. The properties of a created `Availability` can be updated
|
|||||||
at any time.
|
at any time.
|
||||||
|
|
||||||
Because availability(ies) represents *future* sales (and not active sales), and
|
Because availability(ies) represents *future* sales (and not active sales), and
|
||||||
because fields of the matching `Availability` are persisted in a `SalesOrder`,
|
because fields of the matching `Availability` can be persisted in a `SalesOrder`
|
||||||
availabilities are not tied to active sales and can be manipulated at any time.
|
(if needed), availabilities are not tied to active sales and can be manipulated
|
||||||
|
at any time.
|
||||||
|
|
||||||
### `SalesOrder` object
|
### `SalesOrder` object
|
||||||
|
|
||||||
@ -328,17 +329,17 @@ The underlying `RepoStore` of the `SalesRepo` is responsible to reading and
|
|||||||
writing datasets to storage. Its API will include:
|
writing datasets to storage. Its API will include:
|
||||||
|
|
||||||
```nim
|
```nim
|
||||||
proc store(id: DatasetId)
|
proc onStore(id: DatasetId)
|
||||||
## Stores blocks of the dataset, incrementing their ref count.
|
## Stores a dataset, incrementing its ref count.
|
||||||
proc delete(id: DatasetId)
|
proc onClear(id: DatasetId)
|
||||||
## Decreases the ref count of blocks of the dataset, deleting if the ref count is 0.
|
## Decreases the ref count of the dataset, deleting if the ref count is 0.
|
||||||
```
|
```
|
||||||
|
|
||||||
Datasets will be tracked by a particular id, but it is TBD as to what that ID
|
Datasets will be tracked by a particular id, but it is TBD as to what that ID
|
||||||
will be:
|
will be:
|
||||||
|
|
||||||
- Preferred option for MP is `manifestCID + slotIndex`.
|
- Preferred option for MP: `requestId + slotIndex`.
|
||||||
- Alternative options discussed: `treeCid + slotIndex`, `slotRoot`.
|
- Alternative options discussed: `treeCid + slotIndex`, `slotRoot`, `requestId + slotIndex + manifestCid`
|
||||||
|
|
||||||
## Total collateral
|
## Total collateral
|
||||||
|
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user