nim-dagger/codex/sales/reservations.nim
Eric 1d161d383e
Slot queue (#455)
## Slot queue
Adds a slot queue, as per the [slot queue design](https://github.com/codex-storage/codex-research/blob/master/design/sales.md#slot-queue).

Any time storage is requested, all slots from that request are immediately added to the queue. Finished, Canclled, Failed requests remove all slots with that request id from the queue. SlotFreed events add a new slot to the queue and SlotFilled events remove the slot from the queue. This allows popping of a slot each time one is processed, making things much simpler.

When an entire request of slots is added to the queue, the slot indices are shuffled randomly to hopefully prevent nodes that pick up the same storage requested event from clashing on the first processed slot index. This allowed removal of assigning a random slot index in the SalePreparing state and it also ensured that all SalesAgents will have a slot index assigned to them at the start thus the removal of the optional slotIndex.

Remove slotId from SlotFreed event as it was not being used. RequestId and slotIndex were added to the SlotFreed event earlier and those are now being used

The slot queue invariant that prioritises queue items added to the queue relies on a scoring mechanism to sort them based on the [sort order in the design document](https://github.com/codex-storage/codex-research/blob/master/design/sales.md#sort-order).

When a storage request is handled by the sales module, a slot index was randomly assigned and then the slot was filled. Now, a random slot index is only assigned when adding an entire request to the slot queue. Additionally, the slot is checked that its state is `SlotState.Free` before continuing with the download process.

SlotQueue should always ensure the underlying AsyncHeapQueue has one less than the maximum items, ensuring the SlotQueue can always have space to add an additional item regardless if it’s full or not.

Constructing `SlotQueue.workers` in `SlotQueue.new` calls `newAsyncQueue` which causes side effects, so the construction call had to be moved to `SlotQueue.start`.

Prevent loading request from contract (network request) if there is an existing item in queue for that request.

Check availability before adding request to queue.

Add ability to query market contract for past events. When new availabilities are added, the `onReservationAdded` callback is triggered in which past `StorageRequested` events are queried, and those slots are added to the queue (filtered by availability on `push` and filtered by state in `SalePreparing`).

#### Request Workers
Limit the concurrent requests being processed in the queue by using a limited pool of workers (default = 3). Workers are in a data structure of type `AsyncQueue[SlotQueueWorker]`. This allows us to await a `popFirst` for available workers inside of the main SlotQueue event loop

Add an `onCleanUp` that stops the agents and removes them from the sales module agent list. `onCleanUp` is called from sales end states (eg ignored, cancelled, finished, failed, errored).

Add a `doneProcessing` future to `SlotQueueWorker` to be completed in the `OnProcessSlot` callback. Each `doneProcessing` future created is cancelled and awaited in `SlotQueue.stop` (thanks to `TrackableFuturees`), which forced `stop` to become async.
  - Cancel dispatched workers and the `onProcessSlot` callbacks, prevents zombie callbacks

#### Add TrackableFutures
Allow tracking of futures in a module so they can be cancelled at a later time. Useful for asyncSpawned futures, but works for any future.

### Sales module
The sales module needed to subscribe to request events to ensure that the request queue was managed correctly on each event. In the process of doing this, the sales agents were updated to avoid subscribing to events in each agent, and instead dispatch received events from the sales module to all created sales agents. This would prevent memory leaks on having too many eventemitters subscribed to.
  - prevent removal of agents from sales module while stopping, otherwise the agents seq len is modified while iterating

An additional sales agent state was added, `SalePreparing`, that handles all state machine setup, such as retrieving the request and subscribing to events that were previously in the `SaleDownloading` state.

Once agents have parked in an end state (eg ignored, cancelled, finished, failed, errored), they were not getting cleaned up and the sales module was keeping a handle on their reference. An `onCleanUp` callback was created to be called after the state machine enters an end state, which could prevent a memory leak if the number of requests coming in is high.

Move the SalesAgent callback raises pragmas from the Sales module to the proc definition in SalesAgent. This avoids having to catch `Exception`.
  - remove unneeded error handling as pragmas were moved

Move sales.subscriptions from an object containing named subscriptions to a `seq[Subscription]` directly on the sales object.

Sales tests: shut down repo after sales stop, to fix SIGABRT in CI

### Add async Promise API
  - modelled after JavaScript Promise API
  - alternative to `asyncSpawn` that allows handling of async calls in a synchronous context (including access to the synchronous closure) with less additional procs to be declared
  - Write less code, catch errors that would otherwise defect in asyncspawn, and execute a callback after completion
  - Add cancellation callbacks to utils/then, ensuring cancellations are handled properly

## Dependencies
- bump codex-contracts-eth to support slot queue (https://github.com/codex-storage/codex-contracts-eth/pull/61)
- bump nim-ethers to 0.5.0
- Bump nim-json-rpc submodule to 0bf2bcb

---------

Co-authored-by: Jaremy Creechley <creechley@gmail.com>
2023-07-25 12:50:30 +10:00

370 lines
10 KiB
Nim

## Nim-Codex
## Copyright (c) 2022 Status Research & Development GmbH
## Licensed under either of
## * Apache License, version 2.0, ([LICENSE-APACHE](LICENSE-APACHE))
## * MIT license ([LICENSE-MIT](LICENSE-MIT))
## at your option.
## This file may not be copied, modified, or distributed except according to
## those terms.
import std/typetraits
import pkg/chronos
import pkg/chronicles
import pkg/upraises
import pkg/json_serialization
import pkg/json_serialization/std/options
import pkg/stint
import pkg/stew/byteutils
import pkg/nimcrypto
import pkg/questionable
import pkg/questionable/results
push: {.upraises: [].}
import pkg/datastore
import ../stores
import ../contracts/requests
export requests
logScope:
topics = "reservations"
type
AvailabilityId* = distinct array[32, byte]
Availability* = object
id*: AvailabilityId
size*: UInt256
duration*: UInt256
minPrice*: UInt256
maxCollateral*: UInt256
used*: bool
Reservations* = ref object
repo: RepoStore
onReservationAdded: ?OnReservationAdded
GetNext* = proc(): Future[?Availability] {.upraises: [], gcsafe, closure.}
OnReservationAdded* = proc(availability: Availability): Future[void] {.upraises: [], gcsafe.}
AvailabilityIter* = ref object
finished*: bool
next*: GetNext
AvailabilityError* = object of CodexError
AvailabilityAlreadyExistsError* = object of AvailabilityError
AvailabilityReserveFailedError* = object of AvailabilityError
AvailabilityReleaseFailedError* = object of AvailabilityError
AvailabilityDeleteFailedError* = object of AvailabilityError
AvailabilityGetFailedError* = object of AvailabilityError
AvailabilityUpdateFailedError* = object of AvailabilityError
const
SalesKey = (CodexMetaKey / "sales").tryGet # TODO: move to sales module
ReservationsKey = (SalesKey / "reservations").tryGet
proc new*(T: type Reservations,
repo: RepoStore): Reservations =
T(repo: repo)
proc init*(
_: type Availability,
size: UInt256,
duration: UInt256,
minPrice: UInt256,
maxCollateral: UInt256): Availability =
var id: array[32, byte]
doAssert randomBytes(id) == 32
Availability(id: AvailabilityId(id), size: size, duration: duration, minPrice: minPrice, maxCollateral: maxCollateral)
func toArray*(id: AvailabilityId): array[32, byte] =
array[32, byte](id)
proc `==`*(x, y: AvailabilityId): bool {.borrow.}
proc `==`*(x, y: Availability): bool =
x.id == y.id and
x.size == y.size and
x.duration == y.duration and
x.maxCollateral == y.maxCollateral and
x.minPrice == y.minPrice
proc `$`*(id: AvailabilityId): string = id.toArray.toHex
proc toErr[E1: ref CatchableError, E2: AvailabilityError](
e1: E1,
_: type E2,
msg: string = e1.msg): ref E2 =
return newException(E2, msg, e1)
proc writeValue*(
writer: var JsonWriter,
value: AvailabilityId) {.upraises:[IOError].} =
mixin writeValue
writer.writeValue value.toArray
proc readValue*[T: AvailabilityId](
reader: var JsonReader,
value: var T) {.upraises: [SerializationError, IOError].} =
mixin readValue
value = T reader.readValue(T.distinctBase)
proc `onReservationAdded=`*(self: Reservations,
onReservationAdded: OnReservationAdded) =
self.onReservationAdded = some onReservationAdded
func key(id: AvailabilityId): ?!Key =
(ReservationsKey / id.toArray.toHex)
func key*(availability: Availability): ?!Key =
return availability.id.key
func available*(self: Reservations): uint = self.repo.available
func hasAvailable*(self: Reservations, bytes: uint): bool =
self.repo.available(bytes)
proc exists*(
self: Reservations,
id: AvailabilityId): Future[?!bool] {.async.} =
without key =? id.key, err:
return failure(err)
let exists = await self.repo.metaDs.contains(key)
return success(exists)
proc get*(
self: Reservations,
id: AvailabilityId): Future[?!Availability] {.async.} =
if exists =? (await self.exists(id)) and not exists:
let err = newException(AvailabilityGetFailedError,
"Availability does not exist")
return failure(err)
without key =? id.key, err:
return failure(err.toErr(AvailabilityGetFailedError))
without serialized =? await self.repo.metaDs.get(key), err:
return failure(err.toErr(AvailabilityGetFailedError))
without availability =? Json.decode(serialized, Availability).catch, err:
return failure(err.toErr(AvailabilityGetFailedError))
return success availability
proc update(
self: Reservations,
availability: Availability): Future[?!void] {.async.} =
trace "updating availability", id = availability.id, size = availability.size,
used = availability.used
without key =? availability.key, err:
return failure(err)
if err =? (await self.repo.metaDs.put(
key,
@(availability.toJson.toBytes))).errorOption:
return failure(err.toErr(AvailabilityUpdateFailedError))
return success()
proc delete(
self: Reservations,
id: AvailabilityId): Future[?!void] {.async.} =
trace "deleting availability", id
without availability =? (await self.get(id)), err:
return failure(err)
without key =? availability.key, err:
return failure(err)
if err =? (await self.repo.metaDs.delete(key)).errorOption:
return failure(err.toErr(AvailabilityDeleteFailedError))
return success()
proc reserve*(
self: Reservations,
availability: Availability): Future[?!void] {.async.} =
if exists =? (await self.exists(availability.id)) and exists:
let err = newException(AvailabilityAlreadyExistsError,
"Availability already exists")
return failure(err)
without key =? availability.key, err:
return failure(err)
let bytes = availability.size.truncate(uint)
if reserveErr =? (await self.repo.reserve(bytes)).errorOption:
return failure(reserveErr.toErr(AvailabilityReserveFailedError))
if updateErr =? (await self.update(availability)).errorOption:
# rollback the reserve
trace "rolling back reserve"
if rollbackErr =? (await self.repo.release(bytes)).errorOption:
rollbackErr.parent = updateErr
return failure(rollbackErr)
return failure(updateErr)
if onReservationAdded =? self.onReservationAdded:
try:
await onReservationAdded(availability)
except CatchableError as e:
# we don't have any insight into types of errors that `onProcessSlot` can
# throw because it is caller-defined
warn "Unknown error during 'onReservationAdded' callback",
availabilityId = availability.id, error = e.msg
return success()
proc release*(
self: Reservations,
id: AvailabilityId,
bytes: uint): Future[?!void] {.async.} =
trace "releasing bytes and updating availability", bytes, id
without var availability =? (await self.get(id)), err:
return failure(err)
without key =? id.key, err:
return failure(err)
if releaseErr =? (await self.repo.release(bytes)).errorOption:
return failure(releaseErr.toErr(AvailabilityReleaseFailedError))
availability.size = (availability.size.truncate(uint) - bytes).u256
template rollbackRelease(e: ref CatchableError) =
trace "rolling back release"
if rollbackErr =? (await self.repo.reserve(bytes)).errorOption:
rollbackErr.parent = e
return failure(rollbackErr)
# remove completely used availabilities
if availability.size == 0.u256:
if err =? (await self.delete(availability.id)).errorOption:
rollbackRelease(err)
return failure(err)
return success()
# persist partially used availability with updated size
if err =? (await self.update(availability)).errorOption:
rollbackRelease(err)
return failure(err)
return success()
proc markUsed*(
self: Reservations,
id: AvailabilityId): Future[?!void] {.async.} =
without var availability =? (await self.get(id)), err:
return failure(err)
availability.used = true
let r = await self.update(availability)
if r.isOk:
trace "availability marked used", id = id.toArray.toHex
return r
proc markUnused*(
self: Reservations,
id: AvailabilityId): Future[?!void] {.async.} =
without var availability =? (await self.get(id)), err:
return failure(err)
availability.used = false
let r = await self.update(availability)
if r.isOk:
trace "availability marked unused", id = id.toArray.toHex
return r
iterator items*(self: AvailabilityIter): Future[?Availability] =
while not self.finished:
yield self.next()
proc availabilities*(
self: Reservations): Future[?!AvailabilityIter] {.async.} =
var iter = AvailabilityIter()
let query = Query.init(ReservationsKey)
without results =? await self.repo.metaDs.query(query), err:
return failure(err)
proc next(): Future[?Availability] {.async.} =
await idleAsync()
iter.finished = results.finished
if not results.finished and
r =? (await results.next()) and
serialized =? r.data and
serialized.len > 0:
return some Json.decode(string.fromBytes(serialized), Availability)
return none Availability
iter.next = next
return success iter
proc unused*(r: Reservations): Future[?!seq[Availability]] {.async.} =
var ret: seq[Availability] = @[]
without availabilities =? (await r.availabilities), err:
return failure(err)
for a in availabilities:
if availability =? (await a) and not availability.used:
ret.add availability
return success(ret)
proc find*(
self: Reservations,
size, duration, minPrice, collateral: UInt256,
used: bool): Future[?Availability] {.async.} =
without availabilities =? (await self.availabilities), err:
error "failed to get all availabilities", error = err.msg
return none Availability
for a in availabilities:
if availability =? (await a):
if used == availability.used and
size <= availability.size and
duration <= availability.duration and
collateral <= availability.maxCollateral and
minPrice >= availability.minPrice:
trace "availability matched",
used, availUsed = availability.used,
size, availsize = availability.size,
duration, availDuration = availability.duration,
minPrice, availMinPrice = availability.minPrice,
collateral, availMaxCollateral = availability.maxCollateral
return some availability
trace "availiability did not match",
used, availUsed = availability.used,
size, availsize = availability.size,
duration, availDuration = availability.duration,
minPrice, availMinPrice = availability.minPrice,
collateral, availMaxCollateral = availability.maxCollateral